sispca.utils
Classes
Custom data class for more efficient storage of kernels. |
Functions
|
Z-score normalization. |
|
Calculate the trace of the covariance matrix of the hidden representation. |
|
Calculate the Gaussian kernel matrix. |
|
Calculate the delta kernel matrix. |
|
Calculate the HSIC between two tensors using Gaussian kernel. |
|
Calculate the HSIC between two tensors using linear kernel. |
|
Project the data to an orthogonal space using Gram-Schmidt process. |
|
Slice a PyTorch sparse matrix K by the indices in index such that the result is K[index, :][:, index]. |
Module Contents
- sispca.utils.normalize_col(x, center=True, scale=True)
Z-score normalization.
- Parameters:
x (2D tensor) – (n_sample, n_feature).
- Returns:
(x - x.mean(dim=0)) / x.std(dim=0)
- sispca.utils.tr_cov(x)
Calculate the trace of the covariance matrix of the hidden representation.
- Parameters:
x (2D tensor) – (n_sample, n_feature).
- Returns:
tr(x @ x.T)
- sispca.utils.gaussian_kernel(x, bw=None)
Calculate the Gaussian kernel matrix.
- Parameters:
x (2D tensor) – (n_sample, n_feature).
bw – Bendwidth of the Gaussian kernel. If None, will set to the median distance.
- Returns
K (2D tensor): (n_sample, n_sample).
- sispca.utils.delta_kernel(x)
Calculate the delta kernel matrix.
- Parameters:
x (2D array) – Category labels. (n_sample, n_feature).
- Returns
K (2D tensor): (n_sample, n_sample).
- sispca.utils.hsic_gaussian(x, y, bw=None)
Calculate the HSIC between two tensors using Gaussian kernel.
- Parameters:
x (2D tensor) – (n_sample, n_feature_1).
y (2D tensor) – (n_sample, n_feature_2).
bw – Bendwidth of the Gaussian kernel. If None, will set to the median distance.
- Returns:
HSIC between x and y.
- Return type:
HSIC (float)
- sispca.utils.hsic_linear(x, y)
Calculate the HSIC between two tensors using linear kernel.
- Parameters:
x (2D tensor) – (n_sample, n_feature_1).
y (2D tensor) – (n_sample, n_feature_2).
- Returns:
HSIC between x and y.
- Return type:
HSIC (float)
- sispca.utils.gram_schmidt(x)
Project the data to an orthogonal space using Gram-Schmidt process.
- Parameters:
x (2D tensor)
- Returns:
data with orthonormal columns.
- Return type:
x_new (2D tensor)
- class sispca.utils.Kernel(target_type, Q=None, target_kernel=None)
Custom data class for more efficient storage of kernels.
- Usage:
kernel = Kernel(‘continuous’, Q = target_data) # K = Q @ Q.T kernel.realization() # return the (n, n) kernel matrix kernel.subset(idx) # return the sub-kernel matrix of shape (m, m) where m = len(idx) kernel.xtKx(x) # return x.T @ K @ x
- Parameters:
target_type (str) – One of [‘continuous’, ‘categorical’, ‘identity’,’custom’]. The type of the target data. If ‘custom’, the target_kernel should be provided.
Q (int or 2D tensor) – If int, Q is the dimension of the identity matrix. If 2D tensor, Q is the decomposed matrix (n_obs, n_var) where K = Q @ Q.T.
target_kernel (2D tensor) – The pre-calculated kernel matrix of shape (n_obs, n_obs). Applied when target_type is ‘custom’. Will be stored as a sparse tensor.
- target_type
- Q = None
- target_kernel = None
- shape
- _rank = None
- _sanity_check()
- _shape()
- realization()
- xtKx(x)
- subset(idx)
Helper function to extract batched inputs for training. idx (tensor) is the index of the batch.
- rank()
Calculate the rank of the kernel matrix
- sispca.utils.slice_sparse_matrix(K: torch.sparse_coo_tensor, index: torch.Tensor)
Slice a PyTorch sparse matrix K by the indices in index such that the result is K[index, :][:, index].
- Parameters:
K – Input sparse matrix (torch.sparse_coo_tensor).
index – 1D tensor of row/column indices to slice (torch.Tensor).
- Returns:
][:, index].
- Return type:
A new sparse matrix (torch.sparse_coo_tensor) corresponding to K[index,