nmf_models.nmf_models_mod_updates.intNMF
- class intNMF(n_topics, epochs=15, init='random', mod1_skew=0.5, seed=None, reg=None)[source]
Bases:
objectClass to run int NMF on multiome data
- theta
cell x topic matrix (joint low dimensional embedding)
- Type:
array-like
- phi_rna
topic x gene matrix. Gives the loading matrix to define topics.
- Type:
array-like
- phi_atac
topic x region matrix. Gives the loading matrix to define topics.
- Type:
array-like
- loss
l2 norm of the reconstruction error i.e. ||X_rna - WH_rna||_2 + ||X_atac - WH_atac||_2
- Type:
- loss_atac
l2 norm for the reconstruction error of just the atac matrix i.e. ||X_atac - WH_atac||_2
- Type:
- loss_rna
l2 norm for the reconstruction error of just the rna matrix i.e. ||X_rna - WH_rna||_2
- Type:
- Parameters:
n_topics (int) – The number of latent topics
epochs (int) – Number of interations during optimisation. Defaults to 15.
init (string) – Method of initialising W and H. Defaults to random.
mod1_skew (float) – Relative weighting of two modalities between 0-1. Defaults to 0.5.
reg (string) – Include l1 or l2 regularisation or not. (This is TODO). Default None
seed (int) – Random seed to use. Defaults to None ,i.e., no control of random seed (Useful for reproducability when using random initilisation)
Methods
Optimise NMF.
- fit(rna_mat, atac_mat, rna_names=None, atac_names=None)[source]
Optimise NMF.
Uses accelerated Hierarchical alternating least squares algorithm proposed here, but modified to joint factorise two matrices. https://arxiv.org/pdf/1107.5194.pdf. Only required arguments are the matrices to use for factorisation. GEX and ATAC matrices are expected in cell by feature format. Matrices should be scipy sparse matrices. min ||X_rna - (theta . phi_rna)||_2 and min ||X_atac - (theta . phi_atac)||_2 s.t. theta, phi_rna, phi_atac > 0. So that theta hold the latent topic scores for a cell. And phi the allows recontruction of X
- Parameters:
rna_mat (scipy sparse matrix (or coercible) of single cell gene expression) –
atac_mat (scipy sparse matrix (or coercible) of single cell gene expression) –
rna_names (Optional list of gene names must be same length as columns in rna_mat) –
atac_names (Optional list. Must be the same length as columns in atac_mat) –
- Returns:
Access low dim embed with self.theta or the loadings with self.phi_rna or self.phi_atac
- Return type:
self