This is an experimental function and is
subject to change.
Performs consensus integrative non-negative matrix factorization (c-iNMF) to return factorized \(H\), \(W\), and \(V\) matrices. In order to address the non-convex nature of NMF, we built on the cNMF method proposed by D. Kotliar, 2019. We run the regular iNMF multiple times with different random starts, and cluster the pool of all the factors in \(W\) and \(V\)s and take the consensus of the clusters of the largest population. The cell factor loading \(H\) matrices are eventually solved with the consensus \(W\) and \(V\) matrices.
Please see runINMF for detailed introduction to the regular
iNMF algorithm which is run multiple times in this function.
The consensus iNMF algorithm is developed basing on the consensus NMF (cNMF) method (D. Kotliar et al., 2019).
Usage
runCINMF(object, k = 20, lambda = 5, rho = 0.3, ...)
# S3 method for liger
runCINMF(
  object,
  k = 20,
  lambda = 5,
  rho = 0.3,
  nIteration = 30,
  nRandomStarts = 10,
  HInit = NULL,
  WInit = NULL,
  VInit = NULL,
  seed = 1,
  nCores = 2L,
  verbose = getOption("ligerVerbose", TRUE),
  ...
)
# S3 method for Seurat
runCINMF(
  object,
  k = 20,
  lambda = 5,
  rho = 0.3,
  datasetVar = "orig.ident",
  layer = "ligerScaleData",
  assay = NULL,
  reduction = "cinmf",
  nIteration = 30,
  nRandomStarts = 10,
  HInit = NULL,
  WInit = NULL,
  VInit = NULL,
  seed = 1,
  nCores = 2L,
  verbose = getOption("ligerVerbose", TRUE),
  ...
)Arguments
- object
 A liger object or a Seurat object with non-negative scaled data of variable features (Done with
scaleNotCenter).- k
 Inner dimension of factorization (number of factors). Generally, a higher
kwill be needed for datasets with more sub-structure. Default20.- lambda
 Regularization parameter. Larger values penalize dataset-specific effects more strongly (i.e. alignment should increase as
lambdaincreases). Default5.- rho
 Numeric number between 0 and 1. Fraction for determining the number of nearest neighbors to look at for consensus (by
rho * nRandomStarts). Default0.3.- ...
 Arguments passed to methods.
- nIteration
 Total number of block coordinate descent iterations to perform. Default
30.- nRandomStarts
 Number of replicate runs for creating the pool of factorization results. Default
10.- HInit
 Initial values to use for \(H\) matrices. A list object where each element is the initial \(H\) matrix of each dataset. Default
NULL.- WInit
 Initial values to use for \(W\) matrix. A matrix object. Default
NULL.- VInit
 Initial values to use for \(V\) matrices. A list object where each element is the initial \(V\) matrix of each dataset. Default
NULL.- seed
 Random seed to allow reproducible results. Default
1.- nCores
 The number of parallel tasks to speed up the computation. Default
2L. Only supported for platform with OpenMP support.- verbose
 Logical. Whether to show information of the progress. Default
getOption("ligerVerbose")orTRUEif users have not set.- datasetVar
 Metadata variable name that stores the dataset source annotation. Default
"orig.ident".- layer
 For Seurat>=4.9.9, the name of layer to retrieve input non-negative scaled data. Default
"ligerScaleData". For older Seurat, always retrieve fromscale.dataslot.- assay
 Name of assay to use. Default
NULLuses current active assay.- reduction
 Name of the reduction to store result. Also used as the feature key. Default
"cinmf".
Value
liger method - Returns updated input liger object
A list of all \(H\) matrices can be accessed with
getMatrix(object, "H")A list of all \(V\) matrices can be accessed with
getMatrix(object, "V")The \(W\) matrix can be accessed with
getMatrix(object, "W")
Seurat method - Returns updated input Seurat object
\(H\) matrices for all datasets will be concatenated and transposed (all cells by k), and form a DimReduc object in the
reductionsslot named by argumentreduction.\(W\) matrix will be presented as
feature.loadingsin the same DimReduc object.\(V\) matrices, an objective error value and the dataset variable used for the factorization is currently stored in
miscslot of the same DimReduc object.
References
Joshua D. Welch and et al., Single-Cell Multi-omic Integration Compares and Contrasts Features of Brain Cell Identity, Cell, 2019
Dylan Kotliar and et al., Identifying gene expression programs of cell-type identity and cellular activity with single-cell RNA-Seq, eLife, 2019
Examples
# \donttest{
pbmc <- normalize(pbmc)
#> ℹ Normalizing datasets "ctrl"
#> ℹ Normalizing datasets "stim"
#> ✔ Normalizing datasets "stim" ... done
#> 
#> ℹ Normalizing datasets "ctrl"
#> ✔ Normalizing datasets "ctrl" ... done
#> 
pbmc <- selectGenes(pbmc)
#> ℹ Selecting variable features for dataset "ctrl"
#> ✔ ... 168 features selected out of 249 shared features.
#> ℹ Selecting variable features for dataset "stim"
#> ✔ ... 166 features selected out of 249 shared features.
#> ✔ Finally 173 shared variable features are selected.
pbmc <- scaleNotCenter(pbmc)
#> ℹ Scaling dataset "ctrl"
#> ✔ Scaling dataset "ctrl" ... done
#> 
#> ℹ Scaling dataset "stim"
#> ✔ Scaling dataset "stim" ... done
#> 
if (requireNamespace("RcppPlanc", quietly = TRUE)) {
    pbmc <- runCINMF(pbmc)
}
#> Replicating iNMF runs ■■■■■■■■■■■■■■■■                  50% | [ 5 / 10 ] ETA:  …
#> Replicating iNMF runs ■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■  100% | [ 10 / 10 ] ETA: …
#> ℹ Taking the consensus
#> ✔ Taking the consensus ... done
#> 
#> ℹ ANLS optimization with consensus fixed
#> ✔ ANLS optimization with consensus fixed  ... objective error: 35854.9654727417
#> 
# }