Perform factorization for subset of data — optimizeSubset • rliger

Uses an efficient strategy for updating that takes advantage of the information in the existing factorization.

Usage

optimizeSubset(
  object,
  clusterVar = NULL,
  useClusters = NULL,
  lambda = NULL,
  nIteration = 30,
  cellIdx = NULL,
  scaleDatasets = NULL,
  seed = 1,
  verbose = getOption("ligerVerbose"),
  cell.subset = cellIdx,
  cluster.subset = useClusters,
  max.iters = nIteration,
  datasets.scale = scaleDatasets,
  thresh = NULL
)

Arguments

object: liger object. Should have integrative factorization (e.g. runINMF) performed in advance.
clusterVar, useClusters: Together select the clusters to subset the object conveniently. clusterVar is the name of variable in cellMeta(object) and useClusters should be vector of names of clusters in the variable. clusterVar is by default the default cluster (See runCluster, or defaultCluster at "Cell metadata access"). Users can otherwise select cells explicitly with cellIdx for complex conditions. useClusters overrides cellIdx.
lambda: Numeric regularization parameter. By default NULL, this will use the lambda value used in the latest factorization.
nIteration: Maximum number of block coordinate descent iterations to perform. Default 30.
cellIdx: Valid index vector that applies to the whole object. See subsetLiger for requirement. Default NULL.
scaleDatasets: Names of datasets to re-scale after subsetting. Default NULL does not re-scale.
seed: Random seed to allow reproducible results. Default 1. Used by runINMF factorization.
verbose: Logical. Whether to show information of the progress. Default getOption("ligerVerbose") which is TRUE if users have not set.
cell.subset, cluster.subset, max.iters, datasets.scale: These arguments are now replaced by others and will be removed in the future. Please see usage for replacement.
thresh: Deprecated. New implementation of iNMF does not require a threshold for convergence detection. Setting a large enough nIteration will bring it to convergence.

Value

Subset object with factorization matrices optimized, including the W matrix in liger object, and W and V

matrices in each ligerDataset object in the datasets

slot. scaleData in the ligerDataset objects of datasets specified by scaleDatasets will also be updated to reflect the subset.

Examples

pbmc <- normalize(pbmc)
#> ℹ Normalizing datasets "ctrl"
#> ℹ Normalizing datasets "stim"
#> ✔ Normalizing datasets "stim" ... done
#> 
#> ℹ Normalizing datasets "ctrl"

#> ✔ Normalizing datasets "ctrl" ... done
#> 
pbmc <- selectGenes(pbmc)
#> ℹ Selecting variable features for dataset "ctrl"
#> ✔ ... 168 features selected out of 249 shared features.
#> ℹ Selecting variable features for dataset "stim"
#> ✔ ... 166 features selected out of 249 shared features.
#> ✔ Finally 173 shared variable features are selected.
pbmc <- scaleNotCenter(pbmc)
#> ℹ Scaling dataset "ctrl"
#> ✔ Scaling dataset "ctrl" ... done
#> 
#> ℹ Scaling dataset "stim"
#> ✔ Scaling dataset "stim" ... done
#> 
if (requireNamespace("RcppPlanc", quietly = TRUE)) {
    # Only running a few iterations for fast examples
    pbmc <- runINMF(pbmc, k = 20, nIteration = 2)
    pbmc <- optimizeSubset(pbmc, cellIdx = sort(sample(ncol(pbmc), 200)),
                           nIteration = 2)
}
#> ℹ Subsetting dataset: "ctrl"
#> ℹ Subsetting dataset: "stim"
#> ✔ Subsetting dataset: "stim" ... done
#> 
#> ℹ Subsetting dataset: "ctrl"

#> ✔ Subsetting dataset: "ctrl" ... done
#>