Quantile align (normalize) factor loadings — quantileAlignSNF • rliger

This is a deprecated function. Calling 'quantileNorm' instead.

Usage

quantileAlignSNF(
  object,
  knn_k = 20,
  k2 = 500,
  prune.thresh = 0.2,
  ref_dataset = NULL,
  min_cells = 20,
  quantiles = 50,
  nstart = 10,
  resolution = 1,
  dims.use = 1:ncol(x = object@H[[1]]),
  dist.use = "CR",
  center = FALSE,
  small.clust.thresh = 0,
  id.number = NULL,
  print.mod = FALSE,
  print.align.summary = FALSE
)

Arguments

object: liger object. Should run optimizeALS before calling.
knn_k: Number of nearest neighbors for within-dataset knn graph (default 20).
k2: Horizon parameter for shared nearest factor graph. Distances to all but the k2 nearest neighbors are set to 0 (cuts down on memory usage for very large graphs). (default 500)
prune.thresh: Minimum allowed edge weight. Any edges below this are removed (given weight 0) (default 0.2)
ref_dataset: Name of dataset to use as a "reference" for normalization. By default, the dataset with the largest number of cells is used.
min_cells: Minimum number of cells to consider a cluster shared across datasets (default 2)
quantiles: Number of quantiles to use for quantile normalization (default 50).
nstart: Number of times to perform Louvain community detection with different random starts (default 10).
resolution: Controls the number of communities detected. Higher resolution -> more communities. (default 1)
dims.use: Indices of factors to use for shared nearest factor determination (default 1:ncol(H[[1]])).
dist.use: Distance metric to use in calculating nearest neighbors (default "CR").
center: Centers the data when scaling factors (useful for less sparse modalities like methylation data). (default FALSE)
small.clust.thresh: Extracts small clusters loading highly on single factor with fewer cells than this before regular alignment (default 0 -- no small cluster extraction).
id.number: Number to use for identifying edge file (when running in parallel) (generates random value by default).
print.mod: Print modularity output from clustering algorithm (default FALSE).
print.align.summary: Print summary of clusters which did not align normally (default FALSE).

Value

liger object with H.norm and cluster slots set.

Details

This process builds a shared factor neighborhood graph to jointly cluster cells, then quantile normalizes corresponding clusters.

The first step, building the shared factor neighborhood graph, is performed in SNF(), and produces a graph representation where edge weights between cells (across all datasets) correspond to their similarity in the shared factor neighborhood space. An important parameter here is knn_k, the number of neighbors used to build the shared factor space (see SNF()). Afterwards, modularity-based community detection is performed on this graph (Louvain clustering) in order to identify shared clusters across datasets. The method was first developed by Waltman and van Eck (2013) and source code is available at http://www.ludowaltman.nl/slm/. The most important parameter here is resolution, which corresponds to the number of communities detected.

Next we perform quantile alignment for each dataset, factor, and cluster (by stretching/compressing datasets' quantiles to better match those of the reference dataset). These aligned factor loadings are combined into a single matrix and returned as H.norm.

Examples

if (FALSE) {
# liger object, factorization complete
ligerex
# do basic quantile alignment
ligerex <- quantileAlignSNF(ligerex)
# higher resolution for more clusters (note that SNF is conserved)
ligerex <- quantileAlignSNF(ligerex, resolution = 1.2)
# change knn_k for more fine-grained local clustering
ligerex <- quantileAlignSNF(ligerex, knn_k = 15, resolution = 1.2)
}