This process builds a shared factor neighborhood graph to jointly cluster cells, then quantile normalizes corresponding clusters.
The first step, building the shared factor neighborhood graph, is performed
in SNF(), and produces a graph representation where edge weights between
cells (across all datasets) correspond to their similarity in the shared
factor neighborhood space. An important parameter here is nNeighbors
,
the number of neighbors used to build the shared factor space.
Next we perform quantile alignment for each dataset, factor, and cluster (by stretching/compressing datasets' quantiles to better match those of the reference dataset).
Usage
quantileNorm(object, ...)
# S3 method for liger
quantileNorm(
object,
quantiles = 50,
reference = NULL,
minCells = 20,
nNeighbors = 20,
useDims = NULL,
center = FALSE,
maxSample = 1000,
eps = 0.9,
refineKNN = TRUE,
clusterName = "quantileNorm_cluster",
seed = 1,
verbose = getOption("ligerVerbose", TRUE),
...
)
# S3 method for Seurat
quantileNorm(
object,
reduction = "inmf",
quantiles = 50,
reference = NULL,
minCells = 20,
nNeighbors = 20,
useDims = NULL,
center = FALSE,
maxSample = 1000,
eps = 0.9,
refineKNN = TRUE,
clusterName = "quantileNorm_cluster",
seed = 1,
verbose = getOption("ligerVerbose", TRUE),
...
)
Arguments
- object
A liger or Seurat object with valid factorization result available (i.e.
runIntegration
performed in advance).- ...
Arguments passed to other S3 methods of this function.
- quantiles
Number of quantiles to use for quantile normalization. Default
50
.- reference
Character, numeric or logical selection of one dataset, out of all available datasets in
object
, to use as a "reference" for quantile normalization. DefaultNULL
tries to find an RNA dataset with the largest number of cells; if no RNA dataset available, use the globally largest dataset.- minCells
Minimum number of cells to consider a cluster shared across datasets. Default
20
.- nNeighbors
Number of nearest neighbors for within-dataset knn graph. Default
20
.- useDims
Indices of factors to use for shared nearest factor determination. Default
NULL
uses all factors.- center
Whether to center the data when scaling factors. Could be useful for less sparse modalities like methylation data. Default
FALSE
.- maxSample
Maximum number of cells used for quantile normalization of each cluster and factor. Default
1000
.- eps
The error bound of the nearest neighbor search. Lower values give more accurate nearest neighbor graphs but take much longer to compute. Default
0.9
.- refineKNN
whether to increase robustness of cluster assignments using KNN graph. Default
TRUE
.- clusterName
Variable name that will store the clustering result in metadata of a liger object or a
Seurat
object. Default"quantileNorm_cluster"
- seed
Random seed to allow reproducible results. Default
1
.- verbose
Logical. Whether to show information of the progress. Default
getOption("ligerVerbose")
orTRUE
if users have not set.- reduction
Name of the reduction where LIGER integration result is stored. Default
"inmf"
.
Value
Updated input object
liger method
Update the
H.norm
slot for the alignment cell factor loading, ready for running graph based community detection clustering or dimensionality reduction for visualization.Update the
cellMata
slot with a cluster assignment basing on cell factor loading
Seurat method
Update the
reductions
slot with a newDimReduc
object containing the aligned cell factor loading.Update the metadata with a cluster assignment basing on cell factor loading