Skip to contents

This process builds a shared factor neighborhood graph to jointly cluster cells, then quantile normalizes corresponding clusters.

The first step, building the shared factor neighborhood graph, is performed in SNF(), and produces a graph representation where edge weights between cells (across all datasets) correspond to their similarity in the shared factor neighborhood space. An important parameter here is nNeighbors, the number of neighbors used to build the shared factor space.

Next we perform quantile alignment for each dataset, factor, and cluster (by stretching/compressing datasets' quantiles to better match those of the reference dataset).

Usage

quantileNorm(object, ...)

# S3 method for liger
quantileNorm(
  object,
  quantiles = 50,
  reference = NULL,
  minCells = 20,
  nNeighbors = 20,
  useDims = NULL,
  center = FALSE,
  maxSample = 1000,
  eps = 0.9,
  refineKNN = TRUE,
  clusterName = "quantileNorm_cluster",
  seed = 1,
  verbose = getOption("ligerVerbose", TRUE),
  ...
)

# S3 method for Seurat
quantileNorm(
  object,
  reduction = "inmf",
  quantiles = 50,
  reference = NULL,
  minCells = 20,
  nNeighbors = 20,
  useDims = NULL,
  center = FALSE,
  maxSample = 1000,
  eps = 0.9,
  refineKNN = TRUE,
  clusterName = "quantileNorm_cluster",
  seed = 1,
  verbose = getOption("ligerVerbose", TRUE),
  ...
)

Arguments

object

A liger or Seurat object with valid factorization result available (i.e. runIntegration performed in advance).

...

Arguments passed to other S3 methods of this function.

quantiles

Number of quantiles to use for quantile normalization. Default 50.

reference

Character, numeric or logical selection of one dataset, out of all available datasets in object, to use as a "reference" for quantile normalization. Default NULL tries to find an RNA dataset with the largest number of cells; if no RNA dataset available, use the globally largest dataset.

minCells

Minimum number of cells to consider a cluster shared across datasets. Default 20.

nNeighbors

Number of nearest neighbors for within-dataset knn graph. Default 20.

useDims

Indices of factors to use for shared nearest factor determination. Default NULL uses all factors.

center

Whether to center the data when scaling factors. Could be useful for less sparse modalities like methylation data. Default FALSE.

maxSample

Maximum number of cells used for quantile normalization of each cluster and factor. Default 1000.

eps

The error bound of the nearest neighbor search. Lower values give more accurate nearest neighbor graphs but take much longer to compute. Default 0.9.

refineKNN

whether to increase robustness of cluster assignments using KNN graph. Default TRUE.

clusterName

Variable name that will store the clustering result in metadata of a liger object or a Seurat object. Default "quantileNorm_cluster"

seed

Random seed to allow reproducible results. Default 1.

verbose

Logical. Whether to show information of the progress. Default getOption("ligerVerbose") or TRUE if users have not set.

reduction

Name of the reduction where LIGER integration result is stored. Default "inmf".

Value

Updated input object

  • liger method

    • Update the H.norm slot for the alignment cell factor loading, ready for running graph based community detection clustering or dimensionality reduction for visualization.

    • Update the cellMata slot with a cluster assignment basing on cell factor loading

  • Seurat method

    • Update the reductions slot with a new DimReduc object containing the aligned cell factor loading.

    • Update the metadata with a cluster assignment basing on cell factor loading

Examples

pbmc <- quantileNorm(pbmcPlot)
#>  Using largest dataset of recommended type as reference: "ctrl" with 300 cells