This function scales normalized gene expression data after variable genes have been selected. We do not mean-center the data before scaling in order to address the non-negativity constraint of NMF. Computation applied to each normalized dataset matrix can form the following equation:
$$S_{i,j}=\frac{N_{i,j}}{\sqrt{\sum_{p}^{n}\frac{N_{i,p}^2}{n-1}}}$$
Where \(N\) denotes the normalized matrix for an individual dataset, \(S\) is the output scaled matrix for this dataset, and \(n\) is the number of cells in this dataset. \(i, j\) denotes the specific gene and cell index, and \(p\) is the cell iterator.
Please see detailed section below for explanation on methylation dataset.
Usage
scaleNotCenter(object, ...)
# S3 method for dgCMatrix
scaleNotCenter(object, ...)
# S3 method for ligerDataset
scaleNotCenter(
object,
features = NULL,
chunk = 1000,
verbose = getOption("ligerVerbose", TRUE),
...
)
# S3 method for ligerMethDataset
scaleNotCenter(
object,
features = NULL,
verbose = getOption("ligerVerbose", TRUE),
...
)
# S3 method for liger
scaleNotCenter(
object,
useDatasets = NULL,
features = varFeatures(object),
verbose = getOption("ligerVerbose", TRUE),
remove.missing = NULL,
...
)
# S3 method for Seurat
scaleNotCenter(
object,
assay = NULL,
layer = "ligerNormData",
save = "ligerScaleData",
datasetVar = "orig.ident",
features = NULL,
...
)
Arguments
- object
liger object, ligerDataset object, dgCMatrix, or a Seurat object.
- ...
Arguments passed to other methods. The order goes by: "liger" method calls "ligerDataset" method", which then calls "dgCMatrix" method. "Seurat" method directly calls "dgCMatrix" method.
- features
Character, numeric or logical index that choose the variable feature to be scaled. "liger" method by default uses
varFeatures(object)
. "ligerDataset" method by default uses all features. "Seurat" method by default usesSeurat::VariableFeatures(object)
.- chunk
Integer. Number of maximum number of cells in each chunk, when scaling is applied to any HDF5 based dataset. Default
1000
.- verbose
Logical. Whether to show information of the progress. Default
getOption("ligerVerbose")
orTRUE
if users have not set.- useDatasets
A character vector of the names, a numeric or logical vector of the index of the datasets to be scaled but not centered. Default
NULL
applies to all datasets.- remove.missing
Deprecated. The functionality of this is covered through other parts of the whole workflow and is no long needed. Will be ignored if specified.
- assay
Name of assay to use. Default
NULL
uses current active assay.- layer
For Seurat>=4.9.9, the name of layer to retrieve normalized data. Default
"ligerNormData"
. For older Seurat, always retrieve fromdata
slot.- save
For Seurat>=4.9.9, the name of layer to store normalized data. Default
"ligerScaleData"
. For older Seurat, stored toscale.data
slot.- datasetVar
Metadata variable name that stores the dataset source annotation. Default
"orig.ident"
.
Value
Updated object
dgCMatrix method - Returns scaled dgCMatrix object
ligerDataset method - Updates the
scaleData
andscaledUnsharedData
(if unshared variable feature available) slot of the objectliger method - Updates the
scaleData
andscaledUnsharedData
(if unshared variable feature available) slot of chosen datasetsSeurat method - Adds a named layer in chosen assay (V5), or update the
scale.data
slot of the chosen assay (<=V4)
Note
Since the scaling on genes is applied on a per dataset base, other scaling
methods that apply to a whole concatenated matrix of multiple datasets might
not be considered as equivalent alternatives, even if options like
center
are set to FALSE
. Hence we implemented an efficient
solution that works under such circumstance, provided with the Seurat S3
method.
Methylation dataset
Because gene body mCH proportions are negatively correlated with gene
expression level in neurons, we need to reverse the direction of the
methylation data before performing the integration. We do this by simply
subtracting all values from the maximum methylation value. The resulting
values are positively correlated with gene expression. This will only be
applied to variable genes detected in prior. Please make sure that argument
modal
is set accordingly when running createLiger
. In
this way, this function can automatically detect it and take proper action.
If it is not set, users can still manually have the equivalent processing
done by doing scaleNotCenter(lig, useDataset = c("other", "datasets"))
,
and then reverseMethData(lig, useDataset = c("meth", "datasets"))
.
Examples
pbmc <- normalize(pbmc)
#> ℹ Normalizing datasets "ctrl"
#> ℹ Normalizing datasets "stim"
#> ✔ Normalizing datasets "stim" ... done
#>
#> ℹ Normalizing datasets "ctrl"
#> ✔ Normalizing datasets "ctrl" ... done
#>
pbmc <- selectGenes(pbmc)
#> ℹ Selecting variable features for dataset "ctrl"
#> ✔ ... 168 features selected out of 249 shared features.
#> ℹ Selecting variable features for dataset "stim"
#> ✔ ... 166 features selected out of 249 shared features.
#> ✔ Finally 173 shared variable features are selected.
pbmc <- scaleNotCenter(pbmc)
#> ℹ Scaling dataset "ctrl"
#> ✔ Scaling dataset "ctrl" ... done
#>
#> ℹ Scaling dataset "stim"
#> ✔ Scaling dataset "stim" ... done
#>