Impute the peak counts from gene expression data referring to an ATAC dataset after integration
Source:R/ATAC.R
imputeKNN.Rd
This function is designed for creating peak data for a dataset with only gene expression. This function uses aligned cell factor loading to find nearest neighbors between cells from the queried dataset (without peak) and cells from reference dataset (with peak). And then impute the peak for the former basing on the weight. Therefore, the reference dataset selected must be of "atac" modality setting.
Usage
imputeKNN(
object,
reference,
queries = NULL,
nNeighbors = 20,
weight = TRUE,
norm = TRUE,
scale = FALSE,
verbose = getOption("ligerVerbose", TRUE),
...,
knn_k = nNeighbors
)
Arguments
- object
liger object with aligned factor loading computed in advance.
- reference
Name of a dataset containing peak data to impute into query dataset(s).
- queries
Names of datasets to be augmented by imputation. Should not include
reference
. DefaultNULL
uses all datasets except the reference.- nNeighbors
The maximum number of nearest neighbors to search. Default
20
.- weight
Logical. Whether to use KNN distances as weight matrix. Default
FALSE
.- norm
Logical. Whether to normalize the imputed data. Default
TRUE
.- scale
Logical. Whether to scale but not center the imputed data. Default
TRUE
.- verbose
Logical. Whether to show information of the progress. Default
getOption("ligerVerbose")
orTRUE
if users have not set.- ...
Optional arguments to be passed to
normalize
whennorm = TRUE
.- knn_k
Deprecated. See Usage section for replacement.
Value
The input object
where queried ligerDataset
objects in datasets
slot are replaced. These datasets will all be
converted to ligerATACDataset class with an additional slot
rawPeak
to store the imputed peak counts, and normPeak
for
normalized imputed peak counts if norm = TRUE
.
Examples
bmmc <- normalize(bmmc)
#> ℹ Normalizing datasets "rna"
#> ℹ Normalizing datasets "atac"
#> ✔ Normalizing datasets "atac" ... done
#>
#> ℹ Normalizing datasets "rna"
#> ✔ Normalizing datasets "rna" ... done
#>
bmmc <- selectGenes(bmmc, datasets.use = "rna")
#> Warning: The `datasets.use` argument of `selectGenes.liger()` is deprecated as of rliger
#> 1.99.0.
#> ℹ Please use the `useDatasets` argument instead.
#> ℹ The deprecated feature was likely used in the rliger package.
#> Please report the issue at <https://github.com/welch-lab/liger/issues>.
#> ℹ Selecting variable features for dataset "rna"
#> ✔ ... 83 features selected out of 172 shared features.
#> ✔ Finally 83 shared variable features are selected.
bmmc <- scaleNotCenter(bmmc)
#> ℹ Scaling dataset "rna"
#> ✔ Scaling dataset "rna" ... done
#>
#> ℹ Scaling dataset "atac"
#> ✔ Scaling dataset "atac" ... done
#>
if (requireNamespace("RcppPlanc", quietly = TRUE)) {
bmmc <- runINMF(bmmc, k = 20)
bmmc <- alignFactors(bmmc)
bmmc <- normalizePeak(bmmc)
bmmc <- imputeKNN(bmmc, reference = "atac", queries = "rna")
}
#> ℹ Using largest dataset of recommended type as reference: "rna" with 340 cells
#> ℹ Normalizing peak of dataset: "atac"
#> ✔ Normalizing peak of dataset: "atac" ... done
#>
#> ℹ Imputing 1 query dataset: "rna"
#> ℹ from reference dataset: "atac"
#> ℹ Normalizing peak of dataset: "rna"
#> ✔ Normalizing peak of dataset: "rna" ... done
#>