Linking genes to putative regulatory elements

Evaluate the relationships between pairs of genes and peaks based on specified distance metric. Usually used for inferring the correlation between gene expression and imputed peak counts for datasets without the modality originally (i.e. applied to imputeKNN result).

Usage

linkGenesAndPeaks(
  object,
  useDataset,
  pathToCoords,
  useGenes = NULL,
  method = c("spearman", "pearson", "kendall"),
  alpha = 0.05,
  verbose = getOption("ligerVerbose", TRUE),
  path_to_coords = pathToCoords,
  genes.list = useGenes,
  dist = method
)

Arguments

object: A liger object, with datasets that is of ligerATACDataset class in the datasets slot.
useDataset: Name of one dataset, with both normalized gene expression and normalized peak counts available.
pathToCoords: Path tothe gene coordinates file, usually a BED file.
useGenes: Character vector of gene names to be tested. Default NULL uses all genes available in useDataset.
method: Choose the type of correlation to calculate, from "spearman", "pearson" and "kendall". Default "spearman"
alpha: Numeric, significance threshold for correlation p-value. Peak-gene correlations with p-values below this threshold are considered significant. Default 0.05.
verbose: Logical. Whether to show information of the progress. Default getOption("ligerVerbose") or TRUE if users have not set.
path_to_coords, genes.list, dist: Deprecated. See Usage section for replacement.

Value

A sparse matrix with peak names as rows and gene names as columns, with each element indicating the correlation between peak i and gene j, 0 if the gene and peak are not significantly linked.

Examples

# \donttest{
if (requireNamespace("RcppPlanc", quietly = TRUE) &&
    requireNamespace("GenomicRanges", quietly = TRUE) &&
    requireNamespace("IRanges", quietly = TRUE) &&
    requireNamespace("psych", quietly = TRUE)) {
    bmmc <- normalize(bmmc)
    bmmc <- selectGenes(bmmc)
    bmmc <- scaleNotCenter(bmmc)
    bmmc <- runINMF(bmmc, miniBatchSize = 100)
    bmmc <- alignFactors(bmmc)
    bmmc <- normalizePeak(bmmc)
    bmmc <- imputeKNN(bmmc, reference = "atac", queries = "rna")
    corr <- linkGenesAndPeaks(
        bmmc, useDataset = "rna",
        pathToCoords = system.file("extdata/hg19_genes.bed", package = "rliger")
    )
}
#> ℹ Normalizing datasets "rna"
#> ℹ Normalizing datasets "atac"
#> ✔ Normalizing datasets "atac" ... done
#> 
#> ℹ Normalizing datasets "rna"

#> ✔ Normalizing datasets "rna" ... done
#> 
#> ℹ Selecting variable features for dataset "rna"
#> ✔ ... 83 features selected out of 172 shared features.
#> ℹ Selecting variable features for dataset "atac"
#> ✔ ... 126 features selected out of 172 shared features.
#> ✔ Finally 135 shared variable features are selected.
#> ℹ Scaling dataset "rna"
#> ✔ Scaling dataset "rna" ... done
#> 
#> ℹ Scaling dataset "atac"
#> ✔ Scaling dataset "atac" ... done
#> 
#> ℹ Using largest dataset of recommended type as reference: "rna" with 340 cells
#> ℹ Normalizing peak of dataset: "atac"
#> ✔ Normalizing peak of dataset: "atac" ... done
#> 
#> ℹ Imputing 1 query dataset: "rna"
#> ℹ from reference dataset: "atac"
#> ℹ Normalizing peak of dataset: "rna"
#> ✔ Normalizing peak of dataset: "rna" ... done
#> 
#> ℹ 172 genes to be tested against 995 peaks
#> ℹ Calculating correlation for gene-peak pairs...
# }

Usage

Arguments

Value

See also

Examples