Skip to contents

Evaluate the relationships between pairs of genes and peaks based on specified distance metric. Usually used for inferring the correlation between gene expression and imputed peak counts for datasets without the modality originally (i.e. applied to imputeKNN result).

Usage

linkGenesAndPeaks(
  object,
  useDataset,
  pathToCoords,
  useGenes = NULL,
  method = c("spearman", "pearson", "kendall"),
  alpha = 0.05,
  verbose = getOption("ligerVerbose", TRUE),
  path_to_coords = pathToCoords,
  genes.list = useGenes,
  dist = method
)

Arguments

object

A liger object, with datasets that is of ligerATACDataset class in the datasets slot.

useDataset

Name of one dataset, with both normalized gene expression and normalized peak counts available.

pathToCoords

Path tothe gene coordinates file, usually a BED file.

useGenes

Character vector of gene names to be tested. Default NULL uses all genes available in useDataset.

method

Choose the type of correlation to calculate, from "spearman", "pearson" and "kendall". Default "spearman"

alpha

Numeric, significance threshold for correlation p-value. Peak-gene correlations with p-values below this threshold are considered significant. Default 0.05.

verbose

Logical. Whether to show information of the progress. Default getOption("ligerVerbose") or TRUE if users have not set.

path_to_coords, genes.list, dist

Deprecated. See Usage section for replacement.

Value

A sparse matrix with peak names as rows and gene names as columns, with each element indicating the correlation between peak i and gene j, 0 if the gene and peak are not significantly linked.

See also

Examples

# \donttest{
if (requireNamespace("RcppPlanc", quietly = TRUE) &&
    requireNamespace("GenomicRanges", quietly = TRUE) &&
    requireNamespace("IRanges", quietly = TRUE) &&
    requireNamespace("psych", quietly = TRUE)) {
    bmmc <- normalize(bmmc)
    bmmc <- selectGenes(bmmc)
    bmmc <- scaleNotCenter(bmmc)
    bmmc <- runINMF(bmmc, miniBatchSize = 100)
    bmmc <- quantileNorm(bmmc)
    bmmc <- normalizePeak(bmmc)
    bmmc <- imputeKNN(bmmc, reference = "atac", queries = "rna")
    corr <- linkGenesAndPeaks(
        bmmc, useDataset = "rna",
        pathToCoords = system.file("extdata/hg19_genes.bed", package = "rliger")
    )
}
#>  Normalizing datasets "rna"
#>  Normalizing datasets "atac"
#>  Normalizing datasets "atac" ... done
#> 
#>  Normalizing datasets "rna"

#>  Normalizing datasets "rna" ... done
#> 
#>  Selecting variable features for dataset "rna"
#>  ... 83 features selected out of 172 shared features.
#>  Selecting variable features for dataset "atac"
#>  ... 126 features selected out of 172 shared features.
#>  Finally 135 shared variable features are selected.
#>  Scaling dataset "rna"
#>  Scaling dataset "rna" ... done
#> 
#>  Scaling dataset "atac"
#>  Scaling dataset "atac" ... done
#> 
#>  Using largest dataset of recommended type as reference: "rna" with 340 cells
#>  Normalizing peak of dataset: "atac"
#>  Normalizing peak of dataset: "atac" ... done
#> 
#>  Imputing 1 query dataset: "rna"
#>  from reference dataset: "atac"
#>  Normalizing peak of dataset: "rna"
#>  Normalizing peak of dataset: "rna" ... done
#> 
#>  172 genes to be tested against 995 peaks
#>  Calculating correlation for gene-peak pairs...
#>  ■■■■■■■■■■■■■■■■■■■■■■■■          77% | ETA:  1s
#>  ■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■  100% | ETA:  0s
# }