Find shared and dataset-specific markers — getFactorMarkers • rliger

Applies various filters to genes on the shared (\(W\)) and dataset-specific (\(V\)) components of the factorization, before selecting those which load most significantly on each factor (in a shared or dataset-specific way).

Usage

getFactorMarkers(
  object,
  dataset1,
  dataset2,
  factorShareThresh = 10,
  datasetSpecificity = NULL,
  logFCThresh = 1,
  pvalThresh = 0.05,
  nGenes = 30,
  printGenes = FALSE,
  verbose = getOption("ligerVerbose", TRUE),
  factor.share.thresh = factorShareThresh,
  dataset.specificity = datasetSpecificity,
  log.fc.thresh = logFCThresh,
  pval.thresh = pvalThresh,
  num.genes = nGenes,
  print.genes = printGenes
)

Arguments

object: liger object with factorization results.
dataset1: Name of first dataset. Required.
dataset2: Name of second dataset. Required
factorShareThresh: Numeric. Only factors with a dataset specificity less than or equal to this threshold will be used. Default 10.
datasetSpecificity: Numeric vector. Pre-calculated dataset specificity if available. Length should match number of all factors available. Default NULL automatically calculates with calcDatasetSpecificity.
logFCThresh: Numeric. Lower log-fold change threshold for differential expression in markers. Default 1.
pvalThresh: Numeric. Upper p-value threshold for Wilcoxon rank test for gene expression. Default 0.05.
nGenes: Integer. Max number of genes to report for each dataset. Default 30.
printGenes: Logical. Whether to print ordered markers passing logFC, UMI and frac thresholds, when verbose = TRUE. Default FALSE.
verbose: Logical. Whether to show information of the progress. Default getOption("ligerVerbose") or TRUE if users have not set.
factor.share.thresh, dataset.specificity, log.fc.thresh, pval.thresh, num.genes, print.genes: Deprecated. See Usage section for replacement.

Value

A list object consisting of the following entries:

value of dataset1: data.frame of dataset1-specific markers
shared: data.frame of shared markers
value of dataset1: data.frame of dataset2-specific markers
num_factors_V1: A frequency table indicating the number of factors each marker appears, in dataset1
num_factors_V2: A frequency table indicating the number of factors each marker appears, in dataset2

Examples

library(dplyr)
#> 
#> Attaching package: ‘dplyr’
#> The following objects are masked from ‘package:stats’:
#> 
#>     filter, lag
#> The following objects are masked from ‘package:base’:
#> 
#>     intersect, setdiff, setequal, union
result <- getFactorMarkers(pbmcPlot, dataset1 = "ctrl", dataset2 = "stim")
#> ! Factor 7 did not appear as max in any cell in either dataset
#> 
print(class(result))
#> [1] "list"
print(names(result))
#> [1] "ctrl"           "shared"         "stim"           "num_factors_V1"
#> [5] "num_factors_V2"
result$shared %>% group_by(factor_num) %>% top_n(2, logFC)
#> # A tibble: 38 × 4
#> # Groups:   factor_num [19]
#>    feature  factor_num logFC    pval
#>    <chr>         <int> <dbl>   <dbl>
#>  1 DUSP2             1  6.52 1      
#>  2 NPM1              1  7.15 0.889  
#>  3 ID3               2  3.42 0.161  
#>  4 CD83              2  4.29 0.620  
#>  5 IL1B              3  6.14 0.0377 
#>  6 H2AFZ             3  5.98 0.0942 
#>  7 S100A10           4 10.9  0.653  
#>  8 S100A11           4  7.97 1      
#>  9 MARCKSL1          5  2.94 0.0660 
#> 10 IL8               5 10.9  0.00408
#> # ℹ 28 more rows