Calculate agreement metric after integration

This metric quantifies how much the factorization and alignment distorts the geometry of the original datasets. The greater the agreement, the less distortion of geometry there is. This is calculated by performing dimensionality reduction on the original and integrated (factorized or plus aligned) datasets, and measuring similarity between the k nearest neighbors for each cell in original and integrated datasets. The Jaccard index is used to quantify similarity, and is the final metric averages across all cells.

Note that for most datasets, the greater the chosen nNeighbor, the greater the agreement in general. Although agreement can theoretically approach 1, in practice it is usually no higher than 0.2-0.3.

Usage

calcAgreement(
  object,
  ndims = 40,
  nNeighbors = 15,
  useRaw = FALSE,
  byDataset = FALSE,
  seed = 1,
  dr.method = NULL,
  k = nNeighbors,
  use.aligned = NULL,
  rand.seed = seed,
  by.dataset = byDataset
)

Arguments

object: liger object. Should call alignFactors before calling.
ndims: Number of factors to produce in NMF. Default 40.
nNeighbors: Number of nearest neighbors to use in calculating Jaccard index. Default 15.
useRaw: Whether to evaluate just factorized \(H\) matrices instead of using aligned \(H.norm\) matrix. Default FALSE uses aligned matrix.
byDataset: Whether to return agreement calculated for each dataset instead of the average for all datasets. Default FALSE.
seed: Random seed to allow reproducible results. Default 1.
dr.method: We no longer support other methods but just NMF.
k, rand.seed, by.dataset: See Usage for replacement.
use.aligned: Use useRaw instead.

Value

A numeric vector of agreement metric. A single value if byDataset = FALSE or each dataset a value otherwise.

Examples

if (requireNamespace("RcppPlanc", quietly = TRUE)) {
    pbmc <- pbmc %>%
    normalize %>%
    selectGenes %>%
    scaleNotCenter %>%
    runINMF %>%
    alignFactors
    calcAgreement(pbmc)
}
#> ℹ Normalizing datasets "ctrl"
#> ℹ Normalizing datasets "stim"
#> ✔ Normalizing datasets "stim" ... done
#> 
#> ℹ Normalizing datasets "ctrl"

#> ✔ Normalizing datasets "ctrl" ... done
#> 
#> ℹ Selecting variable features for dataset "ctrl"
#> ✔ ... 168 features selected out of 249 shared features.
#> ℹ Selecting variable features for dataset "stim"
#> ✔ ... 166 features selected out of 249 shared features.
#> ✔ Finally 173 shared variable features are selected.
#> ℹ Scaling dataset "ctrl"
#> ✔ Scaling dataset "ctrl" ... done
#> 
#> ℹ Scaling dataset "stim"
#> ✔ Scaling dataset "stim" ... done
#> 
#> ℹ Using largest dataset of recommended type as reference: "ctrl" with 300 cells
#> [1] 0.3723238