Create liger object

This function allows creating liger object from multiple datasets of various forms (See rawData).

DO make a copy of the H5AD files because rliger functions write to the files and they will not be able to be read back to Python. This will be fixed in the future.

Usage

createLiger(
  rawData,
  modal = NULL,
  organism = "human",
  cellMeta = NULL,
  removeMissing = TRUE,
  addPrefix = "auto",
  formatType = "10X",
  anndataX = "X",
  dataName = NULL,
  indicesName = NULL,
  indptrName = NULL,
  genesName = NULL,
  barcodesName = NULL,
  newH5 = TRUE,
  verbose = getOption("ligerVerbose", TRUE),
  ...,
  raw.data = rawData,
  take.gene.union = NULL,
  remove.missing = removeMissing,
  format.type = formatType,
  data.name = dataName,
  indices.name = indicesName,
  indptr.name = indptrName,
  genes.name = genesName,
  barcodes.name = barcodesName
)

Arguments

rawData: Named list of datasets. Required. Elements allowed include a matrix, a Seurat object, a SingleCellExperiment object, an AnnData object, a ligerDataset object or a filename to an HDF5 file. See detail for HDF5 reading.
modal: Character vector for modality setting. Use one string for all datasets, or the same number of strings as the number of datasets. Currently options of "default", "rna", "atac", "spatial" and "meth" are supported.
organism: Character vector for setting organism for identifying mito, ribo and hemo genes for expression percentage calculation. Use one string for all datasets, or the same number of strings as the number of datasets. Currently options of "mouse", "human", "zebrafish", "rat", and "drosophila" are supported.
cellMeta: data.frame of metadata at single-cell level. Default NULL.
removeMissing: Logical. Whether to remove cells that do not have any counts from each dataset. Default TRUE.
addPrefix: Logical. Whether to add "datasetName_" as a prefix of cell identifiers (e.g. barcodes) to avoid duplicates in multiple libraries ( common with 10X data). Default "auto" detects if matrix columns already has the exact prefix or not. Logical value forces the action.
formatType: Select preset of H5 file structure. Current available options are "10x" and "anndata". Can be either a single specification for all datasets or a character vector that match with each dataset.
anndataX: The HDF5 path to the raw count data in an H5AD file. See createH5LigerDataset Details. Default "X".
dataName, indicesName, indptrName: The path in a H5 file for the raw sparse matrix data. These three types of data stands for the x, i, and p slots of a dgCMatrix-class object. Default NULL uses formatType preset.
genesName, barcodesName: The path in a H5 file for the gene names and cell barcodes. Default NULL uses formatType preset.
newH5: When using HDF5 based data and subsets created after removing missing cells/features, whether to create new HDF5 files for the subset. Default TRUE. If FALSE, data will be subset into memory and can be dangerous for large scale analysis.
verbose: Logical. Whether to show information of the progress. Default getOption("ligerVerbose") or TRUE if users have not set.
...: Additional slot values that should be directly placed in object.
raw.data, remove.missing, format.type, data.name, indices.name, indptr.name, genes.name, barcodes.name: See Usage section for replacement.
take.gene.union: Will be ignored.

Examples

# Create from raw count matrices
ctrl.raw <- rawData(pbmc, "ctrl")
stim.raw <- rawData(pbmc, "stim")
pbmc1 <- createLiger(list(ctrl = ctrl.raw, stim = stim.raw))
#> ! No human mitochondrial gene found in the union of dataset "ctrl"
#> ℹ calculating QC for dataset "ctrl"
#> ℹ Updated QC variables: "nUMI", "nGene", "mito", "ribo", and "hemo"
#> ℹ calculating QC for dataset "ctrl"

#> ✔ calculating QC for dataset "ctrl" ... done
#> 
#> ! No human mitochondrial gene found in the union of dataset "stim"
#> ℹ calculating QC for dataset "stim"
#> ℹ Updated QC variables: "nUMI", "nGene", "mito", "ribo", and "hemo"
#> ℹ calculating QC for dataset "stim"

#> ✔ calculating QC for dataset "stim" ... done
#> 

# Create from H5 files
h5Path <- system.file("extdata/ctrl.h5", package = "rliger")
tempPath <- tempfile(fileext = ".h5")
file.copy(from = h5Path, to = tempPath)
#> [1] TRUE
lig <- createLiger(list(ctrl = tempPath))
#> Found more than one class "H5D" in cache; using the first, from namespace 'rliger'
#> Also defined by ‘hdf5r’
#> Found more than one class "H5D" in cache; using the first, from namespace 'rliger'
#> Also defined by ‘hdf5r’
#> Found more than one class "H5D" in cache; using the first, from namespace 'rliger'
#> Also defined by ‘hdf5r’
#> ! No human mitochondrial gene found in the union of dataset "ctrl"
#> ℹ calculating QC for dataset "ctrl"
#> 
#> ℹ Updated QC variables: "nUMI", "nGene", "mito", "ribo", and "hemo"
#> ℹ calculating QC for dataset "ctrl"

#> ✔ calculating QC for dataset "ctrl" ... done
#> 

# Create from other container object
if (requireNamespace("SeuratObject", quietly = TRUE)) {
    ctrl.seu <- SeuratObject::CreateSeuratObject(ctrl.raw)
    stim.seu <- SeuratObject::CreateSeuratObject(stim.raw)
    pbmc2 <- createLiger(list(ctrl = ctrl.seu, stim = stim.seu))
}
#> ! No human mitochondrial gene found in the union of dataset "ctrl"
#> ℹ calculating QC for dataset "ctrl"
#> ℹ Updated QC variables: "nUMI", "nGene", "mito", "ribo", and "hemo"
#> ℹ calculating QC for dataset "ctrl"

#> ✔ calculating QC for dataset "ctrl" ... done
#> 
#> ! No human mitochondrial gene found in the union of dataset "stim"
#> ℹ calculating QC for dataset "stim"
#> ℹ Updated QC variables: "nUMI", "nGene", "mito", "ribo", and "hemo"
#> ℹ calculating QC for dataset "stim"

#> ✔ calculating QC for dataset "stim" ... done
#>

Usage

Arguments

See also

Examples