Skip to contents

This function allows creating liger object from multiple datasets of various forms (See rawData).

DO make a copy of the H5AD files because rliger functions write to the files and they will not be able to be read back to Python. This will be fixed in the future.

Usage

createLiger(
  rawData,
  modal = NULL,
  organism = "human",
  cellMeta = NULL,
  removeMissing = TRUE,
  addPrefix = "auto",
  formatType = "10X",
  anndataX = "X",
  dataName = NULL,
  indicesName = NULL,
  indptrName = NULL,
  genesName = NULL,
  barcodesName = NULL,
  newH5 = TRUE,
  verbose = getOption("ligerVerbose", TRUE),
  ...,
  raw.data = rawData,
  take.gene.union = NULL,
  remove.missing = removeMissing,
  format.type = formatType,
  data.name = dataName,
  indices.name = indicesName,
  indptr.name = indptrName,
  genes.name = genesName,
  barcodes.name = barcodesName
)

Arguments

rawData

Named list of datasets. Required. Elements allowed include a matrix, a Seurat object, a SingleCellExperiment object, an AnnData object, a ligerDataset object or a filename to an HDF5 file. See detail for HDF5 reading.

modal

Character vector for modality setting. Use one string for all datasets, or the same number of strings as the number of datasets. Currently options of "default", "rna", "atac", "spatial" and "meth" are supported.

organism

Character vector for setting organism for identifying mito, ribo and hemo genes for expression percentage calculation. Use one string for all datasets, or the same number of strings as the number of datasets. Currently options of "mouse", "human", "zebrafish", "rat", and "drosophila" are supported.

cellMeta

data.frame of metadata at single-cell level. Default NULL.

removeMissing

Logical. Whether to remove cells that do not have any counts from each dataset. Default TRUE.

addPrefix

Logical. Whether to add "datasetName_" as a prefix of cell identifiers (e.g. barcodes) to avoid duplicates in multiple libraries ( common with 10X data). Default "auto" detects if matrix columns already has the exact prefix or not. Logical value forces the action.

formatType

Select preset of H5 file structure. Current available options are "10x" and "anndata". Can be either a single specification for all datasets or a character vector that match with each dataset.

anndataX

The HDF5 path to the raw count data in an H5AD file. See createH5LigerDataset Details. Default "X".

dataName, indicesName, indptrName

The path in a H5 file for the raw sparse matrix data. These three types of data stands for the x, i, and p slots of a dgCMatrix-class object. Default NULL uses formatType preset.

genesName, barcodesName

The path in a H5 file for the gene names and cell barcodes. Default NULL uses formatType preset.

newH5

When using HDF5 based data and subsets created after removing missing cells/features, whether to create new HDF5 files for the subset. Default TRUE. If FALSE, data will be subset into memory and can be dangerous for large scale analysis.

verbose

Logical. Whether to show information of the progress. Default getOption("ligerVerbose") or TRUE if users have not set.

...

Additional slot values that should be directly placed in object.

raw.data, remove.missing, format.type, data.name, indices.name, indptr.name, genes.name, barcodes.name

[Superseded] See Usage section for replacement.

take.gene.union

[Defunct] Will be ignored.

Examples

# Create from raw count matrices
ctrl.raw <- rawData(pbmc, "ctrl")
stim.raw <- rawData(pbmc, "stim")
pbmc1 <- createLiger(list(ctrl = ctrl.raw, stim = stim.raw))
#> ! No human mitochondrial gene found in the union of dataset "ctrl"
#>  calculating QC for dataset "ctrl"
#>  Updated QC variables: "nUMI", "nGene", "mito", "ribo", and "hemo"
#>  calculating QC for dataset "ctrl"

#>  calculating QC for dataset "ctrl" ... done
#> 
#> ! No human mitochondrial gene found in the union of dataset "stim"
#>  calculating QC for dataset "stim"
#>  Updated QC variables: "nUMI", "nGene", "mito", "ribo", and "hemo"
#>  calculating QC for dataset "stim"

#>  calculating QC for dataset "stim" ... done
#> 

# Create from H5 files
h5Path <- system.file("extdata/ctrl.h5", package = "rliger")
tempPath <- tempfile(fileext = ".h5")
file.copy(from = h5Path, to = tempPath)
#> [1] TRUE
lig <- createLiger(list(ctrl = tempPath))
#> Found more than one class "H5D" in cache; using the first, from namespace 'rliger'
#> Also defined by ‘hdf5r’
#> Found more than one class "H5D" in cache; using the first, from namespace 'rliger'
#> Also defined by ‘hdf5r’
#> Found more than one class "H5D" in cache; using the first, from namespace 'rliger'
#> Also defined by ‘hdf5r’
#> ! No human mitochondrial gene found in the union of dataset "ctrl"
#>  calculating QC for dataset "ctrl"
#> 
#>  Updated QC variables: "nUMI", "nGene", "mito", "ribo", and "hemo"
#>  calculating QC for dataset "ctrl"

#>  calculating QC for dataset "ctrl" ... done
#> 

# Create from other container object
if (requireNamespace("SeuratObject", quietly = TRUE)) {
    ctrl.seu <- SeuratObject::CreateSeuratObject(ctrl.raw)
    stim.seu <- SeuratObject::CreateSeuratObject(stim.raw)
    pbmc2 <- createLiger(list(ctrl = ctrl.seu, stim = stim.seu))
}
#> ! No human mitochondrial gene found in the union of dataset "ctrl"
#>  calculating QC for dataset "ctrl"
#>  Updated QC variables: "nUMI", "nGene", "mito", "ribo", and "hemo"
#>  calculating QC for dataset "ctrl"

#>  calculating QC for dataset "ctrl" ... done
#> 
#> ! No human mitochondrial gene found in the union of dataset "stim"
#>  calculating QC for dataset "stim"
#>  Updated QC variables: "nUMI", "nGene", "mito", "ribo", and "hemo"
#>  calculating QC for dataset "stim"

#>  calculating QC for dataset "stim" ... done
#>