This function allows creating liger object from
multiple datasets of various forms (See rawData
).
DO make a copy of the H5AD files because rliger functions write to the files and they will not be able to be read back to Python. This will be fixed in the future.
Usage
createLiger(
rawData,
modal = NULL,
organism = "human",
cellMeta = NULL,
removeMissing = TRUE,
addPrefix = "auto",
formatType = "10X",
anndataX = "X",
dataName = NULL,
indicesName = NULL,
indptrName = NULL,
genesName = NULL,
barcodesName = NULL,
newH5 = TRUE,
verbose = getOption("ligerVerbose", TRUE),
...,
raw.data = rawData,
take.gene.union = NULL,
remove.missing = removeMissing,
format.type = formatType,
data.name = dataName,
indices.name = indicesName,
indptr.name = indptrName,
genes.name = genesName,
barcodes.name = barcodesName
)
Arguments
- rawData
Named list of datasets. Required. Elements allowed include a matrix, a
Seurat
object, aSingleCellExperiment
object, anAnnData
object, a ligerDataset object or a filename to an HDF5 file. See detail for HDF5 reading.- modal
Character vector for modality setting. Use one string for all datasets, or the same number of strings as the number of datasets. Currently options of
"default"
,"rna"
,"atac"
,"spatial"
and"meth"
are supported.- organism
Character vector for setting organism for identifying mito, ribo and hemo genes for expression percentage calculation. Use one string for all datasets, or the same number of strings as the number of datasets. Currently options of
"mouse"
,"human"
,"zebrafish"
,"rat"
, and"drosophila"
are supported.- cellMeta
data.frame of metadata at single-cell level. Default
NULL
.- removeMissing
Logical. Whether to remove cells that do not have any counts from each dataset. Default
TRUE
.- addPrefix
Logical. Whether to add "datasetName_" as a prefix of cell identifiers (e.g. barcodes) to avoid duplicates in multiple libraries ( common with 10X data). Default
"auto"
detects if matrix columns already has the exact prefix or not. Logical value forces the action.- formatType
Select preset of H5 file structure. Current available options are
"10x"
and"anndata"
. Can be either a single specification for all datasets or a character vector that match with each dataset.- anndataX
The HDF5 path to the raw count data in an H5AD file. See
createH5LigerDataset
Details. Default"X"
.- dataName, indicesName, indptrName
The path in a H5 file for the raw sparse matrix data. These three types of data stands for the
x
,i
, andp
slots of adgCMatrix-class
object. DefaultNULL
usesformatType
preset.- genesName, barcodesName
The path in a H5 file for the gene names and cell barcodes. Default
NULL
usesformatType
preset.- newH5
When using HDF5 based data and subsets created after removing missing cells/features, whether to create new HDF5 files for the subset. Default
TRUE
. IfFALSE
, data will be subset into memory and can be dangerous for large scale analysis.- verbose
Logical. Whether to show information of the progress. Default
getOption("ligerVerbose")
orTRUE
if users have not set.- ...
Additional slot values that should be directly placed in object.
- raw.data, remove.missing, format.type, data.name, indices.name, indptr.name, genes.name, barcodes.name
- take.gene.union
Examples
# Create from raw count matrices
ctrl.raw <- rawData(pbmc, "ctrl")
stim.raw <- rawData(pbmc, "stim")
pbmc1 <- createLiger(list(ctrl = ctrl.raw, stim = stim.raw))
#> ! No human mitochondrial gene found in the union of dataset "ctrl"
#> ℹ calculating QC for dataset "ctrl"
#> ℹ Updated QC variables: "nUMI", "nGene", "mito", "ribo", and "hemo"
#> ℹ calculating QC for dataset "ctrl"
#> ✔ calculating QC for dataset "ctrl" ... done
#>
#> ! No human mitochondrial gene found in the union of dataset "stim"
#> ℹ calculating QC for dataset "stim"
#> ℹ Updated QC variables: "nUMI", "nGene", "mito", "ribo", and "hemo"
#> ℹ calculating QC for dataset "stim"
#> ✔ calculating QC for dataset "stim" ... done
#>
# Create from H5 files
h5Path <- system.file("extdata/ctrl.h5", package = "rliger")
tempPath <- tempfile(fileext = ".h5")
file.copy(from = h5Path, to = tempPath)
#> [1] TRUE
lig <- createLiger(list(ctrl = tempPath))
#> Found more than one class "H5D" in cache; using the first, from namespace 'rliger'
#> Also defined by ‘hdf5r’
#> Found more than one class "H5D" in cache; using the first, from namespace 'rliger'
#> Also defined by ‘hdf5r’
#> Found more than one class "H5D" in cache; using the first, from namespace 'rliger'
#> Also defined by ‘hdf5r’
#> ! No human mitochondrial gene found in the union of dataset "ctrl"
#> ℹ calculating QC for dataset "ctrl"
#>
#> ℹ Updated QC variables: "nUMI", "nGene", "mito", "ribo", and "hemo"
#> ℹ calculating QC for dataset "ctrl"
#> ✔ calculating QC for dataset "ctrl" ... done
#>
# Create from other container object
if (requireNamespace("SeuratObject", quietly = TRUE)) {
ctrl.seu <- SeuratObject::CreateSeuratObject(ctrl.raw)
stim.seu <- SeuratObject::CreateSeuratObject(stim.raw)
pbmc2 <- createLiger(list(ctrl = ctrl.seu, stim = stim.seu))
}
#> ! No human mitochondrial gene found in the union of dataset "ctrl"
#> ℹ calculating QC for dataset "ctrl"
#> ℹ Updated QC variables: "nUMI", "nGene", "mito", "ribo", and "hemo"
#> ℹ calculating QC for dataset "ctrl"
#> ✔ calculating QC for dataset "ctrl" ... done
#>
#> ! No human mitochondrial gene found in the union of dataset "stim"
#> ℹ calculating QC for dataset "stim"
#> ℹ Updated QC variables: "nUMI", "nGene", "mito", "ribo", and "hemo"
#> ℹ calculating QC for dataset "stim"
#> ✔ calculating QC for dataset "stim" ... done
#>