For convenience, the default formatType = "10x"
directly fits the
structure of cellranger output. formatType = "anndata"
works for
current AnnData H5AD file specification (see Details). If a customized H5
file structure is presented, any of the rawData
,
indicesName
, indptrName
, genesName
, barcodesName
should be specified accordingly to override the formatType
preset.
DO make a copy of the H5AD files because rliger functions write to the files and they will not be able to be read back to Python. This will be fixed in the future.
Usage
createH5LigerDataset(
h5file,
formatType = "10x",
rawData = NULL,
normData = NULL,
scaleData = NULL,
barcodesName = NULL,
genesName = NULL,
indicesName = NULL,
indptrName = NULL,
anndataX = "X",
modal = c("default", "rna", "atac", "spatial", "meth"),
featureMeta = NULL,
...
)
Arguments
- h5file
Filename of an H5 file
- formatType
Select preset of H5 file structure. Default
"10X"
. Alternatively, we also support"anndata"
for H5AD files.- rawData, indicesName, indptrName
The path in a H5 file for the raw sparse matrix data. These three types of data stands for the
x
,i
, andp
slots of adgCMatrix-class
object. DefaultNULL
usesformatType
preset.- normData
The path in a H5 file for the "x" vector of the normalized sparse matrix. Default
NULL
.- scaleData
The path in a H5 file for the Group that contains the sparse matrix constructing information for the scaled data. Default
NULL
.- genesName, barcodesName
The path in a H5 file for the gene names and cell barcodes. Default
NULL
usesformatType
preset.- anndataX
The HDF5 path to the raw count data in an H5AD file. See Details. Default
"X"
.- modal
Name of modality for this dataset. Currently options of
"default"
,"rna"
,"atac"
,"spatial"
and"meth"
are supported. Default"default"
.- featureMeta
Data frame for feature metadata. Default
NULL
.- ...
Additional slot data. See ligerDataset for detail. Given values will be directly placed at corresponding slots.
Value
H5-based ligerDataset object
Details
For H5AD file written from an AnnData object, we allow using
formatType = "anndata"
for the function to infer the proper structure.
However, while a typical AnnData-based analysis tends to in-place update the
adata.X
attribute and there is no standard/forced convention for where
the raw count data, as needed from LIGER, is stored. Therefore, we expose
argument anndataX
for specifying this information. The default value
"X"
looks for adata.X
. If the raw data is stored in a layer,
e.g. adata.layers['count']
, then anndataX = "layers/count"
.
If it is stored to adata.raw.X
, then anndataX = "raw/X"
. If
your AnnData object does not have the raw count retained, you will have to
go back to the Python work flow to have it inserted at desired object space
and re-write the H5AD file, or just go from upstream source files with which
the AnnData was originally created.
Examples
h5Path <- system.file("extdata/ctrl.h5", package = "rliger")
tempPath <- tempfile(fileext = ".h5")
file.copy(from = h5Path, to = tempPath)
#> [1] TRUE
ld <- createH5LigerDataset(tempPath)
#> Found more than one class "H5D" in cache; using the first, from namespace 'rliger'
#> Also defined by ‘hdf5r’
#> Found more than one class "H5D" in cache; using the first, from namespace 'rliger'
#> Also defined by ‘hdf5r’
#> Found more than one class "H5D" in cache; using the first, from namespace 'rliger'
#> Also defined by ‘hdf5r’
#> Found more than one class "H5D" in cache; using the first, from namespace 'rliger'
#> Also defined by ‘hdf5r’