Enables easy loading of sparse data matrices provided by 10X genomics.
read10X
works generally for 10X cellranger pipelines including:
CellRanger < 3.0 & >= 3.0 and CellRanger-ARC.
read10XRNA
invokes read10X
and takes the "Gene Expression" out,
so that the result can directly be used to construct a liger
object. See Examples for demonstration.
read10XATAC
works for both cellRanger-ARC and cellRanger-ATAC
pipelines but needs user arguments for correct recognition. Similarly, the
returned value can directly be used for constructing a liger
object.
Usage
read10X(
path,
sampleNames = NULL,
addPrefix = FALSE,
useFiltered = NULL,
reference = NULL,
geneCol = 2,
cellCol = 1,
returnList = FALSE,
verbose = getOption("ligerVerbose", TRUE),
sample.dirs = path,
sample.names = sampleNames,
use.filtered = useFiltered,
data.type = NULL,
merge = NULL,
num.cells = NULL,
min.umis = NULL
)
read10XRNA(
path,
sampleNames = NULL,
addPrefix = FALSE,
useFiltered = NULL,
reference = NULL,
returnList = FALSE,
...
)
read10XATAC(
path,
sampleNames = NULL,
addPrefix = FALSE,
useFiltered = NULL,
pipeline = c("atac", "arc"),
arcFeatureType = "Peaks",
returnList = FALSE,
geneCol = 2,
cellCol = 1,
verbose = getOption("ligerVerbose", TRUE)
)
Arguments
- path
(A.) A Directory containing the matrix.mtx, genes.tsv (or features.tsv), and barcodes.tsv files provided by 10X. A vector, a named vector, a list or a named list can be given in order to load several data directories. (B.) The 10X root directory where subdirectories of per-sample output folders can be found. Sample names will by default take the name of the vector, list or subfolders.
- sampleNames
A vector of names to override the detected or set sample names for what is given to
path
. DefaultNULL
. If no name detected at all and multiple samples are given, will name them by numbers.- addPrefix
Logical, whether to add sample names as a prefix to the barcodes. Default
FALSE
.- useFiltered
Logical, if
path
is given as case B, whether to use the filtered feature barcode matrix instead of raw (unfiltered). DefaultTRUE
.- reference
In case of specifying a CellRanger<3 root folder to
path
, import the matrix from the output using which reference. Only needed when multiple references present. DefaultNULL
.- geneCol
Specify which column of genes.tsv or features.tsv to use for gene names. Default
2
.- cellCol
Specify which column of barcodes.tsv to use for cell names. Default
1
.- returnList
Logical, whether to still return a structured list instead of a single matrix object, in the case where only one sample and only one feature type can be found. Otherwise will always return a list. Default
FALSE
.- verbose
Logical. Whether to show information of the progress. Default
getOption("ligerVerbose")
orTRUE
if users have not set.- sample.dirs, sample.names, use.filtered
These arguments are renamed and will be deprecated in the future. Please see usage for corresponding arguments.
- data.type, merge, num.cells, min.umis
These arguments are defuncted because the functionality can/should be fulfilled with other functions.
- ...
Arguments passed to
read10X
- pipeline
Which cellRanger pipeline type to find the ATAC data. Choose
"atac"
to read the peak matrix from cellranger-atac pipeline output folder(s), or"arc"
to split the ATAC feature subset out from the multiomic cellranger-arc pipeline output folder(s). Default"atac"
.- arcFeatureType
When
pipeline = "arc"
, which feature type is for the ATAC data of interests. Default"Peaks"
. Other possible feature types can be"Chromatin Accessibility"
. Error message will show available options if argument specification cannot be found.
Value
When only one sample is given or detected, and only one feature type is detected or using CellRanger < 3.0, and
returnList = FALSE
, a sparse matrix object (dgCMatrix class) will be returned.When using
read10XRNA
orread10XATAC
, which are modality specific, returns a list named by samples, and each element is the corresponding sparse matrix object (dgCMatrix class).read10X
generally returns a list named by samples. Each sample element will be another list named by feature types even if only one feature type is detected (or using CellRanger < 3.0) for data structure consistency. The feature type "Gene Expression" always comes as the first type if available.
Examples
if (FALSE) {
# For output from CellRanger < 3.0
dir <- 'path/to/data/directory'
list.files(dir) # Should show barcodes.tsv, genes.tsv, and matrix.mtx
mat <- read10X(dir)
class(mat) # Should show dgCMatrix
# For root directory from CellRanger < 3.0
dir <- 'path/to/root'
list.dirs(dir) # Should show sample names
matList <- read10X(dir)
names(matList) # Should show the sample names
class(matList[[1]][["Gene Expression"]]) # Should show dgCMatrix
# For output from CellRanger >= 3.0 with multiple data types
dir <- 'path/to/data/directory'
list.files(dir) # Should show barcodes.tsv.gz, features.tsv.gz, and matrix.mtx.gz
matList <- read10X(dir, sampleNames = "tissue1")
names(matList) # Shoud show "tissue1"
names(matList$tissue1) # Should show feature types, e.g. "Gene Expression" and etc.
# For root directory from CellRanger >= 3.0 with multiple data types
dir <- 'path/to/root'
list.dirs(dir) # Should show sample names, e.g. "rep1", "rep2", "rep3"
matList <- read10X(dir)
names(matList) # Should show the sample names: "rep1", "rep2", "rep3"
names(matList$rep1) # Should show the avalable feature types for rep1
}
if (FALSE) {
# For creating LIGER object from root directory of CellRanger >= 3.0
dir <- 'path/to/root'
list.dirs(dir) # Should show sample names, e.g. "rep1", "rep2", "rep3"
matList <- read10XRNA(dir)
names(matList) # Should show the sample names: "rep1", "rep2", "rep3"
sapply(matList, class) # Should show matrix class all are "dgCMatrix"
lig <- createLigerObject(matList)
}