ligerDataset class
Source:R/classes.R
, R/generics.R
, R/ligerDataset-methods.R
ligerDataset-class.Rd
Object for storing dastaset specific information. Will be embedded within a higher level liger object
Usage
rawData(x, dataset = NULL)
rawData(x, dataset = NULL, check = TRUE) <- value
normData(x, dataset = NULL)
normData(x, dataset = NULL, check = TRUE) <- value
scaleData(x, dataset = NULL)
scaleData(x, dataset = NULL, check = TRUE) <- value
scaleUnsharedData(x, dataset = NULL)
scaleUnsharedData(x, dataset = NULL, check = TRUE) <- value
getMatrix(x, slot = "rawData", dataset = NULL, returnList = FALSE)
h5fileInfo(x, info = NULL)
h5fileInfo(x, info = NULL, check = TRUE) <- value
getH5File(x, dataset = NULL)
# S4 method for ligerDataset,missing
getH5File(x, dataset = NULL)
featureMeta(x, check = NULL)
featureMeta(x, check = TRUE) <- value
# S4 method for ligerDataset
show(object)
# S4 method for ligerDataset
dim(x)
# S4 method for ligerDataset
dimnames(x)
# S4 method for ligerDataset,list
dimnames(x) <- value
# S4 method for ligerDataset
rawData(x, dataset = NULL)
# S4 method for ligerDataset,ANY,ANY,matrixLike_OR_NULL
rawData(x, dataset = NULL, check = TRUE) <- value
# S4 method for ligerDataset,ANY,ANY,H5D
rawData(x, dataset = NULL, check = TRUE) <- value
# S4 method for ligerDataset
normData(x, dataset = NULL)
# S4 method for ligerDataset,ANY,ANY,matrixLike_OR_NULL
normData(x, dataset = NULL, check = TRUE) <- value
# S4 method for ligerDataset,ANY,ANY,H5D
normData(x, dataset = NULL, check = TRUE) <- value
# S4 method for ligerDataset,missing
scaleData(x, dataset = NULL)
# S4 method for ligerDataset,ANY,ANY,matrixLike_OR_NULL
scaleData(x, dataset = NULL, check = TRUE) <- value
# S4 method for ligerDataset,ANY,ANY,H5D
scaleData(x, dataset = NULL, check = TRUE) <- value
# S4 method for ligerDataset,ANY,ANY,H5Group
scaleData(x, dataset = NULL, check = TRUE) <- value
# S4 method for ligerDataset,missing
scaleUnsharedData(x, dataset = NULL)
# S4 method for ligerDataset,missing,ANY,matrixLike_OR_NULL
scaleUnsharedData(x, check = TRUE) <- value
# S4 method for ligerDataset,missing,ANY,H5D
scaleUnsharedData(x, check = TRUE) <- value
# S4 method for ligerDataset,missing,ANY,H5Group
scaleUnsharedData(x, check = TRUE) <- value
# S4 method for ligerDataset,ANY,missing,missing
getMatrix(
x,
slot = c("rawData", "normData", "scaleData", "scaleUnsharedData", "H", "V", "U", "A",
"B"),
dataset = NULL
)
# S4 method for ligerDataset
h5fileInfo(x, info = NULL)
# S4 method for ligerDataset
h5fileInfo(x, info = NULL, check = TRUE) <- value
# S4 method for ligerDataset
featureMeta(x, check = NULL)
# S4 method for ligerDataset
featureMeta(x, check = TRUE) <- value
# S3 method for ligerDataset
cbind(x, ..., deparse.level = 1)
Arguments
- x, object
A
ligerDataset
object.- dataset
Not applicable for
ligerDataset
methods.- check
Whether to perform object validity check on setting new value.
- value
See detail sections for requirements
- slot
The slot name when using
getMatrix
.- returnList
Not applicable for
ligerDataset
methods.- info
Name of the entry in
h5fileInfo
slot.- ...
See detailed sections for explanation.
- deparse.level
Not used here.
Slots
rawData
Raw data. Feature by cell matrix. Most of the time, sparse matrix of integer numbers for RNA and ATAC data.
normData
Normalized data. Feature by cell matrix. Sparse if the
rawData
it is normalized from is sparse.scaleData
Scaled data, usually with subset shared variable features, by cells. Most of the time sparse matrix of float numbers. This is the data used for iNMF factorization.
scaleUnsharedData
Scaled data of variable features not shared with other datasets. This is the data used for UINMF factorization.
varUnsharedFeatures
Variable features not shared with other datasets.
V
iNMF output matrix holding the dataset specific gene loading of each factor. Feature by factor matrix.
A
Online iNMF intermediate product matrix.
B
Online iNMF intermediate product matrix.
H
iNMF output matrix holding the factor loading of each cell. Factor by cell matrix.
U
UINMF output matrix holding the unshared variable gene loading of each factor. Feature by factor matrix.
h5fileInfo
list of meta information of HDF5 file used for constructing the object.
featureMeta
Feature metadata, DataFrame object.
colnames
Character vector of unique cell identifiers.
rownames
Character vector of unique feature names.
Matrix access
For ligerDataset
object, rawData()
, normData
,
scaleData()
and scaleUnsharedData()
methods are exported for
users to access the corresponding feature expression matrix. Replacement
methods are also available to modify the slots.
For other matrices, such as the \(H\) and \(V\), which are dataset
specific, please use getMatrix()
method with specifying slot name.
Directly accessing slot with @
is generally not recommended.
H5 file and information access
A ligerDataset
object has a slot called h5fileInfo
, which is a
list object. The first element is called $H5File
, which is an
H5File
class object and is the connection to the input file. The
second element is $filename
which stores the absolute path of the H5
file in the current machine. The third element $formatType
stores the
name of preset being used, if applicable. The other following keys pair with
paths in the H5 file that point to specific data for constructing a feature
expression matrix.
h5fileInfo()
method access the list described above and simply
retrieves the corresponding value. When info = NULL
, returns the whole
list. When length(info) == 1
, returns the requested list value. When
more info requested, returns a subset list.
The replacement method modifies the list elements and corresponding slot
value (if applicable) at the same time. For example, running
h5fileInfo(obj, "rawData") <- newPath
not only updates the list, but
also updates the rawData
slot with the H5D
class data at
"newPath" in the H5File
object.
getH5File()
is a wrapper and is equivalent to
h5fileInfo(obj, "H5File")
.
Feature metadata access
A slot featureMeta
is included for each ligerDataset
object.
This slot requires a DataFrame-class
object, which
is the same as cellMeta
slot of a liger object. However,
the associated S4 methods only include access to the whole table for now.
Internal information access follows the same way as data.frame operation.
For example, featureMeta(ligerD)$nCell
or
featureMeta(ligerD)[varFeatures(ligerObj), "gene_var"]
.
Dimensionality
For a ligerDataset
object, the column orientation is assigned for
cells and rows are for features. Therefore, for ligerDataset
objects,
dim()
returns a numeric vector of two numbers which are number of
features and number of cells. dimnames()
returns a list of two
character vectors, which are the feature names and the cell barcodes.
For direct call of dimnames<-
method, value
should be a list
with a character vector of feature names as the first element and cell
identifiers as the second element. For colnames<-
method, the
character vector of cell identifiers. For rownames<-
method, the
character vector of feature names.
Subsetting
For more detail of subsetting a liger
object or a
ligerDataset object, please check out subsetLiger
and subsetLigerDataset
. Here, we set the S3 method
"single-bracket" [
as a quick wrapper to subset a ligerDataset
object. i
and j
serves as feature and cell subscriptor,
respectively, which can be any valid index refering the available features
and cells in a dataset. ...
arugments are passed to
subsetLigerDataset
so that advanced options are allowed.
Concatenate ligerDataset
cbind()
method is implemented for concatenating ligerDataset
objects by cells. When applying, all feature expression matrix will be merged
with taking a union of all features for the rows.
Examples
ctrl <- dataset(pbmc, "ctrl")
# Methods for base generics
ctrl
#> An object of class ligerDataset with 300 cells
#> rawData: 266 features
print(ctrl)
#> An object of class ligerDataset with 300 cells
#> rawData: 266 features
dim(ctrl)
#> [1] 266 300
ncol(ctrl)
#> [1] 300
nrow(ctrl)
#> [1] 266
colnames(ctrl)[1:5]
#> [1] "ctrl_AAACATACCTCGCT.1" "ctrl_AAACGGCTCTTCGC.1" "ctrl_AACACTCTAAGTAG.1"
#> [4] "ctrl_AACCGCCTCAGGAG.1" "ctrl_AACGTTCTTCCGTC.1"
rownames(ctrl)[1:5]
#> [1] "ISG15" "ID3" "RPL11" "MARCKSL1" "RPS8"
ctrl[1:5, 1:5]
#> An object of class ligerDataset with 5 cells
#> rawData: 5 features
# rliger generics
## raw data
m <- rawData(ctrl)
class(m)
#> [1] "dgCMatrix"
#> attr(,"package")
#> [1] "Matrix"
dim(m)
#> [1] 266 300
## normalized data
pbmc <- normalize(pbmc)
#> ℹ Normalizing datasets "ctrl"
#> ℹ Normalizing datasets "stim"
#> ✔ Normalizing datasets "stim" ... done
#>
#> ℹ Normalizing datasets "ctrl"
#> ✔ Normalizing datasets "ctrl" ... done
#>
ctrl <- dataset(pbmc, "ctrl")
m <- normData(ctrl)
class(m)
#> [1] "dgCMatrix"
#> attr(,"package")
#> [1] "Matrix"
dim(m)
#> [1] 266 300
## scaled data
pbmc <- selectGenes(pbmc)
#> ℹ Selecting variable features for dataset "ctrl"
#> ✔ ... 168 features selected out of 249 shared features.
#> ℹ Selecting variable features for dataset "stim"
#> ✔ ... 166 features selected out of 249 shared features.
#> ✔ Finally 173 shared variable features are selected.
pbmc <- scaleNotCenter(pbmc)
#> ℹ Scaling dataset "ctrl"
#> ✔ Scaling dataset "ctrl" ... done
#>
#> ℹ Scaling dataset "stim"
#> ✔ Scaling dataset "stim" ... done
#>
ctrl <- dataset(pbmc, "ctrl")
m <- scaleData(ctrl)
class(m)
#> [1] "dgCMatrix"
#> attr(,"package")
#> [1] "Matrix"
dim(m)
#> [1] 173 300
n <- scaleData(pbmc, "ctrl")
identical(m, n)
#> [1] TRUE
## Any other matrices
if (requireNamespace("RcppPlanc", quietly = TRUE)) {
pbmc <- runOnlineINMF(pbmc, k = 20, minibatchSize = 100)
ctrl <- dataset(pbmc, "ctrl")
V <- getMatrix(ctrl, "V")
V[1:5, 1:5]
Vs <- getMatrix(pbmc, "V")
length(Vs)
names(Vs)
identical(Vs$ctrl, V)
}
#> [1] TRUE