ligerDataset class — ligerDataset-class • rliger

Object for storing dastaset specific information. Will be embedded within a higher level liger object

Usage

rawData(x, dataset = NULL)

rawData(x, dataset = NULL, check = TRUE) <- value

normData(x, dataset = NULL)

normData(x, dataset = NULL, check = TRUE) <- value

scaleData(x, dataset = NULL)

scaleData(x, dataset = NULL, check = TRUE) <- value

scaleUnsharedData(x, dataset = NULL)

scaleUnsharedData(x, dataset = NULL, check = TRUE) <- value

getMatrix(x, slot = "rawData", dataset = NULL, returnList = FALSE)

h5fileInfo(x, info = NULL)

h5fileInfo(x, info = NULL, check = TRUE) <- value

getH5File(x, dataset = NULL)

# S4 method for ligerDataset,missing
getH5File(x, dataset = NULL)

featureMeta(x, check = NULL)

featureMeta(x, check = TRUE) <- value

# S4 method for ligerDataset
show(object)

# S4 method for ligerDataset
dim(x)

# S4 method for ligerDataset
dimnames(x)

# S4 method for ligerDataset,list
dimnames(x) <- value

# S4 method for ligerDataset
rawData(x, dataset = NULL)

# S4 method for ligerDataset,ANY,ANY,matrixLike_OR_NULL
rawData(x, dataset = NULL, check = TRUE) <- value

# S4 method for ligerDataset,ANY,ANY,H5D
rawData(x, dataset = NULL, check = TRUE) <- value

# S4 method for ligerDataset
normData(x, dataset = NULL)

# S4 method for ligerDataset,ANY,ANY,matrixLike_OR_NULL
normData(x, dataset = NULL, check = TRUE) <- value

# S4 method for ligerDataset,ANY,ANY,H5D
normData(x, dataset = NULL, check = TRUE) <- value

# S4 method for ligerDataset,missing
scaleData(x, dataset = NULL)

# S4 method for ligerDataset,ANY,ANY,matrixLike_OR_NULL
scaleData(x, dataset = NULL, check = TRUE) <- value

# S4 method for ligerDataset,ANY,ANY,H5D
scaleData(x, dataset = NULL, check = TRUE) <- value

# S4 method for ligerDataset,ANY,ANY,H5Group
scaleData(x, dataset = NULL, check = TRUE) <- value

# S4 method for ligerDataset,missing
scaleUnsharedData(x, dataset = NULL)

# S4 method for ligerDataset,missing,ANY,matrixLike_OR_NULL
scaleUnsharedData(x, check = TRUE) <- value

# S4 method for ligerDataset,missing,ANY,H5D
scaleUnsharedData(x, check = TRUE) <- value

# S4 method for ligerDataset,missing,ANY,H5Group
scaleUnsharedData(x, check = TRUE) <- value

# S4 method for ligerDataset,ANY,missing,missing
getMatrix(
  x,
  slot = c("rawData", "normData", "scaleData", "scaleUnsharedData", "H", "V", "U", "A",
    "B"),
  dataset = NULL
)

# S4 method for ligerDataset
h5fileInfo(x, info = NULL)

# S4 method for ligerDataset
h5fileInfo(x, info = NULL, check = TRUE) <- value

# S4 method for ligerDataset
featureMeta(x, check = NULL)

# S4 method for ligerDataset
featureMeta(x, check = TRUE) <- value

# S3 method for ligerDataset
cbind(x, ..., deparse.level = 1)

Arguments

x, object: A ligerDataset object.
dataset: Not applicable for ligerDataset methods.
check: Whether to perform object validity check on setting new value.
value: See detail sections for requirements
slot: The slot name when using getMatrix.
returnList: Not applicable for ligerDataset methods.
info: Name of the entry in h5fileInfo slot.
...: See detailed sections for explanation.
deparse.level: Not used here.

Slots

rawData: Raw data. Feature by cell matrix. Most of the time, sparse matrix of integer numbers for RNA and ATAC data.
normData: Normalized data. Feature by cell matrix. Sparse if the rawData it is normalized from is sparse.
scaleData: Scaled data, usually with subset shared variable features, by cells. Most of the time sparse matrix of float numbers. This is the data used for iNMF factorization.
scaleUnsharedData: Scaled data of variable features not shared with other datasets. This is the data used for UINMF factorization.
varUnsharedFeatures: Variable features not shared with other datasets.
V: iNMF output matrix holding the dataset specific gene loading of each factor. Feature by factor matrix.
A: Online iNMF intermediate product matrix.
B: Online iNMF intermediate product matrix.
H: iNMF output matrix holding the factor loading of each cell. Factor by cell matrix.
U: UINMF output matrix holding the unshared variable gene loading of each factor. Feature by factor matrix.
h5fileInfo: list of meta information of HDF5 file used for constructing the object.
featureMeta: Feature metadata, DataFrame object.
colnames: Character vector of unique cell identifiers.
rownames: Character vector of unique feature names.

Matrix access

For ligerDataset object, rawData(), normData, scaleData() and scaleUnsharedData() methods are exported for users to access the corresponding feature expression matrix. Replacement methods are also available to modify the slots.

For other matrices, such as the $H$ and $V$, which are dataset specific, please use getMatrix() method with specifying slot name. Directly accessing slot with @ is generally not recommended.

H5 file and information access

A ligerDataset object has a slot called h5fileInfo, which is a list object. The first element is called $H5File, which is an H5File class object and is the connection to the input file. The second element is $filename which stores the absolute path of the H5 file in the current machine. The third element $formatType stores the name of preset being used, if applicable. The other following keys pair with paths in the H5 file that point to specific data for constructing a feature expression matrix.

h5fileInfo() method access the list described above and simply retrieves the corresponding value. When info = NULL, returns the whole list. When length(info) == 1, returns the requested list value. When more info requested, returns a subset list.

The replacement method modifies the list elements and corresponding slot value (if applicable) at the same time. For example, running h5fileInfo(obj, "rawData") <- newPath not only updates the list, but also updates the rawData slot with the H5D class data at "newPath" in the H5File object.

getH5File() is a wrapper and is equivalent to h5fileInfo(obj, "H5File").

Feature metadata access

A slot featureMeta is included for each ligerDataset object. This slot requires a DataFrame-class object, which is the same as cellMeta slot of a liger object. However, the associated S4 methods only include access to the whole table for now. Internal information access follows the same way as data.frame operation. For example, featureMeta(ligerD)$nCell or featureMeta(ligerD)[varFeatures(ligerObj), "gene_var"].

Dimensionality

For a ligerDataset object, the column orientation is assigned for cells and rows are for features. Therefore, for ligerDataset objects, dim() returns a numeric vector of two numbers which are number of features and number of cells. dimnames() returns a list of two character vectors, which are the feature names and the cell barcodes.

For direct call of dimnames<- method, value should be a list with a character vector of feature names as the first element and cell identifiers as the second element. For colnames<- method, the character vector of cell identifiers. For rownames<- method, the character vector of feature names.

Subsetting

For more detail of subsetting a liger object or a ligerDataset object, please check out subsetLiger and subsetLigerDataset. Here, we set the S3 method "single-bracket" [ as a quick wrapper to subset a ligerDataset object. i and j serves as feature and cell subscriptor, respectively, which can be any valid index refering the available features and cells in a dataset. ... arugments are passed to subsetLigerDataset so that advanced options are allowed.

Concatenate ligerDataset

cbind() method is implemented for concatenating ligerDataset objects by cells. When applying, all feature expression matrix will be merged with taking a union of all features for the rows.

Examples

ctrl <- dataset(pbmc, "ctrl")

# Methods for base generics
ctrl
#> An object of class ligerDataset with 300 cells
#> rawData: 266 features
print(ctrl)
#> An object of class ligerDataset with 300 cells
#> rawData: 266 features
dim(ctrl)
#> [1] 266 300
ncol(ctrl)
#> [1] 300
nrow(ctrl)
#> [1] 266
colnames(ctrl)[1:5]
#> [1] "ctrl_AAACATACCTCGCT.1" "ctrl_AAACGGCTCTTCGC.1" "ctrl_AACACTCTAAGTAG.1"
#> [4] "ctrl_AACCGCCTCAGGAG.1" "ctrl_AACGTTCTTCCGTC.1"
rownames(ctrl)[1:5]
#> [1] "ISG15"    "ID3"      "RPL11"    "MARCKSL1" "RPS8"    
ctrl[1:5, 1:5]
#> An object of class ligerDataset with 5 cells
#> rawData: 5 features

# rliger generics
## raw data
m <- rawData(ctrl)
class(m)
#> [1] "dgCMatrix"
#> attr(,"package")
#> [1] "Matrix"
dim(m)
#> [1] 266 300
## normalized data
pbmc <- normalize(pbmc)
#> ℹ Normalizing datasets "ctrl"
#> ℹ Normalizing datasets "stim"
#> ✔ Normalizing datasets "stim" ... done
#> 
#> ℹ Normalizing datasets "ctrl"

#> ✔ Normalizing datasets "ctrl" ... done
#> 
ctrl <- dataset(pbmc, "ctrl")
m <- normData(ctrl)
class(m)
#> [1] "dgCMatrix"
#> attr(,"package")
#> [1] "Matrix"
dim(m)
#> [1] 266 300
## scaled data
pbmc <- selectGenes(pbmc)
#> ℹ Selecting variable features for dataset "ctrl"
#> ✔ ... 168 features selected out of 249 shared features.
#> ℹ Selecting variable features for dataset "stim"
#> ✔ ... 166 features selected out of 249 shared features.
#> ✔ Finally 173 shared variable features are selected.
pbmc <- scaleNotCenter(pbmc)
#> ℹ Scaling dataset "ctrl"
#> ✔ Scaling dataset "ctrl" ... done
#> 
#> ℹ Scaling dataset "stim"
#> ✔ Scaling dataset "stim" ... done
#> 
ctrl <- dataset(pbmc, "ctrl")
m <- scaleData(ctrl)
class(m)
#> [1] "dgCMatrix"
#> attr(,"package")
#> [1] "Matrix"
dim(m)
#> [1] 173 300
n <- scaleData(pbmc, "ctrl")
identical(m, n)
#> [1] TRUE
## Any other matrices
if (requireNamespace("RcppPlanc", quietly = TRUE)) {
    pbmc <- runOnlineINMF(pbmc, k = 20, minibatchSize = 100)
    ctrl <- dataset(pbmc, "ctrl")
    V <- getMatrix(ctrl, "V")
    V[1:5, 1:5]
    Vs <- getMatrix(pbmc, "V")
    length(Vs)
    names(Vs)
    identical(Vs$ctrl, V)
}
#> [1] TRUE