Package 'scINSIGHT' reference manual

Title:	Interpretation of Heterogeneous Single-Cell Gene Expression Data
Description:	We develop a novel matrix factorization tool named 'scINSIGHT' to jointly analyze multiple single-cell gene expression samples from biologically heterogeneous sources, such as different disease phases, treatment groups, or developmental stages. Given multiple gene expression samples from different biological conditions, 'scINSIGHT' simultaneously identifies common and condition-specific gene modules and quantify their expression levels in each sample in a lower-dimensional space. With the factorized results, the inferred expression levels and memberships of common gene modules can be used to cluster cells and detect cell identities, and the condition-specific gene modules can help compare functional differences in transcriptomes from distinct conditions. Please also see Qian K, Fu SW, Li HW, Li WV (2022) <doi:10.1186/s13059-022-02649-3>.
Authors:	Kun Qian [aut, ctb, cre] , Wei Vivian Li [aut, ctb]
Maintainer:	Kun Qian <[email protected]>
License:	GPL-3
Version:	0.1.4
Built:	2026-05-26 05:50:23 UTC
Source:	https://github.com/vivianstats/scinsight

Create an scINSIGHT object.

Description

This function initializes an scINSIGHT object with normalized data passed in.

Usage

create_scINSIGHT(norm.data, condition)
create_scINSIGHT(norm.data, condition)

Arguments

norm.data

List of normalized expression matrices (genes by cells). Gene names should be the same in all matrices.

condition

Vector specifying sample conditions.

Value

scINSIGHT object with norm.data slot set.

Examples

# Demonstration using matrices with randomly generated numbers
S1 <- matrix(runif(50000,0,2), 500,100)
S2 <- matrix(runif(60000,0,2), 500,120)
S3 <- matrix(runif(80000,0,2), 500,160)
S4 <- matrix(runif(75000,0,2), 500,150)
data = list(S1, S2, S3, S4)
sample = c("sample1", "sample2", "sample3", "sample4")
condition = c("control", "activation", "control", "activation")
names(data) = sample
names(condition) = sample
scINSIGHTx <- create_scINSIGHT(data, condition)
# Demonstration using matrices with randomly generated numbers
S1 <- matrix(runif(50000,0,2), 500,100)
S2 <- matrix(runif(60000,0,2), 500,120)
S3 <- matrix(runif(80000,0,2), 500,160)
S4 <- matrix(runif(75000,0,2), 500,150)
data = list(S1, S2, S3, S4)
sample = c("sample1", "sample2", "sample3", "sample4")
condition = c("control", "activation", "control", "activation")
names(data) = sample
names(condition) = sample
scINSIGHTx <- create_scINSIGHT(data, condition)

Perform scINSIGHT on normalized datasets

Description

Perform INterpreting single cell gene expresSIon bioloGically Heterogeneous daTa (scINSIGHT) to return factorized $W_{\ell1}$ , $W_{\ell2}$ , $H$ and $V$ matrices.

This factorization produces a $W_{\ell1}$ matrix (cells by $K_j$ ), a $W_{\ell2}$ matrix (cells by $K$ ), a shared $V$ matrix ( $K$ by genes) for each sample, and a $H$ ( $K_j$ by genes) matrix for each condition. $W_{\ell2}$ are the expression matrices of $K$ common gene modules for all samples, $V$ is the membership matrix of $K$ common gene modules, and it's shared by all samples. $W_{\ell1}$ are the expression matrices of $K_j$ condition-specific gene modules for all samples, and $H$ are the membership matrices of $K_j$ condition-specific gene modules for all conditions.

Usage

run_scINSIGHT(
  object,
  K = seq(5, 15, 2),
  K_j = 2,
  LDA = c(0.001, 0.01, 0.1, 1, 10),
  thre.niter = 500,
  thre.delta = 0.01,
  num.cores = 1,
  B = 5,
  out.dir = NULL,
  method = "increase"
)
run_scINSIGHT(
  object,
  K = seq(5, 15, 2),
  K_j = 2,
  LDA = c(0.001, 0.01, 0.1, 1, 10),
  thre.niter = 500,
  thre.delta = 0.01,
  num.cores = 1,
  B = 5,
  out.dir = NULL,
  method = "increase"
)

Arguments

object

scINSIGHT object.

K

Number of common gene modules. (default c(5, 7, 9, 11, 13, 15))

K_j

Number of dataset-specific gene modules. (default 2)

LDA

Regularization parameters. (default c(0.001, 0.01, 0.1, 1, 10))

thre.niter

Maximum number of block coordinate descent iterations to perform. (default 500)

thre.delta

Stop iteration when the reduction of objective function is less than the threshold. (default 0.01)

num.cores

Number of cores used for optimizing factorizations in parallel (default 1).

B

Number of repeats with random seed from 1 to B. (default 5)

out.dir

Output directory of scINSIGHT results. (default NULL)

method

Method of updating the factorization (default "increase"). If provide multiple $K$ , user can choose method between "increase" and "decrease".

For "increase", the algorithm will first perform factorization with the least $K=K_1$ . Then initialize $K_2-K_1$ facotrs, where $K_2$ is the $K$ sightly larger than $K_1$ , and perform facotrization with these new facotrs. Continue this process until the largest $K$ .

For "increase", the algorithm will first perform factorization with the largest $K=K_1$ . Then choose $K_2$ facotrs, where $K_2$ is the $K$ sightly less than $K_1$ , and perform facotrization with these new facotrs. Continue this process until the least $K$ .

Value

scINSIGHT object with $W_1$ , $W_2$ , $H$ , $V$ and parameters slots set.

The scINSIGHT Class

Description

The scINSIGHT object is created from two or more single cell datasets. To construct a scINSIGHT object, the user needs to provide at least two normalized expression (or another single-cell modality) matrices and the condition vector.

Details

The key slots used in the scINSIGHT object are described below.

Slots

norm.data: List of normalized expression matrices (genes by cells). Each matrix should have the same number and name of genes.
condition: Vector specifying each sample's condition name.
W_1: List of $W_{\ell1}$ estimated by scINSIGHT, names correspond to sample names.
W_2: List of $W_{\ell2}$ estimated by scINSIGHT, names correspond to sample names.
H: List of $H$ estimated by scINSIGHT, names correspond to condition names.
V: Matrix $V$ estimated by scINSIGHT.
norm.W_2: List of $W_{\ell2}$ after normalization. Recommended for downstream analysis.
clusters: List of cluster results.
parameters: List of selected parameters, including $K$ and $\lambda$ .

Package 'scINSIGHT'

Help Index

Create an scINSIGHT object.

Description

Usage

Arguments

Value

Examples

Perform scINSIGHT on normalized datasets

Description

Usage

Arguments

Value

The scINSIGHT Class

Description

Details

Slots