—————- 2023.03.06 ———————-
俺来填坑了!!!
scRNA-seq analysis 大体分两个模块:
测序数据前期处理
转为矩阵
数据挖掘
前期处理,包括 barcode 的拆解,比对。
利用比对结果,进行定量,去除 doublet 和 empty droplet,拿到矩阵。定量有:表达量定量;虽然 scRNA-seq 覆盖度较低,但已经有了对剪接的研究,取得了一定的进展。
QC steps must also be performed at the level of transcripts. 在基因水平的计数,过滤掉不表达的基因后,基因数量大幅减少,难以区分细胞的异质性。
normalization:
CPM, TPM
data correlation (batch correction, noise correction)
data processing stages:
- raw data
- normalized data
- corrected data
- feature-selected data
- dimensionality-reduced data
stages above grouped into three layers:
- measured data
- corrected data
- reduced data
无论是基因表达量还是剪接水平定量,最后都是一个矩阵。对矩阵进行降维、聚类等数据挖掘,完成细胞分群,拟时序分许等。
tools:
- also performs cell QC, which has three covariates for dying cells, doublet and empty droplet:
- number of counts per barcode (count depth)
- number of genes per barcode
- the fraction of counts from mitochondrial genes per barcode
10X genomics: what is fixed RNA profiling
- single-sample solutions
- multiplexing solutions
reference: