Background: Single-cell RNA sequencing (scRNA-seq) enables the possibility of many in-depth transcriptomic analyses at a single-cell resolution, it’s already widely used for exploring the dynamic development process of life, studying the gene regulation mechanism, and discovering new cell types. However, the low RNA capture rate, which cause highly sparse expression with dropout, makes it difficult to do downstream analyses.
Method: Most current methods use bimodal model to fit the gene expression with overwhelming zero. In this paper, we proposed scRNA-seq complementation (SCC) to solve the dropout problem in scRNA-seq data. Firstly, we find the nearest neighbor cells of every cell. Then we use a mixture model to impute the dropouts of scRNA-seq data. The model can identify the possibility of dropouts and estimates the reasonable gene expression value.
Results: Experiment results show that SCC gives competitive results compared to two existing methods while showing superiority in reducing the intra-class distance of cells and improving the clustering accuracy in both simulation and real data.
Conclusions: SCC is an effective tool to resolve the dropout noise in scRNA-seq data. The code is freely accessible at https://github.com/nwpuzhengyan/SCC.