Comparative analysis of NovaSeq 6000 and MGISEQ 2000 single-cell RNA sequencing data
Weiran Chen, Md Wahiduzzaman, Quan Li, Yixue Li, Guangyong Zheng, Tao Huang
Comparative analysis of NovaSeq 6000 and MGISEQ 2000 single-cell RNA sequencing data
Background: Single-cell RNA sequencing (scRNA-seq) technology is now becoming a widely applied method of transcriptome exploration that helps to reveal cell-type composition as well as cell-state heterogeneity for specific biological processes. Distinct sequencing platforms and processing pipelines may contribute to various results even for the same sequencing samples. Therefore, benchmarking sequencing platforms and processing pipelines was considered as a necessary step to interpret scRNA-seq data. However, recent comparing efforts were constrained in sequencing platforms or analyzing pipelines. There is still a lack of knowledge of analyzing pipelines matched with specific sequencing platforms in aspects of sensitivity, precision, and so on.
Methods: We downloaded public scRNA-seq data that was generated by two distinct sequencers, NovaSeq 6000 and MGISEQ 2000. Then data was processed through the Drop-seq-tools, UMI-tools and Cell Ranger pipeline respectively. We calculated multiple measurements based on the expression profiles of the six platform-pipeline combinations.
Results: We found that all three pipelines had comparable performance, the Cell Ranger pipeline achieved the best performance in precision while UMI-tools prevailed in terms of sensitivity and marker calling.
Conclusions: Our work provided an insight into the selection of scRNA-seq data processing tools for two sequencing platforms as well as a framework to evaluate platform-pipeline combinations.
We proposed that evaluating scRNA-seq data processing pipelines should aim at comparing the sequencer-pipeline combinations rather than benchmarking between either sequencers or pipelines. We compared sequencer-pipeline combinations in aspect of gene detection, dropout rates, number of markers and cell types. Based on results above we made recommendations for different purposes of research such as finding more marker genes or gaining maximum precision.
Single-cell RNA sequencing / cell-type / data processing / pipeline / platform
[1] |
Hwang,B., Lee,J. H. ( 2018). Single-cell RNA sequencing technologies and bioinformatics pipelines. Exp. Mol. Med., 50 : 1– 14
CrossRef
Google scholar
|
[2] |
Trapnell,C. ( 2015). Defining cell types and states with single-cell genomics. Genome Res., 25 : 1491– 1498
CrossRef
Google scholar
|
[3] |
Michalopoulos,G. ( 2021). Novel insights into liver homeostasis and regeneration. Nat. Rev. Gastroenterol. Hepatol., 18 : 369– 370
CrossRef
Google scholar
|
[4] |
Chen,G., Ning,B. ( 2019). Single-cell RNA-seq technologies and related computational data analysis. Front. Genet., 10 : 317
CrossRef
Google scholar
|
[5] |
Drmanac,R., Sparks,A. B., Callow,M. J., Halpern,A. L., Burns,N. L., Kermani,B. G., Carnevali,P., Nazarenko,I., Nilsen,G. B., Yeung,G.
CrossRef
Google scholar
|
[6] |
Goodwin,S., McPherson,J. D. McCombie,W. ( 2016). Coming of age: ten years of next-generation sequencing technologies. Nat. Rev. Genet., 17 : 333– 351
CrossRef
Google scholar
|
[7] |
Zheng,G. X., Terry,J. M., Belgrader,P., Ryvkin,P., Bent,Z. W., Wilson,R., Ziraldo,S. B., Wheeler,T. D., McDermott,G. P., Zhu,J.
CrossRef
Google scholar
|
[8] |
Macosko,E. Z., Basu,A., Satija,R., Nemesh,J., Shekhar,K., Goldman,M., Tirosh,I., Bialas,A. R., Kamitaki,N., Martersteck,E. M.
CrossRef
Google scholar
|
[9] |
Smith,T., Heger,A. ( 2017). UMI-tools: modeling sequencing errors in Unique Molecular Identifiers to improve quantification accuracy. Genome Res., 27 : 491– 499
CrossRef
Google scholar
|
[10] |
Parekh,S., Ziegenhain,C., Vieth,B., Enard,W. ( 2018). zUMIs—A fast and flexible pipeline to process RNA sequencing data with UMIs. Gigascience, 7 : 6
CrossRef
Google scholar
|
[11] |
Petukhov,V., Guo,J., Baryawno,N., Severe,N., Scadden,D. T., Samsonova,M. G. Kharchenko,P. ( 2018). dropEst: pipeline for accurate estimation of molecular counts in droplet-based single-cell RNA-seq experiments. Genome Biol., 19 : 78
CrossRef
Google scholar
|
[12] |
Tian,L., Su,S., Dong,X., Amann-Zalcenstein,D., Biben,C., Seidi,A., Hilton,D. J., Naik,S. H. Ritchie,M. ( 2018). scPipe: A flexible R/Bioconductor preprocessing pipeline for single-cell RNA-sequencing data. PLOS Comput. Biol., 14 : e1006361
CrossRef
Google scholar
|
[13] |
Gao,M., Ling,M., Tang,X., Wang,S., Xiao,X., Qiao,Y., Yang,W. ( 2021). Comparison of high-throughput single-cell RNA sequencing data processing pipelines. Brief. Bioinform., 22 : bbaa116
CrossRef
Google scholar
|
[14] |
Ziegenhain,C., Vieth,B., Parekh,S., Reinius,B., Guillaumet-Adkins,A., Smets,M., Leonhardt,H., Heyn,H., Hellmann,I. ( 2017). Comparative analysis of single-cell rna sequencing methods. Mol. Cell, 65 : 631– 643.e4
CrossRef
Google scholar
|
[15] |
Natarajan,K. N., Miao,Z., Jiang,M., Huang,X., Zhou,H., Xie,J., Wang,C., Qin,S., Zhao,Z., Wu,L.
CrossRef
Google scholar
|
[16] |
Jeon,S. A., Park,J. L., Kim,J. H., Kim,J. H., Kim,Y. S., Kim,J. C. Kim,S. ( 2019). Comparison of the MGISEQ-2000 and Illumina HiSeq 4000 sequencing platforms for RNA sequencing. Genomics Inform., 17 : e32
CrossRef
Google scholar
|
[17] |
AndrewsS.. ( 2014) FastQC: A quality control tool for high throughput sequence data
|
[18] |
Stuart,T., Butler,A., Hoffman,P., Hafemeister,C., Papalexi,E., Mauck,W. M. Hao,Y., Stoeckius,M., Smibert,P. ( 2019). Comprehensive integration of single-cell data. Cell, 177 : 1888– 1902.e21
CrossRef
Google scholar
|
[19] |
Shao,X., Liao,J., Lu,X., Xue,R., Ai,N. ( 2020). Sccatch: automatic annotation on cell types of clusters from single-cell rna sequencing data. iScience, 23 : 100882
CrossRef
Google scholar
|
[20] |
Senabouth,A., Andersen,S., Shi,Q., Shi,L., Jiang,F., Zhang,W., Wing,K., Daniszewski,M., Lukowski,S. W., Hung,S. S. C.
CrossRef
Google scholar
|
/
〈 | 〉 |