PDF
(2836KB)
Abstract
Single-cell genomics give us a new perspective to understand multivariate phenotypic and genetic effects at the cellular level. Recently, technologies have started measuring different modalities of individual cells, such as transcriptomes, epigenomes, metabolomes, and spatial profiling. However, integrating the results of multimodal single-cell data to identify cell-to-cell correspondences remains a challenging task. Our viewpoint emphasizes the importance of data integration at a biologically relevant level of granularity. Furthermore, it is crucial to take into account the inherent discrepancies between different modalities in order to achieve a balance between biological discovery and noise removal. In this article, we give a systematic review for the most popular single-cell integration methods and models involving cell label transfer, data visualization, and clustering task for downstream analysis. We further evaluate more than 10 popular integration methods on paired and unpaired gold standard datasets. Moreover, we discuss the data preferences of the limitations, applications, challenges and future directions of these methods.
Keywords
single-cell omics
/
single-cell integration
/
clustering analysis
Cite this article
Download citation ▾
Yulong Kan, Weihao Wang, Yunjing Qi, Zhongxiao Zhang, Xikeng Liang, Shuilin Jin.
A comparison of integration methods for single-cell RNA sequencing data and ATAC sequencing data.
Quant. Biol., 2025, 13(2): e91 DOI:10.1002/qub2.91
| [1] |
Iacono G , Massoni-Badosa R , Heyn H . Single-cell transcriptomics unveils gene regulatory network plasticity. Genome Biol. 2019; 20: 1- 20.
|
| [2] |
Cuomo ASE , Seaton DD , McCarthy DJ , Martinez I , Bonder MJ , Garcia-Bernardo J , et al. Single-cell RNA-sequencing of differentiating iPS cells reveals dynamic genetic effects on gene expression. Nat Commun. 2020; 11 (1): 1- 14.
|
| [3] |
Lin Z , Ou-Yang L . Inferring gene regulatory networks from single-cell gene expression data via deep multi-view contrastive learning. Briefings Bioinf. 2023; 24 (1): bbac586.
|
| [4] |
Zhu H , Liu T , Wang Z . scHiMe: predicting single-cell DNA methylation levels based on single-cell Hi-C data. Briefings Bioinf. 2023; 24 (4): bbad223.
|
| [5] |
Treutlein B , Brownfield DG , Wu AR , Neff NF , Mantalas GL , Espinoza FH , et al. Reconstructing lineage hi-erarchies of the distal lung epithelium using single-cell RNA-seq. Nature. 2014; 509 (7500): 371- 5.
|
| [6] |
Qiu X , Mao Q , Tang Y , Wang L , Chawla R , Pliner HA , et al. Reversed graph embedding resolves complex single-cell trajectories. Nat Methods. 2017; 14 (10): 979- 82.
|
| [7] |
Mallory XF , Edrisi M , Navin N , Nakhleh L . Methods for copy number aberration detection from single-cell DNA-sequencing data. Genome Biol. 2020; 21 (1): 1- 22.
|
| [8] |
Baek S , Lee I . Single-cell ATAC sequencing analysis: from data preprocessing to hypothesis generation. Comput Struct Biotechnol J. 2020; 18: 1429- 39.
|
| [9] |
Karemaker ID , Vermeulen M . Single-cell DNA methylation profiling: technologies and biological applications. Trends Biotechnol. 2018; 36 (9): 952- 65.
|
| [10] |
Ludwig CH , Bintu L . Mapping chromatin modifications at the single cell level. Development. 2019; 146 (12): dev170217.
|
| [11] |
Erhard F , Baptista MAP , Krammer T , Hennig T , Lange M , Arampatzi P , et al. scSLAM-seq reveals core features of transcription dynamics in single cells. Nature. 2019; 571 (7765): 419- 23.
|
| [12] |
Gameiro D , Pérez-Pérez M , Pérez-Rodríguez G , Monteiro G , Azevedo NF , Lourenço A . Computational resources and strategies to construct single-molecule metabolic models of microbial cells. Briefings Bioinf. 2016; 17 (5): 863- 76.
|
| [13] |
Hu Y , Zhong J , Xiao Y , Xing Z , Sheu K , Fan S , et al. Single-cell RNA cap and tail sequencing (scRCAT-seq) reveals subtype-specific isoforms differing in transcript demarcation. Nat Commun. 2020; 11 (1): 5148.
|
| [14] |
Specht H , Emmott E , Petelski AA , Huffman RG , Perlman DH , Serra M , et al. Single-cell proteomic and transcriptomic analysis of macrophage heterogeneity using SCoPE2. Genome Biol. 2021; 22 (1): 1- 27.
|
| [15] |
Labib M , Kelley SO . Single-cell analysis targeting the proteome. Nat Rev Chem. 2020; 4 (3): 143- 58.
|
| [16] |
Ming J , Lin Z , Zhao J , Wan X , Consortium TTM , Ezran C , et al. FIRM: flexible integration of single-cell RNA-sequencing data for large-scale multi-tissue cell atlas datasets. Briefings Bioinf. 2022; 23 (5): bbac167.
|
| [17] |
Shen X , Shen H , Wu D , Feng M , Hu J , Liu J , et al. Scalable batch-correction approach for integrating large-scale single-cell transcriptomes. Briefings Bioinf. 2022; 23 (5): bbac327.
|
| [18] |
Wu W , Zhang W , Ma X . Network-based integrative analysis of single-cell transcriptomic and epigenomic data for cell types. Briefings Bioinf. 2022; 23 (2): bbaa546.
|
| [19] |
Ma Y , Sun Z , Zeng P , Zhang W , Lin Z . JSNMF enables effective and accurate integrative analysis of single-cell multiomics data. Briefings Bioinf. 2022; 23 (2): bbac105.
|
| [20] |
Yin Q , Wang Y , Guan J , Ji G . scIAE: an integrative autoencoder-based ensemble classification framework for single-cell RNA-seq data. Briefings Bioinf. 2022; 23 (1): bbab508.
|
| [21] |
Yan X , Zheng R , Li M . GLOBE: a contrastive learning-based framework for integrating single-cell transcriptome datasets. Briefings Bioinf. 2022; 23 (5): bbac311.
|
| [22] |
Karikomi M , Zhou P , Nie Q . DURIAN: an integrative deconvolution and imputation method for robust signaling analysis of single-cell tran-scriptomics data. Briefings Bioinf. 2022; 23 (4): bbac223.
|
| [23] |
Han W , Cheng Y , Chen J , Zhong H , Hu Z , Chen S , et al. Self-supervised contrastive learning for integrative single cell RNA-seq data analysis. Briefings Bioinf. 2022; 23 (5): bbac377.
|
| [24] |
Ma S , Zhang B , LaFave LM , Earl AS , Chiang Z , Hu Y , et al. Chromatin potential identified by shared single-cell profiling of RNA and chromatin. Cell. 2020; 183 (4): 1103- 16.
|
| [25] |
Zhu C , Yu M , Huang H , Juric I , Abnousi A , Hu R , et al. An ultra high-throughput method for single-cell joint analysis of open chromatin and transcriptome. Nat Struct Mol Biol. 2019; 26 (11): 1063- 70.
|
| [26] |
Liu L , Liu C , Quintero A , Wu L , Yuan Y , Wang M , et al. Deconvolution of single-cell multi-omics layers reveals regulatory heterogeneity. Nat Commun. 2019; 10 (1): 470.
|
| [27] |
Granja JM , Klemm S , McGinnis LM , Kathiria AS , Mezger A , Corces MR , et al. Single-cell multiomic analysis identifies regulatory programs in mixed-phenotype acute leukemia. Nat Biotechnol. 2019; 37 (12): 1458- 65.
|
| [28] |
Zeng W , Chen X , Duren Z , Wang Y , Jiang R , Wong WH . DC3 is a method for deconvolution and coupled clustering from bulk and single-cell genomics data. Nat Commun. 2019; 10 (1): 4613.
|
| [29] |
Jansen C , Ramirez RN , El-Ali NC , Gomez-Cabrero D , Tegner J , Merkenschlager M , et al. Building gene regulatory networks from scATAC-seq and scRNA-seq using linked self organizing maps. PLoS Comput Biol. 2019; 15 (11): e1006555.
|
| [30] |
Li Y , Ma A , Mathé EA , Li L , Liu B , Ma Q . Elucidation of biological networks across complex diseases using single-cell omics. Trends Genet. 2020; 36 (12): 951- 66.
|
| [31] |
Duren Z , Chen X , Zamanighomi M , Zeng W , Satpathy AT , Chang HY , et al. Integrative analysis of single-cell genomics data by coupled nonnegative matrix factorizations. Proc Natl Acad Sci USA. 2018; 115 (30): 7723- 8.
|
| [32] |
Pastore A , Gaiti F , Lu SX , Brand RM , Kulm S , Chaligne R , et al. Corrupted coordination of epigenetic modifications leads to diverging chromatin states and transcriptional heterogeneity in CLL. Nat Commun. 2019; 10 (1): 1874.
|
| [33] |
Nam AS , Chaligne R , Landau DA . Integrating genetic and non-genetic determinants of cancer evolution by single-cell multi-omics. Nat Rev Genet. 2021; 22 (1): 3- 18.
|
| [34] |
Ma A , McDermaid A , Xu J , Chang Y , Ma Q . Integrative methods and practical challenges for single-cell multi-omics. Trends Biotechnol. 2020; 38 (9): 1007- 22.
|
| [35] |
Colomé-Tatché M , Teis FJ . Statistical single cell multi-omics inte-gration. Curr Opin Struct Biol. 2018; 7: 54- 9.
|
| [36] |
Lähnemann D , Köster J , Szczurek E , McCarthy DJ , Hicks SC , Robinson MD , et al. Eleven grand challenges in single-cell data science. Genome biology. 2020; 21 (1): 1- 35.
|
| [37] |
Wang J , Zou Q , Lin C . A comparison of deep learning-based pre-processing and clustering approaches for single-cell RNA sequencing data. Briefings Bioinf. 2022; 23 (1): bbab345.
|
| [38] |
Flores M , Liu Z , Zhang T , Hasib MM , Chiu YC , Ye Z , et al. Deep learning tackles single-cell analysis-a survey of deep learning for scRNA-seq analysis. Briefings Bioinf. 2022; 23 (1): bbab531.
|
| [39] |
Ma Q , Xu D . Deep learning shapes single-cell data analysis. Nat Rev Mol Cell Biol. 2022; 23 (5): 303- 4.
|
| [40] |
Ravì D , Wong C , Deligianni F , Berthelot M , Andreu-Perez J , Lo B . Deep learning for health informatics. IEEE J Biomed Health Inform. 2016; 21 (1): 4- 21.
|
| [41] |
Argelaguet R , Arnol D , Bredikhin D , Deloro Y , Velten B , Marioni JC , et al. MOFA+: a statistical framework for comprehensive integration of multi-modal single-cell data. Genome Biol. 2020; 21 (1): 1- 17.
|
| [42] |
Inglis GAS . BABEL: using deep learning to translate between single-cell datasets. Commun Biol. 2021; 4 (1): 591.
|
| [43] |
Hao Y , Hao S , Andersen-Nissen E , Mauck WM, III , Zheng S , Butler A , et al. Integrated analysis of multi-modal single-cell data. Cell. 2021; 184 (13): 3573- 87.
|
| [44] |
Jin S , Zhang L , Nie Q . scAI: an unsupervised approach for the in-tegrative analysis of parallel single-cell transcriptomic and epigenomic profiles. Genome Biol. 2020; 21: 1- 19.
|
| [45] |
Zuo C , Dai H , Chen L . Deep cross-omics cycle attention model for joint analysis of single-cell multi-omics data. Bioinformatics. 2021; 37 (22): 4091- 9.
|
| [46] |
Liu Q , Chen S , Jiang R , Wong WH . Simultaneous deep generative modelling and clustering of single-cell genomic data. Nat Mach Intell. 2021; 3 (6): 536- 44.
|
| [47] |
Singh R , Hie BL , Narayan A , Berger B . Schema: metric learning enables interpretable synthesis of heterogeneous single-cell modalities. Genome Biol. 2021; 22 (1): 1- 24.
|
| [48] |
Zuo C , Chen L . Deep-joint-learning analysis model of single cell transcriptome and open chromatin accessibility data. Briefings Bioinf. 2021; 22 (4): bbaa287.
|
| [49] |
Hu J , Zhong Y , Shang X . A versatile and scalable single-cell data inte-gration algorithm based on domain-adversarial and variational approximation. Briefings Bioinf. 2022; 23 (1): bbab400.
|
| [50] |
Wu W , Zhang W , Ma X . Network-based integrative analysis of single-cell transcriptomic and epigenomic data for cell types. Briefings Bioinf. 2022; 23 (2): bbab546.
|
| [51] |
Peng T , Chen GM , Tan K . GLUER: integrative analysis of single-cell omics and imaging data by deep neural network. 2021. Preprint at bioRxiv: 2021.01.25.427845.
|
| [52] |
Xu Y , Das P , McCord RP . SMILE: mutual information learning for integration of single-cell omics data. Bioinformatics. 2022; 38 (2): 476- 86.
|
| [53] |
Welch JD , Hartemink AJ , Prins JF . MATCHER: manifold alignment reveals correspondence between single cell transcriptome and epigenome dynamics. Genome biology. 2017; 18 (1): 1- 19.
|
| [54] |
Stuart T , Butler A , Hoffman P , Hafemeister C , Papalexi E , Mauck WM, III , et al. Comprehensive integration of single-cell data. Cell. 2019; 177 (7): 1888- 902.
|
| [55] |
Welch JD , Kozareva V , Ferreira A , Vanderburg C , Martin C , Macosko EZ . Single-cell multi-omic integration compares and contrasts features of brain cell identity. Cell. 2019; 177 (7): 1873- 87.
|
| [56] |
Lin Z , Zamanighomi M , Daley T , Ma S , Wong WH . Model-based approach to the joint analysis of single-cell data on chromatin accessibility and gene expression. Proc Natl Acad Sci USA. 2020; 35 (1).
|
| [57] |
Wangwu J , Sun Z , Lin Z . scAMACE: model-based approach to the joint analysis of single-cell data on chromatin accessibility, gene expression and methylation. Bioinformatics. 2021; 37 (21): 3874- 80.
|
| [58] |
Song Q , Su J , Zhang W . scGCN is a graph convolutional networks algorithm for knowledge transfer in single cell omics. Nat Commun. 2021; 12 (1): 3826.
|
| [59] |
Cao K , Bai X , Hong Y , Wan L . Unsupervised topological alignment for single-cell multi-omics integration. Bioinformatics. 2020; 36 (Suppl ment_1): i48- 56.
|
| [60] |
Zeng P , Wangwu J , Lin Z . Coupled co-clustering-based unsupervised transfer learning for the integrative analysis of single-cell genomic data. Briefings Bioinf. 2021; 22 (4): bbaa347.
|
| [61] |
Singh R , Demetci P , Bonora G , Ramani V , Lee C , Fang H , et al. Unsupervised manifold alignment for single-cell multi-omics data. BProc 11th ACM Int Conf Bioinformatics, Computa Bio Health Inf. 2020: 1- 10.
|
| [62] |
Zhang K , Liu N , Yuan X , Guo X , Gao C , Zhao Z , et al. Fine-grained age estimation in the wild with attention LSTM networks. IEEE Trans Circ Syst Video Technol. 2019; 30 (9): 3140- 52.
|
| [63] |
Jain MS , Polanski K , Conde CD , Chen X , Park J , Mamanova L , et al. MultiMAP: dimensionality reduction and integration of multimodal data. Genome Biol. 2021; 22 (1): 1- 26.
|
| [64] |
Lin Y , Wu TY , Wan S , Yang JYH , Wong WH , Wang YXR . scJoint integrates atlas-scale single-cell RNA-seq and ATAC-seq data with transfer learning. Nat Biotechnol. 2022; 40 (5): 703- 10.
|
| [65] |
Cao K , Hong Y , Wan L . Manifold alignment for heterogeneous single-cell multi-omics data integration using Pamona. Bioinformatics. 2022; 38 (1): 211- 9.
|
| [66] |
Cao K , Gong Q , Hong Y , Wan L . Unified computational framework for single-cell data integration with optimal transport. Nat Commun. 2022; 13 (1): 7419.
|
| [67] |
Hu J , Chen M , Zhou X . Effective and scalable single-cell data alignment with non-linear canonical correlation analysis. Nucleic Acids Res. 2022; 50 (4): e21.
|
| [68] |
Gong B , Zhou Y , Purdom E . Cobolt: integrative analysis of multimodal single-cell sequencing data. Genome Biol. 2021; 22 (1): 1- 21.
|
| [69] |
Li G , Fu S , Wang S , Zhu C , Duan B , Tang C , et al. A deep generative model for multi-view profiling of single-cell RNA-seq and ATAC-seq data. Genome Biol. 2022; 23 (1): 20.
|
| [70] |
Ashuach T , Gabitto MI , Koodli RV , Saldi GA , Jordan MI , Yosef N . MultiVI: deep generative model for the integration of multimodal data. Nat Methods. 2023; 20 (8): 1222- 31.
|
| [71] |
Rautenstrauch P , Vlot AHC , Saran S , Ohler U . Intricacies of single-cell multi-omics data integration. Trends Genet. 2022; 38 (2): 128- 39.
|
| [72] |
Longo SK , Guo MG , Ji AL , Khavari PA . Integrating single-cell and spa-tial transcriptomics to elucidate intercellular tissue dynamics. Nat Rev Genet. 2021; 22 (10): 627- 44.
|
| [73] |
Vandereyken K , Sifrim A , Thienpont B , Voet T . Methods and applications for single-cell and spatial multi-omics. Nat Rev Genet. 2023; 24 (8): 1- 22.
|
| [74] |
Wang X , Almet AA , Nie Q . The promising application of cell-cell interaction analysis in cancer from single-cell and spatial transcrip-tomics. Semin Cancer Biol. 2023; 95: 42- 51.
|
| [75] |
HubertL, PlA . Comparing partitions. J Classif. 1985; 2 (1): 193- 218.
|
| [76] |
Strehl A , Ghosh J . Cluster ensembles-a knowledge reuse framework for combining multiple partitions. J Mach Learn Res. 2002; 3 (Dec): 583- 617.
|
| [77] |
Rousseeuw PJ . Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J Comput Appl Math. 1987; 20: 53- 65.
|
| [78] |
Clark SJ , Argelaguet R , Kapourani CA , Stubbs TM , Lee HJ , Alda-Catalinas C , et al. scNMT-seq enables joint profiling of chromatin accessibility DNA methylation and transcription in single cells. Nat Commun. 2018; 9 (1): 781.
|
| [79] |
Stoeckius M , Hafemeister C , Stephenson W , Houck-Loomis B , Chattopadhyay PK , Swerdlow H , et al. Simultaneous epitope and transcriptome measurement in single cells. Nat Methods. 2017; 14 (9): 865- 8.
|
| [80] |
Angermueller C , Clark SJ , Lee HJ , Macaulay IC , Teng MJ , Hu TX , et al. Parallel single-cell sequencing links transcriptional and epigenetic heterogeneity. Nat Methods. 2016; 13 (3): 229- 32.
|
| [81] |
Swanson E , Lord C , Reading J , Heubeck AT , Savage AK , Green R , et al. TEA-seq: a trimodal assay for integrated single cell measurement of transcription, epitopes, and chro-matin accessibility. 2020. Preprint at bioRxiv. 2020.09.04. 283887.
|
| [82] |
Forcato M , Romano O , Bicciato S . Computational methods for the integrative analysis of single-cell data. Briefings Bioinf. 2021; 22 (3): bbaa042.
|
| [83] |
Colomé-Tatché M , Theis FJ . Statistical single cell multi-omics integra-tion. Curr Opin Struct Biol. 2018; 7: 54- 9.
|
| [84] |
Stegle O , Teichmann SA , Marioni JC . Computational and analytical challenges in single-cell transcriptomics. Nat Rev Genet. 2015; 16 (3): 133- 45.
|
| [85] |
Korsunsky I , Millard N , Fan J , Slowikowski K , Zhang F , Wei K , et al. Fast, sensitive and accurate integration of single-cell data with Harmony. Nat Methods. 2019; 16 (12): 1289- 96.
|
| [86] |
Tran HTN , Ang KS , Chevrier M , Zhang X , Lee NYS , Goh M , et al. A benchmark of batch-effect correction methods for single-cell RNA sequencing data. Genome biology. 2020; 21: 1- 32.
|
| [87] |
Cao J , O'Day DR , Pliner HA , Kingsley PD , Deng M , Daza RM , et al. A human cell atlas of fetal gene expression. Science. 2020; 370 (6518): eaba7721.
|
| [88] |
Domcke S , Hill AJ , Daza RM , Cao J , O'Day DR , Pliner HA , et al. A human cell atlas of fetal chromatin accessibility. Science. 2020; 370 (6518): eaba7612.
|
| [89] |
Baron M , Veres A , Wolock SL , Faust A , Gaujoux R , Vetere A , et al. A single-cell transcriptomic map of the human and mouse pancreas reveals inter-and intra-cell population structure. Cell systems. 2016; 3 (4): 346- 60.
|
| [90] |
Puram SV , Tirosh I , Parikh AS , Patel AP , Yizhak K , Gillespie S , et al. Single-cell transcriptomic analysis of primary and metastatic tumor ecosystems in head and neck cancer. Cell. 2017; 171 (7): 1611- 24.
|
| [91] |
Macaulay IC , Ponting CP , Voet T . Single-cell multiomics: multiple measurements from single cells. Trends Genet. 2017; 33 (2): 155- 68.
|
RIGHTS & PERMISSIONS
The Author(s). Quantitative Biology published by John Wiley & Sons Australia, Ltd on behalf of Higher Education Press.