PDF
Abstract
Single-cell omics has emerged as a powerful tool for elucidating cellular heterogeneity in health and disease. Parallel advances in artificial intelligence (AI), particularly in pattern recognition, feature extraction and predictive modelling, now offer unprecedented opportunities to translate these insights into clinical applications. Here, we propose single-cell -omics-based Disease Predictor through AI (scDisPreAI), a unified framework that leverages AI to integrate single-cell -omics data, enabling robust disease and disease-stage prediction, alongside biomarker discovery. The foundation of scDisPreAI lies in assembling a large, standardised database spanning diverse diseases and multiple disease stages. Rigorous data preprocessing, including normalisation and batch effect correction, ensures that biological rather than technical variation drives downstream models. Machine learning pipelines or deep learning architectures can then be trained in a multi-task fashion, classifying both disease identity and disease stage. Crucially, interpretability techniques such as SHapley Additive exPlanations (SHAP) values or attention weights pinpoint the genes most influential for these predictions, highlighting biomarkers that may be shared across diseases or disease stages. By consolidating predictive modelling with interpretable biomarker identification, scDisPreAI may be deployed as a clinical decision assistant, flagging potential therapeutic targets for drug repurposing and guiding tailored treatments. In this editorial, we propose the technical and methodological roadmap for scDisPreAI and emphasises future directions, including the incorporation of multi-omics, standardised protocols and prospective clinical validation, to fully harness the transformative potential of single-cell AI in precision medicine.
Keywords
artificial intelligence
/
Clinical Decision Assistant
/
deep learning
/
disease prediction
/
drug repurposing
/
machine learning
/
medicine
/
single-cell -omics
Cite this article
Download citation ▾
Matteo Barberis, Jinkun Xie.
Towards a unified framework for single-cell -omics-based disease prediction through AI.
Clinical and Translational Medicine, 2025, 15(4): e70290 DOI:10.1002/ctm2.70290
| [1] |
Van De Sande B, Lee JS, Mutasa-Gottgens E, et al. Applications of single-cell RNA sequencing in drug discovery and development. Nat Rev Drug Discov. 2023; 22(6): 496-520.
|
| [2] |
Papalexi E, Satija R. Single-cell RNA sequencing to explore immune cell heterogeneity. Nat Rev Immunol. 2018; 18(1): 35-45.
|
| [3] |
Baysoy A, Bai Z, Satija R, Fan R. The technological landscape and applications of single-cell multi-omics. Nat Rev Mol Cell Biol. 2023; 24(10): 695-713.
|
| [4] |
Abiodun OI, Jantan A, Omolara AE, et al. Comprehensive review of artificial neural network applications to pattern recognition. IEEE Access. 2019; 7: 158820-158846.
|
| [5] |
Elharrouss O, Akbari Y, Almadeed N, et al. Backbones-review: feature extractor networks for deep learning and deep reinforcement learning approaches in computer vision. Comput Sci Rev. 2024; 53: 100645.
|
| [6] |
Khalifa M, Albadawy M. Artificial intelligence for clinical prediction: exploring key domains and essential functions. Comput Methods Programs Biomed Update. 2024; 5: 100148.
|
| [7] |
Wang X, Powell CA, Ma Q, et al. Clinical and translational mode of single-cell measurements: an artificial intelligent single-cell. Clin Transl Med. 2024; 14(9): e1818.
|
| [8] |
Wolf FA, Angerer P, Theis FJ. SCANPY: large-scale single-cell gene expression data analysis. Genome Biol. 2018; 19(1): 15.
|
| [9] |
Satija R, Farrell JA, Gennert D, et al. Spatial reconstruction of single-cell gene expression data. Nat Biotechnol. 2015; 33(5): 495-502.
|
| [10] |
Korsunsky I, Millard N, Fan J, et al. Fast, sensitive and accurate integration of single-cell data with harmony. Nat Methods. 2019; 16(12): 1289-1296.
|
| [11] |
Leek JT, Johnson WE, Parker HS, Jaffe AE, Storey JD. The sva package for removing batch effects and other unwanted variation in high-throughput experiments. Bioinformatics. 2012; 28(6): 882-883.
|
| [12] |
Pearson K. LIII. On lines and planes of closest fit to systems of points in space. Lond Edinburgh Dublin Philos Mag J Sci. 1901; 2(11): 559-572.
|
| [13] |
van der Maaten L, Hinton G. Visualizing data using t-SNE. J Mach Learn Res. 2008; 9(86): 2579-2605.
|
| [14] |
McInnes L, Healy J, Melville J. UMAP: uniform manifold approximation and projection for dimension reduction. arXiv. 1802.03426v3.
|
| [15] |
Khened M, Kori A, Rajkumar H, et al. A generalized deep learning framework for whole-slide image segmentation and analysis. Sci Rep. 2021; 11(1): 11579.
|
| [16] |
Rigatti SJ. Random Forest. J Insur Med. 2017; 47(1): 31-39.
|
| [17] |
Hearst MA, Dumais ST, Osuna E, et al. Support vector machines. IEEE Intell Syst. 1998; 13(4): 18-28.
|
| [18] |
Bourlard HA, Morgan N. Feature extraction by MLP. In: Bourlard HA, Morgan N, eds. Connectionist Speech Recognition: A Hybrid Approach. USA: Springer; 1994: 253-263.
|
| [19] |
Jogin M, Mohana X, Madhulika MS, et al. Feature extraction using convolution neural networks (CNN) and deep learning. In: 2018 3rd IEEE International Conference on Recent Trends in Electronics, Information & Communication Technology (RTEICT). 2018: 2319-2323.
|
| [20] |
Acharya DB, Zhang H. Feature selection and extraction for graph neural networks. In: Proceedings of the 2020 ACM Southeast Conference (ACMSE’20). 2020: 252-255.
|
| [21] |
Wang Y, Wang Y. Feature extraction by using attention mechanism in text classification. In: Qin P, Wang H, Sun G, Lu Z, eds. Data Science. Springer; 2020: 77-89.
|
| [22] |
Novaković JD, Veljović A, Ilić SS, et al. Evaluation of classification models in machine learning. Theory Appl Math Comput Sci. 2017; 7(1): 39-46.
|
| [23] |
Raschka S. Model evaluation, model selection, and algorithm selection in machine learning. arXiv. 1811.12808v1.
|
| [24] |
Lundberg SM, Lee SI. A unified approach to interpreting model predictions. arXiv. 1705.07874v2.
|
| [25] |
Tran HTN, Ang KS, Chevrier M, et al. A benchmark of batch-effect correction methods for single-cell RNA sequencing data. Genome Biol. 2020; 21(1): 12.
|
| [26] |
Cui H, Wang C, Maan H, et al. scGPT: toward building a foundation model for single-cell multi-omics using generative AI. Nat Methods. 2024; 21(8): 1470-1480.
|
RIGHTS & PERMISSIONS
2025 The Author(s). Clinical and Translational Medicine published by John Wiley & Sons Australia, Ltd on behalf of Shanghai Institute of Clinical Bioinformatics.