Graph transformer with disease subgraph positional encoding for improved comorbidity prediction

Xihan Qin , Li Liao

Quant. Biol. ›› 2025, Vol. 13 ›› Issue (4) : e70008

PDF (353KB)
Quant. Biol. ›› 2025, Vol. 13 ›› Issue (4) : e70008 DOI: 10.1002/qub2.70008
RESEARCH ARTICLE

Graph transformer with disease subgraph positional encoding for improved comorbidity prediction

Author information +
History +
PDF (353KB)

Abstract

Comorbidity, the co-occurrence of multiple medical conditions in a single patient, profoundly impacts disease management and outcomes. Understanding these complex interconnections is crucial, especially in contexts where comorbidities exacerbate outcomes. Leveraging insights from the human interactome and advancements in graph-based methodologies, this study introduces transformer with subgraph positional encoding (TSPE) for disease comorbidity prediction. Inspired by biologically supervised embedding, TSPE employs transformer's attention mechanisms and subgraph positional encoding (SPE) to capture interactions between nodes and disease associations. Our proposed SPE proves more effective than Laplacian positional encoding, as used in Dwivedi et al.'s graph transformer, underscoring the importance of integrating clustering and disease-specific information for improved predictive accuracy. Evaluated on real clinical benchmark datasets (RR0 and RR1), TSPE demonstrates substantial performance enhancements over the state-of-the-art method, achieving up to 28.24% higher ROC AUC (receiver operating characteristic-area under the curve) and 4.93% higher accuracy. This method shows promise for adaptation to other complex graph-based tasks and applications. The source code is available at GitHub website (xihan-qin/TSPE-GraphTransformer).

Keywords

comorbidity / graph embedding / graph transformer / human interactome / subgraph positional encoding

Cite this article

Download citation ▾
Xihan Qin, Li Liao. Graph transformer with disease subgraph positional encoding for improved comorbidity prediction. Quant. Biol., 2025, 13(4): e70008 DOI:10.1002/qub2.70008

登录浏览全文

4963

注册一个新账户 忘记密码

References

[1]

Goodell S, Druss BG, Walker ER, Mat MJRWJFP. Mental disorders and medical comorbidity, vol. 2. Robert Wood Johnson Foundation; 2011.

[2]

Guo M, Yu Y, Wen T, Zhang X, Liu B, Zhang J, et al. Analysis of disease comorbidity patterns in a large-scale China population. BMC Med Genom. 2019; 12 (S12): 1- 10.

[3]

Sanyaolu A, Okorie C, Marinkovic A, Patidar R, Younis K, Desai P, et al. Comorbidity and its impact on patients with COVID-19. SN Compr Clin Med. 2020; 2 (8): 1069- 76.

[4]

Fang X, Shen L, Yu H, Wang P, Zhang Y, Chen Z, et al. Epidemiological, comorbidity factors with severity and prognosis of COVID-19: a systematic review and meta-analysis. Aging (Albany NY). 2020; 12 (13): 12493- 503.

[5]

Guan W-J, Liang W-H, Yi Z, Liang H-R, Chen Z-S, Li Y-M, et al. Comorbidity and its impact on 1590 patients with COVID-19 in China: a nationwide analysis. Eur Respir J. 2020; 55 (5): 2000547.

[6]

Jones PJ, Ma R, McNally RJ. Bridge centrality: a network approach to understanding comorbidity. Multivariate Behav Res. 2021; 56 (2): 353- 67.

[7]

Prasad K, AlOmar SY, Alqahtani SAM, Malik MZ, Kumar V. Brain disease network analysis to elucidate the neurological manifestations of COVID-19. Mol Neurobiol. 2021; 58 (5): 1875- 93.

[8]

Astore C, Zhou H, Ilkowski B, Forness J, Skolnick J. LeMeDISCO is a computational method for large-scale prediction & molecular interpretation of disease comorbidity. Commun Biol. 2022; 5 (1): 870.

[9]

Nam Y, Jung S-H, Yun J-S, Sriram V, Singhal P, Byrska-Bishop M, et al. Discovering comorbid diseases using an inter-disease interactivity network based on biobank-scale PheWAS data. Bioinformatics. 2023; 39 (1): btac822.

[10]

Menche J, Sharma A, Kitsak M, Dina Ghiassian S, Vidal M, Loscalzo J, et al. Uncovering disease-disease relationships through the incomplete interactome. Science. 2015; 347 (6224): 1257601.

[11]

Xihan Q, Liao L. Improving disease comorbidity prediction with biologically supervised graph embedding. In: Bansal, MS, et al., editors. Computational Advances in Bio and Medical Sciences. ICCABS 2023. Lecture Notes in Computer Science, vol. 14548. Springer Nature Switzerland. 2025. p. 178- 90.

[12]

Akram P, Liao L. Prediction of comorbid diseases using weighted geometric embedding of human interactome. BMC Med Genom. 2019; 12 (Suppl 7): 161.

[13]

Tenenbaum JB, de Vin S, Langford JC. A global geometric framework for nonlinear dimensionality reduction. Science. 2000; 290 (5500): 2319- 23.

[14]

Doll R, Bradford Hill A. The mortality of doctors in relation to their smoking habits. Br Med J. 1954; 1 (4877): 1451- 5.

[15]

Moni MA, Liò P. How to build personalized multi-omics comorbidity profiles. Front Cell Dev Biol. 2015; 3: 28.

[16]

Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, et al. Attention is all you need. Adv Neural Inf Process Syst. 2017; 30.

[17]

Devlin J, Chang M-W, Lee K, Toutanova K. BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, vol. 1. Association for Computational Linguistics. 2019. p. 4171- 86.

[18]

Ansari N, Babaei V, Najafpour MM. Enhancing catalysis studies with chat generative pre-trained transformer (ChatGPT): conversation with ChatGPT. Dalton Trans. 2024; 53 (8): 3534- 47.

[19]

Chen C-FR, Fan Q, Panda R. CrossViT: cross-attention multi-scale vision transformer for image classification. In: Proceedings of the IEEE/CVF international conference on computer vision. 2021. p. 357- 66.

[20]

Roy SK, Deria A, Hong D, Rasti B, Plaza A, Chanussot J. Multimodal fusion transformer for remote sensing image classification. IEEE Trans Geosci Rem Sens. 2023; 61: 1- 20.

[21]

Maurício J, Domingues I, Bernardino J. Comparing vision transformers and convolutional neural networks for image classification: a literature review. Appl Sci. 2023; 13 (9): 5521.

[22]

Tysinger EP, Rai BK, Sinitskiy AV. Can we quickly learn to 'translate' bioactive molecules with transformer models? J Chem Inf Model. 2023; 63 (6): 1734- 44.

[23]

Ma J, Zhao Z, Li T, Liu Y, Ma J, Zhang R. GraphsformerCPI: graph transformer for compound-protein interaction prediction. Interdiscipl Sci Comput Life Sci. 2024; 16 (2): 1- 17.

[24]

Poulain R, Beheshti R. Graph transformers on EHRs: better representation improves downstream performance. In: Proceedings of the Twelfth International Conference on Learning Representations (ICLR). 2024. Available from OpenReview.net (id=pe0Vdv7rsL).

[25]

Dwivedi VP, Bresson X. A generalization of transformer networks to graphs. 2012. Preprint at arXiv:2012.09699.

[26]

Saito T, Rehmsmeier M. The precision-recall plot is more informative than the ROC plot when evaluating binary classifiers on imbalanced datasets. PLoS One. 2015; 10 (3): e0118432.

[27]

Matthews BW. Comparison of the predicted and observed secondary structure of T4 phage lysozyme. Biochim Biophys Acta Protein Struct. 1975; 405 (2): 442- 51.

[28]

Porollo A, Meller J. Prediction-based fingerprints of protein-protein interactions. Proteins: Struct, Funct, Bioinf. 2007; 66 (3): 630- 45.

[29]

Tan KP, Varadarajan R, Madhusudhan MS. DEPTH: a web server to compute depth and predict small-molecule binding cavities in proteins. Nucleic Acids Res. 2011; 39 (Suppl_2): W242- 8.

[30]

Zhang J, Zhang H, Xia C, Sun L. Graph-BERT: only attention is needed for learning graph representations. 2020. Preprint at arXiv: 2001.05140.

[31]

Abdi H. The eigen-decomposition: eigenvalues and eigenvectors. In: Encyclopedia of measurement and statistics; 2007. p. 304- 8.

[32]

Kipf TN, Welling M. Semi-supervised classification with graph convolutional networks. In: Proceedings of the International Conference on Learning Representations (ICLR). OpenReview; 2017. Available from OpenReview.net (id=SJU4ayYgl).

[33]

Velickovic P, Cucurull G, Casanova A, Romero A, Lio P, Bengio Y. Graph attention networks. Stat. 2017; 1050 (20): 10- 48550.

[34]

Ieremie I, Ewing RM, Niranjan M. TransformerGO: predicting protein-protein interactions by modelling the attention between sets of gene ontology terms. Bioinformatics. 2022; 38 (8): 2269- 77.

[35]

Grover A, Leskovec J. node2vec: scalable feature learning for networks. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining; 2016; p. 855- 64.

[36]

Chen L, Tan X, Wang D, Zhong F, Liu X, Yang T, et al. TransformerCPI: improving compound-protein interaction prediction by sequence-based deep learning with self-attention mechanism and label reversal experiments. Bioinformatics. 2020; 36 (16): 4406- 14.

[37]

BCEWithLogitsLoss-PyTorch 2.2 documentation[cited 2024 Apr 8]. Available from PyTorch website.

[38]

Shen C, Wang Q, Priebe CE. One-hot graph encoder embedding. IEEE Trans Pattern Anal Mach Intell. 2023; 45 (6): 7933- 8.

[39]

Qin X, Shen C. Efficient graph encoder embedding for large sparse graphs in Python. In: Science and information conference. Cham: Springer Nature Switzerland; 2024. p. 568- 77.

[40]

Chicco D, Tötsch N, Jurman G. The Matthews correlation coefficient (MCC) is more reliable than balanced accuracy, bookmaker informedness, and markedness in two-class confusion matrix evaluation. BioData Min. 2021; 14: 1- 22.

[41]

Youden WJ. Index for rating diagnostic tests. Cancer. 1950; 3 (1): 32- 5.

RIGHTS & PERMISSIONS

The Author(s). Quantitative Biology published by John Wiley & Sons Australia, Ltd on behalf of Higher Education Press.

AI Summary AI Mindmap
PDF (353KB)

220

Accesses

0

Citation

Detail

Sections
Recommended

AI思维导图

/