Integrating sequence and chemical insights: a co-modeling AI prediction framework for peptides

Zihan Liu , Meiru Yan , Zhihui Zhu , Yongfu Guo , Mouzheng Xu , Jiaqi Wang

Journal of Materials Informatics ›› 2025, Vol. 5 ›› Issue (2) : 17

PDF
Journal of Materials Informatics ›› 2025, Vol. 5 ›› Issue (2) :17 DOI: 10.20517/jmi.2024.91
Research Article

Integrating sequence and chemical insights: a co-modeling AI prediction framework for peptides

Author information +
History +
PDF

Abstract

Understanding the impact of the primary structure of peptides on a range of physicochemical properties is crucial for the development of various applications. Peptides can be conceptualized as sequences of amino acids in their biological representation and as molecular architectures composed of atoms and chemical bonds in their chemical representation. This study examines the influence of different biological and chemical representations of peptides on the local interpretability and accuracy of their respective prediction models and has developed “feature attribution” methodologies based on these representations. The effectiveness of these methodologies is validated through physicochemical analyses, specifically within the context of peptide aggregation propensity (AP) prediction, with training datasets derived from high-throughput molecular dynamics (MD) simulations. Our findings reveal significant discrepancies in the attribution extracted from sequence-based and chemical structure-based representations, which has led to the proposal of a co-modeling framework that integrates insights from both perspectives. Empirical comparisons have demonstrated that the contrastive learning-based co-modeling framework excels in terms of effectiveness and efficiency. This research not only extends the applicability of the attribution method but also lays the groundwork for elucidating the intrinsic mechanisms governing peptide activities and functions with the aid of domain-specific knowledge. Moreover, the co-modeling strategy is poised to enhance the precision of downstream applications and facilitate future endeavors in drug discovery and protein engineering.

Keywords

Deep learning / molecular dynamics / peptide / aggregation propensity / feature attribution

Cite this article

Download citation ▾
Zihan Liu, Meiru Yan, Zhihui Zhu, Yongfu Guo, Mouzheng Xu, Jiaqi Wang. Integrating sequence and chemical insights: a co-modeling AI prediction framework for peptides. Journal of Materials Informatics, 2025, 5(2): 17 DOI:10.20517/jmi.2024.91

登录浏览全文

4963

注册一个新账户 忘记密码

References

[1]

Langel U,Graslund A.Introduction to peptides and proteins. 1st Edition. CRC press: 2009.

[2]

Muttenthaler M,Adams DJ.Trends in peptide drug discovery.Nat Rev Drug Discov2021;20:309-25

[3]

Bhinder B,Madhukar NS.Artificial intelligence in cancer research and precision medicine.Cancer Discov2021;11:900-15 PMCID:PMC8034385

[4]

Mohapatra S,Poskus M,Gómez-Bombarelli R.Deep learning for prediction and optimization of fast-flow peptide synthesis.ACS Cent Sci2020;6:2277-86 PMCID:PMC7760468

[5]

Tao K,Aizen R.Self-assembling peptide semiconductors.Science2017;358:eaam9756 PMCID:PMC5712217

[6]

Kim SH.A model for the controlled assembly of semiconductor peptides.Nanoscale2012;4:6940-7

[7]

Yang Y,Wu X.Computation-driven rational design of self-assembled short peptides for catalytic hydrogen production.J Am Chem Soc2024;146:13488-98

[8]

Stone EA,Craven TW.Isolating conformers to assess dynamics of peptidic catalysts using computationally designed macrocyclic peptides.ACS Catal2021;11:4395-400 PMCID:PMC8513768

[9]

McDonald EF,Plate L,Gulsevin A.Benchmarking AlphaFold2 on peptide structure prediction.Structure2023;31:111-9.e2 PMCID:PMC9883802

[10]

Lei Y,Liu Z.A deep-learning framework for multi-level peptide-protein interaction prediction.Nat Commun2021;12:5465 PMCID:PMC8443569

[11]

Batra R,Chan H.Machine learning overcomes human bias in the discovery of self-assembling peptides.Nat Chem2022;14:1427-35 PMCID:PMC9844539

[12]

Bhadra P,Li J,Siu SWI.AmPEP: sequence-based prediction of antimicrobial peptides using distribution patterns of amino acid properties and random forest.Sci Rep2018;8:1697 PMCID:PMC5785966

[13]

Veltri D,Shehu A.Deep learning improves antimicrobial peptide recognition.Bioinformatics2018;34:2740-7 PMCID:PMC6084614

[14]

Hellinger R,Wu W.Peptidomics.Nat Rev Methods Primers2023;3:25 PMCID:PMC7614574

[15]

Seebach D,Glättli A.Helices and other secondary structures of beta- and gamma-peptides.Biopolymers2006;84:23-37

[16]

Mittal J,Georgiou G.Structural ensemble of an intrinsically disordered polypeptide.J Phys Chem B2013;117:118-24

[17]

Hearst M,Osuna E,Scholkopf B.Support vector machines.IEEE Intell Syst Their Appl1998;13:18-28

[18]

Breiman L.Random forests.Mach Learn2001;45:5-32

[19]

Murtagh F.Multilayer perceptrons for classification and regression.Neurocomputing1991;2:183-97

[20]

Almagro Armenteros JJ,Emanuelsson O.Detecting sequence signals in targeting peptides using deep learning.Life Sci Alliance2019;2:e201900429 PMCID:PMC6769257

[21]

Medsker L.Recurrent neural networks: design and applications. 1st Edition. CRC Press: 1999.

[22]

Hochreiter S.Long short-term memory.Neural Comput1997;9:1735-80

[23]

Vaswani, A.; Shazeer, N.; Parmar, N. Attention is all you need. arXiv 2017, arXiv:1706.03762. Available online: https://arxiv.org/abs/1706.03762. (accessed 21 Feb 2025)

[24]

Charoenkwan P,Hasan MM,Shoombuatong W.BERT4Bitter: a bidirectional encoder representations from transformers (BERT)-based model for improving the prediction of bitter peptides.Bioinformatics2021;37:2556-62

[25]

Chu Y,Wang Q.A transformer-based model to predict peptide–HLA class I binding and optimize mutated peptides for vaccine design.Nat Mach Intell2022;4:300-11

[26]

Wang J,Shin S.Accelerated atomic data production in ab initio molecular dynamics with recurrent neural network for materials research.J Phys Chem C2020;124:14838-46

[27]

Bronstein MM,Lecun Y,Vandergheynst P.Geometric deep learning: going beyond euclidean data.IEEE Signal Process Mag2017;34:18-42

[28]

Wu Z,Chen F,Zhang C.A comprehensive survey on graph neural networks.IEEE Trans Neural Netw Learn Syst2021;32:4-24

[29]

Yan K,Guo Y,Liu B.sAMPpred-GAT: prediction of antimicrobial peptide by graph attention network and predicted peptide structure.Bioinformatics2023;39:btac715 PMCID:PMC9805557

[30]

Wei L,Xue Y,Wei L.ATSE: a peptide toxicity predictor by exploiting structural and evolutionary information based on graph neural network and attention mechanism.Brief Bioinform2021;22:bbab041

[31]

Boadu F,Cheng J.Combining protein sequences and structures with transformers and equivariant graph neural networks to predict protein function.Bioinformatics2023;39:i318-25 PMCID:PMC10311302

[32]

Zhao A,Fang Z,Li J. Dual-modality representation learning for molecular property prediction. arXiv 2025, arXiv:2501.06608. Available online: https://doi.org/10.48550/arXiv.2501.06608. (accessed 21 Feb 2025)

[33]

McCloskey K,Monti F,Colwell LJ.Using attribution to decode binding mechanism in neural network models for chemistry.Proc Natl Acad Sci U S A2019;116:11624-9 PMCID:PMC6575176

[34]

Wang J,Zhao S.Deep learning empowers the discovery of self-assembling peptides with over 10 trillion sequences.Adv Sci2023;10:e2301544 PMCID:PMC10625107

[35]

Liu Z,Luo Y,Li W.Efficient prediction of peptide self-assembly through sequential and graphical encoding.Brief Bioinform2023;24:bbad409

[36]

Xu T,Zhao S.Accelerating the prediction and discovery of peptide hydrogels with human-in-the-loop.Nat Commun2023;14:3880 PMCID:PMC10313671

[37]

Wang J,Zhao S.Aggregation Rules of Short Peptides.JACS Au2024;4:3567-80 PMCID:PMC11423302

[38]

Marler RT.The weighted sum method for multi-objective optimization: new insights.Struct Multidisc Optim2010;41:853-62

[39]

Shang W,Almeida D. Understanding and improving convolutional neural networks via concatenated rectified linear units. arXiv 2016, arXiv:1603.05201. Available online: https://doi.org/10.48550/arXiv.1603.05201. (accessed 21 Feb 2025)

[40]

Gao, Y.; Beijbom, O.; Zhang, N.; Darrell, T. Compact bilinear pooling. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, USA, Jun 27-30, 2016; IEEE, 2016; pp. 317-26.

[41]

Marrink SJ,Yefimov S,de Vries AH.The MARTINI force field: coarse grained model for biomolecular simulations.J Phys Chem B2007;111:7812-24

[42]

Monticelli L,Periole X,Tieleman DP.The MARTINI coarse-grained force field: extension to proteins.J Chem Theory Comput2008;4:819-34

[43]

Brooks BR,Mackerell AD Jr.et alCHARMM: the biomolecular simulation program.J Comput Chem2009;30:1545-614 PMCID:PMC2810661

[44]

Brooks BR,Olafson BD,Swaminathan S.CHARMM: a program for macromolecular energy, minimization, and dynamics calculations.J Comput Chem1983;4:187-217

[45]

Sundararajan, M.; Taly, A.; Yan, Q. Axiomatic attribution for deep networks. arXiv 2017, arXiv:1703.01365. Available online: https://doi.org/10.48550/arXiv.1703.01365. (accessed 21 Feb 2025)

[46]

Meier F,Virreira Winter S,Mann M.BoxCar acquisition method enables single-shot proteomics at a depth of 10,000 proteins in 100 minutes.Nat Methods2018;15:440-8

[47]

Liu F,Schoofs L.The construction of a bioactive peptide database in Metazoa.J Proteome Res2008;7:4119-31

[48]

Jiang W.Graph-based deep learning for communication networks: a survey.Comput Commun2022;185:40-54

[49]

Veličković, P.; Cucurull, G.; Casanova, A.; Romero, A.; Liò, P.; Bengio, Y. Graph attention networks. arXiv 2017, arXiv:1710.10903. Available online: https://doi.org/10.48550/arXiv.1710.10903. (accessed 21 Feb 2025)

[50]

Hamilton, W. L.; Ying, R.; Leskovec, J. Inductive representation learning on large graphs. arXiv 2017, arXiv:1706.02216. Available online: https://doi.org/10.48550/arXiv.1706.02216. (accessed 21 Feb 2025)

[51]

Frederix PWJM,Abul-Haija YM.Exploring the sequence space for (tri-)peptide self-assembly to design and discover new hydrogels.Nat Chem2015;7:30-7

AI Summary AI Mindmap
PDF

107

Accesses

0

Citation

Detail

Sections
Recommended

AI思维导图

/