Artificial intelligence in fusion protein three-dimensional structure prediction: Review and perspective

Himansu Kumar , Pora Kim

Clinical and Translational Medicine ›› 2024, Vol. 14 ›› Issue (8) : e1789

PDF
Clinical and Translational Medicine ›› 2024, Vol. 14 ›› Issue (8) : e1789 DOI: 10.1002/ctm2.1789
REVIEW

Artificial intelligence in fusion protein three-dimensional structure prediction: Review and perspective

Author information +
History +
PDF

Abstract

•This review provides the overall pipeline and landscape of the prediction of the 3D structure of fusion protein.

•This review provides the factors that should be considered in predicting the 3D structures of fusion proteins using AI approaches in each step.

•This review highlights the latest advancements and ongoing challenges in predicting the 3D structure of fusion proteins using deep learning models.

•This review explores the advantages and challenges of employing AlphaFold2, RoseTTAFold, tr-Rosetta, and D-I-TASSER to model 3D structures.

Keywords

AI / AlphaFold2 / deep learning / fusion protein structure / protein structure prediction / RoseTTAFold

Cite this article

Download citation ▾
Himansu Kumar, Pora Kim. Artificial intelligence in fusion protein three-dimensional structure prediction: Review and perspective. Clinical and Translational Medicine, 2024, 14(8): e1789 DOI:10.1002/ctm2.1789

登录浏览全文

4963

注册一个新账户 忘记密码

References

[1]

Siegel JB, Zanghellini A, Lovick HM, et al. Computational design of an enzyme catalyst for a stereoselective bimolecular Diels–Alder reaction. Science. 2010;329:309-313.

[2]

Kuhlman B, Dantas G, Ireton GC, Varani G, Stoddard BL, Baker D. Design of a novel globular protein fold with atomic-level accuracy. Science. 2003;302:1364-1368.

[3]

Webb B, Sali A. Comparative protein structure modeling using MODELLER. Curr Protoc Bioinf. 2016;54:5.6.1-5.6.37.

[4]

Chen X, Zaro JL, Shen W-C. Fusion protein linkers: property, design and functionality. Adv Drug Delivery Rev. 2013;65:1357-1369.

[5]

Hall A, Burke N, Dongworth R, Hausenloy D. Mitochondrial fusion and fission proteins: novel therapeutic targets for combating cardiovascular disease. Br J Pharmacol. 2014;171:1890-1906.

[6]

de Bruyn M, Bremer E, Helfrich W. Antibody-based fusion proteins to target death receptors in cancer. Cancer Lett. 2013;332:175-183.

[7]

Lee S, Ballow M. Monoclonal antibodies and fusion proteins and their complications: targeting B cells in autoimmune diseases. J Allergy Clin Immunol. 2010;125:814-820.

[8]

Berman H, Henrick K, Nakamura H. Announcing the worldwide protein data bank. Nat Struct Mol Biol. 2003;10:980-980.

[9]

Kim P, Yiya K, Zhou X. FGviewer: an online visualization tool for functional features of human fusion genes. Nucleic Acids Res. 2020;48:W313-W320.

[10]

Best RB, Hummer G. Optimized molecular dynamics force fields applied to the helix− coil transition of polypeptides. J Phys Chem B. 2009;113:9004-9015.

[11]

Schaefer M, Bartels C, Karplus M. Solution conformations and thermodynamics of structured peptides: molecular dynamics simulation with an implicit solvation model. J Mol Biol. 1998;284:835-848.

[12]

Kim P, Tan H, Liu J, et al. FusionGDB 2.0: fusion gene annotation updates aided by deep learning. Nucleic Acids Res. 2022;50:D1221-D1230.

[13]

Kim P, Zhou X. FusionGDB: fusion gene annotation DataBase. Nucleic Acids Res. 2019;47:D994-D1004.

[14]

Berman HM, Westbrook JD, Gabanyi MJ, et al. The protein structure initiative structural genomics knowledgebase. Nucleic Acids Res. 2009;37:D365-D368.

[15]

Melo J. The molecular biology of chronic myeloid leukaemia. Leukemia. 1996;10:751-756.

[16]

Kursula P. Small-angle X-ray scattering for the proteomics community: current overview and future potential. Expert Rev Proteomics. 2021;18:415-422.

[17]

Torrisi M, Pollastri G, Le Q. Deep learning methods in protein structure prediction. Comput Struct Biotechnol J. 2020;18:1301-1310.

[18]

Baek M, DiMaio F, Anishchenko I, et al. Accurate prediction of protein structures and interactions using a three-track neural network. Science. 2021;373:871-876.

[19]

Lyseng-Williamson K, Jarvis B. Imatinib. Drugs. 2001;61:1765-1774.

[20]

Kumar H, Raj U, Gupta S, Varadwaj PK. In-silico identification of inhibitors against mutated BCR-ABL protein of chronic myeloid leukemia: a virtual screening and molecular dynamics simulation study. J Biomol Struct Dyn. 2016;34:2171-2183.

[21]

Keller G, Schafhausen P, Brümmendorf THB. Small Molecules in Oncology. Springer; 2010.

[22]

Deininger MW. Nilotinib. Clin Cancer Res. 2008;14:4027-4031.

[23]

Kantarjian H, Jabbour E, Grimley J, Kirkpatrick PD. Dasatinib. Nat Rev Drug Discovery. 2006;5:717-719.

[24]

Zhou T, Commodore L, Huang WS, et al. Structural mechanism of the pan-BCR-ABL inhibitor ponatinib (AP24534):lessons for overcoming kinase inhibitor resistance. Chem Biol Drug Des. 2011;77:1-11.

[25]

Giles F, O’dwyer M, Swords R. Class effects of tyrosine kinase inhibitors in the treatment of chronic myeloid leukemia. Leukemia. 2009;23:1698-1707.

[26]

Kim LC, Song L, Haura EB. Src kinases as therapeutic targets for cancer. Nat Rev Clin Oncol. 2009;6:587-595.

[27]

Sridhar R, Hanson-Painton O, Cooper DR. Protein kinases as therapeutic targets. Pharm Res. 2000;17:1345-1353.

[28]

Grünewald TGP, Cidre-Aranaz F, Surdez D, et al. Ewing sarcoma. Nat Rev Dis Primers. 2018;4(1):5.

[29]

Kumar H, Raj U, Gupta S, Tripathi R, Varadwaj P. Systemic review on chronic myeloid leukemia: therapeutic targets, pathways and inhibitors. J Nucl Med Radiat Ther. 2015;6:257-263.

[30]

Kumar H, Raj U, Srivastava S, Gupta S, Varadwaj PK. Identification of dual natural inhibitors for chronic myeloid leukemia by virtual screening, molecular dynamics simulation and ADMET analysis. Interdisciplin Sci: Comput Life Sci. 2016;8:241-252.

[31]

Jumper J, Evans R, Pritzel A, et al. Highly accurate protein structure prediction with AlphaFold. Nature. 2021;596:583-589.

[32]

Varadi M, Anyango S, Deshpande M, et al. AlphaFold protein structure database: massively expanding the structural coverage of protein-sequence space with high-accuracy models. Nucleic Acids Res. 2022;50(D1):D439-D444.

[33]

Kumar H, Tang L-Y, Yang C, Kim P. FusionPDB: a knowledgebase of human fusion proteins. Nucleic Acids Res. 2024;52(D1):D1289-D1304.

[34]

Kim P, Tan H, Liu J, Kumar H, Zhou X. FusionAI, a DNA-sequence-based deep learning protocol reduces the false positives of human fusion gene prediction. STAR Protoc. 2022;3:101185.

[35]

Mauro MJ, Druker BJ. STI571: targeting BCR-ABL as therapy for CML. Oncologist. 2001;6:233-238.

[36]

Sasaki T, Rodig SJ, Chirieac LR, Jänne PA. The biology and treatment of EML4-ALK non-small cell lung cancer. Eur J Cancer. 2010;46:1773-1780.

[37]

Sabir SR, Yeoh S, Jackson G, Bayliss R. EML4-ALK variants: biological and molecular properties, and the implications for patients. Cancers. 2017;9:118.

[38]

Salagierski M, Schalken JA. Molecular diagnosis of prostate cancer: pCA3 and TMPRSS2: eRG gene fusion. J Urol. 2012;187:795-801.

[39]

Macaluso M, Giordano A. TMPRSS2: eRG gene fusion: a new genetic marker for prostate cancer progression. Cancer Biol Ther. 2007;6:46-47.

[40]

Liquori A, Ibañez M, Sargas C, Sanz , Barragán E, Cervera J. Acute promyelocytic leukemia: a constellation of molecular events around a single PML-RARA fusion gene. Cancers. 2020;12:624.

[41]

Du Z, Su H, Wang W, et al. The trRosetta server for fast and accurate protein structure prediction. Nat Protoc. 2021;16:5634-5651.

[42]

Zheng W, Wuyun Q, Zhou X, Li Y, Freddolino PL, Zhang Y. LOMETS3: integrating deep learning and profile alignment for advanced protein template recognition and function annotation. Nucleic Acids Res. 2022;50:W454-W464.

[43]

Eswar N, John B, Mirkovic N, et al. Tools for comparative protein structure modeling and analysis. Nucleic Acids Res. 2003;31:3375-3380.

[44]

Yang J, Zhang Y. I-TASSER server: new development for protein structure and function predictions. Nucleic Acids Res. 2015;43:W174-W181.

[45]

Kelley LA, Mezulis S, Yates CM, Wass MN, Sternberg MJ. The Phyre2 web portal for protein modeling, prediction and analysis. Nat Protoc. 2015;10:845-858.

[46]

Kim DE, Chivian D, Baker D. Protein structure prediction and analysis using the Robetta server. Nucleic Acids Res. 2004;32:W526-W531.

[47]

Kiefer F, Arnold K, Künzli M, Bordoli L, Schwede T. The SWISS-MODEL Repository and associated resources. Nucleic Acids Res. 2009;37:D387-D392.

[48]

Källberg M, Margaryan G, Wang S, Ma J, Xu J. RaptorX server: a resource for template-based protein structure modeling. In: Protein Structure Prediction. Springer; 2014:17-27.

[49]

Söding J, Biegert A, Lupas AN. The HHpred interactive server for protein homology detection and structure prediction. Nucleic Acids Res. 2005;33:W244-W248.

[50]

Ma J, Peng J, Wang S, Xu J. A conditional neural fields model for protein threading. Bioinformatics. 2012;28:i59-i66.

[51]

Li Y, Hu J, Zhang C, Yu D-J, Zhang Y. ResPRE: high-accuracy protein contact prediction by coupling precision matrix with deep residual neural networks. Bioinformatics. 2019;35:4647-4655.

[52]

Kong L, Ju F, Zheng WM, et al. ProALIGN: directly learning alignments for protein structure prediction via exploiting context-specific alignment motifs. J Comput Biol. 2022;29:92-105.

[53]

Rombel IT, Sykes KF, Rayner S, Johnston SA. ORF-FINDER: a vector for high-throughput gene identification. Gene. 2002;282:33-41.

[54]

Yu K, Liu C, Kim B-G, Lee D-Y. Synthetic fusion protein design and applications. Biotechnol Adv. 2015;33:155-164.

[55]

Patel DK, Menon DV, Patel DH, Dave G. Linkers: a synergistic way for the synthesis of chimeric proteins. Protein Expression Purif. 2022;191:106012.

[56]

Shamriz S, Ofoghi H, Moazami N. Effect of linker length and residues on the structure and stability of a fusion protein with malaria vaccine application. Comput Biol Med. 2016;76:24-29.

[57]

Bahrami AA, Bandehpour M, Khalesi B, Kazemi B. Computational design and analysis of a poly-epitope fusion protein: a new vaccine candidate for Hepatitis and Poliovirus. Int J Pept Res Ther. 2020;26:389-403.

[58]

Crasto CJ, Feng JA. LINKER: a program to generate linker sequences for fusion proteins. Protein Eng. 2000;13:309-312.

[59]

Waterhouse A, Bertoni M, Bienert S, et al. SWISS-MODEL: homology modelling of protein structures and complexes. Nucleic Acids Res. 2018;46:W296-W303.

[60]

Studer G, Tauriello G, Bienert S, Biasini M, Johner N, Schwede T. ProMod3—A versatile homology modelling toolbox. PLoS Comput Biol. 2021;17:e1008667.

[61]

Betz RM, Walker RC. Paramfit: automated optimization of force field parameters for molecular dynamics simulations. J Comput Chem. 2015;36:79-87.

[62]

Xu D, Zhang Y. Ab initio protein structure assembly using continuous structure fragments and optimized knowledge-based force field. Proteins Struct Funct Bioinf. 2012;80:1715-1735.

[63]

Simoncini D, Berenger F, Shrestha R, Zhang KY. A probabilistic fragment-based protein structure prediction algorithm. PLoS ONE. 2012;7:e38799.

[64]

Lee J, Kim S-Y, Lee J. Protein structure prediction based on fragment assembly and parameter optimization. Biophys Chem. 2005;115:209-214.

[65]

Bujnicki JM. Protein-structure prediction by recombination of fragments. ChemBioChem. 2006;7:19-27.

[66]

Garcia-Garcia J, Valls-Comamala V, Guney E, et al. iFrag: a protein–protein interface prediction server based on sequence fragments. J Mol Biol. 2017;429:382-389.

[67]

Fiser A. Template-based protein structure modeling. Methods Mol Biol. 2010;673:73-94.

[68]

Chen C-C, Hwang J-K, Yang J-M. 2-v2: template-based protein structure prediction server. BMC Bioinf. 2009;10:1-13.

[69]

Zhang Y. Template-based modeling and free modeling by I-TASSER in CASP7. Proteins Struct Funct Bioinf. 2007;69:108-117.

[70]

Zhu J, Wang S, Bu D, Xu J. Protein threading using residue co-variation and deep learning. Bioinformatics. 2018;34:i263-i273.

[71]

Wu F, Xu J. Deep template-based protein structure prediction. PLoS Comput Biol. 2021;17:e1008954.

[72]

Evans R, O’Neill M, Pritzel A, et al. Protein complex prediction with AlphaFold-Multimer. Biorxiv. 2021;10.04.463034.

[73]

Zardecki C, Dutta S, Goodsell DS, Lowe R, Voigt M, Burley SK. PDB-101: educational resources supporting molecular explorations through biology and medicine. Protein Sci. 2022;31:129-140.

[74]

Berman HM, Westbrook J, Feng Z, et al. The protein data bank. Nucleic Acids Res. 2000;28:235-242.

[75]

Roy A, Kucukural A, Zhang Y. I-TASSER: a unified platform for automated protein structure and function prediction. Nat Protoc. 2010;5:725-738.

[76]

Hiranuma N, Park H, Baek M, Anishchenko I, Dauparas J, Baker D. Improved protein structure refinement guided by deep learning based accuracy estimation. Nat Commun. 2021;12:1340.

[77]

Lee GR, Won J, Heo L, Seok C. GalaxyRefine2: simultaneous refinement of inaccurate local regions and overall protein structure. Nucleic Acids Res. 2019;47:W451-W455.

[78]

Kryshtafovych A, Schwede T, Topf M, Fidelis K, Moult J. Critical assessment of methods of protein structure prediction (CASP)—Round XIV. Proteins Struct Funct Bioinf. 2021;89:1607-1617.

[79]

Renaud N, Geng C, Georgievska S, et al. DeepRank: a deep learning framework for data mining 3D protein-protein interfaces. Nat Commun. 2021;12:7068.

[80]

Eisenberg D, Lüthy R, Bowie JU. VERIFY3D: assessment of protein models with three-dimensional profiles. In: Methods in Enzymology. Elsevier; 1997:396-404.

[81]

Colovos C, Yeates TO. Verification of protein structures: patterns of nonbonded atomic interactions. Protein Sci. 1993;2:1511-1519.

[82]

Laskowski RA, MacArthur MW, Moss DS, Thornton JM. PROCHECK: a program to check the stereochemical quality of protein structures. J Appl Crystallogr. 1993;26:283-291.

[83]

Wiederstein M, Sippl MJ. ProSA-web: interactive web service for the recognition of errors in three-dimensional structures of proteins. Nucleic Acids Res. 2007;35:W407-W410.

[84]

Chen VB, Arendall WB 3rd, Headd JJ, et al. MolProbity: all-atom structure validation for macromolecular crystallography. Acta Crystallogr, Sect D: Biol Crystallogr. 2010;66:12-21.

[85]

Pronk S, Páll S, Schulz R, et al. GROMACS 4.5: a high-throughput and highly parallel open source molecular simulation toolkit. Bioinformatics. 2013;29:845-854.

[86]

Brooks BR. CHARMM: the biomolecular simulation program. J Comput Chem. 2009;30:1545-1614.

[87]

Phillips JC, Braun R, Wang W, et al. Scalable molecular dynamics with NAMD. J Comput Chem. 2005;26:1781-1802.

[88]

Thompson AP, et al. LAMMPS-a flexible simulation tool for particle-based materials modeling at the atomic, meso, and continuum scales. Comput Phys Commun. 2022;271:108171.

[89]

Shaw DE, Maragakis P, Lindorff-Larsen K, et al. Atomic-level characterization of the structural dynamics of proteins. Science. 2010;330:341-346.

[90]

Brünger AT, Adams PD, Clore GM, et al. Crystallography & NMR system: a new software suite for macromolecular structure determination. Acta Crystallogr, Sect D: Biol Crystallogr. 1998;54:905-921.

[91]

Adams PD, Afonine PV, Bunkóczi G, et al. PHENIX: a comprehensive Python-based system for macromolecular structure solution. Acta Crystallogr, Sect D: Biol Crystallogr. 2010;66:213-221.

[92]

Nilsson J, Ståhl S, Lundeberg J, Uhlén M, Nygren PA. Affinity fusion strategies for detection, purification, and immobilization of recombinant proteins. Protein Expression Purif. 1997;11:1-16.

RIGHTS & PERMISSIONS

2024 The Author(s). Clinical and Translational Medicine published by John Wiley & Sons Australia, Ltd on behalf of Shanghai Institute of Clinical Bioinformatics.

AI Summary AI Mindmap
PDF

144

Accesses

0

Citation

Detail

Sections
Recommended

AI思维导图

/