Exploration on learning molecular docking with deep learning models

Qin Xie , Wei Ma , Jianhang Zhang , Shiliang Li , Xiaobing Deng , Youjun Xu , Weilin Zhang

Quant. Biol. ›› 2023, Vol. 11 ›› Issue (3) : 320 -331.

PDF (2617KB)
Quant. Biol. ›› 2023, Vol. 11 ›› Issue (3) : 320 -331. DOI: 10.15302/J-QB-022-0321
RESEARCH ARTICLE
RESEARCH ARTICLE

Exploration on learning molecular docking with deep learning models

Author information +
History +
PDF (2617KB)

Abstract

Background: Molecular docking-based virtual screening (VS) aims to choose ligands with potential pharmacological activities from millions or even billions of molecules. This process could significantly cut down the number of compounds that need to be experimentally tested. However, during the docking calculation, many molecules have low affinity for a particular protein target, which waste a lot of computational resources.

Methods: We implemented a fast and practical molecular screening approach called DL-DockVS (deep learning dock virtual screening) by using deep learning models (regression and classification models) to learn the outcomes of pipelined docking programs step-by-step.

Results: In this study, we showed that this approach could successfully weed out compounds with poor docking scores while keeping compounds with potentially high docking scores against 10 DUD-E protein targets. A self-built dataset of about 1.9 million molecules was used to further verify DL-DockVS, yielding good results in terms of recall rate, active compounds enrichment factor and runtime speed.

Conclusions: We comprehensively evaluate the practicality and effectiveness of DL-DockVS against 10 protein targets. Due to the improvements of runtime and maintained success rate, it would be a useful and promising approach to screen ultra-large compound libraries in the age of big data. It is also very convenient for researchers to make a well-trained model of one specific target for predicting other chemical libraries and high docking-score molecules without docking computation again.

Graphical abstract

Keywords

molecular docking / ultra-large virtual screening / deep learning

Cite this article

Download citation ▾
Qin Xie, Wei Ma, Jianhang Zhang, Shiliang Li, Xiaobing Deng, Youjun Xu, Weilin Zhang. Exploration on learning molecular docking with deep learning models. Quant. Biol., 2023, 11(3): 320-331 DOI:10.15302/J-QB-022-0321

登录浏览全文

4963

注册一个新账户 忘记密码

References

[1]

Wouters,O. J., McKee,M. (2020). Estimated research and development investment needed to bring a new medicine to market, 2009-2018. JAMA, 323: 844–853

[2]

Ban,F., Dalal,K., Li,H., LeBlanc,E., Rennie,P. S. (2017). Best practices of computer-aided drug discovery: lessons learned from the development of a preclinical candidate for prostate cancer with a new mechanism of action. J. Chem. Inf. Model., 57: 1018–1028

[3]

Kurcinski,M., Pawel Ciemny,M., Oleniecki,T., Kuriata,A., Badaczewska-Dawid,A. E., Kolinski,A. (2019). CABS-dock standalone: a toolbox for flexible protein-peptide docking. Bioinformatics, 35: 4170–4172

[4]

Kurcinski,M., Jamroz,M., Blaszczyk,M., Kolinski,A. (2015). CABS-dock web server for the flexible docking of peptides to proteins without prior knowledge of the binding site. Nucleic Acids Res., 43: W419–W424

[5]

TsujiM.,. and Kagechika, H., (2017) Identifying the receptor subtype selectivity of retinoid X and retinoic acid receptors via quantum mechanics. FEBS Open Bio, 7, 391–396

[6]

Grosdidier,A., Zoete,V. (2007). EADock: docking of small molecules into protein active sites with a multiobjective evolutionary optimization. Proteins, 67: 1010–1025

[7]

Campagna-Slater,V., Pottel,J., Therrien,E., Cantin,L. D. (2012). Development of a computational tool to rival experts in the prediction of sites of metabolism of xenobiotics by p450s. J. Chem. Inf. Model., 52: 2471–2483

[8]

Lee,H., Heo,L., Lee,M. S. (2015). GalaxyPepDock: a protein-peptide docking tool based on interaction similarity and energy optimization. Nucleic Acids Res., 43: W431–W435

[9]

ShinW. H.,Lee G. R.,HeoL.,LeeH.. (2014) Prediction of protein structure and interaction by galaxy protein modeling programs. On the website of researchgate

[10]

van Zundert,G. C. P., Rodrigues,J. P. G. L. M., Trellet,M., Schmitz,C., Kastritis,P. L., Karaca,E., Melquiond,A. S. J., van Dijk,M., de Vries,S. J. Bonvin,A. M. J. (2016). The haddock2.2 webserver: user-friendly integrative modeling of biomolecular complexes. J. Mol. Biol., 428: 720–725

[11]

Dominguez,C., Boelens,R. Bonvin,A. (2003). HADDOCK: a protein-protein docking approach based on biochemical or biophysical information. J. Am. Chem. Soc., 125: 1731–1737

[12]

Roel-Touris,J., Romero-Durana,M., Vidal,M., lez,D. (2018). LightDock: a new multi-scale approach to protein-protein docking. Bioinformatics, 34: 49–55

[13]

Meier,R., Pippel,M., Brandt,F., Sippl,W. (2010). Paradocks: a framework for molecular docking with population-based metaheuristics. J. Chem. Inf. Model., 50: 879–889

[14]

Pei,J., Wang,Q., Liu,Z., Li,Q., Yang,K. (2006). PSI-DOCK: towards highly efficient and accurate flexible ligand docking. Proteins, 62: 934–946

[15]

McMartin,C. Bohacek,R. (1997). QXP: powerful, rapid computer algorithms for structure-based drug design. J. Comput. Aided Mol. Des., 11: 333–344

[16]

Ruiz-Carmona,S., Alvarez-Garcia,D., Foloppe,N., Garmendia-Doval,A. B., Juhos,S., Schmidtke,P., Barril,X., Hubbard,R. E. Morley,S. (2014). rDock: a fast, versatile and open source program for docking ligands to proteins and nucleic acids. PLOS Comput. Biol., 10: e1003571–e1003578

[17]

Morley,S. D. (2004). Validation of an empirical RNA-ligand scoring function for fast flexible docking using Ribodock. J. Comput. Aided Mol. Des., 18: 189–208

[18]

Majeux,N., Apostolakis,M. S. J., Ehrhardt,C. (1999). Exhaustive docking of molecular fragments on protein binding sites with electrostatic solvation. Proteins, 37: 88–105

[19]

Koes,D. R., Baumgartner,M. P. Camacho,C. (2013). Lessons learned in empirical scoring with smina from the CSAR 2011 benchmarking exercise. J. Chem. Inf. Model., 53: 1893–1904

[20]

Onawole,A. T., Kolapo,T. U., Sulaiman,K. O. Adegoke,R. (2018). Structure based virtual screening of the Ebola virus trimeric glycoprotein using consensus scoring. Comput. Biol. Chem., 72: 170–180

[21]

Feher,M. (2006). Consensus scoring for protein-ligand interactions. Drug Discov. Today, 11: 421–428

[22]

Mavrogeni,M. Pronios,F., Zareifi,D., Vasilakaki,S., Lozach,O., Alexopoulos,L., Meijer,L., Myrianthopoulos,V. (2018). A facile consensus ranking approach enhances virtual screening robustness and identifies a cell-active DYRK1α inhibitor. Future Med. Chem., 10: 2411–2430

[23]

Houston,D. R. Walkinshaw,M. (2013). Consensus docking: improving the reliability of docking in a virtual screening context. J. Chem. Inf. Model., 53: 384–390

[24]

Berenger,F., Vu,O. (2017). Consensus queries in ligand-based virtual screening experiments. J. Cheminform., 9: 60

[25]

Masters,L., Eagon,S. (2020). Evaluation of consensus scoring methods for AutoDock Vina, smina and idock. J. Mol. Graph. Model., 96: 107532

[26]

OnawoleA. T.,SulaimanK. O.,AdegokeR. O. KolapoT.. (2017) Identification of potential inhibitors against the Zika virus using consensus scoring. J. Mole. Graphi., 73, 54–61

[27]

WangR.. (2001) How does consensus scoring work for virtual library screening? An idealized computer experiment. J. Chem. Inf. Comput. Sci., 41, 1422–1426

[28]

Yang,J. M., Chen,Y. F., Shen,T. W., Kristal,B. S. Hsu,D. (2005). Consensus scoring criteria for improving enrichment in virtual screening. J. Chem. Inf. Model., 45: 1134–1146

[29]

Clark,R. D., Strizhev,A., Leonard,J. M., Blake,J. F. Matthew,J. (2002). Consensus scoring for ligand/protein interactions. J. Mol. Graph. Model., 20: 281–295

[30]

Liu,S., Fu,R., Zhou,L. H. Chen,S. (2012). Application of consensus scoring and principal component analysis for virtual screening against β-secretase (BACE-1). PLoS One, 7: e38086

[31]

Paul,N. (2002). ConsDock: a new program for the consensus analysis of protein-ligand interactions. Proteins, 47: 521–533

[32]

Gorgulla,C., Boeszoermenyi,A., Wang,Z. F., Fischer,P. D., Coote,P. W., Padmanabha Das,K. M., Malets,Y. S., Radchenko,D. S., Moroz,Y. S., Scott,D. A. . (2020). An open-source drug discovery platform enables ultra-large virtual screens. Nature, 580: 663–668

[33]

Sterling,T. Irwin,J. (2015). Zinc 15—ligand discovery for everyone. J. Chem. Inf. Model., 55: 2324–2337

[34]

Irwin,J. J. Shoichet,B. (2005). ZINC—a free database of commercially available compounds for virtual screening. J. Chem. Inf. Model., 45: 177–182

[35]

Irwin,J. J., Sterling,T., Mysinger,M. M., Bolstad,E. S. Coleman,R. (2012). ZINC: a free tool to discover chemistry for biology. J. Chem. Inf. Model., 52: 1757–1768

[36]

Capuccini,M., Ahmed,L., Schaal,W., Laure,E. (2017). Large-scale virtual screening on public cloud resources with Apache Spark. J. Cheminform., 9: 15

[37]

Gentile,F., Agrawal,V., Hsing,M., Ton,A. T., Ban,F., Norinder,U., Gleave,M. E. (2020). Deep docking: a deep learning platform for augmentation of structure based drug discovery. ACS Cent. Sci., 6: 939–949

[38]

Gentile,F., Yaacoub,J. C., Gleave,J., Fernandez,M., Ton,A. Ban,F., Stern,A. (2022). Artificial intelligence-enabled virtual screening of ultra-large chemical libraries with deep docking. Nat. Protoc., 17: 672–697

[39]

Berenger,F., Kumar,A., Zhang,K. Y. J. (2021). Lean-docking: exploiting ligands’ predicted docking scores to accelerate molecular docking. J. Chem. Inf. Model., 61: 2341–2352

[40]

Sadybekov,A. A., Brouillette,R. L., Marin,E., Sadybekov,A. V., Luginina,A., Gusach,A., Mishin,A., Borshchevskiy,V. . (2020). Structure-based virtual screening of ultra-large library yields potent antagonists for a lipid gpcr. Biomolecules, 10: 1634

[41]

Soleimany,A., Amini,A., Goldman,S., Rus,D., Bhatia,S. (2021). Evidential deep learning for guided molecular property prediction and discovery. ACS Cent. Sci., 8: 1356–1367

[42]

Yang,Y., Yao,K., Repasky,M. P., Leswing,K., Abel,R., Shoichet,B. K. Jerome,S. (2021). Efficient exploration of chemical space with docking and deep learning. J. Chem. Theory Comput., 17: 7106–7119

[43]

Graff,D. E., Shakhnovich,E. I. Coley,C. (2021). Accelerating high-throughput virtual screening through molecular pool-based active learning. Chem. Sci. (Camb.), 12: 7866–7881

[44]

Graff,D. E., Aldeghi,M., Morrone,J. A., Jordan,K. E., Pyzer-Knapp,E. O. Coley,C. (2022). Self-focusing virtual screening with active design space pruning. J. Chem. Inf. Model., 62: 3854–3862

[45]

Shen,C., Ding,J., Wang,Z., Cao,D., Ding,X. (2019). From machine learning to deep learning: advances in scoring functions for protein–ligand docking. Wiley Interdiscip. Rev. Comput. Mol. Sci., 10: e1429

[46]

Li,H., Sze,K. H., Lu,G. Ballester,P. (2020). Machine-learning scoring functions for structure-based drug lead optimization. Wiley Interdiscip. Rev. Comput. Mol. Sci., 10: e1465

[47]

Yang,J., Shen,C. (2020). Predicting or pretending: artificial intelligence for protein-ligand interactions lack of sufficiently large and unbiased datasets. Front. Pharmacol., 11: 69

[48]

Irwin,J. J., Tang,K. G., Young,J., Dandarchuluun,C., Wong,B. R., Khurelbaatar,M., Moroz,Y. S., Mayfield,J. Sayle,R. (2020). Zinc20—a free ultralarge-scale chemical database for ligand discovery. J. Chem. Inf. Model., 60: 6065–6073

[49]

Trott,O. Olson,A. (2010). AutoDock Vina: improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading. J. Comput. Chem., 31: 455–461

[50]

Zhang,X., Wong,S. E. Lightstone,F. (2013). Message passing interface and multithreading hybrid for parallel molecular docking of large databases on petascale high performance computing machines. J. Comput. Chem., 34: 915–927

[51]

Boyle,N. M., Banck,M., James,C. A., Morley,C., Vandermeersch,T. Hutchison,G. (2011). Open Babel: an open chemical toolbox. J. Cheminform., 3: 33

[52]

Yang,K., Swanson,K., Jin,W., Coley,C., Eiden,P., Gao,H., Guzman-Perez,A., Hopper,T., Kelley,B., Mathea,M. . (2019). Analyzing learned molecular representations for property prediction. J. Chem. Inf. Model., 59: 3370–3388

[53]

Mysinger,M. M., Carchia,M., Irwin,J. J. Shoichet,B. (2012). Directory of useful decoys, enhanced (DUD-E): better ligands and decoys for better benchmarking. J. Med. Chem., 55: 6582–6594

RIGHTS & PERMISSIONS

The Author(s). Published by Higher Education Press.

AI Summary AI Mindmap
PDF (2617KB)

Supplementary files

QB-22321-OF-XYJ_suppl_1

1689

Accesses

0

Citation

Detail

Sections
Recommended

AI思维导图

/