Algorithmic challenges in structure-based drug design and NMR structural biology
Received date: 17 Oct 2011
Accepted date: 23 Nov 2011
Published date: 05 Mar 2012
Copyright
The three-dimensional structure of a biomolecule rather than its one-dimensional sequence determines its biological function. At present, the most accurate structures are derived from experimental data measured mainly by two techniques: X-ray crystallography and nuclear magnetic resonance (NMR) spectroscopy. Because neither X-ray crystallography nor NMR spectroscopy could directly measure the positions of atoms in a biomolecule, algorithms must be designed to compute atom coordinates from the data. One salient feature of most NMR structure computation algorithms is their reliance on stochastic search to find the lowest energy conformations that satisfy the experimentallyderived geometric restraints. However, neither the correctness of the stochastic search has been established nor the errors in the output structures could be quantified. Though there exist exact algorithms to compute structures from angular restraints, similar algorithms that use distance restraints remain to be developed.
An important application of structures is rational drug design where protein-ligand docking plays a critical role. In fact, various docking programs that place a compound into the binding site of a target protein have been used routinely by medicinal chemists for both lead identification and optimization. Unfortunately, despite ongoing methodological advances and some success stories, the performance of current docking algorithms is still data-dependent. These algorithms formulate thedocking problem as a match of two sets of feature points. Both the selection of feature points and the search for the best poses with the minimum scores are accomplished through some stochastic search methods. Both the uncertainty in the scoring function and the limited sampling space attained by the stochastic search contribute to their failures. Recently, we have developed two novel docking algorithms: a data-driven docking algorithm and a general docking algorithm that does not rely on experimental data. Our algorithms search the pose space exhaustively with the pose space itself being limited to a set of hierarchical manifolds that represent, respectively, surfaces, curves and points with unique geometric and energetic properties. These algorithms promise to be especially valuable for the docking of fragments and small compounds as well as for virtual screening.
Key words: structure-based drug design (SBDD); virtual screening (VC); protein-ligand docking; scoring function; molecular dynamics (MD); Monte Carlo (MC); simulated annealing (SA); Markov chain Monte Carlo (MCMC); nuclear magnetic resonance (NMR); nuclear Overhauser effect (NOE); residual dipolar couplings (RDCs); chemical shift (CS); inference structure determination (ISD); Bayesian; Gibbs sampling; probability distribution functions (PDFs); degrees of freedom (DOF); van der Waals (VDW); root mean square deviation (RMSD); manifold; Poisson-Boltzmann equation (PBE)
Lincong WANG , Shuxue ZOU , Yao WANG . Algorithmic challenges in structure-based drug design and NMR structural biology[J]. Frontiers of Electrical and Electronic Engineering, 0 , 7(1) : 69 -84 . DOI: 10.1007/s11460-012-0193-z
1 |
Cavanaugh J, Fairbrother W J, Palmer A G III, Skelton N J. Protein NMR Spectroscopy: Principles and Practice. San Diego, CA: Academic Press, 1995
|
2 |
Brünger A T. X-PLOR: A System for X-ray Crystallography and NMR. New Haven, CT: Yale University Press, 1993
|
3 |
Schwieters C D, Kuszewski J J, Clore G M. Using Xplor-NIH for NMR molecular structure determination. Progress in Nuclear Magnetic Resonance Spectroscopy, 2006, 48(1): 47-62
|
4 |
Güntert P. Automated NMR structure calculation with CYANA. Methods in Molecular iology, 2004, 278: 353-378
|
5 |
Rieping W, Habeck M, Nilges M. Inferential structure determination. Science, 2005, 309(5732): 303-306
|
6 |
Crippen G M, Havel T F. Distance Geometry and Molecular Conformations. New York, NY: John Wiley and Sons, Inc., 1988
|
7 |
Wang L, Kurochkin A V, Zuiderweg E R P. An iterative fitting procedure for the determination of longitudinal NMR cross-correlation rates. Journal of Magnetic Resonance, 2000, 144(1): 175-185
|
8 |
Güntert P, Mumenthaler C, Wüthrich K. Torsion angle dynamics for NMR structure calculation with the new program DYANA. Journal of Molecular Biology, 1997, 273(1): 283-298
|
9 |
Saxe J B. Embeddability of weighted graphs in k-space is strongly NP-hard. In: Proceedings of the 17th Allerton Conference on Communications, Control, and Computing. 1979, 480-489
|
10 |
Berger B, Kleinberg J, Leighton F T. Reconstructing a three-dimensional model with arbitrary errors. Journal of theACM, 1999, 46(2): 212-235
|
11 |
Wang L, Mettu R, Donald B R. A polynomial-time algorithm for de novo protein backbone structure determination from nuclear magnetic resonance data. Journal of Computational Biology, 2006, 13(7): 1276-1288
|
12 |
Rieping W, Habeck M, Nilges M. Inferential structure determination. Supporting Online Material. Science, 2005, http://www.sciencemag.org/cgi/content/full/309/5732/303 /DC1
|
13 |
Habeck M, Nilges M, Rieping W. Bayesian inference applied to macromolecular structure determination. Physical Review E, 2005, 72: 031912
|
14 |
Swendsen R H, Wang J S. Replica Monte Carlo simulation of spin-glasses. Physical Review Letters, 1986, 57(21): 2607-2609
|
15 |
Landau L D, Lifshitz E M. Statistical Physics, Volume 5. Oxford: Pergamon Press, 1980
|
16 |
Feller W. An Introduction to Probability Theory and Its Applications, Volume II. New York, NY: John Wiley and Sons, Inc., 1970
|
17 |
Dyer M, Sinclair A, Vigoda E, Weitz D. Mixing in time and space for lattice spin systems: A combinatorial view. Random Structures and Algorithms, 2004, 24(4): 461-479
|
18 |
Wang L, Mettu R, Donald B R. An algebraic geometry approach to backbone structure determination from NMR data. In: Proceedings of IEEE Computer Society Bioinformatics Conference. 2005, 235-246
|
19 |
Wang L, Donald B R. Analysis of a systematic search-based algorithm for determining protein backbone structure from a minimal number of residual dipolar couplings. In: Proceedings of IEEE Computer Society Bioinformatics Conference. 2004, 319-330
|
20 |
Wang L, Donald B R. Exact solutions for internuclear vectors and backbone dihedral angles from NH residual dipolar couplings in two media, and their application in a systematic search algorithm for determining protein backbone structure. Journal of Biomolecular NMR, 2004, 29(3): 223-242
|
21 |
Wang L, Donald B R. An efficient and accurate algorithm for assigning nuclear Overhauser effect restraints using a rotamer library ensemble and residual dipolar couplings. In: Proceedings of IEEE Computer Society Bioinformatics Conference. 2005, 189-202
|
22 |
Wang L, Donald B R. A data-driven, systematic search algorithm for structure determination of denatured or disordered proteins. In: Proceedings of IEEE Computer Society Bioinformatics Conference. 2006, 67-78
|
23 |
Hu W, Wang L. Residual dipolar couplings: Measurements and applications to biomolecular studies. Annual Reports on NMR Spectroscopy, 2006, 58: 231-303
|
24 |
Wang L, Mettu R, Lilien R, Donald B R. An exact algorithm for determining protein backbone structure from NH residual dipolar couplings. In: Proceedings of IEEE Computer Society Bioinformatics Conference. 2003, 611-612
|
25 |
Kuntz I D, Blaney J M, Oatley S J, Langridge R L, Ferrin T E. A geometric approach to macromolecule-ligand interactions. Journal of Molecular Biology, 1982, 161(2): 269-288
|
26 |
Abagyan R, Totrov M, Kuznetzov D. A new method for protein modeling and design: Applications to docking andstructure prediction from the distorted native conformation. Journal of Computational Chemistry, 1994, 15(5): 488-506
|
27 |
Morris G M, Goodsell D S, Halliday R S, Huey R, Hart W E, Belew R K, Olson A J. Automated docking using a Lamarckian genetic algorithm and an empirical binding free energy function. Journal of Computational Chemistry, 1998, 19(14): 1639-1662
|
28 |
Claußen H, Buning C, Rarey M, Lengauer T. FlexE: Efficient molecular docking considering protein structure variations. Journal of molecular biology, 2001, 308(2): 377-395
|
29 |
Jones G, Willett P, Glen R C, Leach A R, Taylor R. Development and validation of a genetic algorithm for flexible docking. Journal of Molecular Biology, 1997, 267(3): 727-748
|
30 |
McMartin C, Bohacek R S. QXP: Powerful, rapid computer algorithms for structure-based drug design. Journal of Computer-Aided Molecular Design, 1997, 11(4): 333-344
|
31 |
Jain A N. Surflex: Fully automatic flexible molecular docking using a molecular similarity-based search engine. Journal of Medicinal Chemistry, 2003, 46(4): 499-511
|
32 |
McGann M R, Almond H R, Nicholls A, Grant J A, Brown F K. Gaussian docking functions. Biopolymers, 2003, 68(1): 76-90
|
33 |
Taylor R D, Jewsbury P J, Essex J W. A review of proteinsmall molecule docking methods. Journal of Computer-Aided Molecular Design, 2002, 16(3): 151-166
|
34 |
Friesner R A, Banks J L, Murphy R B, Halgren T A, Klicic J J, MainzD T, Repasky M P, Knoll E H, Shelley M, Perry J K, Shaw D E, Francis P, Shenkin P S. Glide: A new approach for rapid, accurate docking and scoring. 1. Method and assessment of docking accuracy. Journal of Medicinal Chemistry, 2004, 47(7): 1739-1749
|
35 |
Pei J, Wang Q, Liu Z, Li Q, Yang K L, Lai L. PSI-DOCK: Towards highly efficient and accurate flexible ligand docking. Proteins: Structure, Function, and Bioinformatics, 2006, 62(4): 934-946
|
36 |
Venkatachalam C M, Jiang X, Oldfield T, Waldman M. LigandFit: A novel method for the shape-directed rapid docking of ligands to protein active sites. Journal of Molecular Graphics & Modelling, 2003, 21(4): 289-307
|
37 |
Baxter C A, Murray C W, Clark D E, Westhead D R, Eldridge M D. Flexible docking using Tabu search and an empirical estimate of binding affinity. Proteins: Structure, Function, and Genetics, 1998, 33(3): 367-382
|
38 |
Chen H M, Liu B F, Huang H L, Hwang S F, Ho S Y. SODOCK: Swarm optimization for highly flexible proteinligand docking. Journal of Computational Chemistry, 2007, 28(2): 612-623
|
39 |
Korb O, Stützle T, Exner T E. Empirical scoring functions for advanced protein-ligand docking with PLANTS. Journal of Chemical Information and Modeling, 2009, 49(1): 84-96
|
40 |
Totrov M, Abagyan R. Flexible ligand docking to multiple receptor conformations: A practical alternative. Current Opinion in Structural Biology, 2008, 18(2): 178-184
|
41 |
Leach A R, Shoichet B K, Peishoff C E. Prediction of proteinligand interactions. Docking and scoring: Successes and gaps. Journal of Medicinal Chemistry, 2006, 49(20): 5851-5855
|
42 |
Warren G L, Andrews C W, Capelli A M, Clarke B, LaLonde J, Lambert M H, Lindvall M, Nevins N, Semus S F, Senger S, Tedesco G, Wall I D, Woolven J M, Peishoff C E, Head M S. A critical assessment of docking programs and scoring functions. Journal of Medicinal Chemistry, 2006, 49(20): 5912-5931
|
43 |
Moitessier N, Englebienne P, Lee D, Lawandi J, Corbeil C R. Towards the development of universal, fast and highly accurate docking/scoring methods: A long way to go. British Journal of Pharmacology, 2008, 153(S1): S7-S26
|
44 |
Erickson J A, Jalaie M, Robertson D H, Lewis R A, Vieth M. Lessons in molecular recognition: The effects of ligand and protein flexibility on molecular docking accuracy. Journal of Medicinal Chemistry, 2004, 47(1): 45-55
|
45 |
Cornell W D, Cieplak P, Bayly C I, Gould I R, Merz Jr K M, Ferguson D M, Spellmeyer D C, Fox T, Caldwell J W, Kollman P A. A second generation force field for the simulation of proteins, nucleic acids, and organic molecules. Journal of the American Chemical Society, 1995, 117(19): 5179-5197
|
46 |
Jorgensen W L, Tirado-Rives J. The OPLS potential funtions for proteins. Energy minimizations for crystals of cyclic peptides and crambin. Journal of the American Chemical Society, 1988, 110(6): 1657-1666
|
47 |
Jorgensen W L, Maxwell D S, Tirado-Rives J. Development and testing of the OPLS all-atom force field on conformational energetics and properties of organic liquids. Journal of the American Chemical Society, 1996, 118(45): 11225-11236
|
48 |
Brooks B R, Bruccoleri R E, Olafson B D, States D J, Swaminathan S, Karplus M. HARMM: A program for macromolecular energy, minimization, and dynamics calculations. Journal of Computational Chemistry, 1983, 4(2): 187-217
|
49 |
MacKerell A D, Bashford D, Bellott M, Dunbrack R L, Evanseck J D, Field M J, Fischer S, Gao J, Guo H, Ha S, Joseph-McCarthy D, Kuchnir L, Kuczera K, Lau F T K, Mattos C, Michnick S, Ngo T, Nguyen D T, Prodhom B, Reiher W E, Roux B, Schlenkrich M, Smith J C, Stote R, Straub J, Watanabe M, Wiorkiewicz-Kuczera J, Yin D, Karplus M. All-atom empirical potential for molecular modeling and dynamics studies of proteins. Journal of Physical Chemistry B, 1998, 102(18): 3586-3616
|
50 |
Halgren T A. Merck molecular force field. I. Basis, form, scope, parameterization, and performance ofMMFF94. Journal of Computational Chemistry, 1996, 17(5-6): 490-519
|
51 |
Landau L D, Lifshitz E M. Quantum Physics, Volume 3. Oxford: Pergamon Press, 1980
|
52 |
Kohn W. Electronic structure of matter-wave functions and density functionals. Reviews of Modern Physics, 1999, 71(5): 1253-1266
|
53 |
Kohn W, Meir Y, Makarov D E. van der Waals energies in density functional theory. Physical Review Letters, 1998, 80(19): 4153-4156
|
54 |
Huang K. Statistical Mechanics. New York, NY: John Wiley and Sons, Inc., 1987
|
55 |
Baxter R J. Exactly Solved Models in Statistical Mechanics. London: Academic Press, 1982
|
56 |
Lebowitz J. Statistical mechanics: A selective review of two central issues. Reviews of Modern Physics, 1999, 71(2): 346-357
|
57 |
Istrail S. Statistical mechanics, three-dimensionality and NPcompleteness: I. Universality of intractability of the partitionfunctions of the Ising model across non-planar lattices. In: Proceedings of the 32nd ACM Symposium on the Theory of Computing (STOC00). 2000, 87-96
|
58 |
Graves A P, Shivakumar D M, Boyce S E, Jacobson M P, Case D A, Shoichet B K. Rescoring docking hit lists for model cavity sites: Predictions and experimental testing. Journal of Molecular Biology, 2008, 377(3): 914-934
|
59 |
Böhm H J. The computer program LUDI: A new method for the de novo design of enzyme inhibitors. Journal of Computer-Aided Molecular Design, 1992, 6(1): 61-78
|
60 |
Böhm H J. LUDI: Rule-based automatic design of new substituents for enzyme inhibitor leads. Journal of Computer-Aided Molecular Design, 1992, 6(6): 593-606
|
61 |
Rarey M, Kramer B, Lengauer T, Klebe G. A fast flexible docking method using an incremental construction algorithm. Journal of Molecular Biology, 1996, 261(3): 470-489
|
62 |
Krammer A, Kirchhoff P D, Jiang X, Venkatachalam C M, Waldman M. LigScore: A novel scoring function for predicting binding affinities. Journal of Molecular Graphics & Modelling, 2005, 23(5): 395-407
|
63 |
Eldridge M D, Murray C W, Auton T R, Paolini G V, Mee R P. Empirical scoring functions: I. The development of a fast empirical scoring function to estimate the binding affinity of ligands in receptor complexes. Journal of Computer-Aided Molecular Design, 1997, 11(5): 425-445
|
64 |
Murray C W, Auton T R, Eldridge M D. Empirical scoring functions. II. The testing of an empirical scoring function for the prediction of ligand-receptor binding affinities and the use of Bayesian regression to improve the quality of the model. Journal of Computer-Aided Molecular Design, 1998, 12(5): 503-519
|
65 |
Wang R, Liu L, Lai L, Tang Y. A new empirical method for estimating the binding affinity of a protein-ligand complex. Journal of Molecular Modeling, 1998, 4(12): 379-394
|
66 |
Wang R, Lai L, Wang S. Further development and validation of empirical scoring functions for structure-based binding affinity prediction. Journal of Computer-Aided Molecular Design, 2002, 16(1): 11-26
|
67 |
Halgren T A, Murphy R B, Friesner R A, Beard H S, Frye L L, Pollard W T, Banks J L. Glide: A new approach for rapid, accurate docking and scoring. 2. Enrichment factors in database screening. Journal of Medicinal Chemistry, 2004, 47(7): 1750-1759
|
68 |
Gohlke H, Hendlich M, Klebe G. Knowledge-based scoring function to predict protein-ligand interactions. Journal of Molecular Biology, 2000, 295(2): 337-356
|
69 |
Velec H F G, Gohlke H, Klebe G. DrugScore(CSD)-knowledge-based scoring function derived from small molecule crystal data with superior recognition rate of nearnative ligand poses and better affinity prediction. Journal of Medicinal Chemistry, 2005, 48(20): 6296-6303
|
70 |
DeWitte R S, Shakhnovich E I. SMoG: de novo design method based on simple, fast, and accurate free energy estimate. 1. Methodology and supporting evidence. Journal of the American Chemical Society, 1996, 118(47): 11733-11744
|
71 |
Muegge I. PMF scoring revisited. Journal of Medicinal Chemistry, 2006, 49(20): 5895-5902
|
72 |
Lovell S C, Word J M, Richardson J S, Richardson D C. The penultimate rotamer library. Proteins: Structure, Function, and Genetics, 2000, 40(3): 389-408
|
73 |
Jones G, Willett P, Glen R C. Molecular recognition of receptor sites using a genetic algorithm with a description of desolvation. Journal of Molecular Biology, 1995, 245(1): 43-53
|
74 |
Dorigo M, Stützle T. Ant Colony Optimization. Cambridge, MA: MIT Press, 2004
|
75 |
Leach A R, Kuntz I D. Conformational analysis of flexible ligands in macromolecular receptor sites. Journal of Computational Chemistry, 1992, 13(6): 730-748
|
76 |
Ulrich E L, Akutsu H, Doreleijers J F, Harano Y, Ioannidis Y E, Lin J, Livny M, Mading S, Maziuk D, Miller Z, Nakatani E, Schulte C F, Tolmie D E, Kent Wenger R, Yao H, Markley J L. BioMagResBank. Nucleic Acids Research, 2008, 36(suppl 1): D402-D408
|
77 |
Debye P, Hückel E. The theory of electrolytes. I. Lowering of freezing point and related phenomena. Physikalische Zeitschrift, 1923, 24: 185-206
|
78 |
Nicholls A, Honig B. A rapid finite difference algorithm, utilizing successive over-relaxation to solve the Poisson-Boltzmann equation. Journal of Computational Chemistry, 1991, 12(4): 435-445
|
79 |
Holst M, Saied F. Multigrid solution of the Poisson-Boltzmann equation. Journal of Computational Chemistry, 1993, 14(1): 105-113
|
80 |
Kirkwood J G, Poirier J C. The statistical mechanical basis of the Debye-Hüchel theory of strong electrolytes. Journal of Physical Chemistry, 1954, 58(8): 591-596
|
81 |
Chern S S, Chen W, Lam K L. Lectures on Differential Geometry. Singapore: World Scientific Publishing Co., 1999
|
/
〈 | 〉 |