A computational toolbox for molecular property prediction based on quantum mechanics and quantitative structure-property relationship

Qilei Liu, Yinke Jiang, Lei Zhang, Jian Du

PDF(1623 KB)
PDF(1623 KB)
Front. Chem. Sci. Eng. ›› 2022, Vol. 16 ›› Issue (2) : 152-167. DOI: 10.1007/s11705-021-2060-z
RESEARCH ARTICLE
RESEARCH ARTICLE

A computational toolbox for molecular property prediction based on quantum mechanics and quantitative structure-property relationship

Author information +
History +

Abstract

Chemical industry is always seeking opportunities to efficiently and economically convert raw materials to commodity chemicals and higher value-added chemical-based products. The life cycles of chemical products involve the procedures of conceptual product designs, experimental investigations, sustainable manufactures through appropriate chemical processes and waste disposals. During these periods, one of the most important keys is the molecular property prediction models associating molecular structures with product properties. In this paper, a framework combining quantum mechanics and quantitative structure-property relationship is established for fast molecular property predictions, such as activity coefficient, and so forth. The workflow of framework consists of three steps. In the first step, a database is created for collections of basic molecular information; in the second step, quantum mechanics-based calculations are performed to predict quantum mechanics-based/derived molecular properties (pseudo experimental data), which are stored in a database and further provided for the developments of quantitative structure-property relationship methods for fast predictions of properties in the third step. The whole framework has been carried out within a molecular property prediction toolbox. Two case studies highlighting different aspects of the toolbox involving the predictions of heats of reaction and solid-liquid phase equilibriums are presented.

Graphical abstract

Keywords

molecular property / quantum mechanics / quantitative structure-property relationship / heat of reaction / solid-liquid phase equilibrium

Cite this article

Download citation ▾
Qilei Liu, Yinke Jiang, Lei Zhang, Jian Du. A computational toolbox for molecular property prediction based on quantum mechanics and quantitative structure-property relationship. Front. Chem. Sci. Eng., 2022, 16(2): 152‒167 https://doi.org/10.1007/s11705-021-2060-z

References

[1]
Kirkpatrick P, Ellis C. Chemical space. Nature, 2004, 432(7019): 823
CrossRef Google scholar
[2]
Katritzky A R, Lobanov V S, Karelson M. QSPR: the correlation and quantitative prediction of chemical and physical properties from structure. Chemical Society Reviews, 1995, 24(4): 279–287
CrossRef Google scholar
[3]
Mills E J. On melting point and boiling point as related to composition. Philosophical Magazine, 1884, 17(5): 173–187
[4]
Dearden J C, Cronin M T D, Kaiser K L E. How not to develop a quantitative structureactivity or structureproperty relationship (QSAR/QSPR). SAR and QSAR in Environmental Research, 2009, 20(3-4): 241–266
CrossRef Google scholar
[5]
Kim S, Cho K H. PyQSAR: a fast QSAR modeling platform using machine learning and jupyter notebook. Bulletin of the Korean Chemical Society, 2019, 40(1): 39–44
[6]
Enciso M, Meftahi N, Walker M L, Smith B J. BioPPSy: an open-source platform for QSAR/QSPR analysis. PLoS One, 2016, 11(11): e0166298
CrossRef Google scholar
[7]
Pirhadi S, Sunseri J, Koes D R. Open source molecular modeling. Journal of Molecular Graphics & Modelling, 2016, 69: 127–143
CrossRef Google scholar
[8]
Stålring J C, Carlsson L A, Almeida P, Boyer S. AZOrange—high performance open source machine learning for QSAR modeling in a graphical programming environment. Journal of Cheminformatics, 2011, 3(1): 28
CrossRef Google scholar
[9]
Cortes-Ciriano I. Bioalerts: a python library for the derivation of structural alerts from bioactivity and toxicity data sets. Journal of Cheminformatics, 2016, 8(1): 13
CrossRef Google scholar
[10]
Murrell D S, Cortes-Ciriano I, van Westen G J P, Stott I P, Bender A, Malliavin T E, Glen R C. Chemically aware model builder (camb): an R package for property and bioactivity modelling of small molecules. Journal of Cheminformatics, 2015, 7(1): 45
CrossRef Google scholar
[11]
Carrió P, López O, Sanz F, Pastor M. eTOXlab, an open source modeling framework for implementing predictive models in production environments. Journal of Cheminformatics, 2015, 7(1): 8
CrossRef Google scholar
[12]
Tosco P, Balle T. Open3DQSAR: a new open-source software aimed at high-throughput chemometric analysis of molecular interaction fields. Journal of Molecular Modeling, 2011, 17(1): 201–208
CrossRef Google scholar
[13]
Dimitrov S D, Diderich R, Sobanski T, Pavlov T S, Chankov G V, Chapkanov A S, Karakolev Y H, Temelkov S G, Vasilev R A, Gerova K D, . QSAR Toolbox—workflow and major functionalities. SAR and QSAR in Environmental Research, 2016, 27(3): 203–219
CrossRef Google scholar
[14]
Kostal J. Advances in Molecular Toxicology. 1st ed. Cambridge: Elsevier, 2016, 139–186
[15]
Krokhotin A, Dokholyan N V. Methods in Enzymology. 1st ed. Waltham: Elsevier, 2015, 65–89
[16]
Polanski J. Comprehensive Chemometrics. 1st ed. Oxford: Elsevier, 2009, 459–506
[17]
Salomon-Ferrer R, Case D A, Walker R C. An overview of the Amber biomolecular simulation package. WIREs Computational Molecular Science, 2013, 3(2): 198–210
CrossRef Google scholar
[18]
Jo S, Kim T, Iyer V G, Im W. CHARMM-GUI: a web-based graphical user interface for CHARMM. Journal of Computational Chemistry, 2008, 29(11): 1859–1865
CrossRef Google scholar
[19]
Berendsen H J C, van der Spoel D, van Drunen R. GROMACS: a message-passing parallel molecular dynamics implementation. Computer Physics Communications, 1995, 91(1): 43–56
CrossRef Google scholar
[20]
Plimpton S. Fast parallel algorithms for short-range molecular dynamics. Journal of Computational Physics, 1995, 117(1): 1–19
CrossRef Google scholar
[21]
Phillips J C, Braun R, Wang W, Gumbart J, Tajkhorshid E, Villa E, Chipot C, Skeel R D, Kalé L, Schulten K. Scalable molecular dynamics with NAMD. Journal of Computational Chemistry, 2005, 26(16): 1781–1802
CrossRef Google scholar
[22]
Li W, Chen C, Zhao D, Li S. LSQC: low scaling quantum chemistry program. International Journal of Quantum Chemistry, 2015, 115(10): 641–646
CrossRef Google scholar
[23]
Gaussian 16. Revision A.03. Wallingford, CT: Gaussian, Inc., 2016.
[24]
Neese F. The ORCA program system. WIREs Computational Molecular Science, 2012, 2(1): 73–78
CrossRef Google scholar
[25]
Schmidt M W, Baldridge K K, Boatz J A, Elbert S T, Gordon M S, Jensen J H, Koseki S, Matsunaga N, Nguyen K A, Su S, . General atomic and molecular electronic structure system. Journal of Computational Chemistry, 1993, 14(11): 1347–1363
CrossRef Google scholar
[26]
Stewart James J P. MOPAC: a semiempirical molecular orbital program. Journal of Computer-Aided Molecular Design, 1990, 4(1): 1–103
CrossRef Google scholar
[27]
Neese F, Wennmohs F, Hansen A, Becker U. Efficient, approximate and parallel hartreefock and hybrid DFT calculations. A ‘chain-of-spheres’ algorithm for the hartreefock exchange. Chemical Physics, 2009, 356(1): 98–109
CrossRef Google scholar
[28]
O’Boyle N M, Banck M, James C A, Morley C, Vandermeersch T, Hutchison G R. Open Babel: an open chemical toolbox. Journal of Cheminformatics, 2011, 3(1): 33
CrossRef Google scholar
[29]
Mata R A, Suhm M A. Benchmarking quantum chemical methods: are we heading in the right direction? Angewandte Chemie International Edition, 2017, 56(37): 11011–11018
CrossRef Google scholar
[30]
Vereecken L, Glowacki D R, Pilling M J. Theoretical chemical kinetics in tropospheric chemistry: methodologies and applications. Chemical Reviews, 2015, 115(10): 4063–4114
CrossRef Google scholar
[31]
Zheng J, Zhao Y, Truhlar D G. The DBH24/08 database and its use to assess electronic structure model chemistries for chemical reaction barrier heights. Journal of Chemical Theory and Computation, 2009, 5(4): 808–821
CrossRef Google scholar
[32]
Řezáč J, Hobza P. Describing noncovalent interactions beyond the common approximations: how accurate is the “gold standard,” CCSD(T) at the complete basis set limit? Journal of Chemical Theory and Computation, 2013, 9(5): 2151–2155
CrossRef Google scholar
[33]
Sun J, Furness J W, Zhang Y. Mathematical Physics in Theoretical Chemistry. 1st ed. Amsterdam: Elsevier, 2019, 119–159
[34]
Goerigk L, Hansen A, Bauer C, Ehrlich S, Najibi A, Grimme S. A look at the density functional theory zoo with the advanced GMTKN55 database for general main group thermochemistry, kinetics and noncovalent interactions. Physical Chemistry Chemical Physics, 2017, 19(48): 32184–32215
CrossRef Google scholar
[35]
Politzer P, Ma Y, Lane P, Concha M C. Computational prediction of standard gas, liquid, and solid-phase heats of formation and heats of vaporization and sublimation. International Journal of Quantum Chemistry, 2005, 105(4): 341–347
CrossRef Google scholar
[36]
Speight J G. Book Lange’s Handbook of Chemistry. 16th ed. New York: McGraw-Hill, 2005, 515–560.
[37]
Liu Q, Zhang L, Liu L, Du J, Meng Q, Gani R. Computer-aided reaction solvent design based on transition state theory and COSMO-SAC. Chemical Engineering Science, 2019, 202: 300–317
CrossRef Google scholar
[38]
Hsieh C M, Sandler S I, Lin S T. Improvements of COSMO-SAC for vaporliquid and liquidliquid equilibrium predictions. Fluid Phase Equilibria, 2010, 297(1): 90–97
CrossRef Google scholar
[39]
Chen W L, Hsieh C M, Yang L, Hsu C C, Lin S T. A critical evaluation on the performance of COSMO-SAC models for vaporliquid and liquidliquid equilibrium predictions based on different quantum chemical calculations. Industrial & Engineering Chemistry Research, 2016, 55(34): 9312–9322
CrossRef Google scholar
[40]
Gani R. Group contribution-based property estimation methods: advances and perspectives. Current Opinion in Chemical Engineering, 2019, 23: 184–196
CrossRef Google scholar
[41]
Mattei M, Kontogeorgis G M, Gani R. Modeling of the critical micelle concentration (CMC) of nonionic surfactants with an extended group-contribution method. Industrial & Engineering Chemistry Research, 2013, 52(34): 12236–12246
CrossRef Google scholar
[42]
Hukkerikar A S, Sarup B, Ten Kate A, Abildskov J, Sin G, Gani R. Group-contribution+ (GC+) based estimation of properties of pure components: improved property estimation and uncertainty analysis. Fluid Phase Equilibria, 2012, 321: 25–43
CrossRef Google scholar
[43]
Goh A T C. Back-propagation neural networks for modeling complex systems. Artificial Intelligence in Engineering, 1995, 9(3): 143–151
CrossRef Google scholar
[44]
Liu Q, Zhang L, Liu L, Du J, Tula A K, Eden M, Gani R. OptCAMD: an optimization-based framework and tool for molecular and mixture product design. Computers & Chemical Engineering, 2019, 124: 285–301
CrossRef Google scholar
[45]
Lu T, Chen F. Multiwfn: a multifunctional wavefunction analyzer. Journal of Computational Chemistry, 2012, 33(5): 580–592
CrossRef Google scholar
[46]
Lu T, Chen F. Quantitative analysis of molecular surface based on improved marching tetrahedra algorithm. Journal of Molecular Graphics & Modelling, 2012, 38: 314–323
CrossRef Google scholar
[47]
Oliphant T E. Python for scientific computing. Computing in Science & Engineering, 2007, 9(3): 10–20
CrossRef Google scholar
[48]
Liu Q, Zhang L, Tang K, Feng Y, Zhang J, Zhuang Y, Liu L, Du J. Computer-aided reaction solvent design considering inertness using group contribution-based reaction thermodynamic model. Chemical Engineering Research & Design, 2019, 152: 123–133
CrossRef Google scholar
[49]
Oxtoby D W, Gillis H P, Campion A, Helal H H, Gaither K P. Book Principles of Modern Chemistry. 7th ed. Belmont: CENGAGE Learning, 2011, 596
[50]
Mullins E, Oldland R, Liu Y A, Wang S, Sandler S I, Chen C C, Zwolak M, Seavey K C. Sigma-profile database for using COSMO-based thermodynamic methods. Industrial & Engineering Chemistry Research, 2006, 45(12): 4389–4415
CrossRef Google scholar
[51]
Rooney J J. Trouton’s rule. Nature, 1990, 348(6300): 398–398
CrossRef Google scholar
[52]
Liu Q, Zhang L, Tang K, Liu L, Du J, Meng Q, Gani R. Machine learning-based atom contribution method for the prediction of surface charge density profiles and solvent design. AIChE Journal. American Institute of Chemical Engineers, 2021, 67(2): e17110
CrossRef Google scholar
[53]
Gastegger M, Schwiedrzik L, Bittermann M, Berzsenyi F, Marquetand P. WACSF—weighted atom-centered symmetry functions as descriptors in machine learning potentials. Journal of Chemical Physics, 2018, 148(24): 241709
CrossRef Google scholar
[54]
Wang S, Song Z, Wang J, Dong Y, Wu M. Solubilities of ibuprofen in different pure solvents. Journal of Chemical & Engineering Data, 2010, 55(11): 5283–5285
CrossRef Google scholar
[55]
Hong J, Hua D, Wang X, Wang H, Li J. Solidliquidgas equilibrium of the ternaries ibuprofen+ myristic acid+ CO2 and ibuprofen+ tripalmitin+ CO2. Journal of Chemical & Engineering Data, 2010, 55(1): 297–302
CrossRef Google scholar

Acknowledgements

The authors are grateful for the financial supports of the National Natural Science Foundation of China (Grant Nos. 22078041 and 21808025) and “the Fundamental Research Funds for the Central Universities (Grant No. DUT20JC41)”.

Electronic Supplementary Material

Supplementary material is available in the online version of this article at https://dx.doi.org/10.1007/s11705-021-2060-z and is accessible for authorized users.

RIGHTS & PERMISSIONS

2021 Higher Education Press
AI Summary AI Mindmap
PDF(1623 KB)

Accesses

Citations

Detail

Sections
Recommended

/