Clustering molecular energy landscapes by adaptive network embedding

Paula Mercurio , Di Liu

Journal of Materials Informatics ›› 2024, Vol. 4 ›› Issue (1) : 3

PDF
Journal of Materials Informatics ›› 2024, Vol. 4 ›› Issue (1) :3 DOI: 10.20517/jmi.2023.40
Research Article

Clustering molecular energy landscapes by adaptive network embedding

Author information +
History +
PDF

Abstract

In order to efficiently explore the chemical space of all possible small molecules, a common approach is to compress the dimension of the system to facilitate downstream machine learning tasks. Towards this end, we present a data-driven approach for clustering potential energy landscapes of molecular structures by applying recently developed Network Embedding techniques to obtain latent variables defined through the embedding function. To scale up the method, we also incorporate an entropy sensitive adaptive scheme for hierarchical sampling of the energy landscape, based on Metadynamics and Transition Path Theory. Taking into account the kinetic information implied by the energy landscape of a system, we can interpret dynamical node-node relationships in reduced dimensions. We demonstrate the framework through Lennard-Jones clusters and a human DNA sequence.

Keywords

Network embedding / metadynamics / transition path theory / energy landscapes

Cite this article

Download citation ▾
Paula Mercurio, Di Liu. Clustering molecular energy landscapes by adaptive network embedding. Journal of Materials Informatics, 2024, 4(1): 3 DOI:10.20517/jmi.2023.40

登录浏览全文

4963

注册一个新账户 忘记密码

References

[1]

Dobson CM.Chemical space and biology.Nature2004;432:824-8

[2]

Reymond JL.The chemical space project.Acc Chem Res2015;48:722-30

[3]

Wales DJ.Exploring energy landscapes.Annu Rev Phys Chem2018;69:401-25

[4]

Weinan E.The gentlest ascent dynamics.Nonlinearity2011;24:1831-42

[5]

Weinan E,Vanden-Eijnden E.String method for the study of rare events.Phys Rev B2002;66:052301

[6]

Weinan E,Vanden-Eijnden E.Energy landscape and thermally activated switching of submicron-sized ferromagnetic elements.J Appl Phys2002;93:2275-82

[7]

Perozzi B,Skiena S.DeepWalk: online learning of social representations. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; New York, USA. Association for Computing Machinery; 2014. pp. 701-10.

[8]

Grover A.Node2vec: scalable feature learning for networks. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; New York, USA. Association for Computing Machinery; 2016. pp. 855-64.

[9]

Mikolov T,Corrado G. Efficient estimation of word representations in vector space. arxiv. [Preprint] Sep 7, 2013. [accessed on 2024 Mar 25]. Available from: https://arxiv.org/abs/1301.3781.

[10]

Mikolov T,Chen K,Dean J. Distributed representations of words and phrases and their compositionality. arxiv. [Preprint] Oct 16, 2013. [accessed on 2024 Mar 25]. Available from: https://arxiv.org/abs/1310.4546.

[11]

Rosenblatt F.The perceptron: a probabilistic model for information storage and organization in the brain.Psychol Rev1958;65:386-408

[12]

Mercurio P.Identifying transition states of chemical kinetic systems using network embedding techniques.Math Biosci Eng2021;18:868-87

[13]

Mercurio P. Network embedding using sparse approximation of a random walks. arxiv. [Preprint] Aug 25, 2023. [accessed on 2024 Mar 25]. Available from: https://arxiv.org/abs/2308.13663.

[14]

Laio A.Escaping free-energy minima.Proc Natl Acad Sci2002;99:12562-6

[15]

Weinan E.Towards a theory of transition paths.J Stat Phys2006;123:503-23

[16]

Du J.Transitions states of stochastic chemical reaction networks.Commun Comput Phys2021;29:606-27

[17]

Qiu H.Clustering and embedding using commute times.IEEE Trans Pattern Anal Mach Intell2007;29:1873-90

[18]

Coifman RR.Diffusion wavelets.Appl Comput Harmon A2006;21:53-94

[19]

pele: Python energy landscape explorer. 2012. Available from: https://pele-python.github.io/pele/. [Last accessed on 25 Mar 2024]

[20]

Pettersen EF,Huang CC.UCSF Chimera - a visualization system for exploratory research and analysis.J Chem Phys2004;25:1605-12

[21]

Cragnolini T,Šponer J,Pasquali S.Multifunctional energy landscape for a DNA G-quadruplex: an evolved molecular switch.J Chem Phys2017;147:152715

[22]

Pasquali S.HiRE-RNA: a high resolution coarse-grained energy model for RNA.J Phys Chem B2010;114:11957-66

[23]

Kingma DP. Adam: a method for stochastic optimization. arxiv. [Preprint] Jan 30, 2017. [accessed on 2024 Mar 25]. Available from: https://arxiv.org/abs/1412.6980.

[24]

Bolhuis PG,Dellago C.Transition path sampling: throwing ropes over rough mountain passes, in the dark.Annu Rev Phys Chem2002;53:291-318

[25]

Gómez-Bombarelli R,Duvenaud D.Automatic chemical design using a data-driven continuous representation of molecules.ACS Cent Sci2018;4:268-76

[26]

PATHSAMPLE: a driver for OPTIM to create stationary point databases using discrete path sampling and perform kinetic analysis. Available from: https://www-wales.ch.cam.ac.uk/PATHSAMPLE/. [Last accessed on 25 Mar 2024]

[27]

disconnectionDPS provides a variety of tools for constructing disconnectivity graphs from a database created by PATHSAMPLE. The original program was written by Mark Miller. Available from: https://www-wales.ch.cam.ac.uk/examples/PATHSAMPLE/DisconnectivityGraphs/. [Last accessed on 25 Mar 2024]

AI Summary AI Mindmap
PDF

30

Accesses

0

Citation

Detail

Sections
Recommended

AI思维导图

/