Bridging language models and computational materials science: A prompt-driven framework for material property prediction

Shuai Lv , Lei Peng , Wentiao Wu , Yufan Yao , Shizhe Jiao , Wei Hu

Materials Genome Engineering Advances ›› 2025, Vol. 3 ›› Issue (2) : e70013

PDF
Materials Genome Engineering Advances ›› 2025, Vol. 3 ›› Issue (2) : e70013 DOI: 10.1002/mgea.70013
RESEARCH ARTICLE

Bridging language models and computational materials science: A prompt-driven framework for material property prediction

Author information +
History +
PDF

Abstract

Large language models (LLMs) have demonstrated effectiveness in interpreting complex data. However, they encounter challenges in specialized applications, such as predicting material properties, due to limited integration with domain-specific knowledge. To overcome these challenges, we introduce MatAgent, an artificial intelligence (AI) agent that combines computational chemistry tools, such as first-principles (FP) calculations, with the capabilities of LLMs to predict key properties of materials. By leveraging prompt engineering and advanced reasoning techniques, MatAgent integrates a series of tools and acquires domain-specific knowledge in the field of material property prediction, enabling it to accurately predict the properties of materials without the need for predefined input structures. The experimental results indicate that MatAgent achieves a significant improvement in prediction accuracy and efficiency. As a novel approach that integrates LLMs with FP calculation tools, MatAgent highlights the potential of combining advanced computational techniques to enhance material property predictions, representing a significant advancement in computational materials science.

Keywords

AI agent / AI for materials science / large language models / material property prediction

Cite this article

Download citation ▾
Shuai Lv, Lei Peng, Wentiao Wu, Yufan Yao, Shizhe Jiao, Wei Hu. Bridging language models and computational materials science: A prompt-driven framework for material property prediction. Materials Genome Engineering Advances, 2025, 3(2): e70013 DOI:10.1002/mgea.70013

登录浏览全文

4963

注册一个新账户 忘记密码

References

[1]

Vaswani A. Attention is all you need. In: Adv Neural Inf Process Syst. Vol 30; 2017.

[2]

Brown TB. Language models are few-shot learners. arXiv preprint arXiv:200514165 2020.

[3]

Thoppilan R, De Freitas D, Hall J, et al. Lamda: language models for dialog applications. arXiv preprint arXiv:220108239, 2022.

[4]

Touvron H, Lavril T, Izacard G, et al. Llama: open and efficient foundation language models. arXiv preprint arXiv:230213971, 2023.

[5]

Hoffmann J, Borgeaud S, Mensch A, et al. Training compute-optimal large language models. arXiv preprint arXiv:220315556, 2022.

[6]

Chowdhery A, Narang S, Devlin J, et al. Palm: scaling language modeling with pathways. J Mach Learn Res. 2023; 24(240): 1-113.

[7]

Lin Z, Akin H, Rao R, et al. Evolutionary-scale prediction of atomic-level protein structure with a language model. Science. 2023; 379(6637): 1123-1130.

[8]

Luo R, Sun L, Xia Y, et al. BioGPT: generative pre-trained transformer for biomedical text generation and mining. Brief Bioinform. 2022; 23(6):bbac409.

[9]

Irwin R, Dimitriadis S, He J, Bjerrum EJ. Chemformer: a pre-trained transformer for computational chemistry. Mach Learn Sci Technol.2022; 3(1):015022.

[10]

Kim H, Na J, Lee WB. Generative chemical transformer: neural machine learning of molecular geometric structures from chemical language via attention. J Chem Inf Model. 2021; 61(12): 5804-5814.

[11]

Jablonka KM, Schwaller P, Ortega-Guerrero A, Smit B. Leveraging large language models for predictive chemistry. Nat Mach Intell. 2024; 6(2): 161-169.

[12]

Xu FF, Alon U, Neubig G, Hellendoorn VJ. A systematic evaluation of large language models of code. In: Proceedings of the 6th ACM SIGPLAN international symposium on machine programming; 2022: 1-10.

[13]

Hu T, Song H, Jiang T, Li S. Learning representations of inorganic materials from generative adversarial networks. Symmetry. 2020; 12(11):1889.

[14]

Ward L, Agrawal A, Choudhary A, Wolverton C. A general-purpose machine learning framework for predicting properties of inorganic materials. npj Comput Mater. 2016; 2(1): 1-7.

[15]

Zheng Z, Zhang O, Borgs C, Chayes JT, Yaghi OM. ChatGPT chemistry assistant for text mining and the prediction of MOF synthesis. J Am Chem Soc. 2023; 145(32): 18048-18062.

[16]

Dagdelen J, Dunn A, Lee S, et al. Structured information extraction from scientific text with large language models. Nat Commun. 2024; 15(1):1418.

[17]

Zhao ZW, Del Cueto M, Troisi A. Limitations of machine learning models when predicting compounds with completely new chemistries: possible improvements applied to the discovery of new non-fullerene acceptors. Dig Dis. 2022; 1(3): 266-276.

[18]

Zhong Z, Zhou K, Mottin D. Benchmarking large language models for molecule prediction tasks. arXiv preprint arXiv:240305075, 2024.

[19]

Hummel RE. Electronic Properties of Materials. Springer Science and Business Media; 2011.

[20]

Yang H, Ma Y, Dai Y. Progress of structural and electronic properties of diamond: a mini review. Funct Diamond. 2022; 1(1): 150-159.

[21]

Yuan C, Zhou Y, Zhu Y, et al. Improved high-temperature electrical properties of polymeric material by grafting modification. ACS Sustain Chem Eng. 2022; 10(27): 8685-8693.

[22]

Chen Z, Zheng R, Graś M, et al. Tuning electronic property and surface reconstruction of amorphous iron borides via WP co-doping for highly efficient oxygen evolution. Appl Catal B Environ. 2021; 288:120037.

[23]

Maheswaran R, Shanmugavel BP. A critical review of the role of carbon nanotubes in the progress of next-generation electronic applications. J Electron Mater. 2022; 51(6): 2786-2800.

[24]

Kang Y, Kim J. ChatMOF: an artificial intelligence system for predicting and generating metal-organic frameworks using large language models. Nat Commun. 2024; 15(1):4705.

[25]

Liu H, Yin H, Luo Z, Wang X. Integrating chemistry knowledge in large language models via prompt engineering. Synthetic Syst Biotechnol. 2025; 10(1): 23-38.

[26]

Boiko DA, MacKnight R, Kline B, Gomes G. Autonomous chemical research with large language models. Nature. 2023; 624(7992): 570-578.

[27]

Chen K, Cao H, Li J, et al. An autonomous large language model agent for chemical literature data mining. arXiv preprint arXiv:240212993, 2024.

[28]

Wei J, Wang X, Schuurmans D, et al. Chain-of-thought prompting elicits reasoning in large language models. Adv Neural Inf Process Syst. 2022; 35: 24824-24837.

[29]

Ahmed T, Devanbu P. Few-shot training llms for project-specific code-summarization. In: Proceedings of the 37th IEEE/ACM international conference on automated software engineering; 2022: 1-5.

[30]

White J, Fu Q, Hays S, et al. A prompt pattern catalog to enhance prompt engineering with chatgpt. arXiv preprint arXiv:230211382, 2023.

[31]

Zhou Y, Muresanu AI, Han Z, et al. Large language models are human-level prompt engineers. In: The Eleventh International Conference on Learning Representations; 2022.

[32]

Ekin S. Prompt Engineering for ChatGPT: A Quick Guide to Techniques, Tips, and Best Practices. Authorea Preprints; 2023.

[33]

Xie T, Wan Y, Huang W, et al. Darwin series: domain specific large language models for natural science. arXiv preprint arXiv:230813565, 2023.

[34]

Giray L. Prompt engineering with ChatGPT: a guide for academic writers. Ann Biomed Eng. 2023; 51(12): 2629-2633.

[35]

Meskó B. Prompt engineering as an important emerging skill for medical professionals: tutorial. J Med Internet Res. 2023; 25:e50638.

[36]

Lanzoni D, Rovaris F, Montalenti F. Machine learning potential for interacting dislocations in the presence of free surfaces. Sci Rep. 2022; 12(1):3760.

[37]

Schleder GR, Padilha AC, Acosta CM, Costa M, Fazzio A. From DFT to machine learning: Recent approaches to materials science–a review. J Phys Mater. 2019; 2(3):032001.

[38]

Schick T, Dwivedi-Yu J, Dessì R, et al. Toolformer: language models can teach themselves to use tools. Adv Neural Inf Process Syst. 2024; 36.

[39]

Yang Z, Li L, Wang J, et al. Mm-react: prompting ChatGPT for multimodal reasoning and action. arXiv preprint arXiv:230311381. 2023.

[40]

Shen Y, Song K, Tan X, Li D, Lu W, Zhuang Y. HuggingGPT: solving ai tasks with ChatGPT and its friends in hugging face. Adv Neural Inf Process Syst. 2024; 36.

[41]

Yao S, Zhao J, Yu D, et al. React: synergizing reasoning and acting in language models. arXiv preprint arXiv:221003629, 2022.

[42]

Zhang SJ, Florin S, Lee AN, et al. Exploring the mit mathematics and eecs curriculum using large language models. arXiv preprint arXiv:230608997, 2023.

[43]

Chu Z, Chen J, Chen Q, et al. A survey of chain of thought reasoning: advances, frontiers and future. CoRR, abs/2309.15402, 2023. arXiv preprint ARXIV230915402.

[44]

Hu W, Lin L, Banerjee AS, Vecharynski E, Yang C. Adaptively compressed exchange operator for large-scale hybrid density functional calculations with applications to the adsorption of water on silicene. J Chem Theor Comput. 2017; 13(3): 1188-1198.

[45]

Hu W, Lin L, Yang C. Projected commutator DIIS method for accelerating hybrid functional electronic structure calculations. J Chem Theor Comput. 2017; 13(11): 5458-5467.

[46]

Hu W, Qin X, Jiang Q, et al. High performance computing of DGDFT for tens of thousands of atoms using millions of cores on Sunway TaihuLight. Sci Bull. 2021; 66(2): 111-119.

[47]

Feng J, Wan L, Li J, et al. Massively parallel implementation of iterative eigensolvers in large-scale plane-wave density functional theory. Comput Phys Commun. 2024; 299:109135.

[48]

Jiao S, Wan L, Li J, et al. Projected wave function extrapolation scheme to accelerate plane-wave hybrid functional-based born–oppenheimer molecular dynamics simulations. J Phys Chem A. 2025; 129(6): 1741-1756.

RIGHTS & PERMISSIONS

2025 The Author(s). Materials Genome Engineering Advances published by Wiley-VCH GmbH on behalf of University of Science and Technology Beijing.

AI Summary AI Mindmap
PDF

38

Accesses

0

Citation

Detail

Sections
Recommended

AI思维导图

/