PDF
(4080KB)
Abstract
Symbolic regression (SR), exploring mathematical expressions from a given data set to construct an interpretable model, emerges as a powerful computational technique with the potential to transform the “black box” machining learning methods into physical and chemistry interpretable expressions in material science research. In this review, the current advancements in SR are investigated, focusing on the underlying theories, fundamental flowcharts, various techniques, implemented codes, and application fields. More predominantly, the challenging issues and future opportunities in SR that should be overcome to unlock the full potential of SR in material design and research, including graphics processing unit acceleration and transfer learning algorithms, the trade-off between expression accuracy and complexity, physical or chemistry interpretable SR with generative large language models, and multimodal SR methods, are discussed.
Keywords
explainable machine learning
/
material database
/
materials science
/
representation learning
/
symbolic regression
Cite this article
Download citation ▾
Guanjie Wang, Erpeng Wang, Zefeng Li, Jian Zhou, Zhimei Sun.
Exploring the mathematic equations behind the materials science data using interpretable symbolic regression.
Interdisciplinary Materials, 2024, 3(5): 637-657 DOI:10.1002/idm2.12180
| [1] |
Anasori B, Lukatskaya MR, Gogotsi Y. 2D metal carbides and nitrides (MXenes) for energy storage. Nat Rev Mater. 2017;2(2):16098.
|
| [2] |
Manzeli S, Ovchinnikov D, Pasquier D, Yazyev OV, Kis A. 2D transition metal dichalcogenides. Nat Rev Mater. 2017;2(8):17033.
|
| [3] |
Zhu Z, Ng DWH, Park HS, McAlpine MC. 3D-printed multifunctional materials enabled by artificial-intelligence-assisted fabrication technologies. Nat Rev Mater. 2021;6(1):27–47.
|
| [4] |
Xiang XD, Sun X, Briceño G, et al. A combinatorial approach to materials discovery. Science. 1995;268(5218):1738–1740.
|
| [5] |
Lencer D, Salinga M, Grabowski B, Hickel T, Neugebauer J, Wuttig M. A map for phase-change materials. Nat Mater. 2008;7(12):972–977.
|
| [6] |
Xu D, Zhang Q, Huo X, Wang Y, Yang M. Advances in data-assisted high-throughput computations for material design. Mater Genome Eng Adv. 2023;1(1): e11.
|
| [7] |
Szymanski NJ, Rendy B, Fei Y, et al. An autonomous laboratory for the accelerated synthesis of novel materials. Nature. 2023;624(7990):86–91.
|
| [8] |
Kalinin SV, Sumpter BG, Archibald RK. Big-deep-smart data in imaging for guiding materials design. Nat Mater. 2015;14(10):973–980.
|
| [9] |
Zhao JC. A combinatorial approach for structural materials. Adv Eng Mater. 2001;3(3):143–147.
|
| [10] |
Maier WF, Stöwe K, Sieg S. Combinatorial and high-throughput materials science. Angew Chem, Int Ed. 2007;46(32):6016–6067.
|
| [11] |
Zhao JC. High-throughput experimental tools for the materials genome initiative. Chin Sci Bull. 2014;59(15):1652–1661.
|
| [12] |
Zhao JC, Zheng X, Cahill DG. High-throughput measurements of materials properties. JOM. 2011;63(3):40–44.
|
| [13] |
Huxtable S, Cahill DG, Fauconnier V, White JO, Zhao JC. Thermal conductivity imaging at micrometre-scale resolution for combinatorial studies of materials. Nat Mater. 2004;3(5):298–301.
|
| [14] |
Kusne AG, Gao T, Mehta A, et al. On-the-fly machine-learning for high-throughput experiments: search for rare-earth-free permanent magnets. Sci Rep. 2014;4:6367.
|
| [15] |
Cohen AJ, Mori-Sánchez P, Yang W. Challenges for density functional theory. Chem Rev. 2012;112(1):289–320.
|
| [16] |
Olson GB. Computational design of hierarchically structured materials. Science. 1997;277(5330):1237–1242.
|
| [17] |
Vergniory MG, Wieder BJ, Elcoro L, et al. All topological bands of all nonmagnetic stoichiometric materials. Science. 2022;376(6595):eabg9094.
|
| [18] |
Marzari N, Ferretti A, Wolverton C. Electronic-structure methods for materials design. Nat Mater. 2021;20(6):736–749.
|
| [19] |
Louie SG, Chan YH, da Jornada FH, Li Z, Qiu DY. Discovering and understanding materials through computation. Nat Mater. 2021;20(6):728–735.
|
| [20] |
Curtarolo S, Hart GLW, Nardelli MB, Mingo N, Sanvito S, Levy O. The high-throughput highway to computational materials design. Nat Mater. 2013;12(3):191–201.
|
| [21] |
Snyder JC, Rupp M, Hansen K, Müller KR, Burke K. Finding density functionals with machine learning. Phys Rev Lett. 2012;108(25):253002.
|
| [22] |
Granlund GH, Knutsson H. Signal Processing for Computer Vision. Springer Science & Business Media;2013.
|
| [23] |
Voulodimos A, Doulamis N, Doulamis A, Protopapadakis E, et al. Deep learning for computer vision: a brief review. Comput Intell Neurosci. 2018;2018:1–13.
|
| [24] |
Guo Y, Liu Y, Oerlemans A, Lao S, Wu S, Lew MS. Deep learning for visual understanding: a review. Neurocomputing. 2016;187:27–48.
|
| [25] |
LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015;521(7553):436–444.
|
| [26] |
Kamilaris A, Prenafeta-Boldú FX. Deep learning in agriculture: a survey. Comput Electron Agric. 2018;147:70–90.
|
| [27] |
GM H, Gourisaria MK, Pandey M, Rautaray SS. A comprehensive survey and analysis of generative models in machine learning. Comput Sci Rev. 2020;38:100285.
|
| [28] |
Theis L, Oord A, van den Bethge M. A note on the evaluation of generative models. arXiv. 2015; https://arxiv.org/abs/1511.01844
|
| [29] |
Kingma DP, Mohamed S, Jimenez Rezende D, Welling M. Semi-supervised learning with deep generative models. Adv Neural Inf Process Syst. 2014;27:3581–3589.
|
| [30] |
Salakhutdinov R. Learning deep generative models. Annu Rev Stat Appl. 2015;2:361–385.
|
| [31] |
White A. The materials genome initiative: one year on. MRS Bull. 2012;37(8):715–716.
|
| [32] |
Ad-hoc Interagency Group on Advanced Materials. Materials genome initiative for global competitiveness. National Science and Technology Council, Committee on Technology;2011. https://www.mgi.gov/sites/default/files/documents/materials_genome_initiative-final.pdf
|
| [33] |
Agrawal A, Choudhary A. Perspective: materials informatics and big data: realization of the “fourth paradigm” of science in materials science. APL Mater. 2016;4(5):053208.
|
| [34] |
Wang G, Peng L, Li K, et al. ALKEMIE: an intelligent computational platform for accelerating materials discovery and design. Comput Mater Sci. 2021;186:110064.
|
| [35] |
Ong SP, Richards WD, Jain A, et al. Python Materials Genomics (pymatgen): a robust, open-source Python library for materials analysis. Comput Mater Sci. 2013;68:314–319.
|
| [36] |
Jain A, Ong SP, Hautier G, et al. Commentary: The Materials Project: a materials genome approach to accelerating materials innovation. APL Mater. 2013;1(1):011002.
|
| [37] |
Jain A, Persson KA, Ceder G. Research Update: the materials genome initiative: data sharing and the impact of collaborative ab initio databases. APL Mater. 2016;4(5):053102.
|
| [38] |
Mathew K, Montoya JH, Faghaninia A, et al. Atomate: a high-level interface to generate, execute, and analyze computational materials science workflows. Comput Mater Sci. 2017;139:140–152.
|
| [39] |
Pizzi G, Cepellotti A, Sabatini R, Marzari N, Kozinsky B. AiiDA: automated interactive infrastructure and database for computational science. Comput Mater Sci. 2016;111:218–230.
|
| [40] |
Curtarolo S, Setyawan W, Wang S, et al. AFLOWLIB. ORG: a distributed materials properties repository from high-throughput ab initio calculations. Comput Mater Sci. 2012;58:227–235.
|
| [41] |
Hicks D, Toher C, Ford DC, et al. AFLOW-XtalFinder: a reliable choice to identify crystalline prototypes. npj Comput Mater. 2021;7(1):30.
|
| [42] |
Hjorth Larsen A, Jørgen Mortensen J, Blomqvist J, et al. The atomic simulation environment—a Python library for working with atoms. J Phys Condens Matter. 2017;29(27):273002.
|
| [43] |
Glick J. Ontologies and databases knowledge engineering for materials informatics. Informatics for Materials Science and Engineering. Butterworth-Heinemann;2013:147–187.
|
| [44] |
Himanen L, Geurts A, Foster AS, Rinke P. Data-driven materials science: status, challenges, and perspectives. Adv Sci. 2019;6(21):1900808.
|
| [45] |
Zhu L, Zhou J, Sun Z. Materials data toward machine learning: advances and challenges. J Phys Chem Lett. 2022;13(18):3965–3977.
|
| [46] |
O’Mara J, Meredig B, Michel K. Materials data infrastructure: a case study of the citrination platform to examine data import, storage, and access. JOM. 2016;68(8):2031–2034.
|
| [47] |
Blaiszik B, Chard K, Pruyne J, Ananthakrishnan R, Tuecke S, Foster I. The materials data facility: data services to advance materials science research. JOM. 2016;68(8):2045–2052.
|
| [48] |
Zhou T, Song Z, Sundmacher K. Big data creates new opportunities for materials research: a review on methods and applications of machine learning for materials design. Engineering. 2019;5(6):1017–1026.
|
| [49] |
Rodrigues JF, Florea L, de Oliveira MCF, Diamond D, Oliveira ON. Big data and machine learning for materials science. Discover Mater. 2021;1(1):12.
|
| [50] |
Mayer-Schönberger V, Cukier K. Big Data: A Revolution That Will Transform How We Live, Work, and Think. Houghton Mifflin Harcourt;2013.
|
| [51] |
Belsky A, Hellenbrandt M, Karen VL, Luksch P. New developments in the inorganic crystal structure database (ICSD): accessibility in support of materials research and design. Acta Crystallogr B. 2002;58(3):364–369.
|
| [52] |
Curtarolo S, Setyawan W, Hart GLW, et al. AFLOW: an automatic framework for high-throughput materials discovery. Comput Mater Sci. 2012;58:218–226.
|
| [53] |
Gossett E, Toher C, Oses C, et al. AFLOW-ML: a RESTful API for machine-learning predictions of materials properties. Comput Mater Sci. 2018;152:134–145.
|
| [54] |
Yang X, Wang Z, Zhao X, Song J, Zhang M, Liu H. MatCloud: a high-throughput computational infrastructure for integrated management of materials simulation, data and resources. Comput Mater Sci. 2018;146:319–333.
|
| [55] |
Draxl C, Scheffler M. The NOMAD laboratory: from data sharing to artificial intelligence. J Phys: Mater. 2019;2(3):036001.
|
| [56] |
Quirós M, Gražulis S, Girdzijauskaitė S, Merkys A, Vaitkus A. Using SMILES strings for the description of chemical connectivity in the crystallography open database. J Cheminf. 2018;10(1):23.
|
| [57] |
Wang G, Li K, Peng L, Zhang Y, Zhou J, Sun Z. High-throughput automatic integrated material calculations and data management intelligent platform and the application in novel alloys. Acta Metall Sin. 2021;58(1):75–88.
|
| [58] |
Jordan MI, Mitchell TM. Machine learning: trends, perspectives, and prospects. Science. 2015;349(6245):255–260.
|
| [59] |
Pederson R, Kalita B, Burke K. Machine learning and density functional theory. Nat Rev Phys. 2022;4(6):357–358.
|
| [60] |
Liu ZK. Ocean of data: integrating first-principles calculations and CALPHAD modeling with machine learning. J Phase Equilib Diff. 2018;39(5):635–649.
|
| [61] |
Abadi M, Agarwal A, Barham P, et al. Tensorflow: large-scale machine learning on heterogeneous distributed systems. arXiv. 2016. https://arxiv.org/abs/1603.04467
|
| [62] |
Paszke A, Gross S, Massa F, et al. Pytorch: an imperative style, high-performance deep learning library. Adv Neu Inf Pro Sys. 2019;721:8026–8037.
|
| [63] |
Ward L, Dunn A, Faghaninia A, et al. Matminer: an open source toolkit for materials data mining. Comput Mater Sci. 2018;152:60–69.
|
| [64] |
Wang G, Sun Z. Atomic insights into device-scale phase-change memory materials using machine learning potential. Sci Bull. 2023;68(24):3105–3107.
|
| [65] |
Wang G, Sun Y, Zhou J, Sun Z. PotentialMind: graph convolutional machine learning potential for Sb-Te binary compounds of multiple stoichiometries. J Phys Chem. 2023;127(51):24724–24733.
|
| [66] |
Cai J, Luo J, Wang S, Yang S. Feature selection in machine learning: a new perspective. Neurocomputing. 2018;300:70–79.
|
| [67] |
Khalid S, Khalil T, Nasreen S. A survey of feature selection and feature extraction techniques in machine learning. In: 2014 Science and Information Conference. Vol 1. 2014;372–378.
|
| [68] |
Burlacu B, Kommenda M, Kronberger G, Winkler S, Affenzeller M. Symbolic regression in materials science: discovering interatomic potentials from data. Genetic Programming Theory and Practice XIX. Springer;2023:1–30.
|
| [69] |
Angelis D, Sofos F, Karakasidis TE. Artificial intelligence in physical sciences: symbolic regression trends and perspectives. Arch Comput Methods Eng. 2023;30(6):3845–3865.
|
| [70] |
Wang H, Fu T, Du Y, et al. Scientific discovery in the age of artificial intelligence. Nature. 2023;620(7972):47–60.
|
| [71] |
Hamilton WL, Ying R, Leskovec J. Representation learning on graphs: methods and applications. arXiv. 2018. https://arxiv.org/abs/1709.05584
|
| [72] |
Bengio Y, Courville A, Vincent P. Representation learning: a review and new perspectives. IEEE Trans Pattern Anal Mach Intell. 2013;35(8):1798–1828.
|
| [73] |
Baldi P. Autoencoders, unsupervised learning, and deep architectures. Proceedings of ICML Workshop on Unsupervised and Transfer Learning. JMLR Workshop and Conference Proceedings, 37–49;2012.
|
| [74] |
Makhzani A, Shlens J, Jaitly N, Goodfellow I, Frey B. Adversarial autoencoders. arXiv. 2015. https://arxiv.org/abs/1511.05644
|
| [75] |
Ackley DH, Hinton GE, Sejnowski TJ, Ackley D, Hinton G, Sejnowski T. A learning algorithm for Boltzmann machines. Cogn Sci. 1985;9(1):147–169.
|
| [76] |
Chen M, Radford A, Child R, et al. Generative pretraining from pixels. International Conference on Machine Learning, PMLR, 1691–1703;2020.
|
| [77] |
Creswell A, White T, Dumoulin V, Arulkumaran K, Sengupta B, Bharath AA. Generative adversarial networks: an overview. IEEE Signal Process Mag. 2018;35(1):53–65.
|
| [78] |
Goodfellow I, Pouget-Abadie J, Mirza M, et al. Generative adversarial networks. Commun ACM. 2020;63(11):139–144.
|
| [79] |
Karras T, Laine S, Aila T. A style-based generator architecture for generative adversarial networks. Proceedings of the IEEE/CVF Conference On Computer Vision And Pattern Recognition, Vol 1, 4401–4410;2019.
|
| [80] |
Mao X, Li Q, Xie H, Lau RY, Wang Z, Paul Smolley S. Least squares generative adversarial networks. Proceedings of the IEEE International Conference on Computer Vision. Vol 1, 2794–2802;2017.
|
| [81] |
Goodfellow I, Pouget-Abadie J, Mirza M, et al. Generative adversarial nets. Adv Neural Inf Process Syst. 2014;27:2672–2680.
|
| [82] |
Kingma DP, Welling M, et al. An introduction to variational autoencoders. Found Trends Mach Learn. 2019;12(4):307–392.
|
| [83] |
Doersch C. Tutorial on variational autoencoders. arXiv. 2016. https://arxiv.org/abs/1606.05908
|
| [84] |
Salakhutdinov R, Hinton G. Deep Boltzmann machines. In: Proceedings of the Twelth International Conference on Artificial Intelligence and Statistics. Vol 5. Proceedings of Machine Learning Research; 2009:448–455. https://proceedings.mlr.press/v5/salakhutdinov09a
|
| [85] |
Pyhne HO, Savic DA. Symbolic regression using object-oriented genetic programming (in C++). Centre For Systems And Control Engineering (Report No. 96/04). School of Engineering, University of Exeter, Exeter, United Kingdom;1996:72.
|
| [86] |
Savic DA, Walters GA, Davidson JW. A genetic programming approach to rainfall-runoff modelling. Water Resour Manag. 1999;13:219–231.
|
| [87] |
Augusto DA, Barbosa HJC. Symbolic regression via genetic programming. Proceedings. Vol.1. Sixth Brazilian Symposium on Neural Networks, 173–178. IEEE;2000.
|
| [88] |
Moraglio A, Krawiec K, Johnson CG. Geometric Semantic Genetic Programming. Parallel Problem Solving from Nature -PPSN XII. Vol 7491. Lecture Notes in Computer Science. Vol 7491. Springer;2012:21–31.
|
| [89] |
Icke I, Bongard JC. Improving genetic programming based symbolic regression using deterministic machine learning. IEEE Congress on Evolutionary Computation, 1763–1770. IEEE;2013.
|
| [90] |
Wilson DG, Miller JF, Cussat-Blanc S, Luga H. Positional Cartesian genetic programming. arXiv. 2018. https://arxiv.org/abs/1810.04119
|
| [91] |
de Franca FO, Aldeia GSI. Interaction-transformation evolutionary algorithm for symbolic regression. Evol Comput. 2021;29(3):367–390.
|
| [92] |
Virgolin M, Alderliesten T, Witteveen C, Bosman P. A model-based genetic programming approach for symbolic regression of small expressions. arXiv. 2019. https://arxiv.org/abs/1904.02050.
|
| [93] |
Burlacu B, Kronberger G, Kommenda M. Operon C++: an efficient genetic programming framework for symbolic regression. Proceedings of the Genetic and Evolutionary Computation Conference Companion, 1562–1570. Association for Computing Machinery;2020.
|
| [94] |
Virgolin M, Alderliesten T, Witteveen C, Bosman PAN. Improving model-based genetic programming for symbolic regression of small expressions. Evol Comput. 2021;29(2):211–237.
|
| [95] |
Brence J, Todorovski L, Džeroski S. Probabilistic grammars for equation discovery. Knowledge-Based Syst. 2021;224:107077.
|
| [96] |
Virgolin M, Bosman P. Coefficient mutation in the gene-pool optimal mixing evolutionary algorithm for symbolic regression. Proceedings of the Genetic and Evolutionary Computation Conference Companion, 2289–2297, Association for Computing Machinery;2022.
|
| [97] |
Franca FO. Transformation-interaction-rational representation for symbolic regression. Proceedings of the Genetic and Evolutionary Computation Conference, Vol 1, 920–928. Association for Computing Machinery;2022.
|
| [98] |
Cava WL, Singh T, Taggart J, Suri S, Moore J. Learning concise representations for regression by evolving networks of trees. arXiv. 2018. https://arxiv.org/abs/1807.00981
|
| [99] |
Olivetti De França F. A greedy search tree heuristic for symbolic regression. Inf Sci. 2018;442–443:18–32.
|
| [100] |
Zhang H, Zhou A, Qian H, Zhang H. PS-Tree: a piecewise symbolic regression tree. Swarm Evol Comput. 2022;71:101061.
|
| [101] |
Baume F, Heckman JJ, Hübner M, Torres E, Turner AP, Yu X. SymTrees and multi-sector QFTs. arXiv. 2023. https://arxiv.org/abs/2310.12980.
|
| [102] |
Tohme T, Liu D, Youcef-Toumi K. GSR: a generalized symbolic regression approach. arXiv. 2023. https://arxiv.org/abs/2205.15569
|
| [103] |
McRee RK. Symbolic regression using nearest neighbor indexing. Proceedings of the 12th Annual Conference Companion on Genetic and Evolutionary Computation, 1983–1990. Association for Computing Machinery;2010.
|
| [104] |
Dubčáková R. Eureqa: software review. Genet Program Evolvable Mach. 2011;12(2):173–178.
|
| [105] |
Austel V, Dash S, Gunluk O, et al. Globally optimal symbolic regression. arXiv. 2017. https://arxiv.org/abs/1710.10720
|
| [106] |
Igarashi Y, Takenaka H, Nakanishi-Ohno Y, Uemura M, Ikeda S, Okada M. Exhaustive search for sparse variable selection in linear regression. J Phys Soc Jpn. 2018;87(4):044802.
|
| [107] |
Ouyang R, Curtarolo S, Ahmetcik E, Scheffler M, Ghiringhelli LM. SISSO: a compressed-sensing method for identifying the best low-dimensional descriptor in an immensity of offered candidates. Phys Rev Mater. 2018;2(8):083802.
|
| [108] |
Ouyang R, Ahmetcik E, Carbogno C, Scheffler M, Ghiringhelli LM. Simultaneous learning of several materials properties from incomplete databases with multi-task SISSO. J Phys: Mater. 2019;2(2):024002.
|
| [109] |
Jin Y, Fu W, Kang J, Guo J, Guo J. Bayesian symbolic regression. arXiv. 2019. https://arxiv.org/abs/1910.08892\
|
| [110] |
Kommenda M, Burlacu B, Kronberger G, Affenzeller M. Parameter identification for symbolic regression using nonlinear least squares. Genet Program Evolvable Mach. 2019;21(3):471–501.
|
| [111] |
Peters J, Schaal S. Reinforcement learning of motor skills with policy gradients. Neural Net. 2008;21(4):682–697.
|
| [112] |
Cranmer M, Tamayo D, Rein H, et al. A Bayesian neural network predicts the dissolution of compact planetary systems. Proc Natl Acad Sci. 2021;118(40):e2026053118.
|
| [113] |
Werner M, Junginger A, Hennig P, Martius G. Informed equation learning. arXiv. 2021. https://arxiv.org/abs/2105.06331
|
| [114] |
He B, Lu Q, Yang Q, Luo J, Wang Z. Taylor genetic programming for symbolic regression. Proceedings of the Genetic and Evolutionary Computation Conference, 946–954. Association for Computing Machinery;2022.
|
| [115] |
Sun F, Liu Y, Wang JX, Sun H. Symbolic physics learner: Discovering governing equations via monte carlo tree search. arXiv. 2022. https://arxiv.org/abs/2205.13134
|
| [116] |
Vázquez D, Guimerà R, Sales-Pardo M, Guillén-Gosálbez G. Automatic modeling of socioeconomic drivers of energy consumption and pollution using Bayesian symbolic regression. Sustain Prod Consum. 2022;30:596–607.
|
| [117] |
Kronberger G, De Franca FO, Burlacu B, Haider C, Kommenda M. Shape-constrained symbolic regression-improving extrapolation with prior knowledge. Evol Comput. 2022;30(1):75–98.
|
| [118] |
Rivero D, Fernandez-Blanco E, Pazos A. DoME: a deterministic technique for equation development and Symbolic Regression. Expert Syst Appl. 2022;198:116712.
|
| [119] |
Wang C, Zhang Y, Wen C, et al. Symbolic regression in materials science via dimension-synchronous-computation. J Mater Sci Technol. 2022;122:77–83.
|
| [120] |
Kartelj A, Djukanović M. RILS-ROLS: robust symbolic regression via iterated local search and ordinary least squares. J Big Data. 2023;10(1):71.
|
| [121] |
Moscato P, Ciezak A, Noman N. Dynamic depth for better generalization in continued fraction regression. Proceedings of the Genetic and Evolutionary Computation Conference, 520–528. Association for Computing Machinery;2023.
|
| [122] |
Petersen BK. Deep symbolic regression: recovering mathematical expressions from data via policy gradients. arXiv. 2019. https://arxiv.org/abs/1912.04871
|
| [123] |
Costa A, Dangovski R, Dugan O, et al. Fast neural models for symbolic regression at scale. arXiv. 2020. https://arxiv.org/abs/2007.10784
|
| [124] |
Udrescu SM, Tegmark M. AI Feynman: a physics-inspired method for symbolic regression. Sci Adv. 2020;6(16):eaay2631.
|
| [125] |
Udrescu S, Tan A, Feng J, Neto O, Wu T, Tegmark MAI. Feynman 2.0: pareto-optimal symbolic regression exploiting graph modularity. arXiv. 2020. https://arxiv.org/abs/2006.10782
|
| [126] |
Biggio L, Bendinelli T, Neitz A, Lucchi A, Parascandolo G. Neural symbolic regression that scales. arXiv. 2021. https://arxiv.org/abs/2106.06427
|
| [127] |
Petersen BK, Larma ML, Mundhenk TN, Santiago CP, Kim SK, Kim JT. Deep symbolic regression: recovering mathematical expressions from data via risk-seeking policy gradients. arXiv. 2021. https://arxiv.org/abs/1912.04871
|
| [128] |
Xu D, Fekri F. Interpretable model-based hierarchical reinforcement learning using inductive logic programming. arXiv. 2021. https://arxiv.org/abs/2106.11417
|
| [129] |
Kim S, Lu PY, Mukherjee S, et al. Integration of neural network-based symbolic regression in deep learning for scientific discovery. IEEE Trans Neural Netw Learn Syst. 2021;32(9):4166–4177.
|
| [130] |
Mundhenk TN, Landajuela M, Glatt R, Santiago CP, Faissol DM, Petersen BK. Symbolic regression via neural-guided genetic programming population seeding. arXiv. 2021. https://arxiv.org/abs/2111.00053
|
| [131] |
Ascoli S, Kamienny PA, Lample G, Charton F. Deep symbolic regression for recurrent sequences. arXiv. 2022. https://arxiv.org/abs/2201.04600
|
| [132] |
Liu X, Wang X, Gao S, et al. Finding predictive models for singlet fission by machine learning. npj Comput Mater. 2022;8(1):70–79.
|
| [133] |
Bendinelli T, Biggio L, Kamienny PA. Controllable neural symbolic regression. arXiv. 2023. https://arxiv.org/abs/2304.10336
|
| [134] |
Chu X, Zhao H, Xu E, Qi H, Chen M, Shao H. Neural symbolic regression using control variables. arXiv. 2023. https://arxiv.org/abs/2306.04718
|
| [135] |
Kubalík J, Derner E, Babuška R. Toward physically plausible data-driven models: a novel neural network approach to symbolic regression. IEEE Access. 2023;11:61481–61501.
|
| [136] |
Pitzer E, Kronberger G. Smooth symbolic regression: transformation of symbolic regression into a real-valued optimization problem. Computer Aided Systems Theory-EUROCAST 2015, 375–38. Springer, 2015.
|
| [137] |
Kusner MJ, Paige B, Hernández-Lobato JM. Grammar variational autoencoder. International Conference on Machine Learning, 1945–1954. PMLR;2017.
|
| [138] |
Udrescu SM, Tegmark M. Symbolic pregression: discovering physical laws from distorted video. Phys Rev E. 2021;103(4):043307.
|
| [139] |
Valipour M, You B, Panju M, Ghodsi A. SymbolicGPT: a generative transformer model for symbolic regression. arXiv. 2021. https://arxiv.org/abs/2106.14131
|
| [140] |
Kubalík J, Derner E, Babuška R. Multi-objective symbolic regression for physics-aware dynamic modeling. Expert Syst Appl. 2021;182:115210.
|
| [141] |
Vastl M, Kulhánek J, Kubalík J, Derner E, Babuška R. SymFormer: end-to-end symbolic regression using transformer-based architecture. arXiv. 2022. https://arxiv.org/abs/2205.15764
|
| [142] |
Virgolin M, Pissis S. Symbolic regression is NP-hard. arXiv. 2022. https://arxiv.org/abs/2207.01018
|
| [143] |
Narayanan H, Cruz Bournazou MN, Guillén Gosálbez G, Butté A. Functional-Hybrid modeling through automated adaptive symbolic regression for interpretable mathematical expressions. Chem Eng J. 2022;430:133032.
|
| [144] |
Kamienny PA, d’Ascoli S, Lample G, Charton F. End-to-end symbolic regression with transformers. arXiv. 2022. https://arxiv.org/abs/2204.10532
|
| [145] |
Li J, Yuan Y, Shen H. Symbolic expression transformer: a computer vision approach for symbolic regression. arXiv. 2022. https://arxiv.org/abs/2205.11798
|
| [146] |
Becker S, Klein M, Neitz A, Parascandolo G, Kilbertus N. Predicting ordinary differential equations with transformers. arXiv. 2023. https://arxiv.org/abs/2307.12617
|
| [147] |
Kamienny PA, Lample G, Lamprier S, Virgolin M. Deep generative symbolic regression with monte-carlo-tree-search. arXiv. 2023. https://arxiv.org/abs/2302.11223
|
| [148] |
Jin P, Huang D, Zhang R, et al. Online symbolic regression with informative query. ArXiv. 2023. https://arxiv.org/abs/2302.10539.
|
| [149] |
Li W, Li W, Sun L, et al. Transformer-based model for symbolic regression via joint supervised learning. The Eleventh International Conference on Learning Representations;2023.
|
| [150] |
Popov S, Lazarev M, Belavin V, Derkach D, Ustyuzhanin A. Symbolic expression generation via variational auto-encoder. PeerJ Comp Sci. 2023;9:e1241.
|
| [151] |
Shojaee P, Meidani K, Farimani A, Reddy C. Transformer-based planning for symbolic regression. arXiv. 2023. https://arxiv.org/abs/2303.06833
|
| [152] |
Cranmer M, Sanchez-Gonzalez A, Battaglia P, et al. Discovering symbolic models from deep learning with inductive biases. arXiv. 2020. https://arxiv.org/abs/2006.11287
|
| [153] |
Forrest S. Genetic algorithms: principles of natural selection applied to computation. Science. 1993;261(5123):872–878.
|
| [154] |
Koza J. Genetic programming as a means for programming computers by natural selection. Stat Comput. 1994;4:87–112.
|
| [155] |
Gong C, Bryan J, Furcoiu A, Su Q, Grobe R. Evolutionary symbolic regression from a probabilistic perspective. SN Comput Sci. 2022;3(3):209.
|
| [156] |
Makke N, Chawla S. Interpretable scientific discovery with symbolic regression: a review. arXiv. 2022. https://arxiv.org/abs/2211.10873
|
| [157] |
Wagner S, Kronberger G, Beham A, et al. Architecture and design of the heuristiclab optimization environment. Advanced Methods And Applications In Computational Intelligence. Topics in Intelligent Engineering and Informatics. Springer;2014:197–261.
|
| [158] |
Engle MR, Sahinidis NV. Deterministic symbolic regression with derivative information: general methodology and application to equations of state. AIChE J. 2022;68(6):e17457.
|
| [159] |
Lucena-Sánchez E, Sciavicco G, Stan IE. Feature and language selection in temporal symbolic regression for interpretable air quality modelling. Algorithms. 2021;14(3):76.
|
| [160] |
Holt S, Qian Z, van der Schaar M. Deep generative symbolic regression. International Conference on Learning Representations;2023.
|
| [161] |
Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need. arXiv. 2023. https://arxiv.org/abs/1706.03762
|
| [162] |
Orzechowski P, Cava WL, Moore J. Where are we now: a large benchmark study of recent symbolic regression methods. Proceedings of the Genetic and Evolutionary Computation Conference, 1183–1190. Association for Computing Machinery;2018.
|
| [163] |
Žegklitz J, Pošík P. Benchmarking state-of-the-art symbolic regression algorithms. Genet Program Evolvable Mach. 2021;22(1):5–33.
|
| [164] |
La Cava W, Orzechowski P, Burlacu B, et al. Contemporary symbolic regression methods and their relative performance. arXiv. 2021. https://arxiv.org/abs/2107.14351
|
| [165] |
Matsubara Y, Chiba N, Igarashi R, Taniai T, Ushiku Y. Rethinking symbolic regression datasets and benchmarks for scientific discovery. arXiv. 2022. https://arxiv.org/abs/2206.10540
|
| [166] |
Gilpin WChaos as an interpretable benchmark for forecasting and data-driven modelling. arXiv. 2023. https://arxiv.org/abs/2110.05266
|
| [167] |
Zhang H, Zhou A. RL-GEP: Symbolic regression via gene expression programming and reinforcement learning. International Joint Conference on Neural Networks (IJCNN), Vol 1, 1–8;2021.
|
| [168] |
Guo Z, Hu S, Han ZK, Ouyang R. Improving symbolic regression for predicting materials properties with iterative variable selection. J Chem Theory Comput. 2022;18(8):4945–4951.
|
| [169] |
Zhang L, Hu W, He M, Li S. Optimizing photoelectrochemical photovoltage and stability of molecular interlayer-modified halide perovskite in water: insights from interpretable machine learning and symbolic regression. ACS Appl Energy Mater. 2023;6(10):5177–5187.
|
| [170] |
Hu W, Zhang L. First-principles, machine learning and symbolic regression modelling for organic molecule adsorption on two-dimensional CaO surface. J Mol Graph Modell. 2023;124:108530.
|
| [171] |
O’Connor NJ, Jonayat ASM, Janik MJ, Senftle TP. Interaction trends between single metal atoms and oxide supports identified with density functional theory and statistical learning. Nat Catalysis. 2018;1(7):531–539.
|
| [172] |
Tan B, Liang YC, Chen Q, Zhang L, Ma JJ. Discovery of a new criterion for predicting glass-forming ability based on symbolic regression and artificial neural network. J Appl Phys. 2022;132(12):125104.
|
| [173] |
Mao Y, Yang H, Sheng Y, et al. Prediction and classification of formation energies of binary compounds by machine learning: an approach without crystal structure information. ACS Omega. 2021;6(22):14533–14541.
|
| [174] |
Birky D, Garbrecht K, Emery J, Alleman C, Bomarito G, Hochhalter J. Generalizing the Gurson model using symbolic regression and transfer learning to relax inherent assumptions. Modell Simul Mater Sci Eng. 2023;31(8):085005.
|
| [175] |
Schmidt M, Lipson H. Distilling free-form natural laws from experimental data. Science. 2009;324(5923):81–85.
|
| [176] |
Versino D, Tonda A, Bronkhorst CA. Data driven modeling of plastic deformation. Comput Methods Appl Mech Eng. 2017;318:981–1004.
|
| [177] |
Xiong J, Zhang T, Shi S. Machine learning of mechanical properties of steels. Sci China: Technol Sci. 2020;63(7):1247–1255.
|
| [178] |
Kabliman E, Kolody AH, Kronsteiner J, Kommenda M, Kronberger G. Application of symbolic regression for constitutive modeling of plastic deformation. Appl Eng Sci. 2021;6:100052.
|
| [179] |
Kabliman E, Kolody AH, Kommenda M, Kronberger G. Prediction of stress-strain curves for aluminium alloys using symbolic regression. Proceedings of the 22nd International ESAFORM Conference on Material Forming. Vol 2113, 180009. AIP Publishing LLC;2019.
|
| [180] |
Montes de Oca Zapiain D, Lane JMD, Carroll JD, et al. Establishing a data-driven strength model for β-tin by performing symbolic regression using genetic programming. Comput Mater Sci. 2023;218:111967.
|
| [181] |
Li Z, Long Z, Lei S, Tang Y. A simple, quantitative expression for understanding and evaluating the yield strength of amorphous alloys based on symbolic regression and dimensional calculation. J Non-Cryst Solids. 2023;614:122409.
|
| [182] |
Weber G, Pinz M, Ghosh S. Machine learning-aided parametrically homogenized crystal plasticity model (PHCPM) for single crystal Ni-based superalloys. JOM. 2020;72:4404–4419.
|
| [183] |
Halder C, Madej L, Pietrzyk M, Chakraborti N. Optimization of cellular automata model for the heating of dual-phase steel by genetic algorithm and genetic programming. Mater Manuf Process. 2015;30(4):552–562.
|
| [184] |
Sastry K, Johnson DD, Goldberg DE, Bellon P. Genetic programming for multiscale modeling. Int J Multiscale Comput Eng. 2004;2(2):239–256.
|
| [185] |
Gandomi AH, Sajedi S, Kiani B, Huang Q. Genetic programming for experimental big data mining: a case study on concrete creep formulation. Autom Constr. 2016;70:89–97.
|
| [186] |
Tantardini C, Zakaryan HA, Han ZK, Levchenko SV, Kvashnin AG. Hardness descriptor derived from symbolic regression. arXiv. 2023. https://arxiv.org/abs/2304.12880
|
| [187] |
Foppa L, Purcell TAR, Levchenko SV, Scheffler M, Ghiringhelli LM. Hierarchical symbolic regression for identifying key physical parameters correlated with bulk properties of perovskites. Phys Rev Lett. 2022;129(5):055301.
|
| [188] |
Yang C, Chong X, Hu M, et al. Accelerating the discovery of hybrid perovskites with targeted band gaps via interpretable machine learning. ACS Appl Mater Interfaces. 2023;15(34):40419–40427.
|
| [189] |
Baloch AAB, Albadwawi O, AlShehhi B, Alberts V. Bandgap model using symbolic regression for environmentally compatible lead-free inorganic double perovskites. IEEE 49th Photovoltaics Specialists Conference (PVSC), 0452–0455. IEEE;2022.
|
| [190] |
Zhang L, Su T, Li M, et al. Accurate band gap prediction based on an interpretable Δ-machine learning. Mater Today Commun. 2022;33:104630.
|
| [191] |
Loftis C, Yuan K, Zhao Y, Hu M, Hu J. Lattice thermal conductivity prediction using symbolic regression and machine learning. J Phys Chem A. 2021;125(1):435–450.
|
| [192] |
Purcell TAR, Scheffler M, Ghiringhelli LM, Carbogno C. Accelerating materials-space exploration for thermal insulators by mapping materials properties via artificial intelligence. npj Comput Mater. 2023;9(1):112.
|
| [193] |
Flores E, Wölke C, Yan P, et al. Learning the laws of lithium-ion transport in electrolytes using symbolic regression. Digit Discov. 2022;1(4):440–447.
|
| [194] |
Zhong Y, Hu X, Sarker D, et al. Data analytics accelerates the experimental discovery of Cu1−xAgxGaTe2 based thermoelectric chalcogenides with high figure of merit. J Mater Chem A. 2023;11(35):18651–18659.
|
| [195] |
Wang E, Wang G, Zhou J, Sun Z. MBenes-supported single atom catalysts for oxygen reduction and oxygen evolution reaction by first-principles study and machine learning. Natl Sci Open. 2024;3:20230043.
|
| [196] |
Ram S, Choi GH, Lee AS, Lee SC, Bhattacharjee S. Combining first-principles modeling and symbolic regression for designing efficient single-atom catalysts in the oxygen evolution reaction on Mo2CO2 MXenes. ACS Appl Mater Interfaces. 2023;15(37):43702–43711.
|
| [197] |
Kenoufi A, Kholmurodov K. Symbolic regression of inter-atomic potentials via genetic programming. Biol Chem Res. 2015;2:1–10.
|
| [198] |
Hernandez A, Balasubramanian A, Yuan F, Mason SAM, Mueller T. Fast, accurate, and transferable many-body interatomic potentials by symbolic regression. npj Comput Mater. 2019;5:112.
|
| [199] |
Hernandez A, Mueller T. Generalizability of functional forms for interatomic potential models discovered by symbolic regression. Phys Rev Mater. 2023;7(5):053804.
|
| [200] |
Pospichal P, Murphy E, O’Neill M, Schwarz J, Jaros J. Acceleration of grammatical evolution using graphics processing units: computational intelligence on consumer games and graphics hardware. Proceedings of the 13th Annual Conference Companion on Genetic and Evolutionary Computation, 431–438. Association for Computing Machinery;2011.
|
| [201] |
Van Heeswijk M, Miche Y, Oja E, Lendasse A. GPU-accelerated and parallelized ELM ensembles for large-scale regression. Neurocomputing. 2011;74(16):2430–2437.
|
| [202] |
Chen Q, Xue B, Zhang M. Genetic programming for instance transfer learning in symbolic regression. IEEE Trans. Cybernet. 2022;52(1):25–38.
|
| [203] |
Muller B, Al-Sahaf H, Xue B, Zhang M. Transfer learning: a building block selection mechanism in genetic programming for symbolic regression. Proceedings of the Genetic and Evolutionary Computation Conference Companion, 350–351. Association for Computing Machinery;2019.
|
| [204] |
Haslam E, Xue B, Zhang M. Further investigation on genetic programming with transfer learning for symbolic regression. IEEE Congress on Evolutionary Computation (CEC), 3598–3605. IEEE;2016.
|
| [205] |
Otte C. Safe and interpretable machine learning: a methodological review. Computational Intelligence in Intelligent Data Analysis. Vol 445. lSpringer;2013:111–122.
|
| [206] |
Smits GF, Kotanchek M. Pareto-front exploitation in symbolic regression. Genetic Programming Theory and Practice II. Vol 8. Springer;2005:283–299.
|
RIGHTS & PERMISSIONS
2024 The Authors. Interdisciplinary Materials published by Wuhan University of Technology and John Wiley & Sons Australia, Ltd.