MatSci-ML Studio: an interactive workflow toolkit for automated machine learning in materials science

Yu Wang , Fei Wang , Guangmao Yan , Jun Wang , Guodong Niu , Jing Feng , Jian Mao , Yan Zhao

Journal of Materials Informatics ›› 2025, Vol. 5 ›› Issue (4) : 51

PDF
Journal of Materials Informatics ›› 2025, Vol. 5 ›› Issue (4) :51 DOI: 10.20517/jmi.2025.45
Research Article

MatSci-ML Studio: an interactive workflow toolkit for automated machine learning in materials science

Author information +
History +
PDF

Abstract

Machine learning (ML) has become a cornerstone of modern materials science, offering powerful tools for predicting material properties and accelerating experimental workflows. However, its widespread adoption is often hindered by the steep learning curve associated with programming languages such as Python, which presents a significant technical barrier for many domain experts. To address this challenge, we introduce MatSci-ML Studio: an interactive and user-friendly software toolkit designed to empower materials scientists with limited coding expertise. In contrast to traditional code-based frameworks, MatSci-ML Studio features an intuitive graphical user interface that encapsulates a comprehensive, end-to-end ML workflow. This integrated platform seamlessly guides users through data management, advanced preprocessing, multi-strategy feature selection, automated hyperparameter optimization, and model training, democratizing advanced computational analysis for the materials community. Notably, it incorporates advanced capabilities such as a SHapley Additive exPlanations-based interpretability analysis module for explaining model predictions and a multi-objective optimization engine for exploring complex design spaces. The practicality and effectiveness of MatSci-ML Studio are demonstrated through representative case studies, confirming its capacity to lower the technical barrier for ML applications, foster innovation, and significantly enhance the efficiency of data-driven materials science.

Keywords

Materials informatics / machine learning / materials science / automation tools / performance prediction

Cite this article

Download citation ▾
Yu Wang, Fei Wang, Guangmao Yan, Jun Wang, Guodong Niu, Jing Feng, Jian Mao, Yan Zhao. MatSci-ML Studio: an interactive workflow toolkit for automated machine learning in materials science. Journal of Materials Informatics, 2025, 5(4): 51 DOI:10.20517/jmi.2025.45

登录浏览全文

4963

注册一个新账户 忘记密码

References

[1]

El Naqa, I.; Murphy, M. J. What is machine learning? In: El Naqa I, Li R, Murphy MJ, editors. Machine learning in radiation oncology. Cham: Springer International Publishing; 2015. pp. 3-11.

[2]

Jordan MI.Machine learning: trends, perspectives, and prospects.Science2015;349:255-60

[3]

Butler KT,Cartwright H,Walsh A.Machine learning for molecular and materials science.Nature2018;559:547-55

[4]

Durodola J.Machine learning for design, phase transformation and mechanical properties of alloys.Prog Mater Sci2022;123:100797

[5]

Liu X,Pei Z.Machine learning for high-entropy alloys: progress, challenges and opportunities.Prog Mater Sci2023;131:101018

[6]

Hao C,Yuan Y,Jin H.Composition optimization design and high temperature mechanical properties of cast heat-resistant aluminum alloy via machine learning.Mater Design2025;250:113587

[7]

Mo W,Huang Y.Active learning-based alloy design strategy for improving the strength-ductility balance of Al-Mg-Zn alloys.Mater Design2025;252:113772

[8]

Liu Y,Xiao N,Dai FZ.Investigating interfacial segregation of Ω/Al in Al–Cu alloys: a comprehensive study using density functional theory and machine learning.Acta Mater2024;279:120294

[9]

Dunn A,Ganose A,Jain A.Benchmarking materials property prediction methods: the Matbench test set and Automatminer reference algorithm.npj Comput Mater2020;6:406

[10]

Xu Z,Niu L. Magpie: alignment data synthesis from scratch by prompting aligned LLMs with nothing. arXiv 2024, arXiv:2406.08464. https://doi.org/10.48550/arXiv.2406.08464. (accessed 31 Jul 2025)

[11]

Willman J.Overview of PyQt5. In: Modern PyQt. Berkeley: Apress; 2021. pp. 1-42.

[12]

Pedregosa F,Gramfort A. Scikit-learn: machine learning in Python. J. Mach. Learn Res. 2011;12:2825-30. https://www.jmlr.org/papers/volume12/pedregosa11a/pedregosa11a.pdf?source=post_page. (accessed 31 Jul 2025)

[13]

Chen T. Xgboost: a scalable tree boosting system. arXiv 2016, arXiv:1603.02754. https://doi.org/10.48550/arXiv.1603.02754. (accessed 31 Jul 2025)

[14]

Ke G,Finley T. LightGBM: a highly efficient gradient boosting decision tree. In Advances in Neural Information Processing Systems 30 (NIPS 2017). 2017. https://proceedings.neurips.cc/paper_files/paper/2017/hash/6449f44a102fde848669bdd9eb6b76fa-Abstract.html. (accessed 31 Jul 2025)

[15]

Prokhorenkova L,Vorobev A,Gulin A. CatBoost: unbiased boosting with categorical features. arXiv 2017, arXiv:1706.09516. https://doi.org/10.48550/arXiv.1706.09516. (accessed 31 Jul 2025)

[16]

Akiba T,Yanase T,Koyama M. Optuna: a next-generation hyperparameter optimization framework. arXiv 2019, arXiv:1907.10902. https://doi.org/10.48550/arXiv.1907.10902. (accessed 31 Jul 2025)

[17]

Mkaouer W,Shaout A.Many-objective software remodularization using NSGA-III.ACM Trans Softw Eng Methodol2015;24:1-45

[18]

Ishibuchi H,Setoguchi Y.Performance comparison of NSGA-II and NSGA-III on various many-objective test problems. In: 2016 IEEE Congress on Evolutionary Computation (CEC), Vancouver, Canada. Jul 24-29, 2016. IEEE; 2016. pp. 3045-52.

[19]

Zitzler E,Thiele L. SPEA2: improving the strength pareto evolutionary algorithm. 2001. https://www.research-collection.ethz.ch/handle/20.500.11850/145755. (accessed 31 Jul 2025)

[20]

Zhang Q.MOEA/D: a multiobjective evolutionary algorithm based on decomposition.IEEE Trans Evol Computat2007;11:712-31

[21]

Hansen N,Koumoutsakos P.Reducing the time complexity of the derandomized evolution strategy with covariance matrix adaptation (CMA-ES).Evol Comput2003;11:1-18

[22]

Das S.Differential evolution: a survey of the state-of-the-art.IEEE Trans Evol Computat2011;15:4-31

[23]

Boyce B,Desai S.Machine learning for materials science: barriers to broader adoption.Matter2023;6:1320-3

[24]

Zhang H,He X.Dramatically enhanced combination of ultimate tensile strength and electric conductivity of alloys via machine learning screening.Acta Mater2020;200:803-10

[25]

Jiang L,Zhang Z.Synchronously enhancing the strength, toughness, and stress corrosion resistance of high-end aluminum alloys via interpretable machine learning.Acta Mater2024;270:119873

[26]

Li H,Li Y.Machine learning assisted design of aluminum-lithium alloy with high specific modulus and specific strength.Mater Design2023;225:111483

[27]

Ahmed I. Lung cancer prediction dataset. 2025. https://www.kaggle.com/datasets/shantanugarg274/lung-cancer-prediction-dataset. (accessed 31 Jul 2025)

AI Summary AI Mindmap
PDF

296

Accesses

0

Citation

Detail

Sections
Recommended

AI思维导图

/