A multi-agent deep reinforcement learning framework for the generative design of alloys and processing routes

Bilal Muhammed; Akash Bhattacharjee; B. P. Gautham; Amol Joshi

doi:10.36922/IJAMD025480050

International Journal of AI for Materials and Design ›› 2026, Vol. 3 ›› Issue (1) :46 -68. DOI: 10.36922/IJAMD025480050

ORIGINAL RESEARCH ARTICLE

research-article

A multi-agent deep reinforcement learning framework for the generative design of alloys and processing routes

Author information +

History +

PDF

Abstract

The design of alloys and their manufacturing processes requires extensive exploration of a broad design space comprising various compositional and processing variables, many of which remain inadequately explored in practice. The existence of multiple viable processing routes for achieving desired alloy properties further complicates the design process. This paper presents a multi-agent deep reinforcement learning (DRL) framework for the in silico design of alloys and their processing routes/conditions tailored to specific property targets. The framework consists of distinct decentralized DRL agents, each responsible for making decisions regarding composition selection and the individual manufacturing steps involved in the process. These agents interact with their respective environments, which represent the assigned processes, and share responsibilities related to both process-specific outcomes and overall property satisfaction, as governed by the reward functions. The reward functions integrate considerations of sustainability, cost, and manufacturability into the decision-making process. A generative design step is proposed to leverage the capabilities of the trained DRL agents to produce multiple design alternatives for a given requirement. The framework is applied to the design of a hot-rolled steel sheet, exploring two feasible processing routes: Conventional casting and thin slab casting, resulting in several alternatives for each route. The framework’s performance is evaluated on two experimental cases from the literature, indicating its success in biasing the sample toward the preferred solution space. A benchmark study is conducted to evaluate the framework’s performance against designs produced by materials engineers for three distinct use cases, demonstrating the superior performance of the proposed framework.

Keywords

Alloy and processing design / In silico design / Multi-agent systems / Deep reinforcement learning / Manufacturing process routes

Cite this article

Download citation ▾

Bilal Muhammed, Akash Bhattacharjee, B. P. Gautham, Amol Joshi. A multi-agent deep reinforcement learning framework for the generative design of alloys and processing routes. International Journal of AI for Materials and Design, 2026, 3 (1) : 46-68 DOI:10.36922/IJAMD025480050

登录浏览全文

4963

注册一个新账户忘记密码

Acknowledgments

The authors sincerely thank Harshad Khadilkar for his guidance and insightful comments, which helped in the development and implementation of the DRL agent strategy. They also extend their gratitude to Dr. Gerald Tennyson for his valuable guidance and comments to refine the alloy design use-case problem. Furthermore, the authors thank Srimannarayana Pusuluri, Surya Ardham, Harisankar K.R., and Sandeep Pusuluri for their active participation in the benchmark study. Finally, the authors would like to thank TCS Research for supporting this work.

Funding

None.

Conflict of interest

The authors declare that they have a pending patent titled ‘Methods and systems for automated design of materials and its manufacturing process for desired properties’ assigned to Tata Consultancy Services Ltd.

Author contributions

Conceptualization: Bilal Muhammed, Akash Bhattacharjee, B. P. Gautham

Data curation: Akash Bhattacharjee

Formal analysis: Bilal Muhammed, Akash Bhattacharjee Investigation: Bilal Muhammed, B. P. Gautham Methodology: All authors

Software: Bilal Muhammed

Supervision: B. P. Gautham

Validation: Akash Bhattacharjee, B. P. Gautham Visualization: Bilal Muhammed, Akash Bhattacharjee Writing - original draft: Bilal Muhammed

Writing - review & editing: Akash Bhattacharjee, B. P. Gautham, Amol Joshi

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Availability of data

The data used in this study are proprietary and subject to organizational confidentiality agreements, and therefore cannot be shared.

Further disclosure

Part of the findings have been presented by Akash Bhattacharjee in the AI/ML & Multiscale Modeling for Materials Discovery Symposium at IIT Delhi, New Delhi. The title of the presentation was “Framework for in-silico generative alloy design.”

References

Publishing order | Descend order by publishing year | Descend order by cited within

[1]	Pollock TM, Van der Ven A. The evolving landscape for alloy design. MRS Bull. 2019; 44(4):238-246. doi: 10.1557/mrs.2019.69

[2]	Ishida K. Alloy design and development of advanced materials based on phase diagrams and microstructural control. Mater Trans. 2020; 65(5):807-819. doi: 10.2320/matertrans.mt-m2019362

[3]	Gorsse S, Tancret F. Current and emerging practices of CALPHAD toward the development of high entropy alloys and complex concentrated alloys. J Mater Res. 2018; 33(19):2899-2923. doi: 10.1557/jmr.2018.152

[4]	Wu M, Wang S, Huang H, Shu D, Sun B. CALPHAD aided eutectic high-entropy alloy design. Mater Lett. 2020; 262:127175. doi: 10.1016/j.matlet.2019.127175

[5]	Carvalho SR, Ong TH, Guimarães G. A mathematical and computational model of furnaces for continuous steel strip processing. J Mater Process Technol. 2006; 178(1):379-387. doi: 10.1016/j.jmatprotec.2006.04.083

[6]	Albertin E, Beneduce F, Matsumoto M, Teixeira I. Optimizing heat treatment and wear resistance of high chromium cast irons using computational thermodynamics. Wear. 2011; 271(9-10):1813-1818. doi: 10.1016/j.wear.2011.01.079

[7]	Frydrych K, Karimi K, Pecelerowicz M, et al. Materials informatics for mechanical deformation: A review of applications and challenges. Materials (Basel). 2021; 14(19):5764. doi: 10.3390/ma14195764

[8]	Zou C, Li J, Wang WY, et al. Integrating data mining and machine learning to discover high-strength ductile titanium alloys. Acta Mater. 2021; 202:211-221. doi: 10.1016/j.actamat.2020.10.056

[9]	Hart GLW, Mueller T, Toher C, Curtarolo S. Machine learning for alloys. Nat Rev Mater. 2021; 6(8):730-755. doi: 10.1038/s41578-021-00340-w

[10]	Gao X, Wang H, Tan H, Xing L, Hu Z. Data-driven machine learning for alloy research: Recent applications and prospects. Mater Today Commun. 2023; 36:106697. doi: 10.1016/j.mtcomm.2023.106697

[11]	Golmohammadi M, Aryanpour M. Analysis and evaluation of machine learning applications in materials design and discovery. Mater Today Commun. 2023; 35:105494. doi: 10.1016/j.mtcomm.2023.105494

[12]	Vanpoucke DEP, Van Knippenberg OSJ, Hermans K, Bernaerts KV, Mehrkanoon S. Small data materials design with machine learning: When the average model knows best. J Appl Phys. 2020; 128(5):054901. doi: 10.1063/5.0012285

[13]	Noh J, Gu GH, Kim S, Jung Y. Machine-enabled inverse design of inorganic solid materials: Promises and challenges. Chem Sci. 2020; 11(19):4871-4881. doi: 10.1039/d0sc00594k

[14]	Debnath A, Krajewski AM, Sun H, et al. Generative Deep Learning as a Tool for Inverse Design of High-Entropy Refractory Alloys. [arXiv Preprint]; 2021. doi: 10.48550/arXiv.2108.12019

[15]	Nguyen P, Tran T, Gupta S, Rana S, Venkatesh S. Hybrid Generative-Discriminative Models for Inverse Materials Design. [arXiv Preprint]; 2018. doi: 10.48550/arXiv.1811.06060

[16]	Chen L, Zhang W, Nie Z, Li S, Pan F. Generative models for inverse design of inorganic solid materials. J Mater Inform. 2021; 1:4. doi: 10.20517/jmi.2021.07

[17]	Sousa T, Correia J, Pereira V, Rocha M. Generative deep learning for targeted compound design. J Chem Inf Model. 2021; 61(10):5343-5361. doi: 10.1021/acs.jcim.0c01496

[18]	Rui X, Siriwardane EMD, Song Y, et al. Active-learning-based generative design for the discovery of wide-band-gap materials. J Phys Chem C. 2021; 125(29):16118-16128. doi: 10.1021/acs.jpcc.1c02438

[19]	Witman M, Ek G, Ling S, et al. Data-driven discovery and synthesis of high-entropy alloy hydrides with targeted thermodynamic stability. Chem Mater. 2021; 33(11):4067-4076. doi: 10.1021/acs.chemmater.1c00647

[20]	Sheikh S, Vela B, Honarmandi P, et al. High-throughput alloy and process design for metal additive manufacturing. NPJ Comput Mater. 2025; 11:179. doi: 10.1038/s41524-025-01670-x

[21]	Lee JW, Park WB, Lee D, Kim S, Goo NH, Sohn KS. Dirty engineering data-driven inverse prediction machine learning model. Sci Rep. 2020; 10:20443. doi: 10.1038/s41598-020-77575-0

[22]	Couperthwaite R, Molkeri A, Khatamsaz D, Srivastava A, Allaire D, Arróyave R. Materials design through batch Bayesian optimization with multisource information fusion. JOM. 2020; 72(10):4431-4443. doi: 10.1007/s11837-020-04396-x

[23]	Honarmandi P, Attari V, Arróyave R. Accelerated materials design using batch Bayesian optimization: A case study for solving the inverse problem from materials microstructure to process specification. Comput Mater Sci. 2022; 210:111417. doi: 10.1016/j.commatsci.2022.111417

[24]	Khatamsaz D, Vela B, Singh P, Johnson DD, Allaire D, Arróyave R. Bayesian optimization with active learning of design constraints using an entropy-based approach. NPJ Comput Mater. 2023; 9:74. doi: 10.1038/s41524-023-01006-7

[25]	Sardeshmukh A, Reddy S, Gautham BP. Bayesian framework for inverse inference in manufacturing process chains. Integr Mater Manuf Innov. 2019; 8(2):95-106. doi: 10.1007/s40192-019-00140-9

[26]	Rao Z, Tung PY, Xie R, et al. Machine learning-enabled high-entropy alloy discovery. Science. 2022; 378(6615):78-85. doi: 10.1126/science.abo4940

[27]	Wen C, Zhang Y, Wang C, et al. Machine learning assisted design of high entropy alloys with desired property. Acta Mater. 2019; 170:109-117. doi: 10.1016/j.actamat.2019.03.010

[28]	Coto AG, Precker CE, Andersson T, et al. The use of generative models to speed up the discovery of materials. Comput Methods Mater Sci. 2023; 23(1):13-26. doi: 10.7494/cmms.2023.1.0802

[29]	Li Z, Nash WT, O’Brien SP, Qiu Y, Gupta RK, Birbilis N. cardiGAN: A generative adversarial network model for multi-principal element alloys. J Mater Sci Technol. 2022; 125:81-96. doi: 10.1016/j.jmst.2022.03.008

[30]	Dan Y, Zhao Y, Li X, Li S, Hu M, Hu J. Generative adversarial networks (GAN) based efficient sampling of chemical composition space for inverse design of inorganic materials. NPJ Comput Mater. 2020; 6:1-7. doi: 10.1038/s41524-020-00352-0

[31]	Iyer A, Dey B, Dasgupta A, Chen W, Chakraborty A. Conditional Generative Model for Predicting Material Microstructures. [arXiv Preprint]; 2019. doi: 10.48550/arXiv.1910.02133

[32]	Zhou Z, Shang Y, Liu X, Yang Y. A generative deep learning framework for inverse design of compositionally complex bulk metallic glasses. NPJ Comput Mater. 2023; 9:15. doi: 10.1038/s41524-023-00968-y

[33]	Sardeshmukh A, Reddy S, Gautham BP, Bhattacharyya P. Material Microstructure Design using VAE-Regression with Multimodal Prior. [arXiv Preprint]; 2024. doi: 10.48550/arxiv.2402.17806

[34]	Menon D, Ranganathan R. A generative approach to materials discovery, design, and optimization. ACS Omega. 2022; 7(30):25958-25973. doi: 10.1021/acsomega.2c03264

[35]	Chen CT, Gu GX. Generative deep neural networks for inverse materials design using backpropagation and active learning. Adv Sci (Weinh). 2020; 7(5):1902607. doi: 10.1002/advs.201902607

[36]	Pei Z, Rozman KA, Do ÖN, et al. Machine-learning microstructure for inverse material design. Adv Sci (Weinh). 2021; 8(23):2101207. doi: 10.1002/advs.202101207

[37]	Popova M, Isayev O, Tropsha A. Deep reinforcement learning for de novo drug design. Sci Adv. 2018; 4(7):eaap7885. doi: 10.1126/sciadv.aap7885

[38]	Turk H, Landini E, Kunkel C, Margraf JT, Reuter K. Assessing deep generative models in chemical composition space. Chem Mater. 2022; 34(21):9455-9467. doi: 10.1021/acs.chemmater.2c01860

[39]	Karpovich C, Pan E, Olivetti EA. Deep reinforcement learning for inverse inorganic materials design. NPJ Comput Mater. 2024; 10:287. doi: 10.1038/s41524-024-01474-5

[40]	Volk AA, Epps RW, Yonemoto DT, et al. AlphaFlow: Autonomous discovery and optimization of multi-step chemistry using a self-driven fluidic lab guided by reinforcement learning. Nat Commun. 2023; 14:1403. doi: 10.1038/s41467-023-37139-y

[41]	Xian Y, Dang P, Tian Y, et al. Compositional design of multicomponent alloys using reinforcement learning. Acta Mater. 2024; 274:120017. doi: 10.1016/j.actamat.2024.120017

[42]	Sui F, Guo R, Zhang Z, Gu GX, Lin L. Deep reinforcement learning for digital materials design. ACS Mater Lett. 2021; 3(8):1433-1439. doi: 10.1021/acsmaterialslett.1c00390

[43]	Yang J, Tian B, Chen L, et al. Deep reinforcement learning for multiphase microstructure design. Comput Mater Contin. 2021; 68(1):1285-1302. doi: 10.32604/cmc.2021.016829

[44]	Rajak P, Chen ASC, Kim JY, et al. Autonomous reinforcement learning agent for kirigami design of 2D materials. NPJ Comput Mater. 2021; 7:72. doi: 10.1038/s41524-021-00572-y

[45]	Pandit P, Abdusalamov R, Itskov M, Rege A. Deep reinforcement learning for microstructural optimisation of silica aerogels. Sci Rep. 2024; 14:1511. doi: 10.1038/s41598-024-51341-y

[46]	Dornheim J, Morand L, Zeitvogel S, Iraki T, Link N, Helm D. Deep reinforcement learning methods for structure-guided processing path optimization. J Intell Manuf. 2022; 33:333-352. doi: 10.1007/s10845-021-01805-z

[47]	Mianroodi JR, Siboni NH, Raabe D. Computational Discovery of Energy-Efficient Heat Treatment for Microstructure Design using Deep Reinforcement Learning. [arXiv Preprint]; 2022. doi: 10.48550/arXiv.2209.11259

[48]	Ghafarollahi A, Buehler MJ. Automating alloy design and discovery with physics-aware multimodal multiagent AI. Proc Natl Acad Sci USA. 2025; 122:e2414074122. doi: 10.1073/pnas.2414074122

[49]	Hu Z, Huang C, Xie L, Hua L, Yuan Y, Zhang LC. Machine learning assisted quality control in metal additive manufacturing: A review. Adv Powder Mater. 2025; 4(6):100342. doi: 10.1016/j.apmate.2025.100342

[50]	Li Y. Deep Reinforcement Learning: An Overview. [arXiv Preprint]; 2017. doi: 10.48550/arXiv.1701.07274

[51]	Arulkumaran K, Deisenroth MP, Brundage M, Bharath AA. A Brief Survey of Deep Reinforcement Learning. [arXiv Preprint]; 2017. doi: 10.48550/arXiv.1708.05866

[52]	Hernandez-Leal P, Kartal B, Taylor ME. A survey and critique of multiagent deep reinforcement learning. Auton Agents Multi-Agent Syst. 2019; 33(6):750-797. doi: 10.1007/s10458-019-09421-1

[53]	Canese L, Cardarilli GC, Di Nunzio L, et al. Multi-agent reinforcement learning: A review of challenges and applications. Appl Sci (Basel). 2021; 11:4948. doi: 10.3390/app11114948

[54]	Mountstephens J, Teo J. Progress and challenges in generative product design. Computers. 2020; 9(4):80. doi: 10.3390/computers9040080

[55]	Lillicrap TP, Hunt JJ, Pritzel A, et al. Continuous Control with Deep Reinforcement Learning. [arXiv Preprint]; 2015. doi: 10.48550/arXiv.1509.02971

[56]	Sutton RS, Barto AG. Reinforcement learning: An introduction. In: IEEE Transactions on Neural Networks. 1st ed. , vol. 9. New York: IEEE; 1998. doi: 10.1109/tnn.1998.712192

[57]	Paszke A, Gross S, Massa S, et al. PyTorch: An Imperative style. High-Performance Deep Learning Library. In: 33rd Annual Conference on Neural Information Processing Systems. Vol. 32; 2019. doi: 10.48550/arXiv.1912.01703

[58]	Zambrano PC, Guerrero MP, Colas R, Leduc LA. Microstructural analysis of hot-rolled, low-carbon steel strips. Mater Charact. 2001; 47(3-4):275-282. doi: 10.1016/S1044-5803(01)00188-7

[59]	JFE Steel Corp. Hot Rolled Steel Sheet Catalogue. JFE Steel Corp. Available from: https://www.jfe-steel.co.jp/en/ products/sheets/catalog/b1e-001.pdf [Last accessed on 2025 Nov 05].

[60]	Jarfors AEW, Du A, Yu G, Zheng J, Wang K. On the sustainable choice of alloying elements for strength of aluminum-based alloys. Sustainability. 2020; 12:1059. doi: 10.3390/su12031059

[61]	Ginzburg VB, Ballas R. Flat Rolling Fundamentals. Boca Raton, FL: CRC Press; 2000. doi: 10.1201/9781482277357

[62]	Townsend H. Effects of alloying elements on corrosion of steel in industrial atmospheres. Corrosion. 2001; 57:497-501. doi: 10.5006/1.3290374

[63]	Miettinen J, Louhenkilpi S, Kytönen H, Laine J. IDS: Thermodynamic-kinetic-empirical tool for modelling of solidification, microstructure and material properties. Math Comput Simul. 2010; 80:1536-1550. doi: 10.1016/j.matcom.2009.11.002

[64]	Lee SJ. Predictive model for austenite grain growth during reheating of alloy steels. ISIJ Int. 2013; 53:1902-1904. doi: 10.2355/isijinternational.53.1902

[65]	Sims RB. Calculation of roll force and torque in hot rolling mills. Proc Inst Mech Eng. 1954; 168:191-200. doi: 10.1243/pime_proc_1954_168_023_02

[66]	Zhang J, Cui Z. Simulation of multi-pass hot rolling by a mixed analytical-numerical method. Int J Appl Mech. 2011; 3:469-489. doi: 10.1142/S1758825111001081

[67]	Medina SF, Quispe A. Improved model for static recrystallization kinetics of hot-deformed austenite in low alloy and Nb/V microalloyed steels. ISIJ Int. 2001; 41:774-781. doi: 10.2355/isijinternational.41.774

[68]	Chubenko V, Khinotskaya A, Yarosh T, Saithareiev L. Sustainable development of the steel plate hot rolling technology due to energy-power process parameters justification. E3S Web Conf. 2020; 166:06009. doi: 10.1051/e3sconf/202016606009

[69]	Singh SB, Krishnan K, Sahay SS. Modeling non-isothermal austenite to ferrite transformation in low carbon steels. Mater Sci Eng A. 2007;445-446:310-315. doi: 10.1016/j.msea2006.09.044.

[70]	Umemoto M, Guo ZH, Tamura I. Effect of cooling rate on grain size of ferrite in carbon steel. Mater Sci Technol. 1987; 3:249-255. doi: 10.1179/mst.1987.3.4.249

[71]	Wang L, Tang D, Song Y. Prediction of mechanical behavior of ferrite-pearlite steel. J Iron Steel Res Int. 2017; 24:321-327. doi: 10.1016/S1006-706X(17)30046-8

[72]

Hahn GT, Rosenfield AR. Sources of fracture toughness: The relation between K_1cand the ordinary tensile properties of metals. In: Conrad H, Jaffee RI, Kessler HP, Minkler WW, editors. Applications Related Phenomena in Titanium Alloys. United States: ASTM International; 1968. p. 5-32. doi: 10.1520/STP33617S

[73]	JMatPro, Sente Software Ltd. Modelling the Plane Strain Fracture Toughness of Titanium and Aluminium Alloys. Sente Software Ltd. Available from: https://www.sentesoftware. co.uk/site-media/fracture-toughness-ti-al [Last accessed on 2024 Mar 04].