Intelligent smelting process, management system: Efficient and intelligent management strategy by incorporating large language model

Tianjie FU, Shimin LIU, Peiyu LI

PDF(2322 KB)
PDF(2322 KB)
Front. Eng ›› 2024, Vol. 11 ›› Issue (3) : 396-412. DOI: 10.1007/s42524-024-4013-y
Industrial Engineering and Intelligent Manufacturing
RESEARCH ARTICLE

Intelligent smelting process, management system: Efficient and intelligent management strategy by incorporating large language model

Author information +
History +

Abstract

In the steelmaking industry, enhancing production cost-effectiveness and operational efficiency requires the integration of intelligent systems to support production activities. Thus, effectively integrating various production modules is crucial to enable collaborative operations throughout the entire production chain, reducing management costs and complexities. This paper proposes, for the first time, the integration of Vision-Language Model (VLM) and Large Language Model (LLM) technologies in the steel manufacturing domain, creating a novel steelmaking process management system. The system facilitates data collection, analysis, visualization, and intelligent dialogue for the steelmaking process. The VLM module provides textual descriptions for slab defect detection, while LLM technology supports the analysis of production data and intelligent question-answering. The feasibility, superiority, and effectiveness of the system are demonstrated through production data and comparative experiments. The system has significantly lowered costs and enhanced operational understanding, marking a critical step toward intelligent and cost-effective management in the steelmaking domain.

Graphical abstract

Keywords

smelting steel / process management / large language models / intelligent Q & A / ChatGPT

Cite this article

Download citation ▾
Tianjie FU, Shimin LIU, Peiyu LI. Intelligent smelting process, management system: Efficient and intelligent management strategy by incorporating large language model. Front. Eng, 2024, 11(3): 396‒412 https://doi.org/10.1007/s42524-024-4013-y

References

[1]
Alayrac J B, Donahue J, Luc P, Miech A, Barr I, Hasson Y, Lenc K, Mensch A, Millican K, Reynolds M, Ring R, (2022). Reynolds M. Flamingo: A visual language model for few-shot learning. Advances in Neural Information Processing Systems, 35: 23716–23736
CrossRef Google scholar
[2]
AndersonPFernando BJohnsonMGouldS (2016). Spice: Semantic propositional image caption evaluation. In: Proceedings of European Conference on Computer Vision (ECCV): 382–398
[3]
AndersonPHe XBuehlerCTeneyDJohnsonM GouldSZhang L (2018). Bottom-up and top-down attention for image captioning and visual question answering. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 6077–6086
[4]
Bao Z, He D, Khan M K, Luo M, Xie Q, (2023). PBidm: Privacy-preserving blockchain-based identity management system for industrial internet of things. IEEE Transactions on Industrial Informatics, 19( 2): 1524–1534
CrossRef Google scholar
[5]
Bellavista P, Fogli M, Giannelli C, Stefanelli C, (2023). Application-aware network traffic management in MEC-integrated industrial environments. Future Internet, 15( 2): 42
CrossRef Google scholar
[6]
Bessarabov A M, Trokhin V E, Popov A K, Radetskaya A S, (2023). CALS project: Hardware and technological design of a modular water management system for industrial applications. Chemical and Petroleum Engineering, 58( 9–10): 855–864
CrossRef Google scholar
[7]
Borkowski A A, (2023). Applications of ChatGPT and large language models in medicine and health care: Benefits and pitfalls. Federal Practitioner, 40( 6): 170–173
CrossRef Google scholar
[8]
CuiYNiekumS GuptaAKumar VRajeswaranA (2022). Can foundation models perform zero-shot task specification for robot manipulation? In: Proceedings of 4th Annual Learning for Dynamics and Control Conference, Stanford, USA
[9]
De Curtò J, De Zarzà I, Calafate C T, (2023). Semantic scene understanding with large language models on unmanned aerial vehicles. Drones, 7( 2): 114
CrossRef Google scholar
[10]
Demertzis K, Demertzis S, Iliadis L, (2023). A selective survey review of computational intelligence applications in the primary subdomains of civil engineering specializations. Applied Sciences-Basel, 13( 6): 3380
CrossRef Google scholar
[11]
DosovitskiyABeyerLKolesnikov AWeissenbornDZhaiXUnterthiner T TDehghaniMMindererMHeigoldG GellySUszkoreit J (2021). An image is worth 16x16 words: Transformers for image recognition at scale. In: Proceedings of International Conference on Learning Representations 2021
[12]
Fang L, Su F, Kang Z, Zhu H, (2023). Artificial neural network model for temperature prediction and regulation during molten steel transportation process. Processes, 11( 6): 1629
CrossRef Google scholar
[13]
Franco D’Souza R, Amanullah S, Mathew M, Surapaneni K M, (2023). Appraising the performance of ChatGPT in psychiatry using 100 clinical case vignettes. Asian Journal of Psychiatry, 89: 103770
CrossRef Google scholar
[14]
Fu T, Li P, Liu S, (2024a). An imbalanced small sample slab defect recognition method based on image generation. Journal of Manufacturing Processes, 118: 376–388
CrossRef Google scholar
[15]
Fu T, Liu S, Li P, (2024b). Digital twin-driven smelting process management method for converter steelmaking. Journal of Intelligent Manufacturing, 2024: 1–17
CrossRef Google scholar
[16]
GuXO’Leary T YKuoWCuiY (2022). Open-vocabulary object detection via vision and language knowledge distillation. In: Proceedings of International Conference on Learning Representations 2022
[17]
Hein-Pensel F, Winkler H, Brückner A, Wölke M, Jabs I, Mayan I J, Kirschenbaum A, Friedrich J, Zinke-Wehlmann C, (2023). Maturity assessment for Industry 5.0: A review of existing maturity models. Journal of Manufacturing Systems, 66: 200–210
CrossRef Google scholar
[18]
Huang H C, Tsai C H, Lin H C, (2023). Development of 5G cyber-physical production system. International Journal of Networked and Distributed Computing, 11( 1): 9–19
CrossRef Google scholar
[19]
HuangWAbbeel PPathakDMordatchI (2022). Language models as zero-shot planners: Extracting actionable knowledge for embodied agents. In: Proceedings of 39th International Conference on Machine Learning (ICML), Baltimore, MA, USA
[20]
Iwańkowicz R, Rutkowski R, (2023). Digital twin of shipbuilding process in Shipyard 4.0. Sustainability, 15( 12): 9733
CrossRef Google scholar
[21]
Jaber M M, Ali M H, Abd S K, Jassim M M, Alkhayyat A, Kadhim E H, Alkhuwaylidee A R, Alyousif S, (2023). AHI: A hybrid machine learning model for complex industrial information systems. Journal of Combinatorial Optimization, 45( 2): 58
CrossRef Google scholar
[22]
Jadhav A, Shandilya S K, Izonin I, Gregus M, (2023). Effective software effort estimation leveraging machine learning for digital transformation. IEEE Access: Practical Innovations, Open Solutions, 11: 83523–83536
CrossRef Google scholar
[23]
KouzapasDStylianidis NPanayiotouC GEliadesD G (2023). Ontology-based reasoning to reconFigure industrial processes for energy efficiency. In: Proceedings of 2023 31st Mediterranean Conference on Control and Automation (MED). 79–84
[24]
Li S, Guo Z, Zang X, (2023). Advancing the production of clinical medical devices through ChatGPT. Annals of Biomedical Engineering, 52( 3): 441–445
CrossRef Google scholar
[25]
LiX JYin XLiC YZhangP CHuX W ZhangLWang LHuHDongLWeiF ChoiY (2020). Oscar: Object-semantics aligned pre-training for vision-language tasks. In: Proceedings of 16th European Conference on Computer Vision (ECCV 2020). 121–137
[26]
LinC YOch F J (2004). Automatic evaluation of machine translation quality using longest common subsequence and skip-bigram statistics. In: Proceedings of 42nd Annual Meeting of the Association for Computational Linguistics (ACL-04), 605–612
[27]
Liu R, Xie X, (2024). Improve the industrial digital transformation through Industrial Internet platforms. Frontiers of Engineering Management, 11( 1): 167–174
CrossRef Google scholar
[28]
Mallio C A, Sertorio A C, Bernetti C, Beomonte Zobel B, (2023). Large language models for structured reporting in radiology: performance of GPT-4, ChatGPT-3.5, Perplexity and Bing. La Radiologia Medica, 128( 7): 808–812
CrossRef Google scholar
[29]
Massey P A, Montgomery C, Zhang A S, (2023). Comparison of ChatGPT-3.5, ChatGPT-4, and orthopaedic resident performance on orthopaedic assessment examinations. Journal of the American Academy of Orthopaedic Surgeons, 31( 23): 1173–1179
CrossRef Google scholar
[30]
MokadyRHertz ABermanoA H (2021). ClipCap: CLIP prefix for image captioning. Computer Science. arXiv: 2111.09734
[31]
NairSRajeswaran AKumarVFinnCGuptaA (2022). R3M: A universal visual representation for robot manipulation. arXiv: 2203.12601
[32]
O’Leary D E, (2023). Enterprise large language models: Knowledge characteristics, risks, and organizational activities. Intelligent Systems in Accounting, Finance & Management, 30( 3): 113–119
CrossRef Google scholar
[33]
Pavlopoulos J, Romell A, Curman J, Steinert O, Lindgren T, Borg M, Randl K, (2023). Automotive fault nowcasting with machine learning and natural language processing. Machine Learning, 113( 2): 843–861
CrossRef Google scholar
[34]
Peng G, Cheng Y, Zhang Y, Shao J, Wang H, Shen W, (2022). Industrial big data-driven mechanical performance prediction for hot-rolling steel using lower upper bound estimation method. Journal of Manufacturing Systems, 65: 104–114
CrossRef Google scholar
[35]
RadfordAKim J WHallacyCRameshAGohG AgarwalSSastry GAskellAMishkinPClarkJ KruegerGSutskever I (2021). Learning transferable visual models from natural language supervision. In: Proceedings of 38th International Conference on Machine Learning, Virtual
[36]
RedmonJFarhadi A (2017). YOLO9000: Better, faster, stronger. In: Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, USA, 6517–6525
[37]
Semenov Y S, Shumelchyk Y I, Horupakha V V, Semion I Y, Vashchenko S V, Khudyakov O Y, Chychov I V, Hulina I H, Zakharov R H, (2022). Development and implementation of decision support systems for blast smelting control in the conditions of PrJSC “Kamet-Steel”. Metals, 12( 6): 985
CrossRef Google scholar
[38]
SharmaPDing NGoodmanSSoricutR (2018). Conceptual captions: A cleaned, hypernymed, image alt-text dataset for automatic image captioning. In: Proceedings of 56th Annual Meeting of the Association-for-Computational-Linguistics (ACL), Melbourne, Australia, 2556–2565
[39]
Shi J J, Zeng S, Meng X, (2017). Intelligent data analytics is here to change engineering management. Frontiers of Engineering Management, 4( 1): 41–48
CrossRef Google scholar
[40]
Shi Y, (2015). Challenges to engineering management in the big data era. Frontiers of Engineering Management, 2( 3): 293–303
CrossRef Google scholar
[41]
Sievers J, Blank T, (2023). A systematic literature review on data-driven residential and industrial energy management systems. Energies, 16( 4): 1688
CrossRef Google scholar
[42]
SnoswellC LSnoswell A JKellyJ TCafferyL JSmithA C (2023). Artificial intelligence: Augmenting telehealth with large language models. Journal of Telemedicine and Telecare: 1357633X2311690
[43]
Stepanov V K, Madzhumder M S, Begunova D D, (2023). Exploring the potential of applying the artificial intelligence language model ChatGPT-3.5 in library and bibliographic activities. Scientific and Technical Information Processing, 50( 3): 166–175
CrossRef Google scholar
[44]
Thiébaut R, Hejblum B, Mougin F, Tzourio C, Richert L, (2023). ChatGPT and beyond with artificial intelligence (AI) in health: Lessons to be learned. Joint, Bone, Spine, 90( 5): 105607
CrossRef Google scholar
[45]
VedantamRZitnick C LParikhD (2015). Cider: Consensus-based image description evaluation. In: Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 4566–4575
[46]
WeiJTayY BommasaniRRaffel CZophBBorgeaudSYogatamaD BosmaMZhou DMetzlerDChiE H (2022). Emergent abilities of large language models. arXiv: 2206.07682
[47]
Xiao Y, Zheng S, Shi J, Du X, Hong J, (2023). Knowledge graph-based manufacturing process planning: A state-of-the-art review. Journal of Manufacturing Systems, 70: 417–435
CrossRef Google scholar
[48]
Yu Z, Gong Y, (2024). ChatGPT, AI-generated content, and engineering management. Frontiers of Engineering Management, 11( 1): 159–166
CrossRef Google scholar
[49]
ZengAAttarian MIchterBChoromanskiKWongA WelkerSTombari FPurohitARyooMSindhwani VLeeJ (2022b). Socratic models: Composing zero-shot multimodal reasoning with language. arXiv: 2204.00598
[50]
ZengAFlorence PTompsonJWelkerSChienJ AttarianMArmstrong TKrasinIDuongDSindhwani VLeeJ (2022a). Transporter networks: Rearranging the visual world for robotic manipulation. arXiv: 2010.14406
[51]
Zheng H, Liu S, Zhang H, Yu J, Bao J, (2024). Visual triggered contextual guidance for lithium battery disassembly: A multi-modal event knowledge graph approach. Journal of Engineering Design, 2024: 1–26
CrossRef Google scholar
[52]
Zhou L, Palangi H, Zhang L, Hu H, Corso J, Gao J, (2020). Unified vision-language pretraining for image captioning and VQA. Proceedings of the AAAI Conference on Artificial Intelligence, 34( 7): 13041–13049
CrossRef Google scholar
[53]
Zhu T, Wang X, Yu Y, Li C, Yao Q, Li Y, (2023). Multi-process and multi-pollutant control technology for ultra-low emissions in the iron and steel industry. Journal of Environmental Sciences, 123: 83–95 in Chinese)
CrossRef Google scholar

Competing Interests

The authors declare that they have no competing interests.

Open Access

This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.
The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

RIGHTS & PERMISSIONS

2024 The Author(s). This article is published with open access at link.springer.com and journal.hep.com.cn
AI Summary AI Mindmap
PDF(2322 KB)

Accesses

Citations

Detail

Sections
Recommended

/