Intelligent diagnosis for hot-rolled strip crown with unbalanced data using a hybrid multi-stage ensemble model

Cheng-yan Ding, Jie Sun, Xiao-jian Li, Wen Peng, Dian-hua Zhang

Journal of Central South University ›› 2024, Vol. 31 ›› Issue (3) : 762-782. DOI: 10.1007/s11771-024-5579-6
Article

Intelligent diagnosis for hot-rolled strip crown with unbalanced data using a hybrid multi-stage ensemble model

Author information +
History +

Abstract

To improve the smart manufacturing capabilities of strip hot rolling, based on digital twin (DT) and cyber-physical system (CPS), this paper proposes a data-driven approach for diagnosing hot-rolled strip crown. Since the hot rolling process features heredity, nonlinearity and strong coupling, the diagnosis of strip crown is an imbalanced problem with ill-defined decision boundaries. Conventional regression methods tend to learn more information from the majority class, which ignore the strip with unqualified crown. To address this challenge, a hybrid multi-stage ensemble model (HMSEN) is presented to classify strip crown. Initially, a novel data-resampling method that combines adaptive synthetic sampling (ADASYN) with repeated edited nearest neighbor (RENN) is proposed to assign more attention to unqualified crown. Subsequently, using the reinforced data, a multi-stage ensemble model is built to enhance the classification performance. Furthermore, the best-performing HMSEN is identified by exploring various combinations of base classifiers. The experimental results demonstrated the proposed novel resampling method outperforms comparison methods on crown dataset. Significantly, the proposed HMSEN outperforms not only the existing regression models but also the mechanism model. Therefore, HMSEN is the most robust and effective model to intelligently diagnose hot-rolled strip crown with unbalanced data.

Keywords

hot-rolled strip crown diagnosis / imbalanced multi-class classification / multi-stage ensemble modeling / data-resampling method / smart manufacturing / cyber-physical system

Cite this article

Download citation ▾
Cheng-yan Ding, Jie Sun, Xiao-jian Li, Wen Peng, Dian-hua Zhang. Intelligent diagnosis for hot-rolled strip crown with unbalanced data using a hybrid multi-stage ensemble model. Journal of Central South University, 2024, 31(3): 762‒782 https://doi.org/10.1007/s11771-024-5579-6

References

[[1]]
Pivoto D G S, de Almeida L F F, Da Rosa Righi R, et al.. Cyber-physical systems architectures for industrial Internet of Things applications in Industry 4.0: A literature review. Journal of Manufacturing Systems, 2021, 58: 176-192, J]
CrossRef Google scholar
[[2]]
Karatas M, Eriskin L, Deveci M, et al.. Big data for healthcare industry 4.0: Applications, challenges and future perspectives. Expert Systems with Applications, 2022, 200: 116912, J]
CrossRef Google scholar
[[3]]
Zhou X-k, Xu X-s, Liang W, et al.. Intelligent small object detection for digital twin in smart manufacturing with industrial cyber-physical systems. IEEE Transactions on Industrial Informatics, 2022, 18(2): 1377-1386, J]
CrossRef Google scholar
[[4]]
Peng G-z, Cheng Y-l, Zhang Y-f, et al.. Industrial big data-driven mechanical performance prediction for hot-rolling steel using lower upper bound estimation method. Journal of Manufacturing Systems, 2022, 65: 104-114, J]
CrossRef Google scholar
[[5]]
Zeba G, Dabić M, Čičak M, et al.. Technology mining: Artificial intelligence in manufacturing. Technological Forecasting and Social Change, 2021, 171: 120971, J]
CrossRef Google scholar
[[6]]
Leng J-w, Wang D-w, Shen W-m, et al.. Digital twins-based smart manufacturing system design in Industry 4.0: A review. Journal of Manufacturing Systems, 2021, 60: 119-137, J]
CrossRef Google scholar
[[7]]
Tao F, Qi Q-l, Wang L-h, et al.. Digital twins and cyber - physical systems toward smart manufacturing and industry 4.0: Correlation and comparison. Engineering, 2019, 5(4): 653-661, J]
CrossRef Google scholar
[[8]]
Tao F, Cheng J-f, Qi Q-l, et al.. Digital twin-driven product design, manufacturing and service with big data. The International Journal of Advanced Manufacturing Technology, 2018, 94(9): 3563-3576, J]
CrossRef Google scholar
[[9]]
Wang X-k, Yang L T, Wang Y-h, et al.. ADTT: A highly efficient distributed tensor-train decomposition method for IIoT big data. IEEE Transactions on Industrial Informatics, 2021, 17(3): 1573-1582, J]
CrossRef Google scholar
[[10]]
Gehrmann C, Gunnarsson M. A digital twin based industrial automation and control system security architecture. IEEE Transactions on Industrial Informatics, 2020, 16(1): 669-680, J]
CrossRef Google scholar
[[11]]
Mücke G, Pütz P, Gorgels F. Methods of describing, assessing, and influencing shape deviations in strips [M]. Flat-Rolled Steel Processes, 2009 Boca Raton CRC Press 287-298,
CrossRef Google scholar
[[12]]
Deng J-f, Sun J, Peng W, et al.. Application of neural networks for predicting hot-rolled strip crown. Applied Soft Computing, 2019, 78(C): 119-131, J]
CrossRef Google scholar
[[13]]
Ji Y-f, Song L-b, Sun J, et al.. Application of SVM and PCA-CS algorithms for prediction of strip crown in hot strip rolling. Journal of Central South University, 2021, 28(8): 2333-2344, J]
CrossRef Google scholar
[[14]]
Li G-t, Gong D-y, Lu X, et al.. Ensemble learning based methods for crown prediction of hot-rolled strip. ISIJ International, 2021, 61(5): 1603-1613, J]
CrossRef Google scholar
[[15]]
Wang L, He S-l, Zhao Z-t, et al.. Prediction of hot-rolled strip crown based on Boruta and extremely randomized trees algorithms. Journal of Iron and Steel Research International, 2023, 30(5): 1022-1031, J]
CrossRef Google scholar
[[16]]
Koziarski M, Krawczyk B, Woźniak M. Radialbased oversampling for noisy imbalanced data classification. Neurocomputing, 2019, 343(C): 19-33, J]
CrossRef Google scholar
[[17]]
Tsai C F, Lin W-c, Hu Y-h, et al.. Under-sampling class imbalanced datasets by combining clustering analysis and instance selection. Information Sciences, 2019, 477: 47-54, J]
CrossRef Google scholar
[[18]]
Tao X-m, Li Q, Guo W-j, et al.. Self-adaptive cost weights-based support vector machine cost-sensitive ensemble for imbalanced data classification. Information Sciences: An International Journal, 2019, 487(C): 31-56, J]
CrossRef Google scholar
[[19]]
Cao C-j, Wang Zhe. IMCStacking: Cost-sensitive stacking learning with feature inverse mapping for imbalanced problems. Knowledge-Based Systems, 2018, 150: 27-37, J]
CrossRef Google scholar
[[20]]
Arefeen M A, Nimi S T, Rahman M S. Neural network-based undersampling techniques. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 2022, 52(2): 1111-1120, J]
CrossRef Google scholar
[[21]]
Wei G-l, Mu W-m, Song Y, et al.. An improved and random synthetic minority oversampling technique for imbalanced data. Knowledge-Based Systems, 2022, 248: 108839, J]
CrossRef Google scholar
[[22]]
Arora N, Kaur P D. A Bolasso based consistent feature selection enabled random forest classification algorithm: An application to credit risk assessment. Applied Soft Computing, 2020, 86: 105936, J]
CrossRef Google scholar
[[23]]
Minhas A S, Singh S. A new bearing fault diagnosis approach combining sensitive statistical features with improved multiscale permutation entropy method. Knowledge-Based Systems, 2021, 218: 106883, J]
CrossRef Google scholar
[[24]]
Chowdhury N K, Kabir M A, Rahman M M, et al.. Machine learning for detecting COVID-19 from cough sounds: An ensemble-based MCDM method. Computers in Biology and Medicine, 2022, 145: 105405, J]
CrossRef Google scholar
[[25]]
Ma K, Shen Q-q, Sun X-y, et al.. Rockburst prediction model using machine learning based on microseismic parameters of Qinling water conveyance tunnel. Journal of Central South University, 2023, 30(1): 289-305, J]
CrossRef Google scholar
[[26]]
Breiman L. Random forests. Machine Language, 2001, 45(1): 5-32 [J]
[[27]]
Geurts P, Ernst D, Wehenkel L. Extremely randomized trees. Machine Learning, 2006, 63(1): 3-42, J]
CrossRef Google scholar
[[28]]
Chen T-q, Guestrin C. XGBoost: A scalable tree boosting system [C]. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. August 13–17, 2016, 2016 San Francisco, California, USA ACM 785-794
[[29]]
Ke G-l, Meng Q, Finley T, et al.. LightGBM: A highly efficient gradient boosting decision tree [C]. Proceedings of the 31st International Conference on Neural Information Processing Systems. December 4–9, 2017, 2017 Long Beach, California, USA ACM 3149-3157
[[30]]
Prokhorenkova L, Gusev G, Vorobev A, et al.. CatBoost: Unbiased boosting with categorical features [C]. Proceedings of the 32nd International Conference on Neural Information Processing Systems. December 3–8, 2018, 2018 Montréal, Canada ACM 6639-6649
[[31]]
Bauer E, Kohavi R. An empirical comparison of voting classification algorithms: Bagging, boosting, and variants. Machine Learning, 1999, 36(1): 105-139, J]
CrossRef Google scholar
[[32]]
Wolpert D H. Stacked generalization. Neural Networks, 1992, 5(2): 241-259, J]
CrossRef Google scholar
[[33]]
TSCHER A, JAHRER M. The bigchaos solution to the netflix grand prize [R]. Netflix Prize Documentation. Available from: https://www.netflixprize.com/assets/GrandPrize2009_BPC_BigChaos.pdf, 2009.
[[34]]
Zhou Z-h, Feng Ji. Deep forest. National Science Review, 2019, 6(1): 74-86, J]
CrossRef Google scholar
[[35]]
Arik S Ö, Pfister T. TabNet: Attentive interpretable tabular learning. Proceedings of the AAAI Conference on Artificial Intelligence, 2021, 35(8): 6679-6687, J]
CrossRef Google scholar
[[36]]
Liu H, Zhang X-y, Yang Y-x, et al.. Hourly traffic flow forecasting using a new hybrid modelling method. Journal of Central South University, 2022, 29(4): 1389-1402, J]
CrossRef Google scholar
[[37]]
Liu H, Deng D-hua. An enhanced hybrid ensemble deep learning approach for forecasting daily PM2.5. Journal of Central South University, 2022, 29(6): 2074-2083, J]
CrossRef Google scholar
[[38]]
Rayhan F, Ahmed S, Mahbub A, et al.. CUSBoost: cluster-based under-sampling with boosting for imbalanced classification [C]. 2017 2nd International Conference on Computational Systems and Information Technology for Sustainable Solution (CSITSS), 2017 Bengaluru, India IEEE 1-5
[[39]]
Chawla N V, Lazarevic A, Hall L O, et al.. SMOTEBoost: improving prediction of the minority class in boosting [M]. Knowledge Discovery in Databases: PKDD 2003, 2003 Berlin, Heidelberg Springer Berlin Heidelberg 107-119,
CrossRef Google scholar
[[40]]
Rayhan F, Ahmed S, Mahbub A, et al.. MEBoost: Mixing estimators with boosting for imbalanced data classification [C]. 2017 11th International Conference on Software, Knowledge, Information Management and Applications (SKIMA), 2017 Malabe, Sri Lanka IEEE 1-6
[[41]]
Zhao J-k, Jin J, Chen S, et al.. A weighted hybrid ensemble method for classifying imbalanced data. Knowledge-Based Systems, 2020, 203: 106087, J]
CrossRef Google scholar
[[42]]
Hou W-h, Wang X-k, Zhang H-y, et al.. A novel dynamic ensemble selection classifier for an imbalanced data set: An application for credit risk assessment. Knowledge-Based Systems, 2020, 208: 106462, J]
CrossRef Google scholar
[[43]]
Wang N, Zhao S-y, Cui S-z, et al.. A hybrid ensemble learning method for the identification of Gang-related arson cases. Knowledge-Based Systems, 2021, 218: 106875, J]
CrossRef Google scholar
[[44]]
Deng W, Liu H-l, Xu J-j, et al.. An improved quantum-inspired differential evolution algorithm for deep belief network. IEEE Transactions on Instrumentation and Measurement, 2020, 69(10): 7319-7327, J]
CrossRef Google scholar
[[45]]
Zhao H-m, Liu J, Chen H-y, et al.. Intelligent diagnosis using continuous wavelet transform and Gauss convolutional deep belief network. IEEE Transactions on Reliability, 2023, 72(2): 692-702, J]
CrossRef Google scholar
[[46]]
Deng J-f, Sun J, Peng W, et al.. Imbalanced multiclass classification with active learning in strip rolling process. Knowledge-Based Systems, 2022, 255: 109754, J]
CrossRef Google scholar
[[47]]
Nakanishi T. Application of work roll shift mill ‘HCW-Mill’ to hot strip and plate rolling. Hitachi Review, 1985, 4: 153-160 [J]
[[48]]
Ataka M. Rolling technology and theory for the last 100 years: The contribution of theory to innovation in strip rolling technology. ISIJ International, 2015, 55(1): 89-102, J]
CrossRef Google scholar
[[49]]
Ding C-y, Sun J, Li X-j, et al.. A high-precision and transparent step-wise diagnostic framework for hot-rolled strip crown. Journal of Manufacturing Systems, 2023, 71: 144-157, J]
CrossRef Google scholar
[[50]]
Fernandez A, Garcia S, Herrera F, et al.. SMOTE for learning from imbalanced data: Progress and challenges, marking the 15-year anniversary. Journal of Artificial Intelligence Research, 2018, 61: 863-905, J]
CrossRef Google scholar
[[51]]
Chawla N V, Bowyer K W, Hall L O, et al.. SMOTE: Synthetic minority over-sampling technique. Journal of Artificial Intelligence Research, 2002, 16: 321-357, J]
CrossRef Google scholar
[[52]]
He H-b, Bai Y, Garcia E A, et al.. ADASYN: Adaptive synthetic sampling approach for imbalanced learning [C]. 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence), 2008 Hong Kong, China IEEE 1322-1328
[[53]]
García V, Sánchez J S, Mollineda R A. On the effectiveness of preprocessing methods when dealing with different levels of class imbalance. Knowledge-Based Systems, 2012, 25(1): 13-21, J]
CrossRef Google scholar
[[54]]
An experiment with the edited nearest-neighbor rule [J]. IEEE Transactions on Systems, Man, and Cybernetics, 1976, SMC-6(6): 448–452. DOI: https://doi.org/10.1109/TSMC.1976.4309523.
[[55]]
Liu Z-d, Li D-yuan. Intelligent hybrid model to classify failure modes of overstressed rock masses in deep engineering. Journal of Central South University, 2023, 30(1): 156-174, J]
CrossRef Google scholar
[[56]]
Cortes C, Vapnik V. Support-vector networks. Machine Learning, 1995, 20(3): 273-297, J]
CrossRef Google scholar
[[57]]
Lecun Y, Bengio Y, Hinton G. Deep learning. Nature, 2015, 521: 436-444, J]
CrossRef Google scholar
[[58]]
Li Y-j, Guo H-x, Liu X, et al.. Adapted ensemble classification algorithm based on multiple classifier system and feature selection for classifying multi-class imbalanced data. Knowledge-Based Systems, 2016, 94(C): 88-104, J]
CrossRef Google scholar
[[59]]
Cui S-z, Yin Y-q, Wang D-j, et al.. A stacking-based ensemble learning method for earthquake casualty prediction. Applied Soft Computing, 2021, 101: 107038, J]
CrossRef Google scholar
[[60]]
Fang Z-c, Wang Y, Peng L, et al.. A comparative study of heterogeneous ensemble-learning techniques for landslide susceptibility mapping. International Journal of Geographical Information Science, 2021, 35(2): 321-347, J]
CrossRef Google scholar
[[61]]
Roy A, Cruz R M O, Sabourin R, et al.. A study on combining dynamic selection and data preprocessing for imbalance learning. Neurocomputing, 2018, 286(C): 179-192, J]
CrossRef Google scholar
[[62]]
Guo H-x, Li Y-j, Shang J, et al.. Learning from class-imbalanced data: Review of methods and applications. Expert Systems with Applications, 2017, 73: 220-239, J]
CrossRef Google scholar
[[63]]
Batista G E A P A, Prati R C, Monard M C. A study of the behavior of several methods for balancing machine learning training data. ACM SIGKDD Explorations Newsletter, 2004, 6(1): 20-29, J]
CrossRef Google scholar
[[64]]
Gazzah S, Ben Amara N E. New oversampling approaches based on polynomial fitting for imbalanced data sets [C]. 2008 The Eighth IAPR International Workshop on Document Analysis Systems, 2008 Nara, Japan IEEE 677-684,
CrossRef Google scholar
[[65]]
Barua S, Islam M M, Murase K. ProWSyn: Proximity weighted synthetic oversampling technique for imbalanced data set learning [C]. Pacific-Asia Conference on Knowledge Discovery and Data Mining, 2013 Berlin, Heidelberg Springer 317-328,
CrossRef Google scholar
[[66]]
Kovács G. An empirical comparison and evaluation of minority oversampling techniques on a large number of imbalanced datasets. Applied Soft Computing, 2019, 83: 105662, J]
CrossRef Google scholar
[[67]]
Shwartz-ziv R, Armon A. Tabular data: Deep learning is not all you need. Information Fusion, 2022, 81(C): 84-90, J]
CrossRef Google scholar
[[68]]
Wang S-m, Zhou J, Li C-q, et al.. Rockburst prediction in hard rock mines developing bagging and boosting tree-based ensemble techniques. Journal of Central South University, 2021, 28(2): 527-542, J]
CrossRef Google scholar
[[69]]
GORISHNIY Y, RUBACHEV I, KHRULKOV V, BABENKO A. Revisiting deep learning models for tabular data [C]//35th Conference on Neural Information Processing Systems (NeurIPS 2021). Sydney, Australia, 2021.
[[70]]
BORISOV V, LEEMANN T, SESSLER K, et al. Deep neural networks and tabular data: A survey [J]. IEEE Transactions on Neural Networks and Learning Systems. 2022. DOI: https://doi.org/10.1109/TNNLS.2022.3229161.

Accesses

Citations

Detail

Sections
Recommended

/