A neural network-based production process modeling and variable importance analysis approach in corn to sugar factory
Yi Tong, Mou Shu, Mingxin Li, Yingwei Liu, Ran Tao, Congcong Zhou, You Zhao, Guoxing Zhao, Yi Li, Yachao Dong, Lei Zhang, Linlin Liu, Jian Du
A neural network-based production process modeling and variable importance analysis approach in corn to sugar factory
Corn to sugar process has long faced the risks of high energy consumption and thin profits. However, it’s hard to upgrade or optimize the process based on mechanism unit operation models due to the high complexity of the related processes. Big data technology provides a promising solution as its ability to turn huge amounts of data into insights for operational decisions. In this paper, a neural network-based production process modeling and variable importance analysis approach is proposed for corn to sugar processes, which contains data preprocessing, dimensionality reduction, multilayer perceptron/convolutional neural network/recurrent neural network based modeling and extended weights connection method. In the established model, dextrose equivalent value is selected as the output, and 654 sites from the DCS system are selected as the inputs. LASSO analysis is first applied to reduce the data dimension to 155, then the inputs are dimensionalized to 50 by means of genetic algorithm optimization. Ultimately, variable importance analysis is carried out by the extended weight connection method, and 20 of the most important sites are selected for each neural network. The results indicate that the multilayer perceptron and recurrent neural network models have a relative error of less than 0.1%, which have a better prediction result than other models, and the 20 most important sites selected have better explicable performance. The major contributions derived from this work are of significant aid in process simulation model with high accuracy and process optimization based on the selected most important sites to maintain high quality and stable production for corn to sugar processes.
big data / corn to sugar factory / neural network / variable importance analysis
[1] |
Kirmse A, Kuschicke F, Hoffmann M. Industrial big data: from data to information to actions. 4th International Conference on Internet of Things. Big Data and Security, 2019,
|
[2] |
Tian W, Ren Y, Dong Y, Wang S, Bu L. Fault monitoring based on mutual information feature engineering modeling in chemical process. Chinese Journal of Chemical Engineering, 2019, 27(10): 2491–2497
CrossRef
Google scholar
|
[3] |
Kira K, Rendell L A. The feature selection problem: traditional methods and a new algorithm. AAAI-92 Proceedings: Tenth National Conference on Artificial Intelligence, 1992,
|
[4] |
Barros R S M, Hidalgo J I G, Cabral D R L. Wilcoxon rank sum test drift detector. Neurocomputing, 2018, 275: 1954–1963
CrossRef
Google scholar
|
[5] |
Malik H, Yadav A K. A novel hybrid approach based on relief algorithm and fuzzy reinforcement learning approach for predicting wind speed. Sustainable Energy Technologies and Assessments, 2021, 43: 100920
CrossRef
Google scholar
|
[6] |
Wold S, Sjostrom M, Eriksson L. PLS-regression: a basic tool of chemometrics. Chemometrics and Intelligent Laboratory Systems, 2001, 58(2): 109–130
CrossRef
Google scholar
|
[7] |
Li H, Xu Q, Liang Y. Random frog: an efficient reversible jump Markov Chain Monte Carlo-like approach for variable selection with applications to gene selection and disease classification. Analytica Chimica Acta, 2012, 740: 20–26
CrossRef
Google scholar
|
[8] |
Cutler A, Cutler D R, Stevens J R. Random forests. Machine Learning, 2004, 45: 157–176
|
[9] |
Zavaljevski N, Stevens F J, Reifman J. Support vector machines with selective kernel scaling for protein classification and identification of key amino acid positions. Bioinformatics, 2002, 18(5): 689–696
CrossRef
Google scholar
|
[10] |
Li Z, Liu P, Wang W, Xu C. Using support vector machine models for crash injury severity analysis. Accident; Analysis and Prevention, 2012, 45: 478–486
CrossRef
Google scholar
|
[11] |
Olden J D, Jackson D A. Illuminating the “black box”: a randomization approach for understanding variable contributions in artificial neural networks. Ecological Modelling, 2002, 154(1–2): 135–150
CrossRef
Google scholar
|
[12] |
Yun Y H, Deng B C, Cao D S, Wang W T, Liang Y Z. Variable importance analysis based on rank aggregation with applications in metabolomics for biomarker discovery. Analytica Chimica Acta, 2016, 911: 27–34
CrossRef
Google scholar
|
[13] |
Qin S J. Process data analytics in the era of big data. AIChE Journal, 2014, 60(9): 3092–3100
CrossRef
Google scholar
|
[14] |
Dimopoulos Y, Bourret P, Lek S. Use of some sensitivity criteria for choosing networks with good generalization ability. Neural Processing Letters, 1995, 2(6): 1–4
CrossRef
Google scholar
|
[15] |
Dimopoulos I, Chronopoulos J, Chronopoulou-Sereli A, Lek S. Neural network models to study relationships between lead concentration in grasses and permanent urban descriptors in Athens city (Greece). Ecological Modelling, 1999, 120(2–3): 157–165
CrossRef
Google scholar
|
[16] |
Garson G D. Interpreting neural network connection weights. Artificial Intelligence Expert, 1991, 6: 47–51
|
[17] |
Scardi M, Harding L W Jr. Developing an empirical model of phytoplankton primary production: a neural network case study. Ecological Modelling, 1999, 120(2–3): 213–223
CrossRef
Google scholar
|
[18] |
Lek S, Belaud A, Baran P, Dimopoulos I, Delacoste M. Role of some environmental variables in trout abundance models using neural networks. Aquatic Living Resources, 1996, 9(1): 23–29
CrossRef
Google scholar
|
[19] |
Lek S, Delacoste M, Baran P, Dimopoulos I, Lauga J, Aulagnier S. Application of neural networks to modelling nonlinear relationships in ecology. Ecological Modelling, 1996, 90(1): 39–52
CrossRef
Google scholar
|
[20] |
Balls G R, Palmer-Brown D, Sanders G E. Investigating microclimatic influences on ozone injury in clover (Trifolium subterraneum) using artificial neural networks. New Phytologist, 1996, 132(2): 271–280
CrossRef
Google scholar
|
[21] |
Grahovac J, Jokic A, Dodic J, Vucurovic D, Dodic S. Modelling and prediction of bioethanol production from intermediates and byproduct of sugar beet processing using neural networks. Renewable Energy, 2016, 85: 953–958
CrossRef
Google scholar
|
[22] |
Hao W R, Lu Z Z, Wei P F, Feng J, Wang B T. A new method on ANN for variance based importance measure analysis of correlated input variables. Structural Safety, 2012, 38: 56–63
CrossRef
Google scholar
|
[23] |
de Sa C R. Variance-based feature importance in neural networks. Discovery Science, 22nd International Conference, 2019,
|
[24] |
Hadzima-Nyarko M, Nyarko E K, Moric D. A neural network based modelling and sensitivity analysis of damage ratio coefficient. Expert Systems with Applications, 2011, 38(10): 13405–13413
CrossRef
Google scholar
|
[25] |
Cortez P, Embrechts M J. Using sensitivity analysis and visualization techniques to open black box data mining models. Information Sciences, 2013, 225: 1–17
CrossRef
Google scholar
|
[26] |
Hadjisolomou E, Stefanidis K, Papatheodorou G, Papastergiadou E. Assessing the contribution of the environmental parameters to eutrophication with the use of the “PaD” and “PaD2” methods in a hypereutrophic lake. International Journal of Environmental Research and Public Health, 2016, 13(8): 764
CrossRef
Google scholar
|
[27] |
Yang B, Li H. A novel convolutional neural network based approach to predictions of process dynamic time delay 286 sequences. Chemometrics and Intelligent Laboratory Systems, 2018, 174: 56–61
CrossRef
Google scholar
|
[28] |
Wang Y J, Li H G. A novel intelligent modeling framework integrating the convolutional neural network with an adaptive time-series window and its application to industrial process operational optimization. Chemometrics and Intelligent Laboratory Systems, 2018, 179: 64–72
CrossRef
Google scholar
|
[29] |
Wang Y, Li H. Industrial process time-series modeling based on adapted receptive field temporal convolution networks concerning multi-region operations. Computers & Chemical Engineering, 2020, 139: 106877
CrossRef
Google scholar
|
[30] |
Yang W, Yang C, Hao Z Y, Xie C Q, Li M Z. Diagnosis of plant cold damage based on hyperspectral imaging and convolutional neural network. IEEE Access: Practical Innovations, Open Solutions, 2019, 7: 118239–118248
CrossRef
Google scholar
|
[31] |
Liu Q, Zhang L, Tang K, Liu L, Du J, Meng Q, Gani R. Machine learning-based atom contribution method for the prediction of charge density profiles and solvent design. AIChE Journal, 2021, 67(2): e17110
CrossRef
Google scholar
|
[32] |
Liu Q, Jiang Y, Zhang L, Du J. A computational toolbox for molecular property prediction based on quantum mechanics and quantitative structure-property relationship. Frontiers of Chemical Science and Engineering, 2022, 16(2): 152–167
CrossRef
Google scholar
|
[33] |
Chang Z, Zhang Y, Chen W. Electricity price prediction based on hybrid model of adam optimized LSTM neural network and wavelet transform. Energy, 2019, 187: 115804
CrossRef
Google scholar
|
[34] |
Maples M P, Reichart D E, Konz N C, Berger T A, Trotter A S, Martin J R, Dutton D A, Paggen M L, Joyner R E, Salemi C P. Robust Chauvenet Outlier Rejection. Astrophysical Journal. Supplement Series, 2018, 238(1): 2
CrossRef
Google scholar
|
[35] |
Elko G W, Sondhi M M, West J E. Noise reduction processing arrangement for microphone arrays. Journal of the Acoustical Society of America, 1989, 88(6): 2919
CrossRef
Google scholar
|
[36] |
López-Medina C, Ladehesa-Pineda L, Puche-Larrubia M Á, Escudero-Contreras A, Font-Ugalde P, Collantes-Estévez E. Which factors explain the patient global assessment in patients with ankylosing spondylitis? A hierarchical cluster analysis on REGISPONSER-AS. Seminars in Arthritis and Rheumatism, 2021, 51(4): 1–5
CrossRef
Google scholar
|
[37] |
Lin J, Li S. Sparse recovery with coherent tight frames via analysis Dantzig selector and analysis LASSO. Applied and Computational Harmonic Analysis, 2014, 37(1): 126–139
CrossRef
Google scholar
|
[38] |
MacQueen J. Some methods for classification and analysis of multivariate observations. In: Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability. Berkeley: University of California Press, 1967,
|
[39] |
Ranade N, Nagarajan S, Sarvothaman V, Ranade V. ANN based modelling of hydrodynamic cavitation processes: biomass pre-treatment and wastewater treatment. Ultrasonics Sonochemistry, 2021, 72: 105428
CrossRef
Google scholar
|
[40] |
Zhang X, Liu L, Long G, Jiang J, Liu S. Episodic memory govern schoices: an RNN-based reinforcement learning model for decision-making task. Neural Networks, 2021, 134: 1–10
CrossRef
Google scholar
|
[41] |
Liu S, Lee I. Sequence encoding incorporated CNN model for email document sentiment classification. Applied Soft Computing, 2021, 102: 107104
CrossRef
Google scholar
|
/
〈 | 〉 |