Online machine learning for stream wastewater influent flow rate prediction under unprecedented emergencies

Pengxiao Zhou , Zhong Li , Yimei Zhang , Spencer Snowling , Jacob Barclay

Front. Environ. Sci. Eng. ›› 2023, Vol. 17 ›› Issue (12) : 152

PDF (3221KB)
Front. Environ. Sci. Eng. ›› 2023, Vol. 17 ›› Issue (12) : 152 DOI: 10.1007/s11783-023-1752-7
RESEARCH ARTICLE
RESEARCH ARTICLE

Online machine learning for stream wastewater influent flow rate prediction under unprecedented emergencies

Author information +
History +
PDF (3221KB)

Abstract

● Online learning models accurately predict influent flow rate at wastewater plants.

● Models adapt to changing input-output relationships and are friendly to large data.

● Online learning models outperform conventional batch learning models.

● An optimal prediction strategy is identified through uncertainty analysis.

● The proposed models provide support for coping with emergencies like COVID-19.

Accurate influent flow rate prediction is important for operators and managers at wastewater treatment plants (WWTPs), as it is closely related to wastewater characteristics such as biochemical oxygen demand (BOD), total suspend solids (TSS), and pH. Previous studies have been conducted to predict influent flow rate, and it was proved that data-driven models are effective tools. However, most of these studies have focused on batch learning, which is inadequate for wastewater prediction in the era of COVID-19 as the influent pattern changed significantly. Online learning, which has distinct advantages of dealing with stream data, large data set, and changing data pattern, has a potential to address this issue. In this study, the performance of conventional batch learning models Random Forest (RF), K-Nearest Neighbors (KNN), and Multi-Layer Perceptron (MLP), and their respective online learning models Adaptive Random Forest (aRF), Adaptive K-Nearest Neighbors (aKNN), and Adaptive Multi-Layer Perceptron (aMLP), were compared for predicting influent flow rate at two Canadian WWTPs. Online learning models achieved the highest R2, the lowest MAPE, and the lowest RMSE compared to conventional batch learning models in all scenarios. The R2 values on testing data set for 24-h ahead prediction of the aRF, aKNN, and aMLP at Plant A were 0.90, 0.73, and 0.87, respectively; these values at Plant B were 0.75, 0.78, and 0.56, respectively. The proposed online learning models are effective in making reliable predictions under changing data patterns, and they are efficient in dealing with continuous and large influent data streams. They can be used to provide robust decision support for wastewater treatment and management in the changing era of COVID-19 and also under other unprecedented emergencies that could change influent patterns.

Graphical abstract

Keywords

Wastewater prediction / Data stream / Online learning / Batch learning / Influent flow rates

Cite this article

Download citation ▾
Pengxiao Zhou, Zhong Li, Yimei Zhang, Spencer Snowling, Jacob Barclay. Online machine learning for stream wastewater influent flow rate prediction under unprecedented emergencies. Front. Environ. Sci. Eng., 2023, 17(12): 152 DOI:10.1007/s11783-023-1752-7

登录浏览全文

4963

注册一个新账户 忘记密码

References

[1]

Abu-Bakar H, Williams L, Hallett S H (2021). Quantifying the impact of the COVID-19 lockdown on household water consumption patterns in England. npj Clean Water, 4: 1–9

[2]

Agirre-Basurko E, Ibarra-Berastegi G, Madariaga I (2006). Regression and multilayer perceptron-based models to forecast hourly O3 and NO2 levels in the Bilbao area. Environmental Modelling & Software, 21(4): 430–446

[3]

Ahmed N K, Atiya A F, Gayar N E, El-Shishiny H (2010). An empirical comparison of machine learning models for time series forecasting. Econometric Reviews, 29(5–6): 594–621

[4]

Alfano V, Ercolano S (2020). The efficacy of lockdown against COVID-19: a cross-country panel analysis. Applied Health Economics and Health Policy, 18: 509–517

[5]

Andreides M, Dolejš P, Bartáček J (2022). The prediction of WWTP influent characteristics: good practices and challenges. Journal of Water Process Engineering, 49: 103009

[6]

Ansari M, Othman F, Abunama T, El-Shafie A (2018). Analysing the accuracy of machine learning techniques to develop an integrated influent time series model: case study of a sewage treatment plant, Malaysia. Environmental Science and Pollution Research International, 25(12): 12139–12149

[7]

Bechmann H, Nielsen M K, Madsen H, Kjølstad Poulsen N (1999). Grey-box modelling of pollutant loads from a sewer system. Urban Water, 1(1): 71–78

[8]

Bifet A, Gavalda R (2007). Learning from time-changing data with adaptive windowing. In: Proceedings of the 2007 SIAM International Conference on Data Mining, SIAM, pp. 443–448

[9]

Boyd G, Na D, Li Z, Snowling S, Zhang Q, Zhou P (2019). Influent forecasting for wastewater treatment plants in North America. Sustainability, 11(6): 1764

[10]

Breiman L (2001). Random forests. Machine Learning, 45(1): 5–32

[11]

Bzdok D , Krzywinski M , Altman N . (2018). Machine learning: supervised methods. Nature Methods, 15(1): 5–6

[12]

Caruana R , Niculescu-Mizil A . (2006). An empirical comparison of supervised learning algorithms. ACM International Conference Proceeding Series, 148: 161–168

[13]

Domingos P, Hulten G (2000). Mining high-speed data streams. In: Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 71–80

[14]

Fontenla-Romero Ó, Guijarro-Berdiñas B, Martinez-Rego D, Pérez-Sánchez B, Peteiro-Barral D (2013). Online machine learning. In: Efficiency and Scalability Methods for Computational Intellect, IGI Global, pp. 27–54

[15]

Gautam S , Hens L . (2020). COVID-19: impact by and on the environment, health and economy. Environment, Development and Sustainability, 22(6): 4953–4954

[16]

Gomes H M, Barddal J P, Ferreira L E B, Bifet A (2018). Adaptive random forests for data stream regression. In: ESANN

[17]

Gomes H M, Bifet A, Read J, Barddal J P, Enembreck F, Pfharinger B, Holmes G, Abdessalem T (2017). Adaptive random forests for evolving data stream classification. Machine Learning, 106(9–10): 1469–1495

[18]

Hillary L S, Farkas K, Maher K H, Lucaci A, Thorpe J, Distaso M A, Gaze W H, Paterson S, Burke T, Connor T R, McDonald J E, Malham S K, Jones D L (2021). Monitoring SARS-CoV-2 in municipal wastewater to evaluate the success of lockdown measures for controlling COVID-19 in the UK. Water Research, 200, 117214

[19]

Hoi S C H , Sahoo D , Lu J , Zhao P . (2021). Online learning: a comprehensive survey. Neurocomputing, 459: 249–289

[20]

Hoi S C H , Wang J , Zhao P . (2014). Libol: a library for online learning algorithms. Journal of Machine Learning Research, 15: 495–499

[21]

Jain L C, Seera M, Lim C P, Balasubramaniam P (2014). A review of online learning in supervised neural networks. Neural Computing & Applications, 25(3–4): 491–509

[22]

Khan I , Shah D , Shah S S . (2021). COVID-19 pandemic and its positive impacts on environment: an updated review. International Journal of Environmental Science and Technology, 18(2): 521–530

[23]

Kim M , Kim Y , Kim H , Piao W , Kim C . (2016). Evaluation of the k-nearest neighbor method for forecasting the influent characteristics of wastewater treatment plant. Frontiers of Environmental Science & Engineering, 10(2): 299–310

[24]

Kovacs D J, Li Z, Baetz B W, Hong Y, Donnaz S, Zhao X, Zhou P, Ding H, Dong Q (2022). Membrane fouling prediction and uncertainty analysis using machine learning: a wastewater treatment plant case study. Journal of Membrane Science, 660: 120817

[25]

Ma S, Zeng S, Dong X, Chen J, Olsson G (2014). Short-term prediction of influent flow rate and ammonia concentration in municipal wastewater treatment plants. Frontiers of Environmental Science & Engineering, 8, 128–136

[26]

Montiel J , Read J , Bifet A , Abdessalem T . (2018). Scikit-multiflow: a multi-output streaming framework. Journal of Machine Learning Research, 19: 2914–2915

[27]

Nemati M , Tran D . (2022). The impact of COVID-19 on urban water consumption in the United States. Water, 14: 3096

[28]

Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V (2011). Scikit-learn: machine learning in Python. Journal of Machine Learning Research 12: 2825–2830

[29]

Pu Z , Yan J , Chen L , Li Z , Tian W , Tao T , Xin K . (2023). A hybrid Wavelet-CNN-LSTM deep learning model for short-term urban water demand forecasting. Frontiers of Environmental Science & Engineering, 17(2): 22

[30]

Safaei S H , Young S , Samimi Z , Parvizi F , Shokrollahi A , Baniamer M . (2022). Technology development for the removal of Covid-19 pharmaceutical active compounds from water and wastewater: a review. Journal of Environmental Informatics, 40(2): 141–156

[31]

Taunk K, De S, Verma S, Swetapadma A (2019). A brief review of nearest neighbor algorithm for learning and classification. 2019 International Conference on Intelligent Computing and Control Systems, ICCS 2019, 1255–1260

[32]

Wang Z , Wang Q , Wu T . (2023). A novel hybrid model for water quality prediction based on VMD and IGOA optimized for LSTM. Frontiers of Environmental Science & Engineering, 17(7): 88

[33]

Wei X , Kusiak A . (2015). Short-term prediction of influent flow in wastewater treatment plant. Stochastic Environmental Research and Risk Assessment, 29(1): 241–249

[34]

Wei X , Kusiak A , Sadat H R . (2013). Prediction of influent flow rate: data-mining approach. Journal of Energy Engineering, 139(2): 118–123

[35]

Zhang Q , Li Z , Snowling S , Siam A , El-Dakhakhni W . (2019). Predictive models for wastewater flow forecasting based on time series analysis and artificial neural network. Water Science and Technology, 80(2): 243–253

[36]

Zhou P , Li Z , Snowling S , Baetz B W , Na D , Boyd G . (2019a). A random forest model for inflow prediction at wastewater treatment plants. Stochastic Environmental Research and Risk Assessment, 33(10): 1781–1792

[37]

Zhou P , Li Z , Snowling S , Goel R , Zhang Q . (2019b). Short-term wastewater influent prediction based on random forests and multi-layer perceptron. Journal of Environmental Informatics Letters, 1: 87–93

[38]

Zhou P , Li Z , Snowling S , Goel R , Zhang Q . (2022). Multi-step ahead prediction of hourly influent characteristics for wastewater treatment plants: a case study from North America. Environmental Monitoring and Assessment, 194(5): 1–14

[39]

Zhu J , Anderson P R . (2019). Performance evaluation of the ISMLR package for predicting the next day’s influent wastewater flowrate at Kirie WRP. Water Science and Technology, 80(4): 695–706

RIGHTS & PERMISSIONS

Higher Education Press 2013

AI Summary AI Mindmap
PDF (3221KB)

3239

Accesses

0

Citation

Detail

Sections
Recommended

AI思维导图

/