Utilizing machine learning models to grasp water quality dynamic changes in lake eutrophication through phytoplankton parameters
Yong Fang, Ruting Huang, Yeyin Zhang, Jun Zhang, Wenni Xi, Xianyang Shi
Utilizing machine learning models to grasp water quality dynamic changes in lake eutrophication through phytoplankton parameters
● Accurate identification of lake eutrophication was achieved via ML models. | |
● XGBoost model has superior performance in identifying limiting nutrients. | |
● LightGBM model effectively uses phytoplankton for water quality characterization. | |
● ML model with TN/TP ratio and phytoplankton can track lake eutrophication dynamics. |
Phytoplankton serve as vital indicators of eutrophication levels. However, relying solely on phytoplankton parameters, such as chlorophyll-a, limits our comprehensive understanding of the intricate eutrophication conditions in natural lakes, particularly in terms of timely analysis of changes in limiting nutrients and their concentrations. This study presents machine learning (ML) models for predicting and identifying lake eutrophication. Five tree-based ML models were developed using the latest data on hydrological, water quality, and meteorological parameters obtained from 34 sites in the Huating Lake basin over 5 months. The extreme gradient boosting model exhibited high accuracy in predicting the total nitrogen/total phosphorus ratio (TN/TP) (R2 = 0.88; RMSE = 24.60; MAPE = 26.14%). Analysis of the TN/TP ratio and output eigenvalue weight revealed that phosphorus plays a crucial role in eutrophication, probably because of the low-flow and deep-water characteristics of the basin. Furthermore, the light gradient boosting machine model exhibited outstanding performance and high accuracy in predicting phytoplankton parameters, especially the Shannon index (H′) (R2 = 0.92; RMSE = 0.11; MAPE = 4.95%). The mesotrophic classification of the Huating Lake determined using the H′ threshold, coincided with the findings from the H′ analysis. Future research should cover a wider range of pollution sources and spatiotemporal dimensions to further validate our findings. Overall, this study highlights the potential of incorporating the TN/TP ratio and phytoplankton parameters into ML techniques for effective monitoring and management of environmental conditions.
Machine learning / Lake / Phytoplankton / Water quality
[1] |
Behrenfeld M J, Boss E S, Halsey K H. (2021). Phytoplankton community structuring and succession in a competition-neutral resource landscape. ISME Communications, 1(1): 12
CrossRef
Google scholar
|
[2] |
Brown K P, Gerber A, Bedulina D, Timofeyev M A. (2021). Human impact and ecosystemic health at Lake Baikal. WIREs. Water, 8(4): e1528
CrossRef
Google scholar
|
[3] |
Burdick S M, Hewitt D A, Martin B A, Schenk L, Rounds S A. (2020). Effects of harmful algal blooms and associated water-quality on endangered Lost River and shortnose suckers. Harmful Algae, 97: 101847
CrossRef
Google scholar
|
[4] |
Carrasco Navas-Parejo J C, Corzo A, Papaspyrou S. (2020). Seasonal cycles of phytoplankton biomass and primary production in a tropical temporarily open-closed estuarine lagoon: the effect of an extreme climatic event. Science of the Total Environment, 723: 138014
CrossRef
Google scholar
|
[5] |
Chi Y, Liu D, Xing W, Wang J. (2021). Island ecosystem health in the context of human activities with different types and intensities. Journal of Cleaner Production, 281: 125334
CrossRef
Google scholar
|
[6] |
Conley D J, Paerl H W, Howarth R W, Boesch D F, Seitzinger S P, Havens K E, Lancelot C, Likens G E. (2009). Controlling eutrophication: nitrogen and phosphorus. Science, 323(5917): 1014–1015
CrossRef
Google scholar
|
[7] |
Derot J, Jamoneau A, Teichert N, Rosebery J, Morin S, Laplace-Treyture C. (2020). Response of phytoplankton traits to environmental variables in French lakes: new perspectives for bioindication. Ecological Indicators, 108: 105659
CrossRef
Google scholar
|
[8] |
Dhaliwal S, Nahid A, Abbas R. (2018). Effective intrusion detection system using XGBoost. Information, 9(7): 149
CrossRef
Google scholar
|
[9] |
Ding F, Zhang W, Cao S, Hao S, Chen L, Xie X, Li W, Jiang M. (2023). Optimization of water quality index models using machine learning approaches. Water Research, 243: 120337
CrossRef
Google scholar
|
[10] |
Dong X, Zeng S, Bai F, Li D, He M. (2016). Extracellular microcystin prediction based on toxigenic Microcystis detection in a eutrophic lake. Scientific Reports, 6(1): 20886
CrossRef
Google scholar
|
[11] |
FengC (2007). Studies on the agricultural ecological tour development in Huating Lake scenic spot. Anhui Nongye Kexue, 35(7): 2035–2037 (in Chinese)
|
[12] |
Feng L, Dai Y, Hou X, Xu Y, Liu J, Zheng C. (2021). Concerns about phytoplankton bloom trends in global lakes. Nature, 590(7846): E35–E47
CrossRef
Google scholar
|
[13] |
Fortes A C C, Barrocas P R G, Kligerman D C. (2023). Water quality indices: construction, potential, and limitations. Ecological Indicators, 157: 111187
CrossRef
Google scholar
|
[14] |
Fuente A D L, Muro-Pastor A M, Merchán F, Madrid F, Pérez-Martínez J I, Undabeytia T. (2019). Electrocoagulation/flocculation of cyanobacteria from surface waters. Journal of Cleaner Production, 238: 117964
CrossRef
Google scholar
|
[15] |
Ge F, Ma Z, Chen B, Wang Y, Lu X, An S, Zhang D, Zhang W, Yu W, Han W.
CrossRef
Google scholar
|
[16] |
Georgescu P L, Moldovanu S, Iticescu C, Calmuc M, Calmuc V, Topa C, Moraru L. (2023). Assessing and forecasting water quality in the Danube River by using neural network approaches. Science of the Total Environment, 879: 162998
CrossRef
Google scholar
|
[17] |
Horppila J. (2019). Sediment nutrients, ecological status and restoration of lakes. Water Research, 160: 206–208
CrossRef
Google scholar
|
[18] |
HowarthR W, Marino R (2006). Nitrogen as the limiting nutrient for eutrophication in coastal marine ecosystems: Evolving views over three decades. Limnology and Oceanography, 51(1111): 364–376
|
[19] |
Hu L, Shan K, Huang L, Li Y, Zhao L, Zhou Q, Song L. (2021). Environmental factors associated with cyanobacterial assemblages in a mesotrophic subtropical plateau lake: a focus on bloom toxicity. Science of the Total Environment, 777: 146052
CrossRef
Google scholar
|
[20] |
Hu Y, Du W, Yang C, Wang Y, Huang T, Xu X, Li W. (2023). Source identification and prediction of nitrogen and phosphorus pollution of Lake Taihu by an ensemble machine learning technique. Frontiers of Environmental Science & Engineering, 17(5): 55
CrossRef
Google scholar
|
[21] |
Hua L, Li W, Zhai L, Yen H, Lei Q, Liu H, Ren T, Xia Y, Zhang F, Fan X. (2019). An innovative approach to identifying agricultural pollution sources and loads by using nutrient export coefficients in watershed modeling. Journal of Hydrology, 571: 322–331
CrossRef
Google scholar
|
[22] |
Jenkins S H. (1982). Standard methods for the examination of water and wastewater. Water Research, 16(10): 1495–1496
CrossRef
Google scholar
|
[23] |
Jia J, Gao Y, Song X, Chen S. (2019). Characteristics of phytoplankton community and water net primary productivity response to the nutrient status of the Poyang Lake and Gan River, China. Ecohydrology, 12(7): e2136
CrossRef
Google scholar
|
[24] |
Jiang M, Nakano S I. (2022). The crucial influence of trophic status on the relative requirement of nitrogen to phosphorus for phytoplankton growth. Water Research, 222: 118868
CrossRef
Google scholar
|
[25] |
Jin M, Ren Z, Shi J P, Huang X Z, Chen J R. (2010). Impact of agricultural non-point source pollution in eutrophic water body of Taihu Lake. Environmental Science & Technology, 33(10): 106–111
|
[26] |
Kim K. (2016). A hybrid classification algorithm by subspace partitioning through semi-supervised decision tree. Pattern Recognition, 60: 157–163
CrossRef
Google scholar
|
[27] |
Li N, Wang J, Yin W, Jia H, Xu J, Hao R, Zhong Z, Shi Z. (2021). Linking water environmental factors and the local watershed landscape to the chlorophyll a concentration in reservoir bays. Science of the Total Environment, 758: 143617
CrossRef
Google scholar
|
[28] |
Li S, Liu C, Sun P, Ni T. (2022). Response of cyanobacterial bloom risk to nitrogen and phosphorus concentrations in large shallow lakes determined through geographical detector: a case study of Taihu Lake, China. Science of the Total Environment, 816: 151617
CrossRef
Google scholar
|
[29] |
Li X, Xu W, Song S, Sun J. (2023). Sources and spatiotemporal distribution characteristics of nitrogen and phosphorus loads in the Haihe River Basin, China. Marine Pollution Bulletin, 189: 114756
CrossRef
Google scholar
|
[30] |
Litchman E, Klausmeier C A. (2008). Trait-based community ecology of phytoplankton. Annual Review of Ecology, Evolution, and Systematics, 39(1): 615–639
CrossRef
Google scholar
|
[31] |
LiuY, LuoH, ZhaoB, Zhao X, HanZ (2018). Short-Term Power Load Forecasting Based on Clustering and XGBoost Method. New York: Institute of Electrical and Electronics Engineers
|
[32] |
Liu Y, Zhuang Y, Ji B, Zhang G, Rong L, Teng G, Wang C. (2022). Prediction of laying hen house odor concentrations using machine learning models based on small sample data. Computers and Electronics in Agriculture, 195: 106849
CrossRef
Google scholar
|
[33] |
Meng F, Li Z, Li L, Lu F, Liu Y, Lu X, Fan Y. (2020). Phytoplankton alpha diversity indices response the trophic state variation in hydrologically connected aquatic habitats in the Harbin Section of the Songhua River. Scientific Reports, 10(1): 21337
CrossRef
Google scholar
|
[34] |
Muhid P, Davis T W, Bunn S E, Burford M A. (2013). Effects of inorganic nutrients in recycled water on freshwater phytoplankton biomass and composition. Water Research, 47(1): 384–394
CrossRef
Google scholar
|
[35] |
Qin B, Zhou J, Elser J J, Gardner W S, Deng J, Brookes J D. (2020). Water depth underpins the relative roles and fates of nitrogen and phosphorus in lakes. Environmental Science & Technology, 54(6): 3191–3198
CrossRef
Google scholar
|
[36] |
RaoK, ZhangX, YiX, LiZ, WangP, Huang G, GuoX (2018). Interactive effects of environmental factors on phytoplankton communities and benthic nutrient interactions in a shallow lake and adjoining rivers in China. Science of the Total Environment, 619–620: 1661–1672
|
[37] |
Reddy G T, Reddy M P K, Lakshmanna K, Kaluri R, Rajput D S, Srivastava G, Baker T. (2020). Analysis of dimensionality reduction techniques on big data. IEEE Access: Practical Innovations, Open Solutions, 8: 54776–54788
CrossRef
Google scholar
|
[38] |
Rezaie-Balf M, Attar N F, Mohammadzadeh A, Murti M A, Ahmed A N, Fai C M, Nabipour N, Alaghmand S, El-Shafie A. (2020). Physicochemical parameters data assimilation for efficient improvement of water quality index prediction: comparative assessment of a noise suppression hybridization approach. Journal of Cleaner Production, 271: 122576
CrossRef
Google scholar
|
[39] |
Shan K, Song L, Chen W, Li L, Liu L, Wu Y, Jia Y, Zhou Q, Peng L. (2019). Analysis of environmental drivers influencing interspecific variations and associations among bloom-forming cyanobacteria in large, shallow eutrophic lakes. Harmful Algae, 84: 84–94
CrossRef
Google scholar
|
[40] |
Singh K P, Malik A, Sinha S. (2005). Water quality assessment and apportionment of pollution sources of Gomti River (India) using multivariate statistical techniques: a case study. Analytica Chimica Acta, 538(1−2): 355–374
CrossRef
Google scholar
|
[41] |
Tian Y, Jiang Y, Liu Q, Xu D, Liu Y, Song J. (2021). The impacts of local and regional factors on the phytoplankton community dynamics in a temperate river, northern China. Ecological Indicators, 123: 107352
CrossRef
Google scholar
|
[42] |
Uddin M G, Nash S, Mahammad Diganta M T, Rahman A, Olbert A I. (2022a). Robust machine learning algorithms for predicting coastal water quality index. Journal of Environmental Management, 321(8): 115923
CrossRef
Google scholar
|
[43] |
Uddin M G, Nash S, Rahman A, Dabrowski T, Olbert A I. (2024a). Data-driven modelling for assessing trophic status in marine ecosystems using machine learning approaches. Environmental Research, 242: 117755
CrossRef
Google scholar
|
[44] |
Uddin M G, Nash S, Rahman A, Olbert A I. (2022b). A comprehensive method for improvement of water quality index (WQI) models for coastal water quality assessment. Water Research, 219: 118532
CrossRef
Google scholar
|
[45] |
Uddin M G, Nash S, Rahman A, Olbert A I. (2023a). A novel approach for estimating and predicting uncertainty in water quality index model using machine learning approaches. Water Research, 229: 119422
CrossRef
Google scholar
|
[46] |
Uddin M G, Nash S, Rahman A, Olbert A I. (2023b). A sophisticated model for rating water quality. Science of the Total Environment, 868: 161614
CrossRef
Google scholar
|
[47] |
Uddin M G, Rahman A, Rosa Taghikhah F, Olbert A I. (2024b). Data-driven evolution of water quality models: an in-depth investigation of innovative outlier detection approaches-A case study of Irish Water Quality Index (IEWQI) model. Water Research, 255: 121499
CrossRef
Google scholar
|
[48] |
Wang X, Fu D, Wang Y, Guo Y, Ding Y. (2021). The XGBoost and the SVM-based prediction models for bioretention cell decontamination effect. Arabian Journal of Geosciences, 14(8): 669
CrossRef
Google scholar
|
[49] |
Wu Z, Liu Y, Liang Z, Wu S, Guo H. (2017). Internal cycling, not external loading, decides the nutrient limitation in eutrophic lake: a dynamic model with temporal Bayesian hierarchical inference. Water Research, 116: 231–240
CrossRef
Google scholar
|
[50] |
XiongJ, Lin C, CaoZ, HuM, XueK, ChenX, Ma R (2022). Development of remote sensing algorithm for total phosphorus concentration in eutrophic lakes: conventional or machine learning? Water Research, 215(1): 118213
|
[51] |
Xiong J, Lin C, Ma R, Cao Z. (2019). Remote sensing estimation of lake total phosphorus concentration based on MODIS: a case study of Lake Hongze. Remote Sensing, 11(17): 2068
CrossRef
Google scholar
|
[52] |
Xu W, Li X, Li Y, Sun Y, Zhang L, Huang Y, Yang Z. (2021). Rising temperature more strongly promotes low-abundance Paramecium to remove Microcystis and degrade Microcystins. Environmental Pollution, 291: 118143
CrossRef
Google scholar
|
[53] |
Xu W, Su X. (2019). Challenges and impacts of climate change and human activities on groundwater-dependent ecosystems in arid areas: a case study of the Nalenggele alluvial fan in NW China. Journal of Hydrology, 573: 376–385
CrossRef
Google scholar
|
[54] |
Yang Y, Gao B, Hao H, Zhou H, Lu J. (2017). Nitrogen and phosphorus in sediments in China: a national-scale assessment and review. Science of the Total Environment, 576: 840–849
CrossRef
Google scholar
|
[55] |
Ye R, Shan K, Gao H, Zhang R, Xiong W, Wang Y, Qian X. (2014). Spatio-temporal distribution patterns in environmental factors, chlorophyll-a and microcystins in a large shallow lake, Lake Taihu, China. International Journal of Environmental Research and Public Health, 11(5): 5155–5169
CrossRef
Google scholar
|
[56] |
Yu H, Jiang S, Land K C. (2015). Multicollinearity in hierarchical linear models. Social Science Research, 53: 118–136
CrossRef
Google scholar
|
[57] |
Yu Q, Wang F, Yan W, Zhang F, Lv S, Li Y. (2018). Carbon and nitrogen burial and response to climate change and anthropogenic disturbance in Chaohu Lake, China. International Journal of Environmental Research and Public Health, 15(12): 2734
CrossRef
Google scholar
|
[58] |
Yuan L L, Pollard A I. (2017). Using national-scale data to develop nutrient–microcystin relationships that guide management decisions. Environmental Science & Technology, 51(12): 6972–6980
CrossRef
Google scholar
|
[59] |
Zhang F, Xue B, Cai Y, Xu H, Zou W. (2023). Utility of trophic state index in lakes and reservoirs in the Chinese eastern plains ecoregion: the key role of water depth. Ecological Indicators, 148: 110029
CrossRef
Google scholar
|
[60] |
Zhang J, Fu P, Meng F, Yang X, Xu J, Cui Y. (2022). Estimation algorithm for chlorophyll-a concentrations in water from hyperspectral images based on feature derivation and ensemble learning. Ecological Informatics, 71: 101783
CrossRef
Google scholar
|
[61] |
Zhang M, Leyi N, Cao T, Fang T, Xiong D W, Zhou G J, Zhu G R, Jun X U, Guo L G. (2010). Impact of aquatic environmental factors on distribution pattern of aquatic macrophytes in upper reaches of Taihu Lake watershed. Environmental Science & Technology, 33(3): 171–174
|
[62] |
Zhang N, Zang S. (2015). Characteristics of phytoplankton distribution for assessment of water quality in the Zhalong Wetland, China. International Journal of Environmental Science and Technology, 12(11): 3657–3664
CrossRef
Google scholar
|
[63] |
Znachor P, Nedoma J, Hejzlar J, Seďa J, Komárková J, Kolář V, Mrkvička T, Boukal D S. (2020). Changing environmental conditions underpin long-term patterns of phytoplankton in a freshwater reservoir. Science of the Total Environment, 710: 135626
CrossRef
Google scholar
|
/
〈 | 〉 |