Utilizing machine learning models to grasp water quality dynamic changes in lake eutrophication through phytoplankton parameters

Yong Fang , Ruting Huang , Yeyin Zhang , Jun Zhang , Wenni Xi , Xianyang Shi

Front. Environ. Sci. Eng. ›› 2025, Vol. 19 ›› Issue (2) : 14

PDF (6551KB)
Front. Environ. Sci. Eng. ›› 2025, Vol. 19 ›› Issue (2) : 14 DOI: 10.1007/s11783-025-1934-6
RESEARCH ARTICLE

Utilizing machine learning models to grasp water quality dynamic changes in lake eutrophication through phytoplankton parameters

Author information +
History +
PDF (6551KB)

Abstract

Phytoplankton serve as vital indicators of eutrophication levels. However, relying solely on phytoplankton parameters, such as chlorophyll-a, limits our comprehensive understanding of the intricate eutrophication conditions in natural lakes, particularly in terms of timely analysis of changes in limiting nutrients and their concentrations. This study presents machine learning (ML) models for predicting and identifying lake eutrophication. Five tree-based ML models were developed using the latest data on hydrological, water quality, and meteorological parameters obtained from 34 sites in the Huating Lake basin over 5 months. The extreme gradient boosting model exhibited high accuracy in predicting the total nitrogen/total phosphorus ratio (TN/TP) (R2 = 0.88; RMSE = 24.60; MAPE = 26.14%). Analysis of the TN/TP ratio and output eigenvalue weight revealed that phosphorus plays a crucial role in eutrophication, probably because of the low-flow and deep-water characteristics of the basin. Furthermore, the light gradient boosting machine model exhibited outstanding performance and high accuracy in predicting phytoplankton parameters, especially the Shannon index (H′) (R2 = 0.92; RMSE = 0.11; MAPE = 4.95%). The mesotrophic classification of the Huating Lake determined using the H′ threshold, coincided with the findings from the H′ analysis. Future research should cover a wider range of pollution sources and spatiotemporal dimensions to further validate our findings. Overall, this study highlights the potential of incorporating the TN/TP ratio and phytoplankton parameters into ML techniques for effective monitoring and management of environmental conditions.

Graphical abstract

Keywords

Machine learning / Lake / Phytoplankton / Water quality

Highlight

● Accurate identification of lake eutrophication was achieved via ML models.

● XGBoost model has superior performance in identifying limiting nutrients.

● LightGBM model effectively uses phytoplankton for water quality characterization.

● ML model with TN/TP ratio and phytoplankton can track lake eutrophication dynamics.

Cite this article

Download citation ▾
Yong Fang, Ruting Huang, Yeyin Zhang, Jun Zhang, Wenni Xi, Xianyang Shi. Utilizing machine learning models to grasp water quality dynamic changes in lake eutrophication through phytoplankton parameters. Front. Environ. Sci. Eng., 2025, 19(2): 14 DOI:10.1007/s11783-025-1934-6

登录浏览全文

4963

注册一个新账户 忘记密码

References

[1]

Behrenfeld M J, Boss E S, Halsey K H. (2021). Phytoplankton community structuring and succession in a competition-neutral resource landscape. ISME Communications, 1(1): 12

[2]

Brown K P, Gerber A, Bedulina D, Timofeyev M A. (2021). Human impact and ecosystemic health at Lake Baikal. WIREs. Water, 8(4): e1528

[3]

Burdick S M, Hewitt D A, Martin B A, Schenk L, Rounds S A. (2020). Effects of harmful algal blooms and associated water-quality on endangered Lost River and shortnose suckers. Harmful Algae, 97: 101847

[4]

Carrasco Navas-Parejo J C, Corzo A, Papaspyrou S. (2020). Seasonal cycles of phytoplankton biomass and primary production in a tropical temporarily open-closed estuarine lagoon: the effect of an extreme climatic event. Science of the Total Environment, 723: 138014

[5]

Chi Y, Liu D, Xing W, Wang J. (2021). Island ecosystem health in the context of human activities with different types and intensities. Journal of Cleaner Production, 281: 125334

[6]

Conley D J, Paerl H W, Howarth R W, Boesch D F, Seitzinger S P, Havens K E, Lancelot C, Likens G E. (2009). Controlling eutrophication: nitrogen and phosphorus. Science, 323(5917): 1014–1015

[7]

Derot J, Jamoneau A, Teichert N, Rosebery J, Morin S, Laplace-Treyture C. (2020). Response of phytoplankton traits to environmental variables in French lakes: new perspectives for bioindication. Ecological Indicators, 108: 105659

[8]

Dhaliwal S, Nahid A, Abbas R. (2018). Effective intrusion detection system using XGBoost. Information, 9(7): 149

[9]

Ding F, Zhang W, Cao S, Hao S, Chen L, Xie X, Li W, Jiang M. (2023). Optimization of water quality index models using machine learning approaches. Water Research, 243: 120337

[10]

Dong X, Zeng S, Bai F, Li D, He M. (2016). Extracellular microcystin prediction based on toxigenic Microcystis detection in a eutrophic lake. Scientific Reports, 6(1): 20886

[11]

FengC (2007). Studies on the agricultural ecological tour development in Huating Lake scenic spot. Anhui Nongye Kexue, 35(7): 2035–2037 (in Chinese)

[12]

Feng L, Dai Y, Hou X, Xu Y, Liu J, Zheng C. (2021). Concerns about phytoplankton bloom trends in global lakes. Nature, 590(7846): E35–E47

[13]

Fortes A C C, Barrocas P R G, Kligerman D C. (2023). Water quality indices: construction, potential, and limitations. Ecological Indicators, 157: 111187

[14]

Fuente A D L, Muro-Pastor A M, Merchán F, Madrid F, Pérez-Martínez J I, Undabeytia T. (2019). Electrocoagulation/flocculation of cyanobacteria from surface waters. Journal of Cleaner Production, 238: 117964

[15]

Ge F, Ma Z, Chen B, Wang Y, Lu X, An S, Zhang D, Zhang W, Yu W, Han W. . (2022). Phytoplankton species diversity patterns and associated driving factors in China’s Jiulong River estuary: roles that nutrients and nutrient ratios play. Frontiers in Marine Science, 9: 829285

[16]

Georgescu P L, Moldovanu S, Iticescu C, Calmuc M, Calmuc V, Topa C, Moraru L. (2023). Assessing and forecasting water quality in the Danube River by using neural network approaches. Science of the Total Environment, 879: 162998

[17]

Horppila J. (2019). Sediment nutrients, ecological status and restoration of lakes. Water Research, 160: 206–208

[18]

HowarthR W, Marino R (2006). Nitrogen as the limiting nutrient for eutrophication in coastal marine ecosystems: Evolving views over three decades. Limnology and Oceanography, 51(1111): 364–376

[19]

Hu L, Shan K, Huang L, Li Y, Zhao L, Zhou Q, Song L. (2021). Environmental factors associated with cyanobacterial assemblages in a mesotrophic subtropical plateau lake: a focus on bloom toxicity. Science of the Total Environment, 777: 146052

[20]

Hu Y, Du W, Yang C, Wang Y, Huang T, Xu X, Li W. (2023). Source identification and prediction of nitrogen and phosphorus pollution of Lake Taihu by an ensemble machine learning technique. Frontiers of Environmental Science & Engineering, 17(5): 55

[21]

Hua L, Li W, Zhai L, Yen H, Lei Q, Liu H, Ren T, Xia Y, Zhang F, Fan X. (2019). An innovative approach to identifying agricultural pollution sources and loads by using nutrient export coefficients in watershed modeling. Journal of Hydrology, 571: 322–331

[22]

Jenkins S H. (1982). Standard methods for the examination of water and wastewater. Water Research, 16(10): 1495–1496

[23]

Jia J, Gao Y, Song X, Chen S. (2019). Characteristics of phytoplankton community and water net primary productivity response to the nutrient status of the Poyang Lake and Gan River, China. Ecohydrology, 12(7): e2136

[24]

Jiang M, Nakano S I. (2022). The crucial influence of trophic status on the relative requirement of nitrogen to phosphorus for phytoplankton growth. Water Research, 222: 118868

[25]

Jin M, Ren Z, Shi J P, Huang X Z, Chen J R. (2010). Impact of agricultural non-point source pollution in eutrophic water body of Taihu Lake. Environmental Science & Technology, 33(10): 106–111

[26]

Kim K. (2016). A hybrid classification algorithm by subspace partitioning through semi-supervised decision tree. Pattern Recognition, 60: 157–163

[27]

Li N, Wang J, Yin W, Jia H, Xu J, Hao R, Zhong Z, Shi Z. (2021). Linking water environmental factors and the local watershed landscape to the chlorophyll a concentration in reservoir bays. Science of the Total Environment, 758: 143617

[28]

Li S, Liu C, Sun P, Ni T. (2022). Response of cyanobacterial bloom risk to nitrogen and phosphorus concentrations in large shallow lakes determined through geographical detector: a case study of Taihu Lake, China. Science of the Total Environment, 816: 151617

[29]

Li X, Xu W, Song S, Sun J. (2023). Sources and spatiotemporal distribution characteristics of nitrogen and phosphorus loads in the Haihe River Basin, China. Marine Pollution Bulletin, 189: 114756

[30]

Litchman E, Klausmeier C A. (2008). Trait-based community ecology of phytoplankton. Annual Review of Ecology, Evolution, and Systematics, 39(1): 615–639

[31]

LiuY, LuoH, ZhaoB, Zhao X, HanZ (2018). Short-Term Power Load Forecasting Based on Clustering and XGBoost Method. New York: Institute of Electrical and Electronics Engineers

[32]

Liu Y, Zhuang Y, Ji B, Zhang G, Rong L, Teng G, Wang C. (2022). Prediction of laying hen house odor concentrations using machine learning models based on small sample data. Computers and Electronics in Agriculture, 195: 106849

[33]

Meng F, Li Z, Li L, Lu F, Liu Y, Lu X, Fan Y. (2020). Phytoplankton alpha diversity indices response the trophic state variation in hydrologically connected aquatic habitats in the Harbin Section of the Songhua River. Scientific Reports, 10(1): 21337

[34]

Muhid P, Davis T W, Bunn S E, Burford M A. (2013). Effects of inorganic nutrients in recycled water on freshwater phytoplankton biomass and composition. Water Research, 47(1): 384–394

[35]

Qin B, Zhou J, Elser J J, Gardner W S, Deng J, Brookes J D. (2020). Water depth underpins the relative roles and fates of nitrogen and phosphorus in lakes. Environmental Science & Technology, 54(6): 3191–3198

[36]

RaoK, ZhangX, YiX, LiZ, WangP, Huang G, GuoX (2018). Interactive effects of environmental factors on phytoplankton communities and benthic nutrient interactions in a shallow lake and adjoining rivers in China. Science of the Total Environment, 619–620: 1661–1672

[37]

Reddy G T, Reddy M P K, Lakshmanna K, Kaluri R, Rajput D S, Srivastava G, Baker T. (2020). Analysis of dimensionality reduction techniques on big data. IEEE Access: Practical Innovations, Open Solutions, 8: 54776–54788

[38]

Rezaie-Balf M, Attar N F, Mohammadzadeh A, Murti M A, Ahmed A N, Fai C M, Nabipour N, Alaghmand S, El-Shafie A. (2020). Physicochemical parameters data assimilation for efficient improvement of water quality index prediction: comparative assessment of a noise suppression hybridization approach. Journal of Cleaner Production, 271: 122576

[39]

Shan K, Song L, Chen W, Li L, Liu L, Wu Y, Jia Y, Zhou Q, Peng L. (2019). Analysis of environmental drivers influencing interspecific variations and associations among bloom-forming cyanobacteria in large, shallow eutrophic lakes. Harmful Algae, 84: 84–94

[40]

Singh K P, Malik A, Sinha S. (2005). Water quality assessment and apportionment of pollution sources of Gomti River (India) using multivariate statistical techniques: a case study. Analytica Chimica Acta, 538(1−2): 355–374

[41]

Tian Y, Jiang Y, Liu Q, Xu D, Liu Y, Song J. (2021). The impacts of local and regional factors on the phytoplankton community dynamics in a temperate river, northern China. Ecological Indicators, 123: 107352

[42]

Uddin M G, Nash S, Mahammad Diganta M T, Rahman A, Olbert A I. (2022a). Robust machine learning algorithms for predicting coastal water quality index. Journal of Environmental Management, 321(8): 115923

[43]

Uddin M G, Nash S, Rahman A, Dabrowski T, Olbert A I. (2024a). Data-driven modelling for assessing trophic status in marine ecosystems using machine learning approaches. Environmental Research, 242: 117755

[44]

Uddin M G, Nash S, Rahman A, Olbert A I. (2022b). A comprehensive method for improvement of water quality index (WQI) models for coastal water quality assessment. Water Research, 219: 118532

[45]

Uddin M G, Nash S, Rahman A, Olbert A I. (2023a). A novel approach for estimating and predicting uncertainty in water quality index model using machine learning approaches. Water Research, 229: 119422

[46]

Uddin M G, Nash S, Rahman A, Olbert A I. (2023b). A sophisticated model for rating water quality. Science of the Total Environment, 868: 161614

[47]

Uddin M G, Rahman A, Rosa Taghikhah F, Olbert A I. (2024b). Data-driven evolution of water quality models: an in-depth investigation of innovative outlier detection approaches-A case study of Irish Water Quality Index (IEWQI) model. Water Research, 255: 121499

[48]

Wang X, Fu D, Wang Y, Guo Y, Ding Y. (2021). The XGBoost and the SVM-based prediction models for bioretention cell decontamination effect. Arabian Journal of Geosciences, 14(8): 669

[49]

Wu Z, Liu Y, Liang Z, Wu S, Guo H. (2017). Internal cycling, not external loading, decides the nutrient limitation in eutrophic lake: a dynamic model with temporal Bayesian hierarchical inference. Water Research, 116: 231–240

[50]

XiongJ, Lin C, CaoZ, HuM, XueK, ChenX, Ma R (2022). Development of remote sensing algorithm for total phosphorus concentration in eutrophic lakes: conventional or machine learning? Water Research, 215(1): 118213

[51]

Xiong J, Lin C, Ma R, Cao Z. (2019). Remote sensing estimation of lake total phosphorus concentration based on MODIS: a case study of Lake Hongze. Remote Sensing, 11(17): 2068

[52]

Xu W, Li X, Li Y, Sun Y, Zhang L, Huang Y, Yang Z. (2021). Rising temperature more strongly promotes low-abundance Paramecium to remove Microcystis and degrade Microcystins. Environmental Pollution, 291: 118143

[53]

Xu W, Su X. (2019). Challenges and impacts of climate change and human activities on groundwater-dependent ecosystems in arid areas: a case study of the Nalenggele alluvial fan in NW China. Journal of Hydrology, 573: 376–385

[54]

Yang Y, Gao B, Hao H, Zhou H, Lu J. (2017). Nitrogen and phosphorus in sediments in China: a national-scale assessment and review. Science of the Total Environment, 576: 840–849

[55]

Ye R, Shan K, Gao H, Zhang R, Xiong W, Wang Y, Qian X. (2014). Spatio-temporal distribution patterns in environmental factors, chlorophyll-a and microcystins in a large shallow lake, Lake Taihu, China. International Journal of Environmental Research and Public Health, 11(5): 5155–5169

[56]

Yu H, Jiang S, Land K C. (2015). Multicollinearity in hierarchical linear models. Social Science Research, 53: 118–136

[57]

Yu Q, Wang F, Yan W, Zhang F, Lv S, Li Y. (2018). Carbon and nitrogen burial and response to climate change and anthropogenic disturbance in Chaohu Lake, China. International Journal of Environmental Research and Public Health, 15(12): 2734

[58]

Yuan L L, Pollard A I. (2017). Using national-scale data to develop nutrient–microcystin relationships that guide management decisions. Environmental Science & Technology, 51(12): 6972–6980

[59]

Zhang F, Xue B, Cai Y, Xu H, Zou W. (2023). Utility of trophic state index in lakes and reservoirs in the Chinese eastern plains ecoregion: the key role of water depth. Ecological Indicators, 148: 110029

[60]

Zhang J, Fu P, Meng F, Yang X, Xu J, Cui Y. (2022). Estimation algorithm for chlorophyll-a concentrations in water from hyperspectral images based on feature derivation and ensemble learning. Ecological Informatics, 71: 101783

[61]

Zhang M, Leyi N, Cao T, Fang T, Xiong D W, Zhou G J, Zhu G R, Jun X U, Guo L G. (2010). Impact of aquatic environmental factors on distribution pattern of aquatic macrophytes in upper reaches of Taihu Lake watershed. Environmental Science & Technology, 33(3): 171–174

[62]

Zhang N, Zang S. (2015). Characteristics of phytoplankton distribution for assessment of water quality in the Zhalong Wetland, China. International Journal of Environmental Science and Technology, 12(11): 3657–3664

[63]

Znachor P, Nedoma J, Hejzlar J, Seďa J, Komárková J, Kolář V, Mrkvička T, Boukal D S. (2020). Changing environmental conditions underpin long-term patterns of phytoplankton in a freshwater reservoir. Science of the Total Environment, 710: 135626

RIGHTS & PERMISSIONS

Higher Education Press 2025

AI Summary AI Mindmap
PDF (6551KB)

Supplementary files

FSE-24107-of-FY_suppl_1

1948

Accesses

0

Citation

Detail

Sections
Recommended

AI思维导图

/