Classification of forest vegetation with the application of iterative reallocation and model-based clustering

Naghmeh Pakgohar , Javad Eshaghi Rad , Hossein Gholami , Ahmad Alijanpour , David W. Roberts , Attila Lengyel , Enrico Feoli

Journal of Forestry Research ›› 2025, Vol. 36 ›› Issue (1) : 75

PDF
Journal of Forestry Research ›› 2025, Vol. 36 ›› Issue (1) :75 DOI: 10.1007/s11676-025-01867-2
Original Paper
research-article

Classification of forest vegetation with the application of iterative reallocation and model-based clustering

Author information +
History +
PDF

Abstract

Numerous clustering algorithms are valuable in pattern recognition in forest vegetation, with new ones continually being proposed. While some are well-known, others are underutilized in vegetation science. This study compares the performance of practical iterative reallocation algorithms with model-based clustering algorithms. The data is from forest vegetation in Virginia (United States), the Hyrcanian Forest (Asia), and European beech forests. Practical iterative reallocation algorithms were applied as non-hierarchical methods and Finite Gaussian mixture modeling was used as a model-based clustering method. Due to limitations on dimensionality in model-based clustering, principal coordinates analysis was employed to reduce the dataset’s dimensions. A log transformation was applied to achieve a normal distribution for the pseudo-species data before calculating the Bray–Curtis dissimilarity. The findings indicate that the reallocation of misclassified objects based on silhouette width (OPTSIL) with Flexible-β (– 0.25) had the highest mean among the tested clustering algorithms with Silhouette width 1 (REMOS1) with Flexible-β (– 0.25) second. However, model-based clustering performed poorly. Based on these results, it is recommended using OPTSIL with Flexible-β (– 0.25) and REMOS1 with Flexible-β (– 0.25) for forest vegetation classification instead of model-based clustering particularly for heterogeneous datasets common in forest vegetation community data.

Keywords

Classification / Heuristic clustering / Finite mixture / Forest ecosystems / Model-based clustering

Cite this article

Download citation ▾
Naghmeh Pakgohar, Javad Eshaghi Rad, Hossein Gholami, Ahmad Alijanpour, David W. Roberts, Attila Lengyel, Enrico Feoli. Classification of forest vegetation with the application of iterative reallocation and model-based clustering. Journal of Forestry Research, 2025, 36(1): 75 DOI:10.1007/s11676-025-01867-2

登录浏览全文

4963

注册一个新账户 忘记密码

References

[1]

Ahn AC, Meier-Kolthoff JP, Overmars L, Richter M, Woyke T, Sorokin DY, Muyzer G. Genomic diversity within the haloalkaliphilic genus Thioalkalivibrio. PLoS ONE, 2017, 12(3 e0173517

[2]

Aho K, Roberts DW, Weaver T. Using geometric and non-geometric internal evaluators to compare eight vegetation classification methods. J Veg Sci, 2008, 19(4): 549-562

[3]

Attorre F, Cambria VE, Agrillo E, Alessi N, Alfò M, De Sanctis M, Malatesta L, Sitzia T, Guarino R, Marcenò C, Massimi M, Spada F, Fanelli G. Finite Mixture Model-based classification of a complex vegetation system. Veg Classif Surv, 2020, 1: 77-86

[4]

Belbin L. The use of non-hierarchical allocation methods for clustering large sets of data. Aust Comput J, 1987, 19(1): 32-41

[5]

Canova F. Testing for convergence clubs in income per capita: a predictive density approach. Int Econ Rev, 2004, 45(1): 49-77

[6]

Cheng J, Cline M, Martin J, Finkelstein D, Awad T, Kulp D, Siani-Rose MA. A knowledge-based clustering algorithm driven by gene ontology. J Biopharm Stat, 2004, 14(3): 687-700

[7]

Dale MB. Knowing when to stop: cluster concept–concept cluster. Coenoses, 1988, 1: 11-31

[8]

Dias JG, Vermunt JK, Ramos S. Clustering financial time series: new insights from an extended hidden Markov model. Eur J Oper Res, 2015, 243(3): 852-864

[9]

Estiri H, Abounia Omran B, Murphy SN. Kluster: an efficient scalable procedure for approximating the number of clusters in unsupervised learning. Big Data Res, 2018, 13: 38-51

[10]

Everitt BS, Landau S, Leese M, Stahl D. Cluster analysis, 2011, Chi Chester, Wiley

[11]

Faith DP, Minchin PR, Belbin L. Compositional dissimilarity as a robust measure of ecological distance. Vegetation, 1987, 69(1): 57-68

[12]

Feoli E, Orlóci L (1991) The properties and interpretation of observations in vegetation study. In: Computer assisted vegetation analysis. Springer, Dordrecht, pp 1–13. https://doi.org/10.1007/978-94-011-3418-7_1

[13]

Feoli E, Ganis P. The use of the evenness of eigenvalues of similarity matrices to test for predictivity of ecosystem classifications. Mathematics, 2019, 7(3): 245

[14]

Feoli E, Ganis P. Similarity, classification and diversity “an Eternal Golden Braid” in quantitative vegetation studies. Fl Medit, 2021, 31: 23-41

[15]

Fluegemann JK, Davies MD, Aguirre ND. Determining the optimal number of clusters with the clustergram, 2011, Washington, DC, National Aeronautics and Space Administration

[16]

Fraley C, Raftery AE. Model-based clustering, discriminant analysis, and density estimation. J Am Stat Assoc, 2002, 97(458611-631

[17]

Fröhwirth-Schnatter S, Kaufmann S. Model-based clustering of multiple time series. J Bus Econ Stat, 2008, 26(1): 78-89

[18]

Gower JC. Some distance properties of latent root and vector methods used in multivariate analysis. Biometrika, 1966, 53(3–4): 325-338

[19]

Grun B (2018) Model-based clustering. Handbook of mixture analysis. In: Frühwirth-Schnatter S, Celeux G, Robert CP (eds) Chapman and Hall/CRC, pp 163–198

[20]

Halkidi M, Batistakis Y, Vazirgiannis M. Cluster validity methods: part I. ACM SIGMOD Rec, 2002, 31(2): 40-45

[21]

He ZL, Ho CH. An improved clustering algorithm based on finite Gaussian mixture model. Multimed Tools Appl, 2019, 78(17): 24285-24299

[22]

Hou Y, Yang Y, Rao N, Lun X, Lan J. Mixture model and Markov random field-based remote sensing image unsupervised clustering method. Opto Electron Rev, 2011, 19(1): 83-88

[23]

Jacques J, Preda C. Model-based clustering for multivariate functional data. Comput Stat Data Anal, 2014, 71: 92-106

[24]

Kaufman L, Rousseeuw PJ. Finding groups in data, 1990, New York, Wiley

[25]

Kim KH, Yun ST, Park SS, Joo Y, Kim TS. Model-based clustering of hydrochemical data to demarcate natural versus human impacts on bedrock groundwater quality in rural areas, South Korea. J Hydrol, 2014, 519: 626-636

[26]

Koleff P, Gaston KJ, Lennon JJ. Measuring beta diversity for presence–absence data. J Anim Ecol, 2003, 72(3): 367-382

[27]

Kolsi A, Haukka K, Dougnon V, Agbankpè A, Fabiyi K, Virta M, Skrunik M, Kantele A, Kiljunen S. Isolation and characterization of three novel Acinetobacter baumannii phages from Beninese hospital wastewater. Arch Virol, 2023, 168(9): 228

[28]

Lawson DJ, Falush D. Population identification using genetic data. Annu Rev Genom Hum Genet, 2012, 13: 337-361

[29]

Legendre P, Gallagher ED. Ecologically meaningful transformations for ordination of species data. Oecologia, 2001, 129(2271-280

[30]

Legendre P, Legendre L. Numerical ecology, 19982Amsterdam, Elsevier

[31]

Lengyel A, Botta-Dukát Z. Silhouette width using generalized mean-a flexible method for assessing clustering efficiency. Ecol Evol, 2019, 9(23): 13231-13243

[32]

Lengyel A, Landucci F, Mucina L, Tsakalos JL, Botta-Dukát Z. Joint optimization of cluster number and abundance transformation for obtaining effective vegetation classifications. J Veg Sci, 2018, 29(2): 336-347

[33]

Lengyel A, Roberts DW, Botta-Dukát Z. Comparison of silhouette-based reallocation methods for vegetation classification. J Veg Sci, 2021, 32(1 e12984

[34]

Lötter MC, Mucina L, Witkowski ETF. The classification conundrum: species fidelity as leading criterion in search of a rigorous method to classify a complex forest data set. Commun Ecol, 2013, 14(1): 121-132

[35]

Lu YP, Phillips CA, Langston MA. A robustness metric for biological data clustering algorithms. BMC Bioinform, 2019, 20(Suppl 15): 503

[36]

Margules CR, Pressey RL. Systematic conservation planning. Nature, 2000, 405(6783243-253

[37]

McCune B, Grace JB (2002) Analysis of ecological communities. MJM Software Design, Glenedon Beach

[38]

Mielke PW Jr, Berry KJ (2007) Permutation methods: a distance function approach, 2nd edn. Springer, New York. https://doi.org/10.1007/978-0-387-69813-7

[39]

Mucina L, van der Maarel E (1989) Twenty years of numerical syntaxonomy. In: Numerical syntaxonomy. Springer, Amsterdam, pp 1–15. https://doi.org/10.1007/978-94-009-2432-1_1

[40]

O’Hagan A, Murphy TB, Scrucca L, Gormley IC. Investigation of parameter uncertainty in clustering using a Gaussian mixture model via jackknife, bootstrap and weighted likelihood bootstrap. Comput Stat, 2019, 34(4): 1779-1813

[41]

Oksanen J, Blanchet FG, Kindt R, Legendre P, Minchin PR, O’hara RB, Simpson GL, Sólymos P, Stevens MHH, Wagner H (2022). Package ‘vegan’. Community ecology package, version 2(9), pp 1–295

[42]

Orlóci L. Multivariate analysis in vegetation research, 19782Amsterdam, Springer

[43]

Pakgohar N, Rad JE, Gholami G, Alijanpour A, Roberts DW. A comparative study of hard clustering algorithms for vegetation data. J Veg Sci, 2021, 32(3 e13042

[44]

Peet RK, Roberts DW. van der Maarel E, Franklin J. Classification of natural and seminatural vegetation. Vegetation ecology, 2013, Oxford, Wiley-Blackwell2662

[45]

Pillar VD. How sharp are classifications?. Ecology, 1999, 80(8): 2508-2516

[46]

Podani J. Introduction to the exploration of multivariate biological data, 2000, Leiden, Backhuys Publishers

[47]

Podani J, Feoli E. A general strategy for the simultaneous classification of variables and objects in ecological data tables. J Veg Sci, 1991, 2(4): 435-444

[48]

R Core Team (2021) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria URL https://www.R-project.org/

[49]

Rand WM. Objective criteria for the evaluation of clustering methods. J Am Stat Assoc, 1971, 66(336): 846-850

[50]

Richards JW, Hardin J, Grosfils EB. Weighted model-based clustering for remote sensing image analysis. Comput Geosci, 2010, 14(1): 125-136

[51]

Roberts DW. Vegetation classification by two new iterative reallocation optimization algorithms. Plant Ecol, 2015, 216(5): 741-758

[52]

Roberts DW. Package ‘labdsv’. Ordinat Multivar, 2016, 775: 1-68

[53]

Roberts DW (2022) Optpart: optimal partitioning of similarity relations. R package version, 2-0.

[54]

Rodriguez MZ, Comin CH, Casanova D, Bruno OM, Amancio DR, da Costa L, Rodrigues FA. Clustering algorithms: a comparative approach. PLoS ONE, 2019, 14(1): e0210236

[55]

Rousseeuw PJ. Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J Comput Appl Math, 1987, 20: 53-65

[56]

Sengupta S, Das S. Selective nearest neighbors clustering. Pattern Recognit Lett, 2022, 155: 178-185

[57]

Ter Braak CJF, Hoijtink H, Akkermans W, Verdonschot PFM. Bayesian model-based cluster analysis for predicting macrofaunal communities. Ecol Model, 2003, 160(3235-248

[58]

Warton DI, Wright ST, Wang Y. Distance-based multivariate analyses confound location and dispersion effects. Methods Ecol Evol, 2012, 3(1): 89-101

[59]

Witte JM, Wójcik RB, Torfs PJJF, de Haan MWH, Hennekens S. Bayesian classification of vegetation types with Gaussian mixture density fitting to indicator values. J Veg Sci, 2007, 18(4): 605-612

[60]

Xia HY, Huang W, Li N, Zhou JZ, Zhang DY. PARSUC: a parallel subsampling-based method for clustering remote sensing big data. Sensors, 2019, 19(15): 3438

[61]

Young WC, Raftery AE, Yeung KY. Model-based clustering with data correction for removing artifacts in gene expression data. Ann Appl Stat, 2016, 11(41998-2026

[62]

Zhang WL, Di YM. Model-based clustering with measurement or estimation errors. Genes, 2020, 11(2): 185

[63]

Zhang HZ, Huang YX. Finite mixture models and their applications: a review. Austin Biometr Biostat, 2015, 2(11-6

[64]

Zhang YF, Horvath S, Ophoff R, Telesca D (2014) Comparison of clustering methods for time course genomic data: applications to aging effects. https://doi.org/10.48550/arXiv.1404.7534

[65]

Zhong S, Ghosh J. A unified framework for model-based clustering. J Mach Learn Res, 2004, 4(6): 1001-1037

[66]

Zhong S, Ghosh J. Generative model-based document clustering: a comparative study. Knowl Inf Syst, 2005, 8(3): 374-384

[67]

Zuur AF, Ieno EN, Walker NJ, Saveliev AA, Smith GM (2009) Mixed effects models and extensions in ecology with R. Statistics for Biology and Health. Springer, New York, USA

RIGHTS & PERMISSIONS

Northeast Forestry University

PDF

248

Accesses

0

Citation

Detail

Sections
Recommended

/