Classification of forest vegetation with the application of iterative reallocation and model-based clustering

Naghmeh Pakgohar , Javad Eshaghi Rad , Hossein Gholami , Ahmad Alijanpour , David W. Roberts , Attila Lengyel , Enrico Feoli

Journal of Forestry Research ›› 2025, Vol. 36 ›› Issue (1)

PDF
Journal of Forestry Research ›› 2025, Vol. 36 ›› Issue (1) DOI: 10.1007/s11676-025-01867-2
Original Paper

Classification of forest vegetation with the application of iterative reallocation and model-based clustering

Author information +
History +
PDF

Abstract

Numerous clustering algorithms are valuable in pattern recognition in forest vegetation, with new ones continually being proposed. While some are well-known, others are underutilized in vegetation science. This study compares the performance of practical iterative reallocation algorithms with model-based clustering algorithms. The data is from forest vegetation in Virginia (United States), the Hyrcanian Forest (Asia), and European beech forests. Practical iterative reallocation algorithms were applied as non-hierarchical methods and Finite Gaussian mixture modeling was used as a model-based clustering method. Due to limitations on dimensionality in model-based clustering, principal coordinates analysis was employed to reduce the dataset’s dimensions. A log transformation was applied to achieve a normal distribution for the pseudo-species data before calculating the Bray–Curtis dissimilarity. The findings indicate that the reallocation of misclassified objects based on silhouette width (OPTSIL) with Flexible-β (– 0.25) had the highest mean among the tested clustering algorithms with Silhouette width 1 (REMOS1) with Flexible-β (– 0.25) second. However, model-based clustering performed poorly. Based on these results, it is recommended using OPTSIL with Flexible-β (– 0.25) and REMOS1 with Flexible-β (– 0.25) for forest vegetation classification instead of model-based clustering particularly for heterogeneous datasets common in forest vegetation community data.

Keywords

Classification / Heuristic clustering / Finite mixture / Forest ecosystems / Model-based clustering

Cite this article

Download citation ▾
Naghmeh Pakgohar, Javad Eshaghi Rad, Hossein Gholami, Ahmad Alijanpour, David W. Roberts, Attila Lengyel, Enrico Feoli. Classification of forest vegetation with the application of iterative reallocation and model-based clustering. Journal of Forestry Research, 2025, 36(1): DOI:10.1007/s11676-025-01867-2

登录浏览全文

4963

注册一个新账户 忘记密码

References

[1]

AhnAC, Meier-KolthoffJP, OvermarsL, RichterM, WoykeT, SorokinDY, MuyzerG. Genomic diversity within the haloalkaliphilic genus Thioalkalivibrio. PLoS ONE, 2017, 12(3): e0173517

[2]

AhoK, RobertsDW, WeaverT. Using geometric and non-geometric internal evaluators to compare eight vegetation classification methods. J Veg Sci, 2008, 19(4): 549-562

[3]

AttorreF, CambriaVE, AgrilloE, AlessiN, AlfòM, De SanctisM, MalatestaL, SitziaT, GuarinoR, MarcenòC, MassimiM, SpadaF, FanelliG. Finite Mixture Model-based classification of a complex vegetation system. Veg Classif Surv, 2020, 1: 77-86

[4]

BelbinL. The use of non-hierarchical allocation methods for clustering large sets of data. Aust Comput J, 1987, 19(1): 32-41

[5]

CanovaF. Testing for convergence clubs in income per capita: a predictive density approach. Int Econ Rev, 2004, 45(1): 49-77

[6]

ChengJ, ClineM, MartinJ, FinkelsteinD, AwadT, KulpD, Siani-RoseMA. A knowledge-based clustering algorithm driven by gene ontology. J Biopharm Stat, 2004, 14(3): 687-700

[7]

DaleMB. Knowing when to stop: cluster concept–concept cluster. Coenoses, 1988, 1: 11-31

[8]

DiasJG, VermuntJK, RamosS. Clustering financial time series: new insights from an extended hidden Markov model. Eur J Oper Res, 2015, 243(3): 852-864

[9]

EstiriH, Abounia OmranB, MurphySN. Kluster: an efficient scalable procedure for approximating the number of clusters in unsupervised learning. Big Data Res, 2018, 13: 38-51

[10]

EverittBS, LandauS, LeeseM, StahlDCluster analysis, 2011, Chi Chester, Wiley

[11]

FaithDP, MinchinPR, BelbinL. Compositional dissimilarity as a robust measure of ecological distance. Vegetation, 1987, 69(1): 57-68

[12]

Feoli E, Orlóci L (1991) The properties and interpretation of observations in vegetation study. In: Computer assisted vegetation analysis. Springer, Dordrecht, pp 1–13. https://doi.org/10.1007/978-94-011-3418-7_1

[13]

FeoliE, GanisP. The use of the evenness of eigenvalues of similarity matrices to test for predictivity of ecosystem classifications. Mathematics, 2019, 7(3): 245

[14]

FeoliE, GanisP. Similarity, classification and diversity “an Eternal Golden Braid” in quantitative vegetation studies. Fl Medit, 2021, 31: 23-41

[15]

FluegemannJK, DaviesMD, AguirreNDDetermining the optimal number of clusters with the clustergram, 2011, Washington, DC, National Aeronautics and Space Administration

[16]

FraleyC, RafteryAE. Model-based clustering, discriminant analysis, and density estimation. J Am Stat Assoc, 2002, 97(458): 611-631

[17]

Fröhwirth-SchnatterS, KaufmannS. Model-based clustering of multiple time series. J Bus Econ Stat, 2008, 26(1): 78-89

[18]

GowerJC. Some distance properties of latent root and vector methods used in multivariate analysis. Biometrika, 1966, 53(3–4): 325-338

[19]

Grun B (2018) Model-based clustering. Handbook of mixture analysis. In: Frühwirth-Schnatter S, Celeux G, Robert CP (eds) Chapman and Hall/CRC, pp 163–198

[20]

HalkidiM, BatistakisY, VazirgiannisM. Cluster validity methods: part I. ACM SIGMOD Rec, 2002, 31(2): 40-45

[21]

HeZL, HoCH. An improved clustering algorithm based on finite Gaussian mixture model. Multimed Tools Appl, 2019, 78(17): 24285-24299

[22]

HouY, YangY, RaoN, LunX, LanJ. Mixture model and Markov random field-based remote sensing image unsupervised clustering method. Opto Electron Rev, 2011, 19(1): 83-88

[23]

JacquesJ, PredaC. Model-based clustering for multivariate functional data. Comput Stat Data Anal, 2014, 71: 92-106

[24]

KaufmanL, RousseeuwPJFinding groups in data, 1990, New York, Wiley

[25]

KimKH, YunST, ParkSS, JooY, KimTS. Model-based clustering of hydrochemical data to demarcate natural versus human impacts on bedrock groundwater quality in rural areas, South Korea. J Hydrol, 2014, 519: 626-636

[26]

KoleffP, GastonKJ, LennonJJ. Measuring beta diversity for presence–absence data. J Anim Ecol, 2003, 72(3): 367-382

[27]

KolsiA, HaukkaK, DougnonV, AgbankpèA, FabiyiK, VirtaM, SkrunikM, KanteleA, KiljunenS. Isolation and characterization of three novel Acinetobacter baumannii phages from Beninese hospital wastewater. Arch Virol, 2023, 168(9): 228

[28]

LawsonDJ, FalushD. Population identification using genetic data. Annu Rev Genom Hum Genet, 2012, 13: 337-361

[29]

LegendreP, GallagherED. Ecologically meaningful transformations for ordination of species data. Oecologia, 2001, 129(2): 271-280

[30]

LegendreP, LegendreLNumerical ecology, 19982Amsterdam, Elsevier

[31]

LengyelA, Botta-DukátZ. Silhouette width using generalized mean-a flexible method for assessing clustering efficiency. Ecol Evol, 2019, 9(23): 13231-13243

[32]

LengyelA, LanducciF, MucinaL, TsakalosJL, Botta-DukátZ. Joint optimization of cluster number and abundance transformation for obtaining effective vegetation classifications. J Veg Sci, 2018, 29(2): 336-347

[33]

LengyelA, RobertsDW, Botta-DukátZ. Comparison of silhouette-based reallocation methods for vegetation classification. J Veg Sci, 2021, 32(1): e12984

[34]

LötterMC, MucinaL, WitkowskiETF. The classification conundrum: species fidelity as leading criterion in search of a rigorous method to classify a complex forest data set. Commun Ecol, 2013, 14(1): 121-132

[35]

LuYP, PhillipsCA, LangstonMA. A robustness metric for biological data clustering algorithms. BMC Bioinform, 2019, 20(Suppl 15): 503

[36]

MargulesCR, PresseyRL. Systematic conservation planning. Nature, 2000, 405(6783): 243-253

[37]

McCune B, Grace JB (2002) Analysis of ecological communities. MJM Software Design, Glenedon Beach

[38]

Mielke PW Jr, Berry KJ (2007) Permutation methods: a distance function approach, 2nd edn. Springer, New York. https://doi.org/10.1007/978-0-387-69813-7

[39]

Mucina L, van der Maarel E (1989) Twenty years of numerical syntaxonomy. In: Numerical syntaxonomy. Springer, Amsterdam, pp 1–15. https://doi.org/10.1007/978-94-009-2432-1_1

[40]

O’HaganA, MurphyTB, ScruccaL, GormleyIC. Investigation of parameter uncertainty in clustering using a Gaussian mixture model via jackknife, bootstrap and weighted likelihood bootstrap. Comput Stat, 2019, 34(4): 1779-1813

[41]

Oksanen J, Blanchet FG, Kindt R, Legendre P, Minchin PR, O’hara RB, Simpson GL, Sólymos P, Stevens MHH, Wagner H (2022). Package ‘vegan’. Community ecology package, version 2(9), pp 1–295

[42]

OrlóciLMultivariate analysis in vegetation research, 19782Amsterdam, Springer

[43]

PakgoharN, RadJE, GholamiG, AlijanpourA, RobertsDW. A comparative study of hard clustering algorithms for vegetation data. J Veg Sci, 2021, 32(3): e13042

[44]

PeetRK, RobertsDWvan der MaarelE, FranklinJ. Classification of natural and seminatural vegetation. Vegetation ecology, 2013, Oxford, Wiley-Blackwell: 26-62

[45]

PillarVD. How sharp are classifications?. Ecology, 1999, 80(8): 2508-2516

[46]

PodaniJIntroduction to the exploration of multivariate biological data, 2000, Leiden, Backhuys Publishers

[47]

PodaniJ, FeoliE. A general strategy for the simultaneous classification of variables and objects in ecological data tables. J Veg Sci, 1991, 2(4): 435-444

[48]

R Core Team (2021) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria URL https://www.R-project.org/

[49]

RandWM. Objective criteria for the evaluation of clustering methods. J Am Stat Assoc, 1971, 66(336): 846-850

[50]

RichardsJW, HardinJ, GrosfilsEB. Weighted model-based clustering for remote sensing image analysis. Comput Geosci, 2010, 14(1): 125-136

[51]

RobertsDW. Vegetation classification by two new iterative reallocation optimization algorithms. Plant Ecol, 2015, 216(5): 741-758

[52]

RobertsDW. Package ‘labdsv’. Ordinat Multivar, 2016, 775: 1-68

[53]

Roberts DW (2022) Optpart: optimal partitioning of similarity relations. R package version, 2-0.

[54]

RodriguezMZ, CominCH, CasanovaD, BrunoOM, AmancioDR, da CostaL, RodriguesFA. Clustering algorithms: a comparative approach. PLoS ONE, 2019, 14(1): e0210236

[55]

RousseeuwPJ. Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J Comput Appl Math, 1987, 20: 53-65

[56]

SenguptaS, DasS. Selective nearest neighbors clustering. Pattern Recognit Lett, 2022, 155: 178-185

[57]

Ter BraakCJF, HoijtinkH, AkkermansW, VerdonschotPFM. Bayesian model-based cluster analysis for predicting macrofaunal communities. Ecol Model, 2003, 160(3): 235-248

[58]

WartonDI, WrightST, WangY. Distance-based multivariate analyses confound location and dispersion effects. Methods Ecol Evol, 2012, 3(1): 89-101

[59]

WitteJM, WójcikRB, TorfsPJJF, de HaanMWH, HennekensS. Bayesian classification of vegetation types with Gaussian mixture density fitting to indicator values. J Veg Sci, 2007, 18(4): 605-612

[60]

XiaHY, HuangW, LiN, ZhouJZ, ZhangDY. PARSUC: a parallel subsampling-based method for clustering remote sensing big data. Sensors, 2019, 19(15): 3438

[61]

YoungWC, RafteryAE, YeungKY. Model-based clustering with data correction for removing artifacts in gene expression data. Ann Appl Stat, 2016, 11(4): 1998-2026

[62]

ZhangWL, DiYM. Model-based clustering with measurement or estimation errors. Genes, 2020, 11(2): 185

[63]

ZhangHZ, HuangYX. Finite mixture models and their applications: a review. Austin Biometr Biostat, 2015, 2(1): 1-6

[64]

Zhang YF, Horvath S, Ophoff R, Telesca D (2014) Comparison of clustering methods for time course genomic data: applications to aging effects. https://doi.org/10.48550/arXiv.1404.7534

[65]

ZhongS, GhoshJ. A unified framework for model-based clustering. J Mach Learn Res, 2004, 4(6): 1001-1037

[66]

ZhongS, GhoshJ. Generative model-based document clustering: a comparative study. Knowl Inf Syst, 2005, 8(3): 374-384

[67]

Zuur AF, Ieno EN, Walker NJ, Saveliev AA, Smith GM (2009) Mixed effects models and extensions in ecology with R. Statistics for Biology and Health. Springer, New York, USA

RIGHTS & PERMISSIONS

Northeast Forestry University

AI Summary AI Mindmap
PDF

183

Accesses

0

Citation

Detail

Sections
Recommended

AI思维导图

/