Developing kNN forest data imputation for Catalonia

Timo Pukkala1(), Núria Aquilué1, Ariadna Just2, Jordi Corbera2, Antoni Trasobares1

PDF
Journal of Forestry Research ›› 2024, Vol. 35 ›› Issue (1) : 80. DOI: 10.1007/s11676-024-01735-5
Original Paper

Developing kNN forest data imputation for Catalonia

  • Timo Pukkala1(), Núria Aquilué1, Ariadna Just2, Jordi Corbera2, Antoni Trasobares1
Author information +
History +

Abstract

The combined use of LiDAR (Light Detection And Ranging) scanning and field inventories can provide spatially continuous wall-to-wall information on forest characteristics. This information can be used in many ways in forest mapping, scenario analyses, and forest management planning. This study aimed to find the optimal way to obtain continuous forest data for Catalonia when using kNN imputation (kNN stands for “k nearest neighbors”). In this method, data are imputed to a certain location from k field-measured sample plots, which are the most similar to the location in terms of LiDAR metrics and topographic variables. Weighted multidimensional Euclidean distance was used as the similarity measure. The study tested two different methods to optimize the distance measure. The first method optimized, in the first step, the set of LiDAR and topographic variables used in the measure, as well as the transformations of these variables. The weights of the selected variables were optimized in the second step. The other method optimized the variable set as well as their transformations and weights in one single step. The two-step method that first finds the variables and their transformations and subsequently optimizes their weights resulted in the best imputation results. In the study area, the use of three to five nearest neighbors was recommended. Altitude and latitude turned out to be the most important variables when assessing the similarity of two locations of Catalan forests in the context of kNN data imputation. The optimal distance measure always included both LiDAR metrics and topographic variables. The study showed that the optimal similarity measure may be different for different regions. Therefore, it was suggested that kNN data imputation should always be started with the optimization of the measure that is used to select the k nearest neighbors.

Keywords

Forest inventory / Differential evolution / Simulated annealing / LiDAR

Cite this article

Download citation ▾
Timo Pukkala, Núria Aquilué, Ariadna Just, Jordi Corbera, Antoni Trasobares. Developing kNN forest data imputation for Catalonia. Journal of Forestry Research, 2024, 35(1): 80 https://doi.org/10.1007/s11676-024-01735-5

References

[1]
Alberdi I, Sandoval V, Condes S, Ca?ellas I, Vallejo R (2016) El Inventario Forestal Nacional espa?ol, una herramienta para el conocimiento, la gestión y la conservación de los ecosistemas forestales arbolados. Ecosistemas 25(3):88–97. https://doi.org/10.7818/ECOS.2016.25-3.10
[2]
Bettinger P, Graetz D, Boston K, Sessions J, Chung W (2002) Eight heuristic planning techniques applied to three increasingly difficult wildlife planning problems. Silva Fennica 36(2):561–584
[3]
Blázquez-Casado á, González-Olabarria JR, Martín-Alcón S, Just A, Cabré M, Coll L (2015) Assessing post-storm forest dynamics in the Pyrenees using high-resolution LIDAR data and aerial photographs. J Mt Sci 12:841–853. https://doi.org/10.1007/s11629-014-3327-3
[4]
Bonet JA, Palahí M, Colinas C, Pukkala T, Fischer C, Miina J, Martinez de Aragón J (2010) Modelling the production of wild mushrooms in pine forests in the Central Pyrenees in northeastern Spain. Can J for Res 40:347–356. https://doi.org/10.1139/X09-198
[5]
Breiman L (2001) Random forests. Mach Learn 45(1):5–32. https://doi.org/10.1023/A:1010933404324
[6]
Chirici G, Barbati A, Corona P, Marchetti M, Travaglini D, Maselli F, Bertini R (2008) Non-parametric and parametric methods using satellite images for estimating growing stock volume in alpine and Mediterranean forest ecosystems. Remote Sens Environ 112(5):2686–2700. https://doi.org/10.1016/j.rse.2008.01.002
[7]
Crookston NL, Finley A (2008) yaImpute: an R Package for kNN imputation. J Stat Softw 23(10). Available on http://www.jstatsoft.org/
[8]
Díaz-Yá?ez O, Pukkala T, Packalen P, Peltola H (2020) Multifunctional comparison of different management strategies in boreal forests. Forestry 93(1):84–95. https://doi.org/10.1093/forestry/cpz053
[9]
FUSION, version 3.2 (2012) – LiDAR analysis and visualization software. Available on: http://forsys.sefs.uw.edu/fusion/fusion_overview.html. Accessed 18 May 2023
[10]
Gittins R (1985) Canonical analysis: a review with applications in ecology. Springer-Verlag, Berlin. p, p 351
[11]
Hudak AT, Crookston NL, Evans JS, Hall DE, Falkowski MJ (2008) Nearest neighbor imputation of species-level, plot-scale forest structure attributes from LiDAR data. Remote Sens Environ 112(5):2232–2245. https://doi.org/10.1016/j.rse.2007.10.009
[12]
Hyypp? J, Hyypp? H, Leckie D, Gougeon F, Yu X, Maltamo M (2008) Review of methods of small-footprint airborne laser scanning for extracting forest inventory data in boreal forests. Int J Remote Sens 29(5):1339–1366. https://doi.org/10.1080/01431160701736489
[13]
Jia W, Sun Y, Pukkala T, Jin X (2020) Improved cellular automaton for stand delineation. Forests 11(1):37. https://doi.org/10.3390/f11010037
[14]
Jin X, Pukkala T, Li F (2016) Fine-tuning heuristic methods for combinatorial optimization in forest planning. Eur J Forest Res 135:765–779. https://doi.org/10.1007/s10342-016-0971-x
[15]
Jin X, Pukkala T, Li F (2018) Meta optimization of stand management with population-based methods. Can J for Res 48:697–708. https://doi.org/10.1139/cjfr-2017-0404
[16]
Latifi H, Nothdurft A, Koch B (2010) Non-parametric prediction and mapping of standing timber volume and biomass in a temperate forest: application of multiple optical/LiDAR-derived predictors. Forestry 83(4):395–407. https://doi.org/10.1093/forestry/cpq022
[17]
LeMay V, Temesgen H (2005) Comparison of nearest neighbor methods for estimating basal area and stems per hectare using aerial auxiliary variables. Forest Sci 51(2):109–119
[18]
Lim K, Treitz P, Wulder M, St-Onge B, Flood M (2003) LiDAR remote sensing of forest structure. Prog Phys Geogr Earth Environ 27(1):88–106. https://doi.org/10.1191/0309133303pp360ra
[19]
Maltamo M, Malinen J, Packalén P, Suvanto A, Kangas J (2006) Nonparametric estimation of stem volume using airborne laser scanning, aerial photography, and stand-register data. Can J For Res 36:426–436. https://doi.org/10.1139/x05-246
[20]
Martín-Alcón S, Coll L, De Cáceres M, Guitart L, Cabré M, Just A, González-Olabarria JR (2015) Combining aerial LiDAR and multispectral imagery to assess post-fire regeneration types in a Mediterranean forest. Can J For Res 45(7):56866. https://doi.org/10.1139/cjfr-2014-0430
[21]
Moeur M, Stage AR (1995) Most similar neighbor: an improved sampling inference procedure for natural resource planning. Forest Sci 41(2):337–359. https://doi.org/10.1093/forestscience/41.2.337
[22]
Packalen P, Temesgen H, Maltamo M (2012) Variable selection strategies for nearest neighbor imputation methods used in remote sensing based forest inventory. Can J Remote Sens 38(5):557–569. https://doi.org/10.5589/m12-046
[23]
Palahí M, Mavsar R, Gracia C, Birot Y (2008) Mediterranean forests under focus. Int Forest Rev 10(4):676–688. https://doi.org/10.1505/ifor.10.4.676
[24]
Pukkala T (2009) Population-based methods in the optimization of stand management. Silva Fennica 43(2):261–274. https://doi.org/10.14214/sf.211
[25]
Pukkala T (2019) Using ALS raster data in forest planning. J Forest Res 30:1581–1593. https://doi.org/10.1007/s11676-019-00937-6
[26]
Pukkala T (2020) Delineating forest stands from grid data. Forest Ecosyst 7:1–14. https://doi.org/10.1186/s40663-020-00221-8
[27]
Pukkala T, Heinonen T (2006) Optimizing heuristic search in forest planning. Nonlinear Anal Real World Appl 7(5):1284–1297. https://doi.org/10.1016/j.nonrwa.2005.11.011
[28]
Rouget M, Richardson DM, Lavorel S, Vayreda J, Gracia C, Milton SJ (2001) Determinants of distribution of six Pinus species in Catalonia. Spain J Veg Sci 12(4):491–502. https://doi.org/10.2307/3237001
[29]
Scarascia-Mugnozza G, Oswald H, Piussi P, Radoglou K (2000) Forests of Mediterranean region: gaps in knowledge and research needs. For Ecol Manage 132:97–109. https://doi.org/10.1016/S0378-1127(00)00383-2
[30]
Storn R, Price K (1997) Differential evolution – a simple and efficient heuristic for global optimization over continuous spaces. J Global Optim 11:341–359. https://doi.org/10.1023/A:1008202821328
[31]
Terrasolid version 017 (2017) – The standard workflow for airborne LiDAR classification. Available on: https://terrasolid.com/. Accessed on 17 May 2023
[32]
Trasobares A, Mola-Yudego B, Aquilué N, González-Olabarria JR, Garcia-Gonzalo J, García-Valdés R, De Cáceres M (2022) Nationwide climate-sensitive models for stand dynamics and forest scenario simulation. For Ecol Manage 505:119909. https://doi.org/10.1016/j.foreco.2021.119909
[33]
Vilà-Cabrera A, Martínez-Vilalta J, Vayreda J, Retana J (2011) Structural and climatic determinants of demographic rates of Scots pine forests across the Iberian Peninsula. Ecol Appl 21:1162–1172. https://www.jstor.org/stable/23022987
[34]
White JC, Wulder MA, Varhola A, Vastaranta M, Coops NC, Cook BD, Pitt D, Woods M (2013) A best practices guide for generating forest inventory attributes from airborne laser scanning data using an area-based approach. Canadian Forest Service Canadian Wood Fibre Centre Information Report FI-X-010
PDF

Accesses

Citations

Detail

Sections
Recommended

/