Remote sensing using unmanned aerial vehicles and ensemble learning for rice aboveground biomass prediction: insights into selecting base and meta learners

Jikai LIU; Weiqiang WANG; Xueqing ZHU; Jun LI; Bo LI; Xinwei LI

doi:10.15302/J-FASE-2026684

ENG. Agric. ›› 2026, Vol. 13 ›› Issue (5) :26684 DOI: 10.15302/J-FASE-2026684

RESEARCH ARTICLE

Remote sensing using unmanned aerial vehicles and ensemble learning for rice aboveground biomass prediction: insights into selecting base and meta learners

Author information +

History +

PDF (7815KB)

Abstract

Remote sensing using unmanned aerial vehicles (UAV) combined with machine learning (ML) has significantly advanced field-scale prediction of aboveground biomass. Although ensemble learning frameworks (ELFs) typically outperform individual ML algorithms in accuracy, systematic evaluations of meta learner selection and the effects of base learner quantity and diversity remain limited. Leveraging vegetation indices and texture features extracted from multi-temporal UAV imagery, ELFs were constructed incorporating nine ML algorithms with three meta learners, Linear model, Random forest and Bayesian model averaging (BMA), to systematically evaluate how base learner configuration affects prediction accuracy. Using the fused feature set of sensitive vegetation indices and texture features, Gaussian process regression (GPR) achieved the highest accuracy among all base learners, with R² = 0.769 and RMSE = 1.83 t·ha^–1. Also, the three meta learners outperformed the best base learner, with the BMA meta learner yielding the superior accuracy (R² = 0.795, RMSE = 1.73 t·ha^–1). However, meta learner performance depended strongly on the composition of the base learners pool, stability was optimal with five base learners, and maximum accuracy was achieved by hybrid ensembles that combined linear-kernel models with GPR. This study highlights the importance of both meta learner selection and base learner composition in ELFs for aboveground biomass prediction in rice. These findings offer methodological guidance for UAV-based high-precision monitoring of crop AGB, with practical implications for precision agriculture and crop management.

Graphical abstract

Keywords

Aboveground biomass / ensemble learning frameworks / machine learning / rice / unmanned aerial vehicles

Highlight

	● Fusion of vegetation indices and texture features improves rice aboveground biomass (AGB) prediction over single-feature models.
	● Bayesian model averaging meta learner achieves superior precision and robustness in AGB prediction.
	● Five base learners provide the optimal balance for ensemble learning frameworks.
	● A hybrid ensemble of linear-kernel models and Gaussian process regression yields the highest accuracy.

Cite this article

Download citation ▾

Jikai LIU, Weiqiang WANG, Xueqing ZHU, Jun LI, Bo LI, Xinwei LI. Remote sensing using unmanned aerial vehicles and ensemble learning for rice aboveground biomass prediction: insights into selecting base and meta learners. ENG. Agric., 2026, 13(5): 26684 DOI:10.15302/J-FASE-2026684

登录浏览全文

4963

注册一个新账户忘记密码

1 Introduction

Aboveground biomass (AGB) serves as a useful agroecological indicator, particularly for monitoring crop growth, predicting yields, and assessing crop-environment-management interactions^[¹^]. Mostly, AGB prediction relies on destructive sampling, which, despite its accuracy, is labor-intensive, time-consuming and spatially limited. These constraints render its application for large-scale or high-frequency dynamic monitoring^[²,³^]. Consequently, developing efficient, non-destructive methodologies for AGB prediction is imperative for the progression of precision agriculture.

Recent advancements in remote sensing using unmanned aerial vehicle (UAV) have catalyzed significant progress in high-resolution crop AGB monitoring^[⁴^]. UAV platforms offer exceptional spatial and temporal resolution, operational flexibility and cost-efficiency, positioning them as ideal tools for continuous, large-scale crop growth assessment^[⁵^]. Although vegetation indices (VIs) are widely adopted for AGB prediction^[⁶–⁸^], they have critical limitations across growth stages, specifically, soil background interference during early growth and signal saturation in later stages collectively compromise predictive accuracy^[⁹,¹⁰^]. To address these challenges, texture features (TFs) have been integrated to capture canopy spatial heterogeneity, frequently combined with machine learning (ML) techniques to develop robust non-linear AGB predictive models^[¹¹–¹⁵^]. ML algorithms are particularly effective for disentangling complex relationships within high-dimensional VIs and TFs feature sets, offering enhanced generalization and resistance to overfitting^[¹⁶,¹⁷^]. However, individual ML algorithms often lack robustness under heterogeneous field conditions arising from structural and parametric uncertainties, which ultimately limits their generalizability across diverse agricultural scenarios^[¹⁸–²⁰^].

Ensemble learning frameworks (ELFs) integrate predictions from multiple base learners via a meta learner, thereby leveraging model complementarity to improve predictive accuracy and stability^[²¹–²³^]. For example, Li et al.^[²⁴^] proposed a UAV hyperspectral-based ensemble model combining four ML algorithms with a Linear model (LM) meta learner, achieving significant improvements in wheat yield prediction. Similarly, Fei et al.^[²⁵^] developed a wheat yield prediction framework incorporating eight ML algorithms via Bayesian model averaging (BMA), which substantially enhanced predictive performance. Despite these advancements, several critical research gaps remain to be addressed^[²⁶^]. First, the selection and comparative assessment of meta learners remain insufficient, with many studies relying on simplistic stacking structures while overlooking the potential of alternative meta learners. Second, base learner selection often lacks systematic analysis, and correlations or redundancies among ML algorithms are rarely scrutinized. Third, the applicability and robustness of ELFs under heterogeneous field environments, such as different fertilizer application regimes, and diverse cultivars, have not been validated.

To address these gaps, this study leverages sensitive VIs and TFs feature sets extracted from multispectral UAV imagery to establish ELFs. These frameworks integrate nine diverse ML algorithms with three meta learners (BMA, Random forest (RF), LM) to systematically evaluate rice AGB predictive performance under varying fertilizer treatments and cultivars. The specific objectives are as follows: (1) compare the predictive performance of nine ML models as base learners; (2) elucidate the adaptability and predictive advantages of ELFs with three meta learners; (3) quantify how the quantity and compositional diversity of base learners influence the overall efficacy of the ELFs. Overall, this study aims to overcome current limitations in model stability, generalization and adaptability, providing methodological support for efficient rice AGB prediction and contributing to the advancement of precision agriculture.

2 Materials and methods

2.1 Study site and experimental design

Field experiments were conducted from June to October 2020 in Xiaogang Village, Fengyang County, Anhui Province, China (32°16′ N, 117°42′ E). The study site is characterized by a typical subtropical humid monsoon climate with distinct seasonality. A split-plot experimental design was implemented with three replicates, where nitrogen application rates were designated as the main plot factor and rice cultivars as the subplot factor (Fig. 1). This design yielded a total of 36 experimental plots, each covering an area of 16 m² (2 m × 8 m). Nitrogen fertilizer was applied using a split-application strategy: 40% was administered as basal fertilizer, 30% at the tillering stage, and the remaining 30% at the heading stage. All other agronomic practices adhered to local guidelines for high-yield rice cultivation.

2.2 UAV images acquisition and preprocessing

Multispectral imagery was acquired at six key growth stages (Table 1) using a DJI Phantom 4 Multispectral Quadcopter UAV (SZ DJI Technology Co., Shenzhen, China) (Table S1). The onboard sensor captures five spectral bands: blue (450 ± 16 nm), green (560 ± 16 nm), red (650 ± 16 nm), red edge (730 ± 16 nm), and near-infrared (840 ± 26 nm), each with a 2.08 megapixel resolution. To ensure consistent solar illumination, all flights were conducted under clear, cloudless conditions between 11:00 and 14:00. Flight missions were preplanned via DJI GS Pro software with a flying altitude of 30 m and a cruising speed of 2.0 m·s^–1. Forward and side overlaps were optimized at 90% and 85%, respectively, to ensure high-quality photogrammetric reconstruction.

Photogrammetric processing was executed using Pix4Dmapper (Pix4D, Lausanne, Switzerland), encompassing a standardized workflow that included image georeferencing, image alignment, dense point cloud generation, digital surface model construction and orthomosaic production^[²^]. Subsequently, individual spectral bands were co-registered and stacked into composite multispectral images. All images were projected to WGS 84 UTM Zone 50N coordinate system with a ground sampling distance of 1.7 cm. Radiometric correction was performed via the empirical line method, which established linear regressions between image pixel digital numbers and the reflectance values of calibrated reference panels^[⁵^].

2.3 Rice aboveground biomass data collection

Ground truth AGB measurements were taken synchronously with each UAV campaign (Table 1). Three representative rice plants were destructively sampled from each plot and transported to the laboratory for further processing. The samples were initially oven-dried at 105 °C for 30 min to deactivate enzymatic activity, then transferred to a constant-temperature oven and dried at 75 °C until a constant weight was achieved. Dry weight was determined using an electronic balance with 0.001 g precision.

2.4 Remote sensing feature extraction

2.4.1 Vegetation indices

VIs are widely used spectral features in remote sensing, commonly derived from algebraic combinations of multispectral bands to amplify spectral contrasts between vegetated and non-vegetated backgrounds^[⁵,²⁷^]. For this study, 16 VIs previously demonstrated to be effective for crop AGB prediction were selected for analysis (Table 2).

2.4.2 Texture features

TFs characterize the spatial heterogeneity of the canopy, effectively reflecting variations in crop density, growth consistency, and structural complexity^[¹⁰,¹²^]. Among various texture extraction methods, the gray-level co-occurrence matrix is extensively used for its capacity to generate rotation-invariant and multiscale textural descriptors with high computational efficiency^[²^]. In this study, eight texture metrics were selected: mean, variance, homogeneity, contrast, dissimilarity, entropy, second moment and correlation. To minimize inter-band redundancy and enhance extraction efficiency, TFs were derived from the first two principal components of the multispectral imagery based on principal component analysis.

2.5 Feature selection

Recursive feature elimination (RFE) is an iterative pruning mechanism designed to identify optimal feature subsets by recursively training a predictive model and sequentially discarding features with the minimal contribution to model performance^[⁴⁴^]. The RFE workflow typically entails three steps: (1) assessing feature importance scores using a base model, (2) iteratively removing the lowest-ranked features, and (3) producing a final feature ranking according to the elimination sequence^[¹⁹^]. For this study, an RF-driven RFE procedure was implemented to screen for VIs and TFs sensitive to rice AGB. The feature selection process was optimized via five fold cross-validation, with the minimum root mean square error (RMSE) serving as the decisive criterion for determining the optimal feature subset.

2.6 Aboveground biomass prediction models

2.6.1 Base learners

To comprehensively assess the predictive performance of different algorithms, nine widely used ML methods were selected as base learners, categorized into three distinct architectures: (1) linear-kernel models, including Stepwise multiple regression (SMR), Partial least squares regression (PLSR), Ridge regression (RR) and Least absolute shrinkage and selection operator (LASSO), (2) tree-kernel models, including Random forest regression (RFR), Extreme gradient boosting (XGBoost), Gradient boosting machine (GBM) and (3) non-parametric models, namely Support vector regression (SVR) and Gaussian process regression (GPR) (Table S2).

SMR iteratively refines the predictor set based on predefined statistical criteria to yield a parsimonious and interpretable framework^[⁴⁵^]. PLSR facilitates robust modeling of high-dimensional, multicollinear datasets by projecting variables into latent components that maximize the covariance between predictors and responses^[⁴⁶^]. To address collinearity and prevent overfitting, RR and LASSO implement distinct regularization strategies; RR uses an L2 norm penalty to stabilize coefficient estimation^[¹⁹,⁴⁷^] whereas LASSO uses an L1 norm penalty to encourage model sparsity and facilitate simultaneous variable selection^[²⁶^]. RFR uses bootstrap aggregation (bagging) to average the outputs of multiple decision trees, thereby enhancing model stability and mitigating overfitting^[²^]. GBM constructs a series of weak learners by iteratively correcting the residuals of preceding models, thereby effectively capturing intricate non-linear dependencies^[⁴⁶^]. XGBoost serves as a high-performance implementation of gradient boosting, incorporating second-order optimization, regularization and efficient handling of missing values^[¹⁴^]. SVR implements the structural risk minimization principle, constructing an optimal hyperplane to minimize prediction error while maintaining a robust margin in high-dimensional spaces^[⁴⁸^]. GPR offers a non-parametric Bayesian perspective, modeling complex relationships within a flexible probabilistic framework that inherently allows for predictive uncertainty quantification^[⁴⁹^].

2.6.2 Meta learners

To improve prediction performance and model robustness, stacking ELFs were implemented in this study (Fig. 2). ELFs is a multi-tiered learning paradigm that uses a meta learner to synthesize predictions from multiple base learners. By training on first level outputs, the meta learner exploits inter-model complementarities, effectively balancing individual algorithmic biases and maximizing overall generalization^[¹,²⁰,²⁵^]. To systematically assess the impact of ELFs architectures, three types of meta learners were evaluated: LM, RF and BMA.

LM is a statistical model used to describe the linear relationship between the dependent variable and independent variables, and quantifies the relative contribution of each individual model by optimizing regression coefficients and an intercept^[²²^]. The formula of the LM meta learner is:

(1)

P LM = β 0 + ∑ s = 1 S β s × X s

where, P_LM is the predicted output of the LM meta learner; β₀ is the intercept term, and β_s is the coefficient of the s-th base learner; X_sis the output of the s-th base learner and S is the number of base learners with S ∈ {1,2,3,...,9}.

The final prediction of the RF meta learner is obtained by averaging the predictions from all decision trees^[²^] as:

(2)

P RF = 1 T ∑ t = 1 T P t

where, P_t is the prediction result of the t-th decision tree model, T is the number of decision trees and T is set to 500.

BMA is a probabilistic ensemble forecasting approach grounded in Bayesian inference, which systematically incorporates model selection uncertainty into predictive outcomes. By weighting the full predictive distributions of candidate models according to their posterior probabilities, BMA yields a composite posterior predictive distribution that comprehensively accounts for both within-model variability and between-model discrepancies^[¹⁷,²⁵,⁵⁰^]. Formally, y is the variable to be predicted, D is the observed data. M = {M₁,M₂,...,M_S} is the ensemble of all base learners’ predictions and S is the number of base learners. The BMA predictive distribution for a target variable y is expressed as:

(3)

p (y | D) = ∑ s = 1 S p (M s | D) × p (y | M s, D)

The model weights

ω s = p (M s | D)

are derived via Bayes’ theorem:

(4)

ω s = p (D | M s) × p (M s) ∑ j = 1 S p (D | M j) × p (M j)

where, p(M_s) is the prior probability of the s-th model M_s. Non-informative prior was used, specifically when setting a uniform model prior p(M_s)=1/s. p(D|M_s) is the marginal likelihood of the data under model M_S.

For regression problems, the prediction residuals of each model are assumed to be normally distributed. Hence, each conditional predictive distribution is parameterized as a Gaussian likelihood function:

(5)

p (y | M s, D) = Normal (y | μ s, δ 2 s)

where,

μ s

and

δ s 2

denote the predictive mean and variance of the s-th model M_s, respectively.

The final output of BMA is the complete mixture predictive distribution p(y|D), which not only provides a point prediction, but also, more importantly, allows the predictive variance to be analytically decomposed via the law of total variance:

(6)

V a r (y | D) = ∑ s = 1 S ω s × δ 2 s + ∑ s = 1 S ω s × (μ s − E [y | D])

where, Var(y|D) is the variance of y on training data D under the considered ML model,

∑ s = 1 S ω s × δ 2 s

is the within-model variance, and

∑ s = 1 S ω s × (μ s − E [y | D])

is the between-model variance.

2.7 Model construction and evaluation

During the model construction phase, the sample dataset was partitioned into a training set and an independent testing set using a 7:3 ratio. For the training of base learners, a five fold cross-validation scheme was implemented to generate the out-of-sample predictions matrix (OSPM). The specific procedure was as follows. The training set was randomly segmented into five mutually exclusive and exhaustive subsets. In each iterative cycle, a single subset was designated as the validation set, while the remaining four subsets constituted the training fold. Within each training fold, a minimalistic hyperparameter tuning protocol (Table S3) was executed via internal three fold cross-validation. The resultant trained base learners were then used to generate predictions on the corresponding validation subset. Upon completion of all five iterations, the validation predictions were aggregated to form the OSPM, which subsequently served as the feature space for the secondary training set. Concurrently, each base learner generated five distinct predictions for the independent testing set throughout this process; these predictions were averaged to construct the secondary testing set. In the stacking phase, LM, RF and BMA were trained on the secondary training set via five fold cross-validation protocol. Subsequently, the final meta learners were validated against the secondary independent test set. To mitigate the potential influence of stochasticity inherent in data partitioning, the entire procedure was repeated over 20 independent iterations (Fig. 2). Then, an exhaustive evaluation of all 502 possible ELFs (comprising 2–9 base learners) was conducted, yielding 1506 sets of independent test results for rigorous comparative analysis.

2.8 Model evaluation

Two metrics were used to evaluate the predictive performance of the base learners and the meta learners: the coefficient of determination (R²) and the RMSE^[⁵¹^]. R² reflects the goodness-of-fit, denoting the proportion of variance explained by the model, while RMSE quantifies the magnitude of deviation between observed and predicted AGB values.

To ensure objective and generalizable model evaluation, a one-way analysis of variance was performed. Tukey’s honestly significant difference test was employed for post hoc pairwise comparisons to identify significant differences between combinations^[¹^].

3 Results

3.1 Sensitive feature selection and rice aboveground biomass prediction based on base learners

3.1.1 Sensitive feature selection

RFE was applied to identify the most important features for rice AGB prediction (Fig. 3 and Table S4). Based on five fold cross-validation, the optimal VIs subset comprised seven indices, yielding an R² of 0.697 and an RMSE of 2.11 t·ha^–1. For TFs, the model reached its peak performance with eight features, resulting in an R² of 0.762 and an RMSE of 1.89 t·ha^–1. The selected VIs encompassed the entire spectral range captured by the multispectral sensor. The most sensitive TFs included correlation, second moment, entropy, mean and contrast, which were evenly distributed across the both the first and second principal components (Section 2.4.2).

3.1.2 Aboveground biomass prediction using base learners

Figure 4 shows the predictive performance of nine ML base learners across three feature sets. Both input features and algorithm selection exert a decisive influence on the predictive accuracy and generalization capability of AGB models. The results show that models using the fused VIs and TFs feature set significantly outperformed those relying on either feature set alone. Considerable variability in predictive efficacy was observed among the evaluated base learners (Table S5). GPR and RFR were the top-performing group, having no statistically significant differences in their predictive accuracy. A second performance group included SVR, RR, LASSO, XGBoost and GBM models, all of which showed comparable predictive results. Conversely, PLSR had the poorest performance. With the VIs + TFs test set, GPR had the highest R² (0.769) and the lowest RMSE (1.83 t·ha^–1), followed closely by RFR (R² = 0.763, RMSE = 1.85 t·ha^–1). Notably, the accuracy enhancements derived from feature fusion were model-dependent, with linear algorithms such as RR, LASSO and SMR giving the most pronounced improvements upon the integration of TFs.

3.2 Rice aboveground biomass prediction based on three meta learners

3.2.1 Performance comparison of three meta learners for aboveground biomass prediction

The three meta learners all performed best on the fused feature set, while yielding comparable results on the individual VIs-only and TFs-only feature sets (Fig. 5). For the fused set, a divergent performance trend was observed between the training and test datasets: the BMA meta learner had the highest goodness-of-fit on the training set whereas the LM meta learner achieved superior performance on the test set. To further elucidate the disparities among the meta learners, their stability and overfitting susceptibility were assessed using two metrics derived from the fused feature set: (1) the standard deviations of R² and RMSE, (2) the performance discrepancy between the training and test sets (|ΔR²| and |ΔRMSE|) (Fig. 6). The stability analysis indicated that the BMA and LM meta learners had the highest and statistically indistinguishable robustness, whereas the RF meta learner showed the lowest stability. Regarding overfitting tendency, the BMA meta learner had the most generalizable, while the LM meta learner was the most prone to overfitting. Based on the comprehensive assessment of predictive performance, stability and generalization, the BMA meta learner was selected as the optimal meta learner.

3.2.2 Rice aboveground biomass prediction based on the best meta learner

Table 3 provides details of AGB prediction accuracy between the BMA meta learner and the optimal base learner (GPR) across three feature sets. The results indicate that BMA consistently outperformed GPR under all configurations. For the VI-only feature set, BMA meta learner increased R² from 0.719 to 0.762 (an increase of 6.01%) and reduced RMSE from 2.02 to 1.86 t·ha^–1 (a decrease of 7.67%) by integrating five base learners. For the TF-only feature set, the integration of three base learners enhanced R² from 0.736 to 0.762 (an increase of 3.54%) and reduced RMSE from 1.97 to 1.86 t·ha^–1 (a decrease of 5.54%). For the VIs + TFs fused feature set, BMA meta learner achieved the highest R² (0.795, an increase of 3.31%) and the lowest RMSE (1.73 t·ha^–1, a decrease of 5.57%) by aggregating five base learners. The scatter plots of predicted versus observed AGB values show that the BMA meta learner provided superior goodness-of-fit and enhanced robustness across both the training and validation datasets (Fig. 7).

3.3 Influence of base learner number and type on ensemble learning frameworks performance

3.3.1 Influence of base learner number on BMA meta learner performance

To evaluate the sensitivity of BMA meta learner performance to the number of base learners, the ensemble size was incrementally expanded from two to nine and the corresponding prediction accuracy was compared (Table 4 and Fig. 8). As additional base learners were incorporated, predictive accuracy gradually improved and reached its maximum at an ensemble size of nine, with an average test set R² of 0.793 and an average RMSE of 1.73 t·ha^–1. However, the performance trajectory plateaued once the ensemble size exceeded five, no statistically significant differences were observed beyond this threshold, indicating diminishing marginal utility from further expanding the ensemble size. Balancing predictive performance and computational efficiency, an ensemble size of five was identified as the optimal configuration.

3.3.2 Influence of base learner type on BMA meta learner performance

To further elucidate the influence of base learner type on the ELFs performance of the BMA meta learner, the nine algorithms were categorized into three groups (Table S2). Prediction accuracy discrepancies across various type combinations were then systematically evaluated (Figs 9 and 10). Ensemble configurations that included the optimal single base learner (GPR) significantly outperformed those excluding it, whereas configurations containing the poorest single base learner (PLSR) performed significantly worse than those omitting it. As shown in Fig. 10, the structural diversity of base learner combinations profoundly influenced AGB prediction. First, ensembles predominated by tree-kernel models (RFR, XGBoost and LightGBM) generally achieved higher accuracy than those composed of linear-kernel models (PLSR, LASSO and RR). Second, when tree-kernel or linear-kernel models were further combined with non-linear models such as SVR and GPR, hybrid combinations incorporating linear-kernel base learners generally retained an advantage. Notably, the optimal hybrid combination, consisting of linear-kernel models and GPR, achieved the highest predictive accuracy (R² = 0.795 and RMSE = 1.73 t·ha^–1) using the following combination of base learners: SMR, PLSR, GPR, LASSO and RR.

4 Discussion

4.1 Advantages of multisource feature fusion for rice aboveground biomass prediction

Multisource remote sensing feature fusion is widely recognized as an effective strategy for improving the accuracy of crop growth parameter estimation^[⁵²^]. The findings of the present study demonstrate that fused features outperform individual spectral or texture features in predicting rice AGB (Figs. 4 and 5, Table 3). This improvement can be attributed to the integration of complementary information from multiple sources, aligning with the conclusions reported by Liu et al.^[¹¹^] and Xu et al.^[¹⁰,¹³^]. Specifically, VIs mainly reflect the photosynthetic physiological characteristics of crops, reflecting the photosynthetic capacity and growth status of the crop^[¹²^]. In contrast, TFs capture spatial variations in image grayscale distribution, providing valuable insights into canopy structure, coverage density and spatial heterogeneity^[²^]. The fusion of these two feature types enables a more comprehensive feature representation, where VIs convey physiological vigor and TFs characterize spatial structure of the canopy. This integrated strategy contributes to more robust and accurate rice AGB prediction.

4.2 Influence of base learners on rice aboveground biomass prediction

The performance and underlying mechanisms of base learners are fundamental determinants of the effectiveness of ELFs^[⁵³,⁵⁴^]. This study showed that GPR and RFR had the superior and stable predictive performance (Fig. 4 and Table S5), which aligns with recent studies in crop parameter retrieval^[⁵⁵^]. Based on a non-parametric Bayesian framework, GPR models the covariance structure of the target function via kernel methods, enabling it to adaptively capture the complex, dynamic and non-linear relationships between multisource remote sensing features and AGB across the growth cycle^[⁵⁶,⁵⁷^]. RFR mitigates overfitting by aggregating multiple decision trees, a process that facilitates the modeling of high-order feature interactions^[²^]. In contrast, the linear structure of PLSR limits its ability to represent the complex, non-linear relationships present in the fused feature space^[⁵³^]. Notably, TFs yielded the most pronounced performance improvements for linear models, particularly PLSR and LASSO. This phenomenon suggests that while linear models possess a restricted capability for characterizing complex physiological relationships, they are highly responsive to TFs that directly characterize canopy structure^[¹⁰^]. Therefore, incorporating TFs can partially compensate for the inherent non-linear modeling limitations of linear algorithms.

4.3 Improvement of rice aboveground biomass prediction performance by meta learners

The predictive accuracy of ELFs is contingent upon the performance of meta learner^[²⁶^]. Beyond improving estimation precision, the introduction of an effective meta learner enhances model resilience and stability by synthesizing outputs from diverse base learners, effectively mitigating individual algorithmic biases and maintaining consistent performance across various datasets^[⁵⁸^]. In this study, the systematic evaluation of LM, RF and BMA revealed that BMA meta learner achieved superior predictive accuracy and generalization capability (Fig. 6). The superiority of BMA stems from its rigorous probabilistic foundation. Unlike the deterministic linear weighting used by LM or the voting-based mechanism of RF, BMA treats each base learner as a candidate model and assigns its contribution based on posterior model probability^[²²,⁵⁹^]. This process effectively performs Bayesian inference on model uncertainty relative to the training data. Consequently, BMA automatically assigns higher weights to more accurate base learners while explicitly quantifying uncertainty in ELFs. This property enables BMA to generate more robust and probabilistically interpretable results, particularly when outputs from base learners diverge, which is consistent with the findings of Shu et al.^[¹⁹^], Fei et al.^[²⁵^], and Zheng and Zhang^[⁵⁸^].

4.4 Influence of base learner number and type on meta learner performance

The effectiveness of ELFs is markedly affected by both the quantity and heterogeneity of base learners^[²⁶,⁶⁰^]. Based on progressive combination analysis (Fig. 8 and Table 4), this study demonstrates that the BMA meta learner achieves an optimal balance between predictive stability and precision when the number of base learners is five. Increasing the ensemble size beyond this threshold can introduce redundancy and noise, which increase computational cost, and ultimately reduce the ability of the meta learner to derive efficient and accurate predictions. This observation aligns with the results of Muslim et al.^[⁶⁰^] and Qi et al.^[²⁶^]. While incorporating the superior individual base learner (GPR) substantially improved overall accuracy, the inclusion of the weakest learner (PLSR) negatively affected performance (Fig. 9). Notably, different base learners had distinct error patterns (Fig. 10). This study found that the optimal ensemble architecture is not necessarily formed by a uniform type of high-performing models, but rather through a hybridization of linear-kernel models (SMR, LASSO and RR) and a non-linear model (GPR) (Table 4). This configuration uses the principle of error complementarity, where the relatively stable but less flexible linear models offset the high-variance predictions of GPR, which excels at capturing complex non-linear relationships. This finding provides a clear guideline for constructing high-performance ELFs: optimal performance is achieved by combining base learners that possess both high individual accuracy and divergent inductive biases^[²⁶,⁶¹^].

4.5 Strengths, limitations and future outlook

Compared with previous studies, this study offers two contributions. First, a systematic comparative analysis identifies BMA as a competitive meta learner, significantly improving predictive accuracy and robustness (Fig. 11). Second, through controlled experiments, quantification of the sensitivity of ELFs performance to the quantity and heterogeneity of base learners, provided empirical data to guide the systematic configuration of base learners within ELFs. Nevertheless, several limitations are evident. First, the current framework relies on standard ML algorithms and does not incorporate advanced temporal deep-learning architectures (e.g., LSTM and transformers)^[⁶²^]. Future work should explore the integration of these advanced models to evaluate their potential gains within the current ELFs. Second, the model was validated at a single experimental site; therefore, its spatial transferability remains to be rigorously tested. Future efforts should include multisite and multiyear external validations using leave-one-site-out and/or spatially blocked cross-validation protocols. To enhance spatial generalizability, adaptive recalibration strategies should be explored. This could be done with a small set of labeled samples (updating the meta learner) for label-constrained target regions, and transfer learning/domain adaptation for unlabeled regions by leveraging unlabeled target data together with auxiliary meteorological and remote sensing variables to mitigate domain mismatch and enhance robustness across agroecological zones.

5 Conclusions

Based on UAV multispectral imagery, this study developed the stacking ELFs by integrating sensitive VIs and TFs with various base learners and meta learners. Different ELFs architectures were systematically evaluated for AGB prediction in rice. There were three main findings from this study. (1) The fusion of VIs and TFs significantly improved the accuracy of AGB prediction. Among the nine machine learning algorithms assessed, GPR emerged as the superior base learner. (2) The three stacking meta learners all further enhanced AGB prediction accuracy, with BMA delivering the optimal performance on the fused feature set (R² = 0.795, RMSE = 1.73 t·ha^–1). (3) Meta learner performance was highly sensitive to the composition of the base learner pool. Optimal predictive stability was attained with a configuration of five base learners, and the hybridization of linear-kernel models with GPR yielded the highest accuracy. Overall, this study underscores the importance of base learners configuration and meta learners selection in enhancing crop monitoring precision, and provides valuable insights for the advancement of precision agriculture.

References

Publishing order | Descend order by publishing year | Descend order by cited within

[1]	Xuan F, Su W, Chen Z, Huang X D, Zhai W G, Li X C, Zeng Y L, Li Z, Li J S, Huang J X . Performance of stacking machine learning and volume model for improving corn above ground biomass prediction. Plant Phenomics, 2025, 7(3): 100068

[2]	Liu J K, Zhu Y J, Song L J, Su X X, Li J, Zheng J, Zhu X Q, Ren L T, Wang W H, Li X W . Optimizing window size and directional parameters of GLCM texture features for estimating rice AGB based on UAVs multispectral imagery. Frontiers in Plant Science, 2023, 14: 1284235

[3]	Zhu Y J, Liu J K, Tao X Y, Su X X, Li W Y, Zha H N, Wu W G, Li X W . A three-dimensional conceptual model for estimating the above-ground biomass of winter wheat using digital and multispectral unmanned aerial vehicle images at various growth stages. Remote Sensing, 2023, 15(13): 3332

[4]	Guebsi R, Mami S, Chokmani K . Drones in precision agriculture: a comprehensive review of applications, technologies, and challenges. Drones, 2024, 8(11): 686

[5]	Liu J K, Wang W Q, Li J, Mustafa G, Su X X, Nian Y, Ma Q, Zhen F X, Wang W W, Li X W . UAV remote sensing technology for wheat growth monitoring in precision agriculture: comparison of data quality and growth parameter inversion. Agronomy, 2025, 15(1): 159

[6]

Maimaitijiang M, Sagan V, Sidike P, Maimaitiyiming M, Hartling S, Peterson K T, Maw M J W, Shakoor N, Mockler T, Fritschi F B . Vegetation index weighted canopy volume model (CVM_VI) for soybean biomass estimation from unmanned aerial system-based RGB imagery. ISPRS Journal of Photogrammetry and Remote Sensing, 2019, 151: 27–41

[7]	Shi J, Yang K L, Yuan N G, Li Y J, Ma L F, Liu Y D, Fang S H, Peng Y, Zhu R S, Wu X T, Gong Y . UAV-based rice aboveground biomass estimation using a random forest model with multi-organ feature selection. European Journal of Agronomy, 2025, 164: 127529

[8]

Luo S J, Jiang X Q, He Y B, Li J P, Jiao W H, Zhang S L, Xu F, Han Z C, Sun J, Yang J P, Wang X Y, Ma X T, Lin Z R . Multi-dimensional variables and feature parameter selection for aboveground biomass estimation of potato based on UAV multispectral imagery. Frontiers in Plant Science, 2022, 13: 948249

[9]

Li Z H, Zhao Y, Taylor J, Gaulton R, Jin X L, Song X Y, Li Z H, Meng Y, Chen P F, Feng H K, Wang C, Guo W, Xu X G, Chen L P, Yang G J . Comparison and transferability of thermal, temporal and phenological-based in-season predictions of above-ground biomass in wheat crops from proximal crop reflectance data. Remote Sensing of Environment, 2022, 273: 112967

[10]	Xu T Y, Wang F M, Shi Z, Xie L L, Yao X P. Dynamic estimation of rice aboveground biomass based on spectral and spatial information extracted from hyperspectral remote sensing images at different combinations of growth stages. ISPRS Journal of Photogrammetry and Remote Sensing, 2023, 202: 169–183

[11]	Liu Y, Fan Y G, Feng H K, Chen R Q, Bian M B, Ma Y P, Yue J B, Yang G J . Estimating potato above-ground biomass based on vegetation indices and texture features constructed from sensitive bands of UAV hyperspectral imagery. Computers and Electronics in Agriculture, 2024, 220: 108918

[12]	Xu L, Zhou L F, Meng R, Zhao F, Lv Z G, Xu B Y, Zeng L L, Yu X, Peng S B . An improved approach to estimate ratoon rice aboveground biomass by integrating UAV-based spectral, textural and structural features. Precision Agriculture, 2022, 23(4): 1276–1301

[13]	Xu T Y, Wang F M, Shi Z, Miao Y X. Multi-scale monitoring of rice aboveground biomass by combining spectral and textural information from UAV hyperspectral images. International Journal of Applied Earth Observation and Geoinformation, 2024, 127: 103655

[14]	Wang D J, Xing Y Q, Fu A M, Tang J, Chang X Q, Yang H, Yang S H, Li Y X . Mapping forest aboveground biomass using multi-source remote sensing data based on the XGBoost algorithm. Forests, 2025, 16(2): 347

[15]	Zhang P P, Lu B, Ge J Y, Wang X Y, Yang Y D, Shang J L, La Z, Zang H D, Zeng Z H . Using UAV-based multispectral and RGB imagery to monitor above-ground biomass of oat-based diversified cropping. European Journal of Agronomy, 2025, 162: 127422

[16]	Shaikh T A, Rasool T, Rasheed Lone F . Towards leveraging the role of machine learning and artificial intelligence in precision agriculture and smart farming. Computers and Electronics in Agriculture, 2022, 198: 107119

[17]	Sharma A, Jain A, Gupta P, Chowdary V . Machine learning applications for precision agriculture: a comprehensive review. IEEE Access, 2021, 9: 4843–4873

[18]	Zhang Y Z, Ma J, Liang S L, Li X S, Liu J D . A stacking ensemble algorithm for improving the biases of forest aboveground biomass estimations from multiple remotely sensed datasets. GIScience & Remote Sensing, 2022, 59(1): 234–249

[19]	Shu M Y, Fei S P, Zhang B Y, Yang X H, Guo Y, Li B G, Ma Y T . Application of UAV multisensor data and ensemble approach for high-throughput estimation of maize phenotyping traits. Plant Phenomics, 2022, 2022: 9802585

[20]	Du P J, Mu H W, Guo S C, Chen Y, Zhang X G, Tang P F. Ensemble learning in remote sensing applications: progress and prospects. National Remote Sensing Bulletin, 2025, 29(6): 1614−1635 (in Chinese)

[21]	Yin J N, Medellín-Azuara J, Escriva-Bou A, Liu Z . Bayesian machine learning ensemble approach to quantify model uncertainty in predicting groundwater storage change. Science of the Total Environment, 2021, 769: 144715

[22]	Fei S P, Hassan M A, Xiao Y G, Su X, Chen Z, Cheng Q, Duan F Y, Chen R Q, Ma Y T . UAV-based multi-sensor data fusion and machine learning algorithm for yield prediction in wheat. Precision Agriculture, 2023, 24(1): 187–212

[23]	Zhang M Z, Chen T E, Gu X H, Kuai Y, Wang C, Chen D, Zhao C J . UAV-borne hyperspectral estimation of nitrogen content in tobacco leaves based on ensemble learning methods. Computers and Electronics in Agriculture, 2023, 211: 108008

[24]	Li Z P, Chen Z, Cheng Q, Duan F Y, Sui R X, Huang X Q, Xu H G . UAV-based hyperspectral and ensemble machine learning for predicting yield in winter wheat. Agronomy, 2022, 12(1): 202

[25]	Fei S P, Chen Z, Li L, Ma Y T, Xiao Y G . Bayesian model averaging to improve the yield prediction in wheat breeding trials. Agricultural and Forest Meteorology, 2023, 328: 109237

[26]	Qi H, Lü L J, Sun H F, Li S, Li T T, Hou L. Yield estimation of wheat lines based on UAV hyperspectral remote sensing and machine learning. Transactions of the Chinese Society for Agricultural Machinery, 2024, 55(7): 260−269 (in Chinese)

[27]	Su X X, Nian Y, Yue H, Zhu Y J, Li J, Wang W Q, Sheng Y L, Ma Q, Liu J K, Wang W H, Li X W . Improving wheat leaf nitrogen concentration (LNC) estimation across multiple growth stages using feature combination indices (FCIs) from UAV multispectral imagery. Agronomy, 2024, 14(5): 1052

[28]	Rouse J W Jr, Haas R H, Schell J A, Deering D W. Monitoring vegetation systems in the Great Plains with ERTS. In: Proceedings of the 3rd Earth Resources Technology Satellite Symposium. Washington, DC: NASA, 1974, 309–317

[29]

Barnes E M, Clarke T R, Richards S E, Colaizzi P, Haberland J, Kostrzewski M, Waller P, Choi C, Riley E, Thompson T, Lascano R J, Li H, Moran M S. Coincident detection of crop water stress, nitrogen status and canopy density using ground-based multispectral data. In: Proceedings of the 5th International Conference on Precision Agriculture. Madison: American Society of Agronomy, 2000, 16–19

[30]	Datt B. A new reflectance index for remote sensing of chlorophyll content in higher plants: tests using Eucalyptus leaves. Journal of Plant Physiology, 1999, 154(1): 30–36

[31]	Gitelson A A, Kaufman Y J, Merzlyak M N . Use of a green channel in remote sensing of global vegetation from EOS-MODIS. Remote Sensing of Environment, 1996, 58(3): 289–298

[32]	Jordan C F . Derivation of leaf-area index from quality of light on the forest floor. Ecology, 1969, 50(4): 663–666

[33]	Tucker C J . Red and photographic infrared linear combinations for monitoring vegetation. Remote Sensing of Environment, 1979, 8(2): 127–150

[34]	Gitelson A A, Gritz Y, Merzlyak M N . Relationships between leaf chlorophyll content and spectral reflectance and algorithms for non-destructive chlorophyll assessment in higher plant leaves. Journal of Plant Physiology, 2003, 160(3): 271–282

[35]	Huete A R . A soil-adjusted vegetation index (SAVI). Remote Sensing of Environment, 1988, 25(3): 295–309

[36]	Rondeaux G, Steven M, Baret F. Optimization of soil-adjusted vegetation indices. Remote Sensing of Environment, 1996, 55(2): 95–107

[37]	Qi J, Chehbouni A, Huete A R, Kerr Y H, Sorooshian S . A modified soil adjusted vegetation index. Remote Sensing of Environment, 1994, 48(2): 119–126

[38]	Sripada R P, Heiniger R W, White J G, Weisz R . Aerial color infrared photography for determining late-season nitrogen requirements in corn. Agronomy Journal, 2005, 97(5): 1443–1451

[39]	Nagler P L, Scott R L, Westenburg C, Cleverly J R, Glenn E P, Huete A R . Evapotranspiration on western U.S. rivers estimated using the Enhanced Vegetation Index from MODIS and data from eddy covariance and Bowen ratio flux towers. Remote Sensing of Environment, 2005, 97(3): 337–351

[40]	Gitelson A A, Kaufman Y J, Stark R, Rundquist D . Novel algorithms for remote estimation of vegetation fraction. Remote Sensing of Environment, 2002, 80(1): 76–87

[41]	Dash J, Curran P J . The MERIS terrestrial chlorophyll index. International Journal of Remote Sensing, 2004, 25(23): 5403–5413

[42]	Jay S, Gorretta N, Morel J, Maupas F, Bendoula R, Rabatel G, Dutartre D, Comar A, Baret F . Estimating leaf chlorophyll content in sugar beet canopies using millimeter- to centimeter-scale reflectance imagery. Remote Sensing of Environment, 2017, 198: 173–186

[43]	Badgley G, Field C B, Berry J A . Canopy near-infrared reflectance and terrestrial photosynthesis. Science Advances, 2017, 3(3): e1602244

[44]	Liu W, Wang J Y . Recursive elimination-election algorithms for wrapper feature selection. Applied Soft Computing, 2021, 113: 107956

[45]	Liu J K, Zhu Y J, Tao X Y, Chen X F, Li X W . Rapid prediction of winter wheat yield and nitrogen use efficiency using consumer-grade unmanned aerial vehicles multispectral imagery. Frontiers in Plant Science, 2022, 13: 1032170

[46]	Zheng R Y, Jia Y Y, Ullagaddi C, Allen C, Rausch K, Singh V, Schnable J C, Kamruzzaman M . Optimizing feature selection with gradient boosting machines in PLS regression for predicting moisture and protein in multi-country corn kernels via NIR spectroscopy. Food Chemistry, 2024, 456: 140062

[47]	Fei S P, Hassan M A, He Z H, Chen Z, Shu M Y, Wang J K, Li C C, Xiao Y G . Assessment of ensemble learning to predict wheat grain yield based on UAV-multispectral reflectance. Remote Sensing, 2021, 13(12): 2338

[48]

Su X X, Nian Y, Shaghaleh H, Hamad A, Yue H, Zhu Y J, Li J, Wang W Q, Wang H, Ma Q, Liu J K, Li X W, Alhaj Hamoud Y. Combining features selection strategy and features fusion strategy for SPAD estimation of winter wheat based on UAV multispectral imagery. Frontiers in Plant Science, 2024, 15: 1404238

[49]	Rasmussen C E. Gaussian processes in machine learning. In: Bousquet O, von Luxburg U, Rätsch G, eds. Advanced Lectures on Machine Learning. Berlin, Heidelberg: Springer, 2004, 63–71

[50]	Hao Y F, Baik J, Choi M . Combining generalized complementary relationship models with the Bayesian Model Averaging method to estimate actual evapotranspiration over China. Agricultural and Forest Meteorology, 2019, 279: 107759

[51]	Richter K, Atzberger C, Hank T B, Mauser W . Derivation of biophysical variables from Earth observation data: validation and statistical measures. Journal of Applied Remote Sensing, 2012, 6(1): 063557

[52]

Berger K, Machwitz M, Kycko M, Kefauver S C, Van Wittenberghe S, Gerhards M, Verrelst J, Atzberger C, van der Tol C, Damm A, Rascher U, Herrmann I, Paz V S, Fahrner S, Pieruschka R, Prikaziuk E, Buchaillot M L, Halabuk A, Celesti M, Koren G, Gormus E T, Rossini M, Foerster M, Siegmann B, Abdelbaki A, Tagliabue G, Hank T, Darvishzadeh R, Aasen H, Garcia M, Pôças I, Bandopadhyay S, Sulis M, Tomelleri E, Rozenstein O, Filchev L, Stancile G, Schlerf M. Multi-sensor spectral synergies for crop stress detection and monitoring in the optical domain: a review. Remote Sensing of Environment, 2022, 280: 113198

[53]	Zhai W G, Li C C, Cheng Q, Ding F, Chen Z . Exploring multisource feature fusion and stacking ensemble learning for accurate estimation of maize chlorophyll content using unmanned aerial vehicle remote sensing. Remote Sensing, 2023, 15(13): 3454

[54]	Gawdiya S, Kumar D, Ahmed B, Sharma R K, Das P, Choudhary M, Mattar M A . Field scale wheat yield prediction using ensemble machine learning techniques. Smart Agricultural Technology, 2024, 9: 100543

[55]	Wu T Z, Zhang Z W, Wang Q, Jin W J, Meng K, Wang C, Yin G F, Xu B D, Shi Z H . Estimating rice leaf area index at multiple growth stages with Sentinel-2 data: an evaluation of different retrieval algorithms. European Journal of Agronomy, 2024, 161: 127362

[56]	Zhang S H, Duan J Z, Qi X H, Gao Y Z, He L, Liu L R, Guo T C, Feng W . Combining spectrum, thermal, and texture features using machine learning algorithms for wheat nitrogen nutrient index estimation and model transferability analysis. Computers and Electronics in Agriculture, 2024, 222: 109022

[57]	Zhu S L, Yang T L, Han D W, Zhang W J, Zain M, Yu Q Q, Zhao Y Y, Wu F, Yao Z S, Liu T, Sun C M . ODP: a novel indicator for estimating photosynthetic capacity and yield of maize through UAV hyperspectral images. Computers and Electronics in Agriculture, 2025, 235: 110350

[58]	Zheng J H, Zhang S . Improving rice phenology simulations based on the Bayesian model averaging method. European Journal of Agronomy, 2023, 142: 126646

[59]	Li G, Liu Z J, Zhang J W, Han H M, Shu Z K . Bayesian model averaging by combining deep learning models to improve lake water level prediction. Science of the Total Environment, 2024, 906: 167718

[60]	Muslim M A, Nikmah T L, Pertiwi D A A, Subhan , Jumanto , Dasril Y, Iswanto . New model combination meta-learner to improve accuracy prediction P2P lending with stacking ensemble learning. Intelligent Systems with Applications, 2023, 18: 200204

[61]	Wang Y M, Zhang Z, Feng L W, Du Q Y, Runge T . Combining multi-source data and machine learning approaches to predict winter wheat yield in the Conterminous United States. Remote Sensing, 2020, 12(8): 1232

[62]	Liu J K, Wang W Q, Su X X, Li J, Nian Y, Zhu X Q, Ma Q, Li X W. Prediction of rice yield and nitrogen use efficiency based on UAV multispectral imaging and machine learning. Transactions of the Chinese Society of Agricultural Engineering, 2025, 41(20): 127−138 (in Chinese)

RIGHTS & PERMISSIONS

The Author(s) 2026. Published by Higher Education Press. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0)