Interpretable machine learning model for forecasting compressive strength of aeolian sand concrete

Yun CHEN; Qianwang FU; Wenbo ZHENG

doi:10.1007/s11709-025-1227-y

Front. Struct. Civ. Eng. ›› 2025, Vol. 19 ›› Issue (10) :1602 -1620. DOI: 10.1007/s11709-025-1227-y

RESEARCH ARTICLE

Interpretable machine learning model for forecasting compressive strength of aeolian sand concrete

Author information +

History +

PDF (8223KB)

Abstract

This study integrates Bayesian optimization (BO) with the natural gradient boosting (NGBoost) algorithm to accurately predict aeolian sand concrete (ASC) compressive strength. The main results are summarized as follows. 1) The NGBoost model demonstrates high precision in predicting ASC compressive strength, achieving testing set metrics with a coefficient of determination of 0.945, mean squared error of 4.145 MPa², and root mean squared error of 2.036 MPa. 2) Feature importance ranking from the NGBoost model identifies age as the significant factor influencing ASC compressive strength, while the effects of aeolian sand ratio, water-to-binder ratio (W/B), and coarse aggregate are minimal. 3) SHapley Additive exPlanations (SHAP) analysis indicates a positive correlation between age, cement, coarse aggregate, superplasticizer, and the compressive strength of ASC. In contrast, the aeolian sand ratio, W/B, and fine aggregate show negative correlations. 4) A python-based graphical user interface (GUI) has been developed to enable engineers to predict ASC compressive strength efficiently, thus enhancing the model’s practical application.

Graphical abstract

Keywords

ASC / compressive strength / NGBoost / BO / SHAP / GUI

Cite this article

Download citation ▾

Yun CHEN, Qianwang FU, Wenbo ZHENG. Interpretable machine learning model for forecasting compressive strength of aeolian sand concrete. Front. Struct. Civ. Eng., 2025, 19(10): 1602-1620 DOI:10.1007/s11709-025-1227-y

登录浏览全文

4963

注册一个新账户忘记密码

1 Introduction

Aeolian sand is derived from extremely fine sand sourced from deserts and the Gobi Desert, comprised of surface sand that has been transported and deposited by wind [1,2]. China’s landscape includes 334000 km² of desertified land and 37000 km² of wind-sanded regions, along with 1162000 km² of the Gobi Desert, culminating in a total of 1533000 km²—representing 15.9% of the nation’s total area and exceeding the total area of its arable land. In recent years, desertification has been progressing at an accelerating rate, shifting its role from simply being an ecological and environmental issue to becoming a significant economic and social challenge. This transformation has been a contributing factor to both poverty and social instability. As a result, addressing desertification has become a worldwide challenge. Addressing the expansion of deserts and promoting their reclamation are now fundamental to securing a sustainable living environment and expanding habitable spaces in the 21st century [3,4]. Western China possesses vast resources of aeolian sand. Employing this material as a replacement for river sand in concrete production offers a sustainable solution to both the increasing river sand shortage and ecological degradation. Therefore, developing and researching aeolian sand concrete (ASC) is an essential and important initiative for advancing the economy and enhancing the environment.

With the advent of the “Western Development” and the “One Belt, One Road” strategies, ASC has become a key research focus in sustainable construction. The performance of ASC can vary considerably due to the variability in raw materials and mixing ratios. Extensive experimental studies have been conducted to optimize mix ratio design [5–7], analyze working performance, assess mechanical properties [8–15] and durability [16–24] of ASC. Studies suggestion that incorporating a specific amount of sand enhances both the mechanical properties and the workability of concrete. However, a higher sand content may lead to significant losses in these areas, though this can be mitigated with the use of additives and admixtures. Durability studies on ASC, encompassing frost resistance, carbonation resistance, ionic erosion resistance, and multi-factor durability, have demonstrated that: frost resistance improves with increased aeolian sand content, peaking at 100%; carbonation resistance initially increases and then declines as sand content rises, peaking at an optimal level of 20%. Additionally, incorporating fly ash improves resistance to carbonation and penetration by chloride ions. Recent experimental studies have also explored the mechanical properties and seismic performance of ASC in structural components such as beams, columns and beam-column joints [25–28], providing a theoretical foundation for ASC’s future applications in building structures.

Predicting concrete performance has long been an area of significant interest. Yet, its complexity, shaped by many factors, complicates accurate forecasts through conventional methodologies and models. The task of predicting ASC performance entails additional considerations, which further complicates understanding its mechanical properties. Recently, advancements in machine learning (ML) technologies have introduced innovative strategies in concrete performance prediction. These ML methods have proven effective in accurately forecasting outcomes in concrete performance.

ML has revolutionized the analysis of concrete’s mechanical properties, where earlier traditional models mainly focused on predicting concrete’s compressive strength [29–43]. Research indicates that standalone models like deep learning neural networks (DNN), support vector machines, and random forests (RF) can produce robust predictive results [44]. For instance, Nguyen et al. [34] utilized a DNN on a database of 335 geopolymer concrete samples, successfully predicting the material’s compressive strength, which demonstrated DNN’s efficacy in concrete proportioning design. In a similar study, Compressive strength prediction for alkali-activated concrete was performed by Gomaa et al. [35] using an RF model; their work highlighted curing conditions and water content as essential variables. Predictive accuracy for concrete compressive strength is often enhanced by more than 20% when using holistic learning models rather than single algorithms [45–47]. Such as, Asteris et al. [48] constructed a hybrid integrated model for concrete containing auxiliary cementitious materials, quantifying the effects of age, percentage of superplasticizer, and water-to-binder ratio (W/B) on compressive strength. The predictability of concrete’s compressive strength becomes notably less consistent when recycled aggregates are used, owing to their diverse origins and characteristics. For concrete made with recycled aggregates, Peng and Unluer [49] employed a combined approach of ML models and a swarm intelligence algorithm to assess compressive strength. They discovered that the combined model, improved through hyper-parameter optimization, outperformed standalone models in effectiveness.

ML techniques have been employed by researchers to explore the mechanical properties and, as well as, the durability performance of concrete structures [50–53]. Khan [50] used artificial neural networks (ANN) to study how silica fume affects the chloride resistance of high-performance concrete. Compared to conventional concrete mixtures, silica fume provided a significant improvement in chloride resistance, as their results showed. Additionally, Liu et al. [51] crafted a composite model incorporating six ML models to assess the carbonation depth in recycled concrete, examining the effects of common design factors like moisture content and CO₂ levels on carbonation depth. Their findings suggest that carbonation depth is inversely related to exposure time and moisture content, but increases with higher cement content. To assess the influence of air-entraining agents and the W/B on the frost resistance of recycled concrete, Liu et al. [52] utilized a combination of multivariate adaptive regression splines, Gaussian process regression, and ANN models to estimate the durability factor. The compressive strength of recycled concrete affected by sulfate attack was predicted by Liu et al. [53] using a range of ML algorithms in a separate research effort. Strength degradation was primarily influenced by water absorption and cement content in recycled aggregates, according to their findings.

In conclusion, ML technologies have demonstrated their efficacy in forecasting both the durability and mechanical properties of concrete. To estimate the compressive strength of various concrete types, ML algorithms were utilized, considering mix proportions and other relevant factors. This facilitates the optimization of concrete mix designs and enhances mechanical performance. For durability, ML models predict concrete’s resistance to various degradative processes including chloride ion penetration, carbonation depth, freeze–thaw cycles, and sulfate attacks. Through analyzing interactions among variables such as the W/B, age, and admixtures, ML models provide critical insights into enhancing concrete durability, contributing to the creation of more robust and lasting concrete structures. In addition to its significant achievements in predicting the mechanical and durability properties of concrete, ML has been extensively applied across various other engineering domains. Notable applications include assessing surface treatment effects on the tribological performance of tool steels [54], developing bearing capacity models for geogrid-reinforced stone columns in soft clay [55], identifying structural damage [56], and predicting surface roughness in electro-discharge machining [57]. These diverse applications demonstrate its broad utility in practical engineering scenarios. Despite ML’s significant successes in concrete applications, research into forecasting ASC compressive strength with ML algorithms remains limited.

This study focused on predicting the compressive strength of ASC using ML methods. A detailed data set had been compiled, encompassing key factors such as cement, aeolian sand ratio, and age et al. This data set was established to enhance precise forecasts of ASC’s compressive strength. Additionally, advanced predictive models for ASC’s compressive strength were created using Bayesian optimization (BO) with three ML algorithms, such as natural gradient boosting (NGBoost), extreme gradient boosting (XGBoost), and RF. Each model’s performance was evaluated and compared using metrics like coefficient of determination (R²), mean squared error (MSE), root mean squared error (RMSE), mean absolute error (MAE), mean absolute percentage error (MAPE), and a20 in order to determine the most accurate model. Furthermore, a sensitivity analysis of the input factors was conducted using SHapley Additive exPlanations (SHAP). In conclusion, a graphical user interface (GUI) built with python was developed to forecast the compressive strength of ASC. Figure 1 illustrates the research roadmap. The structure of this paper is outlined as follows: Section 2 reviews the existing limitations in current knowledge, introduces advancements in the proposed method, and evaluates the practical implications of the study’s findings. Section 3 offers a comprehensive overview of the methodology and the workflow within the framework. Section 4 explores the results of hyperparameter tuning for ML models, assesses their performance in predicting the compressive strength of ASC, utilizes SHAP for feature sensitivity analysis, compares the proposed model with existing literature, and discusses the development of a GUI. Finally, Section 5 summarizes the conclusions of the study, highlighting its limitations and suggesting directions for future research.

2 Research significance

Forecasting the compressive strength of ASC presents a considerable challenge, largely owing to the intricate interplay of various parameters that contribute to its development. While the detrimental impact of aeolian sand addition on compressive strength is extensively documented in existing literature, a robust framework for the mixture proportioning of ASC, one that adequately accounts for key influential factors such as aeolian sand dosage, cement content, and W/B, is currently lacking. Furthermore, the scarcity of comprehensive experimental data impedes the advancement of generalized ML models applicable to materials science problems. Consequently, this study proposes a ML approach specifically designed to address these identified knowledge gaps.

A review of prior research reveals that existing ML algorithms have typically been applied to specific data sets, generating results that, while accurate for their chosen data, lack the capability for model updates. The performance of these ML models can be negatively impacted by the inclusion of previously unaccounted-for input features or the exclusion of important existing ones. Moreover, a dynamic ML model that researchers can readily leverage for broader research benefits is currently unavailable.

ML models in engineering, especially those used for predicting the compressive strength of ASC, encounter intrinsic limitations stemming from the inherent complexity of concrete and various environmental factors. This study undertakes a rigorous assessment of these constraints, offering a thorough analysis of ML model predictability and their inherent boundaries. By offering input options via a GUI, practitioners are provided with an intuitive way to input relevant parameters for the design and optimization of concrete. Although ML tools provide valuable insights, it is essential to view them as initial estimators rather than definitive solutions, considering the uncertainties and assumptions inherent in the modeling process. Nonetheless, the developed GUI-based model functions as an evolving foundational framework, improving its predictive capabilities through continuous updates. This method enables practitioners to make effective use of ML tools while staying aware of their built-in limitations and the evolving nature of this specialized field.

3 Methodology

3.1 Data collection

In this study, a comprehensive data set of 280 ASC compressive strength data points was compiled, primarily sourced from Refs. [5–28]. This data set was employed to train the ML algorithms detailed in Table 1. Each record in this database consists of eight input variables—cement, W/B, aeolian sand ratio, fine aggregate, coarse aggregate, fly ash/B, superplasticizer, and age, and one output variable, compressive strength.

3.2 Data preprocessing

In this study, box plot was employed for data visualization, enabling analysis of outliers. The rationale for selecting this method was to detect outliers that deviate substantially from the bulk of the data, as these could adversely affect the model’s predictive accuracy. Detected outliers were replaced with mean values from the data set to maintain the model’s efficacy. This approach was implemented to bolster the reliability of the data set and minimize the potential for anomalous data points to negatively impact the model’s performance. As depicted in Fig. 2, outliers were observed in the data points of seven input parameters, including fly ash/B, fine aggregate, coarse aggregate, cement, W/B, age, and superplasticizer. Notably, the outliers associated with coarse aggregate, fine aggregate, and cement exhibit larger magnitudes.

3.3 Data visualization

In this study, Fig. 3 displays heatmaps that illustrate the Pearson correlation coefficients (PCC) among the output and input variables. The existence of a strong linear relationship between the parameters is indicated by a PCC value approaching either 1 or −1, demonstrating a high degree of association. Figure 3 illustrates that most of the input parameters have PCC values below 0.7, signifying the absence of multicollinearity among the parameters and no adverse impact on the predicted output results. With respect to the correlation between output and input parameters, a particularly strong linear relationship, quantified by a PCC value exceeding 0.7, was discerned between the age of the material and its compressive strength. Conversely, a strong inverse correlation exists between cement and fly ash/B, indicated by a PCC value of −0.67.

Additionally, Histogram was used as another visualization technique, as shown in Fig. 4. The histogram mainly describes the range and density of the data distribution for each input and output parameter. Figures 4(a)–4(i) illustrates that the cement content was concentrated within the range of 350–450 kg/m³, while the W/B was concentrated between 0.4–0.5. The fly ash/B was concentrated within the range of 0–0.1, with aeolian sand ratio was concentrated between 20–60%. The fine aggregate content was primarily 550–650 kg/m³, and the coarse aggregate content was 1100–1200 kg/m³. Furthermore, the superplasticizer content was concentrated within 0–4 kg/m³, and the aging period was concentrated at 7 and 28 d. Finally, the compressive strength was mainly concentrated in the range of 20–60 MPa.

3.4 Data normalization

Normalization techniques were utilized to standardize the data, effectively adjusting the elements to possess a mean of zero and a standard deviation of unity. This method of standardization improves the accuracy of the model and facilitates more exact predictions [29]. By standardizing all feature scales, this method sought to primarily improve the performance and effectiveness of ML algorithms.

(1)

x s t a n d a r d i z e d = x − x ¯ σ .

In Eq. (1), x signifies the feature undergoing standardization, σ denotes the standard deviation of x, while

x ¯

represents the mean of x in the data set.

3.5 Machine learning algorithms

3.5.1 Natural gradient boosting

Introduced by Duan et al. [58] in 2019, NGBoost is a novel supervised ML model designed specifically for probabilistic prediction tasks. For probabilistic regression, P_θ(y|x), this innovative method employs a multi-parameter boosting framework [36], in which the parameters of the conditional distribution, θ, serve as the targets for gradient boosting. It is built upon a structure integrating three essential modular components: a scoring rule, a probability distribution, and a base learner [58], with a visualization provided in Fig. 5(a). Determining the type of base learner constitutes a hyperparameter decision, and decision trees (DT) are often favored due to their demonstrated effectiveness in handling structured inputs when used with NGBoost. To express uncertainty within NGBoost, prediction intervals are employed. A variety of probability distributions can be employed for predictive inference, including the Bernoulli distribution for binary outcomes and the normal distribution for continuous outputs [58]. In this research, we exclusively used the normal distribution to probabilistically estimate compressive strength values. NGBoost adopts a unique approach to the scoring rule compared to conventional gradient boosting models like categorical boosting, implementing maximum-likelihood estimation to evaluate the quality of predictions generated from a probabilistic distribution [36]. The training of NGBoost further incorporates natural gradient techniques; these techniques refine model parameters and train independent base learners, consequently diminishing the influence stemming from the support of multidimensional prediction outputs and the parameterization process of the probability distribution [36].

3.5.2 Random forests

The RF algorithm, an ensemble learning technique rooted in DT, generates numerous DTs and combines their predictions through averaging or voting to achieve regression forecasts. The sequential method involves: initially, the training data set was randomly segmented into various subsets, each comprising a distinct combination of samples and features. Subsequently, a DT prediction model was constructed for each subset, where the features for node splitting were randomly chosen from the subset’s feature set. Following this, predictions were generated for each DT model using the test data set to yield prediction outcomes. Ultimately, the predictions from all DTs were amalgamated, employing either the mean or a weighted sum, to produce the ultimate prediction outcome. Figure 5(b) presents a schematic illustration of the RF method.

3.5.3 Extreme gradient boosting

XGBoost, an ML algorithm grounded in gradient boosting trees, was designed to tackle regression and classification assignments. Central to this algorithm was the iterative training of a series of DT models, which were then combined to boost prediction precision. In addition, XGBoost incorporated regularization factors in each weak learner to manage model intricacy, enhance generalization, and mitigate overfitting, thus heightening its competence and effectiveness in both regression and classification spheres. A schematic depiction outlining the fundamental principles of XGBoost can be observed in Fig. 5(c).

3.6 Model development

3.6.1 Model construction techniques

The data set was split at the start of model building, 20% for testing and 80% for training. Following this split, the study used the training data to fine-tune the hyperparameters of three algorithms: NGBoost, RF, and XGBoost. Optimized algorithms served to build predictive models for ASC compressive strength. The testing set, depicted in Fig. 6, served to evaluate the models’ predictive accuracy.

3.6.2 Hyperparameter optimization

Hyperparameter optimization for the three ML models was conducted using BO. BO, differing from grid and random searches, systematically models the unknown objective function probabilistically [59]. The distribution of the objective function was represented using a Gaussian process, specifically [60]. In this paper, the optimization goal (x⁺) is to find the maximum value at the sampling point for an unknown function f(x). The hyperparameter search space, H, was incorporated into the optimization process as defined in Eq. (2) [61]:

(2)

x + = a r g m a x x ∈ H f (x) .

BO was grounded in Bayes’ Theorem [62], which was used to calculate the likelihood of an event under a certain condition, as shown in Eq. (3). In the optimization, the posterior distribution resulted from combining the prior of f(x) with sample data. This information was employed to pinpoint where the function f(x) reaches its minimum. An acquisition function, which directs the selection of the next hyperparameter set to investigate [59], plays a crucial role in this process. By evaluating the acquisition function, one can determine the expected optimal hyperparameters.

(3)

P (∅ | S) = P (s | ∅) ⋅ P (∅) P (S) .

In this context, ϕ denotes the event whose probability we aim to find, while s indicates the given condition. P(ϕ) is the prior probability. P(s) is the conditional probability. P(s|ϕ) refers to the likelihood function, and P(ϕ|s) represents the posterior probability.

Evaluation of the ASC compressive strength prediction models across hyperparameter configurations followed, using 10-fold cross-validation (10-CV). During each iteration of this procedure, one of the ten subsets of the training data served as the validation set, while the other nine were used for training. The validation set’s MSE was selected as the evaluation metric for the model. After ten repetitions of the procedure, the average of the obtained MSEs determined the model’s final MSE. The model’s final training error was represented by Eq. (4):

(4)

C V (10) = 110 ∑ i = 1 10 M S E i .

In this context, CV₍₁₀₎ represents the final training error, while MSE_i denotes the MSE.

The concluding performance was assessed by calculating the average of the outcomes from the ten iterations, as illustrated in Fig. 7. Combining BO with 10-CV enhances the reliability of model evaluation and preserves the model’s generalization capability. The outcomes of the hyperparameter optimization were presented in Table 2.

3.6.3 Model performance evaluation

Assessing different models utilized six established statistical metrics: RMSE, MSE, MAE, R², MAPE, and a20. R² represents the proportion of variance in the observed data that was explained by the model. Conversely, RMSE, MAE, MAPE, and MSE, gauge the precision of the model’s predictions. A model was generally considered more accurate as the R² approaches 1 and the values of MAE, MAPE, MSE, and RMSE decline. The a20 index serves as a reliability assessment method, specifically indicating the proportion of samples where the predicted values deviate by no more than 20% from their corresponding experimental values. These indicators validated the model’s accuracy and reliability. Equations (5)–(10) compute these metrics, where y denotes the actual value,

y ¯

the mean, and

y ′

the predicted value, with N representing the sample size. S denotes the total sample size of the data set. s20 signifies the quantity of samples where the ratio of the experimental value to the predicted value falls within the range of 0.80 to 1.20.

(5)

R 2 = 1 − ∑ i = 1 N (y i − y i ′) 2 ∑ i = 1 N (y i − y ¯) 2,

(6)

M S E = 1 N ∑ i = 1 N (y i − y i ′) 2,

(7)

R M S E = 1 N ∑ i = 1 N (y i − y i ′) 2,

(8)

M A E = 1 N ∑ i = 1 N | y i − y i ′ |,

(9)

M A P E = 1 N ∑ i = 1 N | y i − y i ′ y i | × 100,

(10)

a 20 = s 20 S .

3.7 SHapley Additive exPlanations

ML interpretability pertains to the degree to which human interpreters can comprehend the rationale behind the decisions or forecasts generated by ML models. The SHAP model, grounded in game theory, provides a post-hoc interpretative technique [30]. The incremental contribution of each feature to the model was assessed using this technique. Through the utilization of the SHAP method, precise contribution values were computed and amalgamated to anticipate the model’s output, elucidated in Eqs. (11)–(13). Unlike traditional methods used to ascertain feature importance in ML outcomes, the SHAP method not only assesses the impact and significance of each feature but also examines the patterns of influence these features have on the predictions. It further enables in-depth analysis of individual sample feature impacts, encompassing their positive or negative attributes, thereby facilitating both global and local elucidations.

(11)

g (x) = φ 0 + ∑ j = 1 M φ j x j = f (x),

(12)

φ j = ∑ s | S |! (n − | S | − 1) M! (f x (S ∪ {x j} − f x (S)),

(13)

S ⊆ {x 1, x 2, … x p} ∖ {x j} .

In this context, x is the set of feature values for each case, and

x 1, x 2, . . ., x p

are the feature sets for all cases. S designates a feature subset, while M represents the total feature count. φ₀ is the mean predicted value from the ML algorithm across the training data and φ_j represents the contribution of the jth feature. g(x) is the post-hoc model’s predicted outcome, and f(x) the ML algorithm’s prediction.

Shows in Fig. 8 is the SHAP model’s functional diagram, where the average compressive strength of all specimens serves as the baseline value. In determining the ultimate predicted compressive strength value, features 1 and 2 were identified as characteristics that positively impact compressive strength, such as increased curing age or reduced W/B. Conversely, feature 3, potentially indicating a high level of aeolian sand ratio, was observed to have a detrimental effect on compressive strength.

4 Results and discussion

4.1 Hyperparameter tuning results

For the computational analysis in this research, the python scikit-learn library was utilized. NGBoost, RF, and XGBoost require specific parameter adjustments to optimize prediction performance. The selection of parameters for these models were guided by Refs. [31,32,63,64].

In this study, four hyperparameters were configured for the NGBoost model: n_estimators, col_sample, learning_rate, and minibatch_frac per iteration. A smaller col_sample value increases base learner diversity, limiting the chance of overfitting. However, a consequence of this might be a degree of information loss. Increasing the col_sample value can boost stability, yet it also introduces the potential for overfitting the model. When a smaller learning_rate was used, each base learner has less influence on the model, requiring additional iterations to properly fit the training data [64]. An accelerated convergence, achieved through a larger learning_rate, increases the chance of overfitting the model. A smaller minibatch_frac can speed up training but might introduce randomness, leading to instability in the training process. Using a larger minibatch_frac enhances the stability of gradient estimates, although it might lengthen the training time [64]. Increasing n_estimators can significantly boost model performance, albeit with more computational demands. Boosting the n_estimators parameter appropriately can enhance model generalizability, yet an overly large value heightens the risk of overfitting [64].

The RF model’s crucial parameters encompass max_depth and n_estimators. The n_estimators parameter specifies how many DTs are created within the RF, enabling the optimization of the model’s bias and variance. Similarly, the max_depth parameters in the RF model govern the tree generation process. For the XGBoost model, critical parameters that necessitate configuration comprise learning_rate, n_estimators, and max_depth. The XGBoost model shares key parameters with the RF model, namely max_depth and n_estimators. The learning_rate parameter, determines the importance of each DT.

Table 2 offers a comprehensive summary of the hyperparameters that need optimization for the models NGBoost, RF, and XGBoost chosen in this study, along with their range settings and best values [64]. To find the best parameter values, BO proceeds iteratively, using the MSE from each iteration to adjust parameters for the following iteration, thereby continuously improving performance [64]. Figure 9 illustrates the optimization process.

4.2 Machine learning models predictive performance analysis and evaluation

4.2.1 Predictive performance of machine learning models

The performance of three ML algorithms in predicting the compressive strength of ASC was illustrated in Fig. 10, which includes both test data and predictions. The bar graphs below display the error information. Figure 10 showcases variations in performance among different ML algorithms on both the testing and training sets. The XGBoost model demonstrated excellent performance during training; however, its generalization to the test set was comparatively poor, mainly because the scarcity of new data restricted its predictive ability. The NGBoost and RF models outperform the XGBoost model in the test set, attributable to their incorporation of strong randomness and diversity, contributing to enhanced model generalization. When comparing the two integrated models (NGBoost and RF), it was observed that both NGBoost and RF models demonstrate similar performance in the training and test sets. However, NGBoost demonstrated a lower error on the test set, suggesting better generalization capability.

4.2.2 Machine learning predictive model evaluation and comparison

Table 3 and Fig. 11 displayed the compressive strength predictions for ASC from the three models, encompassing both training and testing phase results.

While the XGBoost model demonstrated a high R² value of 0.997 during training, a significant decrease to 0.939 on the testing set indicates that the model may be overfitting the training data. In contrast, the NGBoost and RF models, both based on DT ensemble techniques, demonstrated better generalization, achieving R² values over 0.93 on the test set. Compared to RF, NGBoost showed slightly superior R² values for both test and training data sets and presented lower MAPE, MAE, RMSE, and MSE, highlighting its enhanced ability to generalize.

This pattern can also be discerned from the Taylor plot (Fig. 12), which graphically assesses the proximity of each model to the “Ref” point, indicating their realism (closer proximity signifies better performance). It was observed that NGBoost was closest to the “Ref” point, while RF and XGBoost models have lower predictive performance than NGBoost models in predicting compressive strength.

This study analyzed the predictive performance of three models, with forecast evaluations for both testing and training sets shown in Fig. 13. For the XGBoost model, data from the test set exhibits greater scatter, yet the training set primarily congregates near the regression line. When comparing the NGBoost and RF models, it was evident that the NGBoost model more consistently aligns its predictions with the regression line for both test and training data sets, usually underestimating the training data by less than 10% and maintaining predictions near the regression line. Both models display a pattern of converging distributions between test and training data sets, suggesting better generalization abilities. These findings aligned with the results shown in Table 3.

In summary, the NGBoost and RF models excel in both accuracy and predictive abilities for ASC compressive strength compared to other models. Their success stems from their ability to effectively interpret nonlinear relationships between predictors and targets, modeling ASC’s compressive strength proficiently. These models were particularly effective in handling high-dimensional data, identifying underlying patterns and structures that capture the critical features of the data set. In contrast to other models, the NGBoost model demonstrates a reduced prediction error across both testing and training sets, highlighting its robust generalization ability.

4.3 Global interpretation based on SHapley Additive exPlanations analysis

Subsection 3.2 showed that among the three ML models examined, the NGBoost model demonstrated the most effective predictive abilities for ASC compressive strength, achieving R² values exceeding 0.93 in both the testing and training data sets. Leveraging the computational outputs from NGBoost model, the study had utilized SHAP explainability techniques to offer in-depth interpretations of diverse feature parameters and to juxtapose the analytical results obtained from NGBoost model.

4.3.1 Average SHapley Additive exPlanations values of features

Figure 14(a) visually depicts the mean SHAP values allocated to each feature parameter within the NGBoost regression models. These metrics elucidated the varying impacts of specific features on the predictive target.

As depicted in Fig. 14(a), the age of the components emerges as the most significant factor, as indicated by a SHAP value of 5.81. Subsequent to age, the SHAP values for aeolian sand ratio, coarse aggregate, and W/B were 1.37, 1.26, and 1.11, respectively. Conversely, components such as fine aggregate, superplasticizer, cement, and fly ash/B were deemed less critical, evidenced by SHAP values of 0.85, 0.85, 0.51, and 0.47 sequentially.

4.3.2 SHapley Additive exPlanations heat map of impact values

The importance of average SHAP values highlights overall statistics, while SHAP heatmaps provide an in-depth graphical depiction of SHAP values for each feature across all samples. This detailed visualization enhances understanding of the predictive outcomes for each sample by clarifying the influence of individual features.

Figure 14(b) provides a detail depiction using SHAP, demonstrating the impact of various feature parameters on the prediction of ASC compressive strength. These parameters were arranged in a descending order according to their influence on the predictive outcomes. The magnitude of a feature’s impact on ASC compressive strength was directly proportional to its SHAP value. Blue denotes a negative correlation, while red signifies a positive correlation.

As depicted in Fig. 14(b), higher SHAP values shown that age was positively associated with ASC’s compressive strength, suggesting that ASC’s strength tends to increase as its ages. Similar trends were observed for features such as coarse aggregate, superplasticizer and cement. Despite some outliers in coarse aggregate, a positive correlation was generally observed, suggesting that higher material purity in ASC preparation correlates with better mechanical properties. However, the trend for aeolian sand ratio, W/B and fine aggregate were the opposite, implying a negative correlation with ASC compressive strength, suggesting that higher aeolian sand ratio, W/B and fine aggregate value lead to lower compressive strength. Components such as fly ash/B do not demonstrate distinct linear correlations, suggesting further examination was required for these variables. Moreover, the agreement between the contributions of SHAP-based input variables and the experimental results confirms the accuracy and reliability of the predictions produced by the NGBoost models.

4.4 Dependency analysis based on SHapley Additive exPlanations

A dependency analysis was conducted using the NGBoost model to evaluate the impact of different input parameters on ASC compressive strength and to investigate the nonlinear interactions underscored by a comprehensive SHAP analysis. SHAP dependency plots were used to assess how various input variables affect ASC’s compressive strength. Figure 15 depicts the relationships between distinct input attributes and the compressive strength of ASC.

Figure 15 illustrates how the SHapley values for age, coarse aggregate, aeolian sand ratio, W/B, and other parameters change with respect to these variables. Figure 15(a) illustrates how age affects compressive strength. Generally, compressive strength increases as age increases, with the fitting curve resembling a quadratic function. For ages below 14 d, the SHapley value was negative, indicating a detrimental effect on compressive strength in the data set, with the minimum value around 3 d at approximately −18 MPa. Conversely, for ages above 14 d, there was an enhancing effect on compressive strength, reaching a maximum SHapley value of about 9 MPa at 58 d. The distribution of sample points along the X-axis was fairly uniform, suggesting that the age values in the sample set encompass both ordinary concrete and ASC, making it more representative.

Additionally, Fig. 15(a) shows how coarse aggregate affects compressive strength, using color to represent variations: blue for low and red for high levels of coarse aggregate. The coarse aggregate was evenly distributed across different age values, consistent with real-world observations. At a fixed age, high coarse aggregate can have both negative and positive effects, contrary to the expectation that higher coarse aggregate typically reduces compressive strength. In Fig. 15(c), when the aeolian sand ratio was constant, a high W/B can have either positive or negative effects. This suggests a strong correlation between compressive strength and the aeolian sand ratio under typical conditions. Similarly, Figs. 15(b) and 15(d) show the trends in SHapley values for the other significant parameters and their associated features.

4.5 Comparison the proposed Bayesian optimization with natural gradient boosting (BO-NGBoost) model with literature models

To assess the efficacy of the proposed BO-NGBoost model, its performance was benchmarked against three contemporary models developed by Rehman et al. [65], Zhao et al. [66], and Ji et al. [67] for predicting concrete compressive strength. As presented in Table 4, the BO-NGBoost model demonstrated superior performance, exhibiting a higher R² and lower MAPE, MAE, and RMSE values. This outcome underscores the BO-NGBoost model’s enhanced accuracy and generalizability in forecasting the compressive strength of ASC compared to existing models in the literature.

4.6 Graphical user interface design

A python application with a GUI was developed to enhance the ML model’s usability by enabling prediction of ASC’s compressive strength; the GUI was displayed in Fig. 16. Built with python’s Tkinter module, this GUI incorporates the NGBoost model to enable predictions. Users could input feature data, such as W/B, fly ash/B, aeolian sand ratio, and age et al., and then simply click the predict button to receive the predicted ASC compressive strength. The interface was designed to be intuitive and easy to use, allowing users without technical expertise to operate the ML model effortlessly. Overall, this intuitive GUI provided fast access to ASC compressive strength predictions, thereby enhancing the model’s practical usability.

5 Conclusions

A comprehensive database of ASC compressive strength was assembled as part of this research. The prediction of ASC compressive strength was carried out using three ML algorithms, and their respective performances were evaluated. SHAP analysis was then employed to interpret the predictions, yielding the following conclusions.

1) The NGBoost model demonstrated excellent accuracy in estimating ASC compressive strength. For the testing set, the evaluation metrics R², MSE, RMSE, MAE, and MAPE were 0.945, 4.145 MPa², 2.036 MPa, 1.661 MPa, and 0.044, respectively.

2) Compared to the RF and XGBoost models, the NGBoost-based probabilistic prediction model was more suitable for accounting for the effects of random variables, offering greater reliability and robustness.

3) The NGBoost model ranked features based on their influence on ASC compressive strength, identifying age as a critical factor, while the effects of aeolian sand ratio, coarse aggregate, and W/B were relatively negligible.

4) SHAP analysis revealed that age, coarse aggregate, superplasticizer, and cement were positively correlated with ASC compressive strength, whereas the aeolian sand ratio, W/B, and fine aggregate were negatively correlated.

5) A comparison with existing ML models in the literature demonstrates that the BO-NGBoost model predicts ASC compressive strength with superior accuracy.

To address a limitation of the study, further data collection is necessary to validate the performance of the proposed model. Future research should focus on developing new ML models that incorporate recent advancements in metaheuristic optimization techniques. Additionally, developing ML models to predict the compressive strength of other alternative cementitious materials is recommended.

References

Publishing order | Descend order by publishing year | Descend order by cited within

[1]	Dong W , Shen X D , Xue H J , He J , Liu Y . Research on the freeze–thaw cyclic test and damage model of aeolian sand lightweight aggregate concrete. Construction & Building Materials, 2016, 123: 792–799

[2]	Guan M S , Wang G , Wang Y , Wei C Q , Lai Z C , Du H B , Liu Z Y . Bond behavior of square CFT using manufactured sand and recycled coarse aggregate. Construction & Building Materials, 2021, 269: 121289

[3]	Li D J , Xu D Y , Wang Z Y , Ding X , Song A . Ecological compensation for desertification control: A review. Journal of Geographical Sciences, 2018, 28(3): 367–384

[4]	Qu C W , Qin Y J , Luo L , Zhang L L . Mechanical properties and acoustic emission analysis of desert sand concrete reinforced with steel fiber. Scientific Reports, 2022, 12(1): 20488

[5]	Yan W L , Wu G , Dong Z Q . Optimization of the mix proportion for desert sand concrete based on a statistical model. Construction & Building Materials, 2019, 226: 469–482

[6]	Luo X B , Xing G H , Qiao L , Miao P Y , Yu X G , Ma K Z . Multi-objective optimization of the mix proportion for dune sand concrete based on response surface methodology. Construction & Building Materials, 2023, 366: 129928

[7]	Xing G H , Luo X B , Miao P Y , Qiao L , Yu X G , Qin Y J . Proposed mix design method for dune sand concrete using close packing model and mortar film thickness theory. Journal of Materials in Civil Engineering, 2023, 35(11): 04023395

[8]	Zhang H M , Zheng S H , Jing P Y , Yuan C , Li Y G . Influence of pore structure characteristics on the strength of aeolian sand concrete. Građevinar, 2022, 76(1): 35–45

[9]	Hou L N , Wen B J , Huang W , Zhang X , Zhang X Y . Mechanical properties and microstructure of polypropylene-glass-fiber-reinforced desert sand concrete. Polymers, 2023, 15(24): 4675

[10]	Zhou M H , Dong W . The relationship between pore structure and strength of aeolian sand concrete under low temperature. Journal of Building Engineering, 2023, 80: 108067

[11]	Yang S H , Zhang L , Xu Z F . Effect of high temperature on residual splitting strength of desert sand concrete. Structural Concrete, 2023, 24(3): 3208–3219

[12]	Shen Y J , Peng C , Hao J S , Bai Z P , Li Y G , Yang B H . High temperature resistance of desert sand concrete: Strength change and intrinsic mechanism. Construction & Building Materials, 2022, 327: 126948

[13]	Liu Y J , Yang W W , Chen X L , Liu H F , Yan N N . Effect of desert sand on the mechanical properties of desert sand concrete (DSC) after elevated temperature. Advances in Civil Engineering, 2021, 2021(1): 3617552

[14]	Li Y G , Zhang H M , Liu G X , Hu D W , Ma X R . Multi-scale study on mechanical property and strength prediction of aeolian sand concrete. Construction & Building Materials, 2020, 247: 118538

[15]	Li Y G , Zhang H M , Liu X Y , Liu G X , Hu D W , Meng X Z . Time-varying compressive strength model of aeolian sand concrete considering the harmful pore ratio variation and heterogeneous nucleation effect. Advances in Civil Engineering, 2019, 2019(1): 5485630

[16]	Li G F , Gao B , Zhu C , Hu H , Fang H Q . Study on the deterioration characteristics of aeolian sand concrete under the coupling effect of multiple factors in harsh environments. Plos One, 2023, 18(11): e0289847

[17]	Xue H J , Shen X D , Liu Q , Wang R Y , Liu Z . Analysis of the damage to the aeolian sand concrete surfaces caused by wind-sand erosion. Journal of Advanced Concrete Technology, 2017, 15(12): 724–737

[18]	Bai J W , Zhao Y R , Shi J N , He X Y . Damage degradation model of aeolian sand concrete under freeze–thaw cycles based on macro-microscopic perspective. Construction & Building Materials, 2022, 327: 126885

[19]	Dong W , Sun A Q , Wang X S . NMR-based analysis of fractal characteristics of the pore structure of fully aeolian sand concrete under carbonation-dry-wet cycles. Materials Today Communications, 2024, 39: 108815

[20]	Dong W , Sun A Q , Zhou M H . Microstructure and chloride transport of aeolian sand concrete under long-term natural immersion. Science and Engineering of Composite Materials, 2024, 31(1): 20220242

[21]	Dong W , Wang J F . Deterioration law and life prediction of aeolian sand concrete under sulfate freeze–thaw cycles. Construction & Building Materials, 2024, 411: 134593

[22]	Dong W , Zhou M H . A study of the damage mechanism and microstructures of aeolian sand concrete specimens undergoing salt-freezing effects. Journal of the Minerals Metals & Materials Society, 2023, 75(12): 5290–5299

[23]	Zou Y X , Shen X D , Zuo X B , Xue H J , Li G F . Experimental study on microstructure evolution of aeolian sand concrete under the coupling freeze–thaw cycles and carbonation. European Journal of Environmental and Civil Engineering, 2022, 26(4): 1267–1282

[24]	Li Y G , Zhang H M , Chen S J , Wang H R , Liu G X . Multi-scale study on the durability degradation mechanism of aeolian sand concrete under freeze–thaw conditions. Construction & Building Materials, 2022, 340: 127433

[25]	Li Z Q , Zhai D S , Li J . Seismic behavior of the dune sand concrete beam-column joints under cyclic loading. Structures, 2022, 40: 1014–1024

[26]	Ren Q X , Zhou K , Hou C , Tao Z , Han L H . Dune sand concrete-filled steel tubular (CFST) stub columns under axial compression: Experiments. Thin-walled Structures, 2018, 124: 291–302

[27]	Wang W H , Han L H , Li W , Jia Y H . Behavior of concrete-filled steel tubular stub columns and beams using dune sand as part of fine aggregate. Construction & Building Materials, 2014, 51: 352–363

[28]	Wang Y H , Chu Q , Han Q , Zhang Z P , Ma X Y . Experimental study on the seismic damage behavior of aeolian sand concrete columns. Journal of Asian Architecture and Building Engineering, 2020, 19(2): 138–150

[29]	Hosseinzadeh M , Dehestani M , Hosseinzadeh A . Prediction of mechanical properties of recycled aggregate fly ash concrete employing machine learning algorithms. Journal of Building Engineering, 2023, 76: 107006

[30]	Han F L , Lv Y , Liu Y , Zhang X F , Yu W B , Cheng C S , Yang W . Exploring interpretable ensemble learning to predict mechanical strength and thermal conductivity of aerogel incorporated concrete. Construction & Building Materials, 2023, 392: 131781

[31]	Luo X , Li Y , Lin H , Li H W , Shen J L , Pan B , Bi W L , Zhang W S . Research on predicting compressive strength of magnesium silicate hydrate cement based on machine learning. Construction & Building Materials, 2023, 406: 133412

[32]	Luo X , Li Y , Wang Q A , Mu J L , Liu Y Z . Machine learning based modeling for predicting the compressive strength of solid waste material-incorporated magnesium phosphate cement. Journal of Cleaner Production, 2024, 442: 141172

[33]	Golafshani E M , Behnood A , Arashpour M . Predicting the compressive strength of normal and high-performance concretes using ANN and ANFIS hybridized with grey wolf optimizer. Construction & Building Materials, 2020, 232: 117266

[34]	Nguyen K T , Nguyen Q D , Le T A , Shin J , Lee K . Analyzing the compressive strength of green fly ash based geopolymer concrete using experiment and machine learning approaches. Construction & Building Materials, 2020, 247: 118581

[35]	Gomaa E , Han T H , ElGawady M , Huang J , Kumar A . Machine learning to predict properties of fresh and hardened alkali-activated concrete. Cement and Concrete Composites, 2021, 115: 103863

[36]	Sun Y , Lee H S . An interpretable probabilistic machine learning model for forecasting compressive strength of oil palm shell-based lightweight aggregate concrete containing fly ash or silica fume. Construction & Building Materials, 2024, 426: 136176

[37]	Jin K K , Li Y , Shen J L , Lin H , Fan M T , Shi J J . Investigation on compressive strength and splitting tensile strength of manufactured sand concrete: Machine learning prediction and experimental verification. Journal of Building Engineering, 2024, 97: 110852

[38]	Mai H V T , Nguyen M H , Trinh S H , Ly H B . Optimization of machine learning models for predicting the compressive strength of fiber-reinforced self-compacting concrete. Frontiers of Structural and Civil Engineering, 2023, 17(2): 284–305

[39]	Qiong T , Jha I , Bahrami A , Isleem H F , Kumar R , Samui P . Proposed numerical and machine learning models for fiber-reinforced polymer concrete-steel hollow and solid elliptical columns. Frontiers of Structural and Civil Engineering, 2024, 18(8): 1169–1194

[40]	Le Q H , Nguyen D H , Sang-To T , Khatir S , Le-Minh H , Gandomi A H , Cuong-Le T . Machine learning based models for predicting compressive strength of geopolymer concrete. Frontiers of Structural and Civil Engineering, 2024, 18(7): 1028–1049

[41]	Nguyen N H , Vo T P , Lee S , Asteris P G . Heuristic algorithm-based semi-empirical formulas for estimating the compressive strength of the normal and high performance concrete. Construction & Building Materials, 2021, 304: 124467

[42]	Ashrafian A , Panahi E , Salehi S , Karoglou M , Asteris P G . Mapping the strength of agro-ecological lightweight concrete containing oil palm by-product using artificial intelligence techniques. Structures, 2023, 48: 1209–1229

[43]	Alkayem N F , Shen L , Mayya A , Asteris P G , Fu R , Di Luzio G , Strauss A , Cao M . Prediction of concrete and FRC properties at high temperature using machine and deep learning: A review of recent advances and future perspectives. Journal of Building Engineering, 2024, 83: 108369

[44]	Dai B , Gu C S , Zhao E F , Qin X N . Statistical model optimized random forest regression model for concrete dam deformation monitoring. Structural Control and Health Monitoring, 2018, 25(6): e2170

[45]	Li Q F , Song Z M . Prediction of compressive strength of rice husk ash concrete based on stacking ensemble learning model. Journal of Cleaner Production, 2023, 382: 135279

[46]	Salami B A , Iqbal M , Abdulraheem A , Jalal F E , Alimi W , Jamal A , Tafsirojjaman T , Liu Y , Bardhan A . Estimating compressive strength of lightweight foamed concrete using neural, genetic and ensemble machine learning approaches. Cement and Concrete Composites, 2022, 133: 104721

[47]	Cakiroglu C , Shahjalal M , Islam K , Mahmood S M F , Billah A H M M , Nehdi M L . Explainable ensemble learning data-driven modeling of mechanical properties of fiber-reinforced rubberized recycled aggregate concrete. Journal of Building Engineering, 2023, 76: 107279

[48]	Asteris P G , Skentou A D , Bardhan A , Samui P , Pilakoutas K . Predicting concrete compressive strength using hybrid ensembling of surrogate machine learning models. Cement and Concrete Research, 2021, 145: 106449

[49]	Peng Y M , Unluer C . Modeling the mechanical properties of recycled aggregate concrete using hybrid machine learning algorithms. Resources, Conservation and Recycling, 2023, 190: 106812

[50]	Khan M I . Predicting properties of high performance concrete containing composite cementitious materials using artificial neural networks. Automation in Construction, 2012, 22: 516–524

[51]	Liu K H , Alam M S , Zhu J , Zheng J K , Chi L . Prediction of carbonation depth for recycled aggregate concrete using ANN hybridized with swarm intelligence algorithms. Construction & Building Materials, 2021, 301: 124382

[52]	Liu K H , Zou C Y , Zhang X C , Yan J C . Innovative prediction models for the frost durability of recycled aggregate concrete using soft computing methods. Journal of Building Engineering, 2021, 34: 101822

[53]	Liu K H , Dai Z H , Zhang R B , Zheng J K , Zhu J , Yang X C . Prediction of the sulfate resistance for recycled aggregate concrete based on ensemble learning algorithms. Construction & Building Materials, 2022, 317: 125917

[54]	Cavaleri L , Asteris P G , Psyllaki P P , Douvika M G , Skentou A D , Vaxevanidis N M . Prediction of surface treatment effects on the tribological performance of tool steels using artificial neural networks. Applied Sciences, 2019, 9(14): 2788

[55]	Ghanizadeh A R , Ghanizadeh A , Asteris P G , Fakharian P , Armaghani D J . Developing bearing capacity model for geogrid-reinforced stone columns improved soft clay utilizing MARS-EBS hybrid method. Transportation Geotechnics, 2023, 38: 100906

[56]	Sadegh Barkhordari M , Jahed Armaghani D , Asteris P G . Structural damage identification using ensemble deep convolutional neural network models. Computer Modeling in Engineering & Sciences, 2023, 134(2): 835–855

[57]	Cavaleri L , Chatzarakis G E , Trapani F D , Douvika M G , Roinos K , Vaxevanidis N M , Asteris P G . Modeling of surface roughness in electro-discharge machining using artificial neural networks. Advanced Materials Research, 2017, 6(2): 169–184

[58]	DuanTAvatiADingD YThaiK KBasuSNgA YSchulerA. NGBoost: Natural gradient boosting for probabilistic prediction. 2019, arXiv: 1910.03225

[59]	SnoekJLarochelleHAdamsR P. Practical Bayesian optimization of machine learning algorithms. 2012, arXiv: 1206.2944

[60]	Seeger M . Gaussian processes for machine learning. International Journal of Neural Systems, 2004, 14(2): 69–106

[61]	Wu J , Chen X Y , Zhang H , Xiong L D , Lei H , Deng S H . Hyperparameter optimization for machine learning models based on Bayesian optimization. Journal of Electronic Science and Technology, 2019, 17: 26–40

[62]	BrochuECoraV MDe FreitasN. A tutorial on Bayesian optimization of expensive cost functions, with application to active user modeling and hierarchical reinforcement learning. 2010, arXiv: 1012.2599

[63]	Cao Y , Su F M , Antwi-Afari M F , Lei J , Wu X G , Liu Y . Enhancing mix proportion design of low carbon concrete for shield segment using a combination of Bayesian optimization-NGBoost and NSGA-III algorithm. Journal of Cleaner Production, 2024, 465: 142746

[64]	Wu X G , Feng Z B , Liu J , Chen H Y , Liu Y . Predicting existing tunnel deformation from adjacent foundation pit construction using hybrid machine learning. Automation in Construction, 2024, 165: 105516

[65]	Rehman F , Khokhar S A , Khushnood R A . ANN based predictive mimicker for mechanical and rheological properties of eco-friendly geopolymer concrete. Case Studies in Construction Materials, 2022, 17: e01536

[66]	Zhao W J , Feng S Y , Liu J X , Sun B C . An explainable intelligent algorithm for the multiple performance prediction of cement-based grouting materials. Construction & Building Materials, 2023, 366: 130146

[67]	Ji Y C , Wang D Y , Wang J . Study of recycled concrete properties and prediction using machine learning methods. Journal of Building Engineering, 2024, 94: 110067

RIGHTS & PERMISSIONS

Higher Education Press

AI Summary AI Mindmap