Interpretable machine learning framework for strength pre-diction of glass powder-enhanced concrete

Abdullah Faiz Al ASMARI

doi:10.1007/s11709-026-1272-1

ENG. Struct. Civ. Eng ›› DOI: 10.1007/s11709-026-1272-1

RESEARCH ARTICLE

Interpretable machine learning framework for strength pre-diction of glass powder-enhanced concrete

Abdullah Faiz Al ASMARI

Author information +

History +

PDF (5045KB)

Abstract

The escalating environmental impact of cement production has intensified the demand for sustainable alternatives in concrete formulations. This study in vestigates the predictive modeling of compressive strength (CS) in concrete incorporating glass powder (GP) as a partial cement substitute. A data set comprising 308 experimental samples with nine input variables including binder composition, curing age, particle size, and chemical constituents was analyzed using six machine learning (ML) models: Decision tree (DT), random forest (RF), extra trees (ET), k-nearest neighbors (KNN), support vector regression (SVR), and extreme gradient boosting. Each model was evaluated based on coefficient of determination (R²), root mean square error (RMSE), mean absolute error, mean absolute percentage error, variance accounted for, root mean square error to standard deviation ratio, and weighted mean absolute percentage error to assess performance across 80% training and 20% testing splits. Among all models, the DT model exhibited the highest predictive accuracy with R² = 0.97 (train) and 0.96 (test), RMSE = 3.80 MPa, and minimal residual spread, as confirmed by shapley additive explanations (SHAP) analysis and residual error plots. RF and ET models also performed robustly, benefiting from ensemble learning. Conversely, SVR and KNN displayed higher error margins, reflecting limitations in modeling nonlinear relationships in heterogeneous concrete systems. While the models show strong promise, future research should explore larger, multi-regional data sets and hybrid descriptors to improve generalizability and capture complex chemophysical interactions.

Graphical abstract

Keywords

GP / ML / SHAP analysis / ensemble models / predictive modeling

Cite this article

Download citation ▾

Abdullah Faiz Al ASMARI. Interpretable machine learning framework for strength pre-diction of glass powder-enhanced concrete. ENG. Struct. Civ. Eng DOI:10.1007/s11709-026-1272-1

登录浏览全文

4963

注册一个新账户忘记密码

1 Introduction

Concrete is one of the most widely utilized materials in the construction sector, valued for its substantial compressive strength (CS), durability, and the plentiful availability of its raw materials, such as cement, aggregates, and water [1–3]. However, the pro-duction of Portland cements the primary binding agent in concrete is notably energy-intensive and significantly contributes to global CO₂ emissions, accounting for ap-proximately 5%–8% of anthropogenic CO₂ output [4–6]. In response to growing environmental concerns, researchers have been actively exploring the use of alternative materials to replace cement and natural aggregates, including supplementary cementitious materials (SCMs) such as fly ash, silica fume, rice husk ash, ground granulated blast-furnace slag (GGBS), and notably, waste glass powder (WGP) [7]. WGP, primarily composed of amorphous silica, exhibits pozzolanic properties that enable it to react with calcium hydroxide (Ca(OH)₂) produced during cement hydration, resulting in the formation of additional calcium silicate hydrate (C–S–H) gel, thereby enhancing the microstructure and mechanical performance of concrete [5,8]. Further-more, the use of finely ground glass helps mitigate the alkali-silica reaction (ASR), which can otherwise compromise concrete durability [9–11]. Studies have demonstrated that incorporating WGP at optimal dosages not only enhances compressive and tensile strength but also reduces CO₂ emissions by up to 20% compared to traditional concrete [4,8,11]. Thus, the valorization of glass waste in the form of powder or recycled aggregate presents a promising strategy for sustainable and high-performance concrete production, aligning with the principles of a circular economy and carbon footprint reduction [12,13].

While the incorporation of WGP and recycled glass aggregates in concrete offers significant enhancements in both sustainability and mechanical strength, it also raises concerns regarding long-term durability, primarily due to the ASR [14,15]. ASR is a chemical process wherein reactive silica from the glass interacts with alkali hydroxides in the cement paste, leading to the formation of an expansive gel. This gel can absorb moisture and expand, ultimately causing internal stresses and cracks within the concrete matrix [16,17]. The probability of ASR increases when the glass particles are relatively large (typically greater than 1 mm) or when soda-lime glass with high sodium oxide (Na₂O) content is used without treatment [18]. Research indicates that grinding the glass to a very fine size typically less than 75 to 100 µm significantly reduces ASR risks, primarily due to enhanced pozzolanic behavior and decreased reactive surface exposure [18,19]. For instance, Abdelli et al. [20] reported that incorporating finely crushed waste glass as a 25% replacement for fine aggregate (FA) resulted in an 11.56% increase in CS while mitigating ASR. Moreover, combining WGP with other SCMs such as fly ash or silica fume, or employing chemical additives like calcium nitrate [Ca(NO₃)₂], can inhibit ASR by blocking alkali diffusion and neutralizing reactive sites [3,21]. Nonetheless, ensuring optimal concrete performance necessitates precise control over mix design parameters, including glass dosage and particle size.

Concrete is a multiphase construction material that consists of cement, fine, and CAs, water, and other admixtures that individually serve different purposes about the mechanical performance of concrete [21–23]. Cement reacts with water by being hydrated and, in turn, the binding phase is C–S–H, which is the primary binding phase that contributes greatly to the CS going up with time [24]. Such strength properties as CS are usually determined on cured molded specimens’ cube or cylinder which have been cured between 7, 28, and 90 d as standard testing procedures consistently [25]. Although, reliable, the procedures are labor embracive, time consumptive, and prohibitive especially in cases where there are many mixtures containing different additives to test [7,13,20]. It becomes even more complicated when WGP is used as the non-complete cement alternative, because it is non-homogeneous both in particle-size distribution and chemical characteristics, and is a form of pozzolan, its impairment of which for the hydration steps can be quite dramatic [5,8]. Studies have shown that when added in a certain weight percentage, finely ground WGP can bring about an increase in early-age as well as long-term strength. Nevertheless, high rates of replacements are problematic in terms of mechanical performance, which highlights the need to achieve optimized doses and rigorously test them during experiments [26–28]. Moreover, WGP is also affecting other important properties like workability, and the early and late setting times, and the microstructural development of the matrix further complicating the correct prediction of concrete strength [28].

Incorporation of artificial intelligence (AI) and machine learning (ML) into current construction materials research has enabled this field much more by way of assessing and improving concrete properties, especially CS [29,30]. The significantly nonlinear trend of such properties and the many latent relationships between the variables make the traditional regression models ineffective in most cases to predict CS performance, particularly when aggregates such as WGP, fine recycled glass or recycled concrete aggregate are involved. Consequently, various supervised ML techniques, including decision trees (DT), random forests (RF), support vector machines (SVM), and ensemble algorithms such as gradient boosting and adaptive boosting (AdaBoost), have demonstrated substantial predictive capabilities for concrete performance [31–33]. For instance, Yehia et al. [8] evaluated the efficacy of four ensemble models on glass-based concrete mixtures and reported that extreme gradient boosting regression (XGB) achieved the highest accuracy (coefficient of determination (R²) = 0.97, root mean square error (RMSE) = 2.67 MPa), effectively managing the complexity of inputs presented by glass-based mixtures.

The robustness of ensemble methods was further corroborated by Feng et al. [34], who applied AdaBoost to a data set comprising 1030 samples and achieved over 95% accuracy, illustrating the versatility of these models even with diverse mixture compositions. These findings underscore the growing importance of AI and ML in developing predictive models for sustainable concrete, where traditional methods often encounter limitations. Furthermore, model transparency has been enhanced by interpretative tools such as shapley additive explanations (SHAP), which assess the impact of each feature, such as curing period, WGP dosage, or water-cement ratio (W/C), thereby facilitating informed mix design and performance optimization [35–37].

To gain a comprehensive understanding of the research landscape, a keyword co-occurrence network was con-structed based on recent publications in the field of ML applications for predicting concrete CS. As illustrated in Fig. 1, the predominant clusters emphasize key terms such as “ML”, “CS”, “RF”, and “fly ash”, highlighting the increasing convergence of AI-driven prediction and sustainable construction materials [1–37].

2 Methods

2.1 Data sets collection

To establish a reliable and widely applicable framework for predicting the CS of concrete that incorporates glass powder (GP), it is crucial to meticulously select and assess the key input variables affecting the material’s behavior. study involved assembling a curated experimental data set of 308 samples, derived from an extensive review of peer-reviewed journal articles published [5,7,8,20,26,27]. The criteria for literature selection were rigorously defined: 1) utilization of GP sourced from post-consumer or industrial waste glass; 2) partial substitution of ordinary portland cement with GP by mass; 3) clearly documented chemical composition or oxide constituents of GP; 4) availability of detailed mix design parameters such as W/C, cement, fine and CAs, and curing conditions; 5) CS values reported at specified curing ages.

The data set encompassed a broad range of mixture design variables and material characteristics, including nine oxide-based variables intended to represent the funda-mental hydration mechanisms and pozzolanic reactivity of GP-blended concrete. A comprehensive statistical analysis was conducted to evaluate the data set’s distribution and variability, employing indicators like mean, minimum, maximum, standard deviation, variance, skewness, and kurtosis [29,30,34]. Table 1 provide summarized results with data distributions and correlations depicted shown in Figs. 2(a)–2(j). This analysis provided valuable insights into parameter behavior, aiding in the selection of suitable prediction models and enhancing the reliability of the CS estimation framework.

2.2 Correlation modeling

The Pearson correlation matrix depicted in the Fig. 3 offers a thorough statistical analysis of the relationships between key input parameters and the CS of concrete incorporating ground-GP. Among the variables studied, the W/C ratio shows the strongest negative correlation with CS (r = −0.68), indicating that an increase in water content relative to cement compromises matrix density, increases porosity, and reduces strength. Similarly, the silica content (SiO₂) in GP exhibits a significant inverse correlation with CS (r = −0.59), likely due to its predominantly glassy nature, which may be less reactive and hinder early-age strength development through reduced pozzolanic contribution. An-other critical factor is the particle size of the GP size, which moderately correlates negatively with CS (r = −0.47). This suggests that finer GP contributes more effectively to the pozzolanic reaction and microstructural densification, whereas coarser particles function more as inert fillers. Na₂O, a chemical constituent influencing alkalinity, also shows a negative relationship with CS (r = −0.40), possibly due to its role in elevating pore solution pH and enhancing alkali-silica reactivity, which may disrupt the formation of C–S–H gels. Interestingly, cement content which is typically expected to enhance strength shows a weak inverse correlation (r = −0.36).

This can be interpreted considering interaction effects within the data set, where higher cement dosages might have been combined with larger GP sizes or higher W/C ratios, thus offsetting cement’s strength-contributing potential. In contrast, curing duration (d) is the only parameter demonstrating a meaningful positive influence on CS (r = +0.31), under-scoring the critical importance of extended curing in facilitating both hydration and the pozzolanic reaction of GP, which contributes to continued strength gain over time. Other variables such as coarse aggregate (CA), calcium oxide (CaO), and FA presented minimal correlation values (ranging from −0.23 to −0.06), indicating their relatively lesser role in directly governing CS in GP-modified mixes. Overall, the analysis underscores that strength development in GP-based concretes is primarily governed by binder-phase chemistry, particle fineness, and curing regime, rather than by aggregate proportions. These insights are essential not only for optimizing material selection and mix design but also for guiding the selection of relevant input variables in predictive modeling frameworks aimed at forecasting concrete performance [31,38].

2.3 Machine learning models

This study employed six ML techniques such as k-nearest neighbors regression (KNN), XGB, extra-trees regression (ET), DT, RF, and support vector regression (SVR) to predict the CS of concrete using nine input variables sourced from various studies. The primary modeling tool utilized was the Python-based Scikit-Learn library (version 3.12.4) [1,31,39]. As illustrated in Fig. 4, the data pro-cessing involved partitioning the data set into 80% for training and 20% for testing. Hyperparameters were optimized using a grid search method, and the models’ reliability was further evaluated through 5-fold cross-validation.

2.3.1 K-nearest neighbors

KNN is a non-parametric, instance-based learning algorithm that can be applied to both classification and regression problems. In the context of regression, KNN operates on the premise that samples with similar features are spatially close within the feature space. Given a training data set, the algorithm identifies the k nearest neighbors to a new instance by computing the distance between the instance and each training point using a predefined metric typically Euclidean or Manhattan distance [40]. The predicted output is then obtained by averaging the target values of these k nearest points. Mathematically, the estimated output

y^

for a query instance is expressed as Eq. (1).

(1)

y^= 1 k ∑ j = 1 k y j,

where y_j represents the actual target value of the jth neighbor. To enhance prediction robustness, the model utilizes several key hyperparameters, where k is the number of nearest neighbors considered, influencing the trade-off between bias and variance; and p stands for the power parameter of the Minkowski distance function used for calculating distance (e.g., p = 2 for Euclidean distance, p = 1 for Manhattan distance). The underlying mechanism of KNN regression is illustrated schematically in Fig. 4(a).

2.3.2 Extreme gradient boosting regression

XGB is a robust and scalable ensemble learning technique that employs an additive tree-based structure to minimize a regularized objective function as shown in Fig. 4(b). It iteratively improves prediction performance by combining multiple weak learners, each aimed at correcting the residuals of its predecessors [41]. This approach is particularly suited for nonlinear and multivariate problems such as predicting the CS of concrete mixtures is given by Eq. (2).

(2)

Y^j = ∑ m = 1 M h m (X j), h m ∈ H,

where Y_j is predicted CS for the jth sample, M is total number of boost rounds (trees), h_m is the mth regression tree, X_j is input feature vector for the jth sample and H is functional space of regression trees.

The regularized objective function is expressed as shown in Eq. (3).

(3)

O (\Uppsi) = ∑ j = 1 N ℓ (Y j, Y^j) + ∑ m = 1 M Ω (h m),

where

ℓ (Y j, Y^j)

is differentiable convex loss function (e.g., squared loss),

Ω (h m) = α T m + β 2 ∥ w m ∥ 2

is regularization term,

T m

is number of leaf nodes in the mth tree,

w m

is vector of leaf weights and

α

β

is regularization hyperparameters.

2.3.3 Decision tree

DT is a type of supervised learning method that segments a data set into branches according to feature values to forecast continuous results. It uses a recursive partitioning strategy, dividing the data into smaller groups by choosing the best split points that reduce a specified loss function, usually mean squared error (MSE) [41]. The ultimate prediction is made at the leaf node, where the path of the input data ends. This approach is straightforward, handles nonlinear relationships well, and does not require feature scaling, making it ideal for predicting concrete properties from various input parameters as expressed in Eq. (4) and shown in Fig. 4(d).

(4)

D = {(X j, Y j)} j = 1 N,

where

X j

= feature vector for the

j th

instance,

Y j

= actual output (e.g., CS).

2.3.4 Random forest

RF is an ensemble learning technique that enhances prediction accuracy by creating multiple independent DT and combining their results. It incorporates randomness by using bootstrapping on the data set and choosing a random subset of features at each node split, which improves generalization and minimizes the risk of overfitting as shown in Fig. 4(d) [32]. This method is especially effective for managing high-dimensional, nonlinear data sets, such as those found in concrete mix design, where variable interactions and material property heterogeneity can be intricate as expressed in Eq. (5).

(5)

T = {(χ i, ψ i)} i = 1 N,

where

χ i ∈ R d

is the input feature vector for the

i th

sample,

ψ i ∈ R

is the actual target output (e.g., CS),

N

is the number of data samples,

d

is the number of input features.

2.3.5 Extra-trees regression

ET, also known as Extremely Randomized Trees, is an ensemble learning method that constructs multiple unpruned regression trees and combines their outcomes to improve predictive accuracy and reduce overfitting. Unlike RF, which determine optimal splits based on impurity measures, ET introduces additional randomness by randomly selecting both the features and the split thresholds, thereby enhancing model diversity. Each tree in the ensemble is trained on the entire data set (without bootstrapping), and the split thresholds are chosen randomly rather than by minimizing impurity [33]. This approach typically results in faster computation and greater variance reduction due to increased decorrelation among the trees as expressed in Eq. (6) and shown in Fig. 4(d).

(6)

D = {(z i, θ i)} i = 1 N,

where

z i ∈ R d

is the feature vector of the ith sample,

θ i ∈ R

is the target output (e.g., CS),

N

is the number of samples,

d

is the number of features.

2.3.6 Support vector regression

SVR is a supervised learning method derived from SVM, specifically designed to tackle regression problems. The main goal of SVR is to find a function that closely approximates the actual output values within a defined margin of tolerance (ε), while also maintaining the model’s simplicity. Unlike traditional least squares regression, SVR focuses on minimizing the generalization error using an ε-insensitive loss function, which ignores minor deviations and penalizes only those errors exceeding ε. SVR is particularly beneficial for high-dimensional, nonlinear data sets due to its use of kernel functions (e.g., radial basis function, polynomial), which transform inputs into higher-dimensional spaces where linear regression becomes more effective as shown in Eq. (7) and Fig. 4(e).

(7)

T = {(u i, ϕ i)} i = 1 N,

where

u i ∈ R d

is the input vector of the ith instance,

ϕ i ∈ R

is the corresponding output (e.g., CS),

N

is the number of training samples,

d

is the number of input features.

2.4 Performance analysis

To comprehensively assess the predictive accuracy and generalization potential of the constructed regression models, seven widely recognized statistical metrics were employed, as corroborated by earlier research [25–28]. These metrics encompass the R², RMSE, mean absolute error (MAE), variance accounted for (VAF), mean absolute percentage error (MAPE), root mean square error to standard deviation ratio (RSR), and weighted mean absolute percentage error (WMAPE). Together, these indicators create a comprehensive evaluation framework, examining not only how closely predicted values align with observed data but also the model’s dispersion, bias, and stability. The R² metric quantifies the proportion of variance in the observed data that the model explains, with values nearing 1 signifying a better fit. Conversely, RMSE, MAE, MAPE, RSR, and WMAPE are error-based metrics where lower values denote superior performance. VAF serves as a complement to R² by offering an additional view on variance explanation. While reducing prediction errors is advantageous, excessively low values might lead to overfitting; hence, achieving a balance between accuracy and generalizability is essential. The mathematical formulations of these evaluation metrics are systematically presented in Table 2.

3 Results and discussion

3.1 Prediction performance

The comparative analysis of ML models yields substantial insights into their predictive efficacy for estimating the CS of concrete incorporating GP. As illustrated in Fig. 5 and detailed in Table 3, the performance of the six regression models demonstrates considerable variability across both training and testing data sets. During the training phase, tree-based ensemble models specifically DT, ET, RF, and XG-Boost along with KNN, exhibit a close clustering of predicted values around the ideal 45° reference line. In contrast, SVR shows relatively lower accuracy and increased deviation from the parity line in both training and testing sets. During testing, the superiority of tree-based models becomes more pronounced. DT, RF, ET, and XG-Boost maintain a high density of data points within the ±20% error boundaries and exhibit reduced error metrics such as RMSE and MAE, indicating strong generalization capabilities to unseen data. Conversely, KNN and SVR display broader scatter and more frequent deviations beyond the error threshold, reflecting weaker extrapolation and robustness.

Among the ML models evaluated, the DT model exhibited the most balanced and consistent performance across both training and testing data sets. It achieved an R² of 0.97, RMSE of 3.80 MPa, MAE of 3.03 MPa, MAPE of 10.23%, VAF of 97.87%, RSR of 0.15, and WMAPE of 6.52% on the training data, demonstrating a strong capability to capture the underlying patterns between mixture constituents and CS as shown in Fig. 5(a). When applied to the 20% unseen testing data set, it maintained comparable accuracy with R² = 0.96, RMSE = 3.73 MPa, MAE = 3.00 MPa, MAPE = 10.19%, VAF = 98.26%, RSR = 0.14, and WMAPE = 6.24%, confirming its excellent generalization capability. This consistency is largely attributed to DT’s inherent strength in modeling nonlinear relationships and threshold effects, which are prevalent in concrete mix behavior. Its hierarchical structure allows for clear, interpretable rules that effectively partition the feature space, even with limited data. The slight difference in RMSE between training and testing (only 0.07 MPa) reflects minimal overfitting. The visual output in the scatter plots further supports this, as most predicted values align closely with the 45° parity line and fall within the ±20% error margin. These results suggest that the DT model is well-suited for predicting com-pressive strength in concrete incorporating GP, offering a favorable balance between accuracy, interpretability, and generalization. The RF model also delivered strong predictive performance, establishing itself as one of the top-performing algorithms in this study which is illustrated in Fig. 5(b). On the training data set, RF achieved a high (R²) of 0.96, indicating excellent goodness-of-fit, along with an RMSE of 4.12 MPa, MAE of 3.29 MPa, MAPE of 10.87%, VAF of 96.71%, RSR of 0.18, and WMAPE of 7.12%. These values confirm that the model accurately captured the nonlinear dependencies between the input variables and CS. In the testing phase, RF maintained its reliability with R² = 0.94, RMSE = 4.29 MPa, MAE = 3.45 MPa, MAPE = 11.42%, VAF = 95.18%, RSR = 0.22, and WMAPE = 7.39%, reflecting strong generalization and robustness. This stability is at-tributed to the ensemble learning mechanism of RF, which aggregates predictions from multiple DT to reduce variance and avoid overfitting.

Based on a comprehensive analysis of performance metrics presented in the table for ET Regression, the model demonstrates notable predictive strength and generalization capability across both training and testing data sets as shown in Fig. 5(c). During the training phase, ET achieves a high level of accuracy with a R² of 0.92, indicating that 92% of the variability in CS is accounted for by the model. The relatively low values of RMSE = 6.74 MPa, MAE = 5.44 MPa, and MAPE = 18.96% further corroborate the model’s ability to closely align predicted values with experimental observations. The VAF of 93.46% supports this, while RSR = 0.26 and WMAPE = 12.27% indicate low residual variability and error dispersion in the model output. When applied to the testing data set, the ET model maintains a comparable level of accuracy, thereby demonstrating strong generalization. The model achieves R² = 0.91, closely mirroring its training performance, along with RMSE = 6.94 MPa, MAE = 5.58 MPa, and MAPE = 18.26%, suggesting that its predictive accuracy does not significantly decline on unseen data. Furthermore, the VAF of 94.70%, RSR = 0.27, and WMAPE = 11.61% on the testing set confirm the model’s consistency and robustness in predicting CS across diverse samples. Technically, the ET model employs ensemble learning with increased randomness by selecting cut points at random during tree splitting to mitigate overfitting and enhance variance reduction. This design enables it to capture complex nonlinearities within the data set more effectively than conventional single-tree models. The close alignment of its R² values and minimal deviation in RMSE between training and testing (0.20 MPa) illustrates stable model behavior and limited overfitting. These results clearly establish ET as a reliable and accurate regression approach for forecasting CS.

In the evaluation of non-ensemble ML algorithms, specifically KNN, XGB, and SVR, distinct strengths and limitations were observed in modeling the CS of concrete incorporating GP. All three models demonstrated moderate predictive capabilities yet exhibited limitations in capturing the highly nonlinear and multi-dimensional relationships inherent in the data set, particularly when compared to tree-based ensemble learners. The KNN model, characterized by its sensitivity to local data structures due to its instance-based learning nature, achieved an R² of 0.93 on the training data as shown in Fig. 5(d). However, it experienced a generalization loss with a MAPE of 25.66% and a WMAPE of 11.64% during testing. This significant increase in error, despite a testing R² of 0.91, suggests that while KNN is effective at interpolating within dense feature clusters, it lacks robustness when extrapolating to sparser or unrepresented input regions. Additionally, the RSR values of 0.27 (train) and 0.25 (test) indicate considerable residual spread, undermining its reliability across varied input scenarios. The XGB model, despite being a gradient-boosted tree method, exhibited weaker generalization compared to other tree-based regressors such as RF and DT as shown in Fig. 5(e). With an R² of 0.86 and an RMSE of 7.09 MPa during training, the model adequately captured primary trends; however, its performance declined notably on the test set, with an R² of 0.84, an RMSE of 8.28 MPa, and a MAPE of 22.33%. This suggests a tendency to overfit, likely due to excessive boosting iterations or insufficient regularization. The high RSR value of 0.33 on the test data further substantiates the presence of variance inflation. Although XGB is structurally designed to handle nonlinearities effectively, its performance was constrained by overfitting, particularly in data domains with subtle material behavior transitions. The SVM model, as shown in Fig. 5(f), which relies on optimal hyperplane construction within transformed kernel spaces, produced consistent but less competitive results. With R² values of 0.91 (train) and 0.90 (test), and corresponding RMSE values of 6.89 and 6.82 MPa, SVM demonstrated relative stability across data sets [42,43].

However, the higher MAPE (19.05% train; 21.07% test) and WMAPE (12.47% train; 13.24% test) suggest suboptimal handling of nonlinear and heteroscedastic data characteristics. The elevated RSR values (0.26 train; 0.28 test) further indicate persistent residual spread, implying that SVM’s decision boundaries failed to fully encapsulate the underlying complexities of the mix design parameters affecting CS. Collectively, while KNN, XGB, and SVM offer some utility in CS prediction, their performance metrics consistently lag ensemble tree-based models. KNN struggles with sparsity and boundary generalization, XGB shows signs of overfitting without proper regularization, and SVM faces difficulties in capturing threshold-driven material behavior. These limitations underscore the critical need for models that can both learn complex nonlinear interactions and generalize effectively, qualities more reliably observed in DT, RF, and ET within this study.

Among the six ML models assessed for predicting the CS of concrete with GP, shown in Fig. 5(g) as Training data sets and Fig. 5(h) as Testing data sets, tree-based ensemble methods especially DT, RF, and ET exhibited the highest accuracy and generalization. DT achieved top performance with minimal overfitting, while RF and ET demonstrated robust and stable predictions due to their ensemble nature. In contrast, non-ensemble models like KNN, XGB, and SVM showed lower reliability, with KNN struggling on sparse data, XGB prone to overfitting, and SVM limited in capturing complex nonlinearities. Overall, ensemble models proved most effective for modeling the heterogeneous nature of cementitious systems.

3.2 Error analysis

The error residual distribution histogram as shown in Fig. 6(a) and 6(b) offers significant insights into the prediction accuracy and stability of the models. The DT model displays a pronounced, narrow peak cantered near zero with a relatively steep Gaussian shape, indicating minimal variance and a high concentration of residuals close to zero. This reflects an exceptional fitting capacity with minimal overfitting and underfitting, confirming its consistent accuracy across unseen data. The RF and ET models respectively also demonstrate compact distributions, although slightly broader than DT. This suggests strong predictive power, supported by their ensemble nature, which stabilizes predictions by averaging over many DT. The ET’s slightly better curve sharpness over RF may be attributed to its use of random split thresholds, enhancing variance reduction while reducing correlation among trees. In contrast, the KNN and SVM curves show wider and flatter shapes indicating greater variance and prediction errors. KNN, being a lazy learner, is highly sensitive to feature scaling and noise, often leading to inconsistent predictions in heterogeneous data sets like concrete mixtures. SVM, though theoretically robust and interpretable via kernel methods, shows moderate dispersion, suggesting challenges in modeling complex, nonlinear interactions without extensive hyperparameter tuning. The XGB model displays a somewhat leptokurtic distribution slightly less peaked than DT but better than KNN/SVM. Its boosting mechanism improves accuracy by correcting residuals sequentially, but the tails reveal a few high residuals possibly due to overfitting tendencies from its aggressive tree depth and learning rate unless well-regularized. RF, although robust, displays a slightly broader spread compared to DT and ET, likely due to less randomization in its tree construction process. XGB shows moderate dispersion with a larger, indicating a tendency to underpredict certain samples. KNN and SVM present the widest spread among all, indicating frequent large prediction errors and lesser robustness. In GP-modified concrete systems, where chemical composition, particle size, and hydration kinetics play critical roles, DT and ET models clearly outperform others in residual consistency and ac-curacy. Their ability to learn thresholds and nonlinearities aligns with the behavior of cementitious reactions. RF maintains reliability but with higher variance, while KNN and SVM lack robustness due to poor handling of multivariate feature interactions. XGB, though powerful, shows residual instability due to potential overfitting of rare compositional profile

3.3 Shapley additive explanations

Figure 7 illustrates the SHAP summary plot de-rived from the DT model, depicting the influence and significance of each input variable on the predicted CS of concrete incorporating GP. Each dot on the plot represents the SHAP value of an individual data point for a specific feature. The x-axis denotes the SHAP value, reflecting the magnitude and direction of the feature’s impact on the output prediction, while the y-axis lists features in descending order of importance [35,36]. The color gradient from blue to red indicates the actual feature values, with blue representing low values and red representing high values. The cement content (kg/m³) emerges as the most influential feature, demonstrating a strong positive contribution to CS predictions. High cement values (represented in red) shift the SHAP value significantly to the right, con-firming their enhancing role in mechanical performance through improved hydration kinetics and binder content. This aligns with the expected role of cement as the principal binding phase in concrete. Curing time (d) is the second most impactful variable, showing a clear positive correlation with predicted strength. SHAP values for high curing durations (red) are skewed to the right, demonstrating the beneficial effect of prolonged hydration in developing C–S–H gel, particularly in pozzolanic systems like those containing GP. This is consistent with the known behavior of blended cements that continue to gain strength over extended curing periods. SiO₂ (%) and W/C ratio also exhibit strong explanatory power. Notably, high SiO₂ content has a predominantly negative SHAP value (left-shifted in red), suggesting that while SiO₂ rich pozzolans may contribute to long-term pozzolanic reactions, excessive substitution or poor reactivity may hinder early strength development. Similarly, the W/C ratio shows a typical inverse relation-ship higher W/C values reduce strength due to increased porosity and weaker paste microstructure, which is clearly reflected in the left-shifted SHAP values for high W/C ratios.

CaO content, fly ash (FA kg/m³), and CA (kg/m³) exhibit a moderate influence on the system. Their SHAP values are relatively concentrated around zero, indicating that while they contribute to reaction chemistry and packing density, their impact on CS is less significant within the examined design space. CaO, as a primary component of cement and FA, displays variable behavior, with both low and high values exerting minimal effects. The particle size of ground granulated particles (GP) (µm) and Na₂O (%) are ranked lower, contributing insignificantly to the prediction variance in this DT model. The minimal SHAP spread for these parameters suggests that their effects may be nonlinear and indirectly captured by more dominant variables, or that their influence is obscured by interactions not fully captured in a univariate SHAP analysis. This observation may also indicate the necessity for further refinement or the use of combined descriptors (e.g., GP reactivity index or surface area) to more accurately quantify their role. In conclusion, the SHAP analysis highlights that the CS of GP-concrete is primarily determined by binder content (cement), hydration age (curing), and mixture fluidity (W/C). The DT model effectively captures these primary relationships with high interpretability. Secondary factors such as reactive oxide content and aggregate packing show subdued yet consistent contributions. These findings not only validate the model’s predictions but also align with concrete material science, reinforcing SHAP’s utility in transparent and mechanistically coherent modeling [37].

4 Conclusions

This study offers a comprehensive evaluation of six ML models DT, RF, ET, XGB, SVR, and KNN for predicting the com-pressive strength of concrete incorporating GP, based on 308 experimental samples. Among these models, the DT exhibited superior performance, achieving R² values of 0.97 and 0.96, RMSE of 3.80 and 3.73 MPa, and MAPE below 10.5% on training and testing data sets, respectively. SHAP analysis identified cement content, curing duration, and W/C ratio as the most influential features affecting CS. The RF model closely followed DT with (R² = 0.96/0.94) and RMSE = 4.12/4.29 MPa (train/test), while ET also demonstrated competitive accuracy (R² = 0.92/0.91). Conversely, KNN and SVR underperformed, with lower R² values (< 0.89) and higher residual errors, indicating weaker generalization. The error distribution and box plot analyses corroborated these findings, with DT displaying the narrowest residual spread and minimal outliers. This research contributes to sustainable concrete design by demonstrating the reliability and interpretability of tree-based models for predicting performance in GP-blended systems. However, limitations include the absence of microstructural descriptors (e.g., porosity, C–S–H development) and field-scale variability. Future research should integrate multiscale data, hybrid ML mechanistic models, and uncertainty quantification to enhance robustness and generalization across diverse concretes and real-world construction conditions.

References

Publishing order | Descend order by publishing year | Descend order by cited within

[1]	Faraj R H , Mohammed A A , Mohammed A , Omer K M , Ahmed H U . Systematic multiscale models to predict the compressive strength of self-compacting concretes modified with nanosilica at different curing ages. Engineering with Computers, 2022, 38(S3): 2365–2388

[2]	Jeyasehar A , C G , Saravanan M . Thirugnanasambandam S. Development of fly ash based geopolymer precast concrete elements. Asian Journal of Civil Engineering, 2013, 14(4): 605–615

[3]	Ashish D K , Verma S K . Cementing efficiency of flash and rotary-calcined metakaolin in concrete. Journal of Materials in Civil Engineering, 2019, 31(12): 04019307

[4]	Baek C , Park S H , Suzuki M , Lee S H . Life cycle carbon dioxide assessment tool for buildings in the schematic design phase. Energy and Building, 2013, 61: 275–287

[5]	Muhedin D A , Ibrahim R K . Effect of waste glass powder as partial replacement of cement & sand in concrete. Case Studies in Construction Materials, 2023, 19: e02512

[6]	Abdulkareem O A , Ramli M , Matthews J C . Production of geopolymer mortar system containing high calcium biomass wood ash as a partial substitution to fly ash: An early age evaluation. Composites Part B: Engineering, 2019, 174: 106941

[7]	Elmikass A G , Makhlouf M H , Mostafa T S , Hamdy G A . Experimental study of the effect of partial replacement of cement with glass powder on concrete properties. Key Engineering Materials, 2022, 921: 231–238

[8]	Yehia S A , Shahin R I , Fayed S . Compressive behavior of eco-friendly concrete containing glass waste and recycled concrete aggregate using experimental investigation and machine learning techniques. Construction and Building Materials, 2024, 436: 137002

[9]	Teng S , Lim T Y D , Divsholi B S . Durability and mechanical properties of high strength concrete incorporating ultra fine ground granulated blast-furnace slag. Construction and Building Materials, 2013, 40: 875–881

[10]	Aziz A , Mehboob S S , Tayyab A , Khan D , Hayyat K , Ali A , Latif Qureshi Q B I . Enhancing sustainability in self-compacting concrete by optimizing blended supplementary cementitious materials. Scientific Reports, 2024, 14(1): 12326

[11]	Ndahirwa D , Zmamou H , Lenormand H , Leblanc N . The role of supplementary cementitious materials in hydration, durability and shrinkage of cement-based materials, their environmental and economic benefits: A review. Cleaner Materials, 2022, 5: 100123

[12]	Olabimtan S B , Damdelen Ö . Effect of waste glass powder and recycled fine aggregate in sustainable concrete. Journal of Structural Engineering & Applied Mechanics, 2023, 6(4): 343–363

[13]	Moreira O , Camões A , Malheiro R L M C , Jesus C . High glass waste incorporation towards sustainable high-performance concrete. CivilEng, 2024, 5(1): 41–64

[14]	Wang T , Nicolas R S , Nguyen T N , Kashani A , Ngo T . Experimental and numerical study of long-term alkali-silica reaction (ASR) expansion in mortar with recycled glass. Cement and Concrete Composites, 2023, 139: 105043

[15]	Gholampour A , Memarzadeh A , Nematzadeh M , Kiamahalleh M V , Ngo T D . Concrete containing recycled concrete coarse aggregate and crushed glass sand: Mitigating the effect of alkali–silica reaction. Structural Concrete, 2024, 25(5): 3682–3702

[16]	Akhnoukh A K . Improving concrete infrastructure projects conditions by mitigating alkali–silica reactivity of fine aggregates. Construction Materials, 2023, 3(2): 233–243

[17]	Ezell N D B , Hayes N , Lenarduzzi R , Clayton D , Ma Z J , Pape S L , Pape Y L . Experimental collaboration for thick concrete structures with alkali–silica reaction. In: Proceedings of AIP Conference. Melville, NY: AIP Publishing LLC, 2018, 1949(1): 030001

[18]	Zheng K . Pozzolanic reaction of glass powder and its role in controlling alkali–silica reaction. Cement and Concrete Composites, 2016, 67: 30–38

[19]	Maraghechi H. Development and assessment of alkali activated recycled glass-based concretes for civil infrastructure. Dissertation for the Doctoral Degree. University Park, PA: Pennsylvania State University, 2014

[20]	Abdelli H E , Mokrani L , Kennouche S , de Aguiar J L B . Utilization of waste glass in the improvement of concrete performance: A mini review. Waste Management and Research, 2020, 38(11): 1204–1213

[21]	Mohamed M , Elgabbas F , Elnemr A , Moaty M . The influence of glass powder as a cement replacement material on ultra-high-performance concrete. Al-Azhar University Civil Engineering Research Magazine, 2021, 43(2): 315–323

[22]	Zhang Q , Liu B , Sun Z , Li Q , Wang S , Lu X , Liu J , Zhang S . Preparation and hydration process of copper slag-granulated blast furnace slag-cement composites. Construction and Building Materials, 2024, 421: 135717

[23]	Gill P , Jangra P , Ashish D K . Non-destructive prediction of strength of geopolymer concrete employing lightweight recycled aggregates and copper slag. Energy, Ecology and Environment, 2023, 8(6): 596–609

[24]	Woo S P , Choi Y C . Synthesis of calcium silicate hydrate nanoparticles and their effect on cement hydration and compressive strength. Construction and Building Materials, 2023, 407: 133559

[25]	Cubilla E FPuga K L N NVillar JSanchez J R. Validation of different curing methods and their impact on the compressive strength of concrete. In: Proceedings of 2024 9th International Engineering, Sciences and Technology Conference. Piscataway: IEEE, 2024: 468–474

[26]	Khan F A , Shahzada K , Ullah Q S , Fahim M , Khan S W , Badrashi Y I . Development of environment-friendly concrete through partial addition of waste glass powder (WGP) as cement replacement. Civil Engineering Journal, 2020, 6(12): 2332–2343

[27]	Lam W L , Cai Y , Sun K , Shen P , Poon C . Roles of ultra-fine waste glass powder in early hydration of portland cement: hydration kinetics, mechanical performance, and microstructure. Construction and Building Materials, 2024, 415: 135042

[28]	Shi̇rzad W , Behsoodi M M , Tasal M Y . Utilization and effects of various particle sizes of WGP as partial replacement of cement in concrete. International Advanced Researches and Engineering Journal, 2023, 7(3): 191–199

[29]	Tak M S N , Feng Y , Mahgoub M . Advanced machine learning techniques for predicting concrete compressive strength. Infrastructures, 2025, 10(2): 26

[30]	Nadimalla A , Masjuki S A , Gubbi A , Khan A , Mokashi I . Machine learning models for predicting the compressive strength of concrete with shredded PET bottles and M-sand as fine aggregate. IIUM Engineering Journal, 2025, 26(1): 42–56

[31]	Wagh M , George S , Algburi S , Waghmare C , Gupta T , Yadav A , Mohammed S J , Majdi A . Prediction of ANN, MLR, and NLR models for compressive strength performance in fly ash based self compacting concrete. Asian Journal of Civil Engineering, 2025, 26: 3519–3532

[32]	Waghmare CPathan M GHussain S AGupta TNikhade AWagh MAnsari K. Machine learning based prediction of compressive strength in roller compacted concrete: A comparative study with PDP analysis. Asian Journal of Civil Engineering, 2025, 26: 2241–2253

[33]	Wagh MWaghmare CGudadhe AThakur NMohammed S JAlgburi SMajdi H SAnsari K. Predicting compressive strength of sustainable concrete using advanced AI models: DLNN, RF, and MARS. Asian Journal of Civil Engineering, 2025, 26: 1939–1954

[34]	Feng D C , Liu Z T , Wang X D , Chen Y , Chang J Q , Wei D F , Jiang Z M . Machine learning-based compressive strength prediction for concrete: An adaptive boosting approach. Construction and Building Materials, 2020, 230: 117000

[35]	Zeng X. Enhancing the interpretability of SHAP values using large language models. 2024, arXiv: 2409.00079

[36]	Luo Z , Li S . An interpretable prediction model for pavement performance prediction based on XGBoost and SHAP. In: Proceedings of Second International Conference on Electronic Information Engineering and Computer Communication. Washington, WA: SPIE, 2023, 12594: 187–194

[37]	Alomari Y , Andó M . SHAP-based insights for aerospace PHM: Temporal feature importance, dependencies, robustness, and interaction analysis. Results in Engineering, 2024, 21: 101834

[38]	Mohammed A , Rafiq S , Sihag P , Kurda R , Mahmood W . Soft computing techniques: Systematic multiscale models to predict the compressive strength of HVFA concrete based on mix proportions and curing times. Journal of Building Engineering, 2021, 33: 101851

[39]	Kaveh A. Applications of Artificial Neural Networks and Machine Learning in Civil Engineering. Switzerland: Springer, 2024

[40]	Acito F. Predictive Analytics with KNIME: Analytics for Citizen Data Scientists. Cham: Springer Nature Switzerland, 2023: 209–227

[41]	Gamil Y . Machine learning in concrete technology: A review of current researches, trends, and applications. Frontiers in Built Environment, 2023, 9: 1145591

[42]	Saleh A N , Attar A A , Algburi S , Ahmed O K . Comparative study of the effect of silica nanoparticles and polystyrene on the properties of concrete. Results in Materials, 2023, 18: 100405

[43]	Yang Z , Liu S , Yu L , Xu L . A comprehensive study on the hardening features and performance of self-compacting concrete with high-volume fly ash and slag. Materials, 2021, 14(15): 4286

RIGHTS & PERMISSIONS

Higher Education Press

PDF (5045KB)

Accesses

Citation

Detail

Sections

Recommended

About the journal

Aims & scope

Description

Editorial board

Contact us

Latest issue

Just accepted

Collections

Authors & reviewers

Online submisson

Call for papers

Guidelines for authors

Abstract

Graphical abstract

Keywords

Cite this article

1 Introduction

2 Methods

2.1 Data sets collection

2.2 Correlation modeling

2.3 Machine learning models

2.3.1 K-nearest neighbors

2.3.2 Extreme gradient boosting regression

2.3.3 Decision tree

2.3.4 Random forest

2.3.5 Extra-trees regression

2.3.6 Support vector regression

2.4 Performance analysis

3 Results and discussion

3.1 Prediction performance

3.2 Error analysis

3.3 Shapley additive explanations

4 Conclusions

References

RIGHTS & PERMISSIONS