Advanced machine learning techniques for predicting compressive strength of ultra-high performance concrete

Arslan Qayyum KHAN; Syed Ghulam MUHAMMAD; Ali RAZA; Preeda CHAIMAHAWAN; Amorn PIMANMAS

doi:10.1007/s11709-025-1169-4

Front. Struct. Civ. Eng. ›› 2025, Vol. 19 ›› Issue (4) :503 -523. DOI: 10.1007/s11709-025-1169-4

RESEARCH ARTICLE

Advanced machine learning techniques for predicting compressive strength of ultra-high performance concrete

Author information +

History +

PDF (3536KB)

Abstract

This study presents a robust framework for predicting the compressive strength of ultra-high performance concrete (UHPC) using machine learning models, based on a comprehensive data set of 761 data points derived from various UHPC mix designs. Six models, including K-Nearest Neighbors (KNN), Gradient Boosting Regression (GBR), Random Forest Regression (RFR), Support Vector Regression (SVR), Stacking and eXtreme Gradient Boosting (XGBoost), were evaluated. Among them, XGBoost demonstrated the best prediction accuracy, achieving a coefficient of determination (R²) of 0.969 and a root mean square error (RMSE) of 4.626 MPa, outperforming the other models. The Stacking model also performed well with an R² of 0.960, though it slightly overestimated at higher compressive strength levels. SHapley Additive exPlanations (SHAP) analysis revealed that curing time, silica fume, and aggregate content were the most significant factors influencing compressive strength. Curing time emerged as the dominant factor, significantly surpassing other variables such as silica fume and aggregate content in its impact on compressive strength. This dominance is attributed to its critical role in hydration and compressive strength development, while silica fume and aggregates primarily contributed by enhancing matrix densification and structural integrity. SHAP feature dependency analysis further highlighted complex interactions, particularly between water content and superplasticizer dosage, affecting workability and compressive strength.

Graphical abstract

Keywords

ultra-high performance concrete / compressive strength / machine learning / SHAP / prediction

Cite this article

Download citation ▾

Arslan Qayyum KHAN, Syed Ghulam MUHAMMAD, Ali RAZA, Preeda CHAIMAHAWAN, Amorn PIMANMAS. Advanced machine learning techniques for predicting compressive strength of ultra-high performance concrete. Front. Struct. Civ. Eng., 2025, 19(4): 503-523 DOI:10.1007/s11709-025-1169-4

登录浏览全文

4963

注册一个新账户忘记密码

1 Introduction

In response to the increasing demand for sustainable and resilient infrastructure, ultra-high performance concrete (UHPC) has emerged as a cutting-edge material in civil engineering. UHPC, with compressive strengths greater than 150 MPa-approximately three times higher than conventional concrete-has gained significant attention for its outstanding mechanical properties, low permeability, and enhanced resistance to environmental factors, making it a transformative material for critical infrastructure applications such as high-rise buildings, bridges, and dams [1,2]. Its remarkable attributes, including high compressive strength, low permeability, and enhanced resistance to environmental factors, make UHPC an ideal choice for these demanding applications [3,4].

The components of UHPC are meticulously selected and proportioned to balance workability, compressive strength, durability, and toughness. Cementitious materials, such as cement, fly ash, slag, silica fume, and nano silica, interact with additives like superplasticizers and steel fibers to achieve these properties [5,6]. For example, the use of silica fume and nano silica effectively reduces porosity, but their success depends heavily on the proper dispersion facilitated by superplasticizers [5,7]. Moreover, the packing density achieved through fine aggregates, quartz powder, and limestone powder significantly influences the mechanical properties and durability of the material [8,9].

However, despite its advantages, accurately predicting the compressive strength of UHPC presents a significant challenge due to its complex and heterogeneous composition [10]. The material is composed of various constituents, including cement, fly ash, slag, silica fume, nano silica, limestone powder, aggregates, quartz powder, superplasticizer, and steel fibers [11,12]. These components interact in nonlinear and multifaceted ways, rendering traditional predictive methods, such as empirical formulas and regression models, insufficient for capturing UHPC’s true behavior [12]. The need for more sophisticated predictive models that account for the intricate relationships between these materials is evident.

Given the complexities inherent in UHPC’s composition, predicting its compressive strength has become a formidable task for researchers [13,14]. Traditional methods often fall short, relying on limited parameters and failing to capture the intricate relationships that influence compressive strength outcomes [15]. Furthermore, these methods do not adequately account for the variability introduced by different mixing techniques, curing conditions, and environmental factors. Consequently, there is a pressing need for advanced predictive frameworks that leverage supervised machine learning (ML) techniques to analyze large data sets and enhance prediction accuracy [16,17].

Supervised ML techniques have gained traction in civil engineering, especially for predictive tasks [18–20]. These models utilize historical data to predict key parameters, such as material properties, structural performance, and infrastructure conditions. Linear regression, one of the simplest supervised techniques, establishes a linear relationship between input and output variables and is often used to predict continuous values like the compressive strength of concrete [21]. More complex methods, such as multiple linear regression, capture interactions between numerous input factors and their effects on the output [22]. Nonlinear models, such as decision tree regression and its derivatives, Random Forest Regression (RFR) and Gradient Boosting Regression (GBR), excel in handling nonlinear relationships and offer insights into the relative importance of various input variables [23,24].

Artificial Neural Networks (ANNs), inspired by the human brain, are another powerful branch of supervised learning that can model highly nonlinear relationships [25]. ANNs have demonstrated impressive results in predicting civil engineering parameters, such as the compressive strength of concrete, load-bearing capacities of structures, and energy efficiency of buildings [26,27]. Similarly, Support Vector Regression (SVR), a method that fits a hyperplane to minimize error between input variables and outputs, has been applied in predicting construction material properties and infrastructure performance [28].

Ensemble methods, such as eXtreme Gradient Boosting (XGBoost) and Stacking, are also gaining prominence in civil engineering. XGBoost, a gradient-boosting algorithm, combines multiple decision trees to improve predictive performance, while Stacking involves training multiple base models and combining their outputs using a meta-model for greater prediction accuracy [29,30]. These advanced techniques can capture complex interactions in UHPC’s composition more effectively than traditional methods. The selection of an appropriate ML technique depends on various factors, including the complexity of the data set and the nature of the relationships between inputs and outputs. Civil engineers often employ ensemble methods or hybrid models to enhance prediction accuracy [31].

Despite the extensive application of ML models in predicting concrete compressive strength, limited research has focused on UHPC, a material with highly nonlinear and intricate behavior due to its diverse components and extreme properties. This study addresses this gap by evaluating the predictive performance of six advanced ML models, KNN, SVR, RFR, GBR, Stacking, and XGBoost, on a robust data set of 761 UHPC mix designs. Furthermore, the integration of SHapley Additive exPlanations (SHAP) analysis provides novel insights into the significance and interactions of input parameters, such as curing time, silica fume, and aggregates, offering a unique approach to optimizing UHPC formulations. By combining prediction accuracy with interpretability, this study contributes to advancing the application of ML in UHPC research and practice.

2 ML models and performance evaluation

2.1 ML models

The six ML models, KNN, SVR, RFR, GBR, Stacking, and XGBoost, employed in this study have been widely used in related predictive tasks. For example, RFR and XGBoost have demonstrated high accuracy in predicting concrete and structural properties due to their ability to handle nonlinear relationships and high-dimensional data [16,29]. Stacking combines strengths of multiple models, improving predictive accuracy by leveraging complementary strengths [32]. While KNN is simple and intuitive, its performance is limited in complex data sets, unlike ensemble methods that excel in generalization. This study leverages these models to address the challenges of predicting UHPC compressive strength.

2.1.1 K-nearest neighbors

The KNN is a simple yet effective technique for classification and regression tasks. It operates on the principle of proximity, by detecting the nearest data points in the feature space. The KNN model is a popular choice for new machine learners, as it is easy to implement and understand. It is a versatile supervised learning algorithm with two main variants: KNN Classification and KNN Regression. In classification, the algorithm assigns a new data point to the class that is most usual to its nearest neighbors in the training data set, based on similarity measures like Euclidean or Manhattan distance, the formulas of which are shown in Eqs. (1) and (2). On the other hand, KNN Regression forecasts an uninterrupted target variable by averaging the values of the KNN. This model is a simple and intuitive algorithm that can effectively handle nonlinear connections in the data. It is helpful in situations where the decision boundaries are not well-defined. After training the sample data, the prediction of the labels of the data points is made through classification. Researchers have proposed many classification methods over the past few decades, but the KNN model is still the first choice to classify the data set [33,34].

(1)

E u c l i d e a n D i s t a n c e F u n c t i o n = ∑ i = 1 k (x i − y i) 2,

(2)

M a n h a t t a n D i s t a n c e F u n c t i o n = ∑ i = 1 k | x i − y i | .

2.1.2 Support vector regression

SVR is an ML algorithm that is specifically designed for regression tasks, derived from the principles of support vector machines (SVM). It focuses on fitting a function that can accurately forecast continuous target values, unlike classification problems where SVM aims to find the optimal decision boundary among classes. Finding a function that departs from the true aim values by no more than a specified margin, is the main idea behind the SVR model, known as the ε-insensitive zone while keeping the function as flat as possible. The algorithm achieves this by identifying the “support vectors”—the data points nearest to the decision barrier. These support vectors are the most prominent in molding the fitted function, as they have a valuable impact on the model’s predictions. Support vector regressor anchors these SVs and can exploit many kernel functions, including polynomial, linear, or radial basis functions, to handle both linear and nonlinear connections within the data. The model’s performance is controlled by two main hyperparameters which are the regularization parameter (C) and the ε value. The trade-off between the model’s complexity and the amount of error it can tolerate is determined by the C parameter, while the size of the ε-insensitive zone is set by the ε value. It is crucial to tune these hyperparameters to attain the desired balance between model complexity and generalization performance. SVR can handle complex, nonlinear data sets and can be implemented easily by using libraries like Scikit-learn in Python [35,36].

2.1.3 Random forest regressor

RFR is a robust ensemble learning algorithm commonly used for forecasting tasks, mainly when dealing with complex, nonlinear connections in the data. It serves as a machine-learning algorithm for each classification and regression task. One of its key advantages is its ability to prevent data from overfitting while providing fast results. The methodology involves creating multiple decision tree regressors, each trained on a random subset of the training data. This randomization prevents overfitting and introduces diversity among the individual tree models, enabling the algorithm to capture intricate patterns that a single decision tree might overlook. The final forecast is achieved by averaging the outputs of these individual tree models, which reduces the overall variance and enhances predictive performance. The number of trees to be grown, an essential parameter in RFR, influences the trade-off between the complexity and computational efficiency of the model. Additionally, the “max_features” parameter, which determines the number of features to be considered at each split, significantly impacts the algorithm’s performance by controlling the level of randomization and the ability to capture crucial variables [37].

2.1.4 Gradient boosting regression

GBR is a decision tree-based ensemble ML algorithm. It is an ensemble learning approach introduced by Friedman in 1999 for both regression and classification tasks. The main idea behind this algorithm is to frame an ensemble of weak learners iteratively, specifically decision trees, where each iteration compares a randomly chosen subset of the training data to the ongoing base model. An exciting aspect of GBR is that the lower the fraction of training data used in each iteration, the faster the regression process, as the model only needs to fit a smaller subset of the data at each step. The GBR model requires tuning two key parameters: the number of trees to be grown in the ensemble (ntrees) and the shrinkage rate (shrinkage rate), which controls the contribution of each tree to the final ensemble. By carefully optimizing these parameters, data scientists can find the right balance between model complexity, training speed, and overall prediction accuracy for their specific problem and data set, making GBR a powerful and versatile tool for tackling various regression and classification tasks [38].

2.1.5 Stacking

The stacking algorithm is an ML method that combines the strengths of multiple models to enhance overall prediction accuracy. In this study, the stacking model incorporated predictions from RFR, GBR, and SVR as base models, with a meta-model (XGBoost) trained to optimally combine their outputs. This ensemble approach leverages the RFR’s capability to handle nonlinearity, GBR’s focus on minimizing residual errors, and SVR’s ability to model complex relationships, resulting in a robust and accurate predictive framework. The meta-model can leverage the strengths of the individual models while compensating for their weaknesses, one of the critical benefits of the stacking model. It can often outperform using a single model alone by allowing the meta-model to learn the optimal way to blend the predictions. Stacking is a powerful ensemble learning method widely used in ML to boost prediction accuracy across various problem domains [39].

2.1.6 eXtreme Gradient Boosting

XGBoost is a robust and efficient gradient-boosting method widely used for supervised learning tasks including regression and classification. It has garnered attention in the data research community. It has been used in various data and ML competitions, often outperforming other methods. XGBoost builds models sequentially, where each new tree aims to correct the errors made by the previous models, offering a continuous path to improvement. With regularization techniques, this model helps to prevent overfitting, making it suitable for handling complex data. This model has a wide range of customizable parameters, that allow users to fine-tune models according to their needs and provide built-in features. Due to its robustness, scalability, and superior performance, XGBoost has become famous for winning solutions in data science competitions and real-world applications [40].

2.2 Evaluation matrices

Evaluation metrics are numerical indicators that assess the performances of ML models. They offer a quantitative method to scale various aspects, such as accuracy and precision. Analyzing these metrics allows one to gain insights into the model’s strengths and weaknesses, facilitating comparisons between different models. These metrics are crucial for developing and enhancing ML systems.

2.2.1 Coefficient of determination

R² is an evaluation matrix used to evaluate, how well a regression model’s predictions match the actual data. It shows what proportion of the variability in the dependent variable can be explained by the model’s independent variable(s). If the model shows

(R 2 = 1)

it means the data are perfectly fit into the model, while (

R 2 = 0)

means the model does not explain any variability. Generally, higher

R 2

values indicate a better fit between the model’s predictions and the actual data [41]. The calculation formula

(R 2)

is shown in Eq. (3).

(3)

R 2 = 1 − ∑ i = 1 n (y i ′ − y i) 2 ∑ i = 1 n (y i ′ − y ¯) 2 .

2.2.2 Mean absolute error

MAE is a matrix that measures the average magnitude of errors in a set of predictions without considering the direction of the errors. It calculates the average of the absolute differences between the predicted and actual values, giving equal weight to each error. The calculation formula of MAE is mentioned in Eq. (4). Lower values of MAE indicate better model performance [42].

(4)

M A E = 1 n ∑ i = 1 n | y i ′ − y i | .

2.2.3 Mean squared error

MSE is a metric that calculates the average of the squared differences between predicted and actual values. It is more sensitive to outliers than MAE, as squaring the errors before averaging gives more weight to more significant errors. Lower values indicate better model performance, with a more substantial penalty for more significant errors due to the squaring of differences [43]. Equation (5) represents the calculation formula of MSE.

(5)

M S E = 1 n ∑ i = 1 n (y i ′ − y i) 2 .

2.2.4 Root mean squared error

RMSE is the square root of the MSE, measuring the differences between predicted and observed values in the same units as the target variable. Equation (6) shows the calculation formula of RMSE. Lower values indicate better model performance, with the metric in the same unit as the target variable [44].

(6)

R M S E = 1 n ∑ i = 1 n (y i ′ − y i) 2 .

2.2.5 Mean squared logarithmic error

MSLE measures the ratio of actual and predicted values and is less sensitive to significant errors than MSE. It can be helpful when the target variable has a wide range. Lower values indicate better model performance, which is useful for targets with a wide range [45]. Equation (7) represents the calculation formula of MSLE.

(7)

M S L E = 1 n ∑ i = 1 n (l o g e (1 + y i) − l o g e (1 + y i ′)) 2 .

2.2.6 Median absolute error

MDAE is a metric that takes the median of all the absolute differences between the predicted and actual values. It is more robust to outliers than mean-based metrics and provides a central tendency measure of the errors. Lower values indicate better model performance and robustness to outliers [46]. The calculation formula of MDAE is presented in Eq. (8).

(8)

M D A E = m e d i a n (| y i − y i ′ |),

where n means the number of samples to be used while,

y i

y i ′

, and

y ¯

represents the actual, predicted, and mean values, respectively.

3 Methodology

3.1 Collection of data sets

The experimental data set used in this study comprised 761 data points [47–49], meticulously compiled from published research papers. Subsequent to collection, these data sets underwent thorough preprocessing and prescreening. This pre-screening minimized the variation in the experimental data that could arise from the physical and chemical formations of the UHPC basic materials. The mix proportions included in the data set are UHPC-based, with all samples exhibiting compressive strengths greater than 120 MPa. By focusing on this specific range of compressive strength, the researchers aimed to control the variability in the experimental data that the diverse properties of the UHPC raw materials could influence. The data set contains 12 input parameters, as mentioned in Tab.1, to predict the compressive strength of the UHPC mixture. The complete data set is available as supplementary material online to ensure transparency and facilitate further exploration.

Fig.1 shows that developing predictive models in data science involves several key steps. First, the data set is selected based on its compressive strength greater than 120 MPa and then preprocessed, which includes normalizing the data using techniques like Min–Max scaling. Next, various models such as KNN, SVR, RFR, GBR, Stacking, and XGBoost are considered. Hyperparameter tuning is performed to optimize the performance of models. The model is then trained, validated, and tested on unseen data. Accuracy is checked using k-fold cross-validation. If the model achieves the desired level of accuracy, it is saved; otherwise, the parameters are adjusted, and the model is retrained. Finally, parametric studies are conducted to identify influential parameters, ensuring a thorough and systematic approach to building and validating predictive models.

3.2 Input parameters

UHPC is an advanced construction material characterized by its exceptional mechanical properties and durability. Its unique composition includes a variety of fine-tuned ingredients, each contributing to the overall performance of the material. The key components include cement, fly ash, slag, silica fume, nano silica, limestone powder, aggregates, quartz powder, water, superplasticizer, steel fiber, and curing time as mentioned in Tab.1. Understanding the role of each component and their interrelationships is essential in optimizing UHPC’s properties.

Cement serves as the primary binder in UHPC, providing the matrix within which other components interact. Ordinary Portland Cement (OPC) is commonly used, with its chemical composition dominated by calcium silicates (C₃S and C₂S), which react with water to form calcium silicate hydrate (C-S-H) gel, the substance primarily responsible for compressive strength [50]. The high fineness of cement in UHPC enhances the hydration process, leading to a denser microstructure. Fly ash, a by-product of coal combustion, is frequently incorporated into UHPC to improve workability and reduce the heat of hydration. It mainly contains silica (SiO₂), alumina (Al₂O₃), and iron oxides (Fe₂O₃) [51]. The pozzolanic reaction between fly ash and calcium hydroxide results in additional (C-S-H) gel formation, enhancing the concrete’s durability and compressive strength [33]. Slag improves the long-term compressive strength and durability of UHPC by refining the pore structure and reducing permeability [52]. Silica fume is a by-product of silicon metal or ferrosilicon alloy production, consisting of ultrafine particles of amorphous silicon dioxide (SiO₂) [53]. Its small particle size and high surface area enhance the packing density of UHPC, filling the voids between cement grains. The pozzolanic activity of silica fume also contributes to the production of additional C-S-H gel, significantly increasing the compressive strength and reducing the permeability of the concrete. Nano silica, characterized by even finer particles than silica fume, further enhances the microstructure of UHPC. The use of nano silica can significantly improve the mechanical properties and durability of UHPC, particularly in terms of resistance to cracking and shrinkage [54]. Limestone powder is often used as a filler in UHPC, improving the packing density of the mix. Composed mainly of calcium carbonate (CaCO₃), it also participates in the hydration process [55]. The fine particles of limestone powder help to reduce the water demand and enhance the workability of the UHPC mix. Aggregates in UHPC are typically fine and carefully graded to optimize the packing density and minimize voids [56]. Quartz sand is a common choice due to its hardness and chemical inertness. The selection of aggregates impacts the overall strength, durability, and thermal properties of UHPC. Proper grading ensures a dense and homogenous mix, which is crucial for achieving the high strength and durability characteristics of UHPC [57]. Quartz powder is used as a micro-filler in UHPC to improve the packing density and reduce the porosity of the mix. Composed primarily of SiO₂, quartz powder enhances the interfacial transition zone (ITZ) between the cement paste and aggregates, contributing to the overall strength and durability of the concrete. Its use helps to achieve the ultra-high compressive strengths typical of UHPC. Water plays a critical role in the hydration of cement and the overall workability of UHPC. The water-to-cement (W/C) ratio in UHPC is typically very low, often below 0.2, to minimize porosity and maximize compressive strength [58]. The quality and purity of water used are essential to avoid introducing impurities that could affect the hydration process and long-term durability of the concrete. Superplasticizers are high-range water reducers that are essential in UHPC to achieve the desired workability at very low W/C ratios [59]. They work by dispersing cement particles, reducing the viscosity of the mix, and allowing for the incorporation of other fine materials like silica fume and nano silica. The use of superplasticizers helps to achieve a homogenous and workable mix without compromising strength. Steel fibers are added to UHPC to enhance its tensile strength, ductility, and toughness. The inclusion of fibers helps to bridge cracks and prevent their propagation, leading to a material that is not only strong but also capable of withstanding significant deformation without failure [60,61]. The interaction between steel fibers and the concrete matrix is critical for improving the post-cracking behavior of UHPC. Curing time is a crucial factor in the development of UHPC’s mechanical properties. The hydration process continues over time, leading to increased strength and durability. Proper curing conditions, such as temperature and humidity control, are essential to ensure that the hydration process proceeds optimally. In some cases, heat curing is used to accelerate strength development and achieve the ultra-high performance characteristics that define UHPC [62].

The series of histograms presented in Fig.2 describes the distribution analysis of various input parameters and curing time used in this study. These histograms represent the frequency of specific quantities which are used across different samples. Cement, aggregates, silica fume, and water show a relatively wide distribution, suggesting their variable use in different mixtures, with most of the samples having moderate to high amounts of these materials. Conversely, materials such as fly ash, slag, nano silica, and quartz powder show highly jagged distributions with most of the data concentrated at the lower end, indicating their limited or specialized use in UHPC mixes. The distributions of superplasticizer and steel fiber also show higher peaks at lower values, implying that their use is controlled and in smaller quantities. With most of the samples being cured for shorter durations, curing time shows a similar skewed pattern. The overall distribution highlights the careful balance of ingredients in UHPC formulation, where certain materials are used sparingly to achieve the desired mechanical properties, while others, such as cement and aggregates, form the bulk of the mixture.

3.3 Data scaling

The experimental data has major differences in the scales of the various input features. This can negatively impact the precision and performance of the ML model. To address this, it’s necessary to encompass a data scaling process in the study. Normalization is a process in ML that involves adjusting the values of numeric columns in a data set to ensure they are on a consistent scale [63]. It is needed to address disparities in the magnitudes of the input features. The min–max scaling approach adjusts the values of the data set’s attributes, causing them to be shifted and rescaled to a range of 0 to 1. The numerical formula of normalization by min-max scaling is mentioned in Eq. (9). Applying the z-score technique helps standardize the data set’s range across all features, ensuring that all features share equally in the performance of the model [64]. Equation (10) shows the calculation formula of the z-score technique. By effectively normalizing the data, the impact of these scale differences can be mitigated. This preprocessing step enhances the model’s prediction accuracy, robustness, and interpretability, as each feature similarly influences the model’s outputs. Data scaling is indispensable and non-negotiable. It’s a crucial stage in the analysis that ensures the model can perform at its best without being skewed by the varying magnitudes of the input features.

(9)

X n = X − X min X max − X min,

where

X n

means the value of normalization, while

X min

and

X max

are the minimum and maximum values of a feature.

(10)

z = x − μ σ,

where

x

μ

, and

σ

represents the data set’s original value, mean, and standard deviation. While z is the standardized value (z-score) of data sets.

3.4 Hyperparameters

To optimize the performance of the ML models, a combination of grid search and cross-validation was employed for hyperparameter tuning. The specific hyperparameter values for each model are listed in Tab.2. For XGBoost, key parameters such as learning rate, maximum tree depth, and the number of estimators were fine-tuned to balance model complexity and prediction accuracy, ensuring minimal overfitting. Similarly, for RFR, the number of trees and maximum features were adjusted to capture the nonlinear relationships in the data set while maintaining computational efficiency. The SVR model’s kernel type, regularization parameter (C), and epsilon were selected to achieve a trade-off between model flexibility and generalization. For GBR, learning rate and tree depth were optimized to reduce residual errors in iterative training. Stacking utilized the optimal configurations of its base models, including SVR, RFR, and GBR, with XGBoost as the meta-model for robust predictions. The KNN model’s optimal number of neighbors was determined based on cross-validation to ensure accurate predictions while avoiding overfitting.

These tuned hyperparameters, as detailed in Tab.2, were selected to maximize prediction accuracy, generalization to unseen data, and interpretability of the models.

3.5 k-fold cross-validation

k-fold cross-validation is a method that assesses the execution and generalization of ML models. This approach splits the data set into k equally sized subsets, or “folds”. The models are trained on k−1 folds, with the remaining fold acting as the validation set. This action is replicated k times, ensuring each fold is used exactly once as the validation set while others are as training. Choosing the value of k in k-fold cross-validation involves balancing several factors. Common choices are 5 or 10, as they provide an acceptable equilibrium between bias and variance. A larger k, like 10, is preferred for smaller data sets to ensure each fold is representative, while a smaller k, like 5, can be sufficient for larger data sets [65].

4 Results and discussion

4.1 Comparison of ML models

Tab.3 presents the detailed performance metrics of six ML models in predicting the compressive strength of UHPC. The models were evaluated based on various statistical measures, including the R², MSE, RMSE, MAE, MSLE, and MDAE. Fig.3 and Fig.4 further illustrate these metrics, offering a visual comparison of model performance across training, validation, and testing phases.

Among the models, XGBoost emerged as the top performer, achieving the highest R² values across all data sets 0.994 for training, 0.961 for validation, and 0.953 for testing. The average R² for XGBoost was 0.969, indicating that the model was able to explain 96.9% of the variance in the data set (Fig.3). The strong performance of XGBoost can be attributed to its ability to handle nonlinear relationships and complex interactions between input features, which are critical when modeling the behavior of UHPC with diverse material compositions and curing conditions. This capability suggests that XGBoost can be effectively generalized to other high-strength concrete prediction tasks, particularly where complex material interactions are involved. Furthermore, XGBoost’s regularization parameters help prevent overfitting, ensuring that the model generalizes well on unseen data. Its robustness to high-dimensional data also makes it suitable for UHPC, which involves multiple interacting variables such as cement content, fly ash, slag, and W/C ratio.

The Stacking model also performed well, with an average R² of 0.960. Stacking is an ensemble learning technique that combines the predictions of multiple base models, improving overall accuracy. In this case, the combination of models such as decision trees and SVR in the Stacking algorithm allowed it to capture the intricate relationships within the data set effectively. However, the slightly lower R² compared to XGBoost may indicate that Stacking had minor difficulties generalizing on the validation and testing data, despite performing well during training (R² of 0.991). The MSE and RMSE values for Stacking were slightly higher than XGBoost, suggesting that while Stacking is accurate, it may not be as efficient in minimizing prediction errors, particularly for new data.

KNN with an average R² of 0.959, performed reasonably well but showed an increase in errors during validation and testing phases. KNN operates by predicting the target variable based on the average of its nearest neighbors in the feature space. Although it captured general trends effectively during training, the KNN model exhibited a tendency to overestimate compressive strength in validation and testing sets, especially at higher values. The higher MSE and RMSE values in the validation and testing sets indicate that KNN’s reliance on proximity in the feature space may have limited its ability to generalize for more complex or extreme cases, where the relationships between input variables and compressive strength are highly nonlinear.

GBR achieved an average R² of 0.958, which is slightly lower than KNN and Stacking. However, GBR performed exceptionally well during training (R² of 0.993), suggesting that it was highly accurate in fitting the training data. GBR uses an ensemble of weak learners, typically decision trees, which are sequentially trained to correct the errors of previous trees. Despite this, GBR showed a significant drop in performance during validation and testing, as reflected by the increase in MSE and RMSE. The larger error values indicate that the model may have been overfitting to the training data, failing to generalize effectively on unseen data. This issue is common in boosting methods when the learning rate or the number of trees is not carefully tuned, leading to a model that fits the training data too closely but struggles with new samples.

RFR demonstrated moderate performance, with an average R² of 0.932. RFR builds an ensemble of decision trees, each trained on a random subset of the data. This randomness helps reduce overfitting and increases the model’s ability to generalize. However, RFR’s prediction accuracy dropped during validation and testing phases, as indicated by its relatively high MSE and RMSE values. The model tends to average predictions across multiple trees, which can result in smoother predictions but may fail to capture subtle variations in the data. Additionally, RFR may have struggled with the highly nonlinear relationships present in the UHPC data set, as decision trees are generally more effective at capturing simpler patterns.

Lastly, SVR had the lowest overall performance, with an average R² of 0.930. SVR uses a hyperplane to predict continuous values, but in this case, it struggled to capture the complex relationships between the input variables and the compressive strength of UHPC. The significant drop in R² during validation (0.937) and testing (0.893) phases, combined with high MSE and RMSE values, indicates that SVR overfitted to the training data and failed to generalize. This could be due to the choice of kernel or regularization parameters, which might not have been optimal for this data set. SVR’s poor performance highlights the limitations of using simpler regression models for highly nonlinear and complex materials like UHPC.

Overall, XGBoost and Stacking provided the most accurate predictions, demonstrating their ability to capture the nonlinear interactions within the UHPC mix design. These ensemble models outperformed simpler algorithms such as SVR and KNN, as well as traditional tree-based models like RFR and GBR, by leveraging advanced techniques such as boosting and stacking to improve generalization and minimize errors. The superiority of XGBoost in particular can be attributed to its robust handling of high-dimensional data and effective regularization, making it well-suited for the prediction of compressive strength in complex materials like UHPC.

4.2 Compressive strength prediction

The prediction of compressive strength in UHPC using different ML models is visualized in Fig.4 and Fig.5, showcasing how well each model predicts the actual compressive strength based on the input data. Compressive strength is a critical property of UHPC which makes it a challenging parameter to predict accurately due to the highly nonlinear relationships between the mixture’s ingredients, curing conditions, and mechanical performance. The models used in this study have demonstrated varying degrees of success in predicting compressive strength.

XGBoost, as illustrated in Fig.4(f), consistently outperformed other models in both accuracy and generalization. The predicted compressive strength values closely follow the actual data across the training, validation, and testing data sets. This strong alignment is also evident in Fig.5(f), where the predicted values are tightly clustered around the diagonal trend line, indicating near-perfect predictions. The RMSE for XGBoost was the lowest among the models, with values of 1.738 MPa for training and 4.626 MPa for validation. This minimal deviation suggests that XGBoost efficiently captures the nonlinear relationships between the UHPC components (such as cement, fly ash, superplasticizer, and steel fibers) and compressive strength. The superior performance of XGBoost can be explained by its ability to build multiple decision trees sequentially, where each new tree corrects the errors of the previous one. This iterative learning process enables XGBoost to minimize errors and adjust to the nuances in the data set, such as the complex interactions between UHPC’s ingredients. Moreover, XGBoost’s regularization techniques (i.e., L1 and L2 regularization) prevent overfitting, allowing it to maintain high prediction accuracy even on unseen data. The ability of XGBoost to handle missing data and its built-in cross-validation further contribute to its robust performance.

Stacking also demonstrated a high level of accuracy in predicting compressive strength, with predictions closely following the actual values, as seen in Fig.4(e). Similar to XGBoost, Stacking exhibited strong predictive capabilities, particularly in the middle range of compressive strength values (120–180 MPa). However, it tended to slightly overestimate compressive strength for values greater than 150 MPa. The stacking model achieved an average R² of 0.960 and an RMSE of 3.762 MPa (Tab.3), which are competitive with those of XGBoost. The overestimation observed in the higher compressive strength range might be due to the stacking model combining predictions from different base learners, which can sometimes result in slight prediction biases when the base models do not perfectly capture the underlying data patterns. The strong performance of Stacking is attributed to its ensemble learning technique, which combines multiple ML models (e.g., SVR, decision trees, and random forest) to produce more accurate and generalized predictions. The meta-model in Stacking learns how to optimally weigh the contributions from each base model, leveraging their individual strengths. This ability to combine different models allows Stacking to handle the diverse influences of UHPC components on compressive strength, such as the combined effects of silica fume and nano silica on matrix densification or the impact of curing time and steel fibers on the mechanical performance.

GBR also produced good predictions, though its performance was slightly lower than XGBoost and Stacking. As shown in Fig.4(d), the predicted values generally align with the actual values, but there is a tendency for GBR to slightly overestimate compressive strength, particularly for values greater than 150 MPa. This is reflected in the increasing RMSE from training (1.943 MPa) to validation (4.379 MPa), suggesting that GBR struggled to generalize as effectively as XGBoost. One reason for this could be GBR’s sensitivity to overfitting when the number of boosting iterations or the learning rate is not optimally tuned. Despite this, GBR remains effective at capturing nonlinear relationships, such as the impact of ultra-fine powders (e.g., quartz and silica fume) on the densification of the UHPC matrix and subsequent compressive strength gain.

RFR showed moderate predictive performance, with predicted values generally following the actual values but displaying more variance compared to the ensemble models (Fig.4(c)). RFR achieved an average R² of 0.932 and an RMSE of 5.075 MPa, indicating that while it was able to capture the general trend, it was less precise, particularly in the higher strength ranges. The wide spread of predicted values at compressive strengths above 200 MPa suggests that RFR may struggle to capture the finer details of the relationships between input parameters and compressive strength, especially when dealing with complex variables like the distribution of aggregates and the interaction between fibers and the cement matrix. The limitation of RFR in this context can be attributed to its reliance on averaging multiple decision trees, which, while effective at reducing variance, can also smooth out important details in the data set. In UHPC, small changes in material proportions, curing time, or water content can have significant effects on compressive strength, and RFR may not be as adept at capturing these subtleties compared to boosting algorithms like XGBoost.

KNN exhibited slightly less prediction accuracy, with an average R² of 0.959, but higher errors in validation and testing phases (Fig.4(a)). KNN’s reliance on the proximity of data points makes it effective at capturing general trends, but it struggled with predictions at higher compressive strength values, often leading to moderate overestimation. This is reflected in the error plot, where fluctuations close to zero are accompanied by larger errors at compressive strengths greater than 150 MPa. The model’s dependence on distance-based relationships might explain these errors, as the nonlinear interactions within UHPC (such as the effect of superplasticizers on workability and compressive strength) are not easily captured by proximity-based methods.

Finally, SVR exhibited the poorest performance, with the lowest R² (0.930) and the highest error values across all data sets (Fig.4(b)). The wide fluctuation in predicted values, particularly in the mid-range compressive strengths, suggests that SVR was unable to model the complex relationships between the mix proportions and compressive strength. The model’s high sensitivity to outliers and choice of kernel may have contributed to its inability to generalize well, especially in the presence of nonlinear relationships, such as the simultaneous effects of nano-silica and steel fibers on the UHPC matrix. The error distribution in Fig.4(b) indicates that while SVR was able to capture some general trends, its predictions were often too variable to be reliable for UHPC applications, where accurate compressive strength prediction is crucial for ensuring the performance of high-strength concrete structures.

In summary, XGBoost and Stacking were the most effective models for predicting the compressive strength of UHPC, demonstrating superior ability to handle complex, nonlinear relationships in the data set. These models not only provided accurate predictions but also minimized errors, making them highly suitable for use in civil engineering applications where compressive strength is a key factor in structural design and material selection. On the other hand, simpler models like SVR and KNN struggled with capturing the complex interactions between UHPC’s diverse components, leading to less accurate predictions. RFR and GBR provided moderate prediction accuracy but were outperformed by ensemble methods that could leverage multiple weak learners to achieve more reliable predictions.

4.3 Analysis of input parameters and their effects

The behavior and strength development of UHPC is significantly influenced by its complex composition, including a wide variety of fine materials, fibers, and chemical additives. Understanding how each parameter affects the compressive strength is crucial for optimizing UHPC mixtures for specific applications. Fig.6(a)–Fig.6(l) present the relationships between key input parameters, such as cement content, fly ash, slag, silica fume, nano silica, limestone powder, aggregates, quartz powder, water content, superplasticizer, steel fibers, and curing time, and compressive strength. Each of these components plays a unique role in shaping the mechanical properties of UHPC, and their interactions can significantly affect the material’s performance.

Cement serves as the primary binder in UHPC, contributing to the matrix formation and strength development. As shown in Fig.6(a), the data points indicate that while increasing cement content generally enhances compressive strength, the relationship is not linear. Compressive strength tends to plateau at higher cement contents, suggesting that beyond a certain threshold, additional cement does not significantly contribute to strength. This could be due to the fact that at very high cement contents, the workability of the mix may decrease, leading to difficulties in proper compaction and the formation of micro-voids, which counteract the strength gains. The interaction between cement and other fine materials like silica fume and superplasticizer is critical. For instance, in UHPC, very low W/C ratios are typically employed, and superplasticizers are used to maintain workability. The cementitious materials, especially in combination with supplementary cementitious materials (SCMs) like silica fume, improve the packing density of the matrix. However, the diminishing returns in strength with excessive cement content may also be related to the fact that higher cement contents increase the likelihood of shrinkage and cracking, especially if not adequately compensated by additives like fibers.

Fly ash, a pozzolanic material, is often added to UHPC mixtures to improve workability and reduce the heat of hydration. However, Fig.6(b) shows that increasing fly ash content typically reduces the compressive strength, particularly when the content exceeds 100 kg/m³. The compressive strength is highest when fly ash content is low, with values often exceeding 160 MPa at minimal fly ash dosages. Fly ash’s primary contribution to strength comes through its pozzolanic reaction, which enhances the long-term strength of concrete. However, at higher dosages, the reduction in cementitious content, due to partial replacement with fly ash, leads to lower early-age strength. The scattered data at higher fly ash contents suggests that while fly ash can improve certain properties like workability and reduce the cost of the mix, it should be used sparingly in UHPC to avoid compromising the compressive strength. The optimal use of fly ash in UHPC depends on balancing workability improvements with the need to maintain high strength.

Slag is often used in UHPC mixtures to enhance long-term strength and durability. As seen in Fig.6(c), the relationship between slag content and compressive strength is not straightforward. At low slag contents, there is little noticeable effect on strength, but at higher dosages, there appears to be a slight reduction in compressive strength. This behavior could be due to the fact that slag primarily enhances durability and long-term strength, rather than early-age strength, which is more directly measured by compressive strength tests. Slag’s ability to refine the pore structure and reduce permeability contributes to its role in improving the long-term performance of UHPC. However, the slight reduction in compressive strength at higher slag contents could be attributed to the slower hydration kinetics of slag compared to OPC. The reduction in early-age strength, particularly at high slag contents, may not be desirable for applications requiring high early-age strength, such as in precast construction.

Silica fume is a key component of UHPC, contributing to both the strength and durability of the material. Fig.6(d) shows that while silica fume content varies widely, there is a general trend of higher compressive strength with increased silica fume content. Silica fume particles are extremely fine, allowing them to fill the voids between cement grains, thereby improving the packing density of the matrix. In addition, silica fume reacts with calcium hydroxide produced during cement hydration to form additional calcium silicate hydrate (C-S-H), which is responsible for the material’s strength. However, the data also show a wide scatter in compressive strength at various silica fume contents, indicating that the effect of silica fume is highly dependent on other factors, such as the W/C ratio and superplasticizer dosage. Excessive silica fume can reduce workability, requiring higher doses of superplasticizer to maintain a flowable mix. If not properly dispersed, silica fume can agglomerate, leading to localized weak zones that may reduce overall strength.

Nano silica, like silica fume, enhances the microstructure of UHPC by filling the smallest voids and promoting the formation of additional C-S-H. However, Fig.6(e) shows that the compressive strength does not significantly increase with higher nano silica content beyond a certain point. Most of the data are clustered around low nano silica contents (0–100 kg/m³), and beyond this range, the effect on compressive strength diminishes. This plateau in strength may be due to the fact that nano silica primarily contributes to the densification of the matrix at low dosages, but its benefits taper off as the matrix becomes saturated with fine particles. At higher dosages, nano silica can cause issues with workability, requiring more superplasticizer and potentially leading to segregation if not properly managed.

Limestone powder is often used as a filler in UHPC to enhance the packing density of the mixture. Fig.6(f) shows that there is a wide variation in compressive strength with different limestone powder contents. At low contents (0–200 kg/m³), the strength tends to be higher, but beyond this range, compressive strength decreases. The reduction in strength at higher limestone powder contents can be attributed to the dilution effect, where an excessive amount of filler reduces the volume of reactive cementitious material. While limestone powder improves the packing density and reduces the overall porosity of the matrix, its role as a filler means that it does not contribute to strength in the same way as cement or SCMs like silica fume or slag.

Aggregates play a crucial role in UHPC by providing structural integrity and reducing shrinkage. Fig.6(g) shows that there is no clear linear correlation between aggregate content and compressive strength, with strength values scattered across a wide range of aggregate dosages. This behavior suggests that the quality of the aggregates (e.g., grading, shape, and surface texture) may have a more significant impact on compressive strength than the quantity alone. The wide spread of compressive strengths at various aggregate dosages indicates that factors such as aggregate grading and the interaction between aggregates and the cementitious matrix are crucial for optimizing UHPC performance. Proper grading of aggregates can enhance the packing density and reduce voids, leading to higher strength. However, poor grading or excessive aggregate content can result in a heterogeneous mix that compromises strength.

Quartz powder is commonly used as a micro-filler in UHPC to enhance the packing density of the mix and improve the ITZ between the cement paste and aggregates. Fig.6(h) shows that higher quartz powder contents (beyond 200 kg/m³) tend to reduce compressive strength, indicating that there is an optimal range for its use. The reduction in strength at higher contents may be due to the fact that quartz powder, while effective at filling voids, does not contribute to the hydration process and can reduce the overall reactivity of the mix.

Water plays a critical role in the hydration of cement and the workability of UHPC. Fig.6(i) shows that water content is tightly clustered between 150 and 250 kg/m³, where the highest compressive strengths are achieved. This range reflects the low W/C ratio typical of UHPC mixes, where minimizing the water content reduces porosity and enhances strength. Excessive water content leads to lower compressive strength, as shown by the scattered data points outside this optimal range. Too much water dilutes the cementitious material and increases the porosity of the matrix, resulting in lower strength. Conversely, insufficient water can lead to incomplete hydration and poor workability, which also negatively impacts strength.

Superplasticizers are essential in UHPC to achieve high workability at very low W/C ratios. Fig.6(j) shows that superplasticizer content is generally concentrated between 0 and 100 kg/m³, where compressive strength values are highest. The use of superplasticizers reduces the viscosity of the mix, allowing for better dispersion of fine materials like silica fume and nano silica. However, excessive superplasticizer content can lead to segregation and reduced strength, as indicated by the lower strength values at higher dosages.

Steel fibers are incorporated into UHPC to enhance tensile strength, ductility, and crack resistance. Fig.6(k) shows that moderate amounts of steel fibers (0–150 kg/m³) contribute positively to compressive strength. However, at higher dosages, the data becomes more scattered, suggesting that excessive fiber content can lead to mixing difficulties or fiber clumping, which reduces the overall strength of the material. The positive effect of steel fibers on compressive strength is primarily due to their ability to bridge micro-cracks and enhance the post-cracking behavior of UHPC. However, too many fibers can create a non-uniform distribution in the mix, leading to localized weaknesses.

Curing time is a crucial factor in the development of UHPC’s mechanical properties. Fig.6(l) shows that longer curing times generally lead to higher compressive strength, as the hydration process continues to develop strength over time. Proper curing conditions, such as controlled temperature and humidity, are essential to ensure that the hydration process is fully optimized. In some cases, heat curing can accelerate strength development, leading to ultra-high performance characteristics at earlier stages.

5 Feature importance and dependency analysis

The feature importance and dependency analysis are essential for understanding how each input parameter contributes to the predictive model’s performance. In this case, the SHAP method was used to evaluate the contribution of each feature in predicting the compressive strength of UHPC. SHAP values provide a game-theory-based explanation of model predictions, decomposing the prediction into additive components that correspond to the contributions of different features. This analysis is particularly important when dealing with complex models, such as those used in this study, where multiple features interact in nonlinear ways to influence the outcome.

SHAP not only identifies the overall importance of each feature but also captures feature interactions, which is crucial in the context of UHPC. UHPC is a composite material made from various fine powders, fibers, and chemical additives, each playing a specific role in enhancing its mechanical properties. Understanding which features have the most significant impact on compressive strength predictions, and how they interact, is critical for optimizing the material’s design.

5.1 Feature importance analysis

The SHAP feature importance analysis (Fig.7) ranks the features based on their Mean Absolute SHAP values, giving a clear indication of which input parameters most strongly influence the prediction of UHPC compressive strength. According to the SHAP values, curing time emerged as the most influential feature, followed by silica fume and aggregates. These results align with engineering principles, as curing time significantly impacts the hydration process, which in turn affects the development of compressive strength. In UHPC, the extended hydration time allows for the formation of a denser and more cohesive matrix, especially when silica fume is present to enhance the pozzolanic reaction, producing additional calcium silicate hydrate (C-S-H) gel. This dense matrix is the primary contributor to UHPC’s superior strength.

As the most influential feature, curing time’s importance can be attributed to its direct impact on strength development. In UHPC, long curing periods are often necessary to allow for the full hydration of cementitious materials and the proper development of the microstructure. Longer curing times result in increased compressive strength because they provide more time for the cement particles to react with water, forming more C-S-H, which leads to higher material density and reduced porosity. The SHAP analysis supports this understanding, as curing time consistently exhibits positive SHAP values, indicating that longer curing times positively affect the model’s predictions of compressive strength.

The second most important feature, silica fume, plays a critical role in improving the packing density of the matrix and enhancing the pozzolanic reaction, which contributes to strength development. Its ultrafine particles fill voids between cement grains, thus reducing porosity and contributing to higher compressive strength. SHAP values for silica fume show that higher contents positively influence the predictions, as increased silica fume content leads to a denser matrix with more C-S-H gel formation, thus improving mechanical properties.

While aggregates typically provide bulk and reduce shrinkage in conventional concrete, their role in UHPC is more nuanced. SHAP values indicate that the contribution of aggregates to compressive strength can be both positive and negative, depending on their content and quality. Aggregates in UHPC are usually finely graded to improve packing density, which helps to reduce voids and improve strength. However, an excessive amount of aggregates may hinder workability and lead to inconsistencies in the matrix, which could reduce compressive strength. The SHAP analysis captures this complexity, with the aggregate content ranking high in importance but exhibiting both positive and negative impacts on the model predictions.

Other notable contributors to compressive strength predictions include superplasticizer, water content, and steel fiber. Superplasticizers are essential in maintaining workability at very low W/C ratios, which are critical in UHPC mixes. The SHAP values indicate that higher dosages of superplasticizer generally lead to better predictions of compressive strength. This is because superplasticizers allow for more efficient dispersion of cement particles and other fine materials, such as silica fume, resulting in a more uniform and dense matrix. However, excessive amounts of superplasticizer can lead to segregation, which could reduce strength, a trend reflected in the SHAP values.

Water content is another critical factor that directly influences the hydration process. The SHAP analysis shows that water content has a moderate impact on compressive strength predictions, with both positive and negative contributions. In UHPC, a very low W/C ratio is employed to minimize porosity and maximize strength, but insufficient water can lead to incomplete hydration and lower strength. Conversely, excessive water increases porosity, reducing strength. The SHAP values reflect this delicate balance, indicating that water content must be carefully optimized to achieve the best results.

Steel fibers are primarily used to improve the tensile strength and ductility of UHPC, but they also contribute to compressive strength by enhancing the material’s resistance to crack propagation. SHAP values indicate that moderate amounts of steel fibers positively influence compressive strength predictions, likely due to their ability to bridge cracks and enhance the post-cracking behavior of the material.

Other features, such as cement, quartz powder, limestone powder, fly ash, slag, and nano-silica, have lower SHAP values, indicating that their influence on compressive strength is less significant compared to curing time, silica fume, and aggregates. This does not mean these materials are unimportant, but their role may be secondary or conditional upon interactions with other features.

5.2 Feature dependency analysis

The SHAP feature dependency analysis provides deeper insights into how individual features influence the model predictions for different data points. This analysis reveals how changes in specific features affect the predicted compressive strength, as well as how features interact with one another. Fig.8 presents a scatter plot of SHAP values for each feature, revealing the complex interactions between input parameters and compressive strength predictions. The color coding from blue to red indicates the relative values of the features, with red representing higher values and blue representing lower values.

As indicated in the feature importance analysis, curing time has the largest impact on compressive strength predictions. In the SHAP dependency plot, curing time exhibits consistently positive SHAP values, particularly at higher values (red), confirming that longer curing times lead to increased compressive strength. This relationship aligns with engineering knowledge, as curing time allows for more complete hydration, resulting in a denser microstructure and higher strength. The SHAP values also show that curing time interacts positively with other factors like silica fume and steel fibers, further enhancing strength.

Both silica fume and steel fibers exhibit a similar trend, where higher values tend to positively influence compressive strength predictions. Silica fume, with its fine particle size, improves the packing density and pozzolanic reaction, while steel fibers enhance crack resistance. The SHAP values for these features show that when used in appropriate quantities, they contribute significantly to strength. However, the scatter in SHAP values also suggests that these features interact with other parameters like water content and aggregate grading, which can either amplify or reduce their effect on strength.

The relationship between aggregate content and compressive strength is more complex, as indicated by the SHAP values. The scatter plot shows both positive and negative SHAP values, suggesting that while aggregates generally contribute to strength by improving packing density, their effect can vary depending on other factors, such as water content and superplasticizer dosage. For example, too many aggregates can reduce workability, leading to inconsistencies in the matrix, while too few may not provide enough bulk to counteract shrinkage.

Water content and superplasticizer exhibit a mixed impact on compressive strength, as shown by the scatter in SHAP values. These features interact heavily with other parameters, such as cement and silica fume. For instance, low water content improves strength by reducing porosity, but it requires sufficient superplasticizer to maintain workability. The SHAP analysis captures this dependency, with certain combinations of water and superplasticizer leading to better predictions, while others may result in reduced strength due to poor workability or segregation.

Although limestone powder and nano silica have lower overall importance, their SHAP values reveal interesting patterns. For example, certain values of nano-silica and limestone powder can lead to noticeable shifts in the predictions, suggesting that while their overall impact is smaller, they can still play a role in specific scenarios, such as enhancing early-age strength or filling micro-voids in the matrix.

6 Conclusions

This study comprehensively analyzed the predictive capabilities of various ML models to forecast the compressive strength of UHPC. Six ML models (KNN, SVR, RFR, GBR, Stacking, and XGBoost) were trained and tested using a data set of UHPC mix designs. The results showed that ensemble models, particularly XGBoost and Stacking, significantly outperformed the other models, delivering the most accurate predictions for UHPC compressive strength.

Among the models evaluated, XGBoost demonstrated the highest prediction accuracy, achieving an R² of 0.969 and an RMSE of 4.626 MPa during validation. Its superior performance was consistent across varying UHPC compositions, particularly those with diverse curing times, silica fume contents, and aggregate proportions. The model’s ability to capture complex nonlinear interactions among input features allowed it to generalize effectively, even when applied to data sets with wide variations in material properties and curing conditions. This adaptability highlights XGBoost’s potential for broader applications in predicting the mechanical properties of high-performance concretes with varied compositions. The Stacking model, while also performing well with an R² of 0.960, demonstrated slightly lower predictive capability, primarily in the higher compressive strength ranges where it tended to overestimate.

The feature importance analysis using SHAP provided further insights into the influence of each input variable on the model’s predictions. The SHAP analysis revealed that curing time was the most significant factor influencing compressive strength, with the highest mean absolute SHAP values. This confirms the critical role that curing plays in UHPC strength development, as longer curing times allow for more complete hydration of the cementitious materials, resulting in a denser and more durable concrete matrix. The SHAP analysis also identified silica fume and aggregate content as other major contributors to compressive strength. Silica fume, due to its ultrafine particle size, enhances matrix densification and contributes to the pozzolanic reaction, leading to improved strength. Aggregates, when properly graded and optimized, improve the packing density of the concrete, reducing voids and contributing to overall strength development.

Furthermore, the SHAP feature dependency analysis also highlighted the complexity of interactions between different mix components. For instance, while superplasticizer and water content were identified as moderately important features, their effects were highly dependent on the quantities used and their interactions with other parameters. Higher superplasticizer contents, for example, helped improve the workability of the mix, especially at very low W/C ratios typical of UHPC, but excessive amounts led to segregation and reduced compressive strength. Similarly, steel fibers were found to improve compressive strength by enhancing crack resistance and post-cracking behavior, but too many fibers could lead to mixing difficulties, resulting in lower overall strength.

In summary, the use of advanced ML models like XGBoost and Stacking, combined with SHAP for interpretability, provides a robust and accurate framework for predicting the compressive strength of UHPC. The study demonstrated that these models can capture the complex, nonlinear relationships between UHPC mix components and compressive strength, offering not only precise predictions but also valuable insights into the role of individual features. By leveraging these models, engineers can optimize UHPC formulations to achieve desired strength outcomes more effectively. The XGBoost model with its minimal prediction error (RMSE of 4.626 MPa), is well-suited for high-precision applications in structural engineering where the accurate prediction of compressive strength is critical. This approach can also aid in the design of more sustainable and cost-effective UHPC formulations, ensuring optimal performance in real-world applications.

The data set used in this study, while comprehensive, is limited to specific experimental conditions, and the predictions lack experimental validation. Furthermore, external factors such as environmental conditions and long-term durability are not considered. Future work should address these aspects to improve the robustness and applicability of the proposed framework.

References

Publishing order | Descend order by publishing year | Descend order by cited within

[1]	Mathern A, von der Haar C, Marx S. Concrete support structures for offshore wind turbines: Current status, challenges, and future trends. Energies, 2021, 14(7): 1995

[2]

Rasul M, Ahmad S, Adekunle S K, Al-Dulaijan S U, Maslehuddin M, Ali S I. Evaluation of the effect of exposure duration and fiber content on the mechanical properties of polypropylene fiber-reinforced UHPC exposed to sustained elevated temperature. Journal of Testing and Evaluation, 2020, 48(6): 4355–4369

[3]	IyerN R. An overview of cementitious construction materials. New Materials in Civil Engineering, 2020: 1–64

[4]	Ahmad S, Rasul M, Adekunle S K, Al-Dulaijan S U, Maslehuddin M, Ali S I. Mechanical properties of steel fiber-reinforced UHPC mixtures exposed to elevated temperature: Effects of exposure duration and fiber content. Composites. Part B, Engineering, 2019, 168: 291–301

[5]	Tabish M, Zaheer M M, Baqi A. Effect of nano-silica on mechanical, microstructural and durability properties of cement-based materials: A review. Journal of Building Engineering, 2023, 65: 105676

[6]	Vijayan D S, Devarajan P, Sivasuriyan A. A review on eminent application and performance of nano based silica and silica fume in the cement concrete. Sustainable Energy Technologies and Assessments, 2023, 56: 103105

[7]	Yang H, Monasterio M, Zheng D, Cui H, Tang W, Bao X, Chen X. Effects of nano silica on the properties of cement-based materials: A comprehensive review. Construction & Building Materials, 2021, 282: 122715

[8]	Wang D, Shi C, Farzadnia N, Shi Z, Jia H. A review on effects of limestone powder on the properties of concrete. Construction & Building Materials, 2018, 192: 153–166

[9]

Kumar S, Gupta R C, Shrivastava S, Csetenyi L, Thomas B S. Preliminary study on the use of quartz sandstone as a partial replacement of coarse aggregate in concrete based on clay content, morphology and compressive strength of combined gradation. Construction & Building Materials, 2016, 107: 103–108

[10]	Liu J, Wei J, Li J, Su Y, Wu C. A comprehensive review of ultra-high performance concrete (UHPC) behaviour under blast loads. Cement and Concrete Composites, 2024, 148: 105449

[11]	Bajaber M A, Hakeem I Y. UHPC evolution, development, and utilization in construction: A review. Journal of Materials Research and Technology, 2021, 10: 1058–1074

[12]	Marvila M T, de Azevedo A R G, de Matos P R, Monteiro S N, Vieira C M F. Materials for production of high and ultra-high performance concrete: Review and perspective of possible novel materials. Materials, 2021, 14(15): 4304

[13]	Fan D, Yu R, Fu S, Yue L, Wu C, Shui Z, Liu K, Song Q, Sun M, Jiang C. Precise design and characteristics prediction of Ultra-High Performance Concrete (UHPC) based on artificial intelligence techniques. Cement and Concrete Composites, 2021, 122: 104171

[14]	Fan D, Zhu J, Fan M, Lu J X, Chu S H, Dong E, Yu R. Intelligent design and manufacturing of ultra-high performance concrete (UHPC)—A review. Construction & Building Materials, 2023, 385: 131495

[15]	Li Z, Qi J, Hu Y, Wang J. Estimation of bond strength between UHPC and reinforcing bars using machine learning approaches. Engineering Structures, 2022, 262: 114311

[16]	Avcı-karataş Ç. Modeling approach for estimation of ultimate load capacity of concrete-filled steel tube composite stub columns based on relevance vector machine. Niğde Ömer Halisdemir Üniversitesi Mühendislik Bilimleri Dergisi, 2021, 10(2): 615–626

[17]	Cui J, Xing G, Miao P, Zhang Y, Chang Z, Khan A Q. Flexural behavior of RC beams strengthened with BFRP bars and CFRP U-jackets: Experimental and numerical analysis. Journal of Building Engineering, 2024, 97: 110932

[18]	Avci-Karatas C. Prediction of ultimate load capacity of concrete-filled steel tube columns using multivariate adaptive regression splines (MARS). Steel and Composite Structures, 2019, 33(4): 583–594

[19]	Avci-Karatas C. Application of machine learning in prediction of shear capacity of headed steel studs in steel–concrete composite structures. International Journal of Steel Structures, 2022, 22(2): 539–556

[20]	Khan A Q, Awan H A, Rasul M, Siddiqi Z A, Pimanmas A. Optimized artificial neural network model for accurate prediction of compressive strength of normal and high strength concrete. Cleaner Materials, 2023, 10: 100211

[21]	Jiang T, Gradus J L, Rosellini A J. Supervised machine learning: A brief primer. Behavior Therapy, 2020, 51(5): 675–687

[22]	van Berkel N, Dennis S, Zyphur M, Li J, Heathcote A, Kostakos V. Modeling interaction as a complex system. Human–Computer Interaction, 2021, 36(4): 279–305

[23]	Nasteski V. An overview of the supervised machine learning methods. Horizons Series B, 2017, 4: 51–62

[24]	Osisanwo F Y, Akinsola J E T, Awodele O, Hinmikaiye J O, Olakanmi O. ,Akinjobi J. Supervised machine learning algorithms: Classification and comparison. International Journal of Computer Trends and Technology, 2017, 48(3): 128–138

[25]	Rasul M, Hosoda A. Prediction of occurrence of thermal cracking of RC abutments using artificial neural networks. Journal of Structural Engineering A, 2019, 65: 560–568

[26]	Avci-Karatas C. Artificial neural network (ANN) based prediction of ultimate axial load capacity of concrete-filled steel tube columns (CFSTCs). International Journal of Steel Structures, 2022, 22(5): 1341–1358

[27]	RasulMHosodaA. Application of artificial neural network in predicting maximum thermal crack width of RC abutments using actual construction data. In: Proceedingds of FIB Symposium 2019 Concrete-Innovations in Materials, Desigh and Structures. Krakow: fib, 2019, 1339–1346.

[28]	WangL. Estimating high-performance concrete compressive strength with support vector regression in hybrid method. Multiscale and Multidisciplinary Modeling, Experiments and Design. 2024, 7(1): 477–490

[29]	Khan A Q, Naveed M H, Rasheed M D, Miao P. Prediction of compressive strength of fly ash-based geopolymer concrete using supervised machine learning methods. Arabian Journal for Science and Engineering, 2024, 49(4): 4889–4904

[30]	KhanA QNaveedM HRasheedM DPimanmasA. Prediction of stress–strain behavior of PET FRP-confined concrete using machine learning models. Arabian Journal for Science and Engineering, 2024, 1–21

[31]	Song Y, Zhao J, Ostrowski K A, Javed M F, Ahmad A, Khan M I, Aslam F, Kinasz R. Prediction of compressive strength of fly-ash-based concrete using ensemble and non-ensemble supervised machine-learning approaches. Applied Sciences, 2021, 12(1): 361

[32]	Pernía-Espinoza A, Fernández-Ceniceros J, Antonanzas J, Urraca R, Martinez-de-Pison F J. Stacking ensemble with parsimonious base models to improve generalization capability in the characterization of steel bolted components. Applied Soft Computing, 2018, 70: 737–750

[33]	Chung K L, Wang L, Ghannam M, Guan M, Luo J. Prediction of concrete compressive strength based on early-age effective conductivity measurement. Journal of Building Engineering, 2021, 35: 101998

[34]	Boateng E Y, Otoo J, Abaye D A. Basic tenets of classification algorithms K-nearest-neighbor, support vector machine, random forest and neural network: A review. Journal of Data Analysis and Information Processing, 2020, 8(4): 341–357

[35]	Patel A K, Chatterjee S, Gorai A K. Development of a machine vision system using the support vector machine regression (SVR) algorithm for the online prediction of iron ore grades. Earth Science Informatics, 2019, 12(2): 197–210

[36]	GholamiRFakhariN. Support vector machine: Principles, parameters, and applications. Handbook of Neural Computation, 2017: 515–535

[37]	Ao Y, Li H, Zhu L, Ali S, Yang Z. The linear random forest algorithm and its advantages in machine learning assisted logging regression modeling. Journal of Petroleum Science Engineering, 2019, 174: 776–789

[38]	Rabbani A, Samui P, Kumari S. Implementing ensemble learning models for the prediction of shear strength of soil. Asian Journal of Civil Engineering, 2023, 24(7): 2103–2119

[39]	Ghasemieh A, Lloyed A, Bahrami P, Vajar P, Kashef R. A novel machine learning model with Stacking Ensemble Learner for predicting emergency readmission of heart-disease patients. Decision Analytics Journal, 2023, 7: 100242

[40]	Khan A Q, Deng P, Matsumoto T. Equivalent boundary conditions to analyze the realistic fatigue behaviors of a bridge RC slab. Structural Engineering and Mechanics, 2022, 82(3): 369–383

[41]	Chicco D, Warrens M J, Jurman G. The coefficient of determination R-squared is more informative than SMAPE, MAE, MAPE, MSE and RMSE in regression analysis evaluation. PeerJ. Computer Science, 2021, 7: e623

[42]	Huang L, Zhang S S, Yu T, Wang Z Y. Compressive behaviour of large rupture strain FRP-confined concrete-encased steel columns. Construction & Building Materials, 2018, 183: 513–522

[43]	Ağbulut Ü, Gürel A E, Biçen Y. Prediction of daily global solar radiation using different machine learning algorithms: Evaluation and comparison. Renewable & Sustainable Energy Reviews, 2021, 135: 110114

[44]	Khanal S, Fulton J, Klopfenstein A, Douridas N, Shearer S. Integration of high resolution remotely sensed data and machine learning techniques for spatial prediction of soil properties and corn yield. Computers and Electronics in Agriculture, 2018, 153: 213–225

[45]	Chen P, Hsieh H, Su K, Sigalingging X K, Chen Y, Leu J. Predicting station level demand in a bike‐sharing system using recurrent neural networks. IET Intelligent Transport Systems, 2020, 14(6): 554–561

[46]	Lynch C J, Gore R. Short-range forecasting of COVID-19 during early onset at county, health district, and state geographic levels using seven methods: Comparative forecasting study. Journal of Medical Internet Research, 2021, 23(3): e24925

[47]	Khan M, Lao J, Dai J G. Comparative study of advanced computational techniques for estimating the compressive strength of UHPC. Journal of Asian Concrete Federation, 2022, 8(1): 51–68

[48]	Abuodeh O R, Abdalla J A, Hawileh R A. Assessment of compressive strength of Ultra-high Performance Concrete using deep machine learning techniques. Applied Soft Computing, 2020, 95: 106552

[49]	Kashem A, Karim R, Malo S C, Das P, Datta S D, Alharthai M. Hybrid data-driven approaches to predicting the compressive strength of ultra-high-performance concrete using SHAP and PDP analyses. Case Studies in Construction Materials, 2024, 20: e02991

[50]	VinnakotaS. Understanding the long-term evolution of CSH phases present in cement backfills. In: Proceedings of 37th Cement and Concrete science Conference. London: University College London, 2019.

[51]	Han Q, Zhang P, Wu J, Jing Y, Zhang D, Zhang T. Comprehensive review of the properties of fly ash-based geopolymer with additive of nano-SiO₂. Nanotechnology Reviews, 2022, 11(1): 1478–1498

[52]	Li S, Cheng S, Mo L, Deng M. Effects of steel slag powder and expansive agent on the properties of ultra-high performance concrete (UHPC): Based on a case study. Materials, 2020, 13(3): 683

[53]	Amin M, Zeyad A M, Tayeh B A, Saad Agwa I. Effect of ferrosilicon and silica fume on mechanical, durability, and microstructure characteristics of ultra high-performance concrete. Construction & Building Materials, 2022, 320: 126233

[54]	Park S, Wu S, Liu Z, Pyo S. The role of supplementary cementitious materials (SCMs) in ultra high performance concrete (UHPC): A review. Materials, 2021, 14(6): 1472

[55]	Liu C, He X, Deng X, Wu Y, Zheng Z, Liu J, Hui D. Application of nanomaterials in ultra-high performance concrete: A review. Nanotechnology Reviews, 2020, 9(1): 1427–1444

[56]	Dingqiang F, Rui Y, Zhonghe S, Chunfeng W, Jinnan W, Qiqi S. A novel approach for developing a green Ultra-High Performance Concrete (UHPC) with advanced particles packing meso-structure. Construction & Building Materials, 2020, 265: 120339

[57]	Arora A, Almujaddidi A, Kianmofrad F, Mobasher B, Neithalath N. Material design of economical ultra-high performance concrete (UHPC) and evaluation of their properties. Cement and Concrete Composites, 2019, 104: 103346

[58]	Ullah R, Qiang Y, Ahmad J, Vatin N I, El-Shorbagy M A. Ultra-high-performance concrete (UHPC): A state-of-the-art review. Materials, 2022, 15(12): 4131

[59]	Saleh S, Li Y L, Hamed E, Mahmood A H, Zhao X L. Workability, strength, and shrinkage of ultra-high-performance seawater, sea sand concrete with different OPC replacement ratios. Journal of Sustainable Cement-Based Materials, 2023, 12(3): 271–291

[60]	Zhang Y, Zhu Y, Qu S, Kumar A, Shao X. Improvement of flexural and tensile strength of layered-casting UHPC with aligned steel fibers. Construction & Building Materials, 2020, 251: 118893

[61]	Huang H, Gao X, Li L, Wang H. Improvement effect of steel fiber orientation control on mechanical performance of UHPC. Construction & Building Materials, 2018, 188: 709–721

[62]	Zhang X Y, Yu R, Zhang J J, Shui Z H. A low-carbon alkali activated slag based ultra-high performance concrete (UHPC): Reaction kinetics and microstructure development. Journal of Cleaner Production, 2022, 363: 132416

[63]	Di Guida R, Engel J, Allwood J W, Weber R J M, Jones M R, Sommer U, Viant M R, Dunn W B. Non-targeted UHPLC-MS metabolomic data processing methods: A comparative investigation of normalisation, missing value imputation, transformation and scaling. Metabolomics, 2016, 12(5): 93

[64]	Singh D, Singh B. Investigating the impact of data normalization on classification performance. Applied Soft Computing, 2020, 97: 105524

[65]	RaschkaS. Model evaluation, model selection, and algorithm selection in machine learning. 2018, arXiv: 1811.12808

RIGHTS & PERMISSIONS

Higher Education Press