Development of optimized XGBoost model for accurate prediction of concrete-filled steel tubular column bearing capacity

Tran Minh LUAN; Samir KHATIR; Timon RABCZUK; Nicholas FANTUZZI; Thanh CUONG-LE

doi:10.1007/s11709-026-1279-7

ENG. Struct. Civ. Eng ›› 2026, Vol. 20 ›› Issue (3) :524 -540. DOI: 10.1007/s11709-026-1279-7

RESEARCH ARTICLE

Development of optimized XGBoost model for accurate prediction of concrete-filled steel tubular column bearing capacity

Author information +

History +

PDF (4530KB)

Abstract

In recent years, with the explosion of the 4.0 industrial revolution, terms such as machine learning (ML) have become familiar and are increasingly widely applied in the engineering field. This study focuses on proposing two new hybrid models named exponential-trigonometrie optimized extreme gradient boosting (ETO-XGBoost) and whale optimization algorithm extreme gradient boosting (WOA-XGBoost), which are developed based on extreme gradient boosting combined with exponential-trigonometric optimization and whale optimization algorithm. A data set has been built and analyzed using ABAQUS software, and combined with data from the empirical formula to construct a training data set for the proposed models. The two proposed hybrid models are compared with the available models including Kolmogorov-Arnold networks (KAN), artificial neural networks (ANN) and Eurocode 4 standard through important statistical indices such as mean absolute error (MAE), mean absolute percentage error (MAPE), root mean square error (RMSE) and correlation coefficient R. The analysis results show that the ETO-XGBoost model and the WOA-XGBoost model are the most effective ML models when compared with other models. The correlation index R of both models reached very high values (0.9963 for ETO-XGBoost and 0.9956 for WOA-XGBoost, respectively). At the same time, the error indexes of the ETO-XGBoost model were the smallest among the compared models, with MAPE = 63.3738, MAE = 47.4643 and RMSE = 1.8221; meanwhile, the WOA-XGBoost model had corresponding indexes of MAPE = 67.8956, MAE = 49.1825 and RMSE = 1.9040. Besides, the predicted data from the ETO-XGBoost model and the WOA-XGBoost model showed the highest similarity with the actual data in predicting the axial strength of concrete-filled steel tubular (CFST) columns. Therefore, the ETO-XGBoost model and the WOA-XGBoost model can be considered as a powerful and accurate tool in predicting the compressive strength of CFST columns.

Graphical abstract

Keywords

ML / ETO-XGBoost / WOA-XGBoost / KAN / ANN / Eurocode 4

Cite this article

Download citation ▾

Tran Minh LUAN, Samir KHATIR, Timon RABCZUK, Nicholas FANTUZZI, Thanh CUONG-LE. Development of optimized XGBoost model for accurate prediction of concrete-filled steel tubular column bearing capacity. ENG. Struct. Civ. Eng, 2026, 20(3): 524-540 DOI:10.1007/s11709-026-1279-7

登录浏览全文

4963

注册一个新账户忘记密码

1 Introduction

Concrete-filled steel tubular (CFST) columns are a type of structural member that is widely used in civil and industrial works thanks to its outstanding mechanical properties such as high strength, high ductility and especially load-bearing capacity [1–3]. The structure of the column includes a hollow steel tube and an internal filled concrete that optimizes the combination of steel stiffness and concrete compressive strength. Therefore, CFST columns are widely used in high-rise buildings, bridges and many other structural applications to withstand large axial loads [4–6]. In recent years, the need for accurate prediction of ultimate axial capacity of CFST columns has increased for safe and economical design. The two traditional approaches are to increase the cross-sectional size or use high-strength materials [7], of which the former is less efficient in terms of space and cost. Studies by Cederwall et al. [8], Mursi and Uy [9], Sakino et al. [10], and Liu et al. [11] have investigated high-strength CFST columns, while Lai and Varma [12] have developed corresponding design equations. However, data are still limited and numerical simulations for predicting the ultimate load of CFST have not been fully exploited [13,14].

In that context, machine learning (ML) and artificial intelligence (AI) methods have been increasingly applied in construction engineering [15–20] and especially in compressive structures [21–23] thanks to their ability to solve many linear and nonlinear problems with different levels of complexity. AI models often use hybrid intelligent [24,25] or single intelligent [26,27] computational methods, commonly used as artificial neural networks (ANN) [28], gene expression programming (GEP) [29], etc., supported by many modern optimization algorithms. In addition, the development of many optimization algorithms is also a tool to support these AI models more effectively. Hamdia et al. [30] used the Bayesian method to evaluate the uncertainty in material damage models, showing that the gradient-enhanced damage model has the highest reliability. Abhishek et al. [31] proposed a method combining deep ML and collocation techniques to solve partial differential equations, achieving an error of less than 5%. In the field of predicting the bearing capacity of CFST columns, many natural-inspired optimization algorithms have been applied. Ren et al. [24] combined SVM with PSO to predict the bearing capacity of square CFST columns, achieving higher accuracy than traditional models. Sarir et al. [25] developed a hybrid model by integrating invasive weed optimization (IWO) algorithm with ANN to predict the bearing capacity of CCFST columns, and validated its performance against the artificial bee colony (ABC) with ANN model, showing that IWO provides greater accuracy than ABC in optimizing ANN parameters. Jayalekshmi et al. [32] proposed an ANN model to simulate the axial load capacity of circular CFST columns with high accuracy. Quan-Viet Vu et al. [33] used the gradient tree boosting algorithm to predict the strength of CFST columns, while Ahmadi et al. [27] extended ANN to estimate the compressive strength of concrete confined in CFST with different cross-sectional shapes. Building upon this foundational work, Saadoon et al. extended the application of the model to evaluate the ultimate bearing capacity of both rectangular [34] and circular [35] concrete-filled steel tube columns, thereby demonstrating its versatility across different cross-sectional geometries. In a parallel line of research, Nguyen et al. [36] improved the ANN performance using the IWO technique to avoid local minima. Furthermore, Sarir et al. [37] explored a diverse set of intelligent optimization strategies, including tree-based models, neural swarm algorithms, and whale optimization-inspired GEP, aiming to optimize model parameters more effectively and achieve greater accuracy, following the motivation and methodology presented in Ref. [36].

Recently, the trend of integrating advanced ML models and optimization algorithms has become more popular. Nguyen-Vu Luat et al. [38] have developed a new hybrid intelligent system, namely GAP-BART based on bayesian additive regression tree (BART) combined with three nature-inspired optimization algorithms such as genetic algorithm, ABC and particle swarm optimization and achieved the expected results. In this study, an approach is proposed to replace the BART model with the extreme gradient boosting (XGBoost) model: one of the most powerful and effective ML techniques today for the regression problem. XGBoost is built based on boosting decision trees, combining nonlinear learning capabilities, anti-overfitting and optimizing training speed. This study applied the Exponential-trigonometric optimization algorithm (ETO) [39] to automatically optimize the hyperparameter set of the XGBoost model used in predicting the compressive strength of CFST columns. A database was collected from ABAQUS software numerical simulations and empirical formulas. Input data were normalized to improve the algorithm efficiency and a synthesis of geometric and strength parameters are considered of the training of the XGBoost model with hyperparameters optimized via ETO. The optimization objective is the lowest root mean square error (RMSE). Subsequently, the model was evaluated by indices such as coefficient of determination (R²), RMSE, mean absolute error (MAE) and regression graph.

Section 2 of the paper will focus on introducing existing methods such as ABAQUS software simulation, Eurocode 4 standard and the proposed formulation of previous research. Section 3 will present ML models such as kolmogorov-arnold network (KAN) model and ANN model. Section 4 will present two proposed hybrid models including exponential-trigonometrie optimized extreme gradient boosting (ETO-XGBoost) hybrid model and whale optimization algorithm extreme gradient boosting (WOA-XGBoost) hybrid model. Section 5 will present the results and compare the effectiveness of the two proposed hybrid models with previous models. Finally, Section 6 will present the conclusion and mention other future research directions.

2 Review of existing approaches

To calculate and collect data for the AI model, numerical simulation using ABAQUS software is performed in this study. In addition, the Eurocode-4 standard [40] and the proposed new calculation method reported by Hang et al. [41] are also described.

2.1 Numerical simulation

In this research, finite element analysis was employed to simulate the structural behavior of CFST columns, utilizing the commercial software ABAQUS due to its robust capabilities in modeling complex structural systems. The primary objective of the simulation was to investigate the axial load-bearing capacity of CFST columns under different loading conditions. The numerical outcomes were subsequently validated through comparison with experimental data reported in prior studies, allowing for both verification of model accuracy and a more comprehensive understanding of the structural performance of CFST columns. Figure 1 illustrates the applied load diagram and boundary conditions of the CFST column compression test, which will facilitate simulation in the ABAQUS model.

Within the ABAQUS environment, the concrete core of the CFST columns was modeled using C3D8R elements, which are eight-node linear brick elements with reduced integration and hourglass control. The steel tube was represented by S4R elements (four-node shell elements capable of handling both thin and thick shell behavior, also featuring reduced integration and hourglass control). The interaction between the steel tube and the concrete core was simulated using “surface-to-surface contact” definitions. To capture the interface behavior in both longitudinal and tangential directions, contact properties were assigned, with normal behavior defined using the “hard contact” model and tangential behavior governed by a friction coefficient of 0.47 [42]. Furthermore, the accurate representation of the constitutive models for concrete and steel reinforcement materials played a crucial role in ensuring the reliability of the simulation results produced by ABAQUS. In addition, the concrete failure model adopted in the ABAQUS simulation is the concrete damaged plasticity (CDP) model, which incorporates several key parameters to accurately represent the inelastic behavior of confined concrete. The dilation angle is specified as 38°, while the eccentricity is set at 0.7. Other essential parameters, such as the shape factor

K c = 0.67

, the stress ratio

σ b 0 / σ c 0 = 1.2

, other failure parameters d_c and d_t, are defined in accordance with empirical formulations reported in Ref. [42]. These parameters collectively influence the yield surface and flow potential used to simulate the concrete’s damage and plasticity behavior under multiaxial loading conditions. Figure 2 presents the finite element models of two CFST column types constructed using ABAQUS. The comparison between the numerical results and the experimental data reported in Ref. [3] is illustrated in Fig. 3.

As shown in Fig. 3, the axial load capacities obtained from the finite element simulation are 7598 kN for the hollow circular CFST column and 7049 kN for the hollow square CFST column. In comparison with the experimental data [3], the corresponding values are 7640 and 7143 kN, respectively. These results indicate a close agreement between the numerical and experimental outcomes. Furthermore, the axial load–axial shortening curves derived from both methods exhibit similar trends. Overall, the strong correlation between simulation and experimental data confirms the accuracy and reliability of the ABAQUS model in predicting the structural performance of CFST columns.

2.2 Eurocode 4 standard

According to the Eurocode 4 standard [40], the characteristic axial compressive resistance of a CFST column without internal reinforcement can be determined using Eq. (1):

(1)

N p l, Rk1 = A a f y + A c f ck,

where A_a and A_c denote the cross-sectional areas of the steel tube and concrete core, respectively, f_y and f_ck represent the yield strength of the steel and the characteristic compressive strength of the concrete, respectively. To account for global slenderness effects, especially in columns with significant slenderness ratio

λ ¯ ≤ 0.5

and eccentric loading conditions defined as the ratio of eccentricity to diameter

e / d < 0.5

, where

λ ¯

is the slenderness ratio

= N p l, Rk1 / N cr

,

N cr

is the Euler bending force of the CFST column. The design compressive resistance of the cross-section is evaluated using Eq. (2):

(2)

N p l, Rk2 = η a A a f y + A c f ck (1 + η c t d f y f ck),

where t is the wall thickness of the steel tube, d is the outer diameter of the CFST column,

η a

and

η c

are reduction coefficients that account for the effects of slenderness and confinement, which are computed using Eqs. (3) and (4):

(3)

η a = 0.25 (3 + 2 λ ¯) + [1 − 0.25 (3 + 2 λ ¯)] (10 e / d),

(4)

η c = (4.9 − 18.5 λ ¯ + 17 λ ¯ 2) (1 − 10 e / d) .

To evaluate the slenderness of the column, the Euler buckling load is calculated based on Eq. (5):

(5)

N cr = π 2 (E I) eff L 2,

where

(6)

(E I) eff = E a I a + 0.6 E cm I c,

L is the column length, E_cm and E_a are the test elastic moduli of concrete and steel, I_a and I_c are the moments of inertia of the steel and concrete sections, respectively.

2.3 Concrete-filled steel tubular column design with holes

Han et al. [41] proposed a formula for calculating the axial force of CFST columns with holes according to the following Eq. (7):

(7)

N Han = f syi A si + f osc A soc .

For columns with circular, rectangular and elliptical cross-section voids, f_osc-1 is calculated as follows in Eq. (8):

(8)

f osc-1 = C 1 χ 2 f syo + C 2 (1.14 + 1.02 ξ) f ck .

For CFST columns with square cross-section voids, f_osc is computed as Eq. (9):

(9)

f osc = C 1 χ 2 f syo + C 2 (1.18 + 0.85 ξ) f ck .

In which, the coefficients C₁ and C₂ are calculated, respectively as Eqs. (10) and (11):

(10)

C 1 = α 1 + α, α = A so A c,

(11)

C 2 = 1 + α n 1 + α, α n = A so A c,nom .

In addition,

χ

and

ξ

are the void ratio and nominal limit factor of the CFST column, respectively, calculated by Eqs. (12) and (13):

(12)

χ = D i D o − 2 t o,

(13)

ξ = f syo A so f ck A c,nom,

where A_so and A_c,nom are the cross-sectional area of the outer steel pipe and the nominal cross-sectional area of the concrete, respectively.

(14)

A c,nom = π (D o − 2 t o) 2 4,

(15)

A soc = A so + A c .

In addition, f_ck is the characteristic concrete strength (

f ck = 0.67 f cu

for normal strength concrete),

f syi

and

f syo

is the yield strength of the inner and outer steel pipes, respectively,

A si

is the cross-sectional area of the inner steel pipes, D_o and D_i is the outer and inner diameter or outer and inner width of the circular and square steel pipes, respectively.

3 Machine learning techniques

3.1 Kolmogorov-arnold network

The KAN [43] is inspired by a fundamental theorem stating that any continuous multivariate function defined over a bounded domain can be decomposed into a finite sum of continuous univariate functions. Unlike traditional multi-layer perceptrons (MLPs), where activation functions are applied at the nodes, KAN applies learnable activation functions along the edges. This structural distinction enables KAN not only to capture complex compositional patterns in data but also to achieve high predictive accuracy by leveraging spline-like behavior. As a result, KAN offers enhanced interpretability and precision compared to conventional MLPs, presenting a promising direction for advancing deep learning architectures rooted in MLPs.

Vladimir Arnold and Andrey Kolmogorov demonstrated that any continuous function of several variables, defined on a bounded domain, can be constructed using only a finite number of single-variable continuous functions and addition operations. In particular, if we consider a smooth function

f : [0, 1] n → R

, it can be decomposed accordingly into a composition of univariate functions and binary addition.

(16)

f (x) = f (x 1, . . ., x n) = ∑ q = 1 2 n + 1 Φ q [∑ p = 1 n ϕ q, p (x p)] .

Let

ϕ q, p : [0, 1] → R

and

Φ q : R → R

be univariate functions, where the overall summation forms a multivariate operator. Both

Φ q

and

ϕ q, p

are functions of a single variable. Equation (16) can be interpreted as a two-stage procedure: first, each input component passes through its own set of nonlinear univariate activation functions; then, the outputs are aggregated through summation. This perspective lays the foundation for defining a family of KAN models. These models resemble MLPs in that they represent a mapping between two subsets of Euclidean spaces, say

A ⊆ R n

and

B ⊆ R n ′

. The functions

{ϕ i j, 1 ≤ i ≤ n ′, 1 ≤ j ≤ n}

form the learnable parameters of this model class. To proceed, we calculate the transformed output

x ′

for each input

x ∈ A

as follows:

(17)

x i ′ = ∑ j = 1 n ϕ i j (x j) .

The resulting architecture functions as a universal approximator. Equation (16) achieves this by stacking two layers: the first transforms an input of dimension n into an output of dimension 2n + 1, while the second maps from dimension 2n + 1 to a scalar output. Since the universal approximation theorem for MLPs typically demands a large or potentially infinite number of neurons, this construction suggests a more efficient alternative. Specifically, the number of univariate functions required to approximate any continuous multivariate function from

[0, 1] n

R n ′

is bounded by

(2 n 2 + n) × n ′

. Nevertheless, as highlighted in the original work, these univariate functions can be extremely complex, which can hinder their learnability and practical implementation.·

3.2 Artificial neural networks

The concept of ANN originated from the field of biology, inspired by the way nerve cells (neurons) function and communicate in the human body [44]. In the ANN model, there are three main components, which are the input layer, the hidden layer, and the output layer, as shown in Fig. 4. The input layer receives signals from the external environment, the output layer sends the prediction or classification results, and the hidden layer acts as an intermediate processing layer, helping the model learn and analyze data in depth. The number of hidden layers can vary, from one simple layer to many more complex layers, depending on the nature and requirements of each specific problem. An important point in ANN is the use of nonlinear activation functions, which help map inputs to outputs through the process of signal propagation through hidden layers. These activation functions create the ability to model nonlinear and complex relationships in data, making ANN a powerful tool for prediction and classification. Equation (18) describes this process, often expressed mathematically, which helps clarify how the inputs are transformed through hidden layers to produce the desired outputs. More details about the ANN model can be found in Ref. [44].

(18)

n e t k = ∑ w k j o j a n d y k = f (n e t k),

where

n e t k

represents the activation input of neuron

k

j

refers to the neurons in the preceding layers,

w k j

denotes the weight connecting neuron

j

to neuron

k

o j

is the output of neuron

j

, and

y k

is the activation function applied at neuron k, typically a sigmoid or logistic function.

(19)

f (n e t k) = 1 1 + e − λ n e t,

where

λ

controls the slope of the function.

w k j

has been trained and updated using Eq. (20) as below:

(20)

w k j (t) = w k j (t − 1) + Δ w k j (t) .

The change in

Δ w k j (t)

value is:

(21)

Δ w k j (t) = η δ p j o p j + α Δ w k j (t − 1),

where

η

is the learned value,

δ p j

is the propagation error,

o p j

is the output of the neuron

j

for record

p

α

is the momentum value, and

Δ w k j (t − 1)

are the values that changed in the previous iteration.

4 Proposing a new hybrid model

4.1 Extreme gradient boosting

XGBoost is a powerful algorithm built upon the gradient boosting decision tree framework [45], offering significant improvements in model performance through gradient-based optimization. Leveraging the principles of classification and regression trees, XGBoost has proven to be an effective approach for both regression and classification tasks [46–48]. It can also be considered a soft computing library that integrates advanced algorithmic strategies with gradient boosted decision trees.

The optimized objective function in XGBoost includes two key components: one measuring the model’s prediction error and the other serving as a regularization term to mitigate overfitting [49]. Let

D = {(x i, y i)}

denote the data set with n samples and m features. The model’s prediction is formulated as an additive ensemble of k base learners. The prediction for a given sample is computed as follows:

(22)

y^i = ∑ k = 1 K f k (x i), f k ∈ φ,

(23)

φ = {f (x) = w s (x)} (s : R m → T, w s ∈ R T),

where

y^i

denotes the predicted value for the ith sample, and

x i

refers to the input features of that sample. The prediction score is given by

f k (x i)

, where each

f k

is a function in the ensemble. The symbol

φ

represents the set of regression trees, which includes the tree structure s, the mapping function

f (x)

, and the leaf weights w.

The objective function in XGBoost combines the standard loss function with a regularization term that accounts for model complexity. This formulation allows for evaluating both the predictive performance and the computational efficiency of the algorithm. In Eq. (24), the first term captures the traditional loss (e.g., mean squared error), while the second term penalizes overly complex models to help prevent overfitting.

(24)

O b j = ∑ i = 1 m l [y i, y^i (t − 1) + f i (x i)] + Ω (f k),

(25)

Ω (f k) = γ T + 1 / 2 λ w 2,

where

i

denotes the number of samples in the data set, while m refers to the number of instances used in constructing the

k

th decision tree. The parameters

γ

and

λ

are regularization coefficients that help control the complexity of the model. Specifically, the regularization term plays a crucial role in smoothing the final learned weights, thereby reducing the risk of overfitting.

4.2 Development of combined modeling approaches

4.2.1 Exponential-trigonometrie optimized extreme gradient boosting model

In engineering applications, many studies have been conducted to search for the optimal values of hyperparameters in ML models to improve prediction accuracy [25,50,51]. In this study, a hybrid system named ETO-XGBoost was developed to predict the ultimate axial load capacity of CFST columns. The system consists of the integration of XGBoost ML model and the ETO algorithm [39].

The ETO algorithm is organized into four main stages: a boundary control strategy, two key operational phases, exploration and exploitation, and a coordination mechanism for transitioning between them. Initially, ETO generates a random population of candidate solutions and begins by exploring the search space. During this phase, the search boundaries are dynamically updated to maintain flexibility and improve the search effectiveness. The algorithm either guides agents toward the best-known solution from afar or intensifies the search locally around current agents. If satisfactory improvement is not achieved, the algorithm proceeds to a more focused search by diving deeper into unexplored regions. The boundary search method is then utilized to reposition all agents within promising zones, followed by another cycle of exploration and exploitation. Throughout this process, two scalar parameters, d₁ and d₂, play a significant role in adjusting the search region – helping the algorithm converge faster and with greater precision. Furthermore, ETO incorporates a switching mechanism that allows smooth transitions between exploration and exploitation phases. This dynamic adjustment not only helps maintain search balance but also improves the algorithm’s ability to avoid being trapped in local optimal. The full operational details of ETO are described in the pseudocode (Algorithms 1–4).

In this proposed ETO-XGBoost hybrid model, four important hyperparameters of the XGBoost model are considered in the optimization process: number of trees (n_estimators), learning rate, maximum tree depth (max_depth), and subsample rate (subsamples). The ETO algorithm acts as a global search engine to determine the optimal hyperparameter combination for the XGBoost model. The process begins by randomly initializing a population of hyperparameter combinations in a predetermined search space. Then, at each iteration, the algorithm evaluates the quality of each individual through the objective function RMSE, which is calculated by training the XGBoost model with the corresponding hyperparameter combination and measuring the error on the test set. Based on the current best individual, the remaining individuals will update their positions through control formulas designed to balance exploration and exploitation of the search space. The optimization loop continues until the maximum number of iterations is reached or the objective function converges. The hyperparameter set with the lowest RMSE is chosen to build the final XGBoost model, helping improve both accuracy and reliability.

4.2.2 Whale optimization algorithm extreme gradient boosting model

Another hybrid model considered in this study is WOA-XGBoost which is developed to predict the bearing capacity of CFST columns. This model is an integration of XGBoost ML model and whale optimization algorithm (WOA) [52].

The WOA is a population-based metaheuristic inspired by the foraging behavior of humpback whales. Their unique hunting strategy, known as bubble-net feeding, can be broken into three stages: initially encircling the prey, followed by creating bubble nets to trap it, and finally pinpointing the prey’s exact location, as illustrated in Fig. 5. During this process, whales utilize both contractive movement and spiral trajectories to update their positions [53].

In the encircling phase, whales are believed to estimate the location of their prey through echolocation and adjust their positions accordingly to surround it. The position update at iteration

k

is modeled using the following equation:

(26)

Q k + 1 = Q best k − (2 a ⋅ ρ 1 − a) ⋅ | 2 ρ 2 ⋅ Q best k − Q k |,

where

Q k

represents the position vector of a whale at the kth iteration, and the best position identified so far is denoted as

Q best k = (Q best 1 k, Q best 2 k, . . ., Q best D k)

, with D representing the dimensionality of the search space. The values

ρ 1

and

ρ 2

are random numbers uniformly sampled from the interval [0,1]. The parameter

a

, which linearly decreases from 2 to 0 over time, is used to regulate the exploration-exploitation balance and is calculated using a specific control function:

(27)

a = 2 − 2 k / k max .

As previously discussed, the WOA simulates both shrinking encirclement and spiral updating during the bubble-net hunting behavior. As the value of a decreases over iterations as shown in Eq. (27), whales narrow their search around the best solution, gradually encircling the prey. This results in a new position

Q k + 1

that lies between the current position

Q k

and the best-known position

Q best k

. This movement models the contraction behavior of whales during hunting. Additionally, a mathematical model of spiral motion is used to simulate the swirling pattern whales follow during this stage.

Following the same principle as the ETO-XGBoost hybrid model, the WOA-XGBoost hybrid model also optimizes the hyperparameters in the XGBoost model. The algorithm initializes a set of random individuals in a predetermined search space. At each iteration, the RMSE value corresponding to each individual is calculated by training the XGBoost model with that combination of hyperparameters. The individual with the lowest RMSE is identified as the “leader whale” and is considered the temporary optimal solution. Then, WOA updates the positions of other individuals in the population based on three main behaviors: 1) shrinking the spiral around the leader individual; 2) moving in a spiral toward the leader individual; 3) exploring the search space by randomly selecting another individual for reference if conditions are satisfied. These mechanisms help to balance between exploration and exploitation of the search space to avoid falling into local extrema. The process is repeated until the maximum number of iterations is reached or the objective function converges. The result is the hyperparameter combination with the lowest RMSE, which is used to train the final XGBoost model, thereby ensuring the highest accuracy in predicting the bearing capacity of CFST columns.

5 Discussion

5.1 Model setup

The data are randomly divided into ten parts using the k-fold cross-validation method. Figure 6 illustrates the process of training and testing ML models using this method. Specifically, each model will go through ten iterations, in which each round uses nine parts of the data for training and the one part for testing. This approach allows the entire data set to be used for testing in turn, thereby objectively and comprehensively evaluating the accuracy of the model. In each round, the model is adjusted through training data to improve its prediction ability, while the testing data are used to evaluate the actual performance. This method not only increases the reliability of the evaluation but also helps the model achieve better prediction performance on unprecedented data.

After training, ML models are applied to predict the axial load-bearing capacity of CFST columns based on test data. The predicted results are compared with the actual values obtained from the experiment to evaluate the suitability of each model. The prediction performance is analyzed through statistical indicators, including: MAE, mean absolute percentage error (MAPE), RMSE, and correlation coefficient (R). These indicators allow to evaluate both the deviation and the correlation between the predicted and actual values, thereby reflecting the accuracy and reliability of the model. The specific formulas for calculating the above indicators are presented from Eqs. (28) to (31).

(28)

R M S E = 1 n ∑ i = 1 n (N max − N max ′) 2,

(29)

M A E = 1 n ∑ i = 1 n | N max − N max ′ |,

(30)

M A P E = 1 n ∑ i = 1 n | N max − N max ′ N max |,

(31)

R = n ∑ N max N max ′ − (∑ N max) (∑ N max ′) n (∑ N max 2) − (∑ N max) 2 n (∑ N ′ 2 max) − (∑ N max ′) 2,

where

N max ′

and

N max

are the predicted and actual axial forces of the CFST column, and n is the data size.

A total of 1020 hollow CFST column samples were obtained from both ABAQUS simulation results and the analytical expression (Eq. (7)) introduced by Han et al. [41]. These specimens incorporated both normal-strength

(f c ≤ 6 0 MPa)

and high-strength concrete

(6 0 MPa < f c ≤ 12 0 MPa)

. Table 1 summarizes the key parameters of the data set, including concrete compressive strength

(f c, cyl)

, yield strength of both inner

(f yi)

and outer steel tubes

(f yo)

, diameters of the inner

(d)

and outer columns

(D)

, the thickness of inner

(t i)

and outer steel pipe

(t o)

, and column lengths

(L)

. These factors are considered essential predictors for the model. Additionally, the distributions of these properties are illustrated through charts, which are provided in Fig. 7.

Figure 8 provides a clear depiction of the cross-sectional geometry of a perforated hollow CFST column. The compressive strength of the concrete varies between 41 and 78 MPa. For the steel components, the yield strength ranges from 213 to 544 MPa for the inner tube, and from 256 to 566 MPa for the outer tube. The inner column diameter lies within 90 to 206 mm, while the outer diameter spans from 154 to 332 mm. The wall thickness of the inner steel tube ranges from 0.541 to 3.647 mm, and the outer steel tube from 1.273 to 4.504 mm. The ultimate bearing capacity of the specimen’s ranges from approximately 1053.693 to 5221.531 kN.

5.2 Comparison of results

For the proposed ETO-XGBoost hybrid model, a multi-value population size survey was conducted with a 10-fold cross-validation method. The population size values included 10, 20, 30, 50, and 100. In these models, the RMSE objective function was selected as the stopping criterion when it reached the convergence value. As shown in Fig. 9, the RMSE values were stabilized after the 20th iteration. Among them, the RMSE value stabilized earliest at the population size of 50, and the best population sizes were 50 to 100 with the lowest RMSE. Finally, the optimal ETO-XGBoost hybrid model was found with the following parameters: n_estimators = 500, learning rate = 0.0689, max_depth = 3, and subsample = 0.5 and best RMSE = 63.3738.

Similar to the ETO-XGBoost hybrid model, the WOA-XGBoost model was also investigated with different values of population size, using the 10-fold cross-validation method. Specifically, the selected population size values include 10, 20, 30, 50 and 100. In all these models, the RMSE objective function is used as the stopping criterion, when its value converges. The results shown in Fig. 10 show that the RMSE values tend to stabilize after about the 10th iteration. Notably, with a population size of 50, the convergence process is fastest, and the lowest RMSE values are achieved when the population size is between 50 and 100. From this, it can be determined that the optimal WOA-XGBoost hybrid model is found with the following parameters: n_estimators = 500, learning rate = 0.0952, max_depth = 3, and subsample = 0.5977 and best RMSE = 67.8958.

In addition, to clearly illustrate the generalization ability and control overfitting, the learning curves of the two models are presented in Fig. 11. The horizontal axis is the number of reinforcement learning rounds (Iterations), the vertical axis is RMSE. In both models, RMSE (train) and RMSE (validation) decrease rapidly in the early stages and then asymptotically approach as the number of rounds increases; the gap between the two curves is small and does not expand over time, indicating that the risk of overfitting is well controlled by regularization, subsampling and early stopping. The results obtained below will confirm that the model has good generalization ability on unseen data.

In this study, two proposed hybrid models including ETO-XGBoost and WOA-XGBoost were used to predict the bearing capacity of CFST columns, and then compared with KAN, ANN and Eurocode 4 models. The prediction results are presented in Table 2, with the obtained R values as follows: ETO-XGBoost achieved 0.9963, WOA-XGBoost achieved 0.9956, KAN achieved 0.9839, ANN achieved 0.9859 and Eurocode 4 achieved 0.9785. These R values are all approximately 1.00, indicating a strong correlation between the actual data and the predicted axial compressive strength in CFST columns. In particular, the ETO-XGBoost and WOA-XGBoost model with the highest R value of 0.9963 and 0.9956, respectively, outperformed the other models, confirming the more accurate prediction performance of ETO-XGBoost and WOA-XGBoost model.

The metrics in Table 2 are computed on independent test sets in a 10-fold cross-validation protocol, i.e., in each test run, the model is trained nine times and evaluated on a blinded test run. In each iteration of the 10-fold cross-validation, the hyperparameters are tuned entirely on the training data, and the performance is then computed on the retained test fold, which is never used during training or tuning. Therefore, all reported errors and correlations quantify the model’s ability to generalize to unseen data. Specifically, there are significant differences in the RMSE, MAE and MAPE indices among the ML models. Notably, the ETO-XGBoost model has the lowest error with an RMSE of 63.3738 kN, MAE of 47.4643 kN and MAPE of 1.8221%, outperforming the other ML models presented in Table 2. Besides, the WOA-XGBoost model ranked second with a correlation coefficient R of 0.9956, RMSE of 67.8958 kN, MAE of 49.1825 kN and MAPE of 1.9040%. From these results, it can be concluded that ETO-XGBoost and WOA-XGBoost model exhibits the best predictive performance and best fits the data of the circular CFST column with holes. Figure 12 illustrate the actual and predicted values of the ML models, showing the high accuracy and correlation of these predictions. Among the five models, it can be seen that ETO-XGBoost and WOA-XGBoost model has better predicted data and is closer to the actual values than the other two AI models. Figures 13-15 present the comparison of MAE, MAPE and R between the models and the Eurocode 4 design standard.

In addition, Fig. 16 shows the SHAP summary plot on the independent test set. The horizontal axis is the SHAP value, points to the right (left) of zero indicate features that increase (decrease) the prediction. The colors represent the level of feature value from low to high. It can be seen that D makes the largest contributions in both directions, followed by

d

and

t o

; the material variables

f c

f yo

have significant but smaller effects; L is mostly concentrated around 0. Figure 17 summarizes the relative importance:

D ≈ 49 %

d ≈ 13 %

t o ≈ 11 %

f c = f yo ≈ 9 %

t i ≈ 6 %

f yi ≈ 3 %

and

L ≈ 1 %

. The observed direction of impact is consistent with the structural mechanics of CFST: increasing size and material will increase compressive strength; increasing length will reduce capacity due to slenderness effect, thereby increasing reliability when applying the model.

Combining the ETO and WOA algorithms with XGBoost improves the prediction performance because it solves the biggest drawback of XGBoost: sensitivity to hyperparameters (learning_rate, max_depth, n_estimators, etc). If manually tuned, the model is prone to overfitting or under-exploiting its potential. The ETO and WOA algorithms provide an efficient exploration-exploitation mechanism: from randomly initialized populations, they scan widely to detect promising parameter regions (exploration), then converge finely in the neighborhood of good solutions (exploitation) based on the update rules (encirclement/spiral of WOA; driving mechanism/update trajectory of ETO). When the optimization objective is RMSE, this process automatically selects a parameter combination that balances fit and generalization, limiting overfitting or underfitting, and helping the model fully exploit the advantages of boosting in modeling multidimensional nonlinear relationships. As a result, the RMSE, MAE, MAPE indices all decreased sharply and the R coefficient increased, proving that the ETO-XGBoost and WOA-XGBoost hybrid models outperformed the pure XGBoost.

In summary, the two hybrid models ETO-XGBoost and WOA-XGBoost demonstrate high potential for application in analysis, design and safety assessment of compressive structures, especially for CFST columns in wharf, high-rise buildings and heavy infrastructure. Thanks to the strong nonlinear learning ability of XGBoost combined with automatic hyperparameter optimization from ETO/WOA, the model can quickly predict the axial load capacity and damage trend of composite structures, and is easy to integrate into digital design platforms/smart design support software to provide real-time assessment of load capacity and safety. However, in this study, the model was trained mainly on axially compressed CFST data, so the current application scope is most suitable for the configuration, material parameter range (concrete-steel) and loading conditions within the training data domain; Therefore, it is not possible to guarantee generalization to all elastic-plastic responses of other materials or loading boundaries when the distribution shifts compared to the original data. To increase the generalizability, we will add representative data, perform out-of-sample validation on independent/inter-domain data sets, with uncertainty quantification and model calibration; and extend to diverse materials/loading boundaries (eccentric, cyclic/dynamic), aiming to develop a reliable and general prediction tool for modern engineering applications.

6 Conclusions

In this paper, two hybrid models including ETO-XGBoost and WOA-XGBoost are proposed to predict the bearing capacity of hollow CFST columns. To increase the accuracy and reliability of the proposed models, the results from the proposed models are compared with the existing models including KAN and ANN, and also compared with the Eurocode 4 design standard, an important international standard in the construction field. The data set required for these ML models is collected through the ABAQUS simulation software and empirical formulas from previous studies. The prediction results of the ML models, including the correlation coefficient index, MAE and MAPE, are obtained. The research results clearly demonstrate that the ETO-XGBoost and WOA-XGBoost models have significantly superior performance compared with other models studied. In particular, the correlation coefficients of the ETO-XGBoost and WOA-XGBoost models are both above 0.99, indicating a strong and stable correlation between the predicted values and the actual values. Moreover, the MAE and MAPE indices of the ETO-XGBoost and WOA-XGBoost models are the lowest among the models considered, confirming the high accuracy and efficiency of this model. Not only the ETO-XGBoost and WOA-XGBoost models, the KAN and ANN models also show significant reliability in predicting the bearing capacity of CFST columns. The correlation coefficients of these two models are both above 0.98, demonstrating their good and reliable prediction ability. This shows that all four models have good learning and generalization capabilities from data, providing predictions that are close to the actual values. In addition, comparing the prediction results of ML models with Eurocode 4 design standards further increases the persuasiveness and credibility of these models.

References

Publishing order | Descend order by publishing year | Descend order by cited within

[1]	Ekmekyapar T , Al-Eliwi B J M . Experimental behaviour of circular concrete filled steel tube columns and design specifications. Thin-Walled Structures, 2016, 105: 220–230

[2]	Giakoumelis G , Lam D . Axial capacity of circular concrete-filles tube columns. Journal of Constructional Steel Research, 2004, 60(7): 1049–1068

[3]	Xiong M X , Xiong D X , Liew J Y R . Axial performance of short concrete filled steel tubes with high-and ultra-high-strength materials. Engineering Structures, 2017, 136: 494–510

[4]	Wang F , Han L , Li W . Analytical behavior of CFDST stub columns with external stainless steel tubes under axial compression. Thin-walled Structures, 2018, 127: 756–768

[5]	Chen B C , Wang T L . Overview of concrete filled steel tube arch bridges in China. Practice Periodical on Structural Design and Construction, 2009, 14(2): 70–80

[6]	Han L H , Li W , Bjorhovde R . Developments and advanced applications of concrete-filled steel tubular (CFST) structures: Members. Journal of Constructional Steel Research, 2014, 100: 211–228

[7]	ANSI/AISC 360-99. Specification for Structural Steel Buildings. Chicago, IL: American Institute of Steel Construction, 1999

[8]	Cederwall K , Engstrom B , Grauers M . High-strength concrete used in composite columns. Special Publication, 1990, 121: 195–214

[9]	Mursi M , Uy B . Strength of slender concrete filled high strength steel box columns. Journal of Constructional Steel Research, 2004, 60(12): 1825–1848

[10]	Sakino K , Nakahara H , Morino S , Nishiyama I . Behavior of centrally loaded concrete-filled steel-tube short columns. Journal of Constructional Steel Research, 2004, 130: 180–188

[11]	Liu D , Gho W M , Yuan J . Ultimate capacity of high-strength rectangular concrete-filled steel hollow section stub columns. Journal of Constructional Steel Research, 2003, 59(12): 1499–1515

[12]	Lai Z , Varma A H . High-strength rectangular CFT members: Database, modeling, and design of short columns. Journal of Structural Engineering, 2018, 144(5): 04018036

[13]	Han L H , An Y F . Performance of concrete-encased CFST stub columns under axial compression. Journal of Constructional Steel Research, 2014, 93: 62–76

[14]	Evirgen B , Tuncan A , Taskin K . Structural behavior of concrete filled steel tubular sections (CFT/CFSt) under axial compression. Thin-Walled Structures, 2014, 80: 46–56

[15]	Tang D , Gordan B , Koopialipoor M , Jahed Armaghani D , Tarinejad R , Pham B T , Huynh V V . Seepage analysis in short embankments using developing a metaheuristic method based on governing equations. Applied Sciences, 2020, 10(5): 1761

[16]	Mohammed A S , Asteris P G , Koopialipoor M , Alexakis D E , Lemonis M E , Armaghani D J . Stacking ensemble tree models to predict energy performance in residential buildings. Sustainability, 2021, 13(15): 8298

[17]	Asteris P G , Lemonis M E , Le T T , Tsavdaridis K D . Evaluation of the ultimate eccentric load of rectangular CFSTs using advanced neural network modeling. Engineering Structures, 2021, 248: 113297

[18]	Nguyen N H , Vo T P , Lee S , Asteris P G . Heuristic algorithm-based semi-empirical formulas for estimating the compressive strength of the normal and high performance concrete. Construction and Building Materials, 2021, 304: 124467

[19]	Asteris P G , Cavaleri L , Ly H B , Pham B T . Surrogate models for the compressive strength mapping of cement mortar materials. Soft Computing, 2021, 25(8): 6347–6372

[20]	Yang H , Wang Z , Song K . A new hybrid grey wolf optimizer-feature weighted-multiple kernel-support vector regression technique to predict TBM performance. Engineering with Computers, 2022, 38(3): 2469–2485

[21]	Huang L , Asteris P G , Koopialipoor M , Armaghani D J , Tahir M M . Invasive weed optimization technique-based ANN to the prediction of rock tensile strength. Applied Sciences, 2019, 9(24): 5372

[22]	Lu S , Koopialipoor M , Asteris P G , Bahri M , Armaghani D J . A novel feature selection approach based on tree models for evaluating the punching shear capacity of steel fiber-reinforced concrete flat slabs. Materials, 2020, 13(17): 3902

[23]	Asteris P G , Koopialipoor M , Armaghani D J , Kotsonis E A , Lourenço P B . Prediction of cement-based mortars compressive strength using machine learning techniques. Neural Computing and Applications, 2021, 33(19): 13089–13121

[24]	Ren Q , Li M , Zhang M , Shen Y , Si W . Prediction of ultimate axial capacity of square concrete-filled steel tubular short columns using a hybrid intelligent algorithm. Applied Sciences, 2019, 9(14): 2802

[25]	Sarir P , Shen S , Wang Z , Chen J , Horpibulsuk S , Pham B T . Optimum model for bearing capacity of concrete-steel columns with AI technology via incorporating the algorithms of IWO and ABC. Engineering with Computers, 2021, 37(2): 797–807

[26]	Moon J , Kim J J , Lee T H , Lee H E . Prediction of axial load capacity of stub circular concrete-filled steel tube using fuzzy logic. Journal of Constructional Steel Research, 2014, 101: 184–191

[27]	Ahmadi M , Naderpour H , Kheyroddin A . ANN model for predicting the compressive strength of circular steel-confined concrete. International Journal of Civil Engineering, 2017, 15(2): 213–221

[28]	Ahmadi M , Naderpour H , Kheyroddin A . Utilization of artificial neural networks to prediction of the capacity of CCFT short columns subject to short term axial load. Archives of Civil and Mechanical Engineering, 2014, 14(3): 510–517

[29]	Güneyisi E M , Gültekin A , Mermerdaş K . Ultimate capacity prediction of axially loaded CFST short columns. International Journal of Steel Structures, 2016, 16(1): 99–114

[30]	Hamdia K M , Msekh M A , Silani M , Thai T Q , Budarapu P R , Rabczuk T . Assessment of computational fracture models using Bayesian method. Engineering Fracture Mechanics, 2019, 205: 387–398

[31]	Mishra A , Anitescu C , Budarapu P R , Natarajan S , Vundavilli P R , Rabczuk T . An artificial neural network based deep collocation method for the solution of transient linear and nonlinear partial differential equations. Frontiers of Structural and Civil Engineering, 2024, 18(8): 1296–1310

[32]	Jayalekshmi S , Jegadesh J S S , Goel A . Empirical approach for determining axial strength of circular concrete filled steel tubular columns. Journal of The Institution of Engineers, 2018, 99: 257–268

[33]	Vu Q V , Truong V H , Thai H T . Machine learning-based prediction of CFST columns using gradient tree boosting algorithm. Composite Structures, 2021, 259: 113505

[34]	Saadoon A S , Nasser K Z , Mohamed I Q . A neural network model to predict ultimate strength of rectangular concrete filled steel tube beam-columns. Engineering and Technology Journal, 2012, 30(19): 3328–3340

[35]	Saadoon A S , Nasser K Z . Use of neural networks to predict ultimate strength of circular concrete filled steel tube beam-columns. University of Thi-Qar Journal for Engineering Sciences, 2013, 4(2): 48–62

[36]	Nguyen H Q , Ly H B , Tran V Q , Nguyen T A , Le T T , Pham B T . Optimization of artificial intelligence system by evolutionary algorithm for prediction of axial capacity of rectangular concrete filled steel tubes under compression. Materials, 2020, 13(5): 1205

[37]	Sarir P , Chen J , Asteris P G , Armaghani D J , Tahir M M . Developing GEP tree-based, neuro-swarm, and whale optimization models for evaluation of bearing capacity of concrete-filled steel tube columns. Engineering with Computers, 2021, 37(1): 1–19

[38]	Luat N V , Shin J , Lee K . Hybrid BART-based models optimized by nature-inspired metaheuristics to predict ultimate axial capacity of CCFST columns. Engineering with Computers, 2022, 38(2): 1421–1450

[39]	Luan T M , Khatir S , Tran M T , De Baets B , Cuong-Le T . Exponential-trigonometric optimization algorithm for solving complicated engineering problems. Computer Methods in Applied Mechanics and Engineering, 2024, 432: 117411

[40]	EN 1994-1-1. Eurocode 4: Design of Composite Steel and Concrete Structures—Part 1-1: General Rules and Rules for Buildings. Brussels: European Committee for Standardization, 2004

[41]	Han L H , Ren Q X , Li W . Test on stub stainless steel-concrete-carbon steel double-skin tubular (DST) columns. Journal of Constructional Steel Research, 2011, 67(3): 437–452

[42]	Luan T M , Khatir S , Cuong-Le T . Concrete damage plastic model for high strength-concrete: Applications in reinforced concrete slab and CFT columns. Iranian Journal of Science and Technology, Transactions of Civil Engineering, 2025, 49(4): 3529–3548

[43]	Liu ZWang YVaidya SRuehle FHalverson JSoljačić MHou T YTegmark M. KAN: Kolmogorov-arnold networks. 2024, arXiv:2404.19756

[44]	Sharma V , Rai S , Dev A . A comprehensive study of artificial neural networks. International Journal of Advanced Research in Computer Science and Software Engineering, 2012, 2(10): 270–274

[45]	Chen T , He T . XGBoost: Extreme gradient boosting. R Package Version 0.4-2, 2015, 1(4): 1–4

[46]	Ding Z , Nguyen H , Bui X N , Zhou J , Moayedi H . Computational intelligence model for estimating intensity of blast-induced ground vibration in a mine based on imperialist competitive and extreme gradient boosting algorithms. Natural Resources Research, 2020, 29: 751–769

[47]	Le L T , Nguyen H , Zhou J , Dou J , Moayedi H . Estimating the heating load of buildings for smart city planning using a novel artifcial intelligence technique PSO-XGBoost. Applied Sciences, 2019, 9(13): 2714

[48]	Nguyen H , Drebenstedt C , Bui X N , Bui D T . Prediction of blast-induced ground vibration in an open-pit mine by a novel hybrid model based on clustering and artifcial neural network. Natural Resources Research, 2020, 29(2): 691–709

[49]	Chen TGuestrin C. XGBoost: A scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York, NY: ACM, 2016: 785–794

[50]	Das S K , Biswal R K , Sivakugan N , Das B . Classification of slopes and prediction of factor of safety using differential evolution neural networks. Environmental Earth Sciences, 2011, 64(1): 201–210

[51]	Gomes G F , De Almeida F A , Junqueira D M , Cunha S S , Ancelotti A C . Optimized damage identification in CFRP plates by reduced mode shapes and GA-ANN methods. Engineering Structures, 2019, 181: 111–123

[52]	Mirjalili S , Lewis A . The whale optimization algorithm. Advances in Engineering Software, 2016, 95: 51–67

[53]	Guo H , Zhou J , Koopialipoor M , Armaghani D J , Tahir M M . Deep neural network and whale optimization algorithm to assess fyrock induced by blasting. Engineering with Computers, 2021, 37(1): 173–186

RIGHTS & PERMISSIONS

Higher Education Press

PDF (4530KB)

Accesses

Citation

Detail

Sections

Recommended

About the journal

Aims & scope

Description

Editorial board

Contact us

Latest issue

Just accepted

Collections

Authors & reviewers

Online submission

Call for papers

Guidelines for authors

Abstract

Graphical abstract

Keywords

Cite this article

1 Introduction

2 Review of existing approaches

2.1 Numerical simulation

2.2 Eurocode 4 standard

2.3 Concrete-filled steel tubular column design with holes

3 Machine learning techniques

3.1 Kolmogorov-arnold network

3.2 Artificial neural networks

4 Proposing a new hybrid model

4.1 Extreme gradient boosting

4.2 Development of combined modeling approaches

4.2.1 Exponential-trigonometrie optimized extreme gradient boosting model

4.2.2 Whale optimization algorithm extreme gradient boosting model

5 Discussion

5.1 Model setup

5.2 Comparison of results

6 Conclusions

References

RIGHTS & PERMISSIONS