Application of machine learning technique for predicting and evaluating chloride ingress in concrete

Van Quan TRAN , Van Loi GIAP , Dinh Phien VU , Riya Catherine GEORGE , Lanh Si HO

Front. Struct. Civ. Eng. ›› 2022, Vol. 16 ›› Issue (9) : 1153 -1169.

PDF (23682KB)
Front. Struct. Civ. Eng. ›› 2022, Vol. 16 ›› Issue (9) : 1153 -1169. DOI: 10.1007/s11709-022-0830-4
RESEARCH ARTICLE
RESEARCH ARTICLE

Application of machine learning technique for predicting and evaluating chloride ingress in concrete

Author information +
History +
PDF (23682KB)

Abstract

The degradation of concrete structure in the marine environment is often related to chloride-induced corrosion of reinforcement steel. Therefore, the chloride concentration in concrete is a vital parameter for estimating the corrosion level of reinforcement steel. This research aims at predicting the chloride content in concrete using three hybrid models of gradient boosting (GB), artificial neural network (ANN), and random forest (RF) in combination with particle swarm optimization (PSO). The input variables for modeling include exposure condition, water/binder ratio (W/B), cement content, silica fume, time exposure, and depth of measurement. The results indicate that three models performed well with high accuracy of prediction (R2≥ 0.90). Among three hybrid models, the model using GB_PSO achieved the highest prediction accuracy (R2 = 0.9551, RMSE = 0.0327, and MAE = 0.0181). Based on the results of sensitivity analysis using SHapley Additive exPlanation (SHAP) and partial dependence plots 1D (PDP-1D), it was found that the exposure condition and depth of measurement were the two most vital variables affecting the prediction of chloride content. When the number of different exposure conditions is larger than two, the exposure significantly impacted the chloride content of concrete because the chloride ion ingress is affected by both chemical and physical processes. This study provides an insight into the evaluation and prediction of the chloride content of concrete in the marine environment.

Graphical abstract

Keywords

gradient boosting / random forest / chloride content / concrete / sensitivity analysis.

Cite this article

Download citation ▾
Van Quan TRAN, Van Loi GIAP, Dinh Phien VU, Riya Catherine GEORGE, Lanh Si HO. Application of machine learning technique for predicting and evaluating chloride ingress in concrete. Front. Struct. Civ. Eng., 2022, 16(9): 1153-1169 DOI:10.1007/s11709-022-0830-4

登录浏览全文

4963

注册一个新账户 忘记密码

1 Introduction

Reinforced concrete (RC) is well-known as one of the most widely used composites materials in many types of infrastructure, from road-bridge and building structures to marine structures. It is well accepted that RC structures have a long service life as well as high resistance to aggressive environment. RC structures have two main components: concrete and steel. Concrete has a role to protect steel corrosion by providing a highly alkaline environment, resulting in a passive oxide film around steel. In the RC structures that are exposed to the marine environment, the reinforcement steel can be influenced by the presence of chloride ions in this environment. It has been stated that the presence of chloride ions is the main problem affecting structures in the marine environment, and negatively influences the RC structures [1,2]. Chloride ions firstly accumulate on the surface of the RC structure and gradually penetrate into a concrete structure [3]. The penetration of chloride ions will destroy the oxide film around reinforcement steel and trigger steel corrosion, which causes cracking and spalling of concrete, resulting in the reduction of the bearing capacity of structures [4]. This process of corrosion significantly decreases the service life of concrete structures in seawater. The penetration or ingress of chloride into concrete is strongly dependent on its exposure condition, i.e., atmospheric zone, splash zone, tidal zone, and submerged zone [5]. The chloride ingress in concrete is mostly attributed to the absorption or diffusion mechanism [6]. In the submerged zone, the transport of chloride into concrete is subjected to a diffusion mechanism because the concrete in this zone is under a seawater-saturated state, while the chloride ingress into concrete in the splash zone is mainly subjected to the absorption mechanism [7]. In general, the chloride ingress in RC structures is computed based on Fick’s second law of diffusion, which is widely employed to design the service life of marine structures [810]. Many authors have worked on chloride ingress in concrete.

The diffusion coefficient of chloride is governed by many factors such as concrete strength, exposure condition, water-cement ratio (W/C), etc. It has been stated that increase in concrete strength results in a reduction of the diffusion coefficient [11]. Liu et al. [9] estimated the chloride distribution within offshore concrete using statistical analysis. They indicated that the tidal zone was the most dangerous exposure environment regarding chloride corrosion. In addition, it has been found that the W/C was strongly relevant to chloride diffusion; the higher value of W/C resulted in a higher value of diffusion coefficient, penetration depth, and surface concentration. They have also concluded that this statistical approach is simple and reliable. However, this approach can only be applied to a given structure at a specific place. In addition, other researchers stated that the statistical approach contains some significant disadvantages, such as the difficulty of parameter prediction [12] and low prediction accuracy [13,14]. This is because the interdependence of chloride diffusion and influence factor (such as exposure condition, penetration depth, etc.) is complicated and time-dependent [13,15]. Furthermore, other factors such as apparent surface chloride content and chloride concentration at the initial state also affect the ingress of chloride in concrete. These generate uncertainty and so it is necessary to build a robust model that can consider many factors, including time dependence.

Recently, Machine Learning (ML) and Artificial Intelligence have been used in many engineering problems [1619], particularly in civil engineering [2022]. For example, ML approaches such as artificial neural network (ANN) have been popularly used to estimate mechanical properties such as compressive and flexural strength of concrete [23,24]. Furthermore, gradient boosting (GB) and random forest (RF) have been also adopted to predict the mechanical properties of concrete [2527]. The durability of RC has been estimated through a ML approach by Tafasse et. al. [28]. They have indicated that MF models are comparable to functional models of CO2 and Cl ingress. Ahmad et al. [7] used MF models, namely ANN, decision tree (DT), and gene expression programming (GEP), to estimate the chloride concentration on the surface of structures in the marine environment. Using 12 input variables, they revealed that GEP had the best prediction accuracy among the three models. Liu et al. [29] employed ANN models to predict diffusion coefficient using 653 data samples taken from literature. They imported 13 input variables related to concrete composition, experimental process, and mechanical properties. Other previous study results have found that ANN could be a promising tool for estimating chloride diffusion coefficient. Hoang et al. [30] used four MF approaches to predict the chloride content based on 132 mortar specimens, with four input variables (mortar age, depth of measurement, diffusion dimension, presence of reinforcement). They found that the Multivariate Adaptive Regression Splines (MARS) model obtained the best prediction accuracy (R2 = 0.91). Parichatprecha and Nimityongskul [31] used ANN and Linear Regression models to estimate the total charge moved by chloride ions in high-performance concrete with a dataset of 86 samples using eight input variables (including cement, water, fine aggregate, coarse aggregate, and type and dosage of pozzolanic materials) and found that ANN achieved a high prediction accuracy. Najimi et al. [32] used a feed-forward ANN in combination with an Artificial Bee Colony algorithm (FF-ABC) to estimate chloride penetration of self-consolidating concrete with 72 datasets. They used six input variables related to the binder, water-binder ratio, aggregates, and additives. The results indicated that the FF-ABC model achieved a very high accuracy of prediction (R2 = 0.9801).

All the above literature focused on predicting the chloride penetration using input variables related to the binder, aggregate, and mechanical properties obtained in the laboratory. In these previous models, researchers have not taken into account curing conditions, exposure conditions, and exposure time, which are the most dominant factors affecting chloride content. Furthermore, earlier research mainly used a single ML model with a small dataset to estimate chloride content, and those previous works have not fully considered the importance of input variables. Therefore, this study aims to evaluate and improve the prediction of chloride content of concrete subjected to the marine environment, using a hybrid MF approach. To achieve this goal, three hybrid ML models produced from a combination of ANN, RF, and GB with particle swarm optimization (PSO) are employed to predict the chloride content. The input variables for these models are exposure time, exposure condition, binder content, water/binder ratio (W/B), silica fume content, and depth of measurement. Furthermore, to understand the importance of input variables as well as their influence on the output, the sensitivity analysis, namely SHapely Additive exPlanations (SHAP) and PDP-1D are performed. The results of this study can reduce the gap in the literature and practice relating to prediction of the chloride content of concrete in the marine environment.

2 Database construction

A total of 404 samples were collected from published studies, in which 324 data samples were collected from Ref. [33], and the remaining 80 data were taken from Ref. [34]. These datasets include six input parameters, namely exposure condition, W/B, cement content, silica fume, exposure time, depth of measurement, and one output, namely chloride content. The histogram of input variables of the databases is presented in Fig.1. Fig.1(a) shows the histogram and samples distribution for exposure conditions. The marine submerged (SUB) has the highest number of samples, while the marine splashes (SPL) and marine atmospheric (ATM) have almost the same number of samples. The chloride content mostly ranges from 0.0−0.4 g/100 g of concrete, and a small number of data belonging to SPL and SUB conditions have chloride content larger than 0.4 g/100 g. The ATM condition mostly has chloride content in a range of 0.0−0.3 g/100 g. The W/B changes from 0.35 to 0.50. These values of W/B are the common values used for conventional concrete (Fig.1(b)). The numbers of samples in the case with W/B of 0.35, 0.40, and 0.45 are nearly the same, while that of W/B of 0.50 is much smaller. The cement content varies from 335 to 400 kg/m3, the highest number of samples was found for the case with 360 kg/m3 (Fig.1(c)). The cases with 375 and 400 kg/m3 have a similar number of samples, a small number of samples was observed for 335 kg/m3. The amount of silica fume is in the range of 0−40 kg/m3, in which the largest number of data is found for the case without silica content (Fig.1(d)). The exposure time ranges from 2.5 to 9.0 months, and the depth of measurement varies from 0.5 to 30.0 mm (Fig.1(e) and Fig.1(f)). The chloride content varies from 0.0 to 0.6 g/100 g of concrete, and mostly between 0.0 to 0.4 g/100 g (Fig.1(g)). Overall, all input variables and output do not have good density distribution curves. Tab.1 shows the detailed description of input and output.

The correlations between input and output variables as well as among input variables are presented in Fig.2. There are different correlations among input variables, including high and low correlation. Among input variables, there is a quite high correlation between silica fume and cement content with |R| = 0.6. This is because silica fume is one type of supplementary replacement for cement content. Regarding the correlation between input variables and output, there is a high correlation between depth of measurement and chloride content with |R| = 0.7. This is because in fact as the measurement depth increases, the chloride content decreases. In addition, a fair correlation is found between W/B and output. Besides, it can be observed that based on the R values (R = 0), two features (cement content and time exposure) are independent of output. This may be caused by the limitation of the classical measure of dependence, such as the Pearson correlation coefficient, which is mainly sensitive to a linear relationship between two variables.

Indeed, the correlations between pairs of variables and output are not linear. Distance Correlation (DC) was introduced by Székely et al. [35] to address this deficiency of Pearson’s correlation. Pearson correlation coefficient R equal to zero (R = 0) (uncorrelatedness) does not imply independence, while DC equal to zero (DC = 0) implies independence. The DC between input and output variables is presented in Fig.3. Among input variables, the highest DC (DC = 0.73) is found between cement content and silica fume. The highest DC between input and output was found between depth of measurement and chloride content (DC = 0.67), followed by exposure condition and the chloride content (DC = 0.44). The lowest DC is obtained between time exposure and chloride with DC = 0.11.

3 Machine learning methods

3.1 Artificial neural network

ANNs are computing architectures based on the physiology of animal brains consisting of many simple processors that operate in parallel (neurons), which are interconnected by a system of axons and dendrites through which simple signals are exchanged. The neurons become units, and interconnections become links in the case of an ANN, thereby becoming a vast simplification of brain anatomy.

Nodes in ANN are like neurons and are connected to other nodes. When the input to a node exceeds a threshold, it will output a signal. Multilayer perceptron is a class of ANN [22], where multiple layers of neurons are established. These layers are classified into three types: input, output, and hidden layers. Input layers consist of all the input parameters in the given database, and the output layer consists of the output parameters. Hidden layers are processing layers in between the input and output layers. The layers are interconnected to each other and are associated with different weights. ANNs are trained using a learning function where ANN is updated until the error is within a tolerance limit. The components and working of the ANN algorithm are illustrated in Fig.4. The input layer in this study has six nodes while the output layer is a single node. The detailed parameter information of the ANN used in this study can be found in Section 5.

3.2 Random forest

The RF algorithm is an example of an ensemble bagging technique, where multiple DT are used for the classification and regression of data [36]. In ensemble bagging (bootstrapping) multiple subsets are randomly sampled from the primary database with replacement, and DTs are applied on each of these subsets. The outcomes from all these DTs are averaged to find the outcome of the RF algorithm. The accuracy of the regression/prediction increases as the number of subsets increases. This method eliminates overfitting observed in the application of DTs. A flow chart showing the different stages of the RF algorithm is given in Fig.5.

3.3 Gradient boosting

The GB algorithm also uses multiple DTs for classification and regression of data, but unlike RF algorithm, it uses an ensemble boosting technique [37]. In this algorithm, a new improved predictor is estimated progressively at each DT. The predictors from a DT are boosted sequentially. At each stage, the error obtained in the previous stage is modified. This model is very flexible and leads to an accurate solution faster. However, this model can lead to overfitting if iterations are allowed to continue to reduce all errors.

3.4 Particle swarm optimization algorithm

PSO is a swarm intelligence-based metaheuristic algorithm and is among the oldest and most widely used nature-inspired algorithms. PSO is inspired by a flying swarm of birds searching for food, modeling social interactions. Kennedy and Eberhart [38] proposed the algorithm based on the social behavior of animal groups and sharing of information among the group to increase the survival advantage. For example, a flock of birds flying and finding a place to land becomes a complex issue due to the consideration of various factors like maximizing the availability of food, minimizing the risk of the existence of predators, etc.

A swarm is constituted by several particles (agents) that move in a high-dimensional solution space, whose goal is to find the best (optimal) solution. Typically, optimization aims to maximize or minimize a function that can have multiple local maxima and minima. However, there can be only one global maximum or minimum. Finding the global optimum value for a function that is quite complex is computationally challenging. Each swarm is aware of its positional coordinates in the solution space, which indicates the best solution achieved by that particle so far, known as the personal best (pbest). Each particle searches the solution space by assessing different points using several evaluation criteria at the same time. Also, the PSO keeps track of the best among all personal bests, known as the global best (gbest). Each particle in the swarm dynamically adjusts its velocity based on its flying experiences and that of its fellow particles. In other words, the new position of a particle will depend on its current velocity, pbest and gbest. Particle movement in the PSO algorithm is illustrated in Fig.6. Current motion influence, particle memory influence, and swarm influence are obtained by applying random weights on current velocity, pbest and gbest, respectively. Since the inception of the original PSO approach, many variations have been proposed, like the linear-decreasing inertia weight [39], constriction factor weight [40], dynamic inertia and maximum velocity reduction [40], quantum-inspired approach [41], and hybrid models [42]. The optimization process is detailed in Ref. [43].

3.5 Performance evaluation

In this paper, three statistical indicators are used namely, root mean square error (RMSE), mean absolute error (MAE), and coefficient of determination (R2). The value of R2 ranges from 0 to 1, a higher value of R2 indicating higher prediction ability of the model. However, the lower values of MAE and RMSE imply better accuracy of prediction. These values can be computed using the following formula [4345].

R2=1i=1k(viv¯i)2i=1k(viv¯)2,

RMSE=1ki=1k(viv¯i)2,

MAE=1ki=1k(viv¯i)

where k infers the total number of the samples, vi and v¯i are the actual and predicted outputs, respectively, and v¯ is the mean value of the vi.

4 Model conception

MF models are developed in this study to predict the chloride content in concrete using the parameters such as exposure condition, W/B, cement content, amount of silica fume, exposure time, and measurement depth. The methodology adopted in this study is illustrated in the flow chart in Fig.7. The various steps in the methodology are explained below to achieve the output.

Step 1: Preparation of the database

This study uses a database of 404 experimental results pooled from literature as afore-mentioned. The data consists of seven parameters, as explained in Tab.1, where chloride content is used as the output parameter and others as input parameters. The training dataset is formed by randomly selecting 70% of data from the database. The remaining 30% is used as the testing data set. Models that have learned using the training data set are used to predict the chloride contents of the testing database later.

Step 2. Training models

MF algorithms ANN, GB, and RF combined with Particle Swam Optimization (PSO) algorithm are used for tuning hyperparameters. In the tuning process, 10-Fold cross validation is used for optimizing the cost function R2. The tuning hyperparameters are performed in 70% samples of database. The details of various optimum parameters of these hybrid algorithms (ANN_PSO, GB-PSO, and RF_PSO) are listed in Tab.2 and Section 5.

Step 3: Validating the model

After tuning hyperparameters, the performance of these models is evaluated through three metrics such as R2, MAE, and RMSE (shown in Eqs. (1) to (3)) in the whole dataset with the aid of Monte Carlo simulations which increases the reliability of ML models.

Step 4: Evaluating the effect of input parameters on performance model

In this step, the importance and influence of the input variables on the output variable are evaluated and simulated by calculating the SHAP values. After analyzing and evaluating the influence of input variables on the chloride content of concrete and the predictive performance of the model, less important input variables are removed. The best-proposed model is used to predict the chloride content of concrete.

Step 5: Sensitivity analysis

In this last step, after validation of these proposed ML models using the testing data set, the sensitivity of input parameters is performed. Sensitivity analysis shows how the changes in input parameters affect the output variable, chloride content in concrete. This process is carried out by Partial Dependence Plots (denoted PDP 1D), the details of which are described in Section 5.

5 Results and discussion

5.1 Performance evaluation of hybrid models

The hyper-parameters space of three ML models is listed in Tab.2, which presents the details of each ML model. To obtain the optimal model, K-Fold cross-validation is employed. The cost function of each ML model is evaluated using the R2 index.

Moreover, the hyperparameters of PSO influence strongly the cost function of the optimization problem and the computational process is time-consuming. Of the various the parameters, population size (Po) can be considered to have a high effect on the calculation process of PSO [40,46]. Therefore, the sensitivity analysis of these hyperparameters is performed to select the optimal hyperparameter for tuning the ML model in predicting the chloride content of concrete. The influence of Po on convergence R2 score is shown in Fig.8, which illustrates that with Pos of 3, 10 and 20, the R2 score achieves a stable value. Moreover, computational time is lowest, at 24 s, in the case of Po of 3. As a result, a Po of 3 is acceptable for achieving the optimal hyperparameter.

The optimal values of hyperparameters corresponding to the highest accuracy of prediction are achieved using 10-fold cross-validation. The optimum hyper-parameters of three ML models are displayed in Tab.3. For each iteration process, to attain the optimal value, each hyperparameter of the model has a specific range of values, while other hyperparameters are kept as default. After that, the accuracy of prediction for each proposed model is computed, and then fine-tuned to gain the subsequent values. The details relating to the parameters employed for the trial-error test are shown in Tab.2. And Tab.3 indicates that GB obtained the highest score of R2 (0.8623), followed by RF, and ANN. The specific information related to the optimal parameters of the three models is shown in Tab.3.

Two general optimization algorithms namely Bayesian optimization (BO) and Simulated Annealing (SA) are employed and compared to have a comprehensive understanding of the advantages of those proposed models. The R2 value versus the number of iterations of nine hybrid models is shown in Fig.9. All nine models achieve a high score of R2 (R2 > 0.81), in which the GB_PSO model has the highest value of R2 = 0.862, while the RF_SA and RF_BO models obtain the lowest value of R2 = 0.801. Figure illustrates that all hybrid models using the PSO algorithm achieve higher accuracy compared to those using BO and SA algorithms. Overall, nine models achieve the maximum and stable value of R2 when the number of iterations was larger than 250. Thus, we can roughly conclude that these proposed hybrid models using the PSO algorithm are capable of estimating the chloride content.

The performance of three hybrid models in terms of RMSE, MAE, and R2 for both training and testing parts is presented in Fig.10. In the case of the RMSE value (Fig.10(a)), the RMSE values of the training part are smaller than those of testing parts for all three hybrid models. For both training and testing parts, the ANN_PSO model has higher values compared to those of the GB_PSO and RF_PSO models. In addition, the standard deviations of the ANN_PSO model are also higher than those of GB_PSO and RF_PSO models for both training and testing parts. Among three models, the GB_PSO model shows the smallest value of RMSE. Just as for the RMSE value, the MAE values of testing are higher than those of the training set (Fig.10(b)). For the training set, the ANN_PSO model has the highest value of MAE with the largest value of deviation. The highest value of MAE is also found for the ANN_PSO model in the training set. The GB_PSO has the smallest value of MAE. The ANN_PSO model obtains the smallest value, while the GB_PSO model has the largest value (Fig.10(c)). Thus, from the results of RMSE, MAE, and R2, the GB_PSO model achieves the highest prediction ability, and the lowest performance is found for the ANN_PSO model.

The R2 value after 500 Monte Carlo simulations for three hybrid models is presented in Tab.4. For both train and test datasets, the GB_PSO model attains the highest score of R2 in terms of average and maximum values, and the standard deviation is the smallest. At the same time, the ANN_PSO obtains the lowest value of R2 in terms of the average and maximum values with the highest value of standard deviation.

Tab.5 provides a summary of the average RMSE values for three hybrid models after 500 Monte Carlo simulations. The lower value of RMSE indicates the model has a better prediction accuracy. The GB_PSO model achieves the lowest value of RMSE regarding the minimum and average cases for both train and test datasets. In contrast, the ANN_PSO model has the highest MAE values with regard to minimum and average cases for both train and test datasets. For the test dataset, the GB_PSO models show the lowest value of standard deviation, while the highest value of standard deviation is found for the ANN_PSO model.

The summary of MAE value after 500 simulations for three hybrid models is listed in Tab.6. Just as for the case of RMSE, for both train and test datasets, the GB_PSO model achieves the lowest MAE values for both minimum and average cases. On the other hand, the ANN_PSO model has the highest values of MAE for both minimum and average cases. From the three tables above, it can be concluded that the GB_PSO outperforms the two remaining models.

The summaries of evaluation metrics show the overfitting of ML models is present. Three solutions can be envisaged to reduce the overfitting: (i) completing the range value of data to reduce the imbalanced data; (ii) modifying the optimization space of hyperparameters; (iii) using novel metaheuristic algorithms such as Black Widow Optimization, Slime Mold Algorithm.

5.2 Prediction performance of typical hybrid model GB_PSO

Based on the above results and discussion, the GB_PSO model has the best prediction accuracy. Thus, the GB_PSO is selected to implement further investigation. Fig.11 shows the experimental and predicted chloride content results in function of the sample index for both training and testing datasets. In general, the predicted values match very well with the actual values for both training and testing datasets. This indicates that the GB_PSO model performs well.

The regression graphs of the GP_PSO model for both training, testing, and all datasets are shown in Fig.12. For the training part, the GB_PSO model achieves very high prediction accuracy with R2 = 0.9573, RMSE = 0.0316, and MAE = 0.0163. Regarding the testing results, the prediction accuracy is a little smaller than that found in the training results, but the values of the three indicators also emphasize a high prediction performance with R2 = 0.9500, RMSE = 0.0354, and MAE = 0.0225. The results of all data are presented in Fig.10(c). As can be observed, the prediction accuracy of the GB_PSO model is high with R2 = 0.9551, RMSE = 0.0327, and MAE = 0.0181. Furthermore, it can be seen that there is a small deviation when chloride content ranges from 0.4–0.6 g/100 g. This can be attributed to the difference in exposure conditions. As previously mentioned, exposure conditions included SPL, SUB, and ATM, in which data points mostly lie in the range of 0.0−0.3. There are fewer data points in the range 0.4–0.6, and these mostly come from SPL and SUB (see Fig.1(a)). Thus, this difference in exposure conditions can lead to that small deviation.

In order to evaluate the effectiveness of the proposed model in this study, a comparison between the results achieved in this study and previous literature is made and shown in Tab.7, which shows that this study has a larger number of datasets than the previous studies. Besides, this study completely evaluates the performance accuracy of the proposed model for both training, testing, and all datasets, while previous literature has been mainly conducted either for training or testing datasets. Furthermore, this study employs three statistical indicators (R2, RMSE, and MAE), but the previous studies have mainly used R2 to assess model performance. The prediction accuracy of the GB_PSO model of this study is higher than that of the ANN and ANFIS models in the previous study conducted by Boğa et al. [47] and four models in the previous study by Hoang et al. [30]. This may be because this study uses more input variables and the combination of GB and PSO. As stated in earlier studies, PSO has some advantages with respect to evolutionary algorithms [38,48]. For instance, PSO has no complicated operators as evolutionary algorithms because it has fewer parameters that need to be adjusted [49]. Besides, the result of this study is comparable with the result in a previous study using the ANN model [31] and ANN and CART models [33]. In summary, it can be concluded that the GB_PSO is an outstanding model and has high accuracy in predicting the chloride content.

5.3 Feature importance analysis

The feature importance analysis using SHAP is shown in Fig.13. Based on the result of SHAP, the importance of each input variable is qualitatively evaluated. The order of input variables based on feature value is ranked from top to bottom (i.e. from red to blue color). From the figure, the boundaries of the feature values between positive and negative contributions can be explicitly observed. The depth of measurement is found as the most vital input variable, followed by exposure condition, W/B, cement content, time exposure, and silica fume. This is because concrete has a dense structure, and thus the chloride ions take time to penetrate into the concrete. As a result, generally, the larger the depth of measurement, the lower is the chloride concentration. The exposure condition was ranked the second most important feature, and this is consistent with the result obtained in the previous study by Alizadeh et al. [50]. They suggested that exposure zones strongly affected the surface concentration of chloride as well as chloride diffusion coefficient. Blue dots in the figure are for the SUB, which has positive SHAP values. This indicates that SUB condition has a strong influence on the chloride content because in this range concrete is directly exposed to a high concentration of chloride from seawater. The purple dots indicate the SPL, which also strongly affect the chloride content. The red dots represent the ATM, and this condition has less impact on chloride content. It has been reported that exposure conditions are not only impacted by the chemical properties of seawater but also physical attacks comprising wet and dry cycles, freezing and thawing cycles, and wave attack [51]. This indicates that curing condition is one of the important factors that remarkably influence chloride resistance.

W/B ratio is also found as one of the three most crucial factors, which significantly influence the chloride concentration. This agrees well with the results of previous studies, which have stated that the lower W/B ratio leads to a better chloride resistance [8,9]. This is because, with a lower W/B ratio, concrete possesses a dense structure, which leads to lower permeability. Cement content is also found to have an impact on chloride concentration. However, the boundaries between negative and positive contributions are not clearly defined. Time exposure has less feature value in comparison with cement content, but there is a clear contribution between negative and positive. Silica fume has the smallest feature value, but it can be observed that there is a clear boundary between positive and negative contributions. Previous studies have revealed that the addition of silica fume changes the chloride resistance because silica fume modifies the microstructure of concrete [5254].

5.4 Partial dependence plot analysis

The PDP-1D is performed to evaluate the impact of input variables on the output (chloride content), and the results of PDP-1D are shown in Fig.14. In the PDP-1D analysis, when assessing the influence of one variable on the output, other input variables are left constant. The chloride content is significantly affected by the depth of measurement, and the partial dependence decreases almost linearly with increasing depth of measurement (Fig.14(a)). There is almost no influence of exposure conditions on the chloride content when the number of exposure conditions varies from one to two conditions (Fig.14(b)). However, when the exposure condition increases from two to three conditions, the partial dependence decreases significantly. This indicates that the exposure greatly affects the chloride content (Fig.14(b)). This phenomenon can be explained as follows. Using an example, if splashes and atmospheric conditions are present, then the transport mechanism is mainly attributed to absorption. However, when including the submerged condition, the chloride ingress consists of a diffusion mechanism. In addition, the increase of exposure conditions also leads to more attacks (both chemical and physical attacks) [51]. Fig.14(c) shows a remarkable influence of W/B on the chloride content; the higher W/B causes a higher chloride content because the higher W/B ratio results in a less dense structure of concrete, thus increasing the permeability of concrete [9]. Cement content and exposure time have a similar influence on the chloride concentration; the increase of these factors produce an increase of the chloride content (Fig.14(d) and Fig.14(e)). However, for silica fume content, the partial dependence reduces linearly with an increase in silica fume content; this implies that increasing silica fume amount improves chloride resistance. Because, as stated in the previous study, addition of silica fume leads to a denser structure due to the pozzolanic reaction of silica fume with cement hydrate products [55,56].

6 Conclusions

This paper used three hybrid models ANN_PSO, RF_PSO, and GB_PSO to estimate the chloride content of concrete in the marine condition. The input variables for modeling consist of exposure condition, W/B, cement content, silica fume, time of exposure, and depth of measurement.

The results indicate that three models performed well with high accuracy of prediction (R2 ≥ 0.90). Among three hybrid models, the model using GB_PSO achieved the highest prediction accuracy (R2 = 0.9551, RMSE = 0.0327, and MAE = 0.0181). Based on the results of sensitivity analysis using SHAP and PDP-1D, the importance of variables and the influence of input variables on the chloride content of concrete in the marine condition are fully evaluated. The results also indicate that exposure condition is one of two most vital variables, affecting the prediction of chloride content. In particular, when there are more than two exposure conditions, the exposure significantly impacts the chloride content of concrete because the chloride ion ingress is affected by both chemical and physical attacks. Finally, this study provides an insightful prediction and evaluation of the chloride content of concrete in the marine environment, and the results of the current research can reduce the gap in the literature and provide practical knowledge on the chloride ingress of concrete in the marine environment.

This study is conducted using 404 data samples for concrete in marine conditions. Further research should take into account a larger size of the database as well as other input variables such as grade of concrete, type of coarse and fine aggregates, and other types of pozzolanic materials. It should also consider the chloride content after a longer exposure time to verify the results and approach in this study.

In order to increase the performance of ML models, three solutions can be envisaged to reduce the overfitting: (i) completing the range value of data to reduce the imbalanced data; (ii) modifying the optimization space of hyperparameters; (iii) using novel metaheuristic algorithms such as Black Widow Optimization, Slime Mould Algorithm, etc.

References

[1]

Akiyama M, Frangopol D M, Suzuki M. Integration of the effects of airborne chlorides into reliability-based durability design of reinforced concrete structures in a marine environment. Structure and Infrastructure Engineering, 2012, 8(2): 125–134

[2]

Sadowski L, Nikoo M. Corrosion current density prediction in reinforced concrete by imperialist competitive algorithm. Neural Computing & Applications, 2014, 25(7−8): 1627–1638

[3]

Zhang P, Cong Y, Vogel M, Liu Z, Müller H S, Zhu Y, Zhao T. Steel reinforcement corrosion in concrete under combined actions: the role of freeze−thaw cycles, chloride ingress, and surface impregnation. Construction & Building Materials, 2017, 148: 113–121

[4]

Balafas I, Burgoyne C J. Environmental effects on cover cracking due to corrosion. Cement and Concrete Research, 2010, 40(9): 1429–1440

[5]

Dai J G, Akira Y, Wittmann F H, Yokota H, Zhang P. Water repellent surface impregnation for extension of service life of reinforced concrete structures in marine environments: The role of cracks. Cement and Concrete Composites, 2010, 32(2): 101–109

[6]

Khanzadeh Moradllo M, Shekarchi M, Hoseini M. Time-dependent performance of concrete surface coatings in tidal zone of marine environment. Construction & Building Materials, 2012, 30: 198–205

[7]

Ahmad A, Farooq F, Ostrowski K A, Śliwa-Wieczorek K, Czarnecki S. Application of novel machine learning techniques for predicting the surface chloride concentration in concrete containing waste material. Materials (Basel), 2021, 14(9): 2297

[8]

Chalee W, Jaturapitakkul C A, Chindaprasirt P. Predicting the chloride penetration of fly ash concrete in seawater. Marine Structures, 2009, 22(3): 341–353

[9]

Liu Q, Hu Z, Lu X, Yang J, Azim I, Sun W. Prediction of chloride distribution for offshore concrete based on statistical analysis. Materials (Basel), 2020, 13(1): 174

[10]

Cai R, Han T, Liao W, Huang J, Li D, Kumar A, Ma H. Prediction of surface chloride concentration of marine concrete using ensemble machine learning. Cement and Concrete Research, 2020, 136: 106164

[11]

Dhir R K, Jones M R, Elghaly A E. PFA concrete: Exposure temperature effects on chloride diffusion. Cement and Concrete Research, 1993, 23(5): 1105–1114

[12]

Wang H L, Dai J G, Sun X Y, Zhang X L. Time-dependent and stress-dependent chloride diffusivity of concrete subjected to sustained compressive loading. Journal of Materials in Civil Engineering, 2016, 28(8): 04016059

[13]

Liao K W, Chen C T, Wu B H, Chen W L, Yeh C M. Investigation of chloride diffusion in cement mortar via statistical learning theory. Magazine of Concrete Research, 2016, 68(5): 237–249

[14]

LiuJXingFDongB QMaH YPanD. New equation for description of chloride ions diffusion in concrete under shallow immersion condition. Materials Research Innovations, 2014, 18(sup2): S2−S265−S2−S269

[15]

van Noort R, Hunger M, Spiesz P. Long-term chloride migration coefficient in slag cement-based concrete and resistivity as an alternative test method. Construction & Building Materials, 2016, 115: 746–759

[16]

Guo H, Zhuang X, Rabczuk T. A deep collocation method for the bending analysis of Kirchhoff plate. Computers, Materials & Continua, 2019, 59(2): 433–456

[17]

Anitescu C, Atroshchenko E, Alajlan N, Rabczuk T. Artificial neural network methods for the solution of second order boundary value problems. Computers, Materials & Continua, 2019, 59(1): 345–359

[18]

Samaniego E, Anitescu C, Goswami S, Nguyen-Thanh V M, Guo H, Hamdia K, Zhuang X, Rabczuk T. An energy approach to the solution of partial differential equations in computational mechanics via machine learning: Concepts, implementation and applications. Computer Methods in Applied Mechanics and Engineering, 2020, 362: 112790

[19]

Zhuang X, Guo H, Alajlan N, Zhu H, Rabczuk T. Deep autoencoder based energy method for the bending, vibration, and buckling analysis of Kirchhoff plates with transfer learning. European Journal of Mechanics. A, Solids, 2021, 87: 104225

[20]

Tran Q A, Ho L S, Le H V, Prakash I, Pham B T. Estimation of the undrained shear strength of sensitive clays using optimized inference intelligence system. Neural Computing & Applications, 2022, 34(10): 7835–7849

[21]

PhamB.TLyH BAl-AnsariNHoL S. A Comparison of Gaussian Process and M5P for Prediction of Soil Permeability Coefficient. Hindawi Limited, 2021

[22]

Nguyen Q H, Ly H B, Ho L S, Al-Ansari N, Le H V, Tran V Q, Prakash I, Pham B T. Influence of data splitting on performance of machine learning models in prediction of shear strength of soil. Mathematical Problems in Engineering, 2021, 6: 1–15

[23]

Ben Chaabene W, Flah M, Nehdi M L. Machine learning prediction of mechanical properties of concrete: Critical review. Construction & Building Materials, 2020, 260: 119889

[24]

Moradi M J, Khaleghi M, Salimi J, Farhangi V, Ramezanianpour A M. Predicting the compressive strength of concrete containing metakaolin with different properties using ANN. Measurement, 2021, 183: 109790

[25]

Nguyen-Sy T, Wakim J, To Q D, Vu M N, Nguyen T D, Nguyen T T. Predicting the compressive strength of concrete from its compositions and age using the extreme gradient boosting method. Construction & Building Materials, 2020, 260: 119757

[26]

Zhang J, Ma G, Huang Y, Sun J, Aslani F, Nener B. Modelling uniaxial compressive strength of lightweight self-compacting concrete using random forest regression. Construction & Building Materials, 2019, 210: 713–719

[27]

Han Q, Gui C, Xu J, Lacidogna G. A generalized method to predict the compressive strength of high-performance concrete by improved random forest algorithm. Construction & Building Materials, 2019, 226: 734–742

[28]

Taffese W Z, Sistonen E. Machine learning for durability and service-life assessment of reinforced concrete structures: recent advances and future directions. Automation in Construction, 2017, 77: 1–14

[29]

Liu Q, Iqbal M F, Yang J, Lu X, Zhang P, Rauf M. Prediction of chloride diffusivity in concrete using artificial neural network: Modelling and performance evaluation. Construction & Building Materials, 2021, 268: 121082

[30]

Hoang N D, Chen C T, Liao K W. Prediction of chloride diffusion in cement mortar using multi-gene genetic programming and multivariate adaptive regression splines. Measurement, 2017, 112: 141–149

[31]

Parichatprecha R, Nimityongskul P. Analysis of durability of high performance concrete using artificial neural networks. Construction & Building Materials, 2009, 23(2): 910–917

[32]

Najimi M, Ghafoori N, Nikoo M. Modeling chloride penetration in self-consolidating concrete using artificial neural network combined with artificial bee colony algorithm. Journal of Building Engineering, 2019, 22: 216–226

[33]

Asghshahr M S, Rahai A, Ashrafi H. Prediction of chloride content in concrete using ANN and CART. Magazine of Concrete Research, 2016, 68(21): 1085–1098

[34]

Ashrafi H R, Ramezanianpour A A. Service life prediction of silica fume concretes. International Journal of Civil Engineering, 2007, 5: 182–197

[35]

Székely G J, Rizzo M L, Bakirov N K. Measuring and testing dependence by correlation of distances. Annals of Statistics, 2007, 35(6): 2769–2794

[36]

Ho T K. Random decision forests. In: Proceedings of the Proceedings of 3rd International Conference on Document Analysis and Recognition. Montreal, IEEE, 1995, 278–282

[37]

Friedman J H. Greedy function approximation: A gradient boosting machine. Annals of Statistics, 2001, 29(5): 1189–1232

[38]

Kennedy J, Eberhart R. Particle swarm optimization. In: Proceedings of the Proceedings of ICNN’95—International Conference on Neural Networks. Perth: IEEE, 1995, 1942–1948

[39]

Shi Y, Eberhart R C. Empirical study of particle swarm optimization. In: Proceedings of the Proceedings of the 1999 Congress on Evolutionary Computation—CEC99 (Cat. No. 99TH8406). Washington, D.C.: IEEE, 1999, 1945–1950

[40]

Eberhart R C, Shi Y. Comparing inertia weights and constriction factors in particle swarm optimization. In: Proceedings of the Proceedings of the 2000 Congress on Evolutionary Computation. CEC00 (Cat. No. 00TH8512). La Jolla: IEEE, 2000, 84–88

[41]

Han K H, Kim J H. Quantum-inspired evolutionary algorithm for a class of combinatorial optimization. IEEE Transactions on Evolutionary Computation, 2002, 6(6): 580–593

[42]

dos Santos Coelho L, Mariani V C. Particle swarm optimization with quasi-Newton local search for solving economic dispatch problem. In: Proceedings of the 2006 IEEE International Conference on Systems, Man and Cybernetics. Taipei, China: IEEE, 2006, 3109–3113

[43]

Le T T, Pham B T, Ly H B, Shirzadi A, Le L M. Development of 48-hour precipitation forecasting model using nonlinear autoregressive neural network. In: Ha-Minh C, Dao D, Benboudjema F, Derrible S, Huynh D, Tang A, eds. CIGOS 2019, Innovation for Sustainable Infrastructure. Lecture Notes in Civil Engineering, vol 54. Singapore: Springer, 2020, 1191–1196

[44]

Pham B T, Nguyen M D, Ly H B, Pham T A, Hoang V, Van Le H, Le T T, Nguyen H Q, Bui G L. Development of artificial neural networks for prediction of compression coefficient of soft soil. In: Ha-Minh C, Dao D, Benboudjema F, Derrible S, Huynh D, Tang A, eds. CIGOS 2019, Innovation for Sustainable Infrastructure. Lecture Notes in Civil Engineering, vol 54. Singapore: Springer, 2019, 1167–1172

[45]

Thanh T T M, Ly H B, Pham B T. A possibility of AI application on mode-choice prediction of transport users in Hanoi. In: Ha-Minh C, Dao D, Benboudjema F, Derrible S, Huynh D, Tang A, eds. CIGOS 2019, Innovation for Sustainable Infrastructure. Lecture Notes in Civil Engineering, vol 54. Singapore: Springer, 2020, 1179–1184

[46]

Piotrowski A P, Napiorkowski J J, Piotrowska A E. Population size in particle swarm optimization. Swarm and Evolutionary Computation, 2020, 58: 100718

[47]

Boğa A R, Öztürk M, Topcu I B. Using ANN and ANFIS to predict the mechanical and chloride permeability properties of concrete containing GGBFS and CNI. Composites. Part B, Engineering, 2013, 45(1): 688–696

[48]

Han F, Yao H F, Ling Q H. An improved evolutionary extreme learning machine based on particle swarm optimization. Neurocomputing, 2013, 116: 87–93

[49]

Ludermir T B, De Oliveira W R. Particle swarm optimization of MLP for the identification of factors related to common mental disorders. Expert Systems with Applications, 2013, 40(11): 4648–4652

[50]

Alizadeh R, Ghods P, Chini M, Hoseini M, Ghalibafian M, Shekarchi M. Effect of curing conditions on the service life design of RC structures in the Persian Gulf region. Journal of Materials in Civil Engineering, 2008, 20(1): 2–8

[51]

Yi Y, Zhu D, Guo S, Zhang Z, Shi C. A review on the deterioration and approaches to enhance the durability of concrete in the marine environment. Cement and Concrete Composites, 2020, 113: 103695

[52]

Zhang W, Ba H. Effect of silica fume addition and repeated loading on chloride diffusion coefficient of concrete. Materials and Structures, 2013, 46(7): 1183–1191

[53]

Shekarchi M, Rafiee A, Layssi H. Long-term chloride diffusion in silica fume concrete in harsh marine climates. Cement and Concrete Composites, 2009, 31(10): 769–775

[54]

Zhang P, Li D, Qiao Y, Zhang S, Sun C, Zhao T. Effect of air entrainment on the mechanical properties, chloride migration, and microstructure of ordinary concrete and fly ash concrete. Journal of Materials in Civil Engineering, 2018, 30(10): 04018265

[55]

Khan M I, Siddique R. Utilization of silica fume in concrete: Review of durability properties. Resources, Conservation and Recycling, 2011, 57: 30–35

[56]

Li L G, Zheng J Y, Ng P L, Zhu J, Kwan A K H. Cementing efficiencies and synergistic roles of silica fume and nano-silica in sulphate and chloride resistance of concrete. Construction & Building Materials, 2019, 223: 965–975

RIGHTS & PERMISSIONS

Higher Education Press

AI Summary AI Mindmap
PDF (23682KB)

3273

Accesses

0

Citation

Detail

Sections
Recommended

AI思维导图

/