Machine learning based models for predicting compressive strength of geopolymer concrete

Quang-Huy LE , Duy-Hung NGUYEN , Thanh SANG-TO , Samir KHATIR , Hoang LE-MINH , Amir H. GANDOMI , Thanh CUONG-LE

Front. Struct. Civ. Eng. ›› 2024, Vol. 18 ›› Issue (7) : 1028 -1049.

PDF (39981KB)
Front. Struct. Civ. Eng. ›› 2024, Vol. 18 ›› Issue (7) : 1028 -1049. DOI: 10.1007/s11709-024-1039-5
RESEARCH ARTICLE

Machine learning based models for predicting compressive strength of geopolymer concrete

Author information +
History +
PDF (39981KB)

Abstract

Recently, great attention has been paid to geopolymer concrete due to its advantageous mechanical and environmentally friendly properties. Much effort has been made in experimental studies to advance the understanding of geopolymer concrete, in which compressive strength is one of the most important properties. To facilitate engineering work on the material, an efficient predicting model is needed. In this study, three machine learning (ML)-based models, namely deep neural network (DNN), K-nearest neighbors (KNN), and support vector machines (SVM), are developed for forecasting the compressive strength of the geopolymer concrete. A total of 375 experimental samples are collected from the literature to build a database for the development of the predicting models. A careful procedure for data preprocessing is implemented, by which outliers are examined and removed from the database and input variables are standardized before feeding to the fitting process. The standard K-fold cross-validation approach is applied for evaluating the performance of the models so that overfitting status is well managed, thus the generalizability of the models is ensured. The effectiveness of the models is assessed via statistical metrics including root mean squared error (RMSE), mean absolute error (MAE), correlation coefficient (R), and the recently proposed performance index (PI). The basic mean square error (MSE) is used as the loss function to be minimized during the model fitting process. The three ML-based models are successfully developed for estimating the compressive strength, for which good correlations between the predicted and the true values are obtained for DNN, KNN, and SVM. The numerical results suggest that the DNN model generally outperforms the other two models.

Graphical abstract

Keywords

geopolymer concrete / compressive strength prediction / machine-learning based model / deep neural network / K-nearest neighbor / support vector machines

Cite this article

Download citation ▾
Quang-Huy LE, Duy-Hung NGUYEN, Thanh SANG-TO, Samir KHATIR, Hoang LE-MINH, Amir H. GANDOMI, Thanh CUONG-LE. Machine learning based models for predicting compressive strength of geopolymer concrete. Front. Struct. Civ. Eng., 2024, 18(7): 1028-1049 DOI:10.1007/s11709-024-1039-5

登录浏览全文

4963

注册一个新账户 忘记密码

1 Introduction

Geopolymer concrete was first introduced and named by Davidovits and Cordi in 1979 [1], and is linked to formation of alumino-silicate and alkaline solution. Information on the mechanism of and factors influencing the geopolymerization can be found in the review work of Khale and Chaudhary [2]. The alumino-silicate component of the geopolymer can come either directly from nature or from by-products of other industries. In particular, sources of alumino-silicates, for rendering geopolymers from natural materials, could include meta-kaolin, volcanic ash, etc., while by-product sources can include corn cob ash, coconut husk ash (agricultural waste products), fly ash, red mud, copper (industrial waste products), glass wool fiber, and paper waste (municipal waste products) [37]. Reviews of the history and progress of geopolymer technology can be found in the work of Duxson et al. [8]. Since its introduction, this type of material has been recognized as having many advantages such as highly environment-friendliness with low carbon-dioxide emission [9], outstanding performances in mechanical strength, thermal and chemical resistances, and durability [2,10], thus promising an alternative to traditional Portland cement. Additionally, geopolymers have great potential for serving sustainable developments when industrial waste disposals––such as tailings, sludge, fly ash––can be utilized in their manufacture [11]. The behavior of geopolymer can be ‘unsystematic’, varying from case to case according to the chemical reaction during the synthesizing process and the design of the mixture, including selection of the proportions of ingredients. This nature, therefore, poses challenges to prediction of the compressive strength of the geopolymers.

A large number of experimental studies on geopolymer concretes have previously been performed. Among such research, factors affecting compressive strength have attracted broad interest [1215]. Extension of this can be found in the review work of Castillo et al. [16]. Other research has looked at mechanical properties relating to specific aspects of the material, as in Refs. [1723]. Additionally, behavior of geopolymer concrete in terms of microstructural properties such as fracture [24], compressive stress−strain model [25], and in terms of structural performance [26] have been studied.

To assisting practical applications, design methods and procedures for mixtures of geopolymer concrete have been proposed. Rangan [27] reported a study in which experimental data was utilized to develop a design guideline. From the report, a suitable mix proportion and condition for the manufacture of geopolymer concretes could be identified by a suitable interpolating manner from the provided data. Procedures for mix design to reach a target compressive strength have also been proposed by Ferdous et al. [28] and Pavithra et al. [29]. Furthermore, mathematical estimations for the compressive strength of geopolymer concrete have also been carried out. Mohammed et al. [30] worked on collected data and related equations proposed within the literature to recommend equations for estimating the mechanical properties of geopolymer concrete. More research work in this direction can be found in Ref. [31].

While experimental work consumes much labor, time, and financial costs, practical design procedures based on this still impose limitations on long-term applications. This is because the design procedure is developed based on results obtained from specific experimental tests. If new data are gathered with parameters that fall outside the ranges of the previous tests, it may reveal new trends and insights. Justifying the current design procedure to take into account the new findings is not straightforward and might require much attempt. Hence, new concepts with more complexity, adequate efficiency, and—more importantly—high scalability are current subjects of research. Techniques based on Machine-Learning (ML) have emerged for developing predictive models due to the synergy between the vast and available databases and computational technologies. An ML model learns from features contained in the data and offers new resources for engineering applications. The predictive power of a model can further be improved via the model fitting process when the data source is expanded.

Those advances have resulted in significant alteration in approaches to solve engineering problems, shifting the focus from conventional mathematical models to data-driven ML models. An illustrative example of this paradigm shift can be found in a study conducted by Ref. [32], where the authors utilized ML-based methods to construct a surrogate model capable of predicting the flexural strength and stiffness of reinforced concrete beams. This surrogate model was subsequently employed in the optimization of the structure's design. The authors of study [33] employed deep neural networks (DNNs), a sub-brand of ML-based methods, as an approximation means for finding solutions of partial differential equations (PDEs) in the context of mechanical problems. In that study, the flexibility of network architecture and the efficiency of implemented algorithms were taken as advantages to build an approximation solution for PDEs. Guo et al. [34] introduced the deep collocation method as an alternative ML-based approach for the bending analysis of Kirchhoff plates. In that method, a loss function was designed for the neural network to minimize the PDEs associated with the plate bending problem at collocation points distributed randomly within the domain and along the boundary. Those authors further expanded their research on the deep collocation method, incorporating optimization techniques for neural architecture search and transfer learning techniques to reduce computational costs [35]. Zhuang et al. [36] introduced an unsupervised learning method called the deep autoencoder based energy method for analyzing the bending, vibration, and buckling behaviors of Kirchhoff plates. In that approach, the loss functions of the ML model incorporated the physical principles associated with bending, vibration, and buckling problems. These loss functions were minimized at collocation points located within the domain of the plate.

In the field of concrete material, ML utilization has been actively applied to mechanical prediction [37]. Recently, a large number of studies have been carried out in developing ML-based models for predicting the compressive strength of geopolymer concrete. Van Dao et al. [38] built two models, namely adaptive neuro fuzzy inference, and artificial neural network, in which they considered four input parameters, with a database of 210 samples. Nguyen et al. [39] designed DNN and deep residual network models to estimate the compressive strength of geopolymer from 335 experimental samples. A predetermined split of training and testing sets with the respective proportions of 80% and 20% was implemented. Shahmansouri et al. [40] carried out an experimental study on the effect of silica fume and natural zeolite, which partially substitute for ground granulated blast-furnace slag, on the compressive strength of geopolymer concrete. Then the authors developed an artificial neural network model from 117 experimental specimens with five input variables to predict the compressive strength. Gupta and Rao [41] proposed artificial neural network, multiple linear regression, and the multivariate nonlinear regression models to predict 28-d compressive strength. The data was built from 289 samples with 12 input variables. Rahmati and Toufigh [42] constructed artificial neural network and support vector machine models to predict the compressive strength of geopolymer concrete, focusing on behavior at high temperatures ranging from 100 °C to 1000 °C. Ahmad et al. [43] developed artificial neural network, boosting, and AdaBoost models with nine input variables to forecast the compressive strength of geopolymer concrete. After arriving at final results, the boosting model was shown to be the best model, and then a K-fold cross-validation procedure was carried out to confirm the model’s precision. Emarah [44] collected a mass data set of 860 samples with 12 effective input parameters to develop three ML-based models including artificial neural networks, DNNs, and deep residual networks, for estimating the compressive strength. A K-fold cross-validation was used to confirm the model’s performance. Ahmad et al. [45] conducted a study with similar approach but with different ML-based algorithms, those being decision tree, bagging regressor, and AdaBoost regressor, to predict the compressive strength from nine input parameters. A K-fold cross-validation was used for a further confirmation of the model’s performance. Peng and Unluer [46] carried out a study on predicting 28-d compressive strength of geopolymer concrete by three ML algorithms, including backpropagation neural network, support vector machine, and extreme learning machine. The data comprised 110 samples with ten input variables and was divided into training, validating, and testing sets without implementing K-fold cross-validation approach.

Owing to the complexity with which the compressive strength may develop and be affected, appraisal of different predicting models, with input variables across various ranges of values, is still required. In this study, we present three ML-based models for predicting the compressive strength of geopolymer concrete, namely DNN, K-Nearest Neighbor (KNN), and Support Vector Machines (SVM). To contribute into the overall picture by diversifying the ranges of the influencing variables, we collected 375 samples from reports recently published in the literature to construct the database used for developing and investigating the models. The compressive strength corresponding to each ingredient content was plotted to indicate the characteristics of the database. Eight variables––coarse aggregate content, fine aggregate content, NaOH content, Na2SiO3 content, aluminosilicate content, curing temperature, curing time, and age—were considered as the input contributors with potential influence on the compressive strength output. In most of the studies reviewed, data was divided in a predetermined manner for training, validating, and testing sets, and this may have led to some underestimation or overestimation in the assessments of model performance. Some studies applied cross-validation for the final or further examination of the relevant model’s performance. This study proposes a new direction for developing a data-driven model, in which the hyper-parameters are designed directly via a K-fold cross-validation process so that the generalizability is initially supervised. Afterwards, the models are fitted by utilizing all of the database, preparing for future predictions.

In our work, we not only attempted to arrive at optimal architectures for the models according to the available data, but also to present thorough procedures for developing an ML-based model for a certain data set. In particular, a careful procedure for data preprocessing was implemented, in which through-sample standardization for input values was performed across data points to ensure the robustness of the learning proficiency of the model; outliers within the data set were examined and then removed from the training processes. Additionally, some techniques to enhance training productivity were introduced and applied, including early stopping, dropout, and layer normalization. We confined our work to developing effective data-driven ML models for predicting the compressive strength of geopolymer concrete, in order to facilitate engineering applications. In-depth attention to the geopolymerization, as well as experimental investigation, was outside our scope.

The estimations made by the trained models exhibited a high correlation compared to the actual values of compressive strength with R values being measured around 0.9, demonstrating the efficiency of the models. Furthermore, low fluctuations were recorded over the cross-validation process with the standard deviation of the MAE values varying from just 6% to 12% of the average compressive strength value. This confirmed the generalizability of hyper-parameters tuned for the models.

The remaining part of the paper is organized in the following structure. Section 2 presents the methodology for the proposed methods. Section 3 provides the numerical results and points out some discussions on the results. Finally, some conclusions for the study and future work are described in Section 4.

2 Methodology

For developing an ML-based model, of course, its performance must have the primary attention. Thus, for constructing a methodology by which efficient steps for the development of the model can be derived, we would like to interpret the performance in two distinct aspects, namely generalizability and effectiveness.

Generalizability represents a condition in which equivalent outcomes can be attained when the model performs on any selections of the seen and unseen data sets during the fitting process. This is basically governed by appropriate selections for hyper-parameters. The generalizability of a model can be monitored by applying the K-fold cross-validation approach, which will be illustrated in the later section. Effectiveness describes how accurately the model can estimate the target for a given input. This characteristic is achieved via an optimization process through which weight and bias factors are obtained, such that loss value is minimized. The optimization process is carried out with all data after the hyper-parameters are tuned via the K-fold cross-validation approach. The proposed procedure for developing predicting model is indicated via a general flowchart shown in Fig.1.

To foster the training effectiveness, the following techniques for assisting the model training were rationally applied.

1) Early stopping (via callback method): this technique recognizes the epoch where the model is no longer able to enhance its performance. At that epoch, the fitting process will be stopped, and the best weight and bias of the training history factors are retrieved for future predictions.

2) Layer normalization [47]: this setup transforms the signal generated by a hidden layer, by which each value of the hidden unit is normalized over values of all internal hidden units of the current hidden layer.

3) Dropout [48]: during the model fitting process, at the layer with dropout being assigned, a portion of dropout rate hidden units are randomly eliminated from the process. This can help the model avoid overfitting during the learning process.

The performances of the three models were supervised by standard statistical metrics, including mean squared error (MSE), root mean squared error (RMSE), mean absolute error (MAE), correlation coefficient (R), and the recently proposed PI [49] by which the feasible model for the problem could be suggested.

MSE=1ni=1n(yi pr ed yit ru e) 2,

RMSE= 1n i=1n(yi pr ed yitrue) 2,

MAE=1ni=1n|yip re dyit ru e|,

R=n i=1n y ip re d yi tr ue (i=1nyip re d) (i=1nyit ru e) n( i=1n ( yip re d)2)( i=1nyip re d) 2n( i=1n ( yit ru e)2)( i=1nyit ru e) 2,

PI=R RMSE R+1 ,RRMSE= RMSE|y¯ tru e|,

where yip re d, yi tr ue are the ith predicted and experimental values, respectively, of the compressive strength; y¯t rue is the average value of the experimental compressive strength, and n is the number of samples.

In the following sections, the fundamentals of the three ML-based models namely DNN, KNN, and SVM are described. All the models are developed in Python programming language with the assistance of computational modules within ML programming libraries.

3 Database

The data set used in this study was collected from experimental results, where the input parameters were obtained through an extensive review of published articles in the literature. Our goal was to compile as many input parameters as possible based on the available information. These input parameters encompass eight factors, namely coarse aggregate content (kg/m3), fine aggregate content (kg/m3), NaOH content (kg/m3), Na2SiO3 content (kg/m3), aluminosilicate material content (kg/m3), curing temperature (°C), curing time (h), age of specimen (d), with the corresponding output being the compressive strength (MPa) of the geopolymer concrete.

To preserve the intrinsic properties and physical characteristics of the material, including those that may not have been previously discovered, the data used for model development in this study was directly obtained from experimental research findings, without employing any interpolation or data enrichment techniques. First, outliers of 30 samples were examined and subtracted from the database before a random split was implemented for the cross-validation process, after which 345 samples remained for the development of the models. Next, to enhance the training efficiency, a standardization calculation was performed separately for each input variable through samples of the data set. The standardization process mapped the values of each input variable to those whose distribution had a mean of zero and a standard deviation of one. This was obtained by scaling each input variable separately by subtracting the mean and dividing it by the standard deviation.

The significance of an ML-based model relies heavily on the quality and characteristics of the collected data. Hence, to provide an overview of the data the features of the collected data set are illustrated. The relationship between the mix design variable and compressive strength is depicted in Fig.2, while Tab.1 presents the ranges of numerical values covered by the database. Additionally, the linear relationship between the input variables is quantified using the Pearson correlation coefficient, and the results are displayed in Fig.3. This correlation map offers a preliminary understanding of the data, revealing that most variables are independent of each other. However, two pairs of variables exhibit relatively strong correlations. Specifically, the coarse aggregate demonstrates a relatively high correlation with the fine aggregate, with a correlation coefficient exceeding 0.5, while the curing temperature and curing time display a highly correlated relationship with a correlation coefficient of 0.85. By taking advantage of the power of ML methods in feature extraction, we opted to include all variables when constructing the prediction models, regardless of the exhibited correlations among variables as described earlier.

4 K-fold cross-validation

Hyper-parameters of the predicting models were tuned by a trial-and-error approach, in which the generalizability and the effectiveness of the models were in consideration. To achieve this, first hyper-parameters of the model were initially configured. Then, the K-fold cross-validation approach was applied to inspect the generalizability of the model, for which the data set was shuffled and split into k subsets, termed K-folds. All except one of the subsets were then selected for fitting the model while the remaining subset was used for testing the prediction.

This procedure was rerun with the current selections of the hyper-parameters until every subset was selected for the testing task, through which the generalizability of the model could be evaluated according to the fluctuation status of the outcomes among the folds. Particularly, after each fold had been completed the observed metrics were recorded; therefore, mean and standard deviation values corresponding to such metrics over the runtimes were calculated. By assessing these results, hyper-parameters were remodeled until appropriate performance being reached. Afterwards, with the obtained hyper-parameters model parameters including weight and bias coefficients were fitted by optimization operation using the full data set. By considering the size of the available database and the recommendation of previous research [50] the number of subsets, k, is chosen to be 5. The schematic representation of the designed 5-fold cross-validation is illustrated in Fig.4.

In Fig.4 { P}={MS E,RMS E,MAE ,R} presents the resulting metrics, {P} m ea n is the mean values corresponding to {P}, and σ {P} stores the standard deviation values corresponding to {P}.

5 Machine learning models

5.1 Deep neural network

In brief, DNN is a member of the Deep Learning family which is a type of ML within the Artificial Intelligence field. The ‘deep’ of DNN signifies the ability to deepen the network by appending sequential layers; this enables the model to efficiently handle challenging tasks with large size and nonlinearity of data. This makes the DNN model considerably competitive for developing predictions from raw data.

Basically, a typical DNN model is made up of three main parts, namely input layer, hidden layers (layer for short), and an output layer. The architecture for a DNN model is illustrated in Fig.5. Fundamentally, at a hidden layer the input signal is performed an activation transformation, after which the activated signal becomes the input information for the next layer. This process is consecutively operated until the signal arrives at the output layer where the predicted results are made. Particularly, there are two mathematical operators implemented at each node within a hidden layer to produce the activated signal feeding the next layer. Such operators are linear summation and transformation by the activation function, which are indicated via the following equation:

a j(l)=A( k=1nL (l1)ak(l 1)wkj( l)+ b j(l)) ,

where aj(l) is the output signal at jt h node of lth hidden layer, ak(l 1) is the output signal at k th node of (l1)th hidden layer, wkj(l) is the weight factor connecting kt h node and j th node, nL is the total number of layers, and A () is the chosen activation function.

In the development of a DNN model, the hyper-parameters to be tuned includes the number of hidden layers, the number of neural units per each hidden layer, activation function, optimizer, batch size, and dropout (position put and dropout rate). Having been shown through the literature, with the proper design for the hyper-parameters combined with pertinent training, a DNN model can deal with experimental data to give good results in prediction. This makes DNN an effective tool for building an approximating model such that a set of input variables can be mapped to a target outcome with acceptable accuracy.

5.2 K-nearest neighbors

KNN is a Supervised Learning technique that can be used for both classification and regression problems. The graphical illustration for the KNN method is presented in Fig.6.

The KNN algorithm supposes the similarity between the new observations and observations in the data set. In other words, the KNN algorithm stores all the available data and classifies a new observation based on the similarity. The input consists of the k observations nearest to the new observation from the training set. The output depends on whether KNN is used for classification or regression. In KNN Regression, the output value is the average of the values of k nearest neighbors. Therefore, if k = 1, then the output is equal to the value of the single nearest neighbor.

There are several methods for identification of the distance between two observations. For example, below are three common formulas for determining the distance of x and y, with k attributes:

Euclidean: i=1k ( xi yi)2,

Manhattan: i=1k |xiyi|,

Minkowski: ( i=1k (|xi yi|)q)1 /q.

The most popular method for continuous variables is Euclidean distance.

5.3 Support vector machines

SVM is a supervised ML algorithm that also can be used for classification or regression problems. The SVM algorithm plots each observation as a point in n-dimensional space (n is the number of features), and every feature represents a different coordinate. After that, SVM attempts to determine a line/hyperplane (in multidimensional space) that separates observations. When predicting new observations, SVM classifies them based on their positions in the hyperplane.

In the regression problem, SVM considers the points within the decision boundary line. The best-fit line is the hyperplane that has a maximum number of observations. Then, a boundary must be determined from the original hyperplane. Hence, SVM will take only the observations within the decision boundary and have the least error rate or are observations within the Margin of Tolerance.

There are several crucial features of SVM.

1) Kernel. A kernel supports location of a hyperplane in the higher dimensional space without raising the computational cost. Indeed, the computational cost usually grows as the dimension of the data increases. This increase in the dimension of the data is needed if a suitable hyperplane cannot be found.

2) Hyperplane. This is a line/plane that separates two data classes in SVM. Regarding Support Vector Regression, the hyperplane is the line/plane that helps to predict the continuous output. Decision Boundary: a decision boundary can be considered to be a demarcation line on one side of the hyperplane. Based on the decision boundary, the observations can be classified as either positive or negative. This concept also applies to regression problems. Similar to classification approaches, a regression model also seeks to optimize generalization bounds. This is achieved by using an epsilon-intensive loss function, which ignores errors within a certain distance from the true value. Fig.7 illustrates a regression function with an epsilon-intensive band, where the variables measure the cost of errors on training points. Epsilon defines a tolerable error for the regression model, with its value depending on a specific problem. A higher tolerable error can be achieved by increasing the epsilon-intensive band and vice versa.

6 Results and discussions

First, the settings of the hyper-parameters for the DNN, KNN, and SVM tuned by the K-fold cross-validation are presented. Tab.2 shows the construction for the DNN model, while Tab.3 and Tab.4 list the selected hyper-parameters for the KNN and SVM model, respectively.

Secondly, the performance of the models with finely tuned hyper-parameters is confirmed for two cases including the K-fold cross-validation process and the fitting process. For the K-fold cross-validation process, the statistical measurements are recorded and shown in Tab.5. Next, the predicted values made by DNN, KNN, and SVM models are compared with the experimental compressive strength via Fig.8–Fig.22, respectively. Additionally, the correlation presentations, between predicted and experimental values, are indicated.

From Fig.8–Fig.22, it can be seen that all three models can give good estimations for the compressive strength when compared to the experimental values. Additionally, by examining results over the cross-validation process, the models also show adequate generalizability because, for different data-splitting cases, the outcomes are obtained with little variation. Furthermore, the correlation charts indicate that the predictions are made with fine correlations compared to the actual values.

The numerical results point out that the DNN and SVM models yield relatively equivalent performance and outweigh the KNN model, and the DNN model is slightly better than its SVM counterpart. From Tab.5, the DNN model gives the closest prediction results to the experimental data with the average value of the correlation coefficient for the training and testing sets being 0.9609 and 0.8903 respectively, while those values for the KNN model are 0.8653 and 0.7691, and those values for the SVM model are 0.9392 and 0.8628. The fluctuation of the DNN model presents the lowest level with the standard deviations for training and testing sets of 0.0027 and 0.0280, much smaller than 0.0057 and 0.0596 of KNN and 0.0065 and 0.0423 of SVM.

The RMSE values show a similar trend. For the DNN model, the average values for training and testing sets across the splits are (3.7207 ± 0.0851) and (6.0568 ± 0.6836), respectively. The figures for the KNN model are (6.6998 ± 0.0924) and (8.5465 ± 0.6947), while those for the SVM model are (4.5845 ± 0.2570) and (6.6792 ± 0.7736).

As recommended in the study [49], a model is considered to give reliable predictions if the resulting PI is close to zero, and 0.2 is the acceptable threshold.

Among the K-fold cross-validation runs, all the models give the PIs much less than 0.2 both for training and testing sets. The PI value performed by the DNN model over the cross-validation for the training set is 0.0466, fluctuating by only 0.0009, while that for the testing set is 0.0788 varying by 0.0101. The SVM model gives a comparable performance to that of the DNN model, with the PI being (0.0580 ± 0.0035) for the training set and (0.0882 ± 0.0114) for the testing set. Finally, the figures performed by the KNN model are recorded as (0.0881 ± 0.0013) and (0.1187 ± 0.0098) for training and testing sets respectively.

The evaluation of the effectiveness of a model can be made via the MAE. With the average compressive strength value being 40.76, the MAE values of the DNN, SVM, and KNN models vary within acceptable degrees; from 6% to 12% when performing on the training data set and from 10% to 15% when performing on the testing data set.

For the fitting process, the performance of the three models with tuned hyper-parameters over the whole database is shown in Tab.6. The prediction outcomes and the correlation presentations are plotted via Fig.23, Fig.24, and Fig.25 for the DNN, KNN, and SVM models, respectively.

Fig.23 and Fig.25 show that DNN and SVM models can give good predictions since the estimations (red-dashed line) are in sufficient agreement with the actual data (black-dashed line) and the correlations show good density around the perfect lines. In comparison with DNN and SVM models, KNN shows lower performance, as can be seen in Fig.24.

From the numerical results summarized in Tab.6, all models, in general, provide adequate estimations for compressive strength. In this, the DNN model shows better performance than the two other models with the correlation coefficient being ranked the highest at 0.9594, followed by 0.9352 and 0.8737 for the SVM model and KNN model, respectively. The MAE performed by the DNN model is 2.5911, only approximately 6% of the average value of the compressive strength, and is the smallest error compared to 3.1993 and 4.7125 given by SVM and KNN, correspondingly. RMSE values appear with a similar trend with the respective figures for DNN, SVM, and KNN models being 3.7484, 4.7205, and 6.4973.

As for the PI, all the models make predictions with low (PI) values of 0.0469, 0.0599, and 0.0851 for the DNN, SVM, and KNN models, respectively, which are much smaller than the recommended threshold of 0.2.

Among the regression models developed in this study, the DNN model exhibits superior performance compared to the other two models. However, it is important to note that this comparison should be interpreted in the context of the specific data set used in this study, as the characteristics of the database can significantly influence the model’s efficiency. Additionally, it should be acknowledged that the development of these models in this study involved a trial-and-error approach, and as such, the optimal design choices for the models may be influenced by subjective factors. In a recent research work [53], an optimization approach was employed to determine the most favorable architecture for an ML-based model, aiming to achieve optimal prediction efficiency for a specific experimental database. This may offer effective direction on design architecture of an ML-based model.

Based on these assessments, our future research direction will involve exploring optimization-based methods to automatically tune the hyperparameters of the models. This approach aims to maximize the model’s performance in predicting a target value from an experimental database. Additionally, we plan to investigate specialized machine learning models that incorporate tailored features to account for the unique properties of the problem and the specific characteristics of the data. In particular, we will focus on studying methods such as XGBoost [54], LightGBM [55], and CatBoost [56]. These models are well-suited for handling sparse data and can effectively reduce computational costs, making them promising candidates for our research application.

Despite being built with a relatively small database, the three models can give proper predictions with high correlations. This demonstrates the potential of the proposed models in forecasting the compressive strength of geopolymer concrete. However, this compressive strength is complicatedly attributed to various factors. The input component for the models in this study are limited to eight variables, meaning that the input information cannot cover all the contributors to the compressive strength. This makes the learning progress of the models somewhat crude. Hence, adjusting the design for the model can only sharpen the tool, but cannot alone improve the performance indefinitely. In fact, there needs to be the backing of an appropriate database, both for quantitative and qualitative aspects. As a result, the job of constructing an ML-based model for predicting compressive strength should be considered to be a synergy of experimental work and programming development. Therefore, in addition to the ongoing development of more effective ML-based models for the compressive strength estimation, our forthcoming work encompasses expanding the existing database and enhancing the data pre-processing process.

In this study, samples giving irregular errors were eliminated from the database by hand. So, a further aspect to be concerned with is the issue of unexpected errors and noise produced during experimental work, which may blur the model’s perception during the learning process. Thus, data preprocessing plays an important and prerequisite role in the successful development of an ML-based model.

It is important to notice that the trained model can never acquire information other than what is suggested in the database. Therefore, prediction for a sample with a mix proportion significantly outside the range of input variables in the database may bring unreliable results. However, this fact does not discredit the applicability of the proposed methods since, as discussed above, the capability of the models can be enhanced when the database is enlarged and refined.

7 Conclusions

In this study, three ML-based models for predicting the compressive strength of geopolymer concrete were successfully developed by utilizing experimental data. Hyper-parameters of the models were carefully tried and tuned via a K-fold cross-validation approach to guarantee the generalizability and to foster the effectiveness of the prediction. Furthermore, quality of samples in the database was examined in order to promote the efficiency of the training process. The built models show high effectiveness in estimating the compressive strength of geopolymer concrete with R values being measured around 0.9. The generalizability is also demonstrated to be adequate since the standard deviation values for MAE among the models only account for 5% to 12% of the MAE mean values. Additionally, the developed models show really good results when evaluated by the recently proposed metrics of PI. The DNN model, in general, provides better performance compared to the other two models. The outcomes of this study suggest a new direction for developing a flexible and powerful tool for predicting the compressive strength of geopolymer concrete, including for mix proportion design.

In our future work, we have identified several key directions to focus on.

1) Developing new ML-based models with high performance for estimating physical properties based on experimental data. We aim to further enhance the accuracy and reliability of our models in predicting the target variable of interest.

2) Exploring optimization-based methods for tuning the hyperparameters of the models. We plan to treat the hyperparameters as design variables in an optimization process. By formulating an objective function based on the model’s loss value, we can automatically select optimal hyperparameters that maximize the model’s performance on the available database. Stability and generalizability of the optimization process will be carefully considered.

3) Sensitivity analysis will be conducted so that, via the attained model, the extent to which an input variable contributes to the target value can be deduced.

References

[1]

Davidovits J, Cordi S A. Synthesis of new high temperature geo-polymers for reinforced plastics/composites. In: Pacific Technical Conference and Technical Displays. Costa Mesa, CA: Society of Plastic Engineers, 1979,

[2]

Khale D, Chaudhary R. Mechanism of geopolymerization and factors influencing its development: A review. Journal of Materials Science, 2007, 42(3): 729–746

[3]

K Farhan. Z. M. A. M. Johari, and R. Demirboğa, Assessment of important parameters involved in the synthesis of geopolymer composites: A review. Construction and Building Materials, 2020, 264: 120276

[4]

Li Y, Min X, Ke Y, Liu D, Tang C. Preparation of red mud-based geopolymer materials from MSWI fly ash and red mud by mechanical activation. Waste Management, 2019, 83: 202–208

[5]

Khan K A, Raut A, Chandrudu C R, Sashidhar C. Design and development of sustainable geopolymer using industrial copper by product. Journal of Cleaner Production, 2021, 278: 123565

[6]

Sun Q, Tian S, Sun Q, Li B, Cai C, Xia Y, Wei X, Mu Q. Preparation and microstructure of fly ash geopolymer paste backfill material. Journal of Cleaner Production, 2019, 225: 376–390

[7]

Kiventerä J, Perumal P, Yliniemi J, Illikainen M. Mine tailings as a raw material in alkali activation: A review. International Journal of Minerals Metallurgy and Materials, 2020, 27(8): 1009–1020

[8]

Duxson P, Fernández-Jiménez A, Provis J L, Lukey G C, Palomo A, Van Deventer J S J. Geopolymer technology: The current state of the art. Journal of Materials Science, 2007, 42(9): 2917–2933

[9]

Gartner E. Industrially interesting approaches to ‘low-CO2’ cements. Cement and Concrete Research, 2004, 34(9): 1489–1498

[10]

LloydN ARangan B V. Geopolymer concrete: A review of development and opportunities. In: Proceedings of 35th conference on Our World in Concrete & Structures. Singapore, 2010: 25–27

[11]

Wang S, Liu B, Zhang Q, Wen Q, Lu X, Xiao K, Ekberg C, Zhang S. Application of geopolymers for treatment of industrial solid waste containing heavy metals: State-of-the-art review. Journal of Cleaner Production, 2023, 390: 136053

[12]

Al-Azzawi M, Yu T, Hadi M N S. Factors affecting the bond strength between the fly ash-based geopolymer concrete and steel reinforcement. Structures, 2018, 14: 262–272

[13]

Demie S, Nuruddin M F, Shafiq N. Effects of micro-structure characteristics of interfacial transition zone on the compressive strength of self-compacting geopolymer concrete. Construction & Building Materials, 2013, 41: 91–98

[14]

Deb P S, Nath P, Sarker P K. The effects of ground granulated blast-furnace slag blending with fly ash and activator content on the workability and strength properties of geopolymer concrete cured at ambient temperature. Materials & Design, 2014, 62: 32–39

[15]

Zhang H, Li L, Sarker P K, Long T, Shi X, Wang Q, Cai G. Investigating Various Factors Affecting the Long-Term Compressive Strength of Heat-Cured Fly Ash Geopolymer Concrete and the Use of Orthogonal Experimental Design Method. International Journal of Concrete Structures and Materials, 2019, 13(1): 63

[16]

Castillo H, Collado H, Droguett T. Factors affecting the compressive strength of geopolymers: A review. Minerals, 2021, 11(12): 1317

[17]

Assi L N, Deaver E, Elbatanouny M K, Ziehl P. Investigation of early compressive strength of fly ash-based geopolymer concrete. Construction & Building Materials, 2016, 112: 807–815

[18]

Nguyen K T, Ahn N, Le T A, Lee K. Theoretical and experimental study on mechanical properties and flexural strength of fly ash-geopolymer concrete. Construction & Building Materials, 2016, 106: 65–77

[19]

Olivia M, Nikraz H. Properties of fly ash geopolymer concrete designed by Taguchi method. Materials & Design, 2012, 36: 191–198

[20]

Shehab H K, Eisa A S, Wahba A M. Mechanical properties of fly ash based geopolymer concrete with full and partial cement replacement. Construction & Building Materials, 2016, 126: 560–565

[21]

Gunasekara C, Law D W, Setunge S. Long term permeation properties of different fly ash geopolymer concretes. Construction & Building Materials, 2016, 124: 352–362

[22]

Sarker P K. Bond strength of reinforcing steel embedded in fly ash-based geopolymer concrete. Materials and Structures, 2011, 44: 1021–1030

[23]

Farhan K Z, Johari M A M, Demirboğa R. Assessment of important parameters involved in the synthesis of geopolymer composites: A review. Construction and Building Materials, 2020, 264: 120276

[24]

Sarker P K, Haque R, Ramgolam K V. Fracture behaviour of heat cured fly ash based geopolymer concrete. Materials & Design, 2013, 44: 580–586

[25]

Noushini A, Aslani F, Castel A, Gilbert R I, Uy B, Foster S. Compressive stress−strain model for low-calcium fly ash-based geopolymer and heat-cured Portland cement concrete. Cement and Concrete Composites, 2016, 73: 136–146

[26]

Adak D, Sarkar M, Mandal S. Structural performance of nano-silica modified fly-ash based geopolymer concrete. Construction & Building Materials, 2017, 135: 430–439

[27]

RanganB V. Fly Ash-Based Geopolymer Concrete. Perth: Curtin University of Technology, 2008

[28]

FerdousM WKayali OKhennaneA. A detailed procedure of mix design for fly ash based geopolymer concrete. In: Proceedings of the fourth Asia-Pacific conference on FRP in structures. Melbourne: APFIS, 2013: 11–13

[29]

Pavithra P, Srinivasula Reddy M, Dinakar P, Hanumantha Rao B, Satpathy B K, Mohanty A N. A mix design procedure for geopolymer concrete with fly ash. Journal of Cleaner Production, 2016, 133: 117–125

[30]

Mohammed A A, Ahmed H U, Mosavi A. Survey of mechanical properties of geopolymer concrete: A comprehensive review and data analysis. Materials, 2021, 14(16): 4690

[31]

Li N, Shi C, Zhang Z, Khennane A. A review on mixture design methods for geopolymer concrete. Composites Part B: Engineering, 2019, 178: 107490

[32]

Nariman N A, Hamdia K, Ramadan A M, Sadaghian H. Optimum design of flexural strength and stiffness for reinforced concrete beams using machine learning. Applied Sciences, 2021, 11(18): 8762

[33]

Samaniego E, Anitescu C, Goswami S, Nguyen-Thanh V M, Guo H, Hamdia K, Zhuang X, Rabczuk T. An energy approach to the solution of partial differential equations in computational mechanics via machine learning: Concepts, implementation and applications. Computer Methods in Applied Mechanics and Engineering, 2020, 362: 112790

[34]

GuoHZhuangX RabczukT. A deep collocation method for the bending analysis of Kirchhoff plate. 2021, arXiv: 2102.02617

[35]

Guo H, Zhuang X, Chen P, Alajlan N, Rabczuk T. Stochastic deep collocation method based on neural architecture search and transfer learning for heterogeneous porous media. Engineering with Computers, 2022, 38(6): 5173–5198

[36]

Zhuang X, Guo H, Alajlan N, Zhu H, Rabczuk T. Deep autoencoder based energy method for the bending, vibration, and buckling analysis of Kirchhoff plates with transfer learning. European Journal of Mechanics. A, Solids, 2021, 87: 104225

[37]

Ben Chaabene W, Flah M, Nehdi M L. Machine learning prediction of mechanical properties of concrete: Critical review. Construction & Building Materials, 2020, 260: 119889

[38]

Van Dao D, Ly H B, Trinh S H, Le T T, Pham B T. Artificial intelligence approaches for prediction of compressive strength of geopolymer concrete. Materials, 2019, 12(6): 983

[39]

Nguyen K T, Nguyen Q D, Le T A, Shin J, Lee K. Analyzing the compressive strength of green fly ash based geopolymer concrete using experiment and machine learning approaches. Construction & Building Materials, 2020, 247: 118581

[40]

Shahmansouri A A, Yazdani M, Ghanbari S, Akbarzadeh Bengar H, Jafari A, Farrokh Ghatte H. Artificial neural network model to predict the compressive strength of eco-friendly geopolymer concrete incorporating silica fume and natural zeolite. Journal of Cleaner Production, 2021, 279: 123697

[41]

Gupta T, Rao M C. Prediction of compressive strength of geopolymer concrete using machine learning techniques. Structural Concrete, 2022, 23(5): 3073–3090

[42]

Rahmati M, Toufigh V. Evaluation of geopolymer concrete at high temperatures: An experimental study using machine learning. Journal of Cleaner Production, 2022, 372: 133608

[43]

Ahmad A, Ahmad W, Chaiyasarn K, Ostrowski K A, Aslam F, Zajdel P, Joyklad P. Prediction of geopolymer concrete compressive strength using novel machine learning algorithms. Polymers, 2021, 13(19): 3389

[44]

Emarah D A. Compressive strength analysis of fly ash-based geopolymer concrete using machine learning approaches. Results in Materials, 2022, 16: 100347

[45]

Ahmad A, Ahmad W, Aslam F, Joyklad P. Compressive strength prediction of fly ash-based geopolymer concrete via advanced machine learning techniques. Case Studies in Construction Materials, 2022, 16: e00840

[46]

Peng Y, Unluer C. Analyzing the mechanical performance of fly ash-based geopolymer concrete with different machine learning techniques. Construction & Building Materials, 2022, 316: 125785

[47]

BaJ LKiros J RHintonG E. Layer normalization. 2016, arXiv:1607.06450

[48]

Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R. Dropout: A simple way to prevent neural networks from overfitting. The Journal of Machine Learning Research, 2014, 15(1): 1929–1958

[49]

Gandomi A H, Roke D A. Assessment of artificial neural network and genetic programming as predictive tools. Advances in Engineering Software, 2015, 88: 63–72

[50]

Rodríguez J D, Pérez A, Lozano J A. Sensitivity analysis of k-fold cross validation in prediction error estimation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2010, 32(3): 569–575

[51]

KingmaD PBa J L. Adam: A method for stochastic optimization. In: Proceedings of the 3rd International Conference on Learning Representations, ICLR 2015—Conference Track Proceedings. San Diego, CA: NFDI, 2015

[52]

AgarapA F. Deep learning using rectified linear units (Relu). 2018, arXiv:1803.08375

[53]

Hamdia K M, Zhuang X, Rabczuk T. An efficient optimization approach for designing machine learning models based on genetic algorithm. Neural Computing & Applications, 2021, 33(6): 1923–1933

[54]

ChenTGuestrin C. Xgboost: A scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: Association for Computing Machinery, 2016: 785–794

[55]

KeGMengQ FinleyTWang TChenWMaWYeQ LiuT Y. LightGBM: A highly efficient gradient boosting decision tree. Advances in Neural Information Processing Systems, 2017, 30

[56]

DorogushA VErshov VGulinA. CatBoost: Gradient boosting with categorical features support. 2018, arXiv:1810.11363

RIGHTS & PERMISSIONS

Higher Education Press

AI Summary AI Mindmap
PDF (39981KB)

2129

Accesses

0

Citation

Detail

Sections
Recommended

AI思维导图

/