1 Introduction
In recent years, the acceleration of urbanization, coupled with limited land resources, has driven the increasing construction of underground tunnels as a viable solution to accommodate urban transportation demands [
1]. Shield machines have been widely employed in tunnel construction because of their high efficiency, eco-friendliness and safety features [
2]. Nevertheless, the process of shield tunneling inevitably induces surface settlement, posing potential risks to the safety of both surface and subsurface infrastructures [
3]. With the continuous advancements in shield machine technology, a growing number of tunneling projects employing large-diameter slurry balance shield machines have emerged domestically, such as the Shanghai bund tunnel and Nanjing Heyan Road river-crossing tunnel [
4]. However, the employment of large-diameter shield machines results in one-off large-area excavations of soil, heightening the likelihood of encountering composite formations during construction. In such circumstances, slurry balance shield machines commonly adopt a local pressurization mode. This operational mode serves to mitigate cutterhead torque while modifying soil fluidity, thereby augmenting excavation efficiency. Yet, meticulous operation and rigorous management are imperative when adopting this mode, as improper operation can lead to excessive surface settlement and severe engineering accidents, such as the inclination of surrounding infrastructures [
5]. Consequently, investigating surface settlement induced by large-diameter shield machines in composite formations holds significant importance.
In tunnel engineering, traditional methodologies for predicting surface settlement mainly include empirical formulae [
6,
7], analytical solutions [
8–
11] and numerical simulations [
12,
13]. Empirical formulae are approximate expressions derived from correlations based on on-site measurements, but their application to different projects is challenging because of regional differences. Analytical solutions rely on mechanical principles along with specific assumptions to analyze surface settlement induced by shield tunneling. These methods are relatively simple and limited by influence factors, making them inadequate for describing the complex nonlinear relationship between tunnels and surface settlement. Numerical simulations involve discretizing formations using finite element analysis to simulate the construction process and predict surface settlement. However, significant divergences exist between the constitutive models of soil used in numerical simulations and actual soil conditions, with results susceptible to the mesh configuration, thereby impacting accuracy.
Developments in artificial intelligence and big data have paved the way for neural network models to characterize complex nonlinear relationships among high-dimensional variables. These models have found successfully in tunnel engineering [
14–
16]. In the domain of surface settlement prediction, scholars have conducted extensive related studies. Neaupane and Adhikari [
17] developed a multi-layer perceptron ·capable of predicting longitudinal and lateral deformations above tunnels, incorporating parameters such as tunnel diameter, tunnel depth and soil strength. Drawing on data from the Istanbul Metro extension project, Ocak and Seker [
18] assessed the performance of artificial neural networks, support vector machines and Gaussian processes in predicting surface settlement. Their findings highlighted Gaussian processes as providing the most accurate predictions, achieving a notably low root mean square error (
RMSE) of 9. Zhou et al. [
19] successfully predicted surface settlement in the “soft upper and hard lower” formation of Nanchang Metro Line 1, validating the feasibility of back propagation (BP) neural network in surface settlement prediction. Furthermore, Chen et al. [
20], using 200 sets of monitoring data from Changsha Metro Line 4, constructed a prediction model for maximum surface settlement induced by earth pressure balance shield machine and compared the advantages and disadvantages of BP neural network, radial basis function and generalized regression neural network. Most of these studies predominantly employed traditional neural networks. In 2006, Huang et al. [
21] introduced the extreme learning machine (ELM) based on a single-hidden-layer feedforward neural network. ELM stands out for its straightforward topology, low algorithmic complexity and rapid convergence. Liu et al. [
22] achieved notable prediction accuracy and commendable generalization ability by inputting simplified factors affecting pit deformation into ELM for predicting pit displacement. Sun et al. [
23] used ELM for fault diagnosis of cutterheads, successfully identifying the presence and types of cutterhead failures. Shao et al. [
24] conducted a comparative analysis on the predictive performance, comparing ELM with partial least squares, least squares support vector machine and Gaussian process. Their study utilized a data set obtained from a tunnel in Queens, New York, focusing on excavation performance prediction. Among these models, ELM demonstrated the highest accuracy in predictions, coupled with exceptionally fast learning speed. However, ELM has yet to be widely applied in the domain of shield machine-induced surface settlement prediction. Moreover, existing studies mainly focused on homogeneous formations, lacking sufficient consideration for composite formations.
The parameters of neural network models are typically assigned empirically, which means that finding the optimal predictive performance of the model requires a large amount of testing time. In recent years, scholars have proposed many novel swarm intelligence optimization algorithms to tackle high-dimensional and multi-objective optimization issues [
25]. When integrated with neural network models, these algorithms can effectively determine optimal parameters based on data set features [
26]. Ant colony optimization proposed by Dorigo [
27] and particle swarm optimization proposed by Kennedy and Eberhart [
28] are two of the most classical algorithms in swarm intelligence optimization. With the introduction and improvement of these classical algorithms, their theoretical frameworks have progressively matured, accelerating the emergence of swarm intelligence optimization algorithms. Among them, sparrow search algorithm (SSA) is a new swarm intelligence optimization algorithm proposed by Xue and Shen [
29] in 2020 based on behaviors of predation and anti-predation of sparrow populations. Compared to classical algorithms, SSA has superior convergence rates, heightened stability, higher optimization accuracy, etc., presenting substantial research potential and developmental prospects [
30].
In addressing the engineering challenge associated with the excessive surface settlement induced by large-diameter slurry balance shield machines in composite formations, this study analyzed the longitudinal surface settlement laws of homogeneous and composite formations during excavations using data from the Jiangyin–Jingjiang Yangtze River Tunnel Left Line Project. A hybrid model, SSA-ELM, which integrates ELM with SSA was introduced to predict longitudinal surface settlement in different formations. The performance of the SSA-ELM model was compared with that of both ELM and SSA-BP model, and the accuracy and superiority of SSA-ELM were discussed. Additionally, an optimization strategy for tunneling parameters was proposed to mitigate the possibility of excessive surface settlement.
2 Machine learning methodology
2.1 Extreme learning machine
ELM is a highly efficient perceptron neural network [
21]. It is essentially a single-hidden layer feedforward neural network that employs a distinct method for assigning the weights and biases of the network. During the training process, the weights connecting input layer and hidden layer, along with the biases of hidden layer, are randomly generated while the weights of output layer are computed using Moore−Penrose generalized inverse matrix theory (i.e., ordinary least squares), eliminating the necessity for adjustments via BP algorithms. Therefore, with its unique network structure, ELM simplifies the training process, significantly enhances the operation speed, and reduces the overfitting issues commonly encountered in traditional neural networks. The structure of the ELM model with a single output layer is shown in Fig.1.
Fig.1 Structure of the ELM model with a single output layer. |
Full size|PPT slide
Suppose a single-hidden layer feedforward neural network with
M input nodes,
L hidden nodes and
N training input data. It maps the input value (
xj) into the output value according to the following function [
31,
32]:
where βi is the weight vector connecting the ith hidden layer node and the output layer nodes, hi(·) is the activation function of the ith hidden layer output, w is the weight vector connecting the input layer nodes and the hidden layer nodes and bi is the threshold of the ith hidden layer node. Hence, the relationship between the input value and the output value (yi) can be derived, as detailed in the following equation:
where ti is the ith target value, εi is the error between the ith target value and the ith output value. The training process involves two primary steps: initially, the network randomly generates the weight vectors connecting the input layer nodes and the hidden layer nodes, the thresholds of the hidden layer nodes; subsequently, the network employs the least squares solution of the output value Hβ and the target value T as the objective function, aiming to minimize this objective function. The solution that satisfies this condition is the optimal one. The formula for the objective function is as follows:
By applying knowledge from linear algebra and matrix theory, the optimal solution can be derived as
where is the optimal weight vector connecting the hidden layer nodes and the output layer nodes, H+ is the Moore−Penrose generalized inverse matrix.
2.2 Sparrow search algorithm
SSA simulates the behaviors of predation and anti-predation of sparrow populations. Compared with traditional intelligent swarm optimization algorithms, SSA exhibits better performance at a faster convergence rate, shorter solution time, higher calculation accuracy, and better parallelism [
33]. SSA divides the sparrow into two types: producers and scroungers [
27].
Conducting simulated experiments using virtual sparrows. Suppose there are n sparrows, they can be represented as
where n is the number of sparrows, and d is the dimension of the variables to be optimized. The fitness values for all sparrows can be expressed as follows
where f(·) is the fitness function. A higher fitness value means that an individual has a higher priority in obtaining “food”, which is analogous to finding a better solution in the problem domain. At each iteration, the update of the producer's position can be described as follows
where is the position of the ith sparrow in the jth dimension at iteration, t is the iteration time, itermax is the maximum iteration, α is a random number within (0,1], R2 is an early warning value ranging between [0,1], ST is the safety threshold, with values in the range [0.5,1], Q is a random number that follows a normal distribution, and L is a matrix, where all elements are equal to 1. For scroungers, they are required to execute other rules, and their position update are as follows
where XP is the best position of producers, Xworst is the current worst position, and A is a 1 × d matrix where elements are randomly assigned either 1 or −1. A+ = AT(AAT)−1. In simulation experiments, it is assumed that these aware sparrows make up 10% to 20% of the total population. The initial positions of these sparrows are randomly generated within the population. The mathematical expression can be represented as
where Xbest is the current best position, β is the step length control parameter, which is a random number following a normal distribution with mean 0 and variance 1, K is a random number within the range [0,1], fi is the fitness value of the current sparrow individual, fg and fw are the current global best and worst fitness values, respectively.
2.3 Sparrow search algorithm-extreme learning machine model
The flowchart of the hybrid model, SSA-ELM, is shown in Fig.2, which was developed utilized the Pytorch framework. The training process proceeds as follows.
Fig.2 Flowchart of the SSA-ELM model. |
Full size|PPT slide
Step 1: Import the data set and split it into training and testing sets according to the specific problem. Perform data normalization on both sets.
Step 2: Initialize parameters of the SSA-ELM model.
Step 3: Search for optimal parameters using training sets. Separate the sparrow population into producers and scroungers, calculate their fitness values, update the population positions, recalculate the fitness values, and continuously update the global best fitness and position.
Step 4: Evaluate whether the predefined maximum iteration has been reached. If the termination condition is met, terminate the iterative process; otherwise, recursively execute Step 3 to further explore the optimal combination of weight vectors connecting the input layer nodes and hidden layer nodes, as well as the biases of the hidden layer nodes.
Step 5: Employ the refined model with the best initial parameters obtained from Step 4. Validate the prediction accuracy of the SSA-ELM model using testing sets.
3 Case study
This study analyzes the longitudinal surface settlement patterns across different formations, based on measurements from the Jiangyin−Jingjiang Yangtze River Tunnel Left Line Project. The input parameters and hyperparameters of the SSA-ELM model have been determined.
3.1 Project overview and data collection
The mainline of the tunnel is located between the Jiangyin Bridge and the Taizhou Bridge, covering a distance of approximately 6450 m. It interfaces with Jingjiang city to the north and Jiangyin city to the south. For this project, a composite slurry balance shield machine was employed, and the collected data included tunneling parameters, geological parameters, geometrical parameters and surface settlement.
Tunneling parameters were obtained from the data acquisition system, which collected 2311 types of parameters at a frequency of every 5 min. Following a rigorous screening process to eliminate parameters with negligible impact on the predictive model, this study ultimately identified and selected 13 parameters. These parameters encompass cutterhead speed (RPM), cutterhead torque (Tor), total thrust (Tr), advanced rate (AR), grouting volume (GV), grouting pressure (GP), slurry inlet pipe pressure (SIP), slurry outlet pipe pressure (SOP), slurry inlet pipe flow (SIF), slurry outlet pipe flow (SOF), slurry inlet pipe density (SID), slurry inlet pipe density (SOD) and cutterhead hydraulic pressure (CHP).
Geological parameters include four factors: compression modulus, cohesion, internal friction angel and natural unit weight. Geometrical parameters include two factors: tunnel depth and section distance. These parameters were sourced from the geological survey report of the project. Presently, the shield machine has completed the underground excavations in the Jiangbei area and is proceeding with the riverbed crossing. Fig.3 shows the geological profile of the tunnel, broadly categorized into three types: 1) a composite formation of silt and sand; 2) a sand formation; and 3) a composite formation of silty clay and sand.
Fig.3 Geological profile of the river-crossing tunnel. |
Full size|PPT slide
Surface settlement data were obtained from the monitoring points arranged according to the following principle: monitoring points are perpendicular to the tunnel axis, considering the surrounding environment and geological conditions. Within a distance not exceeding 100 m from the launch and reception shafts, monitoring sections (depicted in red) are established every 10 m. Between distances of 100 to 200 m from the launch and reception shafts, the frequency of monitoring sections (depicted in green) is reduced to one every 20 m. For other sections, the deployment of monitoring sections (depicted in blue) occurs at intervals of 50 m. Consequently, each section comprises a total of 11 monitoring points, as shown in Fig.4.
Fig.4 Layout of the monitoring points. |
Full size|PPT slide
3.2 Collected data analysis
Owing to the incomplete surface settlement data for the composite formation of silt and sand, such formation was not considered in the subsequent analysis. This study selects a total of 184 rings of valid data from the Jiangbei area, which are categorized as follows: 87 rings in the sand (Ring numbers 374–460) and 97 rings in the composite formation of silty clay and sand (Ring numbers 461–557). The composite formation of silt and sand was not included in the analysis because of the limited availability of rings and the lack of surface settlement data.
Initially, the change of lateral surface settlement in both the sand and the composite formations (silty clay and sand) was analyzed, with specific settlement results shown in Fig.5. For each formation, a certain section was selected for analysis. In the sand formation, the lateral surface settlement remains positive both before and after the passage of the shield machine, indicating a continuous heave of the surface. The most significant change in surface settlement occurs at the tunnel axis, where it decreases from 1.34 to 0.28 mm. On the other hand, in the composite formation, the lateral surface settlement is consistently negative before and after the passage of the shield machine, also with the most significant change occurring at the tunnel axis, which is 0.52 mm (change value). The analysis of lateral surface settlement revealed that the most significant changes in both formations occur at the tunnel axis. Therefore, in the subsequent analysis of longitudinal surface settlement, only the data at the tunnel axis were selected for detailed analysis, with three monitoring sections chosen in each formation.
Fig.5 Lateral surface settlement: (a) the sand formation; (b) the composite formation. |
Full size|PPT slide
The longitudinal surface settlement of the three monitoring sections in the sand formation is shown in Fig.6. The reference line represents the condition where no surface settlements occur, with areas above the reference line indicating surface heave and those below indicating surface subsidence. Fig.6(a) provides further explanations regarding the positive and negative values on the x-axis. When the shield machine has not yet reached the monitoring section, the distance between the excavation face and the monitoring section is denoted as negative. Conversely, the distance is considered positive after the shield machine has passed. Fig.6 reveals that the trend of longitudinal surface settlement in the sand formation can be characterized as “initial heave followed by subsidence”. The initial heave is attributed to excessively high Tr and CHP, while subsequent subsidence occurs as the shield machine passes away from the monitoring section, leading to a loss of support above the shield and consequent disturbance and settlement of the soil body. In detail, Section A reaches the maximum heave at a distance of 20 m, with a value of 1.17 mm, and its maximum subsidence at approximately 120 m, with a value of −1.04 mm. Section B reaches its peak heave of 1.13 mm at around −30 m, while Section C reaches it at about −20 m, with a value of 1.34 mm. Regarding subsidence, Section B hits its maximum of −0.47 mm at approximately 50 m, whereas that of section C is observed at 90 m, with a value of −1.64 mm. Section A reaches its maximum heave earlier than Sections B and C because of its higher values of Tr and AR, which are 149.88 MN and 36 mm/min, respectively. For Section B, Tr is 143.48 MN and AR is 35 mm/min, while for Section C, Tr is 144.25 MN, and AR is 34 mm/min.
Fig.6 Longitudinal surface settlement: (a) Section A; (b) Section B; (c) Section C in the sand formation. |
Full size|PPT slide
The longitudinal settlement of the composite formation is shown in Fig.7. The trend in Section D differs from that in Sections E and F because of the fewer number of measurements. However, it is noted that the maximum value of subsidence significantly surpasses that of heave. Particularly, no surface heave occurs at Section E. In Section D, the maximum heave occurs at approximately −20 m, with a value of 0.19 m, while the maximum subsidence occurs at approximately 100 m, with a value of −2.37 m. In Section E, the maximum heave is −2.41 m, occurring around 140 m. Furthermore, in Section F, the maximum heave and subsidence occur at distances of 80 and 50 m, respectively, with values of 0.14 and −2.05 m. The distance between the maximum heave and subsidence is approximately 130 m. Notably, the change range of longitudinal surface settlement in the composite formation is more significant, reflecting the complexity. This underscores the necessity for in-time monitoring of these changes during construction process to ensure safety and structural integrity. Tr when reaching Section D was 131.08 MN, with an AR of 33 mm/min. when reaching Section E, Tr was 136.19 MN, with an AR of 35 mm/min. Finally, when reaching Section F, Tr was 132.13 MN, while AR was 35 mm/min. It is observed that both Tr and AR are smaller than that in the sand formation.
Fig.7 Longitudinal surface settlement: (a) Section D; (b) Section E; (c) Section F in the composite formation. |
Full size|PPT slide
3.3 Database building and analysis
Tunneling parameters in the database are intricate and susceptible to stochastic factors, thus necessitating preprocessing of the initial data. The initial data includes missing data, duplicate data and outliers. For missing data, interpolation is used to fill in missing values based on adjacent time points. For duplicate data, only one instance at a given time is retained. Outliers are treated differently depending on their number, if few, they are treated as missing data; if many, they are directly removed. After preprocessing, the detection of downtime data are required, as such data are of little significance for this study. In line with the actual conditions of the shield machine, data recorded at times when AR, Tor, RPM or Tr is 0 is considered downtime data and is excluded. Lastly, to mitigate the impact of outliers on the overall data distribution, the Z-Score method is employed for outlier identification. This method presents the following advantages: 1) it is relatively straightforward to calculate, facilitating quick data processing; 2) it is unaffected by the magnitude of the data, ensuring comparability across different data sets. The Z-Score transformation formula is expressed as follows
where X is the original data, is the average of the data; s is the standard deviation of the data.
Outliers are removed using the criterion = 2. After all preprocedures are completed, the average of tunneling parameters is calculated as the representative value of each ring excavation parameter, which signifies the construction status of the shield machine for each ring. The preprocessed data are shown in Tab.1.
Tab.1 Basic descriptive statistics of preprocessed data |
Parameter | Min. | Max. | Ave. | SD* | Unit |
RPM | 1.08 | 1.29 | 1.14 | 0.04 | r/min |
Tor | 6.78 | 13.66 | 10.42 | 1.41 | MN·m |
Tr | 99.35 | 168.20 | 138.36 | 13.94 | MN |
AR | 17.79 | 37.00 | 32.17 | 4.29 | mm/min |
GV | 32.70 | 64.20 | 33.95 | 1.96 | m3 |
GP | 5.20 | 12.80 | 8.48 | 1.62 | × 105 Pa |
SIP | 4.12 | 6.53 | 5.30 | 0.65 | × 105 Pa |
SOP | 4.00 | 6.63 | 5.65 | 0.64 | × 105 Pa |
SIF | 39.04 | 48.02 | 42.50 | 1.94 | m3/min |
SOF | 44.25 | 50.00 | 47.35 | 1.54 | m3/min |
SID | 1.18 | 1.31 | 1.23 | 0.03 | × 103 kg/m3 |
SOD | 1.22 | 1.35 | 1.29 | 0.02 | × 103 kg/m3 |
CHP | 3.76 | 6.13 | 4.87 | 0.66 | × 105 Pa |
Strong correlations among parameters in a database can adversely affect both the speed of model training and the quality of predictions. Therefore, analyzing parameter correlations is imperative [
34]. This study employs the Spearman coefficient (
ρ) to evaluate the correlation between two parameters. The formula for calculation is presented as follows
where
x and
y are two sets of variables subject to correlation analysis,
and
are the average of these two variable sets. Given that the dimensions of tunneling parameters vary, it is imperative to normalize the data prior to conducting correlation analysis to expedite the convergence rate of the model. The standardization procedure is as follows [
35]:
where is the ith normalized data, xmin is the minimum value of the data and xmax is the maximum value of the data. After normalizing the 214 sets (including all types of formations) of tunneling parameters, the corresponding Spearman correlation coefficients are calculated to analyze the correlation among the various tunneling parameters. The specific calculation results are shown in Fig.8.
Fig.8 Heat map of Spearman coefficient. |
Full size|PPT slide
In the heat map, a positive value indicates a positive correlation, while a negative value indicates a negative correlation. The larger the absolute value, the stronger the positive (or negative) correlation between the two parameters. Strong correlation is defined as a condition where the absolute value of calculated value exceeds 0.7. From Fig.8, the following conclusions can be drawn: RPM exhibits a strong negative correlation with SIP, SOP, and CHP. On the other hand, GP demonstrates a strong positive correlation with SIP, SOP, and CHP, suggesting that an increase in GP corresponds to increases in SIP, SOP, and CHP. Additionally, there are strong correlations among SIP and SOP, CHP, between SOP and CHP, between SIF and SOF, and between SID and SOD. After filtering, the final set of tunneling parameters retained includes 7 types: Tor, Tr, AR, GV, SIP, SIF, and SID. These preserved tunneling parameters, along with geological parameters and geometric parameters, constitute the database corresponding to the model input parameters.
3.4 Hyperparameter selection and model training
There is always the sand present in each formation. Therefore, a custom parameter composite ratio (φ) is defined to characterize the proportion of the sand in the excavation face, providing a quantitative assessment of the composite formation. The calculation formula for φ is as follows:
where Ssand is the area of the sand in the excavation face, Stotal is the area of the total excavation face.
In summary, the input parameters for the SSA-ELM model are selected from the database, consisting of 14 parameters. Therefore, ELM has 14 input layer nodes and 1 output layer node. The number of hidden layer nodes is determined by the empirical formula [
36]:
, where
N is the number of input layer nodes. ultimately choose to use 14 nodes. The activation function is the rectified linear unit. The configuration of parameters for SSA adapted in this study is as follows: the proportions of producers and scouts are 20%,
ST is set to 0.6, the population size of sparrows is 10 and the maximum iteration is 50.
To better evaluate the predictive performance, the model evaluation metrics adopt three criteria: mean absolute error (MAE), RMSE, and coefficient of determination (R2). A lower MAE (or RMSE) indicates a smaller model error, while a higher R2 indicates a better fitting degree of the model. The calculation formulas are as follows:
where N is the total number of samples, yi is the actual value of the ith sample, is the predicted value of the ith sample and is the average of actual value across all samples.
To validate the generalization capabilities of the hybrid model across different formations, the model training divided 6 sets of cross-section data into 3 groups of test and training data sets. The specific partitioning method is shown in Tab.2.
Tab.2 Division of training set and testing set |
Set number | Sand formation | | Composite formation |
Training set | Testing set | | Training set | Testing set |
Ⅰ | Sections A and B | Section C | | Sections D and E | Section F |
Ⅱ | Sections A and C | Section B | | Sections D and F | Section E |
Ⅲ | Sections B and C | Section A | | Sections E and F | Section D |
To mitigate overfitting in the network, the training sets were shuffled. SSA is used to optimize the weights and biases of ELM, employing error of the training set as the fitness function. The optimization of the SSA-ELM model is shown in Fig.9(a), while the prediction results for three training sets are shown in Fig.9(b)–Fig.9(d), respectively.
Fig.9 Results: (a) optimization using SSA; (b) training set I; (c) training set II; (d) training set III. |
Full size|PPT slide
From Fig.9(a), it is evident that for training set I, ELM achieves optimal performance at iteration 15, with a fitness value of 0.1329. For training set II, optimal performance is achieved at iteration 14, with a fitness value of 0.1170. Training set III requires 25 iterations to reach optimal performance, attributed to its larger size compared to the others, with a fitness value of 0.1886, which is also the highest among the three sets. Notably, from Fig.9(b)–Fig.9(d), all three training sets exhibit predictive effects that meet expectations, with no significant occurrence of overfitting observed. Consequently, the analysis on the testing set can proceed.
4 Results and discussion
4.1 Performance analysis of the sparrow search algorithm-extreme learning machine model
The predicted results for Sections C and F in testing set I are shown in Fig.10(a), with the reference dashed line denoting the demarcation for an absolute error of 0.5 mm. In Section C, the SSA-ELM model replicates the anticipated trend of “initial heave followed by subsidence”. However, three out of the total data points (27.3%) shows an absolute error surpassing 0.5 mm. In Section F, the SSA-ELM model successfully captured sudden variations (from Points 15 to 16), highlighting its sensitivity. Five data points (27.8%) shows an absolute error exceeding 0.5 mm. The performance metrics for Sections C and F are shown in Fig.10(b), where the reference dashed line indicates the scenario where the predicted values equal the measured values. Specifically, in Section C, R2 is 0.9702, MAE is 0.3088, RMSE is 0.3835. In Section F, R2, MAE, and RMSE are 0.7438, 0.3484, and 0.4093, respectively. The high accuracy of the hybrid model confirms its ability to learn the characteristics of different formations and can predict both the surface settlement trends and values across different formations, providing valuable insights for on-site construction activities.
Fig.10 Results of testing set Ⅰ: (a) surface settlement prediction; (b) performance metrics. |
Full size|PPT slide
Similarly, the prediction results and performance metrics for testing set II are shown in Fig.11. In Section B, only 9.1% of the total data points have absolute errors exceeding 0.5 mm, with corresponding values for R2, MAE, and RMSE of 0.8477, 0.1029, and 0.2913, respectively. In Section E, the predictions between Points 16 and 18 closely mirror the predicted trend observed in Fig.10(a). This is plausibly attributed to a high degree of data similarity and a limited amount of training data. The R2, MAE, and RMSE values are 0.7690, 0.4391, and 0.4943, respectively. The absolute errors greater than 0.5 mm reaches 40% in the Section E.
Fig.11 Results of testing set Ⅱ: (a) surface settlement prediction; (b) performance metrics. |
Full size|PPT slide
The surface settlement prediction results for testing set III are shown in Fig.12(a), with the performance metrics shown in Fig.12(b). The total number of data points in this testing set is fewer than the previous two sets. Due to significant data fluctuations, interpolation was not applied to fill any gaps. The predicted trends for Sections A and D closely align with the actual trends, with minor fluctuations also being reflected to some extent (e.g., Points 13, 15, and 19). The proportion of absolute errors greater than 0.5 mm are 27.3% and 23.1%, respectively. R2 for Sections A and D are 0.9078 and 0.8649, respectively, while MAE are 0.3440 and 0.3012, and RMSE are 0.4146 and 0.3652.
Fig.12 Results of testing set Ⅲ: (a) surface settlement prediction; (b) performance metrics. |
Full size|PPT slide
To assess the prediction results more intuitively, the performance metrics for three testing sets are summarized in Tab.3.
Tab.3 Summary of performance metrics of each testing set |
Testing set | Section | Performance |
R2 | MAE | RMSE |
Testing set Ⅰ | C | 0.9702 | 0.3088 | 0.3835 |
F | 0.7438 | 0.3484 | 0.4093 |
C + F* | 0.8540 | 0.3334 | 0.3997 |
Testing set Ⅱ | B | 0.8477 | 0.1029 | 0.2913 |
E | 0.7690 | 0.4391 | 0.4943 |
B + E* | 0.8595 | 0.3530 | 0.4333 |
Testing set Ⅲ | A | 0.9078 | 0.3440 | 0.4146 |
D | 0.8649 | 0.3012 | 0.3652 |
A + D* | 0.9332 | 0.3208 | 0.3886 |
This study employs the hybrid model, SSA-ELM, to predict longitudinal surface settlement across six sections in two different geological formations. It is evident from the results that the model performs well across different sections, with an average R2 of 0.8822, MAE of 0.3357, and RMSE of 0.4072. Thus, it is demonstrated that the model offers high accuracy in surface settlement prediction across different formations, effectively aiding in the prediction of longitudinal surface settlement, thereby allowing for early preparation against potential engineering emergencies.
4.2 Comparative analysis of longitudinal surface settlement prediction models
To evaluate the predictive performance of SSA-ELM, two additional models were constructed for predictions using the same data set. These models include ELM and SSA-BP, each with the following specifications.
1) ELM model. The architecture comprises 14 input layer nodes, 14 hidden layer nodes and a single output layer node (14-14-1). It employs the Rectified Linear Unit as the activation function, optimized via Stochastic Gradient Descent, with 256 training epochs.
2) SSA-BP model. BP mirrors the ELM in structure with a 14-14-1 layout but employs the Sigmoid as the activation function. Regarding SSA, the population size of sparrows is 64 and the maximum iteration is 128. The proportion of producers and scouts and ST mirror those implemented in the SSA-ELM hybrid model.
The prediction results of the three models are shown in Fig.13, where the dotted line serves as a reference line to distinguish different testing sets.
Fig.13 Comparison of different models on the same dataset. |
Full size|PPT slide
The prediction by the SSA-ELM model align with the actual measured results compared to the other two models. Moreover, the predictive trend of the SSA-ELM model closely follows the actual trend, while the ELM and SSA-BP models exhibit fluctuations at certain data points, indicating less stability. A summary of performance metrics for the three models is listed in Tab.4. The average R2 for the SSA-ELM model is 0.8822. In contrast, ELM model achieves the average R2 of 0.7891 while SSA-BP achieves the average R2 of 0.7381. In terms of MAE and RMSE, the SSA-ELM model outperforms other two models, with MAE at 0.3357 and RMSE at 0.4072. ELM has an MAE of 0.5164 and an RMSE of 0.6192, while the SSA-BP model has an MAE of 0.4590 and an RMSE of 0.6036. The superiority of the SSA-ELM model over the ELM model underscores the effectiveness of SSA in optimizing the prediction accuracy. Similarly, the SSA-ELM model surpasses the SSA-BP model, indicating a higher compatibility between ELM and SSA, which results in better prediction results. In terms of training duration, the SSA-ELM model requires 0.2346 s, faster than the SSA-BP model at 1.8427 s but slower than ELM at 0.1159 s. The increased training duration is one of the drawbacks of optimization algorithms, both the SSA-ELM and SSA-BP models necessitate longer training durations than ELM alone. However, the SSA-ELM model is only 0.1187 s slower than ELM. Additionally, the faster training speed of the SSA-ELM model compared to the SSA-BP model further highlight its high compatibility.
Tab.4 Summary of performance metrics of different models |
Model | Testing set | Predictive performance |
R2 | MAE | RMSE | t* (s) |
SSA-ELM | Ⅰ | 0.8540 | 0.3334 | 0.3997 | 0.2338 |
Ⅱ | 0.8595 | 0.3530 | 0.4333 | 0.2297 |
Ⅲ | 0.9332 | 0.3208 | 0.3886 | 0.2403 |
Ave. | 0.8822 | 0.3357 | 0.4072 | 0.2346 |
ELM | Ⅰ | 0.7561 | 0.4837 | 0.6090 | 0.1163 |
Ⅱ | 0.8015 | 0.4598 | 0.5265 | 0.1128 |
Ⅲ | 0.8096 | 0.6058 | 0.7221 | 0.1187 |
Ave. | 0.7891 | 0.5164 | 0.6192 | 0.1159 |
SSA-BP | Ⅰ | 0.7326 | 0.4044 | 0.5937 | 1.8436 |
Ⅱ | 0.7778 | 0.4252 | 0.5332 | 1.8563 |
Ⅲ | 0.7040 | 0.5474 | 0.6840 | 1.8282 |
Ave. | 0.7381 | 0.4590 | 0.6036 | 1.8427 |
4.3 Approach to optimization of tunneling parameters
Noted that no instances of excessive surface settlement have been observed in this project so far, this study does not undertake a detailed exploration of specific management approaches for such issues. Nevertheless, in anticipation of potential risks, a detailed textual description is provided as follows.
1) Evaluate the correlation between each tunneling parameter and longitudinal surface settlement to identify tunneling parameters necessitating adjustment and optimization, specifying feasible value ranges for these parameters.
2) Use longitudinal surface settlement as the fitness function, with the minimization of fitness corresponds to the global optimum solution representing the optimal combination of tunneling parameters.
3) Select an appropriate intelligent swarm optimization algorithm and determine key parameters such as search space, initial population size and iteration time. During the configuration process, carefully consider the operational principles of the shield machine, construction conditions and any potential constraints.
4) Execute the intelligent swarm optimization algorithm to derive the optimal tunneling parameters. In each iteration, update the tunneling parameters based on the current predictions of longitudinal surface settlement and recalculate the prediction. Through iterative processes, identify a set of optimal tunneling parameters ensuring longitudinal compliance of longitudinal surface settlement with anticipated specifications.
5) Continually monitor the status of longitudinal surface settlement Concurrent with actual construction activities, comparing it with the output from the prediction model. If significant discrepancies are found between the predicted outcomes and actual conditions, adjustments may be necessary to the prediction model or optimization algorithm to align with practical requirements. Moreover, adjusting optimization objectives and constraints in response to real-world circumstances is also imperative.
5 Conclusions and outlook
This study tackles the engineering challenge posed by excessive surface settlement induced by the large-diameter shield machines in composite formations. Using the Jiangyin−Jingjiang Yangtze River Tunnel Left Line Project as an example, the study analyzed the longitudinal surface settlement patterns in both homogeneous and composite formations. The SSA-ELM model was constructed to predict the longitudinal surface settlement across different formation sections, with its predictive performance compared with that of ELM and the SSA-BP model. The main conclusions are summarized as follows.
1) In both sand formation and composite formation of silty clay and sand, the maximum surface settlement changes occur along the tunnel axis. In sand formation, the trend of longitudinal surface settlement exhibits an “initial heave followed by subsidence” pattern, whereas certain monitoring sections in the composite formation do not exhibit heave.
2) The SSA-ELM model was employed for the prediction of longitudinal surface settlement in the sand and the composite formations, adopting tunneling parameters, geological parameters, geometrical parameters and a custom parameter as input parameters. R2, MAE, and RMSE were used as performance metrics. For the testing sets, the model achieved an average R2 of 0.8822, an average MAE of 0.3357, and an average RMSE of 0.4072. The proposed model demonstrates a well-structured framework with predictions closely align with actual data, conforming its feasibility and practicality for predicting longitudinal surface settlement in this study.
3) When comparing the predictive performance of the SSA-ELM, ELM, and SSA-BP models on the same data set, the SSA-ELM model outperformed ELM with an average R2 of 0.7891 and the SSA-BP model with an average R2 of 0.7381. In terms of MAE and RMSE, the SSA-ELM model surpassed ELM with values of 0.5164 and 0.6192 respectively, and the SSA-BP model with 0.4590 and 0.6036, respectively. Regarding training time, the SSA-ELM model requires 0.2346 s, contrasted with the ELM model at 0.1159 s and the SSA-BP model at 1.8427 s. Although the SSA-ELM model is superior to the SSA-BP model but not as fast as ELM, considering that the SSA-ELM model is only 0.1187 s slower and has higher prediction accuracy, it is deemed to provide guidance for the construction of this project.
4) This study has laid fundamental work in predicting longitudinal surface settlement. However, as no instances of excessive surface settlement were observed during construction, specific management methods were not explored. Nevertheless, in anticipation of potential risks, this study offers a targeted approach to optimization of tunneling parameters. Combined with the actual site conditions, the optimization model necessitate continuous adjustments and refinements to uphold its relevance to real-world operational context and to ensure its applicability in engineering practice.
{{custom_sec.title}}
{{custom_sec.title}}
{{custom_sec.content}}