Real-time safety control of shield attitude considering tunneling efficiency

Tugen FENG; Jinjian HU; Jian ZHANG; Guoping REN; Yongbo LI; Xiaopeng ZHAO

doi:10.1007/s11709-026-1255-2

ENG. Struct. Civ. Eng ›› DOI: 10.1007/s11709-026-1255-2

RESEARCH ARTICLE

Real-time safety control of shield attitude considering tunneling efficiency

Author information +

History +

PDF (3257KB)

Abstract

Shield attitude control is a critical aspect that must be continuously monitored during shield tunneling. To achieve scientifically rational settings for shield tunneling parameters, this study constructed multiple machine learning prediction models, including shield attitude deviations and tunneling speed, and optimized the hyperparameters of these models using Bayesian algorithms. Subsequently, a constrained grey wolf optimization(GWO) algorithm was employed to establish a real-time safety control method for attitude that considers tunneling efficiency, by dynamically updating the upper and lower bounds for adjustable parameters. The results indicate that the k-nearest neighbors (KNN) model achieved the highest prediction accuracy; however, due to its specific algorithmic principles, KNN is unsuitable for optimization tasks. Embedding the extreme gradient boosting model into the GWO algorithm yielded the best attitude control performance: the absolute attitude deviations were reduced by an average of 45.1% compared to actual values, while the rate of change for adjustable parameters did not exceed 30%. This approach ensures safety and tunneling efficiency during attitude correction and exhibits universal applicability. Compared with other optimization algorithms, GWO demonstrated significant advantages in both optimization effectiveness and computational time.

Graphical abstract

Keywords

shield attitude / tunneling speed / K-nearest neighbors / support vector regression / extreme gradient boosting / grey wolf optimization

Cite this article

Download citation ▾

Tugen FENG, Jinjian HU, Jian ZHANG, Guoping REN, Yongbo LI, Xiaopeng ZHAO. Real-time safety control of shield attitude considering tunneling efficiency. ENG. Struct. Civ. Eng DOI:10.1007/s11709-026-1255-2

登录浏览全文

4963

注册一个新账户忘记密码

1 Introduction

Shield tunneling, as an efficient, safe, and environmentally friendly tunnel construction method, has become an indispensable tool in the construction of underground rail transit systems [1]. Due to the complexity of underground spaces, designers plan an optimal excavation route based on geological survey results and design speed requirements before the shield machine is launched from the shaft. This planned route is known as the shield design axis, along which the shield must advance strictly during the excavation process. Currently, the operation of shield machines primarily relies on the skills of shield drivers. However, given the increasing number of shield tunnels, experienced shield drivers are relatively scarce. Moreover, shield drivers do not always make the optimal decisions, leading to situations where the shield attitude deviates significantly from the designed axis. If not properly managed, this can result in excavation face instability [2], segment damage [3], excessive surface settlement [4], and other risks, potentially causing serious engineering accidents. Therefore, research on shield attitude control is of paramount importance.

Shield attitude variation involves complex machine-soil interactions. Sugimoto et al. [5] first established a theoretical dynamic load model for shields during excavation. Zhang et al. [6] developed mechanical equilibrium equations for shields, accurately describing their attitude characteristics. Shen et al. [7] further derived a computational method for shield attitude based on the foundation reaction curve. While these studies provide the theoretical basis for shield attitude changes, their practical applicability in guiding actual construction is limited due to geological variability. Given the inherent complexity of shield machines, some scholars have approached the problem from a mechanical perspective, researching shield attitude control systems [8–10]. These systems are applicable during the shield design phase but offer limited guidance for on-site tunneling operations. Additionally, finite element simulations of the tunneling process can provide engineering recommendations for attitude adjustment [3,11], yet they struggle to meet real-time requirements.

The development of artificial intelligence has opened new avenues for shield attitude control. By leveraging the massive data generated during shield tunneling, it enables accurate and real-time prediction of future attitude changes. Since shield tunneling data are recorded at specific time intervals, time-series models such as long short-term memory (LSTM) and gated recurrent unit (GRU) are widely adopted by scholars for attitude prediction. Chen et al. [12] proposed an attitude deviation prediction model based on time-aware LSTM by incorporating a time-decay function. Zhang et al. [13] used principal component analysis (PCA) to reduce data dimensionality and associated noise, enabling real-time attitude deviation prediction through GRU’s rolling-window mode. Fu et al. [14] similarly employed PCA for multi-source data fusion and integrated it with temporal convolutional networks (TCN) to form a hybrid deep learning method for dynamic attitude and position prediction in super-large-diameter shields. Other time-series methods for attitude deviation prediction include the adaptive boosting (AdaBoost)-GRU model [15], wavelet transform filter-convolutional neural network (CNN)-LSTM hybrid deep learning model [16], and LSTM-Transformer model [17]. Additionally, traditional machine learning models like Bayesian machine learning [18], LGBM (Chen et al. [19]), and random forest (RF) [20] have been applied. Xiao et al. [21] compared four machine learning models (k-nearest neighbors (KNN), support vector regression (SVR), RF, AdaBoost) and four deep learning models (backpropagation neural network, CNN, LSTM, GRU) for shield attitude prediction and developed an early-warning system. Accurate attitude deviation prediction provides critical references for engineering practice, while intelligent control based on such predictions holds greater practical value. Hu et al. [22] combined extreme gradient boosting (XGBoost) and shapley additive explanations algorithms to establish an interpretable predictive model for shield attitude control, clarifying adjustment directions for key parameters. Xiao et al. [23] and Wang et al. [24] both developed association models linking attitude deviations with tunneling parameters, defining optimal control ranges for these parameters. To further advance automated tunneling, Xu et al. [25] proposed a multi-agent deep reinforcement learning framework based on state classification and assignment for adaptive shield attitude control. Optimization algorithms like the grid method [26], greywolf optimization[27], non-dominated sorting genetic algorithm II (NSGA-II) [28], and NSGA-III [29] have also been utilized for attitude correction, providing quantitative parameter settings for on-site tunneling guidance.

The artificial intelligence-enabled shield attitude control method, combined with on-site tunneling data, achieves high-precision real-time adjustment of shield attitude utilizing machine learning and optimization algorithms. However, most current studies optimize only the single objective of attitude deviations while neglecting tunneling efficiency-an approach that misaligns with practical engineering requirements. Additionally, to ensure safe and stable operation while reducing energy consumption, tunneling parameters should avoid drastic adjustments, a constraint rarely considered in existing research. To address these dual challenges, this study proposes a real-time safety control method for shield attitude that integrates tunneling efficiency. Through comparative analysis, the most suitable machine learning model for attitude correction is identified.

The rest of this paper is organized as follows. Section 2 introduces the project background and original data analysis on which this paper is based. Section 3 presents the specific methods for real-time control of shield attitude. Section 4 validates and discusses the proposed methods. Section 5 elaborates on the innovatives and limitations. Section 6 concludes this paper.

2 Project

The engineering project studied in this paper is the Shenda Intercity Railway Tunnel in northern Shenzhen City, Guangdong Province, China. It runs east–west primarily traversing urban built-up areas. Construction employs twin tunnels: A left-line shield section of 3409 m and right-line section of 3530 m, constructed using the double-mode shield tunneling method. Key shield parameters are listed in Table 1. Shield propulsion is achieved through thrust cylinders, hydraulic pump stations, and control systems. Cylinder arrangement adapts to segment geometry, and directional adjustment during excavation requires cylinder grouping. The shield features 19 sets of dual cylinders divided into six regions: top (Group F), upper right (Group A), lower right (Group B), bottom (Group C), lower left (Group D), and upper left (Group E). Independent pressure control in these six zones enables shield attitude correction and steering. Cylinder arrangement is shown in Fig. 1.

3 Methodology

The methodology in this paper, illustrated in Fig. 2, comprises three interconnected phases: initial data preprocessing of raw tunneling data involving the removal of non-tunneling-state records, outlier detection, and normalization to prepare clean data sets for machine learning; subsequent model construction where categorized parameters-adjustable operational controls and non-adjustable status indicators-serve as inputs to train attitude deviations and TS prediction models optimized via Bayesian hyperparameter tuning; and finally embedded constrained optimization utilizing the Grey Wolf optimization algorithm to iteratively minimize attitude deviations under minimum-speed constraints until site requirements are met through continuous parameter adjustments across successive tunneling cycles.

3.1 Data preprocessing

3.1.1 Data set introduction and parameter filtering

The data set used in this paper is sourced from the left line shield section, with data recorded at 1-min intervals. The data set comprises 38542 timestamps collected from 2023 to 12-26 T15:43:00 to 2024-01-22 T16:15:00. During this period, the shield tunnel traversed slightly weathered granite formations. Typically, shield attitude deviations include the horizontal deviation of the shield head (HDSH), the horizontal deviation of the shield tail (HDST), the vertical deviation of the shield head (VDSH), and the vertical deviation of the shield tail (VDST). Figure 3 illustrates these four attitude deviation components. Figure 4 shows the distribution histogram of these four directional attitude deviations. As can be seen, the HDSH is primarily distributed within –10 and 20 mm, the HDST is primarily distributed within –40 and 30 mm, the VDSH is primarily distributed within 0 and 50 mm, and the VDST is primarily distributed within –50 and 30 mm. Overall, the shield attitude control was effective, but there are a few instances where the absolute value of attitude deviations exceeded 50 mm, even reaching 100 mm. This indicates that relying solely on the experience of the shield driver cannot ensure that the shield attitude deviations remain within a small range; scientifically-grounded real-time control methods should be adopted. Additionally, the shield TS refers to the distance the shield advances along the tunnel axis per unit time, reflecting the efficiency of the shield tunneling. Therefore, we establish mapping relationships between tunneling parameters and both the four attitude deviations and TS. Following consultations with field engineers and literature [30–32], selected adjustable parameters include: cutterhead speed (CS) and propulsion pressures for Groups A–F (PA, PB, PC, PD, PE, PF), selected non-adjustable parameters include: cutterhead torque, penetration rate, and total thrust force.

3.1.2 Remove non-tunneling state data

From a temporal perspective, frequent interruptions occur during segment installation, causing the shield machine to enter a standby state. These non-tunneling data provide no value for the research in this paper and should be detected and removed. When Eq. (1) equals 0, this condition can be considered indicative of a non-tunneling state [14]:

(1)

S = C T × T T F × T S × C S,

where CT represents the cutterhead torque, TTF represents the total thrust force.

3.1.3 Identify outliers

From a parametric perspective, tunneling data may contain outliers due to sensor failures and other reasons. This paper uses the

3 σ

criterion to identify outliers. The

3 σ

criterion is a concept in statistics, which states that the probability of a set of data falling within the range of

(μ − 3 σ, μ + 3 σ)

(where

μ

is the mean of the data set and

σ

is the standard deviation) is 0.9973. Data outside this range (< 0.3% probability) are designated as outliers. For each identified outlier, first, based on attitude deviations and TS (i.e., model output parameters), all tunneling parameter data at the time of the outlier are deleted. Then, using other input parameters as benchmarks, the outliers are replaced with the corresponding tunneling parameter values from the previous moment.

3.1.4 Normalization

Following the basic data preprocessing steps of eliminating non-tunneling-state data and identifying outliers, feature extraction must be performed on input parameters. This is necessary because shield tunneling parameters vary greatly-some exhibit extensive value ranges while others remain relatively concentrated. If used directly in machine learning model training, parameters with wide value ranges would dominate distance or weight calculations, thereby overshadowing the significance of other parameters. Feature extraction techniques include standardization, normalization, and others [33]. This paper employs normalization to eliminate dimensional influences between different features:

(2)

y t ′ = y t − y m i n y m a x − y m i n,

where

y t

is the raw data at time t,

y min

is the minimum value of the data set,

y max

is the maximum value of the data set, and

y t ′

is the normalized data at time

t

3.2 Machine learning models

The adjustment of shield tunneling parameters introduces temporal delays in attitude deviations, meaning that current tunneling parameters do not immediately lead to changes in attitude but exhibit predictable lag effects. This temporal causality enables proactive prediction of attitude deviations. Therefore, before implementing automatic attitude correction, establishing a mapping relationship between current parameters and subsequent attitude deviations at the next timestep is essential. This paper introduces five machine learning models: KNN, XGBoost, light gradient boosting machine (LGBM), SVR, and multiple linear regression (MLR). By comparing these, we select the most suitable model to construct the relationship between tunneling parameters and attitude deviations.

3.2.1 K-nearest neighbors

The KNN algorithm [34], applies to regression tasks. It predicts the target value of a new sample point by measuring distance between the query point and labeled training data, then averaging the values of its k closest neighbors. The specific steps are as follows:

Step 1: Calculate the distance between points in the known category data set and the current point, generally using the Euclidean distance formula:

(3)

d (x, y) = ∑ i = 1 n (x i − y i) 2,

where

d (x, y)

refers to the distance between sample points

x

and

y

, and

x i

and

y i

are the ith features of samples

x

and

y

, respectively.

Step 2: Sort the calculated distances in ascending order.

Step 3: Select the k points with the smallest distances from the current point.

Step 4: Return the average value of the first k points as the predicted value for the current point.

3.2.2 Extreme gradient boosting

XGBoost is a scalable and optimized gradient boosting algorithm [35]. Its basic components are decision trees, which are known as “weak learners”. These decision trees collectively form XGBoost, making it an ensemble learning algorithm. Each decision tree is a classification and regression tree model used for regression or classification tasks. During training, XGBoost starts with the objective function to derive the weight value of each leaf node, the information gain after splitting nodes, and the feature importance ranking function. The construction of the current decision tree begins with a greedy algorithm that selects the feature to use at each node based on the calculated gain of the objective function. In prediction, input features are fed into each decision tree in turn, and each tree’s corresponding nodes have their predictive weights. The sum of all predictive weights yields the final prediction result.

3.2.3 Light gradient boosting machine

The fundamental principle of LGBM [36] is the same as that of XGBoost. It uses decision trees based on learning algorithms and significantly improves training speed and model performance through a series of optimization techniques. The most representative feature is that LGBM uses a histogram-based split point selection algorithm. By discretizing continuous feature values into multiple intervals and constructing a histogram to count the number of samples in each interval, it then traverses the discrete values of the histogram to find the optimal split point. This method allows LGBM to only traverse the values of k histograms, rather than all the discretized values like XGBoost, thereby significantly improving training speed. Additionally, LGBM employs a leaf-wise growth strategy with depth limitations. It selects the leaf with the highest split gain from the current set of leaves for splitting while setting a maximum depth to prevent overfitting.

3.2.4 Support vector regression

SVR [37] is a regression method based on support vector machines. Its core idea is to fit the data by minimizing prediction errors while maintaining a margin within which most data points fall. To handle nonlinear problems, SVR uses kernel functions to map the data into a high-dimensional space, where it seeks an optimal hyperplane to fit the training data. This hyperplane not only maximizes the margin from the training data but also minimizes the loss for the training data. SVR employs an ε-insensitive loss function, meaning that when the prediction error is smaller than a predefined threshold ε, no loss is calculated. This effectively creates a band of width 2ε around the target value. To prevent overfitting, SVR introduces a penalty term concept by setting a regularization parameter C, which penalizes data points that fall outside this band.

3.2.5 Multiple linear regression

Linear regression is one of the simplest regression algorithms in machine learning. MLR [38] refers to a linear regression problem with multiple features in a sample. The basic principle of MLR is to fit a linear model that describes the relationship between independent variables and dependent variables. This linear model typically uses least squares method to estimate parameters, minimizing the sum of squared residuals between predicted values and actual observations.

3.3 Real-time safety control method for attitude

3.3.1 Construction of machine learning models based on Bayesian optimization

To obtain the most suitable machine learning model for optimizing shield tunneling parameters, this paper constructs five machine learning models as introduced in Subsection 3.1 for both the four directional attitude deviations and the TS. The input to each machine learning model consists of adjustable parameters at time

t

and adjustable and non-adjustable parameters at time

t − 1

. Since the geological conditions corresponding to the data set in this paper are homogeneous, it is unnecessary to include them as input parameters in the model. However, the non-adjustable parameters at time

t − 1

(cutterhead torque, penetration rate, total thrust force) are widely recognized as representative indicators of shield machine performance [39–41]. Incorporating these parameters as model inputs indirectly reflects the influence of geological formations on tunneling performance. The exclusion of non-adjustable parameters at time

t

is due to their unsuitability as independent variables in the subsequent optimization process for attitude deviations.

After determining the input and output parameters of the model, the data set is divided into training and testing sets at a ratio of 4:1:1. Using the Bayesian optimization algorithm [42], with the negative mean absolute error between predicted and actual values as the optimization objective, the hyperparameters of each machine learning model are optimized to improve the prediction accuracy of the machine learning models.

3.3.2 Constrained Grey Wolf Optimization algorithm

Subsubsection 3.2.2 implements the advanced real-time prediction of shield machine attitude deviations and TS at time

t + 1

. Based on this, if the predicted attitude deviations exceed a predefined limit, we can take measures to correct the shield machine’s attitude deviations in advance. The main idea is to adjust the adjustable parameters at time

t + 1

after obtaining the adjustable and non-adjustable parameters at time

t

, so that the attitude deviations converges at time

t + 2

. During this process, the optimization upper and lower limits for adjustable parameters at time

t + 1

are set as a fluctuation range around the values at time

t

, without exceeding the maximum adjustment range of the parameters. That is, these optimization boundaries can be flexibly adjusted in real-time based on actual conditions. This approach prevents drastic changes in CS and propulsion pressures for Groups A–F, enhancing the safety of attitude correction. Concurrently, a lower limit for shield TS is established to ensure excavation efficiency. The above optimization problem can be summarized as:

(4)

min (a b s (H D S H t + 2 + H D S T t + 2 + V D S H t + 2 + V D S T t + 2)) = a b s (f H D S H (i n p u t)) + a b s (f H D S T (i n p u t)) + a b s (f V D S H (i n p u t)) + a b s (f V D S T (i n p u t)), i n p u t = C S t + 1 + P A t + 1 + P B t + 1 + P C t + 1 + P D t + 1 + P E t + 1 + P F t + 1 + a d j u s t a b l e_i n p u t t + n o n − a d j u s t a b l e_i n p u t t s . t . {max [C S min, C S t × (1 − λ)] < C S t + 1 < min [C S max, C S t × (1 − λ)] max [P A min, P A t × (1 − λ)] < P A t + 1 < min [P A max, P A t × (1 − λ)] max [P B min, P A t × (1 − λ)] < P B t + 1 < min [P B max, P B t × (1 − λ)] max [P C min, P C t × (1 − λ)] < P C t + 1 < min [P C max, P C t × (1 − λ)] max [P D min, P D t × (1 − λ)] < P D t + 1 < min [P D max, P D t × (1 − λ)] max [P E min, P E t × (1 − λ)] < P E t + 1 < min [P E max, P E t × (1 − λ)] max [P F min, P F t × (1 − λ)] < P F t + 1 < min [P F max, P F t × (1 − λ)] A S t + 2 = f A S (i n p u t), A S t + 2 > A S min,

where

f H D S H

f H D S T

f V D S H

f V D S T

, and

f T S

represent the prediction models for the four directional attitude deviations and the TS, respectively.

λ

denotes the fluctuation ratio of adjustable parameters.

C S min

and

C S max

represent the minimum and maximum cutterhead rotational speeds, respectively.

P A min

and

P A max

represent the minimum and maximum values of Group A’s propulsion pressure, respectively.

P B min

and

P B max

represent the minimum and maximum values of Group B’s propulsion pressure, respectively.

P C min

and

P C max

represent the minimum and maximum values of Group C’s propulsion pressure, respectively.

P D min

and

P D max

represent the minimum and maximum values of Group D’s propulsion pressure, respectively.

P E min

and

P E max

represent the minimum and maximum values of Group E’s propulsion pressure, respectively.

P F min

and

P F max

represent the minimum and maximum values of Group F’s propulsion pressure, respectively. Finally,

T S min

represents the minimum value of the TS.

This paper uses the grey wolf optimization (GWO) [43] to solve the constrained optimization problem. The GWO algorithm is a metaheuristic optimization method that simulates the hunting behavior and social hierarchy of wolves in nature. The GWO algorithm divides the wolf population into four types:

α

β

δ

, and

ω

. Here,

α

represents the leader who makes decisions;

β

represents the subordinates who assist

α

in making decisions;

δ

follows the commands of

α

and

β

, mainly responsible for scouting and guarding;

ω

are the ordinary members who maintain the internal balance of the pack. In the context of optimization problems,

α

β

δ

, and

ω

correspond to the optimal solution, the second-best solution, the third-best solution, and candidate solutions, respectively. The GWO algorithm conducts search and optimization by simulating the hunting behavior of wolves, with its optimization process illustrated in Fig. 5.

In the diagram,

A →

and

C →

are cooperator vectors, calculated using the formula:

A → = 2 a → ⋅ r → 1 − a →

C ¯ = 2 r ~ 2

(where

r → 1

and

r → 2

are random number vectors with values in the range [0,1], and

a →

is a convergence factor whose value decreases from 2 to 0 as the number of iterations increases).

Using the adjustable parameters at time

t + 1

as independent variables, we define the optimization objective as minimizing the attitude deviations at time

t + 2

. TS at time

t + 2

is incorporated as a dynamic constraint. By integrating the selected machine learning models into the GWO framework, a quantitative mapping from adjustable parameters to attitude deviations and TS is constructed. This process allows for the quantitative setting of adjustable parameters, achieving real-time automatic control of the shield attitude.

4 Verification and discussion

The computer processor used for training and testing the machine learning models, as well as optimizing the attitude deviations, is an AMD Ryzen 5 with 16GB of memory. The operating system is Windows 10 (64-bit). The programming language version used is Python 3.7.13, and the integrated development environments are Jupyter Notebook and PyCharm.

4.1 Model prediction results

After optimization using the Bayesian algorithm, the optimal hyperparameters for the five machine learning models corresponding to attitude deviations and TS are shown in Table 2.

Taking the HDSH as an example, Fig. 6 shows a comparison between the predicted and actual values of the five machine learning models. It can be seen that except for MLR, the other five machine learning models have achieved accurate predictions of the HDSH.

To quantitatively compare the prediction accuracy of different machine learning models, two performance evaluation metrics are used: mean absolute error (

M A E

) and coefficient of determination (

R 2

). The definitions of these metrics are as follows:

(5)

M A E = 1 n ∑ i = 1 n | y^i − y i |,

(6)

R 2 = 1 − ∑ i = 1 n (y^i − y i) 2 / ∑ i = 1 n (y ¯ − y i) 2,

where

y^i

represents the predicted value,

y i

represents the actual value, and

y ¯

represents the mean value. The closer

M A E

is to 0 and the closer

R 2

is to 1, the higher the prediction accuracy of the model.

Table 3 shows the prediction performance of attitude deviations and TS under five machine learning models. As can be seen from Table 3, for the four directional attitude deviations, the KNN model has the highest prediction accuracy, with an average

M A E

of 1.123 mm and an average

R 2

of 0.939. The XGBoost model has an average

M A E

of 1.536 mm and an average

R 2

of 0.942, which is similar in prediction accuracy and only second to the KNN model. The SVR model has an average

M A E

of 1.608 mm and an average

R 2

of 0.934, which is slightly less accurate than the XGBoost model. The LGBM model has an average

M A E

of 1.893 mm and an average

R 2

of 0.932, showing a significant decrease in prediction accuracy compared to the previous models. The MLR model has an average

M A E

of 7.863 mm and an average

R 2

of 0.433, which is much less accurate than the other models. For the TS, except for the MLR model, the prediction accuracy of the other four machine learning models is similar and can achieve relatively accurate advanced real-time prediction.

The choice of feature extraction methods significantly impacts the predictive performance of machine learning models. Figure 7 illustrates the average prediction performance of each model under normalization, standardization, and no feature extraction. For results using standardization, the

M A E

averages for the HDSH, HDST, VDSH, VDST, and AS models were 2.307 mm, 4.119 mm, 2.923 mm, 3.397 mm, and 1.355 mm/min, respectively. The

R 2

averages were 0.778, 0.797, 0.805, 0.867, and 0.673. Compared to predictions using normalization, these

M A E

averages increased by 0.9%, 18.7%, 33.4%, 3.9%, and 15.4%, respectively, while

R 2

averages decreased by 0.4%, 2.6%, 7.8%, 0.6%, and 10.3%.Without feature extraction, the

M A E

averages for the HDSH, HDST, VDSH, VDST, and AS models were 3.947 mm, 6.451 mm, 4.274 mm, 6.759 mm, and 1.681 mm/min, respectively. The

R 2

averages were 0.570, 0.550, 0.618, 0.595, and 0.525. These

M A E

averages increased by 72.6%, 85.9%, 95.1%, 106.7%, and 43.2% compared to normalization, while

R 2

averages decreased by 27.0%, 32.8%, 29.3%, 31.8%, and 30.1%.This demonstrates that feature extraction substantially influences model performance in this predictive task, and normalization yields superior prediction accuracy.

4.2 Attitude control results

To save computational resources in practical applications, this paper sets the attitude deviation limit to 30 mm, which is 60% of the maximum displacement of 50 mm specified in the Chinese standard (GB 50446-2017, 2017). This means that if the absolute value of the attitude deviation in any direction exceeds 30 mm, correction will be applied; otherwise, the shield machine will continue tunneling without attitude correction. Considering the ranges of adjustable parameters in the original data set, the maximum and minimum values for the adjustable parameters in Subsubsection 3.2.3 are set as follows:

C S min

= 1.3 r/min,

C S max

= 4.2 r/min,

P A min

= 10 bar,

P A max

= 100 bar,

P B min

= 30 bar,

P B max

= 150 bar,

P C min

= 40 bar,

P C max

= 250 bar,

P D min

= 20 bar,

P D max

= 120 bar,

P E min

= 10 bar,

P E max

= 100 bar,

P F min

= 10 bar,

P F max

= 130 bar,

T S min

= 15 mm/min. Referencing the change ratios of adjustable parameters between adjacent timesteps in the original data set,

λ

is set to 30%.

4.2.1 K-nearest neighbors embedded in grey wolf optimization control results

To compare the attitude control effects, we take time 2000 as an example. First, the KNN model with the highest prediction accuracy is embedded into GWO for real-time attitude correction, setting the number of GWO iterations to 30 times. Figure 8(a) shows the quantitative settings of adjustable parameters after optimization. Under this set of parameter settings, the optimized values and actual values of the four directional attitude deviations are compared in Fig. 8(b), and the TS changes from the original actual value of 18.75 to 18.59 mm/min. The variation rate of the CS is 8%, while the variation rates of the propulsion pressures for Groups A–F are 30%, 22%, 22%, 19%, 1%, and 30%, respectively. It can be seen that although the shield machine’s TS is not affected after embedding the KNN model into GWO, the attitude correction effect is not ideal, with the sum of the absolute values of the four directional attitude deviations only corrected by 8.5 mm. The attitude correction situations at other times are similar to those at time 2000, indicating that although KNN has the highest prediction accuracy for attitude deviations, it is not suitable as the embedded model for GWO.

4.2.2 Support vector regression embedded in grey wolf optimization control results

Since the KNN model with the highest prediction accuracy is not suitable as the embedded model for GWO, it is necessary to choose the machine learning model with the second-highest prediction accuracy as the embedded model for GWO. The prediction accuracy of the SVR model and the XGBoost model is similar. Here, the SVR model is first embedded in GWO for real-time attitude correction. Taking time 2000 as an example, Fig. 9(a) shows the quantitative setting of the optimized adjustable parameters. Under this set of parameter settings, a comparison between the four directional attitude deviations predicted by the SVR model and the actual values is shown in Fig. 9(b). The sum of the absolute values of the four directional attitude deviations is corrected from 56 to 40.496 mm, a reduction of 27.7%. Moreover, the largest shield head vertical deviation is reduced from 36 to 27.334 mm, falling below the limit of 30 mm, indicating significant correction effects. The variation rate of the CSis 0%, while the variation rates of the propulsion pressures for Groups A–F are 9%, 6%, 5%, 13%, 12%, and 30%, respectively. The TS changes from the original actual value of 18.75 to 22.76 mm/min, showing that attitude correction is achieved without affecting the tunneling efficiency of the shield machine.

We input the adjustable parameters from Fig. 9(a) into the KNN model for validation, and the results of the KNN validation are shown in Fig. 9(b). Combining Figs. 8 and 9, it can be observed that even with significant changes in the adjustable parameters, the predicted values of attitude deviations obtained by the KNN model have very little difference, especially the HDST and the VDST are exactly the same. This lack of responsiveness to changes in adjustable parameters is due to the algorithmic principles of KNN. KNN predicts new sample values by calculating the distances between the input parameters of known category data sets and the corresponding input parameters of the new sample. In this attitude control task, the proportion of adjustable parameters among all input parameters of the KNN model is small. As a result, regardless of how the adjustable parameters change, the output of the KNN model largely depends on the non-adjustable input parameters. This explains why the KNN model is not suitable as an embedded model for attitude correction.

4.2.3 Extreme gradient boosting embedded in grey wolf optimization control results

The overall approach is the same as in Subsubsection 4.2.2, except that the SVR model is replaced with the XGBoost model. Figure 10(a) shows the quantitative settings of adjustable parameters after optimization. Under this set of parameter settings, a comparison between the four directional attitude deviations predicted by the XGBoost model and the actual values is shown in Fig. 10(b). The sum of the absolute values of the four directional attitude deviations is corrected from to 30.711 mm, a reduction of 45.1%. Moreover, the largest VDSH is reduced from 36 to 24.307 mm, indicating better correction effects than the SVR model. The variation rate of the CS is 4%, while the variation rates of the propulsion pressures for Groups A–F are 30%, 19%, 30%, 18%, 16%, and 23%, respectively. The TS changes from the original actual value of 18.75 to 17.73 mm/min, showing that attitude correction is achieved without affecting the tunneling efficiency of the shield machine. The validation results of the KNN model once again confirm that the KNN model is not suitable as an embedded model for attitude correction.

Comparative analysis reveals that XGBoost, when embedded in GWO, achieves optimal error correction performance for both overall deviations and maximum deviations, surpassing SVR. Therefore, the XGBoost model is best suited for real-time safety control of shield machine attitude within the GWO framework.

4.3 Discussion

4.3.1 Comparison of optimization results for different algorithms

To demonstrate the superiority of the GWO in shield attitude optimization, the artificial fish swarm algorithm (AFS) [44], sparrow search algorithm (SSA) [45], and particle swarm optimization (PSO) [46] are introduced for comparison. Fig. 11 presents the optimization effectiveness and computation time of these four algorithms. The “average optimization magnitude” in Fig. 11 refers to the mean absolute difference in attitude deviations across four directions before and after optimization, reflecting the algorithms’ corrective capability. All four algorithms achieved the required TS of 15 mm/min. Under this condition, GWO achieved an average optimization magnitude of 6.806 mm-outperforming AFS, SSA, and PSO by 28.7%, 19.8%, and 40.6%, respectively. Concurrently, GWO completed optimization in just 9 s, fully meeting practical engineering requirements while reducing computation time by 35.7%, 18.2%, and 43.8% compared to AFS, SSA, and PSO. Thus, GWO is the optimal algorithm for attitude deviations optimization in terms of both effectiveness and efficiency.

4.3.2 Attitude control results for multiple moments

Based on the results discussed above, embedding the XGBoost model into the GWO model with TS constraints is the optimal method for real-time attitude control. However, this conclusion was drawn from a single moment’s result. In this section, we selected 2000 consecutive moments (from moment 9800 to moment 11799) with significant attitude deviations for attitude correction analysis. The optimized four directional attitude deviations and TS compared to their corresponding actual values are shown in Fig. 12. Since not every moment within these 2000 consecutive moments requires attitude correction, there are missing segments of optimized values in Figs. 12(a)−12(e). Among the optimized moments, the optimized values of the four directional attitude deviations are more convergent than the actual values. Specifically, the average absolute value of the optimized HDSH is 4.291 mm, compared to the actual average absolute value of 6.196 mm, reducing by 30.7%. The average absolute value of the optimized HDST is 1.289 mm, compared to the actual average absolute value of 7.275 mm, reducing by 82.3%. The average absolute value of the optimized VDSH is 17.486 mm, compared to the actual average absolute value of 22.296 mm, reducing by 21.6%. The average absolute value of the optimized VDST is 9.464 mm, compared to the actual average absolute value of 19.158 mm, reducing by 50.6%. While converging the attitude deviations, the optimized TS values do not differ significantly from the actual values. This indicates that the proposed method for real-time automatic control of shield machine attitude has generalizability and can correct attitudes while ensuring tunneling efficiency.

4.3.3 Engineering application process

The application pathway for implementing these research findings in practical engineering is depicted in Fig. 13. After acquiring sufficient preliminary field measurement data, an XGBoost-based model for predicting shield attitude deviations and TS is established. Upon receiving real-time engineering data at time

t

, this model forecasts attitude deviations and TS for

t + 1

. If the predicted attitude deviations fail to meet on-site requirements, the GWO is deployed to derive optimized adjustable tunneling parameters for

t + 1

. Should the predicted deviations satisfy specifications, optimization may be omitted. At

t + 1

, the system initiates a new prediction-optimization cycle using freshly acquired field data. Since attitude tolerance thresholds dynamically evolve with geological variations, GWO objectives can be customized during optimization-for instance, deliberately converging vertical deviation toward nonzero targets in upper-hard-lower-soft formations to prevent pipe flotation. Continuously integrating new field data into the database and periodically updating models ensures sustained prediction accuracy throughout the tunneling process. This framework enables adaptive near-real-time control while accommodating complex geological constraints through context-aware optimization targeting.

5 Innovations and limitations

The key innovations of this paper are twofold: 1) incorporating TS as a constraint during attitude optimization to ensure tunneling efficiency, and 2) dynamically setting optimization boundaries based on real-time fluctuation ranges and allowable value limits of adjustable parameters to enhance operational safety.

A key limitation lies in the data set-collected exclusively during tunneling through homogeneous geological formations-which prevents direct incorporation of geological factors into the machine learning model inputs. This may reduce prediction accuracy in heterogeneous strata, particularly when traversing composite formations [47]. Significant variations in geological properties could trigger severe attitude deviations due to suboptimal tunneling parameters, substantially complicating attitude correction efforts.

6 Conclusions

This paper, based on the Shenzhen-Guangzhou Intercity Railway Tunnel Project, utilized Bayesian optimization with various machine learning models to achieve accurate predictions of shield machine attitude deviations and TS. By integrating these predictions with a constrained GWO, a real-time automatic control method for shield machine attitude was established. The specific conclusions are as follows.

1) Utilizing Bayesian algorithms for hyperparameter optimization of machine learning models achieved high-precision prediction models for shield attitude deviations and TS. However, embedding the KNN model-which yielded the highest prediction accuracy-into the GWO for attitude correction produced suboptimal results, due to the predictive mechanism of the KNN model. Both the SVR and XGBoost models, when combined with the GWO, enabled real-time safety control of shield attitude. Notably, the XGBoost model demonstrated superior attitude correction performance, thus establishing it as the optimal model for attitude correction.

2) The proposed real-time safety control method uniquely considers tunneling efficiency during attitude correction by dynamically constraining adjustable parameters within ±30% fluctuation ranges of current values-rather than fixed thresholds-ensuring operational safety. This universally applicable approach guarantees attitude correction at every deviation-alert moment without compromising tunneling efficiency.

3) The GWO algorithm achieves an average optimization magnitude of 6.806 mm with a computation time of only 9 s, significantly outperforming AFS, SSA, and PSO. This establishes GWO as the optimal algorithm for attitude deviation optimization. During field deployment, optimization objectives can be customized to meet dynamic engineering requirements, while model accuracy is enhanced through periodic updates.

4) Future work will focus on collecting multi-strata tunneling data to quantify physical constraints during machine-soil interactions. These mechanical principles will be fused into neural networks via residual loss terms, developing physics-informed neural networks to enhance prediction accuracy and generalize across composite formations.

References

Publishing order | Descend order by publishing year | Descend order by cited within

[1]	Wei G , Feng F , Huang S , Xu T , Zhu J , Wang X , Zhu C. . Full-scale loading test for shield tunnel segments: Load-bearing performance and failure patterns of lining structures. Underground Space, 2025, 20: 197–217

[2]	Liu W , Ding L. . Global sensitivity analysis of influential parameters for excavation stability of metro tunnel. Automation in Construction, 2020, 113: 103080

[3]	Mo H , Chen J. . Study on inner force and dislocation of segments caused by shield machine attitude. Tunnelling and Underground Space Technology, 2008, 23(3): 281–291

[4]	Chen D , Feng X , Xu D , Jiang Q , Yang C , Yao P. . Use of an improved ANN model to predict collapse depth of thin and extremely thin layered rock strata during tunnelling. Tunnelling and Underground Space Technology, 2016, 51: 372–386

[5]	Sugimoto M , Sramoon A , Asce M. . Theoretical model of shield behavior during excavation. I: Theory. Journal of Geotechnical and Geoenvironmental Engineering, 2002, 128(2): 138–155

[6]	Zhang M , Zhang X , Wu H , Javadi A A , Dai Z. . Theoretical study of mechanical behavior of tunnels considering soil-machine interactions. KSCE Journal of Civil Engineering, 2024, 28(10): 4473–4486

[7]	Shen X , Yuan D , Jin D. . Influence of shield attitude change on shield-soil interaction. Applied Sciences, 2019, 9(9): 1812

[8]	Yue M , Sun W , Hu P. . Dynamic coordinated control of attitude correction for the shield tunneling based on load observer. Automation in Construction, 2012, 24: 24–29

[9]	Xie H , Duan X , Yang H , Liu Z. . Automatic trajectory tracking control of shield tunneling machine under complex stratum working condition. Tunnelling and Underground Space Technology, 2012, 32: 87–97

[10]	Huayong Y , Hu S , Guofang G , Guoliang H. . Electro-hydraulic proportional control of thrust system for shield tunneling machine. Automation in Construction, 2009, 18(7): 950–956

[11]	Sun W , Yue M , Wei J. . Relationship between rectification moment and angle of shield based on numerical simulation. Journal of Central South University, 2012, 19(2): 517–521

[12]	Chen L , Tian Z , Zhou S , Gong Q , Di H. . Attitude deviation prediction of shield tunneling machine using Time-Aware LSTM networks. Transportation Geotechnics, 2024, 45: 101195

[13]	Zhang N , Zhang N , Zheng Q , Xu Y. . Real-time prediction of shield moving trajectory during tunnelling using GRU deep neural network. Acta Geotechnica, 2022, 17(4): 1167–1182

[14]	Fu Y , Chen L , Xiong H , Chen X , Lu A , Zeng Y , Wang B. . Data-driven real-time prediction for attitude and position of super-large diameter shield using a hybrid deep learning approach. Underground Space, 2024, 15: 275–297

[15]	Xiao H , Chen Z , Cao R , Cao Y , Zhao L , Zhao Y. . Prediction of shield machine posture using the GRU algorithm with adaptive boosting: A case study of Chengdu Subway project. Transportation Geotechnics, 2022, 37: 100837

[16]	Zhou C , Xu H , Ding L , Wei L , Zhou Y. . Dynamic prediction for attitude and position in shield tunneling: A deep learning method. Automation in Construction, 2019, 105: 102840

[17]	Dai L , Chen W , Xiao M , Sun W , Wang Z. . Prediction of super-large diameter shield attitude based on LSTM-Transformer. Scientific Reports, 2025, 15(1): 15725

[18]	Wang L , Pan Q , Wang S. . Data-driven predictions of shield attitudes using Bayesian machine learning. Computers and Geotechnics, 2024, 166: 106002

[19]	Chen H , Li X , Feng Z , Wang L , Qin Y , Skibniewski M J , Chen Z , Liu Y. . Shield attitude prediction based on Bayesian-LGBM machine learning. Information Sciences, 2023, 632: 105–129

[20]	Li T , Liu J , Wu X , Su F , Liu Y. . Dynamic prediction and control of a tunnel boring machine with a particle swarm optimization—Random forest algorithm and an integrated digital twin. Applied Soft Computing, 2025, 178: 113294

[21]	Xiao H , Xing B , Wang Y , Yu P , Liu L , Cao R. . Prediction of shield machine attitude based on various artificial intelligence technologies. Applied Sciences, 2021, 11(21): 10264

[22]	Hu M , Zhang H , Wu B , Li G , Zhou L. . Interpretable predictive model for shield attitude control performance based on XGboost and SHAP. Scientific Reports, 2022, 12(1): 18226

[23]	Xiao H , Cao R , Feng S. . Intelligent attitude control method for shield tunneling machines considering a rectifying mechanism: A case study of the Chengdu subway. International Journal of Geomechanics, 2024, 24(8): 05024006

[24]	Wang P , Kong X , Guo Z , Hu L. . Prediction of axis attitude deviation and deviation correction method based on data driven during shield tunneling. Ieee Access, 2019, 7: 163487–501

[25]	Xu J , Bu J , Qin N , Huang D. . SCA-MADRL: Multiagent deep reinforcement learning framework based on state classification and assignment for intelligent shield attitude control. Expert Systems with Applications, 2024, 235: 121258

[26]	Huang H , Chang J , Zhang D , Zhang J , Wu H , Li G. . Machine learning-based automatic control of tunneling posture of shield machine. Journal of Rock Mechanics and Geotechnical Engineering, 2022, 14(4): 1153–1164

[27]	Zhang J , Hu J , Zong C , Feng T , Xu T. . A tunneling speed enhancement method for super-large-diameter shield machines considering strata heterogeneity. Tunnelling and Underground Space Technology, 2025, 159: 106496

[28]	Pan Y , Wang Z , Sun L , Chen J J. . Dynamic prediction and multi-objective optimization on driving position of tunnel boring machine (TBM): an automated deep learning approach. Acta Geotechnica, 2024, 19(8): 5611–5636

[29]	Zhang L , Li Y , Wang L , Wang J , Luo H. . Physics-data driven multi-objective optimization for parallel control of TBM attitude. Advanced Engineering Informatics, 2025, 65: 103101

[30]	Yu H , Qin C , Tao J , Liu C , Liu Q. . A multi-channel decoupled deep neural network for tunnel boring machine torque and thrust prediction. Tunnelling and Underground Space Technology, 2023, 133: 104949

[31]	Shen X , Yuan D , Lin X , Chen X , Peng Y. . Evaluation and prediction of earth pressure balance shield performance in complex rock strata: A case study in Dalian, China. Journal of Rock Mechanics and Geotechnical Engineering, 2023, 15(6): 1491–1505

[32]	Faramarzi L , Kheradmandian A , Azhari A. . Evaluation and optimization of the effective parameters on the shield tbm performance: torque and thrust-using discrete element method (DEM). Geotechnical and Geological Engineering, 2020, 38(3): 2745–2759

[33]	Zhang X , Zhang X , Liu Q , Xie W , Tang S , Wang Z. . TBM big data preprocessing method in machine learning and its application to tunneling. Journal of Rock Mechanics and Geotechnical Engineering, 2024, 17(8): 4762–4783

[34]	Khan A Q , Muhammad S G , Raza A , Chaimahawan P , Pimanmas A. . Advanced machine learning techniques for predicting compressive strength of ultra-high performance concrete. Frontiers of Structural and Civil Engineering, 2025, 19(4): 503–523

[35]	Sandamal K , Shashiprabha S , Muttil N , Rathnayake U. . Pavement roughness prediction using explainable and supervised machine learning technique for long-term performance. Sustainability, 2023, 15(12): 9617

[36]	Wang W , Feng H , Li Y , You Q , Zhou X. . Research on prediction of EPB shield tunneling parameters based on LGBM. Buildings, 2024, 14(3): 820

[37]	Shilton A , Lai D T H , Palaniswami M. . A division algebraic framework for multidimensional support vector regression. IEEE Transactions on Systems, Man, and Cybernetics. Part B, Cybernetics, 2010, 40(2): 517–528

[38]	Hayter A J , Liu W , Ah-Kine P. . A ray method of confidence band construction for multiple linear regression models. Journal of Statistical Planning and Inference, 2009, 139(2): 329–334

[39]	Sebbeh-Newton S , Ayawah P E A , Azure J W A , Kaba A G A , Ahmad F , Zainol Z , Zabidi H. . Towards TBM automation: On-the-fly characterization and classification of ground conditions ahead of a TBM using data-driven approach. Applied Sciences, 2021, 11(3): 1060

[40]	Xiao H , Yang W , Hu J , Zhang Y , Jing L , Chen Z. . Significance and methodology: Preprocessing the big data for machine learning on TBM performance. Underground Space, 2022, 7(4): 680–701

[41]	Li J , Chen Z , Li X , Jing L , Zhang Y , Xiao H , Wang S , Yang W , Wu L , Li P . et al. Feedback on a shared big dataset for intelligent TBM Part I: Feature extraction and machine learning methods. Underground Space, 2023, 11: 1–25

[42]	Cho H , Han S , Heo I , Kang H , Kang W , Kim K. . Heating temperature prediction of concrete structure damaged by fire using a Bayesian approach. Sustainability, 2020, 12(10): 4225

[43]	Mirjalili S , Mirjalili S M , Lewis A. . Grey wolf optimizer. Advances in Engineering Software, 2014, 69: 46–61

[44]	Du T , Hu Y , Ke X. . Improved quantum artificial fish algorithm application to distributed network considering distributed generation. Computational Intelligence and Neuroscience, 2015, 2015: 1–13

[45]	Li E , Zhang N , Xi B , Zhou J , Gao X. . Compressive strength prediction and optimization design of sustainable concrete based on squirrel search algorithm-extreme gradient boosting technique. Frontiers of Structural and Civil Engineering, 2023, 17(9): 1310–1325

[46]	Ding Z. . Research of improved particle swarm optimization algorithm. In: Proceedings of AIP Conference. Melville, NY: AIP Publishing LLC, 2017, 1839(1): 020148

[47]	Zhang J , Lu S D , Feng T G , Yi B B , Liu J T. . Research on reuse of silty fine sand in backfill grouting material and optimization of backfill grouting material proportions. Tunnelling and Underground Space Technology, 2022, 130: 104751

RIGHTS & PERMISSIONS

Higher Education Press

PDF (3257KB)

633

Accesses

Citation

Detail

Sections

Recommended

About the journal

Authors & reviewers

Abstract

Graphical abstract

Keywords

Cite this article

1 Introduction

2 Project

3 Methodology

3.1 Data preprocessing

3.1.1 Data set introduction and parameter filtering

3.1.2 Remove non-tunneling state data

3.1.3 Identify outliers

3.1.4 Normalization

3.2 Machine learning models

3.2.1 K-nearest neighbors

3.2.2 Extreme gradient boosting

3.2.3 Light gradient boosting machine

3.2.4 Support vector regression

3.2.5 Multiple linear regression

3.3 Real-time safety control method for attitude

3.3.1 Construction of machine learning models based on Bayesian optimization

3.3.2 Constrained Grey Wolf Optimization algorithm

4 Verification and discussion

4.1 Model prediction results

4.2 Attitude control results

4.2.1 K-nearest neighbors embedded in grey wolf optimization control results

4.2.2 Support vector regression embedded in grey wolf optimization control results

4.2.3 Extreme gradient boosting embedded in grey wolf optimization control results

4.3 Discussion

4.3.1 Comparison of optimization results for different algorithms

4.3.2 Attitude control results for multiple moments

4.3.3 Engineering application process

5 Innovations and limitations

6 Conclusions

References

RIGHTS & PERMISSIONS