Physics-informed online deep learning for advanced control of shield tail clearance in tunnel construction

Lulu WANG; Penghui LIN; Yongsheng LI; Hui LUO; Limao ZHANG

doi:10.1007/s42524-025-4148-5

Front. Eng ›› 2025, Vol. 12 ›› Issue (4) :828 -853. DOI: 10.1007/s42524-025-4148-5

Construction Engineering and Intelligent Construction

RESEARCH ARTICLE

Physics-informed online deep learning for advanced control of shield tail clearance in tunnel construction

Author information +

History +

PDF (8331KB)

Abstract

To more accurately estimate and control the magnitude of the shield tail clearance, a hybrid deep learning model with the integration of an online physics-informed deep neural network (online PDNN) and non-dominated sorting genetic algorithm-II (NSGA-II) is developed. The online PDNN has evolved from a deep learning framework constrained by the underlying physical mechanism of shield tail clearance measurements. The algorithm is used to forecast the shield tail clearance in tunnel boring machines (TBMs). The NSGA-II is employed to conduct the multi-objective optimization (MOO) process for shield tail clearance. The proposed method is validated in a tunnel case in China. Experimental results reveal that: (1) In comparison with some state-of-the-art algorithms, the online PDNN model demonstrates superior capability in predicting shield tail clearance above, upper-left, and upper-right, with R² scores of 0.93, 0.90, and 0.90, respectively; (2) The MOO achieves a comprehensive optimal solution, with the overall improvement percentage of shield tail clearance reaching 30.87% and a hypervolume of 32 under the 20% constraint condition, which surpasses the average performance of other MOO frameworks by 23 and 5.48%, respectively. The novelty of this research lies in coupling the constructed physical constraints and the online update mechanism into a causal analysis-oriented data-driven model, which not only enhances the model’s performance and interpretability but also realizes the control for the shield tail clearance by the integration of NSGA-II.

Graphical abstract

Keywords

physics-informed neural network / online update mechanism / multi-objective optimization / NSGA-II / shield tail clearance

Cite this article

Download citation ▾

Lulu WANG, Penghui LIN, Yongsheng LI, Hui LUO, Limao ZHANG. Physics-informed online deep learning for advanced control of shield tail clearance in tunnel construction. Front. Eng, 2025, 12(4): 828-853 DOI:10.1007/s42524-025-4148-5

登录浏览全文

4963

注册一个新账户忘记密码

1 Introduction

Nowadays, rapid urbanization and population growth make the problem of traffic congestion more and more serious. To alleviate this situation, large-scale urban rail transit construction has been accelerating. A tunnel boring machine (TBM), as a piece of advanced equipment for tunneling, has been widely used in metro tunnel excavation. Shield tail clearance, referring to the clearance between the inner edge of the shield shell and the outer edge of the lining, plays a crucial role in the rational selection of the lining segment, the attitude correction of the shield machine, and the control of tunneling orientation (Liu et al., 2021). However, shield tail clearance estimation and control face various challenges under different tunneling conditions. In soft-hard uneven composite strata, the unbalanced distribution of thrust on the tunnel face can cause shield attitude deviations and over-excavation, directly affecting the clearance variation (Feng et al., 2022; Tang et al., 2022). Furthermore, silty fine sand and sandy clay interlayers tend to accumulate around the shield, forming a hard-plastic shell as water drains. This increases friction resistance, leading to uneven shield tail deformation and impacting clearance size (Han et al., 2021). These issues are particularly pronounced in shield TBMs, which have a larger contact area with the surrounding rock compared to gripper TBMs (Xu et al., 2021). The practice has shown that shield tail clearances that are either too narrow or too large can compromise construction safety (Liu and Bao, 2021; Lou et al., 2023). Therefore, it is necessary to study how to precisely measure and control the shield tail clearance to ensure the high efficiency and reliability of shield tunnel construction.

The main methods for shield tail clearance measurement include theoretical analysis, manual measurement, contact measurement, and non-contact measurement. The theory of minimum shield tail clearance calculation (Yu, 2022) is one of the effective methods for estimating shield tail clearance. However, due to its simplification of engineering reality and inability to pinpoint the location of the minimal shield tail clearance, this method is not widely used in engineering applications. At present, the most common way to evaluate shield tail clearance is to adopt a manual measurement technique that is user-friendly and yields intuitive results. Nevertheless, manual measurement has many drawbacks, such as human error proneness, low levels of automation, poor real-time performance, and safety hazards (He et al., 2021). The contact measurement method realizes the measurement of shield tail clearance by installing a mechanical device at the shield tail. In fact, the complexity of the shield construction environment makes it inconvenient to install and replace the measuring equipment, thereby increasing the likelihood of damage to the tube segment (Yang, 2021). Compared with the mentioned three measurement ways, the non-contact measurement method is also popular in clearance measurement.

Non-contact measurement mainly utilizes ultrasonic, laser, vision, and imaging technologies to measure shield tail clearance. Specifically, the ultrasonic-based measurement method utilizes ultrasonic sensors embedded inside the shield shell to send and receive ultrasonic signals uniformly distributed on the inner wall of the shield shell to accomplish the clearance measurement. However, the effectiveness of this method’s application is influenced by the thickness of the shield shell, the cleanliness of the environment, and the performance of the ultrasonic rangefinder (VMT, 2021). The laser-based measurement technology uses a laser range finder to scan the inner wall of the shield shell and the outer surface of the lining to locate their edges. Subsequently, shield tail clearance can be calculated indirectly by combining the scanning angles. Yet, its main drawback is that the accumulation of errors could affect the accuracy and stability of data (Sun and Zhuang, 2016). Currently, shield tail clearance measurement based on machine vision is a research direction that has attracted much attention. This method focuses on utilizing digital image processing techniques to calibrate the captured shield tail clearance images, extract the edges of the shield shell inwall and segment outwall, and calculate the clearance by considering the spatial structure of the shield tail (Chen et al., 2021). Although machine vision-based (MVB) systems perform well in laboratory tests, they are prone to interference from stray light, dust, and muck in the underground working environment. This vulnerability leads to unsatisfactory measurement accuracy in practical applications (Chen et al., 2020).

With the development of information technology, the vast amount of data collected by sensors has facilitated the application of machine learning (ML) and deep learning (DL) technologies in the field of civil engineering. At present, these methods are mainly applied in posture recognition (Chen et al., 2025), construction management (Zhou et al., 2018), and risk analysis (Zhang et al., 2024b), but they are less applied in shield tail clearance measurement. Furthermore, similar to the trajectory planning problem proposed by Chai et al. (2020), the shield tail clearance control task can also be regarded as a multi-objective optimization (MOO) problem. The shield tail clearances in different orientations can be viewed as different optimization objectives, which can then be minimized by appropriately adjusting the TBM’s key operating parameters. This approach yields a reasonable adjustment range for these parameters to keep the clearance within specified limits, thereby ensuring the smooth progress of the excavation process. Statistically, the non-dominant sorting genetic algorithm-II (NSGA-II) is one of the most popular methods for solving MOO problems (Verma et al., 2021). For example, Chai et al. (2017) efficiently solved the multi-objective trajectory optimization problem for spacecraft using an improved NSGA-II method. Zhang and Lin (2021) implemented a hybrid approach that integrates extreme gradient boosting (Xgboost) and NSGA-II to mitigate tunneling-induced limit support pressure and ground surface deformation. However, to the best of our knowledge, no researchers have attempted to apply NSGA-II to solve the control problems related to shield tail clearance.

Overall, previous studies have significantly contributed to the advancement of shield tail clearance measurement techniques. However, due to the time-varying nature of shield tail clearance, existing methods still have some deficiencies, such as their inability to provide valuable evidence for controlling shield tail clearance. To address these challenges, exhaustive efforts herein develop an online physics-informed DL model to support the automatic measurement and control of shield tail clearance. There are three key research questions to be addressed: (1) How to construct the physical equations for calculating shield tail clearance at arbitrary positions and utilize them as physical constraints for training deep neural network (DNN) models; (2) how to facilitate the online training process of physics-informed DL models to realize dynamic prediction of shield tail clearance; (3) how to implement the MOO of shield tail clearance to provide a certain reference for on-site operators to control the shield tail clearance. The innovations of this research can be summarized in two key aspects: (a) Integrating the derived physical equations for clearance measurement and the online updating mechanism into a DNN framework to generate the online physics-informed DNN (online PDNN) that can accurately predict the shield tail clearance, and (b) incorporating the developed online PDNN model with the NSGA-II algorithm to realize intelligent control of the shield tail clearance at arbitrary positions.

The remaining parts of this paper are organized as follows. Section 2 reviews existing research on the application of ML, physics-informed neural networks (PINNs), and online learning methods to tunnel problems. Section 3 introduces the details of our method, including the physical mechanism of shield tail clearance measurement, the framework of the online PDNN, and the MOO for shield tail clearance control. Section 4 applies the developed online PDNN approach to the actual construction project of Wuhan Metro Line 12, Guobo Center South Station–Lingwucun Station, aiming to verify its feasibility and effectiveness. Section 5 discusses the effect of using different algorithms in estimating and controlling shield tail clearance to highlight the advantages of the proposed approach. Section 6 summarizes the findings and presents future work.

2 Related studies

With the progress of modern information technology, the tunnel construction industry undergoing an important stage of digital transformation, where an immersing interest is to apply ML technology to improve the automation level of shield tunneling. For one thing, the growing abundance of monitoring data related to TBM can create a big-data environment suitable for training ML models (Yin et al., 2022). For another, ML has powerful learning capabilities to construct potentially hidden relationships among multidimensional data, laying a critical foundation for the autonomous operation of control systems (Chai et al., 2024). For example, Afradi et al. (2020) accurately predicted the TBM penetration rate using a support vector machine (SVM). Mahmoodzadeh et al. (2021) found that Gaussian process regression (GPR) is more suitable for predicting the disc cutter’s life of TBM than support vector regression (SVR), decision trees (DT), and K-nearest neighbors (KNN) algorithms. Although these methods have been proven effective, the shallow learning mechanisms inherent in ML often lead to insufficient accuracy when capturing more complex nonlinear relationships between variables.

To break through the inherent limitations of ML, DL has been developed as an important branch of ML. The distinguishing advantage of DL is its ability to automatically carry out feature engineering, learn the implicit relationships between input features and output targets, and provide accurate prediction results. For example, Feng et al. (2021) predicted the field penetration index of three rock types with an average relative error of less than 0.15 by deploying a deep belief network (DBN). Zhang et al. (2024a) developed a hybrid DL model that integrates graph convolutional network (GCN) and long short-term memory (LSTM) networks, which is capable of predicting TBM penetration rate, over-excavation rate, energy consumption, and tool wear with a coefficient of determination (R²) score of no less than 0.83. In summary, the results of these existing studies confirm the effectiveness of data-driven approaches in addressing tunneling problems. However, the application of ML/DL strategies in tunneling projects still faces several challenges (Phoon and Zhang, 2023): (1) Shield machine tunneling reflects interactions between the ground and the machine, constrained by specific physical mechanisms, but purely data-driven ML/DL models often fail to offer interpretable insights into these mechanisms. (2) One of the prerequisites for ML/DL methods to describe complex interactions is the need for a large amount of training data, which is often very limited for many engineering projects.

To address these fundamental challenges, PINN was proposed in 2019 (Raissi et al., 2019). Since its inception, the PINN has gained more attention in the tunneling engineering domain. For example, Zhang et al. (2023) and Liu et al. (2023) performed PINN models for estimating ground deformation caused by tunnel excavation. Shaban et al. (2023) used a PDNN to model the diffusion of chloride ions and predict the distribution of chloride concentrations in concrete. Lin et al. (2023) used a physics-informed DL approach to reliably predict the soil chamber pressure of TBM with an R² score of 0.96. Although most conventional PINNs yield acceptable results, the prerequisite is that the entire training data set needs to be provided before the learning task, which is detrimental to the shield tail clearance measurement task. As a possible solution, an online learning manner can therefore be taken into account to address the aforementioned limitations, which can potentially be more practical and robust in practice compared to offline PINN.

Online learning methods can adaptively update the model with the incoming stream of observations, enabling the model to adapt more efficiently to the dynamic excavation process. Recently, some research on online ML/DL modeling in the field of tunnel engineering has been reported. Pan et al. (2022) developed an online version of the att-GCN model to predict the penetration and energy consumption of the shield machine. The results indicated that the online learning mechanism can improve the prediction accuracy of the model. Zhang et al. (2020a) integrated a reinforcement learning (RL) based optimizer with the extreme learning machine (ELM) model to enhance the forecasting of tunneling-induced ground response. The findings demonstrate that the optimized model achieves improved accuracy while reducing computational costs. The key to developing online ML/DL models is the determination of hyperparameters. The common methods for determining hyperparameters include grid search, random search, and Bayesian Optimization (BO). Compared with the other two methods, the BO selects parameter configurations for the next iteration by considering the results of previous iterations, which confers a notable edge in tackling complex global optimization problems (Shin et al., 2020). Consequently, the BO algorithm is used in this study to automatically search for optimal hyperparameter combinations for online PDNN models.

To address existing challenges and fill research gaps, herein, a hybrid approach with the integration of online PDNN and NSGA-II is proposed to conduct the prediction and optimization of shield tail clearance. As expected, the developed online PDNN model will be more practical and reliable. To be more specific, the established physical equations are incorporated into the loss function of the DL model, which contributes to reducing over-reliance on training data and improving the prediction performance and interpretability of the model (Rao et al., 2020). The developed physics-informed DL model is trained online to accurately capture and represent the dynamic changes in shield tail clearance during tunnel excavation. Combining online PDNN with NSGA-II for performing MOO, the optimal operational decision obtained could serve as a reference for operators to control shield tail clearance, ensuring the safety and efficiency of the excavation process.

3 Methodology

In this paper, a physics-informed online DL network is developed to estimate and control the shield tail clearance during tunnel excavation. In this section, the physical laws relevant to computing shield tail clearance are principally introduced. The structure of the PDNN algorithm after integrating physical laws and the online update mechanism is then illustrated. Methods for evaluating and explaining the model are provided. At the end of this section, the MOO method for shield tail clearance control is described. An overall framework illustrating the proposed approach is shown in Fig. 1.

3.1 Physical mechanism in shield tail clearance calculation

Currently, the minimum shield tail clearance calculation theory serves as an important tool for analyzing shield tail clearance. This method primarily estimates the minimum clearance value based on the geometric relationship between the clearance and factors such as the radius of the tunnel curve, the pitch angle, and the length of the shield tail covering segments (Yu, 2022). By subsequently comparing the results with existing minimum shield tail clearance standards, it assesses whether the current clearance is too small. However, this method requires construction personnel to accurately identify the location of the minimum shield tail clearance, which can be challenging. In practice, field workers typically monitor shield tail clearance at eight specific locations: above, below, left, right, upper left, lower left, upper right, and lower right. Under these conditions, the minimum shield tail clearance theory becomes less applicable. Additionally, another limitation of this method is its inability to provide effective guidance for controlling shield tail clearance. To enhance its applicability in engineering scenarios, an improved method for calculating the shield tail clearance at different positions is proposed.

Assuming the assembly of the lining rings is in an ideal state, there are six models for the calculation of the minimum shield tail clearance (as shown in Figs. 2(a)–2(f)). In the first model, the TBM excavates along a straight line, and the central axes of the TBM, the lining ring, and the excavation route coincide (as exhibited in Fig. 3). The size of the shield tail clearance Y at different orientations is equal to the initial clearance δ₀ mm (as presented in Fig. 2(a)). In the second one, the variation of shield tail clearance generated by shield pitch angle α (°) is δ₁ mm, where when the TBM is up tilted, α > 0; otherwise, α < 0. Due to the influence of the shield pitch angle, the shield tail clearance values of the relative positions are not equal, in which the size of the larger shield tail clearance is δ_max mm and the smaller is δ_min mm (as depicted in Fig. 2(b)). In the third condition, since the TBM tunnels along a route with a deflection angle η (°) and a tunnel curve radius

R ′

m, the shield tail clearance at relative positions differs by δ₂ mm (as displayed in Fig. 2(c)). In the fourth to sixth cases, the effects of the pitch angle and the radius of the tunnel curve on the shield tail clearance are considered at the same time (as shown in Figs. 2(d)–2(f)). Among them, Fig. 2(d) gives the case when δ₁ < δ₂, where the minimum shield tail clearance appears directly above, and the maximum shield tail clearance occurs directly below. Whereas when δ₁ > δ₂, the situation is the opposite, as indicated in Fig. 2(e). That is to say, the minimum shield tail clearance appears directly below, and the maximum shield tail clearance appears directly above. From the geometries of Fig. 2, δ₁, δ₂, δ_min, and δ_max can be estimated by Eqs. (1)–(4), respectively.

(1)

δ 1 = l t a n α,

(2)

δ 2 = (R ′ − D 2) (1 − 1 − (l R ′ − D / 2) 2),

(3)

δ m i n = {δ 0 − 12 δ 1 (α ≠ 0, η = 0) δ 0 − 12 δ 2 (α = 0, η ≠ 0) δ 0 + 12 δ 1 − 12 δ 2 (α > 0, η ≠ 0, a n d δ 1 < δ 2) δ 0 − 12 δ 1 + 12 δ 2 (α > 0, η ≠ 0, a n d δ 1 > δ 2) δ 0 − 12 δ 1 − 12 δ 2 (α < 0, η ≠ 0),

(4)

δ m a x = 2 δ 0 − δ m i n,

where l refers to the length of the segment covered by the shield tail, which is assumed to be equal to the width of the lining ring in this study; D represents the outer diameter of the lining ring.

Taking the subgraph Fig. 2(f) as an example, the basic principle of shield tail clearance calculation is shown in Fig. 4. The three concentric circles centered on O represent the lining ring with radius

D / 2

, and the two auxiliary circles with radii

(D / 2 + δ m i n)

and

(D / 2 + δ m a x)

, respectively. The ellipse centered at O' is a 1-1 section of the shield shell. F₁ and F₂ are the two foci of the ellipse. The long and short axes of the ellipse are

2 R / c o s α

and

2 R

, respectively, where

R

is the inner diameter of the shield shell. According to the definition of the ellipse and the cosine theorem, we can obtain the Eqs. (5)–(8), which implicitly represents the geometric relationships between the shield tail clearance, pitch angle, radius of the tunnel curve, and horizontal angle (

β

(5)

| E F 1 | + | E F 2 | = 2 R c o s α,

(6)

| E F 1 | 2 = (| O F 1 | + | O F 2 |) 2 + | E F 2 | 2 − 2 (| O F 1 | + | O F 2 |) | E F 2 | c o s φ,

(7)

| E F 1 | 2 = | O F 1 | 2 + (R + Y) 2 − 2 | O F 1 | (R + Y) s i n β,

(8)

(D 2 + Y) 2 = | O F 2 | 2 + | E F 2 | 2 − 2 | O F 2 | | E F 2 | c o s φ,

where

| E F 1 |

and

| E F 2 |

are the distances from the measuring point to the focal points F₁ and F₂, respectively, and φ is the angle between

E F 2

and

F 1 F 2

After eliminating the three unknown variables

| E F 1 |

| E F 2 |

, and

φ

by combining Eqs. (5)–(8), Eq. (9) for the calculation of the shield tail clearance at any position can be derived.

(9)

Y = − {s i n β [4 (R c o s α) 2 | O F 2 | + | O F 1 | 2 | O F 2 | − 4 (R c o s α) 2 | O F 1 | + | O F 1 | 3 − | O F 2 | 3 − | O F 1 | | O F 2 | 2] − s q r t [32 (R c o s α) 2 | O F 1 | | O F 2 | s i n 2 (β) 2 | O F 2 | 2 s i n 2 (β) − 8 (R c o s α) 2 | O F 1 | 2 | O F 2 | 2 − 64 (R c o s α) 4 | O F 1 | | O F 2 | s i n 2 (β) − 32 (R c o s α) 4 | O F 1 | 2 − 32 (R c o s α) 4 | O F 2 | 2 + 64 (R c o s α) 6 + 4 (R c o s α) 2 | O F 1 | 4 + 4 (R c o s α) 2 | O F 2 | 4 + 16 (R c o s α) 2 | O F 1 | | O F 2 | 3 s i n 2 (β) + 16 (R c o s α) 2 | O F 1 | 3 | O F 2 | s i n 2 (β)]} 2 [4 (R c o s α) 2 − | O F 2 | 2 s i n 2 (β) − 2 | O F 1 | | O F 2 | s i n 2 (β) − | O F 1 | 2 s i n 2 (β)] − D 2,

(10)

| O F 1 | = − R c o s α + R t a n α + D 2 + δ m i n,

(11)

| O F 2 | = R c o s α + R t a n α − D 2 − δ m i n,

where β indicates the position at which the shield tail clearance needs to be calculated;

| O F 1 |

and

| O F 2 |

can be calculated by Eqs. (10) and (11), respectively. Combining Eqs. (1)–(4) and (9)–(11), it can be concluded that when the TBM model and segment geometry are fixed, the tunnel curve radius, pitch angle, and length of the shield tail covering segments are the key factors affecting the shield tail clearance. In particular, while the pitch angle changes only slightly during excavation, it significantly impacts the shield tail clearance. From Eq. (1), it can be seen that the greater the pitch angle, the greater the variation in shield tail clearance. However, the total clearance at relative positions is a fixed value. Therefore, it is crucial to strictly control the pitch angle of the TBM to prevent excessive pitch angles from causing shield tail clearance changes that exceed the available clearance value, which could lead to issues such as TBM jamming or difficulties in segment assembly.

Equation (9) realizes the estimation of clearance by revealing the influence laws of pitch angle and tunnel curve radius on the shield tail clearance. However, the derivation process of the physical equations doesn’t consider the influence of other factors on the shield tail clearance, such as cylinder stroke difference and shield tail deviation. As a result, the theoretical analysis methods often fail to accurately calculate the clearance size. To overcome these shortcomings, this study aims to develop an online PDNN model to better forecast and control shield tail clearance.

3.2 Online PDNN for shield tail clearance estimation

3.2.1 The architecture of PDNN model

Since the neural network was proposed, some researchers in the civil engineering field have started to explore the potential application of DNN in tunnel-related challenges (Gao et al., 2019; Lin et al., 2023; Zhang et al., 2020b). However, purely data-driven DNN models have the problems of over-reliance on data and a lack of reasonable explanation of the internal mechanism of the physical system (Chakraborty, 2021). A viable strategy for tackling these issues is to incorporate the physical laws into the framework of DNN to derive a novel PDNN algorithm. Specifically, one of the mainstream design ideas of the PDNN algorithm is to integrate the physical mechanisms into the model constraints. Figure 5 illustrates a general framework of physics-informed DNN.

As shown in Fig. 5, the general structure of a DNN consists of three main components, namely, an input layer that receives feature variables, multiple hidden layers for extracting and learning high-level features through nonlinear transformations, and an output layer for producing prediction results. Generally, the feature variables and predicted values can be represented by the vectors

x = {x 1, x 2, …, x m}

and

y^= {y^1, y^2, …, y^r}

, where

m

represents the number of feature variables and r represents the number of the output targets. These layers comprising a different number of neurons are fully linked together through weights and biases, and the information is transformed between neurons through a nonlinear activation function. Mathematically, the relationship can be represented as:

(12)

o q s = σ (x q s) = σ (∑ q w q s − 1 o q s − 1 + b s − 1),

where

x q s

and

o q s

denote the input and output values of the qth neuron in the sth layer, respectively;

w q s − 1

reprsents the weight of the qth neuron in the (s−1)th layer;

b s − 1

reprsents the bias of the (s−1)th layer; σ is an activation function that acts as a nonlinear mapping. Generally, the values of the hyperparameters

w q s − 1,

b s − 1

are estimated automatically by minimizing the mean square error (MSE) in the process of training.

(13)

M S E = 1 N ∑ i = 1 N ∑ j = 1 r (y^j (i) − y ~ j (i)) 2,

where

y^j (i)

and

y ~ j (i)

are the predicted value and the real value for the jth output target of the ith sample, respectively; N is the number of collection samples. Compared with the Sigmoid and Tanh functions, the activation function ReLU is not only easy to converge but also solves the gradient vanishing, so it is employed to activate the neurons in this study.

(14)

σ (x q s) = R e L U (x q s) = {x q s, i f x q s ≥ 0 0, i f x q s < 0,

where the final output is determined by the input features, as well as the weights and biases associated with each neuron in the network.

The fundamental idea of the PDNN model is to integrate the physical laws governing the problem into the loss function, thereby constructing a DNN model that incorporates physical constraints. The mathematical equations of fundamental physical laws can be categorized into dependent equations, ordinary differential equations (ODEs), and partial differential equations (PDEs) (Roy et al., 2023), with their physical interpretations varying across different engineering problems. Typically, these equations are used to represent the initial conditions, boundary conditions, and control equations that the problem must satisfy. For instance, Chai et al. (2020), in the task of optimizing the trajectory of a hypersonic vehicle, employed dependent equations and ODEs to describe the boundary conditions of the hypersonic vehicle system and the control equations based on flight kinematics, respectively. Due to the differing mathematical representations of physical laws, the corresponding physical loss functions are slightly different. As shown in Eqs. (15)–(17),

M S E p 1

M S E p 2

, and

M S E p 3

represent the physical losses associated with the dependent equations, ODEs, and PDEs, respectively.

(15)

M S E p 1 = 1 N ∑ i = 1 N ∑ j = 1 r (y^j (i) − Y j (i) (x (i))) 2,

(16)

M S E p 2 = 1 N ∑ i = 1 N ∑ j = 1 r (d n y^j (i) d (x t (i)) n − d n Y j (i) (x (i)) d (x t (i)) n) 2,

(17)

M S E p 3 = 1 N ∑ i = 1 N ∑ t = 1 m ∑ j = 1 r (∂ n y^j (i) ∂ (x t (i)) n − ∂ n Y j (i) (x (i)) ∂ (x t (i)) n) 2,

where

x (i)

is the feature vector of the ith sample;

x t (i)

is the tth feature of the ith sample;

y^j (i)

and

Y j (i) (x (i))

are the predicted and theoretical values of the jth the output for the ith sample, respectively;

d n y^j (i) d (x t (i)) n

and

d n Y j (i) (x (i)) d (x t (i)) n

denote the ordinary differentials of the

y^j (i)

and

Y j (i) (x (i))

with respect to the

x t (i)

, respectively;

∂ n y^j (i) ∂ (x t (i)) n

and

∂ n Y j (i) (x (i)) ∂ (x t (i)) n

are the partial differentials of the

y^j (t)

and

Y j (i) (x (i))

with respect to the

x t (i)

, respectively. The values of

d n y^j (i) d (x t (i)) n

and

∂ n y^j (i) ∂ (x t (i)) n

can be obtained using an automatic differentiation technique, while

d n Y j (i) (x (i)) d (x t (i)) n

and

∂ n Y j (i) (x (i)) ∂ (x t (i)) n

need to be derived by differentiating the physical equations. In summary, the loss function of the PDNN model is expressed as the sum of Eq. (13), and Eqs. (15)–(17):

(18)

L o s s = λ 1 M S E + λ 2 M S E p 1 + λ 3 M S E p 2 + λ 4 M S E p 3,

where

λ 1

λ 2

λ 3

and

λ 4

represent the weight of the MSE,

M S E p 1

M S E p 2

, and

M S E p 3

, respectively. The sum of

λ 1

λ 2

λ 3

and

λ 4

is 1.

The application of physical constraint equations not only helps to reduce the size of the input data set but also enables the output target to satisfy the corresponding physical properties. The PDNN model framework for shield tail clearance measurement after coupling the physical mechanism is shown in Fig. 6. The loss function of the developed PDNN model is formulated as:

(19)

L o s s (y^, y ~, y^, Y) = λ 1 L o s s o b s (y^, y ~) + λ 2 L o s s p h y (y^, Y) = λ 1 ∑ i = 1 N ∑ j = 1 r (y^j (i) − y ~ j (i)) 2 + λ 2 ∑ i = 1 N ∑ j = 1 r (y^j (i) − Y j (i)) 2,

where

y^

y ~

, and

Y

represent the sets formed by the predicted values, observed values, and theoretical values of the shield tail clearance from all the training samples, respectively; the first part

∑ i = 1 N ∑ j = 1 r (y^j (i) − y ~ j (i)) 2

represents the data loss

L o s s o b s (y^, y ~)

, calculated from the predicted value and observed value of shield tail clearance, while the second part

∑ i = 1 N ∑ j = 1 r (y^j (i) − Y j (i)) 2

represents the physical loss

L o s s p h y (y^, Y)

, which is derived from the predicted value of shield tail clearance and the theoretical value calculated based on Eq. (9).

Drawing on the research of Chai et al. (2021), which applied physical constraints to optimize overtaking strategies in maneuver planning, the proposed modeling approach that combines DNN with physical laws is expected to improve the model’s causal interpretability. Specifically, the model parameters, such as weights and biases, trained using this carefully designed loss function can minimize not only the data-driven component of the loss but also the boundary residual constraints represented by the dependent equations. This helps the model’s predicted solutions to approximate the theoretical solution more closely, resulting in improved alignment between the predicted data set and the observed data set.

The main steps involved in the proposed PDNN are illustrated in Algorithm 1. For training the PDNN, the input variables include shield pitch angle and several other important parameters x_other. The output variables are the magnitude of the shield tail clearance at different positions. Details on the strategy of hyperparameter tuning for model training as well as input and output variables are described in Subsections 3.2.2 and 4.2, respectively. Herein, the Adam optimizer and Xavier initialization are also applied in the training process of PDNN models.

3.2.2 Online update mechanism

Although DNN models embedded with physical laws reduce the dependence on data set size and more realistically reflect the engineering mechanisms, offline PDNN models may no longer be valid as the data set is continuously updated. Consequently, we concentrate on developing an online PDNN model in this study, which is significantly different from the conventional PDNN model in the manner of data provided and the automatic updating of the model parameters. Specifically, the data are provided sequentially for the developed online PDNN model, and the hyperparameters are dynamically adjusted as the data set is updated. In contrast, the entire data set is supplied in batches for the conventional offline PDNN model. Once the model has been trained, the parameters of the neural network are frozen, which means that they are not updated as the input data set changes. Figure 7 shows the development process of the online PDNN model.

In the online prediction concept, suppose that the initial data set contains N₀ samples. Then, we choose N₁ of N₀ samples as the training data to train the model, and the remaining N₂ samples as the testing data to validate the model. If the loss function can be minimized on the currently set hyperparameter combination θ, it indicates that the current hyperparameter combination θ is optimal. Otherwise, hyperparameter optimization algorithms need to be used to find the optimal combination of hyperparameters

θ ∗

. To the best of our knowledge, Bayesian Optimization (BO) with Gaussian process priors has good practices in determining the optimal hyperparameters for DL algorithms. Therefore, the BO algorithm is integrated into the proposed online PDNN model aiming to automatically optimize the hyperparameters of the neural network. A pseudocode for optimizing the hyperparameters based on BO is shown in Algorithm 2. Here are two key aspects to highlight. One is the use of Gaussian processes to construct a probabilistic agent model; the other is the use of expectation increment as an acquisition function to select the next sampling location.

All computations in the DL method can be viewed as nonlinear operators performed by neural networks, where the tunable parameters of the model are continuously computed and updated by minimizing the loss function. Therefore, when new samples N₃ arrive, we need to ascertain whether it performs well with the current model. If so, it indicates that the model has excellent robustness, and we don’t need to implement any adjustments to it. Otherwise, we need to recalculate the optimal hyperparameters with the updated data set (containing N' samples).

3.3 Model evaluation and interpretation

Five statistical metrics are employed to assess the performance of the developed online PDNN model, including the R², the variance account for (VAF), the a20_index, mean absolute error (MAE), and root mean square error (RMSE), whose calculation formulas are given in Eqs. (20)–(24). These assessment indicators are widely used in the evaluation of regression models. That is to say, the reliability of model quality assessment can be enhanced by utilizing various calculations to assess the differences between predicted and measured values. R² reflects the extent to which independent variables contribute to explaining changes in dependent variables. a20_index as a criterion of engineering problems quantitatively evaluates the reliability of the regression model with a 20% tolerance (Yavas et al., 2023). RMSE and MAE are used to measure the deviation between predicted and measured values. Typically, as the values of R², VAF, and a20_index get closer to 1, and the values of MAE and RMSE get closer to 0, the higher the predictive accuracy of the model.

(20)

R j 2 = 1 − ∑ i = 1 N (y^j (i) − y ~ j (i)) 2 ∑ i = 1 N (y ~ j − y ~ j (i)) 2,

(21)

V A F j = 1 − ∑ i = 1 N (y^j (i) − y ~ j (i)) 2 ∑ i = 1 N (y ~ j − y ~ j (i)) 2,

(22)

a 20_i n d e x j = m j 20 N,

(23)

M A E j = 1 N ∑ i = 1 N | y^j (i) − y ~ j (i) |,

(24)

R M S E j = 1 N ∑ i = 1 N (y^j (i) − y ~ j (i)) 2,

where

y ~ j

represents the mean of all samples for the j-th target;

m j 20

refers to the number of samples whose error for the j-th target does not exceed ± 20% of the measured values.

DL algorithms with black-box mechanisms offer advantages due to their intricate network structures, enabling them to effectively fit a given data set. However, this complexity comes at the cost of reduced model interpretability. Understanding how complex models make decisions is not only a relevant problem in data science applications but also a focal point in engineering (Tahmassebi et al., 2022). To address the interpretability problem of online PDNN models, the partial dependence plot (PDP) and Shapley additive explanations (SHAP) analysis methods are introduced in this study, aiming to visualize the prediction results and enhance the model interpretability.

PDP analysis is a powerful technique in ML and statistical analysis, primarily used to understand the relationship between specific input features and model predictions. PDP allows for the examination of how variations in a selected feature influence the predicted outcomes while holding other features constant. This approach enables a focused analysis of individual or combined effects of features on model behavior, making it particularly useful for assessing feature importance and interpreting complex models. Typically, PDP can be one-dimensional (1D PDP) or two-dimensional (2D PDP), which depends on whether one or two input parameters are varied at a time.

In SHAP, an efficient way is provided for estimating the marginal contribution of each input feature to the model. The Shapley value

ϕ x t

, which represents the contribution of the feature

x t

, is determined by:

(25)

ϕ x t = ∑ x ′ ⊆ x {x t} | x ′ |! (m − 1 − | x ′ |)! m! (R (x ′ ∪ {x t}) − R (x ′)),

where

x ′

is the feature subset that does not contain

x t

;

R (x ′ ∪ {x t})

and

R (x ′)

represent the model’s outputs on the

x ′

and the feature set

x ′ ∪ {x t}

obtained by merging

x ′

with

x t

, respectively. SHAP analysis is based on the additive feature attribution method, which can simplify a complex model into a linear function:

(26)

ξ (x t) = ϕ 0 + ∑ t = 1 m ϕ x t x t,

where

ϕ 0

is computed as the mean of the estimated values; the coalition vector

x t ∈ {0, 1} m

describes whether feature

x t

is observed.

x t = 0

and

x t = 1

represent the feature is absent or present, respectively.

3.4 Multi-objective optimization for shield tail clearance control

To avert excessive extrusion between the shield shell and the concrete lining, it is necessary to consider multi-objective optimization and control for the shield tail clearance. Generally, a minimization MOO problem needs to consider four components, including the r-dimensional target vector

y (x)

, the

M i n e q

inequality constraints,

M e q

equality constraints, and the

m

-dimensional decision vector x, whose mathematical representations are presented by Eqs. (27)–(29).

(27)

M i n i m i z e y (x) = (y 1 (x), y 2 (x), …, y r (x)), r ≥ 2,

(28)

s u b j e c t t o g ρ (x) ≤ 0, ρ = 1, 2, …, M i n e q h μ (x) = 0, μ = 1, 2, …, M e q,

(29)

x = (x 1, x 2, …, x m) ∈ Ω,

where

y r (x)

is the r-th objective function;

g ρ (x)

represents the

ρ

-th inequality constraint;

h μ (x)

is the

μ

-th equality constraint; Ω is the decision space of independent variables. In view of this, the overall objective of the shield tail clearance optimization problem in this study is to determine appropriate ranges for key TBM mechanic parameters by minimizing the shield tail clearance in different directions. This approach helps to prevent excessive extrusion between the shield shell and the concrete linings, thus reducing safety hazards and avoiding engineering accidents. It is worth noting that the optimized shield tail clearance is not only constrained by the the geometries of TBM shield shell and lining rings, but also must not fall below the minimum clearance specified by industry standards. Referring to the literature (Han, 2014), this study sets the warning threshold for shield tail clearance at 10 mm.

According to the existing literature (Chai et al., 2017), the NSGA-II algorithm has good applicability to solve a MOO problem. Compared to NSGA, the main innovation of the NSGA-II algorithm adopts three special principles, including nondominated sorting, crowding distance, and crowd comparison operator, which significantly reduces the complexity of the algorithm and speeds up the computation. In the nondominated sorting procedure, the predominant concept is introduced to search for the optimal solutions to the MOO problem. For example, assuming

x τ 1 = {x 1 ∗, . . ., x m ∗}

and

x τ 2 = {x 1, . . ., x m}

be two solution vectors of MOO problem (Eq. (27)),

x t ∗

dominates

x t

(denoted as

x t ∗ ≺ x t

) if

x t ∗ ≤ x t

for all t = 1, 2, …, m and

x t ∗ < x t

for at least one t = 1, 2, …, m. When a solution vector x of Eq. (27) is not dominated by any other solution vectors, it is called Pareto optimal. The set of all Pareto optimal solutions is denoted as a Pareto set.

Once the sorting is complete, the crowding distance will be considered to maintain the diversity and uniform distribution of the solution set. The crowded distance of the τ-th solution is formulated as follows:

(30)

T = ∑ j = 1 r y j τ + 1 − y j τ − 1 y j m a x − y j m i n,

where

y j τ + 1

and

y j τ − 1

present the jth objective function values of the (τ+1)th and (τ–1)th individuals, respectively;

y j m a x

and

y j m i n

refer to the maximum and minimum values of the jth objective function.

After the fast nondominated ordering and calculations of crowding distance are completed, all population members in the set are assigned two attributes, which are nondomination rank and crowding distance. Based on these two properties, we can utilize the crowded comparison operator to perform the comparison of two solutions. Specifically, if two solutions are of different ranks, we desire to select the solution with the lower (or better) rank. If two solutions are of the same front, the solution with the larger crowding distance is preferred.

The aforementioned steps are iteratively repeated until the maximum number of iterations is reached, at which point the algorithm is terminated, and the Pareto front is generated. To determine the best solution from the Pareto front, it is necessary to calculate the distance between the Pareto optimal solution vector and the ideal solution vector by Eq. (31), where the ideal solution vector is composed of the optimum for all objective functions.

(31)

d τ = ∑ j r (y − o j) 2,

where

y

is the value of the jth objective function in the τth Pareto solution; o_j is the ideal value of the jth objective function. Compare the values of

d τ

calculated by different Pareto solutions, and choose the minimum of them as the best solution.

The pseudocode for optimizing the shield tail clearance based on the hybrid approach of online PDNN and NSGA-II is shown in Algorithm 3. The main steps are as follows: First, select multiple conflicting objectives (minimizing shield tail clearance in different directions) and set appropriate parameter ranges and constraints. Then, apply NSGA-II for MOO to generate a Pareto front. In this process, the online PDNN model acts as a surrogate model to quickly predict shield tail clearance values. Finally, by calculating the distance between each pareto optimal solution and the ideal point, the best solution with evenly reduced shield tail clearance across all directions can be identified.

4 Case study

4.1 Background

To verify the effectiveness of the proposed online PDNN model, we have applied it to the tunneling project of Wuhan Metro Line 12 (i.e., the first loop line of Wuhan Metro in China). The target section focused on in this case is the single-bore, two-lane tunnel between the Guobo Center South Station and Lingwucun Station. The design length and the segment width of the tunnel are 3,373.67 m and 2 m, respectively. For the excavation of this tunnel, a mud-water balanced TBM with a cutter diameter of 12.56 m is employed. The engineering floor plan, lining rings, and photos of the TBM’s data acquisition system are shown in Fig. 8(a)–8(c), respectively.

During the tunnel excavation process, the data acquisition system is used to monitor the performance parameters of the TBM in real-time, which can contribute to ensuring the safety and controllability of construction. Out of all the parameters collected from the TBM, 25 machine parameters closely associated with shield tail clearance are relatively easy to obtain, such as shield attitude parameters, cylinder stroke, chamber pressure, etc. Accordingly, these parameters are considered input features of the model to reliably forecast the shield tail clearance. A detailed description of all TBM operational parameters is given in Section 4.2.

In this case study, the data set is taken from the first 553 concrete lining rings (Ring No. 0–552) with a diameter of 12.1 m and a width of 2 m, respectively. To mitigate the risk of segment damage caused by minimal local shield tail clearance, the onsite workers measured the shield tail clearance at three specific positions, including directly above, above left (β = 135°), and above right (β = 45°), which are represented by y1, y2, and y3, respectively. As a result, the acquired data set comprises 553 rows and 28 columns, with each row containing the aforementioned 25 input features and 3 output targets. It is worth noting that, for the sake of consistency in variable naming, the pitch angle

α

will be replaced with x1.

4.2 Data resources

Table 1 provides a detailed overview of the 25 input features and 3 output targets, with their numerical distribution shown in Fig. 9. It can be observed that the data set is well-distributed in the sample space, with no extreme or anomalous data points. To clarify whether there is a linear correlation between all input features and each output target, Fig. 10 illustrates the results of the Mantel test. Based on Mantal’s r_men and p_men values calculated from the feature and target matrices, it is clear that 25 input features play a significant role in determining the magnitude of the shield tail clearance. Nevertheless, the Pearson values between all input features and each output target are less than 0.75, which indicates that there is no strong correlation present in the prepared data set. In other words, it is almost impossible to accurately predict the shield tail clearance by relying solely on manual analysis. In light of this, this study aims to utilize DL technology to fully exploit the potential value of the acquired data to realize the dynamic prediction and control of the shield tail clearance.

As shown in Table 1, due to the dimensions of each feature being different, the data of different input variables vary greatly. However, this difference in attributes may have an adverse effect on the convergence and prediction of the model. Thus, we perform Min-Max normalization on the data set before developing a DL model. That is to say, without changing the distribution of the original data, the original data are mapped to the interval of [0,1] by Eq. (32).

(32)

x t n o r m a l = x t − min (x t) max (x t) − min (x t),

where

x t n o r m a l

is the normalized value,

min (x t)

and

max (x t)

denote the minimum and maximum values of the feature

x t

in the data set, respectively. After data preprocessing is completed, gray relational grade (GRG) is applied to further illustrate the rationality of the selected input features. The definition of GRG is given in Eq. (33), and the result is shown in Fig. 11.

(33)

γ (x r e f (k), x c o m p (k)) = 1 Q ∑ k = 1 Q m i n c o m p m i n k | x r e f (k), x c o m p (k) | + ζ m a x c o m p m a x k | x r e f (k), x c o m p (k) | | x r e f (k), x c o m p (k) | + ζ m a x c o m p m a x k | x r e f (k), x c o m p (k) |,

where x_ref is the reference sequence; x_comp refers to the comparability sequence; k is an indicator used to represent the time period; ζ is the distinguishing coefficient,

ζ ∈ (0, 1)

. As presented in Fig. 11, the GRG values between the selected features and the output targets are all greater than 0.5, which demonstrates using the mentioned 25 features to train the DL model for shield tail clearance estimation is reliable. In particular, x1−x7 and x14−x25 tend to have a more direct impact on the variation tendency of the shield tail clearance, while the features x8−x13, which are directly related to TBM operation, have some indirect relationships with the shield tail clearance.

4.3 Implementation details

In terms of small-scale data sets, DNN models trained using traditional methods usually exhibit worse performance than machine learning models (Feng et al., 2019). To break this inherent limitation, an online version of the physics-informed DL model is developed in this study, which is expected to have higher accuracy and better generalization performance. According to Subsection 4.2, the number of selected feature variables is 25, and the total number of samples collected is 553. However, it is considered that the shield attitude of the current ring may be affected by the shield attitude of the adjacent ring. For this reason, when evaluating the shield tail clearance of the current ring, not only the feature data of the current ring should be taken into account, but also the feature and target data of the adjacent ring should be input into the model as feature variables. Thus, the number of samples used to train models is 552, and the data composition of each sample is 50 feature variables and 3 target variables.

According to Eq. (9), it is evident that the values of the parameter β differ when measuring clearances in different orientations, which makes the online PDNN models used to forecast the shield tail clearances at different locations have different loss functions. Therefore, for y1, y2, and y3, we design separate online PDNN models. The development process of the online PDNN models for predicting the shield tail clearance is as follows:

The first 501 samples are taken out from the 552 samples after preprocessing, and 498 of these samples are randomly selected as the training data, while the remaining samples are used as the testing data. Considering the complexity of the input features and the scale of the sample set, the initial parameters of the online PDNN model are set as follows: 2 hidden layers, the number of neurons in each hidden layer, and the number of iterations after each data set update are shown in Fig. 12, the learning rate is 0.001, the physical loss weight is 0.1, and a ReLU function is used to activate the neurons. In addition, the Adam optimizer is used for the optimization of the coefficients of the neural network to improve the convergence speed and performance of the model.

To assess the reliability of the online PDNN method, the model is applied to the testing data, and the performance of the model is evaluated using five evaluation indicators: R², VAR, a20_index, MSE, and RMSE. Based on the resulting values of these evaluation metrics, if the model performs well, the model can be directly applied to predict newly arrived samples. Otherwise, it is required to retrain the model based on the optimal hyperparameter combination searched by the BO algorithm. The ideal result is that the initial model’s prediction accuracy on new samples remains high, which proves the model established has strong generalization performance. Nevertheless, if the model does not match the new samples well, it is necessary to integrate the new samples into the initial data set to obtain an updated data set. The updated data set then is divided into the training data and testing data for model training and evaluation. The process is repeated until there are no new samples available. The constructing process of the data set used to train the online PDNN is depicted in Fig. 13. As an example, Fig. 14 presents the training loss of online PDNN models on (a) y1, (b) y2, and (c) y3 after the 1st, 6th, 12th, and 18th data set updates. It can be seen that the training loss decreases rapidly at the first 100 iterations and then gradually reaches a steady value.

After training the online PDNN model, it is integrated with the NSGA-II algorithm for shield tail clearance optimization. Based on the correlation between different TBM machine parameters, x1 to x25 are considered adjustable decision variables. However, the historical data from the previous moment are regarded as unadjustable fixed factors, such as the shield tail clearance y1, t−1, the starting stroke of group C cylinder x16, t−1, and the shield tail horizontal deviation x4, t−1 from the previous moment. To clarify the impact of the adjustable parameter range on the optimization of the shield tail clearance, this study sets five constraint conditions, corresponding to 10%, 20%, 30%, 40%, and 50% of the up-down range of the current adjustable parameter self-values.

As is well-known, reasonable hyperparameters cannot only yield better optimization results but also significantly improve optimization efficiency. Therefore, Table 2 provides the hyperparameter analysis of the NSGA-II algorithm under the 20% constraint condition. It can be observed that while enlarging the population size appropriately can increase the hypervolume (HV) and overall improvement percentage of the Pareto solution set, it also increases the training time. For instance, when the population size increases from 50 to 100, the training time increases by 4.18 s, but the overall improvement percentage only improves by 0.37%. Compared to population size, mutation and crossover probabilities have a smaller effect on training time. Overall, the diversity of the Pareto solution set improves with an increase in these two parameters. Based on these observations, the population size is set to 50, with both crossover and mutation probabilities set to 0.9 in this experiment. Additionally, this study employs parallel computing techniques and sets the convergence tolerance to 1 × 10⁻⁶ and the maximum number of iterations to 1000, which effectively reduces computation time.

4.4 Results analysis

The proposed online PDNN is suitable for online regression tasks on small sample data sets. It has demonstrated a strong capability in modeling the nonlinear relationship between TBM machine parameters and shield tail clearance. To demonstrate the superiority of the online PDNN, we adopt the traditional data set partitioning method (training set: testing set = 9:1) for training DNN and PDNN models. The performance comparison of online PDNN with the DNN and PDNN is shown in Table 3. To understand the effect of incorporating the physical laws into the algorithm, we used the SHAP and PDP methods to analyze the contribution of features, particularly the impact of pitch angle on model output y1. Detailed results are analyzed as follows:

(1) The proposed online PDNN method can accurately predict the shield tail clearance with R² scores higher than 0.9. As shown in Table 3, the online PDNN model performs best when predicting the shield tail clearance directly above (y1), achieving the lowest RMSE (2.97) and MAE (2.52), as well as the highest VAF (0.96) and R² scores (0.93). For the upper left shield tail clearance (y2) and the upper right shield tail clearance (y3), the online PDNN model shows similar predictive performance. Specifically, both have an RMSE of 3.64, MAE values of 3.17 and 3.00, VAF values of 0.93 and 0.94, and R² scores of 0.90. In summary, these low error metric values and high R² scores indicate that the online PDNN method performs reliably. In addition, it can be observed that the a20_index values of online PDNN models reach 1 when predicting the shield tail clearance at all three specified locations, which indicates that the proposed method has physical engineering significance.

(2) Compared with offline DNN and PDNN approaches, the online PDNN has better prediction results. As presented in Table 3, the a20_index metric values obtained based on the different DNN algorithms are close to 1, demonstrating that both the original and improved DL methods are reliable with 20% tolerance. From the four metrics of RMSE, MAE, VAF, and R², the prediction performance of the online PDNN significantly outperforms traditional DNN and PDNN methods. For example, the online PDNN exhibited a decrease of 2.66 mm in RMSE and 1.73 mm in MAE, coupled with an improvement of 0.18 in VAF and 0.14 in R² compared with the second-best model named PDNN in forecasting the upper left shield tail clearance. Furthermore, Fig. 15 demonstrates the shield tail clearance measurement deviations of the three models on the training and testing data. It can be seen that the online PDNN model can significantly increase the number of samples with a prediction deviation of less than 5 mm (an average increase of 14.99% over the PDNN), which again demonstrates the superiority of the online PDNN approach.

(3) The online PDNN shows strong robustness, with its R² scores remaining at 0.9 and above after each addition of new samples to the original data set. Figure 16 exhibits the values of the five evaluation metrics for the online PDNN model on the training and testing data after each data set update. The online PDNN consistently achieves values above 0.90 for VAF, 1 for a20_index, and 0.90 for the R², which indicates that the method has better generalization performance in adapting to new samples. Figure 17 illustrates the SHAP analysis result of the pitch angle. As presented in Fig. 17(a), the pitch angle ranks second among the 51 feature variables contributing to the output y1. Figure 17(b) demonstrates a positive correlation between the pitch angle and the corresponding Shap value. To further explore the contribution of the interactions between input variables to shield tail clearance prediction, Fig. 18 presents a 2D PDP to reveal the interactions among the top three factors. All input feature values in the 2D PDP are standardized, and the contour lines represent the shield tail clearance directly above. For instance, in Fig. 18(a), when the standardized value of the previous moment’s shield tail clearance y1, t−1 is 0.07 and the pitch angle x1 is 0.9, the model can minimize y1 to 50 mm during the testing phase.

(4) The hybrid approach of online PDNN and NSGA- II performs well in searching for the best solution for MOO of shield tail clearance, with the optimal variability space of the adjustable parameters found to be in the 20% self-values up-down range. Figures 19 and 20 show optimized objectives values and the Pareto fronts obtained under the different constraints. It can be evident that as the range of the adjustable decision variables increases, the shield tail clearance gradually narrows, and the Pareto solution set is more evenly distributed in the entire non-inferior solution space. As depicted in Table 4, the overall optimized percentage of the shield tail clearance increases up to 58.37% under the 50% constraints than the original data. According to the reference (Han, 2014), the warning value for shield tail clearance is 10 mm, the normal range is 10−20 mm, and the preferable range is 20−30 mm. Hence, the optimized magnitude of shield tail clearance does not exceed the warning value, which indicates the optimized solutions obtained based on the MOO are reasonable. Extensive adjustments to the machine parameters during shield tunneling can potentially lead to significant increases in insecurity and wear and tear. Accordingly, the optimal variation space of adjustable parameters is confined at a 20% self-values up-down range.

5 Discussions

Through the case analysis, it is displayed that the online PDNN method is an effective tool for estimating and controlling shield tail clearance. To further highlight the advantages of online PDNN, we compared its performance with other popular algorithms from two perspectives. From the prediction aspect, three powerful artificial intelligent models, Xgboost, LightGBM, and LSTM, are trained with the same online learning approach as well as training (and testing) data sets. The forecast performance comparison of different models is presented in Fig. 21. From the optimization aspect, this study conducts five comparative experiments. In the first and second groups, the combination of DNN and PDNN with NSGA-II is used to perform the MOO for shield tail clearance. In the third, fourth, and fifth groups, the online PDNN model is combined with the NSGA-III, multi-objective evolutionary algorithm based on decomposition (MOEA/D), and the adaptive geometry estimation based MOEA algorithm 2 (AGE-MOEA2) to control shield tail clearance. The optimization results obtained utilizing different MOO frameworks are shown in Table 5. In addition, this section conducts a comparison between the online PDNN approach and other non-contact measurement techniques. Findings from these results are summarized below.

(1) The online PDNN demonstrates superior performance compared to other state-of-the-art models across all five evaluation metrics, which include RMSE, MAE, VAF, a20_index, and R². As shown in Fig. 21, the online PDNN model consistently achieves the lowest values of RMSE and MAE, as well as the highest values of VAF, a20_index, and R², highlighting its remarkable accuracy in forecasting the shield tail clearance. Specifically, compared to the prediction performance of LightGBM on y1, the online PDNN reduces the RMSE and MSE by 1.07 and 0.79, respectively, and improves the VAF, a20_ index, and R² by 0.91, 0.02, and 0.12, respectively. In terms of predicting the upper-left shield tail clearance y2, the priorities of the four models are online PDNN > LightGBM > Xgboost > LSTM. Particularly, in comparison to the LSTM model, the online PDNN exhibits a reduction of 1.94 in RMSE, a reduction of 1.58 in MAE, an increase of 0.20 in VAF, an increase of 0.04 in a20_ index, and an increase of 0.29 in R². In terms of predicting the upper-right shield tail clearance y3, the online PDNN outperforms the other three intelligent algorithms. It reduces the RMSE and MAE by at least 0.10 and 0.23, respectively, and improves the VAF and R² by at least 0.03 and 0.03, respectively. Therefore, it can be concluded that the online PDNN model is a more appropriate choice for minimizing prediction errors and enhancing model performance, which is supported by the study by Feng et al. (2019) in 2019. That is, DNN can effectively deal with issues related to small data sets with higher accuracy and superior generalization performance.

(2) Compared with other advanced MOO frameworks, the hybrid model with the integration of online PDNN and NSGA-II exhibits better performance in performing MOO of shield tail clearance. Based on the average overall improvement percentages under different constraints shown in Table 5, the priorities of the six MOO frameworks are Online PDNN + NSGA-II > Online PDNN + AGE-MOEA2 > Online PDNN + MOEA/D > Online PDNN + NSGA-III > PDNN + NSGA-II > DNN + NSGA-II. Under the constraints of currently adjustable parameter self-values up-down range for 10%, 20%, 30%, 40%, and 50%, the overall optimization percentage of the proposed hybrid method improves by at least 3.42%, 0.12%, 2.59%, 4.27%, and 0.57%, respectively. Furthermore, the proposed method achieves the Pareto optimal solution sets with the highest HVs. Compared to other MOO frameworks, the proposed method improves the average HV values by at least 9.20, indicating that the Pareto optimal solution sets obtained by this method have better distribution and diversity. Notably, a significant advantage of the online PDNN model over other non-contact measurement methods is that it does not require specialized equipment to measure shield tail clearance, making it more practical for tunnel engineering projects. For example, the VMT GmbH GB’s Automatic Tailskin Clearance Measurement System – SluM (VMT, 2021) and ENZAN’s Tail Clearance Control System (ENZAN, 2021) use ultrasonic sensors and industrial cameras to collect data, respectively. It makes the measurement process of shield tail clearance more complex and expensive. In summary, the proposed hybrid approach with the integration of online PDNN and NSGA-II can effectively realize the control of shield tail clearance, which may be valuable in facilitating the smooth advancement of the excavation process.

6 Conclusions and future works

To accurately predict and control the magnitude of the shield tail clearance, a hybrid approach integrating the online PDNN and NSGA-II is proposed. The contribution of this study is reflected in two aspects. From the prediction of the shield tail clearance aspect, first, the physical equations applicable to measuring shield tail clearance at any position are solved; secondly, the derived equations are integrated into the loss function of the DNN to develop PDNN models. These PDNN models are trained using online learning techniques, which assist in adaptively updating hyperparameters with the incoming stream of observations to realize the accurate estimation of shield tail clearance. From the MOO of the shield tail clearance aspect, this research enriches the area of optimization and control of shield tail clearance. With the proposed hybrid approach, shield tail clearance can be successfully controlled within a reasonable range by adjusting the appropriate TBM operating parameters. In addition, we verify the effectiveness of the method based on the measured data of the construction project of Wuhan Metro Line 12, Guobo Center South Station-Lingwucun Station.

The conclusions obtained from this research can be summarized below. (1) Under the evaluation metric of R², the online PDNN model demonstrates superior performance, with R² values reaching 0.93, 0.90, and 0.90 for the three objectives (y1, y2, and y3), respectively. Compared with the other four approaches, including the DNN, PDNN, Xgboost, LightGBM, and LSTM, the prediction result of online PDNN is greater than the previous best result (Xgboost) of 0.06 in y1, and the previous best result (LightGBM) of 0.06 and 0.03 in y2 and y3, respectively. (2) Maintaining all other features constant, when the pitch angle value increases, it has a positive influence on the output of the shield tail clearance prediction model. (3) The optimal variability space of the TBM operating parameters used for shield tail clearance optimization is 20% self-values up-down range. By implementing this constraint, the optimized improvement percentage can reach up to 30.87%, with a hypervolume of 32, surpassing the average performance of other MOO frameworks by 5.48% and 23, respectively. (4) In comparison to the existing SluM system and ENZAN’s Tail Clearance Control System, the online PDNN model can realize the prediction of shield tail clearance without sacrificing accuracy and efficiency. Additionally, the shield tail clearance measurement method based on the online PDNN eliminates the need for specialized equipment, which makes the measurement process easier and more cost-effective.

Although the results are generally satisfactory, there are still some limitations that need to be addressed in future work. For one thing, the model performance is dependent on the choice of hyperparameters, so more comparative experiments and hyperparameter optimization are needed in future studies to further validate the superiority of the proposed method. For another, since the innovation of the method primarily focuses on the development of the online PDNN model, it somewhat reduces the emphasis on the advanced nature of the MOO algorithm for shield tail clearance. Therefore, future research will focus on exploring the application of adaptive algorithms and multi-agent RL in the shield tail clearance control framework. Additionally, by integrating the proposed hybrid method with digital twin technology, it can be applied to more TBM construction projects with complex conditions, further verifying its applicability, generalizability, and repeatability.

References

Publishing order | Descend order by publishing year | Descend order by cited within

[1]	Afradi A, Ebrahimabadi A, Hallajian T, (2020). Prediction of TBM penetration rate using support vector machine. Geosaberes: Revista de Estudos Geoeducacionais, 11: 467–479

[2]	Chai R, Guo Y, Zuo Z, Chen K, Shin H S, Tsourdos A, (2024). Cooperative motion planning and control for aerial-ground autonomous systems: Methods and applications. Progress in Aerospace Sciences, 146: 101005

[3]	ChaiRSavvarisATsourdosAChaiS (2017). Solving multi-objective aeroassisted spacecraft trajectory optimization problems using extended NSGA-II. AIAA SPACE and Astronautics Forum and Exposition: 5193

[4]	Chai R, Savvaris A, Tsourdos A, Xia Y, Chai S, (2020). Solving multiobjective constrained trajectory optimization problem by an extended evolutionary algorithm. IEEE Transactions on Cybernetics, 50( 4): 1630–1643

[5]	Chai R, Tsourdos A, Savvaris A, Chai S, Xia Y, Philip Chen C, (2021). Multiobjective overtaking maneuver planning for autonomous ground vehicles. IEEE Transactions on Cybernetics, 51( 8): 4035–4049

[6]	Chai R, Tsourdos A, Savvaris A, Xia Y, Chai S, (2020). Real-time reentry trajectory planning of hypersonic vehicles: A two-step strategy incorporating fuzzy multiobjective transcription and deep neural network. IEEE Transactions on Industrial Electronics, 67( 8): 6904–6915

[7]	Chakraborty S, (2021). Transfer learning based multi-fidelity physics informed deep neural network. Journal of Computational Physics, 426: 109942

[8]	Chen J, Liu S, Li W, Liu F, Zheng D, (2021). Design of laser vision measurement system for shield tail clearance of shield tunneling machine. Machine Tool & Hydraulics, 49: 77–80

[9]	Chen J, Zhou Z, Liu F, Zheng D, (2020). Research on the visual measurement based on shield tail clearance space structure of the shield machine. Machine Tool & Hydraulics, 48: 116–121

[10]	Chen K, Zhou X, Bao Z, Skibniewski M J, Fang W, (2025). Artificial intelligence in infrastructure construction: A critical review. Frontiers of Engineering Management, 12( 1): 24–38

[11]	CSTNET (2021). The “Jiangcheng Pioneer” tunnel boring machine surpasses 1,000 rings while tunneling under the Yangtze River. Available at the website of finance.sina.com.cn

[12]	ENZAN(2021). Tail Clearance control system [EB/OL]. Available at the website of enzan-k.com

[13]	Feng S, Zhou H, Dong H, (2019). Using deep neural network with small dataset to predict material defects. Materials & Design, 162: 300–310

[14]	Feng S X, Chen Z Y, Luo H, Wang S Y, Zhao Y F, Liu L P, Ling D S, Jing L J, (2021). Tunnel boring machines (TBM) performance prediction: A case study using big data and deep learning. Tunnelling and Underground Space Technology, 110: 103636

[15]	Feng X, Wang P, Liu S, Wei H, Miao Y, Bu S, (2022). Mechanism and law analysis on ground settlement caused by shield excavation of small-radius curved tunnel. Rock Mechanics and Rock Engineering, 55( 6): 3473–3488

[16]	Gao X, Shi M, Song X, Zhang C, Zhang H, (2019). Recurrent neural networks for real-time prediction of TBM operating parameters. Automation in Construction, 98: 225–235

[17]	Han X, Zhang F, He Y, Zhong H, (2021). Research on deformation treatment and control technology of tail shield of underwater large diameter slurry shield. In: IOP Conference Series: Earth and Environmental ScienceIOP Publishing, 783: 012027

[18]	HanY (2014). Selection of construction parameters for general segment of shield tunnel. Theoretical Research in Urban Construction (electronic version): 3446–3447

[19]	He C, Xu C, Xiong D, (2021). Development of an automatic measurement device for double laser shield tail gap based on image recognition technology. In: IOP Conference Series: Earth and Environmental ScienceIOP Publishing, 783: 012070

[20]	LinPZhangLTiongR (2023). Soil chamber pressure forecasting in tunnel construction using physics-informed deep learning. Available at SSRN 4614636

[21]	Liu J, Bao Z, (2021). Construction technology of shield tunnel in extremely-soft fluid plastic stratum. Tunnel Construction, 41: 527–532

[22]	Liu J, Shi C, Wang Z, Lei M, Zhao D, Cao C, (2021). Damage mechanism modelling of shield tunnel with longitudinal differential deformation based on elastoplastic damage model. Tunnelling and Underground Space Technology, 113: 103952

[23]	Liu L, Zhou W, Gutierrez M, (2023). Physics-informed ensemble machine learning framework for improved prediction of tunneling-induced short-and long-term ground settlement. Sustainability, 15( 14): 11074

[24]	Lou P, Li Y, Tang X, Lu S, Xiao H, Zhang Z, (2023). Influence of double-line large-slope shield tunneling on settlement of ground surface and mechanical properties of surrounding rock and segment. Alexandria Engineering Journal, 63: 645–659

[25]	Mahmoodzadeh A, Mohammadi M, Hashim Ibrahim H, Nariman Abdulhamid S, Farid Hama Ali H, Mohammed Hasan A, Khishe M, Mahmud H, (2021). Machine learning forecasting models of disc cutters life of tunnel boring machine. Automation in Construction, 128: 103779

[26]	Pan Y, Fu X, Zhang L, (2022). Data-driven multi-output prediction for TBM performance during tunnel excavation: An attention-based graph convolutional network approach. Automation in Construction, 141: 104386

[27]	Phoon K K, Zhang W, (2023). Future of machine learning in geotechnics. Georisk: Assessment and Management of Risk for Engineered Systems and Geohazards, 17: 7–22

[28]	Raissi M, Perdikaris P, Karniadakis G E, (2019). Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational Physics, 378: 686–707

[29]	Rao C, Sun H, Liu Y, (2020). Physics-informed deep learning for incompressible laminar flows. Theoretical & Applied Mechanics Letters, 10( 3): 207–212

[30]	Roy A M, Bose R, Sundararaghavan V, Arróyave R, (2023). Deep learning-accelerated computational framework based on physics informed neural network for the solution of linear elasticity. Neural Networks, 162: 472–489

[31]	Shaban W M, Elbaz K, Zhou A, Shen S L, (2023). Physics-informed deep neural network for modeling the chloride diffusion in concrete. Engineering Applications of Artificial Intelligence, 125: 106691

[32]	Shin S, Lee Y, Kim M, Park J, Lee S, Min K, (2020). Deep neural network model with Bayesian hyperparameter optimization for prediction of NO_x at transient conditions in a diesel engine. Engineering Applications of Artificial Intelligence, 94: 103761

[33]	Sun L, Zhuang Q, (2016). Experimental study on the measurement device of shield tail gap. Modern Tunnelling Technology, 53: 56–59

[34]	Tahmassebi A, Motamedi M, Alavi A H, Gandomi A H, (2022). An explainable prediction framework for engineering problems: Case studies in reinforced concrete members modeling. Engineering Computations, 39( 2): 609–626

[35]	Tang L, Kong X, Ling X, Zhao Y, Tang W, Zhang Y, (2022). Deviation correction strategy for the earth pressure balance shield based on shield-soil interactions. Frontiers of Mechanical Engineering, 17( 2): 20

[36]	Verma S, Pant M, Snasel V, (2021). A comprehensive review on NSGA-II for multi-objective combinatorial optimization problems. IEEE Access: Practical Innovations, Open Solutions, 9: 57757–57791

[37]	VMT(2021). Automatic Tailskin Clearance Measurement System SLuM Ultra. Retrieved from Available at the website of enzank.com

[38]	WMG(2021). Infrastructure dragon! The “Wuhan-made” super-large diameter tunnel boring machine “Jiangcheng Pioneer” rolled off the production line. Available at the website of baijiahao.baidu.com

[39]	Xu Z, Wang W, Lin P, Nie L, Wu J, Li Z, (2021). Hard-rock TBM jamming subject to adverse geological conditions: Influencing factor, hazard mode and a case study of Gaoligongshan Tunnel. Tunnelling and Underground Space Technology, 108: 103683

[40]	Yang H, (2021). Computer vision shield interval automatic detection device development. Building Technology, 5: 24–27

[41]	Yavas M S, Gao Z, Mekaoui N, Saito T, (2023). A machine learning-based hybrid seismic analysis of a lead rubber bearing isolated building specimen. Soil Dynamics and Earthquake Engineering, 174: 108217

[42]	Yin X, Liu Q, Huang X, Pan Y, (2022). Perception model of surrounding rock geological conditions based on TBM operational big data and combined unsupervised-supervised learning. Tunnelling and Underground Space Technology, 120: 104285

[43]	Yu W, (2022). Analysis of shield tail clearance control. Journal of East China Jiaotong University, 39: 47–53 (in Chinese)

[44]	Zhang L, Guo J, Fu X, Tiong R L K, Zhang P, (2024a). Digital twin enabled real-time advanced control of TBM operation using deep learning methods. Automation in Construction, 158: 105240

[45]	Zhang L, Lin P, (2021). Multi-objective optimization for limiting tunnel-induced damages considering uncertainties. Reliability Engineering & System Safety, 216: 107945

[46]	ZhangLWangYFuXSongXLinP (2024b). Geological risk prediction under uncertainty in tunnel excavation using online learning and hidden Markov model. Frontiers of Engineering Management, 1–20

[47]	Zhang P, Li H, Ha Q P, Yin Z-Y, Chen R-P, (2020a). Reinforcement learning based optimizer for improvement of predicting tunneling-induced ground responses. Advanced Engineering Informatics, 45: 101097

[48]	Zhang P, Wu H N, Chen R P, Dai T, Meng F Y, Wang H B, (2020b). A critical evaluation of machine learning and deep learning in shield-ground interaction prediction. Tunnelling and Underground Space Technology, 106: 103593

[49]	Zhang Z, Pan Q, Yang Z, Yang X, (2023). Physics-informed deep learning method for predicting tunnelling-induced ground deformations. Acta Geotechnica, 18( 9): 4957–4972

[50]	Zhou H, Wang H, Zeng W, (2018). Smart construction site in mega construction projects: A case study on island tunneling project of Hong Kong−Zhuhai−Macao Bridge. Frontiers of Engineering Management, 5( 1): 78–87

RIGHTS & PERMISSIONS

Higher Education Press

PDF (8331KB)

2259

Accesses

Citation

Detail

Sections

Recommended

About the journal

Browse

Authors & reviewers

Abstract

Graphical abstract

Keywords

Cite this article

1 Introduction

2 Related studies

3 Methodology

3.1 Physical mechanism in shield tail clearance calculation

3.2 Online PDNN for shield tail clearance estimation

3.2.1 The architecture of PDNN model

3.2.2 Online update mechanism

3.3 Model evaluation and interpretation

3.4 Multi-objective optimization for shield tail clearance control

4 Case study

4.1 Background

4.2 Data resources

4.3 Implementation details

4.4 Results analysis

5 Discussions

6 Conclusions and future works

References

RIGHTS & PERMISSIONS