1. Department of Geotechnical Engineering, College of Civil Engineering, Tongji University, Shanghai 200092, China
2. Key Laboratory of Road and Traffic Engineering of Ministry of Education, Tongji University, Shanghai 201804, China
mengbo_liu@tongji.edu.cn
Show less
History+
Received
Accepted
Published
2022-07-22
2022-12-01
2023-07-15
Issue Date
Revised Date
2023-04-04
PDF
(12017KB)
Abstract
The moving trajectory of the pipe-jacking machine (PJM), which primarily determines the end quality of jacked tunnels, must be controlled strictly during the entire jacking process. Developing prediction models to support drivers in performing rectifications in advance can effectively avoid considerable trajectory deviations from the designed jacking axis. Hence, a gated recurrent unit (GRU)-based deep learning framework is proposed herein to dynamically predict the moving trajectory of the PJM. In this framework, operational data are first extracted from a data acquisition system; subsequently, they are preprocessed and used to establish GRU-based multivariate multistep-ahead direct prediction models. To verify the performance of the proposed framework, a case study of a large pipe-jacking project in Shanghai and comparisons with other conventional models (i.e., long short-term memory (LSTM) network and recurrent neural network (RNN)) are conducted. In addition, the effects of the activation function and input time-step length on the prediction performance of the proposed framework are investigated and discussed. The results show that the proposed framework can dynamically and precisely predict the PJM moving trajectory during the pipe-jacking process, with a minimum mean absolute error and root mean squared error (RMSE) of 0.1904 and 0.5011 mm, respectively. The RMSE of the GRU-based models is lower than those of the LSTM- and RNN-based models by 21.46% and 46.40% at the maximum, respectively. The proposed framework is expected to provide an effective decision support for moving trajectory control and serve as a foundation for the application of deep learning in the automatic control of pipe jacking.
Pipe jacking is one of the most popular trenchless technologies used to construct new pipelines and tunnels below the ground surface in oil and gas, water supply, sewage, communication and electricity pipelines, and pipe-roof projects owing to its minimal impact on surface transportation and the surrounding structures [1–8]. In the pipe-jacking process, strict control of the moving trajectory of the pipe-jacking machine (PJM) is crucial and directly determines the construction quality of pipe-jacking tunnels. Considerable trajectory deviation from the designed jacking axis will affect the reception of the PJM and the assembly of pipe segments, causing tunnel inclination and affecting the performance during the operational period [9].
Currently, the moving trajectory is controlled based on the measured trajectory deviation and deviation-guided rectifications performed by the PJM driver. Because the control process is designed to impose effects after the deviation, untimely control can occur [10], which results in the snake-like motion of the PJM [11]. Hence, if the moving trajectory can be predicted to support the driver in performing adjustments in advance, then the shortcomings of the conventional measured deviation-guided rectifications can be effectively addressed. Therefore, an effective and robust model must be developed to dynamically predict the trajectory of the PJM during pipe jacking.
Several researchers have attempted to investigate the mechanism of pipe jacking using theoretical and experimental methods; however, they have primarily focused on analyzing the factors affecting the jacking force [12–24] as well as lubrication measures to decrease friction resistance [18,25–30]. The prediction of the moving trajectory of the PJM has not been conducted sufficiently. The moving trajectory of the PJM is affected by several complicated coupling factors, such as the unevenness of external forces at the excavation face and the periphery of the PJM, which is induced by different geological conditions; the uneven mass distribution of the PJM; the highly complex operation of the PJM; and the assembly errors of pipe segments, all of which hinder the development of accurate theoretical or numerical methods. Therefore, the development of an effective prediction model remains challenging.
In recent years, deep learning has been widely employed to address complex geotechnical engineering issues, such as predicting shield movement performance [31–33] and analyzing the energy consumption of shield tunnelling machines [34], owing to its capability in establishing connections among different groups of data [35,36]. During the pipe-jacking process, the PJM generates a significant amount of data containing hundreds of parameters at a high frequency. Specifically, these data provide a considerable amount of information that characterizes the pipe-jacking behavior, which provides a solid foundation for establishing data-driven models. Deep learning is indispensable for predicting the moving trajectory of the PJM, particularly when large volumes of data are involved [37]. Among the various deep learning algorithms, the gated recurrent unit (GRU) network has garnered increasing attention owing to its ability in predicting time series. In fact, it has been successfully applied for predicting the deformation of braced excavations [38], tunnelling-induced ground settlements [39], and the cutter wear of shield machines [40]. As such, this study is motivated to leverage the GRU network to predict the moving trajectory of the PJM.
To solve the existing problems, a GRU-based deep learning framework for the dynamic prediction of moving trajectory in pipe jacking is proposed herein. First, operational data are extracted from a data acquisition system and then preprocessed via standstill deletion, outlier elimination, denoising, and normalization. Subsequently, a GRU-based multivariate multistep-ahead direct prediction model is developed to dynamically predict the future moving trajectory using the historical data of the moving trajectory and other relevant covariates. The proposed framework is validated based on a pipe-jacking tunnel case study in Shanghai, China. In addition, the accuracy of the developed GRU-based prediction models is compared with that of existing conventional models, and the effects of different activation functions and time-step lengths are investigated. Finally, the potential implications and limitations of the proposed framework are discussed.
2 Background
2.1 Pipe-jacking operation
Pipe jacking is a typical trenchless technology used to construct pipelines and underpasses. In the pipe-jacking operation, the pipe segment is driven from the launching shaft to the receiving shaft successively through the PJM and the jacking system. The PJM excavates soil ahead of the pipe segments, whereas the jacking system, which typically comprises several hydraulic jacks, provides thrust. Meanwhile, the slurry system provides different types of bentonite or polymers to lubricate and decrease the friction between the pipe segment and soil. A diagram of the pipe-jacking operation is illustrated in Fig.1. As shown, the unique characteristic of the pipe-jacking operation is that all the pipe segments move forward along with the PJM, whereas the shield tunnel linings remain stationary as the shield machine moves forward. Hence, the moving trajectory of the PJM is affected more significantly by the rear-installed structure than by the shield-tunnelling boring machine (TBM).
2.2 Moving trajectory of pipe-jacking machine
Similar to the TBM, the PJM generally exhibits three different movements when jacking in the ground: pitch, yaw, and roll [41]. In theoretical studies, the position of a certain point (typically the center point of the cutterhead) on a TBM, as well as the pitch, yaw, and roll angles, are used to describe the moving trajectory of the TBM [42–44]. In practical engineering, however, the following six key parameters, i.e., the horizontal deviation of the jacking machine head (JMH-HD), vertical deviation of the jacking machine head (JMH-VD), horizontal deviation of the jacking machine tail (JMT-HD), vertical deviation of the jacking machine tail (JMT-VD), pitch, and roll angles are monitored to present the moving trajectory of the PJM [10,45], as shown in Fig.2.
2.3 Time series prediction
From an analytical perspective, data generated from a PJM can be regarded as a time series (i.e., a sequence of sampled data points obtained from a continuous, real-valued process over time) [46]. Therefore, the prediction of the moving trajectory of the PJM can be regarded as a time-series prediction problem.
Time-series prediction methods can be categorized into two types: one predicts the variable y using previous values of y (i.e., univariate prediction), and the other incorporates additional information of other relevant covariates in the prediction process (i.e., multivariate prediction) [47]. A model for multivariate prediction can be formulated as a function F that uses the inputs of variable y with its previous values up to the current time t and a number C of covariates with their previous values up to the current time t. The model outputs a future window of size H for the y variable, as expressed in Eq. (1) [47]:
where represents the previous values of the covariate . The sum of all the input sizes is expressed as . The multivariate prediction method typically yields better results than the univariate prediction method; therefore, the multivariate prediction method was used in this study.
Depending on the size of the future window , the prediction methods can be classified as one-step ahead prediction (if the model predicts only the next time step, i.e., ) and multistep-ahead prediction (if the model predicts a future window of size , where ) [48]. The multistep-ahead prediction can be classified into multistep-ahead recursive prediction (which predicts the future window recursively) and multistep-ahead direct prediction (which predicts the future window directly in one step) [48]. The multistep-ahead direct prediction method can learn the stochastic dependency between future values and incurs a lower computational cost compared with the multistep-ahead recursive prediction [47,48]. Therefore, a multivariate multistep-ahead direct prediction method was adopted in this study to predict the moving trajectory of the PJM.
3 Methodology
3.1 Overview
To achieve dynamic multivariate multistep-ahead direct prediction of the moving trajectory of the PJM during pipe jacking, a GRU-based deep learning framework is proposed herein. Fig.3 illustrates the flowchart of the developed framework, which incorporates three main steps, as described below.
1) Data extraction
A PJM integrates mechanical, electrical, automatic, control, and other multidisciplinary technologies; thus, it is a highly mechanized, automated, and intelligent excavation system [10]. During pipe jacking, this complex system generates a significant amount of data containing hundreds of parameters. The data used in this study were extracted from the data acquisition system of a PJM in the form of Excel files for each ring. These files contain all the information associated with pipe jacking, including the jacking time and distance, key excavation operational parameters, and moving trajectory parameters.
2) Data preprocessing
The data extracted from the data acquisition system of a PJM contain invalid information, including zero-value cells, errors, and noise. Therefore, data preprocessing, including standstill deletion, outlier elimination, denoising, and data normalization, was performed to cleanse and reframe the data used to establish a prediction model. Owing to its capability in inferring actual values based on time series containing noise and other inaccuracies [49], the Kalman filter was adopted in the denoising process to remove noise and measurement errors.
3) Prediction
The preprocessed data were input to the GRU network to establish multivariate multistep-ahead direct prediction models. The GRU network was adopted owing to its capability in addressing short-term memory problems and its advantages over the long short-term memory (LSTM) network, as follows: 1) the GRU requires fewer parameters and is less susceptible to overfitting; 2) the GRU is less computationally expensive; and 3) the improvement in the gate operations of the GRU can remove unnecessary information from previous time steps more effectively.
The pseudo code of the proposed framework is shown below (Algorithm 1).
3.2 Kalman filter
The Kalman filter [50] is a recursive filter that can be used to effectively estimate the state of a dynamic system based on system observations with noise; it has been widely applied in the fields of trajectory estimation, state prediction for control or diagnosis, data merging, etc. [51]. Although the original Kalman filter can only be applied to linear stochastic processes, it has been proven to provide an optimal linear estimate because it considers the estimate at the previous moment. The Kalman filter comprises two main processes: prediction and updating. In the prediction step, the predicted value at the current moment is obtained based on the actual value at the previous moment. In the update step, the true value is inferred from the observed and predicted values at the current moment. Their equations are provided below [52].
The prediction step is expressed as
The update step is expressed as
where and are the actual values at moments k and , respectively; is the predicted value at moment ; is the observed value at moment k; , A, and H denote the Kalman gain, transition matrix, and observation matrix.
3.3 GRU cell
Recurrent neural networks (RNNs) are known for their ability in processing time-series data. This capability is enabled by the hidden state, which learns hidden information from previous time-series data and is continuously updated to adapt to new inputs [53]. However, classical RNNs cannot adequately address unstable gradients and short-term memory problems [54]. Hence, the LSTM [55] cell was proposed and has been widely used owing to its effectiveness. However, owing to the complexity of the structure of LSTM cells, a high computational cost is incurred and overfitting is likely to occur. The GRU cell [56] is an improved version of the LSTM cell; it improves the gate operations while retaining the capability to manage short-term memory problems. The architecture of the GRU cell is shown in Fig.4. Unlike LSTM, the GRU cell performs two types of gate operations, which are performed by an update gate and a reset gate . The current hidden state is calculated based on the current input and hidden state of the previous time step .
The update gate is expressed as
The reset gate is expressed as
The current hidden state is calculated as
In Eqs. (4)–(6), , , and denote the weight matrices of each fully connected layer for their connection to the input vector ; , , and are the weight matrices of each fully connected layer for their connection to the previous hidden state ; , , and are the bias terms for each fully connected layer; and the symbol represents element-wise multiplication. Meanwhile, and tanh represent the logistic sigmoid and hyperbolic functions, respectively, which are defined as
4 Application to case study
4.1 Brief project description
To validate the proposed data-driven framework, the pipe-jacking tunnel of a metro station, namely the Jing’an Temple station, located in Shanghai, China, was selected for a case study. The Jing’an Temple station on Shanghai Rail Transit Line 14 is the first station that uses jacking pipes as the main structure for a metro station in China. Owing to the restrictions of ground traffic and existing structures, a composite structural form was adopted for the Jing’an Temple station, as illustrated in Fig.5. The main structures can be classified into three zones, i.e., Zones A, B, and C. Zone B is composed of three pipe-jacking tunnels, i.e., a station hall (SH) tunnel and two station platform (SP) tunnels. The data used in this study were obtained during the SH tunnel excavation process.
According to a geological investigation report, the construction site is located in the ancient river distribution area of Shanghai. The ground up to a depth of 80.37 m contains Quaternary Late Pleistocene and Holocene sediments, which primarily include mucky, silty, and sandy soils, which exhibit saturated flow to soft plastic clay with high compressibility and sensitivity, low strength, long stabilizing time, and significant settlement after being disturbed [57]. Fig.6 shows the longitudinal geological profile of the study site. Samples obtained from nine boreholes indicate that the station structure is primarily located in areas with muddy silty clay, muddy clay, and clay, and the SH tunnel primarily passes through muddy silty clay. In the field, 42 standard penetration tests, 12 static cone penetration tests, and two vane shear tests were conducted, and 141 undisturbed soil samples were obtained. Conventional laboratory tests, i.e., direct shear, unconfined compressive strength, triaxial, and particle analysis tests, as well as measurements of density, water content, and void ratio were performed on undisturbed soil samples. The detailed soil properties are available in Ref. [9]. The muddy silty clay soil layer featured a natural water content of 42.5%, void ratio of 1.201, and compressive modulus of 2.88 MPa. In addition, it exhibited prominent structural characteristic with a sensitivity value exceeding 4.0, which implies that the strength of the soil will decrease significantly after it is disturbed by construction operations.
The SH tunnel featured a total length of 82 m and comprised 55 pipe segments. The pipe segment featured a quasi-rectangular cross-section with a width of 9.5 m and a height of 4.88 m, and its coverage depth was 4.6 m. Considering the geological conditions, an earth-pressure-balance PJM was adopted for SH tunnel excavation. The PJM measured 9.53 m wide and 4.91 m high and featured a composite cutter plate. The cutterhead of the PJM is shown in Fig.7, and the main design parameters are listed in Tab.1. The two large cutters feature a diameter of 4.675 m and form the primary cutting area, whereas the four small cutters feature diameters of 2.3, 2.1, 1.7, and 1.7 m, individually, and form the remaining cutting area. Because that the large cross-section PJM was jacked in muddy silty clay, which exhibited high compressibility and low strength, the sinking of the PJM head and a significant fluctuation in the posture can easily occur, thus rendering it difficult to predict and control the moving trajectory of the PJM.
4.2 Data extraction
To construct the SH tunnel of the Jing’an Temple station, a PJM was launched on April 8, 2021, and received on May 28, 2021. Data were obtained during the jacking process at a frequency of 1 Hz, including 953 parameters and 132471 rows. The parameters included value parameters, such as the cutter torque and total thrust, as well as state parameters that indicate the on–off state of the components in the PJM. The information contained in the data mathematically describes the state of the PJM and the surrounding geological conditions.
Raw data were obtained from ring Nos. 1 to 55 during the entire jacking process. Notably, the soil around the jacking exit and entrance was strengthened by a vertical metro jet system. The jacking in the reinforcement zone differed significantly from that in the normal zone. Therefore, the data from ring Nos. 7 to 44 (normal jacking zone), i.e., a total of 38 rings over 75000 rows, were used in this study.
The aim of this study is to dynamically predict the moving trajectory of the PJM. During the jacking process, the JMH-HD, JMH-VD, JMT-HD, and JMT-VD were the most important and direct parameters that determine the moving trajectory of the large-cross-section rectangular PJM. Therefore, they were selected as the output parameters for the prediction models. The pitch and yaw of the PJM were not included because they can be reflected by the deviations of the four parameters above, and the roll was not noticeable during the jacking process owing to the large rectangular cross-section of the PJM. Fig.8 shows the four abovementioned output parameters of one ring, where step characteristics are presented in the deviation curves.
The appropriate input parameters must be selected before establishing the models. Factors affecting the moving trajectory of the PJM can be classified into four categories: man-made control, mechanical passive response, historical trend, and geological conditions [58]. Therefore, the correlative main operational parameters, i.e., the total thrust, advance rate, rotation angle, jacking stroke, torque, and rotation speed of the two large cutters; the torque and rotation speed of the four small cutters; and the historical values of the JMH-HD, JMH-VD, JMT-HD, and JMT-VD were selected as input parameters, as listed in Tab.2. Geological conditions were not included in the input parameters of the algorithm of the proposed framework primarily because of the following considerations.
1) As depicted in Fig.6, the geological conditions remained unchanged during the short 82 m distance of pipe jacking. Thus, the effects of geological conditions on the moving trajectory of the PJM was insignificant in the data used to establish the models.
2) The mechanical passive response parameters, such as the thrust and torque of the cutters, can indirectly reflect the geological information of the excavated section to some extent.
3) Compared with the automatically acquired operational parameters of the PJM, the geological data were sparse.
4.3 Data preprocessing pipeline
In machine learning, the process from raw data to the final interpretation is referred to as the “pipeline” [54,59]. The data preprocessing pipeline used in this study included four individual steps: 1) standstill deletion, 2) outlier elimination, 3) denoising, and 4) data normalization. The data preprocessing steps are described in detail herein using the torque of the large cutter 1 (Tol1) as an example. The pipeline was implemented using the programming language Python and several libraries.
4.3.1 Standstill deletion
Owing to the discontinuous pipe-jacking process, significant amounts of empty data were generated when the operation of the PJM was halted, i.e., during standstill. Fig.9 shows the standstill of Tol1. As standstill is not an actual jacking phase, all instances with standstills in the torque, total thrust, advance rate, jacking stroke, and rotation speed were deleted.
4.3.2 Outlier elimination
Owing to sensor malfunction, the operational parameters in the extracted data contained some outliers. Similar to previous studies [59–61], the Mahalanobis distance () [62] was computed to detect and eliminate the outliers. , which represents the covariance distance of the data, is used to correct the Euclidean distance. It considers the relationship between various characteristics and is scale-invariant. For , can be calculated using Eq. (9).
where is the mean of and is the covariance matrix. The p85 percentile of the chi-square distribution is specified as the cutoff value , and instances with outside of are considered outliers and thus removed. Fig.10 present the outliers of Tol1 detected using .
4.3.3 Denoising
Because of the geological environment and sensor accuracy, the observed values of the operational parameters contained noise [63]. Therefore, the observed values do not represent the actual values. A Kalman filter was used to remove the noise and measurement error. Fig.11 shows the observed and smoothed values after applying the Kalman filter of Tol1. Tab.2 shows the statistical characteristics of the operational parameters after the abovementioned data preprocessing steps were performed.
4.3.4 Data normalization
Data normalization unifies the data to the same order of magnitude, eliminates the effect of dimensionality, and improves the convergence speed and performance of the prediction model. Because the outliers were eliminated, min–max normalization (as expressed in Eq. (10)) was performed to preserve the original data distribution.
4.4 Implementation of prediction models
This section describes the prediction model establishment process, the selection of performance evaluation metrics, and the hyperparameter fine-tuning process. All experimental tests were performed using the Jupyter Notebook, the deep learning toolkit Keras (TensorFlow as the backend), and the scikit-learn library on a computer with an Intel Xeon Silver 4215R 3.20 GHz CPU, 32 GB of memory, and NVIDIA Quadro RTX 4000 graphics card.
4.4.1 Establishment of prediction models
The GRU network was adopted to establish multivariate multistep-ahead direct prediction models for the moving trajectory. To achieve the best performance, four models were developed to predict the JMH-HD, JMH-VD, JMT-HD, and JMT-VD. For each prediction model, for example the JMH-HD, the previous values of the 19 covariates and JMH-HD were the input variables, and the future values of the JMH-HD were the output variables, as shown in Tab.2.
In this study, the prediction models were developed to predict the future 10 time steps based on the previous 60 time steps of the moving trajectory of the PJM. The fixed-length sliding time window approach [64] was used to construct X and y pairs for model training and testing (Fig.12). A sliding window with a length of 70 time steps was implemented on the training data to obtain training instances with X and y, i.e., the previous 60 and next 10 time steps, respectively. Subsequently, the sliding window was scrolled forward in 10 time steps to construct the training data set. The test data set was created in the same manner as in the model tests. Consequently, 4672 data set instances were created, among which approximately 80% were training data sets and the remaining 20% were test data sets.
4.4.2 Performance evaluation metrics
Moving trajectory prediction is a regression problem. Thus, two typically used metrics in regression problems, i.e., the mean absolute error (MAE) and root mean squared error (RMSE), were adopted to evaluate the model performance. The MAE and RMSE are expressed as follows:
where is the number of data instances, the predicted value of the th data instance, and the real value of the th data instance.
4.4.3 Hyperparameter fine-tuning
Hyperparameter fine-tuning is required when implementing deep learning models because the best combination of hyperparameters yields the optimal performance of the prediction model [65]. In this study, a grid search algorithm [66] with threefold cross-validation [67] was adopted to optimize the hyperparameters.
Cross-validation is typically performed to evaluate model performance [60,68–70]. In K-fold cross-validation, the training data set is segmented into K equal-sized subsets, i.e., . During each cycle, one of the subsets is used for validation, whereas the remaining are used for training. This process is repeated K times and the final result is calculated as the arithmetic average of all results. Considering the large size of the training data set, the value of K was set to 3 to balance between computational cost and effectiveness. Fig.13 illustrates the threefold cross-validation process.
Instead of performing a manual search, a grid search algorithm was used to determine the optimal combinations of hyperparameters for the prediction models. The number of GRU layer units and the learning rate were considered as key hyperparameters; the tuning ranges and intervals of these hyperparameters are listed in Tab.3. All possible values for each hyperparameter were combined and used in the prediction models to obtain the desired performance. The number of training epochs for the grid search was set to 200. Finally, the hyperparameter combination that achieved the best performance using the validation data set during training was regarded as the optimal combination. The optimal hyperparameters and typical settings for the four prediction models are listed in Tab.4.
4.5 Results
The four prediction models were trained and tested on the training and test data sets, respectively, using the hyperparameter settings listed in Tab.4. The MAE loss curve of the GRU-based model for JMH-HD prediction is shown in Fig.14. As shown, the loss value decreased rapidly in the first 50 epochs and converged to the minimum with subtle local oscillations when the epochs exceeded 150. Therefore, 300 epochs were sufficient for model training. Fig.15 shows the prediction results of the JMH-HD, JMH-VD, JMT-HD, and JMT-VD on the training and test data sets for the four GRU-based models. As shown, the developed models accurately captured the trend of the future moving trajectory and precisely predicted the future values. The scatter plots in the right column show the dense distribution of the points near the reference line (prediction = actual value), which indicates satisfactory prediction performance. Tab.5 presents the performance evaluation metrics (MAE and RMSE expressed in Eqs. (11) and (12), respectively) of each prediction model for the training and test data sets. The prediction model for the JMH-VD demonstrated the best performance, with MAE and RMSE values of 0.1302 and 0.4079 on the training data set, respectively, and MAE and RMSE values of 0.1904 and 0.5011 on the test data set, respectively. The prediction errors for the test data set were higher than those for the training data set, except for the prediction model of the JMT-HD, which indicated a slightly smaller prediction error on the test data set. In general, these prediction models achieved satisfactory prediction performance for the moving trajectory of the PJM.
5 Discussion
5.1 Comparison with other models
To evaluate the prediction accuracy of the proposed GRU-based models, two widely used models for time-series prediction, i.e., LSTM and the RNN, were compared with the GRU-based models. A brief introduction of the two models is provided below.
5.1.1 LSTM
As mentioned in Subsection 3.3, the structure of the LSTM model is more complex than that of the GRU model. An LSTM cell comprises three types of gate operations, which are performed by the forget gate , input gate , and output gate , separately. The current hidden state is calculated as follows.
The meanings of the variables and symbols in Eqs. (13)–(18) are the same as those described in Subsection 3.3. The number of units and the learning rate used in the LSTM-based models were obtained via grid search based on the tuning ranges listed in Tab.3. The other hyperparameter settings were the same as those for the GRU-based models.
5.1.2 RNN
Unlike the mechanisms of the GRU and LSTM, the data flow in an RNN layer is not subjected to gate operations. The current hidden state is calculated based on the current input and the hidden state of the previous time step , which is expressed as
where and are the weight vectors of the input and the weight matrix of the previous hidden state , respectively, and is the bias term. Similar to the process of the LSTM-based models, the number of units and learning rate used in the RNN-based models were obtained through grid search, and other hyperparameter settings were the same as those used in the GRU-based models.
Tab.6 lists the optimal hyperparameters for the LSTM and RNN-based models obtained via grid search. As shown, the optimal numbers of units for the LSTM and RNN are different from that for the GRU. The RNN-based models use fewer units to achieve the best prediction performance. These different models were compared based on the test data set, and the results are shown in Tab.7. In terms of predicting the JMH-HD, JMH-VD, JMT-HD, and JMT-VD, the GRU-based models achieved the best prediction performance, whereas the RNN-based models indicated the worst. The RMSE of the GRU-based models was 21.46% and 46.40% lower at the maximum than those of the LSTM- and RNN-based models, respectively. The inferior performance of the RNN was primarily attributed to the non-existence of gate operations, thus it can hardly manage the short-term memory problem caused by the long input time steps, i.e., 60, in this study. The adopted GRU-based models outperformed the LSTM-based models primarily due to the improvement in the gate operations, which resulted in the effective removal of unnecessary information from previous input time steps.
5.2 Effect of activation function
The activation function is vital to the nonlinear mapping representation between the input and output data of deep neural networks [71]. To investigate the effects of activation functions on the prediction performances of the GRU-based models, a novel activation function, referred to as tanhLU [72], was introduced to perform a comparison.
The activation function tanhLU integrates tanh into a linear unit, which is expressed as follows.
where and determine the scaling extent of the tangent term, whereas defines the slope of the asymptotic line for the combination unit. Based on a previous study, for RNNs with more than 200 units in each layer, tanhLU with the parameter set () is a good alternative to tanh [72]. Therefore, the GRU-based model for the predicting the JMH-VD, which contains 256 units, was selected for comparison. The activation function used for the GRU-based model in this study was the hyperbolic tangent function tanh, as shown in Eq. (6). Fig.16 shows the activation values and derivatives of tanh and tanhLU based on the recommended parameter set. For tanhLU, the introduced parameter maintained the derivative greater than everywhere in the entire domain, whereas the parameters and apply transformations on the shape of the tanh term, which can result in different effects on the neural networks.
The tanh and tanhLU activation functions were set separately to train the GRU-based model for predicting the JMH-VD, whereas the other parameters were fixed. Fig.17 shows the training performance of the GRU-based model with activation functions tanh and tanhLU. The tanhLU significantly accelerated the training process when large neural networks were involved. Only 100 epochs were required for the loss value to converge to the minimum value with subtle local oscillations when tanhLU was used, whereas 200 epochs were required when the conventional tanh was used. This is primarily because the parameter restricts the minimum derivative to 0.1, thus effectively avoiding activation saturation by the small derivative of the activation function. In terms of the prediction performance, tanh performed slightly better than tanhLU for the set (). The adopted tanh effectively balanced between the training efficiency and prediction accuracy. However, the parameter set used in this study for tanhLU may not be optimal—the performance can be further improved using optimized tanhLU parameters.
5.3 Effect of length of input time steps
To analyze the effects of the input time-step lengths on the prediction performances of the GRU-based models, different lengths of input time-steps (i.e., 10, 20, 30, 40, 50, 60, 70, 80, and 90) were selected to create data sets, train models, and evaluate performance. For simplicity, the model used to predict the JMH-HD was selected for training and testing on these data sets, with the hyperparameters set the same as those listed in Tab.4. A comparison of the model performances for different lengths of input time steps is shown in Tab.8. The comparison shows that the effect of the input time-step length on the model performance is significant. The prediction model achieved the best performance on the test data set, with MAE and RMSE values of 0.6459 and 1.4748, respectively, when the length of the input time steps was set to 60. The model performance improved as the time-step length increased because more information was incorporated into the model input. However, the improvement in model performance was limited when the input time-step lengths became sufficiently large to include several redundant information. In addition, a significant increase in the time-step length increases the computational cost. Therefore, the length of the input time steps was set to 60 in this study, as mentioned in Subsubsection 4.4.1.
5.4 Applications of proposed framework
The results show that the proposed framework for predicting the moving trajectory of the PJM is effective. The prediction models can provide decision support for the moving-trajectory control of the PJM. Once the model predicts that the moving trajectory of the PJM will deviate significantly from the designed jacking axis, the PJM driver can decrease this deviation by adjusting the operating parameters (the input parameters of the models) in advance. For a new pipe-jacking project, the proposed framework can be used to select input variables based on the practical settings of cutting and thrust systems, preprocess operational parameters, establish prediction models, and dynamically predict the moving trajectory. However, as the PJM advances, the inherent relationship between the operational parameters and moving trajectory will change because the surrounding geological conditions are not constant. In this regard, a transfer learning strategy [73] can be adopted to update the pretrained prediction models such that new characteristics can be mapped. Moreover, automatic control technology can be further developed based on the proposed framework to replace manual control in future studies.
6 Conclusions
A GRU-based deep learning framework for the dynamic prediction of the moving trajectory of the PJM was proposed herein; the framework includes data extraction, preprocessing, and GRU-based dynamic prediction models. The proposed framework was validated using operational data obtained from a pipe-jacking tunnel of the Jing’an Temple station in Shanghai, China. The GRU-based models selected for this study were compared with other conventional models (LSTM and RNN), and the effects of activation functions and input time-step lengths on the prediction performance were further investigated. The following conclusions were inferred.
1) The proposed framework established multivariate multistep-ahead direct prediction models based on historical operational data to map the complex correlations between the operational parameters and moving trajectory of the PJM. Based on verification via a case study, the proposed framework was proven to dynamically and precisely predict the moving trajectory of the PJM during the pipe-jacking process. The prediction accuracy of the GRU-based models indicated minimum MAE and RMSE values of 0.1904 and 0.5011 mm on the test data set, respectively.
2) Compared with the RMSE values of two conventional time-series prediction models, i.e., LSTM and the RNN, that the GRU-based models reduced by 21.46% and 46.40% at the maximum, respectively. The superiority of the GRU-based model was due to its improved gate operations, which address the short-term memory problem and effectively remove unnecessary information from the previous time steps.
3) Comparing the training performances of GRU-based models incorporated with the activation functions tanh and tanhLU, the introduced linear unit of tanhLU significantly accelerated the training process, although the prediction accuracy was slightly lower. The adopted tanh balanced between training efficiency and prediction accuracy.
4) Compared with using different input time-step lengths, using a length of 60 yielded the best performance on the test data set. The model performance improved as the time-step length increased; however, the improvement was limited when the input time-step lengths became sufficiently large to include several redundant information.
In summary, the GRU-based deep learning framework effectively predicted the moving trajectory of the PJM and can serve as a reference for other pipe-jacking projects in the future. However, some limitations exist in this study, which must be addressed. First, for long-distance pipe-jacking projects crossing different strata, geological conditions must be considered to achieve better prediction accuracy. Second, the assembling errors of pipe segments and human inference present non-negligible effects on the moving trajectory of PJMs and must be considered in future studies. Additionally, effective control strategies must be identified and incorporated into the proposed framework to establish a complete intelligent warning and control system for pipe-jacking operations.
Wang J, Wang K, Zhang T, Wang S. Key aspects of a DN4000 steel pipe jacking project in China: A case study of a water pipeline in the Shanghai Huangpu River. Tunnelling and Underground Space Technology, 2018, 72: 323–332
[2]
Chen X, Ma B, Najafi M, Zhang P. Long rectangular box jacking project: A case study. Underground Space, 2021, 6(2): 101–125
[3]
Xue Z F, Cheng W C, Wang L, Song G. Improvement of the shearing behaviour of loess using recycled straw fiber reinforcement. KSCE Journal of Civil Engineering, 2021, 25(9): 3319–3335
[4]
Hu W, Cheng W C, Wen S, Yuan K. Revealing the enhancement and degradation mechanisms affecting the performance of carbonate precipitation in EICP process. Frontiers in Bioengineering and Biotechnology, 2021, 9: 750258
[5]
Cheng W C, Bai X D, Sheil B B, Li G, Wang F. Identifying characteristics of pipejacking parameters to assess geological conditions using optimisation algorithm-based support vector machines. Tunnelling and Underground Space Technology, 2020, 106: 103592
[6]
Ren D J, Xu Y S, Shen J S, Zhou A, Arulrajah A. Prediction of ground deformation during pipe-jacking considering multiple factors. Applied Sciences (Basel, Switzerland), 2018, 8(7): 1051
[7]
Kumar R, Samui P, Kumari S, Roy S S. Determination of reliability index of cantilever retaining wall by RVM, MPMR and MARS. International Journal of Advanced Intelligence Paradigms, 2021, 18(3): 316–336
[8]
Samui P, Kim D, Jagan J, Roy S S. Determination of uplift capacity of suction caisson using gaussian process regression, minimax probability machine regression and extreme learning machine. Civil Engineering (Shiraz), 2019, 43(S1): 651–657
[9]
Yang Y F, Liao S M, Liu M B, Wu D P, Pan W Q, Li H. A new construction method for metro stations in dense urban areas in Shanghai soft ground: Open-cut shafts combined with quasi-rectangular jacking boxes. Tunnelling and Underground Space Technology, 2022, 125: 104530
[10]
Zhou C, Xu H, Ding L, Wei L, Zhou Y. Dynamic prediction for attitude and position in shield tunneling: A deep learning method. Automation in Construction, 2019, 105: 102840
[11]
Sugimoto M, Sramoon A. Theoretical model of shield behavior during excavation. I: Theory. Journal of Geotechnical and Geoenvironmental Engineering, 2002, 128(2): 138–155
[12]
Zhang P, Behbahani S S, Ma B, Iseley T, Tan L. A jacking force study of curved steel pipe roof in Gongbei tunnel: Calculation review and monitoring data analysis. Tunnelling and Underground Space Technology, 2018, 72: 305–322
[13]
Ji X, Zhao W, Ni P, Barla M, Han J, Jia P, Chen Y, Zhang C. A method to estimate the jacking force for pipe jacking in sandy soils. Tunnelling and Underground Space Technology, 2019, 90: 119–130
[14]
Barla M, Camusso M, Aiassa S. Analysis of jacking forces during microtunnelling in limestone. Tunnelling and Underground Space Technology, 2006, 21(6): 668–683
[15]
Ji X, Ni P, Barla M. Analysis of jacking forces during pipe jacking in granular materials using particle methods. Underground Space, 2019, 4(4): 277–288
[16]
Ong D, Choo C. Back-analysis and finite element modeling of jacking forces in weathered rocks. Tunnelling and Underground Space Technology, 2016, 51: 1–10
[17]
Rohner R, Hoch A. Calculation of jacking force by new ATV A-161. Tunnelling and Underground Space Technology, 2010, 25(6): 731–735
[18]
Wen K, Shimada H, Zeng W, Sasaoka T, Qian D. Frictional analysis of pipe−slurry−soil interaction and jacking force prediction of rectangular pipe jacking. European Journal of Environmental and Civil Engineering, 2020, 24(6): 814–832
[19]
Cheng W C, Ni J C, Shen J S L, Huang H W. Investigation into factors affecting jacking force: A case study. Proceedings of the Institution of Civil Engineers—Geotechnical Engineering, 2017, 170(4): 322–334
[20]
Li C, Zhong Z, Liu X, Tu Y, He G. Numerical simulation for an estimation of the jacking force of ultra-long-distance pipe jacking with frictional property testing at the rock mass–pipe interface. Tunnelling and Underground Space Technology, 2019, 89: 205–221
[21]
Yen J, Shou K. Numerical simulation for the estimation the jacking force of pipe jacking. Tunnelling and Underground Space Technology, 2015, 49: 218–229
[22]
Chapman D, Ichioka Y. Prediction of jacking forces for microtunnelling operations. Tunnelling and Underground Space Technology, 1999, 14: 31–41
[23]
Sheil B. Prediction of microtunnelling jacking forces using a probabilistic observational approach. Tunnelling and Underground Space Technology, 2021, 109: 103749
[24]
Yang S, Wang M, Du J, Guo Y, Geng Y, Li T. Research of jacking force of densely arranged pipe jacks process in pipe-roof pre-construction method. Tunnelling and Underground Space Technology, 2020, 97: 103277
[25]
Shou K, Yen J, Liu M. On the frictional property of lubricants and its impact on jacking force and soil–pipe interaction of pipe-jacking. Tunnelling and Underground Space Technology, 2010, 25(4): 469–477
[26]
Reilly C C, Orr T L. Physical modelling of the effect of lubricants in pipe jacking. Tunnelling and Underground Space Technology, 2017, 63: 44–53
[27]
He Z, Chen J. Experimental study on the complex contact frictional property of an ultralong distance large-section concrete pipe jacking and prediction of pipe string stuck. Advances in Materials Science and Engineering, 2019, 2019: 4353520
[28]
Ye Y, Peng L, Zhou Y, Yang W, Shi C, Lin Y. Prediction of friction resistance for slurry pipe jacking. Applied Sciences, 2019, 10(1): 207
[29]
Cheng W C, Wang L, Xue Z F, Ni J C, Rahman M M, Arulrajah A. Lubrication performance of pipejacking in soft alluvial deposits. Tunnelling and Underground Space Technology, 2019, 91: 102991
[30]
Ye Y, Peng L, Yang W, Zou Y, Cao C. Calculation of friction force for slurry pipe jacking considering soil−slurry−pipe interaction. Advances in Civil Engineering, 2020, 2020: 1–10
[31]
Bai X D, Cheng W C, Li G. A comparative study of different machine learning algorithms in predicting EPB shield behaviour: A case study at the Xi’an metro, China. Acta Geotechnica, 2021, 16(12): 4061–4080
[32]
Lin S S, Zhang N, Zhou A, Shen S L. Time-series prediction of shield movement performance during tunneling based on hybrid model. Tunnelling and Underground Space Technology, 2022, 119: 104245
[33]
YanTShenS LZhouA. Identification of geological characteristics from construction parameters during shield tunnelling. Acta Geotechnica, 2023, 18(1): 535−551
[34]
Elbaz K, Yan T, Zhou A, Shen S L. Deep learning analysis for energy consumption of shield tunneling machine drive system. Tunnelling and Underground Space Technology, 2022, 123: 104405
[35]
SamuiPRoyS SBalasV E. Handbook of Neural Computation. San Diego: Academic Press, an imprint of Elsevier, 2017
[36]
KimD. Handbook of Research on Predictive Modeling and Optimization Methods in Science and Engineering. Hershey: IGI Global, 2018
[37]
Wang R, Li D, Chen E J, Liu Y. Dynamic prediction of mechanized shield tunneling performance. Automation in Construction, 2021, 132: 103958
[38]
Yang J, Liu Y, Yagiz S, Laouafa F. An intelligent procedure for updating deformation prediction of braced excavation in clay using gated recurrent unit neural networks. Journal of Rock Mechanics and Geotechnical Engineering, 2021, 13(6): 1485–1499
[39]
Zhang N, Zhou A, Pan Y, Shen S L. Measurement and prediction of tunnelling-induced ground settlement in karst region by using expanding deep learning method. Measurement, 2021, 183: 109700
[40]
Zhang N, Shen S L, Zhou A. A new index for cutter life evaluation and ensemble model for prediction of cutterwear. Tunnelling and Underground Space Technology, 2023, 131: 104830
[41]
Zhang Z, Ma L. Attitude Correction System and Cooperative Control of Tunnel Boring Machine. International Journal of Pattern Recognition and Artificial Intelligence, 2018, 32(11): 1859018
[42]
Xie H, Duan X, Yang H, Liu Z. Automatic trajectory tracking control of shield tunneling machine under complex stratum working condition. Tunnelling and Underground Space Technology, 2012, 32: 87–97
[43]
Tang X, Deng K, Wang L, Chen X. Research on natural frequency characteristics of thrust system for EPB machines. Automation in Construction, 2012, 22: 491–497
[44]
Zhao Y, Pan H, Wang H, Yu H. Dynamics research on grouping characteristics of a shield tunneling machine’s thrust system. Automation in Construction, 2017, 76: 97–107
[45]
Shen S L, Elbaz K, Shaban W M, Zhou A. Real-time prediction of shield moving trajectory during tunnelling. Acta Geotechnica, 2022, 17(4): 1533–1549
[46]
Längkvist M, Karlsson L, Loutfi A. A review of unsupervised feature learning and deep learning for time-series modeling. Pattern Recognition Letters, 2014, 42: 11–24
[47]
RomeuPZamora-Mart’ınezFBotella-RocamoraPPardoJ. Stacked denoising auto-encoders for short-term time series forecasting. In: Artificial Neural Networks: Methods and Applications in Bio-/Neuroinformatics. Cham: Springer, 2015
[48]
Ben Taieb S, Bontempi G, Atiya A F, Sorjamaa A. A review and comparison of strategies for multi-step ahead time series forecasting based on the NN5 forecasting competition. Expert Systems with Applications, 2012, 39(8): 7067–7083
[49]
LiQLiRJiKDaiW. Kalman filter and its application. In: 2015 8th International Conference on Intelligent Networks and Intelligent Systems (ICINIS). Tianjin: IEEE, 2015
[50]
Kalman R E. A new approach to linear filtering and prediction problems. Journal of Basic Engineering, 1960, 82(1): 35–45
[51]
Auger F, Hilairet M, Guerrero J M, Monmasson E, Orlowska-Kowalska T, Katsura S. Industrial applications of the Kalman filter: A review. IEEE Transactions on Industrial Electronics, 2013, 60(12): 5458–5471
[52]
SärkkäS. Bayesian Filtering and Smoothing. Cambridge: Cambridge University Press, 2013
[53]
Zhang N, Zhang N, Zheng Q, Xu Y S. Real-time prediction of shield moving trajectory during tunnelling using GRU deep neural network. Acta Geotechnica, 2022, 17(4): 1167–1182
[54]
GeronA. Hands-on Machine Learning with Scikit-Learn, Keras, and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems. Sonoma: O’Reilly Media, 2019
[55]
Hochreiter S, Schmidhuber J. Long short-term memory. Neural Computation, 1997, 9(8): 1735–1780
[56]
ChoKvan MerrienboerBGulcehreCBahdanauDBougaresFSchwenkHBengioY. Learning phrase representations using RNN encoder–decoder for statistical machine translation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Doha: Association for Computational Linguistics, 2014
[57]
Liao S M, Liu J H, Wang R L, Li Z M. Shield tunneling and environment protection in Shanghai soft ground. Tunnelling and Underground Space Technology, 2009, 24(4): 454–465
[58]
Xiao H, Chen Z, Cao R, Cao Y, Zhao L, Zhao Y. Prediction of shield machine posture using the GRU algorithm with adaptive boosting: A case study of Chengdu Subway project. Transportation Geotechnics, 2022, 37: 100837
[59]
Erharter G H, Marcher T. MSAC: Towards data driven system behavior classification for TBM tunneling. Tunnelling and Underground Space Technology, 2020, 103: 103466
[60]
Zhang Q, Liu Z, Tan J. Prediction of geological conditions for a tunnel boring machine using big operational data. Automation in Construction, 2019, 100: 73–83
[61]
Zhang P, Wu H N, Chen R P, Dai T, Meng F Y, Wang H B. A critical evaluation of machine learning and deep learning in shield-ground interaction prediction. Tunnelling and Underground Space Technology, 2020, 106: 103593
[62]
Mahalanobis P C. On the generalized distance in statistics. Proceedings of the National Institute of Sciences (Calcutta), 1936, 2: 49–55
[63]
Yin X, Liu Q, Huang X, Pan Y. Perception model of surrounding rock geological conditions based on TBM operational big data and combined unsupervised-supervised learning. Tunnelling and Underground Space Technology, 2022, 120: 104285
[64]
WuNGreenBBenXO’BanionS. Deep transformer models for time series forecasting: The influenza prevalence case. 2020, arXiv:2001.08317
[65]
KingmaD PBaJ. Adam: A Method for Stochastic Optimization. 2017, arXiv:1412.6980
[66]
Li J, Li P, Guo D, Li X, Chen Z. Advanced prediction of tunnel boring machine performance based on Big Data. Geoscience Frontiers, 2021, 12(1): 331–338
[67]
KohaviR. A study of cross-validation and bootstrap for accuracy estimation and model selection. In: Proceedings of the 14th International Joint Conference on Artificial Intelligence. San Francisco: Morgan Kaufmann Publishers Inc., 1995
[68]
Wong T T. Performance evaluation of classification algorithms by k-fold and leave-one-out cross validation. Pattern Recognition, 2015, 48(9): 2839–2846
[69]
Lin S S, Shen S L, Zhang N, Zhou A. Modelling the performance of EPB shield tunnelling using machine and deep learning algorithms. Geoscience Frontiers, 2021, 12(5): 101177
[70]
Yan T, Shen S L, Zhou A, Chen X. Prediction of geological characteristics from shield operational parameters by integrating grid search and K-fold cross validation into stacking classification algorithm. Journal of Rock Mechanics and Geotechnical Engineering, 2022, 14(4): 1292–1303
[71]
HayouSDoucetARousseauJ. On the impact of the activation function on deep neural networks training. 2019, arXiv:1902.06853
[72]
Shen S L, Zhang N, Zhou A, Yin Z Y. Enhancement of neural networks with an alternative activation function tanhLU. Expert Systems with Applications, 2022, 199: 117181
[73]
Liu M, Liao S, Yang Y, Men Y, He J, Huang Y. Tunnel boring machine vibration-based deep learning for the ground identification of working faces. Journal of Rock Mechanics and Geotechnical Engineering, 2021, 13(6): 1340–1357
RIGHTS & PERMISSIONS
Higher Education Press
AI Summary 中Eng×
Note: Please be aware that the following content is generated by artificial intelligence. This website is not responsible for any consequences arising from the use of this content.