Comprehensive Hybrid Gaussian algorithm for rock mass stability assessment in complex geological formations: A machine learning approach with dynamic kernel optimization

Youliang CHEN , Wencan GUAN , Rafig AZZAM

Front. Struct. Civ. Eng. ›› 2025, Vol. 19 ›› Issue (11) : 1759 -1787.

PDF (7525KB)
Front. Struct. Civ. Eng. ›› 2025, Vol. 19 ›› Issue (11) : 1759 -1787. DOI: 10.1007/s11709-025-1242-z
RESEARCH ARTICLE

Comprehensive Hybrid Gaussian algorithm for rock mass stability assessment in complex geological formations: A machine learning approach with dynamic kernel optimization

Author information +
History +
PDF (7525KB)

Abstract

Accurate prediction of tunnel face stability in complex geological formations remains a critical challenge in underground engineering, necessitating innovative computational approaches. This research proposes a Comprehensive Hybrid Gaussian (CHG) algorithm for predicting tunnel face stability in complex rock formations. The algorithm introduces a failure penetration probability index (FPPI) as an intermediate variable that establishes a probabilistic mapping between Tunnel Boring Machine parameters and rock mass stability, overcoming limitations of traditional linear mapping approaches. The CHG algorithm integrates feedforward layer normalization from Transformer architectures to enable dynamic kernel function optimization, resolving the manual hyperparameter tuning constraints in conventional Gaussian processes. A multi-scale regularization framework, from Tikhonov regularization to dropout, provides effective complexity control while maintaining expressive capacity. The posterior process incorporates Chebyshev’s inequality to enhance confidence interval estimation and prediction robustness. Validation across three geological points in the Yinsong Project demonstrates an average prediction deviation of 20.453%, with the CHG algorithm (R2 = 0.982) significantly outperforming support vector regression (R2 = 0.846) and random forest (R2 = 0.923). While slightly underperforming the Transformer model (R2 = 0.992) statistically in cross-project validation on the Yinchao data set, the CHG algorithm (R2 = 0.869) exhibits superior adaptability to geological uncertainties. The synergy between FPPI and dynamic kernel functions establishes an innovative framework for predicting mechanical behavior in heterogeneous geological conditions, particularly in lithological transition zones, providing a theoretically sound and practically applicable decision support system for tunnel stability assessment.

Graphical abstract

Keywords

machine learning / Bayesian inference / neural network / Gaussian process / tunneling automation

Cite this article

Download citation ▾
Youliang CHEN, Wencan GUAN, Rafig AZZAM. Comprehensive Hybrid Gaussian algorithm for rock mass stability assessment in complex geological formations: A machine learning approach with dynamic kernel optimization. Front. Struct. Civ. Eng., 2025, 19(11): 1759-1787 DOI:10.1007/s11709-025-1242-z

登录浏览全文

4963

注册一个新账户 忘记密码

1 Introduction

With the acceleration of global urbanization and the advancement of transportation infrastructure modernization, underground space engineering has exhibited a significant expansion trend, particularly in the rapid increase of long-distance, deeply-buried, large cross-section tunnel projects [1]. Against this backdrop, tunnel engineering increasingly confronts complex rock stratum conditions, such as fault fracture zones, heterogeneous stratified structures, high in situ stress environments, and water-bearing formations [24]. Excavation operations under these geological complexes face unprecedented technical challenges. The tunnel face (i.e., the free surface), as the frontier region of tunnel excavation, directly influences the safety, economic viability, and construction efficiency of underground projects [5]. Therefore, comprehensive understanding and precise prediction of its instability mechanisms possess significant theoretical value and practical implications [6,7].

The fundamental nature of instability mechanisms in tunnel faces within complex rock masses lies in the nonlinear dynamic evolutionary process of multi-scale interactions under multi-field coupling effects [811]. Based on extant research findings, the controlling factors influencing tunnel face stability can be systematically categorized into four core dimensions [12,13]: 1) geological structural characteristics [14,15], specifically manifested in rock stratum occurrence distribution, spatial disposition of fault fracture zones, and heterogeneity in rock mass weathering degrees [16]; 2) rock mechanics parameter systems [17,18], primarily encompassing critical constitutive parameters such as uniaxial compressive strength (UCS), tensile strength, elastic modulus, and Poisson’s ratio [19]; 3) in situ stress field environmental characteristics [20], involving three-dimensional distribution patterns of initial geostress, stress redistribution mechanisms during excavation unloading, and stress path dependency of rock masses [21,22]; 4) hydrogeological conditions [23], encompassing dynamic variations in groundwater levels, pore water pressure distribution patterns, and seepage-stress coupling effects [24]. Current research paradigms predominantly concentrate on theoretical analytical methods (e.g., limit equilibrium theory, upper and lower bound theorems of limit analysis) [2527], physical simulation techniques (including centrifugal model tests and geomechanical similar material models), and numerical simulation approaches (e.g., finite element continuum medium analysis, discrete element discontinuum simulation) [28,29]. While these methodological systems possess engineering applicability under specific boundary conditions, they exhibit significant theoretical limitations in predictive model accuracy constraints and engineering applicability boundaries when confronting complex rock mass systems characterized by pronounced nonlinearity and strong stochasticity [3033].

In recent years, with the rapid development of machine learning and artificial intelligence technologies, data-driven methodologies have demonstrated remarkable advantages in underground engineering domains. Numerous scholars have dedicated efforts to constructing intelligent assessment models for tunnel face stability utilizing algorithms such as support vector machines (SVM), Artificial Neural Network (ANN), and Random Forest(RF) [3440]. Zhang et al. [41] based on the fuzzy and stochastic characteristics of shield tunnel face instability, introduced normal cloud theory to establish a Normal Cloud-Product Scale Method prediction model, effectively quantifying the probability of tunnel face instability. Li and Dias [42] addressing the excavation problem of tunnel faces in dual-layer soils, developed a Monte Carlo [43] simulation method optimized by active learning functions through constructing a rotational failure mechanism that considers differential shear strengths between soil layers, combining limit analysis theory with proxy Kriging models [43], thereby significantly enhancing failure probability calculation efficiency. Regarding support pressure prediction, Li’s research team established limit support pressure formulas through mechanical analysis and innovatively applied SVM algorithms for intelligent identification of active instability criteria. Wu et al. [44] through comparative analysis of SVM and ANN performance in surrounding rock displacement prediction, verified the superior predictive accuracy of the former. Yao et al. [45] further proposed a multi-step displacement prediction model based on this foundation, employing an exponentially transformed Shuffled Complex Evolution algorithm to optimize SVM parameters, thus further enhancing temporal prediction reliability. Jearsiripongkul et al. [46] based on the Hoek–Brown criterion, established artificial neural network prediction models encompassing roadways, bilateral tunnels, and dual circular tunnels, constructing a universally applicable solution for tunnel stability assessment through introducing four-dimensional dimensionless parameters including cover depth ratio and normalized UCS. Nguyen et al. [47] innovatively combined IsoGeometric analysis with Bayesian regularized neural networks to establish closed-form solutions for stability analysis of rectangular tunnels in cohesive-frictional soils under additional loads. However, these methods still face challenges in processing multi-source heterogeneous data, capturing potential nonlinear relationships, and quantifying uncertainties. In particular, existing research predominantly focuses on macro-scale overall tunnel stability, with insufficient characterization of local mechanical behaviors and time-varying properties of tunnel faces; simultaneously, most methods lack in-depth consideration of coupling effects between mechanical parameters and geological conditions during tunnel construction, resulting in deviations between prediction results and engineering reality.

Addressing the aforementioned issues, this research innovatively proposes a prediction method for complex rock tunnel face instability based on the Comprehensive Hybrid Gaussian (CHG) algorithm. This study is pioneering in its utilization of the Field Penetration Index (FPI) and the Tunnel Penetration Index (TPI), which are standard tools in the field of tunnel construction, to establish a correlation between tunnel stability indicators and the Failure Penetration Probability Index (FPPI). The CHG algorithm is an innovative approach that employs the feedforward layer of the Transformer architecture to optimize the Gaussian process (GP) covariance function (kernel function). A key feature of the algorithm is its use of feedforward layer normalization to replace hyperparameter optimization, a technique that has been widely adopted in the field. In the Bayesian prior phase, the algorithm implements a boundary likelihood selection mechanism that facilitates kernel function adaptation based on intrinsic data characteristics and structural features. The FPPI serves as a dynamic weighting matrix that is systematically integrated into the Gaussian inference process, substantially enhancing the residual connection component and embedding domain-specific engineering semantics. For posterior distribution analysis, Chebyshev’s inequality is rigorously applied to establish more conservative and robust confidence intervals for the predicted target variables, transcending the limitations of conventional Gaussian assumptions and completing the comprehensive architecture of the CHG algorithm framework.

The present study selected the mechanical operation parameters of Tunnel boring machines (TBM)from three critical locations (Shimenzi, Weizigou, and Hengbei) in the Yinsong Project [4852] as the training set to predict tunnel stability and classify tunnel collapse types. Empirical validation of the CHG algorithm across the geologically diverse sites of Shimenzi, Weizigou, and Hengbei revealed prediction errors of 30.56%, 6.8%, and 24%, respectively, yielding an aggregate prediction deviation of 20.453%. Comparative analysis utilizing the Weizigou data set, with tunnel stability indicators as the analytical framework, demonstrated substantial performance enhancement of the CHG algorithm’s radial basis function (RBF) kernel over its conventional GP counterpart. Quantitative assessment indicated a reduction in root mean square error (RMSE) and mean absolute error (MAE) by approximately 15% each, concomitant with an 8% increase in the coefficient of determination (R2). Further algorithmic benchmarking established the CHG algorithm’s statistical superiority (R2 = 0.982, Mean Squared Error (MSE)= 4.19) relative to support vector regression (SVR) (R2 = 0.846, MSE = 36.64) and random forest methodologies (R2 = 0.923, MSE = 18.39). To rigorously evaluate cross-domain generalization efficacy, the algorithm underwent validation using the independent Yinchao tunnel data set, with performance metrics juxtaposed against a Transformer architecture. While the CHG algorithm exhibited marginally inferior statistical metrics (R2 = 0.868637, MSE = 0.002134, MAE = 0.035960, RMSE = 0.046195) compared to the Transformer model (R2 = 0.991834, MSE = 0.000133, MAE = 0.008327, RMSE = 0.011518), it demonstrated superior robustness when confronted with heterogeneous geological formations, particularly in regions characterized by lithological transitions and structural complexity. This suggests that despite the Transformer model’s statistical optimization under controlled conditions, the CHG algorithm manifests enhanced adaptability in accommodating the geological uncertainties and non-stationary characteristics intrinsic to practical underground engineering applications.

In juxtaposition with conventional GP frameworks augmented by Bayesian inference mechanisms, the CHG algorithm represents a significant methodological advancement through its comprehensive optimization strategy that effectively transcends the constraints of manual kernel function hyperparameter calibration. The algorithm’s architectural integration of feedforward layer normalization with dynamic kernel adaptation constitutes a paradigm shift in GP modeling. Furthermore, regarding Bayesian prior probability formulation, the inherent data-driven characteristics of the boundary likelihood selection mechanism obviate the limitations associated with subjective prior probability assumptions, thereby enhancing statistical objectivity and computational robustness. The empirical validation across heterogeneous geological formations, from the Yinsong project’s critical points to the cross-domain Yinchao data set, demonstrates the algorithm’s versatility and adaptability to complex non-stationary data structures. Consequently, the CHG algorithm presents a theoretically rigorous yet computationally efficient framework with considerable potential for lightweight deployment in engineering applications requiring robust solutions to complex nonlinear problems, particularly in underground space engineering where geological uncertainties and multi-physics coupling effects predominate.

2 Data processing and research methodology

2.1 Data preprocessing

The data for this study is sourced from the tunnel construction project of the fourth section of the Yinsong Water Supply Project [48,53]. As illustrated in Fig. 1, the engineering geological conditions of the Yinsong Project are as follows. The geological conditions along the entire construction section are extremely complex, posing extremely high construction risks. The present study makes a significant engineering contribution by means of a risk prediction model for tunnel collapse. This model has considerable engineering significance in terms of improving construction efficiency and reducing risks in subsequent engineering construction. The study verifies and compares tunnel collapse predictions at three known critical points in order to characterize the performance of the CHG algorithm in this engineering example. The three locations are characterized by abundant water resources, and during the process of tunnel excavation, they exhibit a high propensity for softening and water inflow, which can result in risks of tunnel collapse. The project site data and excavation critical point data are shown in Table 1.

2.2 Stability analysis of tunnel face

This study utilizes TBM mechanical operation parameters to predict tunnel face stability. In complex rock tunnel excavation projects, the impact of geological conditions on tunnel construction can be indirectly reflected through the TBM’s PFPI [54] and PTPI [55], which in turn reflects the distribution and mechanical properties of the tunnel’s complex rocks.

Traditional PFPI is defined by Eq. (1).

PFPI=FSCRSAR.

The traditional PTPI is defined by Eq.(2).

PTPI=TSCRSAR.

In the equation, F represents the total thrust force, SCR is the cutterhead rotational speed; SAR is the advance rate, T is the cutterhead torque. When the rock stiffness is large, the values of PFPI and PTPI are higher because the numerator is small and the denominator is large. Conversely, when the rock stiffness is low, the values of PFPI and PTPI are smaller. This indirectly reflects the distribution of tunnel rocks. After introducing the average advance speed of the tunnel face and the tunnel face area, the formula is modified as follows.

PFPI=FTvmeanAface,

PTPI=TvmeanRTBMAface,

where FT is the thrust force, measured in kN, representing the force exerted by the TBM to advance forward; vmean is the mean penetration rate, measured in mm/min or m/h, representing the average advance speed of the tunnel face; Aface is the tunnel face area, measured in m2, representing the cross-sectional area of the tunnel; RTBM is the cutter head radius.

During underground excavation, the stability of the tunnel face depends on the flowability of the face and the stress state of the rock mass [55,56]. By understanding the critical velocity field model, different geological conditions can be modeled, and the interactions between factors such as TBM speed, rock mass stress, and flowability can be predicted. This, in turn, helps optimize tunnel construction design and operational strategies, preventing tunnel face instability. The definition of the critical velocity field under the rock mass plastic flow theory is as follows.

De=κεP=κλfσ,

where the proportional coefficient is defined by Eq. (6).

κ=vTBMλfσn,

where vTBM is the TBM advancement velocity.

Plastic strain rate Tensor is defined by Eq. (7).

εP=λfσ.

The parameter system in the equations describes the mechanical behavior of rock plastic deformation during the tunnel excavation process. The deformation rate tensor De is jointly determined by the proportional coefficientκ, plastic stress εP, plastic flow parameter λ, and yield function gradient fσ, where the proportional coefficient κ is calculated as the ratio between the TBM velocity parameter vTBM and the norm of the yield function gradient fσ. Introducing the velocity field gradient, the corrected PTPI formula as follows.

PFPIv=FTvmeanAfaceFv(v).

The modified part is shown in Eq. (9).

Fv(v)=1+αCV(v)+βDFvmean.

The parts with correction functions, such as Eq. (10).

CV(||v||)=1AfaceAface(vvmean)2dAvmean.

The Frobenius norm of the strain rate tensor is described by Eq. (11).

DF=i,j=13Dij2.

The weighting coefficient is as follows.

α=fα(GSI,σc,σH/σv)β=fβ(GSI,σc,σH/σv),

where GSI is the geological strength index, σc is the uniaxial compressive strength (UCS), σH is the horizontal in-situ stress, σv is the vertical in-situ stress.

The correction related to the inhomogeneity of the velocity field is denoted as Δv, βDFvmean representing the correction related to the deformation rate, and α, β is the weight coefficient. In practical engineering, there exists a strong coupling effect between rotational motion and linear motion, which is not accounted for in traditional models. By introducing angular velocity, torque, and geometric correction terms, the modified PTPI index can more comprehensively describe system behavior, enhancing design rationality and construction safety.

Modified PTPI indicator is defined by Eq. (13).

PTPIv=TvmeanRTBMAfaceGv(W,ω).

The correction function for rotational effects is defined by Eq. (14).

Gv(W,ω)=1+γWFω+δ|n(×v)|ω.

The magnitude of the cutter head angular velocity vector is defined by Eq. (15).

ω=2πN60.

The Frobenius norm of the rotation rate tensor is given by Eq. (16).

WF=i,j=13Wij2.

Weight coefficient is defined by Eq. (17).

γ=fγ(GSI,σc,Is)δ=fδ(GSI,σc,Is),

where Is is the point load strength index.

Assuming that the velocities at all points on the tunnel excavation face follow a normal distribution, the joint probability density function of the velocity field can be expressed by Eq. (18).

P(v)=1(2π)n/2|Σ|1/2exp(12(vμ)TΣ1(vμ)),

where v is the velocity vector, μ is the mean vector, and Σ is the covariance matrix. Considering the coefficient of variation (CV) of the velocity field and the strain rate tensor D, the corrected probability distribution can be expressed by Eq. (19).

Pc(v)=P(v)Fv(v)G(W,ω).

Based on the corrected PFPI and PTPIc, we can define the tunnel stability indicators (S).

S=ω1PFPIc+ω2PTPIc,

where ω1, ω2 are the weight coefficients, satisfying ω1 + ω2 = 1. Expanding, we get Eq. (20).

S=ω1FVmaxAmaxFv(v)+ω2TVmaxRmaxAmaxG(W,ω).

Finally, the tunnel face stability evaluation model can be expressed as follows.

P(S>Scritical)=S>ScriticalPc(v)dv,

where Scritical is the critical stability value, and when P(S>Scritical)>Pthreshold, it is considered that the tunnel face is stable [57].

The risk assessment coefficient (R) is defined by Eq. (23).

R=1P(S>Scritical).

The smaller the R, the more stable the tunnel face. The comprehensive evaluation is based on this criterion.

Ψ=SScritical(1R).

When Ψ > 1, the tunnel face is stable; when Ψ < 1, the tunnel face is unstable.

As illustrated in Fig. 2, the original data are presented in accordance with Eqs. (1)–(4). The data [50,51] is illustrated by Weizigou, with the input variables denoted by F, SCR, T, SAR. As illustrated in Fig. 3(a), the data undergoes a process of normalization to ensure dimensional matching during the transmission phase. Figure 3(b) depicts the implementation of the Gaussian prior process for the preprocessed data, introducing a Gaussian distribution with the center positioned at the spatial coordinate (0.5, 0.5, 0.5).

As illustrated in Fig. 4, the data set was mapped using Eqs. (1)–(21). The relational data set under consideration was constructed using structured query language and it is this data set which is the original. The following input variables are employed: SAR, F, SCR, T. The intermediate variables are PFPI and PTPI, and the final variable is tunnel stability.

3 Comprehensive hybrid Gaussian algorithm establishment

3.1 Failure penetration probability index explanation

This study proposes a comprehensive intermediate variable, the FPPI, formulated based on Eqs. (1)–(18). This innovation addresses a fundamental limitation in conventional tunnel analysis frameworks, wherein multi-variable relationships are typically mapped in a single-stage manner, proceeding directly from the TBM operational parameters to tunnel stability assessments. To more accurately characterize the complex nonlinear relationships and multi-physics field coupling phenomena inherent in underground excavation processes, the FPPI serves as a mathematically robust intermediate construct within the analytical framework.

The integration of FPPI with the CHG algorithm facilitates enhanced parameter optimization, substantially improving the information transmission dynamics throughout the modeling process. Given the inherent black-box characteristics of the tunnel-geology interaction system, this research emphasizes the theoretical validation of the methodology’s advantages through rigorous mathematical formulation and empirical verification.

From a theoretical perspective, the FPPI indicator enables optimized weight distribution across multi-variable relationships, directing the CHG algorithm toward prioritized analysis of the most influential contributing factors within the complex geological-mechanical system. This approach exhibits probabilistic near-optimal properties that enhance both computational efficiency and predictive accuracy. The mathematical formulation of the FPPI indicator and its synergistic integration within the CHG algorithm framework are rigorously defined in Eqs. (25)–(53).

3.2 The construction of the Comprehensive Hybrid Gaussian algorithm

As illustrated in Fig. 5, the CHG algorithm’s foundational architecture is predicated on a modified GP framework. The algorithm employs boundary likelihood values as discriminative indicators between data smoothness and roughness, establishing the prior component of the GP. The kernel function undergoes dynamic optimization specifically calibrated for this engineering application, with an attention mechanism systematically incorporated into the query component (Q) for both the Matérn and RBF kernel functions.

In the forward processing layer, key (K) and value (V) components execute a sequential transformation pipeline: initially normalizing the input data, subsequently predicting the potential distribution patterns, and then applying the appropriate kernel function to derive the TBM operational parameters. These parameters are subsequently mapped to a pre-established “TBM-tunnel face stability relationship dataset,” transitioning from predominantly linear relationships with black-box characteristics to nonlinear distributions with pronounced correlation structures. To accommodate gradient dynamics during standardization and facilitate effective residual connections, the algorithm implements a dynamic weighting mechanism utilizing the FPPI as a dynamic weight matrix. When TBM data enters either the Matérn or RBF kernel function, a multiplication operation is executed with this dynamic weight matrix serving as the multiplicative factor. The scaling determination is governed by feature vector characteristics: vectors of smaller magnitude undergo amplification, while those of larger magnitude are attenuated.

The final stage applies Chebyshev confidence intervals to enhance tunnel face stability predictions, irrespective of whether the output originates from the RBF or Matérn kernel function, thereby completing the comprehensive CHG algorithmic process. The detailed mathematical formulation of the attention mechanism-based dynamic scaling factor for the optimized RBF kernel function is rigorously defined in Eqs. (38)–(43). The posterior processing framework of the CHG algorithm is systematically presented in Eqs. (44)–(52), with the definitive prediction formula articulated in Eq. (52).

As shown in Table 2, the following comparison is intended to be more intuitive: at the present time, the combination most commonly used is that of a GP and a Bayesian inference mechanism. The following discussion will present an intuitive comparison with the technical details of the CHG algorithm. Figure 6 shows how variables are passed in the CHG algorithm.

The FPPI index is proposed and defined as Eq. (25).

FPPIbasic=DexctexcτP(V)vexc.

Considering the torque-based FPPI, it can be defined as Eq. (26).

τ(V)=τP(V)DTBMtvexc.

The generated FPPI calculation formula.

FPPI(V)=i=1n(τ(V)τ(V)mean)2vexc.

The FPPI with complete parameters can be defined as Eq. (28).

FPPIcomp=Dτ2vexc,

where Dexc represents the tunnel face area, measured in m2; τ denotes the cutting head torque of the TBM, measured in kN·m; P(V) is the probability distribution function of the velocity field; texc is the excavation time, measured in hours or minutes; vexc is the excavation rate, measured in mm/min or m/h; DTBM is the diameter of the TBM, measured in m; τ(V) is the velocity-based torque function; and τ(V)mean denotes the mean value of the torque function.

1) Original GP kernel function

The common GP algorithm process divides the data into RBF kernel and Matérn kernel functions. The difference between the two is that the RBF kernel is better at handling smooth data, while the Matérn kernel performs better for the discretization and regularization of data. The forms of the two kernels are as follows.

K(x,x)=σf2exp(xx22l2),

Kν(x,x)=21νΓ(ν)(2νxx)νkν(2νxx),

where σf2 is the signal variance, l2 is the length scale, Γ(ν) represents the Gamma function, and xx denotes the Euclidean distance between two points. The term represents the modified Bessel function of the second kind.

2) Nuclear function selection mechanics

A decision mechanism is introduced, where, when the data falls within a specific range, the boundary likelihood values within this data range are calculated to determine which kernel function to choose. The boundary likelihood value calculation formula is as follows.

logp(Y|x,k)=12yTK1y12log|K|n2log2π,

where Y is the observation vector, x is the input feature matrix, and K is the covariance matrix generated by the kernel function. n2 is the number of observations. If the marginal likelihood of the RBF kernel is higher, the data tends to be smoother. If the marginal likelihood of the Matérn kernel is higher, the data may be rougher. Therefore, the introduction of a dynamic decision mechanism ensures that the reliability of the CHG prior process is improved, enhancing the stability of training parameters.

3) FPPI synergy with RBF kernel function based on Attention mechanism dynamic scaling factor

The improved boundary likelihood values of FPPI.

P(V)=1σV2πexp(12(VμV)2σV2).

Consider the confidence interval reference Eqs. (33)–(35) for FPPI.

P(V|FPPI)=P(V)P(FPPI|V,σ,W,ai),

f(x)=i=1Jwiϕi(x),wiN(0,σ2j),

ϕi(x)=exp((xci)22l2a),

where f(x) is the internal Gaussian function of CHG, wi is the corresponding random weight value, ϕi is the RBF centered at ci.

The resulting final RBF kernel function is defined as Eq. (36).

K(x,x)=σ2Ji=1Jexp((xci)22l2a)exp((xci)22l2a).

Optimized expression of RBF is defined as Eq. (37).

ϕj(x)=exp(xcj22σj2).

Dynamic scaling factor is defined as Eq. (38).

σj=σbaseψ(PFPI,PTPI,GSI,σc)

The scaling factor based on the attention mechanism is defined by Eq. (39).

ψ=Norm(Attention(fquery,fkey,fvalue)+Residual).

Optimized RBF kernel function is defined by Eq. (40).

kRBFopt(xx)=σf2exp(||xx||22l2ψ2).

The FPPI prediction function is coupled with the velocity field as shown in Eq. (41).

FPPInew=fc(FPPIbasic,W,a)G(W,a).

Moment factor calculation is defined by Eq. (42).

G(W,a)=WTBMvfτ(W,a)vexc.

The calculation of strength parameters is defined by Eq. (43).

W=i=1nwi2.

where σj represents the dynamic bandwidth parameter, obtained as the product of the baseline bandwidth σbase and the scaling factor ψ, enabling adaptive adjustment of the kernel function to diverse data characteristics; FPPI, serving as a comprehensive intermediate variable, directly participates in the dynamic adjustment process of the kernel function, embedding tunnel engineering semantics into the algorithm; ψ is the dynamic scaling factor generated based on the attention mechanism, derived through the interactive calculation of three feature vectors fquery, fkey, and fvalue, enabling the model to apply importance weighting to different feature dimensions.

4) CHG algorithm post-processing

During the CHG posterior process, Chebyshev’s inequality is introduced as a confidence interval judgment condition to enhance the confidence interval of the final output results. Chebyshev’s inequality can be expressed as Eq. (44).

P(|Xμ|kσ)1k2,

where X is a random variable, k is a random integer, μ is the mean of the random variable, and σ is the standard deviation of the random variable.

Calculation of k-value at a given confidence level is constructed by Eq. (45).

k=11α.

The confidence interval is constructed by Eq. (46).

CI=[μkσ,μ+kσ].

Posterior predictive distribution:

p(y|X,Y,x)=N(y|μ,σ2).

Posterior mean:

μ=kT(K+σn2I)1y.

Posterior variance:

σ2=kkT(K+σn2I)1k.

Chebyshev optimization prediction is defined by Eq. (50).

S={μS,if|SμS|<kσS,fc(FPPInew,W,a)G(W,a),otherwise.

Improved boundary likelihood value is defined by Eq. (51).

P(V)=1σV2πexp(12(VμV)2σV2).

Consider the confidence interval of FPPI is defined by Eq. (52).

P(V|FPPI)=P(V)P(FPPI|V,σ,W,ai).

The final prediction function is defined by Eq. (53).

S=ωfc(FPPIbasicτexc)+(1ω)G(W,ai),

where σF denotes the standard deviation associated with the boundary value, V and μF represent the boundary value and its mean value, respectively, P(FPPIV,σ0,W,af) describes the probability distribution of FPPI given the boundary value V and other parametric conditions, and FPPIbasic refers to the basic FPPI.

4 Model training

4.1 Control of overfitting

Overfitting constitutes a central challenge in machine learning, fundamentally characterized by a model’s superior performance on training data contrasted with substantial degradation on unseen data [9,36,58]. From a statistical perspective, overfitting manifests as an imbalance where model variance is excessively high while bias remains unduly low. Within the theoretical framework of function approximation, overfitting arises when the model’s complexity exceeds the intrinsic complexity of the target function, thereby capturing random noise in the training data rather than the underlying true distributional patterns. This phenomenon is particularly salient in kernel-based learning paradigms, attributable to the inherently high representational capacity of kernel functions; especially when the number of centers approximates or equals the training sample size, the model risks perfect memorization of the training instances at the expense of learning their fundamental structural essence.

In the realm of machine learning, regularization serves as a pivotal technique for managing model complexity and mitigating overfitting. Conventional kernel methods employ straightforward ridge regression regularization, achieved through the optimization of objective functions as exemplified in Eq. (54).

minwKwy2+λw2,

where K denotes the kernel matrix, w represents the weight vector to be determined, y signifies the target value vector, and λ>0 serves as the scalar parameter controlling the regularization intensity. The kernel matrix is constructed based on the RBF kernel function in Eq. (29) to form Eq. (55).

K(x,x)=σf2exp(xx22l2),

where σf2 is the signal variance (controlling the output amplitude), l is the length scale (controlling the function smoothness), and xx is the Euclidean distance between input vectors. This methodology is extensively utilized in conventional GPes and SVM; however, it manifests pronounced limitations when confronting intricate non-stationary data sets: fixed kernel parameters and a uniform regularization intensity inadequately accommodate local variations in data characteristics. The analysis of extant overfitting mitigation strategies is delineated in Table 3.

To surmount this limitation, the CHG algorithm introduces an innovative multi-level regularization framework, extending the optimization problem to Eq. (56).

minw,θKϕθwy2+λϕθw2+αθ2+Rdropout(θ)+RBN(θ),

where Kϕθ is the kernel matrix dynamically generated by the parameter network ϕθ, λϕθ>0 is the adaptive regularization parameter generated by the parameter network, and θ constitutes the weight ensemble of the parameter network. a>0 serves as the hyperparameter regulating the regularization intensity of neural network weights, Rdropout(θ) represents the implicit regularization term induced by dropout, and RBN(θ) denotes the implicit regularization term introduced by Batch Normalization.

The paramount innovation of this extension resides in transmuting the static kernel matrix K and the fixed regularization coefficient λ into dynamically generated Kϕθ and λϕθ by the neural network ϕθ, concurrently incorporating multifaceted regularization terms targeting the neural network parameters θ. This architectural paradigm empowers the model to adaptively modulate the kernel function morphology and regularization intensity in accordance with the local idiosyncrasies of input data, thereby furnishing differentiated fitting strategies across disparate data domains.

Delving deeper from the vantage point of conventional kernel methodologies, the analytical solution for kernel weights is customarily articulated as Eq. (57).

w=(KTK+λI)1KTy,

where KTK represents the Gram matrix of the kernel, I denotes the identity matrix, and λ>0 is the globally uniform regularization parameter. This analytical form originates from the Lagrange multiplier method, essentially constraining the model complexity within the function space. The key breakthrough of the CHG algorithm lies in replacing this fixed parameter λ with a dynamic parameter λϕθ(X) generated by a neural network output, thereby yielding a more flexible analytical expression, as illustrated in Eq. (58).

w=(KTK+λϕθ(X)I)1KTy,

where λϕθ(X)>0 is an adaptive regularization parameter generated by the neural network ϕθ based on the input data X. This parameterization approach achieves a fundamental transition from a “globally uniform regularization strength” to a “data-dependent adaptive regularization strength,” enabling the model to impose stronger regularization in regions with complex data distributions while relaxing constraints in areas with simpler distributions, thereby better aligning with the intrinsic structure of non-stationary data.

Delving further into the parameter network ϕθ level, the CHG algorithm constructs a multi-level collaborative regularization framework that integrates the most effective regularization techniques from modern deep learning. Primarily, through the Dropout mechanism, the activation values of each hidden layer undergo random masking, as illustrated in Eq. (59).

h(l+1)=f(W(l)(h(l)m(l))),

where h(l) denotes the activation vector of the l layer, W(l) represents the weight matrix of the l layer, and f() is the nonlinear activation function (ReLU in the CHG framework). The symbol signifies element-wise multiplication (Hadamard product), while m(l){0,1}dl is a binary mask vector, wherein each element has a probability of 0.8 to be 1 and 0.2 to be 0 (corresponding to the neuron being “dropped out”).

From the perspective of ensemble learning theory, this mechanism is equivalent to implicitly training an exponential number of distinct networks with shared parameters, whose expected output can be expressed as Eq. (60):

E[ϕθ(X)]1Mi=1Mϕθi(X),

where M represents the number of networks in the ensemble (theoretically 2|θ|), but practically limited to finite sampling), where ϕθi denotes the configuration of the ith sub-network. This significantly reduces the variance of the parameter network, enhancing the robustness of kernel parameter generation. Concurrently, the Batch Normalization technique standardizes the activation values of each hidden layer, as defined by Eq. (61).

h(l)=h(l)μBσB2+ε,

where h(t) denotes the raw activation vector of the lth layer, h(t) represents the normalized activation vector, (μB) is the mean of each feature in the mini-batch (B). (σB2) is the variance of each feature in the mini-batch (B), and (ε=105) is a small constant added for numerical stability.

This not only accelerates training convergence but also provides a powerful implicit regularization effect. Theoretically, Batch Normalization imposes a Lipschitz constraint on the network mapping, limiting the sensitivity of the function response to input variations, thereby preventing the model from overfitting to minor perturbations in the training data. This smoothing effect bears resemblance to bandwidth control in traditional kernel methods, albeit in a more flexible and adaptive manner.

At the optimization algorithm level, the CHG model employs the AdamW optimizer to achieve decoupled weight decay, with parameter updates following Eq. (62).

θt=θt1η(θL+λdecayθt1),

where θt denotes the parameter vector at the tth step, η>0 represents the learning rate (initially set to 0.001 in the CHG framework), θL signifies the gradient of the loss function with respect to the parameters, and λdecay=0.0001 serves as the weight decay coefficient.

Notably, the weight decay term λdecayθt1 is decoupled from the gradient term θ, L, diverging from the implementation in traditional Adam. In contrast to conventional L2 regularization, AdamW separates weight decay from gradient updates, ensuring that the regularization effect remains unaffected by gradient magnitudes. This mechanism maintains stable parameter constraints even in regions exhibiting abrupt changes in gradient directions, which is paramount for upholding the stability of the kernel parameter network. Finally, an explicit L2 regularization term on the feature importance vector is incorporated into the overall loss function, as defined by Eq. (63).

Ltotal=LMSE+αs2,

where LMSE=1ni=1n(yiy0)2 denotes the mean squared error loss, s represents the feature importance vector (scale_factors), which governs the weights of each feature dimension in the anisotropic kernel, a=0.001 serves as the hyperparameter controlling the L2 regularization strength, and s2=j=1dsj2 signifies the squared L2 norm of the feature importance vector. This design ensures that the feature importances within the anisotropic kernel function do not excessively concentrate on a minority of dimensions, thereby preventing the model from becoming overly sensitive to individual features and enhancing its robustness to variations in feature distributions.

The regularization framework proposed in this study is not a mere superposition but forms an organically unified hierarchical system that synergistically controls model complexity across different scales and spaces—from Tikhonov regularization in the function space [59], to importance constraints in the feature space, weight decay in the parameter space, and finally to dropout and batch normalization in the representation space. This multi-dimensional, multi-scale regularization strategy enables the CHG algorithm to maintain high expressive capacity while effectively managing model complexity, providing a theoretically sound and practically effective solution for complex nonlinear regression tasks.

Figure 7 presents a comparative visualization of regularization techniques, where it is evident that the CHG algorithm achieves an R2 score of 98.2%, substantially surpassing traditional L1, L2, and Elastic Net regularization methods (ranging from 75.2% to 75.9%). Concurrently, the CHG algorithm demonstrates marked superiority in error metrics such as RMSE, MAE, and generalization gap, registering values of merely 4.8%, 3.2%, and 1.5%, respectively, underscoring its enhanced predictive accuracy. Particularly noteworthy is the CHG algorithm’s exhibition of an exceedingly low generalization gap, indicating that despite attaining an exceptionally high R2 value, it effectively mitigates overfitting, thereby validating the reliability and efficacy of its regularization mechanisms.

4.2 Experimental repeatability and configuration

Experimental repeatability serves as a pivotal metric for evaluating the reliability and robustness of machine learning models. The performance of machine learning models is susceptible to various stochastic factors, including parameter initialization, data set partitioning, and the inherent randomness in optimization algorithms. In accordance with the ISO 13528 standard [60], a robust model should exhibit high repeatability, manifesting as consistent outcomes across multiple training iterations under similar conditions. Particularly for intricate neural network architectures, owing to the stochasticity in random initialization and optimization processes, it is imperative to conduct repeated experiments to ensure the scientific validity and reliability of reported results.

In this study, we employ a suite of machine learning evaluation metrics to comprehensively assess the experimental repeatability of the CHG algorithm, encompassing: 1) statistical distribution analysis of performance metrics; 2) residual analysis; 3) CV analysis for predictions; 4) comparative analysis between the optimal model and average performance. These methodologies collectively form a comprehensive framework for model stability evaluation. Detailed experimental performance metrics are illustrated in Figs. 8–15. The tunnel collapse prediction outcomes in this study are elaborated in Subsection 5.2, instance collapse point prediction.

4.2.1 Experimental design

We implemented a rigorous experimental protocol to comprehensively evaluate the stability and reliability of the CHG algorithm.

1) Experimental repetition

Twenty identical training-testing experiments were conducted to ensure statistical significance.

2) Data partitioning

While utilizing the consistent original data set across all experiments, we employed varying random seeds (42 + 5 × rep) for training-validation splits to assess robustness against data partitioning variance.

3) Test set consistency

A fixed test set was maintained across all experimental iterations to facilitate direct performance comparability.

4) Architectural consistency

All experiments utilized identical network architecture ([512,256,128,64]) and hyperparameters (learning rate = 0.001, batch size = 32, centers = 100). Specific hyperparameter configurations are shown in Table 4.

4.2.2 Evaluation metrics

To comprehensively evaluate the model performance, we adopt multiple complementary evaluation metrics, as detailed in Table 5.

Figure 8 illustrates the scatter plot relationship between accuracy and R2 coefficients across 20 experimental runs. As depicted, accuracy exhibits a significant positive correlation with R2 (correlation coefficient r = 0.86, p < 0.001), indicating intrinsic consistency between these two evaluation metrics in assessing model performance. Models with high accuracy (> 85%), marked by red stars, also demonstrate superior R2 values (R2 > 0.82), underscoring the dual advantages of the CHG algorithm in both classification and regression tasks. The sample distribution in the scatter plot is relatively concentrated, with the majority situated in the high-performance region (upper-right quadrant), signifying the algorithm’s robust performance stability. Figure 9 presents the boxplot distributions of residuals across different experiments. It is observable that the residuals generally exhibit symmetric distributions, with median lines proximate to zero, indicating the absence of systematic bias in model predictions. The interquartile ranges (IQR) are comparable across most experiments, with few outliers, reflecting predictive stability. The similarity in residual distributions between experiments further substantiates the algorithm’s repeatability. Notably, certain experiments (e.g., Experiment #8 and #13) display narrower residual distributions, corresponding to higher predictive precision.

Figure 10 compares the predictive performance of the optimal model (Experiment #17) against the average across all experiments. The predictions from the optimal model (in red) cluster more tightly around the ideal prediction line (diagonal), whereas the average predictions (in blue), while following the general trend, exhibit slight systematic deviations in certain regions. Statistical analysis reveals that the Pearson correlation coefficient between the optimal model’s predictions and actual values is 0.94, markedly higher than the 0.89 for average predictions. This disparity is particularly pronounced in extreme value predictions, highlighting the optimal model’s superiority in capturing nonlinear data relationships. Figure 11 delineates the comparison between the optimal model (Experiment #17) and average performance across four key metrics. This juxtaposition clearly demonstrates the magnitude of the optimal model’s advantages over the average, validating the statistical significance of the selected optimal model’s representativeness. Importantly, the performance uplift of the optimal model relative to the average falls within a statistically significant range (approximately 2σ), affirming the reliability and representativeness of its results.

Figure 12 depicts the scatter relationship between residuals from the optimal model and predicted values. The residuals are observed to scatter randomly around zero without discernible patterns, conforming to the homoscedasticity assumption of a sound regression model. The LOWESS trend line (in green) approximates horizontality and adheres closely to the zero line, indicating no apparent correlation between residuals and predicted values, thereby confirming the absence of systematic bias in model predictions. The residual distribution remains uniform across the range of predicted values, demonstrating the CHG algorithm’s consistent predictive capability throughout the prediction spectrum. Figure 13 analyzes the relationship between absolute residuals and predicted values, providing crucial insights into prediction uncertainty. The distribution of data points reveals slightly elevated absolute residuals for smaller predicted values (0.2–0.4 range) and larger ones (0.7–0.9 range), suggesting marginally reduced precision in extreme value predictions. The trend line indicates that the mean absolute residual (MAE = 0.031) maintains relative stability across different predicted value intervals, evidencing the consistency of the model’s predictive accuracy. Across the entire figure, 95% of points fall within an absolute residual < 0.08, quantifying the reliable bounds of prediction precision.

Figure 14 displays the distribution of prediction coefficient of variation (CV), sorted by predicted means. Low CV values signify high consistency in predictions across experiments, serving as a vital indicator of model stability. As shown, 60.4% of samples exhibit CV values below 10%, indicating substantial consistency in most prediction results across experiments. The CV exhibits a slight downward trend with increasing predicted values (red line), suggesting superior consistency in predictions for higher values. A minority of samples with high variation coefficients (> 40%) are primarily concentrated in regions of lower predicted values, attributable to their proximity to decision boundaries or representation of rare patterns.

4.3 Experiments with varying training set proportions

In addition to the aforementioned 20 repeated experiments, we conducted a separate analysis on the impact of training set size. To strictly adhere to the principle of controlling variables, these experiments were independent of the primary repeatability experiments, employing a fixed random seed for training across proportions.

1) Variation in training set proportions

This study subsampled subsets of varying scales from the original training set, with proportions ranging from 60% to 90% in 10% increments, establishing four distinct scale levels.

2) Fixed random seed

For each training set proportion experiment, the same random seed (42) was utilized for data partitioning, ensuring that differences between proportions stem solely from variations in data volume.

3) Cross-validation

For each training set scale level, 5-fold cross-validation was employed to evaluate model performance, further enhancing the reliability of results.

4) Sampling strategy

Stratified random sampling was used to maintain class distributions in each scale’s training set consistent with the original data.

Figure 15 quantifies the statistical characteristics of model prediction residuals under four training proportions using boxplots, with the red dashed line representing the zero-residual baseline. The symmetry of residual distributions reflects the model’s lack of systematic prediction bias. As the training set proportion increases from 60% to 90%, the box heights (IQR) progressively diminish, with interquartile distances narrowing, demonstrating that larger training sets significantly enhance model prediction stability. Figure 16 illustrates the probability density distributions of CHG algorithm model weights under different training set scales (60%–90%), reflecting the pattern whereby model parameters stabilize and concentrate with increasing training data volume. The morphological changes in weight distributions reveal that, under a fixed random seed, training data scale is a key variable influencing model convergence.

5 Results and comparison

5.1 Local validation of the radial basis function kernel function in the comprehensive hybrid gaussian algorithm

Figure 17 illustrates how the RBF (red asterisks) captures the distribution characteristics of the input data. This is a conceptual description diagram, where Feature 1 corresponds to the relationship between FPPI and the tunnel stability index (S) in this study, and Feature 2 corresponds to the relationship between FPPI and the tunnel stability index (S) in this study. Figure 18 illustrates the distribution of training data for the FPPI indicator parameters (thrust force, penetration rate, geological influence) based on the original data set collected from Figs. 4 and 5 and Eqs. (25)–(41). Figure 19 illustrates the residual Q–Q plot effectiveness evaluation and prediction accuracy evaluation. By introducing a dynamic scaling factor, the PSI was controlled to 1.6. In the Weizi Gou tunnel collapse indicator S, the RBF of the CHG algorithm compared to the original GP algorithm showed an RMSE reduction of approximately 15%, a MAE reduction of approximately 13%, and an R2 improvement of approximately 8%.

As illustrated in Fig. 20, a comparison is made between the predictions of the horizontal model. The optimised RBF model (determination coefficient R2 = 0.982) demonstrates the highest level of accuracy in predicting FPPI intermediate variables and consequently tunnel stability, with data points distributed closely around the ideal line. In contrast, the SVR model (R2 = 0.846) exhibits significantly poorer performance, with data points distributed widely and deviating substantially from the ideal line. The random forest model (R2 = 0.923) performs between the two models, outperforming the SVR model but not the optimised RBF model. As demonstrated in Fig. 21, a comparison of the error distribution is provided. It is evident that the error distribution curve of the optimised RBF model is narrowest, with a peak close to zero. This indicates that the prediction errors are minimal and the consistency is high. The SVR model has been found to exhibit a wider and flatter error distribution, indicating larger prediction errors and strong variability. The error distribution of the random forest model is located between the two aforementioned distributions. In terms of predicting the stability of tunnels (S) with the use of FPPI as an intermediate variable, the optimised RBF model demonstrates a significant improvement in performance in comparison to the other two models, attaining the highest level of accuracy (R2 = 0.982, in contrast to 0.846 for SVR and 0.923 for random forest) and the lowest mean squared error (4.19, in comparison to 36.64 for SVR and 18.39 for random forest).

5.2 Instance collapse point prediction

As demonstrated in Fig. 22, the collapse and damage prediction of the tunnel cross-section at the Shimenzi Reservoir, Weizigou Area, Hengbei River Section is illustrated. The three primary geological formations are granitic rock, gneiss, and kaolinite. Each sub-figure is divided into upper and lower parts. The upper part of the figure illustrates the tunnel cross-section, with a circular black line denoting the tunnel boundary (radius 5 m). The colored markers in the figure represent different instability modes: 1) yellow squares: kaolinite-rich zones; 2) blue stars: water inflow points; 3) red dots: collapse zones; 4) orange triangles: softened zones. The fourth point pertains to the green dots, which are indicative of stable zones. The directions North, East, South, and West are indicated by the letters N, E, S, and W, respectively. The lower diagram illustrates the longitudinal section of the tunnel, with two parallel black lines denoting the upper and lower boundaries of the tunnel, respectively. The distribution of different failure modes along the longitudinal axis is illustrated by the use of color-coded markers.

Figure 23(a) characterizes the Shimenzi Reservoir, where the proportion of collapse points is the highest, approximately 23.5%, with collapses primarily concentrated in the tunnel top area, classified as high to critical risk.

Figure 23(b) characterizes the Weizigou Area, where the primary risk is Water_Inrush, accounting for approximately 26.7%. Water inrush primarily occurs at the bottom of the tunnel, classified as high risk.

Figure 23(c) characterizes the Hengbei River Section, where the primary risk is kaolinization, accounting for approximately 31.0%. Kaolinization is distributed around the tunnel in a regional pattern, classified as medium to high risk.

Figure 24 illustrates the three main collapse types based on the publicly available data in Table 1: Shimenzi Reservoir; Local Collapse (Collapse): In areas with well-developed joints, accounting for approximately 18%. Weizigou Area; Water_Inrush: Due to high permeability and reservoir-controlled water levels, accounting for approximately 25%. Hengbei River Section; Due to kaolinite content of 45%, softening accounts for approximately 25%. The CHG algorithm had a prediction error of 30.56% at Shimenzi, 6.8% at Weizigou, and 24% at Hengbei, with an overall average prediction error of 20.453%.

The elevated prediction error of 30.56% observed at the Shimenzi Project site originates from the geological transition zone phenomenon. This region features intricate structural configurations characterized by multiple lithological interfaces, giving rise to zones of abrupt variations in rock types and mechanical properties. The mechanical behavior within these geological transition zones exhibits distinctive nonlinear characteristics, markedly diverging from those observed in monolithic lithological domains. Figure 25 presents the geological transition zone profile of the Shimenzi area, elucidating the geological etiology underlying the high prediction error (30.56%) at this engineering site. The cross-sectional diagram vividly illustrates two primary lithological transition zones: the limestone-gravel fault contact belt (F3 fault zone) and the gravel-andesite unconformity surface. The F3 fault zone spans a width of 2–5 m with a dip angle ranging from 50° to 65°, wherein the UCS of the rock mass plummets from 80 to 110 MPa in the limestone sector to 30–50 MPa within the fault zone, constituting a decrement of 50%–70%. Pronounced structural fracturing in this locale engenders conspicuous interfaces of mechanical parameter discontinuities. Conversely, the gravel-andesite unconformity manifests an orientation of 35°∠40° with characteristic undulating wavy morphology, where the saturated cohesion of the gravel diminishes by 30%–40% upon water saturation, while the contiguous andesite preserves a high-strength blocky architecture (UCS 70–90 MPa), thereby engendering substantial disparities in engineering mechanical attributes. This multifaceted structural milieu, replete with multiple lithological junctions, precipitates precipitous alterations in the physico-mechanical properties of the rock mass, rendering the interplay between TBM excavation parameters and geological responses exceedingly intricate. Such profoundly heterogeneous geological transition zones manifest unique nonlinear mechanical behavioral traits that transcend the representational capacity of extant models. Although the model demonstrates commendable performance under homogeneous stratigraphic conditions, its predictive efficacy encounters certain constraints when confronted with geologically intricate terrains akin to Shimenzi, ultimately culminating in conspicuously elevated prediction errors.

6 Generalization verification and discussions

6.1 Generalization verification

According to pertinent literature, the progressive collapse phenomenon documented in the 60808–60838 m interval of the Yinchao Tunnel Project [50] has been subjected to high-precision cross-sectional visualization analysis employing the CHG algorithm, thereby elucidating its intrinsic mechanisms.

Figure 26 illustrates the relationship between UCS and geological strength index (GSI) for three rock types encountered during the Yinchao Tunnel excavation process. The Hoek–Brown classification system categorizes rock mass quality into several zones ranging from “very poor” to “very good”. Confidence ellipses surrounding each rock type cluster elucidate the variability in rock mechanical parameters. Granite gneiss exhibits superior mechanical performance (UCS approximately 120 MPa, GSI approximately 75), whereas the fractured zone manifests markedly lower values (UCS approximately 25 MPa, GSI approximately 35), falling into high failure risk zones that necessitate specialized excavation strategies and support measures.

Figure 27 depicts the relationship between specific energy consumption and penetration rate, color-coded according to UCS values. Theoretical penetration curves for different UCS values (25, 60, and 120 MPa) demonstrate an inverse proportionality between rock strength and TBM penetration rate. The optimal performance region (10–25 MJ/m3, 6–12 mm/rev) represents the conditions for TBM’s most efficient operation. Efficiency isolines (η = 0.2 to 1.0) showcase the theoretical relationship between penetration rate and specific energy (penetration rate = η·100/SE). The regression line (r = X.XX) confirms a statistically significant negative correlation between specific energy demand and achievable penetration rate across all geological conditions.

High-precision cross-sectional visualization analysis based on the CHG algorithm unveils the intrinsic mechanisms of progressive collapse in the 60808–60838 m interval of the Yinchao Tunnel. The figure clearly demonstrates the evolutionary characteristics of collapse events in response to varying geological conditions. At mileage 60812 m, the cross-sectional visualization of the granite gneiss segment reveals typical stress concentration-induced instability features. The sectional diagram depicts concentrated block detachment in the crown Q1 block (red markers), aligning with stress redistribution phenomena at the intersections of internal structural planes within the rock mass. The spatial distribution of instability points indicates a locally controlled collapse mode. At mileage 60822 m, the cross-sectional visualization of the sandstone segment discloses a collapse mechanism dominated by water-rock interactions. The sectional diagram prominently displays the spatial distribution of basal water inrush (blue markers) and crown collapse (red markers), manifesting instability characteristics under coupled seepage-stress effects. At mileage 60833 m, the cross-sectional visualization of the fractured zone segment presents a pronounced full-section composite instability mode. The section clearly illustrates the multimodal coupled distribution of block collapse (red markers), surrounding rock squeezing (orange markers), and over-excavation (yellow markers), indicating that shear band formation serves as the dominant mechanism leading to ultimate full-section instability. Through comparative visualization analysis of these three sections, the evolutionary progression of collapse types in the Yinchao Tunnel, from localized block collapse, water-controlled collapse, to full-section composite instability, is lucidly demonstrated, revealing the intrinsic correlations between geological conditions and instability types.

Figure 28 showcases the relationship between TBM advance rate and mileage, spanning three distinct geological zones: granite gneiss zone (60808–60818 m), sandstone zone (60818–60828 m), and fractured zone (60828–60838 m). The advance rate exhibits significant variations based on rock types, with granite gneiss demonstrating the highest average advance rate due to its homogeneous structure. Key points at mileages 60812, 60822, and 60833 m represent locations of geological events (block collapse, water inrush, and multiple failure modes), resulting in pronounced declines in TBM performance. Trend lines for each rock type reveal a systematic reduction in advance rate during transitions from intact to fractured rock masses.

The figure delineates the relationship between stability index and mileage, categorized into five levels ranging from highly stable (80–100) to highly unstable (0–20). This diagram displays distinct stability characteristics across various geological zones, with progressive deterioration in the fractured zone. Diverse stability issues (block collapse, water inrush, squeezing, and over-excavation) are represented by markers of varying sizes and colors, illustrating their spatial distribution along the tunnel alignment. This provides a scientific foundation for risk assessment and support optimization in TBM construction.

Based on the Yinchao data set, which comprises a training set of 2400 samples, each with 25 feature variables (including numerical and one-hot encoded categorical features), with dimensions (2400, 25); the validation set contains 600 samples, with dimensions (600, 25); the target value is tunnel stability (collapse failure type). This study conducted baseline capability tests between CHG and Transformer. Figure 29(a) illustrates the relationship between predicted values and actual observed values for CHG and Transformer. For the CHG model, the scatter points are tightly clustered around the y = x diagonal, indicating high accuracy and robustness in its predictions. Although the Transformer model’s scatter points are closer to the diagonal, they may exhibit slight signs of overfitting, particularly at data extremes where prediction deviations are somewhat larger, potentially attributable to its complex attention mechanism’s oversensitivity to training data. Figure 29(b) further analyzes the models’ error distributions through residual plots. The CHG model’s residuals are symmetrically distributed around the zero axis without obvious patterns, suggesting random errors devoid of systematic bias. While the Transformer model’s residuals are generally smaller, they may exhibit mild heteroscedasticity, potentially leading to uneven residual variance in certain data ranges due to overfitting. The complete experimental results are presented in the table.

Based on the Table 6 experimental results from the Yinchao data set, in practical engineering applications, no algorithm or analytical formula can achieve absolute perfection due to inherent uncertainties, variabilities in geological conditions, and real-world complexities that introduce unavoidable discrepancies between modeled predictions and actual outcomes. Consequently, the CHG algorithm, by incorporating adaptive mechanisms that better accommodate these imperfections, aligns more closely with authentic engineering scenarios, enhancing its applicability and reliability in tunnel stability assessments where precise yet realistic predictions are paramount.

6.2 Discussion

Li et al. [50] employed a transfer learning-based long short term memory (LSTM) algorithm, leveraging historical data from the Yinsong Project to predict TBM excavation responses (such as torque and thrust) in the Yinchao tunnel, thereby facilitating early warnings for collapse risks. This model accomplishes predictions through the selection of transfer parameters (e.g., importance assessment via RF) and LSTM networks. Based on Figs. 27–29, in comparison with Li et al.’s transfer learning-based LSTM predictions for Yinchao tunnel collapse outcomes, the CHG model’s prediction results exhibit superior performance relative to the literature benchmark (transfer learning-based LSTM model). First, in terms of scatter plot distributions, it demonstrates heightened predictive accuracy, with point clusters tightly clustered around the y = x diagonal, distinctly separating non-collapse and collapse events, thereby effectively mitigating deviations at extreme values; conversely, literature models frequently manifest broader dispersions and residual biases, resulting in inadequate prediction precision in high-risk zones. Secondly, with respect to geological heterogeneity and dynamic processing of advance rates, the CHG model captures the evolution of rock plastic strain rate tensors via a dynamic kernel function optimization mechanism, sustaining stable performance across diverse rock types (e.g., granite gneiss, sandstone, and fractured zones), minimizing fluctuations in advance rates, while integrating multiple failure modes (e.g., block collapse, water inrush, and squeezing), thereby surpassing the literature’s emphasis on continuous regression that overlooks discrete event risk assessments, rendering CHG more reliable and practically valuable in real-world tunnel engineering applications.

7 Conclusions

This study addresses the critical challenge of predicting tunnel face stability in complex rock formations through the innovative development of a CHG algorithm. The research presents a significant theoretical advancement by introducing the FPPI as an integrated intermediate variable, successfully establishing a probabilistic mapping mechanism from TBM parameters to rock mass stability, thereby transcending the limitations of traditional single-stage linear mapping approaches. By incorporating feedforward layer normalization techniques from Transformer architectures, the study develops an adaptive kernel function optimization framework that effectively resolves the technical bottleneck of manual parameter tuning inherent in conventional GPs. The algorithmic core implements a multi-level regularization system spanning from Tikhonov regularization in function space to Dropout in representation space, maintaining high expressive capacity while efficiently mitigating overfitting. In the Bayesian posterior phase, the innovative application of Chebyshev’s inequality enhances confidence interval estimation, overcoming constraints associated with traditional Gaussian assumptions and substantially improving prediction robustness. Notably, the CHG algorithm achieves adaptive responsiveness to complex geological features, such as lithological transition zones, through the synergistic mechanism between dynamic weight matrices and the FPPI metric, establishing new methodological paradigms for non-stationary data analysis. This research presents comprehensive innovation from theoretical methodology to engineering implementation, not only providing an efficient lightweight solution for tunnel stability prediction but also establishing a novel framework for intelligent decision support systems in underground engineering applications, demonstrating significant theoretical value and practical engineering importance.

8 Limitations

The CHG algorithm effectively integrates physical information constraints into the model training process through the introduction of the FPPI metric, thereby enhancing its performance in rock mechanics and TBM stability prediction tasks. However, the overall framework remains fundamentally data-driven, implying a high dependence on the quality, quantity, and diversity of training data. In scenarios involving data scarcity, noise interference, or distribution shifts, this may result in insufficient generalization capabilities, failing to fully capture complex physical mechanisms [61]. For instance, in the application to the Yinsong Shimenzi TBM data set, although the CHG model excels in metrics such as R2 and MSE, its predictions may overlook the dynamic evolution of underlying physical laws, such as the rock plastic strain rate tensor, leading to limited accuracy in high-dimensional heterogeneous geological environments.

Future improvements could focus on hybrid frameworks driven by physical information, such as integrating the Conservative Energy Method with neural network-based subdomain partitioning strategies, transitioning the CHG algorithm from pure data-driven approaches to energy-minimizing variational forms [62,63]. This would better address variational problems in heterogeneous complex geometric structures by embedding physical conservation laws (e.g., energy conservation) into the loss function, thereby strengthening the intrinsic constraints on rock mechanical parameters (e.g., plastic strain rate tensor). Furthermore, drawing from the weak-form representations in Physics-Informed Neural Networks, future developments could involve subdomain-optimized variants of CHG, reducing the need for hyperparameter tuning and enhancing robustness in small-sample data sets (e.g., Shimenzi or Yinchao cross-domain validations), ultimately achieving more reliable tunnel stability predictions [64,65].

References

[1]

Sharma A , Juneja A . Geotechnical and face stability correlations using cutterhead-soil interaction in soft ground mechanised shield tunnelling. Transportation Infrastructure Geotechnology, 2025, 12(3): 112

[2]

Sivakumar G , Maji V B . A review of experimental and numerical studies on crack growth behaviour in rocks with pre-existing flaws. Geomechanics and Engineering, 2023, 35(4): 333–366

[3]

WangS XZhangM QMoJ L. On the contact and vibration characteristics of TBM cutter with abnormal damage under hard rock conditions. Wear, 2025, 562–563: 205652

[4]

Sun C H , Li Z , Wu J , Wang R , Yang X , Liu Y Y . Research on double-layer support control for large deformation of weak surrounding rock in Xiejiapo tunnel. Buildings, 2024, 14(5): 1371

[5]

Liu W , Albers B , Zhao Y , Tang X W . Upper bound analysis for estimation of the influence of seepage on tunnel face stability in layered soils. Journal of Zhejiang University. Science A, 2016, 17(11): 886–902

[6]

Georgiou D , Kalos A , Kavvadas M . 3D numerical investigation of face stability in tunnels with unsupported face. Geotechnical and Geological Engineering, 2022, 40(1): 355–366

[7]

Chen G H , Zou J F , Guo Y C , Tan Z A , Dan S . Face stability assessment of a longitudinally inclined tunnel considering pore water pressure. International Journal for Numerical and Analytical Methods in Geomechanics, 2024, 48(15): 3725–3747

[8]

Manogharan P , Wood C , Marone C , Elsworth D , Rivière J , Shokouhi P . Nonlinear elastodynamic behavior of intact and fractured rock under in-situ stress and saturation conditions. Journal of the Mechanics and Physics of Solids, 2021, 153: 104491

[9]

Shishegaran A , Khalili M R , Karami B , Rabczuk T , Shishegaran A . Computational predictions for estimating the maximum deflection of reinforced concrete panels subjected to the blast load. International Journal of Impact Engineering, 2020, 139: 103527

[10]

Shishegaran A , Moradi M , Naghsh M A , Karami B , Shishegaran A . Prediction of the load-carrying capacity of reinforced concrete connections under post-earthquake fire. Journal of Zhejiang University—Science A, 2021, 22(6): 441–466

[11]

Shishegaran A , Saeedi M , Kumar A , Ghiasinejad H . Prediction of air quality in Tehran by developing the nonlinear ensemble model. Journal of Cleaner Production, 2020, 259: 120825

[12]

Ahmadi E , Kashani M M . Numerical investigation of nonlinear static and dynamic behaviour of self-centring rocking segmental bridge piers. Soil Dynamics and Earthquake Engineering, 2020, 128: 105876

[13]

Zhang C , Zhu Z D , Wang S Y , Ren X H , Shi C . Stress wave propagation and incompatible deformation mechanisms in rock discontinuity interfaces in deep-buried tunnels. Deep Underground Science and Engineering, 2022, 1(1): 25–39

[14]

Yan T , Shen S L , Zhou A N . Identification of geological characteristics from construction parameters during shield tunnelling. Acta Geotechnica, 2023, 18(1): 535–551

[15]

Tang B L , Ren Y Q . Study on seismic response and damping measures of surrounding rock and secondary lining of deep tunnel. Shock and Vibration, 2021, 2021: 7824527

[16]

Xu C B , Zhou X Y , Wang H L , Gao X J , Li X F . A case study of thaumasite sulfate attack in tunnel engineering. Advances in Civil Engineering, 2021, 2021: 8787757

[17]

Yang J , Li S , Wang Z . The research of creep in tunnel which surrounding rock is expansion. MATEC Web of Conferences, 2015, 31: 13001

[18]

Bu S J , Feng X J , Yao L Y , Yang F J , Xie Y T , Liu S F . Seismic dynamic response and lining damage analysis of curved tunnel under shallowly buried rock strata. Sustainability, 2023, 15(6): 4905

[19]

He M C , Guo A P , Du Z F , Liu S Y , Zhu C , Cao S D , Tao Z G . Model test of negative Poisson’s ratio cable for supporting super-large-span tunnel using excavation compensation method. Journal of Rock Mechanics and Geotechnical Engineering, 2023, 15(6): 1355–1369

[20]

Wang Z Y , Jiang Y S , Shao X K , Liu C L . On-site measurement and environmental impact of vibration caused by construction of double-shield TBM tunnel in urban subway. Scientific Reports, 2023, 13(1): 17689

[21]

Duan K M , Zhang G F , Sun H . Construction practice of water conveyance tunnel among complex geotechnical conditions: A case study. Scientific Reports, 2023, 13(1): 15037

[22]

YouYYangJTangYYangX. Research on intelligent identification and resource utilization of TBM muck in breezy granite strata. In: IOP Conference Series: Earth and Environmental Science, Vol 1337. Bristol: IOP Publishing, 2024

[23]

Jiang T , Pei X , Wang W X , Li L F , Guo S H . Effects of the excavation of a hydraulic tunnel on groundwater at the Wuyue pumped storage power station. Applied Sciences, 2023, 13(8): 5196

[24]

KalagerA K. Control of water leakages during TBM excavation—An example from the Follo Line tunnel project. In: Anagnostou G, Benardos A, Marinos V, eds, Expanding underground. Knowledge and passion to make a positive impact on the world. London: CRC Press, 2023, 1988–1995

[25]

Liu W P , Hu L N , Yang Y X , Fu M F . Limit support pressure of tunnel face in multi-layer soils below river considering water pressure. Open Geosciences, 2018, 10(1): 932–939

[26]

Cheng D J , Hua J F , Zhu J G , Yang J , Hu Z G . Calculation of rock pressure in loess tunnels based on limit equilibrium theory and analysis of influencing factors. Revue Européenne de Mécanique Numérique, 2023, 32(1): 1–30

[27]

Mi B , Xiang Y Y . Analysis of the limit support pressure of a shallow shield tunnel in sandy soil considering the influence of seepage. Symmetry, 2020, 12(6): 1023

[28]

Madanda V C , Sengani F , Mulenga F . Applications of fuzzy theory-based approaches in tunnelling geomechanics: A state-of-the-art review. Mining, Metallurgy & Exploration, 2023, 40(3): 819–837

[29]

Li B , Li H . Prediction of tunnel face stability using a naive bayes classifier. Applied Sciences, 2019, 9(19): 4139

[30]

LuJHuSFanXNiuZ. The prediction model of tunnel face based on fuzzy comprehensive evaluation. Applied Mechanics and Materials, 2021, 80–81: 506–510

[31]

Hong C H , Cho G C , Hong E S , Baak S H , Ryu H H . Anomaly prediction ahead tunnel face using tunnel electrical resistivity prospecting system (TEPS) in Danyang. Procedia Engineering, 2017, 191: 855–861

[32]

Soranzo E , Guardiani C , Wu W . A soft computing approach to tunnel face stability in a probabilistic framework. Acta Geotechnica, 2022, 17(4): 1219–1238

[33]

Vergara I M , Saroglou C . Prediction of TBM performance in mixed-face ground conditions. Tunnelling and Underground Space Technology, 2017, 69: 116–124

[34]

Bigdeli A , Shishegaran A , Naghsh M A , Karami B , Shishegaran A , Alizadeh G . Surrogate models for the prediction of damage in reinforced concrete tunnels under internal water pressure. Journal of Zhejiang University, Science A, 2021, 22(8): 632–656

[35]

Karami B , Shishegaran A , Taghavizade H , Rabczuk T . Presenting innovative ensemble model for prediction of the load carrying capacity of composite castellated steel beam under fire. Structures, 2021, 33: 4031–4052

[36]

Naghsh M A , Shishegaran A , Karami B , Rabczuk T , Shishegaran A , Taghavizadeh H , Moradi M . An innovative model for predicting the displacement and rotation of column-tree moment connection under fire. Frontiers of Structural and Civil Engineering, 2021, 15(1): 194–212

[37]

Shishegaran A , Boushehri A N , Ismail A F . Gene expression programming for process parameter optimization during ultrafiltration of surfactant wastewater using hydrophilic polyethersulfone membrane. Journal of Environmental Management, 2020, 264: 110444

[38]

Shishegaran A , Karami B , Danalou E S , Varaee H , Rabczuk T . Computational predictions for predicting the performance of steel 1 panel shear wall under explosive loads. Engineering Computations, 2021, 38(9): 3564–3589

[39]

Shishegaran A , Saeedi M , Mirvalad S , Korayem A H . Computational predictions for estimating the performance of flexural and compressive strength of epoxy resin-based artificial stones. Engineering with Computers, 2023, 39(1): 347–372

[40]

Shishegaran A , Varaee H , Rabczuk T , Shishegaran G . High correlated variables creator machine: Prediction of the compressive strength of concrete. Computers & Structures, 2021, 247: 106479

[41]

Zhang J W , Gao C Z , Huang X M . Instability prediction model of the shield tunnel face based on the normal cloud-PSM. Soil Mechanics and Foundation Engineering, 2023, 60(5): 472–484

[42]

Li T Z , Dias D . Tunnel face reliability analysis using active learning Kriging model-case of a two-layer soils. Journal of Central South University, 2019, 26(7): 1735–1746

[43]

Sun, C, Huo, S, Bai, J. CN Patent, CN119203753(A), 2025-03-20

[44]

Wu Q D , Yan B , Zhang C , Wang L , Ning G B , Yu B . Displacement prediction of tunnel surrounding rock: A comparison of support vector machine and artificial neural network. Mathematical Problems in Engineering, 2014, 2014(1): 351496

[45]

YaoB ZYangC YYuBJiaF FYuB. Applying support vector machines to predict tunnel surrounding rock displacement. Applied Mechanics and Materials, 2010, 29–32: 1717–1721

[46]

Jearsiripongkul T , Keawsawasvong S , Thongchom C , Ngamkhanong C . Prediction of the stability of various tunnel shapes based on hoek-brown failure criterion using artificial neural network (ANN). Sustainability, 2022, 14(8): 4533

[47]

Nguyen M T , Bui T N , Shiau J , Nguyen T , Nguyen T T . Stability of rectangular tunnels in cohesive-frictional soil under surcharge loading using isogeometric analysis and Bayesian neural networks. Advances in Engineering Software, 2025, 201: 103861

[48]

Chen Z , Zhang Y , Li J , Li X , Jing L . Diagnosing tunnel collapse sections based on TBM tunneling big data and deep learning: A case study on the Yinsong Project, China. Tunnelling and Underground Space Technology, 2021, 108: 103700

[49]

Hou S , Liu Y . Early warning of tunnel collapse based on Adam-optimised long short-term memory network and TBM operation parameters. Engineering Applications of Artificial Intelligence, 2022, 112: 104842

[50]

Li J , Guo D , Chen Z , Li X , Li Z . Transfer learning for collapse warning in TBM tunneling using databases in China. Computers and Geotechnics, 2024, 166: 105968

[51]

Li X , Li H , Du S , Jing L , Li P . Cross-project utilisation of tunnel boring machine (TBM) construction data: A case study using big data from Yin-Song diversion project in China. Georisk: Assessment and Management of Risk for Engineered Systems and Geohazards, 2023, 17(1): 127–147

[52]

Yu Y , Wang E , Zhong J , Liu X , Li P , Shi M , Zhang Z . Stability analysis of abutment slopes based on long-term monitoring and numerical simulation. Engineering Geology, 2014, 183: 159–169

[53]

Li J B , Chen Z Y , Li X , Jing L J , Zhang Y P , Xiao H H , Wang S J , Yang W K , Wu L J , Li P Y . . Feedback on a shared big dataset for intelligent TBM Part I: Feature extraction and machine learning methods. Underground Space, 2023, 11: 1–25

[54]

Yang W K , Chen Z Y , Wang S J , Zhao H T , Li J C , Chen S , Shi C . Improved boreability index for gripper TBMs in medium- to strong-quality rocks based on theoretical analysis and field penetration tests. Rock Mechanics and Rock Engineering, 2025, 58(5): 5429–5453

[55]

Zhao Y , Gong Q M , Tian Z Y , Zhou S H , Jiang H . Torque fluctuation analysis and penetration prediction of EPB TBM in rock–soil interface mixed ground. Tunnelling and Underground Space Technology, 2019, 91: 103002

[56]

Deev P V , Tsukanov A A . Influence of rock interface on stress state of rock mass around underground excavation. Proceedings of the Tula States University-Sciences of Earth, 2022, 1: 448–457

[57]

Xu J S , Liu W C , Wang X R , Du X L . Stability analysis of three-dimensional tunnel roofs in soil based on a modified MC criterion. Acta Geotechnica, 2024, 19(9): 5989–6004

[58]

Bansal A , Sharma R , Kathuria M . A systematic review on data scarcity problem in deep learning: Solution and applications. ACM Computing Surveys, 2022, 54(10S): 208

[59]

Trong D D , Phuong C X , Tuyen T T , Thanh D N . Tikhonov’s regularization to the deconvolution problem. Communications in Statistics–Theory and Methods, 2014, 43(20): 4384–4400

[60]

Wong S K . Review of the new edition of ISO 13528. Accreditation and Quality Assurance, 2016, 21(4): 249–254

[61]

WangYBaiJLinZWangQAnitescuCSunJEshaghiM SGuYFengX QZhuangX, . Artificial intelligence for partial differential equations in computational mechanics: A review. 2024, arXiv: 2410.19843

[62]

Wang Y , Sun J , Li W , Lu Z , Liu Y . CENN: Conservative energy method based on neural networks with subdomains for solving variational problems involving heterogeneous and complex geometries. Computer Methods in Applied Mechanics and Engineering, 2022, 400: 115491

[63]

Sun J , Liu Y , Wang Y , Yao Z , Zheng X . BINN: A deep learning approach for computational mechanics problems based on boundary integral equations. Computer Methods in Applied Mechanics and Engineering, 2023, 410: 116012

[64]

Wang Y , Sun J , Rabczuk T , Liu Y . DCEM: A deep complementary energy method for linear elasticity. International Journal for Numerical Methods in Engineering, 2024, 125(24): e7585

[65]

Wang Y , Bai J , Eshaghi M S , Anitescu C , Zhuang X , Rabczuk T , Liu Y . Transfer learning in physics-informed neurals networks: Full fine-tuning, lightweight fine-tuning, and low-rank adaptation. International Journal of Mechanical System Dynamics, 2025, 5(2): 212–235

RIGHTS & PERMISSIONS

Higher Education Press

AI Summary AI Mindmap
PDF (7525KB)

516

Accesses

0

Citation

Detail

Sections
Recommended

AI思维导图

/