Shear stress distribution prediction in symmetric compound channels using data mining and machine learning models

Zohreh SHEIKH KHOZANI , Khabat KHOSRAVI , Mohammadamin TORABI , Amir MOSAVI , Bahram REZAEI , Timon RABCZUK

Front. Struct. Civ. Eng. ›› 2020, Vol. 14 ›› Issue (5) : 1097 -1109.

PDF (2781KB)
Front. Struct. Civ. Eng. ›› 2020, Vol. 14 ›› Issue (5) : 1097 -1109. DOI: 10.1007/s11709-020-0634-3
TRANSDISCIPLINARY INSIGHT
TRANSDISCIPLINARY INSIGHT

Shear stress distribution prediction in symmetric compound channels using data mining and machine learning models

Author information +
History +
PDF (2781KB)

Abstract

Shear stress distribution prediction in open channels is of utmost importance in hydraulic structural engineering as it directly affects the design of stable channels. In this study, at first, a series of experimental tests were conducted to assess the shear stress distribution in prismatic compound channels. The shear stress values around the whole wetted perimeter were measured in the compound channel with different floodplain widths also in different flow depths in subcritical and supercritical conditions. A set of, data mining and machine learning algorithms including Random Forest (RF), M5P, Random Committee, KStar and Additive Regression implemented on attained data to predict the shear stress distribution in the compound channel. Results indicated among these five models; RF method indicated the most precise results with the highest R2 value of 0.9. Finally, the most powerful data mining method which studied in this research compared with two well-known analytical models of Shiono and Knight method (SKM) and Shannon method to acquire the proposed model functioning in predicting the shear stress distribution. The results showed that the RF model has the best prediction performance compared to SKM and Shannon models.

Keywords

compound channel / machine learning / SKM model / shear stress distribution / data mining models

Cite this article

Download citation ▾
Zohreh SHEIKH KHOZANI, Khabat KHOSRAVI, Mohammadamin TORABI, Amir MOSAVI, Bahram REZAEI, Timon RABCZUK. Shear stress distribution prediction in symmetric compound channels using data mining and machine learning models. Front. Struct. Civ. Eng., 2020, 14(5): 1097-1109 DOI:10.1007/s11709-020-0634-3

登录浏览全文

4963

注册一个新账户 忘记密码

Introduction

In the design of hydraulic structures, the boundary shear stress distribution is an essential factor to understand most of the flow characteristics such as the flow resistances, sediment transport, and cavitation problems. It is suggested that, the stress distribution depends on some parameters such as the flume geometry, the hydraulic condition, the boundary roughness, particularly the streamwise velocity component and the secondary flow pattern [15]. Since the compound cross section is the nearest section to the rivers, understanding the distribution of shear stress along the periphery of compound channels is essential. Furthermore, studying the river morphology and engineering the river bed and banks is dependent on it. In addition, analysis and design of flood control structures depends on extended knowledge on the distribution of shear stresses in the flooding route. Literature includes various investigations considering different methods and case studies [610]. Because of the difficulty and time-consuming of direct and indirect shear stress measurement, many analytical, semi-analytical, and numerical methods have been currently developed [1117]. Rezaei and Knight [18] modified the Shiono and Knight method (SKM) to predict the shear stress distribution in the compound channel with non-prismatic floodplains. Sheikh Khozani and Bonakdari [19] compared five different analytical models to estimate the shear stress distribution in compound channels with prismatic rectangular shapes. They investigated the performance of each model in estimating shear stress in each section of the compound channel. They deducted that method of the Tsallis entropy could estimate good results with fewer calculations.

Nowadays applying soft computing and data mining methods in forecasting different hydraulic and hydrology phenomena are in progress [2028].

In estimating shear stress distribution Sheikh Khozani et al. [29] utilized the Randomize Neural Network (RNN) model in circular channels and estimated their results with results of the Shannon entropy. These researchers proposed a matrix-based equation. Khuntia et al. [30] carried out a model of neural networks to predict the force applied to the walls in compound channel cross-sections. Sheikh Khozani et al. [31] applied different data mining models to estimate apparent shear stress in compound channels. They deducted that by using the Bagging-M5P model the more accurate results of apparent shear stress will be obtained.

Based on the knowledge of authors there is few studies which estimated the shear stress distribution in compound channels by using data mining models. Therefore, a set of experiments were done in different flow depths and flow conditions then the extracted data was used to forecast the shear stress distribution in the smooth compound channel. About 1812 data of shear stress applied to five different models as Additive Regression (AR), M5P, KStar, Random Forest (RF), and Random Committee (RC) models. The performance of each model in prediction of the distribution of shear stress is investigated, and the most accurate model is selected. Also, the output of the most appropriate model is compared with two analytical models as SKM and Shannon model.

Apparatus and proceeding of experiments

In this study, the experiments are conducted utilizing a flume of 18 m length. All experiments were performed in the flume with a simple rectangular cross-section compound channel. The flume width and depth are 1200 and 400 mm, respectively. The bed has a slope of S0 = 2.003 × 103. The main channel dimensions are 398, 50, and 400 mm for width, depth, and floodplains, respectively. The main channel has been constructed with PVC material. The modulus floodplain widths for the L-shaped aluminum sections in prismatic compound channels are 100, 200, 300, and 400 mm. In this study, the distribution of shear stress in the prismatic compound channel with 100 mm floodplain width is investigated (see Figs. 1 and 2).

In the experiments, the uniform flow is controlled by a series of adjustable tailgates located in the end of the flume. OPC denotes, overbank flow in the channel, the first three numbers after OPC refer to the floodplain width and two code numbers denoted the flow discharge. Local boundary shear stress was measured by using a Preston tube of 4.77 mm outer diameter, at the wetted channel perimeter at 25 mm transverse intervals on the bed and 10 mm vertical intervals on the walls. Note that, the above measurements were performed at one section (14 m from the channel inlet). The range of hydraulic parameters of the experimental data are presented in Table 1. The shear stress distribution was measured in different width of the floodplain.

According to the results of different research the shear stress distribution in an smooth compound channel is related to geometry of channel (the width of floodplain, Bfp, Bmc, whole channel wetted perimeter (L)), the transverse coordinate (y), bankfull depth (h), depth of flow over main channel (H), slope of channel bed (S0), flow velocity (V), fluid density (r), gravitational acceleration (g), and hydraulic radius (R) then the dimensionless shear stress can be expressed as a function:

τ ρgRS=( yL, B fp Bmc ,Fr,H h ).

In this study, the y/L, Bfp/Bmc, Fr, and H/h are as input variables which applied to each model and the dimensionless shear stress is the output variable.

Material and methods

Data mining methods

Economist Michael Lovell who used the term “data mining” for the first time in the Review of Economic Studies (1983). Data mining is a process which discovers trends and patterns [32]. Data mining is a subset of statistics and computer science with the mission of discovering patterns in data sets with a goal to extract trends and information from a data set and to prepare the extracted information into a required structure for further application [33].

On the other hand, in addition to the analysis step, it contains data management, inference consideration, pre-processing and post-processing of data, visualization and interestingness metrics [30]. Data mining, unlike data analyzing, employs statistical or machine learning techniques to estimate, predict and to model patterns of the target data set [34]. Most common applications of data mining methods are association learning, anomaly detection, cluster detection, classification, and regression.

Randorn Forest model

RFs are methods for regression and classification and related tasks with constructing a multitude of decision trees. RFs are considered in ensemble learning method category. This method was first introduced by Guo et al. [35] who implemented the stochastic discrimination to classify to the proposed by Eugene Kleinberg using the random subspace method [36]. An extension of the RFs algorithm has been registered as a trademark [37]. In another study by Sun et al. [38], a new RFs algorithm has been proposed for classification based on cooperative game theory, on the other hand, the evaluation of each feature power was performed using Banzhaf power index which was traversing possible coalitions of the feature. In another study, Chen et al. [39] proposed an adaptive variable step method based on RFs. This method from one hand was able to accelerate the training process and on the other hand, can decrease the gain of calculations of information. Based on evidence and documentation, the proposed approach was suitable to be applied in the most decision tree-based models.

In this study the optimum parameter settings of RF models including of batchsize, maximum depth of tree, number of decimal places, number execution slots, number of features, number of iterations, and number of seeds are 100, 0, 2, 1, 0, 100, and 4, respectively.

M5P model

M5P algorithm is first introduced by Quinlan [40]. This method is the upgraded version of the M5 algorithm. Model trees can effectively handle large data sets, and in case of dealing with missing data, they are robust.

Based on Fig. 3, which shows the schematic diagram of the M5 algorithm, the process first split the input data (or input space) into subspaces.

Figure 3 demonstrates the input space which has been divided into subspaces S1, S2, and S3. The minimization of the variation is performing by the use of linear regression approaches. After this step, in order to create a tree-like structure, information of the previous step is imported to build several nodes. In this step, the standard deviation reduction (SDR) is employed to reduce the error at the node (Eq. (1)) [41]:

S DR=sd (S) iSi|S|×s d(Si),

where S = data set which reaches to the node, Si = subspaces, sd = the standard deviation.

Lower SDR than the expected error creates over-training problems. To overcome this problem, there is a need for a smoothing process for the combination of all the models from the root to the leaf. This establishes the final model of the leaf. Finally, the resulted values of data from leaf are combined with the predicted values using linear regression for that node (Eq. (2)) [42]:

E=ne+kan +k,

where E= predicted value for the next higher node, e = Predicted value for the current node, a = model prediction value, n = quantity of the training samples, k = constant value.

In this paper the optimum parameter settings of M5P models including of batchsize, number of decimal places, number of instance, and number of seeds are 100, 0, 2, 4, and 3, respectively.

KStar model

KStar model or in other word K* algorithm as an Instance-based Learner and a memory-based classifier was presented by Cleary and Trigg [43] in a conference proceedings of machine learning. The distance metric for K* technique has been performed by employing the entropy concept. Therefore, it can be claimed that the transformation probability occurs in a “random walk away” manner. Summing the probabilities classifies the K*. Generally, there is not enough evidence about how K* faces class noisy and attribute, and with the attributes mixed values in the data sets [44].

To specify the K* technique, we have (Eq. (3) to Eq. (5)):

0p(tu )p (t¯)1 ,

up (t¯u) =p( t¯),

p (A)= 1.

It satisfies Eq. (6) as a consequence:

tP p( t¯u)=1.

Equation (7) defines the probability function P *:

P*(b|a) =tP ;t(a )=b p(t ¯).

The following properties have been satisfied by P *:

b P* (b|a)=1,0P *( b|a)1.

Finally, the K* function will be defined as Eq. (9):

K* (b|a)= log 2 P* (b|a).

In this study the optimum parameter settings of KStar models including of batchsize, global blend, and minimum number of places, are 100, 1, and 1, respectively.

Additive Regression model

This method is a nonparametric regression method which was first introduced by Ref. [45]. This method is known as an essential part of the alternating conditional expectations algorithm. The alternating conditional expectations algorithm employs a one-dimensional smoother ( fj (xij) in Eq. (10)) to create a class of non-parametric regression models (Eq. (10)). This make the method smoother than a p-dimensional method. This technique is also more flexible compared with that for a standard linear model, but is more interpretable compared with that for a general regression surface. Multicollinearity, overfitting and model selection are consodered as application fields for an additive reggression method.

By considering {yi,xi 1, ...,x ip}, (i = 1 to n) as data set for n units, which xi indicates estimators and yi reperesents the outcome value, the additive model is as Eq. (10):

E [yi|xi 1, ...,xip]=Y=β 0+ j=1p fj( xij )+ϵ .

Fitting the additive regression method can be performed by the use of the backfitting algorithm presented by Yoshida [46] who employed a semiparametric method to explore the structure of AR models.

The optimum parameter settings of number of itration and shrinkage of AR models are 12 and 1, respectively.

Random Committee model

RC belongs to the category of committee machines which works based on ensemble of predictors, e.g. ANNs, decision trees [47]. Thus, it is considered as an ensemble classifier which works on the basis of classification for accoplish the training. It is made using a learning mechanism which predicts the committees of the new inputs. The new imputs are generated through the integration of the estimation of every single committee members. The RC functions as a meta-learning technique using a number of randomized classifiers. The average of estimation achieved each classifier of RC provides the final classification result.

Hu and Hwang [47] documented the concept of RC. He described the architecture of the gating and expert networks where some base classifiers are constructed using a different number of random seeds. Furthermore, an estimation average generated through every base classifier form the final value for the prediction (see Fig. 4).

By assuming x as input variable and y as output variable vectors, f (x) and P (y| f(x))will be function and conditional density respectively. By considering Xq= {x1q,..., xNQq} as a set of NQ test points and let fq={ f1q,..., fNQ q} as the vector of the corresponding unknown response variables and by spliting up the input data set into M sets of data D={D1, ...,DM} and by denoting the data which are not in Di as D¯i=D /Di, we will have in general:

P (fq| D¯ i,Di )P (f q)P( D ¯i|f q)P(D i| D¯i, fq).

It can be approximated Eq. (12):

P (Di| D ¯i,fq)P(Di| fq).

Now the combination of Bayes’ formula and approximation generates Eq. (13):

P (fq|D i1, D¯i )Const×P (f q| Di1)P( fq| Di)P (fq),

approximate predictive density is calculated as Eq. (14):

P^ (f q|D)=Const×i=1M P(fq|D i)P (f q)M1.

In this case, E^ and cov^ are estimated based on P^(fq| D) as Eq. (15):

E^ (f q|D)= 1C i =1Mcov(fq|D i) 1E( fq|D i),

with

C =cov (f q| Di) 1= (M1)(Eqq )1+i=1M cov( fq| Di)1.

The above integration of the committee members predictions ressembles the Bayesian committee machine [47].

The optimum parameter settings of RC models of Batchsize, number of decimal places, number of Execution slots, number of itration, and number of seed are 100, 1, 1, 15, and 1, respectively.

Analytical models

SKM model

The Navier-Stokes equation for a fluid element in steady uniform flow can be written as:

ρ (vuy+wwz)=ρg S0+ τyxy+ τ zxz,
where S0 is bed slope, u, v, and w are local velocities. The τ yxand τ zx represent the Reynolds stresses. Furthermore, g and r are gravitational acceleration and fluid density, respectively. An analytical solution for the Navier-Stokes equation to predict the lateral variation of the depth-averaged velocity in compound channels was proposed earlier by Shiono and Knight [11]. It accounts for the 3D flow by the use of depth-integrated parameters to simplify its use as follow:

ρ gS0H ρ f8 1+s2 Ud2+ y [ρ λH 2f 8UdUdy] =y[H(ρV U)d],
where s is the channel side wall slope. H, Ud, λ , f, and y are the local flow depth, the depth-averaged velocity, the dimensionless eddy viscosity, the Darcy-Weisbach friction factor and the lateral coordinate, respectively. Shiono and Knight [11] proposed an analytical solution, initially ignoring the secondary flow term on the other side of the Eq. (18). They concluded that by ignoring the current secondary term, the velocity profile could be determined relatively accurate. By increasing the bed friction, f, or the turbulent friction, λ, the relationship between the depth-averaged velocity and bed shear stress might be jeopardized in such a way that it became impossible to get a prediction of both profiles accurately at the same time.

Shiono and Knight [48] proposed a secondary current model in order to improve the analytical results. From experimental results, they came to conclusion that within certain regions of the flow, the depth-averaged term on the right-hand side of differential Eq. (18) varied linearly in the y-direction on the floodplains and in the main channel, in such a way, that its derivative could be replaced by the constant, G, in the main channel and on the floodplains. Hence

Γ= y [H (ρUV)d],

ρ gS0H ρ f8 1+s2 Ud2+ y [ρ λH 2f 8UdUdy] =Γ.

For a flat bed region (s 0), the differential Eq. (20) may be written as follow

ρ gHS 018ρf Ud2+ y [ρ λH2( f8)1/2U d Udy]=Γ.

According to Shiono and Knight [48], the analytical solution of Eq. (21) for a prismatic compound channel with a flat bed region and vertical side walls is expressed as follows:
Ud= [A 1eγ y+A 2eγy +k] 1/2,
where k =8gS0H f( 1β); γ= 2λ [f 8]1/4 1H and β = ΓρgS 0H.

At an interface between selected panels, different boundary conditions can be used to determine the unknown parameters A.

Having the depth-averaged velocity, the bed shear stress can be calculated as:

τb= ρfUd 28.

It should be noted that the SKM is not able to model shear stress distribution on the rectangular compound channels walls.

Shannon model

Based on the Shannon entropy concept, Sterling and Knight [49] extended equations to estimate shear stress distribution in channels. They proposed equations for predicting shear stress distribution along the wetted perimeter in the circular channel without flat bed. Also they presented equations to forecast the shear stress distribution in wall and bed of trapezoidal and circular channels with sediment separately. Sheikh Khozani and Bonakdari [19] used these models for estimating shear stress distribution to compare with other analytical models. The suggested equations by Sterling and Knight [49] are as bellows:

τw= 1 λwln[1 +(eλwτ max(w)1) 2(y yw) Pw], y wy P w 2,

τb= 1 λbln[1 +(eλ b τmax(b ) 1) 2(y y w)Pb],
Pw2y Pw2+ yw,
where τ w and τb are shear stress values for wall and bed of floodplain or main channel respectively, τmax (w) and τmax (b) are the maximum shear stress values for wall and bed, respectively. Pb and Pw is the wall and bed wetted perimeter respectively, yw is an offset taken as 5 mm in the study of Sterling and Knight [49] λw and λ b are the Lagrange multipliers related to wall and bed of compound channel subsections respectively which calculated as:

λw=[ τmax(w) eλwτ max(w)eλ w τmax (w)1 ρgR S0]1,

λb=[ τmax(b) eλwτ max(b)eλ b τmax (b)1 ρgR S0]1,
which r is the fluid density, g is the gravity acceleration, R is the hydraulic radius and S0 is the channel slope. To compute the maximum shear stress distribution, the proposed relations by Knight et al. [50] were utilized in studies of other researchers such as [5153].

Models performance evaluation

According to Dawson et al. [54], using one statistical criterion is not suitable for evaluating a model. To investigate the performance of each model for estimating the shear stress distribution in compound channels, four commonly used criteria were utilized. These applied criteria are coefficient of determination (R2), Root Mean Square Errors (RMSE), Mean Absolute Error (MAE), Nash-Sutcliffe Efficiency (NSE), and BIAS. These statistical indexes are calculated as:

R2=( xio xio)(xipxip ) (x iox io)2 i= 1n( xi p xip)22,

R MSE= i=1n (xipxio)2n,

M AE= 1 n i =1n| xipx io|,

B IAS= i =1nx ip xion,

N SE= i =1n( xip xio)2 i=1 n(x ip x¯ip)2,
where xip is the predicted shear stress value by models, xio is the observed shear stress value in the laboratory, x¯io and x¯ip are the mean values of shear stress values which are observed and predicted, respectively, and n is the number of samples.

These indexes were used by Sheikh Khozani et al. [31] to investigate the model performances in modeling apparent shear stress in compound channels.

Results and discussion

Select the best data mining model

All five mentioned models were applied to the shear stress distribution data which were measured in a straight rectangular compound channel. About 1812 data were used in the modeling procedure; 70% were used for the training stage, and 30% for the testing stage. The results of the testing stage are shown in Fig. 5 as a scatter plot and a hydrograph. According to the results shown in this figure, the AR Model predicted the worst results of shear stress distribution with R2 of 0.6745. As seen in Fig. 5 the AR Model predicted the same values of shear stress in different y/P in each test. Moreover, based on the results of hydrograph this model could not estimate shear stress in the whole wetted perimeter. The M5P and KStar models show similar results. As shown in hydrograph, these models are weak in predicting the maximum and minimum shear stresses in walls and beds of main channel and floodplains, but for other y/P they provided more accurate results than the AR Model. The RC and RF models’ predictions for the maximum and minimum shear stress values are better than those of other models. It can clearly be observed from the scatter plot of Fig. 5 that the RF Model with R2 of 0.9003 demonstrated more precise results than the AR, KStar, M5P, and RC models. Therefore, the predictions of the RF Model were compared with two mentioned analytical models (the SKM and the Shannon models) in the next section.

The results of statistical criteria for comparing all five data mining models are presented in Table 2. As seen in this table the performance of RF model is superior to those of other models with the lowest RMSE of 0.971. In addition, the AR model demonstrated the worst results in estimating shear stress distribution in compound channels with RMSE of 0.1707. Based on the results of Fig. 5 and Table 2, the RF model was selected as the best model among all mentioned models to obtain the most accurate prediction values of shear stress distribution in compound channels.

Comparison of the models

To estimate the shear stress distribution in a prismatic compound channel with rectangular cross-section five different data mining methods were investigated. Based on the results, the RF model performed superior to those of other models in all subsections of the compound channel. In this section, the performance of the RF model is compared with the ability of the Shannon and SKM models in forecasting the shear stress distribution. Figure 6 demonstrates the comparison between two analytical models and the RF model. As seen in Fig. 6, the SKM model shows better performance in predicting the shear stress in the bed of the main channel than the bed of floodplains. It is known that the SKM model can only estimate the bed shear stress and this model is not able to predict wall shear stresses. Based on the results of Fig. 6, the SKM model overestimated values obtain for bed shear stress of the main channel and underestimated values calculate for the shear stress of bed of floodplains. The accuracy of the SKM model predictions for the bed of the main channel was decreased as the floodplain width increased.

By contrast, in higher floodplain width the shear stress predictions values for the bed of floodplain are more precise. Also, when the width of floodplain increased, the SKM model estimates the pattern of shear stress for the bed of floodplain with higher accuracy as seen in Figs. 6(e), 6(f), 6(g), and 6(h). The performance of the Shannon model is better than the SKM model. In all sub-sections, the Shannon model predictions are overestimated, but this model performs better for estimating wall shear stress than bed shear stress. When the width of floodplains is equal to 100 mm, the performance of the Shannon model is the same as the SKM model for main channel bed shear stress to somewhat. As the width of floodplains increased, the results of the SKM model become weaker than the Shannon model. Among three mentioned models the RF model illustrates the best results with higher accuracy as seen in Fig. 6. By using the RF model in addition to the most accurate predictions of shear stress distribution in the whole wetted perimeter, the model could estimate the pattern of shear stress distribution very well. In modeling with the RF model only using the hydraulic parameters of channel as y/L, Fr, H/h, and Bfp/Bmc, the shear stress values can estimated in whole channel boundary. By contrast in the Shannon entropy it needs to compute the Lagrange multiplier and the results are not accurate as the RF model. In addition in the SKM model we can only estimate the bed shear stress and it needs to calculate the average depth velocity and computing the shear stress needs to time-consuming procedure.

The statistical results of comparison between the RF, Shannon and SKM models are tabulated in Table 3. As we know, the lower values of RMSE and MAE indexes shows the higher performance of models to forecast a specific phenomenon. As mentioned before the SKM model predict bed shear stress of floodplains and the main channel, the results of the SKM model in Table 3 contains only these predictions. According to the results of this table, the RF model with lower values of RMSE and MAE indicates the best results of estimating shear stress distribution in compound channels. The Shannon entropy model performs better than the SKM model in predicting shear stress values. The values of NSE demonstrates the performance of model which graded as follows: very good for 0.75<NSE≤1, good for 0.65<NSE≤0.75, satisfactory for 0.5<NSE≤0.65, acceptable for 0.4<NSE≤0.5, and unsatisfactory for NSE≤0.4. As seen in Table 3 for the RF model the obtained values of NSE are higher than 0.95, therefore, the RF model has a perfect grade for estimating shear stress values. For estimating shear stress distribution values in OPC-100, OPC-200, OPC-300, and OPC-400 the results of the RF model are most precise with RMSE of 0.0166, 0.0255, 0.0338, and 0.0518, respectively, in comparison with the Shannon and the SKM models. Overall, based the results of Fig. 6 and Table 3 the RF model is the most robust model among mentioned models in this study for estimating shear stress distribution in compound channels. It is worth addition that R2, RMSE, MAE, NSE, and BIAS which are used to estimate how good regression models are, in some cases, can overestimate (or underestimate) the training data. To overcome these issues (overestimation and underestimation), Bayesian methods can be used to improve the regression model [5658].

Conclusions

In this research, the authors investigated the shear stress distribution on the compound channel. A series of experiments were performed in prismatic simple rectangular cross-section compound channels of floodplain width of 100, 200, 300, and 400 mm using flume of the University of Birmingham. The results have used for five different data mining methods to predict the shear stress distribution by using AR, M5P, KStar, RC, and RF models. The AR model with R2 of 0.6745 was not able to estimate shear stress in whole wetted perimeter accurately. The M5P and KStar models did not show appropriate results in predicting the maximum, and minimum shear stresses in walls and beds of main channel and floodplains, however, for other locations of perimeter they showed more accurate outcomes rather than the AR model. The maximum and minimum shear stress values can be predicted better with the RC and RF models in comparison with the other models. The RF Model can predict the results with R2 of 0.9003 which is the most precise prediction among other statistical models. Shannon and SKM analytical model have been compared with RF model, the SKM model is able to predict bed shear stress of floodplains and the main channel better than wall shear stresses, however, Shannon model can predict wall shear stresses more accurately. The accuracy of the SKM model predictions for the main channel bed decreases by increasing the floodplains width. The shear stress predictions values for the floodplain bed are more meticulous in broader floodplains. The results showed that the RF machine learning model has the lower values of RMSE and MAE in comparison with the two famous accurate analytical models’ prediction of shear stress distribution in the whole wetted perimeter. RF modeling technique can estimate the shear stress values in whole channel boundaries using the hydraulic parameters of y/L, Fr, H/h, and Bfp/Bmc. By contrast, Lagrange multiplier and average depth velocity was needed in the Shannon entropy, and SKM model, respectively, and the results were not as accurate as the RF model.

References

[1]

Chiu C L, Chiou J D. Structure of 3-D flow in rectangular open channels. Journal of Hydraulic Engineering, 1986, 112(11): 1050–1067

[2]

Chiu C L, Lin G F. Computation of 3-D flow and shear in open channels. Journal of Hydraulic Engineering, 1983, 109(11): 1424–1440

[3]

Ghosh S N, Roy N. Boundary shear distribution in open channel flow. Journal of the Hydraulics Division, 1970, 96 (4): 967994

[4]

Knight D W, Demetriou J D, Hamed M E. Boundary shear in smooth rectangular channels. Journal of Hydraulic Engineering, 1984, 110(4): 405–422

[5]

Flintham T P, Carling P A. Prediction of mean bed and wall boundary shear in uniform and compositely rough channels. In: International Conference on River Regime Hydraulics Research Limited. Chichester: Wiley, 1988

[6]

Khatua K K, Patra K C. Boundary shear stress distribution in compound open channel flow. ISH Journal of Hydraulic Engineering, 2007, 13(3): 39–54

[7]

Knight D W, Hamed M E. Boundary shear in symmetrical compound channels. Journal of Hydraulic Engineering, 1984, 110(10): 1412–1430

[8]

Naik B, Khatua K K. Boundary shear stress distribution for a converging compound channel. ISH Journal of Hydraulic Engineering, 2016, 22(2): 212–219

[9]

Tominaga A, Nezu I, Ezaki K, Nakagawa H. Three-dimensional turbulent structure in straight open channel flows. Journal of Hydraulic Research, 1989, 27(1): 149–173

[10]

Rezaei B, Knight D W. Overbank flow in compound channels with nonprismatic floodplains. Journal of Hydraulic Engineering, 2011, 137(8): 815–824

[11]

Shiono K, Knight D W. Two-dimensional analytical solution for a compound channel. In: Proceedings of the 3rd International Symposium Refined flow modeling and turbulence measurements. Tokyo, 1988, 503–510

[12]

Khodashenas S R, Paquier A. A geometrical method for computing the distribution of boundary shear stress across irregular straight open channels. Journal of Hydraulic Research, 1999, 37(3): 381–388

[13]

Yang S Q, Lim S Y. Boundary shear stress distributions in trapezoidal channels. Journal of Hydraulic Research, 2005, 43(1): 98–102

[14]

Yang K, Nie R, Liu X, Cao S. Modeling depth-averaged velocity and boundary shear stress in rectangular compound channels with secondary flows. Journal of Hydraulic Engineering, 2013, 139(1): 76–83

[15]

Bonakdari H, Tooshmalani M, Sheikh Z. Predicting shear stress distribution in rectangular channels using entropy concept. International Journal of Engineering, Transaction A: Basics, 2015, 28: 360–367

[16]

Sheikh Khozani Z, Bonakdari H, Ebtehaj I. An analysis of shear stress distribution in circular channels with sediment deposition based on Gene Expression Programming. International Journal of Sediment Research, 2017, 32(4): 575–584

[17]

Sheikh Khozani Z, Bonakdari H, Zaji A H. Efficient shear stress distribution detection in circular channels using Extreme Learning Machines and the M5 model tree algorithm. Urban Water Journal, 2017, 14(10): 999–1006

[18]

Rezaei B, Knight D W. Application of the Shiono and Knight Method in compound channels with non-prismatic floodplains. Journal of Hydraulic Research, 2009, 47(6): 716–726

[19]

Sheikh Khozani Z, Bonakdari H. A comparison of five different models in predicting the shear stress distribution in straight compound channels. Scientia Iranica, 2016, 23: 2536–2545

[20]

Genç O, Gonen B, Ardıçlıoğlu M. A comparative evaluation of shear stress modeling based on machine learning methods in small streams. Journal of Hydroinformatics, 2015, 17(5): 805–816

[21]

Bonakdari H, Sheikh Khozani Z, Zaji A H, Asadpour N. Evaluating the apparent shear stress in prismatic compound channels using the Genetic Algorithm based on Multi-Layer Perceptron: A comparative study. Applied Mathematics and Computation, 2018, 338: 400–411

[22]

Sheikh Khozani Z, Bonakdari H, Ebtehaj I. An expert system for predicting shear stress distribution in circular open channels using gene expression programming. Water Science and Engineering, 2018, 11(2): 167–176

[23]

Sheikh Khozani Z, Bonakdari H, Zaji A H. Estimating shear stress in a rectangular channel with rough boundaries using an optimized SVM method. Neural Computing & Applications, 2018, 30: 1–13

[24]

Azad A, Farzin S, Kashi H, Sanikhani H, Karami H, Kisi O. Prediction of river flow using hybrid neuro-fuzzy models. Arabian Journal of Geoscience, 2018, 11(22): 718

[25]

Sanikhani H, Kisi O, Maroufpoor E, Yaseen Z M. Temperature-based modeling of reference evapotranspiration using several artificial intelligence models: Application of different modeling scenarios. Theoretical and Applied Climatology, 2019, 135(1–2): 449–462

[26]

Vonk J, Shackelford T K. The Oxford Handbook of Comparative Evolutionary Psychology. New York: Oxford University Press, 2012

[27]

Anitescu C, Atroshchenko E, Alajlan N, Rabczuk T. Artificial neural network methods for the solution of second order boundary value problems. Computers, Materials & Continua, 2019, 59(1): 345–359

[28]

Guo H, Zhuang X, Rabczuk T. A deep collocation method for the bending analysis of Kirchhoff plate. Computers, Materials & Continua, 2019, 59(2): 433–456

[29]

Sheikh Khozani Z, Bonakdari H, Zaji A H. Estimating the shear stress distribution in circular channels based on the randomized neural network technique. Applied Soft Computing, 2017, 58: 441–448

[30]

Khuntia J R, Devi K, Khatua K K. Flow distribution in a compound channel using an artificial neural network. Sustainable Water Resources Management, 2019, 5(4): 1–12

[31]

Sheikh Khozani Z, Khosravi K, Pham B T, Kløve B, Wan Mohtar W H M, Yaseen Z M. Determination of compound channel apparent shear stress: Application of novel data mining models. Journal of Hydroinformatics, 2019, 21(5): 798–811

[32]

Lovell M C. CAI on pcs—Some economic applications. Journal of Economic Education, 1987, 18: 319–329

[33]

Witten I H, Frank E, Hall M A, Pal C J. Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann, 2016

[34]

Olson D L. Data mining in business services. Service Business, 2007, 1(3): 181–193

[35]

Guo H, Zhuang X, Rabczuk T. A deep collocation method for the bending analysis of Kirchhoff plate. Computers, Materials & Continua, 2019, 59(2): 433–456

[36]

Ho T K. The random subspace method for constructing decision forests. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1998, 20(8): 832–844

[37]

Breiman L. Random forests. Machine Learning, 2001, 45(1): 5–32

[38]

Sun J, Zhong G, Huang K, Dong J. Banzhaf random forests: Cooperative game theory based random forests with consistency. Neural Networks, 2018, 106: 20–29

[39]

Chen M, Wang X, Feng B, Liu W. Structured random forest for label distribution learning. Neurocomputing, 2018, 320: 171–182

[40]

Quinlan J R. Learning with continuous classes. Mach Learn, 1992, 92: 343–348

[41]

Wang Y, Witten I H. Induction of model trees for predicting continuous classes. In: Proceedings of the 9th European Conference on Machine Learning Poster Papers. Prague, 1997, 128–137

[42]

Behnood A, Behnood V, Modiri Gharehveran M, Alyamac K E. Prediction of the compressive strength of normal and high-performance concretes using M5P model tree algorithm. Construction & Building Materials, 2017, 142: 199–207

[43]

Cleary J G, Trigg L E K. An instance-based learner using an entropic distance measure. Machine Learning Proceedings, 1995, 1995: 108–114

[44]

Tejera Hernández D C. An Experimental Study of K* Algorithm. International Journal of Information Engineering and Electronic Business, 2015, 2: 14–19

[45]

Friedman J H, Stuetzle W. Projection pursuit regression. Journal of the American Statistical Association, 1981, 76(376): 817–823

[46]

Yoshida T. Semiparametric method for model structure discovery in additive regression models. Economie & Statistique, 2018, 5: 124–136

[47]

Hu Y H, Hwang J N. Handbook of Neural Network Signal Processing. Boca Raton, Fl: CRC Rress, Inc.: 2001

[48]

Shiono K, Knight D W. Turbulent open-channel flows with variable depth across the channel. Journal of Fluid Mechanics, 1991, 222: 617–646

[49]

Sterling M, Knight D. An attempt at using the entropy approach to predict the transverse distribution of boundary shear stress in open channel flow. Stochastic Environmental Research & Risk, 2002, 16: 127–142

[50]

Knight D W, Yuen K W H, Alhamid A A I. Boundary shear stress distributions in open channel flow. Physical Mechanisms of mixing and Transport in the Environment , 1994, 1994: 51–87

[51]

Bonakdari H, Sheikh Z, Tooshmalani M. Comparison between Shannon and Tsallis entropies for prediction of shear stress distribution in open channels. Stochastic Environmental Research and Risk Assessment, 2015, 29(1): 1–11

[52]

Sheikh Z, Bonakdari H. Prediction of boundary shear stress in circular and trapezoidal channels with entropy concept. Urban Water Journal, 2016, 13(6): 629–636

[53]

Sheikh Khozani Z, Bonakdari H. Formulating the shear stress distribution in circular open channels based on the Renyi entropy. Physica A Statistical Mechanics & Its Applications. 2018, 490: 114–126

[54]

Dawson C W, Abrahart R J, See L M. HydroTest: A web-based toolbox of evaluation metrics for the standardised assessment of hydrological forecasts. Environmental Modelling & Software, 2007, 22(7): 1034–1052

[55]

Rezaei B. Overbank Flow in Compound Channels with Prismatic and Non-prismatic Floodplains. Birmingham: University of Birmingham, 2006

[56]

Vu-Bac N, Lahmer T, Zhang Y, Zhuang X, Rabczuk T. Stochastic predictions of interfacial characteristic of polymeric nanocomposites (PNCs). Composites. Part B, Engineering, 2014, 59: 80–95

[57]

Vu-Bac N, Lahmer T, Zhuang X, Nguyen-Thoi T, Rabczuk T. A software framework for probabilistic sensitivity analysis for computationally expensive models. Advances in Engineering Software, 2016, 100: 19–31

[58]

Vu-Bac N, Rafiee R, Zhuang X, Lahmer T, Rabczuk T. Uncertainty quantification for multiscale modeling of polymer nanocomposites with correlated parameters. Composites. Part B, Engineering, 2015, 68: 446–464

RIGHTS & PERMISSIONS

Higher Education Press

AI Summary AI Mindmap
PDF (2781KB)

3568

Accesses

0

Citation

Detail

Sections
Recommended

AI思维导图

/