Prediction of Boundary Shear Stress Distribution in Converging Compound Channel Using Gene Expression Programming

The computation of the boundary shear stress distribution in an open channel flow is required for a variety of applications, including the flow resistance relationship and the construction of stable channels. The river breaches the main channel and spills across the floodplain during overbank flow conditions on both sides. Due to the momentum shift between the primary channel and adjacent floodplains, the flow structure in such compound channels becomes complicated. This has a profound impact on the shear stress distribution in the floodplain and main channel subsections. In addition, agriculture and development activities have occurred in floodplain parts of a river system. As a consequence, the geometry of the floodplain changes over the length of the flow, resulting in a converging compound channel. Traditional formulas, which rely heavily on empirical approaches, are ineffective in predicting shear force distribution with high precision. As a result, innovative and precise approaches are still in great demand. The boundary shear force carried by floodplains is estimated by gene expression programming (GEP) in this paper. In terms of non-dimensional geometric and flow variables, a novel equation is constructed to forecast boundary shear force distribution. The proposed GEP-based method is found to be best when compared to conventional methods. The findings indicate that the predicted percentage shear force carried by floodplains determined using GEP is in good agreement with the experimental data compared to the conventional formulas (R = 0.96 and RMSE = 3.395 for the training data and R = 0.95 and RMSE = 4.022 for the testing data).


Introduction
One or more adjacent floodplains usually surround the main channel in several natural rivers.
Because these channels comprise more than one of the basic elementary configurations, they are referred to as compound channels. Numerous hydraulic phenomena, such as channel roughness, sedimentation, catchment disintegration, bed morphology, land subsidence, and  ). In two-stage and single-stage channels with movable bed conditions, a number of studies have investigated the wall shear stress distribution and flow resistance (Knight and Patel 1985;Nezu et al. 1993; Rhodes and Knight 1994; Ackerman and Hoover 2001;Tayfur 2002). Yang and Lim (1998) developed an analytical approach to calculate wall shear force distributions in compound channels. Artificial neural networks were used by Khuntia et al. (2018) to forecast the distribution of boundary shear stress in a two-stage straight channel. Sellin (1964) has examined the momentum shift phenomenon in laboratory flumes. Consequently, many scientists believed that the non-uniformity in the boundary shear stress patterns along the section perimeter was generated by momentum transfer (Ghosh and Jena 1971;Patra et al. 2004). A model for boundary shear stress distribution in a homogeneous compound channel was developed by  using a width ratio (α = flood plain width (B)/main channel width (b)) value up to 4. The work was carried out by Khatua and Patra (2007) based on the data collected during their experiments. They came up with a width ratio model that is suitable for channels with a width ratio of 5. With 6.67 ≤ α ≤ 11.96, Mohanty and Khatua (2014) have created another novel channel system model. In laboratory flumes, the geometries of prismatic and meandering compound channels were thoroughly studied. However, when prismatic compound channel data was compared to non-prismatic compound channel data, substantial errors in percent Sfp estimation were detected (Bousmar and Zech 1999;Bousmar et al. 2004;Proust et al. 2006). In non-prismatic compound channel flow models, the extra momentum shift should be taken into consideration. Boundary shear stress distribution is strongly influenced by the geometry of the cross-section and the two-phase flow structure. As a result, new non-prismatic compound section models are required. To produce a unique expression for %Sfp, experiments were performed on two-stage channel with converging floodplains.
In order to assess the connections between dependent and independent components, it is very difficult to construct any model for wall shear stress using mathematical, analytic, or numerical methods. Moreover, these models become noticeably ponderous and laborious; hence, an easily implementable method such as GEP (Gene Expression Programming) can be used to estimate boundary shear stress. It not only reduces the effort of experimenting in a short period but also eliminates rigorous computations. As a result, compound channels are increasingly being

Theoretical Background
Many researchers have simulated boundary shear stress in compound open channels. For compound sections with various characteristics, the formulae for the percentage boundary shear force that is carried out by floodplain are shown below. Knight and Demetriou (1983) for compound channels with up to 4 and expressed as

A percent Sfp equation was established by
2. In order to account for channels with non-homogeneous roughness,  developed The exponent can be evaluated from the relation is the ratio of roughness between floodplain and main channel.

Development of model for boundary shear stress estimation
In open channel flow, the boundary shear per unit length (SF) is usually considered to be uniform. That's why SF = ρgAS, where g represents the acceleration due to gravity, ρ denotes density of water and S is the bed slope for a particular channel. In terms of flow depth, the only variable that changes is flow area (A). Therefore, shear force is dependent on flow area.

Development of GEP model
The relationship (Eq. 9) indicates the percentage boundary shear force carried by floodplains as a function of geometric and hydraulic variables. In this study, the modeling procedure uses %Sfp as the target value and the four independent factors (α, β, θ, Xr) as input variables discussed in Eq. (9). The model is constructed using four fundamental arithmetic operators (+, −, ×, /) and a fundamental mathematical function (e x , ln, x 2 , average, cube root, maximum of two, inverse, tangent). There are 112 data sets used in the modeling process, some of which are shown in Table 1. The data are randomly distributed for the two different phases of the modeling process. For the current study, 50% of the data is used for training, while the other 50% is used for testing. In this study, RMSE was the fitness function (Ei) and the fitness (fi) was computed by Eq. 10 that yields the target value's total sum of errors. Starting with just one gene and two head sizes, the first model was constructed. The set of genes and heads were then increased one by one throughout each run, and the outcomes of the training and testing datasets were recorded. For head lengths of more than eight and more than three genes, the performance of the training and testing data phase did not improve significantly.
As a consequence, eight were chosen as head length for inclusion in the GEP model, with three genes per chromosome. Addition was used to bind three genes together. Testing revealed that after 4,50,000 generations, there had been no visible change in the fitness function value and coefficient of determination of training and testing data, indicating that generations may have come to an end. Table 2 where a and p are the actual and predicted values, respectively, ā and ̅ are the mean of actual and predicted values, respectively and N is the number of datasets.
For both the training and testing datasets, performance assessments shown in table 3

Conclusions
Gene expression programming (GEP) is a computer programming language that leverages a fixed-length gene expression representation to encapsulate computer programs and rapidly discover succinct and understandable solutions. In this study, the GEP is utilized to estimate the percentage shear force carried by the floodplains in converging compound channels. The The proposed model appears to be influenced by parameters such as width ratio, relative flow depth, converging angle, and relative distance. The GEP approach's suggested model is shown to be highly suitable for all of these kinds of channel systems, covering various laboratory models. In comparison to other flow parameters such as converging angle and relative distance, the relative flow depth and width ratio was found to be more appropriate in computing percentage shear force. In terms of R 2 , MAE, RMSE, and MAPE for different series of datasets, the established GEP model show superior outcomes in comparison to other methods as Naik and Khatua [48], Khatua et al. [46], Knight and Hamed [17], Mohanty et al. [54]. The GEP approach's compatibility is determined by the model's mean percentage of error and the approaches' suitability. Within a close range of non-dimensional parameters examined in this study, the findings clearly demonstrate the effectiveness of the GEP model and its potential utility for real applications. The results of this investigation show that the GEP model is more beneficial in any circumstance with no limits.

Acknowledgement
The authors would like to convey their heartfelt gratitude to the anonymous editor and reviewers for their time and effort in reviewing and providing helpful feedback on the article.

Notation
The following symbols are used in this paper: