Prediction of Transmission Coefficients of Regular Wave Attenuation by Emergent Vegetation


 Transmission coefficient (Kt) for wave attenuation by vegetation is essential parameter for predicting the wave height. In this paper, based on the experimental data of three kind of artificial vegetation model, genetic programming (GP), artificial neural networks (ANNs) and multivariate non-linear regression (MNLR) were used to analyze the dimensionless factors including Ursell number (Ur), relative width (RB) relative height (α) and volume fraction (φ). The proposed GP formulae were compared with MNLR and ANNs. The predictions of GP models were in good agreement with measured data, and outperformed MNLR equations. Otherwise, GP and ANNs were used to obtain the weight of each factor. The results can provide a reference for the artificial planting of the three plants.


Prediction of transmission coefficients of regular wave
Transmission coefficient (Kt) for wave attenuation by vegetation is essential parameter 11 for predicting the wave height. In this paper, based on the experimental data of three 12 kind of artificial vegetation model, genetic programming (GP), artificial neural 13 networks (ANNs) and multivariate non-linear regression (MNLR) were used to 14 analyze the dimensionless factors including Ursell number (Ur), relative width (RB) 15 relative height (α) and volume fraction (φ). The proposed GP formulae were 16 compared with MNLR and ANNs. The predictions of GP models were in good 17 agreement with measured data, and outperformed MNLR equations. Otherwise, GP 18 and ANNs were used to obtain the weight of each factor. The results can provide a 19 reference for the artificial planting of the three plants. 20

21
The emergent vegetation represented by mangroves is widely distributed in 22 middle and low latitudes, which can effectively attenuate wave height and protect the 23 coastline from the action of storm surges 1,2 . The attenuation law of waves in the 24 nearshore vegetation zone is a major concern in the study of wave dissipation 3,4 . The 25 wave absorbing ability of vegetation can be expressed by the transmission coefficient 26 (Kt). On the one hand, as the sea level rises, the occurrence of extreme waves such as 27 storm surges is increasing. The Kt can be predicted according to the parameters such 28 as incident wave height, wave period, vegetation density, and the disaster warning can 29 be carried out in the area with large transmission coefficient. On the other hand, with 30 the strengthening of local environmental protection awareness, measures for artificial 31 mangrove planting have increased, and the area of mangroves has generally shown a 32 trend of first decreasing and then increasing in some countries. The sensitivity 33 analysis of the factors that affect Kt is carried out to obtain the parameters that have a 34 greater impact on the waves. This can provide guidance for the planting of mangroves 35 and achieve the greatest economic effect. 36 Multivariate non-linear regression (MNLR) analysis is the most traditional 37 approaches to establish prediction formulas 5,6 . However, that requires active 38 assumption of the objective function, which requires background knowledge about the 39 relationship between input and output. Furthermore, some formulas are complicated 40 in form and do not indicate the actual physical process, and the prediction accuracy 41 needs to be improved. Predictive methods such as artificial neural networks (ANNs) 42 and genetic programming (GP) can effectively deal with complex multivariate 43 nonlinear problems, and have been used to estimate hydraulic characteristics 7-9 . The 44 main advantage of using GP for symbolic regression is that there is no need to specify 45 the size and shape of the approximation function in advance, and the specific 46 knowledge of the problem can be included in the search process through an 47 appropriate mathematical function 10 . ANNs cannot obtain a definite formula like GP, 48 but it has excellent performance in sensitivity analysis. Therefore, in this paper, we 49 demonstrated the applicability of GP in the prediction of Kt. In order to compare the 50 accuracy of the prediction results, the prediction results obtained by GP, ANNs and 51 MNLR methods were compared with experimental data. Otherwise, Sensitivity 52 analysis was carried out by using GP and ANNs in order to get the weightings of 53 dimensionless parameters.

55
Transmission coefficient 56 The normally accepted dimensionless parameter for evaluating the performance 57 of the vegetation is the transmission coefficient (Kt) 4,6 , which is the ratio of the 58 transmitted wave height to the incident wave height 5,11 . Kt can range from 0 to 1,59 where 0 illustrates no transmission and 1 illustrates complete transmission. The larger 60 the Kt, the weaker the wave-dissipating ability of vegetation. Kt can be defined as a 61 function of wave, fluid, and structure properties. 62 Experimental study is the main approach to understand the mechanism of wave 63 attenuation by vegetation 12 The wave length of a laboratory scale regular wave can be calculated by: The volume fraction can be calculated by the following formula: where Vs is the volume of water, V is the volume of a vegetated area at still water. 94 The parameters of Eq.
(3) were expressed in terms of dimensionless parameters 95 using Buckingham's π theorem. The dimensionless change of the influencing factors 96 in the relationship can be obtained as follows:    Step 1: Data entry. Enter dimensionless numbers into the Eureqa by column, and 166 give names to each variable on the row named name. Each row representing a set of 167 measurements or values that are in some sense simultaneous. 168 Step 2: Data preparation. The difference in figure of variables is minor, and 169 standardization can be decided according to the prompts of the software. 170 Step 3: Search definition. Editing the formula to target expression and selecting 171 the appropriate mathematical building-blocks (i.e. arithmetic, trigonometric, 172 exponential). In general, we need to select as many operators as possible to get a more it performs well in most cases, which is proved by the results obtained. By default, 177 Eureqa will randomly shuffle the data and then split it into training and validation data 178 sets based on the total size of the data. The training set is used to generate and 179 optimize solutions, and the validation set is used to test how well those models 180 generalize to new data. Eureqa also uses the validation data to filter out the best 181 models to display in the Eureqa interface.

182
Step 4: start and stop search. When the program starts to run, generation will be 183 generated continuously and will not stop automatically. The search does not stop until 184 the MAE is less than 0.02 and remains constant or varies little over a long period of 185 time. Table 1 shows the target expression, mathematical building-blocks, error metric, 186 row weight and data splitting used in this study.
187 ANNs is a mathematical model that simulates the brain nervous system for 190 complex information processing based on the main functions of the human brain. neurons. They are arranged into three basic input, hidden and output layer (Fig.3). 195 ANNs gives relative weights between neurons relay, continuously adjust the weights 196 by using the algorithm, so as to get the minimum prediction error and prediction 197 precision is given, and is widely applied in the parameter sensitivity analysis. Step 2: Selecting Kt as a dependent variable, RH, RB, α and φ as covariates.

207
Step 3: Partition the data. Randomly assign cases based on relative numbers of 208 cases, training 70%, test 30%. Automatic architecture selection. 209 Step 4: Selecting Architecture. Using one hidden layer, and the numbers of units 210 is automatically. 211 Step 5: Selecting the output content (independent variable importance analysis 212 and predicted by observed chart in the network performance group).      Table 4.
280      The ANNs structure used in this study.

Figure 4
Comparison between the measured dataset from rigid cylinder model and predicted value using (a) nonlinear regression, (b) GP and (c) ANNs.

Figure 5
Comparison between the measured dataset from plastic synthetic vegetation model and predicted value using (a) MNLR, (b) GP and (c) ANNs.

Figure 6
Comparison between the measured dataset from cylinder array model and predicted value using (a) MNLR, (b) GP and (c) ANNs.

Figure 7
The mean values of each factor index were tted for 10 times of the (a-b) Model 1, (c-d) Model 2 and (e-f) Model 3.