Determination of infiltration model parameters using basic soil physical 1 properties 2

8 Quantification of infiltration rate is a time-consuming process because of its variability and challenges in the accurate 9 estimation of infiltration model parameters. In this study predictive equations for parameters of Horton, Kostiakov, 10 Modified Kostiakov and Philip infiltration models were developed using basic soil-properties. The model-parameters 11 were initially determined applying non-linear Levenberg Marquardt algorithm (LMA) on field-observed infiltration data 12 and were subsequently determined by predictive equations developed after applying regression analysis to investigated 13 soil-properties. Regression analysis was carried-out using stepwise-regression (SR) where all the measured soil- 14 properties were used, and by applying principal component analysis (PCA) prior to multiple linear-regression for 15 reducing number of predictors. The results revealed that developed equations using stepwise regression and the ones 16 developed after applying PCA were able to explain 40- 78% and 10- 50% of variation respectively. The performance 17 evaluation of developed regression equations at two information levels along with LMA for prediction of infiltration 18 model-parameters was carried out by computing an overall performance index (OPI), which combines relative weight 19 of different statistical indicators, namely, Coefficient of Determination (R 2 ), Nash–Sutcliffe Efficiency (E NS ), Willmott’s 20 Index of Agreement (W), Mean Absolute Error (MAE) and Root Mean Square Error (RMSE). Performance evaluation 21 revealed, LMA with highest OPI-value is most suitable to ascertain parameters of studied infiltration models. However, 22 for selected models using parameters determined at two information levels, it was observed that there exists no 23 significant difference in OPI-value of computed infiltration rates suggesting that equations developed after PCA can be used successfully for determination of infiltration model-parameters.


48
Infiltration is an important process of the hydrologic cycle, governed by gravity, suction and 49 pressure forces exerted by the soil in absorbing water from the outer soil's surface down its 50 profile. Accurate estimation of infiltration rate is essential as it is one of the significant factors 51 deciding the ability of soil to absorb water and initiation of runoff in a landscape (Bayabil et al. 52 2019). Further, quantification of infiltration may be potentially helpful to the hydrologists, 53 irrigation and agricultural engineers, and soil scientists for accurate determination of soil moisture 54 status, runoff, sediment and solute transport, estimation of artificial groundwater recharge, design 55 of irrigation and drainage systems, water balance modelling etc. (Parhi et al. 2007; Ma and Shao 56 2008). Consequently, soil and water scientists dedicated a great deal of attention to infiltration 57 studies, resulting in the development of a large number of computational infiltration models. 58 These models can be classified as physically based infiltration models such as that of Philip 59 (1957), Green and Ampt (1911), Smith-Parlange (1978), semi-empirical models such as those of findings of these studies with respect to developed regression equations using easily measured soil 98 properties differ significantly on account of spatial heterogeneity resulting from anthropogenic 99 activities, geological and pedological processes, suggesting that the result can be implemented 100 exclusively at local scales. 6 Information regarding land use land cover (LULC) and soil type were taken from available 153 literature (Rasool et al. 2020). The area under shrubland, farmland and built-up land covered 8.3, 154 12.6 and 43.95% area of the sub-basin, respectively, while the remaining portion of the sub-basin 155 is mainly under water-bodies, wetlands and aquatic vegetation. The study region mainly consisted 156 of clay (40.9 %) and sandy clay (55.4 %) soils and only a small portion of the area (3.6 %) is under 157 loamy sand soils. Hence, we considered only clay and sandy clay textured soils for our study. A 158 total of 23 locations were selected considering different soil types and land covers. A portable 159 Global Positioning Systems (GPS) receiver was utilized to acquire the geographical coordinates 160 of the respective experimental sites, and the selected sites were plotted using Arc GIS (Fig. 1). 161 162 163

Fig. 1 Study area along with selected experimentation sites 164
At each of the selected sites, the samples were collected from an average depth of 25 cm and the 165 collected samples were designated, stored in plastic bags and were taken to the laboratory for 166 analysis. The particle size distribution of disturbed soil samples was estimated using the 167 hydrometer method (Gee and Bauder 1986). For this purpose, disturbed samples in the laboratory 7 were air-dried, ground, passed through the 2 mm sieve and assayed in the laboratory. The 169 undisturbed samples were used to ascertain antecedent moisture content using the gravimetric 170 method. Organic carbon, saturated hydraulic conductivity (Ks), and soil water pressure heads at 171 field capacity and permanent wilting point were determined using Walkley and Black wet 172 oxidation method (Walkley and Black 1934), falling head method (Dingman 2002), and Pressure 173 plate apparatus, respectively. The bulk density ( b) was estimated by the core cutter method 174 (Blacke and Hartge 1986a). Soil porosity was computed applying Equation (1): 175 (1) 176 Where, sp is the particle density and was determined using the pycnometer method (Blacke and 177 Hartge 1986b). 178 The geometric mean and geometric standard deviation of soil particle diameter were determined 179 using the expressions (Shirazi and Boresma 1984) given below in Equation (2)  − 2 ) 0. 5 (3) 182 Where, dg is the geometric mean of soil particles; σg is the geometric standard deviation of soil 183 particles; n is the number of soil textural fractions, fi is the proportion of total soil mass with a 184 diameter equal to or less than Mi and Mi is the mathematical mean of two successive limits of 185 particle size. 186 Field experiments were conducted using the double-ring infiltrometer with 30 cm inner 187 diameter and 60 cm outer diameter rings to measure the infiltration rate. The infiltrometer was 188 carefully penetrated, using the falling weight type hammer, up to the depth of 15 cm into the soil. 189 Water was filled in both the rings carefully without disturbing the soil surface, and steady head 190 /water level was maintained in both the rings during the measurements. The rate of fall of the water 191 level in the inner cylinder was measured at different time intervals, the measurements of water 192 level were continued till the infiltration rate attained a steady value. The infiltration experiments 193 were replicated three times at each of the selected sites to account for measurement variations and 194 accurate determination of infiltration rate in the study region.

8
The movement of soil moisture in the unsaturated soil profile is described by a nonlinear partial 197 differential equation derived by Richards (Richard 1931 are preferred for field application. In this study four infiltration models, namely, Horton (H), 208 Kostiakov (K), Modified Kostiakov (MK) and Philip (P) were selected. These four infiltration 209 models selected based on their practical utility and wide use in various studies (Mishra et al. 2003;210 Machiwal et al. 2006). All the selected models chosen are based on empirical parameters and 211 reflect the in-situ conditions (Wilson 2017) and thus predict the infiltration rates more accurately 212 (Turner 2006). The infiltration models assessed for obtaining the model parameters are briefly 213 presented in Table 1. The parameters of infiltration models demonstrate the effect of physical 214 properties of soil on the infiltration rate in addition to initial moisture content and vadose zone 215 conditions (Ogbe et al. 2011). Thus, to minimize the difference between the fields measured and 216 model-predicted infiltration rates, accurate estimation of model parameters is an important step. In 217 this study, using observed infiltration data, parameters of the infiltration models were determined 218 using non-linear Marquardt algorithm of Statistical Package for Social Sciences 20.0 release 219 software (SPSS 2011). This optimization method has been extensively used for the parameter 220 estimation of the infiltration equations, as this technique has the ability to address the constraints 221 of other parameter estimation techniques (Deep and Das 2008). The value of the parameter for 222 the Horton and Modified Kostiakov infiltration model, was determined experimentally and was 223 used as model parameter in the field data to assess the predictability of infiltration models. 224 225 226 Table 1 Infiltration Equations and fitting parameters of models evaluated  228 229

Prediction using linear regression 231
For the derivation of predictive regression equations for the parameters of selected infiltration 232 models, the observed/estimated parameters (measured or determined using LMA) of the assessed 233 model were used as dependent variables and all the investigated soil properties were used as 234 independent variables. In order to derive the appropriate PTFs to predict the infiltration model 235 parameters, the regression models were derived using the procedure of stepwise regression using 236 the SPSS. The stepwise regression (SR) method involves developing regression models in steps 237 adding a predictor to the model at each step. In order to prevent the procedure from getting into an 238 infinite loop, the variables were added and removed at 0.05 and 0.10 significant levels (Dashtaki 239 et al. 2016). The efficiency of the developed equation was assessed by the coefficient of 240 determination (R 2 ). Since R 2 increases with the addition of predictor at each step irrespective of 241 the fact that if the added variables have increased the power of regression equation and the equation 242 with highest R 2 may appear to present a perfect fit only because it contains more variables. 243 Therefore in addition to R 2 , adjusted coefficient of determination (R 2 adj) determining the fitting 244 of the multiple regression equations for the sample data was used. The R 2 adj is computed using 245 the below expression given in Equation (4): 246 Model name Infiltration-rate Equation

Parameters
Horton (1940) = + ( − ) − ℎ fp = the infiltration rate (cm hr -1 ), fc = the final steady state infiltration capacity, fo = the initial infiltration capacity, Kh = Horton's decay coefficient specific to the soil characteristics and vegetation cover (T -1 ), t = time from the start of infiltration (hr.), (α>0) and β (0< β<1) = Kostiakov empirical constants α ‫׳‬ ( α ‫׳‬ >0) and β ‫׳‬ (0< β ‫׳‬ <1) are Modified Kostiakov empirical constants without physical meaning depending on the soil type, initial moisture content, rainfall rate and vegetative cover, S = the Sorptivity (cm hr -1/2 ) and A= a parameter related to saturated hydraulic conductivity and represents the effects of soil suction and gravity head respectively Where R 2 is the coefficient of determination, n is the sample number and K is the number of 248 independent variables in the regression equation. 249 The value of R 2 adj increases only if the predictor added at each step enhances the model 250 predictability obtained in the previous step and increase the power of regression equation, 251 otherwise with the addition of more variables the value of R 2 adj decreases. Thus, in stepwise 252 regression, the predicted infiltration equation for the infiltration parameters with the highest R 2 adj 253 and having a feasible value of R 2 is considered to be the best performing equation. 254

Prediction using Principal Component Analysis prior to regression analysis 255
The prediction equations generated using the stepwise regression for the infiltration model 256 parameters in general utilizes a large number of independent variables to capture the most of the 257 soil physical properties for greater accuracy, It has been reported that a large set of correlated 258 variables is difficult to interpret and apply in further analysis compared to a small set of 259 uncorrelated variables (Dunteman 1989). In order to reduce the number of variables recognized as 260 considerably important in equations developed using stepwise regression for infiltration model 261 parameters, factor reduction utilizing Principal Component Analysis (PCA) (Jolliffe 1986) was 262 performed using SPSS. The PCA is an approach to reduce the number of variables by transforming 263 an original set of variables into a considerably smaller set of uncorrelated variables that are linear 264 functions of original variables having a large number of independent variables (Dunteman 1989). 265 It aims to formulate a smaller set of variables containing most of the information in the original 266 set of variables. In order to select a subset using PCA from a large data set, there are several 267 strategies, and we adopted the strategy similar to that of Andrews and Carroll (2001) and Rezaei 268 et al. (2006). In this strategy, it was assumed that the infiltration parameters were best represented 269 by the Principal Components (PCs) with Eigenvalues >1. Within each PC, only highly weighted 270 factors receiving weighted loading values (either positive or negative) within 10% of the highest 271 weight were retained. Furthermore, within each PC in order to reduce the redundancy among more 272 than one highly weighted variables, the correlation analysis was performed among the variables. 273 The Pearson correlation coefficient was used to find out correlations among the different soil 274 properties. The variables having a correlation coefficient greater than 0.7 were considered highly 275 correlated (Andrews and Carroll 2001). To choose the variables among the well correlated variables, the absolute value of the correlation coefficients were summed up. In general, the 277 variable with the highest correlation sum was assumed to best represent the group. In order to 278 evaluate at what extent the reduced data set precisely represent the infiltration parameters, multiple 279 linear regression (MLR) analysis was performed in SPSS using standardized infiltration 280 parameters and standardized PCs as dependent and independent variables respectively. The , , ̅̅̅ ̅ are values of measured, predicted, mean measured and mean predicted 301 infiltration rates respectively, j is the number of the jth infiltration measurement in a set of soil 302 infiltration measurement for soil with a total of n infiltration reading, and n is the number of 303 infiltration rate measurement. 304 The parameter estimation technique with lower values of MAE and RMSE and higher values of 305 the R 2 , ENS and W was selected as best performing with good agreement between measured and 306 predicted infiltration rates. However, when multiple indicators are used, sometimes it becomes 307 very difficult to assess the overall performance and rank of the superiority of one model over the Overview of the basic statistics (mean, standard deviation, coefficient of variation) of soil 323 physical properties in the study area is given in Table 2. 324 The mean value of textural components revealed that sand (44.4%) and clay (42.17%) are the 325 major soil components followed by silt (11.17%). In order to elucidate the total variation or 326 heterogeneity of the given variables, the coefficient of variation (CV) was determined. The 327 criterion proposed by (Nielsen and Bouma, 1985) was exercised to categorize the parameters into 328 low (CV< 0.1), moderate (CV 0.1-1) and high (CV>1) variable classes. From the values of CV 329 (Table 2), it was inferred that on the whole, the textural fractions of sand, silt, and clay have 330 moderate variability with CV> 0.1. 331 Table 2 Descriptive statistical analysis of soil properties  332   333 The moderate variability might be due to the presence of more than one soil texture in the study 334 area and also as a consequence of pedogenic processes affected by the micro topographical variability of Ks in the study area is clay mineralogy, tillage practices, pore-size distribution and pore 344 continuity, moisture availability, particle size distribution and organic carbon content or biotic activity 345 (Sarki et al. 2014

Infiltration parameters estimated 349
The parameters of different infiltration models were determined using the non-linear Marquardt 350 algorithm by fitting selected infiltration models to the observed data of 23 sites. In order to get an 351 overview of the infiltration parameters in the given study area, the value of the infiltration 352 parameters computed by fitting the selected infiltration models to the observed infiltration rates 353 and are presented in Table 3. 354 Table 3 Value of infiltration model parameters 355 From the statistical measure of dispersion (CV) of the model parameters, it was revealed that 356 parameter A of the Philip model displays the highest variability in comparison to other parameters. 357 The high variability may be due to the fact that the parameter is related to saturated hydraulic 358 conductivity (Dashtaki et al. 2016), which also has a high variation (CV=1) in the study area. From 359 the CV values in Table 3, it is observed that the Parameters β and β‫׳‬ has lower variability than α and 360 α ‫׳‬ in Kostiakov

Predictive regression equations for Infiltration Parameters 365
For the accurate estimation of infiltration rate, it is important that the parameters of infiltration 366 models are computed accurately. In order to achieve this objective, the infiltration model parameters 367 were computed using two different approaches, namely, stepwise regression and PCA. The computed 368 parameters were then substituted in the selected infiltration models and statistical analysis was carried 369 out to assess the performance of different infiltration models developed using the model parameters 370 obtained through different techniques. 371

Infiltration Models Model Parameters
Mean Standard Deviation

Prediction models using Stepwise regression 372
The predictive equations were developed for the parameters of Horton, Kostiakov, Modified 373 Kostiakov and Philip model for the given sub-basin (Table 4). On the basis of the results obtained in 374 step-wise regression, it is clearly observed that among the soil properties used in developing 375 regression equations, different soil properties were retained and eliminated for prediction of the 376 different infiltration parameters. The independent variables retained and their relationship with 377 different infiltration parameters are presented in Table 4.

393
Broadly the predictive equations generated using the retained variables were able to explain the 394 variation ranging from 40-78% for different infiltration parameters (Table 4)  The number in the matrix represent the contribution of each variable to the principal component. 420 On the basis of criterion adopted (PC's having eigenvalues > 1), only the first three PCs were 421 retained (Table 5a). It is clearly observed that the selected PCs with the proportional variance of 422 0.405, 0.324 and 0.166 by PC1, PC2 and PC3, respectively, were able to explain more than 89% 423 of the cumulative variation (Table 5a). Under the selected PCs for the parameter A, PC1 has Sand, 424 clay, Dg and WP while as PC2 has b, η and MC as the highly weighted variables, while under 425 PC3 only silt content was retained. In order to reduce the redundancy the interrelations among 426 selected variables were determined and a correlation matrix was calculated (    However, to check the acceptability of regression equations for the prediction of infiltration 467 parameters R 2 was computed and it is clearly observed that predicted equations developed by 468 applying PCA prior to regression (table 6b) have acceptable R 2 values ranging from 0.33 to 0.60,

Performance assessment of derived equations 472
In order to check the validity of techniques used to develop regression equations for infiltration 473 model parameters, the scatter plots were set between the predicted model parameters values 474 determined using PTF approach against the values estimated from observed data using non-linear 475 Levenberg Marquardt algorithm (LMA) (Fig.2). 476 From the Fig 2, it is clearly observed that the dots in the parameters computed using simple 477 stepwise regression equations (Table 4) are closer to the parameters determined from the observed 478 data using LMA and lie relatively closer to 1:1 line, whereas the parameters computed using a 479 regression equation developed after factor reduction can be seen with more dispersion from the 480 LMA determined parameters. The same can also be observed from R 2 values displayed in the plots. 481 In general, the predicted equations for the parameters developed using stepwise regression (Table  482 4) were found to perform better than that of regression equations developed after applying PCA 483 (Table 6b). 484 In order to check the difference between the parameters estimated by two techniques a paired 485 sample t-test was carried out ( Table 7). From the paired sample t-test ( The regression equations developed with reduced number of independent variable after PCA were 497 thus considered more feasible to apply in real life problems, particularly while applying to larger 498 areas, basins, sub-basin where heterogeneity exists. In order to assess the suitability of regression 499 equations presented in Table 4 and 6b with respect to LMA for the estimation of model parameters 500 of the selected infiltration models, the performance assessment was carried out by comparing field-501 measured infiltration rates with the predicted ones (determined by substituting the Parameters 502 Philip for the soil textures of clay and sandy clay under different land covers are presented in Table  506 8, 9, 10 and 11, respectively. 507 It is clearly observed that for the Horton model (  (Table 10)      techniques are of sufficient quality to predict the parameters of the selected models. However, 554 prediction equations developed using SR have maximum sites with ENS > 0.5, and are considered 555 more feasible than the one developed after applying PCA. Thus, in general, it may be concluded 556 that the prediction equations developed using simple SR and regression after PCA may be used to 557 determine infiltration parameters directly from soil properties rather than from the observed 558 infiltration data, thus results in saving of time and energy in performing laborious experimentation. 559 In order to select the overall best fit technique of parameter estimation for the selected 560 infiltration models for the given study area, the overall performance of the three executed 561 techniques was compared. The comparison of techniques was performed by considering the OPI 562 computed from the statistical indices of all the selected sites irrespective of the land-cover and soil 563 texture i.e. taking N=23 in the computational equation of OPI. The OPI of the parameter estimation 564 techniques for the selected infiltration models for the study area is depicted in Fig. 3. 565 566 Fig. 3 OPI of Parameter estimation Techniques for the selected Infiltration Models 567 From Fig. 3 it is clearly depicted that LMA with the highest value of OPI is the most suitable 568 parameter estimation technique for all the four infiltration models. Furthermore, it is clearly 569 observed that for the selected models LMA is least feasible for the Kostiakov model. Since the LMA for the Kostiakov model has over-and under-estimated the parameters α and β, respectively, 571 resulting in the under-prediction of infiltration rate throughout the infiltration period and slight 572 over-prediction at the end. Hence the OPI values for the Kostiakov model is slightly less in 573 comparison to other models of infiltration while using LMA for parameter estimation. However 574 in order to derive parameters using the LM algorithm, it is necessary to conduct field 575 experimentation which is time-consuming, thus the prediction equations developed using soil 576 properties were analysed. Moreover it is also observed that the OPI of the prediction equations 577 developed using a maximum number of soil properties by the SR, in general, is higher than OPI observed that for the selected models, equations developed using PTF approach either by applying 584 SR or PCA, the Kostiakov model has the highest OPI value. The best performance of Kostiakov 585 model may be attributed to the fact that by applying SR and PCA the variation explained by derived 586 PTF for the model parameters α is 78% (Table 4) and 43% (Table 6b), and β is 61% (Table 4) and 587 45% (Table 6b) which is highest in comparison to the variation of the parameters of the other 588 selected models. However, in general the equations developed by SR proves to be more suitable 589 to determine the model parameter and predict the infiltration rate in the study area. Since there is 590 no significant difference (P>0.05) in the OPI value of parameter prediction equations developed 591 for the selected models using SR and regression after PCA, equations generated by both the 592 approaches may be used successfully to determine model parameters. The values of the parameters 593 determined in the study will also be useful for hydrologists to compute the infiltration rate precisely 594 by substituting parameters in the selected infiltration models of H, K, MK and P, and to select the 595 best fit infiltration model for the given area. Furthermore, accurate estimation of soil infiltration 596 rate and thereby runoff rate will be helpful in developing proper soil management strategies and 597 conservation measures to minimize the risk of erosion and land degradation in the study area. 598 However, this study considered only soil properties for developing predictive equations for 599 infiltration model parameters. Incorporation of other factors such as land use, topography, horizon 600 type, etc. may further enhance the prediction capability of developed explicit equations.

Conclusion 602
Estimation of infiltration rate is very challenging because of the variability of infiltration model 603 parameters which depend on various soil characteristics and land uses. In the current study, an 604 effort has been made to determine the parameters of Horton, Kostiakov, Modified Kostiakov and 605 Philip infiltration models in the urban sub-basin of lesser Himalayas from the easily measured soil 606 properties. To collect infiltration data field experimentation using double-ring infiltrometer was 607 conducted. Parameters of selected infiltration models were initially determined by applying non-608 linear Levenberg-Marquardt algorithm (LMA) on the field measured data. As in the undulating 609 terrains, it is not easy to collect the infiltration data, therefore, an attempt was made to develop 610 prediction equations using soil properties for the infiltration parameters. Two sets of prediction 611 equations were developed, one by applying stepwise regression on all the measured soil properties 612 and the other one by reducing number of parameters using PCA prior to regression. Equations 613 developed by subjecting all the investigated soil properties to stepwise regression analysis were 614 able to explain up to 78% of variability for some infiltration parameters and the regression 615 equations developed after applying PCA were able to explain up to 50% of the variation. Further, 616 equations developed by two different approaches have acceptable R 2 values and doesn't differ 617 significantly thus implying that the regression equations may be used efficiently in the prediction 618 of infiltration parameters. Comparison of the measured and estimated infiltration rate revealed 619 that non-linear LMA performed better than equations developed by other two approaches with OPI 620 values ranging from 0.67 to 0.95, 0.49 to 58, and 0.40 to 0.59 for LMA, SR and PCA method s 621 respectively. It is also to be noted that the OPI values of the studied infiltration models with 622 parameters estimated using the two varying information levels were not significantly different, and 623 hence, equations developed by both approaches may be used with almost equal accuracy. Since 624 the regression equations developed after PCA have reduced predictor variables, may be more 625 useful to determine the infiltration parameters in case of limited data availability. In general, such 626 predictive equations will be useful to estimate the infiltration rate in the hilly regions of Himalayas 627 where otherwise due to undulating terrain it is difficult to measure infiltration rate precisely. 628 Further substituting the parameters in the infiltration model identified as the most feasible in the 629 given study area as it helps in precise calculation of infiltration rate which may be potentially 630 helpful to the hydrologists for studying various hydrological processes, particularly the rate of 631 runoff under different land uses. Accordingly, soil management strategies and conservation measures may be suggested to minimize the risk of erosion and land degradation in the area of 633 study. However, incorporation of other factors such as land use, topography, horizon type, etc. 634 may further enhance the prediction capability of developed explicit equations. 635