A Mathematical Model for Soil-Water Characteristic Curve by 1 Weibull Distribution

The soil-water characteristic curve (SWCC) plays a crucial role in unsaturated soil behavior. 13 However, none of the models are fully applicable to all soil classes. Therefore, it is necessary to 14 come up with more different models to best-fit the measured SWCC data. In this paper, a 15 mathematical model (that is, Weibull model) for the soil-water characteristic curve was proposed 16 based on two-parameters Weibull distribution. It only contains two parameters a and n, the effects 17 of which on the SWCC are independent. The Brutsaert, van Genutchen, Boltzman and Weibull 18 models were fitted to 24 SWCC data sets from UNSODA 2.0. The quality of fit for these models 19 was compared. Results showed that Weibull model was desirably accurate to fit data from a variety 20 of soil classes with 0.999 for R 2 and 0.010 for RMSE. Taking into account the 2 R , RMSE and ∑R i 21 criteria, it is therefore suggested that the exponential-based Weibull model had a higher fitting 22 accuracy and performed marginally better than the Brutsaert and VG models. As respect to the 23 criteria of AICc, the Weibull and Brutsaert models performed almost equally well but both had a 24 better performance than VG model. The VG model had the largest average number of iterations, as 25 such, it was relatively difficult to fit. However, the Boltzman model had a lower fitting accuracy 26 and less flexibility in comparison with the other models. Consequently, the Weibull model could be 27 used as an alternative to the soil-water characteristic curve models. 28


Introduction
The soil-water characteristic curve is defined as the relationship between soil suction and the amount of water (volumetric or gravimetric water content or degree of saturation) contained in the pores.
Different aspects of unsaturated soil behavior such as shear strength, volume change, chemical diffusivity, chemical adsorption water volume storage, specific heat and thermal conductivity are related to the SWCC (Sillers and Fredlund 2001), commonly estimated by the soil-characteristic curve and saturated soil properties (van Genuchten 1980, Vanapalli et al. 1996).For example, it has become an acceptable procedure to derive an equation to obtain the unsaturated permeability function usually by combining the SWCC with a flow equation (Rahimi et al. 2015).Therefore, knowledge of the SWCC is crucial in theoretical analysis and engineering application of unsaturated soil.
Currently, there are various methods available for measuring the SWCC for a given soil.One way is by conducting laboratory tests, including direct methods using pressure plate, Buchner funnel, tensiometers, pressure membranes, and indirect methods utilizing the filter paper, porous blocks, heat dissipation sensors.Almost all the methods are based on the equilibrium of water content in the soil.However, a reliable measurement of soil suction is challenging and cumbersome since it is time-consuming and costly, even not accurate.
Among these approaches, category 1 usually is an empirical model for making mathematical representations or predictions of SWCC.To date, the more commonly used soil-water characteristic curve models are those by Gardner(1956) (Gardner 1958), Brooks and Corey (1964)(BrooksR.H. and CoreyA.T. 1964), Brutsaert (1966) (Brutsaert 1967), van Genuchten (1980)(van Genuchten 1980), Williams(1983) (Williams et al. 1983), McKee and Bumb 1984 (a Boltzman exponential form) (McKee and Bumb 1984), McKee and Bumb 1987 (Fermi) (Mckee andBumb 1987), Fredlund andXing (1994) (Fredlund and Xing 1994).Most of these models have been developed by agricultural researchers since soil-water relationship relates to water supply for plant growth.These models are also of considerable significance in geotechnical engineering.Previous studies showed that empirical models can be used to represent the variation of water content with soil suction changing, however, none of the models are fully applicable to all soil classes.With this in view, therefore, it is necessary to come up with more different models to best-fit the measured SWCC data.
The objective of this study is to propose a mathematical model that can be used to fit the measured SWCC data.The development of SWCC model based on Weibull function was presented in detail.
A database of 24 undisturbed soil samples covering all soil textural classes, selected from the UNSONA 2.0, was used to be fitted with the Weibull model.After that, the Weibull SWCC model was compared with other previously proposed models in terms of fitting accuracy.It can be drawn from the comparison that the Weibull SWCC model performed better than other studied models and exhibited a better ability of fitting accuracy.

Proposal for Weibull model
The Weibull distribution, a continuous distribution in probability theory and statistics, is named after Swedish mathematician Waloddi Weibull in 1951.As is well-known, the Weibull distribution is considered the most fundamental lifetime distribution and has been used successfully in many fields such as survival analysis, reliability engineering and failure analysis, industrial engineering, extreme value theory, and weather forecasting.For instance, Lu et al. (Lu et al. 2002)  Whose cumulative function is where a and n are the scale and shape parameters of the distribution, respectively.
In Fig. 1, the black curve shows the cumulative distribution function, and the blue curve is the derivative of the black one.It can be seen that the black curve in Fig. 1 (CDF) is a s-shaped curve.
And it is acknowledged that the SWCC poses as a reversed S-shaped curve in semi-logarithmic axis, so that one may use the CDF of Weibull distribution to fit SWCC data.In addition, we can see that the CDF becomes zero when x equals zero, i.e., F(x)|x=0=0; while the water content, θ, is equal to the value of θs at a suction of zero kPa, i.e., θ(ψ)|ψ=0=θ s.Therefore, the CDF of Weibull distribution needs to be modified to show characteristics of SWCC.With this in view, the modified CDF and its differential equation used to describe the relationship between water content and suction for the soil corresponding to Eq. ( 2) and Eq. ( 1) are presented as follows. () The Eq. ( 3) can be written in terms of the normalized volumetric water content, Θ, defined as the amount of water in soils between the residual and saturated volumetric water contents (see Eq. ( 5), called Weibull model).
where Θ is normalized volumetric water content; θ(ψ) is volumetric water content at the suction, ψ; θs is volumetric water content at saturated state; θr is volumetric water content at residual state; a and n are fitting parameters.

The effects of two parameters
Fig. 2(a) shows a plot of Weibull model (Eq.( 5)) with n parameter constant (n equal to 0.5) and a varying.The a parameter has a unit of reciprocal suction (kPa -1 ) and is equal to the inverse of soil suction where normalized volumetric water content, Θ, is equal to 1/e (namely 0.3679).The a parameter does not affect the shape of the curve, but provides a shift in the curve towards the higher suction region as a is decreased.Thus, it is indicated that the inverse of parameter a is related to the air-entry value.
Fig. 2(b) shows a plot of Weibull model with a parameter constant (a equal to 0.01 kPa -1 ) and n varying.All the curves pass through the same point (1/a, 1/e).As can be seen from the curves, the larger the value of n, the steeper the curve in the transition zone.So the n parameter is related to the pore size distribution index.the more uniform the pore size distribution in the soil, the greater the value of n.
From the above analysis, the primary merits of the Weibull model can be summarized as follows: the two fitting parameters are physically meaningful; the effect of one parameter can be differentiated from that of the other parameter; The form of the model is relatively simple and contains only two parameters.

Data
24 soil samples with soil-water characteristic data selected from the unsaturated soil hydraulic database (UNSODA 2.0) (Nemes et al. 2001)

Fitting results
24 sets of experimentally θ-ψ data points were fitted with the Weibull model so that the values of model parameters (a, n, θr, θs) could be determined for each soil, which were then used to calculate the volumetric water contents using the Weibull model at given various suction values for each soil.
The predicted volumetric water contents were compared against the measured ones at identical suction level to investigate the Weibull model prediction accuracy.

Model selection
The soil-water characteristic models compared with the Weibull model in this study are parametric models based on a pore size function and the capillary theory.The equations representing each model are given in table 2. Each equation was written in the normalized water content form so as to can be applied between the saturated water content and residual water content.
The Brutseart model was chosen in the present study because the Brutsaert model performs better than the other two-parameter models among nine SWCC models including both two-parameters and three-parameters models in a study by SILLERS (Sillers et al. 2001).The van Genutchen model was included due to its a wide range of flexibility allows it to better fit data from a great diversity of soil classes.The Boltzman model was included since it is similar to the Weibull model, with only two parameters and taking on an exponential form, Although the Boltzman model is less commonly used.

Fitting procedure
The optimum fitting parameters of all 24 soil samples' water characteristic curve data sets were obtained by optimization technique.the parameters were determined by nonlinear curve fitting of custom functions in ORIGIN software and the model functions were as close as possible to the experimental data points without necessarily passing through any points.This is an iterative method that starts with some initial values of the parameters.
To compare fairly the number of iterations converging to the best fitting parameters (that is, the difficult degree of fitting) of each model, the initial values of each parameter were set to start with 1.As the setting range of the parameters has a great influence on the fitting results, the upper and lower bounds of θs and θr were set to 1 and 0, and the other parameters were greater than 0, except for the parameter a of the Boltzman model because it could sometimes be less than zero.

Evaluation criteria for models performance
There were two measures used to compare the fitting accuracy of the SWCC models in this study.
One was including the root-mean-square error (RMSE) and the adjusted coefficient of determination ( 2 R ), which were used as relative measure of the goodness of fit of SWCC models to the measured data of soil SWCC.RMSE statistic is an indicator to evaluate the total error of the model function.
the closer its value is to zero, the better the fit is.R 2 (coefficient of determination) statistic is generally an evaluation index of fitting quality.Mathematically speaking, R 2 (computed by Eq. ( 7)) will rise with number of parameters increasing.In this paper, the model involved in the comparison has different number of parameters.Thus, to avoid this effect, adjusted R 2 ( 2 R ) is accepted to overcome the rise in R 2 .A value of 2 R close to 1 indicates that the fit is a good one.The RMSE and 2 R are computed by using Eq. ( 6) and (8). 2 Where θm and θf denote measured and fitted water content, respectively; n is number of soil-water characteristic data points for each soil sample.k is number of parameters.
The other was the corrected Akaike Information Criterion (AICc) imposing penalties for additional fitting parameters.It's based on information theory, but a heuristic way to think about it is as a criterion that seeks a model that has a good fit to the data but contains the least parameters.In the case of small sample size, AICc is defined as follows: The definitions of n and k are the same as above.AIC is shorted for the Akaike information criterion.
As n increases, AICc converges to AIC so that AICc can be applied to any sample size(P.Burnham and R. Anderson 2004).The lower the AICc value, the better the fit of model to the data.So the priority model should be the one with the lowest AICc value.In addition, it is can be observed in Fig. 4(b) that the fitting curves of van Genutchen and Weibull models are completely coincident.In Figures 4(c) and (d), the Brutsaert, van Genutchen and Weibull models were able to match all the measured data accurately without much deviations between each other.Nevertheless, the fitting curves of the extrapolation regions are different in the two figures.

Comparing the accuracy of the SWCC models
That is, the suction extends to infinity at constant water content with fitting curves close and parallel to each other in the high soil suction region (see Fig. 4(c)), but some extrapolated regions have significant differences in the same range (see Fig. 4(d)).Looking at the best-fitting curves of 24 soil samples, most of them belong to the type of figure 4 (c).The Boltzman model typically have problems matching data in the junctions of three zones (that is, saturated zone, transition zone and residual zone) in almost all the soil samples, decreasing the flexibility of the curve.
Overall, figures of best-fitting curves for 24 soil samples indicates that all models performed very well for data sets except for the Boltzman model with its low accuracy and inflexibility character.
As we can see in best-fit curves, out of 24 soil samples, the differences in fitting the observed data between the Brutsaert, van Genutchen and Weibull models are merely marginal.
The fitted parameters and evaluation indicators for fitting quality of the models for various soil textural classes are summarized in table 3. The higher R 2 value and the smaller RMSE value represent a higher prediction accuracy for the developed model (Gu Fan et al. 2016b).For each soil sample, 4 models were ranked from 1 to 4 according to the RMSE value.Model with the smallest RMSE comes first (i.e., Ri=1), meaning the best fit.The ranking results have been shown in Table 3.The ranks of each model were added up to compare the fitting performance of different models.
The overall ranking (i.e., ∑Ri) of each model is shown in Table 4.The smaller the ∑Ri, the better the fit.As such, these four models can be orderly ranked as Weibull-Brutsaert-VG-Boltzman.
Compared with Brutsaert and VG model in terms of RMSE and 2 R criteria, Weibull model has the largest average and the smallest standard deviation, although there is little difference between them; while Boltzman model is significantly different from the other three models with the lowest average value of 2 R and the highest RMSE.Thus it can be seen that the performance of Weibull model is slightly better than Brutsaert and VG model according to RMSE and 2 R criteria as well as the ∑Ri.
No matter which evaluation criterion is adopted, Boltzman shows the worst performance.
From the results presented in

Conclusions
Soil-water characteristic curve (SWCC) is one of the most important components for describing unsaturated soil behavior.A better representation of unsaturated soil behavior could be gained by describing the SWCC using mathematical models.Numerous different options have been proposed to choose a model to characterize SWCC.In this paper, we used the modified cumulative function of Weibull distribution to give a description of the relationship between the water content and soil suction.The following conclusions of this paper can be summarized: 2. The Weibull model with R 2 value equal to 0.999, and RMSE value equal to 0.010 can accurately estimate the volumetric water content of soil at any given matric suction for the selected data sets.
3. Taking into account the 2 R , RMSE and ∑Ri criteria, it is therefore suggested that the exponential-based Weibull model had a higher fitting accuracy and performed marginally better than the Brutsaert and VG models.As respect to the criteria of AICc, the Weibull and Brutsaert models performed almost equally well but both had a better performance than VG model.The VG model had the largest average number of iterations, as such, it was relatively difficult to fit.However, the Boltzman model had a lower fitting accuracy and less flexibility in comparison with the other models.Consequently, the Weibull model could be used as an alternative to the soil-water characteristic curve models.
fitted the fracture strength data of brittle material to the Weibull distribution; Martínez-Antúnez et al.(Martínez-Antúnez et al. 2015)used the Weibull function to define the bioclimatic niche of some trees.The equation for the probability density function (PDF) and the cumulative distribution function (CDF) of basic two-parameter Weibull distribution is 1

Fig. 3
Fig.3shows the plot of measured versus predicted volumetric water content of 330 SWCC

Figures 4
Figures 4(a) to (d) show the best-fit curves to the experimental data from samples S2, S5, S10, S24 1.A mathematical model for SWCC based on Weibull distribution was proposed, called Weibull model.It only contains two parameters a and n, the effects of which on the SWCC are independent.A total of 330 SWCC experimental data from UNSODA 2.0 covering all textural classes (two soil samples for one soil class) were used to test the performance of the Weibull model and conduct comparisons with the 3 other widely used models, namely, Brutsaert model, van Genutchen model and Boltzman model.
results.Therefore, it could be a very suitable dataset to evaluate the performance of the SWCC model.The soil-water characteristic curves in this paper are presented in terms of volumetric water content, θ, plotted on an arithmetical scale, and soil suction ψ drawn on a logarithmic scale.
were used to demonstrate the performance of the Weibull model and make comparisons with the three widely used models: Brutsaert model, van Genuchten model, Boltzman model.Several reasons for choosing the dataset from UNSODA in this study: the selected dataset covers all soil textural classes in which each soil class is represented by 2 samples with experimental data (see table1).In addition to soil-water characteristic data, other information, such as, particle size distribution, mineralogy, hydraulic conductivity and water diffusivity, can also be found in UNSODA.Last but not least, other researchers can repeat your work based on the UNSODA code and test the Table 4, the Brutsaert, van Genuchten, Boltzman, Weibull models had the corrected Akaike Information Criterion of -131, -122, -115 and -130, respectively.Fig. 5 is a plot of the average corrected Akaike Information Criterion versus the model type.The lowest AICc indicates the best fit of the given data sets.Although the Brutsaert model had the lowest AICc, i.e., -131, the AICc of Weibull model was just a little bigger than that of Brutsaert model.In other words, there was little difference in the calculated corrected Akaike Information Criterion between the Brutsaert and Weibull model.As mentioned above, Boltzman model provided the poorest fit to the soil-water characteristic data according to the largest average AICc.The van Genutchen model didn't perform well in terms of AICc criterion which may be due to the penalties imposed by theAICc for an extra parameter comparing with the Brutsaert and Weibull models.Because, there is no significant difference between these models from two aspects of RMSE and 2 R criteria.

Table 4
also shows the average number of iterations required for the parameters to converge for every model.When the initial values for the parameters are the same, the number of iterations represents the degree of ease of fitting.It can be observed that the average number of iterations for Boltzman model is indeed the smallest, though it has the largest AICc value, meaning that Boltzman model tends to easiness in finding best-fit parameters.On the contrary, van Genutchen model is the most difficult to fit since it requires relatively quite large number of iterations.The difference in average number of iterations between Brutsaert and Weibull models is very small.(SillersandFredlund2001)conductedanalyses on statistical assessment for nine models and reported that the model with the lowest Akaike information Criterion tended to require the least effort to find best-fit parameters and the exponential based models were the most difficult to fit.Then, their result was different from the result of this study.More specifically, Boltzman model, the exponential based model, has the smallest average number of iterations with the largest average AICc value.The initial values of all parameters in this study were set to 1, while that of parameters were not necessarily equal in study ofSillers and Fredlun d (2001)(at least not mentioned in the paper).Therefore, different initial values of parameters may be a reason for this disagreement.