A Comparative Study on Probabaility Analysis of Annual Maximum Daily Streamflow Using Conventional and L-Moments


 The study aims at the probabilistic analysis of annual maximum daily streamflows at the gauging sites of Godavari upper, Godavari middle, Pranahitha, Indravathi and Godavari lower sub-basins. The daily streamflow data at Chass, Ashwi and Pachegaon of Godavari upper, Manjalegaon, Dhalegaon, Zari, GR Bridge, Purna and Yelli of Godavari middle, Gandlapet, Mancherial, Somanpally and Perur of Pranahitha, Pathagudem, Chindnar, Sonarpal, Jagdalpur and Nowrangpur of Indravathi, and, Sardaput, Injaram, Konta, Koida and Polavaram of Godavari lower sub-basins for the period varying between 1965–2011, collected from Central Water Commission (CWC), India were used in the analysis. Statistics of annual maximum daily streamflow series during the study period at the gauging sites of sub-basins indicated moderately variedand positively skewed streamflows, and flows with sharp peaks at the upstream gauging sites. Probabilistic analysis of streamflows showed that lognormal or gamma distribution with conventional moments fitted the maximum daily streamflow data at the gauging sites of Godavari sub-basins.Among 2-parameter distributions with L-moments,GPA2 followed by GAM2/LN2 fitted annual maximum daily streamflow data at most of the gauging sites.At the downstream-most gauging sites of Pranahitha, Indravathi and Godavari lower sub-basins, the data followed W2 probability distribution. Among 3-parameter distributions with L-moments, GPA3 at seven gauging sites, W3 and P3 at five gauging sites each, GLOG at four gauging sites and GEV at two gauging sites fitted the data. Based on the performance evaluation, 2 – parameter distributions using L-moments at the upstream, 3 – parameter distributions at the middle and probability distributions using conventional moments at the downstreamgauging sites performed better in the Godavari upper and middle sub-basins. Probability distributions based on conventional moments/ 3-parameter distributions using L-momentsfitted the annual maximum daily streamflow data at the gauging sites in the Pranahitha, Indravathi and Godavari lower sub-basins satisfactorily.


Introduction
Streamflow represents an integrated response to catchment heterogeneity and spatial variability of key hydrological processes such as precipitation, infiltration and evapotranspiration, and, provides an insight into long-term hydroclimatic changes. Maximum streamflow(flood) analysis plays an important role in hydrologic and economic evaluation of water resources projects. Floods are frequently occurring phenomena and predicting maximum streamflow is required for designing hydraulic structures and flood management studies.
Despite considerable spatial and temporal variations of streamflow, it is possible to predict streamflow fairly accurately using theoretical probability distributions. Probabilistic analysis of streamflow provides means to capture its statistical structure and suggests appropriate distributions. An understanding of probabilistic behaviour of streamflow is therefore necessary for effective utilization and management of water resources for different purposes. Selection of an appropriate probability distribution, however, depends on how best the distribution fits the historical data.The applicability of these distributions also lies in the fact that the frequency plots based on the fitted distributions may be extrapolated for prediction of extreme events at larger recurrence intervals.
Further, conventional moments of a distribution do not import easily interpreted information about the distribution shapeand, parameter estimates of the distribution fitted by the moments are often less accurate than those obtained by other methods. L-moments, linear combination of order statistics for probability distributions and data samples, on the other hand characterize a wide range of distributions and more robust to the presence of outliers in the data as they do not involve higher powers of observations. They are also subject to less bias in the estimation of probability distribution parameters and approximate their asymptotic normal distribution more closely in finite samples. Kroll and Vogel (2002) recommended Pearson Type III and 3-parameter Lognormal distributions for describing low streamflow at 1505 gauged river sites in the United States for intermittent and perennial rivers respectively. Durrans et. al. (2003) presented frequency analysis of streamflow in U.S. Tennessee Valley using Log-Pearson Type III distribution (LP3). Kumar et al. (2003) carried out regional frequency analysis based on L-moments and concluded that Generalized Extreme Value (GEV) distribution was the robust distributionat the sites in the middleGanges plains of India. Yue and Wang (2004) applied the method of Lmoments to identify the probability distribution of annual streamflow in different climatic regions of Canada and recommended different probability distributions for different regions. Atiem and Harmancioglu (2006) derived hydrologically homogeneous regions and identified the regional statistical distributions for gauging sites on the Nile River tributaries and showed that hydrologically homogeneous region followed Generalized Logistic (GLO) distribution. Gamage (2006) made a study to evaluate the goodness-of-fit of alternative probability distributions to sequences of annual minimum, average and maximum streamflows in Sri Lanka through L-moment ratio diagrams. The study revealed that annual maximum streamflow was best approximated by GEV distribution. Memon (2007) compared the fits of P3 and LP3distributions using annual peak flood data of Kotri site of Indus river and concluded that P3 is the robust distribution for estimation of flood quantiles. Abida and Ellouze (2008) identified regional flood frequency distributions for the sites in different flood zones of Tunisia. Flood data in Northern Tunisia was observed to follow Generalized Normal distribution while the Generalized Normal and GEV distributions were the best-fit distributions at the sites in central and southern Tunisia respectively. Haddad and Rahman (2008) compared a number of distributions for the catchments in southeast Australia and found that GEV distribution was the best distribution for the selected catchments. Bettil Saf (2009) derived regional flood frequency estimates for the gauged sites in West Mediterranean River Basins in Turkey and identified P3 distribution for the Antalya and Lower-West Mediterranean sub-regions, and the GLO for the Upper-West Mediterranean subregion as the best-fit distributions. Gubareva (2009) compared P3, lognormal, GEV and Generalized Pareto (GPA) distributions in the estimation of maximum flows in the rivers of Austria and Siberia and concluded that P3 distributionwasobserved to be the best-fit distribution. Haddad and Rahman (2010) found that two-parameter distributions were preferable to three-parameter distributions for Tasmania in Australia with lognormal appeared to be the best-selection by examining seven different probability distributions. Hussain et al. (2011) carried out flood frequency analysis at seven stations located on the main stream of Indus river, Pakistan and found P3 and GLO as the robust distributions. Rahman et al. (2013) found that a single distribution couldn't be specified as the best-fit distribution for flood flows in Australian states and identified that LP3, GEV and GPA distributions as the top three best-fit distributions. Mamman et al. (2017) fitted various probability distribution models to river flows of the Kainji reservoir in New Busca, Niger state, Nigeria and recommended the Gumbel probability model.Drissiaet al.
(2019) carried out flood frequency analysis at regional level over Kerala, India and found that GPA and GLO were the best-fit distributions for the stations in the study area.
Most of the earlier studies recommended probability distributions for maximum streamflow at the gauging sites in different basins based on either conventional or L-moments. In the present investigation, probability analysis of maximum daily streamflow at the gauging sites of Godavari sub-basins has been carried out by comparing the performance of probability distributions withconventional moments and also L-moments to identify appropriate probability distributions.
India and used in the analysis. The location map of gauging sites is shown as Fig.1 and a brief description of the sites is presented in Table 1. Statistical parameters such as mean, standard deviation, coefficients of variation, skewness and kurtosis were used to describe the variability of streamflows. The present study adopts the method of conventional and Lmoments to select suitable probability distributions for annual maximum daily streamflow at the gauging sites of Godavari sub-basins.

Fig.1 Location map of the gauging sites
Normal, Lognormal, Exponential, Extreme value and Gamma distributions with conventional moments were employed for fitting the streamflow series. Statistical tests such as Kolmogorov-Smirnov(K-S), Anderson-Darling(A-D) and Chi-square ( 2 ) were used to assess the reasonableness of the selected distribution. Commonly adopted 2-parameter distributions such as Generalized Pareto (GPA2), Log-normal (LN2), Gamma (GAM2) and Weibull (W2) and, 3-parameter distributions such as Generalized Extreme Value (GEV), Generalized Pareto (GPA3), Generalized Logistic (GLOG), Log-Normal (LN3), Pearson (P3) and Weibull (W3) using L-moments were selected to fit the streamflow series. The performance measure in terms of the deviations between observed and computed L-moment ratios (L-cv in case of 2-parameter and L-kurt in case of 3-parameter distributions) is considered in the selection of appropriate probability distributions. The performance indicators such as Root Mean Square Error (RMSE), Relative Root Mean Square Error (RRMSE), Mean Absolute Deviation Index (MADI), Efficiency Coefficient (EC), Probability Plot Correlation Coefficient (PPCC) and Volumetric Error (VE) were used to evaluate the performance of the distributions.A brief description of the methods of conventional and L-moments along with performance evaluation criteria is presented below.

Conventional moments
The method of conventional moments is the oldest and widely used technique for fitting a frequency distribution to observed data. These are in widespread use due to their availability on most of the calculators and in statistical software packages and, also due to their familiarity and interpretation. Commonly adopted normal, lognormal, extreme value, exponential and gamma distributions were used in the present study to select suitable distributions.

Fig. 2 Theoretical L -moment ratio diagrams for 2 -and 3 -parameter probability distributions
The visual observation is however subjective and unable to distinguish the differences when more than one distribution seems to be the possible candidates in a L-moments ratio diagram (Peel et al., 2001). To avoid the difficulties with the visual interpretation of L-moment diagrams, a performance measurei.e. the distance (di) between the computed and observed L-cv for 2parameter distribution and L-kurt for 3-parameter distribution, (Kroll and Vogel, 2002) which measures the closeness between sample and theoretical L-moment ratios is used and is expressed as di= |2 [3 0 (i)] -2 0 (i) | for a 2-parameter probability distribution = |4 [3 0 (i)] -4 0 (i) | for a 3-parameter probability distribution 2 0 (i), 3 0 (i), 4 0 (i) are the observed or sample L-cv, L-skewness and L-kurtosis respectively at a gauging site i; 2 [3 0 (i)] and 4 [3 0 (i)] are the theoretical L-cv and L-kurtosis values calculated for a distribution corresponding to a given sample at the gauging site i. A distribution with the smallest di value provides the best fit to sample data. To compare the relative performance of each probability distribution relative to the best probability distribution, a performance ratio (PR) is determined as PRPD = diBest PD/diPD (11) The best probability distribution will have a PR of 1 and all other probability distributions will have a PR between 0 and 1.
A brief description of probability distributions using conventional moments and, 2 parameter and 3 parameter distributions using L-moments is presented in Table 3.

Performance evaluation criteria
The statistical indicators used to evaluate the performance of the selected probability distributions in the present study are briefly discussed below.

Root Mean Square Error (RMSE)
It measures the differences between observed and estimated values. It yields the residual error in terms of mean square error (Yu et al., 1994) and is expressed as Where,xiandyirespectively denote the observed and estimated values. It indicates the relative performance of different models. It gives the quantitative model error in units of the variate. The smallest RMSE value indicates the best-fit model of the variate and gives the standard deviation of the model prediction error.

Mean Absolute Deviation Index (MADI)
It gives mean of absolute deviations of estimated values from observed values with respect to observed data.

Probability Plot Correlation Coefficient (PPCC)
It evaluates the adequacy of a fitted distribution and is a measure of the linearity of a probability plot. It gives the correlation between the ordered observations and the corresponding fitted values determined by a plotting position. A value of the coefficient near one suggests that the observations are mostly drawn from the fitted distribution. It is given by where, ̅ and ̅ are the means of observed and estimated values respectively.

Volumetric Error (VE)
It is an absolute prediction error (Yu et al., 1994), expressed as (16) It measures the percent error in volume (bias) under the observed and estimated, summed over the data period. The negative VE values indicate underestimation of the variable.

Probabilistic analysis using conventional moments
Statistical parameters such as mean, standard deviation, coefficient of variation, skewness coefficient and kurtosis coefficient of maximum daily streamflow at the gauging sites of Godavari river basin were computed and presented in Table 4. Statistics of maximum daily streamflow series at the gauging sites indicated moderately varied and positively skewed. The variability and skewness, however, mostly decreased towards downstream gauging sites barring a few exceptions. Low values of kurtosis except in the upper reaches indicate flat peaks near the mean in the maximum daily streamflows.
Goodness-of-fit statistics using Kolmogorov-Smirnov,Anderson-Darling and Chi-square statistical tests for different distributions were calculated and tabulated in Table 5. A maximum weight of 5 for minimum value and a minimum weight of 1 for maximum value of each statistic for the distributions were given and the best-fit distribution was selected based on maximum total weight computed fromthe weights given to the statistics of the tests.It may be observed from Table 5 that either lognormal or gamma distribution fitted well the maximum daily streamflow data at most of the gauging sites of Godavari subbasins selected for the present study.  13 14 Normal *W -Weight Table 6 presents L-moments and L-moment ratios of annual maximum daily streamflow at the gauging sites of Godavari sub-basins selected for the present study. Lmoment ratios of streamflow series represent large variability accompanied by large skewness. However, the variability and skewness decreased at the gauging sites in the lower reaches. Fig. 3 illustrates sample estimates of L-cv versus L-skew of 2-parameter probability distributions for annual maximum daily streamflow at the gauging sites of sub-basins. Also plotted on Fig. 3 are the L-cv-L-skew relationships for the 2-parameter probability distributions: Lognormal (LN2), Weibull (W2), Gamma (GAM2) and Generalized Pareto (GPA2), employed in the present study. Distances (dis) between sample L-moments at the gauging sites of sub-basins and Lmoment relationships for different 2-parameter distributions were calculated as presented in Table 7. The minimum value of these distances at a gauging site indicates the best choice of the distribution for describing the streamflow series. To compare the performance of each distribution relative to the best distribution, performance ratios were also determined ( Table  7). It may also be observed fromdivalues that no single distribution fitted the data at all the gauging sites. However, GPA2 followed by GAM2/LN2 fitted the data at most of the gauging sites barring a few exceptions. At the downstream-most gauging sites of Pranahitha, Indravathi and Godavari lower sub-basins, the data followed W2 probability distribution. Table 8 presents the parameters of 2-parameter probability distributions for use in the estimation of annual maximum daily streamflow at the gauging sites of Godavari sub-basins.   Fig. 4 illustrates sample estimates of L-skew and L-kurt for annual maximum daily streamflow at the gauging sites of sub-basins. Also plotted on Fig. 4 are L-skew-L-kurt relationships for 3-parameter distributions: Generalized Extreme Value (GEV), Generalized Logistic (GLOG), Generalized Pareto (GPA3), Lognormal (LN3), Pearson Type III (P3) and Weibull (W3), used in the study. These curves aredrawn based on the polynomial approximations as suggested by Hosking (1991) and Stendinger (1993).

Probabilistic analysis usingL-moments
Distances(dis) between sample L-moments at the gauging sites of sub-basins and the L-moment relationships for different 3-parameter distributions were calculated as presented in Table 8. The minimum value of these distances at a gauging site indicates the best choice of the distribution for describing the streamflow series. To compare the performance of each distribution relative to the best distribution, performance ratios were also determined ( Table  8). It may be noticed that no single distribution fitted the data at all the gauging sites as observed in 2-parameter distributions. GPA3 at seven gauging sites, W3 and P3 at five gauging sites each, GLOG at four gauging sites and GEV at two gauging sites fitted the data in the sub-basins.   Table 9 presents the parameters of the best-fit probability distributions using conventional moments, and 2-parameter and 3-parameter probability distributions using L-moments for use in the estimation of annual maximum daily streamflow at the gauging sites of Godavari sub-basins.

Performance evaluation of probability distributions
The performance of the probability distributions recommended based on conventional and L-moments was evaluated using the performance indicators such as RMSE, RRMSE, MADI, PPCC and VE as presented in Table 10.
Low values of RMSE and MADI indicate the appropriateness of the recommended distributions. Values of RRMSE less than 10% substantiate the reasonableness of recommended distributions. PPCC values are above 90% at most of the gauging sites indicating satisfactory performance of the probability models recommended. VE values are mostly negative indicating underestimation of maximum daily streamflow at the gauging sites.
Based on the performance evaluation, it may be observed from Table 10that probability distributions based on conventional moments performed better at most of the gauging sites in the downstream reaches of sub-basins. This may be due to the fact that the effect of outliers may not be very significant at the downstream gauging sites. 2-or 3-parameter probability distributions however fitted the annual maximum daily streamflow data satisfactorily at the gauging sites in the upstream reaches. This implies a wide variation in streamflow thereby increasing the number of outliers and relatively making them significant in the upstream reaches. Further,2-or 3-parameter probability distributions fitted the annual maximum daily streamflow data satisfactorily at the gauging sites in the upstream reaches. 2parameter distributions using L-moments at the upstream-most, 3-parameter distributions at the middle and probability distributions using conventional moments at the downstream-most gauging sites performed better in the Godavari upper and middle sub-basins. As discussed, probability distributions based on conventional moments performed better at most of the gauging sites in the Pranahitha, Indravathi and Godavari lower sub-basins.
It may also be observes that at the gauging sites where the data showed moderate to large variability (0.075 < L-cv≤ 0.4) and moderate skewness (0.05 < Lskew ≤ 0.15), probability distributions with conventional moments seem to be a better choice compared to the distributions with L-moments. However, 2-and 3-parameter distributions based on L-moments performed satisfactorily at the gauging sites where the data showed very large variability (L-cv >0.4)and skewness (L-skew>0.3).