Here, an application of MIKA-SHA to identify the dominant runoff controls of the Rappahannock watershed in the United States is described. Due to the large spatial extent of the watershed, it is expected that the spatial heterogeneity in catchment properties and climate variables are significant. In the current study, the SUPERFLEX library of MIKA-SHA is used with topography-based HRUs and soil-type-based HRUs. For simplicity, spatial heterogeneities of bedrock geology and land use are considered minor due to deep soil layers and large forest cover of the watershed, respectively.
3.1 Study Area
Rappahannock River basin (USGS Station number: 1664000) at Remington, Virginia, United States (Fig. 1) is an Intermediate Scale Area (ISA) river basin with a drainage area of 1605 km2 located in the southeastern quadrant of the United States. The watershed can be categorized as a forested catchment (fraction of forest: 0.75) with a humid climate (aridity index: 1.22). The mean elevation of the catchment is 216.1 m, and the mean slope is 30.33 m/km. Average daily runoff, precipitation, potential evaporation, and temperature values of the basin are 1.065 mm/day, 3.125 mm/day, 2.562 mm/day, and 12.4 oC, respectively. The catchment outlet is located in the southeastern corner of the catchment (Lat: 38.53068o, Long: -77.81360o).
Hydrometeorological data of the Rappahannock basin from 1/1/1998 to 31/12/2014 (17 years) are used for model spin-up (two years), model calibration (five years), model validation (five years), and model testing (five years). Four subcatchments are identified through the watershed delineation. The area percentages of four subcatchments are 32.1%, 22.4%, 25.4%, and 20.1% respectively. Further, the lengths from the subcatchment outlets to the catchment outlet along the main channel of four subcatchments are 0 km, 25.5 km, 22 km, and 22 km, respectively. Spatial variability of precipitation and temperature are considered and lumped at the subcatchment scale (separate time series for each subcatchment). For simplicity, spatial variability of potential evaporation is lumped at the catchment scale (single time series). Table 2 summarises details about the resolutions and sources of each hydrometeorological variable and catchment properties.
Basin characteristics, forcing terms and streamflow data of Rappahannock River Basin at Remington are included in both CAMELS (Newman et al. 2015) and MOPEX datasets (MOPEX 2021). Further, the Rappahannock River Basin at Fredericksburg (Basin ID: 1668000), which consists whole upper basin of the Rappahannock River (4130 km2), is also included in the MOPEX dataset. In this context, the study area used in the current study is a headwater subcatchment of the Rappahannock River Basin at Fredericksburg. Due to this data availability, both Rappahannock River Basin at Remington (study area of the current study) and Rappahannock River Basin at Fredericksburg have been used in many research studies, including parameter regionalization (Ao et al. 2006; Bardossy et al. 2016), priori parameter estimation (Duan et al. 2006), model comparison (Gan et al. 2006; Knoben et al. 2020), and automatic model induction (Spieler et al. 2020).
We avoid direct comparison of predictive capabilities in terms of efficiency values of rainfall-runoff models of the above-mentioned studies with predictive capabilities of MIKA-SHA learnt models as the research objectives, and modelling settings of those studies are significantly different from the objectives and modelling settings of the present study. In contrast to the multi-objective optimization used in MIKA-SHA, all of the studies mentioned above use single objective optimization based on either NSE or KGE. Most of the studies utilize lumped modelling instead of the semi-distributed modelling used in MIKA-SHA. However, comparing the research inferences between the studies mentioned above and the present study would be interesting.
Table 2
Details of hydrometeorological data and catchment properties
Data
|
Resolution/ Scale
|
Source
|
Precipitation (mm/day)
|
Subcatchment averaged
|
Daymet dataset (Daymet 2020)
|
Potential evaporation (mm/day)
|
Catchment averaged
|
CAMELS dataset (Newman et al. 2015)
|
Temperature (0C)
|
Subcatchment averaged
|
Dayment dataset (Dayment 2020)
|
Streamflow (mm/day)
|
Catchment averaged
|
CAMELS dataset (Newman et al. 2015)
|
Soil
|
1:250000
|
STATSGO2 soil data from USDA Natural Resources Conservation Service (NRCS) Web Soil Survey (WSS) (Web Soil Survey 2021)
|
Digital elevation
|
30 m
|
Shuttle Radar Topography Mission (SRTM) data from United States Geological Survey (USGS) EarthExplorer (USGS EarthExplorer 2020)
|
The spatial heterogeneity of the Rappahannock catchment in topography and soil types is considered and incorporated using HRUs. Under topography-based HRU classification, three HRUs are selected, namely floodplain (slope position threshold = 0.1), hill (Slope band percentage > 10), and plateau (Slope band percentage ≤ 10). The spatial variability of topography-based HRUs is shown in Fig. 2. Based on the major map units of STATSGO2 soil data, four HRUs are identified under the soil-type-based HRU classification. Each HRU may consist of one or more soil types (different soil types are separated using a "-"). Four HRUs include S1: Hayesville, S2: Myersville-Catoctin, S3: Occoquan-Meadowville-Buckhall, and S4: Worsham-Hazel-Cupeper. Figure 3 illustrates the spatial distribution of soil-type-based HRUs. Area percentages of each HRU under topography- and soil-type-based classifications are given in Table 3.
Table 3
Category
|
Area percentages
|
Subcatchment 1
|
Subcatchment 2
|
Subcatchment 3
|
Subcatchment 4
|
Topography-based
|
|
|
|
|
Hill
|
33.95
|
41.41
|
52.02
|
53.21
|
Floodplain
|
15.48
|
33.46
|
29.95
|
24.18
|
Plateau
|
50.57
|
25.13
|
18.03
|
22.61
|
Soil-type-based
|
|
|
|
|
S1
|
8.15
|
82.74
|
52.25
|
30.83
|
S2
|
36.73
|
10.48
|
33.58
|
26.57
|
S3
|
33.75
|
3.12
|
4.30
|
29.43
|
S4
|
21.37
|
3.66
|
9.86
|
13.18
|
3.2 Results
MIKA-SHA is applied to identify two optimal semi-distributed model configurations to represent the runoff dynamics of the Rappahannock catchment. One model is based on the topography-based HRUs, and the other model is on soil-type-based HRUs. Each time MIKA-SHA is run with the settings summarized in Table 4. Results obtained through the MIKA-SHA applications are presented in this section.
Table 4
Option
|
Setting
|
Independent Runs
|
50
|
Size of population
|
2000
|
Termination criteria
|
Generation number = 50
|
The randomized method used for initialization
|
Ramped Half and half
|
Special functions/ Mathematical functions
|
SUPERFLEX, DISTRIBUTED/ +, -, /, *
|
Input variables
|
Precipitation, temperature, potential evaporation
|
Dependent variable
|
Streamflow
|
Number of objective functions used
|
4 (VE/ KGE/ NSE/ logNSE)
|
Normalized range of constants
|
0 to 1
|
Depth of parse trees- initial/ maximum
|
3/ 5
|
The mating pool selection strategy
|
Tournament selection with four competitors
|
Genetic operator probability: mutation
Constant/ Tree/ Separation/ Node
|
0.5/0.5/0.3/0.3
|
Genetic operator probability: crossover
|
0.7
|
Count of CPUs used for parallel computation
|
40 units
|
Level of parallel computation
|
Performance evaluation level
|
Likelihood threshold - GLUE
|
NSE = 0.5
|
Behavioural models - GLUE
|
5000
|
3.2.1 Topography-based HRUs
Model configuration learnt by MIKA-SHA based on the topography-based HRUs is illustrated in Fig. 4 (hereinafter referred to as MIKA-SHA_SUPERFLEX_TOPO). The hillside model structure of the MIKA-SHA_SUPERFLEX_TOPO consists of two reservoirs (FR: a fast-reacting soil reservoir, SR: a slow-reacting soil reservoir) and one delay operator (a half-triangular lag function). One SR and two delay operators are included in the floodplain model structure. Plateau area model structure consists of three reservoirs (RR: a riparian reservoir, an FR and an SR) and two delay operators. The storage-discharge relationships of all three SRs of MIKA-SHA_SUPERFLEX_TOPO are based on the power function. The FR of the hillside model structure is also having a power function based storage-discharge relationship. The storage-discharge relationships of the RR and FR of the plateau area model structure are linear.
Figure 5 illustrates the simulated hydrographs of MIKA-SHA_SUPERFLEX_TOPO along with the observed hydrographs over one year (out of five) per each calibration (2004), validation (2008) and testing (2013) periods. A good visual match can be observed between simulated and observed hydrographs of each period. The simulated FDCs of MIKA-SHA_SUPERFLEX_TOPO, along with the observed FDCs of the Rappahannock catchment, are given in Fig. 7. Both simulated and observed FDCs of the calibration period follow each other significantly well. However, simulated FDCs tend to deviate slightly from the observed FDCs in the validation and testing periods, especially in low flow regimes.
Efficiency values of MIKA-SHA_SUPERFLEX_TOPO are graphically shown in Fig. 7. In many hydrological modelling exercises, it is common to see a slight deterioration of performance values of validation and testing periods where the discharge values of those periods are not used to train/ calibrate the model. Similar behaviour can also be observed here. As MIKA-SHA uses the performance values of the validation period in the optimal model selection process, the testing performance values represent the out of sample performance of MIKA-SHA_SUPERFLEX_TOPO. Interestingly, except for KGE, testing performances of the other three objective functions are higher than those in the validation period. The testing performance value of logNSE is even higher than that of the calibration period. This suggests that the MIKA-SHA_SUPERFLEX_TOPO is not overfitted to its training data.
Uncertainty analysis of MIKA-SHA_SUPERFLEX_TOPO reveals that 75.1% of observed discharge values of the calibration period lie between the 90% uncertainty bounds. This high percentage value suggests that the parameter uncertainty of MIKA-SHA_SUPERFLEX_TOPO alone is sufficient to estimate the total uncertainty satisfactorily. By studying the shape of the sensitivity scatter plots, 18 (out of 34) model sensitive parameters are identified. Six of them are associated with the hillside model structure, while four are associated with the floodplain model structure. Plateau area model structure also has six model sensitive parameters. The remaining two are the lag parameters of MIKA-SHA_SUPERFLEX_TOPO.
3.2.2 Soil-type-based HRUs
The optimal model learnt by MIKA-SHA under soil-type-based HRU classification (hereinafter referred to as MIKA-SHA_SUPERFLEX_SOIL) is shown in Fig. 8. Although it is possible to induce much more complex model configurations under soil-type-based HRU classification due to the higher number of HRUs, MIKA-SHA has identified a relatively simple model configuration. S1 model structure consists of two reservoirs (an FR and an SR), while the other three model structures consist of only one reservoir (an SR). In terms of model structures, both S3 and S4 share the same model architecture although different parameter values. The flow path above the SR in S1, S3 and S4 represents the direct portion of precipitation received to either FR or total runoff. In contrast, the same link in S2 represents the runoff generation through infiltration excess mechanism. The storage-discharge relationships of all four SRs are based on the power function. In contrast, the storage-discharge relationship of the FR in the S1 model structure is linear.
Figure 9 demonstrates the simulated hydrographs of MIKA-SHA_SUPERFLEX_SOIL and the observed hydrographs of the catchment over one year per calibration, validation, and testing period. Again a good visual match can be observed between the simulated and observed discharge signatures. Simulated FDC of MIKA-SHA_SUPERFLEX_SOIL closely follows the observed FDC in the calibration period (Fig. 10). However, it tends to deviate slightly in the medium and low flow regimes in the validation period. In contrast, simulated FDC of the testing period varies slightly only in the medium flow regime.
Efficiency values of MIKA-SHA_SUPERFLEX_SOIL are illustrated in Fig. 11. As observed with MIKA-SHA_SUPERFLEX_TOPO, slight deteriorations in model performances can be noted in validation and testing periods compared to the calibration period. However, on two occasions (with VE and logNSE) testing model performances are superior to the validation model performances. Similar to MIKA-SHA_SUPERFLEX_TOPO, logNSE value of the testing period is higher than that of the calibration period. Therefore, we might expect no overfitting issues here also. However, besides logNSE values, the range between the calibration and testing performance values is higher in MIKA-SHA_SUPERFLEX_SOIL than MIKA-SHA_SUPERFLEX_TOPO.
As per the uncertainty analysis of MIKA-SHA_SUPERFLEX_SOIL, 75.4% of observed discharge values of the calibration period fall between the 90% uncertainty bands. Therefore, the parameter uncertainty of MIKA-SHA_SUPERFLEX_SOIL alone is capable of estimating the total uncertainty satisfactorily. Sensitivity scatter plots of model parameters reveal that 10 model parameters out of 31 total model parameters are model sensitive parameters (S1–2, S2–3, S3–2, S4–1, and two lag parameters).
3.2.3 Topography vs. Soil-type
Visual inspection of hydrographs and FDCs is not sufficient to differentiate the performance between MIKA-SHA_SUPERFLEX_TOPO and MIKA-SHA_SUPERFLEX_SOIL. Hence, Fig. 12 graphically illustrates the performance difference in terms of overall efficiency values (one efficiency value over the calibration, validation and testing periods) under each objective function. From this graph, it is evident that the performance of MIKA-SHA_SUPERFLEX_SOIL is dominated by the performance of MIKA-SHA_SUPERFLEX_TOPO in terms of all four performance measures (i.e. MIKA-SHA_SUPERFLEX_TOPO is Pareto-optimal). The difference between the efficiency values is highest in KGE and lowest in logNSE. On this basis, it appears that the runoff dynamics of the Rappahannock catchment are predominantly controlled by the topography than the soil type of the area.
3.2.4 Previous Research Findings
According to results of the large-scale study on model structural uncertainty conducted by Knoben et al. (2020), out of 36 conceptual rainfall-runoff models, the Xinanjiang model (Zhao 1992) performed best for the Rappahannock River Basin at Remington in terms of the KGE values of the calibration period. Six different SUPERFLEX framework based conceptual models were also included among the 36 models used in this study. Among those six models, Hillslope, FLEX-Topo model (Savenije 2010) outperformed the remaining FLEX models. Although this study was a lumped modelling exercise, it is interesting to note that both the Xinanjiang model and Hillslope, FLEX-Topo model are based on the hypothesis that topography drives the runoff generation. In another study (Bardossy et al. 2016), out of three lumped conceptual rainfall-runoff models (HYMOD, HBV and Xinanjiang), the HBV model (Lindström et al. 1997) performed superior to the other two models in terms of the calibrated NSE values for the Rappahannock River Basin at Remington. In contrast to the other two models, runoff estimation in the HBV model is based on the power function. Interestingly, most of the reservoirs of MIKA-SHA_SUPERFLEX_TOPO and MIKA-SHA_SUPERFLEX_SOIL also utilize a similar function to represent the storage-discharge relationships.
Block-wise TOPMODEL was utilized by Ao et al. (2006) to relate model parameters to the basin physical characteristics of 10 United States watersheds, including the Rappahannock River Basin at Fredericksburg. As per the calibrated NSE values, Rappahannock River Basin achieved higher efficiency than the 10-basin averaged efficiency. This may indicate the dominance of topography towards runoff generation of the catchment relative to the other catchments because, in this study, spatial heterogeneity of the area was incorporated based on the topography. Both Duan et al. (2006) and Gan et al. (2006) reported that simpler conceptual rainfall-runoff models perform better than complex physics-based land surface models in simulating runoff responses of 12 MOPEX catchments, including the Rappahannock River Basin at Fredericksburg. Similarly, our MIKA-SHA findings also find relatively simpler models to perform better than more complex models in runoff prediction. Additionally, in terms of the number of reservoir units, three model configurations of MIKA-SHA_SUPERFLEX_TOPO, which represent hill, floodplain and plateau areas of Rappahannock basin, match with the hillslope, wetland and plateau model configurations of FLEX-Topo model (hillslope and plateau: two reservoirs, wetland: one reservoir) which have been defined based on expert's knowledge.
The importance of utilizing a multi-objective performance criterion to evaluate model performances in hydrological modelling can be highlighted through the results of Bardossy et al. (2016) and Knoben et al. (2020). Although with different time frames, both studies used single-objective optimization, the same spatial extent, and 10-year calibration periods. Bardossy et al. (2016) report, in terms of calibrated NSE values, the performance order in predicting the runoff response of Rappahannock River Basin at Remington in descending order as HBV (best one), HYMOD and then Xinanjiang model. However, in terms of the calibrated KGE values reported in Knoben et al. (2020), the order becomes reversed (Xinanjiang model performs best). This indicates that often possible to identify a model which performs well for a specific flow regime as the optimal model under single objective optimization depending on the sensitivity of the selected objective function. For example, the popular NSE value is highly sensitive to large discharge values (Gupta et al. 2009). This was the main reason for utilizing a multi-objective optimization framework within MIKA-SHA.