Identification of Dominant Runoff Controls Using Hydrologically Informed Machine Learning Approach

doi:10.21203/rs.3.rs-815493/v1

Download PDF

Research Article

Identification of Dominant Runoff Controls Using Hydrologically Informed Machine Learning Approach

https://doi.org/10.21203/rs.3.rs-815493/v1

This work is licensed under a CC BY 4.0 License

You are reading this latest preprint version

Relative dominance of the runoff controls, such as topography, geology, soil types, land use, and climate, may differ from catchment to catchment due to spatial and temporal heterogeneity of landscape properties and climate variables. Understanding dominant runoff controls is an essential task in developing unified hydrological theories at the catchment scale. Semi-distributed rainfall-runoff models are often used to identify dominant runoff controls for a catchment of interest. In most such applications, the model selection is based on either expert's judgement or experimental and fieldwork insights. Model selection is the most important step in any hydrological modelling exercise as the findings are largely influenced by the selected model. Hence, a subjective model selection without sufficient expert's knowledge or experimental insights may result in biased findings, especially for comparative studies like identification of dominant runoff controls. In this study, we use a physics informed machine learning toolbox based on genetic programming Machine Induction Knowledge Augmented - System Hydrologique Asiatique (MIKA-SHA) to identify the relative dominance of runoff controls. We find the quantitative and automated approach based on MIKA-SHA to be highly appropriate for the intended task. MIKA-SHA does not require explicit user selections and relies on data and fundamental hydrological processes. The approach is tested using the Rappahannock River basin at Remington, Virginia, United States. Two rainfall-runoff models are learnt to represent the runoff dynamics of the catchment using topography-based and soil-type-based hydrologic response units independently. Based on prediction capabilities, in this case, the topography is identified as the dominant runoff driver.

Civil Engineering

rainfall-runoff models

physics informed machine learning

genetic programming

runoff controls

Identification and understanding of the behaviour of dominant hydrological processes is a major interest in catchment sciences. Further, it is important to appreciate how these dominant hydrological processes reveal variations in runoff controls, such as topography, geology, soil types and land use. Runoff control factors allow us to understand the similarities between different catchments better. Hence, consideration of the heterogeneity and uniqueness of the place is a must in the derivation of unified hydrological theories at the catchment scale (Beven 2020). Understanding relationships between hydrological processes and runoff controls may even facilitate addressing arguably one of the most challenging tasks at hand of hydrologists: Prediction in Ungauged Basins (PUB) (Schröder 2006).

The runoff controls for a catchment are often identified through an experimental exercise which is largely expensive and limited to relatively small catchments. Modelling exercise based on data-driven models which learn the hydrologic response of a catchment through the historical hydrometeorological and available spatial data would be an effective, cheap and quick alternative when the resources are limited for an experimental exercise. Moreover, pursuing a modelling exercise before experimental investigations would certainly help the experimentalist plan the field campaigns more efficiently based on the insights gained from models.

Distributed hydrological models are one of few tools available nowadays to understand the influence of landscape characteristics related to the hydrological response. To better understand the functioning of a watershed upon a rainfall event, it is required not only to estimate model parameters but also to identify a model structure as each watershed may act uniquely due to inherent spatial heterogeneity. In this context, the flexible modelling frameworks and data-driven models might constitute the most suitable approach.

Many research studies to identify runoff controls and their relative dominance in runoff generation have used semi-distributed modelling instead of relying on fully distributed models due to the lower complexity, high prediction accuracy and less computational and data demand. Addor and Melsen (2019) have reported that the model selection in hydrological modelling is frequently based on legacy factors such as the model's popularity rather than adequacy factors like the appropriateness of the model to achieve research objectives. A semi-distributed model choice based on a subjective model selection may introduce biased research findings, especially for a study that aims to identify dominant runoff controls. This is because the selected model may already be founded on a particular runoff driver. For example, TOPMODEL (Beven et al. 1995) assumes topography as the dominant runoff driver, while the SWAT model (Arnold et al. 1998) assumes soil properties to be prevalent in controlling runoff dynamics.

In our prior work (Herath et al. 2021a, b), we developed a physics informed machine learning toolbox based on Genetic Programming (GP) (Koza 1992) for automatic semi-distributed rainfall-runoff model induction Machine Induction Knowledge Augmented - System Hydrologique Asiatique (MIKA-SHA). MIKA-SHA uses hydrological knowledge to govern the learning algorithm to induce physically consistent models with high prediction accuracies. MIKA-SHA relies on modelling concepts which are used as the elements of hydrological knowledge incorporated. Prior model selection is not required, and model induction (both structure and parameters) is part of its machine learning (ML) framework. In this study, our main objective is to explore another potential utilization of MIKA-SHA to identify the relative dominance of runoff controls concerning the discharge response of the watershed of concern. We believe the automated and quantitative approach used in MIKA-SHA makes it highly appropriate for understanding the dominant runoff drivers through the unbiased search process.

1.1 Runoff Controls

Hydrological variables, such as topography, geology, soil types, land use, and climate are identified as runoff controls. These hydrological variables' spatial and temporal heterogeneity and their complex interactions force each catchment to behave uniquely. This is one of the main reasons for limited catchment scale hydrological theories within the hydrological modelling community (Nearing et al. 2020). Despite the large number of research studies conducted on runoff controls, there are still considerable gaps to be filled. Understanding the runoff controls is crucial as the inferred results may be affected if the relying runoff control of the model does not tally with the runoff control of the catchment. Further, understanding the relative influence of each runoff control is necessary for the interpretation of the catchment hydrology (Jencso and McGlynn, 2011).

Table 1

Runoff controls
Runoff Control	Reported influences on runoff generation
Topography	Many models assume as the dominant runoff controller, e.g. TOPMODEL (Beven et al. 1995), FLEX-topo (Savenije 2010), Xinanjiang model (Zhao 1992) Has a strong connection towards runoff generation in mountainous watersheds (Jencso and McGlynn 2011) Controls the flashiness of flow in most of the European catchments (Kuentz et al. 2017) Controls water movement through the soil strata (Price 2011) Influences groundwater recharge and organization of drainage network (Devito et al. 2005) Used to derive the shape of the groundwater table (Condon and Maxwell 2015) Mainly controls the spatial variability of soil moisture (Woods et al. 1997) Used to extract information on other runoff controls (Savenije 2010; Gao et al. 2014a)
Geology	Controls the runoff response of even some hillslopes (Jencso and McGlynn 2011) Flow-through the bedrock may form a significant runoff component (Onda et al. 2001) Interface between soil layer and bedrock may act as preferential flow paths. Bedrock permeability largely decides the moisture storage capacity (Vannier et al. 2016) Fractured bedrock can store a significant amount of groundwater volume while hard crystalline bedrocks with minor fracturing may hold a little amount of groundwater (McGuire et al. 2005) Controls the baseflow index of most European catchments (Kuentz et al. 2017)
Soil types	Influence on surface and subsurface flow processes, water storage, residence time, and flow pathways Influence (soil depth) partitioning of precipitation between discharge and moisture storage (Molin et al. 2020) and moisture storage capacity Influence the speed of runoff generation, e.g. organic soils – quick runoff response, mineral soils – slow runoff response (Devito et al. 2005) Drainage properties (permeability and porosity) affect both surface and subsurface flow dynamics Control spatiotemporal flow occurrence in intermittent rivers and ephemeral streams (Gutiérrez-Jurado et al. 2019) Different soil textures encourage vertical or lateral drainage, e.g. coarse soil texture – more vertical flows, fine soil texture – more lateral flow (Devito et al. 2005)
Land use	Affects infiltration characteristics Exists a direct link between vegetation and evapotranspiration (Meshgi et al. 2015) Affects baseflow generation through infiltration and evaporation (Price 2011) Identified as the second dominant controller of flashiness index and baseflow index of most European watersheds (Kuentz et al. 2017) Changes due to human influences may affect long-term baseflow recession (Wang and Cai 2010) Root zone's moisture storage capacity is identified as a critical water cycle parameter (Gao et al. 2014b) Forest cover is recognized as a dominant factor that influences flood generation (Savenije 2010)
Climate	Categorizes catchments as dry, arid, sub-humid, humid or wet Different dominant processes are observed in different climates, e.g. dry, arid and sub-humid - storage or uptake dominates catchment dynamics with a greater tendency towards vertical flow (Devito et al. 2005) Spatial variability of climate variables are crucial in defining the functioning of drainage systems and separation between snow and rain (Savenije 2010) Climate variables dominate most of the flow signatures of European watersheds (Kuentz et al. 2017)

1.2 Physics Informed Machine Learning

It is often observed that simple data-driven models outperform the theory-driven models, such as physics-based and conceptual models, in terms of prediction accuracy in many hydrological applications (Nearing et al. 2020). At the same time, ML models are heavily criticized for the lack of interpretability of induced models (often referred to as the black-box paradigm). This hindered achieving the level of success ML models achieved in the commercial domain (Karpatne et al. 2017). Researchers who primarily contribute to physics-based process simulation modelling have no assurance in the capabilities of ML models and continuously raise questions about the lack of physical understanding (Sellars 2018). However, not recognizing the potential of ML models in hydrological modelling has been identified as a danger to the hydrological modelling community (Nearing et al. 2020). More importantly, most state-of-the-art ML capabilities have not yet been thoroughly tested in hydrological modelling (Shen et al. 2018).

One promising way forward to bridge the gap between physics-based process simulation and ML would be to incorporate existing hydrological knowledge to guide ML algorithms to induce more physically sound and consistent models (Babovic 2005; 2009). This concept is presently recognised as a new modelling paradigm in the machine learning community as physics informed machine learning (Physics Informed Machine Learning Conference 2016) or Theory Guided Data Science (TGDS) (Karpatne et al. 2017). The objective of TGDS is to blend the existing scientific knowledge with data science models to induce more generalizable models consistent with scientific theories. As per the taxonomy defined by Karpatne et al. (2017), there are five different ways of incorporating scientific knowledge with data science models. They are (i) theory-guided design of data science models, (ii) theory-guided learning of data science models, (iii) theory-guided refinement of data science outputs, (iv) learning hybrid models of theory and data science, and (v) augmenting theory-based models using data science.

Machine Induction Knowledge Augmented - System Hydrologique Asiatique (MIKA-SHA) (Herath et al. 2021a, b) is a hydrologically informed semi-distributed rainfall-runoff model induction toolkit based on GP. MIKA-SHA is developed by extending the lumped modelling capabilities of the Machine Learning Rainfall-Runoff Model Induction Toolkit (ML-RR-MI) (Chadalawada et al. 2020) towards distributed modelling. MIKA-SHA is a hybrid TGDS approach where the existing hydrological knowledge on rainfall-runoff modelling is incorporated to guide the learning algorithm. This enables MIKA-SHA induced models to produce good prediction capabilities as well as readily interpretable model configurations.

Incorporation of existing hydrological knowledge is done through the addition of purpose-built functions into the function set of GP based optimization framework of MIKA-SHA. At present, there are three different model building block libraries in MIKA-SHA. The building blocks are taken either from the SUPERFLEX library, which consists of generic model building components available in SUPERFLEX flexible modelling framework (Fenicia et al. 2011; Kavetski and Fenicia 2011), FUSE library, which consists of generic model building components available in FUSE flexible modelling framework (Clark et al. 2008) and TANK library which includes model building blocks derived from Sugawara TANK model template (Sugawara 1979). The most important feature of MIKA-SHA is that the framework can be easily equipped with any other internally coherent collection of model building components.

As MIKA-SHA relies on GP, there is no requirement for pre-definition of a model structure. Instead, identifying an appropriate model structure is part of the ML framework of MIKA-SHA, meaning that GP optimizes both model structure and model parameters simultaneously. Based on evolutionary computing, MIKA-SHA builds and tests hypotheses about the runoff dynamics using the available model building components of the chosen model inventory. The workflow of MIKA-SHA is fully automated and hence involves no subjectivity in model induction or model selection. It is expected the results to be completely unbiased. The toolkit is especially useful when experts' knowledge and fieldwork insights about the study area are lacking.

The MIKA-SHA SUPERFLEX library is used to induce representative semi-distributed rainfall-runoff models in the current study. This includes two purpose-built functions, namely SUPERFLEX and DISTRIBUTED, where the former represents built SUPERFLEX submodels and later describes induced semi-distributed models using SUPERFLEX submodels. MIKA-SHA learns SUPERFLEX submodels using the model building components (reservoirs units, lag functions, connection elements, and constitutive functions) available in the SUPERFLEX framework. The DISTRIBUTED function assigns separate SUPERFLEX submodels for each Hydrologic Response Unit (HRU) used. HRUs within the same subcatchment share the same set of forcing terms. HRU outflows are first routed to the subcatchment outlets and then routed towards the catchment outlet. The DISTRIBUTED function uses the two-parameter gamma distribution function as the delay operator. For a detailed explanation of MIKA-SHA, please refer to Herath et al. (2021a, b).

MIKA-SHA relies on a performance measures library (Chadalawada and Babovic 2017) which includes most of the objective functions to evaluate model performances. For the current study, four objective functions, namely Volumetric Efficiency (VE) (Criss and Winston 2008), Kling-Gupta Efficiency (KGE) (Gupta et al. 2009), Nash-Sutcliffe Efficiency (NSE) (Nash and Sutcliffe 1970), and log Nash-Sutcliffe Efficiency (logNSE) (Krause et al. 2005) are used in the multi-objective optimization scheme of MIKA-SHA. The respective equations are given in Eq. (1) to Eq. (4) where N: Time steps, Q_ot: Observed streamflow, Q_st : Simulated streamflow, r: Linear correlation coefficient, α = , β = , σ: Standard deviation, μ: Mean, : Mean of observed discharge values, and log: Natural logarithm.

$$VE=1- \frac{\left|\sum _{t=1}^{N}\left({Q}_{ot}- {Q}_{st}\right)\right|}{\sum _{t=1}^{N}{Q}_{ot}} \left(1\right)$$

$$KGE=1- \sqrt{{\left(r-1\right)}^{2}+{\left(\alpha -1\right)}^{2}+{\left(\beta -1\right)}^{2}} \left(2\right)$$

$$NSE=1- \frac{\sum _{t=1}^{N}{\left({Q}_{ot}-{Q}_{st}\right)}^{2}}{\sum _{t=1}^{N}{\left({Q}_{ot}-\stackrel{-}{{Q}_{ot}}\right)}^{2}} \left(3\right)$$

$$logNSE=1-\frac{\sum _{t=1}^{N}{(log{Q}_{ot}-log{Q}_{st})}^{2}}{\sum _{t=1}^{N}{(log{Q}_{ot}-log\stackrel{-}{{Q}_{ot}})}^{2}} \left(4\right)$$

Here, an application of MIKA-SHA to identify the dominant runoff controls of the Rappahannock watershed in the United States is described. Due to the large spatial extent of the watershed, it is expected that the spatial heterogeneity in catchment properties and climate variables are significant. In the current study, the SUPERFLEX library of MIKA-SHA is used with topography-based HRUs and soil-type-based HRUs. For simplicity, spatial heterogeneities of bedrock geology and land use are considered minor due to deep soil layers and large forest cover of the watershed, respectively.

3.1 Study Area

Rappahannock River basin (USGS Station number: 1664000) at Remington, Virginia, United States (Fig. 1) is an Intermediate Scale Area (ISA) river basin with a drainage area of 1605 km² located in the southeastern quadrant of the United States. The watershed can be categorized as a forested catchment (fraction of forest: 0.75) with a humid climate (aridity index: 1.22). The mean elevation of the catchment is 216.1 m, and the mean slope is 30.33 m/km. Average daily runoff, precipitation, potential evaporation, and temperature values of the basin are 1.065 mm/day, 3.125 mm/day, 2.562 mm/day, and 12.4 ^oC, respectively. The catchment outlet is located in the southeastern corner of the catchment (Lat: 38.53068^o, Long: -77.81360^o).

Hydrometeorological data of the Rappahannock basin from 1/1/1998 to 31/12/2014 (17 years) are used for model spin-up (two years), model calibration (five years), model validation (five years), and model testing (five years). Four subcatchments are identified through the watershed delineation. The area percentages of four subcatchments are 32.1%, 22.4%, 25.4%, and 20.1% respectively. Further, the lengths from the subcatchment outlets to the catchment outlet along the main channel of four subcatchments are 0 km, 25.5 km, 22 km, and 22 km, respectively. Spatial variability of precipitation and temperature are considered and lumped at the subcatchment scale (separate time series for each subcatchment). For simplicity, spatial variability of potential evaporation is lumped at the catchment scale (single time series). Table 2 summarises details about the resolutions and sources of each hydrometeorological variable and catchment properties.

Basin characteristics, forcing terms and streamflow data of Rappahannock River Basin at Remington are included in both CAMELS (Newman et al. 2015) and MOPEX datasets (MOPEX 2021). Further, the Rappahannock River Basin at Fredericksburg (Basin ID: 1668000), which consists whole upper basin of the Rappahannock River (4130 km²), is also included in the MOPEX dataset. In this context, the study area used in the current study is a headwater subcatchment of the Rappahannock River Basin at Fredericksburg. Due to this data availability, both Rappahannock River Basin at Remington (study area of the current study) and Rappahannock River Basin at Fredericksburg have been used in many research studies, including parameter regionalization (Ao et al. 2006; Bardossy et al. 2016), priori parameter estimation (Duan et al. 2006), model comparison (Gan et al. 2006; Knoben et al. 2020), and automatic model induction (Spieler et al. 2020).

We avoid direct comparison of predictive capabilities in terms of efficiency values of rainfall-runoff models of the above-mentioned studies with predictive capabilities of MIKA-SHA learnt models as the research objectives, and modelling settings of those studies are significantly different from the objectives and modelling settings of the present study. In contrast to the multi-objective optimization used in MIKA-SHA, all of the studies mentioned above use single objective optimization based on either NSE or KGE. Most of the studies utilize lumped modelling instead of the semi-distributed modelling used in MIKA-SHA. However, comparing the research inferences between the studies mentioned above and the present study would be interesting.

Table 2

Details of hydrometeorological data and catchment properties
Data	Resolution/ Scale	Source
Precipitation (mm/day)	Subcatchment averaged	Daymet dataset (Daymet 2020)
Potential evaporation (mm/day)	Catchment averaged	CAMELS dataset (Newman et al. 2015)
Temperature (⁰C)	Subcatchment averaged	Dayment dataset (Dayment 2020)
Streamflow (mm/day)	Catchment averaged	CAMELS dataset (Newman et al. 2015)
Soil	1:250000	STATSGO2 soil data from USDA Natural Resources Conservation Service (NRCS) Web Soil Survey (WSS) (Web Soil Survey 2021)
Digital elevation	30 m	Shuttle Radar Topography Mission (SRTM) data from United States Geological Survey (USGS) EarthExplorer (USGS EarthExplorer 2020)

The spatial heterogeneity of the Rappahannock catchment in topography and soil types is considered and incorporated using HRUs. Under topography-based HRU classification, three HRUs are selected, namely floodplain (slope position threshold = 0.1), hill (Slope band percentage > 10), and plateau (Slope band percentage ≤ 10). The spatial variability of topography-based HRUs is shown in Fig. 2. Based on the major map units of STATSGO2 soil data, four HRUs are identified under the soil-type-based HRU classification. Each HRU may consist of one or more soil types (different soil types are separated using a "-"). Four HRUs include S1: Hayesville, S2: Myersville-Catoctin, S3: Occoquan-Meadowville-Buckhall, and S4: Worsham-Hazel-Cupeper. Figure 3 illustrates the spatial distribution of soil-type-based HRUs. Area percentages of each HRU under topography- and soil-type-based classifications are given in Table 3.

Table 3

HRU details
Category	Area percentages
Category	Subcatchment 1	Subcatchment 2	Subcatchment 3	Subcatchment 4
Topography-based
Hill	33.95	41.41	52.02	53.21
Floodplain	15.48	33.46	29.95	24.18
Plateau	50.57	25.13	18.03	22.61
Soil-type-based
S1	8.15	82.74	52.25	30.83
S2	36.73	10.48	33.58	26.57
S3	33.75	3.12	4.30	29.43
S4	21.37	3.66	9.86	13.18

3.2 Results

MIKA-SHA is applied to identify two optimal semi-distributed model configurations to represent the runoff dynamics of the Rappahannock catchment. One model is based on the topography-based HRUs, and the other model is on soil-type-based HRUs. Each time MIKA-SHA is run with the settings summarized in Table 4. Results obtained through the MIKA-SHA applications are presented in this section.

Table 4

MIKA-SHA settings
Option	Setting
Independent Runs	50
Size of population	2000
Termination criteria	Generation number = 50
The randomized method used for initialization	Ramped Half and half
Special functions/ Mathematical functions	SUPERFLEX, DISTRIBUTED/ +, -, /, *
Input variables	Precipitation, temperature, potential evaporation
Dependent variable	Streamflow
Number of objective functions used	4 (VE/ KGE/ NSE/ logNSE)
Normalized range of constants	0 to 1
Depth of parse trees- initial/ maximum	3/ 5
The mating pool selection strategy	Tournament selection with four competitors
Genetic operator probability: mutation Constant/ Tree/ Separation/ Node	0.5/0.5/0.3/0.3
Genetic operator probability: crossover	0.7
Count of CPUs used for parallel computation	40 units
Level of parallel computation	Performance evaluation level
Likelihood threshold - GLUE	NSE = 0.5
Behavioural models - GLUE	5000

3.2.1 Topography-based HRUs

Model configuration learnt by MIKA-SHA based on the topography-based HRUs is illustrated in Fig. 4 (hereinafter referred to as MIKA-SHA_SUPERFLEX_TOPO). The hillside model structure of the MIKA-SHA_SUPERFLEX_TOPO consists of two reservoirs (FR: a fast-reacting soil reservoir, SR: a slow-reacting soil reservoir) and one delay operator (a half-triangular lag function). One SR and two delay operators are included in the floodplain model structure. Plateau area model structure consists of three reservoirs (RR: a riparian reservoir, an FR and an SR) and two delay operators. The storage-discharge relationships of all three SRs of MIKA-SHA_SUPERFLEX_TOPO are based on the power function. The FR of the hillside model structure is also having a power function based storage-discharge relationship. The storage-discharge relationships of the RR and FR of the plateau area model structure are linear.

Figure 5 illustrates the simulated hydrographs of MIKA-SHA_SUPERFLEX_TOPO along with the observed hydrographs over one year (out of five) per each calibration (2004), validation (2008) and testing (2013) periods. A good visual match can be observed between simulated and observed hydrographs of each period. The simulated FDCs of MIKA-SHA_SUPERFLEX_TOPO, along with the observed FDCs of the Rappahannock catchment, are given in Fig. 7. Both simulated and observed FDCs of the calibration period follow each other significantly well. However, simulated FDCs tend to deviate slightly from the observed FDCs in the validation and testing periods, especially in low flow regimes.

Efficiency values of MIKA-SHA_SUPERFLEX_TOPO are graphically shown in Fig. 7. In many hydrological modelling exercises, it is common to see a slight deterioration of performance values of validation and testing periods where the discharge values of those periods are not used to train/ calibrate the model. Similar behaviour can also be observed here. As MIKA-SHA uses the performance values of the validation period in the optimal model selection process, the testing performance values represent the out of sample performance of MIKA-SHA_SUPERFLEX_TOPO. Interestingly, except for KGE, testing performances of the other three objective functions are higher than those in the validation period. The testing performance value of logNSE is even higher than that of the calibration period. This suggests that the MIKA-SHA_SUPERFLEX_TOPO is not overfitted to its training data.

Uncertainty analysis of MIKA-SHA_SUPERFLEX_TOPO reveals that 75.1% of observed discharge values of the calibration period lie between the 90% uncertainty bounds. This high percentage value suggests that the parameter uncertainty of MIKA-SHA_SUPERFLEX_TOPO alone is sufficient to estimate the total uncertainty satisfactorily. By studying the shape of the sensitivity scatter plots, 18 (out of 34) model sensitive parameters are identified. Six of them are associated with the hillside model structure, while four are associated with the floodplain model structure. Plateau area model structure also has six model sensitive parameters. The remaining two are the lag parameters of MIKA-SHA_SUPERFLEX_TOPO.

3.2.2 Soil-type-based HRUs

The optimal model learnt by MIKA-SHA under soil-type-based HRU classification (hereinafter referred to as MIKA-SHA_SUPERFLEX_SOIL) is shown in Fig. 8. Although it is possible to induce much more complex model configurations under soil-type-based HRU classification due to the higher number of HRUs, MIKA-SHA has identified a relatively simple model configuration. S1 model structure consists of two reservoirs (an FR and an SR), while the other three model structures consist of only one reservoir (an SR). In terms of model structures, both S3 and S4 share the same model architecture although different parameter values. The flow path above the SR in S1, S3 and S4 represents the direct portion of precipitation received to either FR or total runoff. In contrast, the same link in S2 represents the runoff generation through infiltration excess mechanism. The storage-discharge relationships of all four SRs are based on the power function. In contrast, the storage-discharge relationship of the FR in the S1 model structure is linear.

Figure 9 demonstrates the simulated hydrographs of MIKA-SHA_SUPERFLEX_SOIL and the observed hydrographs of the catchment over one year per calibration, validation, and testing period. Again a good visual match can be observed between the simulated and observed discharge signatures. Simulated FDC of MIKA-SHA_SUPERFLEX_SOIL closely follows the observed FDC in the calibration period (Fig. 10). However, it tends to deviate slightly in the medium and low flow regimes in the validation period. In contrast, simulated FDC of the testing period varies slightly only in the medium flow regime.

Efficiency values of MIKA-SHA_SUPERFLEX_SOIL are illustrated in Fig. 11. As observed with MIKA-SHA_SUPERFLEX_TOPO, slight deteriorations in model performances can be noted in validation and testing periods compared to the calibration period. However, on two occasions (with VE and logNSE) testing model performances are superior to the validation model performances. Similar to MIKA-SHA_SUPERFLEX_TOPO, logNSE value of the testing period is higher than that of the calibration period. Therefore, we might expect no overfitting issues here also. However, besides logNSE values, the range between the calibration and testing performance values is higher in MIKA-SHA_SUPERFLEX_SOIL than MIKA-SHA_SUPERFLEX_TOPO.

As per the uncertainty analysis of MIKA-SHA_SUPERFLEX_SOIL, 75.4% of observed discharge values of the calibration period fall between the 90% uncertainty bands. Therefore, the parameter uncertainty of MIKA-SHA_SUPERFLEX_SOIL alone is capable of estimating the total uncertainty satisfactorily. Sensitivity scatter plots of model parameters reveal that 10 model parameters out of 31 total model parameters are model sensitive parameters (S1–2, S2–3, S3–2, S4–1, and two lag parameters).

3.2.3 Topography vs. Soil-type

Visual inspection of hydrographs and FDCs is not sufficient to differentiate the performance between MIKA-SHA_SUPERFLEX_TOPO and MIKA-SHA_SUPERFLEX_SOIL. Hence, Fig. 12 graphically illustrates the performance difference in terms of overall efficiency values (one efficiency value over the calibration, validation and testing periods) under each objective function. From this graph, it is evident that the performance of MIKA-SHA_SUPERFLEX_SOIL is dominated by the performance of MIKA-SHA_SUPERFLEX_TOPO in terms of all four performance measures (i.e. MIKA-SHA_SUPERFLEX_TOPO is Pareto-optimal). The difference between the efficiency values is highest in KGE and lowest in logNSE. On this basis, it appears that the runoff dynamics of the Rappahannock catchment are predominantly controlled by the topography than the soil type of the area.

3.2.4 Previous Research Findings

According to results of the large-scale study on model structural uncertainty conducted by Knoben et al. (2020), out of 36 conceptual rainfall-runoff models, the Xinanjiang model (Zhao 1992) performed best for the Rappahannock River Basin at Remington in terms of the KGE values of the calibration period. Six different SUPERFLEX framework based conceptual models were also included among the 36 models used in this study. Among those six models, Hillslope, FLEX-Topo model (Savenije 2010) outperformed the remaining FLEX models. Although this study was a lumped modelling exercise, it is interesting to note that both the Xinanjiang model and Hillslope, FLEX-Topo model are based on the hypothesis that topography drives the runoff generation. In another study (Bardossy et al. 2016), out of three lumped conceptual rainfall-runoff models (HYMOD, HBV and Xinanjiang), the HBV model (Lindström et al. 1997) performed superior to the other two models in terms of the calibrated NSE values for the Rappahannock River Basin at Remington. In contrast to the other two models, runoff estimation in the HBV model is based on the power function. Interestingly, most of the reservoirs of MIKA-SHA_SUPERFLEX_TOPO and MIKA-SHA_SUPERFLEX_SOIL also utilize a similar function to represent the storage-discharge relationships.

Block-wise TOPMODEL was utilized by Ao et al. (2006) to relate model parameters to the basin physical characteristics of 10 United States watersheds, including the Rappahannock River Basin at Fredericksburg. As per the calibrated NSE values, Rappahannock River Basin achieved higher efficiency than the 10-basin averaged efficiency. This may indicate the dominance of topography towards runoff generation of the catchment relative to the other catchments because, in this study, spatial heterogeneity of the area was incorporated based on the topography. Both Duan et al. (2006) and Gan et al. (2006) reported that simpler conceptual rainfall-runoff models perform better than complex physics-based land surface models in simulating runoff responses of 12 MOPEX catchments, including the Rappahannock River Basin at Fredericksburg. Similarly, our MIKA-SHA findings also find relatively simpler models to perform better than more complex models in runoff prediction. Additionally, in terms of the number of reservoir units, three model configurations of MIKA-SHA_SUPERFLEX_TOPO, which represent hill, floodplain and plateau areas of Rappahannock basin, match with the hillslope, wetland and plateau model configurations of FLEX-Topo model (hillslope and plateau: two reservoirs, wetland: one reservoir) which have been defined based on expert's knowledge.

The importance of utilizing a multi-objective performance criterion to evaluate model performances in hydrological modelling can be highlighted through the results of Bardossy et al. (2016) and Knoben et al. (2020). Although with different time frames, both studies used single-objective optimization, the same spatial extent, and 10-year calibration periods. Bardossy et al. (2016) report, in terms of calibrated NSE values, the performance order in predicting the runoff response of Rappahannock River Basin at Remington in descending order as HBV (best one), HYMOD and then Xinanjiang model. However, in terms of the calibrated KGE values reported in Knoben et al. (2020), the order becomes reversed (Xinanjiang model performs best). This indicates that often possible to identify a model which performs well for a specific flow regime as the optimal model under single objective optimization depending on the sensitivity of the selected objective function. For example, the popular NSE value is highly sensitive to large discharge values (Gupta et al. 2009). This was the main reason for utilizing a multi-objective optimization framework within MIKA-SHA.

4.1 MIKA-SHA_SUPERFLEX_TOPO

According to the calibrated model parameters, the topmost flow pathway of each model structure (hill, floodplain and plateau) of MIKA-SHA_SUPERFLEX_TOPO contributes little towards the total model outflow (hill: 1.7% of P, floodplain: 2% of P and plateau: 0.5% of P). In this context, the hillside model structure and plateau area model structure are quite similar. However, the percentage of input precipitation that passes through the FR is significantly higher in the hillside model structure (hill: 36.9% of P and plateau: 7.3% of P). Furthermore, there is a lag function associated with the FR of the plateau area model structure. Hence, a quick runoff response to its forcing terms can be expected in the hillside model rather than the plateau area model structure. On the other hand, calibrated parameter values suggest that (power coefficient: hill = 3, plateau = 4.39; the percentage of input P passes through SR: hill = 61.4%, plateau = 92.2%) the baseflow (outflow from SR) of the plateau area model structure is significantly higher than that of the hillside model structure. We find these behaviours of hillside and plateau area model structures to be reasonable as a quick runoff response may be expected in hillsides due to steeper slopes, while a more subsurface oriented delayed response may expect in plateau areas due to milder slopes which may result in higher resident times (water may have more time to reach deeper soil layers).

The floodplain model structure of MIKA-SHA_SUPERFLEX_TOPO has a relatively simple model architecture with only one reservoir. The floodplain area is expected to be saturated or nearly saturated and continuously connected with the stream. In an earlier SUPERFLEX application (Fenicia et al. 2016), where the model selection for each HRU was based on expert judgement, a simple linear reservoir model was similarly identified as sufficient to represent quick runoff responses of riparian zones. Consistent with this in the current application, MIKA-SHA also identified a simpler model with one reservoir to capture the runoff dynamics of floodplains. However, the SR in floodplain model structure has a power function relationship between its discharge and storage.

As mentioned earlier, all three SRs of MIKA-SHA_SUPERFLEX_TOPO have nonlinear discharge-storage relationships. This may suggest that the runoff response of the Rappahanock catchment is mainly nonlinear. Further, the inclusion of stable baseflow components in the model configuration of MIKA-SHA_SUPERFLEX_TOPO is reasonable because the main river channel of the basin can be categorized as a perennial river where a continuous groundwater supply is required to sustain water throughout the year.

4.2 MIKA-SHA_SUPERFLEX_SOIL

Although an FR is included in the S1 model structure of MIKA-SHA_SUPERFLEX_SOIL, the percentage of precipitation that passes through the FR (0.8%) is very low. Further, all SRs of MIKA-SHA_SUPERFLEX_SOIL have nonlinear storage-discharge relationships. Therefore, all four model structures of MIKA-SHA_SUPERFLEX_SOIL share relatively similar model architectures. Interestingly, most of the soil types in the four HRUs are also not much different from each other in terms of the drainage properties, such as moderate/ moderately rapid permeabilities, well-drained, deep/ very deep soil layers, and medium surface runoffs (Official Soil Series Descriptions 2020).

As mentioned earlier, the middle link of the S2 model structure represents runoff generation through infiltration excess mechanism. Surface runoff components like infiltration excess overland flow can be expected in S2 HRU as the two soil types present in S2 HRU (Myersville and Catoctin) may exhibit very rapid and moderately rapid surface runoff responses (Official Soil Series Descriptions, 2020). Both S3 and S4 model structures share the same model architecture (but with different model parameters). However, as per the simulated discharge values of the calibration period, the outflow of the S4 model structure is considerably higher than the outflow of the S3 model structure (i.e. the moisture storage is higher in the S3 model structure as the evaporation losses are approximately the same in both model structures). One possible reason for this would be the inclusion of Worsham soil type in S4 HRU, which is categorized as a soil type with very slow permeability (Official Soil Series Descriptions 2020), which in turn may cause to generate high runoff and low storage.

4.3 Relative Dominance

Studies like Fenicia et al. (2016), Molin et al. (2020) use the predictive capabilities of SUPERFLEX models to understand the dominant runoff controls of the catchment of interest. In these studies, selecting the SUPERFLEX model structure for each HRU was carried out based on the expert's judgement. However, the expert's knowledge may not be available all the time. Further, one can argue that using a pre-defined model structure may be biased towards a particular runoff control. In contrast, MIKA-SHA assumes no model structure priorly. Building an appropriate model architecture is a part of MIKA-SHA's optimization framework. Additionally, MIKA-SHA builds and tests many hypotheses about the runoff dynamics before identifying an optimal model. This objective nature of MIKA-SHA makes it highly appropriate for identifying the relative dominance of runoff controls.

As discussed in Sect. 3.2.3, the optimal model identified based on topography-based HRU classification (MIKA-SHA_SUPERFLEX_TOPO) outperforms the optimal model identified based on soil-type-based HRU classification (MIKA-SHA_SUPERFLEX_SOIL) in terms of predictive capabilities. Hence, it may be safe to assume the topography of the area primarily controls the hydrological response of the Rappahannock catchment. This may find logical due to the landscape characteristics of the catchment, such as the high mean slope (30.33 m/km) and drainage properties between different soil types are not much different.

It is possible to run MIKA-SHA with HRUs defined based on other landscape types, such as bedrock lithology, land use, vegetation and combining different landscape types. This way, one can identify the relative dominance of each landscape type towards the total runoff response of the catchment.

4.4 MIKA-SHA's Model Induction Capability

MIKA-SHA quantitatively identifies a model structure best supported by the measured data as the optimal model (structure + parameters) for the catchment of interest. As can be seen with the current study results, two optimal models identified based on topography- and soil-type-based HRUs demonstrate a logical match between the model structural components and catchment properties (possible indication on models perform for the right reasons). Further, MIKA-SHA shows consistency in its findings even with different HRUs. For example, the nonlinear runoff response of the catchment was captured by both optimal models. This clearly illustrates the capability of MIKA-SHA to mine knowledge from data, and hydrologists can rely on MIKA-SHA findings with more than just statistical confidence.

MIKA-SHA assigns an independent model structure for each HRU following the semi-distributed modelling paradigm. Much complex model configurations are possible with soil-type-based HRU classification due to the higher number of HRUs used. This additional model complexity might help to fit the model more to its training data. Irrespective of the number of HRUs, such complex models induced under soil-type-based HRU classification could not outperform the topography-based models in terms of prediction capabilities. This serves as a clear indication of topography dominated hydrological response of the catchment.

Results of this study validate the potential of MIKA-SHA in identifying dominant runoff processes. The readily interpretable nature of MIKA-SHA induced models facilitates hydrologists to better understand the runoff dynamics for the watershed of interest. Regardless of the type of HRUs used, MIKA-SHA identifies an optimal model to represent catchment dynamics based on measured data. Both optimal models identified in the current study show a logical match between their model configurations and catchment properties. Further, MIKA-SHA optimal models demonstrate a consistency of their findings even between different HRUs.

The automated and quantitative approach utilized within the framework makes it rather appropriate for comparative hydrological applications, such as identifying the relative dominance of runoff controls. Based on the prediction capabilities of the two rainfall-runoff models identified using topography-based and soil-type-based HRUs, in this case, the topography-based model outperformed the soil-type-based model. Hence topography is recognized as the dominant runoff driver for the Rappahannock basin. We believe the approach used here can be easily extended towards catchment classification studies based on dominant runoff controls and be especially useful in situations where the experimental insights and expert's knowledge are lacking. Also, the approach is expected to be an effective, cheap and quick alternative to the more expensive and time-consuming experimental investigations required to identify dominant runoff drivers.

Ethics approval - Not applicable
Consent to participate - Not applicable
Consent for publication - Not applicable
Author contributions – All authors contributed to the design of the methodology. The first draft of the manuscript was prepared by H. M. V. V. Herath and reviewed by V. Babovic. Model simulations were performed by H. M. V. V. Herath and supervised by J. Chadalawada and V. Babovic. All authors read and approved the final manuscript.
Funding - Not applicable
Conflicts of interest - The authors declare no conflict of interest.
Availability of data and material – data used in this study are publicly available through the references given.
Code availability – Code of MIKA-SHA is not yet publicly available. All the details of MIKA-SHA are presented in https://doi.org/10.5194/hess-25-4373-2021

Addor N, Melsen LA (2019) Legacy, rather than adequacy, drives the selection of hydrological models. Water Resour Res 55:378–390. doi:10.1029/2018wr022958
Ao T, Ishidaira H, Takeuchi K, Kiem AS, Yoshitari J, Fukami K, Magome J (2006) Relating BTOPMC model parameters to physical features of MOPEX basins. J Hydrol 320:84–102. doi:10.1016/j.jhydrol.2005.07.006
Arnold JG, Srinivasan R, Muttiah RS, Williams JR (1998) Large area hydrologic modeling and assessment part I: model development 1. JAWRA Journal of the American Water Resources Association 34:73–89. doi:10.1111/j.1752-1688.1998.tb05961.x
Babovic V (2005) Data mining in hydrology. Hydrological Processes: An International Journal 19:1511–1515. doi:10.1002/hyp.5862
Babovic V (2009) Introducing knowledge into learning based on genetic programming. J Hydroinformatics 11:181–193. doi:10.2166/hydro.2009.041
Beven K (2020) Deep learning, hydrological processes and the uniqueness of place. Hydrol Process 34:3608–3613. doi:10.22541/au.158921737.74476942
Beven KJ, Lamb R, Quinn P, Romanowicz R, Freer J (1995) TOPMODEL. In: Singh VP (ed) Computer models of watershed hydrology. Water Resources Publications Highlands Ranch, pp 627–668
Chadalawada J, Babovic V (2017) Review and comparison of performance indices for automatic model induction. J Hydroinformatics 21:13–31. doi:10.2166/hydro.2017.078
Chadalawada J, Herath HMVV, Babovic V (2020) Hydrologically Informed Machine Learning for Rainfall-Runoff Modeling: A Genetic Programming‐Based Toolkit for Automatic Model Induction. Water Resour Res 56: e2019WR026933. doi:10.1029/2019wr026933
Clark MP, Slater AG, Rupp DE, Woods RA, Vrugt JA, Gupta HV, Wagener T, Hay LE (2008) Framework for Understanding Structural Errors (FUSE): A modular framework to diagnose differences between hydrological models. Water Resour Res 44. doi:10.1029/2007wr006735
Condon LE, Maxwell RM (2015) Evaluating the relationship between topography and groundwater using outputs from a continental-scale integrated hydrology model. Water Resour Res 51:6602–6621. doi:10.1002/2014wr016774
Criss RE, Winston WE (2008) Do Nash values have value? Discussion and alternate proposals. Hydrological Processes: An International Journal 22:2723–2725. doi:10.1002/hyp.7072
Daymet. https://daymet.ornl.gov/. Accessed 20 March 2020
Devito K, Creed I, Gan T, Mendoza C, Petrone R, Silins U, Smerdon B (2005) A framework for broad-scale classification of hydrologic response units on the Boreal Plain: Is topography the last thing to consider? Hydrological Processes. An International Journal 19:1705–1714. doi:10.1002/hyp.5881
Duan Q, Schaake J, Andréassian V, Franks S, Goteti G, Gupta HV, Gusev YM, Habets F, Hall A, Hay L, Hogue T (2006) Model Parameter Estimation Experiment (MOPEX): An overview of science strategy and major results from the second and third workshops. J Hydrol 320:3–17. doi:10.1016/j.jhydrol.2005.07.031
Fenicia F, Kavetski D, Savenije HH (2011) Elements of a flexible approach for conceptual hydrological modeling: 1. Motivation and theoretical development. Water Resour Res 47. doi:10.1029/2010wr010174
Fenicia F, Kavetski D, Savenije HH, Pfister L (2016) From spatially variable streamflow to distributed hydrological models: Analysis of key modeling decisions. Water Resour Res 52:954–989. doi:10.1002/2015wr017398
Gan TY, Gusev Y, Burges SJ, Nasonova O, Andréassian V, Hall A, Chahinian N, Schaake J (2006) Performance comparison of a complex physics-based land surface model and a conceptual, lumped-parameter hydrological model at the basin-scale. IAHS PUBLICATION 307:196
Gao H, Hrachowitz M, Fenicia F, Gharari S, Savenije HHG (2014a) Testing the realism of a topography-driven model (FLEX-Topo) in the nested catchments of the Upper Heihe, China. Hydrol Earth Syst Sci 18:1895–1915. doi:10.5194/hess-18-1895-2014
Gao H, Hrachowitz M, Schymanski SJ, Fenicia F, Sriwongsitanon N, Savenije HHG (2014b) Climate controls how ecosystems size the root zone storage capacity at catchment scale. Geophys Res Lett 41:7916–7923. doi:10.1002/2014gl061668
Gupta HV, Kling H, Yilmaz KK, Martinez GF (2009) Decomposition of the mean squared error and NSE performance criteria: Implications for improving hydrological modelling. Journal of hydrology 377:80–91. doi:10.1016/j.jhydrol.2009.08.003
Gutiérrez-Jurado KY, Partington D, Batelaan O, Cook P, Shanafield M (2019) What triggers streamflow for intermittent rivers and ephemeral streams in low‐gradient catchments in Mediterranean climates. Water Resour Res 55:9926–9946. doi:10.1029/2019wr025041
Herath HMVV, Chadalawada J, Babovic V (2021a) Genetic programming for hydrological applications: to model or forecast this is the question. J Hydroinformatics 23:740–763. doi:10.2166/hydro.2021.179
Herath HMVV, Chadalawada J, Babovic V (2021b) Hydrologically Informed Machine Learning for Rainfall-Runoff Modelling: Towards Distributed Modelling. Hydrol Earth Syst Sci 25:4373–4401. https://doi.org/10.5194/hess-25-4373-2021
Jencso KG, McGlynn BL (2011) Hierarchical controls on runoff generation: Topographically driven hydrologic connectivity, geology, and vegetation. Water Resour Res 47. doi:10.1029/2011wr010666
Karpatne A, Atluri G, Faghmous JH, Steinbach M, Banerjee A, Ganguly A, Shekhar S, Samatova N, Kumar V (2017) Theory-guided data science: A new paradigm for scientific discovery from data. IEEE Transactions on knowledge data engineering 29:2318–2331. doi:10.1109/tkde.2017.2720168
Kavetski D, Fenicia F (2011) Elements of a flexible approach for conceptual hydrological modeling: 2. Application and experimental insights. Water Resour Res 47. doi:10.1029/2011wr010748
Knoben WJM, Freer JE, Peel MC, Fowler KJA, Woods RA (2020) A brief analysis of conceptual model structure uncertainty using 36 models and 559 catchments. Water Resour Res 56:e2019WR025975. doi:10.1029/2019wr025975
Koza JR (1992) Genetic programming: on the programming of computers by means of natural selection, vol 1. MIT Press, Cambridge
Krause P, Boyle DP, Bäse F (2005) Comparison of different efficiency criteria for hydrological model assessment. Advances in geosciences 5:89–97. doi:10.5194/adgeo-5-89-2005
Kuentz A, Arheimer B, Hundecha Y, Wagener T (2017) Understanding hydrologic variability across Europe through catchment classification. Hydrol Earth Syst Sci 21:2863–2879. doi:10.5194/hess-21-2863-2017
Lindström G, Johansson B, Persson M, Gardelin M, Bergström S (1997) Development and test of the distributed HBV-96 hydrological model. Journal of hydrology 201:272–288. doi:10.1016/s0022-1694(97)00041-3
McGuire KJ, McDonnell JJ, Weiler M, Kendall C, McGlynn BL, Welker JM, Seibert J (2005) The role of topography on catchment-scale water residence time. Water Resour Res 41. doi:10.1029/2004wr003657
Meshgi A, Schmitter P, Chui TFM, Babovic V (2015) Development of a modular streamflow model to quantify runoff contributions from different land uses in tropical urban environments using genetic programming. J Hydrol 525:711–723. doi:10.1016/j.jhydrol.2015.04.032
Molin MD, Schirmer M, Zappa M, Fenicia F (2020) Understanding dominant controls on streamflow spatial variability to set up a semi-distributed hydrological model: the case study of the Thur catchment. Hydrol Earth Syst Sci 24:1319–1345. doi:10.5194/hess-24-1319-2020
MOPEX Model Parameter Estimation Experiment. https://www.nws.noaa.gov/ohd/mopex /mo_datasets.htm. Accessed 12 February 2021
Nash JE, Sutcliffe JV (1970) River flow forecasting through conceptual models part I—A discussion of principles. Journal of hydrology 10:282–290. doi:10.1016/0022-1694(70)90255-6
Nearing GS, Kratzert F, Sampson AK, Pelissier CS, Klotz D, Frame JM, Prieto C, Gupta HV (2020) What role does hydrological science play in the age of machine learning? Water Resources Research e2020WR028091. doi:10.31223/osf.io/3sx6g
Newman AJ, Clark MP, Sampson K, Wood A, Hay LE, Bock A, Viger RJ, Blodgett D, Brekke L, Arnold JR, Hopson T (2015) Development of a large-sample watershed-scale hydrometeorological data set for the contiguous USA: data set characteristics and assessment of regional variability in hydrologic model performance. Hydrol Earth Syst Sci 19:209–223. 209–223. doi:10.5194/hess-19-209-2015
Official Soil Series Descriptions. https://soilseries.sc.egov.usda.gov/. Accessed 29 July 2020
Onda Y, Komatsu Y, Tsujimura M, Fujihara JI (2001) The role of subsurface runoff through bedrock on storm flow generation. Hydrological processes 15:1693–1706. doi:10.1002/hyp.234
Physics Informed Machine Learning Conference (2016, January), Santa Fe, New Mexico, USA
Price K (2011) Effects of watershed topography, soils, land use, and climate on baseflow hydrology in humid regions: A review. Prog Phys Geogr 35:465–492. doi:10.1177/0309133311402714
Savenije HH (2010) HESS Opinions" Topography driven conceptual modelling (FLEX-Topo)". Hydrol Earth Syst Sci 14:2681–2692. doi:10.5194/hess-14-2681-2010
Sellars SL (2018) "Grand challenges" in big data and the Earth sciences. Bull Am Meteor Soc 99:ES95–ES98. doi:10.1175/bams-d-17-0304.1
Schröder B (2006) Pattern, process, and function in landscape ecology and catchment hydrology–how can quantitative landscape ecology support predictions in ungauged basins? Hydrol Earth Syst Sci 10:967–979. doi:10.5194/hess-10-967-2006
Shen C, Laloy E, Elshorbagy A, Albert A, Bales J, Chang FJ, Ganguly S, Hsu KL, Kifer D, Fang Z, Fang K (2018) HESS Opinions: Incubating deep-learning-powered hydrologic science advances as a community. Hydrol Earth Syst Sci 22:5639–5656. doi:10.5194/hess-22-5639-2018
Spieler D, Mai J, Craig JR, Tolson BA, Schütze N (2020) Automatic model structure identification for conceptual hydrologic models. Water Resour Res 56:e2019WR027009. doi:10.1029/2019wr027009
Sugawara M (1979) Automatic calibration of the tank model/L'étalonnage automatique d'un modèle à cisterne. Hydrol Sci J 24:375–388. doi:10.1080/02626667909491876
USGS EarthExplorer. https://earthexplorer.usgs.gov/. Accessed 20 March 2020
Vannier O, Anquetin S, Braud I (2016) Investigating the role of geology in the hydrological response of Mediterranean catchments prone to flash-floods: Regional modelling study and process understanding. J Hydrol 541:158–172. doi:10.1016/j.jhydrol.2016.04.001
Wang D, Cai X (2010) Comparative study of climate and human impacts on seasonal baseflow in urban and agricultural watersheds. Geophys Res Lett 37. doi:10.1029/2009gl041879
Web Soil Survey (2021) https://websoilsurvey.sc.egov.usda.gov/App/HomePage.htm. Accessed 26 January 2021
Woods RA, Sivapalan M, Robinson JS (1997) Modeling the spatial variability of subsurface runoff using a topographic index. Water Resour Res 33:1061–1073. doi:10.1029/97wr00232
Zhao RJ (1992) The Xinanjiang model applied in China. J of Hydrol 135:371–381. doi:10.1016/0022-1694(92)90096-e

Download PDF

Editorial decision: Major revisions
30 Aug, 2022
Reviews received at journal
19 Oct, 2021
Editor assigned by journal
05 Sep, 2021
First submitted to journal
04 Sep, 2021

You are reading this latest preprint version

Identification of Dominant Runoff Controls Using Hydrologically Informed Machine Learning Approach

Status:

Version 1

Abstract

Figures

1 Introduction

1.1 Runoff Controls

1.2 Physics Informed Machine Learning

2 Mika-sha

3 Case Study

3.1 Study Area

3.2 Results

3.2.1 Topography-based HRUs

3.2.2 Soil-type-based HRUs

3.2.3 Topography vs. Soil-type

3.2.4 Previous Research Findings

4 Discussion

4.1 MIKA-SHA_SUPERFLEX_TOPO

4.2 MIKA-SHA_SUPERFLEX_SOIL

4.3 Relative Dominance

4.4 MIKA-SHA's Model Induction Capability

5 Conclusions

Declarations

References

Status:

Version 1