How does the coupling of real-world policies with optimization models expand the practicality of solutions in reservoir operation problems?

This study aims compares how diﬀerent formulations of a reservoir operation problem with conﬂicting objectives aﬀect the quality of the generated solution set. Six models were developed for comparative analysis: three using dynamic programming (DP) and three using the evolutionary multi-objective direct policy search (EMODPS) algorithm. Afterward, to improve the quality of the generated solution set, an EMODPS model was selected and coupled with zone-based hedging policy that is currently being applied in real-world reservoir operations. The solutions generated by each model were then evaluated regarding proximity to the ideal and three eminent performance indices (risk, resiliency, and vulnerability). The proposed methodology was applied to a multi-purpose reservoir located in South Korea, Boryeong Dam, which had suﬀered a multi-year drought recently. Consequently, the solution sets from the EMODPS model yielded closer results than those of the stochastic DP model for optimality and diversity. Although the solutions from the algorithm performed better than actual operation results under normal conditions, the actual operations executed based on the zone-based hedging rule outperformed the other two in case of droughts. Among the EMODPS models, one with the fewest parameters, the EMODPS-Gaussian model, resulted in better solutions for all cases. Finally, coupling the real-world policy with the optimally derived solutions in the case of droughts improved the frequency, duration, and magnitude of the water supplies whereas the water users experienced an improvement in scale at the expense of more recurrent failures.


Introduction
Reservoirs have been widely adopted as the principal source of freshwater for various uses including irrigation, municipal, industrial, hydropower generation, and flood mitigation (Ahmad et al., 2014). Nevertheless, the recent surge in the frequency and magnitude of irrational hydro-climatic variabilities and their consequences (Loucks, 2000;Salazar et al., 2016) have forced the traditional reservoir management strategies to be reformed. In particular, the inherently complex natures of water management problems caused from conflicting interests and distinctive attitudes towards risk of involving stakeholder groups have compelled the need for advancements in reservoir operations techniques (Castelletti et al., 2008;Quinn et al., 2019).
During the last half-century, the derivation of optimal reservoir control policies through the use of emerging technological advancements gained much popularity and has become one of the key topics in the field of water resources management (Yakowitz, 1982;Yeh, 1985;Klemeš, 1987;Labadie, 2004). Many optimization techniques including linear programming (Belaineh et al., 1999), nonlinear programming (Birhanu et al., 2014;Li and Huang, 2008), and dynamic programming (DP) (Stedinger et al., 1984;Kim and Palmer, 1997) thus far have been adopted as primary problem-solving tools for water resources engineers.
Algorithms belonging to the DP family especially, are counted among the most prominent options applicable to the majority of real-world optimization problems (Stedinger et al., 1984;Kim and Palmer, 1997;Kim et al., 2007;Castelletti et al., 2008). Given that the system dynamics of the target problem is well known (Giuliani et al., 2016a), DP transforms the convoluted original problem into a sequence of relatively simpler problems. By approximating the state and decision space by discretizing them, DP provides release decisions corresponding to every possible combination of the state variables (Kelman et al., 1990). Since this procedure can be applied to all types of objective function or constraints regardless of their shapes, DP and its variations (Stedinger et al., 1984;Mujumdar and Nirmala, 2007;Alaya et al., 2003;Saadat and Asghari, 2019) have been applied continuously to countless reservoir optimization problems.
Despite the wide applicability of DP, approximation procedures that are compulsory in the DP framework may cause decision biases that potentially limit optimal decision making (Kasprzyk et al., 2009). Thus, efforts have been extended continuously to foster optimization techniques mainly to derive more precise resolutions to multi-objective problems involving conflicting trade-offs (Haimes and Hall, 1977;. One of the main streams of approach in this context is the adaptation of multi-objective evolutionary algorithms (MOEAs) (Kollat and Reed, 2006;Javadi et al., 2015;Liu et al., 2019), in which the algorithm aims to tackle the target problem by making the least possible simplifications. Specifically, the evolutionary multi-objective direct policy search (EMODPS) framework (Giuliani et al., 2016a), which combines direct policy search (DPS) (Rosenstein and Barto, 2001), nonlinear approximating networks, and MOEAs into a simulation-based optimization algorithm, was proved to be more effective in yielding solutions to certain problems (Quinn et al., 2019;Salazar et al., 2017). Since solutions generated from EMODPS may yield more flexible decisions under more varied circumstances, EMODPS are being employed in complex multi-objective applications where obtaining reference optimal solutions is relatively demanding (Giuliani et al., 2016a. Although operating policies derived via optimization schemes such as DP and EMODPS may guarantee theoretically optimal alternatives, several factors have hindered their implementation in real-world reservoir operations (Quinn et al., 2019;Whateley et al., 2015), including the extremely risk-averse nature of decision makers (Watkins Jr and McKinney, 1997), administrative or legal constraints, and the lack of expert knowledge among related stakeholders. Alternately, decision makers have adopted various types of hedging rules on-site, such as one-point (Klemeš, 1977), two-point (Bayazit and Ünal, 1990), continuous (Hashimoto et al., 1982), and zonebased hedging (Shih and ReVelle, 1995), that conserve a portion of reservoir storage in case of future droughts (You and Cai, 2008a,b;Tu et al., 2008). In particular, zone-based (discrete) hedging rules are popular in on-site reservoir operations and are currently being adopted by many owing to their simple methodology and relative ease of implementation for dam operators with limited advanced expert knowledge . The zone-based hedging rule first prioritizes water according to its uses and enforces restrictions sequentially in the case of droughts. Consequently, these earlier restrictions would prevent a reservoir from suffering from severe water shortages in the future.
Owing to the discrepancy between the solutions derived from optimization algorithms and hedging rules, only a few studies have endeavored to merge the concepts of policies derived from optimization algorithms and policies being implemented onsite. As an extension of participatory or collaborative decision making efforts in water resources management (Langsdale et al., 2013;Palmer et al., 2013;Basco-Carrera et al., 2017), this study intends to link the two discrete worlds of theoretical optimization and real operation through the following procedures: (1) generate optimal solution sets of a multi-objective reservoir operation problem solely using distinctive optimization algorithms based on K-fold cross-validation and compare the quality of the derived solutions, (2) compare their performances with those from actual operations executed based on status quo (SQ) rules, and (3) implement real-world hedging rules to the solutions from the optimization algorithms and explore the trade-off relationship between conflicting stakeholders based on risk, resiliency, and vulnerability. The authors intend to contribute both to the algorithmic component and to the realworld drought response measures by providing a thorough analysis regarding multiple optimization frameworks and relating them to real-world operating policies.

Dynamic Programming
DP, an optimization algorithm that searches for an optimal set of decision variables that maximizes the current benefit plus the expected benefits from future operations (Tejada-Guibert et al., 1995), and its variations have been applied to multiple water resources management problems (Yakowitz, 1982;Kelman et al., 1990;Kim et al., 2007;Yang et al., 2009;Fayaed et al., 2013;Feng et al., 2020). For the cal-culations, DP first discretizes the entire range of states, decisions, and disturbance variables with appropriate intervals. Then, the objective functions and constraints are iteratively calculated in the discrete time-steps across the entire optimization time horizon. Consequentially, a steady state solution, or an optimal set of decision variables corresponding to each value of the state variables, is computed as the outcome of the optimization procedures.
For example, let us consider a multi-objective problem in which the aim is to maximize a series of objectives, J = J 1 , J 2 , ..., J k by deciding on the optimal policy of release decisions, R * t across the optimization horizon T . Assuming perfect knowledge of the future inflow Q t , this optimization problem can be solved with a deterministic DP (DDP) model by solving the following functional equation (i.e., recursive equation) under the time horizon: where t = time index, t = 1, ..., T ; T = final time step; n = number of periods remaining until the end of the optimization horizon; f n t = the value of benefit function from the current period to the end of the time horizon T ; Z t = the value of the aggregated objective function during period t with prescribed weights (i.e. Z t = k i=1 w i × J i t ); S t = the storage at the beginning of period t; Q t = the inflow during period t; R t = the actual release during period t; γ = a discount factor; S min , S max = minimum and maximum storage; and R min , R max = minimum and maximum release, respectively.
Because this specific formulation of the problem includes two state variables (S t and Q t ) and a single decision variable (R t ), solving Eq. 1 yields an optimal set of decision variables R * t with dimensions proportional to the discretization intervals of the state variables and the length of the optimization horizon. Because DP manages both nonlinear and multi-objective characteristics of complex reservoir operational design problems (Giuliani et al., 2016a) in its objective function, it has been considered one of the most prominent options in solving water resources management problems. In particular, the stochastic version of DP (SDP), which handles future uncertainties with hydrologic state variable or transition probabilities, has been adopted in reservoir operation problems (Stedinger et al., 1984;Tejada-Guibert et al., 1995;Kim and Palmer, 1997;Kim et al., Forthcoming).
Despite its advantages, few scholars have argued for the application of DP to more complicated water resources problems due to its notorious curses Giuliani et al., 2014a): curse of dimensionality (Bellman, 1957), curse of modeling (Tsitsiklis and Van Roy, 1996), and the curse of multiple objectives (Powell, 2007). Briefly, the curse of dimensionality refers to the exponential increase in the number of computations proportional to the number of state variables, the curse of modeling refers to the necessity of including all input variables in the model as supplementary state variables, and the curse of multiple objectives refers to the exponential increase in the number of computations proportional to the number of objectives. This has motivated recent technical advancements that are aimed primarily to overcome these three curses by using novel approaches other than DP (Bertsekas, 2005;Giuliani et al., 2016a).

Evolutionary Multi-Objective Direct Policy Search
EMODPS is an example of the parameterization-simulation-optimization (PSO) approach (Koutsoyiannis and Economou, 2003), where the MOEAs are merged with DPS and nonlinear approximating networks to derive the set of optimal policies. Instead of implementing approximation during problem formulation processes, such as discretization in DP, EMODPS directly optimizes the parameters of the optimal policy through a simulation-based optimization scheme. It is argued that the unique features of EMODPS partially resolve the three curses of DP: (1) the curse of dimensionality by evading the iterative computation of the value of the benefit function; (2) the curse of modeling by using a simulation model that can include exogenous information directly during simulation; and (3) the curse of multiple objectives by attaining the entire Pareto front through the adoption of MOEAs during its search (Kasprzyk et al., 2009;Giuliani et al., 2014bGiuliani et al., , 2016a. The problem formulation and optimization conducted by the EMODPS framework can also be illustrated through the same problem that is to maximize a series of objectives, J. The EMODPS aims to search for the set of parameters (Θ * = θ * 1 , ..., θ * n ) that constitute the optimal policies p * θ . The formulation of the optimal policy that maximizes J can be expressed by the following equation: The policy parameter vector Θ, which constitutes the operating policy p θ through parameterization, is coupled with diverse types of networks of functions that are flexible in shape. Typical examples of network functions include artificial neural networks (ANN) (Cybenko, 1989) and radial basis functions (RBF) (Park and Sandberg, 1991). The example below demonstrates the RBF with a cubical shape, where the decision variable R t is decided according to both the shape of the functions and the values of the parameters, Θ (Giuliani et al., 2016a;Quinn et al., 2017): where n = the number of RBFs; θ j,1 , θ j,2 , θ j,2 = set of parameters that compose the jth RBF; S t = the value of the state variable at time t (storage at time t in this example); and R min , R max = minimum and maximum release, respectively. The last component in the EMODPS framework, MOEA, is adopted during the searching procedures of the optimal policy parameter set Θ * from Eq. 6. Because MOEAs can discover the critical tradeoffs facing water resources systems , non-dominated or Pareto optimal sets of solutions can be derived without aggregating multiple objectives into a single function. Consequently, the policymakers   1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35  36  37  38  39  40  41  42  43  44  45  46  47  48  49  50  51  52  53  54  55  56  57  58  59  60  61  62  63  64  65 are presented with the full tradeoff and can express their preference for a single or several alternatives among the non-dominated solutions. To generate a more precise set of solutions, developments in diverse MOEA schemes have been developed, including the Niched Pareto Genetic Algorithm (NPGA) (Horn et al., 1994), the Non-dominated Sorting Genetic Algorithm (NSGA) (Srinivas and Deb, 1994), the Non-dominated Sorting Genetic Algorithm II (NSGA-II) (Deb et al., 2002), and the auto-adaptive Borg MOEA . For detailed research on the performances of different MOEAs, refer to the papers by Zitzler et al. (2000), Nicklow et al. (2010), Hadka and Reed (2012), and .

Zone-based Hedging Rules
Operating reservoirs with alternatives derived from optimization algorithms such as DP or EMODPS would theoretically lead to higher measures of superior performance regarding optimized objectives. However, the on-site application of these policies may be limited by: lack of theoretical backgrounds among decision makers , both actual and perceived errors in the models/forecasts, and the problematic attribute of systems models that are difficult to explain or understand (Quinn et al., 2019). Instead, the adaptation of relatively simpler policies such as diverse variants of hedging rules, or drought contingency plans (Shepherd, 1998), is more prevalent in real-world reservoir operations. Among various alternatives, zone-based (discrete) hedging rules (Shih and ReVelle, 1995) where the release decisions are determined according to the storage at a specific time period, are considered to be foremost primarily due to their relatively simple and intuitive application procedures. Figure 1 depicts an example of a zone-based hedging rule, that all twenty multipurpose dams in South Korea are adopting as their main operating policy. Operating a reservoir solely based on the zone-based hedging rule requires the following procedures. First, the reservoir's total demand is prioritized according to use. In the case of South Korea, water use from multipurpose dam is mainly classified into one of these four categories: municipal (M), industrial (I), agricultural (A), and environmental (E) and priorities are granted to each category accordingly. For instance, municipal and industrial water as highest, irrigation the second, and environmental the least priority. Finally, the release R t is determined according to the storage zone and can be expressed as Eq. 7 through Eq. 11, which are considered as the key operating procedures for zone-based hedging rules: where α t , β t , and γ t (1 > α t > β t > γ t > 0) = weighting factors determined according to the reservoir and the season; and D t = the aggregate demand during day t; S DSL = the dead level storage of the reservoir. This study aims to use of the current zone-based SQ policy in the two ways: (1) compare the performance of the solutions derived from optimization algorithms with the actual performance operated based on SQ policy and (2) the SQ policy will be coupled during the simulation step of the PSO algorithm to analyze the effect of the incorporation of the hedging rule into an optimization algorithm.

Study Site
This study selected the Boryeong Dam, a multi-purpose reservoir located on the western coast of the Korean Peninsula, as the test case, and applied the proposed methodology to derive optimal release sequences with multiple alternatives. Boryeong Dam supplies water for M&I, A and E uses to eight nearby administrative districts: Taean, Seosan, Dangjin, Yesan, Hongseong, Boryeong, Cheongyang, and Seocheon ( Figure 2). Due especially to the marginal difference between the reservoir's capacity (116.9 × 10 6 m 3 ) and the districts' aggregate annual demand (106.6 × 10 6 m 3 ), management of the reservoir during the drawdown period is even more critical (Jung et al., 2016). Moreover, atypically from other typical reservoirs, the majority on the water demand from the Boryeong Dam is concentrated on M&I uses (83%). This high ratio for M&I water demand consequently increases the extent of the potential exposure of the public to drought damage if stocks are mismanaged.
Recently, districts dependent on Boryeong Dam suffered water shortages due to a multi-year drought from 2013 through 2019. As shown in Figure 3, the main cause of the drought was a significant reduction in the amount of cumulative rainfall during the water years (defined as October through September of the following year in this study). Due to this contraction in annual precipitation, Boryeong Dam measured the lowest storage rate in its recorded history (8.3%) in July 2017. Because of this severe drought   1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35  36  37  38  39  40  41  42  43  44  45  46  47  48  49  50  51  52  53  54  55  56  57  58  59  60  61  62  63  and low storage, multiple measures such as the development of structured stakeholder engagement (Kim et al., 2019a,b) and construction of novel water infrastructures (Ihm et al., 2019) were carried out to minimize the multi-year drought damage.
To demonstrate how different optimization algorithms or problem formulation methodologies may ameliorate the performance of the reservoir, this study derived the optimal monthly release sequences for 20 historical water years (October 1999 through September 2019) using K-fold cross-validation. In this study, the observed inflow series were grouped into four subgroups (K = 4) each with 15 years of inflow series for optimization, and the remaining five years of inflow for validation. During validation, the starting storage values were set as the actual storage values recorded   1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35  36  37  38  39  40  41  42  43  44  45  46  47  48  49  50  51  52  53  54  55  56  57  58  59  60  61  62  63 64 65  Fig. 4 Research flow chart adopted in this study at the beginning of the validation period (Table 1). Then, the performance of each model was comparatively analyzed. The overall flow of this study is shown in Fig. 4.

Optimization and Reservoir Simulation
This study adopted the multi-objective optimal operations (M3O) toolbox developed by Giuliani et al. (2016b) to derive the optimal release rules with different alternatives. In total, six different models were used to derive six different optimal release rules for each subgroup: the DDP model with perfect forecast (DDP-Perfect), DDP model with average inflow forecast (DDP-Average), a stochastic DP model assuming a lognormal distribution of the inflow (SDP), and three EMODPS models with differently shaped approximating functions each with a distinct number of parameters (EMODPS-Linear, EMODPS-Root, and EMODPS-Gaussian).
All models attempt to minimize the two objectives that reflect the preference of two conflicting groups of stakeholders: J supply , which reflects the interests of reservoir managers and J demand , which reflects the satisfaction of water users. Both objectives are measured in terms of magnitude and expressed as the average percentage of deficit over the entire optimization (or simulation) time horizon H. The first objective minimizes the deficit between the Zone 2 storage (vertical line between Zone 1 and Zone 2 in Fig. 1) and the actual storage in each time step, and is expressed as a percentage: where S t = the actual storage at the beginning of month t, and S target t = the predetermined Zone 2 storage during month t.
The second objective is to minimize the average deficits between releases and demands, also expressed as a percentage: where D t = the monthly aggregate demand during day t, and R t = the actual release during month t. Ideally, both J supply and J demand should be zero, which indicates that the actual storage is at least above the target storage during the entire simulation period, and the release is greater than the demand. The two contrasting objectives in this study, J supply and J demand , aim to reflect the real-world conflicts that historically have occurred (Kim et al., 2019a(Kim et al., , 2020.

Policy Derived using DP
In this study, three models were developed using DP as their primary optimization algorithm with different forecast information: DDP-Perfect, DDP-Average, and SDP. Primarily, the DDP-Perfect model assumes perfect forecasting during the entire optimization horizon of future inflow. The solutions derived by the DDP-Perfect model were the reference set, in which the approximated solutions are not able to predominate. Conversely, the DDP-Average model was set to make release decisions that assume the average inflow throughout the optimization horizon. Finally, the SDP model developed in this study assumes a lognormal distribution of the inflow series and derives optimal release decisions based on Eq. 15: Because all DP-based models require a commensurate objective function, Z t ·, for calculation in problems with two or more objectives, the aggregation was conducted as Eq. 16 with combinations of weighting factors w 1 and w 2 : 1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35  36  37  38  39  40  41  42  43  44  45  46  47  48  49  50  51  52  53  54  55  56  57  58  59  60  61  62  63  64 where w 1 and w 2 = weights multiplied by each of the objectives (0 ≤ w 1 , w 2 ≤ 1); Z 1,t and Z 2,t = J demand and J supply calculated at day t.

Policies derived using EMODPS
Because the type of selected optimization algorithm and the formulation of the approximation function affect the overall quality of the generated solution sets, this study generated manifold alternatives with EMODPS by varying the shape and number of parameters constituting the RBFs. Because it has been proven that the solution sets are sensitive to the shape of the parameterized function by Giuliani et al. (2016a), three shapes of RBFs with different numbers of parameters were adopted in this study: 4-parameter linear RBF (EMODPS-Linear), 6-parameter root-shaped RBF (EMODPS-Root), and 3-parameter Gaussian RBF (EMODPS-Gaussian). All formulations adopted the NSGA-II for optimization due to its extensive use in water resources optimization problems and for equitable comparison. The first alternative, the EMODPS-Linear model, was constructed using linear functions with four policy parameters (m 1 , m 2 , h 1 , and h 2 ). Consequently, the optimal release decision derived via linear RBFs was determined according to Eq. 17 through Eq. 19: The second model adopted the Gaussian RBF, which is one of the most prominently adopted shapes of RBF in the field of scientific and engineering computing (Bugmann, 1998;Fasshauer and McCourt, 2012), to determine the policy for the optimal release sequences. The three policy parameters that constitute the simplified Gaussian RBF (w 1 , c 1 , and b 1 ) consequently determine the decision R t using the following equation: The last shape of the RBF used in this study adopted was root-shaped RBF with six policy parameters (w 1 , w 2 , r 1 , r 2 , d 1 , and d 2 ), and the release decision was derived according to the following equation: It has been proven that both the population diversity and convergence of solutions to the Pareto optimal front are paramount in several studies dealing with MOEAs (Wang et al., 2015;Liu et al., 2016;Ying et al., 2017). Thus, to ensure both diversity and convergence of optimal solution sets, all models in this study adopted 40 individuals each with 500 generations during optimization. The generated optimal solution sets were then analyzed to identify more effective and reliable operating schemes during drought periods.

Incorporation of SQ Policy during Simulation
To analyze the effect of incorporating the hedging rule, this study coupled the realworld SQ policy during simulation in drought periods with the release decisions from the optimization algorithm. Among the types of hedging rules, zone-based hedging is used to operate Boryeong Dam. Under this rule, the actual release R t is determined according to the zone to which storage S t belongs. In this study, this SQ operating policy was included in the simulation step according to Eq.7 through Eq.11 and realworld monthly rationing factors (Table 2) were adopted for a realistic simulation. Finally, the efficacy of incorporating hedging rules was analyzed regarding frequency, duration, and magnitude from the stances of both water suppliers and users.

Comparison in performance of models
Application of the proposed methodology at the reservoir on the study site resulted in diverse sets of solutions, each distinct in quality and diversity. For comparative analysis of their performance, the solutions generated by the DDP-Perfect, DDP-Average, SDP, EMODPS-Linear, EMODPS-Root, and EMODPS-Gaussian models were all plotted and compared with the actual operating results of the same period, which include on-site, real-time decision making by experts in reservoir operations ( Figure 5). Multiple insights can be derived according to the generated Pareto fronts using different models. Primarily, solutions derived by the optimization algorithms outperformed the results from the actual operation, indicated by the red diamond in Figure 5, in three of the four subgroups. Although on-site release decisions are made with more information than those in the optimization algorithms, the results indicated that the adaptation of any kind of optimization model into actual reservoir operation would lead to improving the performance in stationary conditions. However, the actual operation resulted in better performance in the case of droughts (Figure 5d). This result indicates that the SQ policy, on which the actual operations are based, may yield more practical outcomes when adequately employed under suitable conditions. Among the optimization models, the DDP-Perfect model, which assumed perfect forecasting of future inflow series, yielded the best performance as initially expected in all subgroups. Of the three EMODPS models, the EMODPS-Gaussian model with three parameters yielded the results closest to the ideal point, represented as   1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35  36  37  38  39  40  41  42  43  44  45  46  47  48  49  50  51  52  53  54  55  56  57  58  59  60  61  62  63  a blue star, among all subgroups in Figure 5. Policies derived by the EMODPS-Linear model followed those from the EMODPS-Gaussian model regarding proximity to the reference set; however, they outperformed the other two models regarding solution diversity. Despite having more parameters than the other two, the EMODPS-Root model performed worst regarding proximity to the ideal point. This indicates that the quality of the solutions generated from the EMODPS algorithm is more sensitive to the shape of the network functions than the number of parameters that constitute them. Moreover, in cases where deriving both optimal and diverse solution sets is challenging, the overall trade-off relationship between the performances and the number of generated solutions, which can be explored from the results of the EMODPS-Linear and EMODPS-Root models, stresses the importance of the decision makers choosing optimal operating policies depending on their circumstances. Finally, from Figure 5, the alternatives derived from the DDP-Perfect model may be considered as the optimal set under which, in all circumstances, the reservoir should be operated because their performance is adjacent to the ideal point. However, deterministic models are types of open-loop control, where the sequence of actions is not updated as a function of new observations of state variables (Herman et al., 2020). Due to this attribute, the optimal release decisions made under the deterministic models would severely underperform in the case of unexpected states. This can be proved by the inferior performances of the DDP-Average model in two of the subgroups, K=2 and K=4 (Figure 5b, 5d).
Conversely, the SDP model and all types of EMODPS models, which belong to the group of policy-searching closed-loop models, update their release decisions dynamically according to either the present or future states. This indicates that the   1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35  36  37  38  39  40  41  42  43  44  45  46  47  48  49  50  51  52  53  54  55  56  57  58  59  60  61  62  63  64  65 closed-loop models would be more flexible to future uncertainties and would surpass the performance of deterministic models in cases of unanticipated incidents such as floods or droughts. Moreover, the release decisions from the EMODPS models are defined under the entire domain of state variables, whereas the decisions from the DP models are only defined on limited domains because of the discretization of the state variables. This enhances their adaptability under various circumstances even more. Thus, policies from adaptive optimization algorithms with approximating functions of various shapes should be considered when deriving optimal solutions in reservoir operation problems.

Effect of coupling zone-based hedging rules during droughts
This section aims to investigate the effect of coupling the zone-based hedging SQ policy with optimal solutions from the optimization algorithm during droughts. Given the total precipitation during the optimization horizon and the conflicting nature of the objectives, the relocation of the Pareto optimal solution sets closer to the ideal point is hardly achievable. Hence, this portion of the study explores the expansion of the trade-off relationship between the water users and suppliers that may arise from the coupling zone-based hedging policy during the simulation. Risk, resiliency, and vulnerability (Hashimoto et al., 1982) were chosen as performance indices to measure the frequency, duration, and magnitude of predefined failures. Higher values of risk, resiliency, and vulnerability each correspond respectively to more frequent, longer-lasting, and greater failures.
To quantify the effect of the hedging rule under drought conditions, this study selected the EMODPS-Gaussian model under subgroup K=4 (Table 1), which yielded both optimal and diverse solution sets. The same set of derived solutions was simulated twice under the historical inflow, once based on the real-world hedging rule and once without it. The supply-side failures were defined as not meeting the target storage (i.e., J supply > 0). Conversely, failure from the demand-side was split into different uses and defined as when the release was less than the target demand. Since in the ratio for M&I uses, A uses are dominant, 89% in the study reservoir, the risks and vulnerabilities for these two water uses were explored. Figure 6 represents the performance indices from the supply-side. Since the zonebased hedging rules restrict release from the reservoir in case of relatively lower storage values, the performance of the system measured from the supply-side improved. Especially, the duration of failure, which is measured in terms of resiliency, was significantly reduced after incorporating hedging rules (Figure 6b). Whereas the expected length of failure was 7.7 months before hedging, the introduction of the hedging rule resulted in a 37.4% reduction in the average duration of failure (4.8 months after hedging). Hedging not only improved the duration, but both frequency and the magnitude of failure were also enhanced after coupling hedging during the simulation (Figure 6a, 6c).
On the other hand, Figure 7 depicts the comparative performances before and after coupling hedging that was calculated from the demand-side. Owing to the conflicting nature of supply and demand, the frequency of failure measured from the demand-side  (Figure 7a, 7c). However, the average magnitude of failure slightly improved in the stances of water users (%) (Figure 7b,  7d). This result indicates that the case of hedging being applied, the users undergo more frequent failures but with downsized water deficit. Considering the extremely risk averse nature of water users facing water deficiency, reducing the magnitude of failure by incorporating hedging seems to be a competent strategy given the increased risk of droughts in the future .   1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35  36  37  38  39  40  41  42  43  44  45  46  47  48  49  50  51  52  53  54  55  56  57  58  59  60  61  62  63 64 65

Conclusion
The primary motivation for this study was to conduct a comparative analysis between distinct formulations of the optimization algorithms used in reservoir operation problems and with actual operation outcomes. Consequently, six different models were formulated with two eminent optimization algorithms each with distinct characteristics: DP and EMODPS. The performances of all optimization models, measured as the magnitude of the deficit compared to the target storage and demand, surpassed the results from the actual operation with real-time calculations and experience of related experts under normal inflow conditions. Among the optimization models, solutions from the EMODPS models dominated those from the DP models in their proximity to the ideal point and variety of solutions across the entire Pareto front. Finally, the EMODPS-Gaussian model with the least number of parameters outperformed the other two in adjacency to the ideal point, and the trade-off between optimality and diversity between the EMODPS-Linear and the EMODPS-Root models could be explored.
Another objective of this study was to combine the real-world zone-based hedging rule into the simulation process of optimally derived policy using an optimization algorithm. The solutions from the EMODPS-Gaussian model were simulated under historical inflow series with and without coupling the real-world hedging policy. The results were evaluated for the frequency, duration, and magnitude of both supply and demand, and yielded a trade-off relationship between different performance metrics. Despite the expected improvement in supply indices, the magnitude of failure from the demand-side also increased after the incorporation of the real-world policy. Taking the highly risk-averse nature of water users into consideration, this result that decreases the average size of damage at the expense of increased frequency may be worthwhile in actual operations. Ultimately, the conclusions drawn from this study highlight the necessity of narrowing the gap between theory and real-world applications, which have been stressed since the mid-20th century.
Despite multiple findings from the application of the proposed methodology, room for improvements in future studies remains. In particular, the values of different MOEAs could be explored by adopting various types of MOEAs during the optimization process. This advancement would potentially result in the development of solution sets with improved performance measured against the algorithmic side of the reservoir operation problems. Conversely, the simulation under future inflow scenarios that reflect the increased risk of future climate variability, especially from climate change, and analyzing the proposed alternatives under a wider climate range would enable the assessment of their performance under even more extreme climate conditions. Finally, adjusting the finer details that constitute the real-world zone-based hedging rules would potentially provide the related decision makers with a wider range of options from which to choose.