Parameter dependences arising from calibration of a riverine diatom model - representation in terms of posterior conditional distributions

We address the analysis and proper representation of posterior dependence among parameters obtained from model calibration. A simple water quality model for the Elbe River (Germany) is referred to as an example. The joint posterior distribution of six model parameters is estimated by Markov Chain Monte Carlo sampling based on a quadratic likelihood function. The estimated distribution shows to which extent model parameters are controlled by observations, highlighting issues that cannot be settled unless more information becomes available. In our example, some vagueness occurs due to problems in distinguish-ing between the effects of either growth limitation by lack of silica or a temperature dependent algal loss rate. Knowing such indeﬁniteness of the model structure is crucial when the model is to be used in support of management options. Bayesian network technology can be employed to convey this information in a transparent way.

The Elbe River with station Geesthacht where the chlorophyll a and silica observations under study were taken. Some aspects of model forcing were obtained from stations Neu Darchau (river discharge), Schnackenburg (temperatures in 1997) and Schmilka (silica). The map also indicates the four most important tributaries.
Both F light (t) and F Si (t) can assume values between 0 and 1. 155 The model does not explicitly consider the water depth coordinate z. Light limitation is 156 specified just as a vertical average over water depth D. Given radiation intensity I(t) at the 157 water surface, the following parametrization of F light is assumed, where the 'Smith formula' (Smith, 1936) with some half-saturation constant K light has been 159 used. Light attenuation λ is assumed to be proportional to chlorophyll a concentration with 160 a constant factor λ S : In Eq. (4), water depth D is treated as a constant, indicating that its value remains unchanged 162 during each trajectory calculation. However, for each simulated trajectory the value of D is 163 adjusted to water discharge observed at station Neu Darchau (about 50 km upstream of 164 Geesthacht) at the time when this trajectory reaches Geesthacht Weir. A polynomial for-165 mula well reproduces the empirical relationship between discharge and water depth, slightly 166 enhancing, however, small values of D (see Scharfe et al., 2009, their Fig. 3). 167 The potential for chlorophyll a development may depend on the amount of silica ap-168 portioned to each simulated trajectory in agreement with observations at station Schmilka. 169 The simple model concept assumes that an initial reservoir of silica, C Si (t 0 ), is continuously In our experiment parameter K Si is set to the fixed value of 0.1 mg Si/l. cided to modify the model in such a way that in each year assimilation of silica is abandoned 184 during a 1-2 week period (see Scharfe et al., 2009, Fig. 11 therein). In this study we adopt 185 this approach to prevent that the large short-term discrepancies dominate the overall model 186 evaluation. In all time series shown in this paper, the special periods will be highlighted.
The interpretation of coefficient a being greater than one remains unspecific but could cover 190 an increased zooplankton grazing rate, for instance.

191
From the above equations we selected six parameters for this calibration study.
with an assumed observational error σ chl . The MCMC search algorithm should not explore 216 unrealistically large values for parameters or parameter combinations insufficiently con-217 trolled by observations. For that reason, for each parameter θ k a prior distribution of the 218 following form is introduced: Scaling ensures that ∞ 0 p(θ k )dθ k = 1. To adjust coefficients b k in Eq. (11), we first specify 220 for each parameter θ k values θ k that to be exceeded should be quite unlikely (see Table 1).

221
Probability P of finding θ k ≤ θ k reads With Eq. (11) and (b 2 k + θ 2 k ) −1 dθ k = arctan(θ k /b k )/b k , coefficients b k satisfying Eq. (12) 223 can then be calculated as: In the following we assume a value of P = 0.9.

225
All parameters θ k considered in this study are also constrained to positive values. The basic idea of graphical Gaussian modelling is to modify a sample correlation matrix formation divergence between two jointly normal distributions, assuming that their means 259 are equal (Kullback, 1959). Specific properties of V imply that the deviance assumes the 260 following simplified form (Edwards, 1995): If data are normally distributed, the deviance has an asymptotic χ 2 distribution with the 262 degrees of freedom given by the number of edges missing in the graph (Whittaker, 1990 where Pa(X i ) denotes the set of all parent nodes of node X i . For root nodes without parents 281 (applies to at least one node in a DAG), the conditional probability P(X i | Pa(X i )) is replaced   given measured data d is given by of each parameter. This enables a fast exploration in parameter spaces that are too high-320 dimensional to be visualized directly.  Table 2). Black lines represent simulations based on those parameters for     interplay between parameters λ S and K light (see Eqs. (4)) and (5)).   reasonable agreement with what one gets assuming high precision data (Fig. 6a).

420
It is interesting to see how posterior parameter distributions differ when calibration is 421 conducted using data from individual years (Fig. 7).     Fig. 8 (lower triangle). Right: Corresponding matrices S p and V p of partial correlations. Numbers in bold type correspond to edges that were maintained in the GGM.  Table 3 shows the original correlation matrix S of feasible parameter combinations,

454
EOFs of which were displayed in Fig. 4. Note the particularly strong correlations (either 455 positive or negative) between algal silica content f Si and the two algal loss related parame-456 ters log(a) and log(σ 0 ) (see Eq. (8)). Table 3 (Fig. 8). The BN is shown in a state after evidence for both µ 0 and f Si was entered. Calculations were performed using Netica. Conditional marginal distributions obtained from the truncated BN well reproduce those shown in Fig. 5c. White contours indicate unconditional distributions. to the low dimensionality (dof=2.7) of the posterior parameter space. Small differences be-471 tween the results from either S or V are in favour of the simplified GGM. Also the leading 472 EOFs obtained from V resemble those for S in Fig. 4 (not shown).

473
Interpretation of the GGM in Fig. 8 is the following. Assume, for instance, that parame- converted into undirected ones, giving the so-called moral graph (Cowell et al., 1999). The 483 moral graph derived from the BN in Fig. 9, for instance, would also contain an edge between λ S and f Si , as these two nodes have joint children K light and µ 0 . Such edge is missing in the 485 GGM in Fig. 8). Hence, the GGM tends to be more restrictive than the BN and it can there-486 fore be expected, that the simplified BN (6 edges out of 15 edges were removed; maximum 487 number of parents is 3 instead of 5 for the saturated graph) behaves similar to the saturated 488 BN.

489
To substantiate this agreement, Fig. 9 replicates the experiment shown in the upper pan- system's evolution. Spear and Hornberger (1980) found separation induced correlations be-531 tween model parameters to not exceed 0.23, which is why the authors did not embark on 532 a deeper analysis of the correlation matrix. According to Spear (1997) conventional mul-533 tivariate analyses proved to be unhelpful also in other studies using the same approach. In 534 our study, correlations were found to be much higher (see Table 3). We presume that this 535 relates to a) our model being much more controlled by observations and b) the huge number 536 of successful simulations (10 6 in our study) affordable with today's computer power.

537
The aforementioned studies motivated further developments leading to the GLUE (Gen-538 eralized Likelihood Uncertainty Estimation; Beven and Binley, 1992)  ally more expensive than MCMC using sequential sampling (Camacho et al., 2015). Ran-544 dom sampling is also used within the Bayesian Monte Carlo (BMC) approach, a method 545 related to GLUE but using a statistically rigorous likelihood function (Dilks et al., 1992).

546
According to Beven and Freer (2001) the advantage of MCMC might diminish when model 547 output likelihood has a complex shape. That seemed not to be the case in our application.

608
Given the model structure we used, the aspect worst controlled by data is decreasing chloro-609 phyll a concentrations explained by by either algae growth limited by lack of silica or a 610 strong algal loss rate. According Fig. 5c, choosing a relatively low algal silica content f Si 611 (and therefore a low depletion of the silica reservoir) implies a large loss rate σ 0 and a small 612 coefficient a governing effects of temperature on the loss rate (e.g. via grazing). By contrast, 613 for a large silica content the maximum loss rate should be set to a small value; its variation 614 for high temperatures becomes less constrained by the data.

615
One must be aware that simple (or even complex) model structures neglect many exter-  Observations Geesthacht Weir will be made available from a repository.

685
Code availability 686 The model code will be made available from a repository together with the data being used.
von Toussaint: Implementation of Markov Chain Monte Carlo algorithm, data analysis.