2.1 Method
The modelling analysis was carried out using the state-of-the-art advective EPANET model, an upgraded version of the EPANET model (advective-diffusive-dispersive model), which includes the diffusion and dispersion equations proposed by Romero-Gomez and Choi (2011) and a new diffusive-dispersive model developed by the authors called EPANET-DD (dynamic-dispersion).
The classic advective EPANET model solves the advective transport equation by solving a mass balance of the fundamental plug-flow substance that accounts for the advective transport and might include kinetic reaction processes (Eq. 1). Using this approach, the substance mass is assigned to discrete volume elements once all the connections in the network have been partitioned. Thereafter, the concentration within each volume segment is eventually subjected to reactions and then transferred to the adjacent downstream segment. If the latter is a junction node, the incoming mass and flow volumes combine with those already present at the network nodes. Once these processes have been exhausted for all network elements, the concentration is calculated and released in the first pipe segments, with flow leaving the node. In this case, the effect of longitudinal dispersion is totally neglected, as it is not considered important in most operating conditions (Rossman, et al., 1993).
\(\frac{{\partial C}_{i}(x,t)}{\partial t}=-{u}_{m}\frac{{\partial C}_{i}\left(x,t\right)}{\partial x}-K{C}_{i}\left(x,t\right)\) (1)
In the present application, in which a conservative tracer is used, the reaction term is neglected.
In contrast, the advective-diffusive-dispersive model, which includes the diffusion and dispersion equations proposed by Romero-Gomez and Choi (2011), solves the transport equation by highlighting the differences between mass flows backwards and forwards from a specific position, which result in different dispersion speeds that lead to solute transport in both directions (Eq. 2).
\(\frac{\partial C}{\partial t}=\frac{1}{\varDelta x}\left({\varphi }_{b}-{\varphi }_{f}\right)-{u}_{m}\frac{\partial C}{\partial x}\) (2)
in which
\({\varphi }_{b}=-{E}_{b}{\left.\frac{\partial C}{\partial x}\right|}_{b} and {\varphi }_{f}=-{E}_{f}{\left.\frac{\partial C}{\partial x}\right|}_{f}\) (3)
\({E}_{b}={E}_{b}\left(0\right){exp}\left(-xpT\right)+{\beta }_{b}\left(T\right){E}^{*}\) (4)
\({E}_{f}={E}_{f}\left(0\right){exp}\left(-xpT\right)+{\beta }_{f}\left(T\right){E}^{*}\) (5)
where \({E}_{b} and {E}_{f}\) (Eq. 4 and Eq. 5) are the dispersion parameters backwards and forwards with respect to the flow direction, \({u}_{m}\) is the flow average velocity and \({\beta }_{b}\left(T\right)={\beta }_{f}\left(T\right)=3.705\bullet T\).
The dimensionless travel time (T) was calculated as in Eq. (6). This parameter indicates how far the dispersion coefficient has progressed towards achieving stable conditions.
\(T=\frac{4{D}_{AB}\stackrel{-}{t}}{{d}^{2}}=4\frac{{x}^{*}}{{S}_{C}\bullet R}\) (6)
in which
\({x}^{\text{*}}=\frac{L}{d}\) is a dimensionless pipe length that defines the location of solute migration, \(L\), with respect to the pipe diameter, \(d\);
\(R=\frac{{u}_{m}\bullet d}{\upsilon }\) is the Reynolds number, which accounts for the mean flow velocity’s (\({u}_{m}\)) geometric dimensions (\(d\)) and fluid conveying properties (kinematic viscosity, ν);
\({S}_{C}=\frac{\upsilon }{{D}_{AB}}\) is the Schmidt number, which accounts for the solute properties (solute diffusion coefficient, \({D}_{AB}\));
and \(\stackrel{-}{t}=\frac{L}{{u}_{m}}\) is the time, which is defined as the ratio between the location of solute migration, \(L\), and the flow velocity (\({u}_{m}\)).
The previously reported expression of the coefficient \({\beta }_{b}\left(T\right)={\beta }_{f}\left(T\right)\) was determined, as the authors found that for short travel times (\(T<0.01\)), the dispersion rates were amplified by 25% more than the numerical results using the formulation proposed by Lee (2004), where \(\beta \left(T\right)=1-exp\left(-16T\right)\).
The EPANET-DD model solves the equations under quasi-steady flow conditions, solving the hydraulic problem under steady flow conditions with the EPANET-MATLAB-Toolkit (Eliades, et al., 2016) and the advection-diffusion-dispersion equation under dynamic flow conditions in the two-dimensional case with the classical random walk method (Delay, et al., 2005), implementing the diffusion and dispersion equations proposed by Romero-Gomez and Choi (2011).
As demonstrated in the literature ((Kinzelbach and Uffink, 1991) - (Delay, et al., 2005)), the use of this combined method is possible due to the similarities between the Fokker-Planck-Kolmogorov equation and the advection-dispersion equation. In fact, the two equations are essentially identical unless there is a conceptual difference between the parameters of the two equations, as the parameters present in the Fokker-Planck-Kolmogorov equation are independent of time, resulting from the stationary hypothesis. To overcome this problem and address the issues related to discontinuities that could cause local mass conservation errors (LaBolle, et al., 1996), Delay et al. (2005) provided a new equivalence, making this analogy valid again. This methodology can be easily applied to any flow model because the mass of the solute is discretized and transported by the particles in the random walk. Consequently, the mass conservation principle is automatically satisfied because the particles cannot suddenly disappear.
This model allows us to determine the position of the solute particles that move inside the network in the \(x\) and \(y\) directions as a function of the different flow regimes that occur inside the network, as shown in equations (7) and (8):
\(x=x+\frac{3}{2}{u}_{x}\left(1-{\left(\frac{y}{\frac{d}{2}}\right)}^{2}\right)dt+\sqrt{2\bullet {E}_{f or b}\bullet dt}\) (7)
\(y=y+{u}_{y}dt+\sqrt{\left({E}_{f}+{E}_{b}\right)\bullet dt}\) (8)
where 𝑢𝑥 corresponds to the component along the 𝑥 axis of the flow velocity, 𝑢𝑦 corresponds to the component along the 𝑦 axis of the flow velocity, 𝑑𝑡 is the duration of the contamination event, 𝑑 is the pipe diameter, and 𝐸𝑓 and 𝐸𝑏 are the forwards and backwards diffusion coefficients, respectively, as defined by Romero-Gomez and Choi (2011). The diffusion coefficient used in Eq. (7) assumes the forwards or backwards values depending on whether the flow direction is positive or negative. The above equation was developed considering laminar flow conditions, in which the velocities in the network are relatively low. This allows the particles to move freely along the 𝑦 axis. This characteristic is also highlighted by the presence of the term in round brackets, \(\left(1-{\left(\frac{y}{\frac{d}{2}}\right)}^{2}\right)\), which multiplies the x component of the velocity 𝑢𝑥. In fact, as the velocity along the \(x\) direction increases and the flow rate changes, the particles tend to move along the preferred flow direction, and the term in brackets disappears from the equation.
To confine the particles inside the pipe section, the previous equations are solved considering the following boundary conditions (Eq. 9 and Eq. 10).
\(\begin{array}{ccc}y=-2\bullet {y}_{max}-y& for& y<-\end{array}{y}_{max}\) (9)
\(\begin{array}{ccc}y=2\bullet {y}_{max}-y& for& y>{y}_{max}\end{array}\) (10)
where the particle position along 𝑦 is limited above and below by the physical presence of the pipe wall. The parameters −𝑦𝑚𝑎𝑥 and 𝑦𝑚𝑎𝑥 coincide with the value of the pipe radius and take on a positive and negative value since the 𝑥 axis has been placed at the centre of gravity with respect to the cross-section of the pipe. By means of these two boundary conditions, the particles are not only prevented from escaping from the pipe but are also reflected, which prevents the particles from settling along the wall. These conditions are called the boundary reflection condition.
At this point, the contaminant concentration has been determined through Eq. (11), in which the concentration value at the previous time has been increased by an amount that corresponds to the concentration per unit of particles \(\left(C\bullet n\right)\) passing through the control volume \(\left(\frac{L}{\varDelta x}\bullet \pi \frac{{d}^{2}}{4}\right)\), where \(L\) is the pipe length, \(\varDelta x\) is the section number of the pipe, and \(\pi \frac{{d}^{2}}{4}\) is the cross-sectional area of the pipe.
\(C=C+\frac{C\bullet n}{\frac{L}{\varDelta x}\bullet \pi \frac{{d}^{2}}{4}}\)(11)
The models were applied to the experimental network of Enna University – UKE (see (Piazza, et al., 2020) for details) suitably calibrated previously from the hydraulic point of view.
The roughness coefficient (reported in the Table 2) was calibrated according to the flow rate measured upstream of the network (1.44 m3 / h) and the diameter of each pipeline, calculating and iterating the uniform flow rate in order to coincide with the measured flow rate upstream of the network. Numerous experimental tests were conducted on the network, varying the pressure set at the pumping system (3.5–4.5 bar) and the flow rates drawn from the network nodes (between 5 and 15 L/min for nodes 5, 8 and 11).
The Table 1, Table 2 and Table 3 show the calibrated roughness values of the pipes and the standard deviation[1] (σ) values determined for the pressures at the nodes 6, 7, 9, 10, the flow rates flowing into the network and the flow rates tapped at the nodes 5, 8, 11.
Table 1
Standard deviation between the pressures measured in the network and simulated numerically.
|
Node 6
|
Node 7
|
Node 9
|
Node 10
|
σ [mH2O]
|
0.01
|
0.15
|
0.05
|
0.09
|
Table 2
Pipes roughness and standard deviation between the flow rates measured in the network and simulated numerically.
|
Link 5
|
Link 6
|
Link 7
|
Link 9
|
Link 10
|
Link 11
|
Link 13
|
Roughness [mm]
|
1
|
1
|
1
|
1
|
1
|
1
|
1
|
σ [m3/h]
|
0.12
|
0.12
|
0.08
|
0.11
|
0.11
|
0.11
|
0.15
|
Table 3
Standard deviation between the measured tapped flow rates and the numerically simulated flow rates.
|
Node 5
|
Node 8
|
Node 11
|
σ [L/min]
|
0.45
|
0.07
|
0.07
|
To calibrate the relative dispersion coefficients for the two models used in the present study, three statistical parameters were determined: the Nash-Sutcliffe efficiency (NSE), Kling-Gupta efficiency (KGE) and coefficient of determination (R2). The Nash-Sutcliffe efficiency (NSE) coefficient (Nash and Sutcliffe, 1970) is a hydrology metric that measures how well a model simulation predicts an outcome variable. It is defined as one minus the ratio of the error variance of the modelled time series divided by the variance of the observed time series, as shown in Eq. (12):
\(NSE=1-\frac{\sum {\left({y}_{i}-{y}_{i, sim}\right)}^{2}}{\sum {\left({y}_{i}-\stackrel{-}{y}\right)}^{2}}\) (12)
where \({y}_{i}\) and \({y}_{i, sim}\) correspond to the measured and simulated values of the variable, respectively, and \(\stackrel{-}{y}\) is the average of the measured values of \(y\). If NSE = 1, there is a perfect correspondence between the model and the observed data; if NSE = 0, the model has the same predictive capacity as the average of the time series in terms of the sum of the square errors. If NSE < 0, the observed mean is a better predictor of the model.
The Kling-Gupta efficiency (KGE) coefficient (Gupta, et al., 2009) is a metric that measures the goodness of fit (Eq. (13)). It consists of three main components: the correlation coefficient between the observations and simulations \(r\), the ratio between the standard deviation of the simulated values and the standard deviation of the observed values, and the ratio between the average of the simulated values and the average of the observed values.
\(KGE=1-\sqrt{{\left(r-1\right)}^{2}+{\left(\frac{{\sigma }_{sim}}{{\sigma }_{obs}}-1\right)}^{2}+{\left(\frac{{\mu }_{sim}}{{\mu }_{obs}}-1\right)}^{2}}\) (13)
Similar to the NSE coefficient, KGE = 1 indicates perfect agreement between the simulations and observations. For KGE values < = 0, analogous to what the authors observed for NSE values, all negative values below the threshold KGE = 0 indicate results with poor model performance.
The coefficient of determination (R2) is a measure of the goodness of fit of a statistical model. It is defined as the squared value of the linear correlation coefficient. The R2 value ranges between 0 and 1. A value of zero indicates that there is no correlation between two data series. On the other hand, higher values of the coefficient indicate a better fit of the model. However, it is not always true that large R2 values result in a good model fit, as the linear correlation coefficient could produce a perfectly positive or negative relationship (Moriasi, et al., 2007).
2.2 Optimization problem
The optimization problem was solved by using the Monte Carlo method, which is based on probabilistic procedures and can solve problems that present analytical difficulties that would otherwise be difficult to overcome (Tarantola, 2004).
Conceptually, the method is based on the possibility of sampling an assigned probability distribution, F(X), using numbers drawn at random (random numbers); that is, the possibility of generating a sequence of events X1, X2..., Xn..., distributed according to F(X). Instead of using numbers drawn at random, the authors use a sequence of numbers obtained through a well-defined iterative process; these numbers are referred to as pseudorandom because, although they are not random, they have statistical properties similar to those of true random numbers.
The Monte Carlo method yields reliable answers in the study of complex real systems, although the solution obtained is never exact in a statistical sense, as it is subject to uncertainty, which decreases as the number of statistical samples increases.
In the present case, the Monte Carlo method was used to solve the optimization problem for the positioning of three water quality sensors within the laboratory network of the University of Enna “KORE”, which is fully described in the following section. Given the uncertainty in the position, magnitude and duration of the contamination, 1000 simulations were performed, with the contamination parameters (mass of contaminant, duration of contamination and contamination node) set at random. User demands in all nodes were fixed at 2.5 l/min. The inlet head was fixed to 1.5 bar. At the same time, an experiment was carried out in node 5, with a duration of 3 minutes and a mass of 460 grams (leading to a constant concentration of 4600 mg/l).
Three objective functions were used:
• F_1: Detection likelihood, i.e., the probability that a sensor configuration will detect the contamination;
• F_2: Detection time, i.e., the average time between contamination and detection in 200 simulations;
• F_3: Detection redundancy, i.e., the probability that the contamination is detected by two sensors within 20 minutes.
The objective functions were slightly modified from those presented in (Preis and Ostfeld, 2008) to comply with the smaller dimensions of the analysed network, and they were equally weighted in the selection of the optimal sensor location.
[1] Standard deviation of zero means that there is no variability between the data.