Identification of source information for sudden hazardous chemical leakage accidents in surface water on the basis of particle swarm optimisation, differential evolution and Metropolis–Hastings sampling

A quick and accurate identification of source information on sudden hazardous chemical leakage accident is crucial for early accident warning and emergency response. This study firstly regards source identification problem of sudden hazardous chemical leakage accidents as an inverse problem and presents a source identification model based on the Bayesian framework. Secondly, a new identification method is designed on the basis of particle swarm optimisation (PSO), differential evolution (DE) and the Metropolis–Hastings (M–H) sampling method. Lastly, the designed method, i.e. PSO-DE-MH, is verified by an outdoor experiment analyses in a section of the South–North Water Transfer Project. Results show that the number of iterations, the average absolute error, the average relative error and the average standard deviations of the identification results obtained by PSO-DE-MH are less than those of PSO-DE and DE-MH. Moreover, the relative error and the sampling relative error of the identification results under five different measurement errors (MEs) (σ = 0.01, 0.05, 0.1, 0.15, 0.2) are less than 9.5% and 0.2%, respectively. The designed method is effective even when the standard deviation of the ME increases to 0.2. Therefore, the designed method can effectively and accurately obtain the source information of sudden hazardous chemical leakage accidents. This study provides a new idea and method to solve the difficult problems of emergency management.


Introduction
With the development of industrialisation and the acceleration of urban construction, the production, storage, transportation and utilisation of hazardous chemicals have also increased in the world (Zhao et al. 2018;Sun et al. 2019), which easily trigger sudden hazardous chemical leakage accidents (SHCLAs) in surface water Yoo and Choi 2019). SHCLAs have occurred in many countries, for example, in January 2014, a chemical leakage in West Virginia, USA (Hsu et al. 2017); in November 2015, water pollution in the Jialing River was caused by tailing leakage in Longnan City, China (Wang et al. 2015); in July 2016, water pollution in the Saskatchewan river was suffered by oil leakage, Canada (Debofsky et al. 2020); in May 2020, a diesel leakage accident occurred in Norilsk, Russia (Abramochkin et al. 2021); and so on. These accidents not only damage the water environment and threaten human health, but also affect social stability (Yu et al. 2015;Andrle and EI Badia 2015;Wang et al. 2019;Lei et al. 2019). However, it is difficult to effectively deal with SHCLAs depending on two aspects: one is that, in these accidents, there are a series of uncertainties, such as leakage time, leakage position, leakage intensity, damage degree and influence range; the other is that many hazardous chemicals, such as organic matter, heavy metals and bacteria, are difficult to be detected with the naked eye before causing harm . As a result, emergency decision-makers can effectively deal with SHCLAs only by grasping the law of hazardous chemical's migration and diffusion in surface water at the first time Lei et al. 2019). Amongst them, the historical reconstruction process of SHCLAs is the key to reflect the migration and diffusion law of hazardous chemicals in water and track the responsible persons (Liu and Wang 2020). Therefore, the ability to quickly and accurately identify the leakage source information of SHCLAs plays an important role in the process of formulating emergency disposal and rescue measures.
Identification method is a direct tool to measure the ability of source identification in sudden water pollution accident (SWPA) (Yang et al. 2016;Ashyralyev and Emharab 2018). According to the mathematical characteristics, the existing identification methods can be divided into four types, namely, the analytical method (Skaggs and Kabala 1995;Wu et al. 2020), regularisation method (Jana and Nair 2020;Ma et al. 2017), optimization method (Guan et al. 2006;Ayvaz 2016;Huang et al. 2018) and stochastic method (Yang et al. 2016;Wang et al. 2016;Jiang et al. 2018;Liu and Wang 2020;Jing et al. 2020). However, all four methods have their own advantages and disadvantages. Analytical method and regularisation method have high computational efficiency, but they sacrifice the precision of the partial solution. Optimization method can find the simulation results closest to the observed data by repeatedly calling the forward water quality model, but it cannot solve the uncertainties in the simulation process. Bayesian inference method is a more common stochastic method, which transforms emergency identification problems into the posterior estimation of unknown parameters on the basis of Bayesian inference and Markov chain Monte Carlo technique (Bayesian-MCMC) (Han et al. 2014;Yang et al. 2016;Jiang et al. 2017;Yu et al. 2020;Zhang et al. 2020). The identification results obtained by Bayesian-MCMC have strong randomness and avoid the decision risk caused by 'optimal' parameter distortion; however, its calculation increases exponentially with the increase in parameters (Verstegen et al. 2014;Zhang et al. 2017;Jiang et al. 2017).
Considering the urgency of dealing with SWPAs, some scholars have studied and designed some coupling methods based on optimisation method and stochastic method (Chen et al. 2007;Wang et al. 2016;Yang et al. 2016;Cao and Yun 2017;Wu et al. 2020). There are two types of coupling methods commonly used at present. One is hybrid optimization approach based on two or more different optimization methods, whereas the other is stochastic optimization approach based on stochastic method and optimisation method. For example, Cao and Yun (2017) studied single-point source identification problem and multi-point source identification problem by coupling particle swarm optimisation (PSO) and differential evolution (DE) and verified the accuracy of this designed method through several examples. Ayvaz (2016) analysed the advantage and disadvantage of genetic algorithm (GA) and reduced gradient method and developed a new hybrid optimization approach to identify groundwater pollution source. Yang et al. (2016) studied the multi-point source identification of SWPA by coupling DE and the Metropolis-Hastings (M-H) sampling method and verified the effectiveness and accuracy of the designed method by conducting outdoor experiments. Wu et al. (2020) explored the identification problem of SWPAs in rivers or lakes and designed a new identification method based on an adaptive Metropolis sampling method and DE and verified the accuracy of this designed method by conducting a case study. Although the first type of coupling method has better searching ability than single optimization method, it still cannot deal with uncertainties during the process of source identification; the other type of coupling method has the same ability to deal with uncertainties as stochastic method, but its searching ability needs to be strengthened. As typical SWPAs, SHCLAs have general characteristics similar to SWPAs but also have the characteristics of high risk, great harm, suddenly happening and difficult response (Chen et al. 2007;Wang et al. 2016). Therefore, in order to reduce the loss and impact of SHCLAs, it is necessary to design a faster and more accurate identification method.
Based on the above analysis and the urgency of emergency response to SHCLA, this study designs a new identification method on the basis of the M-H sampling method, PSO and DE. In the new method, a set of possible solutions is firstly searched from the solution space by PSO, and then the most likely solution set, which is close to the true value, is sampled from a set of possible solutions by the propagation idea of DE, and the identification results are finally obtained by the acceptance-rejection sampling idea of M-H from the most likely solution set. Therefore, the new approach not only uses the search ability of PSO and the propagation idea of DE to improve the accuracy and searching speed in the posterior space, but also uses the samples update strategy to reduce the uncertainty in the process of emergency identification problem. To verify the effectiveness and accuracy of the new method, this study takes outdoor experiment as a case study and comparatively analyses the identification results by using the new method, PSO-DE and DE-MH. The calculation results show that the new method effectively improves calculation efficiency and accuracy and thus meets the needs of emergency management.
The remainder of this paper is organised as follows: the "Problem description and model formulation" section describes the source identification problem of SHCLAs and proposes the emergency identification model. The "New identification method" section designs a new coupling method on the basis of the searching ability of DE and PSO under the Bayesian inference framework. The "Results and discussion" section verifies the new method through an outdoor experiment. The "Conclusions and future research" section summarises the conclusions and provides future researches.

Problem description and model formulation
In this section, the source identification problem of a SHCLA is first described, and then the laws of hazardous chemical migration and diffusion in water are discussed, and an emergency identification model is further established.

Description of source identification problem
A SHCLA that occurs in a river or a canal is taken as an example. We consider a section with a length of L and width of B, and the geometric shape of the section does not change much. The section has p observation points, and q j (t) is used to represent the concentration distribution function of water quality index obtained from the jth (j=1, 2, …, p) observation point at time t. S represents the leakage source information of hazardous chemicals (e.g. intensity, position and leakage time), as shown in Figure 1. Figure 1 shows that once some observation values suddenly appears abnormal, that is, once SHCLA occurs, emergency decision-makers urgently need to identify leakage source information S and then formulate emergency response measures to ensure the safety of water quality downstream. In other words, the source identification problem of a SHCLA is to study how to obtain the leakage source information S by the observation values q j (t) quickly and accurately.

Emergency identification model
In order to accurately simulate the migration and diffusion of hazardous chemicals in surface water after a SHCLA occurs, a two-dimensional water quality equation can be expressed as where C(x, y, t) is the concentration of hazardous chemicals in position (x, y) at t moment (g/L); u x and u y represent the longitudinal and transverse velocity of water flow (m/s), respectively; D x and D y are the longitudinal dispersion and transverse diffusion coefficients (m 2 /s), respectively; ζ(t) is the degradation function of hazardous chemicals; and S is the sources-sink terms.
However, the diffusion of hazardous chemicals in the water body is restricted by boundary and reflection due to the constraint of shore and bottom. We assume that the initial conditions are C=C 0 | x=0 and C=C 0 | t=∞ . If the bilateral reflection on both sides of the water body is considered and the leakage position of SHCLA is taken as the origin point, then the solution of Eq. (1) is where m is the leakage intensity of hazardous chemicals, g; b is the distance between the leakage point and the shore, m; B and h are respectively the width and average depth of the water body, m; n is the number of reflections across the boundary; δ is the degradation coefficient of hazardous chemicals; x and y are respectively the longitudinal distance and transverse distance from the predicted point to the leakage position, m; C 1 (x, y, t), C 2 (x, y, t) and C 3 (x, y, t) are the concentration increments generated at point (x, y) by the leakage source of hazardous chemicals, reflection near the shore and reflection far from the shore, respectively; and C h is the background concentration of water.
SHCLAs have many leakage modes, and one of which is instantaneous non-shore leakage mode (e.g. the leakage accident occurs when a vehicle carrying hazardous chemicals is driving on the bridge due to traffic accidents, machine malfunction or other reasons). If the background concentration of the water body is ignored, then C(x, y, t) can be expressed as If leakage mode is an instantaneous shore leakage, then C(x, y, t) can be expressed as If there are p (p≥1) observation points, d={d 1 , d 2 ,…, d p }, g={g 1 , g 2 ,…, g p } and h={h 1 , h 2 ,…, h p } respectively denote the set of observation values, prediction values and true values, then there exist measurement errors (MEs) and prediction errors (PEs). Due to the characteristics of symmetry, unimodality, boundedness and offsetting, the ME and PE are usually assumed to obey the normal Gauss distribution (Guo et al. 2009;Kastner et al. 2013). Therefore, the probability density functions of ME and PE at the ith observation point are respectively expressed as where σ d.i and σ g.i are respectively the standard deviations of the ME and PE at the ith observation point.
Normally, all observation points run independently, and then the likelihood function can be described as Eq.
Start Initializing position and volcity the particle swarm Calculating the fitness of each particle Updating pbest and gbest according to fitness, updatIng particle position and velocity The maximum likelihood estimator (MLE) is a method that can be used to estimate parameters by using observation values. Its principle is to find a parameter vector, which maximises the probability of observation values (Zeunert and Günter 2020). To improve the evaluation and convergence of parameters, the MLE is usually determined by minimising the negative log-likelihood function (MNLL) (Carrera and Neuman 1986b). Therefore, the emergency identification model of SHCLA can be expressed as If we substitute Eq. (6) or Eq. (7) into Eq. (10), then S can be identified or estimated by adopting the M-H sampling method. However, some limitations, such as large amount of computations and low sampling efficiency, exist. To obtain more accurate source information of the SHCLA, M-H needs to adjust the Markov chain to reach the convergence domain through a vast number of calculations, and this requirement reduces the identification efficiency and then affects the emergency effect of SHCLAs. A vast number of calculations outside the convergence domain can be avoided if we can improve the efficiency of M-H. Given that DE and PSO have strong global and local searching ability respectively, a new identification method that combines PSO, DE and M-H sampling is designed to identify the source of SHCLA in this study.

New identification method
This new identification method is designed by combining PSO, DE and M-H sampling on the basis of Bayesian inference. To explain the proposed identification method, we firstly discuss the procedure of PSO, DE and M-H, then explore the searching mechanism and operation process of the new method and finally discuss the index to diagnose the convergence of the new method.

Particle swarm optimisation
PSO, which was proposed in 1995, is an evolutionary algorithm based on the behaviour of birds (Eberhart and Kennedy 1995a). The basic idea of PSO is to find the optimal solution through the cooperation and information sharing amongst individuals in the population (Eberhart and Kennedy 1995b). In contrast to other evolution algorithms, PSO is easy to implement with few evolution parameters (Gupta et al. 2016;Jain et al. 2018). In PSO, it firstly initialises a group of random particles, where each particle has its position and velocity (Tang and Guan 2007;Hou and Jiang 2017). Then, the optimal solution is found by updating the iteration. In each iteration, the updated velocity of each particle is determined mainly by p best and g best , where p best denotes the best solution found so far by a particle in the position and g best indicates the best value found so far by all the particles in the population. Figure 2 shows the procedure of PSO.
The convergence speed of PSO is fast. Thus, this method can be applied to solve various problems with its unique advantages. However, it easily stagnates, and the convergence accuracy is low in the process of convergence.

Start
generaing randomly a new sample S (*) Does these samples meet with convergence criteria?

Initializing population
Transforming Parameter s coordinate

Differential evolution
DE is a population-based evolutionary algorithm that searches solutions randomly over a continuous space through real vector coding (Storn and Price 1997). The core idea of DE is to obtain the mutation operator by the difference of multiple pairs of vectors selected arbitrarily in the population (Ali and Törn 2004). Figure 3 illustrates the procedure of DE. Given the advantage of greedy optimisation, DE improves optimisation performance, but it greatly reduces the ability of the population to resist the local extremum attraction. As a result, DE is more suitable for accelerating convergence in the later state of the algorithm.

Metropolis-Hastings sampling method
M-H is a method to estimate unknown parameters by constructing a Markov chain (Raje and Krishnan 2012). To make the method ergodic in solution space, new samples are usually generated on the basis of a proposal distribution (Brooks and Roberts 1998). Figure 4 illustrates the procedure of M-H.
However, the sampling efficiency of M-H is not high because the acceptance rate of the constructed Markov chain during the transfer process may be small.

PSO-DE-MH
Due to numerous uncertainties that exist in the source identification problem of SHCLAs, the identification results obtained by the existing identification methods have poor efficiency or low accuracy. Therefore, to improve the efficiency and accuracy of identification results, a new method (i.e. PSO-DE-MH) is proposed in this section by combining PSO and DE with M-H on the basis of different searching mechanisms.
For M-H, the calculating efficiency and accuracy of the identification results largely depend on the convergence rate of the Markov chain. However, the convergence rate of the Markov chain is slow, especially for complex or highdimensional models since M-H need to repeatedly select samples from the proposal distribution space to obtain the accurate target region.
PSO and DE exactly improve the convergence rate of the Markov chain constructed by M-H. For example, the new testing parameters are examined by updating the position and velocity of particles based on the movement rule of PSO, or they are determined by mutation and crossoperation based on the propagation idea of DE. To consider the uncertainties of source identification problem fully and ensure that the parameter space is searched efficiently, we firstly search the possible solutions that are close to true values on the basis of PSO, and then the possible solutions closer to the true values are obtained by an improved mutation formula based on the propagation idea of DE. Therefore, the new testing parameters X pbest (*) are generated in accordance with true values *

Yi iter+1
Yi iter X (i) * * * * * * (X (i) (r 2 ) X (i) (r 3 )) * Parameter vectors Yi iter from the iterth movement by PSO Fig. 5 Process of generating new testing parameters X (*) with three dimensions by the particle movement formula of PSO and the revised mutation formula DE where X i,best is the ith generation population consisting of the individual optimal Y best of all particles generated by PSO; r 1 , r 2 and r 3 are mutually different integers in the ith generation population; E is a scaling factor of the difference vector and is a positive real number; and ε 2 is the given disturbance that reflects the uncertainties existing in the identification problem.
Calculating the fitness of each particle Calculating the velocity parameter and updating particle position Calculating individual optimal and global optimal according to fitness Does iter=iter max or satisfy the conditions?

Yes No
Determining Inertia Weight w Determining learning factor c1 and c2 Initializing the particle swarm Fig. 6 Operation process of the PSO-DE-MH In Eq. (11), E and ε 2 are the tuning parameters. E has a great influence on identification solutions. A small E converges prematurely and falls into local optimum, whereas a large E easily slows down the searching speed and skips global optimal solutions (Vitaliy 2006). In addition, if ε is too large, then it reduces the accuracy of identification solutions (Roberts and Rosenthal 2004). An interaction exists between E and ε 2 . When we choose a large value of E and ignore ε 2 , the searching process becomes random; when ε 2 is too large, it cannot reflect the influence of E on the sampling method. Therefore, neither E nor ε 2 should be set too large or too small. Regarding the proposed problem in this study, we choose E as a constant value in [0, 2] and set ε 2 within 1-20% of the range of prior distribution (Storn and Price 1997;Roberts and Rosenthal 2004). Moreover, the size of NP affects the searching efficiency. Generally, to maintain the balance between variety of convergence rate and population, the size of NP should be set to [10,50], where NP∈[10, 35] is suitable for lowdimensional problems and NP∈(35,50] for high-dimensional problems (Mohamed and Sabry 2012). Figure 5 shows the process of generating new testing parameters with three dimensions by the particle movement formula of PSO and the revised mutation formula DE. X (i) (j) in Figure 5 represents the individuals in the jth population with the ith iteration, j∈ [1, NP].
To coordinate the global and local search capability, the new method adopts some nonlinear change strategies, including inertial weight, learning factor and scaling factor. The detailed procedure of the new method to solve this considered problem is as follows: (i) Initialising the particle swarm In accordance with the numbers of S, the population size (NP), the maximum number of movement iter max and the maximum sampling times I max are determined, and the velocity V i iter and position Y i iter of the ith particle swarm (i = 1,2,…,I max ; iter=1, 2, …, iter max ) are initialized within the proposal probability distribution space of unknown parameters.
(ii) Calculating the inertial weight To make the particle swarm have good global and local search ability at the initial and later stages of iteration, respectively, a dynamic inertial weight is adopted, as shown in Eq. (12).
where iter is the number of particle swarm movement, w max is the maximum value of inertial weight and w min is the minimum value of inertial weight.

(iii) Calculating the time-varying learning factor
The learning factors c 1 and c 2 affect the individual optimum and global optimum of each particle, respectively. Venter and Sobieszczanski-Sobieski (2004) denoted c 1 as x-coordinate of source location 5000m y 0 y-coordinate of source location 21m t 0 Leakage time 120min self-confidence and c 2 as swarm-confidence. The smaller learning factor limits the movement of the particle, whereas the larger learning factor causes the particle to be divergent. A time-varying learning factor is adopted in the designed method, as shown in Eq. (13).
where c 11 , c 12 , c 21 and c 22 are constants.
(iv) Calculating and updating the fitness, position and movement speed of the particle swarm where ε 1 is disturbance factor, Y iter i;pbest is the individual optimum of the particle swarm after the iterth movement and Y iter i;gbest is the optimal value of Y iter i;pbest .
(v) Repeating Steps (ii)-(iv) until iter = iter max or Y iter i;gbest satisfies the minimum limit (vi) Combining the individual optimum Y iter i;pbest (iter = 1, 2, …, iter max ) to form the ith iteration population X (i) and calculating the concentration of hazardous chemical and conditional probability density of X (i) (vii) Selecting randomly three different particles from a new particle swarm and then performing mutation in accordance with Eq. (15) to obtain a new test parameter X (*) where E is a scaling factor; ε 2 is disturbance factor; and r 1 , r 2 and r 3 are respective mutually different integers in the ith generation population.  (viii) Obtaining the accept probability A(X (*) (r 1 ), X (*) ) at which the Markov chain moves from p(X (*) (r 1 )) to p(X (*) ) by Eq. (16) (ix) Generating a random number R (R∈[0,1]) that follows an uniform distribution If R < A(X (*) (r 1 ), X (*) ), then the testing parameters are accepted, and X (i) = X (*) ; otherwise, X (i) = X (*) (r 1 ).

(x) Repeating Steps (i)-(ix) until the predetermined itera-
tions are completed Figure 6 shows the detailed operation process of PSO-DE-MH.

Convergence diagnostic method
Scale reduction score (SRS) is a method to judge the convergence of an algorithm by calculating a quantitative diagnostic indicator (Gelman and Rubin 1992). The process of the method is as follows: where x ij is the jth sample value of the ith sampling sequence.
(iii) Calculating the average value of the k within-sequence variances W by Eq. (18) (iv) Calculating an index called scale reduction score (SRS) by Eq. (19) ffiffiffiffiffiffiffiffi (v) If ffiffiffiffiffiffiffiffi SRS p ≈1, then the generated samples converge to the posterior distribution of the unknown parameters. Otherwise, the generated samples do not converge to the posterior distribution.

Outdoor experiment analyses
As we all know, China is a country with extremely uneven water resources. In order to balance water resources, China has built the South-North Water Transfer Project (SNWTP) since 2002, which is divided into East Route Project (ERP), Middle Route Project (MRP) and West Route Project (WRP), as shown in Figure 7a. The starting point of MRP is the Danjiangkou Reservoir; its water supply area comprises the Henan Province, Hebei Province, Beijing City and Tianjin City . The total distance of MRP is 1276 km, which includes many hydraulic structures, such as over 1300 bridges, 88 diversion sluice gates, 53 exist sluice gates, 26 aqueducts and so on. Moreover, MRP is constructed in the form of three-dimensional cross arrangement, along which there are 1640 cross buildings, such as river channel crossing, left bank drainage, canal crossing, railway crossing and  highway crossing. It is worth noting that, according to the survey of China South-to-North Water Diversion Group, more than 600 pollution sources along the MRP are prone to the occurrence of SHCLAs. To verify the proposed method, we choose 3-km canal section as the experimental canal, which is along the direction of DongyangGe to the bridge Baiyun in Baoding City, Hebei Province, nearby some hydraulic buildings such as river channel crossing building, inverted siphon and sluice gate, and take DongyangGe sluice gate as observation point D, as shown in Figure 7b. During the experiment, we use sucrose as tracer; the system and pollutant characteristics are shown in Table 1. The first time the studied section was discovered to be polluted is set as 10:00 am, and the measurement is conducted every 10 min thereafter until 2:00 pm. The sequence values of sucrose concentration at observation point D are shown in Figure 8. Figure 8 shows that the sequence value of sucrose concentration at observation point D are approximately normally distributed, and the highest value is approximately 3.1777 g/m 3 occurring at approximately 11:50 am.

Analysis of emergency identification results
In accordance with the characteristics of SHCLA and the prior information of the leakage source, all unknown parameters are uniformly distributed and independent of each other. Then, the posterior probability density function of the unknown parameters can be expressed as ; 0:4 Â 10 6 < m < 1:6 Â 10 6 ; 0 < x 0 < 10 4 ; 0 < y 0 < 40; 0 < t 0 < 200 0 else To simplify the calculations, only ME is considered in this section and is assumed to follow a Gaussian distribution with σ d.i = 0.01. Moreover, we set the parameters E, NP and ε as stated in the rule in the "PSO-DE-MH" section. After repeated computations, we set E = 0.1, NP = 10, ε 1 = ε 2 = 0.01, w max = 0.9, w min = 0.4, c 11 = c 12 = 1.5, c 21 = c 22 = 0.5, iter max = 3000 and I max = 10,000. In accordance with the operation process of PSO-DE-MH, the frequency distribution and the iterative curve of pollution sources information are illustrated in Figures 9, 10 and  As shown in Figure 9a, the generated samples of source intensity m almost fall in [0.995 × 10 6 , 1.035 × 10 6 ]. Amongst them, 75.52% of the generated samples fall in [0.995 × 10 6 , 1.015 × 10 6 ]. Figure 9b shows that the generated samples of leakage time t 0 almost fall in [120,125]. Amongst them, 61.21% of the generated samples fall in [120,123]. According to Figure 10, the generated samples of leakage position (x 0 , y 0 ) almost respectively fall in [4990, 5130] and [21.05, 22.45]. Amongst them, over 60 percent of generated samples respectively fall in the [4990, 5080] and [21.05, 21.85]. According to Figure 11, the identification results approach the truth values of the parameters after approximately 300 iterations, and the constructed Markov chain has the characteristics of stable and piecewise smooth. As shown in Table 2, if we adopt PSO-DE-MH to solve the identification problem of hazard chemical leakage source, then the relative  error and the sampling standard deviation of unknown parameters are less than 1.3% and 1.4, respectively. Therefore, the identification results with the proposed method are close to the true values, and the generated samples also converge to the posterior distribution.

Comparative analysis with PSO-DE and DE-MH
To verify the effectiveness of PSO-DE-MH, DE-MH and PSO-DE are used in this study to solve this emergency identification problem of SHCLA simultaneously. The iterative curve of the unknown parameters is shown in Figures 12  and 13, respectively. We perform the following operations for the above three methods: (i) the identification results were analysed after eliminating the unstable results, as shown in Table 3. (ii) The calculation errors of the unknown parameters were extracted at intervals, as shown in Figure 14.

Analysis of ME impact on identification results
For SHCLAs, the effectiveness of the identification method largely depends on the constructed likelihood function, which is directly related to the distribution of ME. Therefore, the effects of ME on the identification results must be discussed (Lu et al. 2019). The identification results are obtained with limited sampling; thus, the sampling relative error (SRE) can be used to measure the effects of ME on the identification method (Cao et al. 2015) and can be expressed as where σ is the standard deviation of ME, V is the true value and n 1 is the number of sampling.
To analyse the effects of ME on the identification results, we choose five different standard deviations of ME (0.01, 0.05, 0.1, 0.15 and 0.2) to discuss the effects on the identification results. The mean and standard deviation of the leakage intensity m, leakage positions (x 0 , y 0 ) and leakage time t 0 under the five different standard deviations of ME are calculated respectively by PSO-DE-MH, as shown in Table 5. The relative error and SRE with five different standard deviations of ME are also calculated on the basis of the proposed method, as shown in Figure 15 and Figure 16.
As shown in Table 5, Figure 15 and Figure 16, the standard deviation, relative error and SRE of the identification results with five different standard deviations of ME are less than 14, 10% and 2%, respectively. Figure 15 shows that the relative errors of unknown parameters increase with the standard deviations of ME. When the standard deviation of ME is greater than 0.1, the effects on m and x 0 are maximum and minimum, respectively. Moreover, the relative errors of identification results are greater than 1%. The reason is that when σ ≥ 0.1, ME has a significant effect on the information of leakage source through the accept probability of the proposed method.
If ME is larger, the accept probability is close to 1, which leads to the samples being far away from the true values. When σ < 0.1, the effects on y 0 and x 0 are respectively maximum and minimum. Moreover, the relative errors of the identification results are less than 2.5%. The reason is that when σ < 0.1, ME mainly affects the accuracy of the identification results by the proposal distribution interval. If the range of proposal distribution of unknown parameters is small, then the identification results are affected by ME. Figure 16 shows that SRE also increases with the increase of the standard deviation of ME; the increasing speed of SRE is in turn leakage intensity, leakage position and leakage time. Since ME affects the sampling effectiveness of the proposed method by the accept probability, ME is an important factor that affects the accuracy of identification results. Therefore, PSO-DE-MH remains effective even when the standard deviations of ME increase to 0.2.

Conclusions and future research
For SHCLAs in the surface water, emergency rescue depends on the accuracy and effectiveness of the identification results of leakage sources. In this study, a new method called PSO-DE-MH is designed by integrating the advantages of PSO, DE and MH. The proposed method is tested with an outdoor experiment in a section of SNWTP. The results show that the proposed method can effectively improve the calculation efficiency and the accuracy of source information for SHCLA. The specific conclusions are as follows: (i) New samples generated by the proposed method can improve the accuracy of identification results. In contrast to the M-H sampling method under the Bayesian framework, PSO and DE are used in this study to ensure that the samples can be quickly sampled around the true values. Hence, PSO-DH-MH does not need to have high approximation accuracy in the entire research area and only needs to have high accuracy in important areas, especially near the true values. This approach not only reduces the calculation cost but also improves the accuracy of inversion.
(ii) One great advantage of PSO-DE-MH is its low computational cost. This study shows that the identification results obtained by the proposed method quickly stabilise after the start of iteration. The more accurate the prior information of the leakage source is, the higher the effectiveness of the sampling by PSO-DE-MH is. (iii) Another great advantage of PSO-DE-MH is its strong noise resistance. The uncertainty of information of leakage sources increases with the increase in ME. This study shows that the average relative error and the average mean error of the leakage source information by PSO-DE-MH are 0.57% and 1.24% when the standard deviation ME is 1%. Even if the standard deviation of ME is increased to 20%, PSO-DE-MH can still accurately identify the leakage source's information.
In conclusion, compared with the existing methods, the proposed method has better noise immunity and thus can be used to solve the leakage source identification problem of SHCLAs. Considering the increasing complexity of water environments and the diversity of observation data, how to construct a more practical emergency identification model based on complex hydrodynamic water quality models and diversity data must be the focus of further research.