Groundwater monitoring at industrial waste facility sites can be an expensive exercise. Apart from the initial capital cost for the installation of the monitoring network, the cost for analysing the water quality parameters can outweigh the cost for sampling (personnel, logistics, and instrumentation), data processing, storage, and reporting. The cost for analysing the water quality can be reduced if it is possible to identify the water quality parameters that have a high risk of impacting the groundwater regime and modify the water quality parameters to be monitored, and the frequency of monitoring according to the risk of probable impact. The question is, how can historical and ongoing monitoring data be statistically evaluated to identify these high impact water quality parameters and how can it be applied to modify or optimize a groundwater monitoring programme in terms of which water quality parameters to monitor, and the frequency of monitoring.

Descriptive, bivariate, and multivariate statistical analysis are statistical methods commonly applied in water resource sciences (Helsel and Hirsch, 1992) and water monitoring reporting. Although these methods are useful in understanding the characteristics of different data sets, as well as identifying differences and/or correlations between two or more independent groups of data (Wilcoxon, 1945; Helsel and Hirsch, 1992; Belkhiri et al., 2010; Belkhiri, Boudoukha and Mouni, 2011; ITRC, 2013; McQuitty, 2018; Das et al., 2019; de Andrade Costa et al., 2020; Elumalai et al., 2020), they lack the ability to quantify the probability of impact on the natural water regime. If the probability of impact cannot be quantified, it is impossible to perform a quantitative risk assessment in terms of which water quality parameter have the highest risk of probable impact. Now, this is where the principle of the Monte Carlo Analysis becomes a useful tool.

The Monte Carlo analysis is a method of analysis that uses random data sampling techniques in obtaining a probabilistic approximation to a solution of a mathematical equation/model (U.S. Environmental Protection Agency, 1997). The statistical sampling techniques used are complex due to (1) the input model/equation is simulated hundreds or thousands of times, where each solution of the end simulation is equally likely, and the result is a probability distribution of possible outcomes, and (2) the Monte Carlo transforms numbers from a random number generator, and sequences of these transformed numbers will repeat after a certain number of samples (Glen, 2022). In other words, the Monte Carlo method calculates all the possible events that could or will happen, and the probability of each possible outcome (U.S. Environmental Protection Agency, 1997; Glen, 2022).

The aim of the Monte Carlo analysis applied in this paper was to characterize, quantitatively, the uncertainty and variability in assessing possible impacts and risk. It has the ability to quantify the probability of an impact or risk, which means that it generates scenarios or outcomes with numbers you can use (U.S. Environmental Protection Agency, 1997; Glen, 2022).