Artificial Neural Network (ANN) Model
ANNs, which are extensively employed for predicting valuable data from nonlinear variables, are shaped by three fundamental components: architecture, activation functions, and the training algorithms21. This network comprises input, hidden, and output layers. The input layer accepts variables and facilitates transmission, while the hidden layer conveys variations to the output layer. The output layer generates the final output of the structure. Each layer is interconnected by nodes (neurons), each performing a distinct nonlinear activation function. A hidden node produces an intermediate output by performing a weighted sum of inputs and then transforming it with a transfer function. Hidden nodes transfer data to connected nodes in the next layer until the output layer completes the process by producing the final output22. Factors such as the number of layers, neurons, and the type of activation function employed significantly influence the performance of the ANN model structure. However, careful consideration and selection of these parameters are crucial when constructing a model for specific applications23–25.
ANNs represent a complex computational frameworks with a distributed nature, characterized by multiple processing elements operating concurrently. Within this intricately structured system, interconnected components possess the inherent capability to autonomously adapt their connection strengths during the learning process. The primary aim of this research was to predict benzene concentrations in surface water sources using MATLAB (2019b) mathematical software. The ANN employed in this analysis consists of an input layer, a hidden layer, and an output layer, each comprising multiple neurons, as depicted in Fig. 1. To prevent numerical overflows from arising from excessively large or small weights, normalization of the input and output data was conducted, constraining them to a range between 0 and 1, as exemplified in Eq. (1)26:
$${\text{x}}_{norm}=\frac{{\text{x}}_{\text{i}}-{\text{x}}_{\text{m}\text{i}\text{n}}}{{\text{x}}_{\text{m}\text{a}\text{x}}-{\text{x}}_{\text{m}\text{i}\text{n}}}$$
1
In this context, the normalized value (xnorm) is calculated based on the original data (xi) using the maximum (xmax) and minimum (xmin) values. This process ensures that the scaled data fall within the range of 0 to 1.
For this study, the feed-forward back propagation network (FFBPN) algorithm, initially proposed by Rumelhart, Hinton, and Williams27 was employed. The FFBPN algorithm is highly effective at learning in ANNs; it operates by propagating the error from the output layer back through the hidden layer and to the input layer of the network to achieve the desired final outputs. The algorithm utilizes the gradient descent technique to calculate the network's weight and adjust the interconnection weights to minimize the output error, as shown in Eq. (2)27:
$${W}_{ix}^{m}={W}_{ix}^{m-1}+{W}_{ix}^{m}={W}_{ix}^{m-1}+\eta \times {\delta }_{x}^{n}\times {A}_{i}^{n-1}$$
2
In Eq. (2), the connective weight (Wix) represents the weight associated with a particular connection, while η denotes the learning rate that influences the weight adjustment process. The error signal (\({\delta }_{x}^{n}\)) and the output value of the sublayer (\({A}_{i}^{n-1}\)) also play crucial roles in determining the new weight values. The summation function is employed to compute the weighted sum of all the input signals, serving as the initial step in the network's computation process, as described in Eq. (3)28:
$$f\left(x\right)={\sum }_{i=1}^{n}\left({W}_{ix}+{a}_{i}\right)$$
3
In this paper, a hyperbolic tangent sigmoid transfer function was used in the hidden layer and a linear transfer function was used for the output layer29. To determine the ideal architecture, neural networks were trained using varying iteration numbers (epochs). The dataset was subjected to random partitioning, resulting in three separate subsets: 70% for training, 15% for validation, and the remaining 15% for testing purposes.
Study Area and Data Collection
In this study, measurement data for wastewater parameters monitored at the AIWWTP were collected over a 9-month period. Wastewater samples for these parameters were collected daily by experts from the facility and analyzed in an accredited laboratory. A total of 19 parameters were utilized for the ANN. These parameters are listed in Table 1.
Model performance evaluation.
The purpose of the performance evaluation of the trained ANN model was to assess the quality of the developed model. To achieve this, several statistical measurements were considered when evaluating the performance of the ANN model. These include the coefficient of multiple determination (R2) and the mean squared error (MSE) given by Eq. (4) and Eq. (5), respectively30:
$$\text{M}\text{S}\text{E}=\frac{1}{N}\sum _{t=1}^{N}{({Y}_{t}-{\widehat{Y}}_{t})}^{2}$$
4
$${\text{R}}^{2}=1-\frac{\sum _{t=1}^{N}{({Y}_{t}-{\widehat{Y}}_{t})}^{2}}{\sum _{t=1}^{N}{\left({\widehat{Y}}_{t}\right)}^{2}}$$
5