A Single Hidden Layer Arti�cial Neural Network Model that Employs Algebraic Neurons: Algebraic Learning Machine

Artificial neural networks (ANN) have been employed successfully because of their high modeling capability. Many versions of the ANN have been proposed to increase the modeling ability. Since ANN based on the biological neural network system, the only mathematical operation is summation or subtraction (while the coefficients are negative). This research was done to investigate the application of other mathematical operations, which are multiplication, division, logarithm, and exponential, in nodes. Based on this fact, a novel a single hidden layer feed-forward artificial neural network (SLFN) model, which was called was algebraic learning machine (ALM), was proposed. The proposed ALM was evaluated and validated with 60 different benchmark datasets. Obtained results were compared with results obtained by each of the extreme learning machine (ELM), randomized artificial neural network, and random vector functional link, and back-propagation trained SLFN methods. Achieved results show that the proposed method is successful enough to be employed in classification and regression.


Introduction
The learning is one of the major abilities of humanity [1]. Therefore, many of the artificial learning methods have been built on the aim of modeling human learning mechanisms.
One of the popular machine learning methods is an artificial neural network (ANN), which is based on the human neural network [2][3][4].
ANN has been employed in many applications since 1943 such as medical, economy, robotics [5]. The major reason behind the success of ANN is its high adapting capability of modeling not only linear but also nonlinear issues. The adaptation of ANN means determining the optimal values of its free parameters and this determining stage is called training. In order to increase the adapting capability of ANN, various ANN structures (e.g., feed-forward, recurrent, single hidden layer feed-forward) and ANN training methods (gradient-based methods, randomized methods) have been proposed [6][7][8][9].
In each of them, the inputs (and in some cases, recurrent outputs) are weighted according to the learned weights by employed training method and these weighted inputs are summed (for some cases, it is subtraction because of negative weights) and biased [10,11]. Finally, the nonlinearity of the ANN is gained by employing the transfer (activation) function to the obtained summation. As seen in this basic process, the main algebraic operation in ANN is summation and subtraction as seen in the existing literature in ANN [11][12][13].
Based on the summation operator, it can be easily said that the existing ANN methods can be employed successfully in modeling a system or a phenomenon that has a summation relationship between input variables (parameters). According to the best knowledge of the author, there is not any research in the literature that employed any algebraic operation other than summation or subtraction. Although, there is a large and a growing literature about the success of ANN that applied summation and/or subtraction in the nodes [8, [14][15][16], the result of employing different algebraic operations in nodes has not been investigated yet.
The research questions in this study focused on whether the success of the adaptation of systems that may have other algebraic relationships between input variables/parameters such as production, division, exponential, and logarithmic. The motivation behind this paper is to seek the answer to this question. To evaluate and validate the effectivity of applying various algebraic operations in the nodes, a novel single hidden layer feed-forward (SLFN) ANN structure was proposed, which was called algebraic learning machine (ALM) and the proposed ALM was trained by the methodology of randomized ANN [17]. The proposed approach was validated by Backpropagation trained feed-forward artificial neural network (BPN), generalized regression artificial neural network (GRN), randomized artificial neural network (RNN), random vector functional link (RVFL), and extreme learning machine (ELM) methods

[17-21].
The main contribution of this paper is employing various algebraic operators in the nodes and by this to increase the modeling capability of the traditional ANNs. In order to validate the proposed approach, 60 benchmark datasets (30 of them are classification and the others are regression datasets) were employed according to the given methodology in Section 2. Obtained results in terms of success and processing time were given and discussed in Section 3, and the last section was concluded in the paper.

Material
To validate the proposed approach 30 benchmark classification datasets, whose details were given in Table 1, have been employed in tests. The first 5 datasets, which are the Lithuanian, Highleyman, banana-shaped, spherical, and multi-class, are synthetic datasets [22], while the other employed datasets that were given in Table 1  The employed 30 benchmark regression datasets are summarized in

Algebraic learning machine (ALM)
The proposed ALM is built on a single hidden layer feed-forward artificial neural network (SLFN) as given in Figure 1 [21]. As seen in Figure 1, there are different types of neurons, which are summation, production, exponential, logistic, and trigonometric neurons, in the hidden layer. Each of them performed different algebraic operators at its core. The previous neurons are only done summation or subtraction operations. The algebraic operator, which is employed in the core of the neurons in the hidden layer, performed in ALM is not limited to summation, production, exponential, logistic, and trigonometric operators. The researchers may use any algebraic operator that may represent the modeled system or phenomenon. The output of each type of the neurons in the hidden layer, which can be seen in Figure 1, can be calculated as follows.
where ., !, J, '(. ), * +,-, 2 -, are the input, output of the neuron in the hidden layer, the number of features in the input, the activation (transfer) function, weights in the input layer, and biases, respectively. Furthermore, L, M, 9, N, and O are the summation, production, exponential, logarithmic, and trigonometric types of neurons in the hidden layer, which has P, 2, Q, R, and S number of neurons in the hidden layer, respectively. The output of ALM is given below where X, and W, are the number of neurons in the hidden layer and the bias in the output layer, respectively. In ALM, the training methodology is based on a randomized artificial neural network as given below [17].  After assigning these parameters it can be seen that the only unknowns are the weights and bias in the output layer. These free parameters must be determined to minimize the norm error, which can be shown below [21,24].
To achieve the unknown/un-assigned weights and biases, a training dataset is required and 0 shows the number of observations in this dataset. After applying the training dataset, the following matrix can be obtained.
As seen the matrix that consists of the output of the neurons in the hidden layer is not a square matrix. Therefore, the matrix that consists of the weights in the output layer can be easily calculated analytically by the generalized Moore-Penrose inverse method or by Fisher method [17,21,24].
where ' A shows the generalized Moore-Penrose inverse of the outputs of the neurons in the hidden layer.

Validation Process
Each of the processes was applied according to 5-folds cross-validation. BPN, GRN, RNN, RVFL, and ELM methods were employed in the validation process [17][18][19][20][21]. In order to achieve a fair comparison, the same partitions and optimization rules were employed for each of the employed methods. To evaluate and validate the proposed approach, the following methodology was performed as summarized in three steps.

step:
Normalizing the employed dataset into the range of -1 to 1.

Analyzing the outcomes obtained based on the employed number of neuron types in the hidden layer
To investigate the outcomes of the neuron types on the success of the ALM, 2 classification datasets, which are the Lithuanian (a synthetic dataset) and the liver (a real-world dataset) dataset, and 2 regression datasets, which are the approximate sinc (a synthetic dataset) and the forest fire (a real-world dataset), were employed. The following neuron types were employed in tests.
a. All types of neurons, which are summation, production, exponential, logistic, and      As seen in Tables 3 and 4, there is not any relationship between obtained success and the neuron types. For instance, in the Lithuanian dataset, the most suitable activation function was found as trigonometric neurons, which can be because of the data distribution of the Lithuanian dataset that can be seen in Figure 6.
As a summary, the obtained major result well suits the literature findings that the optimal network parameters depend on the data distribution of the dataset. The optimal neuron types, the number of neurons in the hidden layer, and the activation function must be determined to achieve better modeling; in other words, to achieve higher success.

Obtained success rates in ALM
The proposed ALM, BP, GRN, RNN, RVFL, and ELM were employed in the datasets given in Section 2. Each of these methods was optimized and applied according to 5-folds cross-validation. Obtained accuracies (%) and RMSEs were given in Tables 5 and 6, respectively, while the best ones were given in bold. Please note that, in tests in this section, the neuron types were not optimized; all types of neurons, which are summation, production, exponential, logistic, and trigonometric neurons, were employed together.   Although ALM is not faster than the other methods as seen in Tables 7 and 8, the obtained process durations are still in an acceptable range.

Discussion
Given results in Tables 5-8 are summarized in Table 9 based on employed methods, which are ALM, BP, GRN, RNN, RVFL, and ELM.  Figure 1 The structure of the single hidden layer feed-forward arti cial neural network Obtained accuracies based on number of neurons and activation function in the liver: a) all neuron types, b) summation neurons, c) production neurons, d) exponential, e) logistic, f) trigonometric Obtained accuracies based on number of neurons and activation function in the approximate sinc: a) all neuron types, b) summation neurons, c) production neurons, d) exponential, e) logistic, f) trigonometric Obtained accuracies based on number of neurons and activation function in the forest re: a) all neuron types, b) summation neurons, c) production neurons, d) exponential, e) logistic, f) trigonometric Figure 6 The Lithuanian dataset