An Intelligent Feature Selection using Archimedes Optimization algorithm for Facial Analysis

. Human facial analysis (HFA) has recently become an attractive topic for computer vision research due to the technological progress and the increase of mobile applications. HFA explores several issues as gender recognition, facial expression, age, and race recognition for automatically understanding social life. In addition, the development of several algorithms inspired by swarm intelligence, biological inspiration, and physical/mathematical rules allow giving another dimension of feature selection in the ﬁeld of machine learning and computer vision. This paper develops a novel wrapper feature selection method for gender recognition using the Archimedes optimization algorithm (AOA). The paper’s primary purpose is to automatically determine the optimal face area using AOA to recognize the gender of a human person categorized by two classes (Men and women). In this paper, the facial image is divided into several sub-regions (blocks), where each area provides a vector of characteristics using one method from handcrafted techniques as the local binary pattern (LBP), histogram oriented gradient (HOG), or Grey level co-occurrence matrix (GLCM). The proposed method (AOA) is assessed on two publicly datasets: Georgia Tech Face dataset (GT) and the Brazilian FEI dataset. The experimental results show a good performance of AOA compared to other recent and competitive optimizers as Sine cosine algorithm (SCA), Henry Gas Solubility Optimization (HGSO), Equilibrium Optimizer (EO), Emperor Penguin Optimizer (EPO), Harris Hawks Optimize (HHO), Multi-verse Optimizer (MVO) and Manta-ray Foraging Optimizer (MRFO) in terms of accuracy and the number of the selected area.


Introduction
1 Human vision allows performing several tasks in parallel and a rapid time, particularly facial 2 detection, gender recognition, and recognizing the state of mind, which differentiates the human 3 being from others. 4 The automation of gender recognition represents a real challenge for scientific researchers , and 5 it has a significant impact on the commercial field and video surveillance. For example, shopping 6 centers are interested in knowing the sales rate and the category of people who buy their products, 7 particularly the gender, age, and origins, to increase the sales rate. Also, another area requires 8 the application of gender recognition to detect suspected people, captured by surveillance cameras 9 in large spaces such as airports, shopping malls, and gas stations. In order to reduce the time 10 of searching for the target suspected person, the gender recognition application can contribute 11 profoundly to solving this issue, especially for critical situations as suicide bombing or airport (EPO) [35], Manta-ray Foraging Optimizer (MRFO) [36]. Also, we notice that physics and math-  The experimental study is validated using two datasets including FEI and FERET. The ob-85 tained results achieved 96%, 94% as accuracy rates, respectively. 86 In gender recognition from face images, a big challenge that remains to this day is how to 87 determine the most significant areas from face images characterized by local binary pattern (LBP), 88 histogram of oriented gradient (HOG) or Grey level co-occurrence matrix (GLCM) descriptors 89 intelligentlly and automatically? 90 This paper automatically determines the significant areas based on handcrafted features (LBP, 91 HOG, or GLCM) from the face using the Archimedes optimization algorithm (AOA) to solve 92 gender recognition problems using an optimal number of faces extracted areas. 93 The major contributions of this paper are as follows: 94 • Designing a novel wrapper physical algorithm AOA for predicting gender identification using 95 an automatic selection of the optimal and significant areas of face images. 96 • Comparing the performance of AOA with several recent and robust optimizers as for facial 97 analysis based on FS. 98 • Evaluating the impact of three handcrafted features based on LBP, HOG, and GLCM. 99 • Testing the efficiency of AOA for gender recognition over two datasets: FEI and GT. 100 The following structure of our paper contains six sections. Section. 2 explains some works which sessed on FEI datasets, and the experimental results shown that the third framework outperforms 136 others by 90% in terms of accuracy. However, the accuracy increases to 94% when the decision 137 is taken using the weighting vote. Also, the task of gender identification is solved by texture and 138 geometric features, which can be determined by local binary pattern (LBP) and gray level co-

141
In the same context, a novel variant of LBP is proposed by [60] named Adaptive patch-weight LBP 142 (APWLBP). Their method used a pyramid structure to compute the gradient using weight param-143 eters determined by Eigen theory. The main objective of (APWLBP) is to determine the optimal 144 projection on the hyper-plane with a high value of variance for gender recognition. The perfor-   3 × 3 pixels where the pixel to be processed is in the center, and its neighborhood is around. Fig.   202 1 shows an example of the execution of the LBP algorithm relating to the steps described below.

203
Step 1 -Extraction of the neighborhood of the pixel to be processed. The eight intensity values of 204 pixel's neighborhood to be processed are extracted from a matrix of 3 × 3 pixels. In this example, 205 each pixel has a different gray intensity value. The pixel being processed has the value of intensity 206 40.

207
Step 2 -A thresholding is performed on the intensity value of the neighboring pixels. Any pixel 208 having an intensity value greater than or equal to the intensity value of the pixel being processed 209 is assigned the value 1. The value 0 is assigned to any intensity value lower than that of the pixel 210 being processed.

211
Step 3 -A multiplier matrix is stored. This matrix will be used to describe the resulting local 212 binary form uniquely in the next step of the algorithm.

213
Step 4 -Element-by-element multiplication. This operation is carried out between the matrix 214 resulting from the thresholding of step 2 and the multiplying matrix of step 3.

215
Step 5 -The summation of the values of the resulting matrix from step 4 is performed. This sum 216 is related in the output image to the corresponding coordinates of the pixel to be processed in 217 the input image. The algorithm re-executes steps 1 to 5 until all the pixels of the input image 218 are processed. According to the the procedure for identifying LBP, a histogram is calculated to 219 characterize the frequency of appearance of the various patterns. The computed number for each 220 pixel in step 5 uniquely identifies a gray intensity pattern among the possible patterns. The shape 221 of the resulting histogram is characteristic of the texture studied by the LBP algorithm.

222
In general, the task of extracting features from facial images using LBP starts by dividing the input image into several blocks (7 × 7). Then, we extract the histogram for each block based on LBP. The final step consists of concatenating all histograms in order to realize the task of gender recognition. The concept of handcrafted features using LBP is shown in Fig. 2. To calculate the LBP code in a neighborhood of P pixels with a radius R, we simply count the occurrences of gray levels g p greater than or equal to the central value using Eq.(1.
Where g p and g c are the gray levels of a neighboring pixel and of the central pixel, respectively. S indicates the Heaviside function defined by Eq. 2:

223
HOG is a very powerful descriptor proposed by Dalal Figure 1: Basic LBP operator.

232
Gradient values (G h , G y ) are computed for each pixel using a centered 1 − D derivative filter, in the horizontal and vertical directions. For this, the following masks (S h , S v ) are used and defined by Eq.(3) and Eq. (4): Step 2 -The magnitude (|G(x, y)|) and gradient orientation (θ) of each pixel (x, y) are calculated using Eq. (7) and Eq.(8): G h and G v represent the horizontal gradient and the vertical gradient at pixel (x, y), respectively.
Step 3 -The histogram of the orientation based on the gradient inside each cell is calculated 234 by quantizing unsigned gradients at each pixel in 9 channels (bins) orientations. The histograms 235 are uniform from 0 to 180°(unsigned case) or from 0 to 360°(signed case).

236
Step 4 -The characteristic vector for each cell is normalized using histograms in their recognized blocks. In this work we use the L2-norm for the normalization of the blocks; the normalization factor is calculated using the following equation: Where Hist is the non-normalized vector containing all the histograms in a block, Hist 2 is the 237 L2 norm of the descriptor vector, and ǫ is a regularization term.

238
Step 5   • The energy (E n ): E n expresses the regularity of the texture, which can be computed by: It is important to note that a higher value of (E n ) signify a complete homogeneous image.

251
• The contrast (C n ): It measures the rate of local variation in the picture (I). The formula of (C n ) is given by : • The entropy (E t ) : E t is the inverse of energy and characterizes the irregular appearance of the image, hence a strong correlation between these two attributes. The formula of E t is computed by : • Correlation (C r ): It can be compared to a measure of the linear dependence of gray levels in 252 the image. It calculated by: • Homogeneity (H m ):The homogeneity changes inversely to the contrast and takes on high values if the differences between the analyzed pixel pairs are weak. It is therefore more sensitive to the elements diagonals of the GLCM, unlike the contrast which depends more on the distant elements diagonal. It is measured by: • Dissimilarity (D s ) : It expresses the same characteristics of the image as contrast to difference 254 that the weight of the GLCM inputs increases linearly as they move away from the diagonal 255 rather than quadratically in the case of contrast.

256
It calculated by: • The cluster shade and the cluster prominence give information on the degree of symmetry of 257 the GLCM.

284
-The third step -Transfer coefficient & density scalar:. In this step the collision between 285 object is occurred until obtaining the equilibrium state. The principal role of transfer function 286 (T c ) is to switch from exploration to exploitation mode, defined by Eq. (24): The T c increases exponentially over time until reaching 1. t is the current iteration, while T 288 denotes the maximum number of iterations. Also, the decrease of density scalar d s in AOA allows 289 to find an optimal solution using Eq.(25): -The fourth step -Exploration phase :. In this step, the collision between agents is occurred 291 using a random selection of material (M r). So, the update of acceleration objects is applied using 292 Eq. (26) when the transfer function value is less or equal to 0.5.
-The fifth step -Exploitation phase :. In this step, the collision between agents is not realized.

294
So, the update of acceleration objects is applied using Eq. (27) when the transfer coefficient value 295 is greater than 0.5.
Where Γ Best is the acceleration of the optimal object O Best .

297
-The sixth step -Normalization of acceleration :. In this step, we normalize the accelera-298 tion in order to determine the rate of change using (28): Where α and β are fixed to 0.9 and 0.1, respectively. The Γ t+1 i−norm determines the percentage of 300 step that each agent will change. The higher value of acceleration means that the object realizes 301 the operation of exploration; otherwise, the exploitation mode is operational.

302
-The seventh step -The Update process:. For exploration phase (T c ≤ 0.5), the position of 303 i th object in iteration t + 1 is modified by Eq. (29), whereas the object position is updated by Eq.
Where c 1 is equal to to 2.
where c 2 is fixed to 6.

307
The parameter δ is positively correlated with the time and this parameter is proportionally 308 linked to the transfer coefficient T c i.e δ = 2 × T c . The main role of this parameter is to ensure 309 a good balance between exploration and exploitation operations. During the first iterations, the 310 margin between the best object and the other object is higher, which provides a high random walk.

311
However, in last iterations, the margin will be reduced and provided a low random walk.

314
-The eighth step -The evaluation:. In this step, we evaluate the novel population using score  Evaluate the score for each object.

4:
Determine the best object (O Best ) with their best density (D Best ), best volume (V Best ) and best acceleration (Γ Best ).  Adjust acceleration (Γ i ) using Eq.(26).

19:
Compute the score of each object. Determine the best object (O Best ) with their best density (D Best ), best volume (V l Best ) and best acceleration (Γ Best )..  In order to apply the process of gender recognition using a wrapper feature selection assisted by AOA, a good compromise between accuracy and a lower number of features must be assured. SO, the score for each object is computed by: where (Acc), (d) are the accuracy obtained by Multilayer perceptron neural network (MLP) and 339 the size of selected histograms, respectively.

342
The MLP is integrated as a classifier in the FS process using k -folds as a cross-validation 343 strategy. In this study, the value of k is fixed to 5 to realize a fair comparison. So, 80% of samples 344 is used in the training step, where the rest is used for testing. Additionally, the architecture of 345 MLP is described in Fig.5.

346
This architecture includes three layers:

Design framework 356
This part represents the core of our work, which consists of applying the AOA algorithm     The confusion matrix must be used and defined in Table 1. Next, some measures must be 388 computed as Accuracy (Ac), Recall (Re), Precision (P r) and F-score(F score ). • The accuracy metric (A): Among the most important metrics, we find the accuracy which measures tHE rate of correct data classification, defined by : • The recall metric (R) : This metric is called also true positive rate (TPR), which indicates the percentage of predicting male person as: • The precision (P ) : It indicates the rate of true predicted samples as: where d is the number of relevant blocks based faces which increase the performance of gender 399 recognition.

400
• F-score (F Score ): In statistical F-score indicates the harmonic mean between recall and precision. It computed by Eq.37 : • CPU time (Cpu): It is the required time for each algorithms.  Table 2.  Table 2 Parameters settings of physical, mathematical and swarm inspired algorithms.   According to the results of Table. 5     The fitness curves obtained by the different optimizers are shown in the Figure.   In addition, we have graphically represented the ROC curve, as shown in Figure.      The obtained results showed the high performance of AOA-based HOG for both datasets in 530 terms of accuracy and F-score for both datasets. Also, SCA allows keeping the smallest number of 531 relevant blocks with a speed time. As the advantage of the proposed method, the AOA ensures a 532 good balance between the most relevant gender features from faces and the correct rate of gender 533 classification. However, some drawbacks can be highlighted of AOA and the handcrafted methods, 534 and mainly they are several parameters that are defined randomly in the initialization steps of 535 AOA. Also, the number of blocks for each handcrafted descriptor is fixed to 7 × 7 which has a 536 higher impact on the performance. Furthermore GLCM, required to define two parameters: the 537 displacement (d) and orientation (θ). All parameters can be tuned automatically as hyper-heuristic 538 AOA in the future.

539
The new horizon can be explored, like the automatic fusion between textural descriptors assisted