4.1 Theoretical basis of genetic algorithm
The flow of genetic algorithm is shown in Fig. 1.
The linear combination of any two points x1 and x2 of the arithmetic crossover operator in the solution space d of the convex search space is:
$$p{x}_{1}+(1-p){x}_{2}\text{, }p\in \left[\text{0,1}\right]$$
16
Assuming that x1 and x2 in the population are selected as the two parents of the crossover operation, two parents produce children as follows:
$$\begin{array}{c}{x}_{1}{ }^{{\prime }}=p{x}_{2}+(1-p){x}_{1}\\ \end{array}$$
17
$${x}_{2}^{{\prime }}=p{x}_{1}+(1-p){x}_{2}$$
18
Replacing the original gene value and changing it into a random number is the uniform mutation operation. This random number will be evenly distributed and located in the sequence of the corresponding gene value. Therefore, individuals can act independently in the search space, but they are not good at the local search process for specific values. In order to make up for this deficiency, we will make a small random change to the value of the original gene, take the modified value as the new value of the gene, and replace the original value of the gene with the new gene value, rather than taking the same random number distribution. After the mutation operation with equal probability on each trajectory, the space corresponding to the solution changes slightly along the solution vector.
$$\begin{array}{c}{X}^{{\prime }}=\left({X}_{1},{X}_{2},\cdots ,{X}_{k}^{{\prime }},{X}_{1}\right)\\ {X}_{k}^{{\prime }}={X}_{k}+\varDelta X\text{ }\text{o}\text{r}\text{ }{X}_{k}-\varDelta X\end{array}$$
19
If there is a parent x, if the element xk is selected to mutate, the result of mutation is: where Δ X is the changed value.
4.2 Facial image feature enhancement
For an n × n given image of pixels is first divided into m × m personal face feature block. Each face shape block is actually a 1 × 1 pixel image. Define the initial population: aij is the block state of the face shape. If the block on the surface is significant, set iaj to 1, otherwise set to 0. All surface features of aij block are composed of matrix sk, which is as follows:
$${S}_{k}=\left[\begin{array}{cccc}{a}_{11}& {a}_{12}& \cdots & {a}_{1m}\\ {a}_{21}& {a}_{22}& \cdots & {a}_{2m}\\ ⋮& ⋮& \ddots & ⋮\\ {a}_{m1}& {a}_{m2}& \cdots & {a}_{mm}\end{array}\right],k=\{\text{1,2},3,\dots N\}$$
20
Among them, SK is one of the possible solutions of the optimal part of the expression of a specific population, and N is the population size. Calculation parameters α: Let a training image of the same training expression in the sample set be D, then take the face feature matrix of the training image D, and take the matrix as the ID. If the following rules are met, the training image D belongs to this kind of expression:
$$\left|{I}_{D}\cap {S}_{k}\right|\ge {\Omega }\sum _{r=1}^{m} \sum _{c=1}^{m} {a}_{rc}$$
21
Ω is the threshold set to 0.8, which means that it must have 80% similarity with the selected feature to indicate that it belongs to this kind of expression.
4.3 Facial expression recognition
For a given image sample A, image features are extracted through the optimal mapping vector, so that
$${\text{y}}_{\text{k}}=A{\text{x}}_{\text{k}},\text{k}=\text{1,2},\cdots ,\text{d}$$
22
For a single network, we use hidden layer RBF network. The output layer of each network has an output result representing the expression category, and the size of the image sample feature matrix calculated above is m × d.
Among them, when using genetic algorithm to select a single neural network for integration, it is assumed that M neural networks N1, N2, N3, Nm is trained independently, and N simple averaging methods are used to form a neural integration network. After considering the elimination of nm neural network, N'neural network is composed of N1, N2, N3, Nm−1 uses a simple average method to meet:
The correlation between neural network Ni and Nj is defined as follows:
$${C}_{ij}=\int p\left(x\right)\left[{N}_{i}\left(x\right)-d\left(x\right)\right]\left[{N}_{j}\left(x\right)-d\left(x\right)\right]dx$$
23
If a genetic individual corresponds to its own s in {N1, N2, N3,..., nm}, assuming that the verification set is V, the estimated value of the correlation between Ni and NJ in the calculated neural network verification set V is:
$${C}_{ij}^{V}=\sum _{x\in V} \left[{N}_{i}\left(x\right)-d\left(x\right)\right]\left[{N}_{j}\left(x\right)-d\left(x\right)\right]/\left|V\right|$$
24
The average error of the neural network corresponding to s integrated in the verification set V is:
$$\left(\sum _{{N}_{i},{N}_{j}\in S} {C}_{ij}^{V}\right)/|S{|}^{2}$$
25
The fitness value of the genetic algorithm is selected as the reciprocal of the error.
4.4 Classification of facial expression recognition results
After selecting facial features, they are divided into five expressions: focus, doubt, distraction, excitement and anxiety. Then a random forest classification algorithm is proposed, which can effectively improve the recognition accuracy of expression classification.
In this algorithm, the Gini index is the standard for judging the random forest regression tree, and the calculation process is:
$$\text{G}\text{i}\text{n}\text{i}\left(S\right)=1-\sum _{i=1}^{mtry} {P}_{i}^{2}$$
26
Where Pi represents the probability that Yi class will appear in test set S.
The schematic diagram of two-dimensional facial expression recognition algorithm based on feature extraction and random forest feature classification is shown in Fig. 2:
4.5 Facial expression recognition and classification result analysis
There are two problems to be solved in the experiment: the selection of the number of hidden layer nodes and the number of subnets in RBF network. Firstly, the selection of the number of hidden layer nodes in RBF neural network is determined. We set the number of hidden layer nodes to 2, 5, 10, 15, 20, 25 and 30 for experiments, and use RBF neural network to recognize facial expressions. The results show that if the number of hidden layer nodes starts from 2, the detection rate increases with the increase of the number of nodes. If the number of nodes is 15, the detection effect is the highest, and then the detection rate decreases with the increase of the number. The experimental results of hidden layer node parameters comparison and debugging are as follows:
According to the experimental results, the number of hidden layer nodes is set to 15. The number of hidden layer nodes in RBF neural network is determined. Then determine the number of subnets. The experimental results are:
In the previous step, it has been proved that if the number of networks is 6, the detection rate is the best. In order to use genetic algorithm to select individuals from networks with multiple differences, we trained 6 * 2 individual networks, and then genetic algorithm is used to select 6 * 2 individuals from the network. Select 6 of them for network integration.
We also compare other algorithms. In the experiment, three different facial expression recognition algorithms are provided. Most of these algorithms have achieved high accuracy. All these algorithms and their accuracy in five facial expression recognition are shown in Table 1. It can be seen from Table 1 that the algorithm designed in this paper has achieved the best results.
Table 1
Comparison of accuracy of different face recognition algorithms
Method Expression | Algorithm A | Algorithm B | Algorithm C | Algorithm D | Algorithm E | Algorithm in the paper |
Focus | 0.4 | 0.6 | 0.7 | 0.6 | 0.94 | 0.980 |
Doubt | 0.7 | 0.9 | 0.65 | 0.94 | 0.85 | 0.94 |
Distracted | 0.7 | 0.6 | 0.8 | 0.9 | 0.9 | 0.980 |
Excited | 0.86 | 0.85 | 0.87 | 0.90 | 0.9 | 1 |
Anxiety | 0.7 | 0.76 | 0.8 | 0.9 | 0.82 | 0.95 |
Average | 0.67 | 0.75 | 0.75 | 0.84 | 0.88 | 0.97 |