QGA–QGCNN: a model of quantum gate circuit neural network optimized by quantum genetic algorithm

Using a global optimization algorithm to optimize the initial weights and thresholds of traditional neural network models can effectively address the problems of premature convergence and lower accuracy. However, the shortcomings such as slow convergence speed and poor local search ability still exist. In order to solve these problems, a neural network model QGA–QGCNN using a Quantum Genetic Algorithm (QGA) to optimize Quantum Gate Circuit Neural Network (QGCNN) is proposed in this paper. In QGA–QGCNN, the initial parameters of QGCNN are optimized for the strong global optimization ability and faster convergence speed by using a QGA. When dealing with more complex problems, the QGCNN model based on quantum computing has specific parallel computing capabilities and can give full play to its ability to blur uncertain problems, thereby improving detection performance. We use the authoritative 10% KDD CUP99 data set in the field of network intrusion detection to conduct simulation experiments on the proposed QGA–QGCNN model. Experimental results show that the proposed intrusion detection model has a lower false alarm rate and significant accuracy compared to conventional attack detection models. And QGCNN optimized by QGA improves the convergence performance of the model.


Introduction
Artificial Neural Network (ANN) is a research focused in the field of artificial intelligence since 1980 s. It has become an extremely important branch of artificial intelligence with features such as parallel processing, distributed storage, self-adaptation, 1 3 QGA-QGCNN: a model of quantum gate circuit neural network… Ali and Torn [17] studied the efficiency and robustness of genetic algorithms and other population set-based global optimization methods. Furthermore, the authors show that the effectiveness of the global optimization algorithm is improved by combining local search. One of the main issues with heuristics and meta-heuristics is the local optima stagnation phenomena, and it is often called premature convergence [18]. To resolve such a problem, various global optimization algorithms have been proposed, which help to significantly improve the training of deep learning models. The representative examples include the GA, SCE, DE, PSO, ES and CMA-ES and random search algorithms [19]. Moreover, the global optimization algorithms are helpful for improving the search precision and accelerating the convergence rate [20].
Quantum Genetic Algorithm (QGA) is an intelligent optimization algorithm combining quantum computation and genetic algorithms. It was proposed by Han [21], who introduced quantum concepts such as quantum state, quantum gate, quantum state characteristics and probability amplitudes into the genetic algorithm. Quantum Genetic Algorithm is also a probability search algorithm that uses qubits to represent genes. Gene expression of Genetic Algorithm is a certain information, but in Quantum Genetic Algorithm, because of the superposition of quantum information, the gene expressed by qubits contains all possible information. QGA uses quantum gate as genetic operator to find the optimal solution, which shows fast convergence and good global search ability.
In this paper, to solve the shortcomings of slow convergence and poor local search capability of traditional neural networks, we optimize QGCNN with QGA to have strong global optimization ability. And the QGA-QGCNN model is subjected to network intrusion detection simulation experiments. Compared with the traditional neural network model, the QGA-QGCNN model performs well in terms of false negative rate and prediction accuracy, and its model converges faster. In Sect. 2, we provide a brief review of QGA. In Sect. 3, we first introduce the QGCNN model, and then, we combine the quantum genetic algorithm with the QGCNN model. In Sect. 4, we perform intrusion detection simulation experiments to compare numerically with the traditional intrusion detection model. Finally, we conclude and look ahead in Sect. 5.

Quantum genetic algorithm
Quantum Genetic Algorithm (QGA) is a combination of quantum computation and Genetic Algorithm (GA), and it is a kind of probability evolution algorithm. It uses some concepts and theories of quantum computation, such as qubit, quantum superposition state. Using qubit to encode chromosomes, this probability amplitude representation enables one quantum chromosome to express information of multiple states at the same time. The effect of quantum gate on the superposition state is used as the evolutionary operation, which can better maintain the diversity of the population. Moreover, the information of the optimal individual can be easily used to guide the variation so that the population evolves to a good pattern at an approximate rate, thus achieving an optimal solution to the goal. The procedure can be described in SPARKS language as Algorithm1. q t i is the i chromosome of the t-th generation individual, defined as follows: k is the number of genes, which is determined according to actual problems. The iterative formula of the rotation angle i of quantum rotation gate is as follows: i = s( i , i )Δ i , where s( i , i ) and Δ i represent the direction and rotation value, respectively. The fitness value f ( , the qubit of q t i is adjusted to make the probability amplitude evolve in the direction that is conducive to the occurrence of x i ; otherwise, if f (x i ) f (b) , the qubit of q t i is adjusted to make the probability amplitude evolve in the direction that is conducive to the occurrence of b i . The rotation angle adjustment strategy of quantum rotation gate is shown in Table 1. Table 1 Rotation angle adjustment strategy Where 6. The actual output of the hidden layer is h g = h g1 , h g2 , … , h gp T , whose value is equal to the probability amplitude of �1⟩ of the corresponding quantum state output, and U = 0 0 0 1 7. Output layer quantum state output: Where 8. The actual output of quantum neural network is y g = y g1 , y g2 , … , y gm T . Its value is equal to the probability amplitude of �1⟩ of the corresponding quantum state output, and U = 0 0 0 1 9. The sample label space is ỹ = ỹ 1 ,ỹ 2 , … ,ỹ s , and its label sample ỹ g = (ỹ g1 ,ỹ g2 , … ,ỹ gm ) T , g = 1, 2 , … , s, where s is the number of samples. (2)

3
QGA-QGCNN: a model of quantum gate circuit neural network… Quantum gate circuit neural network (QGCNN) is a quantum neural network constructed on the basis of quantum gates. It has no connection weights or thresholds like classical neural networks or quantum neural networks based on classical neurons. The two quantum gate matrices U( ) and U( ) in the quantum gate circuit neural network can be regarded as such classical connection weight matrix approximately. Therefore, the update of the weights in the learning algorithm is an update of the corresponding angle parameters and in the two quantum gate matrices. Updating weights in learning algorithms are an optimization task. When considering training neural networks, gradient descent algorithm is usually adopted [24]. Selvi and Valarmathi [25] have demonstrated that it has good performance in large-scale learning problems.
Assuming that the normalized expected output is ỹ g1 ,ỹ g2 , … ,ỹ gm , according to the expected output ỹ gk and the real output y gk of the network, the error function of the network can be defined as follows: According to the gradient descent algorithm, the gradient calculation formula of the rotation angle of the quantum rotation gate in each layer of the network is: where h gj is the actual output of the hidden layer, as shown in Eq. (4).
Therefore, the updating formula of the rotation angle of the quantum rotation gate in each layer of the network is as follows: Where t is the number of iteration steps, and is the learning rate.

Quantum gate circuit neural network model optimized by quantum genetic algorithm
In order to solve the shortcomings of traditional artificial neural network, such as slow learning convergence speed, easy to fall into local minimum and unable to get the global optimal solution, and random selection of initial weights. In this paper, quantum genetic algorithm is used to optimize the quantum gate circuit neural network model. The weights of QGCNN network are optimized by QGA, and the optimal search domain is selected in the specific solution space, so that the error between the output value of the optimized QGCNN network model and the solution of the objective function is close to 0. The QGA-QGCNN network model constructed by this is shown in Fig. 2.
The QGA-QGCNN model consists of three parts from top to bottom, respectively, the preprocessing module, the training module (QGA optimization module and QGCNN module) and the testing module. The preprocessing module, as its name implies, completes the task of data preprocessing, which is to normalize the original data set. The purpose is to eliminate the dimensional influence between indicators and to solve the comparability between data indicators. After data preprocessing, each index is in the same order of magnitude, which is suitable for comprehensive comparative evaluation.
Next, the training set and the test set are divided according to the proportion set by the user. The network optimization module completes the initial optimization of weights, namely, uses the QGA algorithm to complete the optimization of the initial rotation angle , of the QGCNN network in accordance with the Algorithm 1 process. The initial angles and of the optimization module output are set as the initial values of the QGCNN hidden layer quantum gate matrix U( ) and the output layer quantum gate matrix U( ) , respectively, and train the model according to the GQCNN algorithm. Finally, the test module will perform network intrusion The procedure can be illustrated in Algorithm 2.
Step 2 is the key step for QGA to optimize QGCNN, which setting the fitness function to f = 1 E and the square sum of error E is shown in Eq. (8). After processing the data set in the training set of sample after quantization, E can be obtained. Moreover, the real sample X = [x 1 ,x 2 , … ,x n ] T , and the quantum state transformation formula is defined as fol-

Build effective feature sets
When constructing a network intrusion detection model, it is very important to construct an effective feature sets [26] and the design of classification model, which will directly affect the performance of network intrusion detection. Because the original network intrusion data contain a large number of irrelevant and redundant features, if the original data are not processed and directly input to the detection model, not only the accuracy of model detection is reduced, but also the training time of the detection model is increased.
The experimental data set selected in this paper is KDD CUP99 data set. Each connection record in this data set contains 41 characteristic attribute fields, including 34 numerical fields and 7 symbolic fields and 1 class identification field.
The data set contains four types of attacks: In this paper, the effective feature subsets of KDD CUP99 data set selected by the GA-based intrusion detection feature selection algorithm proposed in Ref. [27] are as follows: {protocol_type, service, flag, src_bytes, dst_bytes, su_attempted, count, dst_host_ count, dst_host_srv_count, dst_host_diff_srv_rate, dst_host_srv_diff_host_rate, dst_host_rerror_rate} In order to verify the effectiveness of the QGA-QGCNN model proposed in this paper, through the simulation example, the experimental simulation environment is Windows 10 and MATLAB 9.0 (R2016a), and the simulation experimental data adopts the 10% KDD CUP99 (see Table 2) in the field of network intrusion detection data collection.

Network intrusion detection
Network intrusion refers to the use of network information system vulnerabilities and security flaws to attack the system and resources [28]. It refers to any kind of offensive action against computer information system, infrastructure, computer network or personal computer equipment. Network intrusion can be divided into active attack and passive attack, in which active attacks will lead to tampering of some data flows and the production of false data flows. Such attacks can be divided into tampering, forgery of message data and terminal (denial of service). However, in the passive attack, the attacker does not make any changes to the data information. Without the consent and approval of the user, the attacker obtains the information or relevant data. Such attacks usually include eavesdropping, traffic analysis, cracking the weak encrypted data stream and other attacks.
Network Intrusion Detection System (NIDS) [29,30] refers to the combination of software and hardware to detect the behaviors that endanger the security of computer system, such as collecting vulnerability information, causing access denial and obtaining system control rights beyond the legal scope. The purpose of NIDS is to identify potential attacks from TCP/IP message flows over the network. An intrusion detection system mainly consists of three functional components: information collection, information analysis and result processing (response). Its functional structure is shown in Fig. 3.
As shown in Fig. 2, QGA-QGCNN model is a three-module structure, and intrusion detection system is also a three-layer logic structure. Therefore, QGA-QGCNN model and intrusion detection system can be corresponding to each other and applied to network intrusion detection. In NIDS, the information collection layer corresponds to the data preprocessing module of QGA-QGCNN, the information analysis layer corresponds to the training module, and the result processing layer corresponds to the intrusion detection model.
After the training of QGCNN model based on QGA optimization, the process of network intrusion detection and analysis based on QGA-QGCNN is as follows: Step 1 Feature selection of the original KDD CUP99 data sets [31] can reduce the feature dimension, speed up detection and eliminate the large number of irrelevant and redundant features contained in the original network intrusion data.
Step 2 Normalize the data sets after feature selection, quantize the normalized result of the test sample with Eq. (12) and prepare the corresponding quantum state where i represents the ith sample in the system, and n is the feature dimension of the sample.
Step 3 The quantized sample characteristic parameters are input into the trained QGA-QGCNN model in turn and the network output y i is calculated.
Step 4 The process of network intrusion detection is shown in Fig. 4. The error between the actual output value y i and the expected output value ỹ i of the prediction sample is e i = | | y i −ỹ i | | , and the average error of the prediction sample is ē = 1 n ∑ n i=1 e i , and the evaluation and quantitative analysis of the network intrusion detection model are realized.

Evaluation metrics
Important concepts related to the evaluation metrics in intrusion detection are shown in Table 3. It should be noted that the terms "positive" and "negative" do not refer to the value of the condition of interest, but to its presence or absence; the condition itself Fig. 3 Function composition of intrusion detection system could be a disease, so that "positive" might mean "diseased", while "negative" might mean "healthy".
The evaluation metrics defined according to the concepts in Table 3 are as follows, which are also used as one of the evaluation metrics of our experiment.
Since even a small amount of undetected attacks can cause network paralysis and become a security threat, FNR is generally considered more important in network intrusion detection.
In order to compare with other detection models, we also compare the following evaluation metrics:

Parameter setting of the model
In this paper, the hidden layer neuron number H is determined by empirical formula H = [32], where a is a constant between 1 and 10, I is the number of input neuron in the network according to the input vector length of the sample, and O is the number of output neurons according to the length of the output vector. Compared to the results of several tests, the basic parameters of the QGA-QGCNN model are set as 12-10-1 type. As shown in Fig. 1, the parameters refer to the number of quantum circuits in the input layer, the hidden layer and the output layer, which are n, p and m, respectively.
The selection of hyperparameters will cause the performance of the model to change. Therefore, we consider the experiment by changing the size of the population and the value of the learning rate. The larger the population size, the less likely it is for individual optimal solutions to dominate the evolutionary direction of the overall solution. The smaller the population size, the slower the speed of finding the optimal solution, and there are local optimization problems. Generally, it is necessary to balance the solution quality and calculation time, that is, to reduce the calculation time as much as possible while ensuring the solution quality. Similarly, if the learning rate value is too small, the longer it takes to converge, and it will also cause the model to reach a local optimal solution instead of a global optimal solution. Conversely, if the learning rate is too large, the gradient descent may exceed the minimum value and may fail to converge. Therefore, choosing an appropriate learning rate value helps us determine the best weight.
We set the range of population size from 50 to 300 and changed the learning rate from 10 −4 to 10 −1 while measuring Accuracy and FNR. In the results shown in Table 4, we can observe that FNR decreases as the learning rate increases. However, when the learning rate is less than 0.01, the FNR of the model increases instead. The main reason is that the learning rate is set too large, and the model cannot converge, and the model cannot be created effectively. From the Accuracy, we can also see the relationship between the learning rate change and the model detection performance. In addition, FNR decreases as the population size increases. When the population size reaches 200, the FNR does not decrease significantly, and even increases. Therefore, we set 200 as the optimal population and 10 −2 as the optimal learning rate.

Analysis of simulation results
Based on the above data set and parameter settings, we use the mapminmax function provided by MATLAB to perform Min-Max normalization on the 12 input feature attributes of the training set and test set. Hereafter, complete training of the QGA-QGCNN model with the learning algorithm QGA-QGCNN algorithm proposed in this article, the process is shown in Fig. 4. Finally, we use the trained model for network intrusion detection, the model detects the test set according to the learned knowledge and outputs the test results.    Figure 5 lists the detection performance of our proposed model compared to typical network intrusion detection technologies of LSTM (256 hidden units), ANN (2 hidden layers), Support Vector Machine, k-nearest neighbors (k = 5), Naïve Bayes and Decision Trees.
The above models are based on typical machine learning techniques. Since the task of intrusion detection is to classify and identify network attacks, the selected models are based on commonly used classification algorithms. In addition, since the KDD CUP99 data set contains time series information, we also selected an LSTM model suitable for processing time series data for comparison.
It can be concluded from the experimental results (see Fig. 5) that the six detection models of QGA-QGCNN, LSTM, ANN, SVM, KNN and DT show good performance for DoS, R2L and Probe intrusion detection. That is, it can realize accurate early warning of these three kinds of attacks with low false alarms. However, the NB model's precision rate of R2L is not satisfactory, only about 60%. Moreover, our proposed model has the lowest FNR and the highest accuracy.
For the detection of U2R attacks, QGA-QGCNN and LSTM can show better detection performance when the FNR is relatively low, while the detection performance of other models is relatively poor. Although ANN, SVM, KNN, DT and NB show high precision for U2R detection, FNR is relatively high and recall rate is low, which leads to low F1 score. This means that false alarms of detection models are very low, but these models tend to classify the data as normal data. This shows that these models cannot identify U2R attacks as accurately as QGA-QGCNN and LSTM models. These results are owing to the fact that U2R has only 52 samples, only accounts for 0.04% of the total training set. In the big Fig. 5 Compared to other detection models data environment, data labeling is difficult, the classifier cannot be effectively trained. This also restricts the prediction effect of the QGA-QGCNN model for U2R not reaching the accuracy similar to the other three attacks.
LSTM model is usually more suitable for processing and predicting important events with relatively long interval and delay in time series [33]. However, we can see that although the precision rates of QGA-QGCNN and LSTM are similar, FNR is better than LSTM model. As QGA-QGCNN model processing and storage information by superposition, that is, under the same conditions, QGA-QGCNN model will learn more "knowledge". QGCNN uses the idea of superposition of quantum states in quantum theory, so that the training of quantum neural network is also in a superposition of multiple results, so that the generated network has an inherent ambiguity. Both theory and experiment [34] prove the quantum gate group neural network model has an excellent classification effect for pattern recognition problems with uncertainties and cross data between patterns.
In general, our proposed QGA-QGCNN-based model has lower FNR when identifying four types of attacks, and the precision rate, recall rate and F1 value are better than other detection models. In other words, QGA-QGCNN can achieve more accurate intrusion alarm in the case of relatively low false alarm rate, that is, it can well understand the nature of network data and capture the changes of data.
Since QGA is an intelligent optimization algorithm which combines quantum computing with genetic algorithm, and QGA is related to the performance improvement of the model, we will analyze it next. Figure 6 is an adaptive convergence process of the fitness function for the GA-QGCNN model and QGA-QGCNN model. The simulation results show that the search efficiency of QGA is better than that of GA, and the convergence speed is faster and the convergence precision is higher. After 10 generations of QGA evolution, the fitness of individuals began to rise significantly, reaching 3.95 and tended to converge. After 32 generations of evolution, the fitness of individuals Fig. 6 Adaptive convergence process of GA-QGCNN and QGA-QGCNN tends to converge basically, which indicates that the fitness value has reached the optimal value of the operation. At the same time, the optimal fitness value of QGA is better than that of GA at the same stage. Table 5 lists the initial errors and iteration steps of QGCNN, GA-QGCNN and QGA-QGCNN at the end of the experiment, and Fig. 7 shows the error curves for training three network models.
It can be seen from Table 5 and Fig. 7 that the network errors corresponding to the initial weights of the model optimized by GA and QGA are 0.16 and 0.11, respectively, which are lower than the network errors corresponding to the random initialization weights of the QGCNN network not optimized by 0.37, and the optimization ability of QGA algorithm is stronger. QGA, as an auxiliary algorithm of QGCNN model, shows good global optimization ability, which can effectively reduce the initial network error of QGCNN model. It can be seen from the convergence curve of Fig. 7 and the partial zoom results that QGA-QGCNN has the largest initial slope, and the error drops the fastest and reaches the target value first, followed by GA-QGCNN, and the QGCNN model is the slowest. Based on the above  Table 5, QGA-QGCNN can the convergence speed and improve the accuracy of intrusion detection. The range of rotation angles ij and jk of QGCNN is [0, 2 ] . In the iterative process, there are all global optimal solutions on the interval, that is, there are multiple attractors in the period, which can significantly improve the convergence speed of the network. The initial value of weight matrix W of ANN neural network is randomly selected from [−1, 1] , and the optimal solution will not appear repeatedly. Therefore, the number of global optimal solutions of ANN neural network is less than QGCNN network. In the process of weight iteration, a single attractor will slow down the convergence of the network. Moreover, ANN network uses gradient descent method to update the network weight. The target object to be optimized is more complex, resulting in slow convergence speed of the network. When the output is close to 0 (or 1), the weight error changes very little, resulting in the training process is very slow, and more iterative steps are required.
It can be seen from Table 5 and Fig. 7 that for the expected error of 0.0001, the number of QGCNN model training steps is 79, the number of GA-QGCNN model training steps is 35, and the number of QGA-QGCNN model training steps is 27, which indicates that the proposed QGA-QGCNN has a significant performance than GA-QGCNN in the issue of convergence speed.

Conclusion and prospect
Based on the analysis of network intrusion detection model and quantum gate group, this paper proposes a QGA-QGCNN model and algorithm of quantum gate circuit neural network optimized by quantum genetic algorithm and applies it to the field of network intrusion detection. It is different from the traditional neural network in weighting, aggregation, activating and other methods. In QGA-QGCNN model, the rotation angle of the quantum rotation gate is the parameter to be adjusted in the network model, and the quantum rotation gate is used to complete the phase rotation and as the control bit to control the turning of the qubit. The output value of each layer of the network model is determined by quantum rotation gate and multi-qubit controlled-NOT gate. QGCNN makes full use of the superposition characteristics of quantum states to obfuscate uncertain problems, which improves the ability of how to express the uncertain information in the target to be detected, and further improves the accuracy of the target problem.
In view of the fact that traditional algorithms are easy to fall into local minima, intelligent optimization algorithms have the problems of slow convergence speed and insufficient local search ability. In this paper, the improved QGA based on GA is applied to the learning process of the QGCNN neural network to correct the quantum rotation angle of the QGCNN network and improve its convergence performance. Both the QGA-QGCNN model and NIDS have a three-layer tree structure (see Figs. 2 and 3), and the data processing process is similar. We corresponded the model with the NIDS system and constructed a QGA-QGCNN-based model suitable for intrusion detection (see Fig. 4). Simulation results show that compared to other well-known network intrusion detection schemes, such as LSTM, ANN, SVM, kNN, NB, and DT, the proposed scheme shows well detection performance. Moreover, the QGCNN model optimized by QGA has improved convergence performance compared to the unoptimized and optimized by the original GA algorithm. In future work, we will develop dynamic QGA to provide a larger scale of performance computation for the system. Suppose the future analysis shows that the advantage of convergence speed of the QGA becomes a force in a more extensive search space. In that case, it will validate the efficiency of the QGA.