Cooperative Attack-Defense Decision-Making of Multi-UAV Using Satiscing Decision-Enhanced Wolf Pack Search Algorithm

Unmanned Aerial Vehicles (UAVs) have shown their superiority for applications in complicated military missions. A cooperative attack-defense decision-making method based on satisﬁcing decision-enhanced wolf pack search (SDEWPS) algorithm is developed for multi-UAV air combat in this paper. Firstly, the multi-UAV air combat mathematical model is provided and the attack-defense decision-making constraints are deﬁned. Besides the traditional air combat situation, the capability of UAVs and target information including target type and target intention are all considered in this paper to establish the air combat superiority function. Then, the wolf pack search (WPS) algorithm is used to solve the attack decision problem. In order to improve eﬃciency, the satisﬁcing decision theory is employed to enhance the WPS to obtain the satisﬁcing solution rather than optimal solution. The simulation results show that the developed method can realize the cooperative attack decision-making.


Introduction
Due to the stealth performance, autonomous capability and the effect of reducing casualty, Unmanned Aerial Vehicles (UAVs) are widely used in the modern air combat. Moreover, UAVs have excellent advantages in carrying out military activities, including the high maneuverability and the ability to avoid the mistakes caused by pilots [1]. On the other hand, because the combat mission is tending to diversification under the growing complex battlefield environment, multi-UAV cooperative combat is a primary air combat mode in the future. In order to further increase the combat effectiveness of UAVs, it is important and necessary to concentrate on the cooperative attack-defense decision-making of multi-UAV [2].
The cooperative attack-defense decision-making of multi-UAV is a challenging problem, in which, to obtain better combat effectiveness and lower cost, targets of different position, type and threat are allocated to UAVs with different functionality and performance. In general, cooperative attack-defense decision-making is a combination optimization problem with different constraints, and some valuable works can be found in the literature to solve it. Genetic algorithm was used to obtain the decision-making plan in multi-agent cooperative multiple task assignment system [3,4]. An improved nondominated sorting genetic algorithm-II with a elitist strategy was adopted to search the Pareto optimal solutions for the multi-UAV task allocation of cooperative attacking multiple targets in [5]. A simulated annealing genetic algorithm was proposed to find out the optimal solution to the missile-target assignment problem in [6,7]. In [8], a method based on the collective intelligence theory was studied to solve air combat decision-making problem for coordinated multiple tar-gets attack. Although all of the above-mentioned works can realize cooperative attack-defense decision-making, the more effective decision-making algorithms need to be further studied to decrease the high algorithmic complexity.
In the future unmanned air combat, the rapidity and accuracy of decision-making are very important. Compared with other swarm intelligence algorithms, the wolf pack search (WPS) algorithm imitate the hunting activity of a wolf group. WPS algorithm has more advantage in obtaining the high quality solution and avoiding the defection of falling into the local optimum [9]. Thus, WPS algorithm is widely employed in combinatorial optimization problem, such as attack-defense decision-making, dynamic replanning and path planning of UAVs [10][11][12][13]. In [10], a modified two-part wolf pack search algorithm was presented to solve the combinatorial optimization model of multi-UAV cooperative attack-defense decision-making. The grey WPS algorithm was utilized to find the feasible trajectory while avoiding collision among obstacles and other UAVs, and solve the three dimensional path planning problem of UAVs [13].
Satisficing decision is a set-theoretic idea based on game theory. In order to improve the speed of solving optimization problems, satisficing decision is committed to obtain the satisficing solution rather than optimal solution [14]. On the basis of this, satisficing decision theory can combine the advantages of different intelligent algorithms, and the combination framework can be built for the decision-making systems that have much less calculation than existing systems. Furthermore, satisficing decision theory was widely used in different fields due to its rapidity, such as controlling science, coordinated target assignment and multi-agent interaction [15][16][17]. A multi-UAV target assignment algorithm based on the satisficing decision theory was studied for air-to-ground attacking in [17], the satisficing decision is was empolyed to improve the searching efficiency of optimization problem. In [18], satisficing decision was applied to obtain the near-optimal objective function value, and the solving efficiency of multi-UAV task assignment problem is improved. Hence, satisficing decision theory can employ to enhance WPS algorithm for cooperative attack-defense decision-making of multi-UAV in this paper.
The main contributions of this paper include: -Except for the traditional air combat situation indicators, the capability of UAVs and target information including target type and target intention are all considered in this paper to establish the air combat superiority function.
-In order to improve searching efficiency, the WPS algorithm is enhanced by using satisficing decision theory to obtain the satisficing solution rather than optimal solution. This paper is organized as follows. The description, mathematical model of air combat and target assignment constraints are defined in Section 2. In Section 3, a satisficing decision-enhanced wolf pack search (SDEWPS) algorithm is proposed to solve the cooperative attack decision-making problem of multi-UAV. The simulation results are presented in Section 4. Finally, further discussions and conclusions are presented in Section 5.

Unmanned air combat superiority function
In the unmanned air combat, attack-defense decisionmaking is necessary for UAVs, which is beneficial for improving the intelligence and flexibility of command and control system. The problem described in this paper is to assign M UAVs to attack N targets. Namely, the objective is to design the attack-defense decisionmaking algorithm based on air combat information so as to obtain the cooperative attack-defense scheme of M UAVs to N targets and determine intended that UAVs used for attack or defense.
With the development of military technology in different countries, the capabilities of their UAVs vary tremendously, which lead to that the capability priority has more impact on the unmanned air combat. On the other hand, one step ahead of the air competition can be realized and decision-making becomes more flexible if target intention is predicted in advance. Hence, besides the traditional air combat attack priority, the capability priority and target information (including target type and target intention) are all considered in this paper to establish the air combat superiority function. The structure of cooperative attack decision of UAVs is shown in Fig. 1.

Air Combat Situation Priority
The air combat situation priority is related to the air combat situation, and relative positions of targets. The performance priority is depends on the performance of UAV, such as weapon, sensor and etc. Referring to [19], the air combat situation priority functions are given as follows.
Angle priority P a is defined as [19] P α = 1 − (|ϕ| + |ρ|)/180 • (1) where ϕ is the position angle, ρ is the target entrance angle, | · | is the absolute value symbol. Velocity priority P v is given by [19] P where v u and v t are the velocities of UAV and target. Distance priority is defined as follow [19]: Case 1: if the target performance is better than the UAV, we have rm < rmt < rr < rrt. Under this case, we define Case 2: if the target performance is worse than the UAV, we have rmt < rm < rrt < rr. Under this case, we define where R is the distance between the target and the UAV, rm is the maximum missile range of the UAV, rmt is the maximum missile range of the target, rr is the maximum detection range of the UAV radar, and rrt is the maximum detection range of the target radar. From (3) and (4), we know that when the UAV and target can attack each other or can only be detected and are both unable to attack, P d = 0.5.
Height priority P h is defined as [19] where h is the height difference between the target and the UAV. Weapon priority P w is given by [19] P w = where ll and llt are the missile numbers of the UAV and the target. From the above, the air combat situation priority is defined as [19] where k 1 , k 2 , k 3 , k 4 are the weights of the priority. However, due to the uncertainty and incompleteness of modern air combat, some of the information of air combat situation may missing or cannot be detected. Thus, entropy weight method is considered to determine these weights.
Defining P 1 = P α , P 2 = P v , P 3 = P d , P 4 = P h . Then, we calculate the entropy of each priority as [19] Finally, the weight can be obtained as If some information is missing, or cannot be obtained, we define corresponding k s = 0.

Capability Priority
Apart from air combat situation priority, the capability of the UAV also has great impact on cooperative attack-defense decision-making. Based on [20], the UAV capability Cap can be calculated as follows: where B is the maneuverability coefficient. A 1 and A 2 are the strike capability and detection capability of the UAV. ε 1 , ε 2 , ε 3 and ε 4 are the maneuvering capability, survivability, endurance and electronic countermeasure capability.
For convenience of calculations, the capability priority P C is deal with normalization. Then, we have where Cap u is the capability of UAV and Cap t is the capability of target.

Target information
In general, the target information is related to target type and target intention. Targets can be usually divided into two types, one is combat UAV and the other is UAV for detection, surveillance and reconnaissance. The threat level of combat UAV and support UAV are different. Furthermore, they can be further divided as leader or follower. The responsibility of a follower is to help and follow the leader. Similarly, the importance degree of leader and follower are also different. The UAV value is used to reflect the importance degree in this paper, the great threat level expresses the corresponding UAV has more threat.
where V c and V s are the threat level of combat UAV and support UAV, V l and V f are the threat level of leader and follower. Target intention is rarely taken into account in the traditional air combat assessment model. In this paper, the target intention set is defined as [21] I = {A, S, P, F, D, R, C, E} where A, S, P , F , D, R, C and E are express the target intention of attack, surveillance, penetration, feint, defense, reconnaissance, cover, and electronic interference.
The threat level P V of each intention is defined as Table 1.

Problem Formulation
The cooperative attack-defense decision-making of multi-UAV problem can be formulated as a combinatorial optimization problem. The air combat superiority function can be expressed as [22] where k S , k C and k V are the weights of air combat situation priority, capability priority and target threat level.
Suppose that there are M UAVs against N targets. Specifically, the air combat superiority function of UAV i attacking target j is P ij .
Then, the decision-making variable is defined as Hence, the performance index of cooperative attack decision-making is defined as In order to ensure that all targets are attacked, each target should be assigned to at least one UAV [22]. Namely, we have For the purpose of making the UAV at full operational effectiveness and ensuring firepower balance, firepower should be spread out in different targets. Assuming the maximum number that UAV i attacks target j is D j [22]. Then, we have In addition, in order to guarantee the safety of the UAV, the number of each UAV attacks target is limited.
Assuming the maximum available number that UAV i attacks target is E i [22]. Then, we have To sum up, the objective function of cooperative attack decision-making is given by The meaning of this work is to design a fast and efficient attack-defense algorithm to maximize the objective function and obtain the cooperative attack-defense scheme to complete the attack-defense decision-making of M UAVs and N targets.

Attack-defense Decision-making Based on SDEWPS Algorithm
The WPS algorithm is a new meta-heuristic algorithm, which is abstracted through simulating the behavior of wolves in nature to besiege prey cooperatively [23]. Based on the natural law of "the strong survive", the algorithm ensures that the wolf pack can quickly besiege prey and avoid being trapped into local optimum through the cooperative search mode of wolf pack with clear assignment of responsibility. Furthermore, this rigorous organized system of wolf pack is consistent with the thought of multi-UAV cooperative attack-defense decision-making. However, as a matter of fact, for various complex reasons, determining the strictly optimal solution may not be feasible with the given resources in the actual optimized problem [10] and it may waste a great deal of time. To avoid this problem, searching the satisfactory solution to replace the optimal solution becomes more concerned. Additionally, satisficing decision is an improved exhaustive method based on game theory, whose origin can be traced back to Simon's "Minimum standard concept" [24]. In the satisficing decision, when an optimal solution is found that meets the expectations of the decision maker, the search for it can be terminated [25]. Thus, satisficing decision theory is employed to enhance the wolf pack algorithm to obtain the satisficing solution and improve the searching efficiency.
The structure of SDEWPS algorithm for attackdefense decision-making under unmanned air combat is shown in Fig. 2.
In order to better describe the attack-defense decisionmaking based on the SDEWPS algorithm, some definitions are given as follows [9].
Definition 1 (Distance of wolves) The distance of wolf p and q is defined as where ⊕ is XOR operation.
Definition 2 (Crossover operator) Assume that the position of the ith wolf is X i = (x i1 , x i2 , ..., x iL ), then the crossover operator is defined as a two-dimensional array (x ij , x ik ), where j, k ∈ {1, 2, ..., L} and j ̸ = k.
Definition 3 (Motion operator) Suppose that the position of the ith wolf is X i = (x i1 , x i2 , ..., x iL ), the motion operator Θ(X i , r) indicates that r crossover operators are randomly generated, and in the sequence of which corresponding encoding values in X i are exchanged.
Furthermore, some definitions of satisficing decision are given as follows.
In satisficing decision, satisficing means choosing a decision-making strategy that is "good enough" rather than being the optimal [26]. Firstly, a satisficing set is proposed according to the estimated benefits and costs. Then, the selectability function W s (u) and rejectability function W r (u) are defined. The role of the selectability function is used to measure the degree in achieving the goal, and the rejectability function is the cost of decision-making. Therefore, the satisficing decision theory is used to improve the searching speed of WPS algorithm.
The satisficing set Σ α is defined by [26] where U is the decision universe of UAVs, and α is the satisficing factor. However, for a satisficing unit u, there may exist other satisficing units u ′ that are better than u. Namely, W s (u ′ ) ≥ W s (u) or W r (u ′ ) ≤ W r (u). Although the solution may not be optimal, the results are still can meet the requirements.
To ensure that the WPS algorithm can be adapted to requirements of multi-UAV attack-defense decisionmaking problem, the encoding method is firstly designed in this section. The encoding example is shown as Fig. 3 [27].  where the arrow express that the UAV is assigned to attack the corresponding target.
In the most cases of unmanned air combat, there is not one-to-one correspondence between targets and UAVs due to the differences of UAV performance.
If M > N , namely, the number of UAVs is more than targets. In this case, the target sequence needs to be extended. Assume that the encoding length is L, then we have (k − 1)N < M < kN = L(k ∈ N * ). The encoding diagram is shown as Fig. 4 [27]. In the other case, if M < N , namely, the number of UAVs is less than targets, the UAV sequence needs to be extended. Assume that the encoding length is L, then we have (k − 1)M < N < kM = L(k ∈ N * ). The corresponding encoding diagram is shown as Fig. 5 [27]. Same as the WPS algorithm, the wolf pack is divided into lead wolf, searching wolves and fierce wolves in SDEWPS algorithm [28]. The lead wolf is the leader of the wolf pack. The lead wolf not only commands the wolves to catch the prey as soon as possible, but also ensures the wolf pack to catch the better prey. The searching wolves are the few elites of the wolf pack, and they make autonomous decisions based on the concentration of scent left by their prey in the search space and head for the highest concentration of scent in their vicinity. The rest of the wolves are fierce wolves. When the searching wolves discover the high quality prey, the lead wolf summons the fierce wolves to attack the prey so that it can catch the prey as quickly as possible. The distribution rule of the wolf pack is to prioritize the prey to the wolf that finds and catches the prey firstly. It is ensured that the wolf pack can effectively avoid local optimization and quickly select the suitable prey.
On the other hand, the attack-defense decision-making is influenced by the uncertainty and incompleteness of the battlefield in the unmanned air combat. It is difficult to obtain the optimal solution. UAVs need to make trade-offs to match the overall requirements. Hence, it is suitable to make use of satisficing decision to improve the speed of intelligent algorithm.
For attack-defense decision-making problem of multi-UAV, the satisficing decision is employed based on the various constraints. According to the benefit and cost of each attack-defense decision-making scheme, the satisficing set is obtained by the SDEWPS algorithm. the rejectability function and selectability function of satisficing decision theory are used to enhance the searching efficiency of WPS algorithm. Therefore, it is important to point out that the result may not be optimal, but it can meet the tactical requirements and complete the attack-defense mission.
Suppose that there are M UAVs against N targets. The benefit b ij , namely, the effect of UAV i attacking target j is defined as where V j is the threat level of target j, and p ij is the damage probability that UAV i attacks target j.
If there are L j (L j = 1, ..., D j ) UAVs attack target j synchronously. Then, the overall damage probability under L j UAVs coordinated attack P j is calculated as [29] The estimated benefit that multi-UAV cooperative attack target j is defined as [29] The overall attack benefit B z is calculated as In the unmanned air combat, the UAV also can be attacked by the target. Thus, beside the attack benefit, the cost also need to be considered. The cost of UAV i is defined as [28] where V i is the importance degree of UAV i, p ′ ij is the damage probability of target j to UAV i.
Meanwhile, the overall attack cost C z is calculated as In this paper, the rejectability function and selectability function are defined as follows: where V jmax is the maximal threat level of all targets, V imax is the maximal importance degree of all UAVs, they are used for normalization, and f is the penalty factor, which plays an important role to avoid the overconcentration allocation result. In this paper, we define In (30), γ is the regulatory factor, γ ∈ (0, +∞), m j is the number of UAVs that attack target j, and D j is the maximum number that allow UAV attack target j. Obviously, if m j exceeds b j , f will decrease quickly.
Hence, the new objective function of cooperative attack decision-making based on SDEWPS algorithm is Based on [26], the original wolf pack algorithm cannot be applied to the multi-UAV attack-defense decisionmaking problem directly. Then, the modified discrete WPS algorithm is proposed in SDEWPS algorithm to solve this problem in this paper.
Suppose that the search space is K ×L. The position of the ith wolf which represents a potential solution of the optimization problem is given by where i = 1, 2, ...N ; j = 1, 2, ..., L; 1 ≤ x ij ≤ L. The wolf pack predatory activity is abstracted as lead wolf generation mechanism, regeneration mechanism, scouting behavior, summoning behavior and beleaguering behavior [30].
In the modified discrete WPS algorithm, the lead wolf generation mechanism and regeneration mechanism are similar with traditional WPS algorithm. In the initial wolf pack, the wolf with optimal fitness value is the lead wolf. If the fitness value of the object function is higher than the previous generation lead wolf during iteration, the lead wolf is updated as the new wolf with optimal fitness value. To ensure higher quality and maintain the diversity of wolf pack, the regeneration mechanism is based on the mechanism of "survival of the fittest" [31]. More specifically, it is necessary to remove the wolves with lower fitness value and generate the same number new wolves randomly in the iterative process. However, the new wolves may be generated in the searched space, and they will lead to the waste of search resources. In order to avoid the problem, we define the new wolves generated in the unsearched space preferentially in this paper.
For the multi-UAV attack-defense decision-making problem, the UAVs are abstracted to the wolf pack, and the targets are the prey. The objective is to assign the suitable prey to the wolves and maximize the objective function (31). The SDEWPS algorithm for multi-UAV cooperative attack decision-making consists of the following steps: 1) Data acquisition. Detect the target information and calculate the air combat superiority. 2) Encoding. Determine the encoding length L of wolf based on UAVs number M and targets number N , and encode the wolf pack. 3) Initialization. Preset the wolf pack size K, initial position X i , step size step 1 , step 2 and step 3 , maximum iterations I max , searching wolf scaling factor α and maximum searching number T max . 4) Elitism. Select the wolf with maximum fitness function value as lead wolf and n sw suboptimal wolves as the searching wolves, where n sw = [N/(α+1), N/α]. 5) Scouting. Searching wolf i senses the concentration of prey odor at the current position. Namely, calculate the fitness function value F i of searching wolf i. If F i > F lead (F lead is the fitness function value of lead wolf), let F lead = F i , and the searching wolf i becomes the lead wolf. While F i ≤ F lead , the searching wolf i search in h directions around its current position. Namely, searching wolf i perform motion operator Θ(X i , step 1 ) (step 1 is the searching step size of searching wolves) h times. Suppose that F im is the maximum fitness function value with optimal search direction p (p ∈ {1, 2, ..., h}), if F im > F i , a step forward with p direction is chosen by searching wolf i, and let F i = F im . The searching behavior does not end until F i > F lead or the maximum of searching number is reached. 6) Summoning. The lead wolf s summons other wolves to get close to the position of the lead wolf. Specifically, the random step 2 length encoding value of wolf i is replaced by the same position sub-sequence of lead wolf. After that, adjust the encoding that has not been replaced to avoid duplications. In the summoning process, if the fitness function value of wolf i F i > F lead , let F lead = F i , and the wolf i becomes the lead wolf. Otherwise, the wolf i continue move closer to the lead wolf until d w (s, i) < d w0 , where d w (s, i) is the distance between wolf i and lead wolf s, d w0 is the decision distance. 7) Beleaguering. The wolves are led by the lead wolf to besiege the prey as quickly as possible. To be specific, the beleaguering wolves perform randomly generated motion operator. The wolf with higher fitness value is selected to participate in the next iteration process after beleaguering behavior. 8) Update. Update the position of lead wolf, remove the wolves with lower fitness value and generate the same number new wolves. 9) Terminal condition. The traditional WPS will not end until the optimal solution is obtained or the maximum number of iterations is reached. To adapt to the rapidity and timeliness demand of air combat, a new terminal condition is added in SDEWPS algorithm. Once the satisficing decision solution is obtained, the algorithm will also end. Then, output the current position of lead wolf (attack-defense scheme) and its fitness function value. Otherwise, back to step 4) From what has been discussed above, the flow diagram of cooperative attack algorithm is shown in Fig.  6.
Through above analysis, satisficing decision enables decision-making systems to obtain a satisficing result for attack decision and allows them to improve their decision speed, but sometimes decision accuracy is more important in air combat. In such cases, wolf pack algorithm is used to adjust the satisficing factor, improve the decision accuracy and finally achieve hybridaugment.

Simulation experiments
Assume that the air combat situation is shown in Fig.  7. In order to verify the correctness and reliability of the above method, the simulation results are given. All the experiments are simulated in MATLAB 2014a and performed on a desktop computer with Intel(R) Core(TM) CPU i5-4460 @ 3.20GHz, 4.00GB RAM.  In Fig. 7, the blue triangles denote 12 targets T 1 -T 12 , and red asterisks express 8 UAVs U 1 -U 8 . The initial position and speed of UAVs and targets are given in Table 2.
The intention of each target is shown in Table 3.   According to the equations (1) - (13), the air combat superiority can be calculated as Table 4.
The air combat superiority curve of the SDEWPS algorithm is shown in Fig. 8.  Table 4 Air combat superiority The benefit-cost curve is shown in Fig. 9. In order to ensure balance of benefit and cost, the satisficing factor is chosen based on the expertise and α=1.11. Then, the satisficing decision curve is shown in Fig. 10. From the above simulation curves, the air combat superiority tends to increase with the iterative number increasing. Moreover, the benefit and cost are changed with the wolf pack behaviors. Obviously, under the enhancement of satisficing decision, it is not necessary to operate the maximum number of traditional WPS algorithm iterations. The optimal solution will be obtained once the satisficing condition (B − αC ≥ 0) is reached.
The cooperative attack results are shown in Table  5. Table 5 The cooperative attack results UAV Attack targets results Allocated quantity Seen form the attack-defense results, targets T 11 and T 9 are allocated to U 1 , targets T 2 and T 1 are allocated to U 2 , targets T 5 and T 8 are allocated to U 4 , UAV U 5 is allocated to attack targets T 3 and T 6 , UAV U 6 is allocated to attack targets T 12 and T 10 , UAV U 8 is allocated to attack targets T 7 and T 4 . It should be noted that there is no target allocated to UAVs U 3 and U 7 . That is mainly because both of them are located in the disadvantage positions. It is better to take some defensive measures for them (The same result can be reached according to Figure 7).
In order to evaluate the performance of the SDEWPS algorithm, the proposed method is compared with traditional WPS algorithm and satisficing decision (SD). Under the same test environment, through 50 simulations, the comparison results are shown in Table 6.
From the comparison results, it is illustrated that the SDEWPS algorithm can well solve the cooperative attack decision-making problem of multi-UAV. The optimal value of SDEWPS is 3.5165, it is only slightly smaller than WPS algorithm and significantly better than satisficing decision. In addition, the run time of SDEWPS is 0.2841s, it is also superior to traditional WPS algorithm and satisficing decision. Therefore, the accuracy of SDEWPS algorithm can be similar with the traditional WPS algorithm, and the SDEWPS algorithm has the same advantage with satisficing decision in algorithm operating speed. In conclusion, the SDEWPS algorithm combines the advantages of both traditional WPS algorithm and satisficing decision. It is well suited to attack-defense decision-making problem in modern unmanned air combat system.

Conclusion
In this paper, a method based on SDEWPS algorithm has been studied for attack-defense decision-making of multi-UAV. In order to improve the speed of WPS, the selectability function and rejectability function have been designed to solve the satisficing problem. The satisficing decision theory is employed to enhance WPS algorithm obtain the satisficing solution rather than optimal solution. The simulation results have shown that the method can solve the problem of cooperative attack-defense decision-making effectively.