Equilibrium Optimizer with Generalized Opposition-based Learning for Multiple Unmanned Aerial Vehicles Path Planning


 Multiple Unmanned Aerial Vehicles (UAVs) path planning is the benchmark problem of multiple UAVs application, which belongs to the non-deterministic polynomial problem. Its objective is to require multiple UAVs flying safely to the goal position according to their specific start position in three-dimensional space. This issue can be defined as a high-dimensional optimization problem, the solution of which requires optimization techniques with global optimization capabilities. Equilibrium optimizer (EO) is a population-based meta-heuristic algorithm. In order to improve the optimization ability of EO to solve high dimensional problems, this paper proposes a modified equilibrium optimizer with generalized opposition-based learning (MGOEO), which improves the population activity by increasing the internal mutation and cross of the population. In addition, the generalized opposition-based learning is used to construct the population, which can effectively ensure that the algorithm has ability to jump out of the limitation of local optimal. Firstly, numerical experiments show that MGOEO has better optimization precision than EO and several other swarm intelligent algorithms. Then, the paths of UAVs are simulated in three different obstacle environments. The simulation results show that MGOEO can obtain safe and smooth paths, which are better than EO and other eight state-of-the-art optimization algorithms.


Introduction
Unmanned Aerial Vehicle (UAV), as a kind of modern equipment with strong mobility and excellent autonomous control capability, has received extensive attention from scholars in various research fields. Its research and application have been involved in post disaster medical assistance, intelligent transportation, a-5 gricultural science, location monitoring and other modern industries (Yao et al., 2019;Liu et al., 2020;Radoglou-Grammatikis et al., 2020;Sarıçiçek & Akkuş, 2015). In the research of UAV application, the problem of UAV flight path planning is the benchmark problem for all applications and is the precondition to ensure that UAV can service better. The issue of UAV flight path planning can 10 be described as requiring the UAV to get an optimal path out of all the paths from the starting point to the target position, with minimal cost under conditions that require avoiding all obstacles and maintaining steady flight. There have been many achievements in the research of a single UAV from the twodimensional environment to three-dimensional environment (Lin et al., 2019;Qu 15 et al., 2020;Yao & Wang, 2017). These researchers focus on adding complex constraints to the flight environment to explore the safe path, such as increasing the number and coverage range of radar and artillery in the simulated battlefield environment. However, as research evolves and the needs of industry become more complex, the collaborative application of multi-UAVs will become a focus 20 of future research.
The problem with multi-UAVs path planning is that multiple vehicles are required to complete their respective missions with different paths on the basis of a single UAV. The issue can be regarded as a high-dimensional optimization problem. The traditional algorithms A* and D* are often used for solving the 25 land robots path planning, and their improved versions are also used for UAV path planning (Guruji et al., 2016;Cai et al., 2019;Guo et al., 2009). However, the solutions often obtained are not very satisfactory. Swarm intelligence optimization is often used to solve high-dimensional optimization problems, and it often performs beautifully due to the fact that it can somewhat jump out of the 30 limitations of local optimality (Chen & Pi, 2020). Therefore, researchers apply such algorithms to solving the UAV path planning problem, often resulting in a satisfactory path. Wang et al. used glowworm swarm optimization (GSO) to simulate the flight trajectories of single UAV in two and three-dimensional environments, respectively (Wang et al., 2015). Wang et al. concluded that GSO has 35 a better ability to acquire flight trajectories compared to Dijkstra (Zhang et al., 2017), particle swarm optimization (PSO) (Liu et al., 2018), biogeography-based optimization (BBO) (Simon, 2008), and an improved version of bat algorithm (IBA) (Wang et al., 2016). Therefore, it can be seen that different optimization algorithms have different search capabilities. The path planning of multiple 40 UAVs is a NP hard problem, which requires the algorithm with strong global optimization ability to plan the path for saving total costs. The idea of swarm intelligence algorithm is to simulate the living foraging behavior of natural social creatures, and searching in the space of objective optimization problem with population as unit, which is also called natural heuristic 45 algorithm. What limits the search capabilities of the swarm intelligence algorithm is the problem of premature convergence in its late search. Different algorithms have been proposed to curb early convergence from different perspectives to enhance the global search capability. These algorithms include multi-Verse Optimizer (MVO) inspired by multi verse theory(Abd Elaziz et al., 2019), whale 50 optimization algorithm (WOA) (Chen et al., 2019a) and salp swarm algorithm (SSA)  to simulate the behavior of marine organisms, sinecosine algorithm (SCA) (Gupta & Deep, 2019) based on mathematical functions, and flower pollination algorithm (FPA) (Salgotra & Singh, 2017), etc. These algorithms have achieved good results in wind prediction, parameter identi-55 fication and large-scale production scheduling (Chen & Pi, 2019;Chen et al., 2019b;Shao et al., 2020Shao et al., , 2019. At present, researchers apply grey wolf optimizer (GWO) (Dewangan et al., 2019), MVO (Jain et al., 2019) and SSA (Saxena et al., 2019) to optimize the path of multi-UAVs, where SSA is the most efficient. Wolpert et al. pointed out that there is no one optimization algorithm 60 suitable for solving all optimization problems, meaning that any one algorithm is somewhat different in solving a particular optimization problem (Wolpert & Macready, 1997). This paper extends the study of a population-based algorithm called Equilibrium optimizer(EO) that was proposed recently with the idea of the mass balance equation of control volume (Faramarzi et al., 2020). EO has 65 been successfully used for image segmentation (Abdel-Basset et al., 2020).
To improve the optimization performance of EO, we propose a new equilibrium optimizer with generalized opposition-based learning, which is used for function optimization and multi-UAVs path planning. Generalized oppositionbased learning is an improved version of the traditional opposition-based learn-70 ing. The traditional version has been successfully applied in the field of optimization algorithms. Many improvements have emerged in the hairsplitting process, which are aimed at improving the learning rate of the optimization algorithm (Tizhoosh & Ventresca, 2008;Mahdavi et al., 2018;Wang et al., 2011;Draa, 2015;Abd Elaziz et al., 2017;Sapre & Mini, 2019). Compared with the 75 traditional version, the generalized opposition-based learning is different while it uses random numbers to control the changes of the opposition learning points.
Therefore, it has a better stochastic learning rate. In addition to embedding generalized opposition-based learning into EO, we add crossover and mutation under random probability in the EO framework, which can increase the diver-80 sity of populations and make the algorithm have better ability to jump out of local optimal solutions on high-dimensional problems. The specific MGOEO design is described in section 4 of this paper. First, the performance of the proposed MGOEO is verified by using different types of continuous optimization test functions. Then, MGOEO is applied to multi-UAVs path planning in three 85 different scenarios. The results show that the MGOEO is effective and efficient.
The remainder of this article is organized as follows. Section 2 describes the problem of the multi-UAVs path planning and gives the relevant model definition; Section 3 describes the EO's framework; Section 4 describes the design ideas of the proposed MGOEO and the implementation process; Section 90 5 carries out numerical experiment test and discusses the results; Section 6 uses the proposed algorithm for multi-UAVs path simulation; Section 7 summarizes the work of this article and gives the future research direction.

Problem description
The problem of multi-UAVs flight path planning is to plan the tracks of at 95 least three UAVs simultaneously in search space. Its purpose is to require each UAV to take off from the set take-off point and reach the target position safely.
As shown in Fig.1, three UAVs A, B and C set their respective target positions respectively. During the course of navigation, they need to avoid the obstacles in the diagram to maintain a smooth flight to the goal point. For UAV i, we c k l k , k = 1, 2, · · ·, n s.tχ i,j (l i , l j ) = 0, ∀i, j = 1, 2, · · ·, n (1) In the process of flight, UAVs need the calculation of flight cost, which includes the loss of fuel consumption, avoiding the consumption of turning obstacles, and the consumption of inconsistent final goal points. For a particular UAV in three-dimensional space, the flight position set from takeoff point s to target point t is recorded as l x,y,z , which is shown in Eq.2.
110 l x,y,z = {s, p 1 , p 2 , · · ·, p n , t} Among them, p 1 to p n is the docking points selected during the flight. Assuming that the UAV flies at a constant speed during the flight, its fuel consumption is directly proportional to the flight distance, so the fuel consumption loss can be expressed as c f uel , and its calculation method is shown in Eq.3 More sharp turns during a UAV flight can lead to more fuel consumption and 115 danger for the UAV. Therefore, sharp turns should be avoided as much as possible in the flight path. So the cost function of a sharp turn represents the minimum number of turns when avoiding an obstacle. In path planning, three consecutive points are taken p 1 , p 2 , p 3 , where the cross product between vectors − − → p 1 p 2 and − − → p 2 p 3 is zero, and there is no sharp turn. All others are considered to 120 have turning consumption.
The number of sharp turns is recorded as: c turn =Count of turning points in path.
Considering whether all the UAVs will eventually reach their target locations.
Therefore, there will be an approximate range of objective precision for UAVs 125 in practical applications, which is calculated by Eq.4. The symbol x end , y end , z end respectively represents the actual location of the UAVs in the mission. The symbol x t , y t , z t indicates the specific goal location. Therefore, c end is 0 when the goal is completely overlapped, which is responsible for defining the cost of the objective by distance.
Therefore, the overall cost function is shown in Eq.5, where the α 1 , α 2 , α 3 respectively represents the weight of fuel consumption cost, sharp turn cost, terminal uncertainty cost, and different values of these parameters indicate different cost priorities.

135
The idea of equilibrium optimizer is derived from the mass balance equation in the control volume, which can be expressed by the first order ordinary differential equation as Eq.6.
C is the concentration in the control volume V. In engineering thermodynamics, control volume is also called open system, which is called control volume for 140 short. V dC dt represents the rate of mass change in the control volume. Q Indicates the volume flow in and out of the control volume. C eq is the concentration in equilibrium pool. G is the mass production rate in the control volume.
The model of equilibrium optimizer still evolves in the form of population.
An equilibrium pool of five individual components is constructed in the algorith-145 m to provide reference learning individuals. C 1 − C 4 represent the top 4 better solution vectors obtained from the population solving the objective problem, and C ave is the arithmetic mean vector of these 4 vectors. The equilibrium pool is expressed as Eq.7.
The core update equation for this algorithm is as follows: C indicates the position of the individual in Eq.8. C eq is a randomly selected learning object from the equilibrium pool. F is the parameter (turnover rate) that changes over time and is calculated as shown in Eq.9. The λ and r are random numbers between 0 and 1. a 1 is defined as a constant 2.
The G is called the generation rate in Eq.8 and is a condition of the algorithm to 155 increase the development of the exact solution. GCP shown in Eq.10 and Eq.11 is the parameter controlling the generation rate, where r 1 ,r 2 are all expressed as random numbers between 0 and 1. These two problems can directly lead to the algorithm's lack of precision 185 in solving high-dimensional optimization problems. In order to improve the algorithm's search capability and curb its premature fall into local optimal premature convergence, we propose an improved version algorithm MGOEO.
Firstly, the idea of cross-mutation is embedded in EO. Cross-mutation is the main means to update the new solution in evolutionary optimization, using 190 cross-mutation can improve the internal diversity of the population. Thus Eq.12 is used to generate the mutation solution, which is cross-synthesized with the mutation solution generated by Eq.8 into new populations for searching.
In addition, opposition-based learning is applied to extending the coverage space of the population, which can enhance the activity of the population. Opposition- This was the original version of the opposition-based learning. After a followup study, the researchers proposed super opposite and quasi opposite. From a probabilistic point of view, they have a higher probability of searching to a more optimal solution than the original version (Tizhoosh & Ventresca, 2008).
Generalized opposition-based learning is a more random approach to learning 205 proposed on this basis. As shown in Eq.14, the generalized opposite point of The

Discussion of results
To test the performance of MGOEO, we use it to solve for the optimal values 245 of the function F1-F23. Table 1 gives the mean value and standard deviation of the five algorithms in 30 independent runs. Fig. 2 if t 1 was no change for ten consecutive times then if rand < P then 20: The new population OP OP was generated by using generalized position based learning; Storing the N P elements of the set N P OP into N C

24:
Compare C with N C and save the smaller value in C sented in Table 2. The result of Friedman rank-sum test shows that MGOEO is the best on the whole functions, unimodal functions and the statistical results of high-dimensional multimodal function. When dealing with fixed-dimension problems, its performance is slightly inferior to FPA. This is also because FPA can search for a better solution when facing F20-F22 function.
It is concluded that the optimization results of MGOEO are the best among the three map environments, and it can be considered that MGOEO is an 365 algorithm that can effectively and efficiently solve the problem of multi-UAVs flight path planning.
The running time of an algorithm is another important measure of performance. its original version EO, and these time expenditures come from the opposition population construction. Analysis from the perspective of asymptotic computational complexity. Define F to denote the evaluation objective function. The computational complexity of EO can be expressed as O=(M axgen · N P · F ).
The computational complexity of MGOEO for the opposing population in terms 380 of probability P is O=(P · N P · F ). The computational complexity of MGOEO under the number of iterations M axgen is O= (M axgen · (N P · F (1 + P ))) .
The frequency of (M axgen · N P · F · P ) is smaller than (M axgen · N P · F ), so the asymptotic time complexity of MGOEO is O=(M axgen · N P · F )which is the same as EO. Therefore, the execution time growth rate of MGOEO does not 385 increase as the problem size increases. Combined with the running time statistics of these algorithms in Table 5, the running time of MGOEO is acceptable.

Conclusion and Prospect
In this paper, an equilibrium optimizer with generalized opposition-based 390 learning for solving the problem of multi-UAVs path planning. The first step is to improve the solving precision of the algorithm to handle high-dimensional problems. We use cross-mutation operations and generalized opposition-based learning to improve the implementation framework of the algorithm, which can increase population diversity and thus extend the global search capability of Ethical standard This article does not contain any studies with human participants performed by any of the authors