An evolutionary trajectory planning algorithm for multi-UAV-assisted MEC system

This paper presents a multi-unmanned aerial vehicle (UAV)-assisted mobile edge computing system, where multiple UAVs are used to serve mobile users. We aim to minimize the overall energy consumption of the system by planning the trajectories of UAVs. To plan the trajectories of UAVs, we need to consider the deployment of hovering points (HPs) of UAVs, their association with UAVs, and their order for each UAV. Therefore, the problem is very complicated, as it is non-convex, nonlinear, NP-hard, and mixed-integer. To solve the problem, this paper proposed an evolutionary trajectory planning algorithm (ETPA), which comprises four phases. In the first phase, a variable-length GA is adopted to update the deployments of HPs for UAVs. Accordingly, redundant HPs are removed by the remove operator. Subsequently, a differential evolution clustering algorithm is adopted to cluster HPs into different clusters without knowing the number of HPs in advance. Finally, a GA is proposed to construct the order of HPs for UAVs. The experimental results on a set of eight instances show that the proposed ETPA outperforms other compared algorithms in terms of the energy consumption of the system.


Introduction
With the development of mobile communication systems, a huge number of resource-intensive and latency-sensitive applications are emerging, such as virtual reality, online gaming, and so on. Such applications are usually sensitive to latency and require huge computational resources. However, due to limitations on mobile users (MUs) devices, it is very difficult to execute these tasks on them.
Mobile edge computing (MEC) is a promising technology to address the above-mentioned issue. It can provide services with low latency and high reliability near or at MUs. It can execute tasks of MUs at the nearby edge cloud and send back the results to MUs (Asim et al. 2020). Due to the shorter physical distance between MEC's server/edge cloud and MUs, it consumes less energy as compared to mobile cloud computing. However, it is still lacking in fulfilling the requirements of MUs, as the location of the edge cloud is usually fixed and cannot be adjusted flexibly according to the requirements of MUs. Therefore, it cannot provide timely services during a natural disaster as the terrestrial communication link may be broken/lost.
To satisfy this ever-increasing demand, unmanned aerial vehicle (UAV) is regarded as one of the most promising technologies to achieve these ambitious goals. Compared to the traditional communication systems that utilize the terrestrial fixed base stations, UAV-aided communication systems are more cost-effective and likely to achieve a better quality of service due to their appealing properties of flexible deployment, fully controllable mobility, and low cost. In fact, with the assistance of UAVs, the system performance (e.g., data rate and latency) can be significantly enhanced by establishing the line-of-sight communication links between UAVs and MUs. In addition, through dynamically adjusting the flying and hovering location, UAVs are capable of improving communication performance in wireless communications.
Recently, due to the above-mentioned advantages, UAVs have been extensively used in various fields, such as wireless communication (Zaini and Xie 2019;Mozaffari et al. 2019), military (Low et al. 2017;Zeng et al. 2016), surveillance and monitoring (Olsson et al. 2010;Yuan et al. 2016), delivery of medical supplies (Gupta et al. 2020), and rescue operations (Gomez et al. 2015;Merwaday and Guvenc 2015). Very recently, UAVs have been used to enhance the capabilities of MEC systems. For example,  studied a multi-UAV-enabled MEC system, where several UAVs are deployed as flying edge clouds for large-scale MUs. Zhang et al. (2020) proposed a UAV-assisted MEC for efficient multitask scheduling to minimize completion time. Garg et al. (2018) studied the application of a UAV-empowered MEC system in cyber-threat detection of smart vehicles.
Moreover, to fully exploit the potential of UAV-assisted MEC systems, some researchers have studied appropriate path planning and trajectory designing of UAVs. For instance,  proposed a multi-agent deep reinforcement learning-based trajectory planning algorithm for UAV-aided MEC framework, where several UAVs having different trajectories fly over the target area and support the ground MUs. Wu and Zhang (2018) studied a practical scenario of UAVs in an orthogonal frequency-division multiple access (OFDMA) system. They proposed an iterative block coordinate descent approach for optimizing the UAV's trajectory and OFDMA resource allocation to maximize the minimum average throughput of MUs. Diao et al. (2019) optimized joint trajectory and data allocation to minimize the maximum energy consumption. Jeong et al. (2018) studied the bit allocation and trajectory planning under latency and energy budget constraints. Hu et al. (2019) developed a UAV-assisted relaying and MEC system, where the UAV can act as the MEC server or the relay. They proposed a joint task scheduling and trajectory optimization algorithm to minimize the weighted sum energy consumption of UAVs and MUs subject to task constraints. Yang et al. (2019) presented the sum power minimization problem for a UAV-enabled MEC network.  studied a multi-UAVassisted MEC system, where the UAVs act as edge servers to provide computing services for Internet of Things devices. Zeng et al. (2019) proposed an efficient algorithm to optimize the trajectory of UAV, including the hovering locations and duration. They formulated the problem as a traveling salesman problem to minimize the energy consumption of UAV. Asimand et al. (2021) studied a multi-UAV-assisted MEC system. They proposed a novel genetic trajectory plan-ning algorithm with variable population size to minimize the energy consumption of the multi-UAV-assisted MEC system. Xu et al. (2021) investigated the computing delay issue in multi-UAVs-assisted MEC systems aiming to minimize the task completion time. Specifically, they considered both the partial offloading and binary offloading modes by jointly optimizing time slot size, terminal devices scheduling, computation resource allocation, and UAVs' trajectories. Tun et al. (2021) proposed a UAV-aided MEC system. They jointly minimized the energy consumption at the Internet of Things devices and the UAVs during task execution by optimizing the task offloading decision, resource allocation mechanism, and UAV's trajectory. Ji et al. (2020) investigated joint resource allocation and trajectory design for UAV-assisted MEC systems. They jointly optimized resource allocation and UAV trajectory in order to minimize the weighted sum energy consumption of the UAV and user devices.
From the above introduction, it is clear that variable numbers of UAVs have rarely been considered in the current studies. The deployment of an appropriate number of UAVs can improve the system's performance. The main contributions of this paper are summarized as follows: • A new multi-UAV-assisted MEC system is proposed and formulated to minimize the energy consumption of the system by considering the deployment including the number and locations of hovering points (HPs), the number of UAVs, and their association with HPs, and the order of HPs. • The deployment of HPs is addressed by proposing a genetic algorithm (GA) with a variable length individual. Specifically, evolutionary operators like crossover and mutation are modified to handle variable-length individuals. • An evolutionary trajectory planning algorithm (ETPA) is proposed that consists of four phases. First, a variablelength GA (VLGA) (Ting et al. 2009) is adopted to optimize the deployment of HPs. Subsequently, redundant HPs which have no MUs to be served are removed by using the remove operator. After that, UAVs are associated with HPs via differential evolution clustering (DEC) algorithm (Mostapha 2015). Accordingly, a GA is adopted to construct the order of HPs for UAVs. The remainder of this paper is organized as follows. In Sect. 2, we introduce the system model, including the problem formulation of the proposed system. Section 3 presents the details of our proposed algorithm ETPA. In Sect. 4, the exper- imental studies are discussed. Finally, Sect. 5 concludes this paper. UAV flies over all the MUs to collect the data. We assume that the UAV will hover at some points for some time and the MU can send the sensing data to the UAV. We assume UAV will hover over t ∈ T j = {1, 2, . . . , T j } HPs. Therefore, one has

System model
where a i j [t] = 1 denotes that the i-th MU decides to send its sensing data to j-th UAV at t-th HP, while a i j [t] = 0 indicates otherwise. Then, one has which denotes that one MU should choose one UAV at each HP to send its sensing data. We assume that the MU always sends data to the closest UAV at each HP t. Then, one has Assume that at each HP t, j-th UAV can accept at most U j MUs. Therefore, one has We assume that i-th MU may collect D i amount of data which intend to send it to the UAV. The UAV may stop at T j points at the air in which each stop may last for T max seconds, where T max is the fixed value.
Then, the time to send the data from MU to UAV at the t-th HP is as where r i j [t] is the data rate which is given by (14). Also, define F i as the CPU cycles which this task may need to process. Then, one can have the process time of the data in UAV as where f i j [t] is the computation capacity of the UAV assigned to each data processing procedure, where we have where f max is the maximal computing power the UAV can provide to each MU. Also, we have Then, one can have Assume that the coordinate of i-th MU is as (x i , y i ) and the coordinate of the j-th UAV at t-th HP is as (X j [t], Y j [t], H ). Also, assume the UAV's trajectory can be characterized by a sequence of location q j In addition, all UAVs start from the same initial position q[0] and finally come back to the same initial position q[0] after visiting all the HPs. Also, we have where S max = V max · T max is the maximum horizontal distance which the UAV can travel and V max is the maximum speed. Then, the horizontal distance between the i-th MU and the UAV is as Also, the distance between the i-th MU and the UAV at the t-th HP is as Then, the channel power gain can be given as where β 0 denotes the channel power gain at the reference distance 1m.
If MUs decide to offload to the UAVs, the data rate can be given as where σ 2 is the noise power and p ue i is the transmission power, which is constrained by The energy consumption of the i-th MU for sending data to the j-th UAV at t-th HP is given by The whole energy consumption of all MUs is expressed as Assume the flying energy of the UAV is proportional to the flying distance/flying time, then the flying energy can be calculated as Also, for the hovering energy, one can have where P H denotes the hovering power of the UAV.
The whole energy consumption of all UAVs is expressed as where C is the fixed cost including take off, land in, and maintenance cost for adding UAVs. Then, we can have the optimization problem as follows.
P : min subject to: where the objective function is the sum of hovering energy and flying energy of UAVs and C8 and C9 present the lower and upper bounds of X-axis and Y-axis, respectively.

Motivation
By analyzing the proposed system model and problem formulation in Sect. 2, it is clear that (21(a)) is a non-convex, NP-hard, and nonlinear optimization problem. (21(a)) cannot be solved by traditional optimization methods due to the following challenges.
• To solve (21(a)), we need to consider the number of UAVs, the number of HPs and their locations, which MU will send data to which HP, which UAV will visit which HPs, and in which order the UAV will visit the assigned HPs. Therefore, it is a complicated/complex problem to be tackled.
• (21(a)) contains integer decision variable M and the number of HPs T j for UAV j, binary variable a i j , and continuous variables (X j and Y j ). Therefore, it is a mixed decision variable problem, which is challenging to be solved Liao et al. 2014). • Since the number of UAVs is unknown in prior, the clustering of HPs into different clusters requires an unsupervised scheme (i.e., free of initialization/parameter-free clustering algorithm) that can group closely spaced HPs into different clusters automatically and can also simultaneously find an optimal number of clusters/UAVs (Sinaga and Yang 2020).
In this paper, we proposed an algorithm called ETPA to design the trajectories of UAVs. The proposed algorithm consists of four phases: the deployment of HPs, removing redundant HPs, the association between UAVs and HPs, and the order of HPs for UAVs.
The main technical advantages of the proposed algorithm are given as.
• Consider the strong coupling among the deployment of HPs, the association between UAVs and HPs, and the order of HPs. ETPA plans the trajectories of UAVs at each iteration through four phases: updating the deployment of HPs, removing redundant HPs, the association between UAVs and HPs, and constructing the optimal trajectories for UAVs. • In ETPA, the deployment of HPs is solved by using VLGA in Ting et al. (2009). Each individual represents the whole deployment; thus, the whole population represents a set of deployments. Since the length of individuals is variable, we modified the common crossover and mutation operators to handle variable-length individuals for updating the deployment of HPs. • The optimization problem (21(a)) includes mixed decision variables i.e., integer, binary, and continuous decision variables. By analyzing the problem, we transformed it into subproblems so that there are no mixed variables involved. We solved each subproblem independently by proposing an efficient algorithm.

ETPA
The framework of ETPA is given in Algorithm 1. In the initialization, the locations of HPs are produced randomly, forming an initial population P O P = (X 1 , Y 1 ), (X 2 , Y 2 ),…,(X max , Y max ). Subsequently, redundant HPs are removed to restrict UAVs from visiting HPs having no MU by using the algorithm given in Algorithm 3. Accordingly, DEC algorithm in Algorithm 4 is adopted to group HPs into different clusters and a UAV is assigned to each cluster. Afterward, GA in Algorithm 5 is adopted to construct the order of HPs in each cluster. After that, P O P is evaluated via Eq. (21(a)), if it is feasible, the initial population is generated successfully; otherwise, the initialization is repeated until it is feasible or the number of fitness evaluations (F Es) is not less than maximum F Es (F Es max

The deployment of HPs
For the deployment of HPs, a VLGA in Ting et al. (2009) is adopted. GA is a simple, most popular, and effective EA and has been successfully applied in many fields (Asim et al. 2018(Asim et al. , 2017aMashwani and Salhi 2012;Mashwani et al. 2021). More specifically, different from Ting et al. (2009), tournament selection (Goldberg and Deb 1991), simulated binary crossover (SBX) (Deb and Agrawal 1995;Deb and georg Beyer 1995;Deb et al. 2007), and polynomial mutation  (Kalyanmoy and Hans-georg 1996) operators were adopted in ETPA to generate an offspring population P O P of f (i.e., locations of new HPs). The individuals of P O P of f are adopted to update parent population P O P (i.e., locations of HPs can be updated).
Since each individual in GA represents a whole deployment of HPs. Therefore, the whole population represents the set of deployments of HPs. Hence, the number of HPs is equal to the length of the individual in the population. Since the lengths of individuals in P O P are not same i.e., variable, the lengths of individuals are varying during evolution while updating the number of HPs i.e., the individual length can be increased, kept unchanged, or reduced by applying SBX designed for variable-length individuals. By using Algorithm 2, we construct the offspring population P O P of f . More specifically, we designed a special scheme to apply SBX operator (Ting et al. 2009) on variable-length individuals. First, the lengths of both individuals are compared. For example, if individual 1 has four substrings and individual 2 has five substrings. Then, to deal with the unequal individual lengths, the shorter individual will be chosen as parent 1. After that, the substrings of parent 1 will be mapped randomly to the substrings of longer individual (i.e., parent 2). The SBX crossover for substrings can then be performed on the four pairs of mapped substrings for parents 1 and 2 (Ting et al. 2009).
If the new population was composed of the newly created descendants only, the old population's best individual may be lost. To eliminate this deficiency, a new operator, the socalled elitism, was introduced. This operator ensures that the previous population's best individual will get into the new population without any modification; thus, the best solution found so far will survive during the whole evolutionary process.
Algorithm 3 Removing HPs with no MU 1: U ← Find unique association between MUs and HPs; 2: D ← Find the set difference between the index set of HPs/P O P and U ; 3: U pdated P O P ← Update HPs by removing HPs from P O P with indexes D;

Removing redundant HPs
After associating MUs with closest HPs via Eq. 3, we have some redundant HPs which have no MUs associated with them. We update the number of HPs by removing redundant HPs that have no MU to be served by using Algorithm 3. First, we find unique association U between MUs and HPs (line 1); then, we calculate the set difference D between the index set of HPs/P O P ( index set of P O P = 1 to size(P O P)) and U (line 2) and finally remove HPs from the P O P with indexes given in D (line 3). By removing redundant HPs, we restrict UAVs from visiting redundant HPs; as a result, the flying energy can be saved. In addition, it can shorten the running time of ETPA.

Association between UAVs and HPs
In this section, we group HPs into different clusters, and then a UAV is associated with the HPs of each cluster. However, since the number of UAVs is unknown, we need a clustering algorithm that does not require the number of clusters/UAVs in advance. Clustering can be stated as a particular kind of NP-hard grouping optimization problem (Falkenauer 1998). Therefore, it can be solved by optimization algorithms and metaheuristics. Specifically, evolutionary algorithms (EAs) are widely used for solving NP-hard problems, which provide near-optimal solutions to such problems in a reasonable time (Hruschka et al. 2009). Therefore, a large number of EAs for solving clustering problems have been proposed in the past. EAs are based on the optimization of some objective function (i.e., the so-called fitness function) that guides the evolutionary search (Hruschka et al. 2009). ETPA adopted a DEC algorithm in Mostapha (2015) to automatically cluster HPs into different clusters. Specifically, DE/rand/1 and binomial crossover (Qin et al. 2009;) are used to produce offspring. Like other EAs, it is also based on a fitness function. The fitness function is computed using the Davies-Bouldin index (DBI) (Davies and Bouldin 1979). The DBI is a function of the ratio of the sum of within-cluster scatter to betweencluster separation (Bandyopadhyay and Maulik 2002). The scatter within C i cluster is computed as where S i,q is the qth root of the qth moment of the HPs in cluster i with respect to their mean and is a measure of the dispersion of the HPs in cluster i. Specifically, S i,1 is the average Euclidean distance of the vectors in class i to the centroid of class i, and z i is the centroid of C i and is defined as and n i is the cardinality of C i , i.e., the number of HPs in cluster C i . The Minkowski distance of order t between cluster C i and C j is defined as The DBI is then defined as where The objective is to minimize the DBI for getting proper clustering of the HPs.
The DEC algorithm is explained in Algorithm 4. First, for each individual in the population P O P, a random number j in the range [ j min ; j max ] is generated. This individual is assumed to present the centers of j clusters. For initializing these centers, j HPs are chosen randomly from the set of HPs. These HPs are distributed randomly in the P O P. After that, the DBI is calculated by using Eq. (24). Subsequently, the offspring population is generated by using DE operators. Accordingly, the new population is evaluated by using Eq. (24). Population with minimum DBI is selected as a parent population for the next iteration. This process continues until the maximum number of iterations Max I ter is reached. Finally, the best solution with minimum DBI is selected as the best solution; hence, the number of clusters with proper clustering is obtained (i.e., C j clusters are obtained, where j represents the number of clusters).

The order of HPs
In this subsection, we design the optimal trajectories for all UAVs. In fact, this problem can be dealt with as a traveling salesmen problem. In ETPA, we proposed GA to construct the optimal order of HPs for all UAVs. GA is a popular EA that ensures good convergence in solving traveling salesman problem (Larrañaga et al. 1999). Specifically, Swap, Flip, and Slide operators are used in GA to produce offspring populations. The implemented operators are given below.
• Swap: selects two HPs and swaps them. Selected HPs can belong to the same or different routes. • Flip/Inversion: selects a sub-route and reverses the visiting order of the HPs/UAVs belonging to it. • Slide/Insertion: selects an HP and inserts it in another place. The route where it is inserted is selected randomly. It is possible to create a new itinerary with this single customer, with probability.
It can be seen from Algorithm 5, and the algorithm requires two input sets, the coordinates of the locations of HPs, and the distance matrix which contains traveling distances among HPs. Furthermore, it requires some parameter determination, like population size, maximum iteration number, and some additional constraints. After these steps, the initial population can be created, which consists of randomly created individuals. The fitness function simply summarizes the overall route lengths for each UAV inside an individual. As can be seen in Algorithm 6, the selection is tournament selection, where tournament size, i.e., the number of individuals who compete for survival, is 8. Therefore population size must be divisible by 8. The winner of the tournament is the member with the smallest fitness, this individual is selected for a new individual creation, and this member will get into the new population without any modification. After selecting parents from the population, GA's operators, i.e., Swap, Flip, and Slide given in Algorithm 6, are applied to produce offspring population. The population with minimum tour (i.e., minimum distance) is selected as a parent population for the next iteration. Finally, the best routes/solutions are obtained for UAVs.

Experimental settings
The parameter setting of the proposed multi-UAV-assisted MEC system is presented in Table 1. We have tested eight instances with up to 200 MUs to evaluate the performance of ETPA. We assumed that all the MUs are distributed randomly in a 1000 m × 1000 m square region. The maximum number of fitness evaluations (F Es max ) is set to 5000, and 20 runs are implemented independently on each algorithm. The mean energy consumption and the standard deviation of the proposed system over 20 runs are denoted by mean EC and Std Dev, respectively. Furthermore, we performed the Wilcoxon rank-sum test at 0.05 significant level. In the experimental results, we used ↑, ↓, and to show that ETPA performs significantly better than, worse than, and similar to its competitors, respectively.
Algorithm 6 GA Operator with flip, slide, and swap 1: while i < popsi ze do 2: Apply tournament selection with tournament size 8 to select 8 individuals from P O P; 3: for k:= 1 to 8 step 1 do 4: Flip ← Apply Flip to flip 2 HPs; 5: Swap ← Apply Swap to transpose HPs from two random individuals; 6: Slide ← Apply Slide operator to slide the HPs of random individual; 7: end 8: i = i +8; 9: end while 10: OUTPUT: NEW POP P O P N ;

Effectiveness of the deployment of HPs
The deployment of HPs is addressed by proposing a GA with variable-length individuals. To prove its effectiveness, we replaced the proposed GA in ETPA with DEVIPS  and developed a variant called DEVIPs-ETPA. In DEVIPS-ETPA, the deployment of HPs is updated by using DEVIPS in . The experimental results of ETPA and DEVIPS-ETPA are presented in Table 2, which show that the proposed ETPA outperforms DEVIPS-ETPA in terms of mean EC. Furthermore, as summarized at the bottom of Table 2, ETPA provides better statistical results than DEVIPS-ETPA. Moreover, Fig. 2 Figure 2 shows that ETPA converges faster than DEVIPS-ETPA and maintains better performance during evolution. The better performance of ETPA is attributed as: since variable length GA in ETPA can always predict the optimal number of HPs quickly, thus leading to the performance improvement.   Table 3, which show that the performance of ETPA is better than ETPA-W in terms of mean EC on all eight instances. In addition, ETPA provides statistically better results than ETPA-W, as can be seen at the bottom of Table 3. To further evaluate its effectiveness, Fig. 3

Effectiveness of the association between UAVs and HPs
To associate UAVs with HPs, this paper adopted DEC algorithm given in Algorithm 4. To show the effectiveness of the association between UAVs and HPs, we have replaced DEC with K-means algorithm (Jain 2010) and designed an algorithm called Kmeans-ETPA. The experimental results of ETPA and Kmeans-ETPA are listed in Table 4, which show that the performance of ETPA is better than Kmeans-ETPA in terms of mean EC on all eight instances. In addition, ETPA provides statistically better results than Kmeans-ETPA, as can be seen at the bottom of Table 4. To further evaluate its effectiveness, Fig. 4 presents the evolution of the mean EC of ETPA and Kmeans-ETPA on eight instances, which shows that ETPA converges faster than Kmeans-ETPA and maintains better performance during evolution. The reason why ETPA performs better than Kmeans-ETPA is straightforward: DEC algorithm in ETPA can group closely spaced HPs into the same cluster automatically without knowing the number of clusters that reduces the EC of the system. In addition, it can also predict the optimal number of UAVs, which reduces the extra cost and improves the system performance.

Effectiveness of GA
To construct the order of HPs for UAVs, this paper adopted GA in Algorithm 5. To show the effectiveness of GA, we  Table 5, which show that the performance of ETPA is better than ETPA-Greedy in terms of mean EC on all eight instances. In addition, ETPA provides statistically better results than ETPA-Greedy, as can be seen at the bottom of Table 5. To further evaluate its effectiveness, Fig. 5 presents the evolution of the mean EC of ETPA and Kmeans-ETPA on eight instances, which shows that ETPA converges faster than ETPA-Greedy and maintains better performance during evolution. The reason why ETPA performs better than ETPA-Greedy is straightforward: GA in ETPA is a famous EA that is known for its good convergence in solving NP-hard problems.

Conclusion
This paper has presented a multi-UAV-assisted MEC system, where multiple UAVs have been used to serve MUs. A trajectory planning problem was formulated as an optimization problem with the aim of minimizing the system energy consumption. To solve the problem, we have proposed an evolutionary trajectory planning algorithm that consisted of four phases. In the first phase, a genetic algorithm with variable length individual in population was adopted for the deployment of HPs. This algorithm updates the number and location of HPs by using genetic operators designed for variable-length individuals. Accordingly, redundant HPs were removed by the remove operator. Afterward, the asso- ciation between UAVs and HPs was determined by adopting DEC algorithm. Finally, a GA was adopted to construct the trajectories of all UAVs with the aim of reducing their flight distances. The experimental results on eight instances up to 200 MUs have shown that the proposed ETPA performs better than other compared variants in terms of minimizing the system energy consumption. In the future, we intend to propose some low-complexity algorithm keeping in view the dynamic environment of MEC systems across the globe.