A Boltzmann machine optimizing dynamic routing for FANETs

Routing optimization for FANETs is a kind of NP-hard problem in the field of combinatorial optimization that is simple to model but difficult to solve. The quality of routing has a direct impact on the network quality of FANETs, and the design of routing protocols becomes a very challenging topic. In this paper, we study the characteristics of dynamic routing, combine the characteristics of FANETs themselves, and use the energy of nodes, bandwidth, link stability, etc., as the metric of routing and use a Boltzmann machine for routing search to form an optimized dynamic routing protocol. The NS3 simulator is used to compare and study the traditional MANET dynamic routing methods AODV and DSR, and the simulation results show that the routes obtained by using the Boltzmann machine search are better than those of AODV and DSR in many aspects, such as end-to-end average delay, average route survival time and control overhead.


Introduction
In recent years, UAVs have been increasingly used in military and civilian fields, such as coordinated reconnaissance (Awasthi et al. 2019), precision agriculture (Tsouros et al.2019), disaster management (Erdelj et al. 2017), environmental monitoring (Tripolitsiotis et al. 2017), and aerial base stations (Savkin et al. 2018). In the development of UAV technology, people gradually realize that the reliability of a single UAV is not high enough to complete the task, and it is difficult to complete complex tasks due to the limitation of its own energy, capacity and load. Therefore, the trend is for multiple UAVs to work together. With the development of electronic technology, UAVs are miniaturized, and to overcome the shortage of a single UAV system, the collaboration between different UAVs is used to establish an ad hoc mode network, which is called a flying ad hoc network (FANET) (Bekmezci et al. 2013;Arafat et al. 2019). In FANETs, communication between UAVs is particularly important, and it is the basis for UAVs to be able to collaborate. As a highly dynamic network, the quality of routing directly determines the quality of FANET networks and directly determines the quality of UAV collaboration to accomplish the work task.
At the same time, artificial intelligence technology is developing rapidly, and the task of combinatorial optimization can basically be handled using artificial intelligence algorithms such as neural networks. With the increase in computing power of UAV devices, it is possible to use neural network algorithms for solving FANET routing optimization problems. Compared to traditional routing algorithms that use model-driven algorithms, neural network-based intelligent routing algorithms use datadriven methods instead of the original mathematical model to solve, using network topology and network features as inputs and routing decisions or link states as outputs (Khan et al. 2020). These methods are able to train the algorithm model using real data without modeling the network environment and are able to quickly and accurately calculate the routing decision results based on the input data and the trained deep learning model, with faster routing convergence. The same algorithm model can solve different network optimization demand problems in different training data and labels, so it has the advantages of accuracy, efficiency and generality, which represents the future development direction of FANET routing decisions. Yang et al. (2019) proposed an optimal routing protocol for DSR routing using a Hopfield neural network, and Wei et al. (2020) constructed a dynamic routing using Hopfield, and both achieved better results from simulation results. However, the HNN has the potential to converge to invalid solutions. In this paper, an intelligent dynamic routing algorithm based on a Boltzmann machine is proposed to calculate the network route with minimum routing overhead by the Boltzmann machine to achieve the optimal routing strategy.

FANET routing protocol
Currently, there is no routing protocol specifically designed for a FANET. In general, the FANET, as a special class of MANET, directly uses the traditional MANET routing protocol. In MANETs, routing algorithms can be classified according to the type of information used by network nodes and used to calculate priority routes or according to the method by which network nodes obtain routing information (Chin et al. 2002). Typically, classification using the method by which network nodes obtain routing information is more common, and based on this type of classification, MANET routing is broadly classified into table-driven routing and on-demand driven routing. Table-driven routing is also known as proactive routing. This type of routing endeavors to maintain consistent and up-to-date routing information from each node in the network to all other nodes, requiring each node in the network to create and maintain one or more tables storing routing information and reacting to changes in the network topology by disseminating routing update information to the entire network, thus achieving consistency in maintaining routing information across the network from the viewpoint of network consistency.
On-demand driven routing is also known as reactive routing. This type of routing creates routes only when they are needed by the network nodes, which is what is meant by ''on-demand''. When a node in the network needs a route to a destination node, the source node initializes the route finding process within the network. Once a suitable route is found or all possible route sequences have been checked, the in-network route finding process is terminated. Once a route is created, it is immediately maintained according to some route maintenance mechanism until maintenance of the route is stopped when the route is abnormally broken or the route is no longer needed.

Boltzmann machine
The Boltzmann machine (Ackley et al. 1985) is a kind of stochastic neural network proposed by introducing a simulated annealing algorithm with the help of concepts and methods of statistical physics. It is widely used in the field of combinatorial optimization (Khan et al. 2018;Nasrin et al. 2019;Wang et al. 2019).
The Boltzmann machine consists of N neurons; the input r i (i = 1, 2…, n) of each neuron takes 0 and 1 values, and the neurons are connected in both directions, so the connection weights are symmetric to each other. However, there is no prominent hierarchical boundary between different layers, and the structure of the Boltzmann machine network is shown in Fig. 1.
For a Boltzmann machine with N neurons, the connection weights between each neuron are {w ij }, the output threshold of each neuron is {h i }, the output is {r i }, and the internal state of neuron i is {E i } (where i, j = 1,2…,N). When taking Tj t¼0 ¼ T 0 and w ij ¼ w ji , {w ij } and {h i } are random values in the interval [-1, ? 1]. The Boltzmann machine works as the rule of repeatedly selecting a neuron i randomly from N neurons, and for the k-th selection, calculating the input from all other neurons' output states r j ðkÞ synthetically to obtain the internal state of that neuron E i : At this point, the probability that the output state r i ðk þ 1Þ of the i-th neuron is 1 is: Fig. 1 The structure of Boltzmann machine network when this probability is greater than the pregiven probability value e 0 , the state of this neuron is updated to 1, both r i ðk þ 1Þ ¼ 1, and the output states of the other neurons remain unchanged. Then, the temperature parameter T is updated according to a certain law, and the above operation is repeated to guide less than a pregiven cutoff temperature T d position. From Eqs. (1) and (2), it can be seen that the lower the Boltzmann machine energy E i ðkÞ is, the greater the probability that the state is 1. This makes the probability of the appearance of the state of each local minimum greater than the probability of the appearance of the surrounding states, and the probability of the appearance of the global minimum is greater than that of each local minimum. On the other hand, when the temperature T is very high, the difference in the probability of occurrence of each state is greatly reduced, and of course, the probability of all minima is still maximum; when the temperature T is very low, the opposite is true, which is the difference in the probability of occurrence of each state is increased, so that the probability of the neural network staying at the global minima at the end of the search is much greater than that of the local minima. Therefore, the network has a tendency to move toward the nearby minima at the microscopic level and toward the energy minimum at the macroscopic level, and when caught in some local minima, it is possible to cross the bit barrier to traverse this minima and thus reach the global minima.
3 Boltzmann machine for the dynamic routing optimization method 3.1 FANET route problem For this type of network, if active routing is used, i.e., each network node maintains a routing table, there is an additional demand for node storage space. At the same time, the network is flooded with routing update information as the nodes move and bring about topology changes. This has a very negative impact on the delivery of network information. If reactive routing is used, i.e., using a dynamic network, route lookups are performed only when the nodes in the network have a need to send data. The advantage is that there is no need to maintain a routing table in real time, while the routes are no longer available when data need to be sent. The disadvantage is that route lookups incurs additional overhead, but this overhead is sufficiently small for FANETs with limited storage space. In FANETs, nodes usually use the full duplex mode, which means that both nodes are within the communication range of each other to communicate. Therefore, FANETs can be considered an undirected graph, and when a node in the network needs to communicate with other nodes, it needs to find an optimal route from all the routes from this node to the communicating node, and the routes in the route set R are usually exponentially varying. Thus, we start from an initial set of paths and use an additional cost subproblem to generate the route of interest, i.e., the route with reduced negative cost. The problem of generating reduced negative cost routes can be described as the leastcost routing problem of Eq. (3). B ij denotes the link bandwidth between nodes i and j, P denotes the current energy of nodes, and S ij denotes the prediction of link stability between nodes i and j, which is performed in paper (Yang et al. 2019) using s:t:a 1 þ a 2 þ a 3 ¼ 1 ð4Þ

Optimization method
The Boltzmann machine is used to solve the routing problem by establishing a correspondence between the Boltzmann machine network structure and the FANET routing so that the minimum value of the energy function corresponds to the optimal route. To construct the Boltzmann network structure for routing, the FANET routing problem must be summarized as a 0-1 planning problem. For FANETs with n nodes, in principle, it is necessary to construct an n 9 n Boltzmann network using the remaining corresponding permutation matrices to characterize an optimal route. However, for FANET network nodes, it is too burdensome to construct an n Â n matrix directly when the FANET network size is large, and referring to the RIP routing protocol (Hedrick, et al. 1988;Malkin, et al. 1994), we consider that there is no route between the source node and the destination node if there is no route to the destination node for the 16-hop network. Therefore, a 16 9 16 Boltzmann network is constructed to characterize the optimal routing of the FANETs. The state r ij of each Boltzmann neuron is a 0-1 variable indicating whether node i is the p-th visited or not. In addition, k ij is the connection weight between two nodes, and the energy function of the optimized route can be defined as: Such a definition makes the structure corresponding to the local minimal value of the energy function correspond to the route, and the smaller the integrated value of the given route is, the smaller the energy function of the corresponding Boltzmann structure. In this way, the process of solving the optimal route of FANETs is the convergence process of the Boltzmann network, starting from any Boltzmann structure corresponding to the FANET route and changing in the direction of the decreasing energy function of the divine network structure. When the neural network tends to be stable, then the resulting network structure corresponds to the optimal route or near-optimal route of FANETs.
To ensure that each Boltzmann network structure corresponds to a route, when the state of a node changes, the corresponding other nodes also change their state at the same time, so that the whole network structure still represents a lineable route. At any moment, only one node in each row and column has a state value of 1, and the connection is not activated. For this reason, the connection weights are specified in the algorithm, and the connections C c are divided into distance connections C d , forbidden connections C i and own connections C b , which are defined as: The distance connection weights are taken as the negative of the distance. The connections with shorter distances have larger weights and are easily activated, making the minimum energy function corresponding to the optimal path. The forbidden connections connect all nodes in the same row and column and set their weights to a large negative number to ensure that none of the forbidden connections are activated when the local extremes are small, and at most one node in each row and column has a state value of 1.
The decay function of the control parameter T is taken to be: where d is a small positive quantity close to 0 and r k is the variance of the objective function value of the k-th Markov chain. This type of decay function is characterized by a relatively slow decrease in T at the beginning and a faster decrease in T as it proceeds. This is advantageous for finding the global minimum because at the beginning, T decreases more slowly, which is beneficial for the algorithm to try different solutions in the convergence domain, while the further it goes, the faster T decreases, which allows the algorithm to converge faster. The initial value T 0 is chosen using the method proposed by Eq. (11) (Aarts et al. 1989).
where c is the initial acceptance rate constant set (c ¼ 0:9 is taken in the experiment), m 1 and m 2 are the number of decreasing and increasing transformations of the objective function for m (10 in the simulation) attempts to produce a certain determined value of the control parameter T, respectively, and Df þ is the average of m 2 increasing transformations of the objective function. Boltzmann neural networks perform state updates of the neuron corresponding routes in three ways to ensure that the entire neural network structure always represents a single travelable route.
1. Reverse order: two edges are removed from the overall route and replaced with another travelable route but with the opposite access order of the same nodes. 2. Transformation: a section of the route is removed and placed between two randomly selected consecutive nodes. 3. Swap: two nodes are randomly selected and swapped in the route. For the updated neuron state, the neural network accepts it with a certain probability according to the simulated annealing algorithm.

Performance indicators
To verify whether the Boltzmann machine can optimize the routing of FANETs and thus improve the performance of the FANET network, we use the NS3 simulator to simulate and verify several key metrics affecting network performance (Nayyar 2018; Yang et al. 2019), such as the average end-to-end delay, average route survival time, and control overhead.
(1) Average end-to-end delay: This is the time required for data to be transmitted from one end of the network to the other. It is mainly the sum of sending delay, propagation delay, processing delay, and queuing delay. For highly dynamic real-time networks such as FANETs, low latency has been the core task of routing optimization. In a general sense, the delay determines the performance of routing protocols, and the higher the stability and the larger the bandwidth are, the lower the delay of routing.
(2) Average route lifetime: This directly reflects the quality of the route. For a route with a higher lifetime, it means that the network is more stable.
The worse the stability of the route is, the more rapid changes in the topology of FANETs lead to an increase in the probability of route breakage, and the indicator can clearly reflect the impact of routing on network performance. (3) Control overhead: The control overhead reflects the control message overhead caused by the periodic sending of route update messages by nodes to maintain the route and the route updates triggered by route breakage.

Simulation parameter setting
Using C ?? to develop a Boltzmann machine routing protocol running in NS3, the following settings are used in the NS3 simulation environment (http://www.nsnam.org); see Table 1 for details: assume that the network nodes are at the same height, the size of the space range is 2000 m 9 2000 m, and the number of nodes is 10 9 10, evenly distributed in the space range. The initial energy of each node is set to 100%, the nodes use omnidirectional antennas, the signal reception range of the nodes is 250 m, the node mobility model uses a random waypoint mobility model, the MAC layer uses IEEE 802.11 DCF, the network layer uses the Boltzmann machine, AODV, and DSR, and the overall network size change and network node movement speed change are compared. The simulation time is 900 s.

Comparison of the impact on network performance when network size changes
The network nodes are moved at the same speed of 10 km/ h, and the network communication source nodes are gradually increased to study the impact of the Boltzmann machine, AODV, and DSR on the network performance under the change of network. The number of communication source nodes is gradually increased to 5, and the packet rate is 4 packets/s. When the communication source nodes reach 40, the packet rate is reduced to 2 packets/s to avoid a meaningless comparison due to the high congestion of the network. Figure 2 shows the average delay when the number of communication source nodes varies, from which it can be seen that the delay is not high for all three routes when the number of communication source nodes is lower than 15, while the delay increases with a further increase in communication source nodes, indicating the presence of congestion and multiple accesses in the network interference. None of the three routes have a load balancing mechanism, so the latency becomes larger. The route obtained by the Boltzmann machine takes into account the multiple factors of nodes, so its latency is still the lowest. Figure 3 shows the control overhead when the number of communication source nodes varies, from which it can be seen that the control overheads of all three routes are relatively smooth, which means that the increase of communication source nodes does not bring additional control overheads when the nodes move at the same speed. The control overhead of the Boltzmann machine is still substantially lower than that of AODV and DSR, which means that the routes obtained by the Boltzmann machine are still the most stable.

Comparison of the impact of node movement speed variation on network performance
The network uses 10 communication source nodes with a packet rate of 4 packets/s and gradually increases the movement speed of the network nodes in a comparative study of the impact of the use of the Boltzmann machine, AODV, and DSR routing protocols on the performance of the network when the movement speed varies from 0 to 50 km/h. Figure 4 shows the average delay when the node movement speed varies, from which we can see that the average delay of the Boltzmann machine is significantly lower than that of AODV and DSR, especially when the node speed increases. The main reason for this is that as the node movement speed increases, the topology changes more sharply, and the risk of route breakage increases significantly, while the Boltzmann machine fully considers the performance of the nodes and takes into account the stability of the routes and other factors, so its route quality is significantly better, which is also verified by the control overhead in Fig. 5 and the average route lifetime in Fig. 6.

Conclusion
Artificial intelligence algorithms have been widely used in combinatorial optimization, and applying neural networks to the optimal routing search of FANETs can not only    The average route lifetime when node movement speed varies effectively reduce the delay of routing but also improve the stability and quality of routing. On the basis of considering the energy of the FANET network nodes, link bandwidth, and link stability, the strategy of using a Boltzmann machine for route search can obtain high quality and high stability routes and achieve the purpose of improving the network performance. The performance of the Boltzmann machine is not high in the early stage of learning until it reaches the whole neural network thermal equilibrium state. For realistic UAVs, especially micro UAVs, how to further reduce the energy consumption brought by the algorithm and accelerate the convergence of the neural network is a future research direction.