## 3.1 FOA based on locality sensitive hashing mechanism

As seen from Section 1.2, the main idea of locality sensitive hashing is that if two data blocks in a high- dimensional space are close, the result of a locality sensitive hashing of that data block has a high probability of similar. If two data blocks are farther apart, the hash result will have a small probability of similar. Therefore, for the FOA algorithm, its core is to design the locality sensitive hashing function h(x).

Given an n-dimensional problem *F*(*x*1,…,*x**n*) with problem domain \(x\in {\left[{R}_{min},{R}_{max}\right]}^{n}\), given an existing solution as \(X\text{'}=({x\text{'}}_{1},\dots ,{x\text{'}}_{n})\), and any other solution is \(X=({x}_{1},\dots ,{x}_{n})\), then the problem *F*(*x*1,…,*x**n*) relative to the solution \(X\text{'}\) the locality sensitive hashing function of is

$$h\left(X|{X}^{{\prime }}\right)=\left[\frac{\sqrt[2]{\left({x}_{1}-{x}_{1}{\prime }\right)+\dots +\left({x}_{n}-{x}_{n}^{{\prime }}\right)}}{{E}_{n}}\right] \left(8\right)$$

where [value] denotes the integer part of the value, \({E}_{n}\) denotes the solution over head of the problem \(F({x}_{1},\dots ,{x}_{n})\), the physical meaning is the range of each hashtable element. In short, the larger the value, the smaller the number of elements of the corresponding hash table. Meanwhile, the smaller the value, the more elements of the corresponding hash table. \({E}_{n}\) is calculated as follows.

$${E}_{n}=\frac{\sqrt[2]{{{(R}_{max}^{1}-{R}_{min}^{1})}^{2}+\dots +{{(R}_{max}^{n}-{R}_{min}^{n})}^{2}}}{w} \left(9\right)$$

Where w denotes the accuracy of the solved problem. It is used to control the range of each element of the hash table. For example, if given a two-dimensional (n = 2) problem *F*(*x*1, ,*x*2), its problem domain is \(x\in {\left[\text{0,5}\right]}^{2}\), \({x}_{i}\in \left[\text{0,5}\right], i=\text{1,2}\), then the solution range of the problem *F*(*x*1, ,*x*2) is\(\left(\text{0,0}\right),\dots ,\left(\text{5,5}\right)\), if given the parameter w=5, then \({E}_{n}\)=\(\frac{\sqrt[2]{50}}{5}\), i.e., the range of elements in each sensitive hash table is [[\(\frac{\sqrt[2]{50}}{5}\text{*}(j-1)\)],[\(\frac{\sqrt[2]{50}}{5}\text{*}j\)]], where j denotes the *j*th elements in the sensitive hash table, and if given the parameter w=10, then \({E}_{n}=\frac{\sqrt[2]{50}}{10}\), similarly, the range of elements in each sensitive hash table is [[\(\frac{\sqrt[2]{50}}{10}\text{*}\)],[\(\frac{\sqrt[2]{50}}{10}\text{*}(j+1)\)]]. Then, assuming that the current solution (i.e., for the population location in the FOA) is X=(0,0) and the target solution (i.e., the population location to be selected in the FOA) is *X*1=(0,1),*X*2= (1,0)༌*X*3= (1,1)༌*X*4= (0,2)༌ *X*5= (2,0)༌ *X*6= (2,2), then the sensitive hash, value cor, esponding, to the ta, get soluti, n(w=5) is \(h\left({X}_{1}|X\right)=\left[\frac{5}{\sqrt[2]{50}}\right]=0\),\(h\left({X}_{2}|X\right)=\left[\frac{5}{\sqrt[2]{50}}\right]=0\),\(h\left({X}_{3}|X\right)=\left[\frac{10}{\sqrt[2]{50}}\right]=1\),\(h\left({X}_{4}|X\right)=\left[\frac{20}{\sqrt[2]{50}}\right]=2\), \(h\left({X}_{5}|X\right)=\left[\frac{20}{\sqrt[2]{50}}\right]=2\),\(h\left({X}_{6}|X\right)=\left[\frac{40}{\sqrt[2]{50}}\right]=5\),Thus, the target solutions *X*1 and *X*2 are mapped into the first element of the locality sensitive hashing table, *X*3 is mapped into the second element of the locality sensitive hashing table, the target solutions *X*4 and *X*5 are mapped into the third element of the locality sensitive hashing table, and the target solution *X*6 is mapped into the sixth element of the locality sensitive hashing table. The graphical representation is shown in the following figure.

From Fig. 3, the locality sensitive hashing result of the target solution (the population location to be selected in the FOA) in this example is divided into four sensitive hashing table elements, i.e., elements \(j\in J=\left\{\text{1,2},\text{3,6}\right\}\). Then, how to migrate the FOA population location is another key problem. In this study, the roulette technique for the target solution selection and migrates the selected target solution were used as the new FOA population location.

Roulette as a commonly used selection method, also known as proportional selection method. The basic idea of Roulette is that the probability of each population location point being selected is related to its corresponding edge weight value, which is performed in the following steps.

Step1: First determine the selection probability of each locally sensitive hash table element.

$${\alpha }_{j}={e}^{j},jϵJ \left(10\right)$$

Step2: Calculate the sum of the weights between all FOA population location points to be selected and the current fruit fly population location \({X}_{i}\):\({\sum }_{\left\{j\in J\right\}}{\alpha }_{j}\)

Step 3: Calculate the probability of each FOA population location point x being selected.

$${p}_{k}=\frac{{\alpha }_{k}}{{\sum }_{\left\{jϵJ\right\}}{\alpha }_{j}} \left(11\right)$$

Step 4: Calculate the cumulative probability of each FOA population location point to be selected.

$${{p}^{{\prime }}}_{k}=\frac{{\sum }_{j=1}^{k}{p}_{j}}{{\sum }_{jϵJ}{\alpha }_{j}} \left(12\right)$$

Step 5: Generate a random number θ with uniform distribution in the interval [0,1].

Step5: If \(\theta \le {p}_{k}^{\text{'}}\) and \({p}_{k-1}^{\text{'}}\le \theta\), then element k of the locality sensitive hashing table is selected, and then, any FOA to be selected population location Xi in element k is randomly selected.

At this point, the new FOA population location has been selected.

Algorithm 1. LSHFOA |

Input: benchmark function and definition D Output: the optimal solution in the definition domain D |

1: Initialize a set of (n) population location points\(V=\{{X}_{1},{X}_{2},\dots ,{X}_{n}\}\) 2: Select a population location point as the initial: \(\forall {X}_{i}\in V\) 3: Loop \(t\le iter\) Then 4: Generate population individuals (Pop)for point \({X}_{i}\) according to the population 5: Calculate the benchmark function value for each individual fruit fly 6: Preservation of optimal fruit fly individuals 7: If trapped in local optimum Then 8: Calculate the probability of selecting the location point of the population to be selected in FOA and the point Xi: {\({\alpha }_{1},\dots ,{\alpha }_{n}\)} 9: Roulette calculates the selection probability of each population location point: P= {\({p}_{1},\dots ,{p}_{n}\)} 10: Randomly generated selection probabilities\({p}_{0}\) 11: Selecting new population locations based on locality sensitive hashing table 12: End if 13: Update population location point\({X}_{i}\) 14: End Loop 15: Return the optimal solution |

By means of roulette, the weight relationship between population location points can be mapped to the selection probability, and then random values are used for population location selection. This roulette selection method can, on the one hand, improve the selection probability of dominant population location points, i.e., satisfy the principle of optimization selection; on the other hand, roulette also has a chance to select other slightly inferior population location points, and this operation helps to enhance the diversity of fruit fly population locations and further ensures the global optimization capacity of the fruit fly optimization algorithm.

## 3.2 LSHFOA

In the improved fruit fly optimization algorithm (LSHFOA) proposed in this paper, locality sensitive hashing tables are used for the selection of Drosophila population locations, i.e., when the fruit fly optimization algorithm falls into a local optimum (usually measured by multiple population location invariance), the roulette mechanism is used for population location selection (according to locality sensitive hashing table). the pseudo-code representation of LSHFOA is shown in Algorithm 1[6–8].

**Algorithm 1** shows that the time complexity of the algorithm is O (*m*1*m*2), i.e., the time consume of the algorithm is related to the population size *m*1 and the number of iterations *m*2. Therefore, in order to reduce the time consuming of the algorithm, the population size in the experiments of this paper is 50 and the number of iterations is 300. In order to cover more initial location points, the size of the initial population location set V is 50, i.e., the fruit fly population location locality sensitive hashing table contains a total of 50 points[9–11]. According to the execution flow of Algorithm 1, the flow chart of LSHFOA is shown in Fig. 4.

Where V in Fig. 4 denotes a set of fruit fly population location points, i.e., each vertex in the locality sensitive hashing table of fruit fly population location, *X**i* denotes any point of population location points in V, Pop denotes the set of population individuals generated according to the population location *X**i* with a scale of 50, and Fitness denotes the fitness value of each fruit fly individual on the benchmark function, i.e., the Step6- Step7 denotes the new population location selection scheme, i.e., the locality sensitive hashing table model with roulette wheel selection method[12–15].