Evolutionary algorithm for optimized CNN architecture search applied to real-time boat detection in aerial images

When processing the detection of boats in aerial images by neural networks, we have always been concerned about the execution time of these networks in the equipment on board the Unmanned Aerial Vehicle (UAV). Throughout its mission, the UAV will capture images that must be processed in real time. For this purpose, a network optimized for execution time is essential. This article proposes an enhanced Network Architecture Search (NAS) method for searching for time-optimized detection networks, for a given dataset, using an evolutionary algorithm. The search uses mutations as a mechanism of evolution that affect the structure of the network and the hyper-parameters of its layers. Its original fitness function allows the choice of architectures that are not very greedy in terms of operations, specifically favouring small networks whose advantages are to be fast and quick to train, thus accelerating the search algorithm. Using this method, we were able to obtain detection networks with an improved mean Average Precision (mAP) compared to the initial network (parent) but with much fewer FLoating-point OPerations (Flops): 68% of operations reduction. This induces considerable gain in terms of execution time with 50 Frames Processed per Second (FPS) in an embedded environment on a drone.


Introduction
The NAS has become widely used not only in computer vision, but also in other domains to find a whole, a scalable neural network architecture, or cells that structure this architecture. In computer vision, architectures searched by NAS are outperforming those developed by experts for tasks like image classification and object detection [1][2][3].
The space where the architecture is taken is called the search space as depicted by [4]. To explore it, a search strategy is needed. Many strategies were used, based on: random search, reinforcement learning, evolutionary algorithms, gradient-based methods... which search the architecture that performs well for the task. To evaluate the architecture performance, a performance estimation strategy is also needed. Training each architecture and calculating its performance using validation is a strategy example.
The neuro-evolution, as presented by [5], tries to evolve neural networks based on the Genetic Algorithms. It was first used to find network weights as a replacement for gradient methods. It was also used to optimize the neural network architecture with its weights using mechanisms inspired by 78 Page 2 of 10 biological evolution such as crossover, mutation and selection as seen in [6].
Our proposal is a NAS method based on an evolutionary algorithm that reduces an initial known neural network in order to optimize it for soft real-time inference constraints. The application of this neural network model is the real-time detection of boats from an UAV on-board computer using a global shutter camera sensor. Inference time is of critical importance as the UAV has fixed wings and cannot hold position or reduce its speed below stall speed.
In this paper, we will first present some methods used in deep learning to get neural networks adapted to low computational resources and compare our method to other NASs. We will follow with a description of our evolution method and the experiments we performed. The resulting networks from the evolutions will be tested on the target environment and compared to other tiny architectures. Finally, we will discuss how to reduce the duration of the algorithm execution and some enhancements.

Related Works
The need to obtain optimal networks that run on different architectures, even those limited in computational resources has increasing interest in the scientific literature. The execution time of these networks, which is not only a function of the number of parameters and the number of addition and multiplication operations, but also of the architecture (environment) running these networks, is an important factor to be taken into consideration when designing these networks. Several studies have addressed these points in different ways.

Compression
Compression (with pruning or distillation): For pruning as explained in [7], it is the reduction of the parameters of an existing network. One kind of the pruning methods, is the pruning of parameters based on their magnitudes which preserves accuracy. In [8] authors prune the Yolov3 [9] model and use the original model weights (transfer learning between the same layers) for the pruned architecture. For the distillation as in [10], it is a question of transferring the knowledge of the cumbersome model towards a small model more adapted to the deployment.

Network design
It addresses designing scalable networks using layers with fewer parameters and fewer operations. We cite as example [11,12] which exploits the depthwise separable convolution to reduce the number of parameters and the additionmultiplication operations that a traditional convolution uses. In [13], the authors use pointwise group convolution instead of pointwise convolution and introduce a new channel mixing technique. With the same number of parameters and less operations, their model has similar performance to [12].

Network architecture search
The NAS also proposes scalable architectures for different targets (GPUs, Mobile). It uses many strategies to explore the search space. We cite as examples: reinforcement learning [1,2,14], evolutionary algorithms [3,15] (faster convergence compared to the reinforcement learning in the same search space), and gradient-based methods [16]. These methods usually search for architectures, blocks and cells using an image classification task, for few epochs or with small dataset. Then the final architecture (searched directly or based on the searched blocks or cells) is trained from scratch and tested on the target dataset.
For our method, specific to detection, we were inspired by [17]. The authors use Yolov2 [18] as initial parent and search for a new architecture by favouring certain mutation operations such as convolution with 1024 filters of size 3x3 (lot of parameters and Flops). The aim of their method is to enhance mAP without worrying about speed. Whereas our method looks for architectures with low Flops and high FPS. For this purpose, we use a configuration introducing less operations when adding convolution layers. We also include Flops as parameter in our fitness function.
Our method evolves the whole architecture except the last convolution layer and the region layer (Fig. 1). Whereas in [17], the backbone does not undergo any mutation to benefit from the transfer learning.
Since the objects on the aerial images differ from those of the standard images and represent more challenges than the latter, a search in the part responsible for extracting the characteristics of the image (backbone) is essential. Besides that our method adapts the network to process these types of images, it is able to reduce their parameters and operations. Whereas with a fixed backbone, the number of operations of the generated networks will not drop below the number of operations of the fixed part. To shorten the search, we also choose to start from a network already established but which has been optimized during evolution. Finally, the method [17] searchs for multi-path network whereas ours search is about a single path as "the plain single-path architecture is a better choice for small networks" [19].
Compared to NAS methods intended for constrained resource devices, our method has some differences and similarities. It is a single step method unlike [20,21] (an evolutionary and a gradient based method respectively) which use a subsequent optimization with the discovered architectures. Authors of [20] use a pruning technique. Whereas the authors of [21] add some mutation to networks that can probably perform well with the final dataset. For our method, the searched networks do not undergo more optimisation steps, as they are searched using an aggressive fitness function, using the target dataset directly. In [14,22], the authors include latency in the reward by getting it from the target device or in [23] by using pre-investigated operations' latencies to predict the final architecture's latency. For our experiments, we choose to include the Flops beside the mAP as we use only the server for the evaluation. Future work can use latency and integrate target device in the search process. Method [21] uses proxy dataset for evaluating the discovered networks. Whereas in our method, like [22,23] (both for classification), we evaluate the network on the target detection dataset.

Optimal CNN search
The Aim of our work is to find an optimal architecture for a given detection dataset using an enhanced NAS based on an evolutionary algorithm to evolve an initial network. By optimal we mean an architecture leading to a model with a similar or an improved mAP compared to the initial parent but with less execution time. Algorithm 1 outlines the steps of the proposed evolution. The initial network is mutated using the mutation functions described below to generate C individuals that will constitute, with this network, the first generation of C + 1 individuals. Each mutation gives a new individual and each individual of a generation is unique. The algorithm ensures that each individual appears only once in the generation by rejecting duplicates (Algorithm 1, line 12). Then these generated networks will be trained at low iterations count on the training set and evaluated on the test set. The selection of the best individual in the generation is made using a fitness function. The individual with the highest fitness value will be the parent of the next generation and C new individuals will be generated from it. For each generation, all the generated individuals and even their parent will be trained, tested on the target dataset, and at the end, evaluated to choose the best individual. The algorithm thus continues to create generations and does not stop until the max generation G is attempted. We will refer afterwards to the set of the best individual taken from each generation as B.

Algorithm 1 Our proposed evolution
New individuals per generation 2: G ⇐ 50 Max number of generations 3: parent ⇐ tinyY olov2 4: history ⇐ ∅ 5: while |history| < G do 6: generation ⇐ ∅ Current generation 7: train(parent) 8: evaluate(parent) 9: add parent to generation 10: while |generation| < C do 11: child ⇐ mutate(parent) 12: if child is not in generation then 13: train(child) 14: evaluate(child) 15: add child to generation 16: end if 17: end while 18: add generation to history 19: parent ⇐ getbest(generation) Returns the best individual according to its fitness 20: end while The retraining of the generation parent is voluntary, to give a chance to a child with similar performance to be selected in case of a decrease in parent fitness and thus encourage the diversity. If the parent maintains high performance, it will be reused to generate other individuals. As the mAP of a parent can change when retrained, we maintain a reference of the best individual among all the generations and updated it when a model with improved performance is found.
To evaluate each candidate on the server, a fitness function is calculated using the mAP and the number of Flops expressed in Billions (BFlops). We choose Flops instead of the latency because its value depends on the architecture and does not change depending on the memory and CPU load in parallel environment as is the case for latency. The function is formulated by 1 and represented in Fig. 2.
The target_flops equals 1 BFlops which is well below 5.344 BFlops, the Tiny Yolov2 number of operations. The current_flops and map are respectively the number of BFlops and the mAP of the current individual whose fitness is being calculated.
This function pushes the search to select networks with a smaller number of operations and penalizes those who are greedy even if they have a good accuracy. Therefore, it makes the algorithm gain in speed by reducing the number of operations in the architectures and so reduce the training time. That said, the main goal is to find an architecture with satisfactory mAP and which executes in less time in production.
The different mutations used to create new individuals are listed below: Add_convolution: Adds a new convolution layer with predefined parameters. The parameters are a number of filters equals to 32 with a size, a stride, and a padding equal to 1 and a Leaky ReLU as activation function. The position index in which the layer will be inserted is chosen randomly between the first position in the network (included) and the last convolution index. We chose a 1 x 1 convolution with a number of filters equals 32? This type of convolution was used as dimension reduction by [24]. During our experiment, it was observed that this type of convolution with 32 filters introduces less operations (Flops) in general, and most of the time, it enhances the mAP value when inserted earlier (at bottom) in the network. Add_pooling: Adds a new max pooling layer with a stride and a filter size of 2. The pooling layer can be added only between two convolution layers in order to avoid a succession of pooling layers which can introduce a loss of information. A new pooling cannot be inserted just before the last convolution in the network. The position index is chosen randomly. If the selected index is not suitable another index is chosen.
Remove_convolution: The operation removes a convolution layer from a randomly chosen index when the layer is not between two pooling layers and its index is not that of the last convolution layer used for detection. Also, the first layer when followed by a pooling layer cannot be removed. In these cases, another convolution layer is chosen.
Remove_pooling: The pooling layer is removed from a random index without conditions. Alter_filter_size: a convolution layer is chosen randomly to modify its current filter size with a different value selected from 1, 3, and 5.
Alter_filter_stride: a convolution layer is chosen randomly to modify its current filter stride with a different value selected from 1 and 2.
Alter_filter_number: a convolution layer is chosen randomly to modify its current filter number with a value selected from: 32, 64, 128, 256, 512 and 1024. [17] uses the numbers of filters 512, 1024 and 2048 to modify only the head part.
For the last three operations, if the selected value is not different from the current value (the filter's size, stride, and number), the algorithm tries with another convolution layer and with another value until the operation can be applied.

Experiments and results
We will discuss the choice of the initial parent used in the different experiments, and present the deep learning framework and the target dataset before detailing the two different implementations of our algorithm as well as the results. Raw output of the method is available as Online Resource 1.
We choose Tiny Yolov2 as an initial parent. It is a fast network compared to the normal Yolov2 [18] but less accurate. The choice of this network is justified by its simplicity and its architecture constituted only by the well-known layers of a CNN: Convolution, Batch Normalization, Leaky Activation and Pooling. These layers are also implemented in the OpenCV version used by the drone running real-time detection, whereas some newer layers like Shrink and SiLU activations are not implemented or not tested for production in edge environment. It is also a way to prove the algorithm ability to find optimal and efficient architectures by starting from simple networks (single path) and by searching among these same networks.
The algorithm uses the Darknet framework 1 for training and evaluating individuals: each generated individual has a Darknet configuration file listing the different layers in its architecture. This file is used to construct, train, and test the network with the Darknet framework. All the generated individuals use anchors to predict the bounding boxes. To train all the networks including the initial Tiny Yolov2, we did not use pre-trained weights, all the networks had their weights initialized randomly. For evaluation, each architecture is tested on the test set by resizing the test images to 416x416 which is the input size of all architectures. 1 We use the following host computer configuration in Microsoft Azure for training and test. It is equipped with 12 Intel Xeon E5-2690 v3 vCPUs using 112GiB RAM and NVidia Tesla K80 GPU using 24 GiB RAM running Ubuntu 18LTS. Darknet framework is compiled using CUDA, CUDNN, and OpenCV.
The dataset in [25], used for the experiments, contains aerial photography of boats from nadir orientation over diverse backgrounds, with boats of different sizes and shapes and under different weather conditions affecting the brightness and quality of the image. Some examples are in Fig. 10. It was chosen due to the difficulty of finding a good network architecture that both provides good average precision and time [26]. It is thus a good playing ground for optimizing a neural network using evolution. It is composed of 1,111 images of size 600x600 divided into two sets: 667 for training and 444 images for test.
For our experiments, we use a number of maximum generation G equals to 50, a number of new individuals C equals to 9, and two different implementations. The first implementation chooses the mutation operation uniformly at random between the set of the 7 operations described in Section 3. It allows an existing individual (from an old generation) to reappear in a new one. Whearas the second (improved) implementation has the follwing differences: • The probabilities of the add_pooling and alter_filter_ stride mutations become half of the other mutations.
• When an individual reappears in a new generation, its architecture is mutated until it gives birth to a new individual not belonging to any previous generation. • The individual whose Flops exceed a max_flops thresh of 15 BFlops is ignored and his fitness becomes null. • Although the maximum generation number G is set to 50, the evolution will stop once an individual fitness approached the value of 1.
The utility of these changes are demonstrated in the Sects. 4.1 and 4.2. The other training parameters are listed below: momentum = 0.9, decay = 0.0005, learning_rate = 0.001, policy = steps, steps = 10000, 15000, scales = 0.1, 0.1, and max_ batches = 20000. The max_batches option defines the max number of iterations or batches with which the network is trained. We chose the value of 20000 because, in a previous work [19], when training Yolov2, the model starts stabilising their mAP at this iteration. Knowing that this architecture is initialized with a pre-trained weight and that it is very large compared to the Tiny Yolov2. As a compromise, we kept this number of iterations with the same Learning Rate (LR) as in [25] which is 0.001, then reduced it at the steps 10000 and 15000 to 0.0001 and 0.00001 respectively. From the first iteration to the 1000th iteration, the LR grows from 0 to the value indicated in the learning_rate parameters, and stays unchanged until the steps indicated above. We did not change the anchor values. We used the same as Tiny Yolov2 for all the architectures. An available Darknet augmentation technique randomly altering size, exposure, saturation, and hue is used during training with the values saturation = 1.5, exposure = 1.5, and hue = 0.1. Figures 3 and 4 show the graphs, for the two experiments, representing the initial parent and the set of best individuals B according to the ratio of their mAPs and Flops relative to those of the initial parent (e.g.: a Flops ratio of 0.2 means individual_flops tiny_yolov2_flops = 0.2 and a mAP ratio of 0.9 means individual_mAP tiny_yolov2_mAP = 0.9 ). The initial parent has a mAP and Flops ratios both equal 1. For some individuals, we added the generation number from which they were selected as the best.

First implementation experiments
On both experiments, some individuals slightly outperformed Tiny Yolov2 in terms of mAP while reducing the number of Flops. Others had small decreases in mAP but with significant reduction in the Flops' numbers. Whereas others recorded very significant drops in Flops which impacted considerably their mAPs values.
To get an idea of the evolution key indicators and where these individuals are located along the evolution, we have drawn, in Figs. 5 and 6, the curves of mAP ratio, Flops ratio and Fitness/5 for the set of best individuals B across generations for the two experiments (we divide the fitness value by five to make its curve fit in the graph).
For both experiments, throughout the evolution, the Flops curve fell most of the time and its value stabilized around generation 15 for E1, and around generation 20 for the second experiment E2. While that of fitness continued to increase until some stabilization occurred from generation 20 onwards at various points in E1. This is due, on the one hand, to some individuals who have been selected at least three times in succession as parents for the following generations, and on the other hand, to the similarities in performance between the best individuals of these generations, even if they are different. For E2, fitness experienced mainly two stages. A first stage where the curve followed an upward slope and a second stage where it wavered between two levels. The reasons for this last stage are the parent reselection and performance similarity (as for the first experience) in addition to two parents who skipped generations before reappearing on two new ones.   Regarding the mAP, it remains relatively stable during the first generations but for different durations in the two experiments. Three Individuals recorded significant mAP decreases compared to those around them. These are individuals of generation 6 and 14 of E1 and individual 16 of E2 who found themselves with a succession of layers of strides equal to 2 and especially all their final feature maps were reduced by half (fewer bounding box predictions). The generations that followed were able to rectify these low values by removing a max pooling layer from these networks. Thereafter, the curves stabilized at several occurrences but on low values of mAPs. These are networks that do not generalize well and suffer from underfitting. Their losses remain high during training.
The reappearance of parents on other generations and especially at the end of evolution has no meaningful impact on the rest of evolution. Applying mutation to these individuals whether they appear early, in the middle or late may have a better impact as it will allow exploration of new individuals and promote diversity.
The selected fitness function for evolution provides the best results compromise between mAP and operations count during the first generations. It first maintains good mAP while drastically reducing the network operations count. This provides a large spectrum of networks fit for real-time applications. Afterwards, it continues to reduce the network but with significant loss in mAP. To save time, the evolution must stop once the mAP is no longer interesting or the fitness has approached the value 1. This is explained by the fact that when the number of Flops is below the target_flops the value of the fitness grows, the mAP does not influence this function as much as before.
Another remark: as long as the number of Flops has not reached the target value, the decrease in mAP will influence fitness a little more by causing a decrease in it. To give the mAP more time to improve, we can delay progress to the target_flops value. For this, we can reduce the probability of occurrence of mutations that quickly reduce the number of operations in order to allow the hyper-parameters of the individual layers to adapt.
With the current implementation, we were able to obtain optimized networks with similar mAP to the initial network but with less operations. There is even an individual from E1 with an improved mAP which introduces a reduction in operations count of up to 58%, and another from E2 with also an improved mAP and an operations reduction of 28% from the initial parent.

Improved implementation experiment
For this experiment, based on the second implementation, we changed some training parameters. Here are the concerned parameters and their new values: batch = 256, mini_batch = 32, learning_rate = 0.01 and max_batches = 6000. The LR does not change at any step after it reaches the learning_rate parameter value at the 1000th iteration. The mini batch size was first fixed at 32 but when there is an out of memory the mini_batch value is divided by 2 until the training data fits in the memory then the algorithm continues the training of the individual. The experiment will be referenced by E3.
The Fig. 7 is the graph representing the outcome of the evolution in E3. As previously, we use for each individual the ratio of its mAP and Flops with those of the initial parent. Here the fitness value has not been rescaled. We add the execution time (in seconds) curve rescaled by 10.
Three individuals had a high mAP compared to Tiny Yolov2. Their ratios of mAPs are greater than 1. Others are similar or close to the initial parent. All the individuals without exception have a Flops ratio well below 1. Here, we have also very good alternatives to the initial parent with more individuals which improves the mAP while considerably reducing the number of operations and the execution time. A peculiarity of individual 5, which has the highest mAP ratio and a Flops representing 0.41 of that of Tiny Yolov2, is to have a feature map with double resolution compared to Tiny Yolov2. Knowing that each fiber (prediction vector) belonging to the final feature map predicts 5 bounding boxes. And as the resolution of this individual feature map has doubled (more precisely the width and the height have both doubled), the number of bounding box predictions has increased and with it the processing time of these bounding boxes. Hence the high execution time. However, it does not exceed that of Tiny Yolov2.
To highlight the contribution of the modifications made to this version of our algorithm, we present together the evolutions of the different experiments on Figures 8 and 9. For the mAP, the E3 experiment improved the individuals over a longer number of generations compared to the others before reducing slightly their mAPs. Evolution in E3 did not generate low performing individuals like individual 6 of E1 (highlighted earlier). For the Flops, E3 presented individuals with fewer Flops since the first generations, however their mAPs are mostly higher than those of E1. Afterwards, all curves followed the same slope.
With the new implementation, we got optimized networks with similar mAP to the initial network but with less operations, and even three individuals with improved mAPs which introduce reductions in operations going from 59% to 68% from that of the initial parent.

On board test
The number of Flops does not correlate directly with execution time, its scope is limited to operations performed by the convolution and pooling layers. The hardware and the implementation have their impact on the execution time. As a consequence, we were interested in the execution time of the models on the target edge device. We used the Nvidia Jetson Nano with OpenCV's DNN API and CUDA and choose a batch of 11 images to calculate the FPS value for the Nano during inference. For a preview of results in other robotic environment, we conducted also tests on Raspberry PI 4B with a batch of 8 images using OpenCV's DNN API. Table 1 represents the performances of three of our generated networks. These networks were trained for much longer iterations to enhance their mAPs. The weights calculated during evolution for each network were used for weights' initialization. The supplementary training continues from the iteration at which the evolution training stopped. Its cost is minimal and is less than if we start from scratch. Other State Of The Art (SOTA) architectures were added to Table 1 to have an idea about their Flops. Some of them were trained and tested on our dataset for comparison with the generated networks.
Compared to Tiny Yolov2, we were able to significantly increase the FPS on the on-board equipment with our optimized networks while getting mAP close to the parent. SOTA networks provide better accuracy but at the cost of significantly more Flops that we cannot afford in our real time setup. Indeed, when the drone takes a photo, it is about 20 Mpx. The network E3.G4.I7, if tested with this image after cutting it into 600x600 patches, it will process these patches in 1.1 s which is considered as interesting for our use case. Some detection examples with E3.G4.I7, Tiny Yolov2, and Yolov2 are in Fig. 10.

Discussion
We can conclude that Tiny Yolov2 standard network architecture is not the best fit for a dataset of small size we used. This is demonstrated by the successful convergence of our evolution method to better performing network architectures. An early quick reduction of network operations count, while still maintaining good mAP, confirms this conclusion. It might be noted that the current fitness function reduces the network size aggressively. This reduction may be a cause for unbalance relative to a desired increase in mAP. Future work will involve experimenting with a fitness function that changes selection criteria as the evolution progresses. The aim is to favour reducing the network in the early stage then favour increasing mAP progressively until evolution ends.
Regarding evolution computational cost, we used G = 13 generations, each generation has N = C + 1 = 10 individuals. Each of these individuals goes through T = 6000 training iterations and one inference test. This sums up to about G × N × T = 780 k batches processed with Batch B = 256 . The most important determining factor in the evolution computational cost is the number of iterations for each training. We need to determine the number of iterations that will provide reliable fitness evaluation of an evolution. As the count of Flops is known and does not depend on iterations, we focus on mAP and the numbers of iterations that are necessary to determine the mAP difference between any two individuals. In our experiment we have stored the model weights every 1000 iterations. The question is: for any 2 different individual neural networks X and Y from the set of all individuals created during the evolution, what is the probability that the following implication of mAP Conservation (mAPC) is true: if mAP X (i 2 ) ≤ mAP Y (i 2 ) then mAP X (i 1 ) ≤ mAP Y (i 1 ) and if mAP X (i 2 ) ≥ mAP Y (i 2 ) then mAP X (i 1 ) ≥ mAP Y (i 1 ) . Knowing that mAP X (i) is the value of mAP at i iterations for the individual X, i 2 = 6000 and i 1 ∈ I 1 {1000, 2000, 3000, 4000, 5000} . We have run this analysis over a set of 121 neural networks architectures from the third experiment for each iteration in the I 1 . The outcome is that the mAPC starts from 72.9% at iteration 1000 then stagnates above iteration 2000 (mAPC at 2000 is 85.7% and 87.9% at 5000). This means that the probability that an individual network performs better at higher iterations count while we interrupt the training at 2000 iteration is 14.3%. Another mAPC analysis with different setup ( G = 50 , T = 20000 , B = 64 ) with 432 neural networks from the first experiment did provide mAPC of 95% between iteration 10000 and 20000.

Conclusion
We have been able to efficiently use a NAS based evolutionary algorithm to adapt a known convolution neural network Tiny Yolov2 to real-time environment inference without significant loss in mAP for some generated individuals: We did obtain an individual with 1.706 BFlops operations count instead of original 5.344 BFlops of Tiny Yolov2. We have observed that the first generations provide the best compromises between precision and operations count using the selected fitness function. Fitness is one of the parameters that drive the evolution. We have designed a function that mostly adapts the network to the real-time constraint. The choice of mutations and the changes operated in their probabilities and in the network training parameters impacted the evolution and its outcome and provided improved models. We then measured the impact of reducing training iterations on fitness function reliability. However, there is still room for improvement.

Author Contributions
The authors confirm contribution to the paper as follows: study conception, design and coding: IZ; data collection: IZ, YM; analysis and interpretation of results: IZ, YM; draft manuscript preparation: IZ, YM. All authors reviewed the results and approved the final version of the manuscript.
Funding This work was partially funded by AtlanSpace by providing cloud services.

Data availability
The dataset that support the findings of this study are available from the corresponding author on reasonable request. Other Code Availability The source code of the evolutionary algorithm subject of this work are not publicly available due to the partial license detention by AtlanSpace but are available from the corresponding author on reasonable request and with permission of AtlanSpace.

Declarations
Conflict of interest Author Jamal Berrich and Toumi Bouchentouf declare they have no financial interests. Author Ilham Zerrouk was an employee in Company AtlanSpace until August 2021. Younes Moumen, Wassim khiati and Ali El Habchi still employees in Company AtlanSpace at the date of submission.
Ethical approval This article does not contain any studies involving human participants/animals performed by any of the authors.