Biological survival optimization algorithm with its engineering and neural network applications

This study proposes a novel and lightweight bio-inspired computation technique named biological survival optimizer (BSO), which simulates the escape behavior of prey in the natural environment. This algorithm consists of two important courses, escape phase and adjustment phase. Specifically, in the escape phase, each search agent is required to update its location using the best, the worst and a neighboring individual of the population. The adjustment phase is implemented using the simplex algorithm for search better location of the worst agent within a small region. The effectiveness of the BSO is validated on the CEC2017 benchmark problems, three classical engineering structural problems and neural network training models. Simulation comparison results considering both convergence and accuracy simultaneously show that BSO has competitive performance compared with other state-of-the-art optimization techniques.


Introduction
Global optimization is a process of searching the best solutions of a defined problem with maximum or minimum objection functions involved in different areas. With the rapid development of science and knowledge, optimization problems have been endowed with diverse characteristics, such as non-convex, discontinue, high-dimension and so on; therefore, the demand of high-quality optimization algorithms becomes more urgent than before. Inspired by different biology mechanism or nature phenomena, scholars have designed diverse optimization algorithms. In general, these optimization algorithms can be grouped into derivationbased and stochastic algorithms. The former utilizes substantial gradient information of objective function to built the B Qingyang Zhang sweqyian@126.com B Shouyong Jiang math4neu@gmail.com 1 search direction for obtaining better solutions. For some simple or ideal models, derivation-based algorithms can obtain competitive results using less computation cost. However, they also show several disadvantages, premature convergence, gradient dependence, instability for complicated or difficult optimization problem. The latter can deal with the drawbacks mentioned above based on individual cooperation mechanism, since stochastic algorithms use a search population of random solutions sampled in feasible region to approximate better candidate solution.
The main feature of stochastic algorithm includes exploration and exploitation during the whole optimization course (Yang 2010). Exploration ability aims to help the search agents for exploring promising areas in feasible space broadly. On the contrary, exploitation ability refers to guide the population agents for converging toward the best member in the promising areas obtained in exploration phase. It is worth mentioning that a good trade-off between exploration and exploitation should be maintained during the search process, since favoring exploration benefits to improve diversity for local optima avoidance, and emphasizing exploitation benefits to a faster convergence.
Regardless of many successful optimization algorithms, there is a question here that if more effective methodologies are needed in evolutionary computation area. According to the no free lunch (NFL) theorem (Wolpert and Macready 1997), which logically proves that there is no optimization method for solving all optimization tasks. This theorem encourages the proposal of new and efficient methods with the hope to solve a wider range of problems. So, the answer is positive. In addition, it is generally true that conventional optimization paradigms have potential local search capability for objective space, but they usually suffer from premature convergence, whereas population-based stochastic search techniques tend to escape from local optima easily. This motivates us to investigate whether a new heuristic algorithm can be designed by combining stochastic operators and local search algorithm. Finally, many interesting phenomena or behaviors of nature contain intelligence characteristics and provide us with many inspirations. Totally, motivated by the mentioned perspectives, in this paper, authors attempt to draw inspiration from biological survival behavior and propose a new search methodology.
The main work of this paper includes as follows: On the one hand, two different behaviors were extracted by simulating the escape behavior of Oryx named escape phase and adjustment phase, which are modeled mathematically. Then, biological survival optimizer (BSO) is proposed. On the other hand, unlike to other stochastic search algorithms, BSO is a parameter-less method and utilizes the simple algorithm as local generation strategy for searching promising individuals. The advantage of this operator is that it not only performs effective local search but also generates agents that are better than the previous ones, which are widely used in various algorithms.
The following provides the basic organizational structure of this paper: Sect. 2 provides the literature review of the existing optimization algorithms. The background and the principles of BSO are summarized in Sect. 3. In Sect. 4, CEC2017 test problems are employed to evaluate the effectiveness of BSO. In Sect. 5, the proposed algorithm is utilized to solve three constrained engineering design issues. Neural network training models and data analysis are provided in Sect. 6. Section 7 concludes the main work of this paper.

Literature review
Over the past two decades, inspired by various laws or phenomena of nature, many heuristic optimization methods have been designed and applied to solve different practical problems successfully. According to the inspiration, the existing optimization methods can be classified into three different categories: evolution-based, physics-based and swarm-based algorithms.
Evolution-based algorithm is a kind of methods inspired by the evolution mechanism of nature. Genetic algorithm (GA), a representative method in this branch, mimics the Darwinian evolution theory (Akcan 2018;Sami et al. 2019).
Crossover and mutation operators are the main mechanisms for generating new offspring individuals and improving the quality of search population over the course of iterations in GA. Biogeography-based optimizer (BBO) imitates natural biogeography phenomenon that biological species tend to migrate to better habitats for living (Ali et al. 2019). Some other algorithms are evolution strategy (ES) (Rafal 2020) and differential evolution (DE) Pinar and Erkan (2019).
Physics-based algorithms, the second category, simulate physical phenomena or rules, which is different from the evolution-based algorithms. Gravitational search algorithm (GSA) is designed by imitating the Newtonian gravity and the laws of motion. All agents update their positions by the gravitational attraction force between them during iteration (Ricardo et al. 2019). The heavier the mass, the bigger the attractive force. Big-bang big-crunch (BBBC) is designed by imitating the big bang and big crunch principles, which is utilized to generate random directions and gather the search agents for moving toward the best point obtained so far (Doddy et al. 2018). Some other algorithms are multiverse optimizer (MVO) simulates the theory of multi-verse , artificial raindrop algorithm (ARA) simulates the phenomenon of natural rainfall (Jiang et al. 2017), mine blast algorithm (MBA) simulates the concept of mine bomb explosion (Wolpert and Macready 1997). States of matter search (SMS) devises from the simulation of the states of matter concept (Akcan 2018); ray optimization (RO) simulates the Snell's light refraction law (Amin et al. 2020).
Swarm-based algorithms, the third category, mimic the swarm behaviors of animals in nature. Particle swarm optimization (PSO), one of the most population algorithms, imitates the foraging actions of bird in nature (lsiet and Mohamed 2019). PSO utilizes multiple agents for generating new search individuals, which means that each search agent updates position considering its own best position obtained so far as well as the best position of the search swarm found so far. Slime mold algorithm is proposed based on the oscillation mode of slime mold in nature for optimal parameters tuning for cost-effective fuzzy controllers (Precup et al. 2021). Bat algorithm (BA) is designed by imitating ultrasonic echolocation foraging behavior, which is used for sensing distance and to differentiate between prey and obstacles. Two update formula and random walking strategy are the core parts of the algorithm (Liu and Wu 2018). Grey wolf optimizer (GWO) imitates the leadership hierarchy and hunting mechanism of grey wolves. The group of wolves is classified into four hierarchies, alpha, beta, delta, and omega. Adaptive control parameters and different computation operators are implemented to perform optimization (Lu et al. 2018). Some other algorithms are moth-flame optimization (MFO) simulates the flight mechanism of moths (Yamany et al. 2015), cuckoo search (CS) simulates breeding parasitism behavior (Joshi et al. 2017), and biology migration algorithm (BMA) sim-ulates the biology migration phenomenon in nature (Zhang et al. 2019).
In addition to the algorithms mentioned above, the emergence of optimization problems in different fields has provided a solid foundation for the research of algorithms, for example, engineering-designed problems (Hafez et al. 2016), all-colors shortest path model (Carrabs et al. 2021), Mm-Wave massive MIMO system (Singh and Shukla 2022), neural network training model (Zhang and Wang 2017), and so on.
This section list shows that there are many populationbased algorithms proposed so far, and most of them are inspired by hunting and search phenomena. It is generally true that population-based stochastic search techniques usually suffer from premature convergence, whereas some conventional optimization algorithms have potential local search capability in feasible space. This motivates us to investigate whether a new heuristic algorithm can be designed by combining stochastic operators and local search algorithm. In addition, to the best of our knowledge, there is no swarm-based method imitating the escaping behavior of Oryx, which motivates us to design a new optimization technique by modelling the escaping behavior mathematically for solving benchmark functions and different problems.

Biological survival optimizer (BSO)
This section mainly includes two segments; on the one hand, the background of BSO is described broadly. On the other hand, the proposed calculation model of BSO is presented in detail.

Background
The Oryx is one of the most intelligent animals in nature. In the real world, escaping behavior is an common phenomenon, where Oryx needs to cooperate and exchange information together for survival. On the one hand, with the purpose of improving the successful probability of prey, predator usually tends to attack the weak side of Oryx group. On the other hand, the leaders of Oryx group guide other group individuals away from the hunting. Besides that, the individuals should keep close to their neighbors and should avoid collisions with their neighbors. We use a topological structure for simulating the interactive relationships between individuals, no matter how close or how far away those individuals are. However, some Oryx individuals may not escape from the hunting successfully. In other words, each individual has a certain escape probability. The stronger the individual is, the higher successful probability is. After getting away from the threat, the group arrives a new place temporarily. To avoid being attacked easily, the leaders must guide the individual in weaker side, toward a better and suitable location (Stephen et al. 2016;Mads et al. 2017;Braha 2012;Ballerini et al. 2008;Abdulhakeem and Mohammad 2019). Figure 1 presents four inspiration curves of the whole process. Specifically, (a) a image of Oryx group; (b) the Oryx group is getting away from chase by lion; (c) the leaders help to adjust the position of the worst individual; (d) arrive a safe place.
We have to admit that there are many behaviors or rules in real world. However, for simplicity, as mentioned before, the proposed algorithm only considers the following rules in this paper.
1. The weaker side of Oryx group is attacked by predator easily. (Rule 1) 2. Each member has a escape probability value. (Rule 2) 3. Each search individual updates its position according to the best individual and its neighbors.(Rule 3) 4. Two best members guide the individual in weaker side to modify its current position. (Rule 4)

Calculation model
In this section, the calculation models including escape phase and adjustment phase are first presented through converting the above rules properly. Then, we outline the pseudocodes and discuss the basic principle of BSO.

Group generation
It is assumed that the search population (Pop) is comprised of N feasible solutions ( − → X i , i = 1, 2, 3..., N ), which are generated randomly using the following equations.
where N and D refer to the member of search members and dimension of issue, respectively. t denotes the current number of generation; r denotes a random number generated from (0, 1). − → L B and − → U B are the lower and upper boundaries of variables, respectively.

Escape phase
This section mainly simulates the behavior that the leaders help other species members away from the hunting. Due to predator usually attacks the weak side of population (Rule 1). The individual in the weak side is caught by the hunter easily. In BSO, we use the roulette strategy to set probability for Fig. 1 The inspiration curves of BSO a Oryx group, b getting away from chase by lion, c adjust the position of member, d arrive a safe place determining whether the current agent will be caught (Rule 2). Obviously, in terms of fitness, the better the individual is, the lower probability it has. If the current individual is caught, it will be replaced by a new one. Besides that, Rule 3 shows that each agent modifies its position according to the best individual and its neighbors. To define the neighborhood structure of an individual, a ring topology structure (see Fig.  2) is utilized in this paper (Kennedy and Mendes 2002). If the index of individual is i, the index of the selected neighbor is i + 1. If the index of agent is N , the index of the selected neighbor is 1. Here, we assume that the best individual has the best fitness; the worst member is opposite. The following provides the calculation equations.
where each number of δ i and ϕ i is generated from (0,1), pop best and pop wor st are the best and worst individual of population, respectively, −−−−→ X neigh,i is the neighbor of − → X i . Figure 4a depicts that the pursuer attacks the animal population labeled by black dot. In Fig. 4b, it is assumed that the best individual ( − −−−→ pop best ) is described by blue, the worst ( − −−−−→ pop wor st ) is red, the current individual ( − → X i ) is purple, and

Adjustment phase
The phase main simulates Rule 4; the leaders guide the other members in the weak side of the population toward safer place. This action is commanded by the two best individuals of the population. From the view of optimization, this phase aims to make the worst individual better with the assistance of two best solutions of the population. Therefore, we define As shown in Fig. 4c, the best member and the second best agent are labeled by blue and pink, respectively. The worst individual (red dot) moves toward new place with the help of the other best members using the simple algorithm. The red dot will arrive a new position as shown in Fig. 4d.
In other words, the movement within the region may ensure that the search individual will find better position. Finally, after the above-mentioned phases, the initialization population can move toward a new domain. The quality of population will be improved over the course of generation. The main pseudocode of BSO is summarized in Algorithm 1.

Discussion
The purpose of this section is to discuss the differences and connections between the proposed algorithm and wellknown particle swarm optimization. The similarities are obvious in they are all inspired by nature and utilize a population of candidate solutions initialized randomly in the search space to proceed to the global optimum. At the same time, there exist great differences between them reflected in the following two aspects. On the one hand, their inspiration backgrounds are different. PSO is derived from the foraging behaviors and characteristics of birds, while the proposed BSO is derived from the escape behaviors and strategies of Oryx in nature. On the other hand, the main distinction between them exits in algorithm operator and iteration mechanism during the evolution processes. PSO mainly adopts the guidance of local and global optimal solutions to update the speed and position of individuals. BSO mainly consists of two parts, escape phase and adjustment phase described below in detail.
The first phase consists of a ring topology structure, the roulette strategy and a stochastic operator. The ring topology, inspired from the notion of neighborhood, connects the index of individual, where the first individual is the neighbor of the last individual and vice versa. It also can be regarded as a neighborhood technique, which aims to maintain diversity Algorithm 1 Biological Survival Optimizer Input: N : the population size; D: the dimension of optimization problem; U B, L B: the lower and upper bounds of variables, respectively; Control factors: Expansion factor (α), Contraction factor(β); Max I ter: the termination criterion ; Generate an initial population (1) and (2); Evaluate the objective function values: ; while the termination criterion is not satisfied do sort the population based on fitness; set the probability (Pr) of each agent using roulette; (Escape Phase) Find the best solution ( − −−−→ pop best ) and the worst solution Find the neighbor of each agent Find the two best members and the worst individual of population Execute the simplex algorithm (seen Appendix A) based on the three solutions ; end while Output the best candidate solution; of the solutions and improve the exploration capability of algorithm. The roulette strategy is very close to Darwinian evolution theory, which means that the worst individual is eliminated easily, while the better member can survive. We use this strategy to keep the elitism solution and determine whether the worse individual will be replaced by a new one with a probability. It is generally known that to prevent premature convergence during the process of evolution, it is necessary to sample the whole search space systematically and generate new solutions diverse enough. Therefore, the stochastic search operator is used to improve diversity of the population and guarantee that each agent can move toward other places randomly. This not only generates some potential solutions distributed in the search space, but also improves the probability of finding the optimal solution. The second phase is implemented by the simplex algorithm, which simulates Rule 4. Another important reason is that the simplex algorithm is a good algorithm, which can generate better solutions in each generation. According to the characteristics, the simplex algorithm is able to guide the search agents to move toward better place from generation to generation. It not only benefits to strengthen the local search, Fig. 4 The whole search course curves of BSO but also provides accurate search direction during the optimization process. Besides that, BSO is also a parameter-less method except two basic parameters used in the simple algorithm, which makes the algorithm simple and convenient.

Numerical experiments
This section aims to evaluate the effectiveness of BSO by solving a set of CEC2017 test problems with various characteristics.

Benchmark problems
The CEC2017 benchmark problems are very suitable to evaluate the performance of an optimization algorithm comprehensively, since it has diverse features including multimodal, non-separable, asymmetrical and different variables subcomponents. According to the literature (Awad et al. 2016), these functions can be classified into four categories, unimodal and composition ( f 20 − f 29 ) parts. More details about the basic characteristics of these problems can be found in corresponding literature.

Evaluation indicator
This study adopts the following indicators for evaluating the effectiveness of algorithm.
• f mean and std refer to the average value and standard deviation of the function error, respectively; • Wilcoxon's rank sum tests (Wang et al. 2011) determine whether there exists statistical difference between two algorithms for each problem at a 0.05 significance level, p-value less than 0.05 means that the effectiveness of two competitive methods is statistically different with 95% (h = 1); otherwise, there is no significant difference (h = 0). Besides that, the relevant comparison results are recorded at the bottom of each table, and ‡, † and used in this manuscript indicate that the performance of BSO is better than, worse than and similar to that of the corresponding algorithm, respectively.

Parameter settings
The stopping condition is set to the maximum number of iterations (Max I ter), which is defined as 10, 000 for all algorithms. Thirty runs are independently carried out to reduce the computation error. In addition, eight existing optimization algorithms are utilized to compare the performance of BSO, the following provides the relevant descriptions and involved parameters according to corresponding references, DEDVR (Ghosh et al. 2020 • DEDVR : F=0.8, C R=0.5;

Experimental results
The experimental results calculated by all algorithms are recorded in Tables 1, 2

Analysis of statistical and exploitation performance
For f 1 − f 2 unimodal problems, it can be observed from Table  1 that although the proposed algorithm has the same f mean and std values as CMAES on f 2 , CMAES has the best f mean value and standard deviation with respective to its peers. According to the obtained p-value and h-value results, BSO performs significantly better than the other seven compared algorithms and performs similar to CMAES on f 2 . Figure 5 plots the convergence graphs of f 1 , as may be observed, the potential convergence performance of BSO is slightly slower than QGDA and GLFGWO at the beginning of generations, but it is second and outperforms other competitors finally. The bar charts Fig. 7 included also illustrate that BSO has its strong competitiveness in exploitation compared with the other optimization algorithms.           than other compared algorithms on most simple multimodal problems according to the Wilcoxon's test values. This situation is ascribed to the powerful exploration capability. BSO has the simplex algorithm and two main stochastic search operators, which are designed to guarantee that each search individual is able to move toward potential regions randomly and generate promising offspring agent. Besides that, it can be observed from Fig. 5 that BSO has better convergence performance with respect to its competitors. Figure 7 presents bar chars also show that BSO is not the best on in solving f 4 , while it is superior to other competitors clearly. Such evidence indicates that the proposed algorithm has promising exploration ability for avoiding local optimal.

Analysis of statistical and balance performance
Different from the above two types of functions, hybrid problems ( f 10 − f 19 ) usually require various techniques to optimize different subcomponents partitioned on variables. The statistical and comparison results recorded in Table  3 demonstrate that BSO has better performance than its competitors on f 10 , f 13 and f 14 , performs slightly worse than DEDVR on five test problems and performs similar to MGWO on f 11 , f 15 and f 16 . Besides that, Fig. 6 provides three evolution graphs; it is appeared that although BSO is slight worse than CMAES and MGWO, it is significant better than other compared methods over the course of iterations. The last category is composition functions ( f 20 − f 29 ) with the properties of non-separable, asymmetrical and different properties around different local optima; they are much difficult to be optimized effectively and are suit for evaluating the balance ability between exploration and exploitation. It can be summarized from the experimental data provided in Table 4 that the proposed algorithm obtains the best results on f 21 , f 23 and f 26 problems. Although BSO is slightly weaker than DEDVR and MGWO considering f 24 and f 29 , respec-   tively, there are no great differences between them according to the Wilcoxon's test results. From the given convergence graphs in Fig. 6, BSO has the best convergence rate compared with its peers. Figure 8 gives the bar charts of all algorithms, which demonstrates that BSO is able to implement a good balance between exploration and exploitation for escaping from different local optima compared with the other competitive algorithms.

Sensitivity analysis
As mentioned before, the proposed algorithm consists of two different key components. This subsection aims to discuss the role that each component plays in dealing with CEC2017 benchmark functions. Specifically, to demonstrate that the stochastic search operator has important effect on the proposed strategy, BSOV is designed by removing the stochastic operator; in the other words, BSOV just has the second search mechanism. Similarly, to study the role of the simple algorithm, BSO is also modified by excluding the second search strategy, called BSOVV. These two variants are compared with the original BSO, and Tables 5, 6, 7 and 8 report the corresponding computing and comparison results. It is not difficult to observe from the results that BSO outperforms two modified variants (BSOV and BSOVV) for almost all test problems. This means that each search strategy indeed helps improve the quality of population in varying environments. The reason may originate from the fact that the stochastic operator is designed by using different individuals, which helps to generate promising solutions to some extent. The comparisons between the two different variants and the original BSO illustrate that each part has an significant effect on the performance of BSO, and removing any of them reduces performance. Therefore, it is necessary to combine them together and format the BSO algorithm.
Apart from the component analysis, there are two important parameters in BSO named the expansion factor (α) and contraction factor (β). The discussion on the influence of two parameters will not appear here. The reason is that much effort has been devoted to designing effective parameter combinations, and different parameters may suitable for different problems. This paper only defines a base case of α = 2 and β = 0.5 for computing the optimization problems. In our future work, scholars who are interested in this can also investigate it and further improve the search performance of the algorithm.

Engineering optimization problems
To further explore the effectiveness of the proposed algorithm in practice, three widely used practical engineering problems are adopted (Mirjalili and Lewis 2016;Mirjalili 2015;Askarzadeh 2016). Note that the stopping condition, the number of runs and the other relevant settings keep the same as that in previous subsection 4.3. In addition, to deal with the constrains involved in problems, an effective constraint handling technique is utilized. Tables 9, 10and 11 summarize the experiment results considering computing time (' f time '), the mean error value ( f mean ) and standard deviation (std) on the corresponding statistical results.

case 1
The main target of compression spring problem aims to optimize the weight considering four different constraint conditions, which involves three control variables (x 1 ,x 2 and x 3 ). Equation (4) and Fig. 9 provide the mathematical model and  . 9 The schematic of the tension/compression spring design problem architecture graph, respectively.   Fig. 10 The schematic of the pressure vessel design problem In this problem, the results listed in Table 9 clearly show that the performance of BSO is remarkable with respect to other optimizers on different evaluate indicators. Although the standard deviation values and computing time (' f time ') of BSO are a little weaker than that of DEDVR, MGWO and GLFGWO, it generates the best ' f mean ' values, which means that the proposed algorithm is able to obtain a set of optimal design with minimum weight compared to other competitors.

Case 2
The main target of pressure vessel problem aims to optimize overall cost including material, forming and welding costs, which are expressed by four different constraint conditions, two discrete control variables (x 1 and x 2 ) and two continuous control variables (x 3 and x 4 ). Equation (5) and Fig. 10 provide the mathematical model and architecture graph, respectively.   . 11 The schematic of the three-bar truss design problem Table 10 clearly testifies that the proposed algorithm delivers better results under the same run circumstance and termination criterion; although BSO has a little higher computing time (' f time '), the superiority is statistically significant with respective to other competitors considering the obtained f mean and standard deviation results. That means BSO is able to generate a set of parameter combinations such that the optimization objective is optimal.

case 3
The main target of three-bar truss design problem aims to optimize the weight considering stress, deflection and buckling constraints, which are expressed by seven different constraint conditions, two decision variables (x 1 and x 2 ) and some control parameters. Equation (6) and Fig.11 provide the mathematical model and architecture graph, respectively.
where l = 100 cm, P = 2 kN/cm 2 ,σ = 2 kN/cm 2 , 0 x 1 1, 0 x 2 1. According to the experimental results listed in Table 11, although the standard deviation value and computing time (' f time ') of BSO are slight weaker than that of DEDVR, BSO, CMAES and DEDVR obtaining the best results and sufficiently outperforms other optimization algorithms with the same termination criterion and run circumstance. Such evidence indicates that the proposed algorithm will be an attractive alternative optimizer for generating satisfactory results on challenging optimization problems in future.  (Zhang and Wang 2017). Among them, neural networks with three layers (including an input layer, a hidden layer and an output layer) have been widely used in many practical applications. Therefore, this section attempts to employ three-layer FNN to solve two nonlinear function approximation problems for further verifying the effectiveness of the proposed algorithm. Figure 12 depicts the basic architecture of the three-layer FNNs, it can be seen that the feedforward neural network is a unidirectional multi-layer structure, each layer contains several neurons, and each neuron can receive signals from the previous layer and produce output to the next layer. The following equation aims to calculate the output result of each hidden epoch based on the connection weights and activation

F(s j
where F is sigmoid activation function, s j is the jth hidden unit, n denotes the number of input units, ω i j means the corresponding connection weight from the ith input epoch to the jth hidden epoch, θ j refers to the bias of the jth hidden epoch, and x i and h are the ith input variable and the number of hidden units, respectively.
The final result of output layer can be calculated using the equation below.
where o i p , (p = 1, 2, ..., P; i = 1, 2, ..., K ) denotes the pth output of ith input epoch, P and K are the number of output units and the total number of training sampling points, respectively,ω j p means the connection weight from the jth hidden epoch to pth output epoch, andθ p refers to the bias of the pth output unit. Figure 13 describes the optimization principle diagram of FNNs training model; the learning error between actual output and desired result is set to the optimization objective (J ), which is expressed by the formula below.
where d i p and o i p are the desired output and actual result of the ith input unit, respectively.
Actually, the FNNs training model is high-dimensional and multi-modal problem, which aims to obtain a set of optimal decision variables (ω, θ,ω,θ ) for minimizing the objective function.

Nonlinear function approximation examples
This subsection intends to utilize two nonlinear function approximation issues, named SISO and MISO characterized by noise, to further evaluate the performance of BSO.
• SISO: For this function shown in Fig. 14a-b, 1-h-1 neural network architecture is trained for the approximation operations. The  Convergence curves of all algorithms on two approximation problems data set is sampled from the feasible region with increments of 0.1 uniformly, where the first fifty data and the remaining data are set as training set and benchmark set, respectively.
• MISO: For this case shown in Fig. 14c-d, 3-h-1 neural network architecture is employed for the approximation progress. The relevant data set originates from the interval with steps of one, where the first ninety data and the rest are defined as the training set and test set, respectively.

The choice of the number of the hidden layer nodes
The number of hidden layer units (h) is always an significant element of FNNs; the main purpose of this subsection is to define an ideal value by setting different h number sampled from two to eight. It can be observed from Table 12 recorded the comparison results that the overall performance of FNNs with two hidden layer nodes is superior to the other versions significantly. Therefore, FNNs with the architecture 1-2-1 and 3-2-1 are adopted to optimize SISO and MISO problems, respectively.

Experiments results
The experimental results recorded in Table 13 demonstrate that the overall performance of BSO is better than most of the other compared algorithms under the same termination criterion and run environment, which means that it is able to obtain a set of reasonable parameters for minimizing the objective functions. When compared with MGSCA, MGWO and GLFGWO, the superiorities of the proposed algorithm are not significant, even are slightly inferior to them considering on some Testerror or p-value results. Figure 15 provides the convergence curve of all algorithms; it can be seen that BSO converges faster than its competitors in the first half of iterations; MGSCA, MGWO and GLFGWO have certain advantages in the later courses of generations. These results illustrate that the proposed technique may not the best one among these existing algorithms for FNNs problems, but it also can be regarded as an effective tool for solving different optimization problems.

Conclusion
In this paper, inspired by the Oryx escape and survival phenomenon in nature, a new heuristic optimization technique named biological survival optimizer is proposed. The simplex method and several stochastic operators are introduced for exploring and exploiting the feasible region effectively. The sensitivity analysis of the proposed algorithm is also discussed from the view of principle. To evaluate the performance of BSO, a recent test suite of CEC2017 benchmark functions with different characteristics, three different engineering design problems and FNNs training models are utilized. Experimental results compared with eight existing optimization algorithms demonstrate that BSO has better performance than the other algorithms on most cases, showing the proposed algorithm has a good tracking ability. Besides that, the sensitivity analysis on the role of each component in the proposed algorithm is also analyzed and discussed extensively. In future work, BSO will be further improved or modified as a better choice for addressing diverse practical applications in real world.