Strengthened Teaching–Learning-Based Optimization Algorithm for Numerical Optimization Tasks

The teaching–learning-based optimization algorithm (TLBO) is an efficient optimizer. However, it has several shortcomings such as premature convergence and stagnation at local optima. In this paper, the strengthened teaching–learning-based optimization algorithm (STLBO) is proposed to enhance the basic TLBO’s exploration and exploitation properties by introducing three strengthening mechanisms: the linear increasing teaching factor, the elite system composed of new teacher and class leader, and the Cauchy mutation. Subsequently, seven variants of STLBO are designed based on the combined deployment of the three improved mechanisms. Performance of the novel STLBOs is evaluated by implementing them on thirteen numerical optimization tasks. The results show that STLBO7 is at the top of the list, significantly better than the original TLBO. Moreover, the remaining six variants of STLBO also outperform TLBO. Finally, a set of comparisons are implemented between STLBO7 and other advanced optimization techniques. The numerical results and convergence curves prove that STLBO7 clearly outperforms other competitors, has stronger local optimal avoidance, faster convergence speed and higher solution accuracy. All the above manifests that STLBOs has improved the search performance of TLBO. Data Availability Statements: All data generated or analyzed during this study are included in this published article.


Introduction
In the past decades, meta-heuristic algorithms are widely applied in various research fields and practical scenarios, which are favored by researchers because of their simplicity, efficiency, low computational cost, and extraordinary performance. Different from exact algorithms, meta-heuristic algorithms are a type of optimization technology that seeks approximate optimal solutions of the problem under limited time and resource constraints. Nowadays, with the development of emerging technologies such as the internet and artificial intelligence, large-scale optimization tasks with features such as non-differentiable, non-differentiable, or discontinuities are increasing rapidly. This allows meta-heuristic algorithms to play an increasingly important role. In meta-heuristics, exploration and exploitation abilities are two crucial characteristics, which together affect the search performance of the algorithm.
PSO is designed based on the predatory behavior of birds.  [12], monarch butterfly optimization algorithm (MBO) [13], earthworm optimization algorithm (EOA) [14], salp swarm algorithm (SSA) [15], elite genetic algorithm with tabu search (TS-EGA) [16], krill herd algorithm (KH) [17], and so on. These naturally inspired population-based optimizers have been proved to be efficient. However, according to the No Free Lunch (NFL) theorem, no optimizer is suitable for solving all the problems in the world.
Therefore, Rao et al. developed a new population-based optimization method called teaching-learning-based optimization algorithm (TLBO) by modeling human teaching-learning behaviors [18].
TLBO is a novel population-based optimization method, inspired by teaching behavior and teaching mode in human society. The design concept of TLBO is that the teacher in the class is keen to improve the average score of the whole class to his/her own level. Moreover, the students in the class try to increase their knowledge by learning from the teacher and communicating with each other. Ideally, each member of the class should perform his or her duties to make the whole group evolve in a better direction.
TLBO is first proposed to solve constrained mechanical design optimization, continuous non-linear function, and engineering optimization problems. Subsequently, it is used in more and more academic research and practical application fields. For instance, in the literature [19], TLBO is used to optimize energy consumption and surface defects in wire cut electric discharge machining. The results reveal that the sur-face defects around the slit are the least under the optimized working conditions of TLBO. In the work [20], a functional link artificial neural network based TLBO is proposed for real-time identification of fuzzy PID-control magnetic levita-  [26]. It can be seen that the key parameters in the optimization algorithm have a greater impact on the performance of the algorithm. Therefore, we reset the update mechanism of teaching factors to better meet the requirements of reality and algorithms. Secondly, we have joined an elite system of the new teacher and class leader in the learning phase to accelerate convergence and learn more knowledge for guide the correct evolution of the population. Thirdly, the authors proposed a fast evolutionary programming (FEP) by introducing the mechanism of Cauchy mutation operator [27]. The results reveal that EFP has excellent search performance. Therefore, we added the Cauchy mutation mechanism to the original TLBO to enhance population diversification for better jump out of the local optimum.

Contributions
The contribution of this paper is threefold. Firstly, a novel For a clearer explanation, the distribution of the students' grades in the two classes taught by and 2 after the implementation of a teaching activity is shown in the Fig. 1, and the distribution of students' scores before and after the implementation of teaching activities in a class is shown in Fig. 2.
As we all know, the distribution of achievements satisfies the normal distribution expressed by Equation (1). As shown in Fig. 1, the curve-1 and curve-2 represent distribution of grades obtained by students taught by and 2 , respectively. Obviously, the teaching level of 2 is better than that of .
Therefore, teachers play an important role in the evolution of class population.
where 2 is the variance, is the mean.  The TLBO contains two stages: "Teacher phase" and "Learner phase". The specific implementation details are showed in the following.

Teacher phase
In the teacher stage, the best individual with the optimal fitness in the class is designated as the teacher. The purpose of selecting teacher is to impart knowledge and try to improve the average grade of the class to the teacher level. In this situation, each decision variable can be regarded as a subject, and the individual's fitness can be regarded as his/her total score in all subjects. However, it is almost impossible for teacher to fully achieve this goal because of many factors such as students' uneven learning and comprehension ability. Therefore, a concept called teaching factor ( , calculated by Equation (3)) is introduced to conform to the facts.
Suppose that the current iteration number is , and let is the average score of all individuals, is the teacher at iteration k. The difference between and can be expressed by Equation (2).
where and satisfy the uniform distribution of [0,1].
Subsequently, the position of the current solution can be updated based on the difference, as shown in Equation (4).
where ( = 1,2, … ) is the position the th search agent before teaching activities, and , is the position the th search agent after teaching activities in iteration . N indicates the preset population size.

Learner phase
Learners (students) increase knowledge in two different ways: one is through the teacher's teaching, the other is through their interaction. In the learner stage, students improve their performance by communicating with each other. The communication methods are mainly conducted through group discussion, personal experience introduction, formal or informal conversation, etc. In communication, students can improve their knowledge by random interaction with others. If a student finds that another student has more knowledge than he/she, he/she can learn something new. The specific individual renewal formula is as follows.
For the individual ′ after the teaching stage, another , the position update formula of ′ can be expressed as: Otherwise, Finally, the fitness was reevaluated. If the fitness of +1 is better than that of ′ , the update will be accepted; otherwise, keep ′ unchanged.

Pseudocode of TLBO
In this subsection, the pseudocode of TLBO is presented in Algorithm End for 13. for = : % Learner phase 14: Let X i k ′ is a learner
End for 25. End while 26. Return global best solution

MG-TLBO algorithm
In this section, strengthened teaching-learning-based optimization algorithm called STLBOs is proposed based several improvements to the original TLBO. In the STLBO, three reinforcement mechanisms are introduced to improve the searching performance of the conventional TLBO.
In the teacher phase, teaching factor is a unique variable that can decide the mean value to be changed. is assigned equal probability to 1 or 2. However, after full and detailed numerical experiments on , we find that the value of 3 or 4 may be a more desirable choice. From a practical point of view, compared with 1 or 2, value of 3 or 4 means that teachers have a higher level of teaching, while students have a higher learning ability, and can even learn the knowledge that teachers have not taught. From the perspective of algorithm, when is 3 or 4, the search agent can approach the global optimal solution faster, so the algorithm has higher convergence speed. Further, in the actual scenario, with the development of teaching activities, the teaching level of teachers should be gradually improved rather than randomly assigned.
Therefore, a linear increasing is designed to model its adaptive changes with iterations in this subsection, as shown in Equation (7).
where is the current number of iterations and represents the maximum number of iterations.

Elite system in learner phase
In the learner phase of the conventional TLBO, class members increase their knowledge by communicating and learning with each other. However, this learning method has certain shortcomings. The most important point is that class members lack the guidance of outstanding individuals. This kind of mutual communication and learning is messy, without certain benchmarks and standards, so it may cause a lot of useless "learning". From the perspective of population evolution, the existence of this invalid learning strategy will make the population evolve in the wrong direction to a certain extent.
We constructed an elite system in the learning phase in order to reasonably solve the above algorithm defects. In this subsection, an elite system is introduced into TLBO to The elite system introduced into the learner phase of TLBO can be expressed as: = the best search agent in current iteration .
= the second best search agent in current iteration .
In the implementation of the algorithm, the remaining search agents update their positions based on and with the same probability. Let be a random number between [0, 1], if is greater than 0.5, the position update formula can be expressed as Equation (8); otherwise, the position update formula can be expressed as Equation (9).
where, ′ denotes the position vector of the ith individual after the teacher phase, and + denotes the position vector of the ith individual after the learner phase. denotes a random number between [0, 1].

A Cauchy mutation mechanism
Although TLBO is a superior optimizer, it has depicts such as premature convergence and weak local optimal avoidance when dealing with some complex tasks. To solve this problem, it is extremely necessary to introduce a certain mutation mechanism.
In this subsection, a Cauchy mutation strategy (CM) is integrated into the learner phase of the TLBO. Generally, the mutation mechanism improves the search performance of the search agent by enhancing the diversity of the population. In this paper, we use the Cauchy distribution to model Cauchy mutation mechanism. The mathematical formula of CM calculation can be expressed as ~( , 0 ), = 0.5, 0 = 1.
The probability density function of the Cauchy distribution is shown in Equation (10).
where 0 is the location parameter defining the peak position of the distribution; is the scale parameter of the half width at half of the maximum value. When = 0.5 0 = 1 , the probability density function image of Cauchy distribution is shown in Fig. 3.
In the implementation of the algorithm, a is designed to model CM mechanism.
= * where represents the D-dimensional vector with value between [0.01, 0.001], and D is the number of variables.
represents the D-dimensional distribution number vector.
Then, the new location update combined CM mechanism is expressed by Equation (12).
where + ′ indicates the location of the search agent after the implementation of the CM calculation.

Schematic diagram of STLBO
In this subsection, the pseudocode of STLBO deployed with the above three mechanisms is presented in Algorithm 2. Algorithm 2. Pseudocode of STLBO algorithm 1: Inputs: N and total number of iteration.
Calculate the mean M 4.
End for 13.
Select and 13.

Experiment and analysis
In this section, the two experiments are carried out to measure the performance of the STLBOs. First, a comparative experiment is performed on seven STLBO variants and the conventional TLBO. The description of the algorithm is shown in Table 1. Then, the SGWO7 with the best search performance is selected in the next experiments. All experimental results and analysis are recorded and compared with several famous optimization technology such as HS, PSO, MFO, GA and traditional TLBO.
In this paper, the total number of iterations in comparison methods are set to 1000 except for the HS. The population size is set to 40 for PSO, MFO, and GA, and the population size is set to 20 for TLBO and STLBOs. This is because TLBO and STLBOs have 2 × N (Population Size) function evaluations in one iteration, which is twice that of PSO, MFO, and GA. For HS, the best population size is 5-7. Therefore, we set the population size of HS to 6. Meanwhile, to ensure the same total number of function evaluations, the total iterations are set to 40,000, same as other algorithms. Detailed parameter settings are listed in Table 2.
As shown in Table 2 The experiments were implemented on a Windows 10 operating system with Inter(R) CPU i7-6700HQ @2.60 GHz and 4.00-GB RAM. All algorithms were encoded using Python 3.7.
( Table 1 to be inserted here.) ( Table 2 to be inserted here.)
( Table 3 to be inserted here.) ( Table 4 to be inserted here.)

Comparative test of seven STLBOs with TLBO
In this subsection, the comparison experiments between Seen from Tables 5 and 6 In the tables, "Average Rank" and "Overall Rank" provide a more intuitive indicator for algorithm comparison.
From Table 5, STLBO7 presents outstanding performance, followed by STLBO4 and STLBO6. From Table 6, STLBO7 is still achieved an overwhelming victory over opponents, followed by STLBO6 and STLBO4. Finally, the total ranking  Tables 5 and 6. It can be seen from ( Table 5 to be inserted here.) ( Table 6 to be inserted here.) ( Table 7 to be inserted here.)

Comparative test of STLBO7 with other optimizer
In this subsection, some other prestigious optimizers such as GA is the best optimization technology when the dimensions are 70D and 100D, followed by STLBO7.
It can be perceived from the "The total ranking table" ( is the most outstanding one among all comparison methods.
Therefore, the newly designed STLBO7 is an efficient optimization technology.
( Table 8 to be inserted here.) ( Table 9 to be inserted here.) (Table 10 to be inserted here.) (Table 11 to be inserted here.) The convergence curves of the comparison optimizers on the 70D functions is shown in Figs. 4 and 5 to further illustrate the effectiveness of the proposed STLBO7. From Fig. 4, it can be perceived that compared with the traditional TLBO and other optimization techniques, STLBO7 has higher solution accuracy and the fastest convergence speed on all unimodal problems. From Fig. 5, STLBO7 is still far better than other optimizers in solving accuracy and convergence speed on multimodal benchmark functions other than func-tion F8. For function F8, GA gets the highest solution accuracy and fastest convergence speed. Although TLBO and STLBO7 cannot find the global optimal solution of F8 within 500 iterations, the latter still has better searching performance than the former. These curves further illustrate the effectiveness of our proposed STLBO7.
In future research, the STLBOs will be applied to more practical tasks and multi-target issues. Meanwhile, designing a binary version of STLBOs and applying them to practical problems is also a worthwhile research direction.