An improved linear prediction evolution algorithm based on nonlinear least square fitting model for optimization

The linear prediction evolution algorithm (LPE) is a recent addition in the field of optimization algorithm with few parameters and high exploration capability. LPE has shown excellent results in some real-world problems but still suffers from some issues such as slow convergence speed. To improve the performance of the LPE algorithm, this study presents an improved linear prediction evolution algorithm (ILPE) to enhance its exploration capability. The proposed ILPE algorithm treats the population series of evolutionary algorithms as a time series and uses the nonlinear least square fitting model as a reproduction operator to forecast the next generation. The performance of the proposed ILPE is verified using the CEC2014 and CEC2017 benchmark functions. The comparison results show that ILPE outperforms LPE, and it is highly competitive with other state-of-the-art optimization algorithms.


Introduction
Real-world problems form interdisciplinary areas, usually including nonlinear, discontinuous, and non-convex objective functions with high-dimensional and complex boundary conditions. To get satisfactory results, deterministic mathematical methods are often infeasible to apply. Meta-heuristic algorithms have received much attention as powerful and general optimization methods for solving various complex problems (Dulebenets 2020;Bansal and Farswan 2017). Many new metaheuristic algorithms have been developed in the last few decades, including swarm intelligencebased algorithms such as ant colony optimization (ACO) (Dorigo et al. 2006), particle swarm optimization (PSO) (Kennedy and Eberhart 1995), artificial bee colony algorithm (ABC) (Karaboga and Basturk 2007), grey wolf optimization (GWO) Mirjalili et al. 2014), teaching learning-based optimization (TLBO) (Rao et al. 2011), spider monkey optimization (SMO) (Bansal et al. 2014 These are based on the biological laws of evolution. Few popular evolutionary algorithms are differential evolution (DE) (Storn and Price 1997), genetic algorithm (GA) (Holland 1992), genetic programming (GP) (Koza and Koza 1992), etc.
Prediction-based evolutionary algorithm is the new trend in the field of evolutionary algorithms. The grey prediction evolution algorithm (GPE)  is the first prediction-based evolutionary algorithm that incorporates the principles of grey prediction theory. In the initial stages of its development, the algorithm utilized the even grey model (EGM (1,1)) Liu et al. (2015), as its foundation model to create a new reproduction operator; especially, the GPE algorithm employs the predicted values produced by EGM (1,1) model as the evolutionary offspring. The distinguishing feature of the GPE algorithm is its egm11 reproduction operator , which is based on even grey model. The performance of the algorithm has also been analyzed using benchmark functions from CEC 2005 and some constrained engineering design problems. Hu et al. proposed a grey prediction evolution algorithm based on even difference grey model (GPEed) Hu et al. (2021). The proposed GPEed developed a reproduction operator based on even difference grey model to predict the next generation of populations, which is the core innovation of GPEed. In order to accelerate the performance of the GPE algorithm Gao, Hu et al. used an acceleration term to improve the accuracy of the prediction and proposed grey prediction evolution algorithm based on accelerated even grey model (GPEae) Gao et al. (2020). Dai et al. proposed topological opposition-based learning to enhance the local search capability of GPE (TOGPE) Dai et al. (2020) algorithms and successfully applied to solve CEC 2005 and CEC 2014 benchmark problems. Several variants of GPE have recently been developed to achieve adaptiveness in GPE and tackle a range of optimization problems, such as multimodal multi-objective problems, economic dispatch problems, and constrained engineering design problems , Xu et al. (2020), , Gao et al. (2022), Xiang et al. (2022) ).
In contrast with other evolutionary algorithms, the LPE proposed in 2021 ) is inspired by the linear least square fitting model (York 1966). It is a new metaheuristic algorithm inspired by a linear mathematical model that treats a population series as a time series and predicts offspring using a linear least-square fitting model. It has simple code features, only one parameter (population size), and potent exploration capabilities. This has been successfully applied to 10-dimensional CEC2014, CEC2017 benchmark functions (Liang et al. 2013;Wu et al. 2017), and seven engineering design problems such as three-bar truss design, pressure vessel design, tension/compression spring design, welded beam design, tabular column design, gear train design, and cantilever beam design problems.
LPE is proposed based on the most widely used static model, the linear least square fitting model, which considers the population series as a time series and approximates a linear move that may exist in the subsequent generation. This makes LPE to use the least-square linear fitting model as the reproducibility operator to generate the next generation. The essence of this technique is an iterative predictive method in the evolutionary framework and an assurance of stability in evolution. Though LPE behaves well on 10-dimensional CEC2014 and CEC2017 benchmark problems in solution accuracy, the robustness and convergence rate were not fair enough . Also, for the 30-dimensional problem, it suffers from showing its effectiveness in terms of convergence rate, solution accuracy, and robustness compared to the other state-of-the-art algorithms. Because the linear strategy fails to explore the whole search space efficiently and after some iterations, it stagnates with no further explorations.
In this paper, an improved linear prediction evolution algorithm (ILPE) is introduced to overcome all such issues. Same as LPE, the ILPE considers the population series as a time series but uses a nonlinear least-square fitting technique as a reproduction operator to predict the next-generation population. Also, to overcome local minima stagnation it uses a random perturbation if the three-generation populations are too close. In order to validate the superiority of ILPE, we use 30-dimensional CEC2014 and CEC2017 benchmark functions.
The rest of the paper is organized as follows. Table 1 lists all the abbreviations and notations used in this research. Section 2 briefly discusses the preliminaries including linear prediction evolution algorithm (LPE). In Sect. 3, a detailed description of the proposed improved linear prediction evolution algorithm is presented. In Sect. 4, the detailed experimental result and different statistical analyses are presented. Finally, concluding remarks are given in Sect. 5.

Linear prediction algorithm
The linear prediction evolution algorithm is developed by Cong Gao et al. (2021) based on the prediction-based linear least-square fitting model (Miller 2006). LPE is an iterative predictive model. It has no crossover operator, and the only control parameter is the population size. Also, the LPE reproduction operator is developed from the simplest statistical model, the linear least-square fitting model. The LPE determines the approximate linear trend that may exist in a subsequent population to achieve iterative optimization.

Mathematical Model
In this subsection, the linear least square fitting model and linear prediction operator of LPE are explained. Linear Least-Square Fitting ModelWeisstein (2002): Assume that (t 1 , x 1 ), (t 2 , x 2 )..., (t n , x n ) are n input data points. The linear least square model can be represented as Here, a 0 and a 1 are coefficients to be determined and E i represents the error of the i th pair of data. Note that the error E i is a function of two variables a 0 and a 1 . More details can be found in Yan and Su (2009). Estimation of a 0 and a 1 Freund et al. (2006): The method of least squares uses to determine a 0 and a 1 . We estimate a 0 and a 1 so that the sum of the squares of the differences between the observations x i and the straight line is a minimum. Thus, the least square criterion is The above equation is called the square loss function. The objective is to find values for a 0 and a 1 so that error is minimum. From multi-variable calculus, we learn that this requires us to find the values of (a 0 ; a 1 ) such that the gradient of S with respect to our variables (which are a 0 and a 1 ) vanishes. To get the value of the undetermined coefficients a 0 and a 1 , partial derivatives of S with respect to a 0 and a 1 are obtained.
The value of a 0 and a 1 can be obtained from the following matrix Now we construct a linear least-square model using the three input data (x 1 , x 2 , x 3 ) to produce the fourth predicted datax 4 . Consider that the four related data points are equidistant and specified as (1, x 1 ), (2, x 2 ), (3, x 3 ) and (4,x 4 ), respectively. From equation (4), we get the value of a 0 and a 1 .
Set E i = 0, and putting the value of a 0 and a 1 in equation (1), the fourth predicted data will bê

Initialization
The first step of LPE is to randomly initialize a population containing 3N individuals in the D-dimensional search space. We denote each population having N individuals by ..N and g = 0, 1, 2, ..., g max , where g is the generation and g max is the maximum number of generations. The i th individuals in the j th dimensions are initialized by using the following equation.
where r represents the uniformly distributed random number between 0 and 1. low j , up j are lower and upper bounds of the j th variable, respectively. Next, we divide the 3N individual into three populations and sort according to the objective function value of the individual. The top N individuals are considered as the first-population X 0 (g = 0). At the same time, the middle N individuals are considered as the second population X 1 (g = 1), and the last N individuals are considered as the third population X 2 (g = 2). These three populations form the initial population series as a time series for predicting the next generation of populations.

Reproduction
Based on linear least square fitting technique, a new reproduction operator called the linear fitting prediction operator (lfp operator) ) has been developed. The lfp operator is defined as follows: Let X g−2 , X g−1 , and X g , (g ≥ 2) represent three consecutive population series and three individuals x g−2 r 1 , x g−1 r 2 , and x g r 3 are randomly chosen from X g−2 , X g−1 , and X g , respectively. Assume that u Then, according to Eq. (6) the lfp operator is The individual series denoted by x g−2 r 3 , j , x g−1 r 2 , j , x g r 1 , j ( j = 1, 2, ...D) represent the j th dimension of the three individuals x g−2 r 3 , x g−1 r 2 , and x g r 1 , respectively.

Selection operator
In this phase of LPE, an individual with a better fitness score is selected from the trial population u g i and the target population x g i according to the greedy selection strategy used to update the population. If the objective function f is to be minimized, the newly generated populations u g i will pass to the next iteration or not; this will decide by the greedy approach described below: Algorithm 1 Pseudocode of LPE Input: N , D, g max , low j , up j Output: Optimal value of the objective function f. Initialization: Initialize X 0 , X 1 and X 2 using the equation (7). The three populations form the series as PopX={X 0 , X 1 , X 2 }, respectively. Linear Prediction:

Motivation
The original LPE algorithm has no parameter except population size to perform the search and is easy to implement. It has no mutation and crossover operator like other evolutionary algorithms. However, experimental analysis has shown that LPE is vulnerable to stucking to local optima in some cases due to the inadequate ability to search of linear least-square fitting models. Though it is a good optimizer for lower-dimensional problems in terms of solution convergence rate and accuracy, its robustness is competitive .
For higher-dimensional problems, LPE fails to show its accuracy and convergence as compared to other metaheuristic algorithms. Therefore, there is a scope to improve the ability to search of linear models and make them better optimizer. So, we have introduced a nonlinear least square fitting model in place of linear fitting operator to predict the next generation of population so that it can explore the large area of the search space. Thus, by using the nonlinear least square fitting model, the main aim of this paper is to increase the LPE algorithm's exploration capability and rate of convergence.

Mathematical model
In this sub-section, the details of the nonlinear least square fitting model are described as follows: Nonlinear least square fitting model Mullineux (2008); Wu (2002): Assume that (t 1 , x 1 ), (t 2 , x 2 )..., (t n , x n ) are n input data points, the nonlinear least square fitting model can be expressed as where a 0 , a 1 , and a 2 are coefficients to be determined and E i represents the residuals of i th pair of the data. And ω lies between (0, π). Estimation of the constants: We use the least square method to estimate a 0 , a 1 and a 2 . Thus, the least square criterion is The above equation is also called the sum of the square loss. Our aim is to find the value of a 0 , a 1 and a 2 to minimize the error. To find the undermined coefficients a 0 , a 1 , and a 2 , we differentiate the function S partially with respect to a 0 , a 1 , and a 2 . The resulting equations are as follows: The value of a 0 , a 1 , and a 2 can be obtained from the following matrix ⎡ Now, similar to LPE let the four equally spaced data points are (1, x 1 ), (2, x 2 ), (3, x 3 ) and (4,x 4 ), respectively. The values of a 0 , a 1 , and a 2 depend upon the selection of ω. Then, with different values of ω ∈ (0, π) with π/6 step size, we get the fourth predictive data point using equation (13). With E i = 0, the models corresponding to the different values of ω are shown in Table 2.

Model selection
Various models with different values of ω have been evaluated over CEC2014 benchmark functions Liang et al. (2013). Here we consider 30-dimensional CEC2014 benchmark functions, and the best values (BEST), mean values (MEAN), and standard deviation values (STD) are recorded over 51 independent runs and are presented in Table 3. The best value for each indicator is in bold. Table 3 shows the results of all 30-benchmark functions of CEC2014 benchmarks set in the 30-dimensional search space, where 'M' denotes the model and 'BP' denotes the benchmark problems. Analysis of the obtained results shows that model-6 performs better than other models in the best values with an average ranking of 1.4, while the average ranking of model-1, model-2, model-3, model-4, and model-5 are 2.3, 4.06, 4.33, 2.9, and 6. The average rank of mean and standard deviation values for model-6 are 1.43 and 1.46, which is lower than the other models. Table 4 shows the average ranks of all models. From Table 3, we can see that model-6

Pseudocode
As mentioned above, the original LPE has no parameters other than population size. The proposed ILPE uses a nonlinear least-square fitting model to achieve a better balance between exploration and exploitation. Similar to the original LPE after three population initialization (equation (7)), the ILPE generates the offspring's by using the reproduction operator (model-6) and selection operator (formula (9)). As compared to LPE, the ILPE uses the nonlinear least square fitting model as a reproduction operator. If all values of the data series are equal, the ILPE uses random perturbation to generate the subsequent value to avoid the local minima stagnations. Otherwise, ILPE reproduction operator is used to predict the next value. If the new solution exceeds the feasible region, then it is replaced by the random solution in the feasible space.

Algorithm 2 Pseudocode of ILPE
Input: N , D, g max , low j , up j , δ Output: Optimal value of the objective function f. Initialization: Initialize X 0 , X 1 and X 2 using the equation (7). The three populations form the series as PopX={X 0 , X 1 , X 2 }, respectively. ILPE Prediction:

Parameter setting
The parameter setting of all the considered algorithms is shown in Table 5. Each algorithm runs independently 51 times with a population size of 50. The maximum number of iterations is set to 2000, and the dimension of each problem is set to 30.

Comparison of ILPE with LPE algorithm on CEC2014 benchmark functions
In this subsection, numerical results of ILPE and original LPE are shown in   for the best values of LPE is 5.9. Also, the average rank among the algorithms for MEAN and STD values of ILPE is superior than LPE. Table 7 shows the average ranking of the considered algorithm. These results indicate that ILPE's performance is superior in terms of robustness and solution accuracy.

Comparison of ILPE with other meta-heuristic algorithms on CEC2014 benchmark functions
In this subsection, the experimental result of ILPE and other meta-heuristic algorithms is shown in  4.47, 5.2, 5.9, 5.07, 9.43, 9.77, 7.07, and 7.37, respectively. The average rank of mean values of ILPE 1.60 is superior than those of other considered algorithms. The average rank of ILPE in terms of standard deviation is 3.73 that is worse than only DE, but it is better than the other nine algorithms. Thus, from Tables 7 and 10, we can conclude that ILPE is a very good optimizer as compared to the GPE, GPEae, GPEed, PSO, ABC, GWO, GA, and TLBO and shows highly competitive robustness with the competitors.

Comparison of ILPE with other meta-heuristic algorithms on CEC2017 benchmark function
Like the above numerical experiments, the experimental results of ILPE on CEC2017 benchmark functions with other meta-heuristic algorithms are shown in  11, 3.63, 5.6, 4.47, 9.33, 9.17, 7.5, and 6.83, respectively. From the average mean rank, we can see that ILPE rank is 1.43 that is superior to those of other considered algorithms. ILPE ranks superior to GPE, GPEae, GPEed, PSO, ABC, GWO, GA, and TLBO but worse than DE when considering standard deviation. Thus, from the statistical results shown in Tables 9 and 11, we can see that ILPE performs well as compared to the GPE, GPEae, GPEed, PSO, DE, ABC, GWO, GA, and TLBO.

Convergence analysis
A comparison of the convergence process of the proposed ILPE and other considered algorithms can be seen through the convergence curves. The convergence curves of all nine considered algorithms under selected benchmark functions from CEC2014 and CEC2017 benchmark functions are given in Figs. 1 and 2. The functions F1, F4, F8, F13, F15, F17,

Statistical analysis
In this section, Wilcoxon signed-rank test Derrac et al. (2011); Gibbons and Chakraborti (2020) is used to analyze the significant difference between ILPE and other competitors. The Wilcoxon signed-rank test is a nonparametric test to evaluate whether the difference between the ILPE and other metaheuristic algorithms is sufficiently notable. In this paper, Wilcoxon signed-rank test uses the optimal values (BEST) from Tables 10 and 11 to compare with the other considered algorithms. This test is performed in pairs with the significance level of 5% using the null hypothesis. If the p-value is less than 0.05, the null hypothesis is rejected. Assume that '+,' '≈' and '−' represent that the test result of ILPE performs better, comparable, and inferior to other considered algorithms, respectively. Of the 300 comparisons on the CEC2014 benchmarks set have 273 positive symbols. For CEC2017 benchmark functions, out of 300 comparisons, it shows 283 positive signs. The comparison results are shown in Tables 12 and 13 with p-value, where 'BP' denotes the benchmark problems. Therefore, from all analyses, it can be concluded that ILPE performs remarkably better as compare to other considered algorithms. The resulting ILPE excels in accuracy, robustness, and reliability. This is a definitive improvement on the proposed algorithm. A more intensive statistical analysis was performed on the numerical result of ILPE, GPE, GPEae, GPEed, LPE, PSO, DE, ABC, GWO, GA, and TLBO. Box plots are empirical distributions of data. For CEC2014 benchmark functions, boxplots for BEST, MEAN, and STD values corresponding to all algorithms ILPE, GPE, GPEae, GPEed, LPE, PSO, DE, ABC, GWO, GA, and TLBO are given in Fig. 3. The boxplot for the CEC2017 benchmark functions is presented in Fig. 4. Boxplot's analysis shows that ILPE is superior to other considered algorithms under consideration for both CEC2014 and CEC2017 benchmark functions.

Computational complexity
The computational complexity in these metaheuristic algorithms dictates how long the algorithm requires to calculate the final outcome. The computational complexity of ILPE and LPE algorithms is calculated in terms of big-O notation. The algorithm's computational complexity depends on initialization, fitness evaluation, position update mechanism, mutation, and greedy selection approach. For all the algorithms, the computational complexity of initialization is O(N×D), fitness evaluation is O(N) time, and the greedy selection approach is O(N). In LPE, the prediction mechanism is O(N×D) time. Therefore, the total computational complexity is O(N×D×M), where D is the dimension of the search space, N is the population number, and M is the maximum number of generations. However, ILPE is based on a nonlinear least square fit approach, but the obtained model is linear; therefore, similar to the LPE the ILPE prediction process is also O(N×D) time. So, the total computational complexity of ILPE is O(N×D×M). Similarly, GPE computational complexity is also O(N×D×M).

Conclusion
LPE is a new and competing evolutionary algorithm with simpler code, less parameters, and extensive search capabilities, which can improve overall performance. This article is an attempt to develop a new strategy called an improved linear prediction algorithm (ILPE). This algorithm treats the population series of an evolutionary algorithm as a time series  Data Availability This manuscript has no associated data.

Conflict of interest
The authors declare that they have no conflict of interest.
Ethical approval This article does not contain any studies with human participants or animals performed by any of the authors.