Dual mutations collaboration mechanism with elites guiding and inferiors eliminating techniques for differential evolution

Differential evolution (DE) is a powerful evolutionary algorithm for global optimization problems. Generally, appropriate mutation strategies and proper equilibrium between global exploration and local exploitation are significant to the performance of DE. From this consideration, in this paper, we present a novel DE variant, abbreviated to DMIE-DE, to further enhance the optimization capacity of DE by developing a dual mutations collaboration mechanism with elites guiding and inferiors eliminating techniques. More specifically, an explorative mutation strategy DE/current-to-embest with an elite individual serving as part of the difference vector and an exploitative mutation strategy DE/ebest-to-rand with selecting an elite individual as the base vector are employed simultaneously to achieve the balance between local and global performance of the whole population instead of only one mutation strategy in classical DE algorithm. The control parameters F and CR for above mutation strategies are updated adaptively to supplement the optimization ability of DMIE-DE based on a rational probability distribution model and the successful experience from the previous iterations. Moreover, an inferior solutions eliminating technique is embedded to enhance the convergence speed and compensate cost of the fitness evaluation times during the evaluation process. To evaluate the performance of DMIE-DE, experiments are conducted by comparing with five state-of-the-art DE variants on solving 29 test functions in CEC2017 benchmark set. The experimental results indicate that the performance of DMIE-DE is significantly better than, or at least comparable to the considered DE variants.


Introduction
Differential evolution (DE), first proposed by Storn and Price (Storn and Price 1997), is a simple yet powerful evolutionary algorithm. DE has exhibited notable performance due to its simple structure, rapid convergence speed as well as strong robustness and has been applied successfully in many domains of science and engineering such as neural network (Su et al. 2019;Baioletti et al. 2020), power system (Sakr et al. 2017;Reddy and Bijwe 2019), medical aspect (Nunes et al. 2017;Song et al. 2019;Hosny et al. 2020 (Paul and Das 2015;Tarkhaneh and Shen 2019) and many other practical optimization problems (Balamurugan and Muthukumar 2019;Huang et al. 2020).
The research directions of DE mainly consist of four categories. The first direction is to implement rational mechanisms to adjust core parameters in DE. For example, the fitness-adaptive parametric scheme for the scale factor F and crossover rate CR is developed in Ghosh et al. (2011) to maintain the exploration ability and help the population escape from the premature convergence situations. Zhu et al. (2013) proposed an adaptive population correction method to improve the optimization ability of DE. Draa et al. (2015) employed a cosine distribution and a sine distribution to generate F and CR, respectively. AGDE (Mohamed and Mohamed 2019) utilized a pre-determined specific candidate pool with a novel and effective adaptation scheme for generating appropriate values for CR. The second direction involves improving the mutation operator, which is one of the most significant operators in DE. Ghosh et al. (2011) proposed a trigonometric mutant strategy to accel-erate the convergence speed. Zhang and Sanderson (2009) developed a DE variant with an effective mutation strategy called DE/current-pbest/1, which is a milestone for the development of DE algorithm. EFADE (Mohamed and Suganthan 2018) provided a new triangular mutation operator which was based on the convex combination vector of the triplet defined by three randomly chosen vectors. The third improvement direction is to hybridize DE algorithm with other heuristic algorithms including particle swarm optimization (Thangaraj et al. 2011), artificial bee colony algorithm (Abraham et al. 2012), grey wolf optimization algorithm (Luo and Liu 2020), whale optimization algorithm (Luo and Shi 2019) and so on. The main purpose of the hybridization is to combine the advantages of DE and other algorithms to enhance the optimization capacity. The fourth direction is executed by adding an extra framework to DE. For example,  proposed a framework to detect the stagnation intelligently. When the population was stagnant, the vectors involved in mutation were selected from the archive, and through this operation, the quality of solutions was significantly improved. In ADE-ALC algorithm (Fu et al. 2017), an ageing mechanism is introduced into the framework of DE to maintain diversity of the population. Deng et al. (2020a) introduced a regeneration framework at the dimension level to alleviate the stagnation problem in DE. No matter which improvement strategy mentioned above is utilized, an appropriate trade-off between the local exploitation and global exploration ability is an important guideline for the algorithm design, especially for the mutation operator, and excessive emphasis on one of them will adversely influence another. Based on this consideration, we propose a novel DE variant with double mutation strategies (DE/current-to-embest and DE/ebest-to-rand) and an inferior solution eliminating technique for further enhancing DE's optimization ability. More specifically, in the strategy DE/current-to-embest, the current individual is taken as the base vector while a randomly selected top ranking elite individual and middle ranking individual serve as part of the difference vector to guide the current individual to explore the promising area, which is propitious to the population diversity but will not lead to the premature convergence. The strategy DE/ebest-to-rand chooses an elite individual as the base vector and two randomly selected individuals to guide the elite individual to exploit its neighbour area. These two mutation strategies are employed simultaneously to generate the mutant vectors. Furthermore, the parameter adaptation scheme utilizes the Cauchy distribution model and a sinusoidal formula which is associated with successful experience from the previous iterations to automatically update scale factor F and crossover rate C R, respectively. The inferior solution eliminating technique is introduced to reduce the population size at certain generations by eliminating several bad-performing individuals to enhance the convergence speed and compensate the cost for fitness evaluation times during the evaluation process. In a summary, the main contributions of the proposed algorithm can be listed as follows: (1) Dual mutation strategies including DE/current-to-embest and DE/ebest-to-rand are proposed and employed simultaneously to achieve an appropriate equilibrium between the global exploration and local exploitation abilities.
(2) The core control parameters F and C R are updated adaptively to supplement optimization ability of algorithm on basis of rational probability distribution and the successful experience from the previous iterations.
(3) An inferior solution eliminating technique is utilized based on the principle of survival of the fittest to compensate the cost for fitness evaluation times and speed up the population convergence.
To verify the effectiveness of proposed DMIE-DE, it's compared with five state-of-the-art DE variants. Extensive experiments are carried out based on 30D, 50D and 100D benchmark functions from CEC2017 test suite. The results indicate that the proposed DMIE-DE algorithm is superior and competitive to the considered DE variants.
The remainder of this paper is arranged as follows. Section 2 studies the basic DE. Section 3 briefly reviews some advanced methods related to our research. Section 4 outlines our approach for the proposed DMIE-DE algorithm. The simulation results including the algorithm performance comparison with several DE variants on empirical functions, effectiveness certification for dual mutation strategies and inferior solution eliminating technique are described in Sect. 5. Conclusion is provided in Sect. 6.

Basic DE algorithm
In this section, four basic operations in DE will be demonstrated including initialization, mutation, crossover and selection. The global minimization task of D dimensions is considered in this paper and can be defined as follows: where f (·) denotes the objective function, and − → − → X * means the global optima of the objective function. The variable S means the searching space, and the target vectors will be restricted in the searching space by a predefined lower bound − → X min = (x min,1 , x min,2 , . . . , x min,D ) and an upper bound − → X max = (x max,1 , x max,2 , . . . , x max,D ).

Initialization
To begin with, an initial population consisted of NP individuals will be produced which can be expressed as Ddimensional vectors: The initial population can be produced randomly by a uniform distribution within the search space via the formula as follows.

Mutation
At each iteration, the mutation operator is employed for each target vector to yield corresponding mutant vector . The five most frequently used mutation strategies are listed below. 1) DE/rand/1: 2) DE/best/1: 3) DE/current-to-best/1: where the indices r 1 , r 2 , r 3 , r 4 and r 5 ∈ {1, 2, . . . , N P} are mutually exclusive integers and are different from the index i. The vector − → X g best represents the optimal individual in the population at the gth generation. The parameter F is called the scaling factor, which is a positive constant within the range [0,1].

Crossover
The population diversity is enhanced by the crossover operator. The trial vector is produced by recombining the variables of target vector − → X g i and mutation vector − → V g i . The most generally implemented binomial crossover operator is formulated as follows: where rand[0, 1] is a random constant complying with the uniform distribution in [0,1]. The parameter C R controls the fraction of components copied from the mutant vector and j rand ∈ (1, 2, . . . , D) is a stochastic integer to ensure that at least one component of trial vector is inherited from the mutant vector.

Selection
The population is rebuilt by employing the selection operation between the target vector − → X g i and the trial vector − → U g i . Individual vectors with better fitness values survive as the offsprings. For a minimization problem, the selection operator is performed as follows: To sum up, the pseudocode of the classical DE algorithm is written in Algorithm 1.

Related works
Researches on DE have reached an impressive state over the past two decades, and many variants have been proposed by researchers to enhance DE's optimization capability. In this section, we will briefly review some modified methods related to our research. For a comprehensive survey on DE, please refer to literatures (Das et al. 2016) and (Opara and Arabas 2019).
An appropriate trial vector generation strategy is of great significance to the performance of DE, and many scholars have proposed various improvement strategies. To modify the DE/current-to-best/1 approach, Zhang and Sanderson (2009) introduced a novel mutation strategy DE/current-to-pbest in their research of JADE. The innovation of this strategy lies in an optional external archive which could provide information about the evolution direction. Tanabe Zheng et al. (2017) proposed a novel mutation strategy, referred to as DE/current-to-ci_mbest/1, which utilized the collective information of the m best individuals to form the difference vector in mutation. Efforts have also been made to utilize multiple mutation strategies during the evolution process. For example, Li et al. (2017a) introduced three mutation strategies including DE/current to cbest, DE/current-to-rbest/1 and DE/currentto-f best. Meanwhile, the whole population is divided into three subpopulations and individuals in each subpopulation utilize different mutations based on their fitness value.  introduced two novel mutation operators DE/current-to-ord_pbest and DE/current-to-ord_best to enhance DE's performance. Li et al. (2020) presented two variants of the classical DE/rand/2 and DE/best/2 strategies, referred to as DE/e-rand/2 and DE/e-best/2, respectively, and these two mutations were used to achieve a balance between global exploration and local exploitation. Deng et al. (2020b) employed an explorative mutation technique DE/seeds-toseeds and an exploitative mutation strategy DE/seeds-to-rand to enhance the optimization ability of DE.
The control parameters are also extremely important to DE's optimization capability, especially for the scale factor F and crossover rate C R. However, the parameters are always sensitive to solve different optimization problems, therefore, researchers have introduced various adaptive or self-adaptive parameter adjustment methods to alleviate this problem. For instance, Draa et al. (2015) introduced a sinusoidal differential evolution namely SinDE, which utilized two sinusoidal formulas to adjust the values of scaling factor and the crossover rate. Recently, Draa et al. (2019) presented a new variant of SinDE, which utilized a compound sine-based formula to adjust F and C R. Compared with SinDE, the new version could make parameters variation less monotone. Yu et al. (2013) proposed a two-level adaptive parameters control strategy. More specifically, the individual-level parameters are updated according to not only individuals fitness value but also its distance from the optimal individual, and the population-level parameters were updated adaptively during the evolution process. Meng et al. (2018) introduced a DE variant called PALM-DE to generate new parameters with an adaptive learning mechanism for the inconvenience in selecting control parameters. However, this variant was heavily relied on the number of individuals to generate a suitable value for C R. Thus, on the basis of Meng et al. (2018), Meng et al. (2019) introduced a new parameter adaptive DE variant called PaDE to resolve inappropriate adaptation schemes with a novel grouping strategy.
To summarize, the improvement methods on mutation strategies and control parameters are promising research area, which therefore provides the references for our proposed DMIE-DE variant in this paper.

Description of DMIE-DE
In this section, we will provide an explicit elaboration of proposed DMIE-DE algorithm. Firstly, we will introduce the designing motivation of DMIE-DE. Then, we will describe the new mutation method with two strategies including DE/current-to-embest and DE/ebest-to-rand as well as main parameters adaptation schemes. Lastly, the inferior solution eliminating technique will be described.

Motivation
The conspicuous flaw for DE variants lies in the fact that it is difficult to coordinate the global exploration ability and the local exploitation ability. Globality and locality are ambivalent but existing in solving the optimization problems. The mutation operator in DE is generally composed of base vectors and difference vectors. The base vector is used to determine the searching reference point while the difference vectors are utilized to provide the searching direction and perturb the base vector. The classical mutation strategy DE/rand/1 is said to be the most frequently used scheme (Das and Suganthan 2011) with maintaining outstanding global exploration ability but it tends to cause the problem of stagnation because of its random manner. Studies indicate that greedy strategies, like DE/best/1 and DE/best/2, have higher convergence speed and strong exploitive ability (Li et al. 2017b) but it is easy to cause premature convergence problem. Therefore, if an algorithm could employ dual mutation strategies with different emphasis that one focuses on exploitation work while the other on exploration work at the same time to generate offsprings and select a better one to next generation, it may promote the accuracy and robustness because their explorative and exploitive advantages are combined. However, the problem with this scheme is that the fitness evaluation times will also be doubled. To overcome this problem, we can eliminate some individuals with bad performance at certain time during the evolution process.

Dual mutations collaboration mechanism with elites guiding technique
In order to achieve a proper balance of the global exploration and local exploitation, we propose dual mutation strategies called DE/current-to-embest and DE/ebest-to-rand. The first mutation DE/current-to-embest is relatively explorative to maintain the population diversity, and is formulated as follows: From Eq. (11), we can see that this strategy selects the current individual − → X i as the base vector and two difference vectors are utilized. More specifically, − → X ebest is called elite individual which is randomly selected from the top 10% individuals with better fitness values in the current population to guide the evolution direction. The vector − → X mbest is called the medium individual which is randomly selected from a group centred on the middle ranking individual according to the fitness values, and the group size is D/5, where the parameter D means the problem dimension.
− → X r is a random individual from the union P ∪ A, where P denotes the current population, and A represents an external archive used to store the inferior parents that fail in the selection process during last iteration. Figure 1 illustrates this mutation strategy in a 2-D plane picture in which the marker a means the vector − → X mbest − − → X r 2 , and markers b and c denote the vectors From the figure, we can draw two conclusions. On the one hand, the current individual serves as the searching centre, which will be helpful to get out of the current poor area. On the other hand, it combines the relatively better solution, i.e. elite individual, and medium solution in the current population to guide the current individual to a more promising area which is helpful to a faster convergence speed. The second mutation strategy DE/ebest-to-rand is designed to be relatively exploitive and its calculation formula can be written as follows: where the vectors X ebest is defined consistently as Eq. (11). The vector − → X r 1 means a random individual selected from the current population P and − → X r 2 is a random individual from the union P ∪ A. The schematic diagram for this strategy can be illustrated as Fig. 2 in which the marker e denotes the difference vector − → X r 1 − − → X r 2 . From the figure, we can observe that this mutation strategy selects the elite individual as base vector instead of the best individual like DE/best/1, which is expected to have better solutions around it and meanwhile alleviate the premature convergence problems. What's more, the difference vector is beneficial to maintain the population diversity because of its random characteristic.
At each generation, these two mutation strategies, i.e. DE/current-to-embest and DE/ebest-to-rand, are applied simultaneously to generate two trial vectors. Fitness values of these two trial vectors are calculated and compared to select a better one to against with the target vector in the selection operation. This scheme is expected to enhance the optimization capacity and robustness of DMIE-DE because the exploration and exploitation strengths are complementary.

Control parameters adaptation
The control parameters also play an important role in the effectiveness of DE. The scale factor F is utilized to determine the searching radius centred on the base vector, and the crossover rate C R controls the components inherited from mutant vector. Different parameter adaptation schemes may be suitable for different optimization problems. In our work, we design a modification of the parameter adaptation scheme proposed in Zhang and Sanderson (2009).
At each generation, the scale factor F i for each individual is generated independently according to a Cauchy distribution model: where μ F is called the location parameter which determines the peak location of the distribution and θ F denotes the scale parameter which decides half width value at half of the maximum distribution value. In our work, we set the parameter θ F to a fixed value 0.1, while the parameter μ F is updated at each generation according to the previous successful experience and its initial value is set to be 0.3. Then, the update formula of μ F can be written as follows: In Eq. (14), the parameter ρ is a dynamically adjusted by a normal distribution: The function abs(·) means the absolute value and N ormalrand(0, 1) is a random number subject to the standard normal distribution. S F is a set of successful F values during last iteration, and mean Lm (·) denotes the Lehmer Recount the population size: N P = N P − E P; 7 end 8 else 9 g = g + 1; 10 end 11 Output: The current population and its size N P. mean function: According to Zheng et al. (2017), Cauchy distribution is beneficial for the diversity of F and can avoid premature convergence to some extent. Moreover, the Lehmer mean is utilized to generate larger F values to enhance the population diversity.
For the crossover rate C R, we employ a sine distribution formula which can be illustrated as follows: The parameter μ C R is initialized as 0.8, and is updated by the following equation: The parameter ρ is the same definition as Eq. (15). The successful C R are also stored for the generation of μ C R by the Lehmer mean: In Eq. (17), the δ is set to 0.1 to control rate of change caused by sine distribution and rand[0, 1] is a random constant complying with the uniform distribution in [0,1]. The design philosophy of the adaptation of C R is that better C R values are more likely to generate individuals that will have more chances to survive and these better values should be inherited to the next generation.

Inferiors eliminating technique
During the selection process, the fitness value of each individual needs to be evaluated for the comparison with the target vector for surviving to next generation which is timeconsuming. When the proposed dual mutation strategy is operated, function evaluation times will be doubled. In order to ease the reduction of practical evaluation rate for function evaluation, inferior solution eliminating technique is provided to improve the situation. More specifically, some individuals with bad fitness values will be removed from the population for a certain generation interval, which is presented by parameter gi p in our algorithm. In fact, the value of gi p is important for the performance of proposed inferiors eliminating technique. On the one hand, if the value of gi p is too large, the inferiors eliminating technique will be seldom utilized so the function evaluations will be wasted. On the other hand, if gi p is set to a small value, individuals will be discarded frequently and the diversity of population might be affected, which is against the optimization capacity. Therefore, it's necessary to provide a suitable value for gi p. In our work, we set gi p to a constant, i.e. 400, which will be discussed in the experimental part. During the elimination process, the eliminating size E P is associated with the initial population size and the population size will be reduced by a tenth for every gi p iterations. For a clearer illustration, this technique is presented as Algorithm 2, in which the function Mod(·) means the modular arithmetic and the function round(·) denotes the rounding function.
The detailed description of dual mutation strategies and control parameters adaptation scheme as well as the inferior solution eliminating technique have been provided. Then, an overall implementation of DMIE-DE is presented in Algorithm 3.

Experimental results
In this section, the comparisons results on the adopted experimental platform between proposed DMIE-DE and five state-of-the-art DE variants will be presented. Moreover, the parameter sensitivity analysis as well as the effectiveness certification for dual mutation strategies and inferior solution eliminating technique will be analysed in detail.

Benchmark functions and compared algorithms
To evaluate the performance of the proposed DMIE-DE algorithm, comparative experiments are carried out based on 29 benchmark functions provided by CEC2017 platform, and the ith function is denoted by f i in this paper. These functions can be divided into the following four groups with different characteristics: f 1 , f 3 : unimodal functions; f 4 − f 10 : simple multimodal functions; f 11 − f 20 : hybrid functions; f 21 − f 30 : composition functions. It should be pointed out that function f 2 has been removed from the test suite because of its unstable characteristic according to the original reference (Wu et al. 2017), which also provides more detail information about other functions in this test suite.
The performance of DMIE-DE is compared with five state-of-the-art DE variants including two classical DEs: JADE (Zhang and Sanderson 2009) and SinDE (Draa et al. 2015); three recently proposed DE variants: TSDE (Liu et al. 2016), AGDE (Mohamed and Mohamed 2019) and EFADE (Mohamed and Suganthan 2018). For a convincing comparison, the associated parameters of the five compared DE variants are configured as recommended in the corresponding original references and the detail information is listed in Table 1. In DMIE-DE, the initial population size N P is set to 5 · D for all different problems and D represents the problem dimension. This setting for the value of N P is recommended by DE's original reference (Storn and Price 1997) and is also adopted by some advanced DE variants, such as Refs. . According to the requirements for these test functions in Ref. (Wu et al. 2017), the maximum number of function evaluations F E S max is set to 10, 000·D, which is also the stopping criteria. For the fairness of the comparison, the computational results are all obtained on a PC with Intel(R) Xeon(R) Gold 5115, 2.39GHz CPU, 64.0 GB Memory and MATLAB R2018b on Windows 10 system.

Performance metric
In our simulation, all the compared algorithms are conducted 50 independent runs to obtain the function error value which is calculated by is the global optimal fitness value obtained by each algorithm and f ( − → X * ) is the fitness value of the actual global optimal solution. The average value(denoted by "Mean") and the standard deviation(denoted by "Std.") of the function error values are utilized to reflect the effectiveness of test algorithms for searching the solution within limited evaluations. Convergence graphics of the function error value are also utilized to compare the convergence characteristic of each algorithm in the respective experiment.
Moreover, to draw a statistically sound conclusion. we employ the single-problem analysis based on Wilcoxon test (García et al. 2009) at the 0.05 significance level to test whether there is significant difference between DMIE-DE and each compared DE variant and the symbols "+/=/-" rep- Update the optimal individual − → X best and fitness value f best .

26
Execute the inferior solution eliminating technique in Algorithm 2. 27 end 28 Output: − → X best and f ( − → X best ).  resent that the performance of the DMIE-DE is significantly better than, similar to, or inferior to each competitor, respectively. Moreover, the multiple-problem analysis based on Wilcoxon test and the Friedman test is also conducted at the 0.05 significance level. The Friedman test is a nonparametric test methods for multiple comparison to detect significant difference and calculate the average ranking between all the compared algorithms.

Results and analysis
The average and standard deviation of the function error values obtained by DMIE-DE and five compared variants for solving 50D benchmark functions are presented in Table 2, and the results of 30D and 100D functions are summarized in Tables A1 and A2, respectively, at the supplementary due to the page limit. In these tables, the smallest average of function error values obtained for each function is highlighted in boldface. For the sake of convenience, the results of the single-problem analysis by the Wilcoxon test are listed in Table 3 and the comparison about the number of the smallest error values obtained by each algorithm is plotted in Fig. 3. For 30D test functions, in terms of the smallest fitness error values, DMIE-DE ranks second by getting the smallest function error value on 10 functions and is defeated by SinDE with obtaining smallest function error value on 12 functions. For other algorithms, AGDE, TSDE, JADE and EFADE are the winners on 6, 5, 4, 4 functions in turn. From the perspective of statistical results based on Wilcoxon test, DMIE-DE is significantly better than EFADE on 23 functions, and is      Considering 50D problems, DMIE-DE still shows its outstanding performance although it's more difficult to deal with higher-dimensional problems. In terms of the average function error, DMIE-DE takes the first-ranking position compared with other algorithms by yielding the smallest function error values on 18 test functions, which is increased by 8 functions compared with that of 30D problems. SinDE ranks the second place by winning on 9 functions and is followed by AGDE on 2 functions. For other algorithms, their performance is slightly worse without obtaining smallest function error values. Based on the single-problem Wilcoxon test, it's noticeable the DMIE-DE obtains 29 significantly better solutions without getting worse or equal results compared with TSDE and EFADE algorithms. For JADE, SinDE and AGDE, DMIE-DE yields better results on 26, 22 and 24 functions, meanwhile, it shows equal performance on 1, 2 and 3 functions, respectively.
For 100D test functions, DMIE-DE obtains the smallest fitness error for 21 functions, much more than other algorithms. In detail, JADE wins on 7 functions and EFADE on 1 functions while SinDE, TSDE and AGDE on zero func-tion with yielding the optimal average results. In terms of the Wilcoxon test, DMIE-DE still outperforms TSDE on all the 29 test functions, which remains the same as the performance of 50D. For SinDE and AGDE, DMIE-DE shows the similar performance by yielding better solutions on 27 functions, equal solutions on 1 function, and worse results on 1 function, respectively. JADE performs best among the 5 compared DE variants by beating DMIE-DE on 5 functions and obtaining equal results on 4 functions. For EFADE, it is beaten by DMIE-DE on 26 functions except for f 9 , f 11 and f 14 , on which it shows equal or better performance.
The multi-problem analysis results based on the Wilcoxon test between DMIE-DE and other compared variants at the 0.05 significance level are presented in Table 4, in which R + and R − mean the sum of ranks that DMIE-DE performs significantly better and worse than each competitor, respectively. The asymptotic p value indicates the difference level between each pair of algorithms, and if the asymptotic p value is less than the significance level (i.e. 0.05), it will be assigned the marker "YES" which means there is 95% probability to ensure that there is significant difference between DMIE-DE and its competitor and vice versa for "NO". From the results, we can draw two conclusions. On the one hand, R + value is much higher than R − compared with all the other five variants in 30D, 50D and 100D functions, which proves that DMIE-DE performs significantly better than other algorithms in all the three dimensions. On the other hand, except for SinDE and TSDE when solving 30D problems, the asymptotic p value is much lower than the significance level 0.05 for other cases. What's more, compared with 30D functions, DMIE-DE provides much lower asymptotic p value than 50D and 100D functions, which indicates the outstanding performance of DMIE-DE for higher-dimensional optimization problems.
The results in Table 5 present the average ranking of DMIE-DE and other compared variants by Friedman test at the 0.05 significance level for solving 30D, 50D and 100D functions in CEC2017, and the first ranking values are presented in boldface. From the results, we can see that DMIE-DE is the best among the compared algorithms on lower-or higher-dimensional problems. More specifically, the average ranking of DMIE-DE remains the first place for these three-dimensional cases. Moreover, along with the the best performance among these competitors in terms of the convergence speed and the final solution accuracy in most cases, which proves the efficiency of balancing global exploration and local exploitation with dual mutation strategies and inferior solution eliminating technique.

Technique validity and parameter sensitivity analysis
In order to testify the validity of the proposed dual mutation strategies (i.e. DE/current-to-embest and DE/ebest-to-rand) and The experiment is also based on the 29 functions from CEC 2017 test suite, and each function is tested for 50 independent runs to obtain the solutions. Detail statistic results of 50D functions are presented in  Fig. 7, we can observe that DMIE-DE2 has the poorest performance in terms of the accuracy of final solutions. The reason for this phenomenon is that the designed DE/ebest-torand is a relatively exploitative strategy to help DMIE-DE2 have a faster convergence speed, however, this strategy is against the population diversity and is easy to cause premature convergence problem. On the contrary, the mutation DE/current-to-embest in DMIE-DE1 is an explorative strategy which is helpful to maintain the population diversity and keep the population active to find better solutions than DMIE-2, but it's easy to lead to the stagnation problem and fall into local optimum. By contrast, DMIE-DE3 has the best performance among the three versions of DMIE-DE since it combines DE/ebest-to-rand and DE/current-to-embest and the exploitation and exploration strengths are complementary. However, DMIE-DE3 still needs to be improved due to the dual cost of evaluation times. Thus, the inferior solution eliminating technique plays an important role and helps DMIE-DE has better performance than DMIE-DE3. In summary, the dual mutation strategies and inferior solution eliminating technique play indispensable role in balancing the exploration and exploitation abilities and help DMIE-DE have an outstanding performance.
As mentioned previously, the inferior solution eliminating technique is proposed for avoiding the waste of function evaluations by eliminating some poor-performing individuals at a certain generation interval. The effectiveness of inferior solution eliminating technique is controlled by the parameter gi p. In order to confirm the best choice of gi p, the experiment is carried out based on CEC2017 functions by setting gi p to different values, i.e. 200, 300, 400, 500 and 600, respectively. The experimental results obtained by DMIE-DE with different gi p values for solving 50D functions are presented in Table 9, and results of 30D and 100D functions are listed in Tables A5 and A6, respectively, of the supplementary material file. The smallest functions error values are highlighted in bold to indicate the best-performing gi p value. Based on these function error values, Friedman test method is employed to draw a comprehensive comparison conclusion. The results are shown in Table 10, from which we can draw the conclusion that gi p equalling to 400 is the most appropriate choice for DMIE-DE with obtaining the best ranking results for 30D, 50D and 100D functions. Therefore, based on a comprehensive consideration, we confirm that gi p equalling to 400 is the most suitable for the superior performance of DMIE-DE.

Conclusion
The performance of DE algorithm highly depends on the mutation strategies and associated control parameters which are expected to balance the global exploration and local exploitation abilities. Based on this consideration, we propose a dual mutations collaboration mechanism with elites guiding and inferiors eliminating techniques for DE. The dual mutation strategies are guided by elite individuals with different functional emphasis. More specifically, the mutation strategy DE/current-to-embest is designed relatively explorative while the mutation DE/ebest-to-rand is relatively exploitative. During the evolution process, these two mutation strategies are employed simultaneously to achieve the balance between local and global performance. Moreover, a newly designed parameter adaptation method is applied to automatically adjust the parameters F and C R. They are updated according to a Cauchy distribution model and a sine formula, respectively, and the updating process is associated with the successful experience from the previous generations.
The inferior solution eliminating technique is supplemented to enhance the convergence speed and compensate the cost for fitness evaluations in the evaluation process. The experiments between DMIE-DE and its three versions prove that