Unbalanced budget distribution for automatic algorithm configuration

Optimization algorithms often have several critical setting parameters and the improvement of the empirical performance of these algorithms depends on tuning them. Manually configuration of such parameters is a tedious task that results in unsatisfactory outputs. Therefore, several automatic algorithm configuration frameworks have been proposed to regulate the parameters of a given algorithm for a series of problem instances. Although the developed frameworks perform very well to deal with various problems, however, there is still a trade-off between the accuracy and budget requirements that need to be addressed. This work investigates the performance of unbalanced distribution of budget for different configurations to deal with the automatic algorithm configuration problem. Inspired by the bandit-based approaches, the main goal is to find a better configuration that substantially improves the performance of the target algorithm while using a smaller run time budget. In this work, non-dominated sorting genetic algorithm II is employed as a target algorithm using jMetalPy software platform and the multimodal multi-objective optimization (MMO) test suite of CEC’2020 is used as a set of test problems. We did a comprehensive comparison with other known methods including random search, Bayesian optimization, sequential model-based algorithm configuration (SMAC), iterated local search in parameter configuration space (ParamILS), iterated racing for automatic algorithm configuration (irace), and many-objective automatic algorithm configuration (MAC) methods. In order to characterize, validate and evaluate the performance of these methods, hypervolume (HV), generational distance, and epsilon indicator (Iϵ+\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$I_{{\epsilon }^{+}}$$\end{document}) are used as performance indicators. The experimental results interestingly proved the efficiency of the proposed approach for automatic algorithm configuration with a minimum time budget in comparison with other competitors.


Introduction
The optimization algorithms are broadly applied for solving different types of real-world optimization problems in various contexts such as economics, sciences, engineering, biologic, etc. (Abualigah et al. 2021a;Abualigah and Alkhrabsheh 2021;Palakonda et al. 2021;Osaba et al. 2021;Bharati et al. 2021).These methods (such as exact methods (e.g., branch-and-bound or methods used for integer programming solvers such as CPLEX and CLASP) and heuristic approaches (including local search methods or metaheuris-tic algorithms) usually contain a series of setting parameters that have a strong impact on their efficiency (Abualigah et al. 2021c, b).Even if default parameter settings for these algorithms are available they have to be carefully and properly set to achieve their best performance when facing a particular problem (Hutter 2009).
The problem-specific setting of parameters can eventuate a much higher-performing optimization algorithm adapted to the problem at hand.The algorithm developers use ad hoc fashion in order to design and tune the parameters of optimization algorithms.In fact, they select a few parameter configurations using a complete assignment of values to parameters, and the experiments are then done to test these parameters.Next, the results are examined and it is decided to check the other different configurations and modify the algorithm or to stop the process.This manual tuning approach has led to high-performing algorithms; however it is time-consuming in terms of human effort and algorithms are typically tested only on a rather limited set of instances.
Hence, this ad hoc process has been sidelined gradually and the demand for systematic approaches to automatically configure the setting parameters has been increasingly outlined in the literature and thus they are of high practical relevance in most contexts recently (Vermetten et al. 2020;Golabi et al. 2020;Ghambari et al. 2020;Corazza et al. 2021).
In recent years, Automated Algorithm Configuration (AAC) approaches (Hutter et al. 2007;Blot et al. 2017), also referred to as automatic parameter tuning methods (Eiben et al. 1999), have been presented to eliminate the limitations and reduce the tedious task of manually looking for high-performance parameter through another algorithm.The task in AAC is to find performance-optimizing settings of a series of parameters for a target algorithm whose behavior is controlled by these parameters.The configuration of the algorithm found during the tuning phase should have the generalization feature.In this way, they can apply to similar but unseen instances.The AAC not only affects algorithmspecific parameters but also has the potential to transform the way optimization algorithms are designed.This is the case as algorithm design decisions can roughly be classified into numerical and categorical parameters (Hutter et al. 2009).
Over the years, several AAC tools (e.g., ParamILS (Hutter et al. 2009), SMAC (Hutter et al. 2011), irace (López-Ibáñez et al. 2016), MAC (Rakhshani et al. 2019)) have been proposed from different perspectives which were successfully applied to (pre-tuned) state-of-the-art solvers of various problem domains (Hutter 2009;Hutter et al. 2009;López-Ibáñez et al. 2016).It is often computationally expensive to do AAC, especially on a large-scale dataset for real-world applications.The existing techniques in this field take considerable time and space and they are often unsuitable to tackle realworld problems with limited resource budgets (e.g., time or function evaluations).
In this paper, we share much of our motivation with the existing researches on AAC.The remark that has caught our attention about the available methods in the literature (e.g., Random Search, Bayesian, irace, etc.) is that they allocate a predetermined budget to each generated configuration.These methods aim to identify good configurations more quickly, however; they tackle the fundamentally challenging problem of optimizing high-dimensional and non-convex function with unknown smoothness, and possibly noisy evaluations.Alternatively, we use an unbalanced distribution of the budget for the generated configurations instead of a fixed budget for all configurations.More precisely, we will show that it is necessary to optimize the required function evaluation as a budget for each configuration and this budget can be tuned automatically during the optimization procedure.
Motivated by the recent success of Hyperband (Falkner et al. 2018), as a popular method to tune the hyperparameters of expensive machine learning models, we modified this proposition for AAC problem in configuring multi-objective evolutionary algorithms.Hyperband is a popular banditbased optimization technique that relies on the random search method and proved its superior anytime performance to find the best configurations in comparison with other traditional methods (e.g., Bayesian optimization) (Falkner et al. 2018).The idea behind it is that the algorithm allocates exponentially more resources (e.g., budget or function evaluations) to more promising configurations.To the best of our knowledge, this is the first work that applies Hyperband for AAC problem with the perspective of unbalanced budget distribution.To summarize, the main contributions of this paper can be stated as in the following: • We used Hyperband algorithm as bandit-based approach for parameter tuning of multi-objective evolutionary algorithms, in particular in configuring the NSGA-II as a target algorithm, with the objective of finding the best parameters before the execution of the algorithm on a problem at hand.• The presented approach uses an unbalanced distribution of the budget for the generated configurations addressing how to allocate resources among randomly sampled configurations in order to reduce the computational time.• We did comparative experiments with the other existing methods in the literature to demonstrate the advantage of the proposed approach in reducing the computational time.
To demonstrate the efficiency of Hyperband on the AAC problem, we have carried out an experimental study where NSGA-II, as a target algorithm, has been tuned for solving multimodal multi-objective optimization (MMO) test suite of CEC'2020.We have presented a comparison with the most prominent configurators including random search, Bayesian Optimization, ParamILS, SMAC, irace, and MAC methods on MMO test suite of CEC'2020.To investigate and validate the performance of the mentioned approaches, hypervolume (HV), generational distance (GD), and epsilon indicator (I + ) are used as performance indicators.The obtained results show that this new proposition outperforms the other competitors both in improving the solution quality and minimizing the run time budget.Furthermore, statistical tests show the superiority of the Hyperband for solving the AAC problem in configuring the multi-objective evolutionary algorithms for solving continuous optimization problems.
The rest of the paper is organized as follows.Section 2 first briefly describes the AAC problem by giving details of the configuration process.Then, a general review of existing approaches in this context, as well as the positioning of this work, is presented in Sect.3. Section 4 elaborates the proposed methodology.The experimentation is included in Fig. 1 Automatic configuration of a given, parameterized target algorithm for performance optimized for a given set problem instances (Stützle 2016) Sect. 5 by considering the performance of the proposed algorithm on multimodal multi-objective test suit.Finally, the conclusions and lines of future work are presented in Sect.6.

The automatic algorithm configuration
The AAC problem includes a parameterized algorithm which is also called target algorithm A; a set of parameters or configurations θ (that might be numerical, discrete, or categorical) which required to be tuned, A(θ ) denotes target algorithm A under specific configuration θ ; parameter search space that determines possible configurations and θ ∈ (including variables and their domains); a set of problem instances (training and test instance) D; and a total configuration budget t .The general concept of AAC is illustrated in Fig. 1.
It should be mentioned that evaluating the performance of a target algorithm is not an easy task due to stochastic nature of the algorithms, and their time-varying execution time which depends on the sampled configurations.Accordingly, the AAC can be considered as an optimizing problem which searches the right behavioral parameters for some underlying algorithm optimizer.In addition, this problem can be considered from a machine learning perspective to learn from a set of training instances for solving unseen problem instances with high efficacy.The configurator then takes advantage of the performance indicator of the target algorithm over the pre-defined validation problem instances.

Configurator tools overview
The widely used methods in the context of AAC include racing approaches (Maron and Moore 1997) (e.g., F-race (Birattari et al. 2010) and its iterated race version called as irace (López-Ibáñez et al. 2016)), ParamILS (Hutter et al. 2009), sequential model-based algorithm configuration (SMAC) (Hutter et al. 2011), Bayesian search strategy, and random search which have led to an increasing automatizing of the algorithm design and parameter setting process.
In this section, we describe the most used configurators with details, and the advantage and disadvantage of these methods are revealed.Thereafter, we clarified our motivation to use Hyperband algorithm for AAC in the field of multi-objective optimization with evolutionary algorithms.
Random search The random search method is an attractive first option for AAC problem because of its simplicity and it is widely used in the field of machine learning for hyperparameter optimization (Bergstra and Bengio 2012;Rakhshani et al. 2019).This method randomly samples the search space instead of discretizing it.The random search method has no end and instead a time budget has to be specified for stopping criteria.It suffers from the curse of dimensionality and massive waste of time on poorly performing areas of the search space.These drawbacks can be solved by other optimization methods, like Bayesian optimization that uses previous evaluation records to determine the next evaluation (Eggensperger et al. 2013).In contrast, the advantages of this method are the capability of finding more precisely the optimal values of two parameters which are little correlated, easily parallelized, and resource-allocated since each evaluation is independent.
Bayesian Search Method Bayesian optimization, as an iterative algorithm, is a machine learning approach that has become extremely popular for tuning hyperparameters in recent years (Shahriari et al. 2015).Unlike random search, Bayesian optimization determines the later evaluation points according to the obtained results from the previous evaluations.This method also optimizes expensive objective functions that need a long time for computation and showed its performance over continuous domains.Besides, this approach can tolerate the existing stochastic noise in function evaluations.Generally, Bayesian optimization has two main components which are (i) a Bayesian statistical model, which is invariably a Gaussian process, for the objective function modeling; and (ii) a surrogate model and an acquisition function that measures the value that would be generated by evaluation of the objective function to then decide for the next sampling (Yang and Shami 2020).
ParamILS ParamILS, implemented in programming language Ruby, is presented by Hutter et al (Hutter et al. 2009(Hutter et al. , 2007)).The core of ParamILS is based on iterated local search (ILS) to exploit the optimal parameter setting for the target algorithm within its configuration space.ParamILS is mostly used for the configuration tasks that aim to minimize the algorithm computation time through an adaptive capping mechanism which bounds the execution time of configurations according to the observed performance of the current best configuration.However, applying the "only" one-exchange neighborhood search obliges the ParamILS to discretize each parameter to define the neighborhood of candidate configurations which is not an easy task to handle.
Further information about the ParamILS approach can be found in the original paper (Hutter et al. 2009).
SMAC Sequential model-based algorithm configuration is also one of the most powerful tools by optimizing configurations of an arbitrary algorithm across a set of instances and yields very good results to address the AAC problem (Hutter et al. 2011).SMAC uses random forest (Friedman et al. 2001) to handle parameters of the target algorithm.The performance metric in SMAC is the algorithm runtime.This method learns a joint model which is an integration of the information about the instances and the random forest model.This joint model can predict the performance of the target algorithm to select promising parameter configurations and then these predictions are collected to give a statistic performance metric on each parameter configuration.It can tune both categorical and numerical parameters.However, the major bulk of the research on SMAC is that this method is limited to use runtime as the performance metric, and it is not effective for all NP-hard problems to extract instance features (Bartz-Beielstein et al. 2020).SMAC is open-source and its code written in Java and Python is free to access with the documentation for installation and run. 1race The irace package, a software tool implemented in R, is a generalization version of the Iterated F-race method (Birattari et al. 2010) that has been used successfully to automatically configure various state-of-the-art algorithms with given a set of training instances of an optimization problem.The irace utilizes an elitist iterated racing algorithm in its latest versions.The method sampled from a sampling distribution for all algorithm configurations.The first stage starts with a random uniformly and in later iterations, it is biased towards the best configurations.At each iteration, a racing strategy based on evaluation of the training problem instances is performed between the elite and generated configurations from previous iterations.Next, the worst configurations from the race will be removed based on a statistical test decision.This procedure is repeated until the race terminates and finally, the surviving configurations become elites for the target algorithm.The advantage of irace is to deal with different types of parameters and processing the performance of each candidate configuration during the optimization process which a prominent feature in the context of multi-objective optimization.For more details about irace package we refer to the original paper (López-Ibáñez et al. 2016).
MAC Many-objective automatic algorithm configuration is a framework for automatic algorithm configuration of many-objective optimization methods, a framework implemented in Matlab and Python, introduced in the interest of integrating the optimization methods and machine learning techniques (Rakhshani et al. 2019).In MAC, the idea of feature selection is incorporated into the stochastic RBF method using an undirected graph to explore more important variables.In more detail, MAC tries to learn probabilistically about the relevance of configurations and the model's performance during the optimization process.The collected information is then processed and used to generate new solutions.The introduced informed component guides the algorithm toward the search space that is likely to contain promising solutions.This orthogonal technique prevents MAC to uniformly consider all configurations and bias the search process toward the good configurations.

Positioning of our research
According to the above explanations, finding a good configuration for a target algorithm is a complex optimization task.The problem has a nonlinear objective function and the variables are interacting with each other.Therefore, it is normal for having multiple local optimum and noises by the stochastic nature of the problem.As mentioned above, a series of tools have been proliferated rapidly to address the AAC problem.They have their advantage and disadvantages and the main drawback of these methods is that they are computationally expensive.The mentioned available tools use a predetermined budget for the optimization process which means they allocate the same budget to each configuration to find the best configuration.Alternatively, we would like to investigate an unbalanced budget distribution policy for the AAC and tune this budget automatically during the optimization procedure.The main objective of this research is to verify and to illustrate the effectiveness of the suggested bandit-based approach, as a variation of random search with some explore/exploit theory, to find the best time allocation for each of the configurations for the AAC task.

Methodology
The parameter configuration of the multi-objective optimization algorithms is a challenging task due to its effect on the final performance of the algorithm.The problem is not only handling the run-time budget required to evaluate every possible configuration but also dealing with the size of the search space.Depending on the nature of the optimization problem at hand as well as the number and type of the algorithm's parameters (continuous, categorical, ordinal), finding the best parameter configuration for an algorithm can be easy or hard.If an optimization algorithm has too many parameters, we will then have exponential growth of possible configurations.The concern raises here is about considering a small number of configurations with longer average runningtime, or contrariwise considering many configurations with a small average running-time.
Table 1 The brackets of Hyperband corresponding to various values of S, when the number of configurations is 5 and allocated budgets are 5000 function evaluations In this regard, Li et al. (Li et al. 2017) proposed a method called Hyperband to address this trade-off.Hyperband is an extension of the SuccessiveHalving algorithm (Jamieson and Talwalkar 2016) which building block is proposed for hyperparameter optimization.SuccessiveHalving allocates uniformly a budget (B) to a set of hyperparameter configurations (n) for one iteration and evaluates the configurations' performances.Afterward, it throws the worst half of the configurations out and continues the procedure until one configuration just remains.Hence, the algorithm tends to allocate exponentially more resources to more promising configurations (Jamieson and Talwalkar 2016;Li et al. 2017).As the overall performance of SuccessiveHalving critically depends on the initial allocated computational resources, the Hyperband method exploits this strategy by evaluating predetermined hyperparameter configurations based on one unit of the budget in such a way that the process repeats by best half to two units of budget; next, the best half thereof to four units; and so on.
The idea behind the Hyperband algorithm is that the algorithm adaptively allocates more resources to the promising configurations while eliminating the poor ones.It uses an infinitely equipped bandit approach on the space of the number of configurations to be considered in parallel.A numerical example of the Hyperband procedure for a better understanding of the resources allocated within each configuration is given in Table 1.Although Hyperband was initially designed to deal only with the hyperparameter optimization problem, in this study we extend it to address the AAC problem.
Hyperband considers several possible values of n for a fixed budget B based on a grid search over the feasible value of n configurations.Each configuration n is associated with a minimum resource r before eliminating some of the worst configurations.Hyperband consists of two components including (i) an inner loop refers to Succes-siveHalving for fixed values of n and r and (ii) the outer loop that repeats over different values of n and r (as written in Eq. 1).Hyperband has two inputs R and η that dictate how many different executions of SuccessiveHalving are considered.R is the maximum amount of resource and can be assigned to a single configuration, while η or downsampling rate controls the proportion of configurations removed in each inner loop relevant to SuccessiveHalving.
An overview of the main steps of the proposed methodology for solving AAC problem is given in Algorithm 1.After the determination of the maximum amount of resource (R) and fixed budget (B), the first step involves generating a set of random configurations.In the inner loop, the Suc-cessiveHalving algorithm is performed for fixed values of n and r and it is repeated to find the best configuration.This bracket is designed to use approximately B total resources and corresponds to a different trade-off between n and B/n.In this inner bracket, a random grid search in the average budget is then performed for each configuration.Next, the algorithm removes the worst configurations and updates the population T .Hyperband needs several functions for any given learning problem which are (i) a function that returns a set of n samples from some distribution defined over the configuration space and (ii) a function that takes a configuration θ ∈ and resource allocation r as input and returns the validation loss after training the configuration for the allocated resources; and (iii) a function that takes a set of configurations and their associated losses and returns the best performing configurations.All these procedures are continued until reach the maximum budget B and return the best configuration.
Algorithm 1: Hyperband algorithm for AAC Input: Problem instance P, parameterized algorithm A, R, η (default η = 2) Output: The best configuration (θ) for the target algorithm Define max-iter as the maximum iterations per configuration s max ← log η (R) B ← (s max + 1)R Define n as the number of initial configurations (used Eq. 1) Initialization generate n random configurations as T for it = 1 to max-iter do for each configuration (i) in T do

Experimental setup
Benchmark set In this study, we validate the correctness of the proposed methodology for solving AAC problem on the multimodal multi-objective optimization (MMO) test suite of CEC'2020. 2In multiobjective optimization problems, there may exist two or more global or local Pareto optimal sets (PSs) and some of them may correspond to the same Pareto Front (PF).These problems are defined as multimodal multiobjective optimization problems (MMOPs) (Liang et al. 2020;Palakonda et al. 2021).The selected MMO test suite of CEC'2020 contains challenging test functions that provide different types of Pareto optimal fronts such as convex or concave both in the linear and nonlinear format.The dimension of the problems is considered with 10 variables.A summary of these function is given in Table 2.
Performance assessment metrics In this study, unary hypervolume (HV ) as one of the broadly used performance indicators (Zitzler and Thiele 1999) is considered as a baseline objective of the comparison for all competitors that should be maximized.(HV ) tries to combine, in a single value, the performance on both, convergence and diversity (Halim et al. 2021).Precisely, this metric evaluates the volume of an approximation set related to the reference point in the objective space and shows a more accurate convergence, uniformity, spread and Pareto compliant than other indicators.Moreover, we used the other quality indicators for further performance evaluation that include generational distance (GD) and epsilon indicator (I + ).These two metrics, known as convergence-based indicators, evaluate the closeness of the obtained approximation concerning the true Pareto front and they should be minimized.GD metric represents the average distance from the points obtained by the algorithm in an approximation to the true Pareto front.
AAC experimental protocol and compared methods For the evaluation of the proposed configurator, we considered NSGA-II (Deb et al. 2000) as the target algorithm A, using jMetalPy framework (Benitez-Hidalgo et al. 2019).The set of the parameters of the target algorithm are population size (type: integer, range: [100,200,300,400,500]), crossover (type: continuous, range: [0.5, 1]), and mutation rates (type: continuous, range: [0.01, 0.2]), respectively.In this study, the crossover operator is a simulated binary crossover (SBX) and the polynomial method is utilized for the mutation mechanism.Also, random search, Bayesian optimization, ParamILS, SMAC, irace, and MAC, as the most widely used configurators, are considered for comparison study.For each of these configurators, we performed 20 independent runs with a total budget of 8 hours CPU time for each configurator, and 300,000 function evaluations for the target NSGA-II algorithm.

Results and discussion
This section presents and analyzes the obtained results.The results are given in Table 3.In this table, we reported the best and mean value of HV as well as the mean and standard deviation (std) for GD and I + metrics, respectively.The best results of each problem are represented in bold style.
As can be seen in Table 3, the Hyperband algorithm could significantly improve the results of MMF4, MMF7 functions in terms of the best and mean value of the HV metric.Furthermore, the obtained mean results in terms of the GD metric show the superiority of the Hyperband algorithm for most of the test functions reported in this table.Although, ParamILS and random search could also obtain the minimum value of GD metric for MMF2 and MMF4 functions, respectively.In terms of I + , Hyperband algorithm outperforms the other competitors.
Also, one can observe that the proposed methodology could obtain the maximum value for the HV metric for all of the test functions in this table.For the MMF8, MMF10_l, MMF11, MMF11_l, MMF12_l, MMF13_l functions, Hyperband algorithm could achieve the minimum value in terms of GD metric and for the other functions reported in this table random search method outperforms the other approach.Remarkably, the performance of Hyperband algorithm is superior to those of all the considered methods for MMF10, MMf11, MMF13 functions in terms of I + metric.
The mean values of HV metrics in Table 3 clearly indicate that the Hyperband algorithm outperforms the other competitors for all of the test functions written in this table.Particularly, for MMF14_a, MMF15, MMF15_a, and MMF15_a_l functions the presented method obtains significantly better results.In terms of GD metric, for MMF14 function, ParamILS and SMAC had similar performance.For MMF14_a and MMF15_a functions random search could obtain the minimum value for the GD metric.SMAC for MMF15_a_l and MMF16_l3 functions and ParamILS for MMF15 and MMF16_l2 functions perform better than the other methods, respectively.For MMF16_l1 test function, ParamILS, SMAC, random search and Bayesian algorithm had the same mean value in terms of GD metric.However, Hyperband algorithm shows its superiority in comparison with the other approaches for all of the functions in terms of the mean value for I + metric.
According to the obtained results, it can be observed that the proposed approach based on the Hyperband algorithm outperforms the other competitors in terms of best and mean values for HV metric for all test functions.The main reason is that the presented methodology can make a good balance between diversification and intensification both in the objective and solution space by assigning different budgets to different configurations.Generally, we can say that our introduced methodology obtains statistically significantly better results in comparison with the other competitors thanks to its downsampling feature.
As a part of the experiments, we also presented the convergence plots of HV metric for different methods for all test functions.These plots are illustrated in Figs. 2, 3 and 4. The shaded areas in these figures illustrate a minimum value of standard deviation for the considered test functions.As it is shown in Fig. 2, Hyperband algorithm outperforms in terms of HV value for MMF1, MMF4, MMF5, MMF7, and MMF8 test functions.Interestingly, in Fig. 3, for functions MMF10, MMF11, MMF13, and MMF14, Hyperband algorithm not only could obtain the higher value for HV but also the algorithm stops with the minimum number of function evaluations.In Fig. 3, the superiority of Hyperband algorithm is represented in terms of HV value for MMF10_l, MMF11_l, MMF13, MMF13_l.For the other functions in this figure, all of the mentioned approaches have a close performance that is evident for MMF12 and MMF12_l.However, for MMF14 the excellent performance of the Hyperband algorithm is obvious.Moreover, as shown in Fig. 4, the proposed approach remarkably outperforms in comparison with the other competitors for functions MMF14_a, MMF15, MMF15_a, MMF15_l, MMF15_a_l, MMF16_l1, MMF16_l2, and MMF16_l3.Particularly, for MMF14_a, MMF15, and MMF15_a_l functions, Hyperband algorithm could take the higher value for HV with the minimum number of function evaluations.Altogether, these convergence plots highlight the capability of Hyperband algorithm in making a trade-off between diversification and intensification.The interesting point is that Hyperband algorithm can reduce the computation budget for the generated configurations which do not perform well, and use it for more promising configurations.It should be mentioned that the suggested strategy is different from the other proposed methods in the literature which assign the same computational budget to all the generated configurations.Figure 5 illustrates the overall rank of HV metric for all of the compared methods.According to this figure, the superiority of Hyperband approach is highlighted.This algorithm could be in the first level of ranking with 25% successful as the first winner.The second one is MAC method that obtains 19.8% of the best performance.SMAC, ParamILS, Bayesian, irace, and random search methods are in third to seven levels of ranking in order.This success is because of the capability of Hyperband in early elimination of the worst configuration and allocating a more computational budget to the better configurations.The Hyperband profits from the concept of "n versus B/n," which allows it to do so and to balance the budget and converge during the searching process.
Statistical Analysis (Kruskal-Wallis test) In this study, the nonparametric Kruskal-Wallis test is applied to compare the results distributions obtained by Hyperband algorithm and the other methods.To do so, we utilized the mean results of HV metric for all methods over test functions.The results of the Kruskal-Wallis test are obtained by the P values of the corresponding statistical test which P values lower than 0.05 indicate that the Hyperband algorithm is better than the other methods while P values greater than 0.95 indicate that the other methods are better than Hyperband algorithm.These results indicate that the proposed methodology based on Hyperband algorithm competes very well and it has statistically better results in comparison with the other methods.The overall results highlight Hyperband significantly outperforms all test functions in comparison with the other methods.

Conclusion, limitations and future research direction
This study presented a new approach for automatic algorithm configuration of multi-objective evolutionary algorithms to find the best configuration with a minimum computational time.We investigated the performance of the unbalanced distribution of budget for different configurations of a target algorithm.Inspired by the Hyperband algorithm, the main goal of this study is to find a better configuration that substantially improves the performance of the target algorithm while using a smaller run time budget.We aimed to tune this budget automatically during the optimization procedure.To do this, Hyperband algorithm is used to generate random sample configurations and HV is considered as the cost function.In the optimization process, the algorithm allocates exponentially more resources or budgets to more promising configurations.The proposed methodology is designed with a highly flexible and extensible architecture.We evaluated this approach by applying it to the multimodal multi-objective optimization (MMO) test suite of CEC'2020.Given a parameterized static algorithm (in particular in configuring NSGA-II), the obtained results highlight the best performance of the presented methodology in comparison with the other methods while reducing the computational time.However, the application of the proposed method can be limited when many objective functions are presented for solving a multi-objective optimization problem.Moreover, the proposed method might not be the best option when there is no limitation on the computational power, and the algorithm configurator method can be executed for a long time.In future work, we plan to extend this version of the algorithm with several objectives where these objectives are the other performance assessment metrics for multiobjective optimization.We also intend to apply our approach to other single-and multi-objective metaheuristic algorithms for challenging combinatorial problems.

2 A:
Scalable number of variables, B: Scalable number of objectives, C: Pareto optima known, D: Pareto front geometry, E: Pareto set geometry, F:Scalable number of Pareto set, G: N_ops(N_global + N_local); N_ops represents the number of PS to be obtained.N_ops = N_global + N_local where N_global represents the number of global PS need to be obtained and N_local represent the number of local PSs need to be obtained 5 Computational experiments

Fig. 2 Fig
Fig. 2 The convergence plots of HV indicator for different methods

Fig. 4
Fig. 4The convergence plots of HV indicator for different methods

Fig. 5
Fig.5The overall rank of AAC methods using HV metric over the MMO problems

Table 3
The obtained results of seven methods on MMO functions from CEC 2020, The significant results are in boldface