A Kernel Search Algorithm for Virtual Machine Consolidation Problem

Virtual machine consolidation describes the process of reallocation of virtual machines (VMs) on a set of target servers. It can be formulated as a mixed integer linear programming problem which is proven to be an NP-hard problem. In this paper, we propose a kernel search (KS) heuristic algorithm based on hard variable fixing to quickly obtain a high-quality solution for large-scale virtual machine consolidation problems (VMCPs). Since variable fixing strategies in existing KS works may make VMCP infeasible, our proposed KS algorithm employs a more efficient strategy to choose a set of fixed variables according to the corresponding reduced cost. Numerical results on VMCP instances demonstrate that our proposed KS algorithm significantly outperforms the state-of-the-art mixed integer linear programming solver in terms of CPU time, and our proposed strategy of variable fixing significantly improves the efficiency of the KS algorithm as well as the degradation of solution quality can be negligible.


Introduction
Nowadays, due to the extensibility and flexibility of cloud computing, more and more internet services are provided by it.Indeed, internet services can be instantiated inside virtual machines (VMs) and flexibly allocated to any servers in the cloud data center.However, as the virtual machines dynamically change, resulting in the improper allocation of VMs and imbalanced load distribution, the high energy consumption and low efficiency of cloud data centers are also becoming more and more serious.In fact, for large data centers, 15 to 20 percent resource utilization is common [1], and even if an activated server is kept idle, it consumes up to 66 percent of the peak power [2].To improve efficiency and reduce energy consumption, cloud service providers employ VM migration technology to dynamically reallocate VMs by migrating old VMs among servers and mapping new VMs into servers in data centers.The above problem is called the virtual machine consolidation problem in the literature.
In general, virtual machine consolidation can be formulated as a mixed integer linear programming (MILP) problem which determines the activated servers and the reallocation of VMs in such a way that the sum of server activation, VM allocation, and VM migration costs is minimized subject to resource constraints of the servers and other practical constraints.The VMCP is strongly NP-hard [3], so there is no polynomial time algorithm to solve the VMCP to optimality unless P=NP.Therefore, existing works mainly focus on heuristic [3][4][5][6][7] and metaheuristic [8][9][10][11] algorithms for solving the VMCP.Among heuristic algorithms, greedy heuristics are the popular algorithm used to solve VMCPs [12,13], which are established based on the original first fit, best fit, first fit decreasing, and best fit decreasing mechanisms [5,6].Greedy heuristics are used as a baseline for comparison in many works because they can provide fast solutions, but they are more problem-dependent than other algorithms.More specifically, their design always depends on the specific problem structure, thus, is not easily extendable to other similar problems.References [3] and [7] proposed some more problem-independent heuristics than greedy algorithms.However, there is still a non-trivial gap, from 6% to 49% on average, between the solutions found by the heuristics with the optimal solution [3,4,7].Although metaheuristic algorithms [8][9][10][11] are problem-independent and can provide better solutions than greedy algorithms, they require more iteration time.
Fortunately, the kernel search (KS) algorithm is problem-independent and can quickly obtain a high-quality solution.The standard KS, first proposed by reference [14], has been applied to the solution of the multi-dimensional knapsack problem that is strongly NP-hard.Subsequently, the KS based algorithms have been successfully applied to solve the portfolio optimization problems [15], facility location problems [16][17][18], index tracking problems [19,20], and general mixed integer programming problems [21].Moreover, the KS heuristic requires little implementation effort since the most cumbersome part of the search is finished by a state-of-the-art MILP solver [15].Unfortunately, to the best of our knowledge, there is no research on using KS for solving VMCP.Furthermore, our preliminary experiments show that the standard KS for solving VMCP is time-consuming.Therefore, a variant of kernel search based on hard variables fixing is presented to address VMCP.A crucial issue in variable fixing is related to the strategy of variable fixing, which significantly determines the efficiency of the KS algorithm.Reference [17] proposed a variant of KS based on hard variable fixing that only fixes some binary variables to their values in LP solution.However, our preliminary experiments indicate that this strategy of variable fixing results in some VMCP instances being infeasible.For this reason, our proposed variant adopts a more sophisticated but efficient strategy of variable fixing that chooses the fixed binary variables according to corresponding reduced costs to avoid infeasibility occurring.As integer variables dominate the MILP formulation of VMCP, so we apply a similar strategy of variable fixing for them.For more details introduction to hard variable fixing, we refer readers to references [22,23].
The main contribution of this paper is that we propose a KS heuristic algorithm based on hard variable fixing, and apply it to quickly obtain a highquality solution for the large-scale VMCPs.We provide a new strategy of variable fixing to enhance the efficiency of the KS algorithm and avoid VMCP infeasibility occurring.In addition to fixing binary variables, we also fixed the integer variables which dominate the VMCP.Extensive computational results show that our proposed KS algorithm significantly outperforms three settings of the standard MILP solver that emphasizes the heuristics, and our proposed strategy of variable fixing significantly improves the efficiency of the KS algorithm as well as the degradation of solution quality can be negligible.
The paper is organized as follows.Section 2 presents the MILP formulation of the VMCP.Section 3 describes the proposed kernel search algorithm to solve VMCPs.Section 4 shows computational results.Finally, Section 5 draws some concluding remarks.

Virtual machine consolidation problem
Virtual machine consolidation describes the process of combining several different virtual machines and assigning them to a set of target servers.It can be used to optimize the allocation of VMs and servers for minimizing the allocation costs of VMs, the activation cost of servers, and the migration costs of VMs.In this section, we follow [24] to present a compact mixed integer linear programming formulation for the VMCP.
Fig. 1 depicts an example of the virtual machine before and after consolidation.Let J , I, and R denote the set of the servers, the set of types of VMs that needs to be allocated to the servers, and the set of resources of the servers, respectively.As shown in Fig. 1, we are concerned with a set of servers associated with a certain capacity s j,r for each j ∈ J , r ∈ R, and a set of old and new virtual machines associated with the demand u i,r for each i ∈ I, r ∈ R. Before VM consolidation, there are n i,j old VMs of type i (e.g., VMs 1-4 in Fig. 1) that are currently allocated at server j and i∈I d new i new incoming VMs (e.g., VM 5 in Fig. 1) that are needed to be allocated at servers.For notation purpose, we denote d i = j∈J n i,j for all i ∈ I.After VM consolidation, we introduce integer variable x i,j to indicate the number of old VMs of type i allocated to server j, binary variable y j to represent whether or not server j is activated, integer variable z i,j to indicate the number of old VMs of type i migrated to server j, and integer variable x new i,j to denote the number of new incoming VM of type i allocated to server j.As discussed, the VMCP can be formulated as follows: The objective function (1a) to be minimized is the cost of allocating all old VMs to servers, the sum of the activation cost of servers, the cost of migrating VMs among servers, and the cost of assigning all new incoming VMs to servers.
Here coefficients c run j , c alloc i,j , c mig i,j , and c new i,j are greater or equal to zero, which represent the activation cost of server j, the cost of allocating a old VM of type i to server j, the cost of migrating a VM of type i to server j, and the cost of assigning a new incoming VM of type i to server j, respectively.Constraint (1b) ensures that any type of resource capacity for each server is enough for the aggregate workload of VMs allocated Constraint (1c) makes sure that if the number of VMs of type i allocated to server j after the consolidation, x i,j , is larger than that before the consolidation, n i,j , then the number of VMs of type i migrated to server j, z i,j , must be equal to x i,j − n i,j due to c mig i,j ≥ 0; otherwise, it is equal to zero due to z i,j ∈ Z + .Constraints (1d) and (1e) state that old and new VMs of each type have to be assigned to servers, respectively.Finally, constraints (1f) and (1g) restrict x i,j , z i,j , x new i,j , and y j to be integer/binary variables and trivial upper bounds {x i,j } for variables {v i,j } where In the next section, we shall develop an efficient customized algorithm based on the kernel search heuristic for solving the large-scale VMCPs.

The kernel search algorithm
In this section, we shall first provide an intuitive and general description of the standard kernel search algorithm and then develop a new kernel search algorithm which is designed to quickly obtain a high-quality feasible solution for the large-scale VMCPs.Most of the notation and definitions introduced in this section will be used throughout the remainder of the paper.

The standard kernel search algorithm
Our description of the standard KS algorithmic framework mainly refers to references [14,15,17,21].To simplicity the notation, let V and G denote the set of binary variables and the set of general integer variables in VMCP (1), respectively.We refer to the MILP problem including all variables in V ∪ G as the original problem, and call the restricted problem where the binary variables in V\U (U V) are fixed to zero by MILP (U).KS is essentially a heuristic framework with a general and flexible structure applicable to any MILP problem with binary variables.Fig. 2 describes an iteration in the standard KS algorithmic framework.More specifically, using the information provided by the optimal solution of the (root node) linear programming (LP) relaxation of the original problem, the standard KS framework generates possibly different orders for binary variables according to the design criterion in the initialization phase.According to common design criterion, the more left the binary variable in Fig. 2(a) is ranked, the more likely it is that the value of 1 is taken in the optimal solution of the original problem.We select the first |K| promising binary variables (see the circles in Fig. 2(a)) from the left to construct the initial kernel K ⊂ V.The remaining binary variables in V\K (see the rhombus in Fig. 2 (i) If set K is too small, the optimal solution quality of MILP (U) where U := K is poor or even infeasible (solution quality); (ii) If set K is too large, we cannot find the optimal solution of MILP (U) where U := K within a reasonable time (solution efficiency).
Neither (i) nor (ii) is our ideal situations.If total computational time after solving the first restricted problem is still within the predefined time limit T max , we solve other restricted problems (see Figs. (3) are introduced before solving restricted problem MILP (U) to reduce the computational time required by the solver to find the optimal solution.The above procedure is repeated until the number of buckets already analyzed in U reaches the limit N ≤ N .The details of standard KS are summarized in Algorithm 1.

The proposed kernel search algorithm
The kernel search algorithm has been demonstrated to obtain high-quality solutions for various MILP problems with binary variables [15][16][17][18][19]21].However, our preliminary experiments showed that due to the solution space of the restriction problem for KS is still too large, the restricted problem is timeconsuming, leading to very poor performance of KS for VMCPs.Fortunately, the efficiency of the KS can be improved by means of variable fixing [17].Therefore, we shall illustrate a variant of the KS based on a new strategy of variable fixing and apply it to solve large-scale VMCPs.Add the two following constraints in MILP (U ∪ B i ): 10: Solve problem MILP (U) where U := U ∪ B i within time limit T i and denote its objective value by UB i ; Set i := i + 1; 17: end while   The key change of our proposed KS is to choose some binary and integer variables to fix their values, as compared to the standard KS.The strategies of fixed variables are crucial to the effectiveness and efficiency of our proposed KS algorithm.Therefore, we now first discuss the strategy of binary variable fixing, which reduces the number of binary variables in all restricted MILP problems.The straightforward strategies for fixing binary variables are as follows: 1) if v * j = 0, then the associated binary variable is permanently fixed to zero; 2) if v * j = 1, then the associated binary variable is permanently fixed to one; where v * , as stated, is the optimal solution for the linear relaxation of the original problem.An existing variant of the KS algorithm using strategies 1) and 2) was proposed in reference [17], and its numerical results indicated that the strategy improves the efficiency of the KS algorithm with minor deteriorations of the solution quality.However, our preliminary experiments indicated that strategies 1) and 2) result in some VMCP instances being infeasible.For this reason, inspired by the basic linear programming theory [25], we use a more sophisticated but efficient strategy to avoid infeasibility occurring.Each binary variable v j has an associated reduced cost r * j value, which can be obtained by solving the LP relaxation of the original problem.The reduced cost is a lower bound on the increase of the LP solution cost if the value of the variable is increased by one unit.The strategies in our implementation are detailed as follows a) if r * j ≥ , ∀ j ∈ V , then the associated binary variable v j is permanently fixed to zero (see the squares in Fig. 3) and add it to set Z; b) if r * j ≤ − , ∀ j ∈ V, then the associated binary variable v j is permanently fixed to one (see the triangles in Fig. 3) and add it to set O.
where > 0 that controls the number of fixed binary variables.In our implementation, we set = 10 −4 .Furthermore, since integer variables dominate the variables of the VMCP, we apply the following similar strategy to fix integer variables, which further significantly improves the efficiency of our proposed KS algorithm.c) if r * j ≥ , ∀ j ∈ G , then the associated integer variable g j is permanently fixed to zero; Our proposed KS algorithm is summarized as in Algorithm 2. After permanently fixing the binary and integer variables according to strategies a), b), and c), and sorting the remainder of unfixed binary variables, we construct the initial kernel K and buckets sequence {B i } i=1,••• ,N .Then we solve problem MILP (U) where U := K, but we find a few instances is infeasible due to the initial kernel is too small.To handle this situation, the proposed KS follow [21] to iteratively increase the size of the kernel until MILP (U) is feasible.A new kernel and bucket sequence are created when problem MILP (U) becomes feasible.The remainder steps are same to the standard KS algorithm in Algorithm 1.

Numerical results
In this section, the effectiveness and efficiency of the proposed KS algorithm for solving VMCPs are evaluated by simulation experiments.First, we perform computational experiments to demonstrate the performance of the proposed KS algorithm for solving the VMCPs, and compare it with the standard MILP solver with three different settings.Then we evaluate the impact of the strategy of variable fixing on the performance of the KS algorithm.All experiments were conducted on a cluster of Intel(R) Xeon(R) Gold 6140 @ 2.30GHz computers, with 192 GB RAM, running Linux (in 64 bit mode).
1: Initialize UB min := +∞, i := 1, O = ∅, and Z = ∅; 2: Using the root node LP relaxation of the original VMCP instead of the LP relaxation; 3: Using the following strategy to permanently fix binary/integer variable values: 4: (a) if reduced cost r * j ≥ , ∀ j ∈ V, the associated binary variable v j is permanently fixed to zero and Z := Z ∪ {v j }; 5: (b) if reduced cost r * j ≤ − , ∀ j ∈ V, the associated binary variable v j is permanently fixed to one and O := O ∪ {v j }; 6: (c) if reduced cost r * j ≥ , ∀ j ∈ G, the associated integer variable g j is permanently fixed to zero;  Add the first |K| × ω variables in the buckets sequence to set U.

15:
Solve problem MILP (U).Add the two following constraints in MILP (U ∪ B i ): Solve problem MILP (U) where U := U ∪ B i within time limit T i and denote its objective value by UB i ;

24:
if UB i < UB min then

25:
Update UB min := UB i ; Set i := i + 1; 28: end while In our experiments, the proposed KS was implemented in C++ linked with software IBM ILOG CPLEX optimizer 20.1.0[26].After preliminary experiments, we set the following CPLEX parameters in our proposed KS.To save computation time for obtaining root node LP information, we decide not to apply the RINS heuristic (parameter RINSHeur) and MILP heuristic (parameter HeurFreq), and turn off the feasiblility pump heuristic (parameter FPHeur) and local branching heuristic (parameter LBHeur).For all restricted problem MILP (U), we choose the pseudo costs to drive the selection of the variable to branch on at a node (parameter VarSel), and generate mixed integer rounding cut (parameter MIRCuts) moderately.All the other CPLEX parameters were set to their default values.Following [17], the three settings for standard MILP solver CPLEX for comparison with the proposed KS algorithm are as follows: • CPX-A: CPLEX with all the parameters set to their default values with the exception that parameter MIPEmphasis that is set to feasibility.• CPX-B: CPLEX with all the parameters set to their default values with the exception that parameter RINSHeurStrategy is applied every 20 nodes.• CPX-C: CPLEX with all the parameters set to their default values with the exception that parameter LBHeurStrategy is turned on.
To gain insight the effectiveness of the proposed strategy of variable fixing, we consider three versions of the KS algorithm: • KS(V, G): our proposed KS algorithm with variable fixing strategies a), b), and c).• KS(V): has the same settings of KS(V, G) with the exception that integer variables are not fixed (strategy c) is not applied).• KS': standard kernel search algorithm without any strategy of variable fixing.
Finally, the time limit and the number of threads were set to 7200 (seconds) and 12.The optimal solutions for VMCP instances are not available in the literature, so we use the best solution found by solver CPLEX within 5 hours with the default setting, denoted as f * , to validate the performance of three CPLEX settings and three versions of the KS algorithm.

Testsets
All algorithms were tested on VMCP instances with 5 VM types and 10 server types with different features, as studied in [4].The VM types and server types are reported in Tables 1 and 2, respectively.We generate four sizes of VMCP instances, each having number |K| ∈ {7000, 8000, 9000.10000} of servers.Each instance has the equal number of servers of each type.In our test, the VMCP instances are constructed as the following procedure.First, we randomly select the element k ∈ K to subset K, with a probability α = 50%.Then for each server k ∈ K, we iteratively assign a random number n i,j of VMs of type i until the maximum usage of the available resource load σ k , defined by exceeds a predefined value β.Finally, to obtain parameter d new i for all i ∈ I, we iteratively assign a random number dnew i of VMs of type i until the maximum usage of the available resource load τ , defined by exceeds a predefined value γ.In general, the larger β and γ, the more VMs and new incoming VMs will be constructed.As shown in references [4,10,27], we consider the linear power consumption model as follows: P idle,k is the idle power consumption (at the idle state) of server k.P max,k is the maximum (peak) power consumption (at the peak state) of server k, and ) is the CPU utilization of server k.Following [4], the activation cost c run j is set to P idle,k , the assignment cost c alloc i,j is set to (P max,k − P idle,k ) ui,CPU s k,CPU , and the idle power consumption P idle,k is set to 60% of maximum power consumption P max,k .As illustrated in Section 2, c mig i,j is set to c alloc i,j in our experiments.

Efficiency of the proposed KS algorithm
In this subsection, we present numerical results to illustrate the efficiency of the proposed KS algorithm compared with the standard MILP solver with three different settings.In Table 3, we provide a summary of the computational results for the proposed KS algorithm and three CPLEX settings.For each data set, we report the average error (Gap %) with respect to f * and the geometric mean of CPU time (T) in seconds.The best solution values found by the proposed KS algorithm or three CPLEX settings are denoted as f H .The error for each instance is computed as 100 f H − f * /f * , and then geometric averaged over all the instances belonging to the same data set to obtain statistic Gap %.As observed in Table 3, compared with CPLEX settings CPX-A, CPX-B, and CPX-C, the CPU time of KS algorithm taken by solving VMCP is much smaller (169.94seconds versus 357.17 seconds, 269.79 seconds, and 296.22 seconds).In particular, we observe that the quality of the solutions found by the KS(V, G) is not worse than CPX-A, CPX-B, and CPX-C, even slightly better.From Table 3, we can conclude that the performance of KS(V, G) is much better than three CPLEX settings for all |K| ∈ {7000, 8000, 9000, 10000} and β ∈ {20%, 40%}.
To gain more insight into the error of KS(V, G) over three CPLEX settings, we compare the worst error (Worst Gap %) returned by KS(V, G) and three CPLEX settings.Statistic Worst Gap % shows the worst error calculated from all the instances that belong to the same data set.The worst error comparison results of KS(V, G) and three CPLEX settings are summarized in Table 4. From the table, we can conclude that the worst error returned by KS(V, G) is From the above computational results, we can conclude that the proposed KS(V, G) algorithm is more efficient in solution quality and CPU time than the three CPLEX settings for large-scale VMCP instances.

Performance of the proposed strategy of variable fixing
To address the advantage of applying the proposed strategy of variable fixing to the KS algorithm, we compare KS(V, G) with KS(V) and KS' to solve VMCPs.The proposed KS(V, G) has more fixed variables than KS(V) and KS' does not have fixed variables.Table 5 provides the computational results of the three versions of KS algorithm.As expected, (i) the CPU time of KS(V, G) is the least of the three versions of the KS algorithm; (ii) the CPU time of KS' is the most of the three versions of the KS algorithm.This is reasonable as the solution space for all restricted problem decreases with the number of fixed variables.Furthermore, the quality of the solutions generally deteriorates with the number of fixed variables.However, we observe that the quality of the solutions found by KS(V, G) is slightly worse than KS(V) and KS', but the improvements in terms of CPU time are remarkable.In some data sets, we observe that the average error of KS(V, G) is even slightly less than that of KS(V) or KS'.This is due to the fact that some instances of KS(V) or KS' cannot be solved within the time limit, resulting in their average error being larger.Next, we compare the worst error for KS(V, G), KS(V), and KS', which is summarized in Table 6.
We observed that the worst error of the three versions of the KS algorithm is greater with the decreasing value of β.Similar behavior can be observed in the CPU time returned by Table 5.In summary, we can conclude that the improvements of our proposed strategy of variable fixing in terms of CPU time for the KS algorithm are significant as well as the impact on the quality of the solutions can be negligible.

Conclusion
In this work, we have designed a new KS algorithm for the solution of the large-scale VMCP.The proposed KS algorithm is based on a new strategy of variable fixing, which is more efficient in terms of avoiding VMCP infeasible than the existing KS strategy, making it more suitable to solve large-scale VMCPs.Extensive computational experiments on large VMCP instances show that our proposed KS algorithm outperforms three different heuristic settings of the standard MILP solver, and our proposed strategy of variable fixing significantly improves the efficiency of the KS algorithm as well as the degradation of solution quality can be negligible.

Fig. 1 :
Fig. 1: An example of the virtual machine consolidation problem (a) Initialization: Initial kernel and buckets.(b) Initialization: Solve problem MIP(K).

Fig. 2 :
Fig. 2: An illustrative example of the operation of the standard KS algorithm.
(a)) are partitioned into N subsets (called buckets) denoted as B i , i = 1, • • • , N .Then we first solve the restricted problem by only considering variables in K ∪ G (see problem MILP (U) where U := K in Fig. 2(b)), denoted its solution by (v min , g min ) and the corresponding objective value by UB min .
2(c) and 2(d)) in the sequence considering the previous U plus the binary variables belonging to bucket B i .Two additional constraints (1a) ≤ UB min , (2) and j∈Bi v j ≥ 1.

Fig. 3 :
Fig. 3: An illustrative example of the operation of the proposed KS algorithm.

Fig. 3
Fig. 3 describes an iteration in the proposed KS algorithmic framework.The key change of our proposed KS is to choose some binary and integer variables to fix their values, as compared to the standard KS.The strategies of fixed variables are crucial to the effectiveness and efficiency of our proposed KS algorithm.Therefore, we now first discuss the strategy of binary variable fixing, which reduces the number of binary variables in all restricted MILP problems.The straightforward strategies for fixing binary variables are as follows:

7 : 8 : 9 :
Sort the unfixed binary variables in V\ (O ∪ Z) according to a predefined sorting criterion; Construct an initial kernel K by selecting the first |K| binary variables in V\ (O ∪ Z); Consider the binary variables belonging to V\ (K ∪ O ∪ Z) and construct a sequence {B i } i=1,••• ,N of buckets; 10: Set maximum time limit T i := T max /(N + 1) of each bucket; 11: Solve problem MILP (U) where U := K; 12: if no feasible solution to problem MIP(U) is found then[21]

13 :
while MIP(U) is not feasible do 14: Redefine the kernel K := U and buckets sequence {B i } i=1,••• ,N .18: end if 19: while i ≤ N do 20: Algorithm 1 The standard KS algorithm Input: VMCP, N , and T max .Output: UB min .1: Initialize UB min := +∞ and i := 1; 2: Solve the (root) LP relaxation of the original VMCP; 3: Sort the binary variables according to a predefined sorting criterion; 4: Construct an initial kernel K by selecting the first |K| binary variables; 5: Consider the binary variables belonging to V\K and construct a sequence {B i } i=1,••• ,N of buckets; 6: Set maximum time limit T i := T max /(N + 1) of each bucket; 7: Solve problem MILP (U) where U := K; 8: while i ≤ N do

Table 1 :
The five VM types.

Table 2 :
The ten server types.

Table 3 :
Comparison results of KS(V, G) and three CPLEX settings

Table 4 :
Worst error comparison results of KS(V, G) and three CPLEX settings

Table 6 :
Worst error comparison results of KS(V, G), KS(V), and KS'