Enhanced Virtual Machine Placement in Cloud Data Centers: Combinations of Fuzzy Logic with Reinforcement Learning and Biogeography-Based Optimization (BBO) Algorithms

The process of mapping Virtual Machines (VMs) to Physical Machines (PMs), which is deﬁned as VM placement, aﬀects Cloud Data Centers (DCs) performance. To enhance the performance, optimal placement of VMs regarding conﬂicting objectives has been proposed in some research, such as Multi-Objective VM reBalance (MOVMrB) and Reinforcement Learning VM reBalance (RLVMrB) in recent years. The MOVMrB algorithm is based on the BBO meta-heuristic algorithm and the RLVMrB algorithm inspired by reinforcement learning, which in both of them the non-dominance method is used to evaluate generated solutions. Although this approach reaches acceptable results, it fails to consider other solutions which are optimal regarding all objectives, when it meets the best solution based on one of these objectives. In this paper, we propose two enhanced multi-objective algorithms, Fuzzy-RLVMrB and Fuzzy-MOVMrB, that are able to consider all objectives when evaluating candidate solutions in solution space. All four algorithms aim to balance the load between VMs in terms of processor, bandwidth, and memory as well as horizontal and vertical load balance. We simulated all algorithms using the CloudSim simulator and compared them in terms of horizontal and vertical load balance and execution time. The simulation results show that Fuzzy-RLVMrB and Fuzzy-MOVMrB algorithms outperform RLVMrB and MOVMrB algorithms in terms of vertical load balancing and horizontal load balancing. Also, the RLVMrB and Fuzzy-RLVMrB algorithms are better in execution time than the MOVMrB and Fuzzy-MOVMrB algorithms.


Introduction
Cloud computing, as a distributed computing model, provides everything as a service and aims to scale down the cost for both customers and providers [1]. In this regard, virtual machine (VM) placement, which is the mapping of VMs to Host Machine (HM), is vitally important. The optimal placement of VM makes data centers (DCs) more efficient, and cuts providers' expenditure. Even though lots of research has been conducted on optimal VM placement, most of them have considered only one goal, such as reducing energy consumption, load balancing, increasing resource efficiency, preventing Service Level Agreement (SLA) violations, and increasing Quality of Service (QoS) [2]. However, in VM placement, multiple of the mentioned goals should be considered simultaneously, which means we encounter a multi-objective problem. In each multi-objective problem, first the candidate solutions are generated using different algorithms, and then they are compared to select the optimum solution. In the case of VM placement, MOVMrB [3] and RLVMrB (our previous work) [4] are two multi-objective methods, which demonstrated exemplary performance. In MOVMrB, generating candidate solutions was conducted using the BBO algorithm, and in RLVMrB, reinforcement learning algorithm was applied. However, in a multi-objective problem, the evaluation method, which is used for comparing different candidate solutions based on various parameters affects final results [4][5] [6]. Therefore, in this paper, we propose two new algorithms, which are called Fuzzy-MOVMrB and Fuzzy-RLVMrB. These algorithms are the fuzzy versions of MOVMrB and RLVMrB algorithms, which in them we apply a customized fuzzy logic method to evaluate the generated solutions in VM placement problem. The proposed methods improve load balancing in terms of vertical load balancing and horizontal load balancing, which in the first, load imbalance between HMs is considered, and in the second, load imbalance between each HM. In contrast to Boolean logic, which in it, variables have either a value of one (true) or zero (false), Fuzzy logic is a multi-value logic in which the value of variables (degree of correctness of variables) can take any number in the range of [0, 1]. A Fuzzy system consists of four main components: fuzzifier, knowledge base, inference engine, and de-fuzzifier [7]. In the fuzzifier part, the inputs are converted to fuzzy sets. Then, in the inference engine part, the degree of compliance of these fuzzy sets with the rules predefined for the system is measured. Finally, in the de-fuzzifier part, the results of the inference engine are converted to a numerical value [7]. The remaining parts of this paper proceed as follows: In section two, the literature review is investigated. The third section is concerned with the proposed methods, Fuzzy-RLVMrB and Fuzzy-MOVMrB. In the fourth part, test settings and their results are analyzing. Finally, the conclusion gives a summary of the findings and future work.

Related work
In this section, the literature on VM placement is reviewed. First, the related works are investigated, and then the proposed algorithms are compared with all related works. Li et al. [3] proposed a multi-objective algorithm called MOVMrB for VM placement, which aimed to balance the load between HMs (horizontal load balance) and the load in each HM (vertical load balance). In both cases, load balancing has been conducted regarding the CPU, bandwidth, and memory. In this work, the BBO algorithm was used to find the best solution in the solution space. To this, in the first, two separate population sets were generated, a population with the aim of balancing the horizontal load and a population with the aim of balancing the vertical load. In each population, there existed three solutions, and information was exchanged between the two population sets. They used the non-dominance algorithm to evaluate the generated solutions. RLVMrB [4] was another work that aimed to balance horizontal and vertical loads. In this work, a multi-objective version of the reinforcement learning algorithm was proposed for VM placement. In the proposed method, the learner agent generated a new solution in each learning episode using three actions, including random selection, mutation, and crossover. The reward, which was received for each performed action, was calculated as a weighted sum of four parameters of horizontal and vertical load imbalance in terms of processor, bandwidth, and memory. For evaluating the generated solutions, the non-dominance algorithms was applied in this work too. Azizi et al. [8] proposed a VM placement algorithm called MinPR, which aimed to reduce resource wastage and energy consumption. They applied a method called the Resource Usage Factor to minimize power consumption by turning off the HMs. They ranked HMs based on performance so that higher-performance HMs are a priority. In [7], a VM placement algorithm was proposed, which aimed to reduce energy consumption and increase resource utilization. To this, the best-fit-decreasing algorithm based on fuzzy logic was used, which its inputs were energy consumption and resource utilization. They simulated their proposed method using the CloudSime simulator. The evaluation results showed that they were able to improve resource wastage and energy consumption by 30% compared to heuristic algorithms. Gharehpasha et al. [9] proposed a multi-objective VM placement algorithm with the aim of reducing energy consumption and resource wastage. To achieve these goals, they used a combination of a multi-objective version of the whale optimization algorithm and a chaotic function-based multi-verse optimizer. The proposed method was able to prevent the increase in VM migration too.
In [10] the authors suggested a VM placement algorithm with to reduce energy consumption and waste of resources using the Particle Swarm Optimization algorithm, which was called multi-PSO. They formulated the VM placement problem as a multi-objective bin packing problem. TPVMP was an energy and traffic-aware VM placement algorithm, which was proposed in [11]. The purpose of this algorithm was to reduce energy consumption and improve communication performance by reducing the cost of VM traffic. In addition, this algorithm minimize network congestion in heterogeneous DCs, while power consumption was unchanged. HMs are also identified as overloaded, and VMs are selected for migration with different immigration policies. In 2020, Farzaei et al. [12] introduced a multi-objective genetic algorithmbased VM placement method to reduce energy consumption and resource waste, which consider the consumed traffic to raise the efficiency and quality of service (QoS). Their proposed algorithm outperformed FFD and ACO algorithms. The experiment results also showed that this algorithm achieved better performance in large search spaces . In 2020, Parvizi et al. [13] demonstrated that a genetic-based VM placement algorithm (NSGA-III) could reduce energy consumption and the number of active HMs. They compared their proposed method with algorithms such as FFD and the exact mathematical approach in terms of critical criteria such as algorithm execution time, resource wastage, and energy consumption. The results depicted the superiority of the proposed algorithm . Liu et al. [14] introduced a multi-objective algorithm called NSGGA for VM placement. This algorithm tried to reduce the number of active HMs, and also minimize communication traffic. Another goal of this method was to balance multidimensional resources in DCs. VMPMBBO was a multi-objective VM placement algorithm introduced by Zhang et al. in 2014. VMPMBBO considered the placement problem as a complex system, and tried to minimize both waste of resources and energy consumption. Their experiment showed better and more efficient convergence. In addition, they used a resource wastage model that considered bandwidth, memory, and CPU usage. Their power consumption model also considered the processor efficiency of the servers, and idle servers were shut down to prevent further power consumption. Gao et al. [15] proposed a multi-objective VM placement algorithm to reduce energy consumption and resource wastage using an ant colony algorithm called VMPACS. They formulated the VM placement problem as a bin packing problem and considered CPU and memory resources in their work. In [16], the authors proposed a multi-objective VM placement algorithm called Island NSGA II. Their goal was trading off energy consumption, resource wastage, and quality of service. Regarding these objectives, they formulated the VM placement problem as a bin packing problem and used the NSGA II algorithm. Wang et al. [17] proposed a VM placement algorithm called IGA with to maximize resource utilization, reducing traffic load, and multidimensional balancing. For this purpose, they used an enhanced version of the genetic algorithm and an elitist strategy. They combined the values of all three objectives and performed the evaluation.
In [18], a multi-objective VM placement algorithm for homogeneous and heterogeneous data centers was proposed. They considered the problem of VM placement as an integer linear programming model with the goals of increasing the number of hosted VMs, and reducing resource waste and the number of active HMs. They tried to increase customer satisfaction by lowering DC costs. To this, they proposed two methods, which in the first, the VM placement problem was calculated for each goal in order of priority, while in the second, the goals were considered a weighted sum. The simulation results showed that the second method reduced the number of active HMs by up to 30%. Table 1 depicts an analytical comparison of related work and our proposed methods. As shown in this table, we compare all methods in terms of the applied approach, objectives, simulation tools, and the year of the related publication.

The proposed methodology
While in single-objective problems, solutions are evaluated based on only one criterion, and finding the best solution is straightforward, the evaluation of generated solutions is widely considered to be the most critical challenge of multi-objective problems [3]. To address this challenge, in this paper, we apply the fuzzy logic to evaluate the generated solutions in the VM replacement problem. In the rest of this section, we first formulate the VM replacement problem, then explain the proposed fuzzy system. After that, we propose two new algorithms, which are called Fuzzy-MOVMrB and Fuzzy-RLVMrB. These algorithms are the fuzzy versions of MOVMrB and RLVMrB, respectively.

Problem formulation
In RLVMrB and MOVMrB algorithms, all VMs in the DC are mapped to HMs. Therefore, the output of these algorithms is a matrix with the size of m * n, in which m and n represent the number of VMs, and the number of HMs in the DC, respectively. Both RLVMrB and MOVMrB are multi-objective and objectives in both of them are the same. To find the optimum placement, the output matrix in these algorithms is evaluated based on the following criteria: 1. CPU load imbalance between HMs (horizontal load balance) 2. Memory load imbalance between HMs (horizontal load balance) 3. Bandwidth load imbalance between HMs (horizontal load balance) 4. Internal load imbalance in each single HM (vertical load balance) For each generated mapping matrix in RLVMrB and MOVMrB algorithms, these four parameters are calculated, and the mapping matrices are evaluated using the non-dominance algorithm. To assess the generated solutions using the non-dominance algorithm, one solution is considered better than the other, when all the goals in that solution are better than the other (one solution overcomes the other). Therefore, this algorithm fails to consider a trade-off The four parameters, which are used for evaluating the output matrix can be divided into two general categories: load imbalance among HMs (in terms of processor, bandwidth, and memory) and load imbalance within each HM. In the following, we will explain how to calculate these parameters. . CV theory is defined as the ratio of the standard deviation to the mean, which indicates the degree of dispersion of resource consumption relative to the mean. CV is one of the normalization methods, so the result is a dimensionless number. Table 3 depicts all symbols and notations, which are used in these equations [4][3] [20]. The load of resource (CPU, bandwidth, and memory) R The number of resources The load of resource r in the HM The CPU, memory, and bandwidth load of the ith VM L r The average load of resource r

Fuzzy system
As mentioned in previous sections, a fuzzy system has four functional blocks: fuzzifier, knowledge base, inference engine, and de-fuzzifier [21] [22]. In the following, we will describe each component.

Fuzzifier
Numerical data feed as inputs to Fuzzy system. As explained earlier, we apply the CV criterion to calculate horizontal and vertical load imbalance. Because CV is a normalization method, all four inputs of the fuzzy system are numbers between zero and one, which should be converted to fuzzy sets. In this regard, for each input, a fuzzy set is defined using a trapezoidal membership function with three linguistic variables: good, medium, and bad. The output of the fuzzy system is also considered as a fuzzy set with a trapezoidal membership function, which has five linguistic variables very good, good, medium, bad, and very bad. For each numeric input (x) in each membership function, the membership degree (µ(x)) should be calculated using Equation 3 is applied in this regard. Parameters a, b, c, and d in Equation 3 are the four scalar parameters related to four trapezoidal vertices in the membership functions. Figure 1 and 2 depicts the membership function for input and output, respectively, where the x-axis represents the input values, and the y-axis shows membership degree µ(x). Because the frequency of most input values is close to zero, the range of membership functions is considered asymmetric.

Knowledge base
This part of the fuzzy system contains a set of rules that are defined as statements "If a and b and c and .... Then z ", which map fuzzy inputs to the desired fuzzy output. The number of rules is usually equal to the number of combinations of all the different states of the inputs in the fuzzy system. In our problem, there exist four fuzzy sets, and each set contains three membership functions. Therefore, the number of combinations of input states is equal to 3 4 . Therefore, the rule base in our fuzzy system contains 81 rules that do not violate each other. Table 4 depicts some of these rules.

Inference engine
In this section, the compliance of fuzzy inputs with defined rules in the rule base is calculated. The antecedent of a rule consists of smaller statements conjunct with each other by 'And' operation. The lowest membership degree (Minimum) among the statements of each rule is considered as the output membership degree. Then, in the aggregation step, the results of each rule must be combined. For this purpose, we have used the maximum S-norm.

De-fuzzifier
Finally, after the aggregation step, a numeric value is expressed as the output of the fuzzy system. For this purpose, we have used the center of gravity method. Finally, we compare the generated matrices based on the output of the fuzzy system. Accordingly, any matrix with a smaller output is more desirable.

Fuzzy-RLVMrB
As mentioned earlier, in our previous work (RLVMrB), we have applied the reinforcement learning method to VM replacement problem. In this algorithm, the non-dominance method has been used to evaluate the generated matrix in each learning episode. Also, to calculate the reward of the reinforcement learning method, the weighted sum of four horizontal and vertical load balance parameters has been calculated. In the Fuzzy-RLVMrB algorithm, instead of using the weighted sum, the output of the fuzzy system (described in section 3.3) is used to evaluate the generated matrices and reward the agent. Algorithm 1 presents the general procedure of the Fuzzy-RLVMrB algorithm. Accordingly, in each learning episode, first (line 12), one action is selected from the permissible actions set (random selection, crossover, and mutation). Then a new mapping matrix is generated based on the chosen action(matrix P in line 13). After that, the matrix is examined in terms of feasibility(line 14). In the next step (line 15), the four parameters of vertical and horizontal load imbalance in terms of processor, bandwidth, and memory are calculated for that matrix (by using Equations 1 and 2). After selecting the action and generating a new matrix, the environment enters a new state. We determine the new state of the environment based on four evaluated parameters (line 16). The values of those parameters are then fed to the fuzzy system to evaluate the generated matrix. The output of the fuzzy system is given as a reward to the learning agent (line 17). Fuzzy output is also used to compare the generated matrix in the current episode with the elite matrix (the best matrix obtained throughout the machine learning to the current episode). If the generated matrix is better than the elite matrix, the new matrix is replaced to the elite matrix (line 18,19, and 20). Finally, the action value table and state of the environment is updated (lines 21,22, and 23).

Fuzzy-MOVMrB
The MOVMrB algorithm is a multi-objective algorithm that tries to balance the load horizontally and vertically in terms of processor, memory, and bandwidth in the VM replacement problem. They used the BBO algorithm for this purpose, which is a population-based meta-heuristic algorithm. To achieve these goals, this algorithm produces solutions in two separate population sets, a set to achieve horizontal load balance and a set to achieve vertical load balance. This increases the execution time of the algorithm. They also use the non-dominance algorithm to evaluate the generated solutions. In the Fuzzy version of the MOVMrB algorithm, we evaluate and compare the generated solutions using fuzzy logic as described in section 3.3. Also, because fuzzy logic is close to human decision-making and has great power, in Fuzzy-MOVMrB we can use a population set instead of creating two separate sets. In the Fuzzy-MOVMrB, the generated solutions are simultaneously compared in terms of both horizontal and vertical load balances. Algorithm 2 depicts the general procedure of the Fuzzy-RLVMrB algorithm. Accordingly, in the first, an initial population consisting of three mapping matrices is generated (line 8). Then two of these matrices are randomly selected, and a crossover is performed between them (line 10). After that, a matrix of three initial matrices is randomly chosen to perform the mutation (line 11). In the next step, all three matrices are examined in terms of feasibility (line 12).
Then the four parameters of vertical and horizontal load imbalance in terms of processor, memory, and bandwidth are calculated separately for all three matrices (line 13). In the next step to evaluate the matrices, the values of these four parameters are given to the fuzzy system. The output of the fuzzy system determines the desirability of the matrices (line 14). Then the previous best population matrix replaces the worst current population matrix (line 15). Finally, all three matrices are compared with the elite matrix. If a matrix is better than the elite matrix, that matrix replaces the elite matrix (lines 16, 17, and 18). Exchange the worst matrix with the best previous population matrix; 16 Compare(P j , P * ) with Fuzzy system output (Cost); 17 if P j is better than P * then 18 P * = P ; 19 return P * ;

Experimental results
In this section, first, an overview of the simulation setting is given and then, the proposed algorithms are evaluated, and a comparison of them with the baselines are demonstrated.

Simulation setting
We have simulated four algorithms Fuzzy-RLVMrB, Fuzzy-MOVMrB, RLVMrB, and MOVMrB, using CloudSim 3.0.3 simulator. In this regard, we have altered the "optimizeAllocation" function of the 'VmAllocationPolicy' class. We considered three scenarios with different real and synthetic datasets and different data center sizes, so that we could evaluate the mentioned algorithms in different situations. Table 5 demonstrates the features of each scenario. In the DC, the scheduling policy is considered as Space Shared. In the Space Shared policy, when a CPU is assigned to a VM, it holds it as long as the VM needs the CPU. Also, HMs are considered homogeneous with 32 GB of RAM and a 32-core processor with 1000 MIPS. To execute scenarios, a system with an Intel Core i3, a maximum frequency of 2.13 GHz processor, 6 GB of RAM, and Windows 10 OS were used.

Performance analysis
This section provides the results of preliminary analysis of proposed algorithms, and a comparison of them with baselines. In the first, we compare them in terms of Horizontal and vertical load balance, and then in term of execution time. Table 6 depicts the vertical load imbalance (disequilibrium column) and horizontal load imbalance in terms of processor, memory, and bandwidth (unev CP U , unev RAM , and unev BW columns). Provided results in Table 6, have been calculated based on an average of 5 executions and in each execution with 10,000 runs. As has been demonstrated in Table 6, in the MOVMrB algorithm, in the first scenario, the horizontal imbalance parameter from the processor dimension is in an ideal state, and this has caused the non-dominance algorithm to have difficulty in decision-making. For this reason, although the vertical load imbalance is not in good condition, this matrix has been selected as the final matrix, but the Fuzzy-MOVMrB algorithm uses a fuzzy system to evaluate the generated matrices. The fuzzy system converts the input values from quantitative to qualitative based on the degree of membership, and calculates and evaluates the output according to defined rules. Therefore, it can choose a solution that its goals are acceptable. Unlike the non-dominance method, the fuzzy system works similarly to human decision-making. The same procedure applies between RLVMrB and Fuzzy-RLVMrB algorithms. The Fuzzy-RLVMrB algorithm was able to select a matrix, which has reached an acceptable level of balance in all dimensions. In scenarios two and three, the Fuzzy-RLVMrB and Fuzzy-MOVMrB algorithms could have better choices than the RLVMrB and MOVMrB algorithms.

Execution time
Another parameter that is important in evaluating algorithms is the algorithm execution time. Table 6 presents the execution time of all algorithms. Due to the fact that the MOVMrB method uses the BBO Complex population-based algorithm and examines the horizontal and vertical load imbalance in two different sets, the execution time of the MOVMrB algorithm is significantly longer than other methods. In the Fuzzy-MOVMrB algorithm, due to the fact that instead of using two separate population sets, only one population set is used, the execution time is reduced compared to the MOVMrB algorithm. Because MOVMrB and Fuzzy-MOVMrB algorithms are population-based algorithms, they have a higher execution time than RLVMrB and Fuzzy-MOVMrB algorithms, which apply reinforcement learning.

Conclusion
Cloud DCs performance is highly correlated with VM placement. Although multi-objective methods have been vastly applied in the literature to optimal placement of VMs, the evaluation method which is used to compare candidate solutions affects final results. The non-dominance algorithm is a common method, which has been applied in the related work to evaluate candidate solutions, but it fails to consider a trade-off between objectives. In this paper, we applied fuzzy logic to evaluate output matrices in multi-objective VM replacement problems instead of the non-dominance algorithm. We proposed the Fuzzy-MOVMrB and Fuzzy-RLVMrB algorithms and compared them with the RLVMrB and MOVMrB algorithms in terms of vertical and horizontal load balance (regarding the CPU, memory, and bandwidth) and runtime. The experimental results using the CloudSim simulator depict that fuzzy logic, which is similar to human decision logic, outperforms the non-dominance algorithm in the multi-objective VM replacement problem. Indeed, the non-dominance algorithm encounters difficulty in making decisions when one of the parameters has ideal conditions, while fuzzy logic does not suffer from this issue. Also, the MOVMrB algorithm has a long execution time because it is a population-based algorithm and uses two separate population sets. But in the Fuzzy-MOVMrB algorithm, in which only one population set is generated, execution time is less than the MOVMrB algorithm. Also, the RLVMrB and Fuzzy-RLVMrB algorithms have much less runtime than the MOVMrB and Fuzzy-MOVMrB algorithms because they are not population-based.

Future studies
As future work, we plan to apply fuzzy logic in other multi-objective problems that use other methods instead of non-dominance algorithms and evaluate the result. Moreover, we plan to consider DCs with heterogeneous VMs configuration.
Funding Not applicable

Declaration
Conflict of interest The authors declare that they have no conflict of interest.
Ethical approval This article does not contain any studies with human participants or animals performed by any of the authors.