Energy and Performance Centric Resource Allocation Framework in Virtual Machine Consolidation using Reinforcement Learning Approach

Virtual machines are deployed to ease the performance, management overhead and access regularities of applications on cloud platforms. Virtual machines are often susceptible to overloading burdens, delays and other hurdles during server consolidation and migration processes. In order to regulate the energy dissipation, monitor the overloading and under loading issues, dynamic consolidation processes are introduced to distribute loads across virtual machines substantially. Consolidation process demands additional computations and resources to reallocate the services from one virtual machine to another, subjected to adherence of Service Level Agreements. The proposed methodology advocates the implementation of a novel architecture for consolidating virtual machines and thus balance the energy and performance parameters. Overall resource requirements, Performance to Power Ratio (PPR) are primary factors in design of proposed Dynamic Weightage algorithm, including the clustering approach with respect to reinforcement learning techniques. Cluster of optimal virtual machines are formed, resources are allocated based on optimal energy and performance expectations. Resource requests from virtual machines are derived into a matching relationship factor to represent the respective hosts with PPR considerations. These estimations also deliver the overall workload incurred during virtual machine consolidation. It is observed that the performance throughout the cluster is maintained to be a nominal with minimal trade-off to energy. The architecture is implemented over an o�ine platform, which are distributed environments, enabling scalability and improved e�ciency of the entire system. The system is validated on CloudSim simulator with datasets retrieved from PlanetLab. From the results obtained, it is noted that energy conservation has yielded up to 47%, with promising quality of service parameters. Results are compared with other state of art algorithms for distributed architectures and heterogeneous environments to prove the e�ciency of the system.


Introduction
Recent years have seen tremendous growth in expansion leading to increase in complex architectures and computations.Such novel architectures demand novel platforms, centralized infrastructure for monitoring, management, resource allocation and energy utilization.All these functionalities are based out to reduce costs incurred during ownership and maintenance.Cloud computing, with these advancements, have been simpli ed from its stages of inception and familiar in solving the problems of implementation and maintenance.Occupancy on these remote servers and pay-per-usage model has become cheaper since then [1], owing to e cient management of space and energy.Renowned giants in cloud computing, such as Amazon, Google, Alibaba, Microsoft, have notably scaled up their establishments by simplifying the processes of data centers universally.Increasing number of data centers associates with a cost of establishments along with utilization of energy.According to a survey [2], energy expenses accounted for nearly 75% of energy utilization with a quarter amount of contribution from IT industries.Contradicting to the survey results of 2014, a recent survey quoted the dispersal of nearly 200 million metric tonnes of CO 2 into the environment by consuming 3% of total energy consumption [3].Energy consumption is inevitable in cloud computing but optimal utilization is encouraged throughout every operation involved in managing cloud environments.Comparatively, energy required for resource management incurs more than electricity required for managing humongous volumes of resources in data centers worldwide.Ine cient management of resources [4] results in extended energy dissipation when compared to power utilization of host platforms.Numerically, up to 50% energy is required for cloud platforms for overall operations, and nearly 70% of energy is required for hosts to transform from idle mode to observe mode.The survey is a resultant of summarizing the energy utilization of nearly 5000 cloud platforms for six months [5].
The cloud platforms are managed in two different workloads to facilitate the processes of sharing, storing and utilization of resources namely under-loading and optimal loaded state.Optimal loaded platform is a near to impossible state, leading to overloaded environments in majority of cases and increased energy utilization.When cloud platforms are under loaded, the platforms continue to utilize power as much as 72% of power they usually consume during the peak utilization [6].With the power consumption in mind, it is advisable to optimise the utilisation of resources in cloud platforms in order to conserve energy in different scenarios.A signi cant factor that affects the performances of cloud platforms is the increased workload of a single machine, in turn affecting the overall waiting time and performance of other users.The Quality-of-Service parameters are remarkably affected in cases where some machines are overloaded with volumes of resources and workload.Service Level Agreements are typically de ned with QoS parameters that have to be promised to end-users, which are kept on top priority.Dynamic Virtual Machine Consolidation is proven to be an e cient technique to adhere to SLA policies [7] and meet the end-users requirements for the utmost quality.The process of live migration was introduced to manage and operate the data centres with a distributed workload and to ensure minimally or zero downtime even during the migration processes.The content and resources of a physical machine can be transferred to a virtual machine, without disturbing the services rendered to end-users, where the users continue to access the resources normally.Power management, being the predominant aspect of dynamic and live migration, resource management and optimizing the storage space but restraining the number of physical servers required are the other common factors for opting for live migration schemes.
Overloaded servers can thus be balanced with underloaded servers, thereby monitoring the equivalent workload and top-notch quality service to users.The signi cance of live migration and its bene ts justi ed the need for multiple optimization techniques and the e ciency has greatly increased [8].Any optimized algorithm should focus on the time required for completing the migration process, indicating the unavailability of the virtual server in its previous location.
The number of resources to be migrated from the physical machine to a virtual machine should be considerably lesser in order to adhere to the e cient power utilisation strategies.Total migration time is an indication of the overall duration required for transferring the content from physical to virtual machines.During this tenure, the server should continue rendering the services to all its users, without any downtime.Downtime [9] indicates the time during which the server is inaccessible or unreachable for accommodating the requests from its registered users.Speaking of resources, the content shall be in any format inclusive of memory pages, storage setups, processor capacity, state and the types of resources handled along with workload.Migrating the memory pages from one location to another in the virtual space is a challenging process when compared to the migration of other resources.The architecture of a physical machine or a data centre determines the di culty of the migration process, as a resource is readily available for sharing between a source and destination machine at all times.Notably, the size and architecture of the storage space are more complicated than the ability of a processor which is predominantly limited by its size.The layers responsible for managing the migration process have evolved to promote the concept of live migrations, and VMMs such as VMWare, Xen and KVM have achieved live migrations with minimal interventions.Virtual machine migration is the need of the hour, in order to improve the utilization ratio and reduce the energy wastage during times of excessive need of resources.Live migration consistently monitors the utilization ratio, and assures that the workload between servers are evenly maintained, otherwise, the algorithms recommends the transfer of information to underutilized servers [10].The actual machines are henceforth transferred to an idle state with low power consumption or switched off in some cases.A migration or consolidation algorithm must consider various factors prior to the initiation of the resources transfer, owing to extreme performance degradation over the servers.The same degradation will ideally affect the Service Level Agreements promised to the users, with disruption of services and downtime.On the whole, the consolidation and migration processes should be justi ed with su cient proofs, and with an aim of promoting the performance of servers.The contributions of the article consolidates the following metrics.Prediction of workload across different time slots, through an e cient workload prediction model.The workload model intends to assist in decision making, and to monitor the distributed workload, in order to detect the requirement of consolidation and migrations.
Monitoring the number of requests and responses for organising the workload between different servers or virtual machines through an e cient cluster management algorithm.
Depending on the location of servers, virtual machines, and number of requests, the proposed model intends to solve the resource allocation mechanism, considering the heterogeneous nature of machines and computational capacity of target machines, with a prioritized energy conservation model.
The following sections are organized into sections with a detailed description on the core concepts of the domain.Section 2 summarizes the background work and related articles in the domain of virtual machine consolidations and energy conservation on cloud platforms.The Section 3 identi es and documents the challenges in monitoring the workload, distributing the resources and thus conserve the energy requirement through the proposed approach.Section 4 discusses the environmental setup and functionality of the proposed approach in the simulated environment.Section 5 conclude the article with future directions.

Related Work
Consolidation of virtual machines and migrating the resources require standard and effective algorithms for ensuring fail proof and satisfactory migratory actions.The algorithm de nes the strategies for effectively transferring resources dynamically to virtual spaces to conserve the energy needed [11].The number of hosts is determined by the algorithm based on the information posted on the physical servers, in turn reducing the overall energy requirement for individual servers.The rst-ever architecture for managing energy requirements is termed virtual power management (VPM) for monitoring and consolidating the virtual machines based on overutilization and universal policies pertaining to service level agreements.From the general perspective, the need for virtual consolidation is determined based on the workload of the servers, selection of the virtual machine followed by placing them in the right location for easier accessibility [12].The concept of e cient energy management was advocated in multiple research works and the primary motive was to adhere to the service level agreements in all cases.In one scenario, the concept of introducing two thresholds for measuring the CPU utilisation between different physical machines to monitor the number of requests and responses was devised.Based on the utilisation, if a physical machine is found to be underutilized, the resources will be migrated to a virtual machine and the physical machine will be transferred to a sleep mode.During the selection of a virtual machine, an optimal workload and utilisation ratio will be compared against the threshold values and the virtual machine will be selected.It is also understood that the threshold values [13] will not be applicable in physical machines with a dynamic workload.Statistical approaches were preferred in determining the dynamic workload of physical machines from the historical information since the inception of the server.
The future demand for a server was estimated with the linear regression approach where the previous utilisation and peak time utilisation were considered.The utilisation of a server was determined with a Gray Markov descent-based regression model for more accurate workload predictions [14].M estimator regression is yet another approach to implementing machine learning algorithms for predicting the server workload.On the other hand, the server workloads were determined without the need for threshold values based on memory utilisation, bandwidth requirement during normal and peak hours, and the disk utilisation.
Resource consolidation is another signi cant problem in cloud platforms owing to the balance between the energy required, exploitation of energy and the performance off the platform.On the whole, the problem is considered to be a bin packing problem with variable sizes depending on the types of resources in which the servers are considered to be the bins and virtual machines to be the packing solutions [15].The workload of servers is identi ed with statistical and adaptive heuristic approaches including median absolute deviation interquartile ranges, regression techniques and various machine learning algorithms.The location of virtual machines and placement ideologies depend on random choice, correlation factors and the time taken for migration.All algorithms primary focus on power-aware techniques for addressing the limitations in the previous architectures.Increased utilisation of servers will lead to increased energy utilisation, thereby demanding the algorithms to frequently migrate the resources and consolidate the virtual machines.The workload detection is a vital element in determining the energy utilisation, followed by virtual machine selection and placement [16].By identifying the number of virtual machines with lesser service levels agreement violations are considered to be underutilized and hence they can be transferred into sleep modes.The entire concept was applied as a QoS-aware algorithm for measuring the server utilisation and the relationship between multiple virtual machines at a particular location.Moving average policy was introduced as a new prediction strategy for monitoring the workloads and assisting in localising the right virtual machine for distributing the workload between under-utilised and over-utilised services [17].The policy assisted in e cient decisionmaking by including multiple criteria such as CPU utilisation, bandwidth and RAM required for operations.
The other considerable factors such as temperature, QoS and energy of the virtual machines were tested in an approach where statistical metrics such as watts consumed per operation or transaction, temperatures during normal and peak workloads, and response time for a transaction were measured.The hardware and ageing factor have to be considered during the measurements of temperature of the host machines and are not a result of excessive workload alone.Indication of the temperatures in relation to the workload may not be a suitable parameter in order to consider temperature-aware virtual machine consolidation.Selection of virtual machines placement and consolidation was selected from bio-inspired models later derived into algorithms for improving the energy utilisation.Particle Swarm Optimisation, genetic algorithms, and ant colony optimisation were inspired by nature and were helpful in including multiple criteria and to adhere multiple objectives [18].
The primary motive behind the nature-inspired algorithms was energy conservation, e cient resource utilisation and preventing violations in the service level agreements [19].Homogeneous data centre architectures based on ant colony optimisation were introduced with the virtual space and were made to adjust according to the requirements dynamically.The algorithms are termed to be e cient and intelligent due to the fact that the parameters were highly dynamic and the architectures have to be evolved according to the requirements.It is also understood that frequent optimisation of the parameters included a high cost, time and energy owing to the consideration of multiple factors at any given time [20].This approach has been identi ed as inferior when an Infrastructure as a Service (IaaS) has to be planned for a very large organisation.
Certain frameworks have concentrated on heterogeneous environments where the resource consolidation problem has been address to using sequence optimisation techniques.Speci c ideology of look ahead control approach was implemented to predict the future requirements and workload for consolidating the resources.The con guration of host machines were altered according to heuristics at any given time from the learnings obtained from previous scenarios and simulated environments [21].Dynamic resource allocation scheme was introduced to plan the utilisation of resources across various virtual machines, yet registered numerous violations on the service level agreements.Another signi cant drawback with this technique was the inability to predict the dynamic capability of virtual machines that led to ine cient resource allocation.
The placement of virtual machines was an important issue in resource planning, consolidation and energy conservation.Power-aware techniques majorly focused on e cient Virtual Machine placements in order to conserve energy required by data centres.The migrated resource has to be placed in an appropriate virtual machine to ensure zero or minimal intervention to the service level agreements, without affecting or increasing the energy requirements.Previous techniques are focused on multiple objectives or multiple criteria or meta-heuristic parameters in order to derive optimal virtual Machine placements.Many such techniques did not include the CPU utilisation ratio and included various other factors for determining the energy requirements [22].In terms of large-scale heterogeneous environments, resource consolidation problems and performance-to-power ratio have to be concluded as primary parameters in determining the requirement of energy and measures to prevent excessive energy utilisation.The proposed framework intends to build a model framework that is adaptive for consolidating the resources in real-time based on the dynamic and rational properties of cloud computing resources.The prediction model enhances the decision-making abilities of the algorithm for e cient VM selection and placement, resource consolidation with a reinforcement learning algorithm.A PSO learning approach [23] is further enhanced for optimal placements of virtual machines based on their dynamical features and attributes.The framework is tested on simulation environments and the following sections justi es the e ciency of the proposed approach.

Proposed Model
The proposed framework is implemented as an Infrastructure as a Service environment constituted by multiple clusters on a shared cloud platform.The architecture is de ned to be heterogeneous in order to cater to different types of services and requests.The CPU utilisation of every system in the cluster is set to be equivalent in order to monitor the workload in futuristic conditions.Every CPU is attributed to the computational speed, memory and bandwidth of incoming requests and response rates [24].The speed of the CPU is measured in Millions of Instructions processed every second, facilitating the number of requests responded to by the servers.The ultimate aim of the servers is to ensure the utmost Quality of Service, as de ned in the SLAs, for processing the requests from different users.Every service provider is subjected to nes and penalties for every violation during the runtime.The architecture of the proposed framework is illustrated in gure 1 as a layered architecture.
The storage space is distributed across all virtual machines and per user request will be processed through an interface between the Cloud Service Provider and its end users.The interface will be simpli ed to provide uniform access to all end-users and to maintain the service level agreements according to their preferences.If the request is legitimate and if service level agreements are maintained, the request is processed by the cloud management portal through the cluster manager interface [25].The cluster manager is responsible for workload distribution and prediction, resource allocation and consolidation.Resource consolidation is termed to be an important process since it measures the workload and deems whether the server is under-loaded or overloaded.The proposed resource allocation algorithm is responsible for optimising the resource allocation across different virtual machines and placing them for effective energy utilisation.The local manager component is responsible for managing the CPU types of individual machines and monitoring the distribution of resources.By considering the real-time requirement, the cloud resources are being allocated to different users through the local manager.The learning algorithm [26] estimates the cooperation between the CPU unit, its utilisation ratio and the number of requests to be processed.
The proposed algorithm indicates that the resource consolidation problem has to be addressed by understanding the requirements of every user and hence cluster managers and local managers should be aware of total resource utilisation in the cloud platform.The sequential decision-making model is implemented as the controller for allocating the resources in order to maintain the performance-to-power ratio under different workload conditions.The resource allocator is also responsible for implementing a learning algorithm that understands the utilisation of resources across heterogeneous requests from thousands of users.At any given point in time, the algorithm should be able to sense the number of hosts, and the number of resources subjected to utilization across all virtual machines.Post the detection of overloaded hosts, the resource allocator [27] initiates the request to the resource consolidator for localising the virtual machines which are under loaded and ready for migration.The heterogeneous property of cloud platforms is considered to be a major factor in determining the workload and the dynamic ability of such platforms to serve different types of requests.An accurate workload prediction model cannot be derived due to the fact that highly dynamic content has to be managed and allocated for different types of users.There is a signi cant difference between virtual machine consolidation and resource utilisation and hence the algorithms should act accordingly to predict the workload at any given time depending on the resource utilisation.The proposed technique includes a model for controlling the resources in an o ine approach based on learnings obtained from previous resource allocation and management, energy consumption and the performance-to-power ratio [28] of different services in different scenarios.The computational complexity of the proposed algorithm is reduced by enforcing a learning algorithm ideally using reinforcement techniques to promote lesser dependency on computational resources.

Resource Allocation Model
that the resources and workload are highly variable in a cloud platform, the variability is measured based on time and workload prediction techniques to collect information about resource utilisation in real-time.Rational allocation of resources is carried out by understanding the utilisation ratio and futuristic demands well in advance for managing the resources in future cases.From various simulated and real-time environments, it is collectively concluded that host utilisation depends on the time and duration of resource collection and allocation.Multiple workload prediction algorithms include a timeseries-based approach for estimating the resource requirements and the moving average algorithm is a standard benchmark for predicting the resource utilisation based on nite impulse lters.The working principle of the moving average algorithm is the determination of a simple linear model for the current utilisation of the resources.The problem with the algorithm was the inability to overcome instantaneous noise/spikes and sudden changes in the requirement, typically the dynamic ability, in cloud platforms.The proposed technique includes a workload prediction algorithm based on median absolute deviation based on variances and is found to perform better than the conventional moving average algorithm.The proposed MDA algorithm also indicates that a weight it will be assigned to the moving average in order to derive the time series values of different types of resources to be processed in the cloud platform.
Once the workload has been predicted successfully, the next important aspect of a cloud platform is to balance the energy consumption and performance of the cloud platform.The proposed technique implements a reinforcement learning algorithm for predicting the workload based on historical information, thereby enabling an automated workload prediction approach.It is evident that bringing a balance between performance and energy consumption of a cloud platform is challenging especially in a highly dynamic environment.This approach is facilitated by implementing a reinforcement learning algorithm that assists in sequential decision-making for effective resource allocation and management.Every reinforcement learning algorithm requires the current state, set of actions and the rewards for learning from successful and unsuccessful actions in the past.The sequential decision-making problem is typically handled by a Markov Decision Process where the current state is transferred to another state based on CPU utilisation in different intervals.Depending on the CPU utilisation, the states of the host machine shall be indicated with the percentage of utilisation being 0% to 100%.Equation 1 indicates the utilisation ratio and the same is used for denoting the utilisation rate of the host machine.Once the utilisation rate of the server is determined, the following set of actions have to be de ned for modifying the states according to the level of utilisation.The states of actions are predominantly focussed on resource allocation and consolidations in the current problem statement.Equation 2 indicates that for any host machine h, the state of action is de ned to be s i at u(t) action a.
Since the platform is deemed to be dynamic, an uncertainty factor has to be included to account for action space continuity characteristics.An aggregation algorithm is used to resolve the uncertainty problem by limiting the number of actions to a con ned set of actions.The set of actions to be standardized inside the host mission is de ned as equation 3.
Every reinforcement learning technique incorporates a goal and a reward for achieving the goal as planned.Every reward guaranteed for a successful action is considered a scalar, the change of one state to another based on requests/workload/management portal, etc., in the case of a host machine.The intention of any system is to maintain the rewards for a longer duration instead of quick and immediate wins.Long-term rewards are more preferred than measuring the success of individual actions and hence the goal should be comprehensive for a system to manage with respect to energy consumption or performance-to-power ratio.

Results And Discussions
The proposed system is targeted toward Infrastructure as a Service (IaaS) applications and the resource allocation algorithms are upgraded to measure and monitor the workload at all times.The performance of the proposed system is measured within a controlled simulated platform, owing to the di culties in testing the application on a real-time platform.CloudSim is a simulation Toolkit dedicated to modelling cloud Computing platforms with respect to computational resources, energy utilisation, resource management, service level agreements and various other factors and policies of real-time cloud platforms.The proposed architecture is built up of 400 heterogeneous physical servers combined together as a data centre along with 100 HP ProLiant DL360 Gen9.The server possessed 36 cores and 64 GB of RAM with a processing speed of 2300 MHz.The other variants such as ProLiant G6 and G7 were with 8 cores and 16 cores respectively.The architectures are standard across all cloud platforms such as Google Cloud, Amazon Web Services and Microsoft Azure.In order to simulate the real-time workload of a cloud platform, the architecture was tested with real-time data with an experiential dataset.The number of requests across the globe was measured every ten minutes from thousands of virtual machines and listed in the Table 1.The proposed weighted moving average algorithm was implemented for monitoring the workload across different time zones and intervals.
The performance of the proposed technique is compared against the other state of art techniques such as the Local Regression and Minimum Migration Time policy (LR-MMT), Dynamic Threshold Maximum Fit policy (DTH-MF) and Power-Aware Best Fit Decreasing algorithm (PABFD) algorithms.The components of a cloud platform are said to consume power across all ends, including the CPU, network bridges, interfaces, cooling systems and even disk storage.The power requirements in cloud platforms are typically a linear regression between the components, their workload and duration of utilization.The power consumption of ProLiant G9 and G7 are considered to be two host machines, and the power consumption at different workloads is tabulated in Table 2 The performance to power ratio of the proposed algorithm has registered the lowest level when compared to other state of art techniques.The energy utilized has been the lowest across the total number of days considered for the simulation.Performance to power ratio was measured across all heterogeneous virtual machines, dynamic consolidation along with the number of active and passive hosts, average energy consumed for every instruction processed by the hosts are monitored.The performance of Dynamic Threshold and Minimum Correlation of Host algorithm has considered the temperature of servers and virtual machines as a upper threshold, considering the hardware and other environmental issues.The proposed approach has yielded nearly 16.712Kwh comparatively better than other techniques.The following gure 2 illustrates the power consumption of different algorithms and VM consolidation techniques.

Figures Figure 1 Architecture
Figures

Table 2 :
Power Utilization at different levels of usage