Enhancing QoE based on Machine Learning in Cloud infrastructure


 The cloud computing paradigm has recently attracted many industries and academic attention. It provides network access on demand and offers applications, platforms, or access to a shared pool of hardware and software resources. For traditional deployment, the user reserves the most required resources. However, this system does not guarantee an optimal use of resources and is not profitable for users. The characteristic feature of the elasticity of the cloud Computing gives the Cloud the ability to perform an automatic up / down scale resources proportional to demand. However, classical deployment only considers the use of resources based on alarm, and does not consider the quality perceived by the end user. The aim of this paper is to set up a private IAAS Cloud infrastructure and complete it by supervision tools so we could optimize the management of the cloud elasticity based on users’ point of view or QoE. We have also used a Machine learning algorithm to predict the load charge of the physical machines of the cloud so that providers could manage efficiently their data centers.


Introduction
With the number of suppliers present on the market, to be competitive, the cloud providers aim to satisfy the users; they must then manage, not only the quality of the service according to a previously established SLA, but must also take into consideration the perception of the user of the service, called Quality of Experience (QoE). On the other hand, with the development of green data center managers, improving energy e ciency and environmental footprint is one of the signi cant challenging aspect of green cloud. Minimizing energy consumption (computing resources, cooling systems) can signi cantly reduce the amount of energy bills and then increases the provider's pro t. It is within this context that our study belongs, the objective is to deploy a service architecture on an IAAS cloud platform based on OpenStack, to implement the monitoring module which is based on the Ceilometer telemetry module and other supervision tools such as SNMP and PRTG to manage the different physical resources used and monitor them. This will optimize the use of resources by avoiding the under / over use of resources in the Cloud by respecting some QoE on one hand and put in place a mechanism allowing the cloud provider to minimize its energy consumption while respecting a certain SLA on the other hand. In this work, we propose to study the automatic scaling by respecting some QoE and the prediction of the load of a cloud to optimize the distribution of the servers to minimize the energy consumption.
Various research works have been carried out to propose approaches that enables a cloud infrastructure to automatically and dynamically scale-up or scale-down. Some of them deal with resource allocation problem and virtual machine (VM) management to achieve cost savings and that is from better utilization to achieve cost savings and that is from better utilization of computing resources. In [1], cloud resource allocation is considered alongside VM's placement and migration based on VM's CPU, memory, storage, network bandwidth along with resource contention. In [2], Relevant threshold values of resource usage are considered to trigger scaling. In [3], The number of concurrent users and the number of active connections are used to implement the concept of dynamic scaling of resources. The research work described in [4] proposes an auto-scaling mechanism based on budget constraints and job execution deadline. in [5], A pattern-based prediction algorithm that handle sudden appearance/ disappearance of tra c is used to introduce an auto-scaling mechanism. In the above-mentioned research work, resources are allocated to achieve optimum results towards improving Quality of Service. But no one has focus on QoE.

A. QoS versus QoE
The QoS was de ned by the ITU as [6] "Totality of characteristics of a telecommunications service that bear on its ability to satisfy stated and implied needs of the user of the service". Another de nition is that QoS is a set of techniques that offers to applications the service they need, from end to end. The goal of quality of service is to provide priority to networks, including dedicated bandwidth, controlled jitter, low latency and improved loss characteristics. So that service providers could offer the best possible service to their clients.
In other words, Quality of Service, is the prioritization of some services using the network, such as VoIP telephony, messaging, video conferencing or video surveillance. It allows to classify the different types of applications according to their importance, in order to assign more or less bandwidth, and thus optimize the network.
To evaluate the quality on offered service, QoS was the only metric used in the past. However, since it relies only on technical measures related to network performance, it doesn't really re ect users' assessments. For the same service two users could have different appreciations. This is due to the performance of the used device, the expectations or the feeling of the users at that moment, their social and intellectual environment and their emotional state. Thus, a new metric appears named QoE or Quality of experience.
The International Telecommunication Union (ITU) de nes the quality of experience in ITU-T P10/G100 [7] "the overall acceptability of an application or service, as perceived subjectively by the end-user". The European Telecommunications Standards Institute (ETSI) [8] de nes the QoE as "the performance of a user when using what is presented by a communication service or application user interface".
By de nition, quality of service and quality of experience are two performance indicators for a service but from different ways. For QoS -Quality of Service it takes into account the network characteristics/behavior Performance guarantees given by network provider based on measurements. Regarding QoE -Quality of Experience: it considers the impact of network behavior on end users as some imperfections may go unnoticed and may render application useless. It is not captured by network measurements. The table 1 presents a comparison between QoS and QoE. All the de nitions presented above consider that QoE is a subjective measurement provided by the end user that re ects the degree of satisfaction of the used service, While QoS is an objective measure provided by clear measurement methods based on indicators. This type of evaluation incorporates the end-to-end system and especially the user's appreciation. This makes its meaning more complete but also exposes it to several factors that may affect the results.

B. QoE measurment factors
QoE is a multidimensional measurement which can be affected by a variety of factors. By de nition, the factors that can affect the QoE are: "Any characteristic of a user, system, service, application, or context whose actual state or setting may have in uence on the Quality of Experience for the user". Services and applications, the human experience can be in uenced by various factors that have an impact on QoE. In this part, we de ne the general factors that can alter the QoE and the speci c factors to our context among them. Factors that may have an impact on QoE can be classi ed into 4 categories. They are shown is the gure 1:

The context factor
As mentioned in the previous section, QoE represents the end-user's perception of quality. For example, quality perception of a multimedia service depends strongly on the viewing context in which consumption takes place. The viewing environment (physical setting) has a considerable in uence as it determines lighting conditions, viewing distances, screen quality… The characteristic can describe the demographic and socioeconomic back-ground, the physical and mental constitution, or the user's emotional state. "QoE is how the user feels about how an application or service was delivered, relative to their requirements" [11]. This is strongly in uenced by the user's internal states and predispositions. Common examples of human factors include not only gender, age, level of expertise, but also the psychological situation when using the service. Indeed, the properties related to the emotional and mental constitution of the user can play a major role in the nal assessment of the user.
Because of their complexity and lack of empirical evidence, we still do not know how human factors affect QoE.

The system factor
The properties of the technical system directly in uence QoE. The term refers to both the entire chain of communication between the service provider and the end-user (eg network, terminal equipment), and technical characteristics of the service provided.
We can therefore classify them as sub-factors of system as follows: Media factors: When dealing with multimedia content, the con guration of the source of media, such as encoding and compression settings, the rate sampling, the resolution of the scene, the frequency of the images, has a high impact on perceived overall quality.
Network factors: These factors refer to the transmission of data over a network and are closely related to network QoS parameters, including packet loss, delay, jitter, bandwidth, and error rate. The effect of these parameters on the perceived quality depends mainly on the type of multimedia application but evolves with time and / or with the location of the user.
Device factor: User device performance may affect the whole user experience. These factors include, for example, the resolution display, colors, brightness. For example, if a high-quality image and high resolution is displayed on a low-resolution screen with few colors, most of the original intent of the image may be lost.

The content factor
For different types of content, there are different requirements system. For video or gaming, for example, the amount of movement and the bandwidth audio can in uence the overall QoE. The content itself and its type of in uence strongly the overall QoE of the system because for different content characteristics, different system properties are needed.

C. QoE evaluation models
There are various approaches to quantifying the QoE of a provided service. These approaches are classi ed according to the perceived quality assessment, if evaluated directly by humans or automatically by technical factors. In the rst case, speci c evaluation processes are used, called subjective tests, while in the second case, mathematical formulas or algorithms are exploited, called models goals. There is a third category of evaluation of QoE called "hybrid method"; based on the use of automatic goal estimator, relying however on the subjective tests available. The next gure 2 presents the main classi cation of the QoE models: Subjective methods Subjective tests are usually based on controlled experiments with human participants who directly evaluate their experience with an application or service. Different techniques can be used for subjective evaluation. For example, users can rate their experience using an absolute rating scale or they can compare images and / or videos by specifying which is better. In all cases, the results are based on users' opinions, past experiences, expectations, user perception, judgment and description skills, etc. One of the most popular subjective assessment methods is "Mean Opinion Score "(MOS) [13].
This method is based on laboratory tests under good conditions. speci c, detailed by the ITU in [14]. The quality is then evaluated by the users based on feedback surveys of the experience lived on a qualitative scale (bad, poor, fair, good and excellent) numbered from (1) to (5) as shown in gure 3. Then, the MOS is calculated as the arithmetic mean of all the individual scores mentioned by the test subjects. The QoE is then attributed to this statistical value.

Objective methods
According to ITU, the principle of objective quality assessment is the estimation of subjective quality only from the measurement of objective quality or indices. Depending on the type of input data used for quality of experience assessment the objective method was classi ed: Media layer models: These models use the multimedia signal to calculate the quality of experience (QoE), following comparisons and do not require any information about the system being tested.
Packet layer models: These models predict QoE only from packet header information and do not have access to the multimedia signals.
Parametric planning models: These models use parameters of quality planning for networks and terminals to predict QoE. They require prior knowledge of the tested system.
Objective measures can be classi ed according to the availability of the original signal. Three major model approaches have been identi ed: Completed reference The QoE Estimation Algorithm requires access to both the reference input data and the degraded output data No reference The QoE estimation algorithm requires only access to the degraded output data.
Reduced reference The QoE estimation algorithm requires access to degraded output data and some features from the original signal that the quality assessment system will use as secondary information to help evaluating the quality of the degraded output data.

Hybrid methods
The third type of QoE assessment method is a hybrid model that is located between the two categories subjective and objective. It works as a quality estimator automatic and objective, relying however on the scores resulting from the subjective tests carried out previously. These hybrid methods are based on learning tools called Machine Learning (ML), and they use subjective test scores like input parameters to form a QoE model. This model then matches the network parameters (for example, packet loss rate, delay, jitter, etc.) to MOS score values. This model offers the possibility of predicting and / or estimating quality in time real.
This type of solution, lying between the subjective and objective methods, presents a signi cant advantage in the eld of prediction and estimation of quality experience but remains very complex to implement and the learning stage is very long and hardworking. Furthermore, the learning stage of the neural network needs a huge amount of data, but in our case, we can't use hybrid models because of lack of data.

D. QoE in the Cloud
A cloud computing environment must be elastically scalable; in other word it must have the ability to exibly expand as the offered load and the business demands change. However, this feature requires the development of a diverse set of algorithms, like those outlined below. The study of Elastic Scalability and QoE Assessment for Cloud Services are prerequisites for the construction of an intelligent QoE Management and Control mechanism for the cloud resources. Various research works have been carried out dealing with the resource allocation problem and VM management to achieve better utilization of computing resources while avoiding overload situations considering QoE. Many research works concentrated on measuring the performance of cloud computing through measuring parameters such as availability, reliability, scalability, response time. Table 2 shows a literature review of different metrics related to IAAS cloud services.
In [15], cloud resource allocation is considered alongside dynamic resource provisioning using feedback control mechanism on the infrastructure level performance metrics. In [16], the proposed approach considers dynamic service level agreement (SLA). To improve users' perceived QoE, various studies have come up with metrics, which are directly related to the performance of services such as video streaming. Mean Opinion Score (MOS) that evaluate the user QoE is calculated in [13] as metrics consisting of network bandwidth,video-bit rate, Round Trip Time (RTT), page load times and video interruptions. [17] S. Dutta et al. inn [18] a novel scaling method that closely considers users' QoE, in fact the solution has considered QoE's feedback as a criterion to scale up/down cloud resources. the proposed solution is to build a QoE-aware resource management of virtual instances; to automatically provision and scale network services (NS) in an elastic way. H. Qian et al. in [19] rstly identi es interactions among the cloud entities and afterwards evaluates the QoE for the End-Users in this complicated environment. Work in [20] studied host reliability issues, from the perspective of the End-Users. IN [21] proposes a three-step approach to map SLA and QoS requirements of business processes to cloud infrastructures. [22] considers QoE in the cloud with power management issues, since it studies a service cloud environment with mobile devices. Emmanouil Kafetzakis et al. in [23], proposes a uni ed QoE-aware management framework, directly targeting to cloud computing environments. In [24] authors, proposes a methodology to estimates the QoE from the end-to-end response time and adjusts the estimated score according to the evaluation context. Sunny Dutta et al. proposes in [25] an approach that enables a cloud-infrastructure to automatically and dynamically scale-up or scale-down resources of a virtualized environment aiming for e cient resource utilization and improved quality of experience within the ETSI NFV MANO framework for cloud-based 5G mobile systems. Some companies like Infovista [26]

Proposed Architecture
Since in this work, we aim to optimize the use of resources by respecting a certain QoE. Our optimization should take into consideration two point of views, the rst is related to the cloud users and his allocated resources, the second is in relation with the cloud provider's servers (suspending or using all servers or allocating resources from another competitor if necessary).
We started with conducting subjective tests, then the deployment of the Cloud OpenStack platform and nally, we measured the current instances used and available resources then we try to scale up or down the system according to an optimization algorithm.
The architecture consists of an Openstack plateform with two key added entities that are the the QoE estimation entity and the orchestration entity. The rst one is responsible on evaluating automatically current user QoE so that the orchestrator ca scale up or down the user's instance to better deserve him. The cloud platform we have chosen is an IAAS, that's why the orchestrator can only manipulate parameters related to the physical characteristics of the instances. It can only scale up/down the avors (VCPU, RAM, Disk).
The architecture shown in Figure 4 illustrates the depoyed plateform.

A. The subjective tests
The type of application will impact the users' MOS, because gamer and developer will need instances with higher con gurations. And a standard user may rate a minimalist con guration as excellent, while it will be rated as bad by a gamer.
In our work, we will consider 4 type of users pro les: In order to realize the survey which allowed us to determine the score MOS given by the users, we used Google Forms. Various con gurations were presented to different categories of users and they rated the related instance according to their appreciation.
Given the different answers, for every user pro le category, we calculated the average MOS. We have established rules to associate a MOS to a certain con guration knowing the category to which the user belong. We used Python 3 to implement the MOS calculation algorithm.

1) Gamer
According to [29] There are three classes of gamers: "Omnipresent" (e.g. real-time strategy games), "Third-Person Avatar" (e.g. role-play games), and "First Person Avatar" (e.g. First-Person Shooters). [30] The table 3 shows 3 different games each one belongs to a different class. Then we asked some users to rate their appreciation of these games using different con guration of the cloud instances. Table 4 summarizes the average MOS scores assigned by users to games for different con gurations.

2) Developer
In this part, we asked developers to rate their assessments of instances with different con gurations for their daily needs. Table 5 summarizes the average MOS scores assigned by developers.

3) Video 4K/HD
In this section, the third category is considered. The users that use 4k/HD Streaming video rated the instances, we will not consider the used device, we assumed that all the used devices have highresolution screens. The table 5 summarizes MOS scores assigned by users.

4) Standard users
Finally, simple users will rate the used instances for navigation, word processing. . . The results are shown in table 6. In our case, we will just consider the working days during the winter, but for future work we will add: Working days/ Weekend and Seasons (summer / autumn / spring / winter).

B. The QoE estimator
In the Classic Cloud deployment, users will reserve certain resource pools to be allocated to their instances. However, the resources reserved will not always be used at their fair value. Indeed, the resources will either be over used or under-used.
Our goal is to develop a system for estimating the quality of experience for cloud users to better deserve them and optimize resources usage.
We used Ceilometer for Cloud ressources measurment. Ceilometer is the monitoring module that applies to the platform generated by OpenStack. The collected data can be sent to different targets: 1. Gnocchi: is developed to capture measurement data in a format time series to optimize storage and queries.
2. Aodh: is the alarm service that sends alerts when the rules set by the user are not respected.
3. Panko: is used for event storage designed to capture document data such as logs and

system event actions.
We hereafter propose a method for estimating the quality of experience (QoE) by specifying the most in uential parameters. This step began with a selection of the system factors most affecting the QoE perceived by the users. Then a survey was realized by varying these parameters to have a real score.
These parameters are subsequently used to estimate the QoE without using the ratings assigned by users each time.
To estimate user QoE, our algorithm will consider various parameters as shown in Figure 5 : User pro le : Gamer, developper, video or standard Reserved and used cloud instance resources collected by Celiometer.
A matching is then made with subjective tests results (Table 4 to 7) to estimate current user QoE.

C. Elasticity management by the orchestrator
The capability of a system to automatically scale up or down in proportionate with demand is known as autoscaling. Autoscaling is essential for availability and optimal usage of resources. Cloud service specify metrics to be observed, their threshold value and alarms.
Whenever observed metric value crosses a threshold value, alarms are raised and either new resources are provisioned (scale up) or currently provisioned resources are released (scale down) based on scaling policy.
In our case, we add a new alarm called MOS for autoscaling process (Figure 8). The MOS alarm ensures that we can automatically scale when the estimated QoE is under or over a certain thershold compared to user's need ( Figure 9). Moreover, the algorithm of scaling up or down will notice the Orchestrator Heat when it's it necessary to do a scaling.
We decide to use recurrent neural networks (RNN), to forecast the load of the cloud (Total ressource usage CPU) so it could manage its resources. Thanks to our additional tools (PRTG and SNMP), we have the statistical usage of our physical machine. We extracted a CSV le that contain the load change of our cloud usage for two months. Our data are time series data, that is why we orient our choice for the machine learning algorithm to Long Short Term Memory "LSTM". It is a variant of the famous Recurrent Neural Networks (RNN). LSTM was designed to model temporal sequences more accurately than conventional RNNs.
The LSTM learns to keep only pertinent information to make predictions. This is achieved during the retropropagation (training phase). Figure 7 represents the structure of LSTM blocks. We put a sequence of several cells to form a chain. The number of cells in a network depends on the input complexity. Typically, for our time series, 4 to 6 cells are su cient to give a good performance score.
The key to LSTM is the cell state (Ct), which enables the information to ow along it unchanged. The cell state is regulated by three gates to optionally let information through. The rst gate is called the forget gate, controlling which elements of the cell state vector Ct−1 will be forgotten. Following that, the input gate decides which value to be updated Finally, the output gate decides which to be output by a sigmoid layer.
A data set containing 2 months of history of load is used with a step of 1 hour, to predict the load of the infrastructure for the next day to determine what action to take and when. Then we divided our sets in two: a learning set (5/6) and a test set (1/6). The RNN used the learning set to learn how the total charge of the cloud evolve over the time, and the last set to validate its predictions.
Before starting the learning phase of the LSTM, we have to pass through a phase of data exploration. The goal is to become familiar with the used variables. What are the variables ? The different values that it can takes? Dealing with the case of missing values. Deleting with duplicates. Treat numeric values on one side and categorical values from another. For our variables, we only have numerical data that resumes the total cloud usage per day. Our les are with a step of one hour.
We used for that a open source implementation of LSTM using Anaconda navigator as python distribution, Jupyter notebook for the code. Several libraries have been used as Keras, Panda. And python 3 for the implemented code.
The next gure 8 shows the input/output of the elasticity algorithm implemented inside the orchestrator.
The gure 9 shows a part of the scaling algorithm.

A. Evaluation of estimated MOS
For the MOS calculation, in addition to agreement, we must consider for the Gamer user , the class of the game (Table 3). In the example above, we just consider the rst agreement with the second game. To validate our approach, we compared the MOS with the estimated MOS. The gure 10 bellow represents both the subjective MOS of a gamer who plays a game of the second class and the estimated MOS corresponding.

B. Evaluation of prediction
The gure 11 below shows the results of the validation step, the predicted values are in orange and the real one are in blue. We noticed that the predicted values are very close to the real values. This gure shows the load evolution for 11 days.
We can see the prediction of the evolution of the load for one day. The following gure 12 shows it.
In this way and given the past evolution of a cloud provider and depending on the number of users and their pro les, the cloud provider can estimate the load of its data center and optimize it by minimizing for example the energy cost, while guaranteeing QoS according to SLA and QOE.