3.1 MDC
This article is mainly to solve the problem of resource scheduling and allocation in MDCs. Therefore, we first conduct research on MDCs and establish corresponding models. In the research, the model assumes that the set of CDCs DC=[dc1,dc2,...,dcn], where n represents the number of CDCs available to cloud service providers. At the same time, each CDC will be expressed and the specific parameter meanings are shown in Table 1.
Table 1
parameter | meaning | unit |
dci.np | Number of PMs | / |
dci.ap | Number of activated PMs | / |
dci.au | Average utilization of PMs | / |
dci.te | Energy consumption of CDC | kW·h |
Similarly, the PMs in a CDC can also be represented as a collection. The specific meanings and units of the parameters are shown in Table 2.
Table 2
parameter | meaning | unit |
pmj.tc | Number of CPU cores | / |
pmj.tm | Memory size | GB |
pmj.uc | CPU core utilization | / |
pmj.um | Memory utilization | / |
pmj.ft | Complete time | / |
pmj.dn | CDC number | / |
The VM serves as the basis for cloud service providers to provide cloud services externally. The specific meanings and values of the parameters are shown in Table 3.
Table 3
parameter | meaning | unit |
vmk.tc | Number of CPU cores | / |
vmk.tm | Memory size | GB |
vmk.ft | Complete time | / |
vmk.pn | PM number | / |
3.2 Natural clustering rule
Once a cloud user submits an application for VM rental, it will selects a suitable PM to deploy it according to the current PM operation. We take 9 PMs in three CDC as an example. They are all homogeneous PMs based on Intel Xeon E5-2686 v4. Therefore, the number of CPU cores is 18, and the corresponding memory size is 72GB. Their parameters are shown in Table 4.
Table 4
parameters before PM deployment
PM | tc | tm | uc | um | ft | dn |
pm1 | 18 | 72 GB | 60% | 60% | 4000 | 1 |
pm2 | 18 | 72 GB | 50% | 50% | 3600 | 2 |
pm3 | 18 | 72 GB | 10% | 10% | 3700 | 2 |
pm4 | 18 | 72 GB | 40% | 40% | 7000 | 3 |
pm5 | 18 | 72 GB | 60% | 60% | 7200 | 3 |
pm6 | 18 | 72 GB | 45% | 45% | 7300 | 2 |
pm7 | 18 | 72 GB | 60% | 60% | 7500 | 1 |
pm8 | 18 | 72 GB | 40% | 40% | 8000 | 1 |
pm9 | 18 | 72 GB | 45% | 45% | 10800 | 3 |
The entire scheduling process is very fast and can be dynamically changed as cloud user needs change. This article uses the VM parameters in Table 5, and comprehensively considers the utilization and energy consumption of the PM when deploying it. The FF algorithm only considers the utilization rate of the PM. Therefore, the PM that can meet its parameter requirements is selected according to the PM number. Because the remaining space of pm1 is not enough to deploy vm1, pm2 is selected to deploy vm1. If vm1 is deployed in this way, the CPU utilization of pm2 is 94%. In order to pursue higher efficiency, the scheduling algorithm usually increases the utilization of the PM as much as possible. However, a fully loaded physical opportunity results in a decrease in VM performance or PM downtime due to the competition of VMs for public resources. Therefore, the upper limit of the utilization of the PM is usually set to 90%, and some redundancy is set aside to improve the performance of the VM. Therefore, vm1 cannot be deployed on pm2. At the same time, studies have shown that even when the PM is empty, the energy consumption is 50%-70% of the full load. Therefore, this article sets a lower limit of 60% for PM utilization. According to this setting, pm3 cannot complete the deployment of vm1. The scheduling algorithm continues to find that pm4 can complete the deployment of vm1, and the upper and lower limits of pm4 utilization after deployment meet the conditions. Finally, the deployment of vm1 is completed on pm4. The process of FF algorithm deployment of VMs is shown in Table 6. Similarly, in order to ensure the performance of the VM and the energy consumption of the PM, the deployed PM should meet the upper and lower limits of PM utilization. According to this demand, we found that because pm4, pm6, pm8 and pm9 can all meet the needs of PM utilization, but deploying vm1 on pm6 will maximize the utilization of the PM. Therefore, the BF algorithm will choose pm6 to complete the deployment of VM. The selection process of BF algorithm deployment VM is shown in Table 7. As this article studies the resource scheduling problem of MDCs, the clustering process can be naturally completed according to the CDC.
Rule 1. Natural clustering rule. Since the deployment of VMs is the first selection of PM clustering, PMs are selected when the clustering is confirmed. Therefore, it is necessary to cluster the PMs of the MDC. However, these PMs can be divided into different CDCs due to geographical conditions, so as to complete the natural clustering of the PMs. In the end, each CDC will form an independent cluster. Next, expand the KNN algorithm to complete the classification of VMs and deploy the VMs.
Table 5
VM | tc | tm | ft | pn |
vm1 | 8 | 32GB | 9000 | / |
vm2 | 4 | 16GB | 7800 | / |
Table 6
the selection process of FF algorithm deployment VM
PM | tc | tm | uc | um | ft | dn |
pm1 | 18 | 72 GB | 104% | 104% | 9000 | 1 |
pm2 | 18 | 72 GB | 94% | 94% | 9000 | 2 |
pm3 | 18 | 72 GB | 54% | 54% | 9000 | 2 |
pm4 | 18 | 72 GB | 84% | 84% | 9000 | 3 |
Table 7
the selection process of BF algorithm deployment VM
PM | tc | tm | uc | um | ft | dn |
pm4 | 18 | 72 GB | 84% | 84% | 9000 | 3 |
pm6 | 18 | 72 GB | 89% | 89% | 9000 | 2 |
pm8 | 18 | 72 GB | 84% | 84% | 9000 | 1 |
pm9 | 18 | 72 GB | 89% | 89% | 10800 | 3 |
3.3 Dynamic KNN classification rule
Supervised learning is an important part of machine learning, and the classification problem is the core problem of supervised learning. In the classification problem, the KNN algorithm is widely used in various fields. Its basic idea is to select K objects that meet the conditions and come from different categories when classifying the target object. Then statistically select the category containing the most objects as the target category, and classify the target object into this category. This can ensure that the target object is classified into the appropriate category as much as possible. At the same time, the K value can adjust the classification result, and the classification result becomes more reasonable as the K value increases.
We still use the PMs in Table 4 and the VM parameters in Table 5, and the clustering of PMs has been completed through natural clustering rules. Through observation, it can be found that pm4, pm6, pm8, and pm9 can all meet the upper and lower limits of PM utilization. Therefore, at this time, we take the value of K as 4. According to the KNN algorithm, the number of PMs meeting the conditions is shown in Table 8. The third cluster contains 2 PMs that meet the conditions, and vm1 should be classified into this cluster. After considering that the PM completion time cannot be extended, pm9 will be selected. The selection process of the VM of this deployment method is shown in Table 9.
Table 8
cluster | number of PM |
1 | 1 |
2 | 1 |
3 | 2 |
Table 9
the selection process of comprehensive of utilization and completion time deployment vm1
PM | tc | tm | uc | um | ft | dn |
pm4 | 18 | 72 GB | 40% | 40% | 7000 | 3 |
pm6 | 18 | 72 GB | 45% | 45% | 7300 | 2 |
pm8 | 18 | 72 GB | 40% | 40% | 8000 | 1 |
pm9 | 18 | 72 GB | 89% | 89% | 10800 | 3 |
After completing the classification and deployment of vm1, the parameters of the PM have changed. The PMs are shown in Table 10. At this time we will classify and deploy vm2. Observation at this time shows that pm1, pm2, pm4, pm5, pm6, pm7, and pm8 can all meet the upper and lower limits of PM utilization. The number is different from when vm1 was deployed. Therefore, we can dynamically adjust the K value in KNN by meeting the upper and lower limits of the PM. This can further improve the accuracy of classification, according to which we propose dynamic KNN clustering rule.
Rule 2. Dynamic KNN classification rule. When deploying a new VM application, the MDC performs KNN classification on the VM according to the result of the natural clustering rule. The scheduling system will use the number of VMs that meet the upper and lower limits of PM utilization as the K value of the KNN algorithm, which will cause the deployment of each VM to produce a different K value. The dynamic change of K value ultimately leads to more accurate classification results.
According to the dynamic KNN classification rule, we take the value of K as 7. The number of PMs meeting the conditions is shown in Table 11. The first cluster contains 3 PMs that meet the conditions, and vm2 should be classified into this cluster. After considering that the PM completion time cannot be extended, pm8 will be selected. The selection process of the VM of this deployment method is shown in Table 12.
Table 10
parameters after vm1 deployment
PM | tc | tm | uc | um | ft | dn |
pm1 | 18 | 72 GB | 60% | 60% | 4000 | 1 |
pm2 | 18 | 72 GB | 50% | 50% | 3600 | 2 |
pm3 | 18 | 72 GB | 10% | 10% | 3700 | 2 |
pm4 | 18 | 72 GB | 40% | 40% | 7000 | 3 |
pm5 | 18 | 72 GB | 60% | 60% | 7200 | 3 |
pm6 | 18 | 72 GB | 45% | 45% | 7300 | 2 |
pm7 | 18 | 72 GB | 60% | 60% | 7500 | 1 |
pm8 | 18 | 72 GB | 40% | 40% | 8000 | 1 |
pm9 | 18 | 72 GB | 89% | 89% | 10800 | 3 |
Table 11
cluster | number of PM |
1 | 3 |
2 | 2 |
3 | 2 |
Table 12
the selection process of comprehensive of utilization and completion time deployment vm2
PM | tc | tm | uc | um | ft | dn |
pm1 | 18 | 72 GB | 60% | 60% | 4000 | 1 |
pm2 | 18 | 72 GB | 50% | 50% | 3600 | 2 |
pm4 | 18 | 72 GB | 40% | 40% | 7000 | 3 |
pm5 | 18 | 72 GB | 60% | 60% | 7200 | 3 |
pm6 | 18 | 72 GB | 45% | 45% | 7300 | 2 |
pm7 | 18 | 72 GB | 60% | 60% | 7500 | 1 |
pm8 | 18 | 72 GB | 62% | 62% | 8000 | 1 |
From the above example, the utilization rate of the PM can be improved without prolonging the running time of the PM, thereby ensuring that efficiency is improved and energy consumption is reduced at the same time. The DNSC algorithm proposed accordingly will be described in the next section.