3.1 CDC architecture
The CDC contains a set of VMs VM={vm1,vm2,...,vmn}, where n represents the total number of VMs provided by the CDC. Each VM can be expressed as vmi={ci,mi,si,pi,ni,ti,ei}. The ci,mi,si,pi,ni,ti,ei represent the number of CPU cores, memory size (GB), processing speed of CPU cores (MIPS), the price of CPU cores included in each VM ($ / hour, low frequency = price_low, high frequency = price_high), the CT number running on the VM, the running time of the VM (s), and the state of the VM (1: normal, 2: failure and 3: copy replication). CTs submitted by cloud users can also be represented as a set of CTs. CT={ct1, ct2,..., ctm}, where m represents the number of CTs. Each CT can be described as ctj={lj,mj,dj,tj,pj,fj}, where lj,mj,dj,tj,pj,fj represent the length of the CT (MI), the required memory size (GB), the deadline of the CT (s), the required actual time of the CT (s), the benefit of completing the CT ($), and the cost of failure ($).
The calculation method of the required actual time of CT j is as follows:
$${\text{ct}}{\text{.}}{{\text{t}}_{\text{j}}}{\text{=ct}}{\text{.}}{{\text{l}}_{\text{j}}}{\text{/}}\left( {\left( {f{\text{/1000}}} \right) \times {\text{vm}}{\text{.}}{{\text{c}}_{\text{i}}} \times {\text{0}}{\text{.9}}} \right)$$
1
Where f is the CPU frequency of the VM.
CDCs generally have 10% redundancy due to fluctuations in the VM performance [25]. Therefore, the deadline for providing cloud users is 1.1 times the required actual time. The calculation method of the deadline for CT j is as follows:
$${\text{ct}}{\text{.}}{{\text{d}}_{\text{j}}}{\text{=1}}{\text{.1}} * {\text{ct}}{\text{.}}{{\text{t}}_{\text{j}}}$$
2
The benefit of completing CT j are as follows:
$${\text{ct}}{\text{.}}{{\text{p}}_{\text{j}}}{\text{=ct}}{\text{.}}{{\text{d}}_{\text{j}}} \times {\text{vm}}{\text{.}}{{\text{p}}_{\text{i}}}{\text{/3600}}$$
3
3.2 Initial mapping of CTs
Cloud users need to rent VMs to complete the CTs in the CDC. Mature CDCs provide VMs with standard machine specifications for users to choose. For example, Amazon’s Eastern United States (Ohio) DC provides conventional VMs that contain four series. The parameters shown in Table 1 are the minimum models for each series. For example, m5a.large (Low-frequency Universal) is the lowest configuration of this series of VMs. At the same time, the CDC also provides high-frequency VM series m5n.large (High-frequency Universal) and m5n.large (High-frequency Computational). The configuration of these two series of VMs is only different in frequency compared to the configuration of low frequency. At the same time, the CDC also provides VMs corresponding to these four VMs with the same scale expansion. The m5a.large (Low-frequency Universal) doubled VM is m5a.xlarge (Low-frequency Universal). The m5a.24xlarge (Low-frequency Universal) configures the largest VM for this series. The number of CPU cores and the size of memory have been increased by 24 times. Of course, the rental price will also increase in proportion to the allocation.
Table 1
| frequency | number of cpu | size of memory(GB) | Price |
m5a.large(Low-frequency Universal) | 2.5GHz | 2 | 8 | 0.086USD/h |
m5n.large(High-frequency Universal) | 3.1GHz | 2 | 8 | 0.119USD/h |
m5a.large(Low-frequency Computational) | 2.5GHz | 4 | 8 | 0.129USD/h |
m5n.large(High-frequency Computational) | 3.1GHz | 4 | 8 | 0.178USD/h |
We assume that cloud users submit a collection of CTs. CDCs need to map them. We only analyze the first five CTs, and their parameters are shown in Table 2. First, the smallest VM that meets the memory requirements is generally selected in order to meet the memory requirements of CTs and minimize the cost. We assume that all low-frequency Universal VMs are selected during the initial mapping. The CT ct1 will choose the m5a.large (Low-frequency Universal) VM. We calculate the required actual time, deadline and benefit according to formulas 1–3, respectively. At the same time, other CTs are also deployed in the same way. The VM is shown in Fig. 1 after the initial mapping. Each cell represents a VM of m5a.large (Low-frequency Universal) specification in the figure. The parameters of the final five CTs are shown in Table 3. The parameters of the five corresponding VMs are shown in Table 4.
Table 2
parameters before CT mapping
| l | m |
ct1 | 90000MI | 8GB |
ct2 | 60000MI | 16GB |
ct3 | 80000MI | 16GB |
ct4 | 100000MI | 8GB |
ct5 | 120000MI | 8GB |
Table 3
CT parameters after CT mapping
| l | m | d | t | p | f |
ct1 | 90000MI | 8GB | 22.0s | 20.0s | (1.72/3600)$ | 0 |
ct2 | 60000MI | 16GB | 7.4s | 6.7s | (1.15/3600)$ | 0 |
ct3 | 80000MI | 16GB | 9.7s | 8.8s | (1.51/3600)$ | 0 |
ct4 | 100000MI | 8GB | 24.2s | 22.2s | (1.91/3600)$ | 0 |
ct5 | 120000MI | 8GB | 29.3s | 26.6s | (2.288/3600)$ | 0 |
Table 4
VM parameters after CT mapping
| c | m | s | p | n | t | e |
vm1 | 2 | 8GB | 4500MIPS | 0.086USD/h | 1 | 22s | 1 |
vm2 | 4 | 16GB | 9000MIPS | 0.172USD/h | 2 | 7.4s | 1 |
vm3 | 4 | 16GB | 9000MIPS | 0.172USD/h | 3 | 9.7s | 1 |
vm4 | 2 | 8GB | 4500MIPS | 0.086USD/h | 4 | 24.2s | 1 |
vm5 | 2 | 8GB | 4500MIPS | 0.086USD/h | 5 | 29.3s | 1 |
3.3 VM failure
Rescheduling technology is commonly used in modern CDCs. As a mature cloud service provider, it usually improves the performance of new VMs to ensure deadline. According to Amazon's service agreement, cloud users will be compensated based on the downtime of VM. We can turn it into compensation beyond the deadline. The specific compensation ratio is shown in Table 5 according to the exceeding ratio the deadline.
Table 5
exceeding ratio | compensation ratio |
(0–1%) | 10% |
(1%-5%) | 25% |
exceed 5% | 100% |
3.4 CDC rescheduling
When a VM center fails in the CDC, its corresponding CT will fail. In order to continue to execute the FCT, the scheduling system needs to reschedule the CT and activate a new VM for mapping. If you activate a VM with the same configuration, the completion time of the FCT may exceed the deadline. This is especially obvious when the fault occurs late. This not only faces higher compensation but also affects the reputation of the cloud service provider. Therefore, the rescheduling system usually expands the VM configuration in the same proportion to ensure that CTs can be completed on time. However, this will also significantly increase the operating costs of cloud service providers. Through analysis of VMs provided by cloud service providers such as Amazon and Alibaba Cloud. Cloud users usually rent Low-frequency Universal VMs. We can also increase the execution speed by using high-frequency or computational VMs. We can see that their prices have risen in turn from Table 1. Based on this, we propose a dynamic classification rule for FCTs.
Rule 1. Dynamic classification rule of FCTs. When remapping FCTs, we can change the type of VM so that the FCT meets its deadline. If the original configured VM is used and the deadline is met, the FCT is classified as a Low-frequency Universal CT. If the frequency of the CPU needs to be increased to 3.1 GHz to meet the deadline, the FCT is classified as a High-frequency Universal CT. If the ratio of the number of CPU cores and the size of memory needs to be increased to 4: 8 to meet the deadline, the FCT is classified as a Low-frequency Computational CT. If the two aspects need to be increased simultaneously to meet the deadline, the FCT is classified as a High-frequency Computational CT. If none of the above can be satisfied, expand the four configurations of VMs by the same proportion and select the VM with the lowest cost for classification and remapping.
We still use the previous example to demonstrate. Assume the time of the VM fails is random. Table 6 shows the remaining deadline for FCTs. Other parameters remain unchanged. According to the dynamic classification rule of FCTs, if ct1 is mapped on the originally configured VM, it can meet the deadline requirement. The ct1 is a Low-frequency Universal CT. If ct2 is mapped on the originally configured VM, the deadline cannot be met. When the frequency of the CPU is increased to 3.1 GHz, it can meet the deadline requirement. The ct2 is a High-frequency Universal CT. The ct3 needs to change the ratio of the number of CPU cores and the size of memory to 4: 8 to meet the deadline. Therefore, ct3 is a Low-frequency Computational CT. The same method can be concluded that ct4 is a High-frequency Computational CT. The ct5 wasted a lot of time due to the late failure of the VM. If the original configuration of the VM is still used for mapping, no matter how high the frequency or the ratio, the deadline cannot be met. We have to double the VM configuration and use High-frequency Computational VMs to meet the deadline. Therefore, the ct5 is also a High-frequency Computational CT.
Table 6
CT parameters before rescheduling
| l | m | d | t | type |
ct1 | 90000MI | 8GB | 20.0s | 20.0s | Low-frequency Universal |
ct2 | 60000MI | 16GB | 5.4s | 6.7s | High-frequency Universal |
ct3 | 80000MI | 16GB | 4.5s | 8.8s | Low-frequency Computational |
ct4 | 100000MI | 8GB | 9.3s | 22.2s | High-frequency Computational |
ct5 | 120000MI | 8GB | 5.4s | 26.6s | High-frequency Computational |
After the above analysis, we have classified the FCTs. After that, the corresponding VM is activated according to the required VM parameters for remapping. The VM parameters after rescheduling are shown in Table 7.
Table 7
VM parameters after rescheduling
| c | m | s | p | n | t | e |
vm1 | 2 | 8GB | 4500MIPS | 0.086USD/h | 1 | 2.0s | 2 |
vm2 | 4 | 16GB | 9000MIPS | 0.172USD/h | 2 | 2.0s | 2 |
vm3 | 4 | 16GB | 9000MIPS | 0.172USD/h | 3 | 5.2s | 2 |
vm4 | 2 | 8GB | 4500MIPS | 0.086USD/h | 4 | 14.9s | 2 |
vm5 | 2 | 8GB | 4500MIPS | 0.086USD/h | 5 | 23.9s | 2 |
vm6 | 2 | 8GB | 4500MIPS | 0.086USD/h | 1 | 20.0s | 3 |
vm7 | 4 | 16GB | 11160MIPS | 0.238USD/h | 2 | 5.3s | 3 |
vm8 | 8 | 16GB | 18000MIPS | 0.258USD/h | 3 | 4.4s | 3 |
vm9 | 4 | 8GB | 11160MIPS | 0.356USD/h | 4 | 8.9s | 3 |
vm10 | 8 | 16GB | 22320MIPS | 0.712USD/h | 5 | 5.3s | 3 |
It can be seen from the above example that the five FCTs all meet the deadline requirements after remapping. At the same time, the dynamic classification rule of FCTs selects VMs reasonably according to the classification of CTs. This avoids blindly expanding the capacity of the VM. In the next part, the core HRFDC strategy of this article will be introduced.