O�oading Dependent Tasks in MEC-enabled IoT Systems: A Preference-based Hybrid Optimization Method

The rapid development of IoT-based services has resulted in an exponential increase in the number of connected smart mobile devices (SMDs). Processing the massive data generated by the large number of SMDs is becoming a big problem for mobile devices, servers, and wireless communication channels. A Multi-access Edge Computing (MEC) paradigm partially mitigates this problem by deploying edge server nodes at the edge of wireless networks nearby SMDs, but the challenge still remains due to the limited computation capacity of MEC servers and the band-width of wireless channels. In addition, the dependency of tasks generated by applications on SMDs increases the complexity of the problem. In this paper, we propose a constrained multiobjective computation oﬄoad-ing optimization solution to resolve the problem of task dependency under limited resources. This solution improves the Quality of Service (QoS) through minimizing the latency, energy consumption, and rate of task failure caused by limited resources. We propose a two-staged hybrid computation oﬄoading optimization method to solve the problem. In the ﬁrst stage, the computation oﬄoading decisions are made based on the preferences of tasks. Then, in the second stage, global optimal solutions are found using the modiﬁed Non-Dominated Sorting Genetic Algorithm (NSGA-III). The overall eﬃciency of the proposed method is increased owing to the preference-based algorithm


Introduction
With the fast development of services on the Internet of Things (IoT) networks (e.g., smart cities, building & home automation, smart manufacturing, health care, automotive, and wearables), the number of smart mobile devices (SMDs) is increasing exponentially [1,2].A challenge is how to timely handle the massive amount of data generated by them.Multi-access Edge Computing (MEC) is a perspective paradigm in which edge server nodes are deployed at the edge of wireless networks to process the data of nearby SMDs.The MEC paradigm is aimed at ensuring short latency and data privacy because the data is processed within a wireless network area [3].However, it also creates a local computation burden when multiple SMDs connect simultaneously on an MEC server node, because many applications running on SMDs generate computation-intensive and/or data-intensive tasks that require MEC nodes to complete, but the MEC nodes do not have sufficient computing capacity and storage [4].This dilema can be mitigated by optimizing the computation offloading of SMDs to the edge server to improve the QoS.
Real-time applications (e.g., mobile games, augmented reality (AR), and autonomous vehicles) require instantly processed computation tasks.Therefore, the completion time (latency) of tasks is an important factor for them.Some authors addressed latency as a metric for computation offloading optimization [5][6][7][8][9].Other applications (e.g., health care sensors, drones, wearables) require a prolonged battery lifetime.Hence, the energy consumption of SMDs is another objective for computation offloading optimization [10][11][12][13].Some authors considered latency and energy consumption as joint metrics for multiobjective computation offloading optimization [14][15][16][17][18].However, from the comprehensive review of these existing studies, we can find that task failure, caused by the limitation of resources, has not been considered as an optimization metric jointly with latency and energy consumption.This metric is important because the limitation of resources leads to task failure when overloading the resources.We should consider the rate of task failure when seeking the optimal solutions to minimize latency and energy consumption.This aspect is a rigorous challenge in the case of interdependent tasks that must maintain the order of execution.
Partial computation offloading is aimed at avoiding overloading an SMD by full local execution, or a wireless channel and a MEC server by fully offloading the tasks.In partial computation offloading, tasks are executed either locally or offloaded onto a MEC server.The last one consists of two processes: transmission of a task through a wireless channel and execution of the task on a MEC server.We can model the local computing, wireless transmission, and edge computing processes as a queueing system network using three queueing theory models.The models allow us to evaluate the latency, energy consumption, and probability of task failure caused by limitations of processors or wireless channels.We formulate it as a constrained multiobjective computation offloading optimization (CMCOO) problem to minimize the mentioned objectives.The optimal solution is found by considering the dependencies of the tasks.To achieve this, we propose a two-stage hybrid method.In the first stage, the tasks are classified into computation-intensive or data-intensive tasks.After that, we make offloading decisions in advance for some tasks based on preferences.In the second stage, a modified Non-Dominated Genetic Algorithm (NSGA-III) is used to find the best offloading decisions for the remaining tasks that don't meet the preferences.
Current solutions to computation offloading optimization do not consider the situations of task dependency and failure.To optimize the multiple objectives of latency and energy consumption, some authors solve the problems by giving weight values to each objective and converting it to a single-objective optimization problem [19][20][21].These solutions require rerunning the optimization algorithm when the weights of the objectives are changed.Other solutions choose to optimize one objective by treating other objective functions as constraints [15,22].Some solutions use the Lagrange multiplier method to solve the CMCOO problem [23,24].However, when constraints are complex, this approach becomes difficult to use.The Non-dominated Sorting Algorithm was proposed because it is faster than other multiobjective optimization algorithms [14].It gives Pareto optimal solutions in which a user can select the optimal solution according to real-time requirements.The performance of the NSGA-III algorithm depends on the randomly generated initial population.A good initial population can lead to faster convergence of the algorithm.
In this paper, we propose a method that categorizes the computation tasks of SMD applications and makes offloading decisions in two stages on the classified computation tasks: making offloading decisions on some classes of computation tasks based on preferences; and using the NSGA-III algorithm to optimize the offloading decisions on the remaining tasks left from the first stage.The initial population of the NSGA-III algorithm is generated considering the decisions of the first stage.Thus, the first stage helps to generate a better initial population.This approach can help improve the performance of the overall optimization process.
The main contributions of this paper are as follows: 1.The local computing, task transmission, and edge computing processes are modeled using queueing theory models that consider the limitations of processors and wireless channels.
2. We optimize task failure rate jointly with latency and energy consumption in a MEC-based IoT system with interdependent tasks.To our knowledge, this is the first work to take task failure and task dependency into consideration.3. We propose a preferences-based hybrid computation offloading multiobjective optimization method to minimize the mentioned objectives.4. We conduct extensive simulation experiments in various cases with the proposed and existing methods.The results of the experiments show that the proposed method can give better results compared with existing methods.
The rest of the paper is organized as follows.The literature on the computation offloading problem is reviewed in Section 2. We describe our system model and formulate the optimization problem in Section 3. The proposed method is described in Section 4. We present the experiment results and compare our findings to the baseline methods in Section 5.In Section 6, we state our conclusions and remark on possible directions for future research.

Related work
Offloading part of computation tasks (referred to as tasks hereafter) onto remote resources (cloud/edge computing) is widely used in practice.MEC is suitable for latency-sensitive applications because the edge server nodes are deployed at the edge of a wireless network.MEC has significant advantages in terms of latency and data privacy compared to cloud computing.Computation offloading in MEC, its limitations, and issues with them were studied in [3,25].Also, Gasmi et al. [26] pointed out several research problems in computation offloading in MEC, including issues related to computation offloading of dependent tasks.
Afrin et al. [14] proposed a computation offloading method using the NSGA-II algorithm to minimize the makespan, energy consumption, and monetary cost in an edge cloud-based multi-robot system.To solve the problem, they proposed a modified NSGA-II algorithm that pre-sorts the initial population based on the task size and processing speed of the resources.Then, to balance the values of all objectives in subsequent generations, it selects the chromosomes having the minimum distance solution from the Pareto-front to the origin.
Another example is the method provided by Cui et al. [16] to minimize the latency and energy consumption of SMDs.The authors explained that utilizing the multiobjective optimization algorithm allows for selecting the best solution among the Pareto optimal set and avoiding rerunning the algorithm when the condition changes.They used the M/M/1 type of queue for modeling computing processes on SMDs and a MEC server.The latency was considered as the sum of waiting time in the queue and execution time of a task.This system works well when tasks are executed on a computer with unlimited resources.
Xu et al. [15] proposed a computation offloading method for cases with multiple computing units, for instance, edge servers with several virtual machines (VMs).They modeled the task execution process on a cloudlet using a M/M/c/∞ queue.A task can be offloaded to an idle one of the c VMs on the cloudlet.The objective of the work was to minimize the energy consumption under deadline constraints in a system that consists of a collaboration of local, edge, and cloud computing resources.They successfully used the NSGA-II algorithm to achieve the goal.
Xu et al. [35] proposed an NSGA-III algorithm-based method to minimize the completion time and energy consumption of IoT devices.They applied simple additive weighting (SAW) and multiple-criteria decision-making (MCDM) techniques to select an optimal schedule strategy.The tasks are executed either on the mobile device, or on the cloudlet, or on the cloud server.That is, the tasks are offloaded onto the cloud server if the cloudlet is busy.If all VMs are busy, then tasks are lost.However, they did not consider the number of lost tasks in their work.
Mao et al. in [36] proposed a Lyapunov optimization-based dynamic computation offloading algorithm.They considered latency and task failure as the performance metrics of the multiobjective optimization problem, then converted it into a single optimization.Their proposed algorithm was oriented to the independent task case.
The works mentioned above were aimed at interesting solutions to improve the performance of SMDs in various situations.Two gaps can be observed in modeling the task computation or transmission processes and in performance metrics.Using the M/M/1 or M/M/c/inf ty queueing models, for example, is not appropriate for systems with limited resources.Using these models, we cannot evaluate the number of lost tasks due to the limited resources of the computing unit or wireless channel.Therefore, the latency calculated by those models may differ from the actual value.Also, the performance of the NSGA algorithm depends on the initial randomly generated populations.
In this work, we use the M/M/c/K queueing to model the local computing, transmission, and edge computing processes.It allows us to evaluate the number of lost tasks in computing units and wireless channels with limited resources.Thus, we optimize the latency, energy consumption, and task failure as a constrained multiobjective optimization computation offloading problem.To solve the problem, we propose a two-staged preference-based hybrid method.In the first stage, offloading decisions are made based on preferences.In the second stage, the modified NSGA-III algorithm is adopted.The differences between this work and existing works are given in Table 1.The initial population of the NSGA-III algorithm is generated according to the result of the first stage.Thus, we can improve the overall performance of the optimization algorithm.The details are presented in the following sections.

Queueing Models and Problem Formulation
We consider a MEC system consisting of a MEC server, a set of SMDs, and several small cells, as shown in Fig. 1.A small cell eNodeB (SeNB) is connected wirelessly with SMDs in a small cell.The bandwidth of the wireless channel is assigned to SMDs equally within the small cell.The SeNBs are connected in a wired manner to Macro eNodeB (MeNB), where a MEC server is deployed.It is allowed to offload the tasks of several SMDs onto the MEC server simultaneously.Let M SMDs be denoted by the set of M = {U 1 , U 2 , ..., U M }.On an SMD, an application generates N tasks, denoted by the set of N = {τ m,1 , τ m,2 , ..., τ m,N }.We assume the rate of generated tasks λ (i.e., the number of tasks per time unit) follows the exponential distribution.The tasks are compound, i.e., each task can contain several dependent subtasks.The dependency of subtasks is given by the directed acyclic graph (DAG) G = (V, E), where V denotes a subtask and E denotes the precedence constraint between subtasks i and k.Any subtasks might be executed either on SMD or the MEC server, except for the first and the last ones, because a task is generated at SMD and its final result is shown at SMD.Each subtask τ m,n,k is defined by two parameters (d m,n,k , c m,n,k ), where d m,n,k represents the data size (DS) of the subtask and c m,n,k represents the number of cycles required to complete Since subtasks can be executed either locally or on a MEC server, the flow of subtasks is divided into two parts.Assume δ m part of the flow is offloaded through a wireless channel and executed on the MEC server.The remaining γ m part of flow is executed locally, where γ m = (1 − δ m ).We model these processes by adopting models from queueing theory.In the queueing models, we consider the limitations of the computation capacities of SMDs, the MEC server, and the bandwidth of the wireless channel.The fraction of offloaded subtasks can be calculated as where N m is the number of tasks of SMD U m .x m,n,k ∈ {0, 1} represents an offloading decision, where x m,n,k = 0 means the subtask is executed locally, and x m,n,k = 1 means the subtask is offloaded onto the edge server.Fig. 3 depicts the network model of the queueing system, which has three queueing system models denoted as QS-1, QS-2, and QS-3.QS-1 models the γ m part of the subtask flow λ m arriving at the CPU of an SMD to process.An SMD has a single computing unit and a limited buffer K l to keep the subtasks in a queue.An arriving subtask is lost if the buffer of an SMD is full.Thus, the local subtask execution process is modeled using the M/M/1/K l type of queueing system with loss.In QS-2, a subtask transmission by wireless channel is modeled using the M/M/1/1.Because we assume the wireless channel does not have any buffer to keep the queue, a subtask is lost directly if the wireless channel bandwidth is busy with the transmission of the previous subtask.The probability of losing a task (task failure) in transmission is defined using π tr m .At the MEC server in QS-3, all flows are aggregated, i.e., M m=1 ϖ m δ m λ m , where ϖ m = 1 − π tr m .A MEC server has several virtual machines (VMs) and a limited buffer.A task is assigned to and executed in a VM randomly.As a result, we model the edge computing process as M/M/c/K, where c is the number of VMs and K is the buffer size of the MEC server.The input flow is the sum of the output flows of SMDs following the same distribution.The values of K l , c, and K are given.

Local Computing
An SMD has a single computation unit, and its capacity is limited.The CPU of the SMD only serves one subtask at a time, and K l − 1 subtasks are held waiting for service in the SMD's buffer.The service time of the computation unit follows an exponential distribution.Therefore, the task execution process on the SMD is modeled by the M/M/1/K l type of the queueing system.The average service time of an SMD can be calculated as where f m is the computation rate of the SMD, which is given by the number of CPU cycles.
The SMD utilization coefficient is calculated as ρ m = γ m λ m b m , where γ m represents the QS-1 input flow, defined as γ m = 1 − δ m .The average waiting time for service of a subtask τ m,n,k on an SMD can be calculated as [37]: where K l is the number of subtasks held in QS-1.
The local execution time of a subtask on an SMD is equal to the sum of the service time by the CPU and the average waiting time for service in a queue, and it is calculated as Considering the dependency of subtasks, we define the ready time of a subtask for local execution as follows: where pred(k) is the set of immediate predecessors of the subtask τ m,n,k .The completion time of the subtask τ m,n,k in the local execution can be calculated as The probability of task failure caused by buffer limitation of an SMD can be calculated as [37] The amount of energy used by an SMD while a subtask is being executed locally is calculated as where κ = 10 −26 is the switched capacitance coefficient, and it depends on chip architecture [11].

Communication Model
The subtask uploading transmission rate R m from SMD U m to access point (AP) is calculated according to the Shannon-Hartley theorem where B denotes the bandwidth of a wireless channel, s is the number of channels, and P m denotes the transmission power of SMD U m .Furthermore, G m represents the channel gain between the SMD U m and the AP, σ represents the background noise power, and I m is the interference parameter caused by SMDs in other cells on the same channel, and it is calculated as follows where a l,m ∈ {0, 1} is a binary variable.If the channel used by SMD U m and SMD U l is the same, a l,m = 1; otherwise, a l,m = 0.The transmission power of SMD U l is p l , and the channel gain between SMD U l and AP is G l .We assume the wireless channel does not have a buffer to store subtask data.The arriving subtasks are dropped when a wireless channel is overloaded.We model this process as an M/M/1/1 type queueing system, which means the system has a single computation unit and can operate only a single subtask at a time.When K l = 1, the average waiting time for service is w = 0 according to Eq. (3).Therefore, the transmission time of a task from SMD to AP is calculated as The probability of task failure due to a wireless channel's bandwidth limitation can be computed as [37] where ρ R = δ m λ m b tr is the utilization coefficient of a wireless channel.The average service time of a wireless channel is calculated as The energy consumption of an SMD during transmission of a subtask is calculated as

Edge Computing
In Edge Computing, subtask flows from all SMDs are aggregated.We assume the edge server has c virtual machines and a limited buffer.Other arriving subtasks are lost because the Edge server can only operate and hold K subtasks in the buffer.We model the edge computing process as an M/M/c/K queueing system.The average service time of the Edge server is calculated by the average workload from all SMDs over the computation rate of the Edge server as where F denotes the computation rate of the Edge server, which is given by the number of CPU cycles.
The utilization coefficient of an edge server is indicated by r = M m=1 δ m λ m b E , where ρ E = r/c.The probability that an edge server is in idle status is defined as [37] The average waiting time of a subtask for service by an edge server is calculated as follows: If ρ E = 1, L'Hôpital's rule is applied twice, as shown in [37].
Thus, the execution time of a subtask on the Edge server is equal to the sum of service time and waiting time in a queue, and it is calculated as The ready time of the subtask τ m,n,k on the computing edge is calculated as The completion time of a subtask on the edge server is calculated as The probability of task failure caused by overloading on the Edge server is calculated as [37] We ignore the energy consumption of the Edge server because its power is provided by the power network.In Edge computing, we consider only the energy consumption by SMDs while offloading the subtasks onto the edge server.

Problem Formulation
The overall completion time of any subtask is equal to the sum of the completion times of all subtasks on local computing and edge computing, and it can be calculated as Similarly, the overall energy consumption of any subtask can be calculated as The average task completion time of tasks for all SMDs is calculated as where the completion time of the 10th subtask is the completion time of a task.The average energy consumption of tasks across all SMDs is calculated as The value of the probability of task failure in the whole system is calculated by the sum of probabilities in all queueing systems as follows: To minimize the energy consumption of SMDs, the average task completion time, and the probability of task failure by finding the optimal computation offloading decisions, a constrained multiobjective optimization problem is formulated as follows: Constraint C1 defines that the completion time of each task cannot exceed the deadline given by an application.Constraint C2 defines that offloading decision is binary.Constraint C3 defines that optimal offloading decisions are sought for subtasks from the second to the ninth in each task.The first and tenth subtasks are executed locally.Constraint C4 shows that the subtask k not be finished before its predecessor.
To solve P1, we propose a preference-based hybrid computation offloading method, which is described in detail in the next section.

A Hybrid Offloading Method
As a solution to the problem P 1, we propose a two-staged hybrid computation optimization method.Assume there are M SMDs, each generating N m tasks, and the total number of tasks is N , whereas each task contains 10 dependent subtasks according to the system model.It is required to build a computation offloading framework that finds the optimal offloading decision for each subtask and optimizes the objective functions under given constraints.
Let α be the proportion of subtasks that get the offloading decisions based on preferences.Assume that the initial values of the population size and the number of iterations, G i and I i , are given.The population size and the number of iterations are reduced to 10 during the process, which depends on the number of tasks.We assume these volumes of population and iterations are enough for the algorithm to converge when the selected number of tasks is at its minimum.To begin with, given a set of N s tasks, a task is selected from each SMD.Thus, there are totally N s × 10 subtasks that wait for offloading decisions.In addition, the offloading decisions for some subtasks are made based on preferences.Furthermore, the modified NSGA-III algorithm is adopted to find the optimal decisions for the remaining subtasks.Finally, the selected task is removed from the task set, and the next task is selected to find the optimal offloading decisions for its subtasks.This process continues until all tasks receive offloading decisions.A general description of the proposed hybrid method is given in Fig. 4, and its pseudo-code is given in Algorithm 1.The following sections are dedicated to showing how the proposed method works in detail.

Making Offloading Decisions by Preferences
As mentioned in Section 3, a subtask has two parameters: d m,n,k , the data size (DS) of the subtask, and c m,n,k , the number of required CPU cycles (NRCC) to complete the subtask.Offloading a subtask with a big DS onto the MEC server through a limited wireless channel leads to high latency and more energy Select a task from each SMD 3: N s ← number of selected tasks Making offloading decisions based on preferences (α)

8:
Making offloading decisions using the NSGA-III (N s , G, I )

9:
Remove tasks N s 10: end while consumption.In contrast, if a subtask is computation-intensive, then executing the subtask locally with limited computation capability is also not efficient.Taking the above considerations into account, we define two preferences for making immediate offloading decisions: 1. Preference 1: If a subtask is not computation-intensive but its DS is large, then it is executed locally.2. Preference 2: If a subtask is computation-intensive and its DS is small, then it is offloaded onto the MEC server.
Note that the preferences are applicable for all subtasks of a task except for the first and last ones.According to the system model, the first and last subtasks are always executed on the SMD.
Thus, we make immediate offloading decisions based on preferences for some subtasks in a set.For the remaining subtasks that do not satisfy the preferences, the offloading decisions are made by using the modified NSGA-III algorithm in stage 2 (see Table 2).The proportion α of subtasks that satisfy immediate offloading decisions is determined as follows.We intuitively assume that less than 50% of tasks can be offloaded based on preferences (α < 0.5).The default value of α is found through trial and error.The immediate offloading decisions based on preferences are made as follows.
The default value of α is found through trial and error.The immediate offloading decisions based on preferences are are made as follows.
1.The weighted values of two parameters for each subtask are calculated by where n ∈ {1, N } is the index of a selected task from the SMD.The maximum values of the weighted DS d w m,max and weighted NRCC c w m,max are determined.2. Thresholds are used to classify the tasks, and they are determined by α, d w m,max , and c w m,max as where 0 < α < 0.5.
3. Identify the subtasks that can get immediate offloading decisions.According to the preferences, the subtasks with the largest DS and the lowest NRCC and the subtasks with the smallest DS and the highest NRCC can be found by the following rules: ) 4. Finally, immediate offloading decisions are made as: the subtasks which satisfy the condition in (31) are executed locally, and the subtasks which satisfy the condition in (32) are offloaded onto the MEC server as follows: The subtasks that do not meet both conditions remain undecided, and decisions are made in the next stage.
Fig. 5 illustrates how the subtasks get immediate offloading decisions based on preferences.The subtasks are shown in a two-dimensional space (blue points) by their weighted DS and NRCC values.The thresholds (red dotted lines) are drawn using the α value and the maximum values of the weighted DS and NRCC.The subtasks in the squares S 1 and S 2 are identified to assign the offloading decisions in advance based on preferences.Herewith, the subtasks in the square S 1 are executed locally, and the subtasks in the square S 2 are offloaded onto the MEC server.A process of finding the immediate offloading decisions based on preferences is given in Algorithm 2. The offloading decisions for the remaining subtasks are made by using the modified NSGA-III algorithm in the next stage.

Find Optimal Offloading Decisions using the Modified NSGA-III Algorithm
The variant of the Non-dominated Sorting Genetic Algorithm, i.e., NSGA-II, has shown high performance among multiobjective optimization algorithms [14].However, the crowding distance operation in the NSGA-II algorithm does not work well for many-objective problems [34].The problem P1 in (27) seeks optimal solutions to minimize the three objective functions under its complex constraints.That is why we adopted and modified the third version of the NSGA to solve the P1.Algorithm 2 returns a set of offloading decisions that is incomplete.It means offloading decisions were defined based on preferences for the first and Calculate weighted values of the parameters by Eqs. ( 28) and ( 29) Calculate thresholds by Eq. ( 30)

4:
Identify the tasks which get the offloading decisions based on preferences by Eqs. ( 31) and ( 32) x m,n,k = 0 7: end if x m,n,k = 1 end if 15: end while last subtasks and some subtasks between them.We need to find offloading decisions for the remaining subtasks to solve the P1.Usually, the performance of the NSGA-III algorithm depends on the randomly generated population.A good population serves to find an optimal solution faster.We use that property of the NSGA-III algorithm in our hybrid method, i.e., the modified NSGA-III algorithm, which we will explain step by step below.
1. Generate reference points: The reference directions are selected as the base model for the optimization.We use the known Das and Dennis [38] systematic approach to determine the set of reference points in each generation, as described in [33].This approach defines reference points uniformly distributed on the entire normalized hyperplane.There is only one reference point to which all the individuals will be associated.In the reference points array, each row represents a reference line and each column is a variable.The total number of reference points W in a problem with Y objectives is calculated as where a refers to the number of divisions considered along each objective axis.

Generate at random the initial populations:
The diversity of the population should be maintained; otherwise, it leads to premature convergence.Conversely, the population size should not be kept very large as it can cause a genetic algorithm to slow down, while a smaller population is not enough for a good mating pool.The size of the population is given by G. Offloading Dependent Tasks in MEC-enabled IoT Systems: A Preference We generate a random initial population considering the result of the previous stage.The initial population contains M chromosomes.The number of chromosomes is equal to the number of tasks selected from M SMDs.Each chromosome represents a set of offloading decisions for a task that consists of 10 genes.Each gene is denoted by {0, 1} that represents the offloading decision of a subtask.Note that the initial population is generated for the remaining subtasks, which do not satisfy the preferences in the previous stage.It means some genes on a chromosome are predefined.
3. Evaluate the solutions: We calculate the objective functions using Eqs.( 24), (25), and (26).Constraints in (27) are defined as constraint violations (CVs) as follows: The chromosomes that satisfy constraints in ( 27) are feasible solutions.In the next step, the best chromosomes are picked out based on the values of the objective functions and the CV.

Sorting and selection:
The selection phase is to select the fittest chromosomes and let them pass their genes to the next generation.First, the values of objective functions are normalized (5).Then, a fast nondominated sorting is performed according (6) Definition 1 : Definition 1 : A solution u 1 dominates another solution u 2 , if any one of the following conditions is true: 1. u 1 is a feasible solution, and u 2 is an infeasible solution.2. u 1 and u 2 are feasible and f (u 1 ) ⪯ f (u 2 ) 3. u 1 and u 2 are infeasible and CV (u 1 ) < CV (u 2 ) Furthermore, solutions are associated with reference points (7), and the fast nondominated sorting is performed on an updated population (8).Two pairs of chromosomes (parents) are selected randomly according to their fitness scores and CV to produce new offspring using different recombination and mutation operators.
10. Crossover: Crossover is the most significant phase in a genetic algorithm.It creates offspring by combining pairs of parents in the current population during evolution.Two chromosomes are randomly selected for picking out from the population.The crossover operation is performed for a certain percentage (pC) of the population as follows: where u ′ 1 and u ′ 2 are both offspring, and ϑ ∈ {0, 1} is the random integer variable.
11. Mutation: Certain new offspring are formed by the crossover operation.Then, the algorithm creates mutations by randomly changing the genes of individual parents.The mutation operation is applied with a probability of mutation of pM .A high probability of mutation will increase the diversity in the population and prevent premature convergence.The values of the crossover and mutation probabilities are given as the simulation parameters in the next section.
where u ′ i is the offspring of u i , i ∈ [1,10] is the gene of chromosome.In the next step, the parent and offspring populations are combined (12).The combined population is evaluated (13), sorted and selected (14).According to the sorting result, a new parent population is created (15).This process continues until the stop condition is met (9).A stopping criterion is defined as the given number of iterations.After several iterations, the algorithm returns a Pareto optimal set.All steps of the working process are given in Algorithm 3.

Experiments
In this section, we present several experiments to evaluate the performance of the proposed hybrid method.The evaluation is made by comparing this method with existing methods based on algorithms such as Multi-Objective Particle Swarm Optimization (MOPSO), NSGA-II, and random offloading.Experiments were performed on a PC with an Intel Core i5 CPU (3.2 GHz and 16 Gb of RAM), and MATLAB 2021a was used for simulation.

Simulation Environment
Based on the system model, we created a simulation environment which consists of the models of a MEC server, SMDs, and wireless channels.We conducted the experiments in the environment with 5-10 small cells.The number of SMDs in each small cell was in the range of .The number of tasks generated by an SMD was in the range of .According to the statistics by Google, the arrival time of computation tasks follows an exponential distribution [39].The data size (DS) of the subtask and the number of required CPU cycles to complete (NRCC) of the subtask were generated by the exponential distribution of an average value of 200 kbit, 5 MHz, respectively.Initial values of the size of the population (G) and a maximum number of iterations Algorithm 3 Finding optimal offloading decisions using the NSGA-III algorithm Require: N , A, G i , I i , pC, pM Ensure: Offloading decisions Combine parent and offspring populations 13: Calculate objective functions and CV  (P) were given, and they were adjusted during the simulation according to the number of computation tasks.The values of all variables used in the simulation are listed in Table 3.

Comparisons on Pareto Optimal Solutions
The computation offloading methods optimized with multiobjectives return several optimal solutions called Pareto Optimal set (Fig. 6).When the condition changes, the user selects the best one according to the real situation without rerunning the optimization algorithm.Fig. 6(a) shows the results of three methods produced by genetic algorithms and a random offloading method.The random offloading method produced only one solution shown as a triangle in the figures.The solutions of other three methods are shown with different symbols explained in the legend.We can see that, compared with other three methods, the results of the random offloading method are the worst in all three objectives: Latency, Task failure and Energy consumption.Among the other three methods, our proposed method produced the best results.Fig. 6(a) shows the results of all methods together.We can see that the results from optimization methods are much better than the random offloading method.Fig. 6(b) is the comparison of the proposed method and the random offloading method.Fig. 6(c) shows the comparison results of the proposed method and the NSGA-II algorithm-based optimization method.Although both methods produced good results, we can see that the NSGA-II algorithmbased optimization method also produced solutions with high probability of task failure.Similar results can also be observed in Fig. 6(d) where the MOPSO algorithm-based optimization method produced high probability of task failure.These comparisons demonstrate that our method is able to reduce the task failure.

Comparison on Resource Utilization
The resources such as computation units of SMDs and MEC servers and the bandwidth of wireless channels are limited.Overloading the task workload with limited resources leads to long latency or even failure of the task.We test the proposed and existing methods in cases with overloading the task workload.For this purpose, we conducted experiments with varied workloads by increasing the number of tasks and SMDs.In each experiment, the value of the resource utilization coefficient was computed.The values of the resource utilization coefficient of an SMD, wireless channel, and the MEC server were calculated using ρ m = γ m λ m b m , ρ R = δ m λ m b tr , and r = M m=1 δ m λ m b E respectively.The experiments were performed 100 times and the average and standard deviation of the utilization coefficient were calculated.
We evaluated the proposed method and other three methods in the settings of computations tasks from 50 to 300, where each task consists of 10 subtasks.The best results of the experiments are highlighted in bold in Table 4.The results show that the number of tasks does not affect resource utilization significantly.This is because the proposed computation offloading framework selects only one task from each SMD to find the optimal decision each time.The next task is selected after the decision on the previous one is made.The highest utilization of the resource appeared on the wireless channel.Since the  resource utilization is less than 85-90%, the probability of task lost is low.In the table, we can see that most best results were produced by our proposed method.
In the experiments, we also investigated the impact of increase of the number of SMDs on resource utilization by gradually putting more SMDs to the SeNBs.The nominal bandwidth of the wireless channel was divided equally by the number of connected SMDs.In addition, the increase of SMDs created more computation workload for the MEC server.When the random computation method is used to make the offloading decisions, the local server resources were less utilized, but the wireless channel was overloaded, resulting in high probability of failure tasks.However, The multiobjective optimization methods can better balance the resource utilization by finding optimal solutions on offloaded tasks.Table 5 shows the experiment results of four methods on resource utilization.We can see that the proposed method produced good overall results.The increase in the number of SMDs does not affect the utilization of local computing resources.It affects only the utilization of wireless channels and the MEC server.These results demonstrate that increasing the number of tasks does not affect the utilization of resources significantly in the case when the framework selects one task at a time from each SMD.

Impact of Varied Numbers of SMDs on Performance Metrics
From the previous section, we see that the increase on the number of SMDs significantly affects resource utilization.To understand how the number of SMDs affects the performance of the optimization algorithms, we conducted experiments to find the relations between the elapsed time of the optimization algorithms and the number of SMDs in the SeNBs.The elapsed time was obtained from the internal function of MATLAB.The number of SMDs used in the experiments was from 20 to 45.For each given number of SMDs, we used the same input datasets for all methods.However, the datasets were randomly generated on the fly for experiments.Therefore, the values of objective functions depend on the generated datasets.The values of bandwidth and computing capacity of the MEC servers were fixed during all experiments.Table 6 shows the experiment results.The same experiment in each setting was executed one hundred times.The average values and their standard deviations are given in column.We can see in the bottom line that our proposed method is 58% and 21% faster than NSGA-II and MOPSO.In addition, it shows better results in optimizing the three objectives.For instance, the latency of our method is reduced by 72% (Random), 18% (NSGA-II), and 30% (MOPSO) compared with other methods.The probability of task failure is reduced by 84% (Random), 35% (NSGA-II), and 74% (MOPSO).Regarding the energy consumption, the proposed method showed positive result compared with the random offloading, and negative result compared with the NSGA-II and MOPSO algorithms.

Conclusion
The rapidly increasing number of IoT devices creates new opportunities for users to have more services, and it also causes a problem of processing massive data.Partial offloading of the computation tasks from IoT devices onto nearby MEC edge nodes partially solves the problem.The problem remains challenging with designing a fast and effective computation offloading framework in resource-limited systems.A combination of preferences based on logical rules and an optimization tool can help to solve the problem.This paper is aimed to propose a hybrid method for fast optimal offloading of dependent computation tasks in a resource-limited IoT-MEC environment.The proposed method achieved better performance compared with existing methods.Especially, the time to search for optimal offloading decisions is reduced.This work will be tested in real-world experiments, and a decentralized method of computation offloading will be investigated.

Declarations Ethical Approval and Consent to participate
Yes.

Human and Animal Ethics
Not applicable.

Consent for publication
Yes.

Fig. 1 Fig. 2
Fig. 1 Illustration of the model of a MEC system.

Fig. 4 A
Fig. 4 A general description of the proposed solution.

Fig. 5
Fig. 5 Classification of subtasks based on preferences.

Table 1
Comparison of the proposed method to existing computation offloading methods

Table 2
Decisions made using different methods on different conditions

Table 3
Simulation parameters