A heuristic-based package-aware function scheduling approach for creating a trade-off between cold start time and cost in FaaS computing environments

With the migration of enterprise applications to microservices and containers, cloud service providers, starting with Amazon in 2014, announced a new computational model called function-as-a-service. In these platforms, developers create a set of fine-grained functions with shorter execution times instead of developing coarse-grained software. In addition, the management of system resources and servers is delegated to cloud service providers. This model has many benefits, such as reducing costs, improving resource utilization, and helping developers focus on the core logic of applications. But, still faces many challenges such as cost/usage balance, optimizing performance, programming models, using current development tools, containers’ cold start problem, saving data in caches, security issues, privacy concerns, and scheduling challenges like execution time prediction. In this paper, we focus on scheduling and cold start problems. Reducing the cold start time can result in better response times and hence a better experience for end-users. Compromise occurs when keeping warm operating environments which can reduce cold start times while increasing resource usage costs. The cold start problem is wildly studied, but in this work, we propose a novel dynamic waiting-time adjustment approach in which the waiting time of a container will change dynamically according to the situation. In this study, four different types of decisions for changing the waiting time are introduced and investigated. We aim to create a trade-off by using these decisions at runtime using a heuristic method. Functions’ invocation frequency, function dependency graph, maximum merging time, correct mergers, cost, and timeline are considered for the decision-making process. The performed evaluations demonstrate that the proposed approach results in a 32% improvement over the fixed-time method (i.e., the method used by Apache OpenWhisk). This comparison is made from the cumulative measurement viewpoint which is a combination of response time, turnaround time, cost, and utilization.


Introduction
Cloud computing is one of the most significant and challenging pillars in modern computational domains. Also, this paradigm is changing the way computation is performed from physical hardware and software systems with local management to virtual services hosted in the cloud [1]. Serverless computing seeks to achieve the two primary goals of reducing cost and increasing elasticity in cloud computing systems [2]. Serverless computing or more specifically function-as-a-service (FaaS) computing is emerging as a new cloud computing paradigm after microservices and containers became more popular in enterprise software systems [3]. In serverless services, the application layer is managed by the cloud provider. Thus, the focus is on software development, and much less time is spent on software configuration and management than on other cloud models. From a cloud provider's point of view, this model provides a significant opportunity to control the software development stack, and reduce operating costs with efficient optimizations and resource management. It also proposes a platform to utilize more services in its ecosystem and reduces the efforts needed to manage the cloud-scale software [3].
Serverless computing has become a term in the industry that refers to a specific programming and architecture model in which small pieces of code are executed in a cloud environment without any control over the resources or network nodes. The difference with PaaS is that in this case, the executing code is much finer and its management is shifted from the developer/client to the cloud provider. These changes bring various issues and challenges that may not exist in other models. Similar and common issues also manifest themselves differently in this computational model. Some of these challenges are discussed below: (1) Business model: Because the computational model promises to be more costeffective, due to the different effective execution times of the functions or limitations such as the limited execution time, there are challenges in the business model and calculation of the services' final cost. (2) Execution isolation: On the one hand, functions are stateless and must be executed in isolated environments. On the other hand, improper isolation may increase the cold start time.
(3) Orchestration and load balancing: Considering that the call rate of functions can be different at various times, the system should be able to make the necessary predictions about the call rate fluctuations and the number of containers required for the most optimal responses to the client at appropriate times. (4) Scheduling: Function scheduling in this computational model has a significant impact on the final cost, faster response times, and the scalability of functions.
As mentioned earlier, the scheduling issue during the functions' execution is very important in FaaS computational model. This is because proper scheduling of available resources such as execution servers, containers, virtual machines, and libraries preloaded on this machine, considering the optimal transfer of user functions to the destination, all affect the final cost, quality of service, service level agreement, and scalability of the system. This is especially important when the cloud service provider assumes that it has no knowledge of the content used in the function and does not make decisions based on its source code [4].
Considering the stated challenges, after examining and classifying the scheduling approaches in FaaS computational model, we have introduced a new approach for function scheduling aiming at creating a balance between cost and reducing the cold start time. The measures considered in the proposed approach are response time, cold start time, computing power, and final cost. These measures are briefly introduced below: (1) Cold start time: The cold start time is the amount of time it takes for the firsttime execution of the submitted function usually lasting between 5 and 25 min. This time includes the time it takes to receive the information, build the container, and get started. The ultimate goal is to reduce the cold start time and response time of the proposed method while maximizing utilization, considering the user's input budget, and reducing costs. In this paper, a method is proposed that reduces the cold start time by evaluating the behavior of activating functions on different days and hours, as well as respecting the user's budget. The main contribution of this paper is to introduce an approach for designing a scheduler that (1) determines which functions should run on which containers based on their similarity with the host container, and (2) decides how long should a container wait for hosting a new function based on a heuristic combination of different parameters of the execution environment (e.g., the cooperation graph of the functions, function dependencies, and previous successful decisions).
The rest of the paper is organized as follows. In Sect. 2, the required background knowledge for a better understanding of the paper is given. In Sect. 3, a review and comparison of the related work of this research are provided. The problem statement is described in Sect. 4. In Sect. 5, the proposed approach is introduced. In Sect. 6, the results of the performed evaluations are given and discussed. Finally, the conclusion of the paper as well as the suggested future work are presented in Sect. 7.

Background knowledge
Since the Serverless computing model and, in general, cloud computing is emerging in the software industry, new terms and notations specific to these computational paradigms are introduced. Cloud computing and the novel services built around it, with the promise of optimal resource allocation, agility, and economic efficiency, the ability to reduce and expand services while increasing and decreasing traffic, and independence from geographical locations have become suitable models for today's software and business services. The computational model of the FaaS as one of the business services that cloud providers now deliver in the form of on-demand systems is one of the new models for developers to manage the executive environment reducing the resource allocation efforts, and leaving the cost management and efficiency to these providers.

Cloud computing's service model
According to the American National Institute of Standards and Technology, "Cloud computing is a model for public, convenient, and on-demand network access to a shared pool of configurable computing resources (such as networks, servers, storage space, software, and services) that can be provided quickly and published with the least amount of managerial effort or service provider interaction [5]". Considering the previous definition, as shown in Fig. 1, cloud computing comes with three different service models. These services are provided at three different levels: (1) software as a service, (2) platform as a service, and (3) infrastructure as a service. In more recent expansions, container-as-a-service (CaaS) is added above the IaaS layer and FaaS is added above the PaaS making the service model a 5-layer service model. Fig. 1 Traditional cloud computing service model [42] (left) and the new expanded version (right) [43] 1 3 Cloud computing was developed with several objectives in mind compared to traditional computing models. The main characteristics of this computing paradigm are as follows.
(1) On-demand and self-service: A consumer can unilaterally describe computing capabilities such as server resources and network storage when needed and without the need for manual interactions with any service provider. (2) Wide network access: Computing capabilities are available on the network and can be accessed through standard mechanisms that promote the use of small and large clients such as mobile phones, tablets, laptops, and workstations. (3) Dynamic resource pool: The computing resources are available in a pool of resources to be served through a multi-tenancy model in which physical and virtual resources are dynamically assigned. Usually, in this method, the consumer has no control over the location of their resources. (4) Higher elasticity: Resource capabilities can be supplied and released dynamically and on-demand. In some cases, this action will automatically increase or decrease the scale to reduce response times in proportion to the increase in demand. From the consumers' point of view, resources are unlimited and can be requested at any time and in any amount. (5) Measurable services: Cloud systems automatically control and optimize resource usage depending on the type of service. Resource utilization can be measured, controlled, and a transparent status report of the utilized resources can be provided to the consumers and providers.

Containers and microservices
Containers are the newest mechanism to achieve the virtualization goal that works at the operating system level. This mechanism which is a lighter method compared with virtual machines uses the host operating system's virtualization technologies and isolates the implementation of different software from each other through the use of container technologies such as Docker [8]. As shown in Fig. 2, the applications are placed on a container virtualization system, which is itself on the host operating system layer. A microservice is an independent component that can be deployed with a defined boundary that supports communications through message-based protocols [6]. As shown in Fig. 3, in monolithic architecture, all the layers including the user interface, business logic, and data are in one place.
While in microservice architecture, each part of the software can use different and independent services.

Serverless computing
The terms serverless computing and function-as-a-service are used interchangeably in this paper and the cloud computing literature in general [7]. In serverless computing, one or more servers are always involved in generating the response. Thus, the term serverless does not mean that there are no servers in the computation model. It refers to the fact that users do not deal with configurations, setup challenges and server maintenance and can only focus on delivering their services. In other words, all the issues and challenges regarding servers, their operation, and management will be completely transparent from the user's point of view.
As shown in Fig. 4, the serverless computational model is event-based. This means that for a function to be executed, an event must occur that triggers its execution. This activation triggers a function that is itself attached to the backend infrastructure. Execution of this function can have a significant output for the user. Table 1 shows the list of common serverless computing platforms.

Advantages
Below, we will discuss some of the advantages and strengths of this computational model: (1) No need to maintain the cloud infrastructure: As mentioned earlier, one of the features of cloud computing in the serverless model is that developers do not need to worry about how to develop and maintain the cloud infrastructure and servers, and can only focus on developing their cloud functions. (2) Cost: Because the cloud computing business model is pay-as-you-go, one only has to pay if she/he uses the service and pays as long as the function runs. This reduces the cost of the serverless computing model. (3) Auto-scaling: This is one of the promises of cloud providers in general and specifically serverless computing models. Functional cloud platforms should be able to automatically increase the number of resources in the event of increased traffic, growth of function activation rates, and larger resource demands for execution. (4) Loosely coupled: Because functions are inherently stateless and can be freely developed together, they are much more likely to be reused. This means that each function can be easily used separately without having to worry about dependencies or their effects on the execution environment.

Disadvantages
In the following, we will discuss some of the disadvantages of this model: (1) Inability to control the infrastructure as needed: Although one of the reasons for using the serverless model is to avoid the hassles of managing and controlling 1 3 A heuristic-based package-aware function scheduling approach… OpenWhisk [8] allow the functions to be executed on local environments, it is not possible to run local functions for testing on proprietary cloud platforms such as Amazon Lambda [9] and Azure Functions [10]. (3) Vendor lock-in: The use of services brings dependency on the provider's ecosystem and the risk of vendor lock-in [3]. (4) Persistent: Serverless functions usually run in a stateless container hosted by the serverless platform without saving ephemeral state information in its persistent storage. Being stateless implies that the techniques depending on transferring state information could not be efficiently used in serverless applications [11].

The cold start problem
FaaS platforms typically rely on container technology and serverless functions are served by containers. When a request comes in and no idle container can be found for the execution of the requested function, then a new container needs to be provisioned [12]. So, some initialization is needed before the function execution starts which is the reason for start-up latency. Start-up latency refers to the time period from the function invocation to its execution. If this latency can be reduced the response times of the requests will be significantly improved and users will in general have a better experience in working with the system. This is the main aim of the current paper. The first step for initialization is to prepare a lightweight isolated execution environment for the function. The isolated environment initialization may start from either a shut-down state or a running situation. This is the main difference between the cold start and the warm start concepts [11]. In serverless computing, the execution times of functions are short. In many cases, the additional cold start latency effectively doubles the latency of function execution from a client perspective [12]. As a cold start can potentially occur numerous times, especially for growing workloads, it affects the total response latency. As stated in [13], the cold start latency could be as much as 80% of the total response latency. So, the problem of minimizing the cold start time is important which is called the cold start problem.

Related work
The introduction of the FaaS model has led to several challenges, ideas, and open issues in this field, making it one of the most interesting topics for cloud computing researchers in the past several years. Given the notable advantages of serverless computing, many cloud providers have proposed their respective frameworks to support a wide range of services such as video processing, machine learning, computer graphics, data analytics, and distributed computing [11].
The statelessness of serverless systems makes them suitable candidates for a wide range of applications. For example, Yan et al. [14] developed a chatbot using the OpenWhisk FaaS service. Also, real-time data processing has been performed using a serverless processing model [15,16,17]. In the field of security, using this model, Bila et al. [18] have developed a mechanism to secure Linux containers. Although effectively exploiting its advantages still has its complexities, it makes this computational model attractive enough to researchers and scholars. To optimize the performance criteria of serverless services (such as response time), various articles and research have been performed, each of which has improved these criteria using different approaches. In this section, the studies are categorized according to their goals.

Optimization for the use of prerequisite packages
One way to reduce response time by reducing cold start delays is to optimize the packet preparation time. Functions need to load their execution environment and performance prerequisites after being placed in a virtual machine or container.
Harter et al. [19], by considering the slowness of Docker containers and by making changes to the Linux kernel to share a cache, proposed a model for these containers in which it identifies and loads the packages needed to run the software. Other packages are also loaded in a lazy manner. With this model, they speed up the supply cycle by 5 times and the development cycle by up to 20 times. They have argued that the main reason for slow containers is the isolated file system. Because the isolated processing of other resources such as memory and network is relatively simpler and faster. They expect that the container image is already present in the local environment whereas, in a computing environment that is made of large numbers of nodes located far from each other with limited storage, few images can be stored locally for future use [20]. Oakes et al. [21] proposed SOCK which is a package-aware caching system that can locally store interpreters and commonly used libraries in containers to reduce the cold start latency. Cadden et al. [22] presented SEUSS which deploys functions based on the snapshots of the isolated uni-kernel that consists of function logic, language interpreter, and library OS, bypassing the high overhead of initialization. Their results show that the deployment time of a function drops from 100 s milliseconds to under 10 ms. Akkus et al. [23] presented the SAND computing system that takes advantage of the application-level sandbox and hierarchical message bus technique. By using these mechanisms, the isolation between functions from the same application is weakened allowing those instances in the same sandbox to share functions and libraries. The results show that SAND can speed up the processing time compared to Apache OpenWhisk to a great extent. All these researches take advantage of caching techniques. But, there are two reasons that caching is far from ideal. First, a single machine is capable of running thousands of serverless functions and thus caching all the functions in memory will introduce high resource overhead. Caching policies are also hard to be determined in the real-world. Second, caching does not help with the tail latency, which is dominated by the cold boot in most cases [24]. Another group of researchers has focused on the application level. The main idea of Bermbach et al. [12] is that if a process encounters a cold start in its first step, it is very likely to encounter cold starts for the other steps as well. So, a hint manager is responsible for cascading cold starts by pre-starting workers using hints from previous invocations. But, their approach is limited to linear chains. Liu et al. [25] presented FaaSLight that efficiently reduced the application code loading. It is an application-level code analysis approach that used a function-level call graph to generate optimized FaaS applications by separating the optional code from the indispensable code segments. The results show that loading only indispensable code can optimize the cold start performance. Their goal is to minimize the cold start latency. Lee et al. mitigated the cold start latency of a workflow using function fusion [26]. The function fusion idea comes from the fact that if two functions are fused into a single function, the cold start of the second function is removed. Besides its advantages, it has a downside which is the approach is sequential in nature and cannot be parallelized. Sethi et al. proposed the Least Recently Used warm Container Selection (LCS) scheduling approach to reduce cold start occurrences by keeping the containers alive for a longer period [27]. They have compared the approach with the MRU container selection approach which resulted in about 48% cold start occurrence reduction. It would be insightful if more policies were considered and compared. Comparisons between these two policies make it ambiguous whether it is better to apply the policy or do nothing at all. Li et al. [28] proposed Pagurus that re-purpose a warm but idle container from another function. It identifies and manages idle containers. Their experiments show that Pagurus alleviates 84.6% of cold start-ups. Re-purposing a container also needs initialization which may itself impose extra latency.

Game theoretical-based methods
One of the simplest methods for function scheduling is to use load-balancing techniques when an event is generated to reduce the response time in distributed systems. One of the approaches that various articles have used to solve this type of problem is by using game theoretical-based techniques. Such approaches try to tackle this issue by modeling the scheduling problem as a game along with the competition of the involved players.
Grosu and Chronopoulos [29], in their paper, presented a game theoretical-based algorithm to optimize load balancing in heterogeneous distributed environments. They have used a noncooperative method to solve the load balancing problem assuming that the knowledge or forecasting statistics of the system or the program are available. There are several decision-makers in this method and each decisionmaker improves her/his response time independently until they all eventually reach a point of equilibrium. In the proposed method in this work, each user begins to collect information about the duration of services and the resource allocation in each host. This is performed so that in response to the resource allocations of other game players, she/he can calculate the optimal segmentation for the incoming traffic under her/his observation. This type of collaboration, together with the idea proposed in a research paper by Devan et al. [30] on the multi-objective workflow scheduling of bag-of-tasks, eventually led to the development of a game theoretical-based approach to functional systems.

Scientific workflow-based methods
One of the uses of FaaS systems is their application in scientific workflow execution domains. Workflow systems are a type of system used to combine and execute a set of calculations or computation steps within a scientific system [31]. The structure of a scientific workflow can be thought of as a DAG,G =< T, E > , such that T = {t1, t2, … , tn} is a set of tasks, and E represents the edges that indicate dependencies. Dependencies determine which tasks in the workflow should be started first and which ones should wait until the pre-requirements are finished [32]. In their paper, Malawski et al. [33] examined various cloud service providers to assess their potential for scientific workflows and developed a prototype of a system running this type of workflow on the providers. According to Fig Khochare et al. proposed a design for a serverless scientific workflow orchestrator that overcomes serverless challenges including the cold start problem [34]. They have proposed a pilot invocation, which is a warm-up NoOp execution of a downstream function in the corresponding directed acyclic graph that activates at least one container for that function but does not execute its business logic. Unfortunately, Fig. 5 The possible options for the execution of scientific workflows in serverless systems [33] 1 3 this approach only proposes an abstract framework without actual implementation. RB Roy et al. [35] proposed DayDream for HPC DAGs which employs the hot start mechanism which is decoupling the runtime environment from the component function code to mitigate the cold start overhead. Under the hot start mechanism, Day-Dream only loads the operating system and language runtime in the memory of a serverless function instance. Phase concurrency prediction is used for finding out the number of hot instances. Their goal is to minimize the cold start overhead.

Parallel methods
The FaaS computational paradigm is based on the reactive programming model. This elastic model makes it an ideal candidate for creating software that is inherently parallel. The need for the ability to run parallel functions led to the beginning of research into orchestration systems and scheduling for parallel operations. In their article, Suresh and Gandhi [36] categorized serverless computing models into two categories (1) edge-triggered and (2) massively parallel programs. The first group includes software that is generally short-lived or merely motivated to run based on ephemeral events such as HTTP requests, small IoT tasks, and so on. On the other hand, in the second group, the intensity of resource usage is much higher and the need for higher parallelism is significantly increased. In addition to this categorization, they designed a function-level scheduler that not only minimizes resource usage from the resource providers' point of view but also meets the customer demands from a functional efficiency viewpoint. By reviewing the existing research in the defined categories, it can be concluded that loading the prerequisites is effective in speeding up the process. It was also explained that academic workload is different from general workloads and usually contains dependencies. Also, the functions can be divided into two categories: (1) massively parallel and (2) edge-triggered. Considering these features, we intend to present a new method for scheduling in which we create a compromise between cold start time and cost. The scheduler is designed to use prerequisite packages in scientific and edge-triggered workflows. The reviewed related work is summarized in Table 2.

Problem statement
The cold start-up latency is more than ten times the warm start-up latency due to container creation, software environment setup, and code initialization. It is ideal if all functions can run in warm containers [28]. Keeping all the containers warm will lead to wasting resources. So, the solution is to compromise between the cold start problem and resource utilization. By reviewing the related work, it can be identified that first of all the concept of cold start in serverless environments and FaaS has received much less attention compared to other service-oriented paradigms such as software-as-a-service or microservice architectures in general. In this study, our goal is to reduce the number of cold starts. As the concept of dynamic waiting time for containers is not investigated yet, the main idea is to change the waiting time of Table 2 A high-level summary of the reviewed related work Article Goal Approach Harter et al. [19] Optimization for the use of prerequisite packages Having a shared cache Oakes et al. [21] Package-aware caching Cadden et al. [22] Creating snapshots Akkus et al. [23] Using sandbox and message bus Bermbach et al. [12] Performing pre-start-up calculations for workers Liu et al. [25] Performing application-level optimizations Lee et al. [26] Performing function fusion Sethi et al. [27] Considering the LRU container selection Li et al. [28] Re-purposing the containers Grosu and Chronopoulos [29] Game theoretical-based methods Optimizing the load balancing mechanism Malawski et al. [33] Scientific workflow-based methods A decentralized architecture for executing scientific workflows with pre-warmups Khochare et al. [34] Performing pilot invocation Roy et al. [35] Decoupling the runtime environment from the code Suresh and Gandhi [36] Parallelism-based methods Considering the resource consumption and functions' lifetime a container according to the situation and context. In other words, the goal of this study is to find the proper time that a container needs to remain hot after a function execution is finished. Therefore, the focus of this study is to reduce the number of cold starts by dynamically adjusting the waiting times. To find the proper waiting time, a set of hyper-parameters are considered. Frequency of function invocation, the collaboration of functions, maximum merging time, the score for a correct merge, budget, and timeline are the parameters that have been used to calculate the proper waiting times.

The proposed approach
In the proposed approach, the goal is to create a heuristic function that can improve various parameters that play a major role in FaaS computing systems. The cold start problem is one of the factors that have a significant effect on reducing the initial start-up speed of cloud functions. In the proposed approach, one of our aims is to reduce the cold start time of functions. One way to reduce the cold start time is to use virtual machines and containers that are currently in a hot state, meaning that a function has been executed on them before, and therefore libraries and function prerequisites are already loaded in them. These containers and virtual machines are more valuable for the functions that have a dependency hell problem because by using different techniques such as grabbing the container process, copying it, or reusing it, the cold start problem can be significantly improved. Considering how the cloud functions are activated and placed in the desired containers, we want to decide between executing cloud functions by creating a new container or running them on a hot container that already exists. In the simplest case, a container is created per function. In Fig. 6, the horizontal line denotes the time, and each box shows the starting and ending times of the function's execution. Looking more closely at the execution of functions in the timeline, it should be considered that a significant part of the functions' execution time is spent on the starting and launching of the execution environment as well as on the finalization and transferring of execution information to the cloud platform to better monitor the service.  , in addition to the execution step, depicts the steps related to container preparation, the execution environment, and the finalization step. According to this model, a significant part of the system's runtime is allocated to the preparation and finalization of the execution environment instead of running the function. In the proposed approach, the aim is to introduce a heuristic-based integrated mechanism that can use a hot container to run multiple functions.
In Fig. 8, without merging, function #1 is executed on the desired container, and after a short time after the finalization of the container in the first function, the execution of the second function begins. In the case of merging containers, the first container completes part of the finalization phase after execution and waits for a variable amount of time to be integrated with a function that may need to be executed in the future. Hence, the second function does not need to prepare the environment, or this time is much shorter than the time spent in the non-integrated state. The waiting time depends on the heuristic parameters. In Fig. 8, the black segment shown in the timeline is the time when the scheduler waits for the container to be completed.
This section discusses the reuse of the execution environments (i.e., containers and VMs) for multiple functions. It describes the general approach for assigning functions to these environments based on their similarities (i.e., using package dependencies). The chosen practical approach for implementing this functionality is not in the scope of the current research. Of course, there are multiple ways to implement this such as using a well-defined cache mechanism for the container and restoring those packages from the local storage of each container. Function specifications/source codes are stored by the user in the system. Every time an invocation event occurs, the scheduler (which itself is an isolated service in the cluster) has the responsibility to assign that function to an agent (e.g., mount the source code), which itself is a container and lives in an agent pool.

The proposed method's attributes
To properly merge functions, we need to keep their executable environments active. Thus, we need to find the time it takes for the function to remain hot after it is finished (i.e., Fig. 9). This time is denoted as r.
We use different parameters to find the optimal value for r . The consideration of these parameters raises several questions that need to be answered. Some of the more important ones are given below: (1) Score from the previous execution: Do the function activation times follow a specific pattern? (2) Collaboration with other functions: What functions are usually activated when executing this function? (3) Maximum merging time: What is the maximum time an execution environment can wait for a merger? (4) The score of correct mergers: How many of the mergers that the execution environment expected to happen, actually occurred in the past executions? (5) Budget: What is the user's budget for keeping the operating environment active for the integration of the functions? (6) Timeline: For how long the information from previous executions is valid?

Score from the previous executions
By monitoring the execution of functions at different times in a day, it can be predicted that the desired function will be activated on some specific days of the week and at which times of the day. This prediction helps us determine the final merger time according to the current time and the period in which the merger takes place. An optimal prediction can help optimize the runtime-related services that are executed in the serverless computing environment. For example, consider a video hosting site that offers online streaming services. The broadcast of a popular TV series, which airs from 1 to 3 p.m. on Mondays and Thursdays, can cause the traffic of this website to peak, and increase the activation frequency for some of the sub-services. By taking this parameter into account, the proposed heuristic algorithm can merge Fig. 9 Keeping the execution environment active for r times (merging time) more functions during peak hours. Different companies publish their users' data traffic at different intervals of the week, which can be used as real data in the evaluations. Figure 10 shows an example of the data collected from the activation frequency of a function at different time intervals during a week.

Cooperation score with other functions
Cloud functions, especially in services with complex business logic, can activate other cloud functions and take advantage of the added value. This activation of functions can create a directed acyclic graph (DAG). For example, in Fig. 11, the execution of one function brings about the execution of several other functions. These functions may call other functions (dotted lines) in certain situations, or the execution of these functions necessarily means the execution of another function in the  continuation of the service. The execution of a function that is the root of the graph also implies the execution of other functions.
One of the parameters for determining the merging time is to monitor the functions that are currently running and to assign a collaboration score to the currently running function in the hot environment according to these functions. This score indicates how likely it is that a function will be activated shortly, given the set of running functions.

Maximum merging time
The algorithm should be able to limit the waiting time for merging with the next function if required. This eliminates the need for additional costs in the case of a series of failures. As can be seen in Fig. 12, the maximum merging time denoted by rmax is smaller than the predicted time (r), and this makes the effective time the same as the maximum merging time (rmax).

The score of the correct mergers
Scoring the correct mergers can make it easier to predict when to merge in the future. If the heuristic algorithm succeeds, it can repeat its strategy in the future, and in case of consecutive failures, it must reconsider its merge timing prediction. Figure 13 shows an example of a successful merger and an expectation of a failed merger.

Budget consideration
The cost parameter in services based on the serverless computational model is one of the key parameters. Therefore, this method should take into account the maximum cost increase due to keeping the containers active. For example, if the cost of a function running in an hour in the FaaS model of the Amazon's Lambda is calculated as $0.04, in case the user introduces a maximum of 1 dollar as a one-month merging budget to the algorithm, this algorithm can only increase the total length of the black sections (i.e., idle times in Fig. 8) to a maximum of 25 h in this month.

Time window (time frame)
There is no need to store and consider all the information collected during the lifetime of the function hosted on the cloud platform. For example, for a video-sharing site, with the end of a popular TV series, the collected time information changes drastically. Hence, a specific time window is necessary when considering information from a recent period, as outdated information should not affect the forecasting process in the long run.

The mathematical model and algorithms
As mentioned earlier, the problem definition is consistent and aligned with an eventbased system. In the beginning, the executable environment starts without any processing potential (i.e., without a container). The main system scheduler waits for a request to execute a function to enter the system. Different schedulers can make different decisions at this point, whether to provide a new environment for this new function or to use an old environment. In the proposed method, old containers that have the required conditions for execution are also considered. There is a waste of time at first which includes the following key times: (1) The container construction time and its up-time (i.e., creation time).
(2) The time for downloading the prerequisites of the desired function (i.e., restoration time).
After spending these times, the function starts to execute. Finally, after the execution is finished, different policies can be considered according to the scheduling strategy: (1) Stop the execution environment without waiting for the new function.
(2) Wait for a fixed time for the next function to execute.
(3) Randomly wait for the next function to execute. These checks create the concept of policy in the proposed approach. Such policies are in fact, guidelines that given the state of the environment and its context, select one of the decisions mentioned above. As shown in Fig. 14, the container in the dynamic model begins to make decisions within a short time after completing its work. The probability of waiting further in a decision is initially 100%. But, as time goes on, this probability changes according to the decision. Also, there is a fixed cost for each waiting time regardless of the decision the scheduler makes for the container, which reduces the likelihood of waiting for the next step. The scheduler can compensate for this decrease with a " + " symbol or even reduces the probability further by a "−" one. It continues with the same probability denoted by using the "N" symbol and ends its work which is represented by a Fig. 14 An example of the proposed dynamic container scheduling. The " + " symbol represents the waiting time increase. The "-" symbol denotes the waiting time reduction. The "N" symbol denotes an unchanged waiting time and the "T" symbol is used to represent termination "T" symbol. For a function not to stay active indefinitely, the maximum probability of waiting time is also slightly reduced as time goes on.
Based on the above explanations, the problem's assumptions in this paper are as follows: (1) There are different users in the system.
(2) Functions are not shared, and each function is called by a specific user. This means that one user cannot see the source code of the functions for another user. The summary of the mathematical notions used in this paper is given in Table 3.
The proposed method is a scheduling approach to find the best value for the r parameter by deciding: (1) Which function should be scheduled on which machine? We make this decision based on the similarity of the machine and the function's requirements. This is performed by analyzing the metadata of the packages used by the function, and the packages that are already present on the worker machines, as explained in Algorithm 2. (2) How long an environment should wait to potentially host the next function call? This is essentially the r parameter, decided using a combination of different parameters evaluated according to the current state of the system. These parameters include the cooperation network of the functions, previous executions of the functions, and so on.
Every time an environment finishes executing a function, the scheduler needs to make a decision about its existence. The scheduler needs to pick one of the available actions: (1) Increase waiting time to keep the environment active (Wait-More), (2) Decrease the waiting time to keep the environment active (WaitLess) (3) Do not change the waiting time value (Neutral), and (4) Terminate the execution environment.
Different policies can be implemented to pick an action depending on the current state of the system. One policy can be defined as below: -Pick WaitMore if the merging probability of the system is more than 55%.
-Pick WaitLess if the merging probability of the system is less than 45%. -Pick Neutral if the merging probability of the system is between 45 and 55%.
It is worth mentioning that all these actions have a cost parameter associated with them. So, even if a policy picks the WaitMore action infinitely, the termination of the waiting process is guaranteed. The detailed architecture of the scheduler shown in Fig. 15 provides a schematic view of this process.

Package-awareness
One of the important decisions that must be made for the operation of an algorithm is what functions are considered similar. If the functions are written in the same programming language, only the prerequisite packages determine their similarity. The more common the prerequisites the two functions have, the more similar they are. According to Algorithm 1, first, a list of dependencies of the two functions is received. Using the metadata related to these dependencies, the first ten heavily dependent common packages in order of size determine the similarity of the two functions. With the similarity calculated between functions, the decision to assign a container to the  The reason these constants were chosen for the Weibull Stretched normalizer function is that they map any positive integer to a probability between 0 and 1, where the probability for input 0 equals 0, and the more the input approaches the integer 10, the probability gets closer to 1. Number 10 is the same constant that is used to consider top similar packages, which again was chosen by trial and error and performing manual experiments. So, to summarize, Algorithm 1 indicates how similar the two functions are. An output of 0 means no similarity, and 1 represents complete similarity. This value is calculated by taking the top ten common dependencies of the two functions and then normalizing the value using the normalization function which is performed based on the Weibull Stretched function. The normalizer limits the range of the calculated values between 0 and 1. -If there are any waiting environments, sort them by the similarity to the function f , and the last function that was running on that environment. Then, select the best candidate for f to be executed on.
If no such environment exists, launch a new container and execute the function f on that newly created environment.

Collaboration network of the functions and the collaboration scores
As explained earlier, to measure the degree to which functions depend on each other at runtime, a network of functions is required in which nodes, functions, and edges show the degree and direction of dependencies. Therefore, this network is directional and weighted. First, we need to create a graph at runtime based on the function calls. This graph generator acts as an auxiliary module next to the timer and monitors the calls. As soon as the function f i is assigned to the Environment e j , for all functions in the system, if there is an edge from the node f r to f i , its weight increases, and if there is no edge, an edge of length 1 is created. Algorithm 3 depicts the process of network creation. As the call frequency of these functions might change at different intervals for various internal or external reasons, the values of the weights on these edges certainly change through time (i.e., evaporate). In other words, the values added to these edges are valid only in a specific time window. Algorithm 4 describes how these edges evaporate as time passes. The frequency when evaporation occurs can be changed as a hyper-parameter and may vary depending on the frequency in which the functions are called in the desired time window. Here, the evaporation frequency of one minute is selected. After building this network, the γ(f i, f j) function can be easily calculated. To calculate the cooperation rate, it is enough to find the shortest path between the two functions in the network and calculate the probability that all functions in this chain will be called using Algorithm 2. In this scheduler, the Dijkstra algorithm [37] is used to find the shortest path. Algorithm 3 constructs a cooperation network between the running functions. It listens for the function f that is just assigned to an environment and then creates an r → f edge from all the running functions (r) in a directed network. If such an edge already exists, the constructor increases the weight of the edge, without letting it exceed the maxWeight parameter (which in our case is assumed to be 1000). For every function g that was not running while the f was assigned to its own environment, the weight of the g → f is decreased.

3
Algorithm 4 decreases the weight of the edges in the cooperation network every minute to take this aspect into account. Because we should not keep the weight of the edges in the cooperation network unchanged. This is because a function might no longer call another function, or cause it to be triggered indirectly.

Dynamic scheduling policies
As shown in Fig. 14, a dynamic scheduler can have different policies to choose from depending on the situation it sees in the environment. In this paper, in addition to evaluating different policies such as non-merger scheduling, fixed-time scheduling, and random-time scheduling, we also consider two dynamic scheduling policies namely the dynamic neutral scheduler and dynamic context-aware scheduler proposed in this paper, in which the first policy always selects the choice of "N" or "indifferent." In the second policy, which is smarter, we will use the various aspects of the environment described in the previous sections (such as using the collaboration parameter). In this case, a total of five different scheduling policies will be considered: (1) No merger scheduler: This scheduler shuts down the container after execution and does not wait for another function to execute. (2) Static wait scheduler: This scheduler waits for the next function for a fixed time after the container is finished. In this paper, the waiting time is assumed to be equal to 5 min (The static scheduler was meant to reference Apache Open-Whisk's scheduler. In OpenWhisk, the waiting time for reusing the container is implemented using a ContainerPool module as described in [38]). (3) Random wait scheduler: This scheduler waits for the next function for a random time that follows the normal distribution. In this paper, the waiting time with an average of 240 s and a standard deviation of 50 s is considered.
(4) Dynamic neutral scheduler: This scheduler dynamically waits for the next function, but always chooses the value N in the decision-making process. This is a simpler version of the proposed scheduler. (5) Dynamic context-aware scheduler: This scheduler is both dynamic and uses three probabilities namely the previous success rate, probability of function cooperation, and similarity of functions for decision-making. This is the proposed scheduler considering a dynamic waiting time.
The formal notations used for modeling this scheduler are given in Table 4. The probability of cooperation of the function running in the container is obtained from Eq. (3). According to this equation, for all functions running in the system, the amount of cooperation between two functions is calculated according to Algorithm 5. Then, according to Eq. (4), the similarity of the two functions is obtained based on Algorithm 1. Algorithm 5 calculates the cooperation probability of two functions using a cooperation network constructed using Algorithm 3. It first finds a path between the two functions in the graph using the Dijkstra algorithm., Then, for each edge in the path it normalizes the weight of the edge using the normalizer function which converts a number to a [0, 1] range. In the last step, the algorithm multiplies all these normalized values (which are now probabilities) and returns the final accumulated value.
Finally, the probability of merging two functions is obtained from Eq. (5). It is calculated from the probabilistic combination of three criteria namely the probability of cooperation, the similarity of functions, and the probability of successful merge operation which is obtained by dividing the number of successful merge actions in the past by all performed merge actions.
After calculating the probability of merging, if the probability is between 45 and 55%, the scheduler will be indifferent, which means that the decision N will be made. More than 55% means an increase in waiting time, and less than 45% means a decrease in waiting time, calculated in proportion to the probability value. Optimization has been performed for finding the best values for these hyper-parameters in the decision algorithm which is the reason why finding the optimal ones can be considered for future work. That range was chosen by performing manual experiments.

Evaluation parameters
Before introducing the evaluation parameters, the recovery time, execution time, and idle time of a container are formulated. As can be seen in Eqs. (6) and (7), the recovery time and execution time of each container are obtained from the sum of all recovery and execution times for each of the functions executed in the container. Also, according to Eq. (8), the container construction time, the time of retrieval for container requirements, and also the waiting time of the container to host new functions are considered as idle time. So, the idle time for each container is equal to the sum of these three values: exec i To evaluate the proposed algorithm, we consider the following criteria. These criteria include response time, cost, and system operating time.
Response time: Response time is the time it takes from the moment the request to execute a function enters the system until the first response is received from the function. Since the system allocates an environment for all of the above schedules, immediately after the request, the proposed algorithm provides a response time for the first request, including packet recovery time and environment creation time. For other requests, only packet recovery time is considered. This value is obtained from Eq. (9). As can be seen in this formulation, for one container (i.e., the first container) the total time of container construction and retrieval of the function packages related to that request is known as the first response time. For other containers, only the recovery time of that function is considered as the first response time. The response time of each function is ultimately equal to the sum of the total build and retrieval time of all functions divided by the number of container functions.
Turnaround time: Turnaround time is the time it takes from the moment the request to execute a function enters the system until the execution of the function ends. Since the system allocates the environment for all the mentioned schedules immediately after the request, in the proposed algorithm, the turnaround time for the first request includes the total execution time of the function, packet recovery time, and environment creation time. For other requests, only the recovery time plus the execution time are considered. This value is obtained from Eq. (10). As can be seen in this formulation, for one container (i.e., the first container) the total container construction time, the function recovery time, as well as the total function execution time of that request, is considered as the time to first response. For other containers, only the total recovery and execution times of the function are considered as the time to the first response. The response time of each function is ultimately equal to the sum of the total construction, retrieval, and execution times of all functions divided by the number of functions in the container.
1. Cost: The heuristic algorithm for merging functions and executing them in a hot environment requires waiting for the next execution. If the frequency of activating the functions is high, this time is small, and if the frequency is reduced, this time can increase the cost due to keeping the operating environment hot. The cost calculation formula (i.e., Eq. 11) we have used is dependent on the lifetime of the container. The cost calculation formula we see in Eq. (11) is a function of the life- time of the container. After calculating the container lifetime from Eq. (12) which is equal to the sum of container construction times, container package preparation, container execution times as well as container waiting times to execute the new function, we can multiply its value by the cost rate. The cost rate of $0.04 per hour is chosen according to the current rate of Amazon Lambda functions [9].
Utilization: Utilization is calculated by dividing the effective runtime duration (i.e., useful running time) by the total runtime duration. In this algorithm, utilization is affected in two ways. On the one hand, waiting for the next function in a hot execution environment (e.g., container) is considered as idle state and this negatively impacts the utilization (i.e., increases total runtime duration). On the other hand, assuming that initialization and finalization steps are not useful, this merge can have a positive effect on the final utilization of the whole system by eliminating the extra initialization time (e.g., downloading package dependencies). The formula for calculating utilization can be seen in Eq. (13). The amount of utilization is calculated by dividing the execution time of the functions in the container by the lifetime of the container as given in Eq. (12).
Cumulative criteria: Among other criteria, we need a cumulative criterion that considers the trade-off between cost, utilization, and response time. These measures are considered collectively where the value of which is obtained from Eq. (14). This value increases when utilization increases or response time, cost, or turnaround time decreases. So, its value is obtained as given below:

Evaluations
Testing and evaluating the introduced algorithms are performed by using a manually developed engine. A functional approach has also been used to construct the simulation of the proposed scheduling policies. This engine is an event-driven engine that processes an event (such as a request to run a new function) at any point in time and changes the context accordingly.

Events
Events are the core of this simulation engine. This simulator at any moment in time checks that if a specific event is defined in the system, it processes it and according to the schedule policies, changes the context and status of the system and goes to the next event. This behavior is exactly similar to the actual cloud function schedulers. Some of the events used in this simulator are: 1. Queue Request: Notifies the scheduler that a request to execute the cloud function is given to the system at the specified time. This request delivers the function information, such as its name and prerequisites, as well as its execution time. 2. Finish Running Function: Informs the scheduler that the function that was running is now finished. The scheduler must now decide whether to wait for the next function to run, or to terminate the environment. 3. Mount Wait Timeout: Notifies the timer that the time allocated to the waiting time for the next function to run has expired. In this case, the scheduler can decide whether to wait longer or terminate the execution. 4. Evaporate Coop Net Edges: Informs the scheduler that it must evaporate the values of the network edges in the cooperation network between functions.
This tool was developed as a general-purpose framework to evaluate various scheduling models. For validity, we have performed several sensitivity analysis evaluations to ensure that the simulator is producing at least face-valid results. On the other hand, by implementing similar approaches in other research works and generating results through the simulator (which is also represented in the evaluation section), and comparing them with the actual results reported in the considered research works we have validated its correct performance.

Software package information
In this simulation, the actual information of the prerequisite packages from the npm website is used, which hosts the JavaScript programming language software packages. This information includes the top 1000 packages from the npm website and includes the following information: 1. Package name. 2. The number of packages that depend on this package. 3. Package installation and release size. 4. Package Hitsrank is a criterion introduced for measuring the authority and hub of a web page. The hub identity captures the quality of the page as a pointer to useful resources, and the authority identity captures the quality of the page as a resource itself. A good hub is a page that points to good authorities, while a good authority is a page pointed to by good hubs [39]. This criterion is used in a graph and indicates the importance of a node in the package dependency network.

Package
Pagerank criterion is a criterion introduced for measuring the authority of a web page in which a good authority is pointed to by many good authorities [39]. This criterion is used in a graph and indicates the importance of a node in the package dependency network.
This generated data can be used or ignored in the simulation while generating packet information in the simulator.

Frequency of calling functions in the simulator
One of the important parameters in the simulation is how to generate the information about the functions that are given to the system in a request to be executed. Information regarding the generation of this data can affect the performance of the scheduler. One of these parameters is the frequency at which the functions are called. As mentioned earlier, different functions in the system can have a higher frequency at certain times during the day. For example, for a specific system, 2 a.m. is the time with the least amount of traffic and 9 p.m. is the peak traffic time. There are different approaches for generating such information: 1. Not paying attention to the frequency information, and the number of functions that are called at a particular time is completely random (i.e., Ignoring frequency data). 2. When generating frequency information, considering external data to generate more requests at specific times (i.e., using external frequency data).
Both of these approaches are considered in this simulator and are used for the evaluation of the scheduler.

Number of packages required by a function
Another piece of information that influences the simulation process is the number and type of requirements for a running function. As mentioned earlier, looking at real data from npm, we see that many functions use the lodash software package [40], while fewer use the package called ansi-colors. Since one of the strengths of the dynamic approach in scheduling is the consideration of the prerequisites of the functions, the functions whose required packages are closer to reality will have a better result for the dynamic approach. Therefore, two decisions must be made to produce the required packages: 1. Should we use real npm data while generating simulation data? 2. If several such prerequisites are considered, which distribution should be used?

Simulation configurations
According to the points mentioned earlier, several different configurations are obtained for the simulations. Each of these configurations determines how many dependencies to use for a function, which packages have higher dependencies, and the invocation frequency according to the information from the number of functions that are called at different times of the day. In this paper, eight different configurations are considered for executing the simulation scenarios. These configurations are given in Table 5. IFD denotes the ignoring frequency data, UFD indicates the use of frequency data, UA means Unaware, AW means aware, N means normal and U means Uniform.

Test results
In this section, for different parameters, we examine the metrics described earlier.
For this purpose, we have created three different models to evaluate the utilization, response time, and cost both individually and in an aggregated manner. The simulation engine and its implementation of the scheduler are completely open-source and publicly available [41]. The engine is written in the functional programming language F# and simulates the FaaS platform using a 24 h event processing approach. It starts at time 00:00 and processes each event (e.g., Function run request, evaporating the co-op network edges, etc.) until 23:59. A scheduler for such a simulator is a function that receives the current event and the current state of the system passed by the simulator and decides what will be the next state of the system based on the current event. Figure 16 shows the components of the simulator.
We have used the UFD_AW_N configuration for the evaluations because using the frequency data and dependency information help the performed experiments to be more realistic, even though the frequency data are randomly generated. This configuration is also in-line with the proposed motivation in this paper as it aims to schedule based on similarity and cooperation network. The reason that eight configurations are defined is to show that the simulation engine supports a wide variety of configurations. However, only UFD_AW_N was selected. As it was mentioned before, NoMerge scheduler shuts down the container after execution and does not wait for potential future reuse. StaticWait scheduler waits for a fixed time (i.e., 5 min) for the next function after the container is finished. RandomWait scheduler waits for the next function for a random time that follows the normal distribution. Dynamic neutral scheduler dynamically waits for the next function, but always chooses the value N in the decision-making process. The dynamic context-aware scheduler uses three probabilities namely the previous success rate, probability of function cooperation, and similarity of functions for the decision-making process.  Fig. 18. The average turnaround time of RandomWait is minimal compared to other approaches, while the average response time of NoMerge is the highest. The results of the utilization in Model 1 are shown in Fig. 19. The average utilization of StaticWait is minimum and the utilization of SAND is maximum. The results of the cost for Model 1 are shown in Fig. 20 and it indicates that the SAND algorithm generates the minimum values.
The results are reported in detail in Table 6. In each column, the best results are marked in green color. The best utilization and cost are obtained using the    Table 7)   The results are reported in detail in Table 7. In each column, the best results are marked in green color. The best utilization result is obtained by the SAND scheduler. StaticWait got the best response time compared with others. The turnaround time of RandomWait is minimum. In model 2, the cost of the Dynamic Neutral scheduler (i.e., the proposed scheduler) is minimum. Again, For a fair comparison, the results of average response time, average turnaround time, and average utilization are aggregated into a single criterion denoted by ℵ . From the aggregated criterion results reported in the last column, it can be concluded that the DynamicContext scheduler (i.e., the proposed scheduler) has the best overall performance in model 2.
3. Model 3 (Fig. 25) a. The Number of functions: from 2 to 15 variables. b. Algorithm: Dynamic context.  The utilization results for model 2 Fig. 24 The results of the cost parameter for model 2 In this simulation scenario, we want to investigate the effect of increasing the number of functions on the proposed algorithm. The reason behind designing the experiment based on this model was to see if the proposed approach behaves differently by using a different number of functions. Under a real word data experiment, it was expected to give us better results as DynamicContext considers the cooperation of the functions. However, we did not observe what we expected in the final results. This is because the dependency data of the functions were generated randomly.

Comparisons
The main idea discussed in this paper for the scheduling of functions is to keep the executable environment active for hosting subsequent functions to use a warm environment and reduce the cold start time.
To compare the proposed algorithm using two different policies, besides the wellknown SAND [23] algorithm, we have considered other algorithms as well. One of these algorithms does not perform any merge action and assuming that the executable environment is infinitely scalable, always constructs a new container to execute the submitted function. We have also considered two other algorithms that, like the proposed algorithm, can wait for running a new function, but this wait is either static (i.e., hardcoded as 5 min) or with a possible normal distribution (i.e., with mean = 240 s, stddev = 50 s). To do this, we have considered two similar models but  Table 8. The top two ratios are highlighted using dark and light green in each column. The proposed method does not significantly perform better than the other methods in this data. However, all the function cooperation data were generated randomly even though this approach is expected to work with real data in which the relation between functions is not as random as appears in this paper. In this model, the SAND algorithm considerably performs better compared to other approaches. This is because due to the low number of functions, all of them are executed within the same application and a container environment. This reduces the container creation time and the time for fetching packages to a great extent. In this model, only the forking time and process creation time are added to the cold start time. It is worth mentioning that in general executing different functions in one container will impose security threats. Also, one cannot execute functions from different users in one container.
In this model, because the SAND algorithm is assumed to work on a singleuser system, it demonstrates the best utilization and cost. After the SAND algorithm, the proposed approach shows the best trade-off among the considered criteria. This is mainly due to the consideration of a dependency graph between functions.
For the second model, the following comparison table is obtained by considering significantly more functions (i.e., an increase from 3 to 50): 1. Best utilization: Dynamic context, SAND. 2. Best response time: Static wait, SAND.  Table 9. It should be noted that the strength of the context-aware policy for the dynamic approach shows its improvement when the functions calling each other have an actual logical relationship between them. In this way, the collaboration graph can correctly estimate the performance of the function. Since the information in this simulation is generated without bias, this relationship seen in real-world functions does not exist within the randomly created collaboration graph. The SAND algorithm shows a weaker performance in this model compared to the previous one. This is because due to the significant increase in the number of functions, they will not be part of the same application and more containers would be needed for executing each application. This will increase the container creation time and package fetching time to a great extent. The third model, which is a model based on a variable number of functions but with a fixed dynamic context approach, shows that no significant pattern emerges in the evaluation results if the number of functions increases. In Table 10, based on the evaluation results, a comparative summary of the approaches is given.

Conclusions
In this paper, we discussed the cloud server functions and how they are scheduled and executed in a serverless computing model. In this computational model, users can call functions, through standard protocols such as HTTP requests. One of the main problems of serverless functions is the cold start problem and one way to reduce the cold start time is to optimize the preparation of the cloud functions' required packages. To this end, we have proposed a model to keep the executable environment that holds the previous function active.
At each stage of this model, the algorithm must decide according to the state of the execution environment whether it needs to wait longer to host the new function, Not being able to categorize and distinguish functions from different users results in an increase in cold start times in a system with multiple users or whether the wait is meaningless and it will only increase the cost. The proposed approach can have different policies. In this paper, we have also implemented a policy by constructing a graph that detects dependencies between functions during the execution of the algorithm. Finally, these two approaches, called Dynamic Neutral and Dynamic Context, were compared with common approaches such as NoMerge, RandomWait, and StaticWait. These comparisons were measured through various criteria such as cost, turnaround time, response time, utilization, and aggregation of them. This measurement was taken by a simulator that was written from scratch which can measure these criteria using various profiles and configurations. The approach of this paper was an attempt to investigate the impact of dynamic waiting time on tackling the cold start problem. The policies defined in the proposed approach have shown better results in terms of cost and utilization compared with the fixed and random waiting approaches. In other words, if in a certain situation, the cost or system utilization is important, the implementation of the dynamic waiting time can be one of the available options. But, they have performed worst in terms of response time which should be taken into consideration. However, cumulatively and by taking all of these indicators into account, the improvement, in general, is almost doubled. The reduction in cold start times will have a significant impact on the overall performance and user experience of the FaaS providers. Thus, this research domain can potentially have significant importance for enterprise vendors. The summary of the innovation points in this research is as follows: 1. Providing a mathematical model for calculating the probability of calling a function that is similar to the current hot operating environment. 2. Presenting a new algorithm for constructing the functions dependency graph. 3. Providing an approach to dynamically change the waiting times. 4. Developing a simulator that was written from scratch which can measure different criteria using various profiles and configurations Further research that can be performed in the future is to use a real-world dataset to evaluate the performance of these approaches. The proposed approach also uses a model similar to amplification algorithms. In a variable time, the introduced approach decides according to the environment whether to reduce the probability of waiting for the next function or vice versa. Therefore, considering more realworld environmental variables and conditions, especially the failure information can result in more accurate prediction and scheduling models. Instead of the two policies introduced in this paper, one can design a policy capable of using deep learning algorithms to consider the patterns seen in the chain of different decisions at each stage and feed this sequence of patterns to future policy decisions. The hyperparameters in the introduced approaches can be implemented using methods such as lattice search and determining the optimal parameters for the degree of neutrality. In the dynamic context method, for example, the range between 45 and 55% was introduced as the neutral range, which may not necessarily be the best value for this policy. So, more extensive evaluations might be needed.