Toward an Intelligent Cache Management: In an Edge Computing Era for Delay Sensitive IoT Applications

The emergence of embedded technologies and Internet of Things (IoT), have perceived the proliferation of devices starting from tiny monitoring sensors, mobile devices, wearable devices, surveillance sensors etc. For the past few decades technological advancements in these devices leverages applications from home automation to health care industries. Due to this drastic growth in technology and newly emerging applications the number of IoT connected devices are expected to reach 42.62 billion with global mobile data traffic of 77.5 exabytes/month by 2022. So, offering required services with sufficient QoS parameters as per SLA is a challenging task. Also, the growing rate of wireless traffic exerts load to the core network and backhaul connections. Even the situations may get still worse with multimedia streaming applications. To mitigate the wireless traffic, Fog Caching (FC) is one of the promising solutions. In FC the popular contents in the mobile core network are cached in suitable places. However, identifying popular contents, locating cache, and replacing contents of cache are noticeable issues in FC. In this paper we propose an Intelligent framework (I-CADET) for efficient CAche management for Delay sensitive IoT applications in the Edge CompuTing era. To locate the cache in mobile core network Connected Dominating Set construction of graph theory is used. The content popularity is predicted by efficient ML technique. Then the identified popular contents are distributed over a constructed semigraph based connected edge dominating virtual backbone. The efficient distribution of popular content on the constructed semigraph based virtual backbone (hot spot places) increases the Quality of Experience (QoE). The numerical results reveal that the proposed framework improves the QoE in terms of content delivery and cache hit rate, minimized average downloading latency and backhaul load. The efficient usage of cache and bandwidth has been ensured while meeting high QoS.


Introduction
The emergence of ubiquitous computing and the great speed of technological advancements in wireless communications has driven the digital world to be the trailblazer for the forthcoming digital evolutions. The Digital 2020 reports say that more than half of the world's population will use social media by middle of this year. So, the usage of mobile devises and social media are becoming indispensable part in everyone's life. The economic times survey says in India the number of smartphone users will be doubled by 2023. Basically, these mobile devices are resource constrained devices which needs the serving platform like cloud computing for hosting and running their storage and computationally intensive applications. In that maximum of the serving contents are multimedia streaming applications like video conferencing applications, gaming, online educational forums and most of the entertaining applications.
However, growing rate of multimedia applications have puts extra burden on the backhaul traffic of cloud computing. And another major issue with growing rate of applications is increased latency which leads bad Quality of Experience (QoE) parameters like quick response, image quality, video quality, bit rate, buffer fil, lag ratio at end user side. The Fog Computing is one of the popular solutions for improving QoE by placing computation and storage near the user premises.
For some of the delay sensitive applications like healthcare and military surveillance the response time plays vital role. Caching is the widely used practice in P2P (Peer to Peer) environment to reduce response time. Fog Caching (FC) is the recent wireless networking paradigm to improve QoE for mobile users. And cache management in such heterogeneous environment is the challenging issue faced today. Figure 1 shows the scenario of Fog Computing in the IoT (Internet of Things) environment.
In the proposed work to reduce back haul load and to improvise QoE, efficient cache management technique is presented. The cache management consists of two phases namely popular content identification and efficient node selection to cache the content. To rank popularity of multimedia content, machine learning techniques were used after preprocessing the input. To find the hot spot locations for cache placing, connected dominating set concept of graph theory is used. The identified Dominating Set (DS) consists of nodes with highest edge degree and less transmission delay. Then the ranked popular contents were distributed over the DS.
The contributions of proposed (I-CADET) are listed below.
(i) Ranking the popular contents by the identified features towards placing them in cache (ii) Cache location prediction by semigraph based Connected Edge Dominating Set (CEDS) of graph theory (iii) Threshold setting for CEDS by EWMA (Exponential Weighted Moving Average) The rest of the paper is organized as follows: Sect. 2 gives the brief about related work. Next section particularizes the ground work required for the proposed work in terms of methods and materials. Section 4 elaborates two phases of proposed work. The simulation results were discussed in Sect. 5. Finally, Sect. 6 concludes the work.

Literature Survey
In [1] reviewed analyzed the performance of various caching methodologies in the given network and also studied pros and cons of those methodologies. Further they explored the issues involved in the four phases of caching process. In [2] proposed smart caching and location prediction technique to predict the user preferences in information centric network in order to improve the user experience. Author used machine learning techniques for caching multimedia in ICN and also developed cache replacement algorithm to improve the cache utilization, therefore authors obtained low latency time and optimize hit ratio. In [3] designed a big data enabled architecture to obtain 100 percent user satisfaction and 98 percent backhaul offloading by implementing the proactive content caching at base stations. In [4] web proxy cache replacement policy is developed with the help of neural networks in order to reduce the burden of web traffic. They trained the neural networks to classify the objects which is needs to be cache, based on the recency and frequency information. Compared the performance of proposed work with LRU, LFU and optimal case algorithms and authors obtained optimal byte-hit rate in worst and best scenario. In [5] designed new approach for delivery phase by implementing pre-fetching strategy and novel technique super-position coding to maximize the per-eRRH energy constraints and fronthaul capacity. They used the both soft-transfer and hard-transfer modes of fronthaul in delivery phase as hybrid manner. In [6] proposed bidirectional deep recurrent neural network technique based proactive caching in order to predict time series content request. Proposed work contains three blocks in series such as convolutional neural network, bidirectional RNN and fully connected NN to maximize the hit rate ratio and prediction accuracy. In [7] proposed cachier system with unique optimization to balance the burden between edge and cloud in order to have minimum latency for image recognition applications. They carried out the work by considering the online assessments of network conditions and offline analysis of image recognition applications to leverage the spatio temporal locality of requests. In [8] proposed F-RAN based mobile VR delivery framework to enrich the resource (computing, communication, caching) distribution between Fog access point and VR devices. Authors represented the decision problem as multiple choice multiple dimensional knapsack problem (MMKP) and used the approach lagrangian dual decomposition to solve MMKP to maximize average tolerant delay. In author [9] proposed content demand ellipses method(CDE) to characterize network traffic to obtain better placement policy of communication system (Information Repeaters) with improved coverage and better capacity. Author places the IR in regions in order to obtain mitigate network traffic, average load for publisher and delay.
In [10] proposed optimal distributed caching policy based on greedy technique to predict user mobility and content popularity and also implementing coded caching to deliver contents from Macro base station in order to improve quality of experience. In [11] proposed dynamic caching scheme to predict social aware user mobility. And they employed cooperative filtering algorithm for efficiently managing and predicting the cache updates. In [12] presented a framework for mobility aware caching in content centric wireless networks to predict the user mobility patterns. Authors exploited the information is involved in the user mobility and reviewed the approach to cache the popular contents in base station or user equipment efficiently so that to mitigate the backhaul loads and lower the deployment cost. In [13] proposed dynamic programing algorithm to take the benefit of inter contact times in user mobility patterns to obtain high data offloading ratio. And also, they tried to formulate the problem as monotone sub modular maximization and implementing greedy algorithm to achieve ½ of approximation ratio. In [14] proposed distributed approximation algorithm for finding optimal storage location while delivery the file. Proposed work is based on large deviation in equalities to have minimal probability of service access from main base station. In [15] proposed a caching judgement strategy called "popcache" by considering content popularity for Information centric networking. And also analyzed "popcache" performance in terms of server hit rate and expected roundtrip time with other three cache judgement strategies.
In [16] developed caching strategy in distributer manner by considering social relationship in device to device communication. Optimizing the cache placement strategy in D2D networks by implementing decentralized learning automaton. Proposed solution is obtained by considering social relationship between D2D users and also their closeness of those devices. In [17] information-centric networking based edge caching mechanisms to utilize bandwidth efficiently. Proposed work properly placed the popular content in intermediate servers to avoid duplicate transmission of same content to different users. In [18] proposed mobile social device caching with the importance of device to device assisted caching technology to reduce backhaul load in wireless networks. In [19,20] author proposes semigraph based EDS and Total EDS algorithms for optimal routing in wireless sensor networks and proved that the proposed system is energy efficient so that increases the overall network lifetime. In [21] author proposes Fuzzy simplified optimization for computational offloading which reduces the total execution time and energy consumption of heavy intensive applications. In [22] proposes the framework TEFLON for finding the trusted Fog nodes to execute intensive applications with reduced latency and also reduces malicious node penetrations in the Fog environment.
In spite of the noteworthy amount of investigation had been done in the field of cache node selection and popular content identification, we have proposed an efficient semigraph based hybrid framework I-CADET which goes beyond the existing methods. The proposed framework allows for the selection of the most appropriate cache nodes based on CEDS concept of graph theory, Region of interest, Latency and back haul traffic, which in turn improves the QoE to the end users.

Methods and Materials
Fog Caching (FC) is the promising research area, that increases QoE parameters to end users. To identify the popular content to be located in fog environment, a renowned machine learning based K-Means clustering is used. Further to designate the caching nodes on the fog environment, a novel graph theory-based concepts such as semigraph and dominating set were used to construct an overlay network on the existing physical fog network.
Assuming Fog network 'F' as a semigraph consists of N fog nodes and wireless communication links between the nodes within the fog. In the semigraph Fog Environment (SFE) the fog nodes and communication links were considered as vertices and edges of the semigraph respectively.
The fog nodes with maximum edge degree were considered for the construction of virtual backbone. The general mathematical terms used for the construction are defined below:

Definitions
Semigraph: Graph G contains a set of vertices (V) and edges (E), G = (V,E).Semigraph (SG) is an ordered pair of vertices (V) and edges (Y), where V is set whose elements are vertices of G and Y is set of n-tuples of distinct vertices called edges of G, for various n. SG should satisfy following conditions.
(1) Any two tuples in Y have at most one vertex common (2) Two edges ( a 1 ,a 2 , ….,a p ) and ( b 1 ,b 2 , …,b q ) are said to equal if (i) p = q and (ii) Either a i = b i for i = 1,2,…., p or a 1 =b p+1−i for i = 1,2,…., p Thus the edge ( a 1 ,a 2 , ….,a p ) is same as ( a p ,a p−1 , ….,a 1 ) Edge degree(e i ) is calculated by following equation

Dominating Set Dominating set (DS) for a graph G = (V,E) is subset V ' ⊆ V such that every vertex in (V-V ' ) is adjacent to at least one vertex in V ' . Edge Dominating Set Edge dominating set (EDS) for a graph G = (V,E) is a subset U ⊆ E such that every edge in (E-U) is adjacent to at least one edge in U.
Connected Edge Dominating Set Edge Dominating Set (EDS) for a graph G is Connected Edge Dominating Set (CEDS) if the edge induced subgraph is connected.

Proposed System
Nowadays accessing the multimedia content from the domain servers with assured quality of experience is a challenging task. To address the issues like latency, back haul traffic we proposed the framework I-CADET for efficient CAche management for Delay sensitive IoT applications. The proposed framework consists of two phases namely popular content prediction and virtual backbone construction for cache location. In the first phase, popular contents are predicted by an efficient K-Means clustering. In the second phase, to identify the cache locations Connected Dominating Set (CDS) construction of graph theory is used.

Predicting the Popular Content
To identify the popular content to be placed in hotspots, collaborative filtering-based recommendation system is used. To predict the popular content (MovieLens dataset is considered here), a utility matrix is created with user's rating of movies. The construction of utility matrix may result with sparse entries (blank entries), the movies which are not rated by the user. Hanse, UV decomposition method is implemented to fill the sparse entries with appropriate values. Here the renowned K-Means technique is used for clustering similar users based on the ratings is used given for each movie. Finally, the movie of which is highly rated in the group is suggested to other members in the same.
The flow for predicting the popular content is depicted in Fig. 2.
The identified popular contents should be placed in appropriate cache location. The hotspot cache places are predicted using semigraph based connected edge dominating set and the same is explained in the consecutive section.

Predicting the Hotspot Places for Cache Location
This section explains the novel semigraph based virtual backbone construction which consists of the following three steps. In the target Fog environment, fog nodes are considered as vertices and connection among them are considered as edges in the semi graph SG = (V, Y). Then the Edge degree is calculated for all edges in semi graph by the Eq. (1). Further the edges are being sorted in the descending order, henceforth the edges with highest degree will be at the top, which will be considered for target backbone construction. The number of edges to be considered for overlay network construction is calculated by the technique Exponentially Weighted Moving Average (EA) Eq. (2). EA(t) = moving average at time tw=Weight parameter value between 0 and 1 U req (t) = Number of user requests generated at time 't' Algorithm 1 explains the generation of Edge Dominating Set (EDS) U and Algorithm 2 explains the generation of Connected Edge Dominating Set (CEDS) CU for overlay network construction. (2) Semigraph based overlay network is constructed by the above said EDS and CEDS generation algorithms. Now to select the optimal node to place popular content in a particular Region of Interest (RoI) the following influence factors were considered. The factors namely Link Strength Threshold (RSSI-by Received Signal Strength Indicator), Cache availability.
The node within the target region with highest average link strength and good cache availability will be considered for popular content placement. The flow of cache node selection is depicted in Algorithm 3.
The identified popular content will be placed in the generated CN (Cache Node set). And the link strength (RSSI) will be periodically calculated to dynamically update the cache node selection. Finally, in the constructed overlay network popular contents are being placed in the hotspot places with high RSSI and good cache size availability.
Next section assesses the performance of proposed I-CADET framework in terms of throughput, jitter, back haul traffic, average Transmission delay and average end-end delay.

Result and Analysis
The proposed I-CADET is implemented using NetSim and various performance metrics were compared with the conventional systems (Raw benchmark 1,Raw benchmark 2), Follow me cache [23], and the renowned greedy [21] technique.
Raw benchmark 1 The popular content being searched by the mobile user is directly loaded from the nearby data center.
Raw benchmark 2 The popular content being searched by the mobile user is loaded from randomly selected node of the same RoI.
Follow me cache [23] The popular content being searched by the mobile user is loaded from Connected Dominating Set (CDS) based RoI.
Greedy [21] The popular content being searched by the mobile user is loaded from random node by greedy technique.
The following metrics analysis shows that proposed I-CADET outperforms with the above-mentioned benchmarks.
In Fig. 3 the throughput is measured for proposed I-CADET and the same is compared with the Follow me cache [23] and other benchmarking algorithms. Here throughput is defined as number of user requests serviced with in certain period of time. Throughput is measured with increased target RoI area (Region of Interest) for fixed number of users (80 count). From the graph it is clearly shown that I-CADET provides service for maximum users with good QoE.  Throughput is measured in a particular coverage area (RoI) with varying number of users and the same is shown in the Fig. 4. Proposed I-CADET gives best throughput while increasing number of users in horizontal scale.
The Predominant factor for measuring Quality of Experience (QoE) of mobile end users is delay in which low delay represents high QoE. Henceforth the average end-end delay for the proposed I-CADET is measured and compared with other benchmarking algorithms (Fig. 5). Here the end-end delay is the elapsed roundtrip delay between the service request and content delivery. Here delay is expressed in milliseconds.
In Fig. 6 transmission delay for the proposed I-CADET is measured with various packet lengths. Transmission delay is the time taken to push packets from the node (Here the content provider in content delivery network) to the transmission path in the enroute. The progression of packet length is shown in horizontal axis and the transmission delay in milli seconds is given in vertical axis. The transmission delay is directly proportional to the packet size and inverse to the rate of transmission, for the constant transmission rate the delay is measured here. From the Fig. 6 it is shown that for the proposed system the maximum transmission delay for the 500 Mb packet is 0.007 ms. I-CADET outperforms with the other state-of-the-art benchmarking techniques in terms of transmission delay.
Jitter is the noticeable measure in the multimedia communication to ensure user QoE. Jitter represents variation in the delay of continuous streaming. In Fig. 7 the variation in time delay is taken in vertical axis with the sequence of packet transmission in horizontal axis. From figure it is clearly shown that the jitter is almost linear for the proposed I-CADET comparing to the other benchmarking techniques.

Conclusions
In this research article we have proposed an Intelligent framework(I-CADET) for efficient CAche management for Delay sensitive IoT applications. The main aim of the proposed work is to reduce the latency and back haul traffic in the cloud environment to attain the good Quality of Experience (QoE) for mobile user. Popular contents are identified by the K-means clustering algorithm by considering average ratings and the number of ratings for a movie within the same cluster then the semigraph based Connected Edge Dominating Set algorithm is used to construct the overlay network. Then a novel node selection algorithm is used select the hotspots to place the popular content. Then the proposed system is evaluated for its performance check with various metrics like latency, jitter, throughput and compared with other benchmarks and an existing cache system. The results show that average end to end delay is reduced by 44% when compared with baseline cache handling methods. And throughput rate is also increased by 38% than considered scenarios. From the simulation results it is shown that proposed I-CADET outperforms with other techniques and attains the required QoE parameters.
Funding The authors declare that no funds received for this research.

Data Availability
The datasets used during this research is publicly available (MovieLens Dataset). And it was preprocessed for the research and available from the corresponding author on reasonable request.

Code Availability
Available from the corresponding author on reasonable request. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.