In this article, to study and analyze articles in the two fields of IoT and deep learning, we have categorized them based on the main idea of the articles. According to this view, the articles are classified into four main categories. Figure 2 reveals these categories. The articles in the ‘Focus on data’ category deal with IoT data processing to get ready for being used in deep models. The articles in this category sometimes review data representation in a new way or extract the appropriate features or reduce data dimensions. In the ‘Network’ category, the reviewed articles are distributed into three groups; in the first one, the articles are focused on the utilized network or a change in the IoT network. In this category, we put these articles in the branch of the ‘Network Technology’. Deep learning needs many resources, including processor, battery power, and memory for the training itself; as a result, it cannot be appropriate for IoT devices to have resource constraints. Thus, in some articles, they presented methods for scheduling tasks on the IoT and adjusting the computational load to improve the consumption of resources. These articles are in the ‘Resource Management’ section. Guaranteeing privacy and security of information is one of the most important considerations in numerous IoT applications and the reason is that the IoT data are sent to analyze via the Internet and are thus visible around the world. Since many applications use anonymization, hacking and re-unification of these strategies as unnamed data will be possible. Further, biased attacks such as entering incorrect data or testing inputs by competitors threaten the Deep Learning (DL) training models; several efficient or inefficient conditions (such as availability, reliability, validity, certainty, etc.) may be at risk with these attacks. We classified articles that focus on the privacy and security of information into the ‘Security’ category.
As shown in Figure 2, the ‘Computing Environment’ category includes cloud, fog, and edge, along with their combined application, as well as big data analysis. Since the IoT devices have computational constraints and cannot quickly handle the deep learning model computations, some articles have implemented their computations in a different context so that they can apply their proposed model. Articles that use deep learning with IoT to present a new application have also been classified and reviewed in the ‘Application’ category.
By searching the mentioned databases, 151 articles were found on the Internet of Things and deep learning which met the criteria stated in Table 1. After selecting and removing articles, we reached 32 articles. These 32 selected articles included 7 conference articles and 25 journal articles, and from the latter, 19 articles had Q1 qualitative level, 3 articles had Q2 qualitative level, and 3 articles, despite their scientific value, had no recorded qualitative status on the site https://www.scimagojr.com/journalrank.php.
Table 2 reports the number of articles reviewed in five article databases, ACM, IEEE, Science Direct, Springer, and Wiley. In the Wiley database, there was no article in the IoT and deep learning domain, therefore, Table 2 lists this database.
Table. 2. The number of selected articles based on publishers and the year of Publication
Publisher
|
ACM
|
IEEE
|
Science Direct
|
Springer
|
Year
|
2018
|
2017
|
2018
|
2019
|
2018
|
2019
|
2017
|
Number of Article
|
1
|
2
|
15
|
5
|
5
|
3
|
1
|
We categorized and analyzed the selected articles in four categories: ‘Focus on Data Management, Network, Computing Environment, and Application’. Figure 3 illustrates the number of publications and their percentage in comparison to the articles reviewed. In the following sections, we will review the articles in each category.
3-1. Primary Studies of ‘Data Management’
In this section, the primary studies on the IoT and deep learning are investigated whereby it is determined that 25% of primary studies focused on Data Management.
Wang and Zhang [18] proposed a tensor DL model for heterogeneous data fusion in IoT. In this article, the authors used the tensor space for simulating the strongly non-linear distribution of IoT big data. They introduced tensor distance and high-order backpropagation for extending the data from linear to multilinear space. Finally, the proposed algorithm was compared to the stacked autoencoder and multimodal deep learning in STL-10 and CUAVE datasets. The authors detailed improved accuracy and data fusion compared to the two mentioned methods.
In 2018, Liang [19] proposed a fast and smart data processing scheme to achieve a quicker calculation in deep learning for real-time applications. In this study, the pre-processing of data was regarded as the main objective. Generally, data pre-processing is usually done in two categories: reduction of the data to subsets with their main features, and data transformation for eliminating some of the main features. In this study, in the preprocessing phase, the combination of both categories has been applied to take all their advantages and consequently preserve the physical properties of original data which is used as a linear in the selected subset. The offered method was evaluated through two large-scale dataset scenarios and big data. The authors proposed the SVD-QR in a large-scale data scenario for selecting sub-datasets. The SVD is applied to sort out individual values and their corresponding single vector while also determining the size of the dataset by single values. On the other hand, the QR is used for selecting the data sample as the deep learning input. In the big data scenario, Limited Memory Subspace Optimization is applied for SVD (LMSVD). This method uses large matrices to calculate outstanding single values based on the optimization of Krylov's subspace, and then they are selected by applying QR data. The proposed method was simulated through handwriting recognition which is used widely in most IoT applications. After data preparation in the pre-processing phase, data are sent to the deep feedforward neural network based on the two mentioned scenarios. The outcomes demonstrate that the method is a powerful technique in deep learning and also, the SVD_QR method effectively reduces the input data and energy consumption.
Bu et al. [20] proposed a multi-projection deep computation model for the smart data in IoT. They utilized the Multi-Projection Deep Computation Model (MPDCM) and generalized the DPDCM through replacing all hidden layers of the deep computation model with a multi-projection layer. Firstly, the MPDCM mapped each multi-modal object to various sub-spaces for demonstrating hidden characteristics in various subspaces. Then, the Multi-Projection autoencoder (MPTSE) learned interactive intrinsic properties from obtaining correlation by mapping the sub-space to the output. They designed a similar option of the MPDCM for training the parameters of the MPDCM based back-propagation and gradient descent. Then, they examined their idea for classification accuracy in Animal-20 and NUS-WIDE-14 datasets and compared them with DPDCM. The results indicated that MPDCM, with increasing numbers of sub-spaces, can accomplish higher categorization accuracy than the DPDCM. This reflects the ability for learning big data features. Although the algorithm uses more subspaces than the DPDCM algorithm does, the computational and the time complexity are almost the same.
Li et al. [21] suggested that a Deep Convolutional Computation Model (DCCM) according to CNN can be applied to heterogeneous industrial big data. The DCCM extends the CNN from the vector space to the tensor space. This tensor-based model can show the hidden relationship over various modalities of the big data and represents heterogeneous things. The tensor can represent the structure of heterogeneous data while maintaining the data raw structures. These attributes allow us to investigate complementarity and mutuality across various modalities. In addition, the tensor may prevent several issues in the vector space, including singularity and dimension disaster. Therefore, they can be employed in various utilizations like feature extraction, pattern recognition, and data fusion. The other advantage of using the tensor space which is mentioned in this article is that it reduces the number of the free variables to avoid over-fitting and also shortens the training time. They introduced a high-order back-propagation algorithm for training the parameters of DCCM in the high-order space. The experimentations outlined in this article on the three data sets of CUAVE, SNAE2, and STL-10 indicated that duration of training of the deep convolutional computation model was longer than that of CNN due to using more weights while it is less time-consuming than the DCM method because by taking a pooling strategy and local receptive strategy in the tensor space, the number of the weights can be reduced efficiently.
Mohammadi et al. [8] utilized semi-supervised learning methods for solving the problem of the lack of labeled data on IoT. They provided a deep reinforcement learning (DRL) algorithm according to the Bluetooth Low Energy (BLE) indoor intelligent location in a smart city. The experimental findings of this new algorithm suggested the efficiency of the semi-supervised DLM compared to the supervised deep learning model. This model uses the Variational Autoencoder (VAE) as the inference engine to generalize optimal policies. Moreover, the proposed model explores the extension of reinforcement deep learning to the semi-supervised and provides a general framework for all types of IoT programs.
Yao et al. proposed four fundamental questions concerning the interaction between human beings and physical objects with the potential of deep learning. They examined the answers to questions as follows: which deep neural network structure can effectively be used to process and combine sensor data in different utilizations?, How to decline the resource use of DLMs in order to fully use them on the resource-constrained IoT instruments?, How can confidence measures be calculated in DL predictions?; and How can one reduce the need for labeled data?” [22].
The DeepSense framework has been introduced as an effective structure for processing and combining sensor data across different applications with minor changes. This framework contains a recurrent neural network (RNN) and convolutional neural network (CNN) and divides the sensor data into time intervals in order to process the time series data. DeepSense uses a convolutional network on sensors to encode local features and efficiently combine sensor data and RNN to extract time patterns. This framework can be used for both categorization and estimation issues. In this article, Deepsense has been used to identify heterogeneous human activity recognition (HHAR) as well as user identification with the biometric motion analysis (User ID). By comparing Deepsense with other deep learning designs, they showed that this framework is more effective than the methods outlined in this article. Yao et al. also introduced the DeepIoT Compact Framework. The purpose of this framework is to discover the optimum dropout for the hidden element in the respective neural network. The evaluation of the DeepIoT compression algorithm suggested that it can decrease the size of the network, energy consumption, and runtime while preserving accuracy. A comparison of this framework with other methods of deep learning has also been outlined in this research. Further, a brief introduction has been presented in this article on the Well-Calibrated algorithm, which is an uncertain estimation algorithm for MLP called RDeepSense. RDeepSense uses a new loss function called tunable that is the weighted sum of the negative log-likelihood as well as mean squared error. They showed that this method provides high-quality uncertainty estimation [22].
In another article, Yao et al.'s study introduced a semi-supervised framework on IoT [23]. The proposed framework (SenseGAN) consists of three sections: a generator, a classifier, and a discriminator. The generator section generates data similar to sensed data. Then, the classifier section labels these data such that the discriminator could not make a distinction between the actual and the simulated data. This process is repeated until the classifier is trained well. The proposed framework used a convolutional structure and Deepsense for the classifier. The authors evaluated this framework on three datasets as user identification with biometric motion analysis (User ID), Wi-Fi signal-based gesture recognition (Wisture), and HHAR. The evaluations outlined in the article revealed that this framework is suitable for labeled and unlabeled data and can promote the classifier predictability without any timely operation and energy consumption.
Moreover, Khelifi et al.'s [24] study proposed numerous methods according to DLMs for IoT utilization. They introduced Edge of Information-Centric IoT to decrease latency time-critical applications. The authors proposed a confusion method to join ICN, IoT, and Edge computation. They applied RNN to process online data to save the history of the exchanged data and future predictions. A key advantage of this new technique has been considered to be a reduction in the volume of data and computation and processing assignments of diverse data in a real-time scheme.
The primary studies conducted on DLMs & IoT which focused on the data are surveyed and analyzed. Our considerations are outlined in Tables 3 and 4. Table 3 reports Ref & year, Main idea of the study, advantages, and disadvantages.
Table. 3. A review of articles focused on ‘Data Management’
Ref & year
|
Main Idea
|
Advantages
|
Disadvantages
|
[18], 2018
|
Applying tensor spaces
|
- Simulating the nonlinear distribution of big data using Tensor - Improving heterogeneous data composition - Better accuracy in detection than CDL and MML for big data
|
In the data integration, all data is considered and additional data is not deleted
|
[19], 2018
|
Preprocessing data by using SVD-QR and LMSVD-QR
|
- Increasing computational speed - Reducing energy consumption
|
To increase the speed of computation just focusing on pre-processing
|
[20], 2019
|
Applying Multi projection deep computation model
|
- High accuracy - Time complexity and computational is the same as DPDCM
|
The classification results are unstable due to the effect of primary parameters
|
[21], 2018
|
Applying tensor space
|
-Avoiding singularity and dimension disaster. - Achieving more acceptable categorization than the DCM and MDL without a high training cost. - Reducing the number of weights in the tensor space.
|
The suggested model contains rather parameters in the high-order tensor space.
|
[8], 2018
|
Applying semi-supervised deep reinforcement learning for dissolving the issue of the lack of labeled data
|
- The model learned the optimal action policies resulting in a better estimation of the target locations.
|
Disadvantages are not mentioned
|
[22], 2018
|
Introducing a new framework for the purposes mentioned in the article
|
-Reducing the running time -Reducing the need for labeled data -Increasing the accuracy
|
Further investigations are necessary to better confirm the applicability of the results.
|
[23], 2018
|
Introducing SenseGAN framework
|
-Training model with 10% of labeled data
|
It does not optimize for Multi-sensor data. It only takes into account the category and does not pay attention to the regression. The learning process has a tricky computation
|
[24], 2019
|
Combining Edge computing and ICN and IoT
|
-Reducing latency time-critical application -Reducing the volume of data -Computing and processing in real-time - Improving the reliability and function/performance. - Mitigating the deployment complexity, and enhancing the flexibility of the network communication.
|
The training time is high
|
Table 4 compares the articles in the ‘Focus on Data’ domain. In this table, we compare articles based on the year of their publication, experimental type, applied learning model, models compared, dataset, comparative criteria, tools, and language. The experimental type specifies the proposed scheme's type. This feature determines the designer type as numerical analysis, implementation, simulation, design, and mathematical proof. The Applied deep learning model feature characterizes the model used in the idea expressed in the article. The Compare Model feature shows the models that were compared with the articles while the Dataset determines the database used by the authors. The Comparison Criterion shows the criteria used by authors to compare the models while the Tool and Language describe the tools and programming language used, respectively.
Table. 4. A review of articles focused on ‘Data Management’
Ref & year
|
Experimental Type
|
Applied Deep learning Model
|
Compare Models
|
Dataset
|
Comparative Criterion
|
Tool & Language
|
[18] , 2018
|
Simulation
|
TDL
|
-SAE[1] -MDL[2]
|
- CUAVE - STL-10
|
- Performance - Detection rate
|
- Matlab - Tensorflow
|
[19] , 2018
|
Simulation
|
Feed forward (NN)
|
---
|
Ex3data1.mat
|
- Running time - Accuracy
|
- Matlab
|
[20] , 2019
|
Implementation
|
Auto-encoder
|
DPDCM
|
- Animal-20 - NUS-WIDE-14
|
- Accuracy
|
Not mentioned in the article
|
[21] , 2018
|
Simulation
|
DCCM[3]
|
DCM[4] MDL
|
- CUAVE - SNAE2 - STL-10
|
Accuracy
|
Not mentioned in the article
|
[8], 2018
|
Implementation
|
DRL
|
Supervised method
|
Prepared from a real-world extension of a net of iBeacons
|
-Accuracy -Average reward -Average distance
|
- Tensorflow - Keras
|
[22] , 2018
|
Simulation
|
CNN RNN
|
DS-single GRU DS-noIndvConv Ds-noMergeconv SparseSep
|
HHAR[5] UserID[6]
|
- Running time - Accuracy
|
Not mentioned in the article
|
[23], 2018
|
Simulation
|
GAN
|
SenseGAN, Semi-RF, S3VM, DeepSense, RF, SVM
|
HHAR UserID Wisture[7]
|
- Run time -Energy consumption -Accuracy -F1 score
|
Not mentioned in the article
|
[24] , 2019
|
Design
|
CNN RNN RL
|
---
|
---
|
---
|
Not mentioned in the article
|
3-3. Primary Studies of Network
In this section, the primary studies on the IoT and deep learning are investigated whereby it is determined that 31% of primary studies have focused on the Network.
The framework based on Software-Defined Networks (SDN) was presented in [25]. The proposed architecture was scalable and flexible as well as secure for IoT. The architecture included a layer for IoT and an SDN layer. The SDN layer consisted of controlling and back warding layers, which were based on an intrusion detection system which is a hardware and software system designed for monitoring traffic attacks. Therefore, the authors used Restricted Boltzmann Machines (RBM) for intrusion detection. This network included two steps including forward and backward. According to the forward step, the hidden nodes are a function of the input, weights, biases, and active or passive activity function with the decision at the beginning being stochastic. On the other hand, this step output is a probability vector. In the backward step, samples of the output will be selected of which input will be mad. They used the KDD99 dataset for detecting four types of attacks and intrusion. They claimed that the accuracy of intrusion detection was far higher than that of other methods, and improved by about 9%.
In another article [26], for managing the industrial IoT, SDN was used dynamically and the Software-Defined Industrial Internet of Things (SDIIoT) was introduced. In SDIIoT, a large amount of data and flow were created through the industrial instruments, wherein there is a distributed, but reasonably focused, physical controller. However, one of the most difficult issues is how to achieve an agreement between multiple controls in complex industrial environments. In this article, an agreement protocol on Block Cycle (BC) is proposed. For Distributed SDIIoT, a BC is capable of acting as a trusted and out-of-band the third party for coordinating between different SDN controls in a secure, reliable, and traceable way. The authors of this article utilized permissioned BC because of its low costs, less delay, and being low band-intensive. They considered the change in sight, the choice of access, and the allocation of computing resources as a joint optimization problem. Also, with Marco's decision making, they proposed a new method of dueling deep Q-learning. The results of the simulation showed the convergence and the efficiency of this new technique. Finally, the authors also argued that the procedure of measuring the trustworthiness of nodes can be considered as a matter of great importance for future studies.
McDermott et. al [27] provided a solution to identify botnet activities on consumer and network IoT devices. Their proposed method involved the development of the Bi-directional Long Short-Term Memory based Recurrent Neural Network (BLSTM-RNN). To identify the text and convert attack packets to the correct tokenized format, embed words were used. By examining the accuracy and error, a comparison was made between this new method and the LSTM-RNN and then identified the 4 attack vectors by Mirai botnet. The authors created a dataset containing four attack vectors for the Mirai botnet. Then, the researchers tested and evaluated it for 4 attack vectors including UDP, Mirai, ack, and DNS. The new method worked properly for UDP, DNS, and Mira attack vectors, and had respectively 98%, 98%, and 99%, coefficients of reliability. However, it did not work completely for the ack attack vector as it requires more training data.
In another article [28], a wireless device identification platform using deep learning techniques was introduced to provide security. Radiofrequency Fingerprint (RF) as one of the physical layer authentication methods can be employed for detecting licensed (allowed) wireless devices. In fact, DL is an acceptable way to achieve the features of various RF instruments by learning their RF data. This article focused on the ZigBee device (IEEE802.15.4) in wireless sensor networks, where ZigBee can use multiple network topologies such as stars, trees, and mesh to transfer data from the source to the base station or other peer-to-peer nodes. RF signals (IQ) are collected by the USRP device from multiple ZigBee devices, for training. These signals are then collected from devices that are already registered on the network and are listed. Next, using this labeled data, the proposed DL-based model is constructed. Upon the training, it is possible to apply the model to detect devices on the network. This new model has been considered to be transparent and passive to the RF devices and hence it is not necessary to install additional software/hardware on the wireless tools. In this article, six different types of ZigBee devices were considered. All the six ZigBee devices were configured for transferring from five distinctive SNR levels. The data collected from these six devices were 300 gigabytes and suitable for training a deep learning model. They investigated three models of deep learning improvement, namely DNN, CNN, and LSTM. The benefit of DL is the ability for automatic extraction of the features. The classification results could be utilized for intrusion detection, and a warning can be issued for an unregistered attacker's security breach.
In an article [29], the authors introduced an Internet-based system for deep learning for detecting anomalies in the Internet Industry Control System (IICS). The proposed method consisted of two sections; supervised and unsupervised. The unsupervised section aims at providing the initial values of the supervised section. AL-Hawawreh et al. used Deep Auto-Encoder (DAE) to provide the initial weights of the model parameters in the unsupervised section. Auto-Encoder uses input data to create the same data in the output and attempts for minimizing the errors generated by the data. Then, the trained model's parameters will constitute the starting point, the initial values of weights as well as biases for the Deep Feed Forward Neural Network (DFFNN). They used the two data sets UNSW-NB15 and NSL-KDD for evaluating this new model and compared them with the other methods. They found that their method outperformed other methods because of its dimension reduction and its automatic extraction of features. Further, it was easy to detect normal behaviors and attacks, as the model was initially trained by a normalized dataset. The extra training process performed on the model also protected the system against complex attacks.
Ayadi and et Al. [30] used deep learning for routers in Named Data Networking (NDN) whereby the router could intelligently forward the packet. In the forwarding strategy, the responsible component for selection is the other hop according to the cost of the route, metrics of forwarding, and local policies. The application of the neural network is a way to predict the drop rate in the network according to the traffic prediction. The authors considered overload probability prediction in all links as one of the signals for minimizing the drop rate according to the new network status thereby increasing the output of the forwarded one. They used a linear function on the output layer and a two-layer feed-forward network with a sigmoid function on the hidden layer to design the DNN model. Then, the researchers applied the backpropagation algorithm for training the DNN model. A data set of static information for each router was made up of traffics in each link. They used a dataset available in the Dongo et al.’s project to evaluate their model and achieved 99.23% accuracy with four hidden neurons and over 70 epochs without the occurrence of over-fitting. Also, Ayadi et Al. assuming the support of IoT devices from the NDN, reviewed the proposed method for the NDN-based audio conference.
In [31], the authors studied a large volume of information collected in IoT and balanced the load on the network. They introduced an agent called Load Bot for effective load balancing in the domain of IoT. This agent measures the load factor of the network and analyzes the configuration of the structure for enormous volume of data and enormous volume of the network load. Further, by applying the learning methods of deep beliefs, it can obtain effective load balances. They also introduced another factor called Balance bot based on the deep Q-learning algorithms to predict neural load. They created a grid-structured map using the deep-belief network by the RBM. Enhancing learning in deep learning is done through action while experiencing load learning in a given environment. The scalar reinforcement value is taken from evaluating the selected action. The Q-learning method does not compute the desired action from the current state; instead, it learns the number of tests and errors through optimal operation on any experiential conditions. Kim et al. used the Neural Prio Ensemble method described in the article [18] to predict the network load. When new data are entered, they are accumulated in a certain amount, and are transformed into a new, deep belief network, and are saved further. At this moment, the new belief network receives previous information on the network via Combining Load bot and Neural Prior Ensemble, learning all the network load, and extracting the weight-change process. By simulating their proposed scheme, they found that as the number of sensors on the Internet increased, the number of migrations did not increase, and compared to the dynamic methods, the proposed method was closer to optimal mode.
This study investigated a DL framework in order to perform dynamic watermarking on IoT. It enabled the IoT cloud framework to identify cyber-attacks and authenticate reliability. The proposed algorithm used the LSTM for extracting random properties such as spectral flatness, skewness, and kurtosis, as well as the central moments of the IoT signals, with these properties being watermarked and placed alongside the original signal. The extracted features of the cloud prevented the attacker to read the watermarked data, and the eavesdropping attack would not be successful. This new LSTM reduced the complexity as well as a delay of the attack identification in comparison to the other security models. According to the simulation of the proposed algorithm, the attack was detected in less than 1 second and the IoT signals could be sent with high reliability from the IoT device to the cloud [32].
Ferdowsi et al. [33] in another article, in addition to the mentioned method, used the theory of games to accelerate the gateway decision-making to identify vulnerable IoT devices. They stated that in massive IoT scenarios, the verification of all devices at the same time is not possible due to computational (resources) constraints. The game theory-based framework can improve the identification of vulnerable devices. They provided two learning algorithms: The fictitious game algorithm which integrates the entire information of the entire state of the IoT devices and converges to the mixed-strategy Nash equilibrium. The other is the deep reinforcement algorithm according to the LSTM blocks that can learn safe mode from previous gateway modes. Then, If the gateway information about the state of the IoT devices is incomplete, it can predict it. The simulation results improve system protection by reducing about 30% of compromised IoT devices.
Zhu and his colleagues [34] suggested a novel Deep Q Learning-based transmission scheduling method in the cognitive IoT. This mechanism uses the Markov decision for describing various transmission states. They proposed a relay that equipped itself with a Q-learning algorithm for transmitting packets from another node to sink for improving performance. To boost the velocity of mapping between the state and activity, they used stacked autoencoders. In this article, the model was comprised of three algorithms including Strategy Iteration (SI), W learning (WL), and Random Selection (RS). They demonstrated that their model is more applicable than the Random Selection and the W Learning when they consider throughput, packet loss, and system utility. The model also outperformed WL when they considered power consumption. Although the proposed algorithm had a lower performance compared to the SI algorithm, its complexity was lower and could be applied to practical scenarios. The authors of this article argued that for the improvement of the proposed algorithm, it is possible to use multiple relays or coworker relays.
The primary studies in the field of deep learning and IoT which focused on the network, are surveyed and analyzed. Our considerations are outlined in Tables 5 and 6. Table 5 lists the Ref & year, Main Idea, Advantages, and Disadvantages.
Table. 5. A review of articles focused on ‘Network’
Ref & year
|
Main Idea
|
Advantages
|
Disadvantages
|
[25], 2018
|
Proposing a framework based on SDN for intrusion detection
|
- scalable - High detection rate
|
- Requiring practical implementation of the proposed architecture
|
[26], 2019
|
Dueling Deep Q-Learning and blockchain
|
- Increasing throughput - high performance -No need for edge computing
|
- Not taking into account the confidence features of nodes and controllers - Not comparing with other methods
|
[27], 2018
|
Developing a detection model based on BLSTM-RNN
|
- High accuracy - Low loss
|
- It does not work well for the ack attack vector because it requires more training data
|
[28], 2018
|
The deep learning model for identifying devices on a wireless network
|
- High accuracy - extracting features automatically - Passive and transparent to RF devices
|
Not comparing with other methods
|
[29], 2018
|
Providing a two-stage deep learning method for anomaly detection
|
- Decreasing dimension - Extracting auto feature - Resistant to complex attacks -High accuracy for attack detection
|
Selecting its parameters for the preprocessing phase is important, this can be not considered the main subject was given the current availability of complex and rapid equipment.
|
[30], 2018
|
Deep Learning for Packet Forwarding
|
- Reducing the complexity of network prediction dynamically - Depending on routing protocols - Avoiding congestion
|
- Other deep learning models can be used instead - No comparison with other methods - Random initialization of the model
|
[31], 2017
|
Proposing LoadBot and Balancebot for load balancing
|
- Not increasing migration when increasing number of sensors - closing to the optimal mode
|
- comparing with more models should be done
|
[32], 2018
|
Deep Learning-Based Dynamic Watermarking
|
- Identify the attack less than 1 s - High reliability - High accuracy - Low complexity
|
not simultaneously authenticate all devices
|
[33], 2019
|
The theory of games and deep reinforcement learning to accelerate the gateway decision making
|
- Suitable for the massive IoT - Identify the attack less than 1 s - High reliability - High accuracy - Low complexity
|
|
[34], 2018
|
Using deep learning and Q- learning for Transmission Scheduling Mechanism
|
- Low complexity - Applicable for practical applications
|
- Lower performance than SI algorithms
|
Table 6 exactly functions like Table 4 regarding the comparison of articles based on the publication, experimental type, and other features.
Table. 6. A review of articles focusing on ‘Network’
Ref & year
|
Experimental Type
|
Applied Deep learning Model
|
Compare Models
|
Dataset
|
Comparative Criterion
|
Tool & Language
|
[25] , 2018
|
Simulation
|
RBM
|
SVM, PCA
|
KDD99
|
Precision
|
Tensorflow Python
|
[26] , 2019
|
- Simulation -Theoretical analysis
|
Dueling Deep Q-Learning
|
---
|
Simulate in a real environment
|
Throughput
|
Tensorflow Python
|
[27], 2018
|
Simulation
|
BLSTM-RNN
|
LSTM-RNN
|
Dataset made by the authors
|
Accuracy loss
|
TensorFlow Keras Theano
|
[28] , 2018
|
Implementation
|
CNN, DNN, LSTM
|
---
|
The dataset includes data from ZigBee devices
|
Complexity Performance Training time Test time
|
Tensorflow Keras
|
[29] , 2018
|
Simulation
|
DAE[8], DFFNN
|
F-SVM, CVT DMM, TANN DBN, RNN[9] DNN
|
NSL-KDD UNSW-NB15
|
Accuracy Detection rate FPR ROC CPU TIME
|
R
|
[30] , 2018
|
Simulation
|
FFNN[10]
|
---
|
Dataset Dongo et al.
|
- Average download latency - Throughput - Cache hit ratio - Interest overhead
|
Python Numpy ndnSIM
|
[31], 2017
|
Simulation
|
RBM
|
---
|
Not used dataset
|
The number of migration
|
Matlab
|
[32] , 2018
|
Simulation
|
LSTM
|
---
|
A real dataset from an accelerometer
|
- Time to detect an attack - Accuracy
|
Not mentioned in the article
|
[33] , 2019
|
- Simulation - Mathematical proof
|
LSTM
|
---
|
A real dataset from an accelerometer
|
Attack detection
|
Not mentioned in the article
|
[34] , 2018
|
Simulation
|
Stack Auto Encoder
|
Strategy Iteration (SI), Wlearning (WL), Random Selection (RS)
|
Not used dataset
|
Throughput Power Packet loss System utility
|
Not mentioned in the article
|
3-4. Primary Studies of Computing Environment
In this section, the primary studies on the IoT and deep learning are investigated; it is determined that 25% of the previous studies have addressed the Computing Environment.
Xuan et al.'s [35] study compared and evaluated three representing methods of Parallel Acceleration, Quantization, and Model Pruning to enable deep learning in the IoT. They determined the impact of the above methods on the Nvidia Tegra X2 platform with two integrated cores of ARM and GPU. Two kinds of methods are used to apply DLMs on the edge of the IoT; the first was to equip IoT devices with CPU or ARM to enhancing their processing power and the second method is to use the middle layer to preprocess and provide data in processing on the IoT. The authors of this study used both methods. For hardware acceleration, they considered multi-core implementation and optimization instructions. On the other hand, for the second method, the lightweight DL model was proposed and the pruning of the model was evaluated. Finally, the quantization method which concerns optimization at hardware levels and DL models were applied. The deep learning model used in this study was CNN, which was evaluated in different ways.
Wei et al. [36] pointed to cloud-based IoT problems and provided a solution for the long delay and their back-haul bandwidth. They used fog-based IoT to reduce service delays and maintain back-haul bandwidth. However, the IoT performance is fog-based and is dependent on the effective and intelligent management of the network resources; thus collaboration of storage, communications, and computations is one of the major challenges to be solved. Their study simultaneously presented difficulties related to the content storage strategy, the computing discharge policy, and the radio resources allocation in order to provide a common optimization solution based on deep learning for fog-based IoT. However, since the service requests and wireless signals exhibit random features, they utilize the actor-critical reinforced learning (RL) framework for solving the joint decision problem for minimizing latency. The DNN is employed in both the Actor and Critic sections. For the Critic section, it is used as the approximation function for estimating the value of functions with regard to the huge state and action space. On the other hand, the actor section is used to illustrate the parametric random policy and improve the policy to help the critic. The authors also used the gradient method to avoid convergence with maximum local value. The simulation results for offloading indicated that the tasks with low computational load were performed on the edge while those with heavy computational loads were performed in the cloud, whereby a better efficiency was achieved. Also, the average end-to-end service latency was reduced by increasing the number of nodes, while more bandwidth and sub-channels could be assigned to each user.
Tang et al. [37] developed the Convolutional Neural Network on the IoT hardware. They applied the SqueezeNet architecture and the ARM Compute Library (ACL) for implementation, to boost the deep learning processing speed and to improve the latency time in the offloading computation. The authors used the Nanomsg messaging framework to exchange messages between tasks on the IoT and used an NNVM-based compiler to optimize the model. The presented method was implemented in Tensorflow by activating the optimization of the ARM NEON vector computations. Further, the NEON-capable building blocks were utilized in the development of the SqueezeNet engine. By comparing Tensorflow and ACL, they found that despite the higher memory and power consumption in the ACL library, its runtime was better by about 150 milliseconds.
Nowadays, fog computing is popular in IoT as bandwidth and computational resources in centralized clouds are limited and the cloud is not sufficient to process and analyze a lot of data. In this regard, Lyu et al.'s [38] study demonstrated a three-level Fog Embedded Privacy-Preserving Deep Learning Framework (FPPDL) for preventing challenges such as privacy matters, response delays, computation, and communication bottlenecks. They proposed that computation is done in fog nodes close to the end equipment which developed a two-level privacy-preserving mechanism. Experimental results on 3 benchmark data sets to classify the images demonstrated that the proposed framework offers good accuracy and provides fault tolerance which is also scalable.
Diro et al.'s [39] study presented a distributed DLM capable of parallelizing the training and sharing of the parameters to local nodes in the fog. They analyzed their model on the NSL-KDD dataset for intrusion detection in computer networks. They performed their analysis on two parts; i) two classes (normal and attack); ii) four classes (Normal, DoS, Probe, R2L.U2R). The researchers used test data to detect the Zero-day attack which occurs frequently due to different protocols in the IoT. They pursued two goals for the proposed algorithm. The first goal was to make a comparison between the findings of the distributed attack detection and a centralized system by implementing a DLM on a single node and numerous coordinate nodes for detecting the distributed attack. Moreover, the second aim was the evaluation of the impact of the DL algorithm against shallow learning to detect attacks on the IoT-based systems. Following the hyper-parameter optimization, the DL system employed respectively 123 input characteristics, 150 neurons for the first layer, 120 and 50 neurons for the second and third layers. Finally, in the last layer, the number of neurons was the same as the number of classes. This model utilized a batch with various sizes and 50 epochs and applied Dropout to avoid overfitting.
Li et al. [40] provided an elastic model compatible with various models of deep learning on the IoT with edge computation as well as an online algorithm for improving the service capacity and off-loading strategies to function properly. Because of the different measurements of intermediate data and pre-processing overhead in various DLMs, this study raises the problem of scheduling to maximize the number of the DL tasks via restricting the network bandwidth as well as service capacity of the edge nodes. In the next step, offline and online algorithms were introduced for solving the problems. The scheduling algorithm in this study was used for reducing the traffic load of a network while transmitting data from the sensors to a cloud server. This article referred to the edge server’s limitation concerning the cloud server and applied the reduction of data size deep learning in higher layers whereby placement of the layers on the edge server could recover the network traffic. Nevertheless, as stated above, the edge server has its capacity limitations. Li and his colleagues first trained the DL network on a cloud server, and then divided the network into two parts, including lower layers close to the input data and higher layers close to the output data. The first part was used on the edge server while the second part was sent to the cloud server. In this study, an algorithm was proposed attempting to achieve the highest tasks in the computational edge structure by applying the deep layers on the IoT edge server, where the delay of the transmission required for each task could be guaranteed. The deep Network was used in this article and CNN's ten-task network was run with various CNN networks. Also, the number of the operations and the intermediate data created in all layers were recorded. Li et al. showed that the input data diminished by DL networks, and most of the intermediate data decreased by the lower layers, while computational overhead enhanced with further layers.
In another article, Zhang et al. [41] presented the Adaptive Deep Computational Model (ADCM). They used this model for learning the characteristics of big data in the industrial IoT. The adaptive dropout rate is designed by the adaptive distribution function to prevent overfitting and to adjust the activation rate according to the position of the layer. They also used the crowdsourcing technique, which is a combination of human intelligence with machine power reducing the issue of the availability of the training samples, to accumulate labeled samples for training the model parameters. It was found that the crowdsourcing technique with cloud computations can enhance the deep computational model performance. In order to have labeled training samples for a deep computational model, some unlabeled examples were transmitted to the cloud platform, and next the labels were achieved by collecting responses provided by human workers on the cloud platform. In this regard, the Response Collecting Method is SLME[11] which is designed for multiple labeling. The authors simulated their proposed method on the CUAVE and SNAE2 datasets. They found that their model suitability for preventing overfitting and providing labeled examples for training the deep computational model. Therefore, for evaluating the new method stability, all models were trained for 5 times, and then the average classification accuracy was employed to validate the efficiency of the proposed model. The authors stated that the results of this model could be improved by processing them on the initialization of the model.
The authors [42] presented an edge-based framework to establish a trade-off between the cost of communication and data freshness. In this framework, IoT's transitional data intelligently understood the environment through DRL and Markov Decision Process (MDP) methods. They then selected and learned storage policy based on history and current raw observational environment. The results suggested that the long-term cost of the user was reduced while the prolonged usefulness of fetching transient data items increased.
In addition, the primary studies in the field of deep learning & IoT that focused on the Computing Environment are surveyed and analyzed. Our considerations are outlined in Tables 7 and 8. Table 7 reports of the Ref & year, Main Idea, Advantages, and Disadvantages.
Table. 7. A review of articles focused on ‘Computing Environment’
Ref & year
|
Main Idea
|
Advantages
|
Disadvantages
|
[35], 2018
|
Compare representative approaches
|
The DL capability on the IoT edge.
|
Applying hardware and software together can have a good impact on performance, but the hardware is a costing method.
|
[36], 2019
|
-Using simultaneous storage of content -The policy of calculating -The discharge and allocation of radio sources -Using fog
|
- Minimize the average end-to-end latency
|
|
[37], 2017
|
-Using CNN inference engines on IoT devices - Construction of shaft motors -Review Offloading
|
- Dominate the restrictions of delay in offloading
|
- Applying ML/DL-based models involve computational power, not low power consumption, and sizeable storage to keep the model - Rendering the migration to onboard deployment extra challenging
|
[38], 2019
|
A fog-deep learning framework with protecting the privacy
|
- The communication and computation costs are greatly reduced - The favorable tradeoff between privacy and performance
|
- Results need more study using other data sets
|
[39], 2018
|
Introduce a deep-distributed learning model to identify the attack
|
- Better performance than the centralized model
|
More comparison can perform by machine learning algorithms and other datasets
|
[40], 2018
|
- Use edge computing along with deep learning -Reduce input data using deep learning
|
-Better performance when deep learning used and had a lot of data -Automatic extracting features - The privacy-preserving in intermediate data transferring
|
At the beginning of the work, the proposed algorithm is less efficient than the rest of the algorithms and shows its performance after a long time.
|
[41], 2019
|
Adaptive dropout Crowdsourcing method improve SLME
|
- Preventing overfitting Solving the problem of the lack of labeled data
|
-The model output is dependent on the initialization
|
[42], 2019
|
Using DRL model for solving the problem of caching IoT data at the edge without knowledge of the future popularity of IoT data.
|
The tradeoff between loss of data freshness and cost of communication.
|
Cooperative caching in IoT systems with multiple edges is not considered
|
Again, Table 8 offers the same information as Tables 4 and 6, but for the Computing Environment domain.
Table. 8. A review of articles focused on ‘Computing Environment’
Ref & year
|
Experimental Type
|
Applied Deep learning Model
|
Compare Models
|
Dataset
|
Comparative Criterion
|
Tool & Language
|
[35] , 2018
|
Implementation
|
CNN
|
light-weight CNN deep CNN
|
VGGFACE2
|
- Accuracy - speed up
|
ARM platform Nvidia Tensor RT
|
[36] , 2019
|
Simulation
|
DNN
|
|
Dataset made by the authors
|
service latency
|
Not mentioned in the article
|
[37] , 2017
|
Implementation
|
CNN
|
---
|
Not used dataset
|
Executive Time Latency
|
TensorFlow ACL
|
[38] , 2019
|
Implementation
|
MLP
|
Centralized Standalone DSSGD[12]
|
MNIST SVHN
|
Accuracy Fault tolerance Computation Communication Bandwidth Response delay
|
TensorFlow
|
[39] , 2018
|
Implementation
|
Multi-layer deep network
|
---
|
NSL-KDD
|
Accuracy DR Precision Recall F-measure
|
Keras Theano Spark
|
[40] , 2018
|
Simulation
|
CNN
|
----
|
The open dataset from Kaggle
|
- Reduced data and operations - Number of deployed tasks
|
Caffe python Network x library
|
[41] , 2019
|
Simulation
|
ADDCM ADDCM_DDC
|
DCM DDCM
|
CUAVE SNAE2
|
Accuracy
|
Matlab
|
[42] , 2019
|
Simulation
|
DRL
|
-LRU -Least Fresh First (LFF)
|
Not used dataset
|
Cache hit ratio Freshness Cost
|
TensorFlow Python
|
3-5. Primary Studies of Application
In this section, the primary studies on the IoT and deep learning are investigated whereby it is determined that 19% of primary studies focused on the Application. The articles in this section have used the usual models of deep learning for a particular or a new application. Further, the authors' idea is to apply and comparing different types of applications with common deep learning models.
Sundaravadivel et al. [43] implemented a deep learning system for monitoring health, called Smart-Log. The five-layer DLM was established on a perceptron neural network with compact hidden layers for regulating nutrition after meals. They introduced a new algorithm-based Bayesian network for determining the nutritional features of food, offering meals and recipes. This algorithm was presented with an accurate analysis of different Bayesian classifiers with proper performance. The built system had a smart sensor board connected to the mobile application software. This board included weight sensors for food. The weight of the food was sent via wireless to the cloud. The facts about nutrition were obtained by a smartphone camera through the smartphone program. The system then provided nutrient values. The user could access the calculations of nutrients’ values and predictions using the smartphone program.
To monitor patients’ nutrition, Vellappally et al. [44] installed chips in the patients' teeth, which was an electrochemical sensor for collecting information about the used foods, including fat, salt, fat, sugar, and so forth. Therefore, the collected data could be used to evaluate the consumed food quality. In the next step, information collected was processed with the use of the bacterial optimization and DL network, reviewing the information of the IoT through the self-learning process. According to the analysis, the IoT device in the teeth reduced the mastication problems. Moreover, the advantage of the prediction system of food quality based on IoT was implemented by MATLAB where the data from 53 patients were collected and 15 of them were used to test the model.
In another study, deep learning was used in medical IoT. The medical Internet of things is capable of collecting massive medical data from ultrasound images, radiography, and magnetic resonance imaging. In this article, Yao et al. learned the features by using the CNN model and the back-propagation learning algorithm from the input data, and categorized and analyzed the various types of gallbladder stones [45].
Sun and his colleagues [46] employed CNN, RNN, and Hash technology for providing a user interface for a natural image and natural language query. Their proposed architecture consisted of four modules including image training, user query, text processing, image retrieval, and data storage. In the proposed architecture, both semantic information and image cognition were important. They evaluated their proposed method for the 4S Online Store and observed that it could be an effective platform.
In the article [47], an intelligent agricultural system was dealt with through deep learning. In this article, in addition to predicting suitable products for subsequent crop cultivation, optimization of the irrigation system in the field was also considered. A wireless network was used to collect supervisory data about soil parameters uploading data in the cloud. After analyzing by LSTM, the results were sent to the user by SMS.
Wang et al. [48] used deep learning in the usage of indoor localization, health care sense, and activity recognition. They provided a DL framework for RF sensing. Their proposed deep learning models for this framework included Autoencoder, CNN, and LSTM. The proposed framework consisted of a data collection section, preprocessing section, an offline section for training the deep learning model, an online section for data testing, and finally the conclusion section. The results indicated that the proposed framework was more accurate in the three mentioned usages.
The primary studies in the field of deep learning and IoT which focused on the Application, are surveyed and analyzed. Our considerations are outlined in Table 9. Table 9 outlines the Ref & year, Name of the application, Applied Deep Learning Model, and Tool.
Table. 9. A review of articles focused on ‘Application’
Ref & year
|
Name of Application
|
Applied Deep learning Model
|
Tool
|
[43], 2018
|
A DL-based Automated Nutrition Monitoring System in the IoT
|
-Bayesian Network -Perceptron Network
|
Weka
|
[44], 2019
|
Nutrition Monitoring System
|
Adaptive Deep Learning Neural Network
|
Matlab
|
[45], 2019
|
Predicting chemical composition of gallstones
|
CNN
|
Not mentioned in the article
|
[46] , 2018
|
Image Cognition Platform
|
VGGNet
|
Caffe
|
[47] , 2017
|
Smart Agriculture
|
LSTM
|
Not mentioned in the article
|
[48], 2018
|
Indoor localization Activity Recognition Healthcare Sensing
|
Autoencoder CNN LSTM
|
Keras Tensorflow
|