A Survey on SDN-based Intrusion Detection Systems on the Internet of Thing: Concepts, Issues, and Blockchain Applications

With the accelerated development of computer networks utilization and the enormous growth of the number of applications running on top of it, network security becomes more signicant. Intrusion Detection Systems (IDS) is considered as one of the essential tools utilized to protect computer networks and information systems. Software-dened network (SDN) architecture is used to provide network monitoring and analysis mechanism due to the programming environment of the SDN controller. On the other hand intrusion detection system is developed to monitor incoming trac to the SDN network; hence it enables SDN to adjust security service insertion. This paper presents a survey study for SDN with the Internet of Things (IoT) and its improved versions like SDN-based IDS and SDN-based IoT. Likewise, discussing the IoT and its problems, especially the security aspects and solutions to overcome these problems. Finally, a brief description of the Blockchain concept and how it can be merged with an SDN-based IoT system to further enhance its security aspects is provided.

network monitoring, it is required to allow machine learning and deep learning (ML/DL) approaches to be merged with SDN controllers [6].
Due to continuous rise of cyber-attacks all over the world [1], the research in IDS grows quickly in the academic and industrial communities. Malicious insiders, denial of services, and web-based attacks are the main reasons that cause more dangerous cybercrimes. These cybercrimes may distribute country's critical national infrastructure by giving the opportunities for malicious software to creep into the system.
Hence to avoid unauthorized access, some programs such as a rewall, antivirus software, and an intrusion detection system (IDS) are deployed by many organizations to protect them from losing their intellectual property. To determine cyber-attacks rapidly, rst you should identify the attack process early [1] from the network utilizing IDS. Then you should use intrusion detection systems (IDS) to identify malicious activities including viruses, worm, DDOS attacks. Irregularity detection speed, accuracy, and reliability are the basic achievement factors for IDS. Therefore ML/DL approaches can be merged with SDN-based intrusion detection to introduce several advantages such as high Quality of Service (QoS), security enforcement, and virtual management. Other advantages introduced by SDN are enhancing the network security, eliminating hardware dependency and achieving exibility to program network devices [4,5]. The recent development concentrates on utilizing a new network architecture, namely, the softwarede ned network (SDN) to execute IDS with machine learning approaches [6]. A few researchers studied integrating SDN with IoT as shown in Table 1.
When services and devices are increased in the network, IoT should be scalable and feasible enough to accommodate these changes in the network. IoT system has limited resources, and hence security mechanisms may not be supportable. The combination of Blockchain (BC) [70] with IoT provides a solution to such di culties. The advantage of using BC is that it has a scalable, distributed, and decentralized nature that makes it the perfect solution for the improvement of various IoT aspects. This paper introduces a review study on intrusion detection in software-de ned networking as well as exploring the using Blockchain for SDN security. Therefore, the contribution of this paper can be summarized as the following: Study the concept and architecture of intrusion detection systems.
Exploring the SDN architecture and its applications along with reviewing the IDS for SDN associated with applying ML/DL. Discussing open research directions of this innovative paper's subject.
The rest of this paper has organized as follows Sect. 2 presents IDS followed by common datasets used in IDS. Section 3 provides ML approaches and consequently ML/DL based IDS observation. In Sect. 4, an outline of SDN architecture and applications is provided. We likewise survey IDS for SDN related with applying ML/DL to SDN-based IDS are talked about. Section 5 discusses the SDN-based IoT system. A brief description of BC technology, BC-based IoT, and BC-SDN-IoT systems are given in Sect. 6. Section 7 provides the Open Issues and Future Research Directions while Sect. 8 concludes the paper with future works.

Intrusion Detection Systems
Intrusion Detection System (IDS) is a critical research achievement in the cybersecurity eld, which can recognize an attack, which could be an ongoing attack or an attack that has already happened. Intrusion detection is like a classi cation problem, such as a binary or a multi-class classi cation problem. In binary classi cation, distinguish whether network tra c behavior is normal or anomalous, and in multiclass .i.e., a ve-class classi cation problem, recognize whether it is normal or any one of the other four attack types such as DOS (Denial of Service), U2R (User to Root), Probe (Probing) and R2L (Root to Local). The major objective of intrusion detection is to successfully recognize the intrusive behavior and increasing the accuracy of classi ers.

Types of techniques of the Intrusion detection system
Techniques of intrusion detection can be divided into [14] into two primary types as shown in Fig. 2.

Anomaly Detection
The main idea in anomaly detection is that it detects both network and computer intrusions and misuse by monitoring system activity, and then classifying the types as either normal or anomalous. The classi cation here depends on heuristics or rules, as opposed to misuse detection which depends on patterns or signatures, and attempts.
Most anomaly detection systems have two phases. The rst is training phase in which a pro le of normal behaviors is built. The second is testing phase in which current tra c is compared with the pro le created in training phase.
As we mentioned earlier that system activity must be classi ed as either normal or anomalous to detect both network and computer intrusions, there are several methods to detect anomalies. One of them is arti cial intelligence type technique, while another method is called strict anomaly detection in which a strict mathematical mode is used to de ne what normal usage of the system comprises, and then any deviation is considered as an attack. Data mining methods, grammar based methods, and arti cial immune system are considered as another methods used to detect anomalous.

Misuse Detection
The main idea in misuse detection is how to detect computer attacks. This is done by de ning abnormal system behavior at rst, and then any deviation is considered as normal. It seems as the opposite method to anomaly detection. Anything not known in misuse detection is considered as normal. Misuse detection depends on patterns, or signatures, and attempts. The advantage of using misuse detection is its simplicity of adding known attacks to the model, so it is used more generally to refer to all kinds of computer misuse. On the other hand the main disadvantage of misuse detection is that its inability to recognize unknown attacks. Hence most intrusion detection systems utilize a combination of two techniques and are often deployed on the network, on a speci c host, or even on an application within a host.

Types of the Intrusion detection system
Intrusion detection systems can be divided into the following:

Network-Based Intrusion Detection
Network-based intrusion detection systems (NIDS) are devices that are placed in different points within the network to analyze all tra c from all devices on the network. Hence, any attack can be identi ed. Once the attack is identi ed, the alert can be sent to the administrator. There are two network interfaces in NIDS. The rst is used to listen to network conversation in promiscuous mode, while the second is used for control. NIDS have the ability to compare signatures for similar packets, and hence harmful detected packets are linked or dropped. On-line and off-line NIDS are considered as two main types of NIDS according to the system interactivity property. They are also called inline and tap mode. On-line NIDS is used in the network in real time to analyze the Ethernet packets to know the attack, but off-line NIDS is used to store data and make some processes on stored data to know the attack occur or not.

Host Based Intrusion Detection
Host-based intrusion detection systems (HIDS) are devices that run on individual hosts or devices on the network. They are used to analyze the inbound and outbound packets from the devices. Hence, if any suspicious activity is detected, HIDS will alert the user or administrator. Mission critical machines are considered as an example of HIDS usage. The main advantage of HIDS is that they have the ability to distribute the load associated with monitoring across available hosts on large networks. So, the load is spread evenly over the network which is very useful for cost and performance.
The main disadvantage in HIDS is that they can't see network tra c because they are designed to run on a single system.

Common Datasets used in IDS
Many datasets are not openly accessible in order to achieve security and privacy issues. Additionally, the data which are openly accessible are anonymous effort and don't examine the present network tra c variety. The details of different IDS datasets are mentioned in Table 2. Table 2: Datasets used in IDS Firstly in supervised learning to predict unknown cases, there are many algorithms used to learn representations from labeled input data. These algorithms are support vector machine (SVM) that used for classi cation problems and random forest that used for classi cation and regression problems [17]. The main property for SVM algorithms that makes it broadly utilized in NIDS research and appropriate for high dimensional data is its powerful classi cation power and practicality in computation. SVM has a problem for choosing a reasonable kernel function as it requires computational processing units and memory [18]. On the other hand the Random forest algorithm [20] is considered a powerful approach when dealing with uneven data but its shortcoming is that it is exposed to over-tting.
Secondly in the unsupervised learning scheme, unlabeled input data is used by algorithms as opposite to supervised learning. To predict unknown data, unsupervised learning algorithms model the fundamental structure or distribution in the data [17]. Principal component analysis (PCA) and self-organizing map (SOM) are considered as examples of unsupervised learning algorithms. Principal Component Analysis (PCA) is an algorithm of feature reduction techniques that is utilized to signi cantly accelerate unsupervised feature learning [21]. PCA is used for feature selection by many researches before applying classi cation [22]. On the other hand self-organizing map (SOM) is one of clustering techniques that was utilized to reduce payload in NIDS. K-means and other distance-based learning algorithms are other clustering algorithms that utilized for anomaly detection because they are exposed to initial conditions such as centroid, and may produce a high false-positive rate [24].
Finally, Semi-supervised learning is considered as a kind of supervised learning because a small amount of labeled data converged with a large number of unlabeled data to form the training data such as photo archives [25]. Keeping in your mind that Semi-supervised support vector machine [26] is another way to improve the accuracy of NIDS [27]. Spectral Graph Transducer and Gaussian Fields approach are two examples of semi-supervised classi cation that are used to distinguish unknown attacks while MPCKmeans is an example of semi-supervised clustering method [28].

Deep learning for network intrusion detection system
With the accelerated development of machine learning, deep learning has become more popular, so it has been applied for intrusion detection. Recent studies have presented that deep learning has the potential to generate better models than traditional methods. Table 3 shows different datasets used in the intrusion detection system using a deep learning approach. Table 3 Comparison of Datasets used in IDS using the DL approach.
Deep Learning algorithms can be considered as a new version of arti cial neural networks that exploit abundant, affordable computation [38]. We can use a deep learning algorithm to learn a representation of data with various levels of generalization. Object detection, detecting network intrusion, and visual object recognition [39] are some applications that use deep learning algorithms. Supervised and unsupervised ways are used to train a deep learning algorithm [12]. The CNN [39] is illustrated as an example of deep learning algorithms that uses a supervised way for training. The CNN architecture is utilized in general in applications such as face recognition [39] and 2D images [40].
On the other hand an autoencoder [41] is considered as example of deep learning algorithms that can be trained in an unsupervised way. An autoencoder achieves dimensionality reduction by learn a representation (encoding) for a set of data. A Deep Belief Network (DBN) [42] is another example of deep learning algorithms that can be trained at rst in an unsupervised way to learn how to reconstruct its inputs then it can be trained in a supervised way to achieve classi cation. The goal of using DBNs that include restricted Boltzmann machines (RBMs) [43] or auto-encoders is to achieve collaborative ltering, topic modeling, feature learning, regression, and dimensionality reduction, etc.
Recurrent neural network (RNN) [44] is an example of deep learning algorithms that can be trained in a supervised or unsupervised way to process random orders of inputs by using internal memory. The main property of RNN is that it has the ability to predict character in the text and learn dependencies and actual evidence stored for a long time [39]. A typical application for RNN is speech recognition [45].

Recurrent Neural Network
The recurrent neural network is presented in Fig. 4, which contains three different types of units such as input units, output units, and hidden units. Hidden units are considered the most important work is completed be, and because they also remember the end-to-end information i.e., hidden units are the storage of the whole network. There is a single ow of information in the RNN model, from the input units to the hidden units. At the point when we unfold the RNN, we can nd that it encapsulates deep learning. For supervised classi cation learning, we can utilize RNNs based approach as mentioned in [46]. The fundamental difference between Recurrent Neural Networks and traditional Feed-forward Neural Networks (FNNs) is that RNN has presented a directional loop which is utilized to retain the preceding information and hence it can be applied to the present output. The previous output is additionally associated with the present output of a sequence, and the nodes between the hidden layers also have connections. The output of the input layer, as well as the output of the last hidden layer, can be considered as the input of the hidden layer.

Convolutional Neural Network
The convolutional neural network (ConvNet) [47] is a class of deep neural networks that is applied in image and video recognition, recommender systems, image classi cation, medical image analysis, and nancial time series. ConvNet has three main layers: input layer, output layer, and multiple hidden layers which consist of a series of convolutional layers such as pooling layers, fully connected layers, and normalization layers. These layers are called hidden layers because their inputs and outputs are masked by the activation function and nal convolution. The most activation function used is a Relu layer.
As we mentioned earlier that image recognition is one of applications used for ConvNet, Fig. 5 shows an example for image recognition. First, the feature extraction network is used to extract feature signals from input image, and then classi cation neural network produces the output from the features of the image.
The feature extraction neural network as shown in the gure has many digital lters that convert the image by using the convolution operation. On the other hand, the main property in the pooling layer is that it decreases the dimension of the image from many pixels to a single pixel.

Software-de ned Networking(Sdn) Based Intrusion Detection System
This section presents the intrusion detection system for SDN. 4.1 Overview of Software-De ned Networking SDN [7] is a new technology concerned with planning, constructing, and overseeing systems. Separating the network's control (brains) and forwarding (muscle) planes is the main property for SDN to make it simpler to advance each. In this condition, a Controller goes about as the "brains," giving a conceptual, concentrated perspective of the general system. Through the Controller, organize chairmen can rapidly and effectively settle on and push out choices on how the fundamental frameworks (switches, routers) of the forwarding plane will deal with the activity. The most widely-used SDN protocol that manages the relation between the controller and the switches is the OpenFlow [52,53]. An SDN depends on Application Programming Interfaces (APIs). These APIs, usually called northbound APIs, empower effective administration coordination and computerization. Accordingly, the SDN empowers a system to shape movement and send administrations to address-changing business needs, without touching any individual switch or any switch in the sending plane. The SDN is another system that tends to empower more agile and nancially savvy systems. The Open Networking Foundation (ONF) leads the pack in SDN standardization [7]. Figure 6 introduces a simple logical representation of SDN architecture. The ONF/SDN engineering comprises three layers that are available through open APIS: The application layer is specialized for expanding the SDN communication services. Both application layer and control layer are separated by the northbound API.
The control layer is specialized for overseeing the network forwarding behavior through an open interface by leveraging from the centralized control.
The infrastructure layer is specialized for packet switching and forwarding by using Network Elements (NEs) and devices.
A software-de ned network is a new technology that has the ability to separate the network control and forwarding functions. Hence, programming of the network control can be achieved, directly [7]. The separation feature of SDN makes network management easy [2] and introduces several advantages. First, it facilitates innovative applications. Then, it helps for dictating a new networking paradigm with the ability to implement IDS [8]. To maintain a high level of security and network monitoring, it is required to allow machine learning and deep learning (ML/DL) approaches to be merged with SDN controllers [6]. On the other hand, ML/DL approaches can be merged with SDN-based intrusion detection to introduce several advantages such as high Quality of Service (QoS), security enforcement, and virtual management. Other advantages introduced by SDN are enhancing the network security, eliminating hardware dependency and achieving exibility to program network devices.

SDN Applications
There are many applications for SDN. These applications tend to increase the exibility of a network and at the same time reduce both the total time required to market and total cost of ownership of future IT SDN-Based Cloud: The main advantage of using SDN in cloud techniques is that it has the ability to defeat cloud intrusions. On the other hand, the service scalability in cloud environments is increased when the SDN paradigm is joined with cloud techniques [49].
Residential environment: SDN framework achieves greater visibility for users and service providers in residential and small o ce networks. On the other hand SDN achieves greater accuracy and scalability by using anomaly detection systems in a SOHO network [4].

SDN Based Intrusion Detection System using ML/DL
The programming environment of the SDN controller allows SDN architecture to provide network monitoring and analysis mechanism. On the other hand intrusion detection system [IDS] is developed to monitor incoming tra c to the SDN network; hence it enables SDN to adjust security service insertion. Taking into account that SDN-based intrusion detection system (IDS) is concentrated on an SDN controller as shown in Fig. 7. This IDS is utilized to recognize the malicious ow by analyzing the tra c intended to the SDN network. After analyzing tra c, SDN switches receive the tra c coming from outside networks. Then SDN switches check ow table entries that is concerned with transmitted packets which are characterized by the controller. Keeping in your mind that if ow entry for respective packets does not exist in the periodically updated ow table then, respective packets take a path to the controller that is responsible for setting the data path to the corresponding packets.
The architecture of the system is shown in Fig. 6. First, preprocessing is applied to the given input which includes numericalization and normalization [46]. In numericalization, non-numeric features are converted into numeric features by using encoding and in normalization, features are scaled i.e. the value of every feature is mapped to [0,1] range. In the feature selection step, optimal features are selected a given to the training of the neural network.

System Analysis
The recurrent neural network is trained using NSL-KDD, KDDCUP 99, and UNSW-NB15 Datasets. Three datasets are available for both binary and multiclass classi cation. The detailed statistics of these datasets are reported in Table 5 and Table 6. Table 5 Types of attacks for NSL-KDD and KDDCUP 99 datasets.

Normal
Normal connection records DoS Attacker aims at making network resources down Probe Obtaining detailed statistics of system and network con guration details

R2L
Illegal access from remote computer

U2R
Obtaining the root or super-user access on a particular computer Backdoors is a mechanism used to access a computer by evading the background existing security.

DoS
Intruder aims at making network resources down and consequently, resources are inaccessible to authorized users

Exploits
The security hole of operating systems or the application software is understand by an attacker with the aim to exploit vulnerability Generic Attacks are related to block-cipher Reconnaissance A target system is observe by an attacker to gather information for vulnerability Shell code A small part of program termed as payload used in exploitation of software Worms Worms replicate themselves and distributed to other system through the computer network The performance of the model is evaluated using accuracy which is calculated by using the confusion matrix. The value of accuracy is considered as performance indicator of the RNN model. Table 7 shows the accuracy of RNN with a different number of features for binary classi cation with different datasets.
The number of epochs is given 100. Table 7 The accuracy of RNN with a different number of features for binary classi cation with different datasets.

Sdn-based Iot
The concept of SDN has been grown quickly in the area of the Internet of Things (IoT). SDN offers solutions to problems concerned with IOT security. In SDN architecture controller becomes an intelligent resource in the network. The controller is decoupled from the networking element and set in the control plane. This gives different points of interest to design the network with less time and resources. As IoT has many drawbacks, many researchers propose a new architecture that merges SDN with IoT to enhance their performance. The combination of SDN with IoT is shown in Fig. 8 where the SDN controller is used to monitoring all devices in the network such as the sink node of IoT and OpenFlow switch to achieve better performance. SDN controller can program all devices in the network according to the requirement and hence, it helps in solving many of the di culties of IoT. On the other hand, the proper installation of OpenFlow switches and SDN controllers helps in enhancing the reliability of the IoT network. Using programmable OpenFlow protocol introduces many advantages. First it helps the IoT system to manage its devices more e ciently. Then, it increases the overall network performance in terms of low bandwidth utilization and high throughput of the network. SDN-based IOT can be additionally enhanced by merging fog computing with. This allows computations to occur at the edge of the network. So, decentralized computing infrastructure is obtained and hence, the overall computation load is reduced [71]. The architecture of SDN-based IoT and fog computing is shown in Fig. 9 where data of IoT application is processed on the edge of the network itself and hence, network latency and bandwidth will be reduced.

Blockchain
Blockchain (BC) is a new technology that has already been implemented in cryptocurrencies such as Bitcoin, Ethereum, etc. BC technology provides a way to record transactions or any digital interaction securely. BC has a distributed and decentralized nature that makes it a secure solution over IOT and Arti cial Intelligence (AI). BC can be classi ed into three types. They are public, private, and permissioned [70]. Public BC can make anyone join the BC network without the agreement of third parties. On the other hand, the owner in private BC has the ability to control access of the nodes in the network because network access is restricted. Keep in mind that in private BC, authorized nodes can only maintain consensus. Finally, permission BC is a combination of public and private BCs.
As shown in Fig. 10

Blockchain-based IoT
Due to changes existed in the network such as increasing services and devices, IOT should have enough scalability and feasibility to be able to face these changes. IOT system has limited resources, and hence security mechanisms may not be supportable. The main advantage of BC-based IoT is that it helps in tracking billions of connected devices. Hence, the processing of transactions and coordination between devices can be executed. Figure 11 shows the architecture of BC-based IoT. The perception layer as shown has various devices for every application. Hence, the local miner in each application enable it to store transactions from the other devices in local BC [72].
6.2 Blockchain-based SDN and IoT architecture BC can be merged with SDN for improving the overall security and privacy of IoT. The advantage of using this architecture is that it has a distributed and decentralized nature that makes the system more resilient and reduces the impact of attacks. The architecture is shown in Fig. 12 where the edge networks include edge nodes. On the other hand, the blockchain layer contains core miner nodes that provide high computation and high storage resources. Miner nodes are responsible for creating blocks and achieve consensus [73,74].
Although the combination of BC with IoT has many advantages in terms of security and privacy, it has many challenges [68]. Firstly, the algorithms for cryptographic and consensus used in the current implementation of BC require signi cant computational resources, and existing IoT can't provide the same. Secondly, BC reduces the need for the server to store transactions, but the size of the global ledger increases with the increasing blocks. This con icts with IoT devices that have very low storage capacity. Thirdly, the current implementation of BC requires an increase in the number of nodes and this means more scalability, but SDN and IoT suffer from scalability issues. So, the combination of BC-SDN-IoT will be affected by this problem that needs to be resolved to improve the performance. Finally, consensus protocols used in BC such as PoW and PoS require signi cant and energy-consuming. Therefore, researchers try to solve these di culties and develop improved IoT architecture to the extent that even the BC concept can be merged with it.

Open Issues And Future Research Directions
Although SDN has many advantages concerned with its exibility, features, and suitability to control and manage IoT networks. The combination of SDN and IoT has several limitations that need to be solved by researches. The following research areas need to be discussed as shown in Fig. 13: SDN-based IoT controller: SDN controller's south bound APIs need to be changed to communicate with IoT devices and this requires lots of effort.
Limited Communication tra c between the gateway and the controller: There is a large number of IoT devices that generate huge amounts of tra c that will be taken into consideration in tra c management to ensure network availability. Hence, new security mechanisms for the SDN-IoT environment should be considered to overcome the problems caused by the huge amount of tra c.
Mobility issues: The nature of IoT infrastructure component differs from the variety of smart objects which are static and mobile. Thus, the diversity in the mobility patterns should be taken into consideration when including the mobility challenge in the transmission rules updating process.
Interoperability: As we mentioned before about the diversity of existing devices nature, common standards, and protocols is required to integrate communication among these devices. Therefore, the interaction between heterogeneous equipment, dependable transport, and routing are considered challenges.

Conclusion
The usage of the IoT system has increased in recent years due to the need for applications such as smart homes and smart cities. Hence, a suitable protection system is required to adapt with a large amount of produced data, but this con icts with IoT. The data are vulnerable to various attacks. IDS systems are intended to identify attacks early. Due to the dynamic nature of the attacks, we should take into account various issues while implementing IDS such as the adaptability of the detection method, but there are many challenges. One of them is that the dimensions of the dataset should be reduced, so feature selection method with classi cation should be developed to classify dataset properly using deep learning techniques. On the other hand designing a centralized SDN controller is another challenge that can monitor and implement real-time intrusion detection in high-speed networks. In the SOHO network [39], most of malicious activities should be identi ed, so SDN-based IDS architectures should be developed.
Keep in your mind that none of the approaches that implement SDN-based IDS are applied to critical infrastructure and high-speed network infrastructure. SDN can be merged with IoT because SDN provides opportunities to solve issues related to IoT security. Another problem related to IoT is that IoT devices are resource and energy-constrained. Hence, the traditional security mechanism required for this implementation is very di cult. The combination of BC with IoT can solve this problem due to the scalable, distributed, and decentralized nature of BC. The combination of BC with IoT has many advantages in terms of security and privacy.