A New DDoS Detection Method in Software Defined Network

: Software Defined Networking (SDN) is a new network architecture in which network control is separated from direct traffic and is programmed directly. Any change in network information and its configuration can be easily implemented in software by using the controller. Although SDN networks with their new structure and controller make way for new and innovative applications for network administrators, but the security challenges and attacks of SDN networks have created problems for these networks. One of these malicious attacks is Distributed Denial of Service (DDoS) attacks. The DDoS attack is aimed at removing machine and network resources from its legitimate users. In this paper, we propose a hybrid method for detecting DDoS attacks in SDN Networks. This method is consisting of statistical and machine learning method. Statistical method calculates the new correlation measure among all features and the dynamic thresholds, then extracts a portion of the data is recognized as attack. This portion is then redirected to the machine learning section to increase the DDoS detection accuracy. The experimental results on UNB-ISCX, CTU-13 and ISOT datasets showed that the proposed method outperforms the existing techniques in terms of the accuracy of detecting DDOS attacks in SDN networks.


INTRODUCTION
SDN architecture [1] consists of application plane, control plane, and data plane. Several applications are provided by application plane including security monitoring, and access controls. Fig 1 shows a simple overview of the network architecture, including the SDN controller, the location of the applications running on the controller, and the openflow switches controlled by the controllers through the openflow interface.

Fig. 1 SDN Architecture
The data plane constitutes underlying network infrastructures known as the infrastructure layer of the SDN. Since SDN creates new networking applications, security in these networks has become a major concern, as it is not an intrinsic feature of SDN architecture yet. Research studies [2] shows that different security attacks, against SDN, can be carried out through different network components of SDN. DDoS attacks are among the most serious threats because they affect network performance, increase the delay, and discard the legitimate packets. For OpenFlow networks, DDoS attacks can be more destructive because there is a constant flow between the controllers and switches. DDoS attacks can start easy and defend hard. For this reason, detection of DDoS attacks in SDN networks is crucial for the future of SDN networks. There are several ways to detect DDoS attacks on SDN networks. Previous research studies [3][4][5][6][7] identify several challenges in detecting DDoS attacks. Failure to investigate all types of DDoS attack, difficulty in selecting the appropriate time intervals for monitoring the traffic in periodic methods, low accuracy of detection, to name a few. The short comes and delays in detecting DDoS attacks may lead to losing resources like bandwidth and CPU, deactivation of the controller and switches and undesirable increase in response time. Furthermore, the importance of maintaining the network security impose high cost of adding hardware to enhance network security, necessitate the major changes in network topology and modifying the security policies on the owner organization. In this paper, we propose a method for detecting DDoS attacks in SDN Networks. The implementation of this method consists of statistical and machine learning methods. Statistical method calculates the new correlation measure among all features and the dynamic thresholds, then extracts a portion of the data for which the TPR reaches to 100% and is recognized as attack. This portion is then redirected to the classification section to reduce the FPR rate. Machine learning section extracts 16 features for the hosts of the same flow and records data samples for incoming packets. The samples are fed into the classification section as the training inputs to create models using various classification algorithms such as BayesNet, J48, RandomTree, Logistic regression, Reptree and Naive Bayes algorithms. In this technique, periodic monitoring and scheduled traffic screening increases the efficiency of the controller in terms of the workload. Another benefit of this idea is needless to add custom hardware to detect attacks. This technique increases in the accuracy of DDoS detection and provides independence from the network topology. This paper is divided into the following sections. We present some methods on DDoS detection in Section 2. Section 3 proposes a new method to detect DDoS attack based on correlation measure and classification algorithms. Section 4 presents the Datasets. Section 5 includes performance evaluations of our proposed detection methods on UNB-ISCX, CTU13 and ISOT datasets. Section 6 compares the methods presented in this study against some existing methods. Finally, Section 7 presents a brief conclusion and future works.

RELATED WORK
In the last decade, various studied were conducted to enhance the detection of DDoS attacks in SDN networks. Dhawan in [8] introduced a method by using the flow graph to estimate the actual network operations. This technique dynamically learns new network behaviors and uses alarm when it detects an attack. In [9] a hybrid diagnostic model using a multidimensional Gaussian method and Expectation Maximization algorithm is proposed for detecting normal and abnormal traffic. In this method, the distance between the parameters was calculated and the detection was performed based on the comparison of these distances with a specific threshold. Various features are used in [10] to infer whether an attack is occurred or not. Since there is more than one factor in judging DDoS attacks, the major issue is to determine the parameters with significant relevance. For example, the destination IP address is considered as one of the relevant parameters in attack detection. Therefore, the attacks can be identified by calculating entropy measure on destination IP address. In [11] an attack detection mechanism presented for responding as quickly as possible to a DDoS attack and reducing the workload of controllers and switches. This DDoS attack detection mechanism uses a neural network as a classification model for packet classification.
In this study a combination of correlation measure between traffic flows and classification algorithms was presented.

THE PROPOSED METHOD
In the present study, a new hybrid method was used for detecting DDoS attacks. The flowchart in Fig. 2 indicates the steps of the presented method. This technique is based on the flows received by the switches and controller. The controller computes correlation among all extracted features and generates a normal level during the analysis period. It also generated a test level for observed traffic using the same correlation measure. If the difference between normal level and observed traffic is higher than dynamic threshold, an alarm will be generated indicating that an attack has occurred and an alarm rate increased one.

Threshold Calculation
Oshimavet al. [12] introduced dynamic threshold and examined the detection of DDoS attacks with DARPA2000 [13]. DARPA2000 datasets are recognized by experts based on the DDoS attacking software leading that these attacks have the simplicity of structure and type in spite of the complexity of the real data. In this study, this threshold was evaluated for DDoS attacks by testing on datasets collected from actual SDN networks. In order to calculate the dynamic threshold, a computational method based on time sequence was used. The main purpose of using this threshold is fast detection of DDoS attacks in small time windows. The dynamic threshold is calculated as follows:  is an experimental parameter and its value has a high impact on the accuracy of attack detection. Since selecting the best value for  is a subjective task and depends on various parameters, to calculate the best  for each time period, it is advised to consider an interaction between the different factors. One of these factor is related to the ability of detecting all attacks. Also, it should not make the number of time periods to be very different and must require less computational burden and produce low false alarm rates. For this reason, we consider the  value for which all attacks are correctly detected, i.e. the TPR value is 100, as the best  In this case, although all attacks are correctly detected, some normal flows may also be mistakenly recognized as attack which results in undesirable increase in the FPR parameter or false alarms. By considering the best  value for each time period, by comparing the optimal values for each time period, the best time period is considered as the period for which the FPR value is lower than that of other time periods. By determining the best time period and best  value, that part of the flow for the attack is detected, is selected and forwarded to the classification step to increase the accuracy of the attack detection. Since this step eliminates a portion of the normal flow that is correctly detected, it balances the number of normal flow and attack flow, before delivering to the classification step. In the classification step, the classification algorithms provide further differentiation between the actual attacks and false alarms achieving higher accuracy in detecting attacks. The controller sends this message to all applications that have requested this type of message. Fig. 2 shows the flowchart of the proposed method.

Fig.2 The method presented in this study
The method presented in Fig. 2 consist of several applications. These applications work together to detect DDoS attacks running in the Floodlight controller. The each section is introduced in the following text.

Feature Extraction
An important challenge of the classification tasks is finding effective features that improve the accuracy of the results. In this section, the most relevant features are selected based on the data and input flow received from the previous stage. Each flow is considered as an edge in the graph and each host, i.e. one of the two ends of a flow, represents a node in the graph. To extract these features, each IP is first considered as a node, then all the connections between those two nodes and other nodes are used to obtain the features. Finally, a weighted directed graph is constructed based on the existing flows. A set of 16 features are extracted for training the classifiers. These features and their explanations are presented in Table 2. Table 2 The features extracted

Explanations Feature
The ratio of the number of one-way connections which were the host of the transmitter to the total connections of the desired node In this article, the data samples are divided into normal and attack classes. After extracting the features, the training samples are given as inputs to the classification algorithms including BayesNet, J48, RandomTree, Logistic regression, Reptree, and Naive Bayes classifiers [14] to construct classification models. In this section, extracted features are provided as inputs to the classification algorithms for detecting attacks. By comparing the results, the best classification algorithm that improve the accuracy of attack detection is selected.
The importance of this method is that we can compute correlation value between any two flows with low number of parameters.

THE DATASETS
To evaluate the performance of the proposed method, well-known datasets, namely UNB-ISCX 1 [15] and CTU-13 2 [16], were selected and used in the experiments. In addition to these datasets, the ISOT 3 [17] dataset is also used for normal traffic.

EVALUATION
In this study, the evaluation results of the proposed method were presented separately for detecting DDoS attacks. The K-Fold cross-validation method [18] was used for training and validation of the classification model. The number of folds was chosen to be 10. The performance of the proposed solution was measured by Accuracy (ACC), Precision (PR), F-Measure (F1), True Positive Rate (TPR) and False Positive Rate (FPR) metrics [19] , which are calculated in Table 4

IMPLEMENTATION ENVIRONMENT AND TOOLS
The experiments were conducted on an ASUS laptop with an AMD (Bristl Ridge), FX-9830P CPU 2.8GHz processor and 12GB of RAM.  [21] for network simulation [22].

RESULTS OF THE EXPERIMENTS
The results are described in two sections. In the first section, the correlation value for each objective traffics and dynamic threshold are calculated for each time period and each specific  . If the difference between this value and normal level is higher than the threshold the attack is detected and a value is added to the alarm rate parameter, that calculates the amount of attack alerts. The best  value was calculated in this experiment and is highlighted in Tables 5 to 8 for each time period. By specifying the best time period and best value of parameter  , which are outlined in Table 9 for each dataset, the part of the flow that is identified as attack is selected for the best time period and the best  value.   Table 9 The value of the best  and the best time period for attack detection in different datasets The results indicated that dynamic thresholding results in obtaining high TPR and high FPR. Since high FPR is undesirable, classification algorithm techniques were used to identify false positive cases and improve the performance of the attack detection. This part is forwarded to the classification step to increase the accuracy of attack detection. Some part of the dataset being detected as attack by the correlation-based section with dynamic threshold were selected as the training set for the classification task for more investigation. However, the rest of the dataset was filtered out and not used as the input for classification task. Various classification algorithms such as BayesNet, J48, Logistic regression, RandomTree, and Reptree are used in this paper to model and accurately detect attacks. Most of the parameters of the classification algorithms are set to default. The results of   The results based on Fig.3 and Table 11 indicated that tree algorithms resulted in better results based on the desired dataset.
This section compares the method proposed in this study with some existing methods. It should be noted that all these studies aimed at detecting DDoS attacks in UNB ISCX and CTU-13 datasets. The comparative results are summarized in Fig.4 and Fig.5.  The main contribution of the current study is to combine statistical methods and machine learning to improve the detection DDoS Attacks in SDN networks. Previous methods did not consider using the strengths of both techniques motivating us to propose an efficient method based on statistical filtering of SDN traffic and supervised learning to achieve higher detection performance. In the statistical step, correlation measure is utilized to recognize the majority of attacks. It is easily developed and implemented in SDN network environments, which requires low CPU load and is easily implemented by the controller. The advantage of this study is also the use of periodic DDoS attack detection technique in using SDN networks over other methods of attack detection in SDN networks. Considering periods that are neither too short nor too long has a great impact on detecting attacks. Because selecting short periods causes losing computational resources such as network bandwidth and CPU cycles. On the other hand, considering long periods increases response time and late detections which results in harmful damages to controllers, switches and network security. Therefore, the choice of detection method by considering selected periods and using dynamic thresholds, which is independent from time periods, can increase the speed of attack detection. In addition, it can preserve resources, protect the controller and switches from harmful damages caused by attacks. Also, eliminating a portion of the normal flows in the correlation method results in balancing normal and attack flows. It acts as a pre-processing step for the machine learning stage. Also, there is also no need for hardware infrastructure to enhance network security, which is another strength of the proposed method. Extraction of features that are independent of the speed and type of attack during machine learning has made the proposed method able to detect both High-rate and Low-rate DDoS attacks. Fig. 5 and 6 illustrate the comparative performance of the proposed method against traditional methods when dealing with real datasets collected from actual SDN networks. The results show that the proposed method outperforms its existing counterpart methods in terms of accuracy and efficiency.

Conclusion
Today, SDN have gained considerable popularity among corporate networks due to the flexibility in network management services and reduced operating costs. However, the issue of security and preventing attacks such as DDoS attack on these networks is inevitable. To improve the security of SDN networks, this study introduced a new method for detecting DDoS attacks using a combination of statistical and supervised learning techniques. The proposed method was evaluated and analyzed and its results were investigated in separate sections. The evaluations indicated that the Correlationbased sections with dynamic threshold do not produce appropriate results according to experiments on different datasets. However, better results were obtained for the dynamic thresholding at the cost of high FPR. In order to solve this problem, different classification algorithms were used and more accurate results were obtained. Finally, the significance of the proposed method was determined by comparing the accuracy of the proposed method with previous studies. Results indicated that the accuracy of the proposed method is higher than other similar methods.       Table 11 indicated that tree algorithms resulted in better results based on the desired dataset