Fast Detection and Traceback-based Mitigation of Interest Flooding Attack

NDN Pending Interest Table (PIT) helps NDN by storing the state of a request within the router. This state information helps the router to redirect the data packet towards the requester. However, an attacker can send malicious requests, which could ﬂood the PIT; this attack is known as Interest Flooding Attack (IFA). In our previous work, we have found the most relevant features needed to detect IFA and applied a few machine learning approaches for the oﬄine detection of IFA. In this article, a trained Artiﬁcial Neural Network (ANN) classiﬁer has been deployed on each NDN router for the online detection of IFA. A novel traceback-based mitigation is proposed, which is triggered after the detection. The proposed approach is found better than the previous approach in terms of the satisfaction ratio and throughput of the legitimate consumers. approaches. This paper presents a traceback-based mitigation approach for IFA. Firstly, six features are selected out of eleven using IG-based ranking. These selected features are used for building a trained ANN-based classiﬁer. This classiﬁer is deployed in the NDN routers for online IFA detection. On IFA detection, the router stores malicious interface and preﬁx-list in the m-list. The router sends an alert message to the gateway router to inform it about the attack. The gateway router adds them to its m-list. The gateway router restricts the interest packets which match with the m-list. The proposed approach performs better than the previous approach in terms of the satisfaction ratio and throughput of normal consumers. In the future, we try to adapt our approach to the mitigation of diﬀerent types of IFAs.

Recently NDN evolves as the most prominent candidate among all ICNs [?]. NDN overcomes the problem of content sharing by in-network caching and fetching the content using its name. NDN is more secure than the TCP/IP as the trust model between the producer and consumer ensures the data packet's provenance and integrity. Confidentiality can be ensured by additionally encrypting the data packet. However, NDN is vulnerable to new types of attacks, such as IFA [11], cache privacy attack [4], cache pollution attack [11], content poisoning attack [12].
The attacks mentioned above either degrade QoS or affects the privacy of the user. However, IFA can cease the flow of traffic through the network. This makes IFA more severe than other NDN attacks. The attacker floods PIT by intentionally requesting non-existing contents. Entries get created in the PITs of the in-between NDN routers due to these requests. Thus PIT becomes unavailable for the legitimate users because these malicious entries persist in PIT till timeout.
Most of the previous approaches [1,8,7,20,21,23] detect IFA using one or two features based on static thresholds. However, the features which are necessary for attack detection have not been analyzed in the previous works. In our previous work [17,18], 12 features have been analyzed that get affected due to IFA, out of which 9 features were chosen using information gain [15] based ranking. Four different machine learning approaches were applied out of which two approaches, i.e., Multi-Layer Perceptron (MLP) with BackPropagation (BP) [9,13] and J48 [13], were found better. This paper deals with the online detection and mitigation of IFA. Few changes have been made to our previous work to adapt it for the online IFA detection. The detection is divided into two phases, i.e., malicious interface detection and malicious prefix detection. For the detection of the interface, we have taken only those features which are computed per interface.
After doing the above changes, the feature selection is re-performed using information gain based ranking. Then ANN classifier is trained using the selected features. The trained ANN classifier is deployed on the router for online detection; the outcome of detection is a malicious interface. After detecting the malicious interface, the malicious prefix is detected using a statistical approach. Traceback-based mitigation is applied after the detection of IFA. The proposed approach is compared with [7] approach.
The major contributions of this paper are: -Performing IFA on Tree topology and DFN like topology.
-Selecting the most appropriate features for IFA detection.
-Offline detection of IFA using the ANN classifier.
-Deploying trained ANN detector in the NDN router.
-A traceback-based mitigation approach for IFA.
-Comparison of the proposed approach with the previous approach.
The rest of the paper is organized as follows. Section 2 presents basic NDN architecture. Section 3 presents the work done by researchers for the mitigation of IFA. Section 4 presents the proposed IFA mitigation approach. Section 5 describes the experimental setup and results. Section 6 discusses the conclusion and future work.

Background Details
NDN uses two types of packets for a communication, i.e., interest packet and data packet. The contents are requested using the interest packet. The data packet contains actual content that acts as a reply to the interest packet.
There are three data structures that are used by NDN, i.e., Pending Interest Table (PIT), Forwarding Information Base (FIB), and Content Store (CS) for forwarding interest and data packet. PIT stores the interest packet's name and the interface on which the interest packet is received until the matching data packet is received. The FIB is like the routing table in TCP/IP. It has a list of named prefixes and the interfaces from which the interest packet should be forwarded. The CS caches the contents received by the router.
NDN forwarding pipeline is shown in Figure 1. Upon receiving an interest packet, the router searches CS for the matching data packet; if it is found, it is replied through the interface from which the interest packet is received. Else, a lookup is performed by the router in the PIT for finding the matching entry. If a matching entry is found, the corresponding interface is added to the matching PIT entry's interface list. Otherwise, a new entry is created in the PIT. Additionally, the interest packet is forwarded through the interface given by FIB. When a router receives a data packet, it searches for the matching PIT entry. If a matching entry is found, the data packet is forwarded through interfaces present in the matching PIT entry, and the data packet is cached in the CS. Else, the router will drop the data packet.

Related Work
The DoS attack in NDN was introduced as "interest flooding" by [11]. [11] have given a countermeasure in which the attack is detected using stats such as pending interest packets per outgoing interface, interest packets per incoming interface, and pending interest packets per namespace. The router reduces the PIT quota for the malicious namespace on the malicious interface after the IFA detection. This information is propagated to downstream routers, which applies a countermeasure on the malicious interface. The authors have not provided an implementation or evaluation of their approach.
[1] have proposed three algorithms that apply a limit to interest packets that are forwarded through each interface. These algorithms are token bucket with per-interface fairness, satisfaction-based Interest acceptance, and satisfaction-based pushback algorithm. In the first algorithm, each outgoing interface has a fixed share of PIT entries, which is equally distributed among all the incoming interfaces. In the second algorithm, the router allows interest packets based on the satisfaction ratio of interfaces. In the third algorithm, a limit is given to each incoming interface to satisfy the interest packets. The routers announce limits to downstream neighbors, which are used by the neighbors to set their limits. The limitation of these algorithms is that they restrict every packet on the malicious interface; thus, legitimate interest packets may also suffer. Also, this approach does not stop the attacker from attacking. [8] have used PIT size as a metric for the detection of IFA. When the PIT size increases more than a threshold, the router looks for an unsatisfied interest packet with the longest name. This interest packet is replied with a spoofed data packet, which contains information about malicious namespace. The originating router applies a filter on the malicious interface after receiving the spoofed data packet. [7] have proposed a framework called Poseidon for the detection and mitigation of IFA. It uses two parameters for IFA detection, i.e., the ratio of incoming & outgoing data packets and PIT space per interface. The attack is detected when both the parameters exceed a predefined threshold. The interest packets received on the malicious interface are dropped after the IFA detection, and an alert message is sent through the malicious interface. The router decreases the threshold value of the above two parameters on receiving the alert message. In this approach, the mitigation is applied to an interface; therefore, incoming legal interest packets may also suffer from IFA. [20] have proposed an approach called Disabling PIT Exhaustion (DPE) for IFA mitigation. Each router has a malicious list (m-list) which contains the number of expired interest packets for each namespace as a parameter for IFA detection. When this parameter goes beyond a predefined threshold, then the namespace present in the corresponding m-list is considered malicious. The router will not create an entry for the malicious namespace. The malicious namespaces remain in the m-list till decay time. [21] have proposed a collusive IFA detection approach based on wavelet analysis. This approach uses a namespace frequency distribution called Power Spectral Density (PSD) as a parameter for IFA detection. When the value of PSD low, the collusive IFA is detected. The authors have chosen the detection threshold experimentally. The main drawback of this approach is it is only applicable to collusive IFA.
A detection approach should be accurate, fine-grained, and fast. Most of these approaches rely on the detection or mitigation of IFA using one or two features. The IFA mitigation is done based on a statistical threshold. These approaches are fast, but they have low accuracy, and they are not fine-grained (detect interface as well as a malicious prefix). We have used six different features to detect malicious interfaces. These features are selected from eleven features based on IG-based ranking. These features are used to train ANN, whose accuracy is 98.5%. This approach is faster as time is only consumed in the training phase; once the ANN is trained, the detection takes few milliseconds. This trained ANN is deployed in the NDN router for the online detection of the malicious interface. After the malicious interface detection, the malicious prefix has been detected. After detection, we have applied tracebackbased mitigation. Thus, the proposed approach is accurate, fine-grained, and fast.

Interest Flooding Attack
IFA makes network services unavailable for legitimate consumers. [11] proposed three types of IFA based on interest packets used for the attack, i.e., Type-1 (existing or static), Type-2 (dynamically-generated), and Type-3 (nonexistent). In Type1, the existing set of contents is requested repeatedly. These contents are cached by the nearby routers due to which the requests are satisfied by the CS. Thus, less number of PIT entries are created. In Type2, the attacker generates dynamic contents which are satisfied by the producer on demand. The producer spends some time creating these contents. Thus, this attack affects a producer as well as a router. In Type-3, the attacker requests non-existing contents. These requests create PIT entries that are not satisfied. PIT entries corresponding to the requests remain in the PIT till timeout. Thus, PIT Type-3 is more severe than Type-2 because the PIT entries created due to the Type-3 attack remain in PIT until timeout and timeout duration is large. This article considers only Type-3; in the rest of the paper, IFA means Type-3. Figure 2 demonstrates IFA in NDN on a simple topology. It has one consumer, one attacker, and one producer. The producer publishes content corresponding to prefix "/prefix." The attacker sends a large number of synthetic interest packets by concatenating the prefix with a random string. These interest packets are forwarded to the publisher by the NDN routers. This results in the creation of PIT entries on the routers which are present in the path fol- The following series of processing has been done for the mitigation of IFA.
1. IFA Modeling and Implementation-In this phase, the simulation of IFA is done using ns-3 based ndnSIM [6] simulator. 2. Data Collection-In this phase, the relevant features are collected from the simulator and pre-processing is performed to make them suitable for further processing. 3. Feature Selection-In this phase, Information Gain based feature selection is applied to select the most relevant feature set. 4. IFA Detection-In this phase, the selected features are used for the offline detection of IFA using MLP with BP. After the off line detection, the trained detector is deployed in the NDN routers for the online IFA detection. 5. IFA Mitigation-A traceback-based detection is applied for the mitigation of IFA. These phases are shown in Figure 3 and are described in detail in the following subsections:

IFA Modeling
The ndnSIM network simulator is used for modeling the attack scenario on Tree topology and DFN like topology which is shown in Figure 4(a) and Figure 4(b). The routers, consumers, attackers, and publishers are shown in the figure as R*, C*, A*, and P*, respectively, here * represent a number. The size of the PIT and the expiration time for a PIT entry is taken as 1200KB and 4 seconds for both the topologies. CS uses LRU as replacement policy and can has maximum 100 entries. The length of the queue and the delay for the point-to-point link are taken as 1000 and 10ms, respectively.
All the scenarios for the Tree and DFN like topology run for 300 seconds of simulation time, and the consumers are active in the whole duration. The consumers send interest packets with a frequency of 100 packets, and the attackers request the malicious interest packets with a frequency of 400 packets for the Tree topology. Whereas for the DFN like topology, the consumers send interest packets with a frequency of 200 packets, and the attackers request the malicious interest packets with a frequency of 800 packets. The attackers are active in the duration of 100s-200s of simulation time.

Data Collection
Data corresponding to 11 different features has been collected after every 500ms. These features are InInterest, OutInterest, DropInterest, InData, Out-Data, DropData, InSatisfiedInterests, OutSatisfiedInterests, InTimedOutInterests, OutTimedOutInterests, and PitEntries. The description of these features is given in Tab. 1. This data is used for selecting the most appropriate features from the feature set.

Feature Selection
Information Gain (IG) based feature ranking is used for selecting the most appropriate features for IFA detection. The process of selecting features is as follows: 1. Calculate the entropy H(x) for the data set using the formula given below: Here, x is a random variable which takes values V 1 , V 2 , ..., V n and p 1 , p 2 , ..., p n are respective probabilities of occurrence of V 1 , V 2 , ..., V n . 2. Information gain of a given feature F i can be calculated uisng equation below: 3. The fetuares are sorted according to their IG values. 4. Features with high IG values are selected for IFA detection.
The data pre-processing, feature selection, and classification are done using the Waikato Environment for Knowledge Analysis (WEKA) [2]. It is an open-source Java-based application licensed under the GNU General Public License. It has collection algorithms for data preparation, classification, regression, clustering, association rules mining, and visualization. Tab. 1 shows eleven features which are used for the analysis of IFA with the corresponding IG values. Out of the eleven features, six prominent features are chosen for IFA detection. These features are InInterest, DropInterest, OutData, InSatis-fiedInterests, InTimedOutInterests, and PitEntries. After selecting the most appropriate feature-set the next step is to apply machine learning approach for the classification of traffic.

IFA Detection
In our previous article [18] we have shown that the ANN classifier is more suitable for the detection of IFA as It has higher accuracy, less training time, and it is easy to deploy. Therefore, a ANN classifier is used for the classification of traffic, and the backpropagation algorithm is used for training the ANN. The detection approach is evaluated using four matrices-accuracy, precision, sensitivity (or recall), and specificity. Ten-fold cross-validation is used for the assessment of the classifier.

Artificial Neural Network based Classifier
A single hidden layer ANN classifier is used for the classification of traffic. This ANN classifier has three layers of neurons, i.e., the input, hidden, and output layer. Same number of neurons are taken for the input and the hidden layer. All the weights in the ANN classifier are updated using the backpropagation [13]. The result of detection is given in Tab. 2. This trained detector is deployed on each router for the detection of malicious interface. The proposed detection approach has already been compared with the previous approaches in our previous work [18].

IFA Mitigation
In the previous section, we have seen that the ANN classifier checks all the interfaces of each router for finding the malicious interfaces. After this detection phase, the mitigation phase is triggered to cease the attackers. The proposed approach follows a traceback mechanism that traces an attacker attached to a gateway router and then applies the filter to the interface through which the attacker is connected.

Algorithm 1 Periodic Stats Collection And Detection
procedure PeriodicStatsCollectionAndAttackDetection(k, detectionInterval) for j = 0 ; j < numInt ; j + + do stats = getStats() if it == mList.end() then ml = CreateObject<MList>(); ml.interf ace = j; ml.numP IT = getP IT (j); mList.push b ack(ml); else it.numP IT = getP IT (j); function sendAlertMessage(j, pref ixList) data = Create<Data> ("/alert"); data.append (prefixList); sendData(data, j) A data structure called mList is used for storing information about malicious interfaces. The mList gets updated periodically after the detection interval. The interest packets which are received from the interfaces present in the mList are restricted on the access routers. For the non-access routers, the interest packets are restricted only when their PIT share exceeds a limit. The detail of the mList data structure is given in the Tab. 2.
The proposed approach is described using three algorithms, i.e., Algorithm 1 (Periodic Stats Collection And Detection), Algorithm 2 (OnInterest), and Algorithm 3 (OnData). Algorithm 1 runs on each router periodically after a fixed period called detectionInterval. Algorithm 2 and Algorithm 3 are called when a router receives an interest packet and data packet, respectively. The detailed description of these algorithms is given below.
The Algorithm 1 extracts stats which are discussed in Section 4.2 and passes it to the trained detector to detect whether an interface is malicious or not. On the detection of a malicious interface, the router checks entry corresponding to the interface in the m-List. Else, the router calls a function createN ewEntryOrU pdate, which creates a new entry if the entry does not exist. Otherwise, the router sets numP IT field to number entries that exist in the PIT for the interface. In the newly created entry interf ace field is set to the interface on which the attack is detected, numP IT field is set to the number of entries exist in the PIT for the interface. Next, if the router is an access router, then it removes all the PIT entries corresponding to the interface. Else, the router creates a pref ixList by extracting top k prefixes from the PIT corresponding to the interface and pref ixList. The interface and pref ixList are passed to the sendAlertM essage function, which creates a data packet having name field set to "/alert." The pref ixList is concatenated to the name field of the data packet then the data packet is sent through the interface. After sending the alert message, the router removes all the PIT entries corresponding to the pref ixList and interface.
The OnInterest and OnData functions are called when the interest packet and data packet respectively are received on the router. When an interest packet is received on an interface, then the router first checks the mList for the matching entry. If the entry is found and the router is an access router, then the packet is dropped. Otherwise, if number of PIT entries occupied by the interface is greater than the allowed PIT quota for the interface, then the packet is dropped. If the interest packet is not dropped, then it is processed according to the normal NDN forwarding pipeline.
When the router receives a data packet, it first checks the whether the message is an alert message or not. If it is an alert message, then a function onAlert is called which searches interface (maxP ref ixInt) on which the maximum number of malicious traffic is received. If the router is an access router, then PIT entries corresponding to maxP ref ixInt is removed, and createN ewEntryOrU pdate function is called, which updates the corresponding mList or creates new mList. Else the alert message is forwarded through the maxP ref ixInt.

Experimental Result
Experimental setup has already been described in Section 4.1. The proposed approach has been compared with [7] approach. Following are the different scenarios considered for the evaluation of the proposed approach. Three matrices are used for the comparison, i.e., Satisfaction Ratio (SR), Average PIT Size (APS), and Throughput. The details of these matrices are given below: 1. Satisfaction Ratio: It is the ratio of the number of data packets received to the number of interest packets sent with respect to time. This is the most commonly used metric for the evaluation of the IFA mitigation approach. 2. Average PIT Size: It is the average of PIT size of all the routers with respect to time. 3. Throughput: It is the number of data packets received by the consumers after sending the interest packets with respect to time. Figure 5(a) and Figure 5(b) show the SR of the legitimate consumers with respect to time for Tree and DFN like topology respectively. In the case of the Tree topology, the satisfaction ratio is approximately one when no attack is going on. For the Tree topology, the SR drops to 0.0092, which is 99% drop as compared to the normal traffic in the attack scenario. The Poseidon approach shows a fluctuation of SR between 0.5 to 0.01 during the attack. ANNM shows a sudden drop in SR, which reaches 0.5 at the starting of the A simillar behaviour can be seen in case of DFN like topology. In the case of DFN like topology, the satisfaction ratio is approximately 0.9 when no attack is going on. For the DFN like topology, the SR drops to 0.327 in the attack duration. The Poseidon approach shows a fluctuation of SR between 0.403 to 0.67 during the attack. The proposed ANNM approach shows a minor fluctuation in SR between 0.86 to 0.9. A similar trend can be seen in the Figure 6(a) and Figure 6(b) in the Throughput metric for Tree and DFN like topology, respectively.  Figure 7(b) show the average PIT size with respect to time for the Tree and DFN like topology respectively. For the Tree topology, the PIT size for the normal scenario fluctuates between 28 to 40. In the attack scenario, the size of the PIT fluctuates between 76 to 130 during the attack (100s-200s). The Poseidon approach shows a wide range of fluctuation in the PIT size between 76 to 130. The proposed ANNM approach shows a minor fluctuation in the PIT size between 46 to 108 in the attack duration. For the DFN topology, the PIT size for the normal scenario fluctuates between 46 to 105. In the attack scenario, the size of the PIT fluctuates between 611 to 1215 during the attack (100s-200s). The Poseidon approach shows a wide range of fluctuation in the PIT size between 576 to 1196. The proposed ANNM approach shows a minor fluctuation in the PIT size between 495 to 1027 in the attack duration.

Conclusion
Most of the previous approaches use one or two features for the detection and mitigation of IFA. These approaches were based on a hard threshold. In our previous work, we have shown that these approaches have less accuracy than machine learning-based approaches. This paper presents a tracebackbased mitigation approach for IFA. Firstly, six features are selected out of eleven using IG-based ranking. These selected features are used for building a trained ANN-based classifier. This classifier is deployed in the NDN routers for online IFA detection. On IFA detection, the router stores malicious interface and prefix-list in the m-list. The router sends an alert message to the gateway router to inform it about the attack. The gateway router adds them to its m-list. The gateway router restricts the interest packets which match with the m-list. The proposed approach performs better than the previous approach in terms of the satisfaction ratio and throughput of normal consumers. In the future, we try to adapt our approach to the mitigation of different types of IFAs.

Declarations
Funding: Not Applicable. Conflicts of interest: None. Availability of data and material: Not Applicable. Code availability: Not Applicable.