Support Based Graph Framework for Effective Intrusion Detection and Classication

Intrusion Detection System is one of the worthwhile areas for researchers for a long. Numbers of researchers have worked for increasing the efficiency of Intrusion Detection Systems. But still, many challenges are present in modern Intrusion Detection Systems. One of the major challenges is controlling the false positive rate. In this paper, we have presented an efficient soft computing framework for the classification of intrusion detection dataset to diminish a false positive rate. The proposed processing steps are described as; the input data is at first pre-processed by the normalization process. Afterward, optimal features are chosen for the dimensionality decrease utilizing krill herd optimization. Here, the effective feature assortment is utilized to enhance classification accuracy. Support value is then estimated from ideally chosen features and lastly, a support value-based graph is created for the powerful classification of data into intrusion or normal. The exploratory outcomes demonstrate that the presented technique outperforms the existing techniques regarding different performance examinations like execution time, accuracy, false-positive rate, and their intrusion detection model increases the detection rate and decreases the false rate.


INTRODUCTION
With the progress of the Internet in the modern world, security threats to computer systems and the network have improved a lot. The security threats influence system security administrations. To control security threats number of innovations are established and organized in administrations, for example, firewall, anti-virus software, message encryption, secured software protocols, and so on. Likewise, this Intrusion Detection is a significant innovation that has existed for a long time [1][2][3]. Intrusion detection is one of the significant presentations of outlier identification that is utilized to recognize the system attacks by opponents. Intrusion Detection Systems (IDSs) are fundamental to guarantee system security [4,5]. The generally utilized methodologies for intrusion detection are the anomaly and signature dependent methodologies [6]. Anomaly-dependent IDSs gain proficiency with the benchmark for the system conduct and any occasion that falls outside the accepted behavior is confirmed as a malicious event. The signature dependent IDSs acquire the normal and anomalous events to identify the attacks with their types [7]. The Intrusion detection system (IDS) is a significant component of secure information systems [8,9]. Intruders in the network are attempting to access the unapproved resources in the system [10]. It is highly required to screen and examine the actions of the user and framework behaviors. Essentially, by adjusting the arrangement of the system parameters, the conduct of the framework could be unpredictable. Subsequently, the framework must be furnished with the highlights for the intermittent observing and its conduct standards both for normal and abnormal activities [11][12][13]. Machine learning techniques can be effective in detecting intrusions. Numerous Intrusion Detection Systems are displayed dependent on machine learning strategies [14].
Machine learning is a common term for depicting a lot of optimization and processing techniques that are lenient of roughness and vulnerability. Currently, a machine learning framework has been stretched out for executing a successful interruption location framework. Machine learning approaches are exceptionally useful and enhanced in current intrusion detection [15][16][17]. Precisely, support vector machines, neural networks, decision trees to have powerful important plans in anomaly detection structures to improve the characterization execution and speed [18]. The key components of machine learning procedures are Fuzzy Logic (FL), Artificial Neural Networks (ANNs), Probabilistic Reasoning (PR), and Genetic Algorithms (GAs). The idea behind the application of soft computing techniques and particularly ANNs in implementing IDSs is to include an intelligent agent in the system that is capable of disclosing the latent patterns in abnormal and normal connection audit records and to generalize the patterns to new connection records of the same class [19,20].
 Optimum features are selected from the normalized information using krill herd optimization for the features dimensionality reduction.  Effective feature selection using krill herd optimization enhances the classification accuracy by diminishing the false positive rate. Support value is estimated for effectually selected features.  Support based graph is constructed for the effective classification of data into intrusion or normal. The structure of the manuscript is sorted out as pursues: Section 2 reviews the literature works in regards to the proposed strategy. In section 3, a short discussion about the proposed system is given, section 4 examines the exploratory outcomes, and section 5 finishes up the paper.

RELATED WORK
Majjed Al-Qatf et al. [21] proposed a powerful deep learning method STL-IDS dependent on the self-trained learning (STL) system. The presented methodology was utilized to feature learning and dimensionality reduction and improves the prediction accuracy of support vector machines (SVM) concerning attacks. The presented approach was assembled utilizing the sparse auto-encoder system, which was an effective learning algorithm for reconstructing a new feature representation in an unsupervised method. After the pre-processing step, the new feature is fed into the SVM to enhance its prediction capacity for intrusion and classification accuracy.
Farrukh Aslam Khan et al. [22] presented a novel two-stage deep learning (TSDL) dependent on a stacked auto-encoder for proficient system intrusion detection. The model contains 2 decision steps: the first step is responsible for classifying network traffic as normal or abnormal using a probability score value. Secondly, it was utilized in the final decision step as an additional. The presented model was able to learn useful feature representations from large amounts of unlabelled data and classifies them automatically and efficiently.
Chuanlong Yin et al. [23] presented an intrusion detection system-dependent based on deep learning, and we propose a deep learning approach for intrusion detection using recurrent neural networks (RNN-IDS). Besides, they investigated the performance of the design in binary classification and multiclass classification, and the number of neurons and different learning rate impacts on the performance of the proposed model. They analyzed the presented strategy with existing soft computing techniques presented by previous researchers on the benchmark data set. Jie Gu et al. [24] presented a methodology for intrusion detection using an SVM ensemble with feature extension. Precisely, the logarithm marginal density ratios transformation was implemented on the original features to obtain new and better-quality transformed training data; the SVM ensemble was then used to build the intrusion detection model. Exploratory outcomes demonstrate that the presented technique can achieve a good and robust execution.
Haipeng Yao et al. [25] presented an MSML framework that incorporates components such as, pure cluster mining, pattern discovery, fine-grained classification, and model updating. In the pure cluster module, they presented knowledge of pure cluster formation and presented a hierarchical semi-supervised k-means calculation mean to discover all the unadulterated clusters. In the pattern discovery model, they defined the unknown pattern and apply a cluster-based technique intending to locate those unknown patterns. At that point, a test was sentenced to mark known examples or unlabelled unknown patterns. The fine-grained classification module can achieve fine-grained classification for those unknown pattern samples.

PROPOSED METHODOLOGY
In this paper, we have proposed an efficient methodology for the classification of intrusion detection data. The input information is first pre-processed by the normalization procedure. Afterward, the finest features are chosen from the normalized information for the highlights dimensionality decrease utilizing krill herd optimization. Here, the viable feature determination enhances the classification precision by diminishing the false positive rate. At that point, support value is estimated for effectually chosen features, and then a support based graph is constructed for the effective classification of data into intrusion or normal. The block diagram of the presented methodology is given in figure 1. KDD'99 has been the most generally utilized dataset for the assessment of anomaly detection techniques. This dataset is created dependent on the information captured in the DARPA'98 IDS evaluation program. DARPA'98 is around 4 gigabytes of packet information of 7 weeks of network traffic, which is handled into around 5 million association records, each with around 100 bytes. The two weeks of test information about 2 million association records. KDD training dataset comprises roughly 4,900,000 single association vectors every one of which comprises 41 features and is marked as either normal or an attack, with precisely one specific attack type.

DATASET 2: CIC IDS 2017 dataset [27]
CIC IDS 2017 comprises 5 days of information accumulation with 225,745 packages with more than 80 features and gathered over seven days of network activity. In the CIC 2017 dataset, the attack simulation is isolated into seven classes including Brute Force Attack, Heart Bleed Attack, Botnet, DoS Attack, DDoS Attack, Web Attack, and Infiltration Attack.

DATASET 3: ISCX IDS 2012 dataset [28]
The UNB (University of New Brunswick) ISCX 2012 dataset signifies powerfully created information that reflects system traffic and interruptions. Different multi-stage attack scenarios are carried to stream the anomalous segment of the dataset. Normal background traffic is given by performing client profiles that were artificially produced at arbitrary synchronized times making profile based client behavior.

DATASET 4: CICDDOS 2019 dataset [29]
Distributed Denial of Service (DDoS) attack is a menace to network security that aims at exhausting the target networks with malicious traffic. CICDDoS2019 dataset contains benign and the most up to date common DDoS attacks, which resembles the true real-world data.

PREPROCESSING 3.2.1 Normalization
Normalization performs the direct change of input information to fit into a particular range. Here, Min-max normalization is utilized for the standardization of data which linearly transforms the data. Min-Max normalization is regularly done through the accompanying condition, Where, min Y and max Y are the minimum and maximum values in Y , and Y is the set of values in the dataset.

FEATURE SELECTION
Feature selection is portrayed as a technique whereby particular features are chosen from a set of features, which have high discrimination ability among class labels. It is a significant and regularly utilized method in numerous fields for dimension reduction. Feature determination is imperative in enhancing proficiency and besides for decreasing dimensions. In the proposed technique, krill herd optimization is utilized for the viable feature selection.

Krill Herd optimization algorithm
This is an iterative heuristic technique involved in the inalienable phenomenon of the krill herd [31]. This is primarily utilized for resolving optimization issues. The pseudo-code of krill herd optimization is represented in algorithm 1.

Begin
Define the size of the populace ( ' S ) and cycle ( max C ) The presented krill herd optimization results in the effective chosen of features through the associated steps: Step 1 The optimization begins with the initialization of normalized data.

Step 2
Fitness esteem is evaluated dependent on the krill individual positions.

Step 3
Next, the fundamental iteration begins by positioning the krill from the finest to the observably bad individual.

Step 4
From that point onwards, motion updates are processed for every krill using the going with conditions, a) The searching update is finished by, The induced motion identifies with the density maintenance of data is represented as, Where, max M denotes the most extreme activated speed, i  denotes the inertia weight, total z  denotes the nearby effect of the th z krill individual has on its neighbors, is the finest result of the th z krill. c) The final movement update is matching the physical distribution through irregular action and is represented as,  Where max D denotes the greatest diffusion speed,  denotes the random directional vector between -1 and 1.
Step 5 Because of the previously showed developments, using unique parameters of movement amidst the time, the location of the th y krill amidst an opportunity to t t  passed on by the associated condition and it is utilized to compute a krill individual location.
 denotes a fundamental constant. Hereby utilizing the reference condition, the krill individual's position is refreshed and the finest result is acquired.
Step 6 Toward the end, the stopping condition is utilized for the fulfillment of function assessments. Though the stopping condition is not reached once more, categorize the krill populace from the finest to the poorest and assess the finest node individual location. The flow chart of krill herd optimization is represented in figure 3.

Support value-based graph classification
In this section, a support value-based graph is utilized for the successful categorization of information into normal or intrusion. Support values are estimated for the chosen feature set and afterward, the average is computed from the support values. Consequently, the Median support value is kept as a threshold for the successful classification of information into normal or intrusion.

Support value measure
In this section, input information is sorted reliant on the support value of features. Here, the Support value evaluation reliant on certain features is represented in condition (8). ) Where n f f f ,.... , 2 1 denotes the selected optimal features set, value S denotes the support value.

Median support value
The median value is determined for all the support values after the estimation of the support value for the chosen features. In the presented classification, this support value measure is taken as a threshold for the significant categorization of information into normal or intrusion. The Median support value measure is processed by the condition (9).
Where, M is the whole quantity of support value measures. The support value-based graph generation depends on the Median support value measure. The algorithm of the proposed support value-based graph classification is indicated in algorithm 2.

Input
: is processed information and it is taken as an input. The support value-based graph generation in algorithm 2 yields the support value-based graph. At first for every data Y y k  determines the support value utilizing condition (8) and the Median support value is then processed by utilizing the condition (9). The Median support value is taken as a threshold for the classification of information. The sample representation for the classification of data into a normal or intrusion utilizing support valuebased graph is depicted in figure 3.

Figure 3: Sample representation of support value-based graph classification
In this classification, the support value of chosen feature esteem is higher than the threshold value then the data is located on the left side of the graph i.e.) intrusion data and if the support value of the selected feature value is lesser than the threshold value is placed in the right side correspondingly i.e.) normal data.

RESULTS AND DISCUSSION
The proposed support value-based graph classification was implemented in the working platform of MATLAB. In this section, the experimental outcomes accomplished for the presented technique are specified. The openly accessible KDD-CUP 99, CIC IDS 2017, ISCX IDS 2012, and CICDDOS2019 dataset was utilized to assess the classification of data into a normal or intrusion utilizing support value-based graph classification. The performance of the presented support value-based graph classification is contrasted with the existing Support vector machine (SVM) [24], Naive Bayes [23], and Random forest [23] classifiers for the accuracy, sensitivity, specificity, precision, recall, and F-measure, FPR, FNR, Kappa and Rank sum. Moreover, the presented work is analyzed with the existing optimization techniques such as genetic algorithm (GA) [18] and Particle swarm optimization (PSO) [30]. Statistical measures to examine the performance of the presented work are given in the subsequent section.

PERFORMANCE ANALYSIS
The statistical metrics of sensitivity, specificity, and accuracy can be expressed in terms of TP, FP, FN, and TN esteem. The performance of our presented work is analyzed by utilizing the statistical measures are mentioned in this section,

Accuracy
Accuracy is determined as the quantity of every single right prediction TP) TN (  divided by the absolute number of a dataset (  FP  FN  TP  TN       ). It quantifies the degree of accurateness of information classification. Accuracy is ascertained by utilizing the condition (10), Where TN is a true negative, TP is the true positive, FP is the false positive, and FN is the false negative.

Sensitivity
Sensitivity is the number of true positives that are viably recognized by a classification test. It demonstrates how extraordinary the test is at classifying the information. Sensitivity is computed by utilizing the condition (11).

Specificity
Specificity is the number of true negatives effectively-recognized through classification tests. It recommends how great the test is at distinguishing normal data. Specificity is computed by utilizing the condition (12).

False-positive rate (FPR)
FPR is ascertained as the proportion among the number of negatives mistakenly measured as positives and the total amount of real negatives. False Positive Rate is computed by utilizing the condition (13).

False-negative rate (FNR)
FNR is the degree of positives that provides negative test outcomes. The false-negative rate is computed by utilizing the condition (14).

F-measure
It is an estimate of a test's precision. The F measure picks up its best value at 1 accompanied by the most unpleasant at 0. It is determined by the condition (15).

Receiver Operating Characteristics (ROC) curve
It is a graphical depiction tool that exhibits the intrusion detection precision against the FPR. The ROC is seen as one of the effective metrics utilized to assess the exhibition of IDSs successfully. In the ROC curve, the best identification performance is 0% FPR and 100% TPR. Furthermore, the area under the curve of the ROC reflects detection accuracy.

DATASET 1: KDD CUP 99 dataset
Comparison table 2 delineates the performance of the presented classifier with the existing classifiers utilizing the KDD-CUP 99 dataset. It is depicted that the proposed system outcomes are highly improved than the existing classifications regarding the accuracy, sensitivity, specificity, precision, recall, F-measure, FPR, FNR, Kappa, and Rank sum test. The features chosen of the presented work utilizing the krill herd optimization is contrasted with the existing GA, PSO optimization algorithms, and the features selection without optimization. The comparison examination provided in table 3 demonstrates that the performance of the proposed feature selection is enhanced than existing techniques.  figure 4 and the feature selection utilizing different optimization techniques is provided in figure 5 proves that the proposed work accuracy is much greater than the existing techniques.   The comparison graph in terms of the ROC curve is analyzed with various existing techniques in figure 6. The ROC curve proves that the detection performance of the proposed intrusion detection is superior to the existing techniques.

DATASET 2: CIC IDS 2017 dataset
The classifier performance of the proposed methodology with existing classifiers utilizing the CIC IDS 2017 dataset is mentioned in table 4 delineates that the performance of the presented work is improved in different measures. The performance of the proposed work features selection is contrasted with the different existing optimization techniques utilizing CIC IDS 2017 dataset is in table 5 and the obtained results prove that the proposed work performance is more prominent than the existing techniques in every performance measure.  figure 7 and figure 8 portray that the presented technique is superior to the number of previous optimization algorithms regarding accuracy.    figure 9. It depicts that the proposed work detection performance is better than the existing technique through the ROC curve.

DATASET 3: ISCX IDS 2012 dataset
The performance of presented work features selection is contrasted with different existing classification techniques utilizing ISCX IDS 2012 dataset is in table 6 and provided results prove that the proposed work performance is greater than the existing techniques for every performance measure. The comparison table 7 delineates the proposed work feature selection utilizing krill herd optimization results in a better outcome than the existing techniques for different execution measures. Here, the performance of the proposed work is examined with the ISCX IDS 2012 dataset. The proposed work accuracy for ISCX IDS 2012 dataset is examined with different classifiers and optimization algorithms than the existing techniques are demonstrated by figure 10 and figure 11. It displays that the accuracy of the proposed work with the ISCX IDS 2012 dataset is vastly improved than the existing techniques.   The performance of the presented system with the ISCX IDS 2012 dataset is examined by the ROC curve in figure 12. The examination proves that the prediction performance of the proposed strategy is enhanced than the existing methods.

DATASET 4: CIC DDOS 2019 dataset
Comparison table 8 depicts the performance of the proposed classifier with the existing classifiers using the CICDDOS2019 dataset. It is shown that our proposed technique performance is much better than the existing classifications in terms of accuracy, sensitivity, specificity, precision, recall, F-measure, FPR, FNR, Kappa, and Rank sum measures. The proposed work features selection using the krill herd optimization is compared with the existing optimization algorithms in table 9 and the comparison analysis proves that the performance of the proposed work is improved than the existing techniques. The classification performance and the optimization algorithms performance using the CICDDOS2019 dataset in figure 13 and figure 14 proves that the proposed work accuracy is much greater than the existing techniques.   The ROC curve with various existing techniques in intrusion detection is given in figure 15 is analyzed with the CICDDOS2019 dataset. It displays that the detection performance of the proposed support value-based classification is better than the existing classifications.

CONCLUSION
In this paper, we have presented a support value-based graph classification for the categorization of data into normal or intrusion. Moreover, an optimal feature selection utilizing krill herd optimization yields superior outcomes for effectively choosing the features. In the presented technique, the input data is pre-processed and the features are ideally chosen to utilize optimization. Lastly, an effective support value-based graph classification is efficiently categorized the data into normal or intrusion. The exploratory outcomes exhibit that our presented classification outperforms the existing SVM, Naive Bayes, and random forest classifiers concerning performance metrics such as accuracy, sensitivity, specificity, precision, recall, F-measure, FPR, FNR, Kappa, and Rank sum measures.