Flow based intrusion Detection system with Whale Optimization and Evolutionary Algorithms (EA) for Diversified Traffic Streams


 The technological growth and advances in the internet led to the generation of huge volume of data that networks must be capable of transmitting. Providing security to this data is a challenging task. The development in the internet attracts several vulnerable attacks. The researchers in the literature proposed several machine learning, Deep learning and ANN based approaches for efficient attack detection. However, these approaches are prone to high false alarm rates and exhibits poor performance for diversified incoming traffic, because these methodologies relay on the packet level or transaction level features. The performance is inversely proposal to the diversity ratio of packet level features. To handle this, we introduced a combination of high-performed evolutionary algorithms and neural networks for attack classification at flow level with low false alarm rates and high detection accuracy. A unique set of flow features are defined to handle the traffic at flow level and optimal feature selection using whale Optimization Algorithm (WOA). The gravitational search (GS), and particle swarm optimization (PSO) combinations are used in attack detection phase to train the ANN and results proposed model as GSPSO-ANN with WOA. The performance of the proposed model is evaluated with NSL-KDD and CSE-CIC-IDS2018 datasets. The results are compared with other ANN based conventional methods. The results inferred that the proposed GSPSO-ANN with WOA attained maximum detection accuracy with low false alarm rates and processing time and also maintained consistency in the performance for diversified traffic.


INTRODUCTION
The technological growth of the internet gained applications in various areas of human life like banking, public networking, online transactions, electronic trade etc. However, with the immense knowledge of attacker the vulnerabilities of the network have frequently been intruded with denial of service (DoS) attacks or Distributed Denial of Service (DDoS) attacks [1]. The DDoS attack is same as the traditional DoS attack, but huge amount of traffic is generated through various ends from the distributed environment using botnet. The DDoS attack denies the access of the victim system for the legitimate user requests. The attacker floods huge number of packets to launch the DDoS attack towards the victim or target network and exhausts the victim system resources like bandwidth, disk space, computing power, etc.
Flooding is one of the most powerful threats to internet. The attacker constitutes the botnet for generating the huge amount of traffic and floods this huge amount of traffic towards the victim. However, the flooding attacks are classified into network level/ transport level and application level based on the protocol used to generate and flood the traffic. Huge volume of traffic is generated using transport layer protocols such as TCP, UDP and ICMP to lunch the network/transport level DDoS attacks by exploiting the vulnerabilities. UDP flood, TCP flood and ICMP flood [2] are the examples of network/transport level DDoS attacks. Most of the web servers and applications widely used protocol in application layer is hypertext transfer protocol (HTTP). The increased HTTP traffic [3] and technological development of the internet invites the attackers to launch the HTTP-based attacks.
In recent studies, rule based datamining, artificial neural network (ANN) [4] , evolutionary algorithms and swarm intelligence have gained significant importance to address the Application layer DDoS attacks. However, the ANN based methods have two short comings. Firstly it exhibits low detection rates. Secondly, the performance of the detection system is unstable for large traffic patterns. One of the reasons for these shortcomings is that ANN based methods are feature dependent and failed to tackle the traffic from the diversified environments. The another reason is ANN provides the better results when it trained with less number of patterns, but the increase of traffic from diversified environments with diversified characteristics of features results uncertainty in the detection performance.
The ANN techniques [5] are prone to be over fitting wen the training dataset is from the diversified networks and contains the diversified characteristics of features to describe normal and attack behavior. It is obvious that the attacker launches DDoS attack with the help of bots or botnet, which exhibits diversified characteristics and distributed in diversified environments. The detection system in testing phase exhibits unexpected results such as unstable or low detection accuracy and high false alarms.One of the alternatives for the ANN methods is Meta-heuristic algorithms and the combination of ANN and these Meta-heuristic optimization algorithms exhibits better results to handle the Application Layer DDoS attacks [6]. The main reasons to consider these meta-heuristic algorithms are (i) these relay on features not the characteristics or values of them ;(ii) gradient information is not required; (iii) local minima is bypassed easily ; (iv) well suitable to address diversified characteristics of the traffic from the distributed ends.
The Nature-inspired or Bio-inspired meta-heuristic algorithms provide the solutions to optimization problems by imitating the biological or physical phenomena of the nature. These algorithms can be classified as Evolutionary based algorithms (EA), Physical-based algorithms (PA) and Swarm based algorithms (SA). The Evolution based algorithms are inspired by the natural evolution laws of creatures in the nature. The randomly generated population is used to start the search process and these will be evolved in the subsequent iterations. The strength of these evaluation based algorithms is that these choose the best individuals and combined these to define the individuals for the next iteration. This is achieved with feature optimization over the interactions. The popular evolution-based algorithms are Genetic Algorithms (GA), Evolution Strategy (ES), Genetic Programming (GP), Biogeography-Based Optimizer (BBO) and Probability-Based Incremental Learning (PBIL). The process of Evaluation-based algorithms is shown in figure 1.
The physical-based algorithms mimic the physical rules of the universe. The popular algorithms under this category are Big-Bang Big-Crunch (BBBC), Gravitational Local Search (GLSA), Charged System Search (CSS), and Gravitational Search Algorithm (GSA), etc.
The swarm-based algorithms mimic the social behavior of the group of animals in the nature. The Particle Swarm Optimization (PSO) is the most popular algorithm in swarm-based methods which mimics the behavior of bird flocking and uses 'n ' number of particles around the search space for finding the optimal solution. The other popular examples other than PSO are Ant Colony Optimization (ACO) , Ant Bee colony (ABO), firefly Approach (FA), Bat algorithm (BA).,etc. These Meta-heuristic approaches are attractive because that PSO has proved to be very competitive with evolutionary based algorithms and swarm based algorithms poses some additional advantages over evolutionary algorithms (EA). For an instance, that the Evaluation algorithms (EA) discards the old information when a new population is designed, whereas swarm based methods save the information for the subsequent generations. The swarm based algorithms are easy to implement because it uses minimum operators compared to evolutionary approaches.
In this work, the aforementioned problems of ANN are solved with the combination of ANN method and the Evolutionary Algorithms (EA). A hybrid approach is proposed known as GS-ANN with the combination of ANN and Gravitational Search (GS) [7]. It is evident from the literature that the Swarm based algorithm namely Particle Swarm Optimization (PSO) [8] is faster than GS algorithm. The combination of GS and PSO is used to train the ANN which is known as GSPSO. Finally the proposed model is named as GSPSO-ANN. The GSPSO algorithm is proved to be competent in performance by training neural Networks for various datasets [9] [10] of different applications. The problem of ANN with the diversified characteristics of incoming traffic is solved with proposed flow based features (sec 3.2), where the flow based features are independent to the detection model. The whale Optimization Algorithm (WOA) reduces the dimensionality of the dataset by selecting the relevant features from the flow based features set. The performance of the proposed method GSPSO-ANN is validated with the NSL-KDD datasets [11], with the standard performance parameters such as detection rate, mean squared error (MSE), decision time and training time. The results are compared with traditional approaches.
The second section explores the detail analysis of flow based attack detection and swarm & evolutionary algorithms. Third section explains flow level features, whale optimization and GSPSO-ANN algorithm. Finally section 4 explores the results analysis followed by the conclusion.

RELATED WORK
This section explores the detail study of various flow based application layer DDoS attack detection methods and the evolution of swarm & evolutionary algorithms for attack detection. The detail analysis of the same is presented at the end of this section.

Review of Flow based intrusion detection
The authors [12] introduced a semi-supervised learning based DDoS attack detection and used entropy calculations to defined the flow size in a given time interval. Whenever the distribution of incoming traffic changes rapidly, then the traffic is divided into three clusters using co-clustering mechanisms. The information gain ratio is calculated to discriminate the abnormal cluster and its associated traffic. The extra-trees algorithm is used to classify the incoming traffic. The drawback of this paper is authors proposed many statistical approaches and these are not capable to handle the dynamic behavior of the traffic.
The authors [13] used packet header features along with the received packets count at a specific time interval as a key parameters to detect the DDoS attacks with Renyi's generalized entropy technique.
Entropy index is calculated to discriminate normal traffic over DDoS attacks and the flash events. The limitation of this entropy based DDoS attack detection scheme is, it detects only high rate DDoS attacks.
The Constraint based group testing (GT) mechanism [14] is defined to discriminate the application layer DDoS attacks. The authors proposed partial, sequential and non-adaptive detection schemes to address the application layer DDoS attacks. The advantage of the proposed model is, these detection schemes effectively handled the low and high rate DDoS attacks with the parameters response time and arrival rate. But failed to detect the low rate DDoS attacks in early stages of transmission. The  [15] used generalized entropy to address the detection of various low and high rate DDoS attacks. The authors executed D-FACE detection method at ISP level and processed at victim level. Computed generalized entropy index, information distance and traffic rate on captured packet data for discriminating the DDoS attacks. The flow features played a major role while exploiting the detection of low rate DDoS attacks.
The authors [16] introduced a new data structure to detect the application layer DDoS attacks in association with hash tables called as sketch. The distribution of the traffic is stable with randomized hash function for normal traffic. The deviation between the sketches is calculated with Hellinger distance and bloom filters are used to discriminate the normal traffic from the attack traffic. The proposed approach exhibits good detection accuracy for the detection of application layer DDoS attacks and produced high false alarm rates. The punith and mala [17] proposed a flow based intrusion detection system with flow features instead of request level parameters. The HTTP requests are differentiated from the normal DDoS attacks with flow features such as count of GET requests, GET request type and Service time. The low application layer attacks are detected successfully with Support Vector Machine (SVM) model and this model identifies request flood attacks only.
An improved Bio-inspired CUCKOO search algorithm [18] is applied on flow features extracted from the network for validating the traffic at flow level as normal or DDoS attacks. The traffic flow size is calculated with absolute time interval and sessions are described with unique set of features. The feature source diversity ratio is applied to find the diversified sources involved in the time interval. The Authors proved good detection accuracy, but this method is applicable only for HTTP flooding attack.
In [19] the detection of application layer DDoS attacks is implemented with SVM, Machine learning (ML), DBSCAN and K-Means algorithms. The browsing characteristics were used to differentiate the attacks from the normal traffic and used Principle component analysis (PCA) to validate the HTTP requests from the traffic. This combination of algorithms defined in this paper addresses only the flooding attacks. The application layer based GET-flood attacks are discriminated from HTTP traffic with parameters like response index, repetition index, request index and popularity index in web browser. In [20] the attacks are categorized into constant rate attack, repeated attack, flash attack and dominant page attack. The authors kim et al [21] proposed a defense method to mitigate flow based application layer DDoS attacks in wireless ad-hoc networks. The number of requests dumped over the session in a particular time frame is used to validate the flooding attacks. The statistical parameters such as variance and standard deviation are used to measure the deviation of packet count in a time frame. The sources of flooding attack are detected with two detection methods namely relay based and originator based transmissions techniques defined in blacklist. The packet level simulator is used to evaluate the proposed method and failed to address various type of application layer DDoS attacks.
The detection of HTTP based application layer DDoS attacks are implemented with flow feature in distributed environments for diversified traffics. The authors [22] mainly focused on the slow rate DDoS attacks with HTTP requests targeting server HTTP. The bots from the botnet establishes a connections or sessions with incomplete GET or POST methods of HTTP requests to launch the slow rate attacks. The authors presented various HTTP based application layer detections mechanism with their pros and corns and also listed the open issue to the budding researchers. The slow communication rate [23] HTTP based application layer DDoS attacks were launched with application specific protocols by exploiting the timeout period at server and these slow rate attacks are not application specific. The bots targets the victim server by establishing multiple sessions from the botnet towards the server. It floods the huge volume of keep-alive, broken request and incomplete requests towards the server with low transmission bytes repeatedly in low transmission time.
 The survey reveals that the application layer DDoS attacks block the availability of victim server by flooding the huge amount of requests to the server using HTTP GET/POST methods. The list of challenges to be addressed is given below to increase the detection accuracy with low false alarms in diversified traffic flows.  Most of the methods presented in the literature are used the packet level or request level features for calculating the detection metrics. However, this increases the algorithm complexity and failed to handle the diversified incoming corpus. The behavior of the traffic is well classified using the flow properties of flow-based intrusion detection system.  In existing techniques, the diversified behavior of the network traffic is not addressed. When the quantum of incoming traffic is raised then the diversity among the traffic features are also increased. For example two packets with different values for the features represent the same flooding attack.  The literature reveals that the majority of the researchers proposed detection models in network and transport layer DDoS attacks and very few focused on applications layer DDoS attacks.  Many researchers proposed machine learning algorithms and metaheurastic algorithms for detecting application layer DDoS attack. However the machine learning algorithms relay on the features used to train the system. The researchers failed to address the diversity of the features while training the system, but this diversification behavior of the features values effect the performance of the detection process. The selection of flow based features rather than request level features will solve the diversity behavior efficiently.

Review of Swarm and Evolutionary algorithms (EA)
The frequently used Evolutionary algorithms (EA) are Genetic Algorithm (GA), and Genetic Programming (GP) and these Evolutionary algorithms are motivated by the biological process of living creatures. These evolutionary algorithms are also known as population based algorithms. The GA was designed with computational methods to mimic the natural process of generating off-springs for obtaining the optimal solutions to the complex real world problems [25]. The GA is heuristic algorithm which mimics natural selection process to acquire expert solutions for specific problem context by using nature inspired operators such as mutation and cross over. The EA's provides the solutions for single objective problems, multi objective problems, combinational and non-deterministic problems [24]. Evaluating the fitness values for high dimensional problems and Multi model complex problems is time consuming process and also computationally very expensive. Hence the genetic algorithms are failed to solve such problems efficiently. The intrusion detection system (IDS) uses EA's for feature optimization in defining classifiers to classify the attacks.
In [26] GA and Differential Evolution (DE) Evolutionary algorithms are proposed for attack detection. The Evolutionary Algorithms (EA) uses the features and calculates the fitness value to know the significance of it. The classification module of EA is trained with the selected features from the dataset and performs classification with various derived attack patterns. The GNP was designed with fuzzy association rules to handle continues and discrete attributes of the attack dataset [27]. The intrusive patterns in the traffic or dataset are detected and extracted by designing best rules with directed graph structure. Usually, the combinations of GA's and rule-based systems are called as learning classifier systems [28].
The GA extracted the information from the network flows to generate the attack signatures with network system flow analysis. The fuzzy logic classifier along with the trained classifier pattern is used to detect the specific traffic instance as normal or attack [29]. The Differential Evolution (DE) [30] is introduced to detect the anomalies or attacks from the incoming traffic with selected features in IDS. However, the candidate solutions are obtained by applying self-mutation with large population and new candidate solutions are defined with a weighted difference mechanism among individuals. The authors [31] proposed the comparative analysis of GA, DE and PSO. When the traffic classification is implemented with SVM , these three algorithms are used only for feature selection. In KDD99 dataset among 41 features 16, 15 and 31 features respectively are extracted by the GA, PSO and DE. The DE outperforms PSO and GA in classifying the incoming traffic as attack or normal.
The self-organized clustering [32] mechanism is designed with a hybrid approach using support vector machine (SVM) and Ant Colony Optimization (ACO). This hybrid approach gets advantage of clustering efficiency from self-organized ACO and efficiency of classification from SVM. However, the combination of SVM and ACO is efficient to select the optimal features from the original feature set.
The Particle Swarm Optimization (PSO) is one of the global optimization algorithms with birds flickering behavior or fishes schooling behavior for converging a common goal with group of agents by perceiving the feedback from others in a swarm [33]. However, the swarm is a collection of distributed agents of similar behavior, which interact each other to obtain optimal solution. The PSO is normally used to provide the solutions for non-linear problems, discontinuous problems, and non-differentiable problems. The Ant Bee colony (ABC) algorithm for IDS is used in combination with the learning algorithms for better classification of incoming traffic as abnormal or attack traffic. In [34] the SVM parameters are identified with standard ABC algorithm and optimum feature set is obtained using binary ABC algorithm. The fitness value is defined as an accuracy rate.
The firefly Algorithm (FA) has been used in [35] for selection of optimized features from the dataset to improve the detection accuracy of classifying DDoS attacks by improving the efficiency of the classifier. The accurate detection of network anomalies or attacks is attained by minimizing cluster initialization problem with the combination of k-harmonic means algorithm and FA. The initialization of k-cluster problem is optimized with the combination of FA and k-harmonic means algorithm. The echolocation behavior of bats is the main inspiration for designing the Bat algorithm (BA) and it provides efficient solution to solve single objective and multi objective optimization problems. The BA is used in IDS for feature optimization process, defining the classifier parameters, detecting attacks and classified these into proposed attack classes. In [36] the BA is used to optimize the parameters and SVM kernel parameters. The information gain algorithm along with BA is used to define the optimized feature set and detecting the DDoS attacks accurately.
In [37] network anomaly-based IDS is introduced, where improved version of BA is combined with SVM. The input parameters of SVM are optimized with normal BA and binary variant of BA is defined as wrapper-based feature selection to select the features from the dataset. The BA algorithm extends the accuracy of the attack detection in both the methods and it is implemented in exploitation phase and exploration phase. The Proposed method outperformed SVM, general BA and PSO methods, when the experimentation carried with NSL-KDD dataset. In [38] pollination based optimization method is used for feature optimization and the optimized features of an IDS have been utilized as local and global properties of FPA. The dataset is defined as plants, the population is assumed to be the number of samples from the dataset, features are assumed as pollinators in the feature selection process. The two classifiers NB and J48 extract the values for the sleeted features from the incoming traffic. The list of advantages of EA's is given below and the evaluation of various methods is explored in table1;  The EA's acquire intrinsic parallelism characteristic and easily handles large volume of attack data.  It provides population of solutions instead of single solution, which is applied for behavior based IDS where user profiles are considered for detecting the attacks or intrusions in the network.  EA's can be easily retrained and provides the better adaptability. Hence, these algorithms can be easily applied to variable environment conditions for detecting the DDoS attacks.  The controlling parameters of EA's such as cross over and mutation probabilities changed overtime and these are suitable to extract the rules from the detection system.
Though The EA's are efficient for handling DDoS attack detection, but these also have some limitations which are given below.   [39] The proposed DE model used binomial crossover, simple selection, single vector mutation operator for attack detection and classification.
The model effectively classifies the attack traffic with appropriate selection of features.
It failed to validate the traffic for multi class data and exhibited poor performance.
E.-S. M. El-Alfy [40] Run-time analysis and MapReduce implementation of Sequential GA and Parallel GA .
The experimental results reveal that the parallel GA is more consistent than sequential GA.
The number of parameters and the volume of incoming traffic play the vital role in calculating the performance.
M. G. Raman, N. Somu [41] The hybrid approach GA-SVM and SVM-GA is used define the Detection of incoming traffic flow.
The detection accuracy is improved with low false alarm rates by using the feature optimization with GA. The SMM-SA combination provides better scalability and adaptability.
Diversity of the feature values is not addressed and the diversified behavior of the incoming flow failed to maintain the consistency of detection method. A. H. Hamamoto, L. F. Carvalho [42] The hybrid approach Fuzzy logic-GA with Binary tournament selection.
The Proposed combination Fuzzy Logic-GA is independent to the detection process and provides the better detection accuracy.
False alarm rate is very high and scalability is very less.
H. M. Rais, [43] The combination of ACO-SVM is used for feature selection and attack detection in the network.
The classification accuracy of the method is improved with feature selection using ACO.
Time consuming process and unable to handle diversified traffic.
Y. Wan, M. Wang [44] The combination of Binary ACO and GA for feature optimization and attack detection.
The evolutionary fitness curve is used to define the results and address the traffic diversity.
False alarm rate is high and bench mark datasets are not used to evaluate the proposed method. M. H. Ali, B. A. D. Al Mohammed [45] The FLN and PSO combination is used for feature optimization and attack detection.
Attained improved detection accuracy with FLN-PSO combination. The accuracy is proportional to the number of neurons used in the system.
When the volume of the traffic is increased, the performance of the system is decreased and exhibited high false alarm rates.
Achieved high attack detection rate and classification rate using RF and PSO.
Improved TPR and FPR values.
The computation cost and time is very less and it also addressed the The performance is proportional to the incoming traffic size. diversified web traffic. The false alarm rate is 0.01. J. Yang, Z. Ye, L. Yan, W. Gu, R. Wang [48] The ABC and MNB combination is used for feature optimization and application layer attack detection.
The processing time is very less because of ABC algorithm and provides the better solutions to the diversity of the traffic features.
The detection accuracy is less i.e 90% and high false alarm i.e 10% S. A. R. Shah, [49] The combination of SVM-FA for feature optimization and attack detection is proposed.
The processing time very less and provides less false alarms.
Detection accuracy is poor for the high speed networks and not addressed the diversity. B. Selvakumar, K. Muneeswaran [50] The combination of C4.5, NB and FA is used for feature optimization and detection of HTTP based flooding attacks.
The computational cost and detection accuracy is improved with feature selection algorithm FA.
The detection accuracy of HTTP flood attacks is very poor and exhibits high false alarms.
A.-C. Enache, [51] The combination of C4.5, SVM and binary BA is used for feature optimization and application layer DDoS attack detection The number of iterations is very less in feature optimization which minimizes the time and improves the performance of the process.
False alarms are high and accuracy is poor for large datasets.
W. Park, S. Ahn [52] The combination of SVM and FPA is used to detect the application layer DDoS attacks.
The detection accuracy is improved with FPA for linear and non-linear classes.
The performance is poor for diversified data and exhibits high false alarms. Zhang, Y., Li, P., & Wang, X, [53] The number of hidden layers and DBN input nodes are defined with GA The detection accuracy is 97% and false alarms are 7% The performance is not reliable. The performance is inversely proposition to the input volume.

PROPOSED WORK
This section explores the flow based Detection of HTTP based Application layer DDoS attacks with Whale Optimization Algorithm (WOA) and Swarm & Evolutionary algorithms GSPSO-ANN. The first subsection presents the frame of proposed application layer DDoS detection, second subsection presents the unique set of proposed flow level features to address the traffic at flow level, Third section defines detailed analysis of Whale Optimization Algorithm (WOA), Fourth section presents the detail explanation of GSPSO-ANN algorithm for traffic classification, finally Evaluation of proposed with experimental results. Figure 1 shows the framework of the application layer DDoS attack detection. However, the work of each module in the given framework has been described in the "Experimental results" section (i.e.,

Sect. 4) for further understanding.
ANN design phase: The network traffic is considered as the input for ANN training and testing process at packet level or request level with N number of attributes. The packet level or request level features are dependent to the performance of the attack detection process. When the volume of the traffic is increased, it is difficult to maintain the consistency in the performance of the detection process with request level features due to the diversity of the transactions. To overcome these limitations, flow level features are described in the section 3.2 and extracted the values for these flow level features from the input traffic.
In this work, ANN is designed with Multilayered ANN also known as Multilayered Perceptron (MLP) and this multilayered ANN contains minimum three layer of nodes such as input layer, hidden layer and output layer. The non-linear activation function is executed by the hidden layers or middle layers and each node is considered as one neuron. The ANN inter-layer weights acted as the input for the hidden layer weights (V) and passed to hidden layers to output layers (W). This MLP is designed as M biases for the input layer to single output layer. The output layer is only one and during the testing or detection phase for the processed input instances a threshold is calculated to publish result of the detection phase. This MLP uses supervised learning approach called back propagation. These MLPs are more frequently used for pattern recognition and classification. The MLPs are also provides the solutions for nonlinearly separable problems. Training of ANN with dataset: The figure 2 demonstrates the architecture of DDoS attack detection in HTTP flow streams. In the data input step the network traffic dataset is given as an input to training and testing process of ANN to update the knowledge such as V, W, weights and biases with GSPSO algorithm. The values for the flow features are extracted from these input data, because the proposed approach is flow based approach and it avoids the request level or packet level feature dependency. The whale optimization Algorithm (WOA) is adopted for feature selection due to the massive and large scale input dataset with huge volume. This WOA eliminates the unimportant features from flow feature set and reduces the dataset dimensionality to minimize the training and testing time.
The GSPSO algorithm is utilized in two phases of ANN learning. Firstly, the initial parameter weights such as V, W and biases are set for each layer and later these weights are updated for each iteration. The number of first layer nodes is defined based on the number of features selected after feature section phase. In this paper multilayered perceptron is used for classification with the K input layers, N hidden layers and with one output or prediction node. This architecture is denoted as K:N:1 and K X N numbers of weights are existed with M bias in the hidden layer. The Preprocessing step is defined as a result of the use of GSPSO-ANN as classifier and two stages of processing is implemented. The sigmoid function is shown below.
The mean squared error (MSErr) is used as the minimization function and the MSE function is given below. Where NoPtt denotes the number of input instances from raining dataset.
Testing ANN: The testing of ANN classifier is performed after completing the training of ANN with training dataset. The predicted output in the testing phase is validated with closest match of any target class and selective action is taken based on this matched output class for the current instance from the testing dataset.

Defining flow level features
The proposed method of application layer DDoS attack detection is implemented at flow level instead of transaction level. The incoming traffic is processed in flow intervals rather than transactions and the list of such are are given below. Asymmetric flows variation rate in a given time frame

Feature-name Description
The deviation in asymmetric flow in a time frame.

F5
Percentage of tiny packet flows The contribution of small length flows in time frame.

F6
Percentage of Client connections Establishment of connection with the incoming flow or request.

F7
Percentage of messages with urgent or keep-alive data The contribution of data/keep-alive messages in a specific flow.

F8
Average Packet size The urgent/keep-alive packet average size in a flow.

F9
Average interval time between messages Elapsed between two continues keep alive or urgent messages in a given time interval.

F10
Percentage of GET/POST requests Contribution of GET/POST requests alone in each flow.

F11
Average of GET/POST interval Elapsed between two continues GET or POST messages in a given time interval.

Ratio of incoming requests
The percentage of incoming requests from individual clients.

F13
Ratio of GET and POST sequence requests The ratio of GET and POST message combinations in a given flow of request.

F14
Client total service time The amount of time allocated to a specific client by the server.

F15
Bandwidth consumptions This feature defines the bandwidth consumed in each session.

F16
Source Diversity Ratio (SR) The number of sources involved to generate the incoming traffic in a specific time frame.

F17
Average server waiting time The amount of time sever is waiting to finish the acceptance of client request.

Extracting the Flow features
The

Whale Optimization Algorithm (WOA) for feature selection
This Section introduces a new meta-heuristic optimization algorithm known as Whale Optimization Algorithm (WOA) [54] which imitates humpback whales hunting behavior. The key difference between the existing bio inspired meta-heuristic approaches and this WOA is, it exhibits best optimization results with its hunting behavior by using best search agent to hunt the prey and with bubblenet attacking behavior. The bubble-net attacking behavior is simulated with spiral approach. The optimization results are exploring that the WOA is competitive that existing methodologies.

Motivation of WOA:
 The whales are highly intelligent animals with emotions than other animals after humans. The spindle cells in the humans and whales makes responsible for emotions, judgment and social behaviors. The whales have twice of these cells quantity than an adult human, which creates its smartness.  It has been proved that whales can learn, think, judge, communicate with others using own language and emotional like humans, but exhibits less smartness than humans.  The social behavior of the whales is quite interesting, because they live in groups or alone and some of their species like humpback whales live as a family for their whole life span. The preferred prey for these whales is krill and small fish.  The hunting behavior of humpback whales is very much interested, because it uses bubble-hunt feeding strategy. These humpback whales hunts the krill or small fishes close to the surface by generating unique bubbles along a circle or "9" fashioned path which is shown in figure 3.  In this paper we considered the unique bubble-net feeding behavior of humpback whale and modeled mathematically a spiral bubble-net feeding technique to implement optimization.

Mathematical model and optimization algorithm
The hunting process of the humpback whales for krill or small fishes on the surface of the water contains three phases namely encircling prey phase, spiral bubble-net hunt phase and prey hunting phase. This section explores all these phases with mathematical modeling and proposed the whale Optimization Algorithm (WOA).
Encircling prey phase: The humpback whales encircle the prey by identifying the prey location, because in search space the position of the optimal design is not known in advance. It is assumed that the target prey or candidate close to the optimum is the present best solution in WOA algorithm. The remaining search agents update their positions towards the selected best search agent. The following equations demonstrate this behavior.
Where the current iteration is represented as, ⃗ denotes the current best solution, ⃗ denotes position vector. The search agents are the whales which updates their positions with reference to the best known solution towards the prey. The updating of the vectors ⃗ ⃗⃗ and ⃗⃗ in the search space controls the whales towards the prey. The parameters ⃗⃗ 1 and ⃗⃗ 2 denotes random values from 0 to 1 and ⃗ value linearly decreased over time from 2 to 0 using the following equation.
Bubble-net attacking phase: This phase exhibits the bubble-net hunting behavior of the humpback whale and represents the exploitation phase. In exploitation phase the whales randomly search for the prey. To model the bubble-net attacking behavior of the humpback whales mathematically spiral updating method is deployed.
The reduction of encircling behavior of bubble-net hunting is accomplished by minimizing the value of with the above equation. The neighbor search agent position is created with the spiral method which is shown in the following equation.
⃗ ( + 1) = ⃗⃗ * * cos(2 ) + ⃗ The humpback whales swim along a spiral path and palely around the prey within a decreased length of circle path. In the optimization process the probability of 50% is considered to select the two mechanisms and this process considers r is a random number from 0 to 1. It is represented in the following equation.
Search for prey (exploration) phase: In this phase search randomly for prey based on each other positions by the hunting humpback whales. For this the value of ⃗ ⃗⃗ is assigned with a random number between +1 and -1, to force the search agents to transfer to longer positions from the selected search agent. Hence, ⃗ is used to update a search agent position rather than using best agent identified. This process is mathematically modeled as follows.
In Whale Optimization approach (WOA) , whales are defined as the randomly selected features and learning algorithm are used to evaluate the fitness of individual feature subset. The search agent is defined as subset of features with best solution. The best features subset is used to update the other whales' position with bubble-net hunting method. In the next iteration the updated features are used as the whale's position and this process continues repeatedly until the final subset contains best informative subset. The selected features are the input for the detection algorithm to detect the incoming traffic as attack prone or normal traffic. The algorithm of WOA is given in below.

Training of ANN with hybrid algorithms
In this section the combination of GS and PSO is used as a detection classifier. The artificial neural Network (ANN) classifiers are generally trained with back propagation method and many of them block to find the local optimum. This problem is successfully handled with two GS and PSO. The Artificial Neural Network (ANN) is trained with this GSPSO hybrid method. The GSPSO-ANN algorithm is defined below.
The GSPSO system mathematically similar to GS algorithm with isolated system of agents and ensures the Newtonian laws of motion and gravity. More specifically the agents follow the law of gravity and motion [55]. The calculation of various mathematical formulas required to define GSPSO-ANN are given as follows.
Let us assume a system with agents and the position of the k th agent defined as follows = ( 1 , 2 , … , . . , ) for 1 ≤ ≤ Where denotes ℎ agent position in the ℎ dimension.
The gravitational force at time on mass to mass is represented as follows In the above equation (3) ( ) denotes the gravitational constant at a time , the passive gravitational mass is denoted as for , the active gravitational mass for agent is denoted as . The value and masses for agents in time function is defined using equation (4). The , i.e gravitational constant is denoted as ( ) and the initial value for the same is denoted as 0 with maximum iteration value .
Here symbolizes agent initial mass and denotes the agent gravitational mass.
In equation (6) , the agent fitness value at time is defined as ( ) and mathematically ( ) and ( ) for global minimization problem is defined as The Euclidean distance between and entities is defined with the following equations. The stochastic characteristic for the GSPSO algorithm is defined as total which act on agent in dimension is defined as the randomly weighted sum of the forces extracted from ℎ components of other agents.
The randomized characteristic of the search is adopted using (random number) with values ranging from 0 to 1. The mass of the agent and field are used to evaluate the acceleration of each agent at time .
The equations (14)- (16) are used to define the velocity and position of the agent . Here weighting function is denoted as , the velocity of agent at time is represented as ( ) , the accelerations coefficients are 1 and 2 , the best fitness is shown with , the position of agent at time is represented as ( ), the acceleration of agent is symbolized as , two random numbers in the range [0,1] is defines as 1 and 2 .

Algorithm 2 GSPSO-ANN for DDoS attack Detection procedure GSPSO-ANN-Attack detection
The dataset( attack and normal records) is given as input (training and testing) The input dataset is Normalized ANN parameters are Initialized with input data matrix: , , weights ( , and biases). Initialize the following , force : training parameters, , 0 :masses of agents, 0 , inertial weights, 1 , 2 , etc: initial position of agents, Initialize with large values : In GSPSO algorithm, initially the number of ANN weights is calculated and generated the initial population by computing the agent size as much as the weights calculated in the initial step. Once the first population is generated successfully, then the initial parameters are defined randomly for each agent such as initial positions of agents, velocity, masses of agents, Force F0, initial weights, etc. The fitness of each agent is calculated in the next step. The algorithm continues until it reached the maximum iterations or attained the error threshold. The parameters such as agent position, velocity and each agent masses are updated to generate the new set of agents or solutions and again calculated the fitness of each agent for nest iteration. In the next step the best solution Global and each agent best solution are updated. The mathematical formula for updating agent position, velocity, masses calculations and other formula required for GSPSO-ANN are explained in the above. The GSPSO method is adopted because of its local search skills of GS and abilities of PSO in social movement behavior.

NSL-KDD dataset:
The commonly used dataset for validating any intrusion detection system is NSL-KDD and each record in this dataset is represented with 41 features. The dataset addresses 24 type of attacks and classified into four classes namely Denial of service (DoS) attack, Probe attack, User Root (U2R) attack and Remote Local (R2L) attack. The DoS attack blocks the resources which are not available to the legitimate users; Probe attack collects the confidential information to get the higher level privileges, U2R attack try to grab the root information with fake credentials to exploit the resources and R2L attack trying to grab the local system information to access the local system and resources. The classification of features in NSL-KDD is defined in table 2.   The parameter setting for ANN and training algorithms plays a vital role in performance evaluation of proposed method. The list of parameters used for evaluating the proposed method is given in Table5.
These parameters are considered based on the various machine learning applications defined earlier with GS and PSO.

Evaluation metrics:
The Detection Accuracy: It quantifies the incoming traffic by detecting the attack traffic and normal traffic correctly from the given instances of the request set. The calculation of detection accuracy is given s follows.

Performance of the Whale Optimization Algorithm (WAO) in feature selection
In the proposed approach the feature optimization is implemented with Whale Optimization Algorithm (WAO) for selecting the necessary features by excluding the unnecessary or unimportant features to improve the performance of the detection process. The proposed method is evaluated with NSL-KDD and CSE-CIC-IDS2018 datasets, but these datasets contains records at request level. The NSL-KDD dataset contains the records with 41 features, whereas the CSE-CIC-IDS2018 data contains dataset with 83 features. The features of NSL-KDD and CSE-CIC-IDS2018 datasets doesn't address the diversity of the features and diversified characteristics of the traffic, which will plays the vital role while evaluating the traffic as normal or attacks for the incoming traffic from the distributed environments. Hence, flow based features are proposed to overcome this limitation and address the diversified characteristics of the traffic. The advantage of the flow based features is, these are independent to the detection process and improves the performance. The values for the flow features are extracted from the NSL-KDD and CSE-CIC-IDES2018 datasets and evaluate the performance of detection system with the optimized parameters of the flow features set, which is given in table 6. This section explores the performance of the Whale Optimization Algorithm (WAO) for selection of important features at request level from NS-KDD and CSE-CIC-IDS2018 datasets. The processing time required evaluating the traffic with WAO for the two datasets and flow features are given in figure 4.

Performance of the proposed method with different algorithms
The simulation process is repeated for 10 times for all the techniques to calculate the average result for best suitable comparisons. The statistical metrics like mean, standard deviation (std), maximum and minimum are calculated from the ten simulations. The MSE and training time is considered as the performance parameters and the detail evaluation of the same for NSL-KDD and CSE-CIC-IDS2018 training dataset is defined in table 7 and table 8 However, this is still minimized to 0.012 with proposed approach and this is further improved with flow level features with WOA for NSL-KDD as 0.09. Hence, the argument is that the proposed method GSPSO-ANN with WAO provides better results at flow level rather than packet or request level features.  The training time of various methods are evaluated and compared with proposed GSPSO-ANN with WOA method using NSL-KDD dataset and CSE-CIC-IDS2018 dataset with original features at packet or request level with WOA and Flow level features defined with WOA. The proposed method consumed less training time compared to other models. For example the training time of the proposed model with optimized NSL-KDD request level features is 81.45 seconds and optimized CSE-CIC-IDS2018 dataset is 82.63 and with optimized flow features for the same are 68.3 and 70.123 respectively. It is proved that the training time of the proposed method is less for flow features rather than request level features of the given datasets.  The Detection time of various methods are evaluated and compared with proposed GSPSO-ANN with WOA method using NSL-KDD dataset and CSE-CIC-IDS2018 dataset with original features at packet or request level with WOA and Flow level features defined with WOA is given in the Table 11 and Table 12. The proposed method consumed less Detection time compared to other models. For example the Detection time of the proposed model with optimized NSL-KDD request level features is 0.19 seconds and optimized CSE-CIC-IDS2018 dataset is 0.23 and with optimized flow features for the same are 0.06 and 0.09 seconds respectively. It is proved that the detection time of the proposed method is less for flow features rather than request level features of the given datasets.  The performance of the proposed work is evaluated with the metrics such as Detection accuracy, precision, recall and F-Measure which is shown in the table 13. From the experimentation it is revealed that the proposed method with flow features attains the maximum Precision, Recall, F-measure, and Accuracy when compared with packet level features of NSL-KDD and CSE-CIC-IDS2018 datasets. The metrics are evaluated for the packet level features of the datasets with WOA and flow level features of the datasets with WOA separately. The detection accuracy of the same is given in figure 5 and figure 6.

Comparative analysis of the proposed method with existing works:
The detection rate of all the contemporary models is given in the table 14    The comparison of precision, recall and F-measure of the proposed method with the contemporary methods with NSL-KDD and CSE-CIC-IDS2018 datasets are given in table 15 , figure 7 and figure 8. The features selection process is employed in the proposed approach with WAO and which reduces the training and testing time as shown in the above tables. The proposed method exhibited better values for precision, recall and F-Measure compared to contemporary methods.  The proposed method exhibits low false alarm rate, when compared with the contemporary methods. The false alarm rate of the proposed method and comparison with the existing methodologies are shown in figure 9.

Performance of the proposed method with diversified flows or diversified characteristics
Though the performance of the GSPSO-ANN is satisfactory, but it failed to maintain the same for the diversified flows from the distributed environments or distributed characteristics of the features used to represent the transaction or the request. One of the key arguments is that the existing methodologies such as DL, GS-ANN,GD-ANN,GA-ANN and GD-PSO results better performance for homogeneous and neglected the diversity of the data. However, most of the machine learning, deep learning and ANN algorithms relay on the features selected to train the system. The diversity of the data plays a prominent role while training the system with selected features and for example attack request can be designed with multiple values for the selected features. The existing methods neglected the diversity and produced the results as efficient. In this paper the diversity of the traffic or data is addressed with flow level features rather than request level, because request level or transaction level features are always process dependent.
The flow features are independent to the methodology and efficiently handles the diversity of the requests. The feature source diversified ratio evaluates the diversity of the incoming traffic. This section explores the performance of the proposed GSPSO-ANN with WOA for diversified traffic and the comparison with the existing methodologies. The comparison of detection accuracy for diversified traffic is given in table 16 and table 17. The performance of the proposed system for various diversified values is given in figure 10 and figure 11.   The process of Evaluation-based algorithms   Comparison of Processing Times of various Data sets and ow features Figure 5 Detection accuracy of the proposed model with NSL-KDD dataset using packet level and Flow level features with WOA. Figure 6 Detection accuracy of the proposed model with CSE-CIC-IDS2018 dataset using packet level and Flow level features with WOA. Comparison of false alarm rates with Contemporary methods.

Figure 10
Comparison of detection accuracy with other methods for diversi ed NSL-KDD dataset Figure 11 Comparison of detection accuracy with other methods for diversi ed CSE-CIC-IDS2018 dataset