Enhancing Intrusion Detection in Wireless Sensor Networks through Deep Hybrid Network Empowered by SC-Attention Mechanism

WSNs are often deployed in unattended or hostile environments, making them vulnerable to various types of attacks. Ensuring the security of WSNs is crucial, especially if the data being monitored is sensitive or critical. An intrusion detection system (IDS) can help detect unauthorized access or malicious activities within the network. In the field of network intrusion detection systems (NIDS), traditional approaches face limitations in effectively detecting evolving threats and unknown attack patterns. To overcome these challenges, this research proposes a novel approach called the Deep Hybrid Network with spatial and channel attention (DHN-SCA) that integrates deep learning techniques with attention mechanisms. The DHN combines convolutional neural networks (CNNs) with a Local Attention Module to enhance the accuracy and efficiency of intrusion detection. The Local Attention Module consists of two sub-modules: Spatial Attention and Channel Attention. Spatial Attention applies average pooling to the feature tensor, while Channel Attention incorporates global average pooling and global max pooling followed by fully connected layers. These sub-modules refine the feature tensor through element-wise multiplication operations with the original features. Experiments and evaluations are conducted on benchmark datasets to assess the performance of the DHN. Evaluation metrics such as accuracy, precision, recall, and F1 score are employed to compare the DHN's effectiveness with existing intrusion detection approaches.


1
Introduction A Wireless Sensor Network (WSN) refers to a group of resource-constrained sensing devices that are either homogeneous or heterogeneous.Their purpose is to sense physical phenomena in the surrounding environment and transmit the collected data to a central node, often called the sink node or base station, using different communication methods.WSNs have attracted significant attention from researchers due to their ability to provide efficient results in remote and unattended locations [1].These networks find wideranging applications in real-time scenarios, including border surveillance, industrial monitoring, commercial applications, healthcare monitoring, environmental applications, as well as the monitoring of national and international highways.The Internet of Things (IoT) encompasses a diverse array of interconnected smart devices that gather, process, refine, and exchange valuable data via the Internet.Each of these devices is assigned a unique IP address or device identity, enabling them to autonomously transmit and receive data across a network, eliminating the need for human intervention.This technology facilitates seamless communication and data sharing between interconnected objects while leveraging the power of the Internet [2].
WSN and IoT are both catalysts for societal transformation, with the potential to turn the world into a smart planet.While they have distinct applications, IoT networks draw heavily from the concepts of WSNs.However, it is important to note that these terms can sometimes be confused, as they share both similarities and differences.One similarity is that both networks often involve resourceconstrained sensing devices, which possess limited processing power, memory, and transmission capabilities.Additionally, both networks are highly effective in real-time applications, such as border area surveillance, where continuous monitoring is essential.In situations where human intervention is impossible, a multitude of sensors can be deployed strategically [3][4].This requires robust and energy-efficient routing protocols that can rapidly reconfigure the network.However, due to the inherent complexities of WSNs and IoT, they are susceptible to various attacks.These include Denial of Service (DoS), sinkhole, blackhole, greyhole, wormhole, selective forwarding, Sybil, and hello flood attacks, among others.
In the realm of cybersecurity, researchers have recognized the criticality of developing robust and efficient network intrusion detection systems (IDS) to ensure the security of networks.The primary objective of intrusion detection systems is to safeguard networked computers by thwarting unauthorized access and protecting the integrity, confidentiality, and availability of transmitted data.A key aspect of IDS is their ability to detect both known and unknown attacks and threats with utmost precision, while minimizing false alarm rates.These systems play a vital role in safeguarding information and communication systems within a network, contributing to a secure and resilient environment [5].
Artificial Intelligence (AI) has revolutionized the field of intrusion detection systems by enabling computers and machines to learn autonomously from datasets, reducing the need for extensive human intervention.Within AI, both machine learning (ML) and deep learning (DL) are sub-fields that have played significant roles in the creation and advancement of efficient intrusion detection systems.ML-based systems rely on manually extracted features for classifying and detecting network traffic, whereas DL systems utilize neural networks to automatically extract features from datasets and perform classification and detection.By leveraging deep learning techniques, the accuracy of detection models can be significantly improved compared to traditional machine learning approaches [6][7].In the pursuit of effective intrusion detection systems, researchers have explored various approaches and learning techniques, leading to the development of numerous models.Researchers have developed various deep learning architectures for network intrusion detection, including Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs), and their combinations.These models can analyze temporal and spatial dependencies in network traffic, capture longterm dependencies, and make accurate predictions about the presence of intrusions [8].However, existing models often suffer from poor precision, low detection rates, and high false alarm rates.Despite these challenges, network intrusion detection through deep learning holds great promise in enhancing the security of computer networks, as it enables the development of intelligent systems capable of detecting sophisticated attacks and adapting to evolving threats.Therefore, ongoing research focuses on refining and enhancing these models to address these limitations and create more robust and accurate intrusion detection systems.By harnessing the power of AI, researchers aim to improve the overall effectiveness and reliability of intrusion detection systems, thereby bolstering network security.

Motivation and contribution
The research motivation revolves around the need for more advanced techniques in network intrusion detection systems (NIDS) to combat evolving cyber threats.Traditional rule-based and signature-based approaches are insufficient in detecting unknown attack patterns.Deep learning, known for its ability to learn complex patterns, offers potential for improving NIDS capabilities.However, challenges such as limited labeled training data, interpretability, scalability, and resource requirements must be addressed.The research aims to address these challenges and contribute to NIDS through deep learning by exploring novel architectures, feature representations, and evaluation methodologies.The findings aim to enhance network security and assist organizations in protecting critical assets from emerging threats.
The research contribution of your study on the Deep Hybrid Network (DHN) for network intrusion detection systems (NIDS) lies in several key areas: Novel Hybrid Architecture: The proposed DHN introduces a hybrid architecture that combines deep learning techniques, specifically convolutional neural networks (CNNs), with attention mechanisms.This integration allows for enhanced detection capabilities by leveraging the strengths of both approaches.

Local Attention Module:
The DHN incorporates a Local Attention Module between each CNN layer, which introduces channel and spatial attention mechanisms.This attention mechanism helps the model focus on important features within each layer, improving the model's ability to capture relevant information for intrusion detection.

Improved Intrusion Detection Accuracy:
By leveraging deep learning and attention mechanisms, the DHN aims to improve the accuracy of intrusion detection.The model's ability to automatically learn complex patterns and extract high-level features from network data enhances its effectiveness in identifying both known and unknown network intrusions.This research work is organized as follows: First section starts with background of WSN and IoT and its security vulnerabilities for intrusion detection; further adoption of recent deep learning along with problem definition is highlighted.Second section mainly deals with the existing mechanism along with its coming.Third section presents the mathematical model of proposed work along with algorithm; proposed model is evaluated in fourth section.

Related Work
The rapid growth and wide-ranging use of wireless local area networks [9][10], including Ad Hoc networks and wireless sensor networks, have posed a challenge for traditional intrusion detection systems (IDS) designed for wired networks.Consequently, there's an immediate requirement for creating intrusion detection mechanisms that are compatible with wireless sensor networks.Anomaly-based IDSs, which view any departure from normal behavior as a potential intrusion, are becoming increasingly important.Some effective anomaly detection techniques for wireless sensor networks have been proposed, encompassing methods like artificial immune algorithms, clustering algorithms, machine learning algorithms, and statistical learning models.A comparative study [11] employed various classifiers such as Logistic Regression (LR), Multinomial Naive Bayes (MultinomialNB), Gaussian Naive Bayes (GNB), KNN, DT, RF, MLP, and GB using the UNSW-NB15 dataset.The metrics used for assessing binary and multiclass cases included accuracy, precision, and F1-score.The results indicated that the Random Forest classifier outperformed other models, achieving 87% accuracy, 98% precision, and an 84% F1-score.Another study [12] compared supervised learning models using the NSL-KDD and CICIDS2017 datasets.In the NSLKDD dataset, the RF and KNN algorithms showed superior performances with accuracy, recall, F1-score, and precision around 76%, 96%, 77%, and 65%, respectively.With the CICIDS2017 dataset, RF achieved up to 93% accuracy, with recall, F1-score, and precision reaching up to 84%.A comprehensive review [13] of IDS research with AI used supervised ML algorithms, such as Artificial Neural Network (ANN), DT, KNN, Naive Bayes, RF, SVM, Convolutional Neural Network (CNN), K-Means, Expectation-Maximization (EM), and Self Organizing Map (SOM).They utilized the highly imbalanced multiclass CICIDS2017 dataset and found that the KNN, DT, and Naive Bayes models performed best for intrusion detection, boasting a 99% accuracy rate.An experiment [14] applied ML models like SGD, Ridge Classifier (RC), DT, RF, and ET to predict DoS, Probe, R2L, and U2R attacks using the NSL-KDD dataset.Following a feature selection process, the DT was found to be a good option for identifying new attacks, with the ET and RF models reaching an accuracy of 99.83% for detecting U2R and DoS attacks using multiclass classification.A proposal [15] used ML models to detect intrusion anomalies in IoT network traffic using the BoT-IoT dataset.DT, GNB, and RF algorithms were chosen, with the GNB algorithm-showing efficacy in intrusion detection.
Another study [16] proposed IDS based on a big data platform that could distinguish the types of network traffic flow produced by IoT devices.The study revealed that ML algorithms outperformed DL algorithms in terms of accuracy and model training time on the Apache Spark platform, using the BoT-IoT real-world network traffic dataset.A novel framework [17] utilized data from the publicly accessible CUPID dataset, annotated with human pentesting activity on the network.Employing supervised algorithms like RF, KNN, and MLP among others, this framework allowed for distinguishing between automated and human-initiated attacks at the feature level.
A unique approach [18] to detect injection attacks in IoT applications was proposed, using feature selection and machine learning techniques.The researchers used the public AWID dataset and applied constant deletion and recursive deletion as feature selection techniques.The machine learning algorithms used in their analysis included SVM, Random Forest, and Decision Tree, highlighting that appropriate feature selection can significantly improve a model's attack detection accuracy in IoT applications.An intelligent intrusion detection and defense mechanism [19] was designed that could effectively curtail DoS attacks while maintaining reasonable energy costs.A dedicated dataset for WSN, called WSN-DS, was created to categorize four types of DoS attacks: blackhole, grayhole, flooding, and scheduling attacks.The dataset was used to train an ANN for detecting various attack categories.Another study [20] applied the fundamental principles of artificial immune technology to introduce a danger theory technology to safeguard wireless sensor networks (WSN).This was accomplished by observing WSN parameters like energy, data volume, and data transmission frequency and generating outputs based on their weights and concentrations.A robust anti-intrusion detection algorithm [21] was proposed for directional sensor networks based on the Most Exposed Path (MEP).The relationship between ant colony path-planning algorithm and MEP was constructed using a dynamic threshold strategy.To accurately find MEP, an exposure factor was incorporated into the state transition probability formula of the original ant colony algorithm.Finally, a proposal [22] for a lightweight dynamic autoencoder network (LDAN) method for Network Intrusion Detection (NID) was made.This method realized efficient feature extraction through a lightweight structure design, demonstrating the promise of new techniques in IDS development.

Proposed Methodology To enhance intrusion detection capabilities in Wireless
Sensor Networks (WSNs), researchers have explored the utilization of deep hybrid networks empowered by the SC-Attention mechanism.This innovative approach combines deep learning techniques with the SC-Attention mechanism to improve the accuracy and efficiency of intrusion detection systems in WSNs.Deep hybrid networks refer to the integration of multiple deep learning models, such as Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs), or their variants, within a unified framework.These models are designed to handle the specific characteristics of WSN data, such as limited resources, heterogeneous sensor nodes, and varying data patterns.The SC-Attention mechanism, short for Spatial and Channel Attention mechanism, is a component incorporated into the deep hybrid network architecture.It aims to selectively focus on relevant spatial and channel information in the sensor data, enhancing the discriminative power of the model.This attention mechanism helps to improve the detection accuracy by prioritizing the most informative features while suppressing noise and irrelevant information.By leveraging deep learning and the SC-Attention mechanism, the intrusion detection system in WSNs can automatically learn and extract meaningful representations from the sensor data.The deep hybrid network architecture enables the model to capture both spatial and temporal dependencies in the data, allowing for accurate detection of intrusion events.The proposed approach offers several advantages.Firstly, it improves the detection accuracy by effectively capturing and utilizing important features from the WSN data.Secondly, it enhances the efficiency of the intrusion detection system by reducing false alarms and effectively utilizing computational resources.Lastly, the SC-Attention mechanism provides interpretability, enabling researchers to gain insights into the decision-making process of the model.

DHN-SCA (Deep Hybrid Network Empowered by SC-Attention Mechanism)
The data is fetched initially, next it needs to be preprocessed, next the data is cleaned, once the sample size is sufficient and each sample is checked for any invalid entries the sample is then discarded.Because of the imbalance associated with the samples are selected for subsequent experiments.Upon analysis of the importance associated with each feature selection the essential, ones are arranged in an image frame suitable to fix it figure 1.Through each of the convolutional layers there exists a local attention module, this completes the spatial attention mechanism through the convolutional layer by specific implementation of shown as below.The other blocks depict the attention mechanism associated with each convolutional layer associated with feature tensor output through CNN that complete the attention mechanism and connect the attention parameters for the dataset via a training process to accomplishing the goal.The processing is shown in figure 3 that consists of two-sub modules known as spatial and channel attention module that processes feature tensor separately to achieve intermediate tensors and extracting their essential features to carry out elemental-multiplication to obtain refined featuretensor.

Figure 2 DHN-SCA Architecture
In the channel attention module, global avgpool and global maxpool are applied on this feature tensor irrespectively.After this, they pass through a fully connected layer whereas the intermediate tensors are computed by element-wise summation.In the end, the result is stored and the original feature tensor multiplied with to obtain the optimization mechanism through the channel attention mechanism.In the spatial attention mechanism through this feature tensor with new features are extracted by the adjacent convolutional layer, the feature tensor and the original feature are multiplied element-wise to attain the tensor optimization with the spatial attention mechanism.
The implementation logic for spatial attention mechanism is shown by the equation.
In combination with previous equations the  is denoted by the following equation The multi-layer attention mechanism is denoted by the following equation Here  denotes the varied dimension output through the last convolutional layer. Determines the classification with high probability.In the proposed model an input layer that receives the input data, such as network traffic or packet information.The data then goes through convolutional layers, where learnable filters extract local features and patterns.Activation functions introduce non-linearity, and pooling layers down samples the feature maps.The attention mechanism computes attention weights for different spatial locations or channels, capturing their relevance.Weighted feature aggregation combines the feature maps based on the attention weights.Fully connected layers' capture complex relationships, and an output layer produces intrusion detection predictions.The model's performance is measured using a loss function, and backpropagation is used to update the model's parameters.The implementation details for the two sub-modules channel attention and spatial attention is shown in figure 3.

Mathematical formulation
The probabilistic model that uses categorical distribution for classification purpose denoted as ƿ(|, ) to train the data  = (  ,   ).R is the target variable denotes the set of classes, s depicts the input features and k denotes the weight parameter.Taking into account the probabilistic distribution the likelihood classification of data denoted as a function  denoted by the parameters denoted as: The neural networks utilize cross-entropy based categorization of the cost function that maximizes the likelihood to achieve high accuracy.The Bayesian probability is achieved by multiplying eq() with belie ƿ() as: ƿ() denotes the posterior distribution for  parameters over .The probability of data is depicted as shown below: In eq (5) the computation of posterior core concept of Bayesian network is detected.The presence of the large integer denominator is evaluated for the purpose of posterior variation distribution inference to the approximation of the denominator.(| ) for  parameters through the variation inference measure.The difference in between the true approximation posterior is measured through the distance metric shown as: Here   () shows the posterior approximation over the  parameters using gap minimization technique that minimizes the distance in between true posterior and approximated posterior distribution.The cost function for the Bayesian network that minimizes the divergence is denoted as: The weight distribution of the weight parameters ƿ() is fixed given the input data henceforth minimising whether the divergence is equivalent to the loglikelihood of   ().Table 2 shows the proposed algorithm.

Table 1 Proposed Algorithm
Step 1 Data Pre-processing: Collect and pre-process the input data, including network traffic data or packet information.
Perform data normalization on it.
Step 2 Feature Extraction and Convolutional Feature Engineering

Apply convolutional layers to extract local features and patterns from the pre-processed data.
Use convolutional feature engineering technique to enhance the discriminative power of the extracted features .

Employ local attention module to compute attention weights for different spatial locations or channels, capturing their importance
Perform weighted feature aggregation based on the attention weights to emphasize relevant features while suppressing irrelevant ones.
Step 4 Bayesian Deep Learning:

Bayesian deep learning technique incorporates uncertainty modeling and robustness.
Using Bayesian inference to estimate the posterior distribution of model parameters, allowing for more reliable predictions and handling of uncertain inputs.

Fully Connected Layers and Output Prediction:
Pass the aggregated features through fully connected layers to capture complex relationships.

Employ appropriate activation functions for each layer to introduce non-linearity.
Use the output layer with the softmax activation function Step 6

Training and Optimization
Define the loss function, such as cross-entropy, to measure the discrepancy between predicted and ground truth labels.

Employ backpropagation to compute gradients of the loss function with respect to the model parameters.
Utilizing, stochastic gradient descent (SGD) optimization algorithms, to update the model's weights and minimize the loss.

Evaluate the trained model's performance
Perform any necessary fine-tuning or hyperparameter optimization to improve the model's performance.

Real-time
Continuously monitor the system, processing incoming data, and making predictions using the trained model.

4
Performance Evaluation This section of the research evaluates the DHN-SCA model with various existing machine learning and deep learning model, moreover DHN-SCA model is evaluated considering the various dataset, and prosed framework is designed through deep learning library with python as programming language.Further details of dataset and simulation results has been discussed in the following section.

Dataset Details
The experiments in this study utilized three datasets: KDDcup99 [23], NSL-KDD [24], and UNSW-NB15 [25].The KDDcup99 dataset is a well-known network intrusion detection dataset that includes various Denial of Service (DoS) attack data points, making it particularly suitable for conducting experiments on DoS attack detection.The UNSW-NB15 dataset, created by the Australian National Security Center in 2015, offers a more comprehensive range of attacks found in contemporary networks, providing a more realistic reflection of the current network landscape.
Table 2 illustrates the composition of these datasets.The KDDcup99 and NSL-KDD datasets consist of 41 feature types, while the UNSW-NB15 dataset comprises 47 feature types.However, some of these features are redundant, repetitive, or lack significant relevance to the labels, necessitating dimensionality reduction.This reduction process aims to decrease the number of features and eliminate redundancy.
To facilitate experimentation, the KDDcup99, NSL-KDD, and UNSW-NB15 datasets are divided into training and testing datasets.The KDDcup99 and NSL-KDD datasets encompass four attack types, while the UNSW-NB15 dataset contains nine attack types.However, this study specifically focuses on extracting and analyzing the DoS attack types exclusively.Table 2 presents the distribution of normal and DoS attacks across the training and testing datasets.

Comparative analysis
Upon analyzing the DHN-SCA mechanism's improvements over DCNN across the NSL-KDD, KDDCup99, and UNSW-NB15 datasets, distinct patterns emerge.The KDDCup99 dataset consistently exhibits the most notable enhancements, particularly in precision for the "Normal" class and F1-score for multi-classification.While the NSL-KDD dataset sees significant improvements in recall for the "DoS" class and F1-score for the "Normal" class, UNSW-NB15 mainly benefits in accuracy and recall for binary classification.Overall, DHN-SCA consistently outpaces DCNN, with the degree of advancement varying across datasets and metrics.

Figure 1
Figure 1 Proposed Workflow Upon conversion of the data records into an image frame for training and validation purposes.Figure (2) depicts the convolutional neural network, it consists of n convolution layers depicted by  (1)   ().Through each of the convolutional layers there exists a local attention module, this completes the spatial attention mechanism through the convolutional layer by specific implementation of shown as below.The other blocks depict the attention mechanism associated with each convolutional layer associated +1 = (  (  ))  − () =    (((  (  ))) (4) Here   depicts the input information for the  − ℎ convolution operation.  is the convolution operation at the  − ℎ layer.   denotes the objects in the range combined together.The mathematical formulation depicting the overall representation is denoted by the following equations:  = (  (  ))(5)() = (((( − (), )))

Table 8
Metrics comparisonWSNs offer many advantages in terms of data collection and monitoring, they are susceptible to a range of security threats.Implementing a NIDS in a WSN is essential to ensure its security and reliability, but it comes with its own set of challenges due to the unique characteristics of WSNs.The integration of deep hybrid networks with the SC-Attention mechanism presents a promising approach for enhancing intrusion detection in Wireless Sensor Networks.This research direction leverages the power deep learning and attention mechanisms to improve the accuracy, efficiency, and interpretability of intrusion detection systems, thereby strengthening the security of WSNs.The research evaluated the performance of the DHN-SCA model against traditional and deep learning mechanisms for detecting Denial of Service (DoS) attacks using three datasets.Across all datasets, DHN-SCA consistently achieved superior accuracy, especially on the KDDcup99 and UNSW-NB15 datasets.When compared to the DCNN model, DHN-SCA demonstrated notable improvements in various metrics, with KDDCup99 showing the most significant enhancements.The results underscore the potential of DHN-SCA in network intrusion detection.Overall, the DHN-SCA model emerges as a promising tool in the domain of DoS attack detection. of