Hybrid deep-learning model to detect botnet attacks over internet of things environments

In recent years, the use of the internet of things (IoT) has increased dramatically, and cybersecurity concerns have grown in tandem. Cybersecurity has become a major challenge for institutions and companies of all sizes, with the spread of threats growing in number and developing at a rapid pace. Artificial intelligence (AI) in cybersecurity can to a large extent help face the challenge, since it provides a powerful framework and coordinates that allow organisations to stay one step ahead of sophisticated cyber threats. AI provides real-time feedback, helping rollover daily alerts to be investigated and analysed, effective decisions to be made and enabling quick responses. AI-based capabilities make attack detection, security and mitigation more accurate for intelligence gathering and analysis, and they enable proactive protective countermeasures to be taken to overwhelm attacks. In this study, we propose a robust system specifically to help detect botnet attacks of IoT devices. This was done by innovatively combining the model of a convolutional neural network with a long short-term memory (CNN-LSTM) algorithm mechanism to detect two common and serious IoT attacks (BASHLITE and Mirai) on four types of security camera. The data sets, which contained normal malicious network packets, were collected from real-time lab-connected camera devices in IoT environments. The results of the experiment showed that the proposed system achieved optimal performance, according to evaluation metrics. The proposed system gave the following weighted average results for detecting the botnet on the Provision PT-737E camera: camera precision: 88%, recall: 87% and F1 score: 83%. The results of system for classifying botnet attacks and normal packets on the Provision PT-838 camera were 89% for recall, 85% for F1 score and 94%, precision. The intelligent security system using the advanced deep learning model was successful for detecting botnet attacks that infected camera devices connected to IoT applications.


Introduction
Due to an increase in cybercrime, researchers have been focusing on identifying intrusions in networks (Vasilomanolakis et al. 2015). Previously, traditional computer networks and personal computers were the focus of cyberattacks, but now communication infrastructure encompasses the internet of things (IoT) (Assis et al. 2020), the internet of connected vehicles (Alkahtani and Aldhyani 2020), the internet of medical things (Manimurugan et al. 2020) and 5G (Perez et al. 2017), these have become the targets of many cyberattacks. Older strategies, such as firewalls and antivirus software, are unable to offer solutions to complex cyberattacks (Chung and Wahid 2012). Machine learning and deep learning algorithms are now used to detect intrusion in networks (Aburomman and Reaz 2016;Bijalwan 2020;Alothman et al. 2020). Algorithms can also be developed to perform both in group strategies and in combination. Grouping strategies is a machine learning technique to solve intrusion detection and prediction problems.
At the present time, big data produced by computers and IoT devices is causing a threat to data traffic in networks and increases the malware that threatens the integrity of data; however, these issues cannot be dealt with by current strategies (Mahmood and Afzal 2013). A bot in a network is a personal computer containing malware that allows attackers to hack it. A botnet is a spider computer network made up of many hosts, each running independent programs. It runs a bot on a large number of devices connected to the internet to form a botnet network operated by a malicious group (Hoque et al. 2015). Botnets pose a threat to network security because they are used in cybercrime techniques such as distributed denial of service (DDoS). Machine learning algorithms are used to track such attacks in IoT (McDermott et al. 2018;Koroniotis et al. 2019;Yerima et al. 2021). A botnet can be used to refuse services directed and distributed to any system on the internet so that it cannot properly serve its legitimate customers. Currently, DDoS attacks are performed from a botnet platform; despite their simplicity, they are very effective due to the bandwidth of the bots. An intruder can gain illegal access to user data through a large number of attacks. Networks are exposed to a wide variety of attacks, including probing, denial of service, user-to-root and port scanning. To execute these attacks, transport or protocols such as internet control message protocol, user datagram protocol (UDP), transmission control protocol (TCP) and file transfer protocol may be used. Network-based intrusion detection systems (NIDS) are among the best means to scan networks and identify attacks (Marir et al. 2018;Azeez et al. 2019). Many machine learning algorithms have been used to detect cyberattacks, but these fail the sanctity of big data traffic. Also, they lack the required optimisation.
Researchers are now able to develop systems to detect and verify botnet data (Tuan et al. 2020). However, complex data traffic does not guarantee high accuracy, as the techniques for detecting botnet attacks are constantly changing. Therefore, there is a need for classification techniques based on neural development that can identify the number of layers and neurons when detecting attacks (Kebande and Venter 2014). When training the network and learning weights, hyperparameters must be set. Signature-based and anomaly-based detection are two of the most important methods of intrusion detection. The former uses signature-based detection of a known attack pattern, while anomaly-based detection is used for both known and unknown attack patterns (Ullah and Mahmoud 2020). Traffic is identified by NIDS, which means extracting the most important features from traffic records in order to classify them as malicious or normal by machine learning algorithms (Dong and Wang 2016). Network-and hostbased systems are two means to classify network traffic records. In the NIDS network, all log traffic is monitored for intrusions such as DoS (Folorunso et al. 2016). When using vast databases for anomaly detection, high detection accuracy and low rate of alerts can be obtained. Training and testing phases are implemented to develop databases for anomaly-based systems; patterns are identified in the training phase and compared during the testing phase. Signature and heuristic methods cannot detect malware or provide an adequate level of detection against new and unknown variants, so machine learning algorithms are used to solve this problem.
Machine learning and deep learning techniques detect attacks without requiring advanced security knowledge (Deng 2014;Berman et al. 2019). The efficiency of intrusion detection systems (IDSs) can be improved through nature-inspired, meta-heuristic data mining, reinforcement learning, grammar-based machine learning and artificial intelligence (Yu et al. 2021;Ahmed et al. 2020;Alauthman et al. 2020). IDS performance can also be improved through the artificial bee colony (ABC) (Mazini et al. 2019), grey wolf optimisation (Al Shorman et al. 2020) and artificial fish swarm (AFS) algorithms (Lin et al. 2014). Rajagopal et al. proposed a group model using a metaclassification method based on stacked generalisation and used two data sets, UGR'16 and UNSW NB-15, which were collected from real and emulated network traffic. The proposed system achieved accuracy of 97%, as well as 94% with emulated data sets (Rajagopal et al. 2020). Elmasry et al. proposed particle swarm optimisation (PSO) for selecting hyperparameters and their most important features in a single process. Classification was done by three algorithms, namely deep belief networks, deep neural networks and long short-term memory recurrent neural networks (LSTM-RNN) (Elmasry et al. 2020). Tripathi et al. proposed an identification grasshopper optimisation method for distinguishing between normal and malicious traffic for the CIC-IDS 2017 and KDD Cup 99 data sets. The machine learning algorithms they used determined the type of attack (Dwivedi et al. 2020). Suhaimi et al. proposed a genetic algorithm, with an immune algorithm to improve attack detection accuracy, and they performed a number of simulations to check the performance of the method. The immune genetic algorithm performed well for predicting network penetration (Suhaimi et al. 2020). Zhou et al. proposed the M-AdaBoost-A algorithm for predicting network intrusion, and they used methods such as POS to group several classifiers according to M-AdaBoost-A. Their method has achieved good performance in detecting intrusions in both wireless and traditional networks (Zhou et al. 2020). Wu et al. presented the SRDLM method, based on deep learning and semantic re-encoding, which took advantage of deep learning methods to enhance classification capabilities. It achieved 99% accuracy in detecting a web injection network attack (Wu et al. 2020    Hajisalem et al. presented a hybrid classification method between AFS and ABC algorithms, and they also used a feature selection algorithm to show the correlation of each feature with others. They applied fuzzy c-means clustering to remove unnecessary features, determine the necessary features and, finally, evaluate the proposed system on the UNSW-NB15 and NSL-KDD data sets (Soe et al. 2020).
The main contribution of this research is to develop an intelligent security system by using advanced deep learning algorithms that detect and classify one of the serious intrusions that threaten IoT platforms. This study presents the convolutional neural network and long short-term memory (CNN-LSTM) model to detect and classify BASHLITE and Mirai attacks of security camera devices connected to IoT applications. It investigates whether the proposed system achieves superior accuracy compared to existing systems.

Methodology
Developing an intrusion detection system to identify and classify botnet attacks of IoT architecture is presented. The system works as an alternative method based on artificial intelligence to secure IoT devices from botnet attacks. The hybrid CNN-LSTM model is proposed to identify botnet  attacks of security cameras. The system was examined using real network traffic extracted from four different security cameras that were connected to IoT devices while under attack from Mirai and BASHLITE botnets. The main objective was to develop a system for designing a database that could help to detect zero-day attacks while matching with botnet attack patterns available in the database. A generic architecture of the proposed system is shown in Fig. 1. The main mechanisms of our security system are described in the next section.

Network botnet IoT data set
Network botnet IoT (N-BaIoT) network traffic data sets were collected from a machine-learning repository that contained 155 features extracted from switch ports connected to IoT environments. The data sets were gathered from four security camera devices, which are shown in Table 1, containing two botnet attacks, namely Mirai and BASHLITE. The data sets are available at this link: (https://archive.ics.uci.edu/ml/datasets/detection_of_IoT_ botnet_attacks_N_BaIoT). The laboratory setup to extract N-botnet attacks from four IoT-connected security cameras using wifi through access point devices is presented in Fig. 2. The port mirroring was configured on the port of switch devices to analyse and gather network packets. The extracted data were recoded using the Wireshark tool. BASHLITE and Mirai IoT attacks were injected into the data sets for extracting sub-attacks from the security cameras. These attacks are summarised in Table 2. They used various environments, including C programming and Linux systems, for collecting significant intrusion data sets.

Deep learning algorithms
Convolutional neural network (CNN) is deep learning technique that is well known for capturing specific data in images. In the CNN model, there are two important parts: convolutional and pooling layers. A convolutional layer is a mathematical operation that can pass over an input matrix using filters for extracting local features from given input data, while the task of pooling layers is to reduce the number features of the tensor by decreasing its size. This can help the CNN technique to obtain less time complexity and lower computation cost. Figure 3 displays a graphical representation of the process of convolutional and maxpooling layers in the CNN technique.
The preceding layer feature map engages with the convolutional kernel using filters in the convolution operation, resulting in the convolutional layer's yielded feature map j. Each outcome feature map j may comprise a convolution involving numerous input feature maps. The convolution layer's formulas are as follows: where c i represents a set of input feature maps, b l ð Þ j indicates the bias, y l j is the production of the convolution and w l ð Þ ij is symbolic of the convolution kernel; where t l j is the feature map of the convolution layer l; and where f is known as the activation function, which is a rectified linear unit (Relu).

Long short-term memory
The LSTM algorithm is a deep learning model designed for a number of real-life applications in classification and prediction domains. Figure 4 outlines the architecture of the LSTM model, which is one type of recurrent neural network. The main contribution of LSTM is to process the sequence data that have a form of connection unlike feedforward neural networks (FFNN). The LSTM model is unlike common FFNN, since it has response connections and can not only process single data points and not entire sequences of data. Three significant gates, namely the input, forget and output gate are presented. The input gate is used to manage and store training data in long-term memory (LTM), while the LTM state is initialised from the existing input training data and short-term memory is initialised from the preceding state time step. The input gate employs filters for extracting significant information and abandoning non-useful information. The sigma function is used to transfer the information from input gate into the output gate and has two indicator values: 0 and 1. The significant values are indicated by the 1 value, whereas unimportant information is represented by the 0 value. The output data are stored in LTM. The forget gate is a significant gate in the LSTM model and used to selected useful information and remove unnecessary information to enhance the capability of the model. It does so by multiplying the values of input gates and forget vector values. This information is passed from the forget gate to the output gate during the development of the new LST model to solve real-life problems.
where i t refers to training data, W is weight value, B: is bias, r is activation function, f t is the forget gate, O t is the output gate, c t is the cellular cell,x t is input information and h t output information.

CNN-LSTM model
This model has two parts. One of these can be made up of the CNN technique, which involves an input layer that receives sensor parameters such as inputs, an output layer that can be used to extract features to LSTMs and numerous hidden layers. The hidden layers are utilised to  Hybrid deep-learning model to detect botnet attacks over internet of things… 7727 contain a convolution layer, ReLU, which is the activation function that is located between layers of the CNN-LSTM network. The CNN extracts the local features and the LSTM temporal part. Through this structure, the neural network 'learns' a weight for each input that decides a precise output. The hybrid of CNN-LSTM is presented in Fig. 5. The significant values for developing the proposed system are presented in Table 3. We considered the size of the kernel convolution to be 5, whereas that of the epoch system was 15. A graphical representation of the proposed system is presented in Fig. 6, while a snapshot of it is shown in Fig. 7.

Results
The empirical results of the intelligent system are presented to detect and classify botnet attacks from security camera devices.

Environmental setup of the proposed system
The proposed system was implemented by using different environments. Table 4 demonstrates the proposed system requirements of the design algorithm for protecting cameras from intrusion. These items of hardware and software are most appropriate for developing our intelligent security system by using advanced deep learning approaches.

Evaluation metrics
In order to examine and evaluate the proposed system for detecting botnet attacks, Precision, F1 score, Accuracy and Recall metrics were used. The equations are defined as follows: Recall where TP is true positive, FP is false positive, TN is true negative and FN is false negative.

Results and discussion
In order to evaluate and test the system under development, we used four data sets generated from different security camera devices and conducted four experiments on different IoT platforms. The CNN-LSTM model was implemented to detect botnet attacks by using a network data set extracted from an IoT setup. The data sets were divided into 30% testing data and 70% training data for validation of the proposed system. Table 5 shows simple input data for the security cameras. In this section, the optimal results obtained from the deep learning model to detect and classify intrusion from security cameras are shown. In this experiment, we used data sets that extracted from four camera devices (Provision PT-737E, Provision PT-838, Simple Home XCS7-1.002-WHT and Simple Home XCS7-1.003-WH and Samsung SNH1011N) to develop a security system for identifying botnet attack patterns. Table 6 shows the results of the hybrid CNN-LSTM model for detecting botnet attacks of security cameras.
The false negative, true negative and false positive parameters were used to create confusion metrics for the  Hybrid deep-learning model to detect botnet attacks over internet of things… 7729 proposed system to find the accuracy to detect botnet attacks of security camera devices. Figure 8 illustrates the confusion metrics of the training model for extracting the patterns from real network traffic from Simple Home XCS7-1.002-W and Simple Home XCS7-1.003-WHT security cameras. The confusion metrics show the superior accuracy of the proposed system. Figure 8 shows the performance of the CNN-LSTM model for identifying botnet attacks of the Provision PT-737E device. The accuracy increased from 80 to 87% and the accuracy loss decreased from 0.2 to 0.7; therefore, the performance of the proposed system increased.
The performance of CNN-LSTM model for detecting Mirai and BASHLITE attacks on the Provision PT-838 security camera is presented in Fig. 9. The accuracy is shown to have increased from 82% to around 90%, and the validation loss decreased from 0.35 to 0.17.
Whereas the accuracy of the Simple Home XCS7-1.002-W device increased from 82 to 89%, it is shown that the accuracy whenever the epoch increased. Figure 9 shows the accuracy and loss of the proposed system for detecting and classifying attacks. The cross-entropy losses the training loss reduced from 0.35 to 0.7 for extracting  Hybrid deep-learning model to detect botnet attacks over internet of things… 7731 unknown attacks from the Simple Home XCS7-1.002-W device (Fig. 10). The training loss of the. Figure 11 shows the performance of the proposed system for detecting attacks on the Simple Home XCS7-1.003-WHT device. The accuracy increased from 78 to 88% with 15 epochs. The actual loss reduced from 0.40 to 0.20, and the performance of the proposed system increased with the addition of more epochs.

Discussion
Botnet represents a robotic software program that scans for susceptible network devices and, once it locates one, adapts it into a tradition bot. Botnet attacks are common malware that comprise a connected network for hacking the victim's computers to helps to control a centralised computer by hackers, who can then simply install cyberattacks across whole networks. The botnet is a form of DDoS attack that allows cyber criminals to access a system or network that is connected to a Wi-fi connection. Therefore, developing a deep learning algorithm such as the CNN-LSTM model to detect botnet attacks can protect many companies and enterprises against this type of attack. The deep learning model is used to design a database that contains numbers of botnet attacks and is installed in network system of companies. The system can help to detected botnets while extracting various unknown patterns from the signature database. In this study, the proposed system was tested with real data extracted from real IoT camera applications.
To validate the proposed CNN-LSTM system, the receiver operating characteristic (ROC) is presented in Fig. 12, where the y-axis represents the recall metric for detecting each botnet attack and normal packets. The x-axis is the specificity metric for detecting all botnet attacks on four security camera devices connected to an IoT system.
The empirical results of the hybrid CNN-LSTM model for classifying botnet attacks show that the system has the capability to identify any packet has stamp of botnet from during finding similarity new network packets with botnet attacks on signature databases. The simulation results show that our system has achieved optimal performance against the existing system, as show in Table 7. Overall, the graphics representation and confusion metrics confirm the effectiveness and efficiency of the proposed system for protecting IoT devices from botnet attacks. We have compared the proposed results with the data sets used by  (Soe et al. 2020). We have noted the proposed model has achieved better supervisor accuracy.

Conclusion
Intelligent security systems with advanced AI for detecting serious attacks have played an essential role in protecting IoT environments from botnet attacks. The CNN-LSTM algorithm is employed to detect and classify such intrusions. Early detection of a DDoS intrusion can help network administrators to stop network operations by disconnecting all the victim IoT devices from wifi connections to prevent the spread of the botnet attack across the network.
In this study, the proposed system was evaluated and tested by employing four security cameras (Provision PT-838, Simple Home XCS7-1.002-WHT and XCS7-1.003-WHT and Samsung SNH 1011N) to contain two major IoT attacks (BASHLITE and Mirai). Whereas the BASHLITE attacks were divided into sub-attacks (Scan, COMBO, Junk, TCP flood and UDP flood), the Mirai botnet attacks were categorised into Scan, ACK, Syn, Plain UDP and UDP flood. The CNN-LSTM was shown to effectively improve security on camera devices while they were connected to the IoT. The main objective of developing this system was to intelligently detect a serious IoT platform threat. The system can help to detect unknown botnet attacks that threaten IoT networking.

Declarations
Conflict of interest The authors declare that they have no conflict of interest.
Ethical standards This article does not contain any studies with human participants or animals performed by any of the authors. Hybrid deep-learning model to detect botnet attacks over internet of things… 7733