A novel method to detect cyber-attacks in IoT/IIoT devices on the modbus protocol using deep learning

The dominant intrusion detection models in internet of things industrial internet of things cybersecurity use network-based datasets. The Modbus protocol is one of the most often targeted protocols and cyberattacks against IoT/IIoT devices have grown to be a major threat in recent years. Due to the intricacy of the protocol and the quick evolution of cyber threats, detecting these attacks using conventional techniques might be difficult. This paper proposes an architecture that consistently outperforms the state-of-the-art methods of performing intrusion Detection that includes binary classification of whether an intrusion occurred or not and multi-class classification that classifies the different types of attacks using an embedding layer in a neural network to model the register values. The best accuracy results were obtained with a convolutional neural network, with an accuracy of 98.91% in the Modbus Binary dataset, a fully connected neural network with an accuracy of 98.06% in the multi-class classification of the Modbus dataset, and long short-term memory neural networks with an accuracy of 99.97%, 99.7%, and 80.20% in Binary, multi-class, and multi-class sub-categories, respectively which conclude that the proposed architecture performs consistently better than the control NN. Three NN are designed with and without the proposed architecture. All experiments performed in this paper conclude that the proposed architecture performs consistently better than the control NN. This paper shows that a NN with an embedding function can effectively be used to model whether an attack occurred on a device and the class of attack that occurred. This network can be utilized in the future to lessen DoS attacks and other types of network attacks. The network will be able to protect itself against a lot of damage if attacks can be predicted either before they occur or at the same moment they are launched.


Introduction
At this time, the entire world is advancing toward the digitalization of infrastructure and systems. The Internet is used by nearly every system to execute its functions. However, the World Wide Web alone cannot support the present technological advancement of newly invented and implemented innovations. IoT technology is now required in the form of software, hardware, detectors, devices, and software to provide higher performance and accuracy. The Internet of Things (IoT) development has made life simpler and more enjoyable for people, businesses, and governments. The intelligence of machines, electronic devices, devices such as sensors, and technologies is increasing [1].

IoT and IIoT devices
IoT are devices that contain sensors, some form of processors, and an active wireless connection (whether it be the internet or a private network) [2,3]. They come in lights, doors, watches, speakers, and other devices kept around the home. IoT devices have been specifically designed to make life easier for the home [4]. Industrial internet of thing (IIoT) is a new subtype of IoT device. These are special IoT devices that are used in the industrial sense. It is defined as ''A system comprising networked smart objects, cyber-physical assets, associated generic information technologies and optional cloud or edge computing platforms, which enable real-time, intelligent, and autonomous access, collection, analysis, communications, and exchange of process, product and service information, within the industrial environment, to optimize overall production value. This value may include; improving product or service delivery, boosting productivity, reducing labor costs, reducing energy consumption, and reducing the build-to-order cycle'' [5].

IoT and IIoT as emerging technology
IoT and the Industrial Internet make collecting data and monitoring systems much easier. The IoT is an emerging technology considered a key enabler for next-generation smart cities, industries, security services and economies [6,7]. The IIoT brought about a significant transformation in the production industry. The idea of the fourth industrial revolution and the IIoT might be closely linked with the goal of boosting the national economy, this theory is swiftly establishing fresh patterns in the creation of company concepts, industrial procedures, logistical support, and a variety of tactical efforts. Hundreds of IoT devices, clever protocols for interaction, and cutting-edge security measures are all present in IIoT systems. The worldwide internet's connection to all of these technological devices gives business operations a lot of managerial possibilities. Utilizing resources effectively also improves an industry's production and quality.
IoT systems provide more effective, affordable, and quicker IT support. Figure 1 shows the overall design of an IoT systems. The following elements must be present for the Internet of Things to function properly: (1) internet and the system connection; (2) device interactivity; (3) research; (4) system and network of connections control; (5) privacy; and (6) archiving of data. These tools employ IoT standards as well as protocols when they need to share information and transfer data gathered [8].

IoT data sets and protection
The World Wide Web and the Internet of Things' primary goals are to link node locations, and cities with smart networks, platforms, and sensors so that they may communicate, share data, and be controlled. Everything is online, including electronic sensors, smartphone fitness programs, heating and cooling systems, solar panels, cooling systems, and home appliances. It is more difficult to protect and preserve IoT data against intruders, hacker groups, unauthorized clients, and unsafe traffic as a result of IoT technological advances' exponential growth [9].

Cloud-based services and IoT protection
Additionally, a more effective, trustworthy, economical, and effective connection is achieved by using cloud-based services. Public data is encrypted when the cloud service provider, also known as the CSP, searches for and shares its contents with the customer using the cipher text-policy attribute-based system for search terms and record transfer (CPAB-KSDS) approach. This strategy also has the benefit of not requiring a private key generator (PKG) in order to reconfigure the encryption, key each time. The authenticity and impartiality attribute-based proxy reconfiguration (VF-ABPRE) technique was put into place to determine whether the computer's sent information was accurate or contained harmful allegations. The researchers secured the information using VF-CP-ABPRE, which stands for verifying and fairness cypher text-policy attribute-based proxy re-encryption [10].

Cyber attacks to IoTs
The downfall is a higher risk of cyber-attacks due to the lack of security measures for IoT and IIoT devices. This exposes the devices to malicious attacks from inside and outside the enterprise networks [11]. Cyber-attacks on IoT/ IIoT devices have become a significant concern in recent years, and the Modbus protocol is one of the most commonly targeted protocols. Detecting such attacks using traditional methods can be challenging due to the protocol's complexity and the rapidly evolving nature of cyber threats. However, a novel approach using deep learning has emerged as a promising solution. In this method, deep learning algorithms are trained to detect anomalous behavior in the Modbus protocol, enabling the identification of potential cyber-attacks. This approach has shown great potential in improving the security of IoT/IIoT devices, and its effectiveness continues to be explored and refined. In this article, we will explore the concept of detecting cyber-attacks on IoT/IIoT devices using deep learning on the Modbus protocol, examining the benefits and challenges of this approach and discussing its potential impact on the security of IoT/IIoT devices.
One of the latest news on IoT cybersecurity vulnerability is when a security researcher tracked an iPhone user (without their consent) using a custom-made AirTag clone. Apple's security measure to detect unwanted AirTags was easily circumvented. The custom-made AirTag also did not have any speakers, so the person who was being stalked did not hear any beeping sounds [12]. Along with the boom in the IoT space, there has also been a boom in the cybersecurity field concerning IoT devices. Kaspersky, the cybersecurity company, reported that in the first half of 2021, the number of breaches in IoT devices doubled to 1.51 billion from 639 million in 2020 [13]. Kaspersky also reported that about 64% of organizations globally use IoT solutions, but 43% do not protect them completely [14]. One of the main reasons for such a high number of successful attacks is that IoT devices are mostly resourceconstrained, with limited computation power and storage capacity [15,16].

Ways to mitigate the attacks
The common ways in which computer systems battle cybersecurity concerns are by using firewalls, encryption, and intrusion detection systems, but since IoTs are fundamentally different from computers, the various security measure cannot be directly ported onto IoT devices [17][18][19]. There has also been a boom in IoT devices, with about 23.14 billion connected devices in 2018, and it is projected to be as high as 75.44 billion connected devices in 2025 [20]. IDS are software applications that detect intrusions in network policy violations or malicious activities [21]. IDSs are used as a second line of defense to monitor the network that an IoT is connected to. An IDS detects malicious activity that successfully evades security perimeters like a firewall [22]. An effective intrusion detection system is required to keep all the IoT/IIoT devices connected to the network in good health without any attacks. Many datasets have been published to help create good intrusion detection systems. Datasets such as [23] [24,25] are network-based datasets that contain packet-level and flow-level information or a combination of both to detect attacks on the IoT network. The main thing missing from these datasets is the actual readings from the IoT devices. The dataset from [26] is a relatively new dataset that contains sensor data logs. The aim of this study is to use the register values generated by the Modbus communication protocol to detect attacks on the IoT/IIoT network. This paper proposes using an Embedding Layer in a NN to model the register values, which then is used to detect attacks in the network. There are three basic NN models, namely, multi-layered perceptron (MLP), convolutional neural network (CNN), and recurrent neural network (RNN). All the networks are used to model a binary-classification problem of whether an attack occurred and a multi-class classification to detect what type of attack occurred. The main contributions of this work include: 1 Applying an embedding function in NN to improve its performance in detecting attacks on an IoT/IIoT network. 2 Comparing the performance of simple and lightweight models in the binary and multi-class classification of cyber-attacks on a network.

Literature review
This section introduces embeddings and datasets, which are the background of the proposed architecture and related works.

Embeddings
Embeddings are low-dimensional spaces where you can project high-dimensional vectors. Embeddings are used in Natural Language Processing (NLP) to represent words in a fixed dimension. The benefit that an embedding provides with NLP tasks is that it provides a dense and low-dimensional vector, with its main benefit being generalization power [27]. Embeddings work better than other vectorization methods because embedding is enabled [28]. The embedding layer in popular deep learning frameworks is a simple lookup table that stores the embeddings of a fixed dictionary and size [29]. In simple terms, it just maps a number to a table of vectors. An ideal embedding layer in NLP will encode the word's meaning, and similar words will be closer in the vector space [30].

ToN IoT Dataset
One of the datasets used in this paper is obtained from the TON_IoT Datasets, which were publicly ble by the University of New South Wales at the Australian Defense Force Academy. This group of datasets is described on their website as a new generation of Industry 4.0/ IoT and IIoT datasets for evaluating the fidelity and efficiency of different cybersecurity applications based on Artificial Intelligence (AI) [31]. The datasets are labeled into two classes of whether an attack occurred. The category of an attack happening is further divided into sub-classes of attack. There are nine types of cyber-attacks: scanning, denial-of-service (DoS), distributed denial-of-service (DDoS), ransomware, backdoor, data injection, Cross-site Scripting (XSS), password cracking attack, and Man-in-The-Middle (MITM) attacks.
In this section, we are considering the types of cyberattacks Dos and DDos attacks. DDoS attacks are a type of cyber-attack in which multiple compromised computer systems, also known as botnets, flood a targeted website or server with traffic, overwhelming the system's capacity to handle legitimate requests. DoS attacks are similar to DDoS attacks but are carried out using a single computer or network connection. There have been numerous studies on DoS and DDoS attacks. For instance; A Survey of DDoS Attack and Defense Mechanisms ''by Jiaojiao Jiang and Liang Gui: This paper provides an overview of various DDoS attacks, including TCP SYN flood attacks, UDP flood attacks, and HTTP-based attacks [32]. It also discusses different defense mechanisms, such as IP blocking, rate limiting, and traffic filtering. According to Jiang DDoS attacks are a serious threat to online businesses and organizations, as they can cause significant financial losses and damage to reputation. In addition, no single solution can eliminate the risk of DDoS attacks. Proactive measures (such as network hardening) and reactive measures (such as DDoS mitigation services) are typically needed to defend against these attacks effectively. To mitigate DDoS attacks, machine learning algorithms can be used. The first step in mitigating DDoS attacks is to identify patterns in network traffic. Machine learning algorithms can be used to analyze network traffic and identify any anomalies that could indicate an attack. Machine learning can be used to identify patterns of normal network behavior and detect any anomalies that could indicate an attack. The system can be trained to recognize the difference between normal and malicious traffic. Once an attack is detected, machine learning algorithms can be used to filter out malicious traffic and allow legitimate traffic to continue to flow. If the traffic rate exceeds a certain threshold, the system can automatically slow down or block the traffic, preventing a DDoS attack from taking down the system. In an attack, the system can be reconfigured dynamically to switch to alternative routes, IP addresses, or other strategies to prevent it.
Even though integrating the industry 4.0 concept offers many benefits, they also introduce new security risks that must be addressed to secure the IoT production lines and their network components. Some of the vulnerabilities include a lack of standardization. IoT devices and network components from different manufacturers may not follow the same security protocols, making it difficult to implement a consistent security framework across the production line. In addition, many IoT devices may use default or weak passwords that can be easily hacked, allowing attackers to access the network. Also, data transmitted between IoT devices and network components may not be encrypted, making it vulnerable to interception and tampering. One common form of vulnerability is that IoT devices may not receive regular updates and patches, leaving them vulnerable to known security vulnerabilities. Lastly, IoT devices and network components may be sourced from different vendors, making it difficult to track the supply chain and ensure their security the attack from overwhelming the system. Out of all the data collected by the lab, this paper focuses only on the Modbus dataset of the IoT/IIoT dataset. The Modbus communication protocol is found in most smart manufacturing and industrial applications. The experimenters extracted the register type features from the Modbus service. The Modbus data that were extracted are described in Table 1.
The Modbus dataset contains six different classes: Backdoor, Injection, Normal, Password, Scanning, and XSS. A brief explanation of each of these cyber-attacks is described below.
• Backdoor refers to any method by which an unauthorized user is able to get around normal security measures and gain high level user access (which is also known as root access) on a computer system, network, or software application [33]. This attack can be used by attackers to steal any type of data (personal, financial) that is stored in the computer system or network. The attackers can also use this attack to further install additional malware, and it can also be used to hijack devices. • Injection attacks are the type of attack where the attacker injects a code into a program or query, or the attacker injects malware onto a computer in order to execute remote commands that can read or modify a database. It can also be used to change data on a website [34]. • Password cracking attack is an attack where the attacker tries to crack the password of a computer system by either a brute-force attack, or dictionary attacks [35]. This attack can allow the attacker to bypass the authentication procedure and hence, compromise the IoT or IIoT device. • Scanning Attacks are attacks where the attackers scan the devices to gather network information of the devices before launching sophisticated attacks. Some common scanning techniques include IP address scanning, port scanning, and version scanning [36]. • XSS attacks, also called Cross-Site Scripting attacks, are a type of injection, in which malicious scripts are injected into benign and trusted websites. XSS attacks occur when an attacker uses a web application to send malicious code, generally in the form of a browser side script, to a different end user. This attack is capable of compromising the data and authentication procedures between IIoT devices and a remote Web server.
An overview of the Modbus dataset is shown in Fig. 2. The Modbus dataset is imbalanced with a ratio of 68:32 for no attack-to-attack. The ratio will be even lower in realworld applications. The testbed (shown in Fig. 3) used to collect the dataset has a combination of physical and simulated IoT/IIoT services. The real devices are two smartphones, and a smart TV with a dynamic IP address. They also include a physical ESP8266 weather sensor. The testbed also has 6 simulated IoT services that are developed using JavaScript in the Node-RED API and linked to the public MQTT broker in the cloud layer. To generate the dataset, various IoT/IIoT scenarios were simulated in the testbed. Scenarios that are commonly found in smart homes, smart cities, and smart manufacturing were simulated. In this paper, we focus mainly on the Modbus service, which simulates the functionality of the Modbus devices found in many industrial applications as these devices communicate with each other using the Modbus protocol.

IoTID20 dataset
Another dataset used in this paper is the IoTID20 generated dataset. The dataset is the primary contribution of the paper [37]. The authors of the paper generated the data from [38]. The dataset is publicly available from [39]. The dataset contains a binary classification and two multi-class classification tasks. It has 83 features with a total of 625,783 data points. Figures 4,5,and 6 show the overview of the dataset. Figure 4 shows the main categories of the attacks used; a brief description of the attacks is given below.
• DoS This attack is a type of attack where the attacker tries to overwhelm a host connected to a network by flooding the target with requests. This then overloads the system and causes legitimate requests to get drowned out in the noise. The Syn Flooding subcategory is a part of the DoS attack. • MITM These attacks install an attacker in between two parties who believe they are directly communicating with each other. ARP Spoofing is a subcategory of MITM that is used in the dataset. This dataset has fewer ''Normal'' data points, in contrast to the ToN_IoT dataset. The testbed for the IoTID20 dataset (shown in Fig. 7) contains IoT devices and interconnecting structures. The testbed contains a Wi-Fi camera, an AI Speaker, and a smart phone. All devices are connected to a Wi-Fi Router. Other smart devices that are connected to the router also include laptops, and tablets. Only the AI Speaker, and the Wi-Fi camera are victims in the network, whereas the laptops, smart phones, and tablets are attackers in the network. To get the CSV files from Pcap files, the CIC flowmeter application is used. Table 2 compares the literature of previous work done. The paper [40] contains one of the datasets that this paper will be working with. The paper's authors used various machine learning methods to model their dataset. They used a logistic regression (LR) model, a linear discriminant analysis algorithm, a k-nearest neighbor algorithm, a  but the performance of their multi-class classification model is lacking. The highest accuracy, precision, recall, and f-score obtained were 0.77, 0.77, 0.77, and 0.75, respectively. The paper [41] contains the other dataset used in this paper. The authors modeled their data using various machine learning algorithms, including SVM, Gaussian NB, LDA, LR, DT, RF, and Ensemble Model. The DT was the best performing algorithm for the binary, category, and sub-category.

Related works
When it comes to machine learning based intrusion detection systems, there are many different papers available. Take, for example the paper. This paper used the dataset simulated by [42], a simulated dataset of a gas pipeline used to move petroleum products to the market. They trained various machine learning models that worked almost perfectly in every situation. There is a fundamental difference between the type of data they used in their paper and the type of data used in this paper. The authors used a dataset with many supplementary data like pump states and solenoid states, which are not applicable to the dataset this paper uses. Another paper [43] also uses machine learning models to detect anomalous data. The dataset they used was presented in [44], where the authors simulated a controller network consisting of several Master Terminal Units (MTU) and several controllers. This paper is interesting because when performing some preliminary dataset processing, the authors found a loophole where all the anomalous data had a lower packet per sec rate of data transfer. Hence, it made all the algorithms obsolete because a simple check of the data transfer rate would give a perfect performance. Nonetheless, they trained many different machine learning models which performed well on the dataset. The paper [45] simulated their network traffic dataset. They trained a linear NN with an accuracy of 99.9%. The paper experiments with several machine learning algorithms that can be used in intrusion detection systems. The dataset the authors used is the Bot-IoT dataset [46]. The machine learning methods they experimented with were k-nearest neighbor, SVM, DT, naïve Bayes, RF, artificial neural network (ANN), and LR. They discovered through their experiments that the RF algorithm is the best performing in a non-weighted dataset. ANN performed the best when classifying a binary weighted dataset. In multiclass classification, the k-nearest neighbor and ANN were the best performing for weighted and non-weighted datasets, respectively. The paper [47] proposes a deep neural network (DNN) for intrusion detection in the MQTT-based protocol. The paper also compares the performance of the DNN with traditional machine learning algorithms. The dataset used is the MQTT-IoT-IDS2020 dataset that features 3 abstraction-level features of MQTT-enabled IoT, including Packet-flow, Bi-flow, and Uni-flow features. The researchers concluded that their proposed network performed better than the traditional machine learning algorithms.
In summary, there are many machines learning based intrusion detection models. They are mostly trained on network data, and almost all data available on this is Technical report Provide insights into the development of data-driven intrusion detection systems using this new dataset The dataset would be useful for researchers and practitioners working in the field of cybersecurity [30] Computer science Technical report The contributions of this dataset include providing a standardized benchmark for evaluating intrusion detection systems in the context of IoT and IIoT enabling researchers and practitioners to develop more effective and accurate intrusion detection systems for these types of devices [31] Computer science Primary literature Highlighting the potential benefits and challenges of using machine learning for anomaly detection in SCADA systems Identifying areas for improvement and further investigation Document Term Type Literature [18] Cybersecurity Technical report The testbed is designed to be flexible and customizable, allowing researchers and educators to modify and extend the system as needed for their specific purposes The article highlights the potential of the testbed to facilitate research and experimentation in the area of SCADA cybersecurity, as well as its use in training and educating students in this field [25] Cyber security Academic journal article The article provides insights into the effectiveness of different machine learning algorithms for anomaly detection in industrial systems and highlighting the challenges and limitations of using machine learning for cybersecurity in industrial control systems It proposes a new approach for intrusion detection in industrial networks using deep learning and to provide a new dataset for testing intrusion detection models in industrial environments [28] (IoT) Academic journal article Comparison of the performance of machine learning algorithms with traditional signature-based intrusion detection systems Identification of the most effective features for attack classification in IoT networks [29] Computer science and engineering Academic journal article The development of effective IDS solutions for IoT systems, which are vulnerable to various security threats due to their limited resources and heterogeneous nature simulated in a laboratory. There is a lack of actual deep learning models that perform on par or better than traditional machine learning models.

The proposed architecture
This section will present the basic architecture of the deep learning networks used in this paper. A NN requires an input passed through many hidden layers to give an output. There are three broad categories of a NN they are fully connected network (FC), CNN, and RNN.
The proposed architecture is to input an embedding function before each NN. This is shown in Fig. 8.
Integrating an embedding function expands the dimensions of an input feature. With more features, the NN will be able to generalize better. The embedding function used in the architecture is a simple lookup table. The lookup table is parametrized with weights, so it is trainable. This simple architecture is already a part of many deep learning frameworks such as Keras, and PyTorch. In this work, the size of the embedding vector used is 512. Hence, each data point that is processed by the embedding layer will be represented by 512 vectors. The Linear network has 6 layers of fully connected layers. The CNN network has 3 convolutional layers and a fully connected layer. Each convolutional layer is followed by a dropout layer and a max-pool layer. The LSTM network has 2 LSTM layers, and 2 fully connected layers. The network with and without embedding functions are fundamentally similar, with the only difference being the addition of the embedding layer. This proposed framework for IIoT cyber-attack detection also addresses problems with enormous amounts of data, for a long-time temporal reliance, and data imbalances in previous techniques. The suggested approach uses deep learning identification along with advanced learning prediction methods to handle cyber-attack detection in IIoT [48]. The design of the suggested technique is illustrated in Fig. 9 further down, which shows how many procedures were combined to create the general structure of the model that was suggested. The newly created data points that are balanced are taken from the original unbalanced datasets in the suggested framework, categorized, and then given into a combined deep learning framework made up of an LSTM AE for training the trends of gathered data as well as a DT architecture for gathering data outputs in order to distinguish anomalous outcomes from normal [49].

Identification of deep anomalies
The basic structure of models based on deep learning includes the deep learning method, which works with sequentially and time-varying information to learn structures and elements. The entry, results, and memory barriers make up the LSTM cell's inner framework, which is seen in the Fig. 10. A single memory gateway is in charge of a certain area of processing information [50].
LSTM cells within an encoder-decoder architectural layer offers the advantages of the two designs for sequentially or time-series data formats. The recently developed LSTM auto-encoder is used in the model that is suggested because it has various advantages over conventional (normal) auto-encoders, including the ability to use sequencing as intake (time period material), whereas standard AE cannot take sequentially examples as the input information [51].

Estimate of deep anomalies
RNN systems and LSTM structure are two effective methods for estimating and foreseeing time-series information in the ML area. The LSTM units can learn when an estimate (instance) has to be eliminated, kept, or recalled in the cell's memory since they have three gates to operate: a gate for entering data, a gate for outcome, and a gate that forgets. As a result, the LSTM framework offers, in contrast with traditional models, an efficient method to change the contents of individual memory cells as well as their internal dynamics [52].

Experimental setup
To evaluate the performance of the proposed architecture, a series of experiments have been conducted on IoT intrusion detection. This section will discuss the different NN used for the experiments and the preprocessing steps used on the datasets.  The proposed architecture ensemble model the IoTID20 dataset, each network is trained thrice, once for binary classification, once for multi-class category classification, and once for multi-class sub-category classification. A manual search for the hyperparameters was carried out on all the networks. The parameters that work well on all networks is chosen. All networks are trained for 10 epochs, with a batch size of 64. A learning rate of 1 Â 10 À4 is set. The Adam optimizer is used. The loss function used for binary-class classification is the binary cross-entropy loss, and the loss function used for multiclass classification is the Negative-log likelihood loss. The NN are trained on an Nvidia Tesla P100.

Datasets preprocessing
As mentioned in the previous section, two datasets are used in this paper. The first is the Modbus dataset from the ToN_IoT datasets, and the second is the IoTID20 dataset.

Modbus dataset
The Modbus dataset has 4 features: read input register, read discrete value, read holding register, and read coil. The time stamp, date, and time features have been removed from the dataset. The dataset has already been filtered and cleaned. The dataset does not need further cleaning because there are no missing values or errors. Batch Normalization is performed each iteration for the NN that do not have an embedding function. There is no need for normalization for the NN with an embedding function. The data classes are imbalanced, with about 68.5% of the dataset belonging to one class. Since the data is very small already, there is no attempt to balance the classes. The dataset is shuffled into ''train'' and ''test'' datasets. 80% of the dataset is used to train the model, and 20% of the dataset is used to test the trained model.

IoTID20 dataset
Out of the total 83 features, only the following 17 features are kept: source port, destination port, flow duration, total forward packets, total backward packets, forward packet length mean, backward packet length means, flow IAT mean, forward IAT mean, backward IAT mean, packet length mean, packet size average, forward segment size average, backward segment size average, sub-flow forward packets, active mean, and idle mean. Batch Normalization is performed per iteration for the NN that do not have an embedding function, but it is not performed if there is an embedding function. The dataset is shuffled and divided into ''train'' and ''test'' datasets in a ratio of 4:1.

Binary classification
The results of binary classification are shown in Table 3.
The confusion matrices for the binary classification are shown in Figs. 11, 12, and 13.

Multi-class classification
The results of multi-class classification are shown in Table 4.
The confusion matrices for the multi-class classification are shown in Figs. 14, 15, and 16.

Binary classification
The results of binary classification are shown in Table 5.
The confusion matrices of binary classifications are shown in Figs. 17, 18, and 19.

Multi-class category classification
The results of multi-class classification are shown in Table 6. The confusion matrices for the multi-class classification are shown in Figs. 20, 21, and 22.

Multi-class sub-category classification
The results of multi-class sub-category classification are shown in Table 7.
The confusion matrices for multi-class sub-category classification are shown in Figs. 23, 24, and 25.

Comparison with state-of-the-art previous results
The previous state-of-the-art results were recreated and are presented in Table 8. The table shows this paper's bestperforming model and the state-of-the-art results.
There are several state-of-the-art models developed to address similar problems in the area of detecting cyberattacks in IoT/IIoT devices. For instance, SVM is a popular machine learning algorithm that has been widely used for intrusion detection in various domains, including IoT/IIoT. The SVM model is based on the principle of finding the optimal hyperplane that separates the normal and attack traffic in the feature space. Compared to the Modbus-based model, SVM requires a smaller dataset for training and can achieve high accuracy for binary classification tasks. However, it may not perform well for complex multi-class classification problems and requires manual feature engineering.
The SVM is an implementable alternative model however the reason why I chose the Modbus model since the SVM is a linear classifier, which means it can only find a linear decision boundary to separate the normal and attack traffic in the feature space. In contrast, the Modbus-based model can learn non-linear relationships between the Modbus traffic features and the attack types using DNN, which may lead to better performance in detecting complex attacks that cannot be separated by a linear boundary. In addition, SVM requires manual feature engineering to extract relevant features from the Modbus traffic data, which can be time-consuming and may not capture all the important characteristics of the traffic. The Modbus-based model can automatically learn the most informative features from the raw traffic data using DNN, which can save time and potentially improve detection accuracy. Lastly the SVM helps in Handling Imbalanced Datasets. In intrusion detection tasks, the number of normal traffic instances is usually much larger than the number of attack instances, resulting in an imbalanced dataset. SVM may struggle to handle such datasets and can result in poor performance in detecting rare attacks. The Modbus-based model can mitigate this problem by using techniques such as oversampling, under sampling, or cost-sensitive learning to balance the dataset and improve the detection accuracy for rare attacks. Albulayhi et al. [53] proposed a feature selection approach using mathematical set theory for machine learning-based Intrusion Detection Systems (IDS) to extract efficient subsets of features. Similarly, Abu Al-Haija et al. [54] employed the AdaBoost machine learning technology combined with Decision Tree (DT) and extensive data engineering techniques to construct a robust classifier for detecting and classifying several cyber-attacks. They presented Boost-Defence: a framework to secure IoT networks from a large vector of cyber-attacks at different IoT layers. In another study, Abu Al-Haija et al. [55] nalyzed the performance of several machine-learning techniques for constructing NIDSs. The authors used six different algorithms: Ensemble Boosted Trees (EBT), Ensemble RUSBoosted trees (ERT), Ensemble Subspace KNN (ESK), Shallow Neural Network (SNN), Bilayered neural network (BNN), and Logistic Regression Kernel (LRK). They test their NIDS on the Ton-IoT datasets, achieving accurate results with high variance. Based on the computational methods employed, the information sets implemented, and their results, these methods are grouped. developed a way for combining several convolutional neural network (CNN) algorithms to identify anomalies in the IoT. The NSL-KDD information sets were utilized to assess the plan that was suggested. The experiments they conducted showed that the suggested approach effectively and accurately identified assaults with a minimum of complexity.
The proposed method demonstrated its superior efficacy when compared to cutting-edge techniques. disapproved of an automated learning-based method for identifying security holes within an IIoT system. Using a practical testbed, analysts were able to distinguish between back-door, conventional treatment, and organized query language injection threats. proposed an advanced adaptive learning-based threat identification system for smart cities with IoT capabilities. Experts [56] coupled an unplanned tree with randomized domain learning. The suggested approach is assessed using 15 SCADA network statistics as a malware attack-detecting framework. The AWID database was used to assess the suggested model, which had an accuracy rate of 98%. For IIoT platforms as well the identification of fresh assaults remains a difficulty. A group of detectors included LSTM modules that were assembled by researchers. Applying the modified bus network's activity statistics to assess the success of the suggested method, they discovered an accuracy rate of 99%. According to the study [57] results, the suggested approaches increased assault detection efficiency by 6%. The findings of this study  17 Confusion matrices of linear model Fig. 18 Confusion matrices of LSTM model might be highly beneficial for creating reliable security mechanisms for IIoT applications. The Modbus model in intrusion detection include was used because it is most popular communication protocols used in industrial control systems. Therefore, many devices support it, making it easier to implement and use. Since it is a simple model, it makes it easier to understand and troubleshoot. In addition, Modbus is a fast protocol, which makes it suitable for real-time applications. This speed is achieved through its use of binary encoding. All in all, we cannot fail to acknowledge the downsides associated with the model for instance; Modbus does not provide any authentication mechanisms. This means that it is vulnerable to attacks such as spoofing, replay attacks, and MTMT attacks. The model is does not provide any encryption mechanisms. This means that data transmitted over Modbus can be intercepted and read by attackers. One of the major disadvantages is that Modbus does not provide any integrity protection mechanisms. This means that data transmitted over Modbus can be modified by attackers.
During the implementation of the experiment, there are a number of problematic scenarios we underwent including the fact that since Modbus is a widely used protocol, but there are many versions of it, which could cause compatibility issues between devices that use different versions of the protocol. Also, as mentioned earlier, Modbus does not provide any built-in security features, making it vulnerable to attacks such as spoofing, replay attacks, and MITM attacks. Their devices require proper configuration to communicate with each other effectively. Incorrect configuration could cause communication failures or other issues. To minimize these potential problems, it is important to thoroughly plan and test the Modbus implementation before deploying it in an industrial control system. Proper configuration, hardware selection, and testing can help ensure reliable and secure communication between Modbus devices (Figs. 26, 27).
The use of machine learning has long been seen as a cutting-edge approach to safeguarding digital information. However, ML tools may be reconstructed to introduce bias or a flaw that reduces the efficacy of its defenses [52]. Security breaches are also capable of contaminating a computer network's security with fake data sets using their personal ML techniques. Massive amounts of unprocessed information are processed by DL algorithms, which are then automatically trained into the system for cyber security. DL brain networks have been programmed to operate independently without supervision or human involvement. progressively, DL outperforms ML in terms of speed and accuracy for extracting extremely complex structures from enormous amounts of data [58].
An Approach for Detecting Cyber-attacks in IoT/IIoT Systems on the Modbus is what I'm anticipating to be an   IoT market because of its majority rise. A restricted resource computational system forces developers to focus on creating IoT systems like the Modbus model with the highest level of service as well as cybersecurity programs while other models paying little consideration to cybersecurity [59]. Cyberattacks on the Internet of Things, (IoT) however, are not limited to a few discrete machines; they open up the gate to many additional linked systems. Because of this, attacking Internet of Things (IoT) devices is appealing to criminals on the Internet and advantageous to them, which is how it's crucial to spot an IoT system intrusion as soon as it happens. The Internet of Things (IoT) system has hundreds of thousands of connected devices that send and receive enormous volumes of data, making it difficult to spot intrusions. With a standard precision of 93.74%, the novel deep-learning malware detection system reported in this research recognizes the majority of prevalent and often launched assaults on Internet of Things (IoT) systems. The datasets that Completely neural networks. (DNNs) use is essential [60]. The new technique used in this study to resolve this issue is gathering the dataset using the Modbus system. The proposed solution is now an effective and adaptable detection system for intrusions that can gather data from any Internet of Things (IoT) network and identify attacks continuously according to this significant methodology [61].

Concluding remarks
The study proposed a new architecture model for enhancing IoT cybersecurity, which employs embeddings as a technique for transforming an index into a vector space for words or characters. This approach is commonly used in the field of Natural Language Processing, where it has proven to be effective. By utilizing embeddings, the proposed model can improve the accuracy of intrusion detection by more effectively capturing the features of network traffic. This approach represents an innovative solution for addressing the challenges posed by the   complexity and dynamic nature of cyber threats facing IoT/ IIoT devices. As such, this research represents an important contribution to the field of IoT cybersecurity. The paper employed two datasets, the Modbus dataset sourced from the ToN_IoT datasets, and the IoTID20 dataset. The proposed architecture incorporates an embedding function preceding the neural network, and this architecture was tested using three basic neural networks: the Linear Model, the CNN Model, and the LSTM Model. The neural networks were trained both with and without the embedding function to enable a comparison of their respective performances. By conducting experiments on these datasets, the study sought to evaluate the effectiveness of the proposed architecture model in enhancing the accuracy of intrusion detection. Following training, the neural networks were subjected to binary-classification and multi-class classification tasks using the Modbus dataset, and binary-classification, multi-class classification, and multi-class sub-category classification using the IoTID20 dataset. The study then presented the results obtained for each neural network. Notably, the neural networks that incorporated the proposed architecture outperformed the control neural networks, and performed better than previous state-of-the-art algorithms across all measured metrics. These findings underscore the effectiveness of the proposed architecture in enhancing the accuracy of intrusion detection for IoT/IIoT devices, thus providing a valuable contribution to the field of cybersecurity.
The study suggests several potential directions for future research. One promising application of the proposed architecture is in mitigating Denial of Service (DoS) attacks, as well as other network attacks. By predicting attacks before they occur, or immediately upon their launch, the network can take proactive measures to protect against potential damages. This approach represents a significant step forward in the field of IoT/IIoT cybersecurity, as it has the potential to significantly enhance the security of these devices against a range of cyber threats.