Hidden Markov Trust for Attenuation of Selfish and Malicious Nodes in the IoT Network

The exposure of IoT nodes to the internet makes them vulnerable to malicious attacks and failures. These failures affect the survivability, integrity, and connectivity of the network. Thus, the detection and elimination of attacks in a timely manner become an important factor to maintain network connectivity. Trust-based techniques are used in understanding the behavior of nodes in the network. The proposed conventional trust models are power-hungry and demand large storage space. Succeeding this Hidden Markov Models have also been developed to calculate trust but the survivability of the network achieved from them is low. To improve survivability, selfish and malicious nodes present in the network are required to be treated separately. Hence, in this paper, an improved Hidden Markov Trust (HMT) model is developed, which accurately detects the selfish and malicious nodes that illegally intercept the network. The proposed model comprises the Learning Module which aims to understand the behavior of nodes and compute trust using HMT with the expected output. The probability parameters of the HMT model are derived from the data flow rate and the residual energy of the nodes. Next, in Decision-Module, the actual nature of the node is obtained with the help of the evaluated node’s likelihood functions. If the node is selfish and is close to crashed state then, is isolated from the routing function, while the selfish node with sufficient energy is immediately destroyed from the network. On the other hand, malicious nodes are provided with a time-based opportunity to reset themselves before being knocked down. Finally, if the node is legitimate, then the function continues smoothly. At last, the Path-Formation-Module establishes the trusted optimal routing path. Further, comparative analysis for attacks such as black-hole, grey-hole, and sink-hole has been done and performance parameters have been extended to survivability-rate, power consumption, delay, and false-alarm-rate, for different network sizes and vulnerability. Simulation result on average provides a 10% higher PDR, 29% lower overhead, and 17% higher detection rate when compared to a Futuristic Cooperation Evaluation Model, Futuristic Trust Coefficient-based Semi-Markov Prediction Model, Opportunistic Data Forwarding Mechanism, and Priority-based Trust Efficient Routing using Ant Colony Optimization trust models presented in the literature.


Introduction
The use of the Internet of Things (IoT) in transmitting and sharing data, resources, and services through the internet had won popularity, especially in the areas such as tracking/ monitoring, health services, military applications, agricultural, vehicular communications, etc. The issue of maintaining data integrity and network resilience in IoT is considered as the key entity for reliable data communication. But the exposure of IoT nodes to the open-ended side of the internet makes them vulnerable to attacks and breaches, which infects their integrity [1,2]. Moreover, the behavioral transition of nodes towards the act of selfishness affects the connectivity of the network [3,4]. Further, IoT devices suffer from random failures due to their distinctive features such as limited memory, limited energy, limited computational capabilities, and decentralized infrastructure. Therefore, for the feasibility and stability of the IoT network, a collaborative environment is favored that should be free from a selfish and malicious act. Selfish nodes deny forwarding the data packets of their neighboring nodes; to conserve their energy and the Malicious nodes (or mischievous) manipulate the data to damage the integrity of the IoT network.
To protect IoT nodes from attacks, while keeping the network connectivity and integrity intact is a difficult task [5]. The conventional cryptographic methods are not acceptable to resource constraint IoT nodes owing to their code size, processing time, and energy consumption. Therefore, alternate trust-based security primitive mechanisms are proposed like Bayesian systems, fuzzy logic, etc. The survey study demonstrates that these trust evaluation methods are dependent on external parameters like the opinion of neighboring nodes and their past behaviors. Inculcation of these factors calls for large memory and communication overhead. In addition, these approaches present the detection of compromised nodes but lack to discriminate the impact of selfish and malicious activity on the IoT network.
Differentiation of selfish and malicious nodes is needed, since, the objective function of the selfish node is to make routing impossible by never participating and always dropping the packets of neighbor nodes while malicious nodes though always take part in network function but try to damage the flow of data. To counter the effect of these nodes, the immediate impression is to destroy these nodes from the network. Treating both nodes in a similar way though improves effective communication among nodes but directly and indirectly impacts the survivability of the resource constraint IoT network. Hence there is a need to provide a technique that can handle selfish and malicious nodes differently and can mitigate the problems occurring due to it so that the survivability of the network can be improved with effective communication.
In this paper, the dynamic behavior of nodes at any given time is characterized by the probability distribution of the possible outcomes like parameter changes, fault frequency, etc. We have used the Hidden Markov-based trust model (HMT) for capturing the zestful behavior of nodes and forecasting the likelihood of the node being in one of the hidden states (or behavior). The model analyzes the 4-state HMM where states are named as the adaptive, greedy, mischievous, and crashed states and the possible outcomes are the analysis of packet transferal information associated with each interaction. In the context of trust-based applications, any routing protocol for Low power and Lossy Network (LLN) is acceptable like RPL, 6LoWPAN, LOADng, etc. For simplicity of the network, our work is focused on IoT applications where communication among devices is generally peer-topeer and the node intends to set a trusted path to a destination. Thus, the trusted route discovery is needed, which is triggered by the source node that wants to send the message.

Development of a Structured Algorithm A generalized trust evaluation algorithm has
been presented in a structured manner for a node to select the trustworthy and reliable path to the destination. We discuss available parameters that affect the path trustworthiness like available energy, packets drooped, and packets modified.

Mathematical Evaluation and Analysis
The maximum likelihood of the node's behavior using the Hidden Markov Approach provides a clear mathematical analysis of cooperative node selection that proceeds by combining transition and emission metrics. 3. Simulation Results Simulation results are provided to evaluate and compare the best suitable trust model for different network sizes (scalability) and different vulnerabilities (in presence of maliciousness). In addition, the model simulates the black hole and grey hole attack in form of test cases. It also investigates the most generalized sink-hole attack and present its comparative analysis with the conventional schemes. Thus, covers all the possible attacks.

Performance Evaluations
The potential of the model is estimated for the survivability rate, Packet Delivery Rate (PDR), energy consumption, end-to-end delay, routing overhead, and detection rate, together with false positive and false negative factors. 5. Critical Analysis The advantages and challenges of providing Time Based opportunities to the infected nodes have been addressed and the effect of changing network size in presence of malicious attacks is presented.
The paper is structured as follows: Sect. 2 presents the related work in the area of detection and elimination of nasty nodes. Section 3, illustrates the proposed work with its system modeling and mathematical modeling. Analysis of simulation results and their impact on different performance parameters is provided in Sect. 4. Finally, the conclusion and future scope are drawn in Sect. 5.

3 2 Related Work
The security in IoT is conventionally carried out using cryptographic methods, where public key and symmetric key techniques are adopted [8]. As for sensor nodes, both key operations are expensive in terms of computations and energy consumption. Considering this, the study of trust evaluation mechanisms for IoT security has gained momentum. Trust models provide the benefits of lesser resource consumption, peer-to-peer structure, and compromised node detection [9]. Thus, trust is considered to be an important factor to ascertain the survivability of the network.
Regarding this, authors in [10] have proposed the Priority-based Trust Efficient Routing protocol using Ant Colony Optimization (PTER-ACO) technique. The authors aim to achieve data privacy and secure transmission over an IoT-based mobile network. For added security, authors have used Advanced Encryption Standard (AES) and Rivest Cipher (RC6) encryption techniques. Though the performance analysis of this trusted technique is better than other security measures. However, the use of AES and RC6 has made the system greedy and power-hungry. Therefore, in recent years, researchers have focused their attention on the stochastic Markov model for providing security from attacks in different areas of technology and applications such as communication, defense, monitoring, etc. In view of this,article [11] presents a quantitive measure of survivability for the clustered network in presence of DoS attacks. In their work, the authors investigated mechanisms of the Markov chain that categorizes nodes into an active state and a dead state. They have incorporated the evaluation of the degree of services to estimate the probability of the node's state. The approach adds failure rate (active to the dead) in connection with energy consumption of node and repair rate (dead to active) as a measure of node density. To be specific, the author has resolved the DoS attack by increasing the density of the node. Though the model increases the survivability rate, but the perspective of increasing the node's density affects the node's residual power.
The applicability of the simple Markov chain model is not always feasible because the transition time from one state to another is a random variable while the former applications use time as an exponential distribution. Therefore, the semi-Markov process is highlighted to characterize the node state transition. To discuss this, the author of [12] has recommended a FUturistic Cooperation Evaluation Model (FUCEM) for establishing an effective routing path. The work adopts a semi-Markov process for determining node reliability. The author has included cooperative, partial-cooperative, and non-cooperative transition states. The reliability of the node depends on the amount of energy dissipated while transmitting and receiving packets. In addition to it, the author has also incorporated link stability based on the mobility of the node. The scheme determines the effective path based on energy level but the involvement of attacks is not contemplated, which results in the loss of data packets. Proceeding more, [13] recommends a Position-based OppoRtunistic (POR) and greedy routing scheme for reliable data delivery. The work adopts a dual-step process to find the best routing path. In the first step, Geographic Location Service (GLS) [14] and Quorum Based Location Service (QBLS) are incorporated for route discovery; which yields efficient data transmission rate. In the second step, the behavior of nodes in the selected route is derived using the semi-Markov process. The scheme approximates the network survivability but quantitative estimation of transition probability is vague and uncertain.
Advancing further, considering node-state transition time as a random variable, the authors in [15] had incorporated a Discrete-Time Markov Chain (DTMC) process for analyzing the behavior of the node. This mechanism addresses the problem of SMS/MMSbased worms of smartphone applications through social media. The node is categorized into susceptible, exposed, infectious, and recovered states. The authors have included realworld data set of cellular networks for estimating effective node behavior. The proposed scheme set forth a good detection rate but is specific only to smartphone applications.
Considering the network survivability, the authors of [16] have proposed a Futuristic Trust Coefficient-based Semi-Markov Prediction Model (FTCSPM) for mitigating selfish nodes from the compromised network. The model incorporates a non-birth-death process to estimate the trust coefficient. It consists of three state models viz, cooperative, selfish, and failed state. The stochastic transition probabilities estimate the selfish behavior of the node. The model also frames the lower and upper bound of network survivability. The model incorporates only the selfish behavior of the node and lacks to mitigate the malicious activity from the network irrespective of selfishness.
Sometimes the state of the system is unseen and the observer only has certain shreds of evidence to realize the current state. At that instant, the Hidden Markov Model (HMM) comes into play. In [17] authors have proposed an HMM-based context-aware trust model to envisage the dynamic behavior of the agent. In this paper, the authors have incorporated entropy-based information theory and multiple key factors for the selection of useful features that defines a complete profile agent. The profile detail makes for the observation matrix and the quality rating of the agent account for the hidden state. The behavioral analysis of the agent is exploited by finite-state HMM. The proposed mechanism is better than traditional HMM but is specific to agent-based systems and needs to include the effect of malicious attacks.
Heading towards the attack, in [18] authors have proposed an intrusion detection system where they have applied two states' HMM to evaluate the reputation of vehicles. Safe and malicious are considered as a state while send, drop, and forward is regarded as observation states. The authors have directly applied the random probabilities value and have evaluated the reputation of the vehicle. Extraction of these probabilities and derivation of reputation is not discussed. For the exploration and extraction of the transition probabilities, the authors of [19] have proposed an IoT Monitoring security system that includes HMM for the determination of the optimal path that attacker may follow, and thus provides the suggestions for subsequent patching and security measures. The authors have used the Baum-Welch algorithm for the estimation of transition and emission probabilities. The experimental results identify the crucial nodes with high accuracy, however, the proposed system is valid only for trigger-action IoT platforms, where the events (legal or illegal) occur only due to the creation of a chain of events within the network. Moreover, the use of the Baum-Welch algorithm often results in training the model to the local optimum, which does not fit in logging the alert messages, thereby reducing the detection effectiveness.
To solve the issue of the Baum-Welch algorithm, the authors of [20] have proposed a pre-training method for multi-stage attack detection. This method is based on the highsemantic similarity of alerts. The approach first clusters the alerts based on their semantic information and pre-classifies the stage of the attack to which the alert belongs. Then, the distance of the alert vector to each attack stage is converted into the probability of generating alerts in each attack stage. The detection accuracy of this approach is better than the Baum-Welch-based algorithm. However, it is true only for limited categories of alerts. In case, the alert categories increase, the detection rate decreases.
Investigating more towards the issue of the Baum-Welch algorithm, the authors of [21] have proposed a multi-HMM model with a Genetic Algorithm (GA). The model eliminates the drawbacks of converging to local optima that occurs when the Baum-Welch algorithm is used for parameter estimation. The proposed HMM model consists of two states legitimate and malicious. The probability distribution of state transition is evaluated using a Genetic Algorithm, where every gene in GA represents the state transition in an HMM model. Since authors have considered multi-HMM, so dynamic sliding window sequence extractor is used to extract multiple sequences simultaneously. The experimental results achieve 99.89% of accuracy, however, are limited to internal threats only.
Progressing further, in [22] a State-based classification model is proposed that recognizes multi-stage advanced attacks. The proposed model contains the log of observed activities. Common activities in the log are correlated within a given timeframe into a single event with weight and hit count. The authors have incorporated an adaptive sliding window approach to correlate the data set. The correlated log accounts for the node's behavior that is examined by the HMM-based model comprising of three stages namely, reconnaissance, attack, and stepping stone. The proposed mechanism yields good detection performance but demands heavy storage space and is limited to IP-address attacks.
Next, the authors of [23] have proposed an Opportunistic Data Forwarding Mechanism (OADM) for the analysis of the node's behavior. The attacking probability of the node is judged by four parameters viz, forwarding rate, residual energy, degree of maliciousness, and state history of the node. These parameters are employed in HMM to yield the current state of the node. For the survivability of the network, an effective relay node is selected to route the packet from source to destination. The mechanism is solitary suitable for on-off attacks.
For multi-stage attacks, the authors in [24] have proposed a probabilistic intrusion detection system for recognizing malicious events. The work adopts a three-step process for the detection of malicious attacks. In the first step, the temporal relationship is established using HMM between attack phases and intent states. This step measures the deviation of a compromised node from one state to another. Sometimes, the availability of information is incomplete and unidentified, therefore in the next step; a rule-based technique is applied to adjust the parameters at runtime. Finally, to interpret the result Loopy Belief Propagation (LBP) is modeled to optimize the HMM into a single output. The model is acceptable for recognizing the cause-and-effect relationship of known planned attack events. But for unknown attacks, the effectiveness of the model is not good. In addition, the cooperation of nodes in the network is not justified.
Therefore, for better cooperation, an Alliance-based trust management method is designed in [25]. The model is for the VANET network based on Blockchain. It uses HMM for the evaluation of trust and detects the malicious behavior of vehicles within the network. The model also incorporates the Blockchain-based Hyper-Ledger Fabric, which improves the efficiency of trust updating and query processing and ensures the premises of security. The model shows a promising and feasible aspect of trust management. However, the use of Blockchain technology demands a huge amount of storage and computation memory. Further, the authors of [26] have proposed a prediction mechanism using the KDDCUP'99 network intrusion data set. The authors have incorporated both HMM and Naïve Bayes methods for predicting Multi-Stage Attack (MSA). Though the hybrid form of this model predicts attacks accurately, the redundancy in the KDDCUP'99 dataset results in end-to-end delay.
For reduced transmission delay, the authors of [27] have proposed an Intrusion Detection System using Machine Learning, known as Hidden Markov Bayesian (HMB) model. The model comprises of two components, one is Naïve Bayesian Hidden (NBH) which acts as a decision-maker and the other is a Knowledge-based Bayesian Hidden System (KBHS) which is an intruder detector. Initially, the dataset is preprocessed and trained, which is then passed through NBH for decision, and then finally with the use of KBHS intruder is detected, if present. The model claims to have better performance in terms of accuracy, detection rate, and transmission delay but the experimental analysis of the performance parameters is missing. Forwarding towards the concept of machine learning only, an HMM-based machine learning algorithm is proposed in [28] for dealing with network-based cyber-attacks in an IoT system. Authors have used the combination of cyberthreat projection and cyber-agility prediction. In cyber-threat projection, cyber-attacks are deduced from an existing set of facts, and gives a realistic view of the attack, while cyberagility refers to the identification of threats within the attacked network. The experiments performed yield better performance but still, there is a need for greater collaboration on data sets.
The above literature analysis limits the existence of any standard mathematical model that has successfully incorporated the effect of attacks in a compromised network. Moreover, they demand heavy storage space that affects the residual power of the node, which is not suitable for resource constraint IoT networks. In addition, the survivability and integrity of the network are rarely investigated. The comparative analysis of these states of art for hidden Markov-based trust models is present in Table 1.
Further, PTER-ACO [10], FUCEM [12], FTCSPM [16], and OADM [23] discussed above are considered for comparison since they are proven as significant models for efficient and effective evaluation of node trust. In addition, these models predict the node's behavior effectively and improve the performance of the network.

Proposed Work
This section elaborates on the technicalities and the procedure of the proposed work including its system modeling and mathematical modeling. System Modeling states the structure of the proposed model along with its flow of information between different modules. System modeling of our proposed scheme includes network structure and system architecture. Mathematical modeling represents the implementation of the discussed system using mathematical concepts and languages. The model considers the mathematical modeling using Hidden Markov Model (HMM). The proposed work successfully evaluates the trust and addresses the black-hole, grey-hole and sink-hole attacks. Table 2 summarizes the symbols used throughout the paper.

System Model
The purpose of the work is to implement a reliable routing path from source to destination along with a high network lifetime. The section comprises of network structure and the system architecture of the proposed Hidden Markov Trust Model that attenuates the selfish and malicious node from the IoT network.

Network Structure
Here we consider a network incorporated by a set of randomly deployed sensor nodes SN, where 'N' is the number of sensor nodes. Let there be nodes of one of the kinds either adaptive node, mischievous node, or greedy node as shown in Fig. 1. An optimal trusted path is established between source and destination using LOADng protocol. Consistently, None experimental analysis of the performance parameters is missing [11] To measure survivability of the clustered network

DoS attack
There is increase in residual power, thus is not suitable for IoT network [17] To investigate the behavior of the agent HMM model with entropy-based information system None Lacks to show the effect of uncooperativeness [18] To establish intrusion detection system for VANET

None
Estimation of probabilities not discussed [20] To detect attacks based on alerts Concept of similarity Multi-stage attack detection Is specific for limited categories of alerts [21] To detect the internal threats HMM with Genetic Algorithm Internal attacks Limited to internal threats after every time 'T', the behavior of nodes in the path is self-quantified through the likelihood function (Sect. 3.2) and is judged as adaptive, malicious, or greedy nodes. The model provides Time-to-Reset (TTR) to the mischievous nodes to improve themselves before withdrawing them from the network routing path. Till that duration, they are allowed to participate and serve the network. The maximum TTR provided is till the trust value outweighs the threshold trust. Thus, the involuntary benefit of the malicious nodes taking part in network function can be contemplated and we can exploit the node to improve the survivability of the network till the benefits outweigh the damage to the network. While on the contrary, the selfish nodes whose objective function is to drop the packets are immediately isolated and destroyed from the network as they tend to never participate in the network function. Thus, it's clearly a waste of resources of other nodes to which they attempt to communicate [29]. Eventually, the new trusted routing path free from greedy nodes and mischievous (when TTR expires) nodes is established between source and destination.

System Architecture
The proposed system is made up of three modules viz. Path formation module, Learning module, and Decision module (Fig. 2). The path formation module evaluates the secured shortest path 'P' to the destination by selecting the trustworthy nodes present in the network and initiates the flow of packets. After every time 'T' we train the nodes in the path 'P' using the proposed HMT model and then analyze the behavior of each node. Once the behavior of nodes is examined, the decision module is activated, which eliminates the unreliable nodes from the network. The upcoming section covers the learning and decision module thoroughly while the path formation module is omitted since the optimal path is established using LOADng routing protocol with trust value as an additional attribute (Algorithm 1).

Mathematical Modeling
The objective of our model is to recognize the reliable movement of packets from source to destination via intermediate nodes. But since the actual state of the node in an IoT network cannot be directly observed, therefore the proposed model had made use of the Hidden Markov process. This is done to observe the behavior of nodes and their attacking probabilities. The probability distribution of the node's state is estimated using the input given and the emitted product visible to the observer. The section covers the mathematical modeling of the Learning module using HMM. It also formulates the proposed decisionmaking module.

Learning Module: Hidden Markov Trust (HMT) Model
The Learning Module is designed to attain the behavior of node existing in path 'P'. It is formulated using the concept of HMM. HMM [30,31] is defined as a Quin Tuple Ħ = (Q, O, π, T, E), where, Q = > It defines the set of distinct states in the Markov process. {q 1 , q 2 , q 3 ….q n }, where 'n' is the number of states. = > It is the set of observation symbols. {O 1 , O 2 , O 3 …O m }, where 'm' is the number of observations. π = > It is defined as the initial state of the node at t = 0. T = > It is the state transition probability matrix of size |Q| X |Q|. Where T ij is the probability of the system moving from state q i at the time 't' to state q j at a time 't + 1' and ∑ n j=1 T ij = 1 E = > It is the emission probability matrix of size |Q| X |O|. Where Ejk is the probability of output O k at time t when the system is in state q j at a time 't' and ∑ m k=1 E jk = 1 Provided HMM 'Ħ' and the observations 'O', the likelihood of the system for a given state q j at time 't'; is estimated by the forward probability algorithm and is expressed as: where t−1 (i) is the previous forward path probability when the system was at state q i at previous time step 't-1', T ij is the transition probability from previous state q i to current state q j, and E j (O t ) is the emission probability of the observation O at a time 't' given the current state is q j .
Based on the above discussed properties of HMM, the dynamic trustworthiness of the node in proposed scheme is modeled by a 4-state HMM model, which predicts the probability distribution of the node's next state. The proposed Learning model is represented as M SI (N) and is defined as quintuple M SI (N) = (S, O, π, P, E) (Fig. 3), where 'S' is the finite state space that defines the behavior of the node. It consists of four states: adaptive (A), greedy (G), mischievous (M), and crashed (C) state. The adaptive state Fig. 2 System architecture * Note: In Path_Formation_Module, the LOADng protocol is a standard protocol and can be referred from the research article [30]. It will be same except, that the RREQ (Route Request), RREP (Route Reply) messages include the trust of the node along with the cost of the routing path and then find the optimal trusted route is discovered. *Note: '#' represents 'the number of' is also known as the cooperative state; here the node is reliable to forward packets from one end to another. Nodes in a greedy state are selfish in nature, so instead of forwarding the packets of another node, they drop the packets to save their energy. Mischievous (malicious) nodes restrain the integrity of the data by tampering with the data packets of other nodes while in transmission. Nodes in a crashed state are said to be failed nodes that do not take part in routing.
'O' is the set of emitted symbols; visible to users while there is the transmission of data from source to destination. It consists of two symbols: expected output (EO) and unexpected output (UEO) at each state.
'π' is the initial state probability of the node at time 't = 0'. When the network is deployed all nodes are assumed to be adaptive in nature i.e., π = {1,0,0,0}. Estimation of stochastic probabilities 'P and 'E' are discussed in upcoming sections and the step by step process of the Learning Module can be studied from Algorithm 2.

a Estimation of state transition probabilities (P)
The state transition probability matrix provides the probability of a node transitioning from one state to another in a single time unit. The transition probabilities of the proposed model are derived from the data flow rate of the packet and the residual energy of the node. As shown in Fig. 3, the stochastic transition probabilities are classified as: AG = > A cooperative node in an IoT environment begins to enter into the greedy state when the residual energy of the node comes down. The average lifetime of the node defines the probability of it; transiting from adaptive to greedy state. It is determined as the ratio of energy consumed by the node in receiving and transmitting packets to the energy left after interplay.
where E c is the energy consumed, E r is residual energy, E i is initial energy.
From Eq. 2, a node is considered to be adaptive when the value of ' AG ' is less i.e., residual energy of the node is high. GA = > A greedy node at times attempts to cooperate and tries to adapt itself in an IoT environment by forwarding data packets on behalf of their neighbor nodes. Thus, the probability of a node to transit its state from greedy to adaptive is the ratio of the number of packets forwarded to the total number of packets received from neighbor nodes.
where pkts f is the number of packets forwarded,pkts r is the number of packets received. AC = GC = MC = > A node in any state tends to enter the crashed state if it starts dropping the packets instead of transmitting them. So, the transition probability of node from any of the states (adaptive, greedy, and mischievous) to crashed state is given as the ratio of the number of packets dropped to the number of packets received by the node.
where pkts d is the number of packets dropped, pkts r is the number of packets received. AM = > We assume an attack model that interrupts the integrity of the message forwarded from source to destination. Therefore, the probability of transition from adaptive to mischievous state can be referred to as the ratio of the number of packets modified to the number of packets received by the node (Eq. 5). Furthermore, to detect which node modifies the packet we have implemented the concept of checkpoint after every 'm' multiple hop. Checkpoint is used to declare the point before which all nodes are in a consistent state and had transmitted unmodified packets. Maintenance of this save point is done by an edge node. After every 'm' hops edge node verifies the forwarded packets by comparing them with the source data packets. If the packet received is damaged, it will backtrack all nodes one by one, till the previous save point. We rely on the fact that nodes store their data till the next checkpoint is administered. Thereby, the number of packets modified by each node is discovered and the transition probability of node from adaptive state to modified state is evaluated.
where pkts m is the number of packets modified (tempered), pkts r is the number of packets received. MA = > It is possible that the mischievous node can be released from the impact of an intruder. So instead of immediately isolating a node, few opportunities can be given to it; to correct itself and get removed from malicious activity. Thus, the rehabilitation probability of the node is given as where Time-to-Reset (TTR) is the time required to realign itself into an adaptive state.
Summarizing, the above transition probabilities, the complete state transition probability matrix is given as in Eq. 7

TTR b Estimation of Emission Probability Matrix (E)
Emission probability also termed as observation output probability is defined as the probability of a node to yield each output (observation) symbol from every single state in a single time unit. Given as in Eq. 8 As discussed, the proposed M SI (N) assumes two observation symbols, which are expected output and unexpected output. So, the emission probability matrix from Eq. 8 is presented in Eq. 9:

c Evaluation of Trusted Node based on Hidden Markov Model (MSI (N))
Provided the learning model M SI (N), the likelihood of the node with the expected output, to be trusted in the routing process, is estimated using the forward probability algorithm. The forward probability algorithm is represented in Eq. 1 and is given as . Utilizing this equation as the base equation, we have evaluated the probability of node to be in adaptive state, greedy state and malicious state.
• The probability of a node to be in adaptive state at the time 't' with expected output is given as: Expanding Eq. 10 for every state, we get: • Similarly, the probability of a node to be in a Greedy state with expected output is given as:

Adaptive(A)
Expanding Eq. 12 for every state, we get • The probability of a node to be in a Mischievous state with expected output is given as: Expanding Eq. 14 for all state, we get: *Probability of node to be in crashed state is not evaluated as nodes in crashed state are not considered for the routing function.

Decision-Making Module
The maximum likelihood of the node's behavior determines the isolation of the node and the integrity is maintained by utilizing the likelihood of expected valid output that is regulated at different states. Maximum of t (A) , t (G), and t (M) stimulate the state of the node. If the maximum value out of three is t (M) , then the node is mischievous in nature. It is not immediately isolated but is given TTR time to reset and detach itself from the malicious effect. TTR is selected to such an extent, that benefits to the network always overpower the damage caused. As from Eq. 10, it is evident that TTR is inversely proportional to trust. Thus the value of TTR is so forth selected, that the probability of trust is always above its threshold value. This is done to increase the survivability of the network. Secondly, if the maximum value is t (G) , then the node is said to be greedy in nature. The greediness of the nodes can be due to certain reasons. First, a node can have minimum energy and can be at the edge of the crashed state. In this situation, the node is only isolated from the routing function and is provided Time-to-Live (TTL) to restore its energy. If the node recovers before TTL expires, it is carried back to the network else crashed. Second, a node can be selfish in nature where despite having adequate energy, it intends to never participate in the network function. In such a case, the node is immediately isolated and destroyed from the network. The line of greediness is evaluated by measuring the energy of the greedy node. Finally, if the maximum value is t (A) , then it determines that the node is trustworthy but besides trustworthiness, if the trust value is more than the trust threshold then the node is adaptive in a network environment with valid output else it yields invalid output because of mischievous activity along the path. Figure 4 illustrates the decision module of the proposed solution and Algorithm 3 represents the steps of making a decision.

Simulation Results and Analysis
The performance of the proposed Hidden Markov Trust for Attenuation of mistrusted nodes in IoT is verified through simulation analysis performed in a MATLAB-R2018 network environment. To validate the performance, existing models such as FTCSPM, FUCEM, OADM, and PETR-ACO are compared with the proposed Hidden Markov Trust (HMT) model. A brief discussion of these models is presented in the Related work section of the paper. All these models are compared in presence of a sink-hole attack. The sinkhole attack is a generalized and most occurring attack where the attacker aims to drop or tamper the data information when in flow within the network. Initially, all nodes in the network are adaptive in nature and the modified LOADng routing path is selected for traffic movement from the source to the gateway. Once the trusted path is selected, traffic continues to move from source to gateway for 'T' sec. After every 'T' sec, the path is analyzed again, mistrusted nodes are isolated and a new reliable trusted path is selected for traffic movement. This continues until the end of the simulation.

Simulation Setup
All simulations are performed in a 500 X 500 m 2 approximately; over 50 nodes with a transmission range of 150 m. Nodes are distributed randomly since it generates a realistic node pattern. The traffic of the simulated model is represented in terms of a constant bit rate with 40 pkts/sec. Further, modified LOADng is used as a routing protocol and the simulation time is set to 400 s. Table 3 presents the simulation parameters for analyzing network performance.

Performance Metrics
Performance Metrics is the process of collecting, analyzing and reporting information regarding the performance of an individual, group, system or component. The proposed study categorizes the experimental results into following performance indices: • Survivability Rate-It is defined as the capability of the system to fulfill its objective in a timely manner, in the presence of attacks, failures, or accidents. It is the ratio between the number of active nodes and a total number of nodes present in the network at a particular instant.
where N active is the number of active nodes in the network and N is the total number of nodes. • Packet Delivery Ratio (PDR)-It is the ratio of the number of packets received by the destination node to the number of packets sent to the destination node.
• Routing Overhead-It is considered as the frequency of discovering routing paths. where FP is number of negative events wrongly categorized as positive, TN is is the number of true negative events.
• False Negative Rate-It is the ratio of false negative to the total number of positive events where FN is number of false negatives, TP is is the number of true positive events. • Avg Energy Consumption-It is the total energy consumed by the node during the packet transmission and reception. • Avg End-to-end Delay-It is the average time taken by the data packets to reach their destination along with connection establishment and delays. • Average Trust value-Trust is defined as an association between two nodes. Average trust is the degree of node to be collaborative in nature.

Simulation of the proposed model in presence of attacks
Simulation of the model is explained by examining the proposed system in presence of various attacking nodes. Here we have observed our model in presence of Black-hole (Testcase I) and Grey-hole attacks (Test-case II). A blackhole (BH) attack is an attack where the attacker claims that it has the shortest route to the destination node, even if it does not have any route to it. Consequently, all packets pass through it and this enables the attacker node (BH node) to forward or discard packets during the data transmission. Further, a Grey-hole (GH) attack is a variant of a BH attack, where GH nodes drop the packets with a certain probability. These nodes discard packets for some particular time duration and then switch back to normal behavior, resulting in on-off vulnerability. The simulation analysis of the proposed system in presence of BH and GH attack is illustrated with the following test scenarios:

Test-Case I: Simulation of Blackhole Attack with Different Network Size and Adversary
Simulation is carried out in presence of various BH nodes from sparse (10 nodes) to dense (50 nodes) sensor network. The comparisons were made for a different number of blackhole nodes (BH = 0, BH = 2, BH = 4, BH = 6). The performance of the network is depicted in Fig. 5a-e and the following observations are drawn: i. As shown in Fig. 5a, the result of PDR in the absence of BH node (BH = 0) is highest irrespective of the number of sensor nodes in the network. The plot depicts a decrease in PDR by 18%, 45%, and 75% for BH = 2, BH = 4, and BH = 6 respectively, when the network is neither sparse nor dense (30 nodes). This is because the BH node in the network aims to cut the connection between two communicating nodes and absorb all intercepting packets. Looking at the results, when the network progresses from sparse to dense, PDR decreases due to the collision of packets during data transmission. ii. Fig. 5b, depicts the average energy consumption of the nodes in presence of BH nodes.
The model reveals the highest energy consumption in absence of BH nodes, but as the BH nodes increase energy consumed by nodes is decreased by 8%, 20%, and 40% for BH = 2, BH = 4, and BH = 6 respectively because packets are dropped by attacking nodes. Therefore, normal nodes tend to remain ideal as they have no forwarding packets. In addition, our proposed solution serves to distribute energy among mobile nodes. Consequently, an increase in mobile nodes decreases the energy consumption which is evident from the graph. iii. Fig. 5c illustrates the routing overhead of the model in presence of BH nodes. Isolating BH (selfish) nodes from the network initiate the selection of a new routing path from source to destination. Routing overhead is high for a large number of BH nodes and low in absence of it, as the transmission of all packets takes place in a single run. The overhead is increased by 86% for BH = 6 when compared to the network without BH nodes. Moreover, the number of mobile nodes also increases the routing overhead because more control packets are required to discover the routes. iv. Fig. 5d presents the result of end-to-end delay with varying network sizes and BH nodes. Initially, in a sparse network, the result of delivering packets from source to destination is low, but as the network size increases, the transmission speed from source to destination becomes less resulting in a 47% increase in end-to-end delay.
Since the packets have to hop through an extra number of nodes. Likewise, the increase in BH nodes also increases the delay, as the compromised nodes restrict the data transmission resulting in the resending of the packets. v. Fig. 5e depicts the detection rate of the compromised nodes. It is inferred, the quantity of nodes in the network helps in increasing the detection rate on an average by 60%. Besides this, on the contrary, attackers try to reduce the rate of detection within a specified network size. It is observed, when BH = 6, then approximately 60% of the BH nodes are detected while when BH = 4, 75% of the BH nodes are discovered. In addition, the position of BH nodes plays an important role, if the location is close to the network traffic, then compromised nodes can be easily detected compared to a node located at a distance.

Test-Case II: Simulation of Grey-Hole Attack with Different Network Size and Adversary
The section discusses the impact of GH attacks on IoT networks. We have analyzed and demonstrated the discussed performance metrics and have drawn the following observations ( Fig. 6a-e): i. Fig. 6a demonstrates that the effectiveness of GH nodes on sensor networks is similar to existing BH effects. Observation states that PDR decreases with an increase in the number of mobile nodes due to collision and with the increase in GH nodes, which aims to absorb all the forwarding packets. The average decrease in PDR is 37% when the number of GH nodes is six with respect to the network without any GH nodes. However, the impact of GH nodes in PDR is approximately 5% higher than BH nodes. This is because; GH nodes switch their behavior after every specified time. This allows the partial flow of packets. ii. Fig. 6b, illustrates the decrease in average energy consumption of the nodes with the increase in the number of mobile and GH nodes. On average, energy consumption is reduced to 67%, when GH = 6 with respect to network free from GH attack. Energy consumed by GH attackers is 10% more than BH attackers in sparse networks followed by a 12% increase in the dense network since the switching nature of GH nodes manages to forward some packets. iii. Fig. 6c; represent the routing overhead in the presence of GH nodes. Initially routing overhead is low for GH = 0, and GH = 2 but as the network becomes dense routing overhead increases and becomes stable because the selection probability of trusted path is more than un-trusted path. While on the contrary rise in GH nodes, escalates the routing overhead of the network because every time the selection of a new routing path is initiated. The graph depicts 0.6% and 0.8% of overhead when GH = 2 and GH = 4, which instantly increase to 6.8% when GH = 6, because each time the new path is initiated which increases the flow of control packets. iv. Delay in delivering packets from source to destination is highlighted in Fig. 6d. On average packet is delayed by 1 s in absence of GH nodes and then as the attacking likelihood increases delay to 1.3 s. The graph shows small variations because the on-off nature of GH nodes models them to be equivalent to normal nodes. Besides, due to the same reason, the overall delay in presence of GH nodes is less when compared to BH nodes. In addition, the delay is increased with increasing mobile nodes because packets now have to cover more hops.  Fig. 6e depicts the unstable rate of detection because the position and on-off nature of the GH node play a key role in the detection mechanism. Moreover, the average detection rate of BH nodes is 2% higher than that of GH nodes because the unpredicted nature of GH nodes helps them to cover up themselves in the normal nodes, thereby making detection difficult.

Comparative Analysis of the Proposed Model with Different Network Size and Adversary
For the efficient justification of the work, the proposed model is investigated and validated by performing the comparative analysis; with the most relevant existing trust models like FTCSPM, FUCEM, OADM and PETR-ACO. We have authenticated our approach by varying the density of mobile nodes with fixed percentage of attackers and also by fixing the density of mobile nodes with varying percentage of attackers. Here we have considered the attackers as sink-hole attackers, whose objective is to drop or alter the data packets.

Performance Metrics Versus Varying Number of Mobile Nodes
We present the performance of the proposed approach by differing the number of mobile nodes in the configured environment. The number of nodes is varied from 10 to 50 with 10% nodes as misbehaving nodes. Figure 7a-d depicts the plots of average energy consumptions, packet delivery ratio, average delay, and routing overhead for our approach along with comparative models.
i. The plot depicted in Fig. 7a shows the node's average energy consumption of the proposed approach with the discussed benchmark systems. It can be illustrated that in general, energy consumption considerably increases when the number of mobile nodes in the network increases; this is due to the increase in the data flow. But the proposed approach consumes less energy as compared to others when mobile nodes in the environment are multiplied. This is due to the effectiveness of our approach which despite the huge data flow distributes the energy among the varying mobile nodes during the period of transmission. Since energy consumption depends on the distance between the two nodes. Though the proposed approach initially shows an increase of 0.1 J, 0.09 J, and 0.121 J when compared with FTCSPM, FUCEM, and OADM respectively. But as the mobile nodes increases, the proposed approach shows a considerable decrease in energy consumption with 0.03 J, 0.88 J, 0.32 J and 0.62 J with 50 nodes when compared with FTCSPM, FUCEM, OADM, and PTER-ACO respectively. ii. The plot in Fig. 7b presents the PDR of the proposed approach with discussed benchmark mechanisms. The figure concludes an increase in PDR from 17 to 30% with respect to FTCSPM, from 3 to 12% with respect to FUCEM, 0.5% to 1.5% with respect to OADM and 9% to 16% with respect to PTER-ACO. All in all, the proposed approach shows a significant improvement in PDR by 12.5%. iii. Further, Fig. 7c plots the average end-to-end delay for the given number of mobile nodes. The graph depicts, a significant decrease in average delay because only the reliable routing path is selected which prevents unnecessary delay. On average the proposed approach presents 90%,62%,13%, and 94.4% decrease in delay when compared to FTCSPM, FUCEM, OADM, and PTER-ACO respectively. It can be concluded that our model shows a significant increase in performance by an average decrement of 88%, 92%,56.6% and 89.8% of overhead for FTCSPM, FUCEM, OADM and PTER-ACO respectively. This is due to the adopted strategy which selects the routing path with the minimal number of control packets.

Performance Metrics Versus Varying Number of Attacking Nodes
Here the performance of the proposed approach is explored by varying the number of attackers in an IoT environment. The percentage of attackers is varied from 10 to 50% with a total of 50 IoT nodes. i. The plot depicted in Fig. 8a shows that there is a decrease in PDR with an increase in the percentage of attackers for all the included benchmarks. However, the proposed approach shows considerable improvement in PDR when compared to other mechanisms. The proposed model on average shows an 8% increase when compared to OADM, 0.7% increase when compared to FUCEM, and 1.96% increase when compared to PTER-ACO. All in all, the proposed approach presents a significant improvement because the most reliable and shortest path is selected that allows the significant flow of packets. ii. Further, the plots depicted in Fig. 8b shows routing overhead with varying number of attackers. Plot infers that injection of attackers escalates the routing overhead. However, our approach presents a lesser increment in routing overhead. On average, our approach is 76% better than OADM, 29% better than FUCEM, and 71% better than PTER-ACO. On the whole, our approach is superior because the most trusted routing path is selected whereas in other cases some misbehaving nodes are misjudged as collaborating nodes. This results in frequent discovery of paths, ending up with overhead augmentation. iii. Finally, the plot in Fig. 8c depicts the detection rate of misbehaving nodes. It can be inferred that attackers decrease the detection rate. According to the plots, our approach is stronger than other benchmarks. Relatively, our approach recognizes attackers 5.4% more than OADM, 1.5% more than FUCEM, and 43% more than PTER-ACO. This is because the observable symbols in the proposed HMT model immediately identify the state of the node.

Analyzing the Efficacy of the Proposed Model
The principal objective of our proposed approach is to ensure the survivability of the network along with the attenuation of the compromised nodes, which is certainly missing in the existing models. The proposed approach has solved this issue and has improved the performance in terms of accuracy, False-alarm-rate and survivability rate. Thus, increasing the efficiency of the proposed model.

False Alarm Probabilities and the Accuracy of the Model
This section states the rate of false-positive and false-negative along with the accuracy of the proposed solution in presence of 50 nodes with 5 compromised nodes. Figure 9a represents the false-positive and false-negative rates as a function of time. The false-positive is the misidentification of normal nodes as bad nodes. The effect of which is normally observed when time is large during which the energy of normal nodes is low, which is likely to reduce the trust value of nodes. However, the false-negative occurs when bad nodes are considered as normal nodes, the effect of it takes place when time is small (initial) at which all nodes in the network are considered to be trustworthy. The graph in Fig. 9a heads towards the discussed outline. The false-negative rate is initially high as all nodes are regarded to be trusted nodes, thus the model is likely to miss the bad nodes. As time progresses, the false-negative rate drop because the proposed solution tends to detect the compromised nodes in the network. But on contrary, the false-positive rate increases slowly since the trust value of normal nodes starts decreasing with time and the system misdiagnoses a normal node as the compromised node. Additionally, the figure illustrates, on average, the model is 95% accurate which increases to 99.99% at time = 400 s, as falsepositive and false-negative rates are lowest at this instant, and then as the time advances the accuracy reduces. Figure 9b, shows the sensitivity of the false alarm rate with respect to the trust threshold, below which the node is considered as a compromised node. It can be inferred that as the trust threshold increases, the false-negative rate decreases while the false-positive rate increases. There exists an optimal threshold at which both false-positive and falsenegative are minimized. Here for time = 400 s, the optimal trust threshold is 0.5 at which both false-negative and false-positive are zero and the accuracy is maximum, higher than 99.99%.
These both graphs conclude that the proposed model is efficient and yields the valid outcomes.

Impact of TTR on the Survivability of the Network
The concept of using Time-To-Reset (TTR) for malicious nodes is an important factor that have huge impact on the survivability and overall trust of the network. The simulation results obtained are illustrated in Table 4. The Table 4 analyses the effect of TTR on the survivability and overall trust of the network, and is graphically represented in Fig. 10. Figure 10a compares the survivability rate against time for TTR = 0, 1, 2 and 3 s. We observe that the survivability rate of the network decreases abruptly with lower values of TTR. The model estimates on an average 93.2% of survivability for TTR = 3 s, which is then dropped to 91.6% for TTR = 2 s followed by 89.6% and 73.6% for TTR = 1 s and TTR = 0 respectively. On contrary, Fig. 10b compares overall trust against time for the same values of TTR. We observe that the trust value decreases gradually with an increase in TTR. The model estimates the probability of trust to be 0.95 when no time-based opportunity is given (i.e. TTR = 0) to nodes. Trust value reduces by 4%, 1% and 0.5% for TTR = 1, 2 and 3 s respectively.
Comparative study of Fig. 10a, b (and also Table 4) states though time-based opportunity (TTR) decreases the overall trust; but when benefits (survivability) are analyzed, the model outperforms and gives a better result. Therefore, the value of TTR has to be opted to an extent that trust value does not drop below threshold trust. Thus, we can infer that providing a time-based opportunity i.e. TTR to nodes in the network plays an important role in the welfare of IoT network communication. In addition, both Fig. 10a, b presents the decrease in survivability rate and trust value as time proceeds. This is because with time residual energy of the nodes becomes less.

Conclusion and Future Scope
In this paper, we focused on the modeling and analysis of the impact of the node's behavior on network survivability and integrity, which has been rarely studied. Firstly, the node's behavior is classified into four types: adaptive, greedy, mischievous, and crashed state, False-negative rate False-positive rate Accuracy Fig. 9 Accuracy and false alarm rate of the proposed solution each with two observable symbols. Then the behavioral model is proposed by employing Hidden Markov Process. The mobile nodes with expected output change their behavior according to the transition probability matrix and emission probability matrix. Once the likelihood of the node being in each behavior state is obtained, the isolation problem is analyzed. The misbehaving nodes whose objective function is to harm the packets are provided with a time-based opportunity (TTR) to reset itself before its permanent isolation. Next, the selfish nodes whose aim is to drop packets are immediately removed from the network but prior to its removal; nodes are verified to see if they are literally selfish or are at the edge of the crashed state. If nodes drop packets due to minimum residual energy,  Fig. 10 Impact of TTR on survivability rate and overall trust of the network then they are not destroyed but are only removed from the routing function and are given TTL time to regain their lost energy. The scheme adopted helps to increase the survivability of the network. Finally, analytical results were explained by simulation experiments. Besides, our work provides a deeper understanding of the network performance evaluation in presence of misbehaving nodes like blackhole nodes, greyhole nodes, and sinkhole nodes.
Depending upon the application under consideration, it has been realized that for multipoint-to-point traffic; IPV6 Routing Protocol for Low-Power and Lossy network (RPL) is advisable. In that direction, our future work is to include 6LoWPAN and RPL protocols in our proposed HMT models, which can offer customized solution to a wide range of IoT applications. The proposed model considers only two output states as observable states. In future, the behavioral model can further be extended by including more observable symbols like residual resource level and degree of connectivity. The criticality of the model can further be improved by validating it against other attacks like good-mouthing attacks, badmouthing attacks, and ballot stuffing attacks.

Conflicts of interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.