A Cognitive Knowledged Energy-Efficient Path Selection Using Centroid and Ant-Colony Optimized Hybrid Protocol for WSN-Assisted IoT

In a WSN-assisted IoT environment, the sensors are resource constrained. The energy, computing and storage resources of deployed sensors in the sensing area are limited. A hybrid protocol named as an Energy Efficient Centroid-based Ant colony Optimization (EECAO) hybrid protocol is proposed in this paper to improve the performance of the sensor network in WSN-assisted IoT environment. This protocol uses a concept of centroid based clustering to gather the information of local clusters and ant colony optimization to relay the same to the base station. The energy level of deployed cognitive sensors is considered as a key parameter for defining the position of centroid in this protocol. The proposed protocol has a new distributed cluster formation design which includes multiple clustering factors such as energy cost, channel consistency and cognitive sensor throughput to select cluster heads. In the proposed protocol, the selection of the super cluster head is based on the energy centroid position for a defined coverage area. The path optimization between the super cluster heads and the base station is carried out using an ant routing model. Our simulation results indicate that the proposed protocol performs better when benchmarked against existing ETSP and EECRP protocols. Also, it suits well for the sensor networks that requires long lifetime when the base station is placed at either center, border or outside the network.


Introduction
Sensors are the dedicated devices that can detect and respond to environmental changes. Wireless sensor network (WSN) is a group of such dedicated sensors used for recording and monitoring the physical condition of environment and organizing the sensed data. Sensors deployed in a wireless sensor network can cooperatively send their data to the base station (BS) [1]. Sensor devices connected to an IoT network can exchange data with users in real time. Sensors performing IoT operation are critical in providing operational efficiency at reduced costs. Base station acts as an interface between the users and the network. A user can retrieve needed data from the network by sending a query to collect its result from the base station through internet servers [2]. A simple WSN-assisted IoT environment is shown in Fig. 1. WSN-assisted IoT network has an advantage of suitable deployment of network devices and good scalability at reduced cost. The major drawback of this network is its limited energy resource. In the rugged environments where sensors operate, it is not easy to provide continuous energy source to charge them. So, the sensor nodes are usually powered by batteries. Also, replacing the sensors is not cost effective. Eventually, the energy management has become a vital task for WSN-assisted IoT environments. So, prolonging the network lifetime and balancing the energy consumption of sensors inside the sensing area are the key aspects in improving the WSN performance.
The concept of cluster is introduced in sensor networks to simplify the network management. Cluster head (C H ) is the coordinator of cluster. Responsibility of cluster organization, routing table formation, aggregating and transmitting information will be undertaken by the cluster head C H . Usually, the energy of the cluster head is dissipated very fast because of the load on it [3]. So, if the information from cluster head to the base station (BS) is multi-hop forwarded, it reduces the energy dissipation of cluster head. In sensor networks, high communication distance between the sensor nodes and base station leads to higher energy consumption. If a cluster head selects leaf node (the node which has no responsibility of aggregating and relaying the sensed data inside a cluster) to relay its information to the base station, its energy dissipation rate can be reduced significantly [4].
Aiming at high packet delivery ratio and reducing energy consumption, a new hybrid protocol named Energy Efficient Centroid-based Ant colony Optimization (EECAO) Fig. 1 WSN-assisted IoT environment protocol is proposed to meet the user demands in a WSN-assisted IoT environments. The protocol is termed "hybrid" as it utilizes two schemes namely Energy Centroid based clustering among local clusters and ANT Optimization routing model between the BS and super cluster heads to achieve its purpose.

Major Contributions
The key contributions of the proposed work in WSN-assisted IoT are as follows: • A clustering algorithm that operates based on the centroid position of the defined coverage area (C R ) for energy management. Its key parameters include channel decision as energy efficiency metric, channel consistency as reliability metric and sensor throughput to define the super cluster head. • An optimization algorithm to reduce the communication distance between the super cluster heads and the base station using ANT routing. Its key parameters include energy probability as the energy efficiency metric, channel consistency as the reliability metric and routing distance to reach the destination (base station BS) • Reducing the average energy consumption of sensor node without an impact on network lifetime and aiming at high packet delivery ratio of the sensor network to meet the user demands in WSN-assisted IoT environment.
In a recent work, a cross layer variant of AODV (Adhoc On-Demand Distance Vector routing protocol) is utilized by substituting hop count metric with link quality and collision count. For creating an intelligent routing decision, the authors proposed method collects link quality information from physical layer and collision information from MAC layer using ZScore method [5].

Organization of the Paper
The rest of the paper is organized as follows. In Sect. 2 related work is discussed. The proposed EECAO protocol is elaborated in Sect. 3. In Sect. 4, the performance evaluation of the EECAO protocol in comparison with the similar existing protocols is presented. Conclusions are drawn in Sect. 5 with future works included. Appropriate references are provided at the end of the paper. data under constraints of device capability and network limitation from perception layer to application layer. Finally, the application layer processes the information received from the network layer [6]. WSNs acts as "cells" for collecting and distributing the data within IoT and enabling the development of smart and context aware applications. Also, by utilizing different types of power sources and maintaining it for long time, these sensor network devices are real enablers in IoT in terms of metrics like lifetime, energy efficiency, less cost and interface to resources [7]. The aspects like throughput, cost of communication and energy consumption are considered as key parameters in IoT based WSNs [8].

Related Work on Location of Base Station in WSN-Assisted IoT
The protocol ETSP [9] (Efficient Tree-based Self-organizing Protocol) optimizes the path between source node and root node/base station. It proves that the base station placed at center of the sensing field provides best results than the base station placed outside the sensing field. ETSP results shows that the hop-based sensor network may achieve longer lifetime but its throughput and packet delivery ratio (PDR) is low. The distance reliable network may achieve good throughput and better PDR but its network formation time is high and hop count is more. The network based on residual energy as key performance metric won't provide better result as its average hop compared to distance reliable network is quite high. Sink node S n weight in ETSP is given by Eq. (1).
Here, W S n is the weight of sink nodeS n . C m is the current count of child nodes connected to Sink nodeS n . R m is sink node S n left over energy. dist m is the distance between current node and Sink node S n . The root node hop by default is set to 0. HoP m is the hop of sink nodeS n . ρ, ϑ, σ, and τ are standardized parameters of four variables dist m , C m , R m and HoP m .
From the source sensor to the base station, overhead added to the data is high. Since multihop routing path is constructed between source and destination, a link failure creates a long process to reset the path. Also, no optimal paths are constructed in the ETSP network routing process.

Related Work on Cognitive Sensors
To overcome the problem of spectrum scarcity in a WSN, Cognitive Radio Sensor Network (CRSN) is used in literature. Similar to WSNs, a CRSN has tiny and inexpensive sensors which operates on limited battery energy [10]. In a WSN, each sensor has the ability to send or receive information or remain in sleep mode. However, in CRSN, a sensing mode exists where the cognitive sensors sense the spectrum to find possible opportunities [11]. A cognitive radio (CR) is an intelligent wireless communication system which is aware of its surrounding environment. It has the ability to adapt its internal parameters to achieve a reliable and energy efficient communication. Using CR technology, unlicensed users can periodically monitor the spectrum for free channels. Integrating cognitive radio (CR) with wireless sensors can overcome current WSNs challenges [12]. CR sensor along with its normal duty of sensing has the capability to sense and share spectrum. It can predict the incumbents on channel and fairness in the distribution of spectrum [13]. With an aim of focusing on CRSN-IoT applications, an energy efficient k-hop clustering scheme is proposed recently to achieve bichannel connectivity without compromising the lifetime of the sensor network. The parameters like residual energy, spectrum awareness, channel primary user appearance probability, channel quality, robustness on primary user arrival and Euclidean distance between sensor nodes is taken into consideration to select the hop and common channel among clusters [14].

Related Work on Energy Centroid Based Clustering Approach
EECRP [15,16] (Energy-Efficient Centroid-based Routing Protocol) is a clustering algorithm that operates based on the location of energy centroid and energy level of sensors. In the field of mathematics, the centroid is the center of weight, which is the imaginary point of mass concentration. The reasons for using the term "energy centroid" in the EECRP protocol are as follows.
(1) First, the weight of sensor nodes in the network is meaningless.
(2) Second, the center of the weight centroid of nodes for the entire cluster is meaningless because node location and weight do not change in the operation of the network. (3) Finally, in the entire network the energy of the sensor node is the only factor that changes. The energy centroid can intuitively display the distribution of residual energy in the network. So, the energy level and the location of sensor node is taken into consideration to calculate the energy centroid. Centroid weight in EECRP is calculated using the equations mentioned below in Eqs. (2) and (3). In the field of mathematics, calculation for weight centroid in cartesian co-ordinate system ( X WC , Y wC ) is calculated as where X WC and Y WC are the results of weight centroid in mathematical prospective for EECRP network. Here C R is the cluster coverage area. dw is the weight differential. is the density of nodes weight. dM y and dM x are the static moments of x and y axis respectively. The drawback of the EECRP protocol is its communication path to the base station from the candidate cluster head node. This protocol is designed to send the information of candidate cluster heads directly to base station. The protocol proved energy efficient when placed at the center of sensing area. But if the location of base station is changed to either border or outside the rectangular sensing area, energy of candidate cluster head node will deplete very fast resulting in reducing the network lifetime as proved in Sect. 4.
An efficient two-fold clustering algorithm for WSNs in IoT named as SFC [17] is proposed recently. Here the cluster head selection is dependent on leftover energy and the degree of connectivity of the zone to ensure uniform energy distribution among the deployed sensor nodes. Here the isolated nodes from connectivity are clumped. The CH selection using Genetic Algorithm in a WSN-Assisted IoT by incorpating parameters like nodes density, energy and distance for the development of fitness function helps in optimizing intra-cluster distance, organizing utilization of a sensor energy inside a cluster, reducing hop and promoting most capable sensors as cluster head [18].
An adaptive fuzzy-based energy efficient opportunistic routing protocol (ARFOR) [19] is proposed for sustainable IoT applications based on the fuzzy parameters like residual energy, distance and threshold. This protocol has a parent node which acts as a cluster head to aggregate the packets to DODAG (Destination Oriented Directed Acyclic Graphs) root and volunteer node which acts as a relay to forward the packets to parent node with energy threshold limits to increase lifetime of the network during a data transfer period. A new clustering approach namely a Stable Election Algorithm (SEA) [20] is created to minimize the message exchange between the sensor nodes and also to reduce frequent roattion of CHs. This protocol uses a location determination algorithm to find the best position for a sink node to collect the data from the cluster heads to prolong the lifetime of sensor networks assisting an IoT application. In an article, the authors introduced a DRL (Deep Reinforcement Learning) [21] routing scheme for WSN-assisted IoTN to reduce the network delay significantly and to increase network lifetime. This scheme partition the entire sensing field into non-uniform clusters depending on the data load of the sensor to prevent immature shutdown of the network.

Related Work on ANT Routing Approach
The aim of artificial Ant routing approach in networks is to form a stable optimal path between source sensors and the base station in sparse sensing environments. The basic Ant Colony Routing (ACR) [22,23] algorithm uses ant behavior in simulating control packets. Ants are simulated as query packets which create valid paths to the base station. The algorithm assumes that the created WSN has only one destination known as base station. Also, it classifies ant behavior into two types namely forward ants and backward ants. Forward ants walk from source node to destination finding new routes and aggregating data. Backward ants which return back to the source sensor from destination update their information in intermediate sensors as they walk (hop). They do so by releasing pheromone in the route they travel to create path. So, pheromone value depicts the starting point of the ants at source sensor and the path they take to reach destination (prey) in multi hop forwarding route. An Ant-colony Optimized Self-Organizing Tree-Based (AOSTEB) [24] is an energy balance algorithm proposed for WSN to discover an efficient route during intra-cluster communication. An ANT-Energy Saving Routing (A-ESR) [25] which exploits basic ACR (Ant-Colony Routing) method is created to reduce the number of fully loaded active links of a node by forcing the user traffic on lightly loaded links. Energy management and channel decision probability plays a major role in obtaining stable path using Ant-Colony optimization algorithm [26]. An improved Ant colony optimization (IACO) [27] algorithm updates pheromone concentration of gradient field by broadcasting messages and exchanging hops between neighbor nodes in a sensor network. Here routing is carried out by considering metrics like shortest path and high residual energy. An improved pheromone update scheme explains that shortest path with less hops will get a larger pheromone increment. A path with high average energy will attract more data flow and the nodes closer to the sink will obtain more pheromones. If the weak node on path has more energy than the traffic is routed over that path [28].

System Model
An Energy Efficient Centroid-based Ant colony Optimization (EECAO) network is built in a hierarchical architecture. The network consists of the set of Cognitive Sensors, Base Station BS, Internet Servers I S , Access Points A p and User Agents/Devices U A . In this network, the data sensed by the sensors is sent to the internet server I S using the base station BS when the user agents U A requests for an operation/access. EECAO protocol parameters and their notations are shown in Table 1.

Assumptions
Sensor nodes with cognitive abilities are deployed randomly inside the sensing field throughout a two-dimensional area and the BS is placed at fixed co-ordinate for the entire simulation scenario. Here, cognitive knowledged sensors are utilized to provide real time sensed information to the user agent. The locational information of sensor node is preloaded into it at the time of deployment. The BS is considered to be active all the time. Deployed sensor nodes inside the sensing field are classified into networking nodes and non-networking nodes. Networking nodes are the sensor nodes which have residual energy above the threshold energy E th set and participate in data sensing, transfer process. Non-networking nodes are the sensor nodes with residual energy less than threshold energy set which cannot participate in data sensing and transfer process. In this EECAO approach, possible number of sensor nodes eligible to become cluster heads and super cluster heads will be the count of networking nodes inside the sensing field. Once the sensor nodes deployment is completed, it is assumed that the location of sensor nodes will not be altered. The sensor node battery can't be charged or replaced. Every sensor node has a unique identification number (sensor ID) with certain ability of computing and storage. The radio channel considered is symmetric. The conditions of MAC layer are ideal-sensed and information is considered to be transmitted perfectly without any collision or interference in the wireless medium. The twodimensional cartesian co-ordinate system is created for the sensing area with its origin point (0,0) located at very lower end of the left corner. Also, it is assumed that every sensor node inside the sensing field is preloaded with location information of BS. So, every sensor node knows the position of BS and its residual energy at any time.
In the proposed EECAO algorithm, the sensor network is built with few other key assumptions: • Initially all the sensors are equipped with equal energy levels and each sensor is assumed to have ample energy for communication with other sensors i.e., all sensor nodes deployed are isomorphic. Sensor nodes can act as a transceiver, able to transmit and receive the information at the same time. • The sink node or base station is unique, which is placed at the co-ordinates either at the center or border or outside of the sensing field. • The sensor node power is limited and the information needs to be multi-hop forwarded to reach the BS.

Energy Model
Energy consumption in a sensor network is mainly due to computation and communication. Energy consumption for communication is more when compared to the energy consumed for calculation. So, in order to reduce the network energy consumption, the most efficient way is to reduce the energy consumption due to communication. To have good analysis on energy consumption due to communication, we have opted a simplified network energy consumption model as shown in Fig. 2. Using this model, l bit packet is transmitted over a communication distance d.
The energy consumption of transmitter is the sum of energy consumption at transmitter and power amplifier circuit, calculated as shown in Eq. (4) where E el is the energy dissipated per bit to run transmitter or receiver circuit in (nJ/bit), amp is the power amplification coefficient and n is the path decline index. The value of n is dependent on communication distance d and critical distance d c a constant.
When d< d c , sensor node uses free space model. This model assumes ideal propagation condition with line-of sight path between transmitting node and receiving node. It represents coverage range around the transmitting node. The receiving node can receive the information from transmitting node if it is in this coverage range. This model can be efficiently utilized for short distance communication. Usually, the communication distance between sensor nodes and cluster head (C H ) is less and so energy dissipation follows free space model in this case. Now amp = fsm and n = 2 in the free space model and the energy dissipated at the transmitter is calculated as shown in Eq. (5) when d ≥ d c , sensor node uses multipath fading model, amp = mpf , n = 4, the energy dissipated at the transmitter is The multipath fading model is utilized when the base station BS is far from sensor nodes. To receive the information from the transmitter the energy dissipated by the receiving radio is The energy dissipated in data fusion process is noted as where E DA is the dissipated energy in aggregating per bit packet data.

EECAO Algorithm Scheme
The proposed EECAO algorithm is further explained in six phases in detail. Phase 1 is the network initialization phase. Phase 2 explains the selection of initial cluster head C H . Channel selection parameters for the cognitive sensors are given in phase 3. Phase 4 details the selection of cluster head C H and its rotation. Super cluster head S CH selection and its rotation using centroid based routing is explained in phase 5. Optimization of super cluster head S CH path to base station BS using ANT routing model is illustrated with a routing diagram in phase 6.

Network Initialization
Initially a LOCATION packet will be sent to the BS by the sensor nodes deployed inside the sensing area of 300 sq.m. This packet will have the locational information of the sensor nodes inside the sensing field. The packet format includes 5 fields. The first field is to make the base station realize that the packet contains location information of sensor node. As every sensor node is deployed with its unique identification number (hereafter represented as sensor ID), the second field will be the sender's sensor ID. The third and fourth fields represents X-coordinate and Y-coordinate positions of the sensor node inside the sensing area. The fifth field will be the residual energy of the sensor node at the time of packet initiation.
Once the base station (BS) receives this packet from the sensor nodes inside the sensing field, it will calculate the distance between itself and to each sensor node. Let ( D BS−SN ) is the Euclidean distance between the base station and a sensor node. If the basestation cartesian co-ordinates are (X BS , Y BS ) and a sensor ID cartesian co-ordinates are (X S , Y S ) then The base station forms the clusters initially. Node table will be updated by the base station with each node locational information and their energy level. Now the base station responds with an acknowledgement (ACK) packet to the sensor nodes inside each of the formed clusters separately. This acknowledgement packet has four fields. The foremost field is to inform the sensor nodes inside each cluster that it is acknowledgement message. The second field shows the maximum coverage range of the formed cluster i.e., maximum distance between the sensor nodes inside the particular cluster to the base station. The third field shares the initial cluster head ID to the sensor nodes inside that cluster. Initial cluster head is selected by the base station itself depending on the distance between the sensor nodes to the base station D BS−SN and the energy level of sensor nodes. The fourth field shares the average energy information of the network with the sensor nodes inside a formed cluster. After exchanging the mutual information among the sensor nodes and the base station, the network initialization is completed. Moreover, the routing table information is updated in real time for the better performance of the network.

Initial Cluster Head selection
As mentioned in the Sect. 3.2.1, the initial cluster head is determined by the basestation itself. So, the initial round in cluster head (C H ) selection is random as the initial energy level of sensor nodes is identical. In fact, in the initial cluster head selection round, every sensor inside a cluster will check whether its own sensor ID matches the third field of acknowledgement packet sent by the base station to gain the position of cluster head. If the field ID matches sensor node ID, then it prepares itself to fulfill its responsibility as the cluster head. The main responsibility of the cluster head is to aggregate the information from the sensor nodes inside its cluster and find a path to relay this information to the base station (BS). If the field ID and sensor ID are not matched, then the sensor node plays an energy saving role by receiving the information but not transmitting it.

Channel Selection Using Sensors Cognitive Knowledge
Cognitive radio enabled sensor nodes have the ability to sense free channels (spectrum sensing) and change the transmission parameters accordingly (spectrum decision). Sensing the channel requires proper channel selection among the multiple channels inside the sensor network [29]. In proposed approach based on special requirement of nodes, channels can be assigned on fixed basis to some nodes or else routing algorithm makes a decision on selection of channels.
As the sensors are deployed with the cognitive radio knowledge, each sensor can sense the free channel to avoid the channel scarcity and channel contention. The cognitive capability of sensors can also help them to find the qualified channel C Q and channel consistency C C . Primary parameter in channel selection is channel consistency C C which is calculated from three independent parameters namely 1. Channel congestion probability P Cc 2. Channel busy probability P Cb 3. Channel packet dropping probability P Cpd . Assume that the source node receives n ECN (Explicit congestion notification) feedbacks. The channel congestion probability at current time t is calculated as shown in Eq. respectively. The weight of CE bits in different ack packets are assigned using an exponentially weighted moving average method. The channel busy probability is the probability that the channel sensed is busy i.e., there exists atleast one node transmitting any type of packets in carrier sensing rangeR cs . Let i be the external transmission probability i.e., the probability of transmission attempt for access class i and i be the internal transmission probability where i p, s . Here p denotes the packet transmitted and s denotes wireless spectrum access (WSA).
i is expressed as since we have P p,vc = 0 and P s,vc = p . So, total transmission probability of channel total is expressed as p + s . Thus, P Cb is calculated as shown in Eq. (11).
Packet dropping means the failure f of transmitted packet to reach its destination. Packet dropping may occur due to the sensor queue limitations, channel inconsistency and heavy network traffic. The heavy traffic is generated in channel due to increase in the number of packets received from sensor node neighbors and observed data by sensors from the environmental changes. So, the probability that the WSA packet being dropped due to more than retransmission limit m is calculated as shown in Eq. (12) Table 2 contains channel consistency parameters and its values collected from sensor ID 29 when the base station is placed at center co-ordinates with 100 nodes inside the sensing field.
The channel consistency C C is computed as shown in Eq. (13) The C C computation provides the intelligence to choose the best forwarding channel. For example (only 2 channels are considered here for sensor ID 29 as sample but originally 5 channels are used in setup) as displayed in Table 2, if the channel consistency is high, there exists high probability of successful data transmission i.e., if the sensor ID 29 selects channel 2 to transmit its data to its destination there exists high probability of successful data transmission.
The sensor node with cognitive knowledge operates from the MAC layer to select the primary channel and the sensor path selection is based on the three parameters. (1) Distance between the sensor node and the base station D BS−SN in meters. (2) Selected channel consistency C C and (3) Sensor throughput S Th in bytes/sec. Sensor node throughput relies on neighbor nodes' packet receiving capability, arrival packet rate P R and the energy level E le of the sensors after each transmission.
A sample path selection parameter values are shown in Table 3. These results are collected from random sensors when the base station is placed at border co-ordinates with 100 sensor nodes deployed inside the defined sensing area. Each node maintains the neighbor list N L and all the path selection parameter values are updated and saved frequently by each node after each round. The Fig. 3 displays the behavior of random-access CR-MAC protocol. The main principle used by the CR-MAC protocols in this category is carrier sense multiple access with collision avoidance (CSMA/CA). Each CR node contends for the medium to dialogue control information and then switches to a common channel in the FCL (Free Channel List) for subsequent data transmission. No time synchronization amongst CR nodes is required in this category but there is always starvation of the control channel. Each node shall sense the carrier before transmission. If the channel is sensed idle, then the CR node that wants to transmit packets sends a RTS message on the common control channel (CCC). If the corresponding CTS message is received successfully, then both the sender and receiver switches to the data channel that was found as common during the initial RTS/CTS dialogue. Data packets can be transmitted on the data channel followed by an acknowledgement (ACK) message. The energy loss of the sensor is dependent on the energy cost for the number of environmental observations by the sensor E os , number of transmissions T N , and reception packet count R pc . Let E Tp and E Rp denotes the packet transmission energy cost and reception energy cost of the sensor respectively. The energy utilization for the packet transceiving process may vary depending on  . 3 The behavior of random-access CR-MAC protocol the communication channel chosen. The estimated energy cost of sensor E cost is computed as shown in Eq. (14). When the BS is placed at border co-ordinates with 100 sensor nodes inside the sensing field area, the energy cost of sensor ID 18 is displayed in Table 4 with its energy probability. Energy probability E p is the probability that the energy of the sensor node may not go below energy threshold E th value set.
Energy saving is one of the major concerns in the sensor communication. The sensors will sense the environmental changes and convert it as data packets and transmit the packet information to the cluster head C H . The qualified channel C Q of cognitive sensor is dependent on channel decision C D value and channel consistency factor C C . Channel decision C D value is calculated as shown in Eq. (15) where E DA is the dissipated energy in aggregating per bit packet data and E OH is the energy cost for transmitting channel overhead.
The qualified channel C Q of the cognitive sensor is the channel with less C D and high C C values. As the channel bandwidth is partitioned into multiple channel sub frequencies as shown in Fig. 3, if a channel is not utilized by sensor node, then it comes into the list of the free channels of the sensor node.

Cluster Head Selection and its Rotation Phase
To communicate with the basestation effectively, the sensing area is divided into coverage regions with coverage range C R for each region. In the proposed work, the sensor network is divided into coverage regions to obtain the position of centroid which is calculated based on the cluster heads energy level for selecting super cluster head. When the sensors are deployed inside the sensing area, each sensor node falls under a certain coverage region as shown in Fig. 4. At the end of the network initiation process, the basestation will have the information of the coverage regions based on its locational co-ordinates. Then the basestation forms the clusters inside each coverage region. Initial cluster heads of each cluster are chosen by the base station itself. The primary selection criteria to select the cluster head C H is its energy level E Le . It has to satisfy the condition E Le ≥ 2E th . The other selection criteria include three metrics namely 1. Energy 2. Reliability 3. Throughput as attributes for the communication between the sensor nodes. The energy metric in the simulation is taken as channel decision C D value calculated from Eq. (15) and reliability metric is taken as channel consistency value C c computed from Eq. (13) and the throughput is the sensor throughput S Th calculated for every sensor node. The sensors rank to select a sensor node as cluster head is dependent on parameters (p1, p2, p3) shown in Table 5. To obtain the single decision from assorted decisions, the cluster head selection parameter values are organized in a 3 × 3 matrix in the following manner as shown in Table 6 where p is the parameter attribute. Table 7 shows the sensor ID 45 matrix output computed when the base station is placed at border co-ordinates with 100 sensors inside the sensing area.

Energy cost of sensor
To rank the sensor node using the Eigen factor, the 2 × 2 Eigen matrix with Eigen values Ei V collected from Eigen list has to be prepared. Eigen list is formed as shown in Table 8.    and

2A
. Then Eigen factor is max ( R 1 , R 2 ). The Eigen factor values of the sensors inside a cluster are sorted in a descending manner by the initial cluster head formed by the basestation. Then it ranks the sensors based on their Eigen factor value. Higher the Eigen factor value of the sensor higher is the possibility of that sensor to become the cluster head C H . Once the initial cluster head energy level E Le < E 0 ∕3 then the sensor node with best rank will be triggered to become the cluster head for the next round of data transmission process. From the next round, when the network is running the sensor node with maximum Eigen factor is automatically triggered to become cluster head C H .

Super Cluster Head Selection and its Rotation Phase Using Centroid Based Routing
Inside the coverage range C R of the coverage region, cluster heads are selected for each cluster. Now one among them has to be elected as the Super Cluster Head S CH to carry the information of all sensor nodes inside a coverage region to the base station (BS). If the Euclidean distance between the selected S CH and the base station i.e., D BS−S CH is less than 1.5 d c then the S CH forward its aggregated data directly to the BS.
Selection of the Super Cluster Head S CH for the coverage range C R is mainly dependent on two parameters 1. Euclidean distance between the C H and the Energy Centroid 2. C H residual energy level. The number of super cluster heads formed inside the sensing area will be the number of coverage regions created in the sensing field.
Let n be the number of the cluster heads formed inside the coverage range C R and C H i be the cluster head of a cluster where i = 1, 2, …, n. Let ( X C H i ,Y C H i ) be the locational co-ordinates of the cluster headC H i , E 0 is the initial energy of sensor ID when it is deployed inside the sensing area and E Le (i) be the residual energy of theC H i . Since an assumption is made that all the sensor nodes inside the sensing area are deployed with equal initial energies, E 0 C D /C D C C /C D C C /C D S Th /C D C D /C C C C /C C C C /C C S Th /C C C D /C C C C /C C C C /C C S Th /C C C D /S Th C C /S Th C C /S Th S Th /S Th will be the same for all theC H i . Now the energy centroid co-ordinates are formed as ( X ce ,Y ce ) and its calculation is shown in Eqs. (16) and (17).
Once the centroid co-ordinates ( X ce ,Y ce ) are updated for the coverage rangeC R , then every cluster head C H i finds its Euclidean distance with the centroid. The C H with less Euclidean distance with the centroid will become the Super Cluster head S CH for that coverage region and will take the responsibility to transmit the information in that C R to the BS. Once the energy level of the super cluster head in a coverage region falls below the energy level of any of the cluster heads then the centroid algorithm is retriggered to select the best of C H i as theS CH . Figure 4 shows an illustration of simple routing process opted inside the sensing area.

Path Optimization Between the Super Cluster Heads and the Base Station Using ANT Routing Model
ANT routing optimization in the proposed work is opted by the super cluster head S CH when the Euclidean distance between the selected super cluster and the base station is high i.e., D BS−S CH ≥ 1.5d c . A simple ant routing model opted for proposed work is shown in Fig. 5.
In this case the information aggregated at certain super cluster heads needs to be multihop forwarded to reach the base station BS. Initially the source node (Super Cluster Head S CH ) starts to broadcast the query packet simulated as the forwarding ant to gather the foremost pheromone value. Pheromone value in the proposed work takes Energy, Reliability, Distance metrics into consideration. So, the pheromone value to be calculated is dependent on three attributes for the proposed work. (1) Energy probability E p is the probability that the energy of the sensor ID (intermediate node to be chosen by S CH to reach BS) may not go below E th . (2) Sensor Channel Consistency C C calculated through Eq. (13). (3) Routing distance probability R Dist of the sensor to the BS. Higher is the E p , C c , R Dist of the sensor node greater is the chance of the sensor ID to be selected as an intermediate node.
Energy probability E p is calculated as where E cost is the energy cost of sensor calculated from Eq. (14) and E Le is the current energy level of the sensor. Routing distance probability R Dist is calculated as where D BS−SN and D BS−S CH are the Euclidean distance between the base station to selected intermediate node and super cluster head respectively. Then pheromone value is computed as where , , are the control parameters of the attributes E p , C C , R Dist respectively. Higher is the pheromone concentration value greater is the chance of that path be selected by S CH to route its packet to BS.
Packet transfer probability of node i to another node j for packet l in time t is calculated as Consider a sensing area of 300 sq. m with base station placed at (150,325) border coordinates and 75 nodes randomly deployed inside the sensing area. In the EECAO network routing process shown in Fig. 6, the entire sensing field is divided into four coverage regions C Rn , wheren = 1, 2, 3, 4 . Let us consider that sensor ID 35 wants to send its information to the base station, then the information is sent to the cluster head C H ID 33. C H ID 33 retransmits the data to the super cluster head S CH ID 26 of the coverage range C R2 which transmits the data to the base station directly since the distance between the BS to S CH is < 1.5d c . But, consider the sensor ID 71 which transmits the information to cluster head C H ID 70 and the same is retransmitted to S CH ID 65 of coverage range C R4 .
Here the S CH ID 65 is far from base station i.e., distance from S CH ID 65 to BS is ≥ 1.5d c and if it tries to transmit information directly to base stationB S , it may soon deplete its energy and die. So, here S CH ID 65 opt for ANT routing and at the end of simulation algorithm iterations N, intermediate nodes to base station for time t are formed. Then it can transmit the information to base station using sensor IDs' 18, 22, 24 respectively and so an ANT routing path 71 is formed to transfer source sensor ID 71 information to base station using S CH ID 65 as shown in Fig. 6. Similarly ANT routing path 58 is formed to relay sensor ID 58 information to base station using S CH ID 52 with the help of intermediate sensor IDs' 41,49,50,46 respectively.

Simulation Analysis and Discussions
NS2 is a platform which facilitates user to simulate wireless sensor network scenarios. One aspect of selecting network simulator (NS-2) is that it has a provision to add advanced functionalities in the simulator. For the simulation of proposed work, the cognitive radio sensor network (CRSN) patch is added to the NS-2 platform. This allows the simulation environment to have multichannel support and primary radio (PR) activity support. The results generated using this simulation platform are accurate and gives a real insight on the issues of the network. The network size is considered as 300 sq.m area in the Network Simulator (NS-2 Version 2.32).
The proposed EECAO network metrics are given in Table 9. Simulation results of the proposed work are compared with the similar existing protocols Efficient Tree based Self-organizing Protocol (ETSP), Energy-Efficient Centroid based Routing Protocol (EECRP) and Basic Ant-Colony Routing algorithm (Basic ACR) explained in detail in Sects. 2.2, 2.4 and 2.5 respectively. The performance metrics considered are 1. The total energy consumption of network in joules. 2. The average energy consumed by sensor in joules inside the sensing field. 3. Packet delivery ratio (PDR). 4. Packet dropping probability. 5. Sensor network throughput in bytes/sec and Goodput in bits/ Sec. and 6. Network delay in seconds when the simulation time considered is t = 200 s.
The simulation results are taken from 3 scenarios by changing the location of base station.

Scenario 1
The base station is placed at location co-ordinates (150,150) at the center of the sensing area of 300 sq.m.

Scenario 2
The base station is placed at location co-ordinates (300,300) at the border of the sensing area of 300 sq.m.

Scenario 3
The base station is placed at location co-ordinates (150,325) outside the sensing area of 300 sq.m

Analysis Based on Total Energy Consumption of Sensor Network
In the proposed approach, the energy consumption of sensor network is due to the environmental observations by sensor E os , packet transmission energy E Tp , packet reception energy E Rp and channel overhead E OH . Networking nodes are the sensors with their energy level E Le > 0.25J.
Results are captured from the simulator by considering the number of networking nodes inside the sensing field. When the BS is placed at centre of the sensing field, (Scenario 1 Fig. 7) as the number of networking nodes inside the ssensing area increases, the total energy consumed by the proposed sensor network is less than the existing sensor network works. when networking node count inside the sensing area is 200, the proposed network in scenario 1 saves around 37 J, 25 J, 9 J when compared to existing ACR, ETSP and EECRP respectively. The proposed EECAO protocol has shown the similar dominant characteristics in scenario 2 (Fig. 8) by saving around 36 J, 28 J, 16 J when compared to existing ACR, ETSP and EECRP respectively. The proposed algorithm proved energy efficient when the BS is placed outside the sensing field (Scenario 3 Fig. 9) by saving around 38 J, 27 J, 17 J when compared to existing ACR, ETSP and EECRP respectively. Figure 8 proves that energy consumed by a sensor ID is less when the BS is placed at the center of the sensing field as the BS is mostly within the reach of super cluster heads formed. So, the super cluster heads can directly send their information to the BS without opting for ANT routing and save their energy.

Analysis Based on Average Energy Consumption of Sensor Network
The rate of energy consumption of deployed cognitive sensors varies significantly, depending on the protocol used for communication between the sensors. The average energy consumption of a sensor node is calculated as the ratio of energy consumed by all the sensor nodes inside the sensing field to the number of deployed sensor in the sensing area.
As given in the Fig. 10, average energy consumed by a sensor in the proposed protocol is comparatively less when the base station is placed at the center of the sensing field. But when the base station is placed at the co-ordinates (300,300) as given in Fig. 11, the average energy consumed by a sensor under EECAO protocol is significantly very less than the protocols ETSP and EECRP. The characteristics shows much difference in the analyzed results, when the base station is based at outside co-ordinates of the sensing area as given in Fig. 12.

Analysis Based on Packet Delivery Ratio (PDR) of Sensor Network
Packet delivery ratio Eq. (22) is the percent of the data packets delivered to the base station to those generated by source sensors which initiates cognitive channel traffic. Packet delivery Fig. 7 Total energy consumption when BS is at co-ordinates (150,150) 1 3 ratio is dependent on constraints like channel overhead and channel traffic [30,31]. Packet delivery ratio in the proposed work is primarily dependent on channel consistency parameters.
Packet delivery ratio in the proposed protocol is dependent on the overhead added to the data and the distance from the sensor node to the base station D BS−SN . As shown in Fig. 13 the proposed EECAO algorithm proved to have dominant packet delivering capacity when compared with the existing works. Its two way switched multihop routing has created a good impact on the packet delivery ratio of the sensor network when the base station is placed at the border of the sensing field as depicted in Fig. 14. Since the existing works has no proper adapting algorithm when the BS is placed outside the sensing field, the proposed work has shown similar dominant characteristics in terms of Fig. 10 Average energy consumed by a sensor ID in the sensing field when the base station is placed at coordinates (150,150) in Scenario 1 Fig. 11 Average energy consumed by a sensor ID in the sensing field when the base station is placed at coordinates (300,300) in Scenario 2

3
PDR as proved from Fig. 15. Moreover, the performance metric PDR has huge impact on the proposed work as it is the deciding factor for quality of service (QoS) in WSNassisted IoT environments to meet the demands of the user agents U A . Higher is the PDR, better is the QoS.

Analysis Based on Packet Dropping Ratio of Sensor Network
Dropping ratio of the packet is the performance degrading metric which exists due to wireless transmission nature of network in the simulated environments. Packet drop may occur in the sensor network due to congested channels, network delay and transmission timeouts. Since the congestion in the proposed work is reduced by using cognitive sensors at the cost

Analysis Based on Sensor Network Throughput and Goodput
In EECAO network, congestion is avoided by selecting cognitive sensors which can sense the channel spectrum for availability access before transmission. Sensor network throughput is defined as the amount of data successfully delivered to the base station from source sensor in a given time inerval t. Throughput is usually measured in bits per second and sometimes in data packets per second. In the simulation, packet size is set to 64 bytes. Throughput is calculated from Eq. (23) where t first is time of first packet sent and t end is the time of last packet received [32]. The throughput of a sensor network may be affected by various factors including the type of transmission medium, transmitter and receiver power of sensor, channel consistency, channel capacity and distance to the base station (destination). When the overhead added to the original data is taken into consideration, transferred useful data rate is comparitively less than the maximum achievable throughput. In all the proposed scenarios shown in Figs. 19, 20 and 21, sensor network throughput in bytes/sec is high compared to the network throughput of existing protocols.
Goodput is defined as a ratio of the size of the transmitted data packet to the lapsed time in transferrring that packet to the base station. Goodput result will always be less than throughput except in an ideal case. Usually an each packet to be sent to the destination is included with a header. So, as the overhead added to the packet increases, the goodput will reduce eventually unlike throughput. Several factors can cause the decrease in goodput, such as network congestion which leads to collision of data and requires the packet to be resent. Also, if a protocol requires acknowledgement of the packet sent then that adds additional overhead to the data transfer process. The goodput characteristics of the proposed EECAO protocol in the simulated scenarios are given in Fig. 22.

Analysis Based on Sensor Network Delay
Network delay is an important Quality of Service parameter for data forwarding in a constrained sensor network scenarios. Network delays are caused due to packet collisions and network congestion during the data transfer period. The sequencing of data traffic is controlled by channel access mechanism. Here the random access cognitive radio mechanism is used in the proposed protocol to avoid packet collisions and network congestion. As a result, the network delay is reduced significantly in all the three proposed scenarios as given in Figs. 23, 24 and 25 when compared to the existing ETSP and EECRP protocols. Once the packet overcome its starvation for the control channel, then the delay is reduced greatly and its

Analyzed Control and Normalized overhead data of EECAO protocol
Control overhead is the quantity of routing packets sent by the source sensors for route discovery and its maintenance to reach the base station. So, the control overehead is dependent on the total count of routed packets per hop or the toatal number of bytes routed per hop. The total count of control overheads added to the data packets in the proposed scenarios when the networking node count in the sensing area varies between 75 ~ 200 is given in Table 10.
Normalized routing overhead in a simulation is defined as the ratio of total routing related transmissions to the data transmissions. It may also be defined as the total count of transmitted routing packets from all the source sensors inside a sensing field per total number of data packets received by the sink or base station. The total count of normalized overheads in the proposed scenarios when the networking node count in the sensing area varies between 75 ~ 200 is given in Table 11.

Conclusion
In this paper, an Energy Efficient Centroid-based Ant colony Optimization (EECAO) protocol for WSN-assisted IoT environments is proposed. The problem of forming clusters inside the coverage area is solved by selecting cluster heads based on its energy cost,   channel reliability and sensor throughput metrics. Eigen value analysis is utilized to solve the problem of assorted decisions and gain a single output result. The selection of forwarder node termed as super cluster head is selected based on the distance of cluster heads to the centroid position of the coverage area. By doing so, the data transmission distance of the cluster heads is reduced significantly. Also, an ant colony optimization algorithm to optimize the path between the super cluster heads and the BS is proposed. This reduces the energy consumption of super cluster heads and increase their lifetime in the sensing field. From the simulation results, when the BS is placed at the center, border and outside the sensor network, the proposed EECAO protocol could transmit a significant amount of packet data with high packet delivery ratio and less energy dissipation. Moreover, the overall network throughput of EECAO protocol is higher than the existing ETSP and EECRP. In future work, the protocol can be enhanced by finding a solution for forming the coverage regions in different shaped sensor fields. Also, the further work on the proposed approach can provide secure communication among the cognitive sensors deployed in the network area.
Funding None.

Availability of data and material
The authors declare that the data supporting the findings of this study are available within the article along with its supplementary information files included.
Code availability Can be provided if requested personally through corresponding author mail or through a repository created by our institution. All simulation videos are included in the supplementary information file.