A Novel Approach to Reduce Energy Consumption By Clustering With Genetic Algorithm and Sleeping Time in The IoT

Internet of things is one of the most important technologies in the last century which covers various domains such as wireless sensor networks. Wireless sensor networks consist of a large number of sensor nodes that are scattered in an environment and collect information from the surrounding environment and send it to a central station. One of the most important problems in these networks is saving energy consumption of nodes and consequently increasing lifetime of networks. Work has been done in various fields to achieve this goal, one of which is clustering and the use of sleep timing mechanisms in wireless sensor networks. Therefore, in this article, we have examined the existing protocols in this field, especially LEACH-based clustering protocols. The proposed method tries to optimize the energy consumption of nodes by using genetic-based clustering as well as a sleep scheduling mechanism based on the colonial competition algorithm. The results of this simulation show that our proposed method has improved network life (by 18%) and average energy consumption (by 11%) and reduced latency in these networks (by 17%).


Introduction
The internet of things is an emerging technology in which any entity can send and receive data through various communication networks. This technology will have a tremendous impact on various aspects of human life, so it is necessary to select the appropriate protocols and technologies for communication between different objects. IoT is a network of networks that allows objects to perform tasks such as measuring, deciding, and exchanging information. These types of networks transfer traditional objects to their intelligent state using their own technologies such as common and pervasive computing, wireless sensor networks, Internet protocols, and applications. A wireless sensor network is a collection of a large number of small sensor nodes with limited telecommunications and computing capabilities that are used to collect and transmit information from one environment to a user or base station. Recent advances in small-scale integrated circuit technology, on the one hand, and the development of wireless communication technology, on the other, have paved the way for the design of wireless sensor networks [1,2].
Wireless sensor networks have attracted many researchers for applications in various fields such as natural disaster alert systems, environmental monitoring, healthcare, security, surveillance, network intrusion detection, and more. The main problem with wireless sensor networks is the limited and irreplaceable resources of sensor nodes that operate on small batteries. In addition, in many applications it is almost impossible to replace sensor nodes when their energy is running low. Therefore, it is essential to minimize the energy consumption of the nodes and increase the life of the network. Lifespan in wireless sensor networks can be defined by various criteria based on application specifications. For example, the death of the first node (FND), the death of half of the nodes (HND), the death of the last node (LND), etc. In heterogeneous wireless sensor networks with different data packets, FND is the most important criterion because the destruction of even one node can cause irreparable damage [3].
Clustering is one of the methods that can be used to solve these challenges in the wireless sensor network. Nodes are divided into small groups called clusters. In each cluster, a cluster head is selected. The data collected by the sensors in each cluster is sent to the cluster head, then the cluster head aggregates the data and sends it directly or with the help of neighboring headers to the base station or sink. The base station is the data processing center that the end user has access to. Clustering increases network scalability, system longevity, and energy efficiency. Clustering sensor nodes is an effective way to reduce energy consumption in the network to increase the lifespan of the wireless sensor network. However, in a clusterbased wireless sensor network, headers carry the extra burden for various activities such as data collection, data accumulation, and transfer of accumulated data to the base station; thus balancing header load is a challenging issue for long-term wireless sensor network performance, and in addition choosing the right data transmission path and one-handed network based on residual energy helps a lot in keeping the maximum number of nodes on [3].
In this paper, we have introduced an efficient method to increase network life, increase packet delivery rate, reduce average end-to-end latency, and reduce energy consumption using data propagation and multi-route routing methods. For this purpose, the genetic optimization algorithm due to its capabilities such as fast convergence and the power to find the general (global) optimal points can be used as a suitable tool to increase network life. In this paper, by applying the clustering method using genetic algorithm and selecting the header node based on the maximum residual energy, the energy consumption to reach the base station is minimized. By reducing the amount of energy of the header from a certain threshold value among the nodes in the same cluster, the node that has more energy than the threshold is selected as the new header. On the other hand, due to the awakening of sensors in the internet of things to monitor the environment and receive data, the colonial competition algorithm has been used to schedule sleep to bring idle nodes to fast and deep sleep so the energy consumption of nodes relative to waking time and constant listening be reduced [4].
The rest of the article is organized as follows: In the second part, a background of the works done in the clustering of wireless sensor networks will be described. The third section contains details of the proposed method. The fourth section will show the results of simulations and evaluations. In the final section, we will present the general conclusion and the main challenges in the field of wireless sensor network clustering as well as future work.

Literature Review
The IoT is a global network in which electronic devices are continuously integrated to form an information network with specific goals in order to provide advanced and intelligent services to users, and encompass various areas such as wireless sensor networks [4]. A wireless sensor network is a network of hundreds or thousands of very small devices called sensors, whose main task is to collect data at regular intervals and convert it into an electronic signal and transmit the signal to a sink or base station via reliable wireless communication media. Each node also has a processor inside, and instead of sending all the raw information to the center or to the node that is responsible for processing and concluding the information, it first performs a series of basic and simple processes on the information it obtains and then sends the semi-processed data. In fact, the power of wireless sensor networks is in the ability to use a large number of small nodes that can be organized and used in many cases such as simultaneous routing, environmental monitoring, monitoring the health of structures or equipment of a system [5].
Node clustering in wireless sensor networks is done for different purposes. Energy conservation is the most important and common of these. Other goals of clustering include scalability, fault tolerance, data aggregation / composition, load balancing, increasing network lifespan, increasing network connections, reducing latency and routing, and avoiding collisions. Clustering methods are classified into two main groups: clustering with equal size and clustering with unequal size [6].
Equivalent clustering methods have been widely proposed by researchers. The main idea of these algorithms is to form clusters of relatively equal size to reduce the number of clusters, distribute the clusters evenly across the network, and minimize interference between them. In fact, the main problem with this type of clustering is that the distance between the nodes and the central station (BS) does not affect the size of the clusters, so the traffic load is not evenly distributed among the nodes, because the node adjacent BS relays more data than nodes that are further away [6].
The most important algorithm presented in this field is LEACH algorithm. The main idea [7], is to rotate the headers between the nodes in order to balance the load. The operation time in this algorithm is divided into a number of rounds and each round is divided into two phases of start-up and steady-state to form clusters and send data directly by header to BS, respectively. Heading selection in LEACH is done with low overhead and using a random method, which ensures that all nodes are selected as headers at least once in the lifespan of the network. And the network life in this algorithm depends on the number of nodes and clusters.
In [8] a multi-step version of LEACH called M-LEACH was presented. This protocol examines the effective performance of multi-step communication with BS versus single-step communication. C-LEACH is a centralized clustering method in LEACH. In [9], BS is responsible for cluster formation. Initially, each node sends information including its spatial and energy level information to the BS, then the average energy of the nodes is calculated, and nodes that have less energy in this cycle than the average total energy are not selected as CH. When BS selects the headers of this cycle, the ID distributes them to all network nodes, and nodes not selected as CH join their nearest CH.
[10] A threshold-based clustering protocol was introduced in WSN that, unlike LEACH, was designed for applications based on a specific and responsive approach. TEEN's main idea is to use data-driven protocols in a hierarchical structure. TEEN is more energy efficient than previous algorithms by reducing the number of transfers to BS, and the data-driven nature of this protocol makes it suitable for timesensitive applications.
In [11] a new protocol, called APTEEN, was introduced that combines the best data acquisition features in TEEN (i.e. responsiveness) and LEACH (i.e. periodic transmission of sensed data). The sensors allow them to send their sensed data at regular intervals, to react to any sudden change in the value of the sensed feature, and to report the corresponding values to their head. [12] Provides an improved LEACH protocol. This protocol is a greedy protocol and pursues two goals, firstly, to improve the life of the network and to unify the energy consumption between nodes, and secondly, by using a chain structure, each node sends its data to the nearest neighbor in that chain, then the receiving node aggregates the received data with its own data and sends it to the next step. In this algorithm, the node closest to BS is known as the leader and is responsible for transmitting data to BS. EEHC is an energy efficient hierarchical clustering protocol and a randomly distributed clustering algorithm presented in [13]. The EEHC operation is divided into two stages, in the first stage the cluster heads collect information from sensor nodes within their clusters and in the second stage, through the cluster hierarchy, a consolidated report is sent to the base station and the operations are repeated recursively until the data reach BS. In order to select the headers, a probability-based algorithm based on the density of neighboring nodes is used.
The cluster head definitive selection algorithm was presented in LEACH, [14]. This algorithm considers the residual energy of the nodes in the CH election, while maintaining the simplicity and distributed nature of LEACH. Also, if after a certain number of rounds, there are still nodes with enough energy to transfer data to BS, the network will be suspended and temporarily idle.
Typically, in unequal clustering, the size of the clusters varies based on the distance between the nodes and BS. In other words, in multi-step routing, clusters close to BS need to relay more information, so they consume more energy and discharge energy very quickly, so they must be smaller than clusters farther away. This problem is commonly known as the energy hole problem. The smaller the number of members in the cluster, the lower the rate of energy consumption within the clusters, so their headers can store more energy for the information relay received from the farther cluster.
This issue was first discussed by Soro and Heinzelman in 2005. The main idea of UCS in [15] is that the criterion of node distance to BS has been used to form adaptive clusters. Accordingly, clusters close to BS are smaller than clusters farther away. In UCS, the header is assumed to be rich in energy (in a heterogeneous grid) and also at the center of each cluster, with BS at the center of the perimeter. And the clusters eventually send their data to BS via a two-step path.
In [16,] EEUC adjusts cluster sizes according to the distance to BS. The assumptions in the EEUC are more realistic than in the UCS. In this method, unlike UCS, BS is located outside the area and does not need to be placed in the center of the header. The selection of headers is also based on a competition. EADUC is an unequally distributed and energy conscious clustering protocol designed to support both homogeneous and heterogeneous networks in [17]. Cluster formation in this protocol is done in the following three steps: data collection of neighbors, CH competition and cluster formation. In order to form unequal clusters, the competition range is based on a weight function of the residual energy and the distance to BS.
In the unequal version of LEACH [18,] BS first distributes a distance matrix containing the distance between each pair of nodes in the network to all nodes in the network. BS creates this matrix with all the Hello message broadcast within the network and receiving the report message from the nodes. Using this matrix, nodes can adjust their transmission power. The headers are then selected based on a modified version of the CH election probability in LEACH, which is based on residual energy and distance to BS.
The proposed version of HEED with unequal clusters in [19,] tries to prevent the death of clusters close to BS. UHEED does this by changing the scope of the protocol competition to achieve unequal clusters. A sensor node usually consists of four basic components: the sensor unit, the processing unit, the radio unit, and the power supply unit. [20] If the minimum energy required to achieve the effective performance of the sensor node is less than the operating capacity of the battery, the battery life will be shortened. Using data processing techniques and turning off the sensor node from time to time when there are no processing or communication demands, battery current can be drastically reduced, thus increasing battery life.
From this point of view, scheduling algorithms can be divided into five categories: TDMA sleep timer with energy efficiency, balanced energy sleep schedule, optimal sleep schedule, dynamic sleep schedule, and dynamic sleep schedule with optimal time delay.
In TDMA sleep scheduling algorithm with energy efficiency, time is divided into two work intervals for each node, in one interval the radio interface is turned on and the packets are sent and received, and in the other interval the radio interface is turned off and no data exchange takes place. Maximizing network life, minimizing the number of packets lost during sleep, are some of the benefits, and the need for time intervals more than the actual network needs, which increases latency and reduces the optimal use of the channel, as well as the possibility Occurrence of data overlap is one of the disadvantages of this algorithm [21].
In balanced energy sleep scheduling, using the sleep technique to store energy by rotating the active and inactive node status is one way that nodes with battery power supply can intelligently manage energy by considering additional nodes, and increase their lifespan. And it uses a very low consumption timer to wake up at a certain time. The redundancy of sensor nodes and the use of additional sensors for sleep to balance the load are some of the disadvantages of this algorithm [22].
In optimal sleep scheduling, nodes sleep periodically, meaning that one node in the model under study is free to turn off its transmitter and receiver for fixed periods of time to store energy. Reducing the amount of latency using this method and increasing the network life are the advantages and not evaluating the coverage and connections are the disadvantages of this method [23]. Intelligent energy storage is very important in dynamic sleep scheduling during inactivity and periods of phenomena occurrence. Reducing canal eavesdropping is critical. If the nodes can determine the time of sending or receiving packets, they can reduce canal eavesdropping. To facilitate energy storage during events, an intelligent scheduler allows nodes to sleep very briefly when the node is neither sending nor receiving a packet [24].
Effective delay sleep scheduling algorithms are another class of sleep scheduling algorithms. In sensor networks, network life is expected to increase with the appropriate efficiency of energy storage in small sensors. Therefore; in network applications with low traffic load, turning off the sensor node radio seems to be desirable when there is no sending or receiving. The Media Access Control Protocol recommends regular, synchronized work cycles to reduce the cost of channel eavesdropping in idle mode [25].
In all cases of sleep scheduling algorithms, the decision to schedule and plan sleep cycles and sensor activity is made only on the basis of the information of the node and adjacent nodes, so the use of new methods and techniques to increase network efficiency and effectiveness seems essential. In the next section, we will describe the system model and present the proposed method and its details.

suggested method
The rapid growth of population density in urban areas requires modern infrastructure for appropriate services to meet the needs of city residents. Hence the latest advancement in communication technology, such as the internet of things, is the demand for a framework for the development of a smart city [26]. In the system model, we present an environmental monitoring scenario that uses the wireless sensor network as part of the internet of things.
For shorter distances, such as the distance between nodes and headers in the room, the open space model is considered, and for longer distances, such as the distance between headers and the sink node, the multipath fading model is used.
For a symmetric propagation channel, the power consumption for transmitting k bits of data in a package to a sensor located at a distance of d meters is written as follows: ( is a transfer parameter for the multi-path fade model and is the transfer parameter for the open space model. LEACH is a one-step initial clustering protocol that stores large amounts of energy compared to other noncluster protocols. Once the nodes are deployed, the sensors combine to form a cluster with a header to collect data. This protocol is implemented in each round, clusters are formed dynamically and the header is selected randomly. Residual energy is constantly monitored by the sink until the lifespan is over, i.e. all nodes are exhausted. With n sensor nodes and let m clusters exist. In the LEACH algorithm, p is the probability of selecting each node as a header, and before starting each round, each node produces a value of i, which is between 0-1. If the number produced is less than the threshold level given by equation (2), that node is selected as the header for that round.
With each round, the header changes based on the probability of selection, which indicates that all nodes in the cluster have the same chance of being selected as the header in the next round, regardless of the remaining energy.
To meet the challenges, a proposed advanced algorithm called R-LEACH is presented. The proposed protocol is a hierarchical clustering algorithm that consists of 2 steps: setup and steadystate stage.
At the beginning of the setup phase, the sensor nodes located in the network are divided into clusters. In each cluster, the header is responsible for collecting data from the sensor nodes. Actual data routing occurs during the steady-state stage phase where the data collected by the header is transferred to the base station.
In the setup step, for the first round, clusters and headers are formed using the LEACH algorithm where the headers are selected using Equation (3). After data transfer, each node in the network consumes an amount of energy which differs for each node. This distance is indicated by the letter d. Hence, for the next round, selection of the header is done through the equation. Where is the residual energy level of each node and is the initial energy level of the node. The optimal cluster number is .
M represents the grid diameter and 0 represents the initial energy of each node. When the headers are selected for the current round, they send their notifications to the node members in the cluster. The sensor nodes monitor the signal strength of request message and decide when the header wants to join them.
To prevent collisions, the nodes of each cluster transmit their data in different time slots. This process continues for the rest of the cycles until the energy of all the nodes in the network run out. In each time slot, only the data transfer node is active, and all the nodes in the cluster will turn off their radio to save energy. After all the nodes in the cluster have completed their data transfer, header starts processing the data, and then transmits the aggregated data in a single-step or multi-step communication to the base station or sink.
Our proposed method has two phases: 1-clustering method based on genetic algorithm 2-Sleep scheduling with the colonial competition algorithm The steps of the proposed method are as follows: Step 1: In this step, the station randomly distributes the nodes in the environment, defines the initial energy and location of the nodes. (The initial population involved in the genetic algorithm will have n * m bits).
Step 2: In this step, based on the initial population and by the point combination, the place of separation and population composition is done to create a new generation.
Step 3: In this step, we create a new generation and offspring by combining the populations that came from the previous generation.
Step 4: In this step, we select the header. If the selected node is the best header and sends the data to the base station from the best path, it will be selected as the main header.
Otherwise, the evaluation process will operate using a mutation of a bit value with a definite probability of a chromosome.
Population assessment with M chromosomes reproduces offspring by calculating their fitness value using the assessment mechanism to obtain suitable offspring. If they have the right children, the process of creating a new generation will be re-established. When the header node is selected, the other nodes fall asleep according to the steps of the colonial competition algorithm in order to store their energy. This storage increases the life of the network. Figure  (1  In the configuration phase, after determining the initial parameters, the possible headers are identified and selected. The procedure for selecting possible headers is that after distributing the nodes, the environment is first networked and in each network, first the center of gravity of the nodes is calculated and the node that is closer to the center of gravity is selected as the possible header. Possible headers are then incorporated into the genetic algorithm, and after several generations, definitive headers are selected.
Immediately after the setting phase, the STEADY phase begins. In this phase, based on the transfers that take place, the initial energy of the nodes is reduced. In the proposed method for normal nodes, only the process of receiving information from the environment and sending it to the header is considered, while for the header nodes, in addition to receiving information from normal nodes and sending it to the base station, data aggregation energy is also considered. Following algorithm shows the pseudocode of the proposed method.

Set Up Phase
In the Set Up phase to determine the header, we first consider the possible headers as the initial population in the genetic algorithm. In each chromosome, zero bits represent normal nodes and one bits represent headers. Once the initial population is identified, a number of random populations are created from the original population and then the next generations are obtained by genetic operators of selection, composition, and mutation. The fit function is then applied to the created populations. The population that has a better (less) fit than other populations is selected as the final population.
The fit function is the average energy consumption of the entire network. In the proposed method, the fit function is presented based on Heinzelmann model. According to Equation (4) in the model, Heinzelman states that each node consumes energy to send L bits of data at a distance d from itself.
In this regard, the shortest transition distance is the energy required to activate the transmitter electronic circuits and parameters that depend on the receiver sensitivity and noise shape. Also, the energy required to receive L bits according to the formula (5) is equal to: In the proposed method, the fit function is defined according to the following formulas, which are the energy required to send the normal node notification to the header, the energy required to receive the header from the normal node, and the energy required to send the header to the base station.
One of the main challenges in sleep scheduling is choosing the least active sensors with the maximum amount of battery to have the maximum coverage and the least amount of data redundancy in the network. For this purpose, in the proposed method, we have used the colonial competition algorithm in sleep scheduling and the important parameters of this algorithm are: 1. Data redundancy matrix: To obtain the parameters of this matrix, we first divide the work environment into same size square areas according to the sensor density and then assume that an event occurred at the intersection. Therefore, for each event, we collect the reports of the nodes and form a data redundancy matrix.
If we assume that each event must be reported by a maximum of two sensors, with the help of this matrix, we can calculate the amount of network data redundancy by calculating the standard deviation of the matrix.

2.
Calculating the amount of network coverage: Similar to calculating redundancy, in this step by forming a binary coverage matrix, we calculate the number of events not reported by the sensor. Calculating the amount of non-coverage of the network is as follows: 3. Neighborhood disruption rate: Although the data redundancy matrix can count the number of sensors that report a phenomenon at one point, it cannot calculate the proximity or distance of two nodes, so another matrix called the neighborhood disruption rate is defined that calculates the rate of this disruption using equation (12).
Where is equal to the radius of the sensor and d is equal to the distance between the two nodes. Therefore, the overlap matrix is calculated and the amount of this matrix in the sensor network is equal to: The BRR is equal to the average battery life of the sensor nodes, so the cost function considering the above parameters is: 4 are weights of the parameters of the cost function.
In the proposed method, we have used genetic algorithm and colonial competition for clustering and sleep scheduling of nodes, which will reduce the average energy consumption, end-to-end latency and average overhead of control packets, and this will increase the longevity of the network.

Evaluation of the proposed method
In this paper, the NS-2 simulator is used to evaluate and prove the efficiency of the proposed algorithm. Today, NS-2 is considered to be the most widely used open source network simulator tool, which has been designed and implemented using the capabilities of C++ programming language and OTCL language. OTCL is a handwritten language with a TCL script structure and object-oriented capabilities. Object-oriented capabilities added to TCL have been designed and implemented at MIT.
In table (1), the simulation parameters are introduced and their values are specified. Selecting an appropriate value for a parameter can lead to a good result in the performance of the protocol, and conversely, inappropriate values for the parameters will cause a decrease in performance. In order to evaluate the results and performance of the proposed method, evaluation criteria such as average end-to-end latency, average energy consumption and network life have been used.
The average energy consumed in the protocol is an important criterion in a wireless sensor network that is related to the lifespan of system; the more energy consumed in a protocol, the more likely it is to reduce the life of that protocol. As you can see in the figure above, the average energy consumption of the proposed method has improved significantly compared to the R-LEACH method with a different number of sensors considering the percentage of simulation error. In the proposed method, we were able to create an acceptable level of energy balance in the network by selecting the best cluster head in different cycles as well as the sleep scheduling of the nodes. The percentage improvement of our method compared to the mentioned method considering the simulation error is equal to 11% Another important criterion used to evaluate and prove the efficiency of the proposed method in sensor networks is the average amount of time taken to successfully transfer a packet from source to destination, which is called the average end-to-end delay. The average end-to-end latency of the proposed method and the R-LEACH protocol are shown in Figure (3). To obtain the end-to-end delay we have used formula (15).
As you can see in Figure 3, the mean end-to-end latency in the proposed method is significantly better than the R-LEACH protocol with a different number of nodes and also considering the simulation error percentage.
In the proposed method, by selecting the best nodes that can be the headers, reducing the number of headers, and using header nodes instead of using header's members and multiple paths, we were able to receive successfully packets generated in the source node at the base station with less delay than the R-LEACH method. The percentage improvement of our method compared to the mentioned method is 17% considering the simulation error.
The grid lifetime criterion refers to the time when a percentage of grid nodes are depleted of energy, which may be half of the grid nodes or some other amount. So we also use this definition.

Figure (4): network lifetime
As you can see in Figure 4, the network life of the proposed method has improved significantly compared to the R-LEACH protocol at different times, taking into account the percentage of simulation error.
The increase in network life in our proposed method has been due to considering the appropriate sleep timing algorithm, maintaining the load balance in the network and using the appropriate clustering method and considering the best node for eclipse. The percentage improvement of our method compared to the mentioned method considering the simulation error is equal to 18%. By examining and evaluating the performance of the protocol using the NS-2 simulation tool, we showed that the proposed method has a much better performance than the other proposed methods in this field. In this method, we increase the lifespan of the network, reduce the power consumption and end-toend latency by obtaining the best header and sleep schedule of nodes.

Conclusion
In this paper, we first examined the internet of things and then focused on an important part of it called the wireless sensor network. Although these networks have a variety of applications, smallness of the size of the sensor nodes has caused processing, computational, energy and other limitations. Improving the energy consumption of nodes is one of the most important issues in wireless sensor networks. One of the effective factors in improving the energy consumption of nodes is how to send information from the nodes to the base station.
In this paper, an attempt was made to provide a method for selecting the header and sleep schedule that, in addition to increasing the life of the network and preventing the unnecessary energy consumption of sensor nodes, also reduces end-to-end latency. For this purpose, we examined the clustering methods in this field. We used a clustering method based on the meta-heuristic genetic algorithm and then applied a sleep timing method with colonial competition algorithm to use energy efficiently.
To review and evaluate our proposed method, we simulated it and compared the results of the simulation of the proposed method from the perspectives of average node energy consumption, end-to-end latency and network life with the basic method of the paper. The results showed that our proposed method compared to the R-LEACH protocol improved the average energy consumption (11%), average end-toend latency, (17%), network life (18%).
In future papers, in order to expand this method, researchers can further examine how to fragment messages when sending from a sensor node to a base station, discover optimal paths using fuzzy algorithms, and network clustering using intelligent clustering. They can also use more advanced encodings to improve network reliability.