Probabilistic coverage in mobile directional sensor networks: a game theoretical approach

Directional sensor nodes deployment is indispensable to a large number of applications including Internet of Things applications. Nowadays, with the recent advances in robotic technology, directional sensor nodes mounted on mobile robots can move toward the appropriate locations. Considering the probabilistic sensing model along with the mobility and motility of directional sensor nodes, area coverage in such a network is more complicated than in a static sensor network. In this paper, we investigate the problem of self-deployment and working direction adjustment in directional sensor networks in order to maximize the covered area. Considering the tradeoff between energy consumption and coverage quality, we formulate this problem as a finite strategic game. Then, we present a distributed payoff-based learning algorithm to achieve Nash equilibrium. The simulation results demonstrate the performance of the proposed algorithm and its superiority over previous approaches in terms of increasing the area coverage.


Introduction
Mobile directional sensor networks (MDSNs) consist of directional sensor nodes which can move and rotate on their own and interact with the physical environment. A directional sensor, such as a video sensor, infrared sensor, and ultrasound sensor, is capable of adjusting its working direction and sensing an angular area at each unit of time. Such networks enable a variety of applications in industry and also our daily life, i.e., in IoT, sensor nodes collect information from the environment and send it to the sink through the wireless network [1]. Therefore, area coverage is an important and challenging problem due to the energy constraint of sensor nodes [2][3][4].
Most of the studies on area coverage problems in DSNs adopted the binary sensing model in which the sensing region is a deterministic sector that is a coarse approximation to sensing region in reality. In this model, an event is detectable by the sensor if and only if it falls into its covered sector. However, in reality, the probability of event detection decreases when the distance of that event from the sensor increases, so the sensing model of a sensor node is practically probabilistic [5][6][7]. In [6], the authors have shown through experimental study that the normal distribution is reasonable for modeling the sensing range of sensor nodes. The authors in [7] have considered an exponentially reducible sensing range for sensor nodes. In this model, the sensing capacity is exponentially reduced by increasing the distance between the points and the sensor nodes.
In this paper, we propose two algorithms, namely binary coverage based on game theory (BCGT) and probabilistic coverage based on game theory (PCGT). We model the area coverage problem as a finite strategic game in which the utility function is designed to capture the tradeoff between the amount of covered area and the energy consumption due to movement and rotation. Then, we propose a learning algorithm to solve this problem based on the log-linear learning algorithm proposed in [8]. It is proved that in this algorithm, each sensor as a player finally selects the action profile that maximizes the total payoff.
To the best of our knowledge, this is the first work that employs probabilistic sensing model for mobile directional sensor networks. The contributions of this work are as follows: • We formulate the problem of determining the direction and location of directional sensor nodes in order to maximize coverage for both binary and probabilistic sensing models as a multiplayer repeated game in which each sensor as a player tries to maximize its utility function. The utility function is designed to capture the tradeoff between the worth of covered area and the energy consumption due to movement and rotation. • We prove that the proposed game is an exact potential game and its potential function is equivalent to covering the area with the least energy consumption to maximize network lifetime. • We propose a variant of log-linear algorithm called binary log-linear learning algorithm (BLLL) that converge to pure Nash equilibrium.
The performance of our proposed algorithm is evaluated via simulations and compared to previous approaches. The simulation results show that our proposed algorithm significantly improves the sensing coverage performance.
The paper is organized as follows: Sect. 2 briefly reviews some recent research related to solving the sensor coverage problem. In Sect. 3, first, the proposed approach is formulated for both binary and probabilistic sensing models based on game theory. Then, the binary log-linear learning algorithm is introduced to converge the game into an efficient action profile. In Sect. 4, simulation results are presented through several experiments. Finally, we conclude the paper in Sect. 5.

Related works
In this section, we briefly review the research work on coverage in wireless sensor networks. The coverage problem is usually divided into three categories: area coverage, point coverage, and barrier coverage [9][10][11]. The purpose of area coverage is to cover the whole area. Next, point coverage is the coverage for Points of Interest (PoI). Finally, the barrier coverage guarantees that every movement that crosses a barrier of sensors will be detected. Su et al. [12] proposed a Voronoi-based Optimized Depth Adjustment deployment scheme to deploy sensor nodes in a target water space. In this algorithm, a number of leader sensor nodes are selected to remain on the water surface. Then, in order to reduce the coverage overlap between nodes, the remaining nodes calculate their depth from small to large based on graph coloring theory.
Alibeiki et al. [13] studied the problem of covering targets with directional sensors. They formulated the problem as maximum coverage with minimum number of sensors and proved that it is NP-complete. Therefore, a genetic-based algorithm is presented to solve the problem. Here, the main idea is the selection of sensing sectors, which cover the maximum number of targets. Mondal et al. [14] proposed a Greedy-based algorithms for target coverage in directional sensor networks with adjustable sensing range. They used both scheduling and adjusting sensing range techniques to form cover sets to cover all targets in the network and maximize network lifetime.
In [15], the authors provided a GA-based algorithm to solve the MNLAR (Maximum Network Lifetime with Adjustable Ranges) problem. GA-based algorithm forms cover sets of directional sensors with appropriate sensing ranges.
Yu et al. [16] addressed the problem of K-coverage in wireless sensor networks with both centralized and distributed protocols. Protocols introduced a new concept of Coverage Contribution Area (CCA). Based on this concept, a lower sensor spatial density was provided. In addition, the protocols considered the remaining energies of the sensors. Therefore, the proposed protocols prolonged the network lifetime.
In [17,18], a Probabilistic coverage preserving protocol is designed to achieve energy efficiency and to ensure a certain coverage rate. The purpose of the proposed protocol is to select the minimum number of probabilistic sensors to reduce energy consumption.
A graph model named Cover Adjacent Net (CA-Net) was proposed by Weng et al. [19] to simplify the problem of k-barrier coverage while reducing the complexity of computation. Based on the developed CA-Net, two distributed algorithms, called 1 3 Probabilistic coverage in mobile directional sensor networks:… BCA and TOBA, were presented for the purpose of energy balance and maximum network lifetime.
Mostafaei et al. [20] proposed a distributed boundary surveillance (DBS) algorithm to cover the boundary and reduce energy consumption of sensors. DBS selects the minimum number of sensors to increase the network lifetime using learning automata.
Zarei et al. [21] proposed new algorithm based on the Voronoi diagram, called prioritized geometric area coverage (PGAC). In the proposed algorithm, in order to maximize the coverage of the desired area, the most Voronoi edges are covered.
In [22], the authors proposed a coverage optimization algorithm using clustering, cluster head selection, and sleep and wake phase. Overlap problems are optimized using sensor node default conditions to achieve maximum coverage within a specific sensing radius. A sleep and wake phase is used to select a small number of nodes for active coverage.
Recently, game theoretic approaches have been taken into consideration to solve coverage problem in WSNs [23][24][25][26]. In [27], the authors have proposed an algorithm based on game theory for the problem of maximizing coverage and reducing energy consumption. They have shown that the desired solution in this model is a NE strategy profile. In [28], the authors have proposed a game theoretical complete coverage algorithm. This algorithm is used to ensure whole network coverage mainly through adjusting the covering range of nodes and controlling the network redundancy. The game theory control method has many advantages including robustness to failures and environmental disturbances, reducing communication requirements and improving scalability. The primary goal of game theory-based approaches is to design rules that guarantee the existence and efficiency of a pure Nash equilibrium [29,30]. Proper utility functions and reinforcement learning methods are designed for the coverage game of WSNs in [23,31]. In these algorithms, each player must have access to the utility values of its alternative actions. In paper [32], moving isotropic sensors with adjustable sensing range are considered and a new family of provably correct reactive envelope control algorithms is proposed for continuous and discrete time sensor dynamics. The proposed coverage control algorithms continuously reconfigure sensor positions and sensor ranges in order to minimize the statistical distance between the event distribution in the environment and the overall event detection probability of the sensors.
In this study, we propose a game theory-based algorithm to optimally cover targets and reduce energy consumption.

The proposed algorithm
In this section, we propose a new game theory-based algorithm for both binary and probabilistic sensing models in mobile directional sensor networks in order to maximize the area coverage. We prove that the proposed method is a potential game and converges to Nash equilibrium using a distributed learning algorithm.

Problem formulation
Suppose that N mobile directional sensor nodes are randomly deployed in a twodimensional mission space. Figure 1 shows the binary sensing area of a directional sensor denoted by . S is the location of the sensor node on a two-dimensional plane. R represents the maximum sensing radius. The horizontal orientation of the sensor and angle of view are indicated by s ∈ (0.2 ] and f , respectively. � ⃗ v is a unit vector that defines the orientation of the directional sensor. Let q be a point in the area. Point q is covered by sensor S if: where � ⃗ d is the distance vector from sensor S to q. The first condition indicates whether q is within the sensing range of S, and the second one examines whether q is in the sensor's angle of view. We assume that communication range of each sensor (R c ) is at least twice the sensing range ( R c ≥ 2R ). Thus, each sensor can transmit its state information to its neighbors.
The two-dimensional mission space is discretized into a squared lattice. Each square of the lattice is 1 × 1 and is represented by the coordinate of its center Under these assumptions, the problem is to find the appropriate location and orientation of each sensor to maximize the area coverage and minimize energy consumption. We model this problem as an optimization problem. For this purpose, we define several notations as follows: • n q : The number of directional sensors that cover q ∈ Q.
• x ni s : A binary variable that indicates whether sensor n is placed at location i with orientation s .
• y i s q : A binary variable that indicates whether a sensor at location i with orientation s covers control point q.
• E mov j : The energy consumed due to movement. • E rotate j : The energy consumed due to rotation.
The goal is to maximize the coverage and reduce energy consumption due to the movement and rotation of sensor nodes. To this end, we define the objective function as follows: Equation (4) ensures that only one working direction is assigned to each sensor node. n q is calculated by (5). About the objective function, the goal of each sensor is to cover the maximum number of control point in the mission area. In addition, it is desired to decrease the amount of overlapping area. Using ∑n q j=1 1 j , we reach these two goals.
In the following subsections, we formulate the utility function for both binary and probabilistic sensor nodes.

Coverage problem as an exact potential game
In our coverage problem, we are concerned with devising motion and orientation laws for repositioning of a finite number of mobile directional sensor nodes so that their converged positions in the limit correspond to a deployment with desirable coverage performance. In this section, we present our formulation of this problem in terms of a restricted exact potential game G mc ∶= ⟨V, A, U mc , F mc ⟩ . A brief review of game theory is provided in the Appendix. In the following, we describe the game components in more detail: is the set of feasible locations that a sensor in location a i can move to, with any direction i .

Utility function using binary sensing model
In this section, the directional sensors follow binary sensing model. As depicted in Fig. 1, the binary sensing model for sensor s i is defined as the following: In fact, C q (a i ) shows that the point q is covered by the sensor s i in action a i if the distance of s i from the point q is less than the sensing radius R, and the point q is within the viewing angle s i ( f ).
Given any action profile a = (a 1 , a 2 , …, a N ), let D(a i ) be a set of points that s i can cover, and n q (a) be the number of sensors covering the point q. The profit of covering the point q will be equally shared by all the sensor nodes that can cover the point q. So, the benefit that sensor s i obtains is defined by: Due to energy constraints in sensor networks, we associate an energy cost with the use of sensor. The energy consumption of sensor s i due to movement is defined as follows: where K i > 0 is a coefficient, l i and l ′ i refer to the present and previous sensor locations, respectively. The energy consumption of sensor s i due to rotation is defined as follows: where k ′ i > 0 is a coefficient, i and ′ i refer to the present and previous sensor orientations, respectively. Therefore, the utility function of sensor s i that aims to capture the above sensing/energy consumption trade-off in the multiagent action a is defined by The following lemma shows that the defined game is a potential game.

Lemma 1.
The strategic game G mc ∶= ⟨V, A, U mc , F mc ⟩ is an exact potential game with the following potential function: Proof The strategic game G is an exact potential game with potential function ∶ A → R if for every player i ∈ N , for every a −i ∈ A −i and for every a i .a It is proved that any action profile that maximizes the potential function is a Nash equilibrium [33].
We consider two action profiles a i = (l i . i ) and a � i = (l � i . � i ) and define 1 = D(a i )�D(a � i ) and 2 = D(a � i )�D(a i ) . Since for each q ∈ 1 , n q (a) = n q a � + 1 and for each q ∈ 2 , n q a � = n q (a) + 1 . Thus, we have:

Utility function using probabilistic sensing model
Although coverage in wireless sensor networks using a binary sensing model is very common because of its simplicity, sensor detections are usually imprecise and the ability of sensing may decreases with increasing distance away from the sensor. The probabilistic sensing model is a more realistic extension of the binary sensing model. This is because of this fact that sensor design, sensing model and environmental conditions are all stochastic in nature; i.e., noise and interference in the environment can be modeled by stochastic processes.
In probabilistic sensing models, the probability of point detection is a reduction function of the sensing distance. As shown in Fig. 2, the probabilistic sensing area in DSNs can be denoted by (S.R.R e . f . s .� ⃗ v) . Similarly, S is the location coordinate on a two-dimensional plane, R e indicates the uncertain sensing range, and R − R e specifies (14) Fig. 2 Probabilistic sensing model the maximum certain sensing range. The point q is probabilistically covered if the Euclidean distance d between q and S is in the range (R − R e .R + R e ) . The horizontal orientation of the sensor and angle of view are indicated by s ∈ (0.2 ] and f , respectively. � ⃗ v is a unit vector and defines the orientation of the directional sensor. In DSNs, the probabilistic sensing model for sensor s i is described as follows: where and are parameters that measure the probability of point detection and vary in different types of sensors.

Definition (Probabilistic Coverage):
The desired area is covered by n sensors with probability P c , if for each point q in the area, the following equation is established.
According to Eq. (15), C q a i is the probability of detecting the point q by the sensor s i . (1 − C q a i ) is the probability that the point q is not covered by the sensor s i . Since the probabilistic coverage of a point by a sensor node is independent of other sensors, the term is the probability that the point q is not covered by any sensors. Hence, the expression 1 − is the probability that the point q is covered by at least one sensor.
In this problem, the goal is to move and rotate the directional sensor nodes in a way that the coverage probability of each point q ∈ Q is greater than or equal to P c . We define the utility function of player i in the probabilistic sensing model as follows: where w q (a i ) is the contribution of directional sensor s i in detecting the point q in action a i , which is defined as follows: E move i a i and E rotate i a i are defined in Eqs. (9) and (10). The following lemma shows that our defined game is a potential game.

Lemma 2.
The strategic game G mc ∶= ⟨V, A, U mc , F mc ⟩ is an exact potential game with the following potential function: . if P(q) ≥ P c

otherwise
Proof: For any agent i = 1. … .N and two consecutive action profiles a i = (l i . i ) and ) . Since for eachq ∈ 1 , n q (a) = n q a � + 1 and for eachq ∈ 2 , n q a � = n q (a) + 1 .
Thus, we have:

Distributed learning algorithm
In the game theoretical formulation, the sensor nodes play the coverage game G repeatedly starting from a desired initial configuration. At each time step t ∈ {0, 1, 2, … } , one senor s i is randomly selected and plays an action a i (t) while other sensors repeat their actions, i.e., a −i (t) = a −i (t − 1) . The role of the learning algorithm is to provide an action updating rule so that the sensor actions converge to Nash equilibrium.
In order to maximize potential function and achieve Nash equilibrium, log-linear learning is presented in [34], in which only one player updates its action at each iteration. In log-linear learning, sensors can select suboptimal actions with low probability. Therefore, sensors are allowed to explore the mission space, which results in finding optimal actions and achieving Nash equilibrium. Log-linear learning assumes that players have a constant action set. In general, convergence to the potential maximizer is not guaranteed when the practical actions available to a player depend on the player's state, i.e., each player is allowed to choose its next action a i (t + 1) from the set of actions A c i (a i (t)) that depends on its current action a i (t) . A modified version of log-linear learning called binary log-linear learning is introduced for the problem of constrained action set in [35]. This algorithm is as follows: At each time t > 0, one agent i ∈ V is uniformly selected and allowed to alter its current action. All other agents must repeat their current action at the ensuing time step, i.e., a -i (t) = a −i (t − 1). At time t, player i selects one trial action a i uniformly from its constrained action set C i (a i (t − 1)) ⊂ A i . Agent i plays a strategy p i (t) ∈ Δ(A i ) where Δ(A i ) denotes the set of probability distributions over the finite set A i , and: for any action a i ∈ A i and temperature τ > 0. The temperature τ determines how likely agent i is to select a suboptimal action. As τ → ∞, agent i will select any action a i ∈ A i with equal probability. As τ → 0, agent i will select a best response to the action profile a −i (t − 1). In the case of a non-unique best response, agent i will select a best response uniformly.
Therefore, to use the binary log-linear algorithm, action selection probability is the determinative factor.
Binary log-linear learning can be used to converge to a set of potential maximizer action profiles if the constrained action sets meet the following two properties. Property 1 (Feasibility) For any agent s i and any action pair a i (0), a i (k) ∈ A i , there exists a sequence of actions a i (0), … , a i (k) such that a i (t) ∈ A c i (a i (t − 1)) for all t ∈ {1, 2, … , k}.

Property 2 (Reversibility) For any agent s i and any action pair
We can easily show that the above properties are met according to the problem settings. In binary log-linear learning, only one sensor is randomly selected at each time step. The selected sensor, assuming the other sensors are stationary, selects a trial action randomly in its constrained action set. The sensor receives a hypothetical utility by playing the trial action and updates its action depending on the current utility and hypothetical utility. The proposed BLLL Algorithm which is based on the general binary log-linear learning algorithm [35] is as follows:

Simulation results
In this section, we present the simulation results of the proposed algorithm. To evaluate the performance of the proposed algorithm, several experiments have been performed in MATLAB. The simulation results are compared with the results of VDA [36], DVSA [37], EDA-I and EDA-II [38], CPP [39] and RND algorithms. RND means the initial value after the random deployment of sensors (with random position and random direction). The algorithms are compared with respect to coverage. We consider the fraction of the area which is covered by the deployed sensors as the coverage criterion. Experiment 1. In the first experiment, we consider an example of applying BCGT in a mobile directional sensor network. We consider a 15 × 15 square-shaped sensor field in which 20 directional sensors are located in the middle of the area. The sensing range of each sensor is 3, and the angle of view of each sensor is taken to be 90 degrees. We have chosen T = 0 ⋅ 1 in the learning algorithm. Figure 3 demonstrates the stabilized positioning of the nodes at iteration 2000.
The evaluation of the potential function in each iteration is shown in Fig. 4. This figure shows that sensor nodes try to increase their utility function, which corresponds to better location and orientation exploration. It is now necessary to show that maximizing the potential function leads to maximizing the coverage of the whole area. Figure 5 displays that the area coverage is increasing during the time.  • , respectively. The comparison of the proposed algorithm BCGT with the existing deployment algorithms is shown in Fig. 6. As shown in Fig. 6, BCGT performance is better than other algorithms in terms of coverage criteria. Figure 7 compares the behavior of BCGT with existing algorithms in terms of coverage criteria for setting parameters as N = 200.R = 50and f = 60 • .90 • .120 • and 180 • . According to Fig. 7, the BCGT again performs better than existing algorithms. From the comparisons, we conclude that the proposed BCGT performs very well under different number of sensors and angles of view. Experiment 3. In order to establish the probabilistic coverage using the proposed PCGT algorithm, we consider a 500 × 500 area. The sensors are probabilistic with parameters (R.R e . . ) , (50, 15, 0.9, 0.1) and 90° viewing angle. We consider the confidence probabilities P c to be 80%, 85%, 90% and 95%. The sensors are randomly placed in the area. Figure 8 shows the simulation results of the PCGT algorithm for N = 100 ∼ 600 sensors and different confidence probabilities. The results show that at higher confidence probability P c , more sensor nodes are needed to fully cover the area. Experiment 4. In the proposed PCGT algorithm, λ and β are two important parameters in the utility function and determining the action of sensor nodes. Therefore, we select two sets of parameters to examine their effect on coverage  Fig. 9, it can be concluded that with increasing the number of sensors, the coverage percentage of the area increases. Similarly, with increasing λ and decreasing β, the probability of sensor coverage and consequently the coverage percentage of the area increases.
Experiment 5. In this experiment, we compare the performance of the proposed PCGT algorithm with the CPP algorithm. We consider a 500 × 500 area. The sensors are probabilistic with parameters (R.R e . . ) , (50, 15, 0.9, 0.1) and 120° viewing angle. We consider the confidence probabilities P c to be 80%, 85%, 90% and 95%. The sensors are randomly placed in the area. Figure 10 shows the required number of sensor nodes under a specified P c . The simulation results show that for different confidence probability models, PCGT can achieve a good coverage rate with a minimum number of sensor nodes.

Conclusion
In this paper, we proposed a game theory-based algorithm for deploying and orienting a number of mobile directional sensor nodes for both binary and probabilistic sensing models to maximize area coverage. An appropriate utility function for each player is designed to improve coverage quality and reduce energy consumption. Then, we proved that the designed game is a potential game, and in order to converge the game and achieve Nash equilibrium, we used the binary log-linear learning algorithm. The simulation results showed the performance of our proposed algorithm over previous approaches in terms of coverage rate. In a future work, extensions to more realistic sensing model that reflects the anisotropic properties of WSNs will be considered. The network connectivity is a crucial factor that must be taken into consideration in designing future solutions. In the aforementioned sensing models, the sensing ability depends only on the distance between the sensor node and the target point. However, the sensing ability of a sensor is non-uniform because of the hardware configuration and software implementation. In addition, obstructions in the area, such as buildings and power stations, result in extra power loss and more variation in the received signal power. So, in these conditions, the shadow fading sensing model would be more realistic.

3
Probabilistic coverage in mobile directional sensor networks:… The existence of NE in an exact potential game is guaranteed [33]. It then follows that any restricted exact potential game has at least one restricted NE.