A Robust Fault-Tolerance Scheme with Coverage Preservation for Planar Topology Based WSN

Maintaining prolonged service lifetime and adequate quality of sensing coverage are the key challenges in constructing Wireless Sensor Network (WSN) based applications. As such networks usually operate in inhospitable and hostile environment, failures are ineludible and providing resilience is a necessity. However, it is challenging to satisfy the conflicting problems of enhancing energy efficiency and fault tolerance simultaneously. Fault-tolerance is a significant requirement while designing WSN. It is crucial to detect the failures in advance and take necessary measures to maintain durable and efficient functioning of the network. Generally, in the existing face structured WSNs, node faults and failures can induce the formation of coverage holes, disrupt the face structure and consequently curtail the application performance. The coverage quality will affect the monitoring effectiveness of tracking applications, e.g., a moving target tracking. Moreover, node failures can cause the network to be partitioned, further reducing the accuracy in tracking. In this paper, we propose a robust fault-tolerance scheme with coverage preservation using a face structured WSN topology (FCAFT\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$F_{CAFT}$$\end{document}). The key objective of the proposed FCAFT\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$F_{CAFT}$$\end{document} scheme is to sustain the performance of the network by timely healing the faults in the network, to enhance the durability and reliability of the WSN. The results of simulation and comparison with existing methods reveal that FCAFT\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$F_{CAFT}$$\end{document} is efficacious in enhancing the service lifetime of WSN by about 14% and sustains about 96% of coverage even when the failure rate is more than 20%, which is a necessity for critical monitoring and tracking applications of WSNs.


Introduction
Wireless Sensor Network (WSN) is a keystone of the Internet of Things (IoT) technology with diverse applications to accomplish precise real-time monitoring of events [1,2]. Object detection, monitoring and tracking are the crucial tasks in most of the applications 1 3 of IoT/ WSNs, and improved service life of such applications through energy conservation, effective resource management and resilience to failures is challenging and is of great significance [3][4][5]. As the network evolves as a revolution in various aspects of our life such as wildlife monitoring, health monitoring, habitat tracking, military, search and rescue, it is worthwhile to have a reliable and resilient network to deal with the demanding issues of improving energy efficiency and service life with sufficient coverage and fault-tolerance simultaneously [6,7].
On one hand, the future potential of WSNs enabling useful applications to the real world is practically limitless; but on the other hand, the design is affected by several constraints [2][3][4][6][7][8]. One of the main objectives to satisfy while designing a WSN is to maintain the WSN alive and operational by enhancing the robustness and reliability of the network [9,10]. A key aspect in this context is the way the WSN is formed and sustained. In fact, the network structure is mostly defined according to the application context and environment. The nodes in the WSN must self-organize to deliver the service as long as possible [11,12]. Unfortunately, WSNs due to their inherent characteristics and deployment in hostile environment, are vulnerable to frequent failures that include various reasons like energy depletion, link failure, and so on [6-8, 10, 13]. Hence, fault tolerance is a critical requirement while designing WSN based applications. Consider an event detection application, when a detecting node fails, or the monitoring report is not received due to several reasons, e.g., node/ link failure, the performance and detection accuracy can be greatly affected [14]. Moreover, node failure is a possible cause of sensing coverage loss as it creates coverage holes and can affect the connectivity between nodes in the WSN, which in turn can reduce the monitoring effectiveness of the WSN [15][16][17]. In the worst case, it can cause partitioning of the network, distorting the network structure as well as the information flow, which may put an end to the service life of the WSNs. Therefore, sustaining the performance of the network by timely detection and healing of the failures in the network is of great importance for the efficient functioning of applications, for instance: critical surveillance, monitoring and tracking applications of WSNs [10,[13][14][15]18].
Although fault tolerance in WSN has been investigated extensively in various aspects by the research community, not much work has been done in planar/ face structured WSN. A fault tolerance mechanism basically follows the stages including fault detection and diagnosis, and restoration/ repair. In some of the existing approaches, fault tolerance is managed and accomplished using add-on modules, other evaluation tools, and require additional hardware [19]. Moreover, many approaches for achieving fault tolerance in WSNs are generally implemented and controlled centrally, also called centralized or sink based approaches [20][21][22]. These approaches require the nodes to send messages to the sink periodically and require a high count of active nodes for monitoring the health of nodes and the monitoring task is performed separately. Such schemes are not practically feasible for a resource-constrained, large-scale event-driven WSNs, because eventdriven WSN pose special challenges to the stated concern [10,23]. Critical monitoring and tracking applications of IoT based WSNs are usually delay and security sensitive applications and have real-time requirements for delivering the sensed data [23,24]. Failure to satisfy these requirements can have serious consequences. Hence, the fault monitoring task should go hand in hand with the normal functioning of the application through effective network configuration and management of resources, as many of such applications demand fast detection and real time monitoring of events using the underlying WSN.
In general, the existing planar topology based WSN built by generating planarized graph, such as Relative Neighborhood Graph (RNG), Voronoi diagram, Delaunay triangulation, Gabriel Graph (GG) and some cross edge removal approaches do not provide any fault tolerance support on its own. Node failures can cause network partitioning and reduce the application performance [25][26][27]. Also, some restoration schemes do exist, but they didn't effectively consider the network coverage, connectivity and topology quality, which are also crucial to a WSN [6,10,[28][29][30]. In real-time and sensitive applications, this can cause target loss and consequently, a significant amount of energy is consumed for recovering the missed target. The condition becomes awful as time progresses. Node/ link failures and faults create coverage holes, distort the face structure, eventually cause the WSN to be split into disconnected partitions, and have negative affect on the service life and application performance. Most of the researches related to face structured WSNs are carried out using existing planar topologies and they try to recuperate from such failures through maintenance performed locally by means of merging the adjacent faces [10,[28][29][30]. If unfortunately the target enters and stays in a coverage hole area, it will remain untraced, and hence it is not possible to get any information about the target unless it leaves the hole region and is sensed by a node. Rapid recovery from failures and restoration of coverage and connectivity is important so as to prevent partitioning of the network and maintain the WSN to perform the activity efficiently. Deploying additional nodes instead of failing nodes is a slow process that is energy consuming and requires human interference, and therefore is not practically well-suited for WSNs in harsh and hostile environment areas [13,14]. Therefore, the network should be self-healing using the existing alive nodes in the network. Moreover, considering the resource constraints and real-time requirements in WSNs, distributed and energy-efficient methods have become more attractive.
By keeping this in mind, as part of this paper, we propose a robust fault-tolerance scheme with coverage preservation using a face structured WSN topology ( F CAFT ). We consider a computational geometry based planar/ face structured WSN, which provides effectual coverage using a minimized set of working nodes for saving the energy consumed and extending the service life of the WSN [23]. The face topology creation is performed in distributed manner. A set of selected working nodes are arranged into faces (AniT nodes) while the remaining nodes are retained in sleep state (non-ANiT nodes), so as to minimize the redundancies that may lead to increased energy utilization and cost. However, when a node fails, its edges will break and it is possible that a hole is created. The main objective of the proposed F CAFT scheme is to sustain the performance of the network by timely healing the faults in the network using the non-ANiT nodes to ensure robustness and resilience of the WSN. The use of non-ANiT nodes to replace a failed node permits the WSN to selfheal and keep functioning as long as possible. This helps to sustain the quality of coverage by preventing the hole creation and preserve the network structure by restoring the connectivity in a distributed and energy efficient way, which are essential requirements regarding applications like critical target tracking, for e.g., an enemy can stay in hole region without being traced by the nodes in the WSN. The working of F CAFT includes four main phases, namely initialization, diagnosis, healing and restoration phase, which correspond respectively to face structure construction, node/link failure detection, selection of most appropriate non-ANiT node as substitute, and recovery by repairing and restoring the face structure of the network. The main contributions of this paper are as given below: • We investigate the fault tolerance capability of face based WSN and evaluate the robustness of the network. • We propose a new algorithm for fault tolerance with coverage preservation in face based WSN through the selection of a suitable substitute node to replace the failing node.
• We present a distributed algorithm for failure recovery by repairing and restoring the face structure of the network. • We evaluate the performance of F CAFT through simulations. The comparison results with the existing techniques [10,13] show the effectiveness of F CAFT in handling the failures by restoring the coverage and connectivity of the face structured WSN.
We organize the remainder of this paper as follows: Sect. 2 discusses the related research. The proposed F CAFT scheme is provided in Sect. 3. Next, the Sect. 4 discusses the simulation results and performance evaluation of F CAFT . Lastly, Sect. 5 concludes the paper.

Related Research
In this section, we provide a brief background information on the CAFT paper [23] and then present the related research on fault tolerance in face structured WSN. The background information is included in Sect. 2.1 and related research on the fault tolerance aspects is given in Sect. 2.2.

Background Information: CAFT
The topology of the network which defines the organization of nodes in the WSN has considerable impact on the performance and efficacy of the system. Therefore, it is crucial to have energy efficient and robust schemes that allow proper resource management to guarantee the performance of the applications that involve real time monitoring and detection of events using the underlying WSN [31]. CAFT incorporate the concepts of graph theory and computational geometry to construct a new planar topology for WSN in a distributed manner [23]. The key focus is to meet the connectivity and coverage requirements utilizing a minimized set of nodes organized as faces in contrast to existing face structured WSN where the entire nodes in WSN engage in topology construction and expend more cost in terms of energy, storage, communication, computation, and time. Originally, the entire nodes of the WSN are in active state to prepare and collaborate for the topology creation procedure and some nodes are made to sleep to result in a planar topology which includes only a reduced count of nodes. The edge creation process depends on the distance and connectivity measures between a node and its neighbors which are supposed to be the vertices of the adjacent faces (or polygons), to satisfy the coverage and connectivity needs. Ultimately, only a subset of nodes follow duty cycle mode, while the rest of the nodes stay in sleep mode, leading to the creation of active/sleep nodes in face topology (ANiT/ non-ANiT nodes). Those nodes that constitute the generation of face topology are called ANiT nodes and the retained sleeping nodes are regarded as non-ANiT nodes. An ANiT node will not be made to sleep during the execution of topology creation process, while a non-ANiT node continue in sleep state and will not engage in any tasks, unless it receive a request to wake up. However, node/link failures can impair the network structure and coverage. We exploit the existing sleep (non-ANiT) nodes in the current work ( F CAFT ) and use them as substitute nodes to replace the failing nodes, to contribute towards a substantial improvement in network performance and service life.

Existing Works on Fault Tolerance
Despite the potential and limitless future applications of IoT and WSNs, such networks have some inherent restrictions imposed because of the constrained resources, such as limited power source, reduced bandwidth, low computational ability etc. [11,12]. The major consequences of node or link failures are of great impact as they affect the monitoring efficiency and communication between nodes. The reliability of WSN can be affected by faults that may happen due to numerous reasons such as depletion of energy, environmental hazards or defective hardware [32][33][34]. An early detection of such faults is crucial for the effective functioning of the WSN. Hence, fault-tolerance of a WSN is a general matter of interest in various application fields, and requires increased attention from researchers [16,17]. The strategies in these researches may vary significantly, but essentially within the scope of constructing fault-tolerant WSN structure, resilience and recovery from failures [32,33,35,36]. In recent years, numerous researches have explored various aspects of fault tolerance and management in WSNs, but there still remain concerns to be addressed as not much work has been dedicated to deal with face structured WSNs [37][38][39].
Fault management techniques are classified in various ways. One such classification is centralized and distributed approaches [33,36]. In the former approach, the fault management task is performed by a sink or base station, while the latter allows local detection and recovery from failures. In addition to clustering and tree approaches, the WSN area is divided into regions, cells, grids, and so on to follow a target in a distributed manner [30,33,35,37,40,41]. When failures develop in monitoring nodes, connectivity or coverage problems, or physical impediments appear during tracking, addressing them all at once becomes increasingly difficult. When resource constraints and real-time needs in WSNs are considered, distributed tracking solutions are more appealing. Prior distributed approaches alleviate the issues commonly observed in centralized schemes. The usage of clusters or trees provides real-time processing and collaboration between nodes, as well as decreased data communication during fault management. However, there exist distributed algorithms in which the tracking operation is not evenly dispersed, necessitating central interactions [22,31,35,42]. The main limitations of these works is that they require a large number of active nodes for fault management, and such operations are not performed in conjunction with the normal working of the application. Moreover, in case of dense deployment, the nodes lie close to each other, causing signal interference leading to irregular signal patterns that can affect the accuracy of running application, and can cause issues related to redundancy and radio contention. Consequently, the network has more energy consumption, which can reduce the service life of the WSN. However, we enable the nodes to provide robustness to tolerate the faults in a distributed manner that involves only a reduced number of active nodes.
In [36], a local self-healing scheme of fault tolerance is proposed. When the node's battery level goes lower than a threshold, node is declared as sleeping and is removed, and topology is updated. However, with the increase in node failures, the coverage and connectivity of the network get impaired. Moreover, the chance for single point of failure is more, which implies that if a head node fails, the fault diagnosis of some other node is halted. As a result, the entire network is at risk of dying prematurely. In [37], a majority neighbors coordination based fault detection method is presented using majority neighbors voting approach. In this approach, the detection accuracy would degrade with increased number of node failures, and also no mechanism is adopted for compensating the coverage loss caused by the failed nodes. In [6], the fault tolerance issue is addressed to provide robustness against node failures caused by battery depletion of nodes using redundant sleeping node as replacement node. Even though authors claim that the technique compensates for coverage loss caused by failing nodes, the paper didn't provide enough discussion on the effect of their algorithm regarding the network coverage and application performance. The works in [42,43] contributed approaches of achieving failure recovery through the use of backup node for replacement of the faulty node, while in [10,[28][29][30], local maintenance by means of merging two or more adjacent faces into a single face. However, in these schemes, multiple number of data exchanges occur between each sensor node to its neighboring nodes in all the adjacent faces during fault detection process,. As a result, it expends substantial additional energy to identify problems, that significantly reduces the network lifetime, and also no mechanism is adopted for compensating the coverage loss caused by the failed nodes. The work in [13] deals with the faults by allowing the nodes to be turned off arbitrarily and maintains the tracking form on the surviving nodes. It also permits insertion of extra nodes into the network and then locally refines the planar graph and the tracking form to accommodate the inserted node. However, deploying extra nodes instead of failing nodes requires human interference and is a time and energy consuming process, which is infeasible in harsh and challenging environments.
As we have mentioned above, not much work has been done to handle the fault tolerance aspects of face topology based WSNs, and related works make use of an existing planar topology constructed using RNG, GG, or some cross edge removal approaches etc. [28][29][30][31]35], which do not have fault tolerance capabilities on their own and suffers from coverage and connectivity issues with increased node failures, resulting in degraded application performance.
Based on the reviewed literature, the proposed paper is found to vary from the previous papers in different aspects. Generally, the faces constructed initially using existing WSN planarization schemes may not be preserved during tracking and the network performance may degrade over time because of faulty nodes present in the network. Node/link failures and faults create coverage holes, impair the face structure, eventually cause the WSN to be partitioned, and adversely degrade the application performance. When the occurrence of a node or link fault is detected in the network, it is of high significance to restore the coverage as well as connectivity of the WSN topology. E.g., in critical target tracking applications, if the network coverage and connectivity is not successfully restored, it can have serious consequences such as target miss and loss of tracking. A significant amount of energy gets wasted in recovering a lost target. Usually, a target recover mechanism aims to recover the target by gradually incrementing the number of active nodes associated with the adjacent faces surrounding the target lost location [30,[44][45][46]. If the target detection is still unsuccessful, the search space is enlarged by activating more surrounding faces. If the target is still not relocated, the WSN returns to the initial state where all the nodes in the WSN are activated for relocating the target [45]. Past researches on fault tolerance in face structured WSNs are mostly carried out using existing planar topologies where they recover and restore the affected faces using the remaining active nodes, where there is no mechanism for compensating the coverage loss caused by the failed nodes. As a result, when more nodes start to fail, the face structure gets destroyed and quality of sensing coverage gets deteriorated, resulting in degraded performance of the WSN and reduced network lifetime [10,13,[28][29][30]. In the worst case, it can cause partitioning of the network, distorting the network structure and data communication, which may put an end to the service life of the WSNs. Therefore, it is necessary that a fault tolerant face based WSN should sustain coverage and connectivity among nodes in an energy efficient manner to preserve the network structure and prolong the service durability of the WSN. While fault tolerance in face based WSN has been investigated by researchers to some extent, much work remains to be done to address the aforementioned concerns.
The above observations motivated us to develop a robust fault-tolerance scheme that self-heal to enhance the durability and reliability of face based WSN. The coverage is preserved and the connectivity is restored in an energy efficient distributed manner, which is essential especially for critical tracking applications of WSN. The sleep (non-ANiT) nodes are used for ensuring failure resilience for better performance and prolonged functioning of the WSN.

Proposed Fault Tolerance Scheme with Coverage: F CAFT
In this section, we initially provide an overview of the proposed F CAFT scheme for fault tolerance in a planar topology based WSN, and then explain the working of F CAFT in detail. The notations used in this work are summarized in Table 1.
Overview: To start with, the deployed nodes are arranged to form a planar WSN topology following the topology construction process of CAFT [23]. The generated planarized graph contains a reduced count of nodes called ANiT nodes, at the same time ensures coverage as a result of the selection of appropriate nodes. The retained sleep (non-ANiT) nodes are used by F CAFT for healing the faults in the network, which in turn ensure robustness and durability in the functioning of WSN by restoring the coverage and connectivity as long as possible. Figure 1 gives an overview of the responsibilities and goals achieved by blending CAFT and F CAFT together. CAFT constructs a new planar topology for WSN in a distributed and energy efficient manner. The connectivity and coverage requirements are satisfied by utilizing a minimized set of nodes organized as faces in contrast to existing face structured WSN where the entire nodes in WSN engage in topology construction and expend more cost in terms of energy, storage, communication, computation, and time. F CAFT exploits the existing sleep (non-ANiT) nodes to provide fault tolerance with coverage preservation, with the aim to improve the performance and service life of WSN. The scheme provides distributed failure resilience through selection of suitable substitution node. The failing node is replaced by the selected non-ANiT substitution node and the face Distance to F n structure is then restored. This ensures the quality of coverage by preventing the hole creation and preserves the network structure by restoring the connectivity in a distributed and energy efficient way. Robustness to faults and failures: F CAFT addresses the conflicting concerns of enhancing energy efficiency and fault-tolerance simultaneously and offers an approach to deliver robustness in face based WSN by maintaining adequate quality of sensing coverage and prolonged service lifetime. Node/ link failure diagnosis and repair functions are analyzed in terms of a target tracking application. Figure 2 gives the workflow of F CAFT . There are four main phases in the working of F CAFT , as follows. (i) Initialization phase: Node deployment and construction of face structured WSN is applied in this phase, (ii) Diagnosis phase: This phase corresponds to node/link failure detection, (iii) Healing phase: The suitable substitute node selection is performed to heal the failure where the most appropriate non-ANiT node is selected as substitute using a selection function, (iv) Restoration phase: This phase performs the recovery by repairing and restoring the face structure of the WSN.

Initialization Phase
We consider a WSN with homogeneous set of nodes deployed in the 2D area of interest, which is then converted into a planar face structured WSN using CAFT [23] to prepare for an intended application, e.g., a mobile target tracking. The working set of nodes, called ANiT nodes, which are part of the face structure will perform the activities according to the application scenario (refer Fig. 3 for illustration). An ANiT node follow a state of duty cycle to conserve energy: active, when it engage in associated tasks; awakening, when it awakes for a short span of time; inactive or sleep, when its involvement is not needed. The nodes, called non-ANiT nodes, which are put to sleep during the formation of face structure will not perform any activity unless a wake-up request is received. When the process has completed, the WSN with non-overlapping polygons (or faces) is ready to run tasks of the designed application, which we consider here is a target monitoring or tracking application.
A target, assumed to be present in any of the faces, is surrounded by the edges of the face within which it currently resides. When the target comes within the sensing range of a node, it can detect the target. Among the nodes that initially detect the target within the face (called current face), the nearest node is called a beacon node, which acts as a coordinator and takes the monitoring responsibilities within that face. When the target crosses an edge towards another face, the nearest node to the target takes the role of beacon for that face. All the links (edges) and nodes of the faces associated to the target need to be checked and monitored, meaning that the monitoring is performed in conjunction with the mobility of the target. Please refer Fig. 3 for example. We can see a set of faces, F= {F1, F2, F3, F4, F5, F6}, out of which F4 is the current face formed by the nodes {n5, n7, n9, n10}. The target entered from F2 to F4 by crossing the edge (n5, n7) that is common to both F2 and F4. The node n5 is the current beacon node. As the mobile target is currently in F4, it is surrounded and trapped by the edges (n5, n7), (n7, n9), (n9, n10), (n10, n5) of the face nodes with respect to F4. Therefore, it is necessary to have all the face nodes and face edges with respect to the target to be intact for robust operation and the fault tolerance process should go hand in hand with the mobile target tracking process.

Diagnosis Phase
Next comes the diagnosis phase. Failures may occur over time during the operation of the WSN. A node can fail due to several reasons such as faults in hardware, faults in software, and degradation with time. Energy depletion is one among the main reasons for node failure. A node failure can also be caused due to faults in its components, such as memory, processor etc. and its capabilities may get affected. For the reason that a node's service lifetime depends on its battery state, a node will have a weak function if its energy level drops below a threshold value. A link may fail due to several reasons such as harsh environmental conditions and the affected nodes may face inability to communicate. Fault or failure detection process can be done through node self diagnosis and collaborative diagnosis. Some of the faults can be identified by a node through self examination based detection, e.g., faults due to depletion of battery can be identified by a node through self-diagnosis. The residual energy of a node can be estimated by monitoring the current level of its battery. Hence, a fault tolerance mechanism can be triggered before the complete death of the node.
For detecting the failure of links, a node checks its links to its direct neighbor ( D nbr ) nodes. A node can detect that a faulty link to one of its D nbr 's if it does not receive any message from that D nbr within a certain interval Δt i , which varies from milliseconds to seconds based on the WSN application requirements. If a node fails, all of its associated links will break, and the beacon node of the current face will not receive any status or monitoring report. A link (or edge) is considered as failed for any of the reasons such as if an edge to or from a node fails, or the node itself is failing, or the node is out of reach as the beacon node didn't receive any status message within a specific period of time. For the fault detection process, we consider the Markov chain model similar to [10,47]. We apply this model for self-diagnosis by nodes as well as link monitoring. We model each node by an embedded discrete time semi-Markov chain, which is defined by a set of states and transition probabilities between the states. We focus on the node's active mode operations in a discrete-time manner [10]. For link monitoring process, continuous-time Markov chain approach is utilized, where we consider each link as a part of a chain between two nodes [47]. Node self-diagnosis enables each node to examine its residual energy and any fault or fault-to-be which includes transmission of abnormal values, for identifying any anomalies/ faults in its own behaviors; and Link monitoring enables each node to probe for its D nbr 's and check the behavior of the links to identify the anomalies. Figure 4 depicts a failure scenario where n4 is the failing node. It has n3, n7 and n10 as its D nbr 's, and F2, F3 and F4 are the associated faces that get affected. The links that will fail include (n4, n3), (n4, n7) and (n4, n10). If a node detects its own faulty behavior, it can understand that it may fail at a later time, so begins the fault tolerance process. Moreover, when a failing node is identified, it has to be substituted using a suitable non-ANiT node so that the tasks of the failing node can be reallocated to the selected substitute node for improving the service life and performance.
Based on the evaluation of the existing related works on the fault tolerance aspect of face based WSNs, the newly proposed recovery scheme focus to satisfy the following requirements that were raised in the literature, to enhance the network performance and lifetime: Verification of coverage and connectivity: It is quite significant to verify the coverage as well as connectivity functions during the fault recovery process to ensure robust functioning of the WSN as failures may cause link breakages, coverage holes and even split the network into disjoint components leading to loss of communication. If a node is about to fail, it is necessary to determine and restore its coverage and connectivity as these functions might be altered after it becomes defective. Coverage verification: aims to restore the lost coverage due to failure. Connectivity or communication verification: aims to verify the communication links. This is discussed in detail in the subsequent sections.

Healing Phase
Upon failure detection, failure maintenance is performed to heal the failure and to reduce the consequences.
Node failure: Firstly, in the context of a node failure, the loss of coverage and connectivity induced due to node failure is quite significant as the failure of nodes may result in disconnection from one or more other nodes and disrupt the face based WSN structure (refer Fig. 4 for example). So, a failing node needs to be replaced with a suitable substitute node and a local recovery scheme is necessary to sustain the network topology. The proposed scheme for fault-tolerance invokes two actions: At first, the selection of best substitute node among the sleeping non-ANiT nodes in its neighborhood is performed for offloading the failing node's tasks and responsibilities. Then, restoration of the edges (or links) is performed to recover the face structure. This helps to provide coverage area maintenance and restore the network structure. Moreover, if a beacon perceives that one of its F nbr node is about to fail, it can select a non-ANiT node nearest to the F n to replace F n and the selected non-ANiT node heals the coverage and restores the face structure.
Selection of suitable substitute node ( S n ): The key issue in providing robustness to WSN is how to determine the best node from the candidate non-ANiT nodes from the neighborhood of failed node and how to recover the face structure of the network proactively. It is necessary to determine the most appropriate non-ANiT node to replace the failing ANiT node, in terms of lowest loss of coverage and connectivity to its associated faces. In order to measure the destroyed coverage of a failing ANiT node say F n , we first define its relevance using the coverage loss caused by F n . Let O N be the set that contains the opted neighbors of F n . The associated non-ANiT nodes act as candidates ( S cand ) for substituting the F n . More details about O N and ANiT/ non-ANiT nodes can be found in [23].
The procedure for selecting S n is explained as follows.

i) & ( [p is covered by i] or [p lies on C s ]) & (p is not covered by
This means that the region R ex formed by these I ex (i) points is under exclusive coverage of F n with I ex (i) as the boundary points. The non-ANiT node that could cover more coverage points (with reference to R ex ) of F n can contribute to high coverage gain when F n fails. Each candidate node ( s c ∈ S cand ) for substituting F n is marked in accordance with its average coverage with respect to I ex (i) . Hence, a feasible solution is to be determined from the S cand set for satisfying the coverage of R ex . And this means that the best node from S cand set needs to be selected as substitute node ( S n ) and then activated to replace F n . This can contribute to more efficiency in terms of coverage and connectivity towards the O N 's of the F n . Such a node should satisfy the coverage of the points that belong to I ex (i) . Thus, the objective function to be satisfied when a F n is replaced by a suitable S cand node j is that both the differences between C(f) and C(f ∪ j ⧵ i) , and E(f) and E(f ∪ j⧵i) should be minimum, where i and j denotes F n and S n , C(f) and C(f ∪ j ⧵ i) denote the coverage rate with respect to the associated faces before the node fail and after replacement, and E(f) and E(f ∪ j ⧵ i) denote the communication energy cost before node fail and after replacement, respectively. Hence, we have formulated the following function ( F rep ) for selecting the S n to replace F n (given by Eq. 1). The higher the value of F rep , the more appropriate a candidate node ( s c ∈ S cand ) is for substitution.
where I ex represents an exclusively covered intersection point of F n , n i denotes the number of points from I ex covered by s c , dist f denotes the distance from s c to F n and 0< < 1 denotes the balancing factor, respectively. A high value of implies that coverage is considered more important than the distance between s c to F n . The result is the decreasing connectivity towards the designated O N 's ofF n during the restoration process. Contrarily, a low value of may cause the selection of candidate nodes with short distance but less coverage of intersection points. To make a better trade-off between these two factors, we have assigned the value experimentally to = 0.4. Consider Fig. 5 for example. Here, the node n5 is the F n with nodes {n4, n9, n3, n6, n8} as its O N 's and the corresponding non-ANiT nodes are also represented. Among the S cand nodes (denoted by green dots), the best node is selected as S n according to the function given in Eq. (1). The algorithm for 'Fault Tolerance: selection of suitable substitute node' is provided in Algorithm (1).
Link failure: For healing link failure, the proposed method makes decision based on the affected node's connectivity measure( ). The details on connectivity measure can be found in [23]. If a node's parameter is greater than threshold (t = 0.3), it checks for the nearest O N connection or checks for possible links between adjacent O N 's, and restore it to heal the failure; or a possible link between the nearest F nbr node is restored as each node already has the information of all its F nbr 's. Otherwise, the affected two faces are combined to form a single face. If more than one edges of a node fail, the node is then substituted using the process explained earlier (Algorithm 1). An illustration of link failure scenario is depicted in Fig 6. Let the link (n6, n8) is a faulty link. So, the node n6 (based on , > t ) restores the link between itself and its adjacent O N node i.e., n7, to heal the failure. The algorithm for healing link failure is given in Algorithm (2).

Restoration Phase
While healing a failure, it is necessary that the face structure should be restored. Here, we explain how the faces are repaired and restored during recovery from failure. After successful selection of S n , the F n needs to be replaced with S n . For this, a message M nbr that contains the details of D nbr 's is sent to the S n for link formation and restoration of face structure. So, next comes the link formation process. It is a two step process which performs link validity check and link correction functions. For this, as the first step, the S n tries to check validity of the link with each of the nodes in the M nbr list. The link validity checking is performed according to the edge formation process in initial face structure construction phase [23]. The second step is necessary due to the following reason. Earlier, S n was a non-ANiT node, but now it is an ANiT node. So, there exist chances that an existing link can become invalid when S n became active (ANiT). So some edge corrections may be required while restoring the face structure. We have illustrated this scenario using an example in Fig. 7. From the figure, we can see that the D nbr 's of failing node F n are {n1, n4, n10}. On receiving the M nbr message, the S n node communicate with the D nbr 's of F n to verify the validity of the respective links. so, the D nbr 's i.e., the nodes n1, n4 and n10 check the possibility for connecting with S n . The nodes n1 and n10 can successfully verify the validity and can connect directly to S n . However, for node n4, its link with node n6 i.e., (n4, n6) becomes invalid as it violates the conditions of topology construction process. So, the link (n4, n6) is removed and the nodes n4 and n6 join S n by individually connecting to S n . Therefore, the new D nbr 's of S n = {n1, n4, n6, n10}. After identifying the D nbr 's, the faces are updated according to the face exploration process of CAFT. The algorithm for restoration phase is given in Algorithm 3.

Simulation Results and Evaluation
In this section, we provide the simulation results and evaluate the advantages of F CAFT over existing schemes in face based WSN. The experiments are performed in Matlab. We consider node deployment in an area of 100 m × 100 m, following random uniform distribution. The communication range ( R c ) of a node is set greater than or equal to twice its sensing range ( R s ), i.e, ( R c ≥ 2 R s ). In the simulations, R c is considered to be 20 m and R s as 10 m. All nodes in the WSN synchronize with the sink in first 1-10ms.

3
The count of nodes is varied from 100 to 350 so that it may provide an insight into the performance of F CAFT in different scenarios and the performance under different node densities is then evaluated. The nodes are time synchronized to coordinate the tasks among themselves. After the topology construction has completed (100s), the WSN is ready for target tracking. For modeling the energy consumption, we have adopted the CC2420 radio parameters [23,48]. The performance is compared with the two existing schemes namely LoMoM [10] and Forms [13]. For evaluating the performance, we consider faults in the WSN at varying rates. Failure rate ( f R ) represents the failure in a random fashion after the construction of face structured WSN, and is calculated in terms of the rate of the count of nodes failed to the total nodes count. The other simulation parameters used are similar to [10,13]. As the target moves in the monitoring area, one or more nodes (of some faces) in the WSN may appear failing. The results are averaged over 100 simulation runs for reliable results. The main parameters used for simulation are discussed in Table 2.
Performance metrics used: The following are the metrics used for performance evaluation and comparison: • Coverage versus different failure rates: measures the variation in coverage according to varying fault occurrences. • Average involved faces: This metric is measured in terms of faces involved and updated for recovering from failures during tracking. • Average data delivery rate: When failures occur, the ability to sustain the operation by avoiding any interruption of WSN serviceability assures more reliable transmission of information. This metric that measures the average data delivery indicates the performance of the schemes in reducing the data loss caused by the failures. • Quality of service: This metric is measured in terms of tracking accuracy, which is estimated with regard to the rate of successful target tracking steps over a total count of events. • Service Lifetime: This metric estimates the service lifetime of the network.
Next, we provide the results of evaluating the performance of FCAFT and its comparison through extensive simulations. The results of simulation are as discussed below.

Coverage Versus Failure Rate
We initially analyze F CAFT in terms of its performance in providing network coverage rate under various node densities, for different failure rates. Figure 8 provides an insight into the average coverage rate of the network with various number of nodes by varying f R . The count of deployed nodes is varied from 100 to 350. As a result of random deployment, when the count of nodes in the WSN is not dense enough, the number of eligible nodes for substituting the failed node is less and therefore a slight deviation in coverage with increase in failure rate is observed. With the increase in the count of deployed nodes, the coverage is sustained even with occurrences of failures. From Fig. 8, we can see that the quality of coverage is successfully maintained with the increase in the count of nodes under different failure rates. To further analyze the performance of F CAFT , we have varied the sensing range value and the same behavior is observed. Coverage rate comparison: We have further evaluated F CAFT in terms of the coverage rate for different node failure rates with respect to existing face based scheme and a comparison of coverage is depicted in Fig. 9. The results reveal that the proposed F CAFT scheme is successful in providing good quality coverage than the existing face based method. This behavior can be explained by the fact that F CAFT accomplishes healing using the non-ANiT nodes through efficient selection of appropriate substitute nodes while the existing method doesn't apply a node substitution during failure recovery. From the figure, we can notice that the coverage rate of existing face based structure decrease with the increase in the rate of failure. As time progresses, the increase in node failures may cause disconnection between nodes and lead to disruption of the network structure, which in turn affect the application performance. In contrast to this, F CAFT ensures durable and robust functioning of the network by preserving the coverage as well as the face structure of the network.

Average Involved Faces
Next, we illustrate the performance of the schemes in terms of average count of faces involved and updated while recovering from failures during tracking. The initial topology construction process makes the faces to be better structured using a reduced count of nodes and are comparatively larger than that of the existing face structured WSN which contains the entire deployed nodes as a part of the planar structure. If a node fails, the existing schemes try to recover from the failure by performing maintenance through merging of faces [10] or by updating the differential form [13], and require more faces to be involved. Moreover, due to the smaller sized faces of the existing schemes, the failures can lead to loss of tracking as the target may escape the face quickly without being detected by any of the nodes in the face and requires participation of more faces for updating the tracking information. In addition, coverage hole gets created due to node failures in existing schemes and if unfortunately the target resides in a coverage hole region, it cannot be detected. In such occurrences, the relocation of the target is possible only when a node detects its presence after it exits the hole area. Even though Forms states about insertion of extra nodes to replace the failing nodes, we believe it is time and energy consuming task and is also infeasible in harsh and challenging environments. A comparison on average number of involved faces for F CAFT , Forms and LoMoM under various node densities when f R is 0.1 is given in Fig. 10. From the figure it is clear that the average involved faces for F CAFT is lower than that of the others. This is because F CAFT efficiently recovers from failure by restoring the coverage and connectivity of the faces using the non-ANiT nodes and provides better robustness to failures.

Average Data Delivery Rate
We have evaluated the performance of the network with reference to average data delivery rate. The results as depicted in Fig. 11 shows that F CAFT provides better performance when compared to LoMoM and Forms. During the occurrence of failures, the efficiency in sustaining the performance by eliminating any interruption in functionality of the WSN assures more reliable transmission of data. The performance of F CAFT is better then others indicating that F CAFT reduces the data loss caused by the failures through the Average data delivery rate replacement of failing nodes with non-ANiT nodes. However, in the existing methods, the performance reduce because there will be significant reduction in coverage as well as connectivity due to failures, which will result in the disruption of the network structure with the increase in failure rate. In contrast to this, F CAFT preserves the network structure and maintains the performance effectively through better healing of failures that ensures coverage and durable functioning of the network.

Quality of Service
To further evaluate the effectiveness of F CAFT scheme, we observed the performance of F CAFT in terms of quality of service, considering the previously mentioned underlying challenges. We analyzed the results based on the overall simulation results under various node densities and compared the performance of F CAFT with that of the other schemes. The results are illustrated in Fig. 12. It is evident that F CAFT gives better performance than the other schemes. With the new topological organization of nodes to construct faces which are larger in size when compared to that of existing face based WSN, and the appropriate recovery process of F CAFT by using non-ANiT nodes in minimizing the formation of coverage holes and in preventing face structure disruption during failure situations, helped in continuous tracking of the target with less count of nodes and faces as well. Therefore, the chances of target miss situations are significantly reduced when compared to existing approaches which involve and require more faces during tracking and to recover the missed target, which consequently affects the network performance in maintaining accuracy and quality of the application. Figure 13 provides the results of target miss rate for the schemes under different node failure scenarios.

Service Lifetime
Next, we examine the service lifetime of the network with and without F CAFT , as shown in Fig. 14. We can see that the lifetime of the network is considerably improved using F CAFT . Moreover, we have evaluated the face based schemes in terms of their lifetimes and the results are provided in Fig. 15. The results reveal that F CAFT provides better service lifetime when compared to others. This is because, F CAFT prevents the loss of coverage and connectivity due to failures and improve the durability of the network. However, in the existing schemes, the loss of coverage and connectivity become more serious with the increase in failure rate, and the performance and service life of the network get reduced significantly. We can see that F CAFT greatly improves the robustness, durability and application performance of the network.

Overall Analysis
Although different aspects of fault tolerant mechanisms have been explored by the researchers, most of them have missed to explore the fault tolerance in face topology based WSNs. In addition, the existing works on face based WSNs utilized RNG, GG or some cross edge removal approaches, which do not have fault tolerance capabilities on their own and suffers from coverage and connectivity issues with increased node failures. As evident from the Figs. 8 and 9, the quality of coverage is successfully maintained with the increase in the count of nodes under different failure rates. F CAFT scheme is successful in providing more than 96% of coverage even when the failure rate is more than 20%. However, in the existing face based method(s), coverage deteriorates below 80% when the failure rate is more than 20%. This behavior can be explained by the fact that F CAFT accomplishes healing of coverage holes caused by node failures, while the existing face based WSN doesn't have a choice of selecting substitute for failure recovery and they are compelled to perform local maintenance through merging of adjacent faces to continue the operation. As time progresses, the increase in node failures may cause disconnection between nodes and lead to disruption of the network structure, which in turn affect the application performance in the existing schemes. The failures can lead to loss of tracking as the target may escape the face quickly without being detected by any of the nodes in the face and requires participation of more faces for updating the tracking information. As observed in Fig. 10, the existing schemes require more than twice the number of average involved faces when compared to F CAFT . The data delivery performance of F CAFT shows better results of about 95% even when the failure rate is more than 15%, while the tracking accuracy is above 90% for varying number of nodes. However for others, the result is between 60% to 80% indicating the performance degradation caused by significant reduction in coverage as well as connectivity due to failures, resulted from the disruption of the network structure with the increase in failure rate. Missing the target can consequently result in substantial energy being expended to compensate for it, which can shorten the network's service life under the existing models. In contrast, F CAFT ensures durable and robust functioning of the network by preserving the coverage without disrupting the face structure and exhibits an improved service lifetime of about 14% when compared to existing schemes.

Conclusion and Future Work
Here, we have presented a robust fault tolerance scheme that preserves the coverage and connectivity in a face based WSN. In the existing face structured WSNs, node faults and failures result in connectivity and coverage loss, and can have critical consequences, e.g., target loss, which in turn reduces the accuracy of application. However, the use of non-ANiT nodes for substituting the failing nodes helps in sustaining the quality of sensing coverage and ensures the connectivity of the network so that the lifetime and application performance is improved. This is quite significant for applications, such as critical tracking, where an intruder can hide in hole region without being detected by the nodes. The results of simulation reveal the efficiency of the proposed F CAFT scheme in comparison with existing face structured WSN schemes. The performance of the network is sustained by timely healing the failures in the network to ensure robustness and resilience of the WSN. The use of non-ANiT nodes in replacing failing nodes and preservation of the network structure by restoring the connectivity in a distributed and energy efficient way permits the F CAFT to self-heal and keep functioning as long as possible. We can see that F CAFT greatly improves the robustness, durability and application performance of the network. Investigating the performance by using meta-heuristic or other machine learning algorithms will be our future work. Also, we consider the adaptability of the proposed method to run on other types of topologies as a part of future work. In the future endeavors, we also aim to investigate more practical issues in using the concept of faces so that it can be expanded for various specialized applications of WSN in both sparse and dense networks with complex scenarios.
Author Contributions All authors contributed to the study conception and design. All authors read and approved the final manuscript.
Funding This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.

Data Availability
No associated data to share.