In this section, we described the proposed DWCR algorithm in detail. The parameter settings and definitions and the DWCR system model are described in Sections 3.1 and 3.2, respectively. Sections 3.3 and 3.4 present the DWCR’s backup node selection and node relocation procedures as well as the corresponding pseudocode.
3.3 Backup Node Selection Procedure
The proposed algorithm selects a backup node for each critical node because only critical nodes can cause network partition. The purpose of selecting a backup node is to ensure rapid network recovery and to minimize the total moving distance and the number of relocated nodes. Each node sends a broadcast message containing its \(ID\) and local location to all nodes within its communication range to initialize the one-hop neighbor table.
If a critical node i has noncritical nodes as neighbors, it will be selected as a backup node first. This is because the noncritical nodes can act as backup nodes to minimize the number of relocated nodes. Conversely, if a critical node is used for network recovery, the number of affected nodes and the total moving distance is greater. In addition, to minimize the total moving distance, the distance between the faulty node and its neighbor nodes must be considered. The DWCR algorithm also selects nodes with lower connectivity to minimize the number of affected nodes.
Critical node i will preferentially select non-critical nodes from among its one-hop neighbors \({N}_{i}\) as backup nodes. If its neighbors include more than two noncritical nodes, the distance and degree of the nodes are calculated, and the weight \({W}_{a}\) of each node is determined using Formula (1). Because the loss of connection between non-critical nodes and their original neighbors does not cause network partition, the distance factor is weighted more heavily than the degree factor. According to our experimental results, optimal results can be obtained when the distance and degree factors are set to 0.9 and 0.1, respectively.
$${W}_{a}= Distance \times 0.9 + Degree \times 0.1$$
1
\({W}_{b}= Distance \times a + Degree \times b\) , where \(a+b=1\) (2)
The distance factors in formulas (1) and (2) can be calculated using formula (3):
$${d}_{ij}\text{ = }\sqrt{{\left({x}_{i}-{x}_{j}\right)}^{2}+{\left({y}_{i}-{y}_{j}\right)}^{2}} \forall i=\text{1,2},3,\cdots ,n$$
3
Formula (3) uses the coordinates of the critical node and its neighbors; i represents the \(ID\) of the critical node, which is an integer from 1 to n; j represents the \(ID\) of any neighbor node of critical node i; and \({d}_{ij}\)is the distance between critical node i and its one-hop neighbor j.
If the neighbors of critical node i include no non-critical nodes but multiple critical nodes, formula (2) must be used, and the weights of distance and degree factors will be dynamically adjusted as the number of nodes and the communication range change. Generally, when formula (2) is applied in a dense environment, the weight of the distance factor a increases and the weight of the degree factor b decreases. However, as a network becomes sparser and the number of nodes decreases, the weights of the distance and the degree factors decrease and increase, respectively. This is because a critical node with a higher degree in a sparse environment is more likely to have a noncritical neighboring node that can serve as a backup node during the cascading recovery process. Therefore, fewer nodes are affected during the cascading recovery process.
In a dense environment, the probability of a critical node’s neighbors including general nodes is higher, and closer nodes can be directly selected as backup nodes to minimize the total moving distance. Therefore, as a network becomes denser, the weights of the distance factor \(a\) and the degree weight factor \(b\) will be dynamically adjusted upward and downward, respectively. However, parameters a and b must be continually adjusted according to the number and range of actors in different environments to obtain the optimal solution. Factors \(a\) and \(b\) can be calculated using formulas (4) and (5) [33]:
$$a=(640 n+1866.6 r)\times \frac{1}{Area}$$
4
where \(n\) represents the number of nodes, \(r\) represents the communication range, and \(Area\)represents the network size.
Figure 2 presents an example of a network topology including critical nodes. When a noncritical node (e.g., node A5, A6, or A7) is abnormal, initiating the network connectivity recovery process is unnecessary because the abnormality will not cause a network partition. Conversely, an abnormal critical node will lead to network partition; therefore, when a critical node fails, the connectivity recovery process must be initiated. For example, if node A17 is damaged, nodes A6 and A7 cannot communicate with node A18; therefore, A17 is a critical node. Similarly, nodes A0, A8, A9, A10, A13, and A18 are critical nodes. Because node A17 is a critical node, the DWCR algorithm must be used to select a backup node. First, node A18, which is a critical node, is excluded through the backup node selection procedure. Because the neighbors of A17 include general nodes, the weight of the backup node is calculated using formula (1). Although the degrees of nodes A7 and A6 are the same, node A7 is closer to node A17 (63 m < 71 m); therefore, node A7 is selected as the backup node for node A17. By contrast, node A9, another critical node, has many critical nodes but no general nodes as neighbors. In this case, the weights of nodes A8, A10, and A13 are calculated using formula (2), and node A10 is selected as the backup node of node A9 because it outperforms nodes A8 and A13 when the weights of distance and degree are considered.
After the backup node selection procedure is completed, critical node i will periodically send heartbeat information [26] to its backup node. When the backup node does not receive heartbeat information for a designated period of time, node i is determined to be malfunctioning. Accordingly, the backup node will start to relocate to recover network connectivity.
3.4 Node Relocation Procedure
Because only critical node failures can disrupt network connectivity and cause network partition, when the critical node i fails, its backup node j will immediately initiate the node relocation procedure to recover network connectivity. Generally, backup node j will first update the neighbor table \(NEIGH\_TABLE\) of one-hop distance to the closest distance to the faulty node \({F}_{i}\). Backup node j, a critical node, tests whether it is connected to its one-hop neighbor after completing the node relocation procedure. If the network goes down, the node relocation procedure is cascaded until the network connection is restored. After the node relocation procedure is completed, the system redetects whether the adjacent neighbors of the original faulty node and the newly migrated node in the one-hop neighbor table are critical nodes. If a one-hop neighbor is identified as a critical node, the backup node selection procedure is performed to select a suitable backup node.
Figure 3 illustrates the results of applying the DWCR algorithm, specifically the node relocation procedure, to the network topology in Fig. 2. In Fig. 3(a), node A13’s one-hop neighbors, namely nodes A9, A14, and A15, are disconnected because of the failure of node A13. The backup node A15 moves to the position of A13 until it establishes connectivity with nodes A9 and A14, as illustrated in Fig. 3(b). Because node A15 is not a critical node, cascading the node relocation procedure after the relocation of node A15 is unnecessary. The topology of the recovered network after the execution of the DWCR algorithm is illustrated in Fig. 3(c).
The pseudocode of the DWCR algorithm is presented in Fig. 4. When the algorithm is first executed, each node broadcasts a message to notify its neighbors within its communication range and records the \(ID\) and location of its neighbors to construct a neighbor table \(NEIGH\_TABLE\) (line 4). Subsequently, each node determines whether it is a critical node by using \(NEIGH\_TABLE\). If a node is a critical one, it undergoes the backup node selection procedure (line 8).
During the backup node selection procedure, formula (1) is used to select a backup node (lines 16–19) if the neighbors of critical node i include multiple general nodes. Conversely, if the neighbors of critical node i include only critical nodes and no general nodes, formula (2) (lines 20–24) is used to select a backup node. Subsequently, the backup node j can undergo the node relocation procedure to restore connectivity to the one-hop neighbors of the faulty node i.
1. Actor_coord: the coordinate of actors
2. \({R}_{c}\): the communication range for each actor
3. \({N}_{i}\): All neighbors of critical node i
4. Initialize a one-hop neighbor table NEIGH_TABLE
5. for each actor;
6. for every Actor i in a WSAN do
7. if Is_Critical(i) then
8. Backup Node Selection
9. else
10. back_ID = 0; Position =[ inf, inf ];
11. end if
12. if i failure
13. j moves to ∀\({N}_{i}\)={ j }
14. end if
15. Backup Node Selection
16. if \({N}_{i}\) have non-critical node
17. for Neighbor_hop1(i) do
18. \({W}_{a}\) = Distance \(\times\) \(0.9\) + Degree \(\times\) 0.1
19. return node←max_\({W}_{a}\) (Neighbor_hop1(i))
20. else
21. for Neighbor_hop1(i) do
22. \({W}_{b}\) = Distance \(\times\) \(a\) + Degree\(\times b\)
23. return node ← max_\({W}_{b}\) (Neighbor_hop1(i))
24. end if