C3P-RPL: A collaborative and proactive approach for optimal peer to peer path selection and sustenance

Internet of Things is evolving from information gathering platforms into collaborative systems wherein, smart devices actively interact with each other in a seamless manner. For instance, Internet of Robotic Things is envisioned to provide augmented solutions through collaboration of varied smart devices and robots. These visions revolve around the ability of smart devices to directly communicate and cooperate with each other in real time. In this context, this paper is an attempt to study RPL’s (Routing Protocol for Low-power Lossy Networks) point to point routing that creates multi-hop paths between peer nodes. This standard routing protocol is known for robust and failsafe upward paths but its peer to peer (P2P) routes are reported to be suboptimal. This work assesses P2P performance of RPL’s storing mode in a network of new generation devices having higher memory. Further, a Collaborative and Proactive Peer to Peer (C3P) path selection and sustenance approach is proposed where, root node collates incremental topology from collaborative nodes and disseminates optimal single source shortest path trees SPT(n). A progressive node betweenness centrality score ensures spread out paths. Minor topology changes are accommodated through incremental node and edge updates to targeted SPT(n) locally. Storing SPTs in intermediate nodes reduces storage and packet size. Through simulations and testbed experiments, it is proven that C3P-RPL improves simultaneous peer to peer communication between all the nodes. Specifically, the path length is reduced by 30% and subsequently the network latency drops by 65% in an experimental testbed of 47 nodes, making it suitable for collaborations.


Introduction
Internet of Things has become omnipresent as we are embracing it openly for the various sophistication that it offers. Right from homes to public spaces, personal assistance to industrial automation, the type of devices employed and their applications are expanding rapidly. New age microcontrollers are taking IoT to varied fields with their augmented capabilities like bigger memory, higher computational power, faster caches, improved security and their ability for customizable specializations [1]. They facilitate newer solutions in myriad areas like Smart City, automation and e-commerce in Industrial IoT, collaboration in Internet of Robotic Things (IoRT), failsafe communication in Internet of Medical Things (IoMT) etc., [2][3][4]. Point to point communication in an Internet of Things network plays a vital role in bringing efficient collaborations between devices.
RPL is the standard routing protocol that empowers large number of constraint nodes to form a connected network [5] which finds many applications in home automation [6,7], industrial automation [8] and urban environmental monitoring [9]. It employs ICMPv6 [10] control messages to form a multi-hop mesh network and a trickle timer [11] to control the distribution of these control messages. A central root node initiates the formation of Destination Oriented Directed Acyclic Graph (DODAG) with an DODAG Information Object (DIO) message. The DIOs trickle down with a rank metric and facilitate other nodes to form shorter acyclic paths towards the root node. Though the protocol was designed to cater for multi-point to point (MP2P), point to multi-point (P2MP) and point to point (P2P) types of communication, it is known that the P2P routing paths are suboptimal. Non-storing mode RPL, always routes peer to peer packets through the root node and the storing mode RPL routes them through common ancestors (in the worst case through root node). Past research papers have flagged the memory requirement of RPL's storing mode operation [12,13] and pointed out that it is not scalable. Hyung-Sin Kim et al., in their survey [14], point out the availability of new generation microcontroller units (MCU) in recent times and prompts research community to move beyond the boundaries of constrained nodes.
To alleviate this P2P problem, Internet Engineering Task Force (IETF) extended RPL's specification to form point to point routing paths on demand, in a reactive manner [15]. A sender node wishing to find P2P routes, initiates a transitory directed acyclic graph (DAG) discovery by declaring itself as the root node for the DAG. Like the alignment to original root node, all nodes orient themselves towards the starter node. By reversing the routes obtained, the starter node gets best P2P routes to all other nodes in the network. To form a complete P2P mesh, all nodes in the network have to initiate this P2P route discovery. This method incurs huge control overhead when all the nodes want to communicate to all the peer nodes. In contrast to the reactive P2P path discovery, a few recent works propose an alternate centralized approach wherein root node initiates neighbour graph collation [16,17]. Armed with the neighbour graph and their link status, they calculate shortest paths between nodes. A requester node gets the shortest route from the root node, caches the same and sends the P2P packets by attaching source routing headers (SRH). SRH contains the entire route to traverse. This approach is not optimal for seamless P2P communication as each node has to wait for a P2P route before initiating the actual packet delivery. Then the nodes have to store the entire path for each route in cache. The increased packet size of this SRH approach requires more energy in transmitting it and/or limits the payload size of a P2P packet. Though the usage of SRH seems unavoidable for pure non-storing node, the same does not hold good for storing mode or hybrid mode. Even in the non-storing mode, the storage requirement of multiple routes could be optimized.
This paper furthers the centralized scheme of peer to peer route discovery and proposes a simple method to reduce storage space, packet size and delay. A collaborative and proactive P2P RPL (C3P-RPL) is proposed with centrality assisted path selection algorithm and a distributed approach for route storage and sustenance. The significant characteristics of C3P-RPL are, • Collaborative: Nodes initiate P2P route discovery by sharing local topology assessment with the root node. • Proactive: Single source shortest path trees (SPT(n), n ∈ V , rooted at n) computed at the root node, are disseminated back. They have optimal routes to all nodes and are installed at each node in a proactive manner.
• Accurate topology: The topology is assessed asynchronously using received signal strength indicator (RSSI) metric at both ends of an edge. • Optimal path selection: Best SPT(n) is selected among the available single source shortest path trees (SSSPT) using a progressive node betweenness centrality metric for effective traffic distribution. • Low memory: Storing the edges of SPT(n) requires less memory [ 2 * (|V| − 1) ] when compared to storing entire routes [ x * (|V| − 1) ], x being the path length of routes). • Less overhead: Table based routing at intermediate  nodes facilitates smaller packet size when compared to source routing which has entire route in its header.
The rest of the paper is organized as follows. Section 2, briefs the process of P2P route formation in RPL. The next section outlines relevant research work that improve P2P routes. Sections 4 and 5, elaborates on the proposed peerto-peer path selection in C3P-RPL, details evaluations and discuss results. The final section concludes and highlights important results from the study.

RPL's path selection
RPL supports peer to peer traffic flows in both the storing and non-storing modes of operation. Downward routes are the basic requirement for identifying paths to peer nodes. This section elaborates the nuances of the RPL's peer to peer route discovery in both the modes.

Non-storing mode operation
RPL initiates route formation by propagating a DODAG Information Object (DIO) control message from the root node. The DIO has a rank metric that indicates the closeness of a node to the root. Each node calculates its rank metric by adding a path metric to its parent's rank and advertises the same in its DIO. In a RPL network, rank is smallest for the root node and increases as the distance from root node increases. A node selects a node as its preferred parent whose rank is the smallest among all the neighbours. So the upward route is formed by simply following the preferred parent at each node. This forward route discovery is common for both the mode of operation. RPL utilizes the Destination Advertisement Object (DAO) control message for constructing downward routes. Figure 1a illustrates route formation process in RPL's non-storing mode. When a node selects a parent as preferred parent, it sends a DAO in the upward direction. The difference lies in the addressing of the DAO and its processing. Let node 'a' choose 'c' as parent. The DAO from a has its global address in the target option field and 'c' as parent address in the transit option field. It can also add multiple parents with their preference level. This DAO is unicasted to root node and forwarded through the preferred parent.
None of the parents along the route would process the DAO but simply forwards it upward until it reaches the root. 'c' sends an independent DAO with its address as prefix and 'f' as the parent's address. Upon receiving the DAO's from the nodes, the root constructs the downward routes through the parent. When a node switches its parent, a DAO with the new parent's address is sent to the root node. So, all P2P data packets are forwarded in upward direction using the preferred parent as next hop towards the root node. The root finds the downward route and stores the same in its routing table. When a P2P packet arrives, it appends a Source Routing Header(SRH) containing the entire route and forwards the packet to the next hop in downward direction. Every intermediate node in downward route processes SRH header, retrieves next hop, removes itself from the header and forwards the packet further down until it reaches the destination node.

Storing mode operation
The storing mode uses the same DAO control message with slight variation for forming downward routes. After parent selection, a node unicasts the DAO to its preferred parent and the parent node stores the route. The figure in Fig. 1b illustrates P2P route discovery in a RPL network. For instance, node 'a' selects node 'c' as its preferred parent. It adds its global address as prefix in the target option of the DAO message and then unicasts the DAO to the preferred parent 'c'. Node 'c', on receiving the DAO, knows that a downward route exists to the sender node 'a' and saves the route 'a' to its route table. Unless 'c' is not a root node, on receiving a DAO from a child node, it should send a upward DAO with multiple consolidated prefixes or the single received prefix. Hence 'c' is not root, it unicasts a DAO to its parent f with addresses of 'a' and 'c in the target option. Subsequently, 'f' knows that a downward route exists for 'c' and also a route exists for 'a' through 'c'. When the number of DAO's generated in these two modes is quantified against each other, it may look like the storing mode contributes more towards the control overhead. But it is not true from the transmission stand point since storing mode transmits DAO to a parent that doesn't travel any further and a new consolidated DAO taking its place. However, DAOs in the non-storing mode needs to be forwarded until they reach the root and the same DAO is transmitted at every hop.
Thus, a node receiving a DAO in storing mode, gets the destination address of a route from the target option and the next hop address from the sender address of the DAO packet. All intermediate nodes record the routes and utilize the same for forwarding data packets. When a parent switch happens, the node, switching the parent, sends out an upward no-path DAO to the previously selected parent. It is further forwarded in the upward direction to clear the old route. After the no-path DAO, a normal DAO is scheduled to the newly selected parent. During packet delivery, a node operating in storing mode consults its route table. If an entry is present, the next hop address is fetched and the packet is forwarded to that next hop. If a route is not available for a destination, the node sends the data packet upward (through preferred parent) until it encounters an ancestor node with a route. In worst cases, the packet would go up to the root node as it is the common ancestor for all the nodes. In general, RPL's peer to peer routes are shorter in the storing mode than in the non-storing mode, as all packets need not go through the root node. However, they are not optimal as they are forwarded along the DAG. When all nodes have sufficient memory to store the routes, RPL's delivery reliability is found to be 78.64% in storing mode of operation. Graph in Fig. 2 shows the packet delivery ratio for a RPL network in a testbed having 47 nodes where, all nodes send P2P data packets to all other nodes. The delivery ratio reduces as packets travel along the DAG creating congestion along few branch heads. Though all the nodes have sufficient memory to store routes, packet drops indicate that it is necessary to find short and well distributed P2P routes.

Relevant work in literature
Literature reports that RPL's P2P communication suffers from non-availability of memory in storing mode and long suboptimal routes in non-storing mode [18,19]. According to the testbed experiment from previous section, the availability of memory improves storing mode performance. But, the reliability is found to be low. RPL uses fewer routing paths along the DAG, even though optimal paths are available. In case of simultaneous peer-to-peer packet exchange, the small number of paths cause congestion and their longer length increases latency. Zhao et al., in their survey [20], highlight that the delay in communication and congestion are obstacles for successful node to node communication. Despite being a critical problem, only 11% of the research papers in RPL focuses on downward route formation and much less on the P2P routes [14].
In a bid to reduce congestion, earlier research works proposed multi-path routing [21,22] where, packets are forwarded through multiple parents using a path life cycle index/braided paths. Similarly, reliable path routing [23] uses multiple factors of parent nodes and QU-RPL [24] forwards packets based on the queue availability of parents. Although they are effective in reducing congestion in the network, they end up with increased parent switches. Frequent parent changes are known to increase control overhead and reduce reliability.
To reduce the memory requirement of RPL's storing mode operation, MERPL [25], proposed to decouple a node and the table storage by using the child's resources to store routes. It demonstrates an improvement in performance for a network having nodes with low memory. Another mechanism to improve the scalability of downward routing with limited memory is proposed by DualMOP-RPL [26]. It combined the presence of both storing and non-storing modes in a single instance to cater a large network. However, packets still have to flow along the DAG and in turn, encounter crowding along few nodes.
There are works which go down the broadcast and multicast paths and uncover shorter routes than the DAG path. A DIO overhearing based mechanism, proposed by ERPL [27], improves both the storing and non-storing mode by adding a route to the direct neighbours. Though this reduces the length of P2P routes for a subset of nodes, other nodes follow the original longer paths. D-RPL [28] suggested a multi-cast channel subscription in case of non-availability of routes. Whenever routes are available, it resorts back to the original RPL behavior. OSR [29] is an opportunistic source routing protocol that was suggested as an alternative to RPL's non-storing mode. It improved reliability by identifying alternate downward paths opportunistically with the use of bloom filters. Bloom filters are known for false positives and a broadcast increases the number of nodes processing duplicate packets.
Address based routing or location based routing reduce the need for large memory to store routes. IoTorii [30] proposed a hierarchical addressing scheme to arrive at a local address for nodes in such a way that reflects the topology of the network. The nodes can gather multiple local addresses depending on the position of the node and uses the same for routing a packet to any destination. Though, it reduces storage space with one hierarchical address, it can only reach the nodes en-route the origin. It would need all possible addresses to reach a node through a shortest path. ER-RPL [31] is energy conserving and region-based where, a subset of nodes participates in route discovery as against the traditional method of holistic participation. Yet, it is intense in control overhead and has a route request response model that is not suitable for simultaneous P2P packet transmissions.
As significance of peer to peer communication became prominent, the standardization of RPL is extended to include a reactive P2P route discovery with P2P-RPL in RFC 6997 [15]. Each node initiates optimal route discovery by declaring itself as a transient root node. E. Baccelli et al., and M. Zhao et al., studied the performance of P2P-RPL in simulations and testbed experiments [32,33]. The study confirmed that the P2P-RPL forms shorter point to point paths between nodes than the routes formed by RPL's non-storing mode. However, there is no comparison on the control overhead involved or energy consumption of P2P-RPL. But it is clear that this approach suffers from a huge control overhead.
Unlike Adhoc on-demand distance vector protocol (AODV), P2P-RPL requires bidirectional links and does not cater to asymmetrical links between nodes. But, AODV is not suitable for radio cycling nodes. AODV-RPL [34] is an attempt to standardize the AODV route requests, replies and error messages and incorporate them in RPL control packets. In this direction, B. Djamaa et al., proposed an efficient, stateless, route discovery process for LOADng and AODV-RPL [35] where unnecessary route requests are curtailed. Z. Sharifian et al., enhances LOADng to check the route request broadcast storm and reduce the overall control overhead in the network [36]. However, their route discovery is reactive in nature and incurs significant control overhead when P2P paths are required between all the nodes. It is a well known fact that RPL, in general is known to be energy conserving than AODV [37].
Recently, there are couple of research works in RPL that improve P2P routes by proposing a centralized scheme. NG-RPL [16] and HRPL [17] propose to collect the neighbor graph from all the nodes of the network, construct shortest paths between nodes and uses source routing for data delivery along the shortest P2P routes. The energy conservation reported is significant in non-storing mode as they are currently following a longer path to root and then down to destination. In case of topology changes, their operations require significant work. In addition, a node featuring in multiple shorter paths would experience congestion around it.
The centralized approach of C3P-RPL reduces control overhead when compared to reactive route discovery. Each node is required to send a single control packet with the local topology and receives back a shortest path tree. SPT has shortest routes to all nodes in the network and hence, reduces latency to the maximum possible extent. C3P-RPL would perform better in reducing congestion than other centralized scheme because a node betweenness centrality score minimizes the total number of shortest paths passing through any single node. In addition, the incremental SPT updates would reduce the overhead in addressing topology changes.

Optimal peer to peer path selection in C3P-RPL
The proposed approach offers couple of advantages over the current centralized approaches. The space required to store the set of P2P routes is made minimal by storing a simple single source shortest path tree (SSSPT) at each collaborative node. Subsequently, by storing SPT(n) at each intermediate nodes, the need for the SRH header is eliminated. This allows for smaller packet sizes and saves energy when compared to SRH approach. While forming SSSPTs, a node betweenness centrality score is applied to select a shortest yet distributed peer to peer routing path. SSSPT on node n is denoted as SPT(n) and is rooted at n, is stored at node n and has |V| − 1 edges. Whereas storing individual routes in cache would require space for x * [|V| − 1] nodes, where x is the average path length of the routes from n. When there are minor changes to the network, the process of collation and dissemination is not repeated. Instead, individual targeted updates are carried out on the the stored SSSPTs in an incremental manner to accommodate node and link inclusions to the network.
The proposed scheme has three distinct parts, namely, • Collation of topology and construction of best possible single source shortest path trees (at root node). • Extraction of routes from a shortest path tree (at each individual nodes). • Reconstruction of a shortest path tree (at targeted nodes) after new node and link additions.

Collation of network topology and construction of SSSPT
A single source shortest path tree is a subgraph having smallest number of edges connecting the source node to all other nodes. It is formed such that the edges in SPT(n) form shortest paths from the vertex n to all other vertices in the graph G = {V, E} . The graph G is assembled at the root node by collating incremental topology information from the collaborating nodes. It is suitably merged with the root node's routing table. Every collaborative node figures its positioning in the network by inspecting the received signal strength indicator (RSSI) metric for the packets received from neighboring nodes. A neighbour with a good RSSI value is considered as peer node and every collaborative node shares its peer information with the root node. An edge is considered bidirectional if nodes present on both the ends identify each other as peers. Otherwise, it is considered unidirectional. The root node merges its routing table and the collated topology information to arrive at graph G.
There may be multiple SSSPTs available for a single source vertex. A progressive node betweenness centrality metric is employed to select a shortest path tree that is spread out whose edges lie in minimal number of shortest routes. This ensures that the P2P data traffic is distributed across different nodes to the maximum possible extent and reduces congestion. Figure 3a displays a Graph, G = (V, E) with |V| = 11 and |E| = 15 . For instance, multiple shortest path trees can be formed for the source node a as shown in Fig. 3b. Though all of them are shortest path trees having same path length, the SPT in 2 of Fig. 3b distributes P2P traffic effectively.
Node betweenness centrality (NBC) of a node is a measure of the number of shortest paths that passes through it. In a Graph having a set of vertices V and edges E, let (a, z) represent the total number of shortest paths between any two vertices a and z. Let (a, x, z) denote the number of the shortest paths between a and z that pass through the vertex x. Then the node betweenness centrality for the vertex x is given by, The summation happens over all the distinct pairs of vertices in the graph. In C3P-RPL, P2P traffic flows only on one chosen path between any two nodes. So essentially (a, z) is 1 and (a, x, z) is either 1 or 0, depending on whether the node x lies in the shortest path between a and z. So the Eq. 1 becomes This progressive NBC score of a node is updated while forming single source shortest path trees with the number of shortest paths in which it features. When multiple shortest paths with different predecessors are available, the node having a small NBC score is chosen as the predecessor. This ensures that an edge or link is reused minimally. This is important for RPL network having a radio duty cycling (RDC) layer which allows a node's radio to sleep for most of the time to conserve energy. Multiple nodes forwarding data traffic through a single node is a major reason for packet loss in RPL network. Generally, shortest paths are identified with a breadth first search algorithm (BFS) [38]. However, this proposed work creates an optimal shortest path tree for each node using the NBC scores.
Single source shortest path trees. a A directed graph G having 11 vertices and 15 edges b Possible single source shortest path trees rooted at vertex a Table 1 provides explanations for the notations used in the algorithm. The algorithm presented in Algorithm 1 combines BFS and NBC scores for selecting an distributed yet shortest route between any two nodes. The algorithm creates a SPT for all nodes in the List, by considering each one of them as a source node. Nodes are assigned level numbers in the ascending order, starting from the source node and moving further to their neighbours. The level numbers mark the order of the nodes in the tree and a non-zero level number indicates the presence of that node in the Edgelist. Starting from the source node, neighbour nodes are gathered at each level. A new neighbor is added to the Edgelist as a (parent, neighbor) pair. The NBC score of the parent node is incremented to indicate the passage of a new shortest path through it. The predecessor of the neighbor node is updated with the parent node. As the algorithm progresses to gather neighbors for multiple nodes at the same level, it could encounter a neighbour node with a non-zero level number. This indicates that the neighbor node has been visited through a different parent. In the event of both paths being equally short, one with the lowest NBC score parent is chosen. When the parent changes, the edge with the old predecessor is replaced with the new parent in Edgelist. Their NBC scores and predecessor values are changed accordingly. The NBC score of a node is carried over throughout the formation of all SPTs, so as to distribute P2P traffic across the entire network. Thereby, this scheme offers optimal shorter paths when compared to RPL's storing mode and saves energy by reducing the number of radio hops required for a packet to reach its peer node.
Algorithm Complexity: Let n be the number of nodes in the collated network and e be the number of directed edges in the network. The algorithm traverses all the directed edges for forming a single SPT. All of the outgoing edges of a node has to be processed to compare the NBC scores and select a path with minimum traffic flow. The inner for loops, though nested, runs only once for each edge and amounts to O(e). The steps for gathering nodes and their neighbors at each level takes at most O(n) operations and runs once for each level. Therefore, a single SPT needs O(n + e ) operations and the entire algorithm takes O(n * (n + e) ) for completion. The space requirement is O(n+(n * MAX_NBR )) as each node is allowed to report only MAX_NBR number of peers to the root node. It is taken as 20 in this study. Node betweenness centrality score of a node (n, nbr) A directed edge in SPT, with n as starting node and nbr as ending node Edgelist SPT stored as a list of edges in the level-order (parent, child) A single edge in the ordered Edgelist dist(x,y) Distance, as number of edges between node x and node y

Implementation specifics
The design of C3P-RPL touches upon multiple aspects of routing to improve P2P routing. Foremost of all, the link local address in the format of fe80::xxxx is chosen for the nodes where the last 16 bit of the address is derived from the MAC address of the node. This short addressing mechanism is adopted for the ease of information collation and dissemination. The topology of the network is formed from the list of peers obtained from the collaborative nodes. The topology information object (TIO) is an Internet Control Message Protocol packet (ICMP6) that carries the topology information from the collaborative nodes to the root node. This new control packet makes it possible for a network to have a mix of non collaborative nodes and collaborative nodes. The collaborative nodes use a new rpl public API to trigger the TIO transmission. The format of a TIO is presented in Fig. 4 and the first field has the total number of peers listed followed by their short addresses. Construction of SPT(n) for each n occurs in the root node and are essentially minimum spanning trees having smallest number of edges connecting a source node with all other nodes. Once a SPT(n) is created, it needs to be disseminated to the ndoe n. A P2P Route Information Object (PRIO) ICMP6 message is utilized to send the SPT back to each node n. As a SPT would have edge pairs for all the nodes in the network, it is split into multiple messages to comply the payload size of a IEEE 802.15.4 packet. Figure 5 provides the format of the base object of this ICMP6 control message having a sequence number, number of edges in the SPT followed by the edge pairs.
The SPT is stored in the respective nodes as a list of edges and is used for deriving next hops. These shorter routes are utilized for forwarding a P2P packet to its destination. The control overhead increases slightly but when compared to the overall DIS, DIO and DAO overhead, this is negligible.

Extraction of routes from a single source shortest path tree
A single source shortest path tree is the most favorable method to store the routes as it occupies less space and the same can be locally updated for the changes in the network. A route can be extracted from a single source shortest path tree by backtracking. It is a repetitive process that starts with a destination node in the SSSPT and traverses back until the path reaches the source node. The next hop node is the one having the source node as its predecessor. The edge list has ordered (parent, child) pairs that represent the edges of SSSPT in the increasing order of path length starting with the source node. For instance, the ordered list of edges for the SSSPT shown in 2 of Fig. 3b would be (a, b), (a, j), (b, c), (b, g), (j, i), (j, k), (c, d), (g, f), (i, h) and (d, e). Here node b and j has the predecessor as the source node a, meaning that they are direct neighbours to a. There is a shortest route to c through b, denoting that the next hop for c is neighbor b.
Each node in the SPT is a destination and the next hop should be identified for each of them. The function GET_ HOP is called with the source node and the destination node. It compares the parent node of the destination with source and returns when there is a match. Otherwise, recurses with the parent node as the new destination. Starting from a node, it traverses a branch through the ancestor nodes to reach the source node. The Algorithm 2 extracts next hops and forwards P2P packets through the same. With the extracted next hops, a peer to peer route is available before the start of  P2P communication. This is advantageous for simultaneous peer to peer data packet transmission as there is no route request / reply concept.
Algorithm Complexity: It takes a node in the SPT and employs a recursion to identify an ancestor whose parent is the source node. Let n be the number of nodes in the SPT. Let m represent the level of a node in the SPT and the level of the source node is marked as 0. A node, at level one would require only one iteration because its parent is the source node . The time complexity for such nodes would be T(1) = 1 . It means that the node is a neighbor and the next hop is in the link of the source node Similarly, a node at level

Reconstruction of SSSPT for new node and edge inclusions
Shortest paths between nodes might change as new nodes are added to the network. When the new node has links to more than one existing node, new paths form between existing nodes through the new node. Hence, reconstruction of an SPT becomes essential during new node inclusion or link formation. Performing collation and dissemination process for every edge or node inclusion is not feasible as it involves exchanging TIOs and PRIOs. Therefore, a viable alternate, is to update an existing SPT that is stored in an individual node for node and edge inclusions. This approach is equivalent to the incremental shortest path solution where existing shortest paths are evaluated for impact and updated. G. Ramalingam et al., and A. Slobbe et al., provide bounded ways to incrementally update shortest paths [39,40]. Let G' (Fig. 6a) be the Graph after adding a new vertex z and six new edges (a,z), (z,a), (i,z), (z,i), (h,z), (z,h) to the Graph G in Fig. 3a. Instead of considering G' as a new graph and constructing SPTs from the start, an alternate method is to incrementally update the existing SPTs. In each SPT, the affected shortest paths are identified and the edges are updated accordingly. Figure 6b illustrates the updates to the SPT that is rooted in vertex a. Incremental up-gradation of SPT is a two fold process where the first step is to add the new node to the existing shortest path tree at the appropriate position. The next step is to identify the existence of a path that is shorter than the existing ones. When there is a new shortest path available through the new node, the predecessor of the impacted destination node is changed to the newly added node. The Algorithm 3 has two separate functions for node inserts and edge updates.
Add_Node measures the path length between the source node and its neighbors. It attaches the new node to the neighbor having smallest path length to source node. Update_ Path checks for the existence of a shorter route between the source node and the neighbors of the new node. It updates Fig. 6 Incremental Shortest Path. a The graph G' with a new node 'z' and six new edges included to G b The SPT at node a after the incremental update the Edgelist for a shorter route. After the incremental update, the SPT has the shortest paths from the source node to all other nodes. When there are multiple node additions, the process is repeated for each node in a sequential manner.
Algorithm Complexity: The edge distance from the source node can be calculated while extracting next hops from the SPT. Then, dist(S, nbr) becomes a O(1) operation. Searching and deleting an edge in the SPT would take a maximum of O(n).

Implementation specifics
Incremental updates to an existing SPT is limited to the nodes that are very near to the newly added node. The rationale is that the C3P-RPL nodes have next hops stored at each intermediate node and updating nodes closest to the new path is advantageous than updating nodes that are far off. In addition to sending a TIO to the root node, a node can be configured to broadcast it to the immediate neighbours. When a SPT is already installed in a node, it automatically follows through the incremental update process. The TIO control message provides the incoming and outgoing edges of a new node. The Algorithm 3 adds the new node to the shortest path tree and updates existing shortest paths to nodes that are on the outgoing edges of the new node. Upon receiving a new TIO in root, it updates the node list, generates a new SPT rooted on the new node and sends PRIO control messages back to the new node. In essence, root node delivers an optimal SPT to the new node and the existing SPTs in the neighbouring nodes undergo incremental updates locally. As it is not feasible to recreate SPTs frequently for topology changes, this targeted approach is proposed. This selective update approach is found to give a good P2P packet delivery in case of small network changes. Complete reconstruction of entire set of shortest path trees is preferred for major network revamp or periodically and can be handled through configuration settings.

Control overhead in C3P-RPL
The proposed approach incurs additional control overhead for the network. The collaborative nodes send a Topology Information Object (TIO) message in the upward direction towards the root node. The maximum number of neighbors in the TIO remains same even when the network scales up. The root node utilizes P2P Route Information Object (PRIO) messages to send back the respective shortest path tree to each collaborative node. There can be multiple PRIOs for each source node depending upon the size of its SPT and they flow in the downward direction. The peer-to-peer route information is quite substantial that they warrant a new type of control packet rather than piggy backing in existing control packets. A new control packet also eases the existence of non collaborative and collaborative nodes in the same network. It should be noted that the proposed approach makes use of local SPT updates for minor topological changes. Therefore, there is no periodic control packet exchange with the root node.
This new control overhead is comparatively very low when there is simultaneous information exchange between all the nodes. The P2P data packets would be much larger than the number of TIOs and PRIOs. It should also be noted that the existing standard, P2P-RPL, utilizes DAOs and DIOs to float a DAG for optimal route formation. When all the nodes float a DAG, the amount of DIOs and DAOs would be huge and surpass the new control packets. The benefits of availing proactive shorter paths in a simultaneous peer-to-peer communication network is likely to outweigh the small control overhead proposed by C3P-RPL.

Experimental evaluation
Both simulation and testbed experiments are performed to verify and quantify the benefits of the proposed strategy. The simulation study is undertaken using cooja simulator and the real-time testbed evaluation is performed in FIT IoT-LAB's [41] Lille site.

Network specification
The RPL network is customized to support 50 nodes. All nodes are routers, capable of forwarding data traffic and are made collaborative to find out the maximum possible benefit. The traffic is point to point and all nodes in the network transmit peer to peer data packets simultaneously to every other node in the network. The implementation of C3P-RPL pre-allocates memory in the order of network_size * Max_neighbours in the root node and network_size in all other nodes for storing the topology and the shortest path tree respectively. Max_neighbours is taken as 20 for this study. The results are labeled as RPL and C3P against the respective protocols. To understand the behavior of the proposed approach under new node additions, the experiments are repeated in the same network with ( N − 3 ) nodes starting at time t and 3 randomly selected nodes starting at t + 12 minutes (N being the number of nodes in the network). The corresponding results are labeled as RPL' and C3P' respectively. The following metrics are collected for evaluation, • PDR: The packet delivery ratio as a percentage of peer to peer data packets received to that of the packets sent. • Latency: The average time taken for a P2P packet to reach the destination node from a sender node. • Hops: The average peer to peer path length between any two peer nodes in the network. • Control Overhead: The average number control messages transmitted by a node. This includes DIS, DIO, DAO, TIO and PRIO. • Duty Cycle: The percentage of time for which CPU and Radio were active against the total run time of the experiment. • Traffic distribution: The number of P2P data packets that passes through each node.

Simulation study in cooja
The network chosen for testing is a grid topology having 50 wismote nodes. Wismote nodes have a TI MSP430 series 5 [42] processor and a TI CC2520 [43] 2.4GHz IEEE802.15.4 transceiver. ContikiRPL in Contiki operating system [44] is chosen as the RPL implementation and C3P-RPL changes are made on top of it. 49 nodes are placed in a 7x7 grid and the root node is placed above this grid at the top middle. The distance between nodes in the grid is 30m and the transmission range and interference range are the default 50m and 100m respectively. The configuration parameters are summarized in Table 2. Hence, corner nodes have two neighbors, nodes along the rim have three neighbours and all others have four neighbours. Each node in the grid transmits a data packet every 30 seconds and the packets are unicasted to all other nodes in the network. All nodes send and receive peer to peer data packets simultaneously and the simulation is run for one hour. Figure 7a highlights the packet delivery ratio of all the nodes in the network. The root node do not send any data packets and facilitates only the topology collation and route dissemination. The average pdr is at 88.8% for RPL and improves to 95.8% for the proposed C3P-RPL. Inspite of having enough memory to hold routes, the nodes experience packet loss. The probability of packet loss increases with path length as with every link there is a chance of collision and congestion. The maximum number of MAC layer retransmissions is set at 5 for low energy operations and packets are dropped after five failed attempts. In addition, RPL has to send packets upward until it reaches a common ancestor. So the nodes at the junction points experience congestion. C3P-RPL reduces packet loss by minimizing the path length of each route by routing along shortest paths and also reduces loss due to congestion by distributing traffic. The node betweenness centrality score, coupled with shortest paths, works well in minimizing loss due to congestion. The overall interquartile range is small for C3P-RPL which shows that all nodes perform equally well in contrast to RPL. In a network where node or edge additions happen after the DAG is formed, RPL's reliability is found to be little lower at 86.8% than the normal network. This is corresponding to the overhead increase seen in Fig. 7d, generated by the new nodes to join the DAG. Figure 7b displays the average path length or number of hops between any two peer nodes in the network. The average path length between nodes in the grid is 6.8 for RPL and is reduced to 4.6 when shortest paths are employed for P2P communication by C3P-RPL. There is a reduction of around 31.5% in the path length between peers. Some of the nodes in RPL has a path length greater than ten as the packets travel along the DAG from one branch to another branch of the grid. When new nodes are introduced in network after it is stabilized (12 minutes from the start time), results in a slow progress of convergence in RPL. The path length of C3P' after new node inclusion is slightly higher as the shortest route updates are targeted and limited to the neighbouring nodes. But this strategy keeps the control overhead lower and maintains the reliability whereas RPL' has a low reliability in the new network.  Figure Fig. 7c presents the average time taken for a P2P packet to travel between two peers in the network. The latency is very high for RPL because the nodes farthest to the root node have to cross all intermediate nodes before reaching destination. With the reduction in path length and congestion, C3P-RPL reduces the travel time from 3219.6 ms to 1048.6 ms. This amounts to a 67.4% of drop in network latency. The graph shows a 28.3% increase in latency for C3P' when compared to C3P because the incremental shortest path updates are performed only in neighbouring nodes and not universally across all the nodes. There is a trade-off involved in increasing the control overhead or increasing the path length of few P2P routes. However when compared to RPL', there is a considerable drop of 61.1% in latency for C3P' which justifies this targeted update strategy.

Result discussion
The averaged control overhead of a node is quantified in Fig. 7d and as expected, it is slightly elevated for the proposed C3P-RPL. The expected transmission count (ETX) metric is recalculated after every successful data packet transmission and a parent switch happens whenever the path metric improves beyond a threshold. C3P-RPL installs new routes through all the neighbours and as a result shows a small increase in DIO generation. However, it can be seen that the newly added TIO and PRIO control messages are very minimum and does not add much to the control overhead. The overhead for C3P' is higher than C3P as three newly added nodes transmit more control packets to become part of the existing DAG. The preferred parent mechanism can be fine tuned for out-of-DAG packet forwards but is considered out of scope for this work.
The chart displayed in Fig. 7e gives the percentage of time for which the cpu, radio transmitter and receiver are active against the total running time of the simulation. A drop in the duty cycle is experienced for the proposed strategy as it reduces the number of hops required for the peer to peer routes. Though the computation is higher for the proposed approach, it reduces the number of intermediate nodes which has to process a incoming packet. It is imperative to note that the power consumption of a radio is much higher than the cpu. There are many outliers shown for RPL which indicates that few nodes that are junction heads need to forward more packets which results in energy depletion. It can be inferred that C3P-RPL is more suitable for optimal energy utilization than RPL.
The number of data packets that is transmitted through each node directly depicts the P2P traffic distribution and is shown in Fig. 7f. As it is a scatter plot, data points belonging to RPL and C3P-RPL alone are pitched against each other. The blue dots, representing RPL show an uneven traffic distribution. Few branch heads cater to heavy traffic while others take in less traffic. In contrast, the green dots represent C3P-RPL where the traffic is evenly distributed. It is clear that C3P-RPL's data distribution using shortest paths and NBC score provides optimal peer to peer shortest paths between nodes.

SPTs under partial collaboration
To understand the effects of partial collaboration in SPT formation, a simulation study is performed in a grid network of 100 peer nodes. Same simulation parameters are employed and the simulations are run for 600 seconds. Figures in Figs. 8a, b and c show the grid networks with 75, 50 and 25 collaborative nodes respectively. They are labeled as C75, C50 and C25. The yellow nodes are collaborative nodes and send a TIO to the root node. The purple nodes are non-collaborative and do not share any topology information. Along with these three networks, a grid with 100 collaborating nodes and a grid with zero collaborative nodes are also taken up for the study. They are labeled as C100 and C0. The SPTs generated in the root node are recorded and the average P2P path length is derived from the same. The SPTs that are formed in C75, C50 and C25 networks are smaller in size when compared to the full collaborative network. When a shortest path is not available from the SPT, C3P-RPL falls back to RPL's approach of sending a packet upward and the DAG path length is considered for such cases. The DAG path length is recorded in the C0 network with the nodes sharing the preferred parent alone in the TIO. The graph in Fig. 9 demonstrates the average P2P path length of routes. The collaboration is highly effective in the network where, all nodes participate in the process of constructing the network topology. The path length in the fully collaborative grid C100 is 6.61 when compared to RPL's average of 12.9, thereby reducing the path length by 48.72%. The reduction in path length is 28.48%, 13.58% and 4.58% for corresponding C75, C50 and C25 grids. In essence, for an effective collaboration in a C3P-RPL network, more than half of the nodes would have to share the incremental topology.

Experimentation in FIT IoT-LAB testbed
FIT IoT-LAB provides multiple testbed sites with hundreds of nodes for testing LLNs and multi-hop networks. They have atleast 20 different set of boards with different MCUs, radios and supports multiple operating systems. For this study, nodes having ARM Cortex M3 MCU with Atmel radio [45] are chosen. These nodes have 64kb of RAM and 256kb of ROM [46]. The positioning of the chosen 47 nodes in Lille test bed is as shown in Fig. 10.
In order to create a multi-hop network, the transmission power of the M3 nodes are set to a low value so that the nodes are capable of reaching a small set of neighbours. Specifically,  the transmission power is set to -17 dbm and the receiver sensitivity is kept at -72 dbm, to attain a six hops diameter. The UIP_CONF_MAX_ROUTES parameter is set to network size in order to provide sufficient space for route table entries. The channel check rate is set to a low 8Hz so that the radio duty cycle is kept minimum to ensure low power operations. Figure 11a exhibits the packet delivery ratio of all the nodes in the network. The average pdr for peer to peer data packet delivery in RPL is 78.64% which is less than the grid network shown in Fig. 7a. As all nodes have sufficient memory to hold routes, RPL's delivery doesn't fail due to lack of routes. It is due to multiple nodes using the same route for forwarding P2P packets. Though the path length is smaller in this topology, 73.8% of the data packets move through the root node indicating that the network has many DAG branches. This congestion reduces the delivery rate. However, C3P-RPL with the proposed centrality assisted P2P path selection method provides shorter and distributed routes that improves the delivery to 90.14%. It proves that node betweenness centrality is a good measure to distribute simultaneous P2P traffic. Figure 11b displays the number of hops needed for a P2P packet to reach its destination. The average path length is 4.2 for RPL's storing mode whereas it reduces to 2.9 for C3P-RPL with its out of DAG shortest paths. C3P-RPL's method of utilizing topology information to derive shorter paths is more effective in reducing the length of a P2P packet's travel. The outliers, seen in this graph for the new network where nodes are added after the DAG creation, represent the longer paths constructed in the DAG prior to the new node additions. There is no significant increase in overall packet hops which indicate that C3P-RPL is resilient for dynamic networks too.

Results discussion
The latency metric values, shown in bar chart of Fig. 11c, attests that the short and distributed P2P routes allow quick turnaround time for a packet in the network along with better reliability. The average latency is brought down to 429.25 ms in C3P and 408.5 ms in C3P'. The proposed approach reduces the latency by around 65%. This is because the RPL nodes belonging to different DODAG branches go all the way to ancestor node and then down. In the tested network, the number of packets that go through the root node for C3P reduces by 73% than RPL. A lower latency is most suitable for devices to collaborate faster in real time.
As C3P-RPL introduces two new control messages and all nodes in the network are made to exchange them with the root node, a slight increase in control overhead is expected. Figure 11d reveals an increase in overhead for the proposed scheme. But it is very small when compared to the overall control overhead in the network. Unlike the grid topology, the control overhead of the proposed approach is similar to that of RPL. This is because of the shorter path length to root node when compared to the grid. Once the network receives new routes, the overhead will not be increased further.
LLNs are known for low energy operations and the proposed implementation is measured against RPL in this respect. The nodes use ContikiMAC [47] and its effective sleep-wake-up cycles allow the nodes to keep their active time for radio to be very less. The radio can be in standby mode for the rest of the time. We could infer from the graph in Fig. 11e that the duty cycle of C3P-RPL is better than that of RPL. This is corresponding to the reduction in the path length of the peer to peer routes. Also, the overall duty cycle is kept below 2% which ensures that the new approach is suitable for battery operated nodes. The receiver is relatively active than the transmitter because of the high number of neighbours in the testbed network than in the grid. On average, the grid network has only 3 neighbours whereas the network formed in the testbed has 8 neighbours.
Traffic distribution of the proposed approach can be understood from the graph in Fig. 11f. The extent of the distribution depends on the availability of the number of alternate nodes. Though the average number of neighbours is high in the testbed, few nodes have only couple of neighbours. Whereas all the nodes have similar number of neighbours in grid. Hence, the distribution is better in the uniform grid network than in the testbed. All the metrics indicate that the proposed approach with node centrality assisted P2P path selection, offers reliable and faster peer to peer communication between all the nodes in the network.

C3P-RPL in a testbed with higher node density
The network in Lille testbed is expanded to 76 nodes and the experiments are repeated to measure C3P-RPL's performance under a different node density. The area of the network remains the same as the dimension (15.8mx16m) The new network is labeled as RPL_76 for RPL and C3P_76 for C3P-RPL. In this new network, every node has an average of 12 neighbors. This is in contrast to the average of 8 neighbors in the earlier network formed with 47 nodes. Figure 12a reveals that the packet delivery ratio for RPL_76 and C3P_76 is comparatively lower than the 47 node network. This is expected because increasing the number of nodes in a small area aggravates congestion. But, C3P-RPL performs better than RPL in this situation by distributing the traffic across multiple nodes and delivers 75.27% of P2P packets. Ideally, the performance of C3P-RPL should be better as more number of alternate paths are available. But, the reception rate of PRIOs is low in C3P_76 as the downward packet flow is impacted heavily in a dense network. Due to this, the full potential of the proposed approach is not realized in the new network.
The graph in Fig. 12b shows an average P2P path length of 3.93 for RPL_76. The decreased path length correlates with the reduced packet deliveries in the network. Packets that go through the longer route via the root node are lost. With part of the C3P_76 network using the DAG path, the latency improvement is limited to 39% when compared to RPL_76, as seen in Fig. 12c. It can also be noted that the latency of RPL_76 is lower than RPL_47. This again reflects the loss of packets in the former due to the higher node density. Figure 12d presents the comparative control overhead for all the four cases. The control overhead of C3P_76 is a tad higher as it delivers additional downward control packets and encounters losses. Packet drops cause an increase in DAOs in order to choose a different path for fail safe delivery. The circulation of larger number of P2P packets in the C3P_76 network cause an increase in its overall radio transmissions. The radio duty cycle shows a corresponding rise in Fig. 12e. With a reduced PRIO reception, the C3P_76 network resembles a partial collaboration situation where few nodes use non-optimal DAG paths. Perhaps, C3P-RPL could use a receiver-on approach for disseminating SPTs in such dense scenarios. The results suggest that C3P-RPL provides effective improvements in a high density network even with the subset of shortest routes.

Conclusion
Real-time device to device collaboration in a IoT network requires reliable and faster means of communication between them. Despite having sufficient memory to store routes, RPL's storing mode falls short in providing efficient peer to peer routes. This paper proposes a new approach to select, store and sustain optimal peer to peer routes. Root node collates the topology of the network from the collaborative nodes. A node betweenness centrality score, creates optimal single source shortest path trees that minimize the number of shortest paths that cross every node. Local incremental updates to an existing shortest path tree sustains shortest routes in neighbouring nodes without having to disseminate the updates to all nodes. The simulations and testbed evaluations show that the proposed approach improves overall P2P packet delivery, reduces path lengths of the routes, lowers packet delay in the network while ensuring low energy operations. In a simulation of 100 collaborative nodes, C3P-RPL reduces the average peer-to-peer path length by 48.7%. A testbed having 47 nodes is found to provide a 65% reduction in packet latency, while a dense 76 nodes testbed offers a 39% improvement. The overhead added by the proposed work is found to be very small when compared to the existing control messages. This makes C3P-RPL suitable for simultaneous P2P communication in a low power network. In future, this work could be extended to study the applicability to dense networks or content centric networks. Collaborative applications in Industry 4.0 and smart cities could stand to gain, where, direct communication between devices is essential.