Informational Cost and Networks Navigability


 Understanding how information navigates through nodes of a complex network has become an increasingly pressing problem across scientific disciplines. Several approaches have been proposed on the basis of shortest paths or diffusive navigation. However, no existing approaches have tackled the challenges of efficient communication in networks without full knowledge of their global topology under external noise. Here, we develop a first principles approach to determine the informational cost of navigating a network under different levels of external noise. We discover the existence of a trade-off between the ways in which networks route information through shortest paths, their entropies and stability, which define three classes of real-world networks. This approach reveals that environmental pressure has shaped the ways in which information is transferred in bacterial metabolic networks and allowed us to determine the levels of noise at which a protein-protein interaction network seems to work in normal conditions in a cell.

Understanding how information navigates through nodes of a complex network has become an increasingly pressing problem across scientific disciplines. Several approaches have been proposed on the basis of shortest paths or diffusive navigation. However, no existing approaches have tackled the challenges of efficient communication in networks without full knowledge of their global topology under external noise. Here, we develop a first principles approach to determine the informational cost of navigating a network under different levels of external noise. We discover the existence of a trade-off between the ways in which networks route information through shortest paths, their entropies and stability, which define three classes of realworld networks. This approach reveals that environmental pressure has shaped the ways in which information is transferred in bacterial metabolic networks and allowed us to determine the levels of noise at which a protein-protein interaction network seems to work in normal conditions in a cell.
Arguably, complex networks exist to transmit items-generically named here "information"-between entities of a complex system [1][2][3] . Therefore, the problem of understanding the routing of information through a network has became a topic of major interest in network sciences [4][5][6][7][8][9][10][11][12][13] . Most extant approaches to characterize "information transfer" (IT) in networks rely on shortest (topological) paths (SP) connecting pairs of nodes [14][15][16][17][18][19] . This assumption has given rise to concepts like average shortest path length and "small-worldness" 20 , closeness and betweenness centrality 21 , and global efficiency 22 , among others ubiquitously used nowadays across the disciplines. However, for information to navigate through SP, its sender needs to have a complete knowledge of the topology of the network, which is rarely the case. Even in cases where such knowledge exists, navigation may occurs through different paths in a network 23 . Alternative navigational strategies have been proposed to supplement the drawbacks found with the SP strategies 24 , such as the use of random walks 25,26 , search information 27,28 , communicability 29,30 , strategies based on node similarity 4 , and others. Although these measures are sustained on physical grounds-the ubiquity of diffusive processes in nature and society-the majority of them have been introduced in ad hoc ways. This ad hoc nature is revealed, for instance, in the next example. Let us consider the navigation of information on the network illustrated in Fig. 1(a) between the nodes labeled v and w. The SP does not differentiate between the paths P 1 and P 2 because they have the same topological length. The search information measure 27,28 was proposed to sort out such kind of problems. However, it is unable to distinguish between the two paths P 1 and P 2 either (see Fig. 1(a)). Assuming a random walk and calculating the commuting time sort out the problem and identifies path P 1 as shorter than P 2 (see explanation in Fig. 1(a)). However, if we enlarge the network as in Fig. 1(b) neither the SP, nor the search information, nor the commuting time, differentiate the new paths P 1 and P 2 , although we expect that P 2 should be shorter than P 1 (see Fig. 1(a)). The fact that you can get lost in space by navigating this way has been previously analyzed by von Luxburg 31 .
Here we introduce a first-principles physical formalism to analyze IT in networks and conclude that the landscape of IT in these systems is more complex than previously considered. We find a trade-off between purely diffusive and purely deterministic SP routing. On "normal" operational conditions there are networks behaving like pure quantum states, for which a complete knowledge of their topology brings no information about the routes of navigation. Other networks route the information through SP in more classical-like (mixed) states. The zoo is completed with networks in which both mechanisms are mixed-up. The new formalism allow us to reveal evolutionary fingerprints produced by environmental pressure on metabolic bacterial metabolism, as well as to gain insights about the level of noise at which a protein-protein interaction network is operating in a "normal" cellular environment. Namely, the current work shows that no preconceived notion of navigability is needed to understand the different routing patterns of information in complex networks. Such patterns of information transfer emerges naturally from the topology of the networks. Additionally, we found that these patterns are not rigidly determined by the network connectivity, but they can be modified by the levels of noise to whch the networks are submitted to.

Formalism
We consider here undirected networks G = (V, E) (the directed case is treated in Methods) where the transmission of "information" between two adjacent nodes is determined by the Hamiltonian where α v is the on-site energy retaining information at the node v, γ vw is the energy needed for information to hop from node v to its nearest neighbor w. The operatorĉ † v (ĉ v ) annihilates (creates) a particle at the site v. For the sake of simplicity we make α v = 0 and γ vw = −1. Therefore, We consider that the process of IT in the network occurs in the presence of external noise, which is introduced in the system by submerging the network into a thermal bath of temperature T . Considering the system at equilibrium, we apply the "imaginary time formalism" to transform the propagator ⟨v| exp |w⟩ into the thermal Green's function (TGF) (namely con- where the Boltzmann constant is set to unity) 32 : where |e v ⟩ is the standard basis vector in R n which has 1 in the vth entry and zero elsewhere. Let Γ (G, β) be the matrix whose nondiagonal entries are the corresponding TGF and let Q (G, β) := diag (Γ (G, β)). We define as the matrix whose nondiagonal entries are normalized TGF. We prove in Methods that C (G, β) is a correlation matrix, where its entry r vw (β) is the uncentered correlation coefficient between the nodes v and w of G. Because r vw (β) is a correlation coefficient we can write it as is the covariance of the two nodes, and σ v (β) = √ Γ vv (G, β) is the standard deviation of the corresponding node (see Methods for details). The term σ 2 v (β) accounts for the effects of thermal fluctuations (TF) on the position of node v in the Euclidean space, while cov v,w (β) = r vw (β) σ v (β) σ w (β) describes the combined effects of TF and IT on the correlated motion of both nodes. We prove in Methods that for simple networks σ v (β) σ w (β) ≥ cov v,w (β), which means that IT has a compensatory effect over the TF, i.e., they have different signs. Therefore, the net effects produced by IT between the two nodes are obtained by the following difference: which corresponds to the variance of the difference between the two correlated variables: σ 2 v−w . It quantifies the cost needed to pay by IT to overcome the thermal oscillations of the individual nodes due to the temperature of the thermal bath. We prove in Methods that σ v−w (β) is a Euclidean distance metric. When σ v−w (β) is small, r vw (β) is large enough as to overcome the thermal oscillations at relatively low cost. On the contrary, when σ v−w (β) is large, the thermal fluctuations of the nodes are too big in relation to r vw (β), which implies a high cost for IT. Therefore, we identify this variance with the information cost for IT under given thermal conditions, i.e., I vw (β) := σ v−w (β).

Network geometrization and information paths
Technically, in a network we can only transfer information from one node to another which is adjacent to it. Then, in order to capture this "through-paths" information cost we perform a geometrization of the graph in order to define a length space on it 33,34 . This is carried out by considering every edge e = vw in E as a compact 1-dimensional manifold with boundary ∂e = v ∪ w. Let the edge e = vw be given the I vw (β) metric such that We now extend the metric on the edges of G via infima of lengths of curves in the geometrization of G. Then, the network becomes a metrically length space, which is locally compact, complete and geodetic 34 . We therefore can now define the "path informational cost" as: To warm up we first calculate the informational length for the different paths previously studied in Fig. 1. For the graph in Fig. 1 (a) we have: C P1,vw (β = 1) ≈ 4.072 and C P2,vw (β = 1) ≈ 4.112, which indicates that the first path is shorter than the second. In the graph in Fig. 1 (b) the results are C P1,vw (β = 1) ≈ 4.174 and C P2,vw (β = 1) ≈ 4.158, indicating that the second path is shorter than the first as expected.
We now turn our attention to the relation of the informational cost of a path and the shortest topological distance between the two endpoints of that path. For this we consider the case when β → 0 (T → ∞). In this case the variance of every node is the maximum, σ 2 v (β) → 1, and the covariance between every pair is minimum, cov v,w (β → 0) → 0. This situation corresponds to a clear dominance of the thermal fluctuations over the IT. In this situation the length of every edge is equal to 2, which implies that where l vw is the length of the shortest (topological) path between the nodes v to w in G. The number of edges included in the SIP is identical to that in the SP. The main difference between SIP and SP is that the first avoids to cross nodes with high σ 2 v (β), which in general are the nodes with the highest degree, i.e., the hubs of the network. On the contrary the SP between two nodes crosses some of the hubs of the network with high probability (see Methods).

Navigational cost
Here we define the global difficulty of navigating a network under given thermal conditions by means of the von Neumann entropy: is the density matrix of the Boltzmann thermal state 35 , which is the density matrix with maximum von Neumann entropy, and Z (G) = T r (Γ (G, β)) is the partition function of the network. In a network having many paths with low informational cost of F P k ,vw (β), the von Neumann entropy is low, indicating a good global navigability of the network. Indeed, when S (G, β) = 0, the system operates in pure quantum states. When the only available paths to travel between pairs of nodes are those having large values of I vw (β), the von Neumann entropy is large. In this caseρ → 1 n I, where I is the identity matrix, such that S (G, β → 0) → ln (n), which is the maximum value of the von Neumann entropy. This state is typical of mixed quantum states. A way to quantify how far a network is from its maximum entropy is by means of: (0.10) The purity of the system is characterized by means of: where Π = 1 represents a pure quantum state and Π < 1 represents a mixed one. Finally, we introduce the thermal expectation of informational energy of the network as 35Ē

General real-networks analysis
In Fig. 2(a) we represent the results for the networks analyzed here in the unit informational square formed by the normalized △S and the ratio of SIP routed through SP (see Methods). We distinguish three clusters of networks according to their proximity to each of the three vertices of the upper triangle. We prove analytically that the exact vertex (0, 1) is occupied by a directed cycle. In this case, S (G, β = 1) = ln n and the density matrix is identical to the corresponding identity matrix (see Methods). This means that information is completely localized at the nodes of the network, with identical probability at each of them. This is characteristic of mixed quantum states. Indeed, in this case Π ≈ 0. This group of networks includes mainly directed networks such as transcription and metabolic networks, Internet at autonomous system, software relational networks, power-grids and electronic circuits. Let us consider the network representing gates or flip-flops connected by wires in the electronic circuit S240 as a representative of this group. In this case, S (G, β) ≈ ln n ≈ 5.4307 indicating the proximity of the von Neumann entropy to its maximum, and Π ≈ 0.005 indicates that this network operates in a mixed quantum state. Consequently, the density matrix will bring very poor information about the navigability of this network (see Fig 2(b)). However, many of the shortest informational paths (SIP) in these networks coincide with SPs as can be seen in Fig. 2(e). Therefore, these networks operate in a classical-like way, where information diffusing across the network is routed "naturally" through the SPs. Consequently, we can determine the routes of navigation with high probability with the topological information of the network. The class of networks close to the (1, 0)-vertex is mainly formed by large social collaboration networks, protein-protein interaction networks, the neural directed network of C. elegans, and the USA air transport network. We consider the last one as a representative of this class. We can suppose the diffusion of infective (viral) particles across this network. In this case such infective particles diffusing through the airports are routed naturally through the SPs only in 36.6% of the almost 55 thousand pairs of nodes. Thus, in the majority of cases, the knowledge of the topology of this network is useless to understand the propagation process. However, the von Neumann entropy is close to zero, S (G, β) ≈ 1.02 · 10 −9 , and the value of Π ≈ 1.000 indicating that this network operates like a pure quantum state. Therefore, we can gain insights about the viral propagation from the quantum information of the system through its density matrix. The prototype of this class of network is a star graph to which random edges are added (see Methods).
Finally, the class of (1, 1)-networks is mainly formed by small food webs and small face-to-face social networks. They operate in nearly pure quantum states but most of information diffusing across the networks is routed through SP. For instance, the food web "El Verde" has S (G, β) ≈ 6.44·10 −2 , Π ≈ 0.988, and the SIPs coincide in 83.6% with the SPs. Therefore, we can gain insights about IT either from the classical (network topology) or through the quantum (density matrix) representation of the system. As can be seen in Fig 2(a) the networks in this class tend to display relatively large edge densities. In Methods we prove that the vertex (1, 1) is occupied by fully-connected networks. Therefore, these networks pay an energy quote for routing information through SP. For instance, they have ⟨Ē⟩ ≈ −7.24, which is significantly lower than the one of (1, 0)networks, which are pretty stable with ⟨Ē⟩ ≈ −16.65. The (0, 1)-networks pay a large quote in terms of stability for routing most of information though SP and working at mixed states, i.e., here ⟨Ē⟩ ≈ −0.92.

Applications
First we consider 116 metabolic networks of bacteria living in different environments. Following Parter et al. 36 these environments are classified into the following groups (in nondecreasing order of their variability): Obligate (O), Specialized (S), Aquatic (Aq), Facultative (F), Multiple (M) and Terrestrial (T). The role played by environmental pressure on shaping the modularity of these metabolic networks has been widely discussed [37][38][39][40] . However, how this environmental variability has changed the way in which information is transmitted across these networks is unknown. Then, let G (β = 1) be a metabolic network operating under "normal" conditions in a given cell. We calculate △S = ln n − S (G, β = 1) as well as the proportion of SIPs that coincide with the SPs for the networks in each environmental class. As can be seen in Fig. 3(a) △S increases steadily with the increase of the environmental variability from obligate to terrestrial bacteria. This means that when the number of environmental challenges increase, like in the case of terrestrial bacteria, the metabolic network has to evolve towards the direction of more pure states (notice that there is an increase of 47% in △S from obligate to terrestrial environments). This implies that the metabolic networks depart from the more classical-like deterministic behavior which they enjoy in obligate bacteria to a more diffusive-like states in the terrestrial ones. As can be seen in Fig. 3(b) in terrestrial bacteria, less diffusive paths are routed through SP in comparison with obligate bacteria. The cost that bacteria have to paid for making their metabolic networks more diffusive is reflected in the length of their genomes. In Fig. 3(c) we observe the negative trend between the use of SP and the length of the genome. In those bacteria with more diffusive-like IT processes the length of their genomes is more than 3 millions pairs of bases larger than those with more classical-like routing through SP (data from 41 ).
Now we analyze the protein-protein interaction (PPI) network of S. cereviciae (yeast). About a quarter (26.4%) of all 2224 proteins in this PPI are essential, i.e., their knock-out produce cell death. Since the seminal paper of Jeong et al. 42 determining that the most connected proteins have more chances of being essential, several methods have been designed to identify those proteins from the topological information of the network [43][44][45] . We perform here an experiment with a two-fold objective. First we are interested in generalizing the popular centrality measure known as "closeness" and check how it performs the task of identifying essential proteins in the yeast PPI network. Second, we are interested in investigating whether we can recover the temperature at which the corresponding network is operating. Let us define the closeness informational centrality as: We then identify the percentage of essential proteins (EP) in the top x proteins ranked according to CC v (G, β) for 0 < β ≤ 1. The results, illustrated in Fig.  4(a) indicates that the largest percentages of EPs are identified for values of β around 0.6. We notice that the values of EPs identified with CC v (G, β) are always significantly larger-up to 20% larger-than those identified with the classical closeness centrality, which never exceed 42.5%. We then recover this value of β = 0.6 as the inverse temperature at which this PPI network is "normally" operating in the cell. If we compare the two regimes, β = 1.0 and β = 0.6, we can observe significant differences in the functioning of this network. First, the purity of the states drops from Π (G, β = 1.0) ≈ 0.92 to Π (G, β = 0.6) ≈ 0.57, indicating that the network is working in a more classical-like (mixed) state than it could do according to its topology. This implies that the information sent diffusively from the nodes can now be routed through the SP in 45% of the times, in contrast with only 27% of the cases when β = 1.0.

Conclusions
In this work we propose a formalism for IT on networks under external noise. Using a tight-binding Hamiltonian for a network submerged into a thermal bath, we define a correlation matrix whose entries quantify the combined effects of thermal fluctuations and IT on the coordinated perturbations of nodes in a network. The net effects of IT are recovered as the variance of the difference between the correlated variables, which represent position vectors of the nodes in a Euclidean space. This variance is a Euclidean distance between the nodes. Then, via geometrization of the graph we recover SIPs between pairs of nodes, which naturally avoid nodes of high degree when navigating a network. We discover the existence of three main classes of networks: (i) networks operating in mixed states were the diffusing information is naturally routed through shortest paths; (ii) networks operating in pure quantum state were most of information navigates in a purely diffusive way; (iii) networks operating in pure states but were information is routed through SP. Networks in the first class operate at a high energy cost, while those in the second one are pretty stable. We reveal that environmental pressure has shaped IT in bacterial metabolic networks, making them to operate in more quantum-like state. We also show that the yeast PPI network seems to be operating at a temperature that makes it to functioning in a more classical-like state, allowing larger than expected routing of information through SPs. Finally, we would like to remark that he current approach can be exploited to design networks with IT patterns a la carte.

General definitions
We consider here directed or undirected graphs G = (V, E). Let A be the adjacency matrix of G. A walk of length k starting at node v and ending at node w is a sequence of k consecutive (not necessarily different) edges. The number of such walks is given by A path is a walk without repetition of nodes or edges. The entries of the TGF Γ vw (G, β) = exp (βA), counts the number of walks of any length from v to w, giving more weight to the shorter than to the longer ones: In the case of directed networks the Hamiltonian is non-Hermitian and we need to symmetrize it in order to define metric properties. Otherwise, there are violations of the symmetry axiom of these metrics. Therefore, for these properties we work withÃ (G) : as for instance in 46 . When the graph is undirectedÃ (G) = A (G) . We will recover the directionality of the networks later on. In the case of the symmetrized matrixÃ we can always writeÃ = U T ΛU where U is the matrix whose columns are orthonormalized eigenfunctions ofÃ and Λ is the diagonal matrix of eigenvalues. Let |φ v ⟩ be the vector formed by the vth column of U T . Then, β) is a Gram matrix, where |x v (β)⟩ and |x w (β)⟩ are vectors positioning the corresponding nodes at the surface of a Euclidean n-dimensional sphere 47,48 .
We now prove that C (G, β) defined as the normalizedΓ (G, β) is a correlation matrix. For that we will prove that its nondiagonal entries r vw (β) are uncentered correlation coefficients between the position vectors |x v ⟩ and |x w ⟩. It is easy to check that (0.14) which implies that r vw (β) represents the ratio of information transferred between the two nodesΓ vw (G, β) to the information returned to each of the two sourcesΓ vv (G, β) andΓ ww (G, β) . Then, becauseΓ (G, β) is a Gram matrix we have thatΓ vw (G, β) = ⟨x v |x w ⟩ andΓ vv (G, β) = ∥x v ∥ 2 . Thus, it is straightforward to realize that which is the classical interpretation of an uncentered correlation coefficient. We now prove that, for simple (directed or undirected) graphs, 0 ≤ r vw (β) ≤ ≥ 0, which is always true for non-weighted graphs where A contains no negative values. Finally we prove that σ v−w (β) is a squared Euclidean metric. Let us write is the length of the chord connecting the nodes v and w at the surface of the embedding n-sphere.

Shortest informational paths
To find the SIP in a network we use the following. Let I (G,β) := |s(G, β)⟩ ⟨1| + |1⟩ ⟨s(G, β)| − 2Γ (G, β) be the matrix whose (v ,w) entry is the informational distance between the nodes v and w based on the symmetrized adjacency matrix. We then obtain the weighted (directed) adjacency matrix of the geometrized graph as: W (G,β) := I (G,β) A where is the entrywise product (also known as Hadamard or Schur product), and A is the unsymmetrized adjacency matrix. Finally, we find the minimum shortest (directed) paths between every pair of nodes using the matrix W (G,β). A SIP between the nodes v and w is then the shortest path that minimizes the informational distance along all paths connecting the corresponding nodes.
We define here the probability that a piece of information traveling between two nodes separated by a shortest (topological) path of length L goes through a SIP of length I. When K = L this measure indicates the probability of traveling through the SP of given length in that network. For calculating these probabilities we first obtain the p × q bivariate histogram N for the p different lengths of existing SPs and the q different lengths of existing SIPs between every pair of nodes in G. The r, s entry of N counts the number of SIPs of length q that coincide with the length of SPs of length p. Then, we obtain the p × p diagonal matrix J = diag (N |1⟩) containing in its main diagonal the total number of SPs of every length in G. The probabilities that the length of a SIP coincides with that of a SP is given by the entries of the matrix R = N T J −1 .
The entry R (1, 1) is always unity because a SIP of length one always coincides with the SP. This is a direct consequence of the triangle inequality. That is, let (v, w) ∈ E then I vw (β) ≤ I vr (β) + I rw (β), which means that the informational path of length one, I vw (β), is shorter than the path of length two, I vr (β) + I rw (β) ,which proves the result. We provide an example that clarifies many of the concepts used here in the Supplementary Information accompanying this work.
Here we show the main difference between a SP and a SIP. A SP between two non-nearest neighbors tends to cross the nodes with the largest degree of the network. The reasoning is as follows. The number of SPs between two non-nearest neighbors v and w that crosses the node i is at least k i (k i − 1) /2 − t i ,where k i is the degree and t i the number of triangles at the node i. Therefore, the largest the degree k i the higher the probability that the SP crosses that node. Contrastingly, the SIP between two non-nearest neighbors v and w avoids the nodes with the largest values ofΓ ii (G, β) because, by definition, these terms increases the informational distance: The termΓ ii (G, β) is a weighted sum that includes the degree as its first term: Γ ii (G, β) = 1 + 1 2 k i + 1 6 t i + · · · . Therefore, the SIPs avoid as much as possible the nodes with the largest degree in the network.

Informational unit square
The informational square is define as the unit square [0, 1] × [0, 1] with axis △S norm (G, β) := △S (G, β) / ln n and SP P (G, β), which is the ratio of SIPs that coincide with SPs. Both indices are bounded between zero and one. Then, every network is represented as a single point with coordinates x = △S norm (G, β) and y = SP P (G, β) in this unit square. The vertex (0, 0) is occupied by an empty graph,K n of any size. In this case we have that Γ (K n , β = ln n, so that △S norm (G, β) = 0. In this case SP P (G, β) = 0 because there are no paths between every pair of nodes.
The vertex (0, 1) is occupied by a directed cycle C n of very large size wherê ρ vv (C n , β) = 1/n andρ vw (C n , β) = (1/n) (1/ (l vw )!). That is: that ln e βÃ = βÃ). In this graph there is a unique path between every pair of nodes, which implies that the SIP will always coincide with the SP. The vertex (1, 1) is occupied by the complete graph K n of large size. In this caseρ (K n , β) = ( e n − 1 n (e n + n − 1) where E is the all-ones matrix. For sufficiently large n we haveρ (K n , β) ≈ (1/n) E, which have eigenvalues λ 1 (ρ) = 1 and λ j (ρ) = 0 for all j ≥ 2. Thus, S (K n , β ) = − ∑ n k=1 λ k (ρ) ln λ k (ρ) = 0, assuming as usual that 0 ln 0 = 0. Furthermore, in K n every pair of nodes is connected and because the SIP of length one coincide with the SP we have that it has (△S norm (G, β) , SP P (G, β)) = (0, 0) .
The last vertex of the informational unit square is the (1, 0) one. A network having △S norm (G, β) = 1, which implies S = 0, needs to have a large spectral gap (λ 1 − λ 2 ) ≫ 1. When (λ 1 − λ 2 ) ≫ 1, Γ (G) ∼ = |ψ 1 ⟩ e λ1 ⟨ψ 1 |. Therefore, Z (G, β) ∼ = e λ1 andρ ∼ = |ψ 1 ⟩ ⟨ψ 1 |, which characterizes pure quantum states. In the USA air transportation network, for instance, (λ 1 − λ 2 ) ≈ 23.92 and λ 1 ≈ 41.23. However, this is not enough. For the network to have SP P (G, β) = 0 it: (i) cannot be regular, i.e., it has to contain some "high" degree nodes that the SIP should avoid instead of the low degree ones, (ii) should be sparse, otherwise it will appear up in the y-axis, (iii) the number of nodes of degree one should be relatively small, otherwise every two of these pendant nodes connected to the same hub will be traveled through a SP increasing SP P (G, β). We create here a prototype of this class of networks as follows. Let G (n, p) be an Erdős-Rényi graph with n nodes and probability p ≪ 1. Pick randomly one node and connect it to the rest of the nodes of the graph. For instance, a graph created this way with n = 2000 nodes and δ ≈ 0.0035 displays △S norm (G, β) ≈ 1 and SP P (G, β) ≈ 0.0246. The main spectral gap is (λ 1 − λ 2 ) ≈ 42.33 and λ 1 ≈ 47.34. More details and an example are provided in the Supplementary Information accompanying this work.

Network datasets
We consider a dataset of 58 real-world networks representing a variety of biological, social, technological, infrastructural, and ecological complex systems, 26 are undirected networks and 32 are directed ones. A description of every network and appropriate references can be found in the Supplementary Information. The dataset of metabolic networks was compiled by Parter et al. 36 (see also Supplementary Files). These networks consist of directed graphs of different sizes representing the metabolism in 116 bacterial species. The PPI network of yeast is based on the data compiled by Bu et al. 49 who focused on 11,855 interactions between 2617 proteins with high and medium confidence in order to reduce the interference of false positives. They reported a network consisting on 2361 nodes and 6646 links from which we analyzed the main connected component consisting of 2224 and 6609 interactions. The data about the essentiality of proteins was taken from the work of Jeong et al. 42 .

Code availability
The Matlab code used is given in the Supplementary Information accompanying this work.

Data availability
All the network datasets used in this paper are freely and publicly available. The descriptions and references are given in Supplementary Information. The data supporting the results in the manuscript are also provided in a Supplementary Files accompanying this work. Fig. 1: Predicting navigation routes. a, Network with two paths P 1 and P 2 connecting the nodes v and w, which are not differentiated by SP as they have the same topological length of 3 edges. Information diffusing through the route P 1 can be diverted towards the node C, but it can be routed directly to the next node in the triangle to complete the route in the best case in 4 steps. Information diffusing through the route P 2 can be diverted towards the pendant nodes. Then, it has to come back again to the root of the pendant node to continue with the route. Thus, in the best case it can complete the diverted route in 5 steps. The search information measure, which is based on node degrees 27,28 gives the same value for both paths. The sum of the commuting times (which is a distance known as resistance distance) for every edge in the corresponding paths gives 2.235 for P 1 and 2.471 for P 2 , indicating that a random walker will arrive first to its destination through P 1 than through P 2 . b, An analysis of this graph indicates that the commuting times though both paths is 2.2083. Therefore, neither SP, search information nor commuting time differentiate between the two routes. Fig. 2: Classes of networks according to the IT processes taking place on them. a, Unit informational square formed by the normalized von Neumann entropy and the ratio of SIPs routed through SPs. The four vertices of the square are occupies, from (0, 0) to (0, 1) clockwise, by empty graphs, a directed cycle, complete graphs and special graphs S n + G (n, p), where S n is a star graph and G (n, p) is an Erdős-Rényi random graph (see Methods). All connected real-world networks are located mainly in the upper triangle of this unit square. The nodes are drawn with size and color proportional to the edge density. b-d, Density matrices of networks close to the (0, 1) (an electronic circuit), (1, 0) (USA air transportation network) and (1, 1) (food web of "El Verde" rain forest. The color map represents the density. e-g, Contour plots of the probability that a SIP is routed through a SP for every length of the SPs in the three networks analyzed in (b-d). The color map represents the probability.  For each value of 0 < β ≤ 1 we ranked the proteins according to CC v (G, β) and count the number of essential proteins in the groups formed by the top y proteins, where y = 10, 20, · · · , 200. The color bar indicates the probability of identifying essential proteins. CC is the classical closeness centrality. b, Ration of SIPs routed by SPs in the yeast PPI network operating at β = 1. c, The same plot as in b, for the value of β = 0.6, which was found as the one in which the largest percentage of essential proteins was detected by CC v (G, β).  Shortest path distance