GHDC: a dual-centric data center network architecture by using multi-port servers with greater incremental scalability

As the volume of data keeps growing rapidly, more and more storage devices, servers, and network devices are continuously added into data center networks (DCNs) to store, manage, and analyze the data. The industry experience indicates that, instead of adding a huge number of servers into the DCNs at a time, the DCN can also be expanded gradually by adding a small number of servers from time to time. This paper proposes a new type of dual-centric data center network structure, called GHDC (Generalized Hypercube Data Center network architecture), which is constructed by using commodity switches and multi-port servers. After analyzing the shortest distance between any two vertices, a routing algorithm for GHDC is developed. To achieve incremental scalability, two incomplete GHDC structures are proposed. A small number of servers can be added into the incomplete GHDC structures while their topological properties are maintained. The analysis and experiment results show that GHDC significantly outperform the other DCN structures, such as FatTree, BCube, Platonica, FCell, FSquare, and FRectangle, in terms of the incremental scalability and robustness. The average throughput of GHDC is approximately comparable to that of FatTree, BCube, and FSquare, and is higher than that of Platonica, FCell, and FRectangle by about 130.2%, 17.45% and 25.5%. Compared with the FatTree, BCube, FCell, FRectangle, and FSquare, GHDC reduces the cost by about 68.49%, 78.04%, 10.84%, 22.85%, and 29.55%, and reduces the max energy consumption by about 69.45%, 34.48%, 11.55%, 24.31%, and 29.58%, respectively. The actual energy consumption of GHDC is much lower than that of FatTree, BCube, FCell, FRectangle, and FSquare, and is little higher than that of Platonia.

The main goal of this study is to tackle the technical problem -Can we construct the dual-centric DCN structures with incremental scalability, high bandwidth, low-cost, and robustness, by using multi-port servers and commodity switches? The potential benefits of this work are multifaceted. First, the cost of DCN become low by using commodity switches. Second, the communication efficiency between servers is alleviated, which greatly reduces the communication latency between servers.
In this paper, we propose a novel dual-centric DCN structure, called GHDC, which is constructed by using low-cost commodity switches and multi-port servers. After calculating the shortest distance between any two vertices, we develop a routing algorithm for GHDC. The diameter and bisection bandwidth of GHDC are also analyzed. In addition, based on the incomplete hypercube structure [24][25][26], two kinds of incomplete GHDC structure are proposed to achieve incremental scalability. Any number of servers can be added into the incomplete GHDC structures while the topological properties of the incomplete GHDC structures are preserved.
The main contributions of this paper include: 1. We propose a novel dual-centric DCN structure, named GHDC, by using commodity switches and multiport servers, which is constructed based on hypercube.
In addition, we propose two kinds of incomplete GHDC structures to achieve the incremental scalability. Any number of servers can be added into the incomplete GHDC structures while the topological properties of the incomplete GHDC structures are preserved. 2. We analyze the topological properties of GHDC, including diameter, bisection bandwidth, the shortest distance between any two servers, and robustness. The cost and energy consumption of GHDC are also analyzed. With incremental scalability, the GHDC is more cost-efficient and energy-efficient than the other data center structures. Compared with the FatTree, BCube, FCell, FRectangle and FSquare, GHDC reduces the cost by about 68.49%, 78.04%, 10.84%, 22.85%, and 29.55% and reduces the energy consumption by about 69.45%, 34.48%, 11.55%, 24.31%, and 29.58%, respectively. 3. We use a simulator and construct seven different network structures, including FatTree, BCube, GHDC, FCell, FRectangle, FSquare, and Platonica, to evaluate their throughput. The simulator treats the data center network as a graph, and the capacity of each edge is customized. The simulator formalizes the flows with four tuples, including source host, target host, start time, and flow size. Moreover, it estimates the delay caused by forwarding, queuing, transmission, and processing by assigning a fixed round trip time (RTT) to each flow. The experimental results show that the average throughput of GHDC is approximately comparable to that of FatTree, BCube, and FSquare and is higher than that of Platonica, FCell, and FRectangle by about 130.2%, 17.45%, and 25.5%.
The rest of this paper is organized as follows. Section 2 introduces the related work of DCN architectures. Section 3 proposes the definitions of the complete GHDC structure and the incomplete GHDC structures, and the topological characteristics of these two kinds of structures are also analyzed. In Sect. 4, we compare GHDC with other dual-centric DCNs. Finally, Sect. 5 concludes the paper.

Related work
This section mainly introduces four dual-centric DCNs structures, including FCell, FRectangle, FSquare, and Platonica. FCell: FCell [18] is a dual-centric structure built on a folded Clos topology, which is constructed by using 2-port servers and n-port switches. The FCell(n) is composed of n 2 ∕2 + 1 blocks, and each block includes two layers of switches. Each n-ports switch at the 1-layer connects to n/2 servers with n/2 ports and connects the n/2 switches of the 2-layer within the block with the remaining n/2 ports. There are a total of n 2 ∕2 servers, and one server port is utilized to connect with another one block. Therefore, the total number of servers within the FCell(n) structure is n 4 ∕4 + n 2 ∕2 , and the total number of switches is 3n 3 ∕4 + 3n∕2 . Figure 1 shows the topology of a FCell(4) structure. The FCell structure is cost-efficient. However, this structure reduces the overall cost at the expense of reducing network performance.
FRectangle FRectangle [18] is constructed according to two dimensions. Each column of the FRectangle is a block of the FCell structure. In each row of the FRectangle structure, n switches are connected to n 2 servers, and the interconnection models of each row are chosen from the following two types. Type A interconnections: for servers a i,j in odd rows, , a i,j is connected to the kth switch. The scalability of FRectangle is good, which means it is able to build a large-scale DCN. However, the incremental scalability of FRectangle is incredibly poor. Once the FRectangle structure is deployed, it is difficult to add new servers to the structure, even a small number of servers. Figure 2 shows the topology of a FRectangle(4) structure.
FSquare Similar to FRectangle, FSquare [18] is constructed according to two dimensions, where each column and each row of FSquare form a block of the FCell structure. Figure 3 shows the topology of a FSquare(4) structure. FSquare exhibits good performance in data processing. Nevertheless, this structure uses numerous switches and redundant links, which brings a large cost burden and energy GHDC: a dual-centric data center network architecture by… consumption to the structure. Furthermore, the incremental scalability of FSquare is inferior to GHDC.
Platonica The Platonica [19] is constructed by employing fault-tolerant building blocks inspired by the edge connection pattern of five varieties of Platonic solids, including Tetrahedron, Hexahedron(3-dimentional Hypercube), Octahedron, Dodecahedron, and Icosahedron, that turns the proposed architecture to a flexible scheme. Each variety of building block Platonium x (n) is used to build a different type of data center network Platonica x (n) structure. By using n-port switches, the Platonica x (n) structure consists of S 0 +1 number of Platonium x (n) where S 0 is the number of servers in each building block. To form a Platonica x (n) structure, out of S 0 +1 number of Platonium x (n), it should consider each Platonium x (n) as a logical node and then connect all logical nodes in the form of a complete graph. For example, the Platonium (8) is constructed based on Hexahedron(3-dimentional Hypercube) as shown in Fig. 4. The Platonica (8) is constructed by number 41 Platonium (8). The Platonica can be gradually expanded by adding a few blocks.

The GHDC structure
The GHDC structure is constructed by using low-cost commodity switches and multi-port servers and can be divided into two categories: complete structure and incomplete structure. In this section, we first define the complete GHDC structure. After analyzing the topological properties of the complete GHDC structure, two kinds of incomplete structures of GHDC are proposed. Servers can be gradually added into the incomplete GHDC structure. Then, the topological properties of incomplete structures are also derived.

Complete GHDC structure
Since the GHDC structure is constructed based on hypercube, we first give the definition of m-dimension hypercube as follows.

Definition 1
In m-dimension hypercube H m , the vertices and edges are defined as follows: The hypercube H 3 is shown in Fig. 5.
Since the hypercube is highly scalable and highly symmetric, the hypercube can be expanded incrementally, while the topological properties are maintained. In this paper, hypercube is used as the underlying interconnection network to construct a DCN structure, and gradually scalable schemes are proposed to make the DCN have the property of incremental scalability.

Definition of complete GHDC structure
Based on m-dimensional hypercube H m , the GHDC m (k) is constructed by using (2m − 1)-port switches and (k + 2)-port servers, where m represents the dimension of GHDC. The GHDC is a kind of multi-layer structures which contains number k + 1 of layers. The definition of GHDC m (k) is defined as follows.
x i is the complement of x i , and l represents the layer of the switches and servers.
Constructed by using (k + 2)-port servers, the GHDC m (k) is a kind of multi-layered structure which contains number k + 1 GHDC m (0) structures. Each structure GHDC m (0) can be identified by l, where 0 ≤ l ≤ k . Every server in GHDC m (k) use one port to connect a switch, one port to connect one server in the same layer, and k ports to connect the servers in the other different k layer structures. Since there are 2 m blocks in each layer structure, the GHDC m (k) contains (k + 1)2 m blocks, and each block contains m switches and m(m − 1) servers linking to the switches.
For example, the structure of GHDC 3 (0) is constructed based on a threedimensional hypercube H 3 . Each vertex of H 3 in Fig. 5 can be replaced by the block structure as shown in Fig. 6, and the connection model of H 3 is retained. Finally, we can construct a 1-layer structure of GHDC 3 (0) as shown in Fig. 7. According to the Definition 2, the GHDC 3 (2) is constructed by using 4-port servers, which contains number 3 GHDC 3 (0) structures. As shown in Fig. 8, each server in GHDC 3 (2) uses 2 ports to connect the servers in the other 2 layers. For simplicity, we only give the links from (000;021) to the other two servers of (000;102) and (000;202) in the other layers.
According to the Definition 2, we can get the following theorem.

The shortest internode distance
The ways of calculating the path length between source server and destination server in switch-centric structure, server-centric structure, and dual-centric structure are different. For the switch-centric structure, the path length is the number GHDC: a dual-centric data center network architecture by… of links in the path [18,27]. The length of a path in the server-centric structure is calculated as the number of servers between the source server and destination server [8][9][10]. For the dual-centric structures, the path length is calculated as the total numbers of servers and switches in the path [18]. Since the GHDC structure is dual-centric, the shortest distance between the source and destination servers is the length of the path with a minimum number of servers and switches. For the servers of a = (x m−1 ⋯ x 0 ; k 1 y 1 y 0 ) and b = (u m−1 ⋯ u 0 ; k 2 z 1 z 0 ) in GHDC m (k) , let dis(a, b) denote the shortest distance between them, and d be the hamming distance between x m−1 ⋯ x 0 and u m−1 ⋯ u 0 . Then, we can get the following theorem.

Theorem 2 For the source server
, the dis(a, b) can be calculated as follows: We can first construct the shortest path from ( • Case 2: y 1 < z 1 .
-Case 2.1: y 0 ≠ z 1 . According to the method in Case 1.1, the length of the shortest path from According to the method in Case 1.2, the length of the shortest path from According to the method in Case 1.1, the length of the shortest path from

Diameter
The diameter of the interconnection network is defined as the maximum distance between any pairs of vertices in it. Since the diameter is a measure of network communication delay, low diameter is one of the desirable properties of an interconnection network.

Theorem 3 The diameter of GHDC
Proof Since we have analyzed the shortest internode distance through six subcases in Theorem 2, we can find the longest path in each subcase. By comparing all the six longest paths, we can get the diameter of GHDC m (k).
Since the maximum value of d in this case is m, the maximum value of d + 3 is m + 3.
In this case, the maximum value of d + 5 is m + 5 when d = m. -Case 2.2: y 0 = z 1 .
Similar to Case 2.1, the maximum value of d + 3 is m + 3.
-Case 2.1: Since the maximum value of d in this case is m, the maximum value of The maximum value of d + 2 is m + 2 when d = m; Therefore, we get diam(GHDC m (k)) = m + 5 . ◻

Bisection bandwidth
The bisection bandwidth refers to the minimum number of edges that need to be removed to divide the network into two equal parts. A large bisection width implies a high network capacity and a more resilient structure against failures.

Theorem 4 The bisection bandwidth of GHDC m (k) is not larger than
Proof According to the definition of GHDC m (k) , all servers can be divided into two disjoint vertex sets V 1 and V 2 , where The number of servers in V 1 is equal to that in V 2 . The vertex set V 1 and V 2 both contain (k + 1)2 m−1 blocks, and there are m links between each block in V 1 and its corresponding block in V 2 . Therefore, the bisection bandwidth of GHDC m (k) is not larger than m(k + 1)2 m−1 . ◻

Routing algorithm
For GHDC, we propose a load-balance routing algorithm as shown in Algorithm 1. This algorithm is fully distributed, which means it can be executed on any vertex, and can detect its neighbor vertices and quickly determine the next vertex on the path to the destination vertex. The function getDifferentBit returns a different bit of p between (x m−1 ⋯ x 0 ; k 1 y 1 y 0 ) and (u m−1 ⋯ u 0 ; k 2 z 1 z 0 ) , and the function swap can swap the value of two variables. If the vertex (x m−1 ⋯ x 0 ; k 1 y 1 0) is congested, the function isValid(x m−1 ⋯ x 0 ; k 1 y 1 0) returns false, otherwise the function returns true.
When getDifferentBit(A, B) = 0 (Line 2), which means that y 0 ≠ z 0 , and y 0 needs to be transformed into z 0 through the edges between servers and switches(Lines 3-9). We can transform x m−1 ⋯ x 0 to u m−1 ⋯ u 0 according to the Case 1.1 in the proof of Theorem 2 (Lines 11-20). If y 1 ≠ z 1 (Line 21), y 1 can be turned into z 1 through the edges between servers (lines 22-32 or 33-43), and then, k 1 can be transformed into 1 3 GHDC: a dual-centric data center network architecture by… k 2 (line 45). It directly needs to transform k 1 to k 2 when y 1 = z 1 and k 1 ≠ k 2 (Lines 48-50).
When some servers or switches are congested during the transmission process, Algorithm 1 can effectively avoid the congested device and reselect a non-congested device to transmit the package. When the next server in the path is congested, Algorithm 1 will transmit the package to the non-congested neighbor server in the other layer (Line 25 or Line 36). When the next switch on the path linking to the server A is congested, Algorithm 1 calls the function FaultTolerantRouting (A, B) to transmit the package to the neighbor server in the other layers (Lines 52-57). If the link between the switches is congested, Algorithm 1 will reobtain the p value to change the routing path (Lines 12-14).

Incomplete GHDC structure
In practice, instead of constructing complete GHDC m (k) structure which contains millions of servers, we use m-port switches to construct an incomplete GHDC structure, and the number of servers in it is less than m(m − 1)(k + 1)2 m . When we need to increase the storage, we can add any number of servers into the incomplete GHDC structure, and its topological properties are preserved. In this section, we propose two kinds of incomplete GHDC structures.

3
GHDC: a dual-centric data center network architecture by… The connection models of GHDC m (k;n) and GHDC m (k) are the same except that each switch in GHDC m (k; n) connects m − 1 servers. Thus, Algorithm 1 can be used in GHDC m (k; n) structure. The GHDC 3 (0; 2) is shown in Fig. 9. According to the definition of GHDC m (k; n) , we can draw the following theorem.
Theorem 5 For the GHDC m (k; n) structure, the number of servers is m(m − 1)(k + 1)2 n , the number of switches is m(k + 1)2 n , and the number of edges is

Theorem 7
The diameter of GHDC m (k; n) is n + 5.

Theorem 8
The bisection bandwidth of GHDC m (k; n) is not larger than m(k + 1)2 n−1 .
Since the connection models of GHDC m (k; n) and GHDC n (k) are the same, the proofs of Theorem 7 and Theorem 8 are the same as that of Theorem 3 and Theorem 4.
The GHDC m (k; n, t) is a subgraph of GHDC m (k; n + 1) . When n > t , we can add another GHDC m (k; t) into a GHDC m (k; n, t) to construct a GHDC m (k; n, t + 1) . Besides, if we add a GHDC m (k; m − 1, m − 2) into a GHDC m (k; m − 2) , we can get a GHDC m (k) . As shown in Fig. 10, the GHDC 3 (0; 2, 1) is constructed by using a GHDC 3 (0; 2) and a GHDC 3 (0; 1) . According to Definition 4, we get the following theorem. GHDC: a dual-centric data center network architecture by… Theorem 9 In the GHDC m (k; n, t) structure, the number of servers is m(m − 1)(k + 1)(2 n + 2 t ) , the number of switches is m(k + 1)(2 n + 2 t ) , and the number of edges is The connection models of GHDC m (k; n, t) and GHDC m (k) are the same, so Algorithm 1 can be used in the GHDC m (k; n, t) structure.

Theorem 10
The diameter of GHDC m (k; n, t) is n + 6.
Proof For the source server (0x n−1 ⋯ x 0 ; k 1 y 1 y 0 ) and the destination server

Proof
The GHDC m (k; n, t) is constructed by one GHDC m (k; n) and one GHDC m (k; t) . All servers in GHDC m (k; n) can be divided into two disjoint sets V 1 and V 2 , where V 1 = (0x n−1 ⋯ x t+1 0x t−1 ⋯ x 0 ; ly 1 y 0 ) and All servers in GHDC m (k; t) can be divided into two disjoint sets V 3 and V 4 , where V 3 = {(10 n−t+1 x t−2 ⋯ x 0 ; ly 1 y 0 )} and V 4 = {(10 n−t 1x t−2 ⋯ x 0 ; ly 1 y 0 )}. The numbers of servers in V 1 ∪ V 3 and V 2 ∪ V 4 are equal. The number of edges between V 1 and V 2 is m(k + 1)2 n−1 , and the number of edges between V 3 and V 4 is m(k + 1)2 t−1 . The number of edges between V 1 ∪ V 3 and V 2 ∪ V 4 is m k (2 n−1 + 2 t−1 ) . That means the bisection bandwidth of GHDC m (k; n, t) is less than or equal to m(k + 1)(2 n−1 + 2 t−1 ).

Characteristics comparison
In this section, we compare the GHDC structure with the other typical DCN structures, including FatTree, BCube, FCell, FRectangle, FSquare and Platonica. The topological properties of BCube, FatTree, FCell, FRectangle, FSquare, and Platonica, such as network diameter, bisection bandwidth and scalability, have been fully analyzed in [1,8,18], and [19]. Since the characteristics employed to perform the comparison have significant impacts on the performance of a DCN structure, this section mainly compares FatTree, BCube, FCell, FRectangle, FSquare, Platonica, and GHDC in terms of diameter, bisection bandwidth, incremental scalability, robustness, throughput, cost, and energy consumption. Table 1 presents a comparison of GHDC against other well-known DCN structures in terms of the topological properties of number of switches, number of wires, diameter, and bisection bandwidth.

Diameter
Since the diameter is a measure of network communication delay, low diameter is one of the desirable properties of an interconnection network. As shown in Table 1, the diameters of FCell, FRectangle, and FSquare are 9, 10, and 8, recursively. The diameter of Platonica is equal to twice the diameter of its building block plus one. Since the diameter of Platonium is 5, the diameter of Platonica is 11. Table 1 shows that the diameter of FatTree is 2 log 2 N which equals to 2 multiples the height of the FatTree. The diameter of BCube equals to the level of the BCube plus one. Due to the fact that a low-dimension GHDC can hold a large number of servers (e.g., a GHDC 7 (2) contains 16128 servers and a GHDC 10 (2) contains 276480 servers), the diameters of GHDC 7 (2) and GHDC 10 (2) are only 12 and 15, respectively. The GHDC can effectively support applications with real-time requirements. Furthermore, even if the diameters of FCell, FRectangle, and FSquare are all small, their incremental scalabilities are incredibly poor.

Bisection bandwidth
The bisection bandwidth of FatTree, BCube, FCell, FRectangle, FSquare, and Platonica are N/2, N/2, N/4, N/4, N/2, and 4 + (N − 8(n − 3))∕4 , recursively, where N denotes the number of servers. The bisection bandwidth of GHDC m (k) is less than N∕2(m − 1) , where m ≥ 2 and N = m(m + 1)(k + 1)2 m . Thus, the bisection width of GHDC m (k) is pretty large when we choose large (k + 2)-port servers to construct the GHDC m (k) based on high dimensional hypercube H m . This means that there are many paths between a pair of servers in the GHDC structure. Therefore, GHDC is GHDC: a dual-centric data center network architecture by… intrinsically fault-tolerant, and it provides the possibility to design multi-path routing on top of it.

Incremental scalability
All structures shown in Table 1 are highly scalable, which means they are all suitable for constructing large scale DCNs. As the volume of data keeps growing rapidly, more and more storage devices, servers and network devices are continuously added into data centers. The incremental scalability of the DCN structure refers to the gradual expansion of the data center, while the expansion process has little impact on the topological properties of the structure. To incrementally expand a DCN structure, three critical points should be considered, (1) no rewiring costs, (2) no software modification, and (3) no hardware replacement. These requirements ensure that existing applications are not compromised and can be implemented in the data center structures. The evaluations of incremental scalabilities of Fat-Tree, BCube,FCell, FRectangle, FSquare, Platonium , and GHDC are listed out in Table 2. By constantly adding pod into the FatTree architecture, we can construct data center network with arbitrary number of servers. However, the number of servers in Fat-Tree is small, which equals to n 3 /4. Thus, the large-port switches should be adopted to build large-scale DCN structure. In addition, the price of the large-port switch is very high. This will lead to high cost of building the DCN structures. Thus, the FatTree is not suitable for building large-scale data center networks.
To achieve incremental scalability, the partial BCube structure was proposed in [8]. The partial BCube k can be constructed by a number of BCube k−1 s connecting to the full layer-k switches. Thus, a small number of BCube k−1 can be gradually added into the partial BCube k . The routing algorithm of the partial BCube is same to that of complete BCube. However, the apparent disadvantage of this approach is that switches in layer-k are not fully utilized which leads to a waste of resources. Furthermore, when additional servers are added into a complete BCube, more NIC port must be added into the every server in the structure.
FCell supports two ways of expanding its size gradually. By connecting the first m+1 ( m < n 2 ∕2 ) blocks, an incomplete FCell structure can be constructed. A few number of switch ports in the incomplete FCell structure reserved for future expansion. Blocks can be added into the incomplete FCell structure. When all the reserved ports of the 2-layer switches are fully occupied, the incomplete FCell will finally become a complete structure. When adding new blocks into the complete FCell structure, the connection model of each block must be changed, and routing algorithm needs to be redesigned. The FRectangle and FSquare are both constructed with two dimensions, and each column of these two structures is the basic building block. When a small number of blocks are added into the incomplete FRectangle and FSquare structures, the connection models of each row in these two structures must be changed, and the routing algorithm must be changed too.
The complete Platonica x (n) structure consists of S 0 +1 Platonium x (n) where S 0 is the number of servers in each building block. To form a Platonica x (n) structure, out of S 0 +1 number of Platonium x (n) , it should consider each Platonium x (n) as a logical node and then connect all logical nodes in the form of a complete graph. The incremental scalability of Platonica is good. A number of Platonium structures can be adopted to construct an incomplete Platonica structure. A small number of Platonium structures can be gradually added into the incomplete Platonica structure while the connection model of the incomplete structure is reserved.
For GHDC, a small-scale data center network can be built according to the incomplete GHDC structures. Then, servers and switches can be gradually added into the data center structures according to the real requirements. The topological properties of incomplete GHDC structures are similar to that of the complete structure. That means, we can add any number of servers into an incomplete GHDC structure while all the topological properties are maintained. Same to Platonica, the incremental of GHDC is also outstanding.
The scalabilities of FatTree, FCell, FRectangle, FSquare, Platonica, and GHDC are all limited by the switch port. That means, any server cannot be added into the complete structures when all the switch ports in these structures are fully occupied. The switches must be replaced by large-port ones for further expansion. Furthermore, the connection models and routing algorithms must be changed when the complete FCell, FRectangle, and FSquare structures need to be expanded.
However, the number of servers in the GHDC m (k) is m(m + 1)(k + 1)2 m . Thus, the number of servers in the complete GHDC m (k) structure is really huge when we choose highly dimensional Hypercube H m and (k + 2)-port servers to construct the GHDC structure. As shown in Table 3, the numbers of servers in complete GHDC structures constructed by using large-port servers are pretty huge. We can employ large-port switches to construct an incomplete GHDC architecture. The servers can be added into the incomplete GHDC structure. As more and more servers are added into the architecture, the incomplete GHDC structure will finally become a complete structure. Since the number of servers in the complete GHDC structure is very huge, such as GHDC 48 (0) and GHDC 48 (2) , the switches in the GHDC need not be replaced.

Robustness
Robustness in complex networks defines the ability of a network to survive the occurrence of component failures and the onset of malicious attacks [28]. As the scale of DCN increases, failures will occur frequntly and have a significant impact on the running applications [29]. Spectral radius 1 : the largest eigenvalues of the network adjacency matrix.
In this paper, we only consider the targeted attack in which high-level switches are removed to disconnect the network, and the vertex failure rate is taking into account 1% to 6%. The comparison results of GHDC with other structures are present in Table 4. Since the BCube, FatTree, and GHDC are multi-layered structures, the average shortest-path lengths of BCube, FatTree, and GHDC are smaller than that of the other structures. Although the |v|−1 of BCube and the ∕(D + 1) of FSquare are the smallest among these seven structures, the |v|−1 and ∕(D + 1) of GHDC are slightly larger than that of BCube and FSquare, respectively. The 1 s of BCube and GHDC are comparable which are smaller than that of FatTree, FSqure, and Platonica, and larger than that of FCell and FRectangle. Furthermore, the metrics of max(v) and ⟨d⟩ of GHDC are the smallest among these seven structures. According to the analysis above, the robustness of GHDC and BCube is comparable and is almost higher than that of FatTree, Platonica, FCell, FSquare, and FRectangle.

Throughput
In order to evaluate the throughput of different network structures, we use a flowlevel simulator, called mtCloudSim [31], to estimate the data flow behaviors in the real world based on the method proposed in [32]. We generate a synthetic flow workload according to the characteristics summarized in [33] which contains 80000 flows with a total size of 4TB. The maximum flow size is 1 GB, while the minimum size is 1 KB. Furthermore, the source host and destination host of each flow are randomly selected. Therefore, the workload used in the evaluation can be a good representative of data center traffic.
The throughput of Platonica (11) is the smallest, and the highest throughput of it is about 220Gbps. It takes the longest time, i.e., 343s, for the Platonica (11) to finish the flows transmission. This is mainly because that there are fewer numbers of switches and links in Platonica (11). Because of numerous switches and redundant links, FatTree and BCube achieve the highest bandwidth over 280Gbps. The throughput of GHDC 7 (0) and FSquare are approximately comparable to that of FatTree and BCube, and are larger than that of FCell and FRectangle by about 17.45% and 25.5%, respectively. The 2-layer switches of FCell and FRectangle are congested and act as bottleneck switches during data flow transmission, so that the times to finish the data flows transmission are slower than that of GHDC and FSquare. Compared with the other structures, there are multiple available paths from between the servers in GHDC architecture, which can achieve a good tradeoff of traffic workload.
The throughput of GHDC 7 (0) , GHDC 9 (0;6) , and GHDC 8 (0;6, 4) are depicted in Fig. 11 and Table 5. The highest throughput of incomplete and complete GHDC structures all exceeds 270Gbps. Furthermore, as the topological properties of incomplete GHDC structures are similar to that of the complete structure, they almost simultaneously finish the data transmission. Thus, any number of servers and switches can be continually added into GHDC structure according to the real requirements while the throughput performances are maintained.

Cost and energy consumption
In this section, we compare the cost and energy consumption of GHDC with the other DCN structures when we use them to construct a DCN containing the same number of servers. Since the total cost and energy consumption of the servers are the same, we only evaluate the cost and energy consumption of switches and NICs. For fairly evaluation, FCell, FRectangle, FSquare, Platonica, and GHDC are constructed by using 16-port switches and 2-port servers. Since the FatTree architecture built with n-port switches contains n 3 /4 servers, 48-port switches are adopted to build the data center. The BCube structure is a kind of recursively constructed structure. When k > 1 , BCube k is constructed by using n BCube k−1 and n k n-port switches, and the number of server-port in BCube k is k + 1 . In the evaluation, we construct BCube 3 structure by using 16-port switches and 4-port servers. In order to analyze the cost and energy consumption intuitively, we construct DCNs with numbers of 1024, 2048, 4096, and 8192 servers, respectively. Table 6 illustrates the price and max power consumption of switches and NICs used to construct DCNs [8,34]. The prices of switches and NICs may vary in   Table 7, Figs. 12, and 13, the cost and max energy consumption of Platonica is the lowest among these DCN structures. This is mainly because that the number of switches in Platonica is lower than that of the other DCN structures.
Since the FatTree in the evaluation experiment is constructed by using 48-port switches, and the BCube is constructed by using 4-port servers, the costs and max energy consumptions of FatTree and BCube increase rapidly when the scale of the DCN expands. The cost and max energy consumption of GHDC is much lower than that of FatTree, BCube, FCell, FRectangle, and FSquare. Compared with the FatTree, BCube, FCell, FRectangle, and FSquare, GHDC reduces the cost by about 68.49%, 78.04%, 10.84%, 22.85%, and 29.55%, and reduces the energy consumption by about 69.45%, 34.48%, 11.55%, 24.31%, and 29.58%, respectively.
Since the energy consumption in the switch will be proportional to the workload, we also propose a more precise model to evaluation the actual energy consumption (AEC) of every switch. According to [35], each port in a switch will consume 150mw to keep the port active. That means, the none-loaded energy consumption of an n-port switch is (150×n)mw when the switch is in the standby status. We assume that the AEC of an n-port switch, denoted as E a , can be calculated as follows: where E nl and E fl are the none-loaded energy consumption and full-loaded energy consumption of the switch, respectively, B ave and B max are the average bandwidth and max bandwidth of the switch, respectively. We assume the max bandwidth of every port in switch is 1Gbps. Thus, the max bandwidth of the n-port switch is nGbps. According to the throughput evaluations in Sect. 4.5, we can get the AECs of different DCN structures as shown in Table 8 and Fig. 15 The AEC of switches in Platonica is the lowest among these seven DCN structures. This is mainly because that the none-loaded energy consumption of switches in Platonica is the much lower than that of the other DCN structures. However, the throughput of Platonica is the worst compared with other DCN structures as analyzed in Sect. 4.5.
Since the total numbers of switch ports in FatTree and FSquare are larger than that of the other five DCN structures, the none-loaded energy consumptions of Fat-Tree and FSquare are higher than that of the other DCN structures. Thus, the AECs of the switches in FatTree and FSquare are higher than that of the other five DCN structures. The AECs of the switches in BCube and FRectangle are comparable, which are lower than that of FatTree and FSquare. The AECs of the switches in GHDC and FCell are higher than that of Platonica and are much lower than that of the other five DCN structures. However, the AEC of the switches in GHDC is lower than that of FCell.
Since wires and cables are required to deploy DCNs, we also list the number of links used in these network structures. As shown in Fig. 14, the number of wires in Platonica is the smallest, this leads to the throughput of Platonica is the worst among these compared DCN structures. The number of links in GHDC is less than those of FatTree, BCube, FCell, FRectangle, and FSquare.
According to the above analysis, the cost, energy consumption, and number of wires of GHDC are much less than that of FatTree, BCube, FCell, FRectangle, and FSquare. As shown in Figs. 12, 13, and 14, the cost, max energy consumption and number of links of GHDC maintain a low growth speed with the continuous expansion of the DCN. Thus, the GHDC structure is more suitable for

Conclusion
Based on hypercube, this paper proposes a new dual-centric DCN structure, called GHDC, which is constructed by using low-cost commodity switches and multi-port servers. A routing algorithm is developed, and the shortest distance between any two vertices is also analyzed. We also propose two kinds of incomplete GHDC structures based on the incomplete hypercube structure to achieve the incremental scalability of GHDC. We also analyzed the diameter, bisection bandwidth, incremental scalability, cost, and energy consumption of GHDC.  11.55%, 24.31%, and 29.58%, respectively. The actual energy consumption of GHDC is much lower than that of FatTree, BCube, FCell, FRectangle, and FSquare and is little higher than that of Platonia. Therefore, the GHDC structure is more suitable for building a large DCN with low cost, low energy consumption, robustness, high throughput, and great incremental scalability.