The mean age of cases is 35.90 years with a median of 34 years. The mean degree centrality is 0.74 and the maximum number of contacts is 43. The mean PageRank score is 0.0005. There is slightly more male with a percentage of 54.54%. Most cases (94.19%) have mild symptom. Eight cases are asymptomatic infections. The results of our preliminary descriptive statistics are presented in Table 1.
Table 1
Descriptive statistics of Xi'an COVID-19 cases
Variables
|
N
|
Mean
|
Median
|
S.D.
|
Min
|
Max
|
Age
|
2,049
|
35.90
|
34
|
17.70
|
4 days
|
94
|
Degree centrality
|
2,050
|
0.741
|
1
|
1.379
|
0
|
43
|
PageRank score
|
2,050
|
0.0005
|
0.0004
|
0.0006
|
0.0001
|
0.0165
|
Interval 1
|
1,483
|
-3.912
|
-3
|
3.427
|
-20
|
1
|
Interval 2
|
730
|
4.199
|
3
|
3.129
|
0
|
21
|
Interval 3
|
339
|
2.976
|
2
|
3.131
|
-8
|
17
|
Gender
|
|
Male
|
Female
|
|
|
|
|
2,050
|
1,118 (54.54%)
|
932 (45.46%)
|
|
|
|
Disease severity
|
|
Asymptomatic infections
|
Moderate
|
Mild
|
|
|
|
2,048
|
8 (0.39%)
|
111 (5.42%)
|
1,929 (94.19%)
|
|
|
Network Visualization
Figure 2 shows the contact network of COVID-19 in Xi'an from December 9, 2021 to January 18, 20221. The nodes in the figure represent confirmed cases and the edges represent the contact relationships between them. Larger size indicates more cases contact with the focal case. There are 2,050 cases in the network and only 759 edges, making the network very sparse, with a density close to 0. More than 900 components have only one node.
Figure 3 shows the largest component extracted from the overall network, which is the largest transmission chain during the epidemic. The number of each node is the numeral order, smaller number means the case was detected earlier. This component contains 64 confirmed cases (3.12% of all cases) and 63 contact relationships (8.3% of all contact relationships). The longest chain of infection in the network has 6 steps, we marked one of them in red color. The node with the highest degree centrality is a staff member of a local university, diagnosed on December 18, 2021. This case went to several shopping malls in Xi'an, thus triggering a mass transmission.
Network Indicators
Figure 4 demonstrates the distribution of degree centrality for all cases in Xi'an, showing a right-skewed distribution. This means most cases have very few contacts, actually there is only three super-spreaders (number of transmissions >10).
The triad census is shown in Figure 5. The percentage of no contact relationship between the three actors is 25.99% (n=545,154). The percentage of only one relationship between three actors is 73.93% (n=1,550,928). The percentage of two contact relationships among three actors is 0.08% (n=1,752), and the percentage of full connected triadic is 0% (n=0).
Intervals Analysis
Figure 6 shows the distribution of intervals between case diagnosis time and isolation time. The mean value is -3.91 days, with the minimum value -20 days and the maximum value 1 day.
Figure 7 shows the distribution of intervals between the diagnosis time of the primary case and the diagnosis time of the secondary case. In general, the main distribution ranges between 0 and 21 days. The mean value is 4.20 days, meaning the infectee is diagnosed 4 days on average after the previous case was diagnosed.
Figure 8 illustrates the distribution of intervals between the primary case isolation time and the secondary case isolation time. The overall distribution ranged from -8 days to 17 days, with a mean of 2.98 days, indicating that a case is isolated for an average of about 3 days after his or her infectors were isolated.
The trend of the average daily intervals of confirmed cases is shown in Figure 9. There is a downward trend in the negative differential between diagnosis dates and isolation dates. Interval 2 has an upward trend, indicating a gradual increase in the intervals between infectors diagnosis time and infectees diagnosis time. Interval 3 has the same trend as interval 2.
We further compare whether there is a change in interval 1, interval 2 and interval 3 before and after the city lockdown by t-test. Results show that interval 1 declines significantly after the lockdown (-1.83 days vs. -4.06 days, p<0.001). Interval 2 increases significantly after the lockdown (1.77 days vs. 4.34 days, p<0.001), as well as interval 3 increases significantly after the lockdown (1.58 days vs. 3.04 days, p<0.05). These results still held after we excluded cases on the day of the city lockdown, 1 day after the city lockdown and 2 days after the city lockdown. This suggests that these statistical findings are robust.
Table 2 shows the results of the OLS regressions, in which the number of contacts (degree centrality) is the dependent variable. Model 1 is the baseline model with only control variables, while models 2, 3, and 4 include interval 1, interval 2, and interval 3, respectively. Among the three intervals, only interval 1 has a significant positive effect on degree centrality, while coefficients of intervals 2 and 3 are insignificant. Because the value of interval 1 was mostly negative, the coefficient indicates that the longer the case was isolated before diagnosis, the fewer people he/she infected.
Table 2
OLS regression results, DV = degree centrality
Variables
|
Model 1
|
Model 2
|
Model 3
|
Model 4
|
Age
|
0.000146
|
-0.00000354
|
0.0000555
|
0.00119
|
|
(0.00176)
|
(0.00131)
|
(0.00146)
|
(0.00110)
|
Gender(male=0)
|
0106
|
0.0485
|
0.0591
|
0.00657
|
|
(0.0621)
|
(0.0461)
|
(0.0556)
|
(0.0409)
|
Moderate disease (0= Asymptomatic)
|
0.815*
|
0.615
|
1.263
|
0
|
|
(0.363)
|
(0.441)
|
(0.817)
|
(.)
|
Mild disease (0= Asymptomatic)
|
-0.114
|
0.143
|
0.297
|
-0.00711
|
|
(0.351)
|
(0.431)
|
(0.809)
|
(0.117)
|
Interval 1
|
|
0.0446***
|
|
|
|
|
(0.00812)
|
|
|
Interval 2
|
|
|
-0.0233
|
|
|
|
|
(0.0129)
|
|
Interval 3
|
|
|
|
0.0120
|
|
|
|
|
(0.00800)
|
Residential areas
|
controlled
|
controlled
|
controlled
|
controlled
|
Constant
|
0.107
|
0.465
|
-0.279
|
0.941*
|
|
(1.051)
|
(0.964)
|
(1.142)
|
(0.374)
|
N
|
2,036
|
1,480
|
727
|
337
|
R2
|
0.029
|
0.062
|
0.085
|
0.073
|
Standard errors in parentheses, * p < 0.05, ** p < 0.01, *** p < 0.001
Table 3 presents the results of the OLS regression, in which PageRank scores are the dependent variable. Model 1 is the baseline model with only control variables, and interval 1, interval 2 and interval 3 are added to models 2, 3 and 4, respectively. The effect of interval 1 on the PageRank score is significantly positive, indicating that the longer a case is isolated before diagnosis, the fewer direct and indirect contacts he or she has. The effect of interval 2 on the PageRank score is significantly negative, indicating that the longer the interval between diagnosis times of the case and his/her infectee, the lower the PageRank score of the case. The regression coefficient of interval 3 is positive but insignificant.
Table 3
OLS regression results, DV = PageRank scores
Variables
|
Model 1
|
Model 2
|
Model 3
|
Model 4
|
Age
|
0.000000474
|
-0.000000252
|
0.000000316
|
0.000000313
|
|
(0.000000738)
|
(0.000000598)
|
(0.000000574)
|
(0.000000560)
|
Gender(male=0)
|
0.0000436
|
0.0000218
|
0.0000231
|
0.0000185
|
|
(0.0000260)
|
(0.0000211)
|
(0.0000219)
|
(0.0000209)
|
Moderate disease (0= Asymptomatic)
|
0.000398*
|
0.000320
|
0.000621
|
0
|
|
(0.000194)
|
(0.000240)
|
(0.000373)
|
(.)
|
Mild disease (0= Asymptomatic)
|
-0.0000190
|
0.000105
|
0.000340
|
0.0000545
|
|
(0.000188)
|
(0.000235)
|
(0.000369)
|
(0.0000695)
|
Interval 1
|
|
0.0000208***
|
|
|
|
|
(0.00000308)
|
|
|
Interval 2
|
|
|
-0.0000110**
|
|
|
|
|
(0.00000356)
|
|
Interval 3
|
|
|
|
0.00000225
|
|
|
|
|
(0.00000335)
|
Residential area
|
controlled
|
controlled
|
controlled
|
controlled
|
Cons
|
0.000440
|
0.000587
|
0.000283
|
0.000785***
|
|
(0.000613)
|
(0.000446)
|
(0.000408)
|
(0.000187)
|
N
|
2,036
|
1,480
|
727
|
337
|
R2
|
0.027
|
0.064
|
0.081
|
0.106
|
Standard errors in parentheses, * p < 0.05, ** p < 0.01, *** p < 0.001