We employ quantile regression (47, 48) to test the impact of globalisation, settlement characteristics and population characteristics on the cumulative total confirmed COVID-19 cases per one million inhabitants over a six-week period from the 10th week (ending March 4th ) until the 15th week of 2020 (ending April 8th ). Figure 1 shows the distribution of cases over the study period.
Table 1 lists the variables in the model, with the source, units and year of each.
Table 1
List of independent variables to explain the diffusion of confirmed COVID-19 cases.
Variable Description | Category | Units (Transformation) | Source | Year |
Interpersonal Globalisation | Globalisation | Index Value (100 Point Scale) | Swiss Economic Institute (KOF) | 2019 |
Trade Globalisation | Globalisation | Index Value (100 Point Scale) | Swiss Economic Institute (KOF) | 2019 |
Financial Globalisation | Globalisation | Index Value (100 Point Scale) | Swiss Economic Institute (KOF) | 2019 |
Urbanisation Rate | Settlement | National (Percent) | World Bank | 2018 |
Population Density | Settlement | Log transformed value of Inhabitants per square kilometre | World Bank | 2018 |
Urban Density | Settlement | Inhabitants per square kilometre in Densest Metropolitan Area | Demographia | 2020 |
Areal Accessibility | Settlement | The area-weighted average for driving time to a location with at least 1,500 inhabitants per square kilometer | Weiss et al (2018) | 2018 |
Human Development | Population | Index Value | United Nations Development Programme | 2018 |
Population aged 65 and over | Population | Percent Age 65+ | United Nations, Department of Economic and Social Affairs Population Division | 2019 |
Household Size | Population | Mean Number of Household Members | United Nations, Department of Economic and Social Affairs Population Division | 2019 |
Population | Population | Total population | United Nations | 2019 |
During the six-week period of the study period, the number of cases increased by 1433 per cent and the number of countries and territories affected more than doubled, counting those enumerated within the COVID-19 Data Repository by the Center for Systems Science and Engineering at Johns Hopkins University (JHU) (14). Figures 2 and 3 show the geographical (Fig. 2) and temporal spread (Fig. 3) of COVID-19 over time.
The dependent variable in the quantile regression model is the number of cumulative total of confirmed COVID-19 cases per one million inhabitants (log-transformed) by country (or territory) and by week. The denominator for the dependent variable is the 2019 mid-year population by country drawn from the United Nations World Population Prospects (49). 84 countries had consistent available data for the duration of the study period and were therefore included in the model. Data on national COVID-19 cases were extracted from the JHU repository on May 13th 2020.
Quantile regression allows us to go beyond the mean relationship between the response and the predictor variables to reveal statistical relationships at different quantiles of the distribution (47, 48, 50, 51). In this way we detail our discussion on how the globalisation, settlement characteristics, and population characteristics affect global diffusion of COVID-19 cases along across its entire distribution. The technique explains the differential effects that socio-spatial factors have across points along the distribution that mean models cannot account for, which in this instance can identify contribute that explain COVID-19 diffusion at either end of the pandemic spectrum.
Although mean regression models are highly sensitive to outliers, different quantile estimations can also be influenced by outliers at different locations (quantile) (52, 53). For example at the 50th quantile in the last three weeks of the study, China, Iran and Japan stand out as influential observations which might have overly impacted the significance of each variable.
To understand the role of globalisation in COVID-19 diffusion, we test three variables from the KOF globalisation index (38, 54, 55): de facto interpersonal globalisation, de facto financial globalisation and de facto trade globalisation. These sub-indices proxy migration, tourism and business flows, which have been positively associated with outbreaks of infectious diseases by exposing countries to the outside world (33–35, 40, 56–58). Globalisation variable 1 is de facto interpersonal globalisation is a KOF sub-index of social globalisation that includes indicators of international traffic, transfers, international tourism, international students and migration (38). An early study of the COVID-19 spatial diffusion (29) shows that the volume of migration flows has been a strong indicator for the international spread of the pandemic. Globalisation variable 2 is de facto trade globalisation, another KOF sub-index of economic globalization that reflects trade in goods and services as well as trade partner diversity (38). Globalisation variable 3 is de facto financial globalisation, a KOF sub-index of economic globalisation. It is comprised of measures of foreign direct investment, portfolio investment, international debt, international reserves, and international income payments (38).
To understand the role of settlement characteristics in COVID-19 diffusion, we test four variables that measure various national-scale dimensions, including: urbanisation rate, population density, maximum urban population density, and areal accessibility (measures the average drive time of the national population from smaller to larger settlements (59)). These represent human interaction within national boundaries, with recent publications demonstrating that diffusion happens more rapidly in cities that are dense, well-connected, and accessible (11, 29, 42–44). Settlement variable 1 is urbanisation rate, defined as the proportion of a national population located in cities or metropolitan regions (national definitions vary). We selected this variable as cities are more prone to early disease diffusion than rural areas due to higher concentration of interaction and movement in urban areas (42). COVID-19 has been preliminary found to diffuse faster in more populous urban areas in the United States (60). Settlement variable 2 is population density, defined as the population per square kilometre across a national territory. Population density proxies the higher intensity of human interaction which makes disease transmission more likely. The literature shows a significant effect of population density on the outbreak of infectious diseases (44). While a previous study (29) found no significant relationship between population density and total confirmed COVID-19 cases, there is a broader literature that shows an association between population density and the outbreak of infectious diseases (44).
Settlement Variable 3 is urban density [maximum], defined as the population per square kilometre of the densest city in a country. This variable has been selected based on previous studies that documented a higher sensitivity of large cities (global cities) to the spread of infectious diseases (11, 31). Settlement Variable 4 is areal accessibility, defined as an area-weighted average of driving time to locations with at least 1,500 inhabitants per square km (59). This variable has been selected based on a previous study (43) in which the authors argue that extended urbanisation may result in increased vulnerability to an infectious disease spread. Urban accessibility captures the variations in suburbanisation and peri-urbanisation across countries.
To understand the role of national population characteristics in COVID-19 diffusion, we employ HDI, population age structure (65+), median household size, and population size. Research suggests that COVID-19 is more likely to spread in more-developed countries with higher levels of international migration than in countries with lower levels of development and migration (33). Affluent, healthy and educated populations (HDI) are more likely to be highly mobile. Although larger household sizes and national populations have also been shown to increase COVID-19 cases, these are not clear-cut relationships (8). Older populations or populations with higher mortality rates are more likely to get tested than younger populations that may be asymptomatic (46, 61). Population variable 1 is HDI (Human Development Index), which captures a holistic picture of individual countries and has been used as an indicator of the macro environment in a previous study (29) written in the early period of the pandemic. The study found that each unit increase in the HDI score is associated with five more confirmed COVID-19 cases. Populations in countries with higher HDI are more affluent, healthier, and better educated, meaning that their overall mobility potential would be higher. Population variable 2 is population aged 65 and over (%), which is the proportion of the population aged 65 years and over. We hypothesise that in early stages of the pandemic, case detection is higher in countries with older populations due to the higher burden of mortality among older adults (46). COVID-19 transmission may remain undetected longer in younger populations (61). Population variable 3 is household size (mean) is the average number of people per dwelling. Individuals in larger households interact with more people including once stay-home measures are applied. The analysis of demographic and socioeconomic determinants of COVID-19 testing in New York shows a very strong correlation between the cases of infection in the population and household size (8). Population variable 4 is population (n), which is a demographic variable with a direct relation to the pool size for the potentially infected population. Population size was considered as a moderating variable in a previous study (29) that found that “a one person increase in population size indicates over 1.6 more COVID-19 cases” (p. 385) thus more populous countries have greater potential for exposure. Even when normalised on a per capita basis, the likelihood of new cases is still higher in large countries than small countries. The table below (Table 2) provides summary statistics on globalisation, settlement characteristics and population variable data.
Table 2
Descriptive Summary of Independent Variables
Independent Variable | Median | Mean | St. Dev. | Min | Max |
Confirmed cases per million by March 4th [log] | 0.71 | 0.83 | 1.10 | -1.29 | 3.25 |
Confirmed cases per million by March 11th [log] | 1.59 | 1.60 | 1.00 | -0.83 | 3.52 |
Confirmed cases per million by March 18th [log] | 2.28 | 2.23 | 0.91 | -0.01 | 3.97 |
Confirmed cases per million by March 25th [log] | 2.77 | 2.69 | 0.88 | 0.50 | 4.29 |
Confirmed cases per million by April 1st [log] | 3.15 | 3.04 | 0.86 | 0.70 | 4.53 |
Confirmed cases per million by April 8th [log] | 3.38 | 3.32 | 0.83 | 0.89 | 4.76 |
Interpersonal Globalisation [index] | 68.50 | 64.80 | 20.70 | 22.70 | 96.50 |
Trade Globalisation [index] | 62.80 | 57.80 | 21.50 | 21.20 | 99.20 |
Financial Globalisation [index] | 72.70 | 69.20 | 19.10 | 21.30 | 97.30 |
Urbanisation [rate] | 72.00 | 68.60 | 19.60 | 18.50 | 100.00 |
Population Density [log] | 1.97 | 1.95 | 0.57 | 0.31 | 3.90 |
Urban Density [maximum] | 5650 | 7686 | 6251 | 1300 | 41000 |
Areal Accessibility [mean] | 111 | 158 | 116 | 30 | 577 |
Human Development [index] | 0.80 | 0.79 | 0.12 | 0.43 | 0.95 |
Aged over 65 [%] | 11.00 | 11.40 | 6.70 | 1.09 | 27.60 |
Population [million] | 18 | 76 | 219 | 1 | 1432 |
Household Size [mean] | 3.36 | 3.78 | 1.16 | 2.05 | 8.66 |