Culture and Coronavirus Disease Statistics Public Health Through the Lens of Hofstede’s Cultural Dimensions, A Multiple Regression Analysis

Background Hofstede’s Culture Dimensions (HCD) are the most prevalent metrics with which social scientists distinguish cultural differences between countries. In this study, we examine the relationships between HCD and the COVID-19 pandemic. In particular, we investigate how differences in COVID-19 infection, death and recovery between countries correlate with differences in individualism (IDV), indulgence (IVR) and power distance index (PDI). Method We use multiple linear regressions to interpret statistical and economic signicances. Results IDV is found to be signicantly associated with death rate and recovery rate globally, while IVR and PDI do not seem to be signicantly relevant. None of the three dimensions are signicantly related to the global infection rate. Conclusions These results have implications for the design of public health campaigns on preventing COVID-19 infection and compliance with vaccination campaigns. Some practical strategies have been proposed for public health ocials to help mitigate COVID 19 spread.


Introduction
There is ample literature on the relationship between infectious disease and natural factors such as environmental conditions, biological bases and comorbidities. Few studies investigate sociological or economic attributes, and even fewer attempt to relate pandemics to culture dimensions. This paper aims to ll the gap by linking spread of the ongoing COVID-19 pandemic to social factors proxied by the Geert Hofstede's Culture Dimensions (HCD) that include PDI, IVR and IDV [24].
COVID-19 has spread globally and evolved into a pandemic with far-reaching impacts. The Coronavirus Resource Center of John Hopkins University of Medicine reports over 45 million con rmed infection cases worldwide and over one million deaths as of October 2020 (JHU 2020), and these gures may be underestimates (Vogel 2020) [46]. The recommended safety policies result in social isolation and signi cant national, organizational and individual economic disruption. According to IMF, the global growth suffers 3% decrease in 2020; The International Labor Organization estimates a wipeout of 6.7% working hours in the second quarter this year, which equals 195 million full-time workers, and the estimated unemployment number is 30 million, in comparison to 25 million during the 2008 nancial crisis [25,48]. The cumulative GDP loss globally over 2020 and 2021 from this pandemic could amount to around 9 trillion dollars, although there is a projected global growth of 5.8% next year [22]. Other social issues include increased bankruptcies and unemployment rate. Wang, Yang, Iverson and Kluender (2020) nd that Chapter 11 ling, a form of bankruptcy record, has increased almost by 200% from large corporations, from January to August this year, as compared to last year [47]. Meanwhile, the data from U.S. Bureau of Labor Statistics shows that the civilian unemployment rate nationwide has been going up signi cantly since March, peaking at around 15% one month later this year [14]. It is reported that public health cooperation at global level can effectively alleviate such issues, yet due to con icted interests, politicians keep ignoring possible cooperation and intensi es contradictions among countries (McKibbin and Fernando 2020) [32].
Recovery from a pandemic is accelerated by compliance with protocols recommended by public health authorities such as the World Health Organization, the Centers for Disease Control and Prevention, and the relevant institutes with the National Institutes of Health. However, culture in uences compliance, which is why pandemic "hot spots" are often localized regions that are culturally homogeneous and culturally different from their surroundings, i.e., geographies whose health and cultural measures are easily segmented from their surroundings. Acknowledging this, public health researchers are realizing the importance of accounting for culture in public health campaigns (Munodawafa, Ford, Oni, & Agyemang, 2020; Van Bavel et al., 2020) [1,8]. But how do we measure culture? Geert Hofstede's seminal work of culture (Hofstede 1984)  By example, the Global Entrepreneurship Monitor uses HCD to evaluate the state of entrepreneurship in countries across the world (150,000 participants in 50 economies). Since 1998, policy makers use the annual GEM data to advance the regional economic impact of entrepreneurship (Bosma et al., 2020) [11]. In general, cultures with high individualism and low uncertainty avoidance exhibit the strongest entrepreneurial activity (Mueller and Thomas, 2001) [35].
While HCD is most frequently used in economic studies, we propose that it might also provide insight into the impact of culture on compliance with public health recommendations and help explain disparities between the COVID-19 statistics of different countries. In particular, we investigate how PDI, IDV and IVR of a country correlate with the country's COVID-19 infection rate, death rate and recovery rate.
Our contribution to the literature is twofold. First, this study bridges the gap between culture literature and pandemic research, as there are few works exploring how culture and pandemic response are related, and none that use the HCD so common in the economic research. We also nd moderate to strong correlations among some Hofstede dimensions and COVID-19 national health statistics.
Second, the study has social, policy and economic implications. The study informs policy makers on their country's cultural attributes that impact compliance. Just as policy makers use the HCD to tailor economic policy, public health o cials can also use HCD to tailor public health campaigns. The goal is similar in both cases: increase participation and compliance with recommendations that appeal to regional culture.
We nd that IDV and death rate are positively related. While people with higher IDV may experience more "freedom" by not caring about anyone's health but their own, this freedom comes with higher regional mortality rate. As the world enters a third wave of COVID-19 casualties, there is both fatigue with the pandemic and a call for increased sacri ce, especially for Americans to come to terms with their individualism that appears unsuitable for pandemic response (O'Rourke 2020) [36]. In addition to saving lives, Thunström et al. (2020) estimate the bene ts of effective social distancing are worth $5.16 trillion for the U.S., which accounts for more than 24% of its GPD last year [42].
The remainder of the paper is organized as follows. Section 2 conducts literature review and develops our hypotheses. Section 3 describes our data and summary statistics. Section 4 presents the empirical results, Section 5 shows the robustness checks, and Section 6 concludes.
Literature Review And Hypothesis Development as Italy and Spain suffer more in this pandemic, while it is easier for Japan to adopt social distancing due to the lack of close contact in Japanese culture [6]. It is also not surprising that East Asian culture emphasizes collectivism, for which infringing policies against individual freedom are more prevalent, while in western culture individualism usually dominates collectivism even in a crisis (An and Tang 2020) [5,23].
In one of the few studies involving COVID-19 and HCD, Messner (2020) nds that societies with high IDV and high PDI experience a slowed rate of pathogen multiplication [34]. His model, and that of An and Tang (2020), focus on institutional constructs. In contrast, we are interested instead in public health constructs.
While HCD is prevalent in cross-culture research, many have also pointed out its limitations. Some scholars argue that Hofstede's uni-level analysis neglects interactions between macroscopic and microscopic cultural levels (Baskerville 2003, McSweeney 2002 [33]. Others blame the theory for having "ecological fallacy" whereby there is a correlation inconsistency at national (ecological), individual and organizational level (Brewer and Venaik 2014) [12]. In other words, even if HCD accurately describe a nation's culture, it does not mean the citizens from such country behave exactly as the theory suggests. In addition, dividing cultures into "stereotypes" may misleading to (Jain 2020) [26]. Some more elegant methodology is needed for culture measure and analysis.
Hofstede, aware of this limitation -and abuse -of his model of culture, reminds HCD users that "the concept of a common culture applies to societies, not to nations" (Hofstede, Hofstede et al. 2010, p. 21), and "one of the weaknesses of much cross-cultural research is not recognizing the difference between analysis at the societal level and at the individual level" (Hofstede 2011, p. 6).

Hypothesis Development
The work of Messner, Baniamin et al., and An and Tang (2020) implies that culture difference may in uence the intensity of the pandemic. Anecdotal observation of civilian responses to pandemic public health recommendations, such as parades in de ance of stay-at-home orders and anti-vaccination campaigns, further motivate us to suspect that individualism is relevant to the severity of this pandemic.
Because of culture and de-centralized governance, states in the U.S. balance differently the trade-off between public health protection and citizens' freedom (Calandrillo 2004) [13]. Rothstein (2020) points out that it is a question whether Americans will comply with government's quarantine decision, due to their ingrained values, including adamant individualism, self-reliance, nonconformity and independence, and it is unimaginable for them to experience a cordon sanitaire exhibited in Wuhan city of China [40]. According to Time website, the infection curve has a uctuating trend since early April in America, with a new daily case number decreasing from 9.5 to 6 roughly per 100000 people, as in comparison to a net increase of 9.5 per 100000 people from early March to late April, after which there was a skyrocketing daily case number up to more than 20 per 100000 people in late July, followed by a decreasing trend till mid-August [44]. Meanwhile, there are more behaviors exhibiting quality of collectivism instead of resolved individualism, also there are less protests or disturbing actions and more proponents of community-oriented activities (Rothstein 2020) [48]. While we cannot say for sure these two phenomena have a causal relation, it makes sense to link them together.
Literature suggests that IVR is related to other infectious diseases and health outcomes. Mackenback (2014) nds that among HCD, IVR is mostly correlated with both health outcomes and health behaviors, with 25 cases having |r|>0.4, and 14 cases having |r|>0.6, out of 35 from the former, and with 16 cases having |r|>0.4, and 5 cases having |r|>0.6, out of 35 from the latter. Messner (2020) nds that hedonistic values with indulgence focus and avoidance of restraints are positively related to the COVID-19 outbreaks (p value = 0.001), and it is such indulgence that makes social distancing restraints di cult and therefore facilitates the outbreaks. Similarly, people in countries with higher IVR scores, such as U.S. and other western countries excluding Germany tend to resits stay-at-home orders, requiring authorities to enforce such order implementation, while Chinese culture scoring about a quarter of U.S. IVR values endurance and patience, which is a constructive response to the pandemic (Travica 2020) [43].
Apart from IVR and IDV, other elements in HCD might also affect the impact of infectious diseases. For example, PDI is a measure of deference to authority gures, and might thus impact compliance with centralized public health guidance. There can also be a feedback loop between policy and compliance; to avoid losing popularity, some politicians in low PDI cultures might advocate policies they believe will be well received by the majority of their constituents even if the policies diverge from those recommended by healthcare professionals. Borg (2014) shows that PDI, UAI and MAS also have much correlation with a few key performance indicators relevant to infection prevention and control and antibiotic management. [9]. UAI (0.365), MAS (0.214) and LTO (-0.281) are signi cantly associated with Methicillin-resistant Staphylococcus aureus (MRSA) in Europe, with UAI being the strongest culture element relevant to MRSA prevention (Borg et al. 2012) [10]. In addition, a high PDI may prevent patients or their families from speaking up even if they see healthcare providers not wash their hands before direct contacts; a nurse may not be able to stop inserting central line even if the doctor is seen using a wrong technique, and both cases may result in hospital-induced infections (Saint 2017) [41].
We expect that PDI, IDV or IVR is associated with at least one of the essential COVID-19 statistics (con rmed case rate, death rate and recovery rate) with statistical signi cance, which is the alternative hypothesis below. The null hypothesis is that these two are not related.
H 0 : None of IDV, IVR or PDI in HCD is signi cantly related to the con rmed case rate, death rate or recovery rate in COVID-19. (β IDV = 0, β PDI = 0, β IVR = 0 in each regression) Our null hypothesis assumes these two factors have no correlation with the COVID-19 pandemic. In other words, they are jointly insigni cant to the COVID-19 statistics. The null hypothesis is counterintuitive based on our life experience, apart from other factual records in similar literature mentioned above. We specify that the lowest signi cance cutoff is 0.1.
H a : At least one element among IDV, IVR and PDI is signi cantly related to the con rmed case rate, death rate or recovery rate in COVID-19.
For the alternative hypothesis, we posit that PDI, IDV and IVR are jointly linked to the current pandemic due to the cited anecdotal observations and extant literature. We perform the F test to determine if PDI, IDV and IVR are jointly signi cant for each regression.

Data
Because there is no consolidated data source that integrates COVID-19 health statistics and HCD values, this study combines three data sources. The rst data source is the most recent (December 2015) HCD values from Hofstede's website (Hofstede 2016). There are two issues with the dataset. First, some countries are not uniquely presented. For example, Canada has two versions, a "traditional" Canada, and French Canada composed of the provinces that predominately speak French. Belgium and Switzerland are similarly subdivided in the HCD dataset.
Second, Hofstede's website de nes some countries in vague geographical terms. For example, the dataset cites Africa East, African West, and Arab countries. We complete this dataset with data from the Country Comparison tool on Hofstede Insights (Hofstede 2017), and in countries where the HCD values reported by these two sources con ict, we use the Country Comparison values because they are more recent [23].
The third dataset is the Coronavirus Resource Center of John Hopkins University of Medicine (JHU, 2020) that provides COVID-19 infection, death and recovery data as of 7/3/2020 [17]. Any data missing from this third dataset is obtained from Google News [15]. We also record the population for each country or district for standardization purpose [9].
The GPD per capita is estimated by International Monetary Fund [29]. The physician density is de ned as number of medical doctors (physicians), including generalist and specialist medical practitioners, per 1000 population, according to World Health Organization. The recorded date does not strictly match other information mentioned before, generally it comes with a time lag from 6 to 11 years. The median age for each country is also available from Wikipedia [30]. After removing incomplete observations (countries), 75 remain. Environmental performance, or equivalently, water and sanitation index is also considered as a control variable in our model [21].

Methodology
The main statistical tool used in this study is a multilinear regression model. The three HCD's used in our regressions are PDI, IDV and IVR. The signi cance levels are set at 1%, 5% and 10%. Since each country/district has different population, we divide the total infection number by the population. Technically, this term is the con rmed case rate, but in this paper we use the two terms interchangeably. The de nitions of variables are listed in Table 2.
Possible explanatory control variables include GPD per capita, physician density, life expectancy and net migration rate. According to CDC, age, race and gender may also in uence the infection [16]. Dangi and George (2020) discuss other factors including humidity, temperature, water, sanitation, population density, and job satisfaction [18]. However, some of these control variables are not available for every country, including job satisfaction. Urbanization is also considered as an environmental determinant for infectious diseases (Eisenberg et al. 2007) [20].
We also include the legal origin as a control variable because of its signi cance to a country's economics (Porta et al. 2007), which in turn is possible to be associated with COVID-19 statistics [39]. In fact, legal origin turns out to be statistically signi cant to the economic outcome, with common law being more economically promising than French civil law (Porta et al. 2007) [47]. The legal origin is divided into four groups in terms of country and region: UK, France, Germany and Scandinavia. To avoid dummy variable trap, we set UK as the base group in our regression analysis. In Table 2 we list the available control variables used in this study.

Empirical Results
Tables 4, 5, and 6 present the results of regressions with IDV, IVR, and PDI as independent variables (individually and combined), the control variables in Table 2, and con rmed case rate, death rate and recovery rate as dependent variables, respectively. First and most importantly, IDV is only statistically signi cant with death rate and recovery rate globally. Second, IVR and PDI have no statistical signi cance with any rate at any level speci ed. Third, none of PDI, IDV and IVR is signi cant to global infection rate, at any of these speci ed levels. Fourth, among the three dependent variables, only con rmed case rate and recovery rate have very strong correlations (Table 3 Panel B). Table 1 presents the data used in the regressions. 75 entries remain after eliminating entries with any missing data. As can be seen, most attributes are either dependent variables or control variables, and the last three columns are the independent variables of interest. Table 2 explains each variable in the regressions. Note that the GPD per capita has purchasing power parity instead of being nominal, which is a better re ection of reality. We use the logarithm of performance, as GDP per capita tends to be a nonlinear growth [23]. Table 3 Panel A shows that COVID-19 is highly infectious, with less lethality, and the majority of infected recover. Table 3 Panel B indicates that some variables have moderate to high correlations, but most correlations are safely under 0.8.
The adjusted r square values roughly equal 0.33 for the four infection rate regressions ( Table 4). The adjusted r square values for all four con rmed case rate regressions range from 0.3 to 0.39 and from 0.24 to 0.29 in the death rate (Table 5) and recovery rate regressions (Table 6), respectively. An increment in IDV correlates with a decrease of 1.6% in infection rate, with other variables xed. This counter-intuitive result disputes the notion that IDV is positively related to COVID 19. From these t statistics, IDV and death rate correlate with very strong statistical signi cance, but an increment in IDV correlates with a relatively small 0.4% increase in death rate. The estimated recovery rate decreases by a noteworthy 3.5% for every increment in IDV. The IDV t statistics of -2.021 and -2.347 both suggest its statistical signi cance to the recovery rate, although it is not as strong as its appearance in the previous death rate regression.
We also perform the F test to see if PDI, IDV and IVR are jointly signi cant, which leads to a similar conclusion as above. The p values corresponding to these F statistics for infection rate, death rate and recovery rate equal 0.5983, 0.0135 and 0.1018 respectively. Therefore, we conclude that PDI, IDV and IVR are only jointly signi cant for the death rate. Messner (2020) found that societies with high IDV and high PDI experience a slowed rate of pathogen multiplication, while population density is negatively related to outbreak [34]. These results contradict our ndings, despite its approval of positive relation between IVR and outbreaks. While HCD cover more than 100 countries, our study used the 75 that had the necessary records compared to 96 used by Messner. In addition to using more recent COVID-19 data, we use three dependent variables (infection rate, death rate and recovery rate), instead of only growth rate used by Messner (2020). In addition, whereas we apply economic explanatory and control variables, Messner applies political variables namely functioning political institutions and education system quality, with the former being negatively associated with the outbreak and the latter having positive association.

Robustness Checks
We apply two robust techniques to our study. First, to avoid collinearity, we discard any entry that exceeds 0.8 in the correlation matrix. We check the correlations of IDV, IVR and PDI with the remaining non-control variables, and nd that no correlation exceeds the cutoff. Second, we set a variance in ation factor (VIF) threshold of 5 to check IDV, PDI and IVR, since it does not matter if the variables with high VIF are control variables. These criteria mitigate multicollinearity [4]. The VIFs for regressions with regressands IDV, IVR and PDI are 2.78, 2.20 and 2.62, respectively, using the formula 1/(1-R 2 ), which are both below the threshold. Therefore, our model should be robust.

Conclusion
For over two decades, economic leaders have been using the Global Entrepreneurship Monitor (GEM), which is built on the HCD, to guide regional economy policy (Al-Kadi, 2017) [2], and the latest GEM report presents the entrepreneurship policy roadmap for each of the fty participating economies (Bosma et al., 2020) [11]. We recommend a similar culture-speci c policy roadmap for pandemic response. Because political borders are more porous to pandemic effects than to economic effects, such roadmaps may be even more important to public health policy than to economic policy.
IDV is statistically signi cant to death rate and recovery rate at a global level.
There are a few limitations we can think of below: 1. The actual number of the infection is very likely to be underestimated in many countries. Likewise, the number of deaths and recovery may not be entirely reliable. Due to omitted cases, different ways used to calculate these statistics, and even intentionally misleading information, it is di cult to ascertain the accuracy of the raw data.
2. Due to the lack of data availability and completeness, many countries or districts are not covered. This may lead to bias in our estimations.
3. Some attributes do not chronologically match exactly. For example, the population and pandemic statistics are measured at different times. The date of the webpage displaying country population is 2020; that is not necessarily the date of the population measurements, and the pandemic statistics are mostly by 7/3/2020. However, since usually a country does not rapidly change the population within a few months, this should not cause severe bias in this study. In addition, HCD values lag by a few years, while the COIVD-19 statistics are collected in 2020. This issue applies to some control variables as well.
4. Some HCD scores may be subject to obsoleteness in the future without updates. While the data in this study are collected after 2015, Zhao et al. (2016) discover that PDI, IDV and UAI have changed over time, from 1970 to 2010 [49]. It suggests that the same results may not hold in long term, and the same research needs to be redone.

While
Hofstede's culture dimensions theory is prevalent, there is some criticism to HCD and it may not be a perfect estimation for our study. However, if the bias is within some tolerance, we can still argue our discovery is valid.
Because IDV is statistically associated with death rate and recovery rate globally, they may have a causal relationship. Changing some behaviors related to individualism may help reduce COVID 19 spread. However, public health o cials may nd it helpful to practice the following strategies: 1. It is not wise to challenge one's individualistic belief directly when imposing public health policy. Subtly adjusting risky individualistic behaviors contributing to a public health crisis is more feasible.
2. For many people, words or behaviors from their idols usually override opinions from the rest. Public health marketing could thus cooperate with supportive and diverse celebrities including movie, music and sports stars, political and business leaders, and other social idols, to control pandemic spread.
3. Public health researchers should further investigate the drivers of individualism and incorporate them in policy making.

Declarations
Ethics approval and consent to participate Not applicable

Consent for publication
Not applicable Availability of data and materials The datasets generated and/or analyzed during the current study are available from the following websites: