Data
The data employed was taken from different sources. For COVID-19 cases and testing the data came from ourwoldindata.org, in combination with GitHub, the data on cases, deaths and tests encompass till 7th May. For health indicators the OECD and the WHO databases were consulted. The data collected corresponds to the most recent data available.
For the cross-section models the countries included are those which reported a 3-day average of 3 new deaths, in at least one day. This criterion is been made to take out of the sample the countries in which COVID-19 has not been widely spread till now. Upon this criterion a sample of 71 was obtained, which full list is in the additional files (See Additional file 4). A subsample for OECD was also built. Not all OECD members were included, due to lack of information, or because they do not meet the above mention criterion on COVID-19 deaths. For the panel data analysis, all available information was used, yet given that many countries do not report daily ciphers, or they do not change over the time, the sample is smaller, reduced to 66. Full list of the countries used per model is presented in the additional files (see Additional file 4).
Ordinal Probit model specification
An Ordinal Probit model allows to use an ordinal list as a dependant variable, which can be numeric or categorical. The model was estimated with Stata. The dependent variable for this model is the CFR, which is takes values from 1 to N, where 1 is assigned to the countries with the lowest CFR.
The estimation of CFR is difficult for several reasons. First, the universe of confirmed cases. Due to the very different criteria for test applications, in most countries, the tests are administrated only to those presenting symptoms, at least fever, or those requiring hospitalisation. Therefore, the universe of cases is well underestimated. Nonetheless, there is not still an agreement over the likely size of this underestimation, depending on the study, the asymptomatic cases are estimated between 5% and 80% (Heneghan, Brassey and Jefferson, 2020). For instance, Iceland is the country with more test applied per million inhabitants due to a massive testing strategy. In this case, they identified 50% of the positive cases as asymptomatic (Heneghan, Brassey and Jefferson, 2020). While, in the case of the Diamond Princess cruise ship, the proportion of asymptomatic to total infected was estimated on 17.9% (Mizumoto, K., Kagaya, K., Zarebski, A., Chowel, G., 2020 ). Second, differences in registers. Some countries recognize as COVID-19 death those suspicious, this is, that lived with a former late COVID-19 patient or was closely related; meanwhile other countries only account for the confirmed cases. Third, the timing matters. It has been confirmed that, similar to other viruses, once a person is infected, it takes up to two weeks to develop symptoms, if that is the case, a person can develop a mild flu-like illness, which according to the first Chinese analysis this proportion was estimated up to 81% (Novel Coronavirus Epidemiology Response, 2020). Yet those entering to severe and critical states might be hospitalised, and it takes several days until a fatality occurs. In view of that, obtaining the CFR by using the proportion of current deaths to current cases, is a misleading indicator, since the actual deaths from current cases will be reported later (Battegay et al., 2020).
Following the recommendation by Battegay et al. (2020), the third problem has been addressed by estimating the CFR as follows:

This, measure is larger than a current indicator, yet it might be more accurate. In Fig. 2 are shown three different CFR trough the time for the world. It is clear that the larger the lag in the total cases, the larger the CFR will become. Yet, it is noticeable that they tend towards convergence.
In Table 1, the values at the beginning and the end of the period are shown. For the three indicators the CFR is higher at the end of the period, and the difference among them diminished.
Table 1 CFR for the Wold. Source: Own estimation with data from Oueworldindata.org
Date
|
CFR_0
|
CFR_5
|
CFR_7
|
CFR_10
|
2020-01-11
|
1.7%
|
1.7%
|
1.7%
|
3.7%
|
2020-05-07
|
7.1%
|
7.8%
|
8.2%
|
8.8%
|
It is also important to mention that the first reported death came on the 12th day after the first case was registered. Therefore, it is important to use a lagged number of cases, for a better estimate.
The model used is as follows:

Where CFRi is the Case Fatality Rate ranking for the country i, for the full CFR per country see the additional file (see Additional file 1), Xi is a vector of variables corresponding to health indicators, both on infrastructure and on population’s health which could help to explain the difference in CFR across countries, such as, obesity, diabetes, presence of elderly people, and others. It is important to mention that not all the variables are included at the same time in the models to prevent biases, specially by the correlation among health expenditure, infrastructure and population health indicators, the variables are not put in the model at the same time.
The number of tests per million inhabitants are also included, since it has been claimed that the only way to decrease the CFR in the long-term is to massify the applied tests (OECD, 2020). Finally, considering that quarantine measures have been considered a determinant factor for fatality rate, the Stringency index by Thomas, et al. (2020) is also added as an explanatory variable. This index is a wide indicator of all the different social measures taken by governments to reduce the speed of spread, such as schools closing, cancelation of public events, closing borders, etc. It is available daily for several countries. It gives a weight to each measure taken, and the highest level for any given country is 100.
Cross-Section models specification.
These models are estimated by Ordinary Least Squares (OLS) in Stata. The first model uses as a dependant variable the total cases per million inhabitants, and the second model uses the total of deaths per million inhabitants. The aim of this model is to show a robust statistical correlation between the cases and death, and the explanatory variables that were statistically significant in the first model. The models are specified as follows:

Panel Fixed Effects models
Finally, a group of panel data estimations have been made for evaluating greater robustness for the models above specified. Panel data models can potentially include larger number of data by combining cross-section and time-series analysis. The cross-section models were used to be able to link the dependant variables varying daily to annual variables, by using one static picture at the data. Instead, for the panel analysis only data varying daily is used, these are cases, tests, deaths and the Stringency index. Given the type of data, these models allow to use dynamic variables. Thus, first differences of the dependent variables are employed. Natural logarithms are used to find elasticities.
The models are specified as follows:

For all the models the explanatory variables are two: the 7th lag of new tests per million inhabitants, and the square of the stringency index. The seventh lag of new tests per million is used given the claims that early testing reduces the chances or greater infections (OECD, 2020). At the same time, similarly to CFR, it is considered the time for the virus to develop, for instance, a person that is asymptomatic today, might develop symptoms within a week. Mizumoto et al. (2020) estimated a range of 5.5 to 9.5 days for incubation, yet it is still uncertain. There are cases in which people might show symptoms and die within a few days. Given the difficulties determining the best lag to consider, two choices are shown, the 7th and the 15th. Regarding to quarantine measures, many countries converge to similar levels in the index at the end of the period, yet squaring the variable allows to model the fact that the index has a maximum, and its marginal effect is smaller in the time. Also, countries taking early measures should be able to content the spread to a larger extent, thus, this is modelled through the initial larger marginal effect on the dependant variables of a squared variable.
In Eq. 5, the model has as a dependent variable the natural logarithm of the first difference in CFR. In Eq. 6 the dependant variable is the natural logarithm of new COVID-19 cases per million (first difference of total COVID-19 cases per million), and in a similar fashion, the natural logarithm of new deaths per million (first difference of total COVID-19 deaths per million). By using weighted variables per million inhabitants, it is addressed the population size differences across countries.
All the variables and its summary statistics are shown in Table 2.
Table 2 Summary statistics. Source: Own elaboration
|
Mean
|
Maximum
|
Minimum
|
Standard Deviation
|
|
Panel data
|
CFR
|
0.0683694
|
9.5
|
0
|
0.1837786
|
New cases per million
|
12.49621
|
4944.376
|
-139.488
|
66.70643
|
New deaths per million
|
0.5867564
|
200.04
|
0
|
3.860438
|
New tests per million
|
325.8418
|
7285
|
0
|
566.0734
|
Stringency Index
|
32.84637
|
100
|
0
|
37.00693
|
|
Cross-section
|
CFR
|
0.0633442
|
0.2009389
|
0.0084971
|
0.0438073
|
Total tests per million
|
14153.18
|
80726.73
|
0
|
16803.75
|
Health expenditure as GDP percentage (%)
|
6.869014
|
17.1
|
2.3
|
3.380769
|
Stringency Index
|
79.54732
|
97.14
|
0
|
20.52645
|
Total deaths per million
|
85.62903
|
719.523
|
0.788
|
155.176
|
Total cases per million
|
1274.181
|
9719.796
|
34.875
|
1664.223
|
As can be seen in the last table, the mean CFR is similar for both datasets (0.0683694 and 0.0633442), which implies that the CFR keeps its trend in the time period analysed. although it is not the case for the coefficient of variation, which is major for the panel data (268.80) than for the cross Sect. (69.15), which is explained by the different results in the period for the different countries.
It is also worth noting that the maximum for CFR in the panel data can be higher than 1. The reason is that, in countries with a very explosive growth, the total cases confirmed one week are less than the total deaths occurring the following week, by which time the confirmed cases grew exponentially.