COVID19 pandemic: how effective are interventive control measures and is a complete lockdown justified? A comparison of countries and states

Background For fighting the COVID-19 pandemic, countries used control measures of different severity, from ‘relaxed’ to lockdown. Drastic lockdown measures are considered more effective, but also have a negative impact on the economy. When comparing the financial value of lost lives to the losses of an economic disaster, the better option seems to be lockdown measures. Methods We developed a new parameter, the effectiveness of control measures, calculated from the 2 nd time derivative of daily case data, and normalised to the average of the daily case data during the effective phase; and also calculated from the derivative of the logarithm of the reproductive number. We calculated this parameter and two associated parameters, i.e. effectiveness, duration of effective phase, and ratio of the 2 former parameters, for 92 countries, states and provinces, whose effective phase ended the latest on 15 May 2020. We compared these effectiveness parameters, and also the mortality during We did not find any statistically significant difference in the three effectiveness parameters, between countries with and without lockdown (p > 0.76; very small effect size). There was also no significant difference in mortality during the effective phase (p > 0.1; very small to small effect sizes), however a significant difference after the effective phase, with higher mortality for lockdown countries. The effectiveness parameter derived from the daily case data correlated well with the parameter derived from the reproductive number (R 2 = 0.9480). The average duration of the effective phase was 17.3 ± 10.5 days. The results indicate that lockdown measures are not necessarily superior to relaxed measures, which in turn are not necessarily a recipe for failure. Relaxed measures are, however, more economy-friendly. The higher mortality of lockdown countries is explained from the fact that, in our database, more lockdown countries than no-lockdown countries have higher mortality already during the effective phase that became significant only after the effective phase.


1) Rationale of the method
If the number of infected people grows naturally, exponentially or sub-exponentially [20], then the slope of the daily case numbers becomes steeper with time. If this development is interrupted by effective control measures, then the slope flattens. The more effective the measures, the quicker slope flattens, and eventually becomes negative, resulting in a deceleration and decline of the daily case numbers. The higher the daily case numbers, the steeper are their slopes. It is therefore evident that any effectiveness index should be independent of 'numbers' (daily case numbers in this context). Our proposed method, and the derivation of the effectiveness, hinges on normalising the time derivatives of daily case numbers such that the effectiveness is independent of scaling factors. If two countries share the same geometrical identity of daily cases that differ only by their scaling factors, then the effectiveness of their control measures is identical. The 'force' required for interrupting the natural growth and for bending the slope is supposed to be generated by control measures, at least in the early stages of an epidemic. As this 'force' is applied daily over a certain period, it is more appropriate to refer to the 'force rate' (force per unit time). It will be shown subsequently, that the outcome of this force rate is directly related to the effectiveness of control measures.
The term 'effectiveness' used in this study stems from the two different types of intervention studies, where "efficacy can be defined as the performance of an intervention under ideal and controlled circumstances, whereas effectiveness refers to its performance under 'real-world' conditions" [21,22].

2) Effectiveness parameters, mathematical derivation and terminology
Most commonly, confirmed cases are reported and visualised as cumulative cases, CC, which approximately follow an S-shaped curve between 2 constant values, 0 and the maximum number of cases Cmax.
The speed of the increase in cases, velocity v, corresponds to the daily case count (unit: cases per day, c/d). It has to be noted that the (numerical) integral of v does not result in CC, but rather in CI, as CC is a summation. Thus, In mechanical terms, the 'force' mentioned above equals the acceleration of an object times its mass; whereas the 'force rate', equals the jerk times the mass of the object.
The jerk (or jolt), j, of the spreading disease is the (numerical) time derivative of a, namely j = da/dt.
(unit: c/d 3 ). The jerk j is positive or negative, if the acceleration increases or decreases, respectively. The major decrease of the acceleration (i.e. the major transition from acceleration to deceleration) is denoted by the effective phase or period, TE (measured in days). During TE, j is negative on average.
As will be shown later, the higher Cmax, the larger is the absolute jerk, |j|. This relationship prevents the direct comparison of the j-data of different countries. Therefore, for comparative reasons, j has to be normalised to v. This normalisation process has 3 advantages: where b is a multiplier (proportional to Cmax), t is the time (in days), m is the day where v reaches its maximum and s represents the width of the bell curve. The Gaussian function is symmetrical about m.
Note that the standard structure of a Gaussian involves a multiplier of 2 in the denominator of the exponent, which is omitted here for simplification purposes.
Also, note that the Gaussian function is only one out of other suitable models and will by no means be used as a fit function applied to actual daily cases data. It simply serves to understand the dynamics and principles of effectiveness.
When simplifying Eqn (1) and setting m to 0, t = 0 occurs at the velocity peak, and the time has a negative sign before the v-peak: (14).
-ratio of average effectiveness E to duration of effective phase TE, This ratio combines the opposite trends of E and TE in a single parameter.
Eqns ( where log denotes the natural logarithm. The time derivative of this function is the logarithmic growth Note that the multiplier b drops out which makes the gradient independent of the actual number of daily cases, as already seen in Eqns (12)- (14). This fact proves that R can be very well calculated from underestimated data, which stands in contrast to the criticism by Leung where SI is the serial interval.
As 0 ≤ Reff << ∞, and as the transition from epidemic to endemic occurs at Reff = 1, taking the logarithm of Eqn (18) puts this transition at log(Reff) = 0: The time derivative of this function is  (20). This principle establishes the relationship between Reff, or more precisely the derivative of log(Reff), with the effectiveness E of preventive control measures, which is also a function of s -2 , according to Eqn (14). Thus, the steeper the gradient of log(Reff), i.e. -2SI s -2 , the more effective are the control measures.
Normalising Eqn (20) to SI, if the average or median SI is a COVID-19 associated constant, delivers the effectiveness calculated from R, denoted ER From Eqns (14) and (21), This constant applies to Gaussian models only.

Data processing of real-world data
The processing procedure started with daily cumulative case data, CC, commonly reported on websites as further specified below. The daily case data, v, were determined from CC. Due to the noisy nature of the original v-data, they were pre-filtered and subjected to a double running average filter (1 st order Savitzky-Golay filter) with a window width of 3 data. The major data fit for identifying the trend was performed with a running quadratic filter (2 nd order Savitzky-Golay filter) over a window of 13 data. The filter method, specifically the window width of 13 data was obtained from a convergence test. In principle, the absolute peak data (min and max) of a and j become smaller, and may finally asymptote, as the window with widens (e.g. from 5 to 23 data). At smaller windows, the magnitude of the peak data is greater for two reasons: the slope of the filter data is steeper, and the local noise (data fluctuations) is more pronounced. Consequently, the data fluctuations were assessed by means of a randomness index RI (RI-p-ap method [23]; 0 = perfectly correlated, 0.5 = perfectly random; 1 = perfectly anticorrelated). The smaller RI, the less the data fluctuate. The RI-data of a and j asymptoted at an average window width of 13 (11)(12)(13)(14)(15). Using the quadratic filter without the preceding double average filters would require a wider window than 13 data to achieve the same RI effect, but resulted in smaller peak data.
The resulting dataset of filtered v-data served for two purposes: -Each filtered v-datum corresponds to the midpoint of a quadratic fit curve over 13 pre-filtered v-data.
The residuals between filtered v-data and original v-data were used to calculate the confidence interval of each filtered v-datum. The residual standard deviation of each filtered v-datum was divided by √13 to obtain the standard error, which was multiplied by the t-distribution of degrees of freedom = -The filtered v-data including their 95% confidence interval data were numerically differentiated twice by calculating the slope over 3 data points to obtain a and j.
Finally, E was computed from -j/v. The effective phase TE was defined as the time between an amax and an amin, where amax was positive and amin was negative, and a = amaxamin (23) where a was the greatest of the entire dataset. amax and amin were determined visually, according to the aforementioned guidelines.
The impulse of the jerk over the effective phase was The duration of the effective phase, TE, was determined from its boundaries tE1 and tE2 where tE corresponded to the intersections of the j-data and the zero line (intersection of a straight line between 2 consecutive data points, one positive and one negative), one intersection at amax and one at amin. Note that TE is usually non-integer.
E v .was calculated from averaging the v-data over TE.
The parameters obtained from Eqns (23) - (27) were determined from the filtered v-data and their confidence bounds (lower and upper). Note that after differentiating v with time, once and twice, the resulting confidence bounds for the parameters obtained from Eqns (23) - (27) are not necessarily lower and upper anymore, as their value depends on the instantaneous slope of the v-curve, and of the a-curve.
The profile or shape of the velocity curve was assessed theoretically and practically. It is evident that real-life velocity data (daily cases data) do not necessarily follow a Gaussian function. Practically, the different shapes were determined from the deviation from the initial theoretical Eqn (2) where all TE and E data follow a Gaussian function. As such, the s-parameter was determined for TE and E from both Eqns (5) and (14), respectively: If the TE and E data follow a Gaussian function, then sE = sT, and the ratio = sE/sT must be unity.
The ratio when expressed as log , determines the shape of the velocity profile, where log > 0 is more triangular shaped, log < 0 is more trapezoidal-shaped, and log = 0 is bell-shaped (Gaussian function).
For constructing an isoline with a fixed -value, and calculating it from E as a function of TE The principle of average effectiveness E , and the effective phase TE is shown in Figure 1, calculated from simulated daily case numbers (velocity v), and their consecutive time derivatives, acceleration a and jerk j.
The average effectiveness derived from Reff, i.e. R E , was calculated from where log Reff denotes the decrease of log Reff during the effective phase. R E was correlated to E and to the shape factor log . From the three values of the coefficient of determination (R 2 ) of multiple and single regressions, the combined influence was calculated from the sum of the R 2 of the single regressions minus the R 2 of the multiple regression. The individual influences (semi-partial correlations) of E and log were calculated from the single regression R 2 minus the combined influence. The influences were expressed as a percentage, resulting from 100*R 2 .
This correlation exercise served to prove practically that R E is directly related to E , a proof that was already established theoretically in Eqn (22) for a Gaussian v-profile; and for cross-validating the two different methods. The drawback of the 2 nd method, however, namely calculating the effectiveness ( R E ) from Reff, is that the start and end of the effective phase has to be predetermined from the first method, which is the positive and negative peak data of the acceleration a ( Figure 1). Taking the steepest gradient of log(Reff), or the gradient at Reff = 1, provides only a local maximum or value, respectively, of ER, instead of an average R E across the effective phase.

3) Data sets of real-world data
We collected publicly available data of cumulative and daily cases [1] reported for the countries, states and provinces listed in Table 1. We analysed the daily case data, calculated from the cumulative data, of 92 countries, states and provinces using the method described above. The countries, states and provinces were selected based on the following inclusion criteria: TE ending the latest on 15 May 2020, and CC ≥ 250 at this date. We took the cumulative case data and cumulative death data from several websites that provide databases for different countries, states and provinces 1 .

4) Classification of intervention measures
There are already some webpages [24] available, that provide (at least partially) information on lockdowns and restrictions related to the COVID-19 pandemic. We used this information and associated references found on these webpages for assigning countries, states and provinces to 2 groupslockdown, and no lockdownaccording to the definition below, and to our best of our knowledge, belief and understanding, when compiling all the information found on the internet. We excluded the following countries from this classification: Russia, as there was not a nationwide lockdown but only in some cities and regions; USA, which was treated by states; and China, with was treated by provinces.
Defining a 'lockdown' for decision making is a subjective process, more exclusive rather than inclusive, mostly by judging what it is not.
For this study, we define a lockdown as a) a nationwide (state-wide / territory-wide) compulsory stay-home order for 24 hours per day and at least for 14 days, b) enforced by law, police and by penalties in case of infringement, c) with very few exceptions that allow people to leave their home (e.g. essential work and study, shopping for essential goods, medical care, exercise, etc.).
It is evident that this compulsory stay-home order further, but not necessarily, implies: the closure of schools and universities, non-essential businesses, and places for public gathering such as restaurants and entertainment facilities; prohibition of visiting of friends and relatives (indoor gatherings) and outdoor gatherings; abiding social distancing rules and mask orders; etc.
What a 'lockdown', by our definition, is not, infer: any of the implications, single or combined, that arise from a compulsory stay home order, in absence of the cardinal compulsory stay home order; voluntary stay-home orders, where people are advised or directed to stay home (e.g. Florida; people 'shall stay home' or 'shall limit their movements' [25] instead of must); age-dependent compulsory stay-home orders (e.g. elderly citizens); curfews for less than 24 hours such as during night time (e.g. Serbia); compulsory stay-home orders of less than 14 days (e.g. Israel); movement control orders (e.g. Malaysia); fines for breaching the physical distancing rule, in absence of a compulsory lockdown (e.g. Netherlands).

5) Statistics
From the daily case data, the following parameters were calculated, by using the method described The mortality was determined from the number of deaths in a population normalised to the size of the population of countries/states/provinces listed in Table 1 at a specific point in time during the COVID-19 pandemic (data at the beginning, middle and end of the effective phase; and as of 26/06/2020). We compared the mortality of countries with and without lockdown with the Mann-Whitney test.
The effectiveness was visualised in Matlab (Release 2018b, The MathWorks, Inc., Natick, Massachusetts, United States) for European countries by colour-coding the parameters TE, E , , and .

1) Practical explanation of effectiveness parameters
Very efficient preventive measures implemented are associated with a short TE and a great E and . Table 1 shows the data of countries and states whose effective phases ended before 16 May 2020. In general, long trapezoidal plateaus are subjected to fluctuations, which render the plateau alternating effective and ineffective. In the short trapezoidal plateau of New Zealand (Figure 2c), the plateau is almost flat which makes the effectiveness profile double-humped, with zero-effectiveness between the two humps. 2) Interrelationship of effectiveness parameters Figure 3 shows the relationship between E and TE, with respect to hypothetical data of a Gaussian function (separating triangular and trapezoidal v-profiles). The point map and its power-law fit function (R 2 = 0.8337) deviate from, and cross over, the hypothetical Gaussian function data. The data can be divided into three areas: velocity data profiles ranging between Gaussian and triangular with high effectiveness (green area); profiles ranging from triangular over Gaussian to trapezoidal with medium effectiveness (yellow area); and profiles ranging from Gaussian to trapezoidal with low effectiveness (pink area). Figure 3 shows that there are no ineffective triangular velocity profiles and no highly effective trapezoidal profiles. Figure 4 shows   parameter is the ratio of E to T E , (the greater, the more effective); parameter is another parameter associated with the shape of the velocity profile, which indicates the transition from triangular velocity profile over Gaussian to a trapezoidal profile.  E is directly related to R E , the effectiveness calculated from Reff, as shown in Figure 5a. The slope of the regression line is 1.2656 and not 1.41 according to Eqn (22), which is applicable to Gaussian functions only. The intercept of the regression function is very close to 0. In Figure 5b, the regression slope of R E vs E is plotted against the averages of log showing that the slope decreases as log increases. The intercept of the regression function is 1.4003, which corresponds to the slope predicted at log = 0 and is close to the predicted multiplier of 1.41 according to Eqn (22). Figure 5 proves that E and R E are comparable and complementary measures.
The magnitude of R E is 94.80% explained from E (100R 2 ; Figure 5a); and 22.61% from log . The multiple regression dependency of R E is 96.19% on E and log . In only 3.81%, the dependency of R E remains unexplained. The individual influences (semi-partial correlations) of E and log on R E were 73.58% and 1.39%, respectively, and the combined influence of E and log on R E was 21.11%. The semi-partial correlations revealed that any influence of log on R E happened only in combination with E . The reason for this could be explained from the fact that log is 43.41% influenced by E , and even 44.66% by E < 0.03. More efficient countries tend to have a triangular v-profile, whereas less efficient countries are characterised by a more trapezoidal v-profile. 3) Timeline graphs of the effectiveness Figure 6 shows the timeline of the effectiveness. The first cluster was China and its provinces, followed by South Korea about a month later. 8 days after Korea left the effective phase, Malaysia started with her effective phase, followed by Uruguay, Iceland, Italy, Thailand and Switzerland. The first countries that left the effective phase, were: Lebanon; followed by Jordan, Massachusetts and Taiwan on the same day; followed by Andorra. The 5 most effective countries and states with the greatest ( E ) were Montenegro, Idaho, South Korea, Malta and Mauritius.   Table 2

5) Mortality rate
Comparing the mortality of lockdown and no lockdown countries at the beginning, middle and end of the effective phase did not reveal any significant difference ( Table 3). The only significant difference was found when comparing the mortality data of 26/06/2020 with a medium effect size, and a higher mortality for countries with lockdown (Table 3). To investigate this result further, we divided the mortality data in the middle of the effective phase in two groups (greater or smaller than 50 deaths per one million population). The associated mortality data of 26/06/2020 are significantly different between, and directly correlated with, these two groups (Table 3). This result indicates that the lockdown countries in the higher mortality group (as of middle of effective phase) were not able to flatten the mortality curve better than the non-lockdown countries despite lockdown measures. The significantly higher mortality rate of lockdown countries as of 26/06/2020 is explained simply from the fact that more countries with lockdown had a higher mortality since the effective phase than countries without lockdown.

DISCUSSION:
The objective of our study was to develop a method for measuring the effectiveness of control measures on decreasing the transmission of the SARS-2-Coronavirus.
The most striking result of our study was that no significant difference in terms of the effectiveness of control measures could be found between countries (and states / provinces) with and without lockdown measures. This result has serious implications for the management of control measures.
First, our study provides the necessary evidence for Anders Tegnell's comment that 'lockdown' has no 'historical scientific basis' (for being efficient).
Second, the 'fine balance' [2] (between saving lives and saving the economy) is not immediately valid anymore, as the allegedly efficient lockdown is supposed to save lives, but alsoas a side effectbrings the economy down. Countries with lockdown are as efficient as countries withouton average (based on the comparisons of medians). There was also no evidence that lockdown measures manage the mortality (deaths per population) better than measures without lockdown.
The statements 'the case for shutdown clear' and 'the shutdown wins' [4] are therefore no longer valid either, as the losses from COVID-19 casualties per population depend on the mortality during the effective phase rather than on lockdown measures themselves, and the burden of the economic downturn after lockdown still affects these very countries. It must be emphasised, though, that these conclusions are valid across a range of countries in terms of average or median data, whereas individual countries will respond differently to lockdown or no lockdown, in terms of effectiveness.
Other noteworthy results are that the time between the first day of drastic lockdown measures and the first day of the effective phase was 5.6 d on average, which is slightly longer than the serial interval of  [27]. Typical lockdown measures are part of only the 3 rd strategy, as one extreme of a wide range of measures. The minimum requirement of strategy 3 was addressed by Chu et al. [27], who found in a meta-analysis review of physical distancing, face masks and eye-protection that 'no intervention, even when properly used, was associated with complete protection from infection.' These findings [27] seem to support many countries' decision for advising or compulsory requiring wearing face masks in public such as Canada, South Korea, the Czech Republic and Austria.
Why do countries respond differently to strategies and measures for controlling COVID-19? Why do some countries suffer from long plateaus of daily case data despite control measures and even lockdown measures?
The effectiveness of physical distancing and wearing personal protective equipment depends on the compliance of the citizens. Being compliant with control measures, however, is not just an expression of personal protection, but even more so 'shifts the focus … to altruism, actively involves every citizen, and is a symbol of social solidarity in the global response to the pandemic' [29]. Compliance is mainly driven by the 'duty to obey authorities and personal morality' rather than by 'perceived risk of legal sanctions and perceived risk of the virus' [30]. A lockdown is nothing but enforcing compliance, specifically by rules, law, police, and fines, which in turn risks 'increasing non-compliance (or so-called lockdown fatigue [31])'. A compliance study [30] that investigated the compliance of Australian citizens during the (first) lockdown revealed that the participants' non-compliance was approximately 50% for each of the following reasons: "socialised in person with friends/relatives" they did not live with and "left the house without a really good reason". This striking behaviour of non-compliance with strict lockdown rules and measures resulted in a medium effectiveness of control measures, close to the averages of the data shown in Table 1.
Another important driver of compliance, often overlooked and underestimated, is 'that persuasion and education encourage normative compliance with rules and laws, because they promote a sense that people should comply with laws because it is the right thing to do' [32]. Proper education is facilitated by leadership by example, exhibited by leading politicians and scientists on the mass media and in public [33], e.g. by wearing masks, demonstrating good physical distancing practice, information session led by epidemiologists, etc. Leaders should refrain from conveying anecdotic evidence, such as which medication or personal measures could protect from attracting COVID-19.
A favourable approach for both controlling a virus outbreak and saving the economy of a country, would involve 1) an early start with control measures, even before case zero; 2) improvement of compliance by thorough education, information and explanation of the restrictions and appealing to the solidarity and morality of the people; and adopting economy-friendly but outbreak-preventive control measures in the absence of lockdown rules.
It will take time and further research to align the hitherto contrary prioritiessaving lives and preserving the economyand provide a comprehensive and holistic strategy based on lessons learnt from the COVID-19 crisis. Considering that we have to deal with such a pandemic for the first time after 100 years, the inexperience of fighting this battle contributes decisively to the varying effectiveness of control measures.

CONCLUSIONS
In this study, we provided three new epidemiological parameters, related to the effectiveness of controlling a highly contagious disease, as well as a method for calculating these parameters. These data were determined for 92 different countries, states and provinces. Comparing effectiveness data of countries with and without lockdown revealed that there was no statistically significant difference in the effectiveness between lockdown and 'relaxed' measures. Furthermore, there was also no statistically significant difference in the mortality during the effective phase between lockdown and 'relaxed' measures. These results do not provide any evidence that, on average, lockdown measures are more efficient and that the number of casualties per population is less.
The implications of this study are that there is neither guarantee for lockdown measures being successful, nor certainty that relaxed measures lead to failure. The advantage of relaxed control measures is that they are economy-friendly and prevent associated effects on mental health such as 'lockdown fatigue'. Instead of (late) lockdown measures, any control measures should start very early, accompanied by improving the compliance of citizens through thorough education in epidemiologically necessary changes (i.e. behavioural and others), and appealing to the solidarity and morality of the people.

Consent for publication
Not applicable

Availability of data and materials
All data analysed during this study are available from various web sources (see Reference [1] for web links) whose data are completely open. All data generated during this study are included in this published article.