Assessing the efﬁciency of COVID-19 NPIs in France : a retrospective study using a novel methodology

After an initial phase of low reactivity from the French public health authorities, in the face of the emergence of SARS-CoV-2 in February 2020, various Non Pharmaceutical Interventions (NPIs) were put in place (strict stay-at-home orders, followed by mandatory mask wearing in public places, curfews, partial lockdowns, etc.). In our knowledge, no study has independently assessed their respective effectiveness in an independent manner nor a synergistic manner. Our study has retrospectively studied (from 03/01/2020 to 30/01/2021), using metropolitan France data, the association strength (using normalized mutual information) as well as the linear correlation (using Pearson’s correlation) between more restrictive NPIs (mrNPIs) and epidemiological markers of COVID-19. All mrNPIs were moderately associated with a viral reproduction rate decrease but were associated neither with a decrease in COVID-19 daily hospitalizations, nor with COVID-19 daily ICU admissions. This paper is only for academic discussion, conclusions need to be conﬁrmed by further research. Data and codes were available here http://gitlab.com/covid-data-2/lockdown-and-curfew .


Introduction
The management of epidemics has not changed for centuries: "detect, isolate, treat" has almost always been the creed of health authorities -and remains as such. Furthermore, until last century, patient isolation used to be selective, not to say individual. For instance, lazarettos, e.g. prison-like places equipped with infirmaries, have been broadly used in order to keep ship passengers or other sick patients in strict quarantine 1 ; in 17th-century London, only infected families were isolated in their homes, their doors being marked with red crosses 2 in order to protect the healthy. Countless such examples taken from history reveal that broad-based, general lockdowns, concerning both the healthy or asymptomatic as well as the sick, has been very uncommon practice. Therefore, the decision to impose mrNPIs such as been the case in metropolitan France, with mandatory "stay-at-home" orders ("lockdowns") and subsequently curfews applied indiscriminately to entire populations, without historical precedent nor scientific basis, was made because limiting social control to the sick appeared unacceptable to presumably well-meaning policy makers. More recently, a curious political measure without any epidemiological hindsight nor data was added to this panel: the "week-end lockdown".
The impact of such "stay-at-home" mandates on COVID-19 epidemiological outcomes is subject to debate, for many studies report on positive effects, while others evince complete uselessness or even negative results. A broad survey 3 of the lockdown and other NPI policies suggested a global uselessness of lockdowns when it comes to COVID-19 mortality, and even sometimes SARS-CoV-2 transmission.
The challenges posed to French public health authorities has been compounded with the fact that established epidemiological models have shown their limitations during this pandemic 4,5 . Furthermore, it is commonly accepted that multi-agent, compartmentalized, or Markovian process based models shall be used in order to assess the impact of public health measures on an epidemic. At the very least, one method that is adapted to time series should be employed 6 . Ideally, rigorous uncertainty quantification shall be performed in order to provide credibility bounds to the prospective results. Meanwhile, we have observed that epidemiological models (as the MRC/Cambridge one) are sometimes revised and show reproduction rates lower than 1, despite the emergence of new variables: e.g., cumulative rates of heart attacks are dramatically revised upwards, hereby drastically reducing IFR estimates 7 . Consequently, we endeavoured to bring methodological innovation to overcome the various biases and flaws that exist in current epidemiological models used so far in the assessment of public health policies 8,9 .
In order to not wallow in idle criticism of the aforementioned policies without solid arguments, we assessed the impact of mrNPIs (lockdowns and curfews) on intensive care unit (ICU) admissions as well as daily hospitalizations for COVID-19. In this goal, we acknowledged that inter-dependency between 2 variables (here, the mrNPI, compared to COVID-19 epidemiological markers) can be studied not only with linear correlation, traditionally used in the field of epidemiology, but also with other association-evincing techniques apparently unknown to this field. To this end, we have retrospectively studied (from 03/01/2020 to 30/01/2021) in the metropolitan French example the strength of the association and the correlation between NPIs and epidemiological markers of COVID-19, using both classical Pearson correlation as well as with innovative Shannon mutual information theory.

Pearson method
With a time lag of 0 days, Pearson's correlation matrix ( Figure 2) clearly reveals no significant correlation between COVID-19 ICU admissions, neither with a curfew starting at 6 p.m. (0.04), nor with one starting at 8 p.m. (0.06). However there was a minor positive correlation between COVID-19 hospitalizations and 8 p.m. curfew (0.11) and a smaller one with a 6 p.m. curfew (0.09); lastly regarding viral reproduction rate R(t), there was not any significant correlation for either type of curfew ( -0.03). Furthermore, concerning general lockdowns we noticed a modest positive correlation with COVID-19 hospitalisations (0.29); a slightly stronger positive correlation with COVID-19 ICU admissions (0.31); and no significant correlation concerning viral reproduction rate (0.09).
Finally, closure of "non-essential" businesses (bars, restaurants, non-food stores, theaters, etc.) was modestly but yet noticeably positively correlated with with both COVID-19 hospitalizations (0.36) and ICU admissions (0.32), but was not not correlated with a change in viral reproduction rate (0.02).
The most homogeneous NPI in terms of correlation coefficient remained mask wearing (without, however, exhibiting a strong correlation): for instance, mandatory mask wearing in public transit was slightly negatively correlated with both COVID-19 ICU admissions (-0,16) and viral reproduction rate (-0.22), but not significantly with COVID-19 hospitalizations (-0.05). An even smaller effect was observed regarding the mandatory wearing of masks in indoors public venues, with minimally-significant negative correlation with viral reproduction rate (-0.10), and no significant correlation with either COVID-19 ICU admissions (-0.02) nor hospitalizations (0.02). Even less convincing results were found regarding the correlation of mandatory mask wearing at the workplace, with a borderline insignificant negative correlation with viral reproduction rate (-0.10), no significant correlation with COVID-19 ICU admissions (0.08), and even slightly positive correlation with to COVID-19 hospitalizations (0.13).
We crucially report that we obtained similar findings, when using a non-zero time lag of 15 days from NPI introduction, to hospital/ICU admissions and viral reproduction rate. It is therefore futile to try to time-consolidate in the hope of evincing more favorable outcomes between NPIs and epidemic outcomes, as typically-accepted time lags (in the order of 2 weeks) yield identical series of Pearson coefficients. The association of NPIs with the various epidemiological indices studied is significant (p-value< 0.05) ( Figure 3).

Normalized mutual information method
Those results were confirmed by the normalized mutual information method ( Figure 4). No association between hospitalizations or ICU admission and all NPIs was found and a negligible association with non-essential stores closure (0.13 and 0.10). All NPIs were moderately associated (0.3>x> 0.1) to the viral reproduction rate; however, the association of mask wearing in transport with viral reproduction rate is not significant (p-value = 0.45) ( Figure 5). The ICU admissions and hospitalizations association is well captured (0.35) as by the Pearson correlations matrix; our classical information entropy method showed a modest association between French "départements" and hospitalizations or ICU admission (> 0.1), reflecting the disparity in severity over the territory, which the Pearson correlations matrix did not capture.

Pearson method
Regarding the synergistic evaluation of NPIs ( Figure 6), all of them allowed to decrease the viral reproduction rate, but to a very small extent; however, they had no significant negative influence on ICU admissions or hospitalizations. The most effective measure is the association between wearing a mask in public transport and closing non-essential stores (-0.14). Other measures also have low but significant effectiveness (-0.11): the association of the wearing of a mask in closed places and the closing of non-essential businesses, the association of the wearing of a mask in companies and the closing of non-essential businesses, association of the wearing of masks in companies and the closure of non-essential businesses with lockdowns. All correlation results between NPIs and epidemiological trackers were significant (p-value < 0.05) ( Figure 7).

Normalized mutual information method
Our above, Pearson-based, correlation findings results were confirmed by our novel, normalized mutual information based approach ( Figure 8): all NPIs were only modestly associated to the viral reproduction rate (Rt)Mean (0.1<x<0.3): specifically, the strongest association between an NPI and epidemiological outcomes was observed with the synergistic effect of mandatory mask wearing in public transit combined with the closure of "non-essential" stores (0.25). However, it is important to acknowledge that no NPI exhibited any association with either COVID-19 daily hospitalizations or COVID-19 daily ICU admissions. Furthermore, all association results between NPIs and epidemiological markers were found to be statistically significant (p-value < 0.05) (Figure 9).
Finally, it is interesting to notice that were linear Pearson correlation was not able to evince a strong association between "département" number and COVID-19 epidemiological markers (e.g. only 0.16 for viral reproduction rate and even lower for other variables), in contrast the Shannon mutual information based approach evinces a strong association (e.g. 0.64 for viral reproduction rate). This is easily explained by the fact that the numerical IDs of the French "département" are purely alphabetical and therefore do not have numerical but merely categorical meaning. As a result, Pearson correlation is in this case used outside of its operating assumptions; in contrast, Shannon mutual information being intrinsically agnostic to variable type is able to retrieve the expected association, reflecting the observed substantial differences in epidemic severity across the "départements" of metropolitan France, hereby demonstrating the superiority of this novel methodology for those types of analyses and outlining the relevance of our approach.

Discussion
All considered NPIs were moderately associated with a viral reproduction rate decrease but were undoubtedly not associated, neither with a decrease in COVID-19 daily hospitalizations nor with COVID-19 daily ICU admissions. Furthermore, this lack of association was clearly observed using two fundamentally different approaches, resting on wholly different assumptions and principles: one with classical Pearson correlation, the other with novel (in this field) Shannon mutual information. Furthermore, we observed that these two approaches did not contradict but rather nicely complemented each other, in that Pearson correlation made it possible to study the sign of the correlation (positive vs. negative) while normalized mutual information made it possible to assess the association strength in a finer way, independent from the assumption of monotony required by linear correlation.
We surmised that the main criticism of our work would most likely be grounded in temporality, in that a time lag is to be expected between the inception of a public health policy, and its observable effects on epidemiological markers. In order to alleviate such concerns, we carried out additional analyzes by inserting 15-day time lags between policy inception variables on one hand, and epidemiological variables on the other. However and quite remarkably, no variation in the observed coefficients nor in the p-values were observed with such time-consolidation. From a methodological point of view, it should be specified that the algorithm with this 15-day period must be run twice: once to assess the association between X and Y with a 15-day lag between X and Y, then a second time to assess the association between X and Y with a minus 15-day lag between Y and X.
We also want to acknowledge that a recent geo-epidemiological study was published concerning France 10 which, in contrast to our analysis, found that the delay that elapsed between the first COVID-19-associated death and the onset of a lockdown appeared to be positively associated with in-hospital incidence, mortality, and case fatality rates. However, the same authors also found, like us, a decrease in viral incidence associated with the imposition of lockdowns. Contrary to us, they studied only the first lockdown period; our study have analyzed a larger period.The curfew measures effect in France had not been studied in any other study in France just like the mask wearing effect in the generalized population study across metropolitan France. The main limitation of our study therefore remains the study of the population's compliance with these NPIs: mobility makes it possible to assess compliance with the curfew and lockdown, but no index accessible in open data allows us to judge to assess that relating to the wearing of the mask (and its proper use).

Methods
We used two different methods in order to evince association, or lack thereof, between public health policies implemented in metropolitan France (mandatory mask wearing, closing of "non-essential" businesses, lockdowns, nightly curfews starting at 6 PM. or 8 PM) and the daily number of people hospitalized or admitted to the ICU for COVID-19 , as well as the effective reproduction rates, including appropriate time lags. These two methods were, on one hand, the classical Pearson correlation and, on the other hand, normalized Shannon mutual information. A STROBE checklist was fulfilled for this cohort study (Supplementary Table S1).

Settings
We used historical epidemiological data, spanning from 03/01/2020 to 01/30/2021, by extracting the metropolitan "département" (the second-tier administrative subdivisions of France, below the regional level, of which there are 96 in metropolitan France)

3/16
data on 01/31/2021. Specifically, the following study variables were extracted: number of hospitalized COVID-19 patients per day, and number of COVID-19 ICU admissions per day.
For this same period, we also considered the presence or absence of the various COVID-19-related NPIs: curfew starting at 8.00 PM; curfew starting at 6.00 PM; general lockdown; mandatory mask wearing in public transportation and indoor businesses; "non-essential" stores and restaurants closures; and the closure of performance venues (concert halls, movie theaters, etc.).

Variables
For each method, we have studied the following variables: 'CF8pm' curfew at 8. Raw data for independent NPIs efficiency assessment are compiled in our Gitlab 11 . Time-consolidated data (whose aim is to take into account time lags between measures and outcomes, described below) and synergistic NPIs efficiency assessment datas are available online 11 .
In order to evaluate the NPIs in a synergistic way,

Data sources
Epidemiological data were extracted from the official site of "Santé Publique France". NPI application data followed closely: mask wearing in public transit has been recommended since 05/11/20, mask wearing indoors places became recommended after 07/20/20, and in businesses by 09/01/20; closure of "non-essential businesses" (bars, restaurants...) and performance halls closure has have been complete since October 2020. All data sources were accessible on our metadata file 12 .

Statistical methods
Obviously, correlation is not causation; but absence of correlation implies absence of causality. Correlation (which might be negative as well as positive) is therefore a key component of the scientific process, for it evinces collections of variables that may interact with each other, hereby warranting further study. Conversely, this methodology also for the early dismissal of unwarranted hypotheses regarding such interplay between variables. This is why we used this approach as our goal was not to propose new NPIs, but rather to discover whether often-claimed benefits thereof were warranted or not.
The first method we used is based on the standard Pearson correlation matrix [12], whose computation was performed with the Python Numpy library, that we controlled with two Pearson formulas, for discrete series and continuous series. Specifically, we used the following function: where: • d f is the dataframe of the data • cols is the list of columns used for the matrix • T the transposition function Concerning Pearson correlation, it is a commonly formulated criticism that one may not establish a linear correlation between a series of quantitative variables (i.e., daily number admitted patients) and another one of qualitative variables (i.e., presence or absence of a curfew). However, this concern is misguided in the case of dichotomous variables (i.e., taking on binary values), for this correlation can be legitimately established using the point biserial correlation coefficient [13][14][15] . We have therefore performed these calculations for all dichotomous series (lockdown, curfew), and it is interesting to note that we obtained the coefficients as those directly obtained with Pearson. This method has previously been used to study the relationship between COVID-19 mortality and containment 16 .

4/16
The second method we used was based on mutual information entropy 17,18 , allowing us in particular to free ourselves the limiting assumption of monotony required by linear correlation. It is a measure of the quantity of information (in the sense defined by Claude Shannon in 1948 19 ) that two distributions share. In other words, it is a measure of the association ("clustering") between two variables: it is important to stress the fact that his approach is NOT linear correlation, but classical information entropy. As this approach was so far apparently ignored in the field of public health, but due to our familiarity with it in other contexts such as cyber-security 20 , we endeavoured to harness its power in support of medical studies. In this approach, entirely novel to the field in our knowledge, we compute a dimensionless quantity generally expressed in units of bits 21 , which may be thought of as the reduction in uncertainty about one random variable given knowledge of another. For instance, high mutual information indicates a large reduction in uncertainty about one variable, given the other; whereas low mutual information indicates a small reduction about this uncertainty; and of course zero mutual information between two random variables entails no association between the two distributions. Furthermore, Shannon's source coding theorem establishes strict bounds on what can be known about one data series might be compressed -which in turn tells how to what extent one variable might be a proxy of another one without data loss. More broadly, Shannon information entropy has been demonstrated to be especially efficacious to evaluate algorithmic complexity when evaluated with the Block Decomposition Method 22,23 . Moreover and according to N. N. Taleb, entropy metrics solve practically all correlation paradoxes in the field of social sciences (or rather, pseudo-paradoxes) 24 . Another important example of the relevance of this technique is that of mother wavelet selection, where it demonstrated superior sensitivity to quantify the changes of signal structure than classical mean-squared error and correlation coefficient 25 .
Therefore, it seemed especially relevant to us to assess the impact of these NPIs on the daily number of people hospitalized and on the daily number of ICU admission using each of these 2 methods.

Bias and their reduction
In fact, the number of hospitalizations at a time depended on the number of daily hospitalizations which is a time series, and the decision of curfew by the political power is highly dependent on the daily and cumulative number of hospitalizations.
The second bias is the temporal auto-correlation between daily ICU admissions and daily hospitalizations. To overcome this, we used time lags in the observations to construct the joint distribution. For example, for the case of sheaves, we took the CDC figures 26 : 5 days of incubation + 10 days of illness to arrive in sheaves. The temporality was well taken into account since each of the observations is a tuple of which each of the elements is not measured at the same t. Indeed, it is a joint distribution integrating time lags (of course the values of which can legitimately be discussed). Therefore, part of the code evaluated the auto-correlation relationship for ICU admissions 27 : the calculation of the auto correlation from a time-consolidated file making the sum of the patients in sheaves for each day then the auto-correlation test with a confidence interval of 95 %. Concerning the temporal auto-correlation bias previously described, the data on ICU admissions over the 320 days fell within the cone; therefore, it indicated that there was no auto-correlation between daily hospitalizations and daily ICU admissions, as you can see on Figure 1 and in Supplementary Table S2. Therefore, although this bias seems to have little impact, we performed an analysis with the initial data and another with normalized data.
There is a third bias which aims to modulate the evaluation of the effectiveness of NPIs: the population's compliance with them 28 . To establish an idea of the adherence of the French population to these NPIs and respect for them, we could study the number of fines issued for non-compliance with health rules from 03/01/2020 to 30/01/2021. However, no open access data is available concerning the number of fines per day and per "département" for non-compliance with sanitary rules.
A fourth bias lied in problems intrinsic to the different variables: the nation of «phantom beds» and therefore in the representativeness of the basic data 29 . This is why we were interested in the number of daily ICU admissions and the number of daily hospitalizations, and not in occupancy rates.
A fifth bias concerned the demographic difference between French "départements": 200 admissions in ICU in a sparsely populated "département" does not have the same impact as in a heavily populated "département" and equipped with resuscitation beds accordingly. To overcome this, we looked at an index with good external validity from a demographic point of view: the R(t). R(t) cannot be modified through vaccination or other changes in population susceptibility, it can vary based on a number of biological, socio-behavioral, and environmental factors 30 and also be modified by physical distancing and other public policy or social interventions; therefore, this indicator was undoubtedly the most important in our study. Therefore, in order to estimate R(t), we have used the Cori method that we adapted based on 2 datasets: emergency room visits for suspected COVID-19 and PCR tests positive for SARS-CoV-2. The calculation method based on the average between the Cori method 31 [ R (t) = A / B where A is the number of people admitted to the emergency room for COVID-19 (over 7 days) and B the number of people admitted to the emergency room for Covid19 7 days ago (over 7 days)] and the PCR positive method [same calculation but with a different variable: the daily SARS-CoV-2 PCR positive number]. Our calculation results are available ( 32 ).
The last but not the least bias is based on the synergy of NPIs between them and the resulting epidemiological effect. To overcome this, we will study for each method the NPIs separately and individually and then in a synergistic manner. Thus, we

5/16
carried out 4 runs of the programs: one without the time-consolidated ICU data and an independent evaluation of the NPIs, one with the time-consolidated ICU data and an independent evaluation of the NPIs, one without the time-consolidated ICU data and a synergistic evaluation of the NPIs, one with time-consolidated ICU data and a synergistic assessment of NPIs.

Data availability
All data generated or analysed during this study are included in this published article (and its repository http://gitlab. com/covid-data-2/lockdown-and-curfew).