COVID-19
Worldwide and Europe data on COVID-19 cases and deaths are retrieved from the Worldometers platform20. Greek cases, deaths, and active data are retrieved from the same platform, while data for the geographical distribution of cases and deaths in Greece are retrieved from COVID-19’s page on the official governmental platform21. Maps were created with Chartsbin and Mapcharts22-23. The selected timeframe is from February 26th to April 24th, as these two dates mark important milestones; first confirmed case and end-of-quarantine announcement, respectively.
Non-Pharmaceutical Interventions (NPIs)
Since no official COVID-19 cure exists, and as the vaccine is not yet available and could be more than a year before it is ready for public distribution, NPIs, as, for example, house quarantine or international flights restrictions are required in order to monitor and minimize disease spreading. The timely adoption of such measures is what made the Greek case that successful, and the statistical significance of the days of adoption are explored order to elaborate on the importance of NPIs.
Greek officials immediately acted on adopting preventive measures for the spread of the disease, with the first NPI being announced along with WHO declaring COVID-19 a pandemic on March 11th, and, effective as of March 12th, ordered the closing of all schools and universities24. Retail shops, gyms, cafes, restaurants, etc., followed over the next days25- 26, while less than two weeks later, on March 23rd, strict movement measures took place, including house quarantine, and special forms and SMS for all forms of movement27. On March 12th and March 25th, two villages in the Kozani and one in the Xanthi Prefectures were put under complete lockdown, respectively28-29, while the two major cities close by were also put in special quarantine measures -but not complete lockdown- on March 31st .30-31 Said cities exhibit increases COVID-19 cases and deaths compared to other Prefectures.
Google Trends
Several approaches in monitoring and analysis several epidemiological characteristics of the virus have been recorded up to this point using Google Trends data.32-35 Google Trends is an open online tool that provides information on the behavior towards selected topics and keywords. Such infodemiology variables can accurately measure the users’ online search patterns, and, in this case, assist with exploring the public’s perception and interest towards COVID-19 over the examined period.
In this paper and in line with the Google Trends Methodology Framework in Infodemiology and Infoveillance36, normalized Google Trends data were retrieved on a sequence of Google queries from February 26th, which is the date for recording of the first confirmed COVID-19 case in the country, up to April 24th, which is the day before the announcement of the softening of the quarantine measures.
Google Trends is not case sensitive, but it does take into account accents, special characters, and misspellings. Greek is a rather complicated language in terms of accents and spelling, and the spelling of the word of coronavirus had not been definite. To that end, and to ensure that the majority of coronavirus searches were included in the analysis, the following procedure for the selection of the examined keywords was followed.
At first, there existed four differently spelled terms to express coronavirus, i.e., “Κορωνοιός”, “Κορωναιός”, “Κορονοιός”, and “Κοροναιός”, with all terms including accents. Each term was compared to itself without the accent in Google, and all cases exhibited non- significant results for the terms with the accents. As the terms “Κορωναιός” and Κοροναιός”, though used during the first days of the epidemic, were quickly “abandoned” by the experts, media, and public, they exhibit significantly less interest, and thus not included in further analysis. Therefore, the Greek terms “Κορωνοιος” and “Κορονοιος” were selected at this stage and, in order to also include searches conducted in English, “Coronavirus (search term)” was also added in the analysis, i.e., data for the “κορωνοιος+κορονοιος+coronavirus” sequence of search terms were retrieved for the Google query data variable.
Predictability analysis
In this paper, a predictability analysis for COVID-19 Daily Deaths and Daily Cases ratio in Greece is performed; the prediction model is based on an autoregressive model with heterogeneous explanatory variables (𝐴𝑅 − 𝛨𝑋). This proposed model is constructed in order to incorporate and study short-term and long-term effects of predictors that are crucial for the assessment of the duration and the effectiveness of an intervention policy. In spite of the simplicity of the model, we find that it successfully achieves to predict the COVID-19 Daily Deaths and Daily Cases ratio.
In particular, let 𝑦(#) be the dependent variable constructed as the between Daily Deaths and Daily Cases, 𝑥%,!, with 𝑖 = 1, 2, 3 denoting the explanatory variables and 𝑡 = 1, … , 𝑇, with 𝑇 being the respective number of observations. The dependent variable exhibits a series of statistical properties that pose serious challenges to standard statistical models (e.g. autoregressive fractionally integrated moving average models). For example, the autocorrelations of the square and absolute values of Daily Deaths and Daily Cases ratio display large-memory that last for long periods of time (e.g. weeks), while it is expected that its determinants will influence it after a long period of a random time.
Despite that the widely used models in the existing literature use infinite-dimension restrictions to infinite variable lags in order to be able to obtain long memory, many observations are lost because of the not time-effective build-up period for the fractional difference operator.37 However, for cases as the one explored in our study, there is a limited number of observations, as the statistical understanding is crucial to be developed over a restricted short-term period. In sense, delays could be measured in number of casualties. What is more, such models capture the so-called unifractal scaling behavior and not the multiscaling behavior (i.e., when the data exhibits patterns which are repeated at different time scales or scaling laws). Such scaling-type regularities can provide useful information for modeling and forecasting a phenomenon.
The model developed during the implementation of this study has a simple autoregressive structure, yet with the feature of considering explanatory variables over different interval sizes. The 𝐴𝑅(𝑘) − 𝛨𝑋 model heterogeneous is given by:
[Please see the supplementary files section to view the equation.]
where 𝑐 is the constant term, (𝑑) denotes the data frequency (i.e., daily data frequency) and 𝐷 is the dummy variable that is equal to one (1) for the day that an event occurs (e.g. a restriction imposed), and zero otherwise, while (𝑤) denotes longer aggregation periods. The selection of (𝑤) is data driven. In this predictability analysis, we make use of longer aggregation periods than one day, as we allow the (𝑤) to varies over a fixed number of lags. This data driven method allows us to find the optimal number of days passed for assessing past events that may influence the dependent variable in the future. It allows us to examine not only the effectiveness of the imposed measures via a strict statistical analysis, but the optimal intervention framework so that such situations are predicted as well.
In sense, explanatory variables viewed over different time horizons are considered, which, in turn, permit for direct comparison among quantities defined over various time horizons. This is of high significance, as it indicates how much time -in which case, days- policy makers have at their disposal in order to determine the last time-point which will allow them to act, how long such imposed measures should be in place, how the latter will be evaluated, etc. In fact, the explanatory variables are multiperiod quantities that are normalized sums of the one-period quantities (i.e., a simple average of the daily quantities).
Table 2 consists of the description of the dependent and explanatory (independent) variables used in this study.
Table 2. Descriptions of the dependent and independent variables.
Variable Description
𝑦 Ratio between Daily Deaths and Daily Cases
𝑥! Ratio between Deaths and Cases*
𝑥" Ratio between Recovered and Active**
𝑥# Google Trends
𝐷 Dummy for Restrictive measures
* Refers to total deaths and total cases
** Refers to total active and total recovered
Notes. In order to explore the relationship between the dependent and the independent variables and to avoid spurious regression results with non-stationary times series, the Augmented Dickey-Fuller (ADF) test38-39 and the Phillips–Perron (PP) test40 were used. In the case where the null hypothesis of non-stationarity (i.e., the series has a unit root) cannot be rejected, the first differences of the series is constructed.
Studying the interrelations of the dependent variables measured over different time horizons, the dynamics of the different components of a system can be revealed. It is expected that innervations or infections over longer time intervals have a stronger influence on the Daily Deaths and Daily Cases ratio over shorter time intervals. Furthermore, the interpretation of the proposed model is must simpler than an autoregressive model with non- heterogenous explanatory variables taking a very large number of lags. As already mentioned, standard models employed in the literature, while possibly effective in modelling the evolution of a phenomenon that develops in an endogenous system, are not able to capture exogenous effects that have took place a long time ago (e.g. weeks or months).