Data sources
We linked three different nationally representative data sources, including district data on SARS-CoV-2 incidence and deaths, population statistics, and cartographical data. Daily data on SARS-CoV-2 incidence and associated deaths (28/01/2020 to 04/05/2020) was obtained from the Robert-Koch Institute, which provides the national surveillance system of infectious diseases in Germany, stratified by sex and age groups (0–4, 5–14, 15–34, 35–59, 60–79, 80+) [10]. We used most recent district-level population statistics (2019) from a database of the system of social reporting in official statistics to calculate expected incidences and deaths [11]. Cartographical data were taken from the Federal Agency of Cartography and Geodesy [12].
Statistical analysis
We first assessed the distribution of SARS-CoV-2 incidence and associated deaths by calculating weekly standardised incidence (SIR) and mortality ratios (SMR) at district level stratified for sex and age groups (0–4, 5–14, 15–34, 35–59, 60–79, 80+) and on federal states level. For the calculation of SIR and SMR, and corresponding Poisson 95% confidence intervals (95%-CI), the ratio of observed versus expected counts was calculated [13, 14].
We then used a Bayesian spatiotemporal model, fitted by the integrated nested Laplace approximation (INLA) approach [15], to assess the spatial risk of SARS-CoV-2 incidence. For this analysis, we used daily notifications from 28/01/2020 to 04/05/2020. In order to select the likelihood distribution for the model that fit the count data best (e. g. Poisson, zero-inflated Poisson, negative binomial or zero-inflated negative binomial), four intercept-only models were specified at first. Using the Watanabe-Akaike information criterion (WAIC), the likelihood of the intercept-only model with lowest WAIC was selected for fitting two spatiotemporal models, which included a parametric and a non-parametric time trend, respectively. Based on lowest WAIC, we then selected the model that fit the data best for further calculation (see Additional file 1: Table S1 and Table S2) [14].
A zero-inflated negative binomial spatiotemporal model fit the data best. The model included a Besag-York-Mollié (BYM) model and a non-parametric dynamic time trend. For districts and notification days the rate of observed SARS-CoV-2 incidence and expected number of cases integrated as offset was modelled on a logarithmic scale:
.
Where is the intercept, is the BYM-model with as spatially structured random effect modelled using intrinsic conditional auto-regression (neighbouring structure of districts) and as spatially unstructured effect modelled exchangeable among the districts (independent and identically distributed, iid). The dynamic temporal trend incorporates a temporal structured random effect modelled dynamically using a random-walk model of order two, and an unstructured temporal effect modelled using the iid [14, 16].
The district-specific risk ratios (RR) and corresponding 95% credibility intervals (95%-CrI) are calculated by combining the spatial structured and unstructured random effects, e. g. [16]. For assessing the uncertainty associated with the , we compute the posterior probability of exceeding an threshold greater than 1, 2 and 3, respectively. Following the Richardson criterion, an exceedance probability of greater than or equal to 80% determines a high certainty of exceeding one of the mentioned thresholds in the respective district [17]. We further assess the posterior temporal mean trend at national level to provide an estimate for the time-dependent change in the risks of SARS-CoV-2 incidences in Germany. Therefore, we combined the temporal effects trough a linear combination and calculated time-specific risk ratios () and 95%-CrI [14].
The analysis was conducted using the R-programming language for statistical computing (V. 3.6.3). The R-INLA package [18] was used to fit the Bayesian models, the tmap [19] and the leaflet [20] packages were used for generating the maps.
Calculation of expected values
The expected SARS-CoV-2 incidence for district and notification day used for the Bayesian spatiotemporal models were calculated on the basis of the cumulative incidence rate of SARS-CoV-2 (with observed incidence ) until the current notification day weighted by the respective district population size (formula 1) [14, 16]:
.
The calculation of the weekly expected incidence and deaths used for estimating SIR and SMR is based on formula 1. The cumulative incidence rate for the most recent notification week with was calculated on the basis of the respective preceding notification weeks (formula 2):
.
The same calculation procedure was applied for estimating stratified SIR and SMR for age groups (0–4, 5–14, 15–34, 35–59, 60–79, 80+) and sex, and for the 16 federal states in Germany.
Approach for ongoing monitoring
In order to ensure an ongoing daily monitoring, we implement the statistical approach explained above in an automatic process and generate interactive maps for SIR, SMR and district specific risk estimates for SARS-CoV-2 incidence and mortality on a daily basis. The data management and data analysis is carried out with the R programming language, and is automatically started when the database of the Robert-Koch Institute is updated for the previous notification day at 00:00. The output of the analysis contains maps grouped by calendar weeks for daily and weekly SIR and SMR estimates (including age group and sex stratified estimates and counts for SARS-CoV-2 morbidity and mortality), as well as a daily updated map for district specific risk ratios based on the model configuration and selection process explained above. The daily updated interactive maps are then automatically integrated in an interactive web surface, the “Covid-19 Small Area Monitor” (CoV-19 SAM) [21], which is programmed using JavaScript and the JavaScript library jQuery.