Model Description

The model is an extension of the SIR model (20). A full description of the model is given in (7) and a summary of the essential aspects is given here. The model assumes a constant population (P) that can be divided into uninfected (U), infected, pre-symptomatic (I), symptomatic (S), seriously sick (SS), recovering or “better” (B), recovered (R), and deceased (D). Transitions between the stages occur, each with their own rate constant *k*, as shown in Figure 3.

The first transition (U → I) results from four parallel processes: U + I → 2I; U + S → I + S; U + SS → I + SS; U + B → I + B. The transition U + I → 2I is assumed to occur with a pseudo-first-order rate constant, *k*11. This rate constant is obtained by correcting its initial value, *k*11,0 by a time-dependent factor that incorporates the effect of non-pharmaceutical interventions (NPIs). At the time of each NPI, a smooth transition of *k*11 from its pre-NPI to its post-NPI value is introduced (see (7) for details). All rate constants are given in (7). Only *k*11,0 and its corrections through NPIs are adjustable when the mortality rate is known. The model allows for spikes of the infection rate, but spikes were not used in the analysis of any of the Spain or New York data. The mortality is set by adjusting the rate constant *k*4 of the process SS → D. The default value, *k*4 = 0.01223 day−1, corresponds with a mortality rate of 1.5 %. The simulation was developed in MATLAB. The source code is given in the Supplementary Information.

Data Analysis

The Spain death data up to and including July 18, 2020 were collected from the Worldometer website (11) on July 19, 2020. They are shown in the Supplementary Table S1. Cumulative daily death data for New York up to and including July 27, 2020 were collected from the New York Times COVID-19 repository (13) on July 28, 2020. The data are shown in the Supplementary Table S2.

These data were used to calibrate the model (7) summarized above. The calibration was done manually by trial and error. The model incorporated multiple non-pharmaceutical interventions (NPIs). A reopening is represented as an NPI with negative effectiveness. The starting date of each intervention was based on media reports of country-wide or state-wide initiatives to contain the epidemic. The dates chosen for the Spain model were March 15 and April 1, 2020 for the NPIs and June 21 for the reopening. An additional NPI with positive effectiveness was included to ensure a good fit with the data. The chosen date was May 12. The May 12 NPI did not improve the agreement with the reported death data so the effectiveness of this NPI was set equal to 0. The dates chosen for the New York model were March 13, March 22, and March 28, 2020 for NPIs and May 15 for the reopening. An additional NPI with negative effectiveness was needed to obtain a good fit with the data. The chosen date was April 20.

The starting date of the simulation (*t* = 0) is set to February 1, 2020. The initial conditions are 100 infected, 10 sick and 1 seriously sick, each multiplied by an adjustable factor labeled Correction in Tables 1 and 2. The remaining adjustable parameters are the basic infection rate, *k*11,0, in the absence of NPIs, the efficiency of each NPI as a fraction of *k*11,0, and the true mortality of the disease, *f*mort.

Effect of False Positives

It is assumed that there are no false negatives in the testing for the purpose of this calculation. The effect of false negatives is evaluated separately.

In this calculation, *x* is defined as the fraction of the actual number of positive cases over the population, *y* is the fraction of the number of false positives over the actual negative cases, and *z* is the fraction of the apparent positive cases over the population, which represents the sum of actual and false positives. Hence:

Solving for *x* (actual positives) leads to:

Solving for *y* (false positives) leads to:

For the Spain data, the authors of (9) provided a range of real positives based on the results obtained with two different tests. Their numbers were used without further processing.

The simulation of the New York data was run with different values of the intrinsic mortality, *f*mort. For each value of *f*mort, the model was fitted to the data and the number of SARS-CoV-2 positive people as of April 20, 2020 was calculated. For each simulation, the proportion of false positives consistent with the simulation result was calculated with the equation (4). The relationships were plotted in Figure 4. Simulations were made for the entire state of New York, for New York City, and for the County of Monroe, NY (representing upstate New York). The lines representing the three sets of simulations have very nearly coinciding intersections, indicating that there is a unique combination of *f*mort and the proportion of false positives that is consistent with the three data sets.

Effect of Age Distribution

The age distributions of Spain and New York state were taken from https://www.populationpyramid.net/spain/2019/ (18) and https://www.censusscope.org/us/s36/chart_age.html (19) respectively. For the New York data, the highest age category in the data is 85+ years whereas for the Spain data it is 100+ years. For consistency, all 85+ categories in the Spain data were lumped to a single group and assigned a representative age of 90 years. For Spain, this led to the same result as using the entire distribution with a representative age of 102 years for the 100+ years data. Based on (16) and (17), the mortalities in each age group are estimated as proportional to exp(0.1*a*), where *a* is the age in years. Hence, the fraction of the total population in each age group was calculated and multiplied with exp(0.1*a*) and the results were summed. The sums were 653 and 398 for Spain and New York, respectively. These numbers are proportional to the overall mortality rate. Hence, a ratio of 653/398 = 1.64 is expected.

The estimation of infection mortality rate for Spain and the state of New York bounded by this ratio is given in the Discussion section. Values of 1.94 ± 0.43 % and 1.18 ± 0.26 % were found for Spain and New York, respectively, i.e. fractions of 0.0194 ± 0.0043 and 0.0118 ± 0.0026. By dividing these fractions by 653 and 398, respectively, a universal pre-exponential factor of (3.0 ± 0.5)×10−5 is found for the age-dependent infection mortality rate. This leads to the equation:

where *f*mort is the mortality rate as a fraction, *A = *(3.0 ± 0.5)×10−5, *B *= 0.1 year−1 and *a* is age in years.