Predicting mortality for Covid-19 in the US. Delayed Elasticity Method

: The evolution of the pandemic caused by COVID-19, its high reproductive number and the associated clinical needs, is overwhelming national health systems. We propose a method for predicting the number of deaths, and which will enable the health authorities of the countries involved to plan the resources needed to face the pandemic as many days in advance as possible. We employ OLS to perform the econometric estimation, and through RMSE, MSE, MAPE, and SMAPE forecast performance measures we select the best lagged predictor of both dependent variables. Our objective is rather to estimate a leading indicator of clinical needs. Having a forecast model available several days in advance can enable governments to more effectively face the gap between needs and resources triggered by the outbreak and thus reduce the deaths caused by COVID-19.

Predictive models for Covid-19 focus on predicting the evolution of infections through exponential adjustment, without predefined models 1,2 , or with predefined infections 3 and number of deaths models 4 . These models can be highly complex 3 , based on very detailed epidemiological features of the disease studied and can allow clinical strategies to be analyzed, yet their predictive accuracy is low 4 . This makes them ineffective for prediction purposes and prevents health authorities from having reliable predictions in advance.

Methodology
We propose the following method (Delayed Elasticity Method-DEM). Using officially published data from the the Johns Hopkins University CSSE 5, we econometrically estimate the following equation: where i=1, 2 …,10 are the number of delays of the explanatory variable, ℎ is the total number of deaths up to day t, is the number of cases detected up to date t.
The coefficient is what in economics is called elasticity and it represents the relationship between the variation of the dependent and independent variables: After estimating equations using different lags, we select the one with the best forecast performance (which minimizes forecasting errors). We calculate RMSE 3 as an indicator of predictive accuracy. Other indicators can be used, such as MAE, MAPE or SMAPE. RMSE is defined as follows: where N is the number of out-of-sample observations, which we use to estimate the forecast performance of our estimate, is the estimated value of the dependent variable, and is the actual value. Finally, we select the estimate with the lagged explanatory variable that shows the lowest value in this indicator, and we make the corresponding prediction in total values.

Results
The estimation sample spans from 3/4/2020 to 3/29/2020. We left March 30, 31 and April 1 as out-of-sample observations in order to measure the forecast performance of the estimated model. Estimation is performed through the OLS estimator. We select the model including seven delays, since it shows the lowest RMSE value. The equation of the model that evidences the best forecast performance is the following: The delayed elasticity, = 0.8227, implies that a 1% increase in the number of infected cases predicts a 0.82% increase in the number of deaths seven days later. The estimate presents a high goodness of fit, with an R-square of 0.99.   Table 1 shows the number of actual and estimated deaths, as well as the errors for each time period. To obtain the total number of deaths, we carry out the following transformation

Discussion
The result of the DEM is a model with a very high predictive accuracy, offering a sevenday advance for deaths in the US. This advanced forecast is what can allow the authorities to improve resources planning to face the clinical needs-health resources gap caused by the epidemic.
The DEM is replicable to any country, state, region, city or hospital area, and is applicable to other clinical situations apart from deaths and can also be applied to anticipate health needs (ICUs, hospital beds…).
The DEM does, nevertheless, evidence certain shortcomings that must be taken into account and that basically affect available data. It only uses one independent variable; detected cases. If the number of tests for possible Covid-19 infected cases were to be increased, the model would tend to overestimate future deaths and would need to be recalibrated. The same effect would be caused by applying new clinical strategies that reduce the fatality ratio. These effects can be corrected by performing a re-estimation when the model loses its predictive accuracy. Dong, Ensheng, Hongru Du, and Lauren Gardner. "An interactive web-based dashboard to track COVID-19 in real time." The Lancet Infectious Diseases (2020).