Background
Many popular disease transmission models have helped nations respond to the COVID-19 pandemic by informing decisions about pandemic planning, resource allocation, implementation of social distancing measures and other non-pharmaceutical interventions. We study how five epidemiological models forecast and assess the course of the pandemic in India: a baseline model, an extended SIR (eSIR) model, two extended SEIR (SAPHIRE and SEIR-fansy) models, and a semi-mechanistic Bayesian hierarchical model (ICM).
Methods
Using COVID-19 data for India from March 15 to June 18 to train the models, we generate predictions from each of the five models from June 19 to July 18. To compare prediction accuracy with respect to reported cumulative and active case counts and cumulative death counts, we compute the symmetric mean absolute prediction error (SMAPE) for each of the five models.
Results
For active case counts, SMAPE values are 0.72 (SEIR-fansy) and 33.83 (eSIR). For cumulative case counts, SMAPE values are 1.76 (baseline) 23. (eSIR), 2.07 (SAPHIRE) and 3.20 (SEIR-fansy). For cumulative death counts, the SMAPE values are 7.13 (SEIR-fansy) and 26.30 (eSIR). For cumulative cases and deaths, we compute Pearson’s and Lin’s correlation coefficients to investigate how well the projected and observed reported COVID-counts agree. Three models (SAPHIRE, SEIR-fansy and ICM) return total (sum of reported and unreported) counts as well. We compute underreporting factors as of June 30 and note that the SEIR-fansy model reports the highest underreporting factor for active cases (6.10) and cumulative deaths (3.62), while the SAPHIRE model reports the highest underreporting factor for cumulative cases (27.79).
Conclusions
In this comparative paper we describe five different models used to study full disease transmission of the SARS-Cov-2 disease transmission in India. While simulation studies are the only gold standard way to compare the accuracy of the models, here we were uniquely poised to compare the projected case-counts against observed data on a test period. Prediction of daily active number of cases does show appreciable variation across models. The largest variability across models is observed in predicting the “total” number of infections including reported and unreported cases. The degree of under-reporting has been a major concern in India.