According to the World Health Organization (WHO), as of October 19, 2021, there have been more than 240 million confirmed cases of COVID-19 globally, causing the loss of more than 4.9 million lives [1]. The wide and rapid spread of the SARS-CoV-2 virus precipitated a global health and economic crisis: the WHO reported a disruption to essential health services that affected 90% of countries between March and June 2020 [2], and the World Bank estimates that the global economy has experienced the deepest recession since the Second World War and that an upsurge in extreme poverty is to be expected [3, 4].
Governments have introduced physical distancing policies, hygiene protocols, and total or partial lockdowns to control virus transmission, but many countries who eased lockdown restrictions saw a resurgence of COVID-19 cases, making the development of a vaccine to combat COVID-19 one of the most urgent public health endeavors in modern history. As of October 19, 2021, there were one vaccine approved and three approved for emergency use in the United States, four approved for emergency use in the European Union [5], 127 candidate vaccines in clinical evaluation and 194 in preclinical studies [6], with more vaccines on track to complete clinical trials and possibly receive regulatory authorization for use.
To evaluate a vaccine candidate’s efficacy and safety, it is critical that vaccine trials are conducted in the right populations at the right time to improve the power of an accurate result for the product’s true efficacy, and to reduce the required sample size. Ideally, randomized controlled trials for vaccines are conducted in sites and population groups where there is a high incidence rate and where a significant proportion of the population remains susceptible to infection. Infectious disease models can help support the decision-making process in vaccine trial site selection by predicting which settings and populations are most likely to have high incidence rates during periods when trials are scheduled to occur. In addition, models can assist in predicting the expected rate of infection endpoints that is needed to calculate the required study sample size. Between June and August 2020, we used an agent-based model, Covasim [7], to project COVID-19 incidence and diagnosis rates for 72 locations across Australia (AU), Belgium (BE), Brazil (BR), France (FR), Italy (IT), Mexico (MX), the Netherlands (NL), South Africa (ZA), Spain (ES), and the United States (US). The Covasim model was used as one of multiple models to project the incidence rate of infections and diagnosed (i.e., tested positive) cases for the 6-week window of 15 September to 31 October 2020, which was 2–3 months in the future from the time that the model projections were created. The 6-week window was the time during which the Janssen COVID-19 vaccine was scheduled to start enrolment in efficacy clinical trials. Model projections from multiple models (e.g., the MIT model), including the Covasim model, were considered in the selection of clinical trial sites alongside logistic, feasibility, time to enrolment, and other factors in vaccine trial site selection.
Infectious disease models, no matter how sophisticated, are inherently limited by the data used to calibrate them and the assumptions they make about the future conditions. Model validation is an important mechanism to understand which inputs and assumptions play the greatest role in determining model accuracy. For the 72 settings modelled, the projection period has now passed, meaning that it is possible to compare model outcomes with actual outcomes in each setting and to determine any correlates of accuracy. This information is valuable for improving future reliability of infectious disease models, as well as for improving our fundamental understanding of the COVID-19 pandemic.
In this study, we assess whether the actual data for each of the 72 settings fell within projected confidence limits of 5 different scenario analyses, the a priori accuracy of the ranking of COVID-19 projected incidence for potential trial sites, and usefulness of a statistical regression model to identify policy, socioeconomic, and other factors as predictors of model accuracy.