In the first scenario, the use of the natural log of U.S. COVID-19 total confirmed cases as the dependent variable, and not the number of observed cases directly, was a novel approach in COVID-19 modeling. This approach was useful in determining the accuracy of the b1 and b3 coefficients: the peak of the natural log of cases, and the inflection point midway on the S-curve correlating to the day number denoting deceleration in the natural log of cases, respectively. The small variability in both coefficients compared to models from earlier in the epidemic, especially for the inflection point, which was less than 1 day, has pragmatic importance. In a real-world application, this use of a three-parameter logistic model shows highly reliable predictability in the day number representing the point of deceleration for case growth, whether earlier the epidemic or at later points. Estimation of peak cases (b1) is discussed as a possible limitation later in this paper.
The utility of the independent t-test analyses carried out in the first scenario also have practical applications. For example, statistical significance between comparative coefficients from different models indicate that there are improvements in different parameters of the epidemic. On the other hand, lack of statistical significance in the respective coefficients signify that case numbers are stabilizing to a point so as not to show statistically discernible differences in these parameters: peak cases, growth rate, or time to deceleration. Both results signify improvements in the epidemic that are not immediately obvious from the visual models or statistical analyses alone. Therefore, this is one measure to assess the effectiveness of epidemiologic public health measures when they are being adhered to. This also has implications as a tool for public health leaders, to help in their communication with the general public and encourage adherence to proposed public health safety measures.
In the second scenario, a first-derivative model generated from a base three-parameter logistic model was useful as another method to objectively gauge control of the epidemic by forecasting CFR trends, specifically, during times of already-instituted public health measures. This approach would have been useful during a time when cases of the COVID-19 epidemic were beginning to flatten in the United States, such as at the time this scenario’s model was generated (April 27). At such time, the general public would be concerned with time until ease of restrictions. This model would have served to give encouraging information that peak mortality had passed and would have had utility in predicting a good future date to begin policy implementations for easing certain restrictions, based on an effective CFR of near-zero. Given the individualistic culture of the United States, such information can serve to help the general public understand that adherence to certain public health safety measures is necessary for a short while longer.
The third and final scenario has high-yield utility at a time when the COVID-19 epidemic in the United States is experiencing accelerated growth, after initially plateauing for a brief interval. With a growing number of Americans not choosing to wear masks, a topic of public debate [11], a model such as this could serve as a tool to guide public health leadership in making individualized epidemiologic recommendations taking into account the non-uniformity of American culture. The first-derivative model in this scenario proved to be a novel approach to examining actual daily changes in mortality versus daily changes in case rates, a marker of the COVID-19 epidemic’s impact with continued case increases.
With these models, decreasing CFR relative to increasing total cases of COVID-19 may indicate that aggressive treatments are being utilized readily in hospitalized patients, that those contracting the virus are not dying from it, or that the most vulnerable are being better-protected. For example, it is known that older adults and individuals with certain chronic medical conditions are most susceptible to severe illness or death from COVID-19 [12]. If CFR continues a downwards trend, that may indicate that more young individuals with mild or no symptoms are contracting the virus and not dying from it. On the other hand, if according to this model CFR starts to increase relative to accelerated case increases, then that may indicate that the young and healthy are starting to transmit the virus back to older individuals like parents, relatives, neighbors, etc. This type of conclusion is not only a critical objective measure of the current course of the virus in the population but can also serve to guide public health safety recommendations that are focused, specific, and individualized to help control the epidemic. Such recommendations may target the most vulnerable individuals at risk of death from COVID-19 and would also better take into account the current political and cultural climate in the United States.
Limitations
The accuracy of three-parameter logistic models in predicting peak cases depends on several factors being constant in the United States, like constant rates of testing, constant rates of infection, constant rates of recovery, and a predictable and constant changing rate of deaths. Any easing of restrictions, enacting new policies or public safety measures, or improvements in testing and/or treatments will affect all these constants, and consequently affect the growth rate and predicted peak cases. This is especially applicable to models that use the actual observed case numbers as the dependent variable, where significant differences in growth rate and peak cases are seen depending on when the model was calculated at different points in the epidemic. A similar limitation is discussed in other papers. In a study using the three-parameter logistic model to forecast COVID-19 cases in China, the authors noted that the model failed to estimate case numbers in the early stages of the epidemic, and was only accurate when there was an apparent maximum being reached in cases [5]. In another study using a five-parameter logistic model to estimate COVID-19 cases in the United States, data was used from March 21 to April 4, estimating a peak of about 800,000 cases [6]. As is evident now by the present number of cases, the model did not accurately predict present case numbers. The authors note that the model’s long-term predictability for future new cases may not be accurate, and that it was limited to the data collected over the short interval in the study [6]. The study conducted in this paper found similar limitations as discussed in these two studies [5, 6] in the use of the three-parameter logistic models to model COVID-19 data.
Of note, in the three-parameter logistic model study for China [5] the authors tested for heteroskedasticity in the data, and examined residuals, two methods which were not part of this paper. Also, unlike a different study discussing use of the logistic model study for cases in China [3], this paper did not examine applicability to other countries.
In the first scenario, the estimated b1 coefficient, the peak of the natural log of U.S. COVID-19 cases, is calculated at 14.5026. This gives an estimate of 1,987,921 peak cases, when eb1 is calculated. When the same model was run with the observed case data as the dependent variable the peak cases predicted were 3,414,134, which is a closer estimate to the present number of U.S. COVID-19 cases as noted in the Introduction. The best utility therefore of the suggested three-parameter logistic model in the first scenario, using the natural log of cases as the dependent variable, would be to estimate the inflection point coefficient b3 since it showed minimal variability. This is when the growth of the natural log of cases would start to decelerate and is effectively the same point of deceleration for the exponential growth of the actual observed case data.
The first-derivative models have few limitations, chiefly because they are useful in tracking changes in actual rates, whether in cases, or in mortality, as discussed in the second and third scenarios. Therefore, the peaks of first-derivative models will correlate to the inflection point of their base three-parameter logistic model, the point when rate deceleration begins. Their utility in estimating “zero rate,” however, is limited by the same parameters concerning peak cases in logistic models.