Nonlinear time‐series forecasts for decision support: short‐term demand for ICU beds in Santiago, Chile, during the 2021 COVID‐19 pandemic

Abstract In Chile, due to the explosive increase of new Coronavirus disease 2019 (COVID‐19) cases during the first part of 2021, the ability of health services to accommodate new incoming cases was jeopardized. It has become necessary to be able to manage intensive care unit (ICU) capacity, and for this purpose, monitoring both the evolution of new cases and the demand for ICU beds has become urgent. This paper presents short‐term forecast models for the number of new cases and the number of COVID‐19 patients admitted to ICUs in the Metropolitan Region in Chile.


Introduction
Coronavirus disease 2019 (COVID-19) (Andersen et al., 2020) spread worldwide in a few months since its identification in December 2019. Causal theories of accidental release from a lab and zoonotic spillover both remain viable (Bloom et al., 2021), and it is an ongoing pandemic at the time of writing. Like all viruses, coronaviruses mutate over time. Although most mutations have little to no impact on the virus' properties, some mutations may affect the virus's properties, such as how easily it spreads, the associated disease severity, or the performance of vaccines, therapeutic medicines, diagnostic tools, or other public health and social measures (World Health Organization, 2022).
This research aims to provide a framework for decision support for policymakers who need to know in advance how much intensive care unit (ICU) bed capacity needs to be provided to fulfill demand for COVID-19-related ICU hospitalizations via time-series-based forecasts of both the number of new COVID-19 cases and the corresponding ICU bed utilization.
We illustrate our proposed methodology with the case of the Metropolitan Region of Santiago, the capital city of Chile, with data collected during the second wave of the pandemic, the initial four months of 2021. We further implemented our proposed method for the third wave of the pandemic, covering September to November 2021, to verify the robustness of our proposal.

Context
The global healthcare disaster caused by the COVID-19 pandemic has endangered and disrupted the lives of billions around the world, resulting in illnesses and deaths (Nagurney et al., 2021). By late July 2020, with declining rates of new cases, Chile had achieved a partial control of the growth rates of COVID-19. However, with the policy of indiscriminate issuance of Summer Vacation Permits by the government, as announced on December 30, 2020 (Chilean Ministry of Health, 2020) and continuing until late February 2021, the spread of the virus during the first quarter of 2021 became more pronounced than during its first wave in 2020. Subsequent lockdown policies were announced in March and April 2021 but have not been effectively enforced by the authorities. Moreover, as Joshi and Musalem (2021) found, lockdown policies have seen their desired effect (in reduced mobility) sliced in half after a month, and the empirical evidence from Blanco et al. (2021) suggests that although confinement measures worldwide have been mostly effective in reducing the rate of COVID-19 contagion and deaths, the effectiveness of the containment measures has been mixed, depending on factors such as timeliness, enforcement by the authorities, compliance of the population, and the length of the application of said measures.
Santiago is the capital city of Chile, the largest city of the country with a population of around seven million (Carranza et al., 2022), located in the central zone of the country, and located in a mediterranean valley surrounded by mountains which affect its climate, with four clearly distinguishable seasons. Seasonality is relevant for the spread of COVID-19, as cold and temperate climates are more favorable to the spread of the virus, as compared to warmer climates (Araújo and Naimi, 2020).
With the ability of health services to accommodate new incoming cases in jeopardy, it has become an urgent necessity to be able to manage ICU capacity, and for this purpose to obtain accurate forecasts of COVID-19 demand for ICU beds. Thankfully for Chile in general, and Santiago in particular, authorities have been very proactive in building ICU bed capacity (Contardo and Costa, 2022) within a fully integrated (public-private) health sector, whenever occupancy trespassed any dangerous thresholds, and well beyond the original capacity of beds of any type. While this policy was executed considering any budgetary constraints as secondary, it also became very important to be able to forecast the needed capacity to avoid waste in building unneeded capacity if possible.
With this reality in mind, and understanding the importance of modeling/computational techniques in order to improve healthcare delivery and informatics (Gupta and Sharda, 2013;Tang et al., 2022), we have set ourselves to forecast and model the short-term demand for ICU beds in Santiago, Chile.

Literature review and modeling approaches
A natural inclination for modelers attempting to predict demand for COVID-19 ICU beds consists in using variations of the so-called susceptible-infectious-recovered (SIR) model (Keeling and Danon, 2009;Papo et al., 2020;Rahimi et al., 2021) or the susceptible-exposed-infectious-removed (SEIR) model (Aron and Schwartz, 1984;Kuniya, 2020;Liu et al., 2021;Rahimi et al., 2021;Tagliazucchi et al., 2020). However, as different authors have remarked (Gitto et al., 2021;Manca et al., 2020), while SIR/SEIR-type models can be useful for running parametric predictive scenarios, they are based on many assumptions and hypotheses that are quite sensitive to the selection of proper values of the adaptive parameters and functional description, which might not be as appropriate when real-time data are incomplete.
Similarly, other compartment models to study ICU beds utilization (see Goic et al., 2021 as an example) require accurate information on the count of infectious individuals who show COVID-19 symptoms, critically ill people who need an ICU bed, and discharged patients from the ICU, and if such data is not available, an alternative approach becomes necessary.
A key input variable for the short-term forecast of demand for ICU COVID-19 beds is the count of newly infected cases, making it necessary to also model those cases in order to use them for the demand forecasts. Some authors (e.g., Meo et al., 2020) have produced forecasts using standard linear regression models, which are less than ideal given the nature of discrete count data.
We have, thus, decided to use phenomenological models to forecast the count of newly infected cases and, with that input, obtain ICU bed demand forecasts. Phenomenological models (e.g., Chowell, 2017;Chowell et al., 2019;Wang et al., 2012;Roosa et al., 2020b;Chowell et al., 2016, among others) have been previously applied to various infectious disease outbreaks including other respiratory illnesses. This family of models can produce good forecasts using small samples, even in the absence of hypotheses about the population, the rate of diffusion, and the upper limit of the infection curve (Gitto et al., 2021;Ma et al., 2014). Real-time short-term forecasts generated from such models are useful to guide the allocation of resources that are critical to bring the epidemic under control. For instance, Maier and Brockmann (2020) fitted a subexponential growth curve for China, whereas Remuzzi and Remuzzi (2020) used exponential growth models to predict the early propagation of COVID-19 in Italy. Sadly, exponential growth models are unrealistic in scenarios where additional information is available. Several authors (e.g., Aviv-Sharon and Aharoni, 2020; Vicuña et al., 2021;Roosa et al., 2020a;Torrealba-Rodríguez et al., 2020;Zreiq et al., 2020;Chen et al., 2020, among others) propose instead to use generalized variations of the Richards/logistic curve model (Richards, 1959) to generate forecasts of COVID-19 cases in several countries. Gitto et al. (2021) were the first to propose the use of Harvey's model (Harvey, 1984) to forecast the diffusion of an epidemic precisely in the context of bed occupancy in Italy during the first wave of COVID-19. Moreau (2020) used Weibull curves to estimate new cases of COVID-19 in Brazil.
To conduct our own forecasts, we model the evolution of new cases using three phenomenological curves (Richards/logistic, Harvey, Weibull), with different distributional assumptions on the error term. In Section 2, we describe our modeling strategy in detail, in Section 3 we present and discuss our empirical results for the second wave of the pandemic in Santiago de Chile, in Section 4 we present a robustness analysis for the third wave of the pandemic in the second half of 2021, and in Section 5 we conclude.

Data
Data used in this work have been retrieved from the epidemiological reports from the Chilean Ministry of Health (CMoH). These epidemiological reports are updated by the independent expert panel working with the CMoH, and are updated regularly to adjust for errors and misreports. These most accurate counts are collected on the Chilean Ministry of Science (CMoSc) website at https://www.minciencia.gob.cl/covid19. These are the data that other published studies, such as Vicuña et al. (2021); Canals et al. (2020); Tariq et al. (2021), have utilized for the Chilean case.
The dataset includes the total count of confirmed cases in the Metropolitan Region of Santiago, according to the polymerase chain reaction test (PCR) prognosis notification date (as registered by the physician on the CMoH surveillance system) as well as the daily demand for ICU beds for COVID-19 patients. Fortunately, since the COVID-19 response system has integrated the bed availability network, observed demand is not censored: Every time that capacity has approached 100% occupancy, the system has been able to open new slots by reconverting beds to ICU use as needed.
Since our study is based on secondary data from the CMoH's official daily public reports as published by the CMoSc, it did not require approval from an ethics committee.

Modeling strategy
Our modeling strategy follows a two-stage model to forecast the number of COVID-19 ICU beds demand per day. For the first stage, we forecast the number of new COVID cases, which is a crucial input for the second stage, particularly for our out-of-sample, short-term forecasts of the demand for ICU COVID-19 beds. In the second stage, we forecast the number of beds using the lagged measure of the number of cases the previous week as well as the demand for ICU beds the day before.
It is possible for somebody to think that, because there exists a time lag of approximately one week between a patient being infected and the development of severe/critical symptoms that can lead him/her to an ICU, it might be unnecessary to model the first stage for the number of new cases. However, in order to build capacity beyond the seven-day window it takes for new patients to demand ICU beds (Faes et al., 2020), it becomes crucial to use a forecast for the new cases to be able to have estimates of ICU utilization for days beyond the average incubation period.

First stage: new cases
To model new COVID-19 cases, we use phenomenological models. Many functional forms have been used in epidemic forecasting research. We focus on three popular nonlinear specifications: Richards/logistic (Richards, 1959), Weibull (Weibull, 1951), and Harvey (Harvey, 1984). In terms of notation, let t be a daily time index; X t the observed number of new cases in Santiago on day t; The logistic curve is given by Equation (1), whereas the Weibull curve for accumulated cases corresponds to Equation (2): Given that the epidemiological curves in (1) and (2) are used to model the cumulative number of cases, in order to make a daily forecast of new cases, we use for estimation the first derivatives, λ t , yielding a measure of the daily amount of new cases: Harvey (1984)'s curve, displayed as Equation (5), provides a direct forecast for the count of new cases during a pandemic. It includes a deterministic time trend and an autoregressive term to account for the previously accumulated cases: In order to be able to estimate these curves, we have rewritten the expressions for λ t for new cases as a function of parameters ϑ j to be estimated, according to expressions (6)-(8): We thus use these specifications for the rate of new cases as parametric inputs for two probability models: (quasi-)Poisson and Gaussian. We use Gaussian for benchmarking purposes, keeping in mind that our data constitute discrete (and not continuous) variables.
Since we denoted {X t } as the number of confirmed COVID-19 cases at time t, let now X t−1 ≡ {X t−1 , X t−2 , . . .} be a collection of all past realizations of X t . Then, the Poisson model, used traditionally to analyze count data (see, e.g., Agosto and Giudici, 2020;Barroilhet et al., 2022;Biswas et al., 2020 ), has a probability function given by The key handicap of the traditional Poisson model, in this context, comes from its assumption that the mean of the daily number of cases is identical to its variance, that is, The quasi-Poisson model (as used by Biswas et al. (2020) and Vicuña et al. (2021) among others), instead, offers a generalization allowing for an overdispersion factor φ: As mentioned before, for benchmark/comparison purposes, we also fit a Gaussian (normal) model, with mean E(X t |X t−1 ) = λ t and variance Var(X t |X t−1 ) = σ 2 t . Its density is given by In this first stage, our objective is to obtain the rate of new COVID-19 cases via the conditional means λ t specified through Equations (6)-(8), as well as other covariates. Specifically, (i) the variable WeekDay j,t is an indicator variable for each day of the week, capturing the effect of the reduced capacity in the release of PCR test results during weekends (see, e.g., Vicuña et al., 2021;Nakagawa and Kanatani, 2021); (ii) the variable I t is an indicator capturing a full lockdown policy intervention in Santiago (in our sample, between March 27 and April 28, 2021); (iii) the parameter δ 0 is an offset (intercept) value which captures the fact that the sample under analysis did not start from zero cases. Therefore, the conditional mean used for estimation is given by Equation (9): For the quasi-Poisson case, the parameters to be estimated for the proposed curves (all collected in a vector called θ ) are estimated by maximizing the quasi-likelihood function (given by expression (10)): Similarly, for the Gaussian case, the parameters θ are estimated by maximizing a log-likelihood function as in expression (11): These results allow us to obtain an estimated value for the expected rate of arrival of new cases in day t in Santiago, λ t , which is the key input for our second-stage model to estimate the demand for ICU beds.

Second stage: demand for COVID-19 ICU beds
To model the short-term daily demand for COVID-19 ICU beds in Santiago, we propose to fit generalized Poisson and quasi-Poisson time series models. The variables chosen as predictors are (i) the (lagged) count of new COVID-19 diagnosed cases and (ii) an autoregressive term accounting for the utilization of ICU beds by COVID-19 patients in the immediately preceding days.
It is important to remember that, as mentioned earlier, in the case of Santiago (and Chile in general), utilization of ICU beds by COVID-19 patients equals demand, since there has been no demand censoring effect: thus far, whenever utilization has approached 100% of available ICU beds, the integrated health system for COVID-19 has ensured that new capacity has been created for incoming COVID-19 patients needing ICU care. Therefore, the terms utilization of and demand for ICU beds by COVID-19 patients can be used indistinctly in this case.
Denote as {Y t } the stock of utilized ICU beds for COVID-19 cases in Santiago on day t. As before, let {X t } correspond to the count of new COVID-19 cases diagnosed on day t.
. .}, a collection of all realizations of {X t } and {Y t } from the beginning of the sample until t − 1. Then, the Poisson time series assumes that Y t conditional on F t−1 is well described by the following probability function: Here, the rate μ t = g(F t−1 , α) is a function of the covariates associated with the response variable Y t and parameters that need to be estimated.
The proposed model for the case of Santiago considers the following specification for the mean demand μ t : where Y t is the utilization of COVID-19 ICU beds on day t; μ t is the mean (demand) rate of the process on day t; covariate X 1,t ≡ ln( t−1 i=1 Y i ) corresponds to a Harvey-like (Harvey, 1984) growth curve component, capturing the similarities between the use of beds and the count of new cases; and covariate X 2,t ≡ X t−τ = 1 7 t−τ i=t−τ −6 X i is a seven-day moving average of the daily count of new cases (based on the average time it takes from COVID-19 onset to hospitalization; see Faes et al. 2020) with an out-of-phase lag of τ days, capturing the effect of a full week of newly infected cases on the future utilization of beds τ days ahead.

Computational implementation
To estimate the parameters of the generalized linear model given by expression (9), we used the software R. The gnm library includes the function gnm(). The iterative algorithm requires starting values for the parameters, which were obtained through the function nls() in the nls library. To estimate the corresponding model (12) for the second stage, we utilized the tscount library for discrete-time-dependent variables, in combination with the generalized linear models library glm.
To obtain confidence intervals for out-of-sample predictions, we approximated the quasi-Poisson likelihood with negative binomial distributions via a bootstrap of size 10,000, using the ciTools library in R, as described by Haman (2017). The intervals are, thus, built as follows: 1. The model in Equation (9) is fitted to obtain estimates θ and Cov( θ ). The number of simulations is set at 10,000. 2. Simulate 10,000 draws of the coefficients θ * ∼ N( θ, Cov( θ )). 3. Simulate Y * |F t−1 from the response distribution using the following approximation: 4. Determine quantiles α/2 and (1 − α/2) of the simulated conditional responses.
For the case where the out-of-sample predictions follow a Gaussian distribution, we approximated the confidence intervals via the Delta method. For the point estimate, we use a first-order Taylor expansion: where θ 0 = plim θ and ∇λ t (θ 0 ) is the gradient of the new cases curve λ t evaluated at θ 0 . The variance, then, can be approximated with Therefore, the resulting (1 − α)% confidence interval is given by

Forecast accuracy evaluation
We are interested in comparing the forecast accuracy of these models. In order to choose models, it is common practice (see, e.g., Gitto et al., 2021;Zreiq et al., 2020;Ribeiro et al., 2020 ) to identify a training sample, where the training data are used to estimate the parameters of a forecasting method, and a test sample, that is used to evaluate its accuracy. Two popularly used scale-dependent measures are the mean absolute error (MAE) and the root mean square error (RMSE), whereas the most common measure of percentage errors (thus, unit-free) is the mean absolute percentage error (MAPE): where y t+ j denotes its jth predictor based on information up to period t.
From these measures, we can choose the estimates that yield their lowest values in the out-ofsample forecasting window. To better validate the estimates and to evaluate the flexibility of the proposed models, we considered two different cutoff points for the training sample (April 15, 2021, andMay 15, 2021), and three different corresponding forecasting horizons (3, 7, and 14 days ahead).

Results and discussion
In our analyses, we first considered data from January 1 (the day after the announcement of the issue of the vacation permits, which itself marked the start of the second wave of the pandemic in Chile) until April 15, 2021, to make forecasts from April 16 onward. Subsequently, we repeated the same analyses, now using a sample including one extra month, until May 15, 2021, to make forecasts from May 16 onward. Our rationale was that, for each of the two cutoff points, given the nonlinear dynamics of the disease, the performance of each model would be different as well. As such, this would highlight the complementary aspect of each modeling approach. To generate forecasts using time series, one stands at the cutoff point in each case and uses observed values up until that date and predicts values for future, not-yet-observed values. A useful pedagogical reference on how to compute forecasts in a recursive iterative manner (to distinguish what is observable and what is not in any given period of time) can be found in Liboschik et al. (2017), section 4.
The observed count of new daily cases is displayed in Fig. 1, and the utilization of COVID-19 ICU beds is shown in Fig. 2. The vertical lines in both cases denote the two cutoff points considered. Figure 1 shows that the number of new cases peaks slightly after the first cutoff point, stabilizing shortly after. On the other hand, we can observe in Fig. 2 that the utilization of ICU beds (in spite B.F. Quiroga et al. / Intl. Trans. in Op. Res. 0 (2022)   of being a stock variable, not a flow variable) also can be modeled as a logistic growth curve, having reached a peak in late April, slowly decreasing after that date.

First-stage results
We fit the different curves we proposed above to predict the new cases of COVID-19 in Santiago, Chile. Tables 1 and 2 display the statistical significance of the estimated parameters for each of the parameters considering both cutoff points.
Looking at Table 1, under the April 15 cutoff scenario, it is plausible to observe that for a 5% significance level, the lockdown intervention parameter, ψ, is not statistically significant for the quasi-Poisson logistic curve, whereas the offset parameter δ 0 is statistically not different from zero in the Harvey curve cases. From Table 2, using the May 15 cutoff, it is possible to see that ψ is statistically insignificant for the Gaussian Harvey and Weibull cases. In the case of the two Richards/logistic curves, the ψ parameter was not properly estimated, because the effect of the entire term exp (ψI t ) was estimated to be statistically zero for those curves (meaning that the corresponding estimated value of ψ would be as close to −∞ as possible). We display our estimated epidemiological curves in Figs. 3 and 4 (quasi-Poisson and Gaussian, respectively, for the April 15 cutoff) and Figs. 5 and 6 (quasi-Poisson and Gaussian, respectively, for the May 15 cutoff). Additionally, Tables 3 and 4 contain our forecasting accuracy measures for each respective cutoff.
Analyzing the out-of-sample performance of these estimates, for the April 15 cutoff, the model that best performs is the Gaussian Richards/logistic curve. This curve yields the lowest values of MAE, MAPE, and RMSE considering forecasting horizons of 3, 7, and 14 days ahead. The results of the quasi-Poisson Richards/logistic curve are not qualitatively dissimilar. The Harvey and the     Weibull curves, instead, performed substantially worse, in that they overestimated the forecasts compared to the actual values. For the May 15 cutoff, our results are in a way opposite to what we obtained for the first cutoff: the Richards/logistic curves continued estimating a downward trajectory, yielding systematic underestimations of the count of new cases for this new period. The Gaussian Harvey and Weibull curves also underestimated the new cases compared to the actual realizations for the forecasting horizon. As one can see in Fig. 5, the quasi-Poisson Weibull and Harvey curves yield reasonable estimates for the forecasting horizon. This is also supported by the performance measures: the best MAE/MAPE/RMSE was obtained with the quasi-Poisson Weibull curve when considering a 3-to-7 day horizon, whereas the best measures for the 14-day horizon were obtained with the quasi-Poisson Harvey curve.
Tables 5 and 6 show our forecasts for some specific dates, between April 16 and 29 for the first cutoff, and between May 16 and 29 for the second, including 95% confidence forecasting intervals for each curve. From Table 5, with the April 15 cutoff, we can see that the short-term forecast intervals using the quasi-Poisson (up to three days ahead) and Gaussian (up to two days ahead) logistic curves contain the true values, underestimating the counts in the longer run, whereas the Harvey and Weibull curves always overestimate the counts. For the May 15 cutoff, we can see from Table 6 that the quasi-Poisson forecasts using the Harvey and Weibull cases perform well, containing the true values in the seven-and 14-days-ahead forecast intervals, while the logistic curves underestimate in about 60% of the true values in the 14-day forecast horizon.

Second-stage results
Now, we move onto our main variable of interest: the daily utilization of ICU beds with COVID-19 patients. We forecast this variable by estimating a mean demand model specified through Equation (12). In choosing our best forecasting model, we face a trade-off between improving the predictive ability of the model and minimizing its residual correlation. We analyzed several models, considering different choices of the corresponding lag lengths p and q and the parameter τ , and we obtained the best results by parameterizing Equation (12) with the specification presented in Equation (14): where, as before, μ t is the mean demand rate of the process on day t; Y t is the utilization of COVID-19 ICU beds on day t; covariate X 1,t = ln( t−1 i=1 Y i ) is a Harvey-like component measuring an accumulation of past demands; and covariate X 2,t = 1 7 t−1 i=t−7 X i is the average of the daily count of new cases over the past seven days. Figure 7 contains the fitted curve overlaid on the actual data. Panel (a) shows the fit of the model for the April 15 cutoff and panel (b) does the same for the May 15 cutoff. In both cases, the model captures remarkably well the empirical dynamics of the utilization of ICU beds. Tables 7 and 8  display the estimated parameters of the model for each respective cutoff point, along with standard deviations and p-values. All estimated parameters are statistically different from zero at the 5% significance level. Figures 8 and 9 display the model forecasts two weeks after the April 15 cutoff date, totaling six estimated curves from the first stage, working as inputs for the second stage. Figures 10 and 11 show the analogous results for the May 15 cutoff date. Tables 9 and 10 show the performance measures (MAE, MAPE, RMSE) for each estimated model, considering each cutoff point.
We can see that, as expected, the performance of each model in this second stage is directly affected by the performance of the first-stage input models. From Figs. 8 and 9 from the April cutoff point, we can observe that when the model for new cases only slightly overestimates the count compared to the actual results (as it is the case of the logistic curve), the ICU beds demand model provides an appropriate fit even for the 14-days-ahead forecast hori-   zon. On the other hand, the corresponding forecasts using Weibull's or Harvey's curves, where the count of new cases is substantially overestimated, resulting in an also excessively high forecast of ICU beds demand. Table 9 also supports the aforementioned graphical analysis, with better goodness-of-fit indicators for the quasi-Poisson logistic curves, followed by the Gaussian logistic case.
Using now the May 15 cutoff point, Figs. 10 and 11 display that the best out-of-sample predictions for new cases are obtained using a Gaussian Harvey and a quasi-Poisson Weibull curve, with the Gaussian Harvey curve for new cases generating 95% forecasting intervals for the demand for ICU beds that ex post contain the true values. On the other hand, the Gaussian Weibull and both logistic curves yielded excessively low forecasts of the new cases count compared to the true values, which in turn systematically underestimated the demand for ICU beds. This underestimation result, from a policy standpoint, is much more problematic when compared to overestimation: when one overestimates demand, the system builds capacity unnecessarily, whereas, when demand is underestimated, we risk facing the inability to save lives. Note that the goodness-of-fit measures collected in Table 10 do not take that differential cost into consideration, demonstrating how using goodness-of-fit measures by themselves, without taking into account the graphical representation

Defining the third wave
To rigorously identify the third wave, we decided to use a formal definition of what constitutes a COVID-19 wave. For this effect, we followed the method proposed by Ayala et al. (2021) that defines new waves through the following criteria: Crit. 1: End of the previous wave: the first week (since the starting date of the wave) fulfilling: (a) weekly incidence rate lower than 70 cases per 100,000 people (70/100,000 cases); (b) negative growth incidence rate for at least two consecutive weeks. Crit. 2: Start of a new wave: the first week (since the end of the previous wave) fulfilling: (a) weekly incidence rate higher than 70/100,000 cases; (b) positive growth incidence rate in at least one week in which the municipality presented over 70/100,000 cases. Crit. 3: Average threshold: the midpoint week between the end of the previous wave and the start of the new one.
The identification strategy of the Ayala et al. (2021) method contains criteria on both levels and changes. The criterion of 70 cases per 100,000 people was established following the government's strategy to define geographical COVID-19 measures, the so-called step-by-step plan (Vicuña et al., 2021), in which geographically determined restrictions were set according to epidemiological indicators. According to this strategy, a daily incidence rate of 10 cases per 100,000 people was considered as positive, triggering advances in terms of the stringency of circulation. In terms of changes, the use of more than one week with low/high cases allows a trend to be identified, avoiding defining a wave based on outliers.
Using their method, we were able to determine that the end of the second wave and the start of the third corresponded to July 18 and October 22, 2021, respectively, with the average threshold corresponding to the September 4, 2021, midpoint. Figure 12 shows these thresholds graphically.
This third wave is also an interesting period to analyze for Chile. As Contardo and Costa (2022) observe, Chile had reached by July 2021 a 75% of fully vaccinated subjects, meaning that subjects had received either a single-dose vaccine or both doses of a two-dose vaccination scheme. According to Tetteh et al. (2021), it is necessary for a given population to obtain herd immunity, a threshold that only 15 other countries worldwide had reached even by late September of that year (Lazcano and Yáñez, 2021). However, the spread of the disease did not decrease at the same pace. Contardo and Costa (2022) argue that one of the main factors to explain the continuing increase in hospitalizations, in spite of the success of the vaccination campaign, is the rapid relaxation of social distancing measures by the local health authorities, much like it had happened with the issuance of vacation permits at the start of 2021.

Replicating our analyses for the third wave
Having defined the third wave, we again use the models described in Section 2 for the third wave, using the data from September 4, 2021, until November 25 as a training sample, to make forecasts for the period spanning from November 26 to December 9, 2021 (14 days ahead).   Figure 13 shows the fit of the models for new cases for quasi-Poisson (upper half) and Gaussian (lower half) processes, using Weibull, Richards/logistic, and Harvey specifications, respectively. The 95% confidence intervals for the 14-days out-of-sample forecasts are displayed in gray shading.
With the results from the first stage, we can now estimate the demand for ICU beds during the second stage, in a similar fashion as before. The results are presented in Fig. 14.
The out-of-sample performance of these forecasts, for this new wave period under consideration, is captured in Table 13. Analyzing the out-of-sample performance of these estimates, the model that best performs is the Weibull curve using the quasi-Poisson distribution, irrespective of which performance metric (MAE, MAPE, or RMSE) is used. This curve yields the lowest values of MAE, MAPE, and RMSE considering forecasting horizons of 3, 7, and 14 days ahead. The results of the Weibull curve using a Gaussian distribution are not qualitatively dissimilar. The Richards/logistic curves do not perform that much worse, and qualitatively their results are similar (if less precise) than their Weibull counterparts. The Harvey curves, instead, performed substantially worse for this time period, in that they again overestimated the forecasts when compared to the actual values. These results are very useful to realize that this approach of using different epidemiological curves can yield different qualitative results. When the recent trend is upwards, Harvey's curve will predict a continuation of that trajectory, whereas both the Weibull and the logistic curves tend to capture that downward trajectory much quicker. Therefore, the quality of the performance of each alternative will depend crucially on the level of the wave the pandemic is at, with Harvey's curve tending to follow the upward trajectory, and the other two tending to react quicker to recent changes. As such, each set of forecasts by themselves can provide useful best-and-worst case scenarios (or, alternatively, upper and lower bounds) for a policymaker who intends to decide on how much capacity of ICU beds needs to be supplied to satisfy demand.

The effect of age on a risk group
In Chile, during the first wave of the pandemic in 2020, people older than 60 years of age represented 16% of the total cases, but 51% of ICU hospitalizations and 85% of deaths . In our forecast implementations thus far, for both the second and third waves, we used the set of variables that had been made publicly available at the time of each cutoff. However, the CMoH released retrospectively in their public datasets a metric which, had it been made available contemporaneously (instead of retrospectively) could have captured that incidence rate for individuals aged 60 years old and above.
To incorporate the effect of age, therefore, we consider the age risk group as defined by the CMoH as the population whose age is 60 years or higher. For this group, the CMoH reports the epidemiological week average of the incidence rate of contracting COVID-19 in Santiago per 100,000 inhabitants. We call the new variable X 3,t , the incidence rate of the group aged > 60 years the epidemiological week available prior to day t.
Since the incidence rate of the age risk group, X 3,t , is highly correlated with the seven-day moving average variable X 2,t ≡ X t−1 = 1 7 t−1 i=t−7 X i as used in Equation (12) (sample correlation 0.95054), Table 14 Estimated parameters, COVID-19 ICU beds utilization model, November 25, 2021, cutoff, using seven-day moving average of age incidence rate X 3,t instead of X 2,t . we used the new incidence rate variable instead of X 2,t . This new specification of Equation (12) was estimated using X 3,t , yielding the results summarized in Table 14.
In order to make predictions with this new model, it is required to have a forecast for the incidence rate of the age group > 60. For this, we used a bottom-up forecasting strategy (see Fig. 15 for implementation): 1. According to data up to the cutoff date of November 25, 2021, 21% of the infected people belonged to the age group of 60 years and above. 2. Hence, we fitted the six epidemiological models we have used throughout the paper for the first stage and obtained estimates X t for future new cases in Santiago. 3. Next, we multiply X t and π = 0.21 to obtain a proxy for the number of new daily cases among the elderly. 4. Finally, the forecast X 3,t+k , for the average weekly incidence rate of people over 60 years old measured k days ahead, is calculated as per Equation (15): where n r,t is the total population of Santiago reported by the CMoH at date t.
Quite fascinatingly, observing Table 15, we can note that, irrespective of the metric or the forecasting horizon, the performance differences in these models for each case are negligible. However, compared to the results in Table 13, where we used the seven-day moving average of past new cases instead of the elder age incidence rate, we observe that using the latter variable does not necessarily improve (and in quite a few cases, worsens) the predictions of the Weibull and logistic curves. At the same time, the inaccuracies observed in the Harvey case were substantially corrected, particularly for the longer forecasting horizon. This situation can also be observed in Fig. 16.

Conclusion
Our ability to understand the COVID-19 epidemic is crucial in order to curb its spread. In strategic health planning, the use of operations research and problem-structuring methods has huge value-generating potential (Thunhurst, 2003), and in the phases of detailed design work and implementation, appropriate decision support forecasting tools become fundamental in the scenario of a pandemic like this. The main contribution of our study lies in the provision of an important and simple framework to inform public health decision-making designed to build ICU capacity in a timely manner and, in the meantime, also illustrating the usefulness of nonlinear modeling approaches to forecast the evolution of the disease when availability of data is limited. Limitations of the study include the fact that our forecasts are based on three growth models with two different distributional assumptions (six estimated growth curves used as inputs for the ICU beds model), as it would be interesting to compare forecasts from several different alternative methods. More crucially, though, note that for our illustrations here we used publicly available data. Other important covariates, and/or additional variables with higher granularity, would in all likelihood provide more accurate forecasts and results. Also, the idiosyncratic situation of each country, territory, or region will require different modeling choices based on data availability as well as specific contexts.
Additionally, it is important to remember that the forecasting methods here presented for both the demand for ICU beds and the number of new daily cases are designed to work for a mediumto-short-term. In the longer run (i.e., beyond two weeks ahead), the forecasts obtained from these models could differ in a substantial way from the true observed data. However, as a decision support strategy, we have provided a useful, easy-to-understand decision support tool, that does not require extremely detailed data to provide reasonable short-term forecasts, enabling authorities and decision-makers to make informed decisions.