Application of disease model to ascertain risk-mitigation strategies during COVID-19 pandemic in 2020, an example from resource-constrained urban settings in Pakistan

Most COVID-19 affected resource-constrained settings have not ascertained their disease prevalence that pose a risk for global health. In wake of limited diagnostics and research capacity in such settings, this disease forecasting model provides an example to be adapted for evidence-based response efforts. Using officially reported data, this model forecasted COVID-19 prevalence in cosmopolitan cities. Several risk-mitigation strategies were analyzed for effectiveness in controlling disease incidence. Moreover, the reproduction rates to ascertain transmission, herd-immunity threshold, and performance of required laboratory tests were studied. The severe-critical cases were relatively low due to larger young population. Following any risk-mitigation strategy, at end of first wave, a susceptible population remained at risk of recurrent COVID-19 transmission. The herd-immunity threshold was in accordance with global estimates but would need careful monitoring based on adopted risk-mitigation strategy and variation in vaccines’ efficacy. A test-gap between performed vs required laboratory tests led to miss several cases from getting diagnosed. First of its kind, this study estimates sub-national COVID-19 prevalence in dense-urban living in low-middle income settings. Future response policies should consider such evidence to prevent recurrent COVID-19 waves of transmission. Unless sustained herd-immunity is achieved by effective immunization, risk of re-introduction to vulnerable population would remain.


Background
COVID-19 has affected a large proportion of global population at a swift pace. On 30 January 2020, the World Health Organization (WHO) declared COVID-19 a Public Health Emergency of International Concern (PHEIC) [1]. The international spread of COVID-19 remains pronounced when countries that brought ongoing transmission to a low level remain at-risk of re-introduction. For example, parts of Europe, New Zealand, Vietnam, and Singapore had lowered disease spread, but a potential re-introduction in previously unaffected regions coupled with autochthonous transmission leads to increase in COVID-19 prevalence [2][3][4][5]. Unless addressed globally, there is a risk looming for any susceptible population. The recent alerts highlighting the second wave of COVID-19 transmission and its potential to establish as an endemic disease has showed the serious nature of perceived risk [6,7]. Therefore, it is important to ascertain disease incidence on prompt basis to achieve an evidence-based risk mitigation and response. To date, this felt need is answered by forecasting models in high-income countries while, largely unaddressed in most low-middle income countries (LMICs). Therefore, the resource-poor countries remain at disproportionally higher risk of disease spread internally or overseas. These countries have limited diagnostic and research capacity that could help in timely estimation of COVID-19 incidence [8,9]. Such a knowledge base is vital to guide policy makers in understanding potential impact of risk-mitigation strategies and plan accordingly.
Since 31 December 2019 when COVID-19 cases were first reported in Wuhan, China, there has been an obvious global spread of disease [10]. COVID-19 is caused by the newly identified virus SARS-CoV-2, seventh member of the coronavirus family (together with MERS-nCoV and SARS-nCoV) that can affect humans [11,12]. Majority of the reported cases (80%) have none to mild symptoms and may not realize their acquired illness [13,14]. Others have symptoms that included (but not limited to) fever, cough, sore throat, shortness of breath, myalgia, fatigue, and diarrhea [15]. A smaller proportion of cases may have severe to critical infection leading to pneumonia and even death [15]. The average incubation period is 5-6 days long that may last for 14 days [16]. The reported disease transmission is through respiratory droplets, airborne particles, close contact among humans or less commonly through contaminated surfaces [16]. The respiratory modes of disease transmission are known to play a role in rapid global spread of SARS-CoV-2.
Due to range of factors, COVID-19 has established its global transmission while, the prevalence of disease is not fully ascertained. For example, a large proportion of cases are not identified either due to indefinite disease presentation, subclinical infections, and/or initial limited diagnostics' capacity [15,17,18]. To overcome these challenges, the burden of disease is estimated by disease modelling. However, most of the models' estimation addressed situations in high-income countries [19][20][21]. To date, the limited disease modelling and research capacity prevented estimation of prevalence in LMICs [8,9]. In the absence of available evidence or scientific forecasting, the resource-limited or resource-constrained settings adopt developed countries' strategies that might not be suitable for them. Occasionally, this has led to reverse of ongoing response efforts and suboptimal control of disease transmission that causes both medical and socioeconomic impact at community level [22,23]. Timely information of disease transmission is important both for priority settings and prompt response actions. To fill this gap, we initiated this study to develop a forecasting model while using scenarios in two major cosmopolitan cities of Pakistan. Such a model from a low-middle income country can be adopted elsewhere and serves an example for a large proportion of near-similar global population.

Study objective
The objective of this study was to apply SIR model to ascertain COVID-19 disease prevalence that would help policy makers in planning of response and mitigation efforts to contain ongoing COVID-19 spread and recurrent outbreaks hammering dense-living masses in low-middle income settings. The intention from this study is to estimate COVID-19 disease incidence, cumulative case count, performed vs required RT-PCR laboratory tests, reproduction rates and herd immunity threshold for specific current prevention measures

Methods and Materials a) Study settings and Data sources
The model was customized with the officially reported cumulative COVID-19 data for Karachi and Lahore, the two most populous cities in Sindh and Punjab provinces of Pakistan, respectively [24,25]. The two settings were selected for the following main reasons. Primarily, since first reported in March 2020, the brunt of COVID-19 disease prevalence was from these two cities. This situation needed urgent attention as the high disease incidence due to conducive environment such as dense populations propagated spread of COVID-19 [25]. Moreover, the study settings provided a good example to match dense urban livings in many cities of LMICs. This would increase the generalizability of this model. Secondarily, COVID-19 diagnostics improved overtime in public sector therefore, the case-detection increased. In addition, the quality of data for COVID-19 cases was relatively better for these two urban settings that could support extrapolation by forecasting model. For example, during a population level representative survey in Lahore that is one of our study setting, the government collected specimens to determine COVID-19 prevalence [26]. The specimens were laboratory confirmed by RT-PCR and tested positive for both workplaces and residential areas all over the city. This was a valuable evidence to deem the at-risk population across study settings susceptible of acquiring SARS-CoV-2 infection.
This model was set to project COVID-19 pandemic situation in these two settings under scenario of "current measures" for transmission control and prevention. While estimating the potential disease prevalence, it was important to determine the susceptible population. The available data showed that age-groups below ten years old and above 80 years had negligible COVID-19 incidence. This finding is explained by the early closure of schools as well as parks that restricted children's mobility and a relatively smaller elderly population in Pakistan.
Therefore, the susceptible population in this model did not include children under ten years of age (24% of total population) that were mainly confined to homes [24]. However, the elderly age-group was included in the model as this group although small yet had a potential to contribute cases with severe disease condition, a known challenge for any health system.
This study used open source primary data reported by both national and sub-national ministries of health, Pakistan [26][27][28]. The data for number of laboratory tests performed to diagnose SARS-CoV-2 infection was available only for the whole province while, no such city specific information was found [27]. While, the city-specific case-count was available for both the study settings as well as their respective provinces. To estimate city-specific number of tests, we assumed that the percentage of tests performed was in accordance with the proportion of cases in a city to total cases in its province. This gave an approximate estimation of the daily number of laboratory tests performed to confirm cases in both of our study settings.

b) Time-dependent Susceptible-Infected-Recovered Model
A susceptible-infected-recovered (SIR) model was developed to forecast the COVID-19 disease prevalence. The total population was divided into three compartments. First was the susceptible ( ) population being vulnerable to COVID-19 presented in equation (1). Second, the infected ( ) population that got disease and third, the removed population ( ) were the ones that either recovered or died from infection are shown in equation (2) and (3) respectively. The SIR model was based on assumptions as following: for all recovered population, there was no risk of reinfection. There was no change in population from other cause of morbidity, mortality, or migrations during the timeframe of this model. Moreover, all the susceptible and infected population had an equal chance of interaction with each other.
To customize model output with the daily official reported COVID-19 incidence, the unit of time ( ) was taken as The data collation and analyses were run in Microsoft Excel and StataSE version 16.

c) Method of parameter estimation
The change in number of infected cases per unit time depended on the ( ) and ( ) parameters. Using the officially reported epidemiological data, approximate values of both the parameters were set for model run.
Overtime, the disease transmission in our study settings got influenced by several levels of lockdowns, mass The SIR model assumed a homogenously spread vulnerable population with random mixing patterns leading to equal level of individuals' susceptibility to infection in both the study settings. Following this assumption, the reproduction rates were estimated [29]. The basic reproduction number ( 0 ), that is average number of infections caused by a typical infected individual in a completely susceptible population as shown in equation 8. The effective reproduction number ( ), a parameter that helps determine potential for epidemic spread overtime is number of secondary infections when the population is partly susceptible was estimated per equation 9. To reach this conclusion, the existing rates of infection and removal for that specific time were used. This study also carried an objective to ascertain immunization against COVID-19 as control and prevention efforts. Mass vaccination would help achieve a herd immunity defined as the indirect protection from infection at a given contact rate when a sufficiently large proportion of immune individuals exist in a population [30]. To reach herd immunity, a population have to pass through herd immunity threshold ( ) that is the point at which the proportion of susceptible individuals in a population falls below the threshold needed for disease transmission [30]. The mathematical derivation of ( ) is presented in equation 10.

d) Scenario analysis for risk mitigation strategies
In addition to the projection for "current measures", the scenario analyses were conducted for the implementation of following Non-Pharmacological Interventions (NPIs) based on physical distancing and wearing face masks as the main risk mitigation strategies For all these scenarios analyses, the "current measures" scenario was taken as baseline.

e) Assumptions for scenario analyses
The performed scenario analyses carried following assumptions.
i. Preliminary results from renowned studies suggested that 80% of the COVID-19 cases had either mild infections or asymptomatic cases [31]. Due to less pronounced or absence of symptoms, this proportion of cases would not usually report to health facilities. As a result, such cases were bound to miss diagnostic confirmation as most of the low-middle income settings did not have capacity for seroprevalence surveys to ascertain these cases. Therefore, the reported official figures accounts for the proportion of symptomatic cases only that visited health facilities (moderate-severe cases) representing not more than 20% of the total COVID-19 cases [31,32]. If these factors were not considered, the model would be an under-estimation of true size of pandemic in study settings. ii.
Since the onset of pandemic, the availability of diagnostic capacity for COVID-19 case-confirmation improved in our study settings. This also led to a gradual improvement in number of case detection and reporting. However, per WHO suggested criteria, this was subject to a test positivity rate (suggested to keep below 10%, preferably less than 3%) and the number of tests (preferably 10-30) performed per confirmed case [33]. If the rate of performed laboratory tests did not increase overtime, there would be a risk of miss-to-diagnose large proportion of COVID-19 cases [26].
In addition to the above shared factors, the official case-count of COVID-19 was also affected by three latent factors that were not possible to measure during this modelling process. The first two factors were fear of stigma and lack of trust on the strained health system [34]. These factors led to reluctance of symptomatic COVID-19 cases from self-reporting to healthcare. Third, for a health system like Pakistan, not all COVID-19 cases were part of the official reports, especially the ones managed in private sector, asymptomatic, and the ones with mild symptoms. Considering these salient factors as well as recommendations to isolate at home and the global evidence of a smaller proportion of moderate-severe cases (15-20%) attending health facilities, it was anticipated that the officially reported case-count in our study settings represented 15% of the total disease prevalence [14,[35][36][37].

Results
The model simulation initiated with a single case depicted the introduction and spread of COVID-19 in two major cities of Pakistan. The disease transmission model in two settings was not related to each other. The local population mobility patterns that were influenced by levels of lockdown (restricted movements of population) and resulting encounters among susceptible and infected individuals led to spread of disease at community level [38,39]. Therefore, the influence of lockdowns on the rate of infection ( ) was estimated and customized with the available officially reported data (appendix). An overall lockdown was in-place from 25 March-13 May 2020 [40]. The available data suggested that effects of such a lockdown were pronounced from 24 April-13 May during first 20 days of the month of Ramadan (total 30 days, 24 April -23 May 2020). From 14-24 May 2020, the lockdown was relaxed closer to Eid celebration (festive season from 22-24 May) that led to mass-movement and dense gatherings in local marketplaces [41,42]. Although the large-scale population movement decreased by end of May 2020 however, the aftermaths of the earlier relaxation of lockdown were evident as spikes of new infections after two weeks (around 5-10 June 2020). This was followed by issuing of set of protocols by provincial administrations for an overall controlled movement at public places, wearing face masks and regular lockdowns during weekends. Therefore, after 8 June 2020 onwards, the rate of infection might have lowered but relatively higher than during the overall lockdown phase. The removal rate ( ) of COVID-19 cases was estimated from the officially reported data customized for our model (appendix).
Under "current measures" scenario, the model forecasted peak of COVID-19 transmission by 2 nd -3 rd week of August 2020 in Lahore and Karachi, respectively (Fig 1). Our model followed evidence that of the total COVID-19 cases, nearly 15% and 2.5% had moderate and severe symptoms, respectively. Similarly, a proportion of cases (15% of total cases) with moderate-severe symptoms attended health facilities and were reported as part of the official database. As per observed trends for disease transmission, 1% required critical-care support and were extrapolated by the model (Fig 2). An increased population mobility and resulting interaction around 22-24 May Eid festive led to a relative increase in number of cases reported during early June. As of end-June 2020 before isolation at home was introduced, the officially reported cases on daily basis were within 95% confidence interval (CI) of the cases forecasted by our model that attended health facilities (15% of total case count) (Fig 2). The proportion of critically ill-patients by age-groups could not be extrapolated due to lack of similar base-line data.  transmission would have affected 51% and 50% of the susceptible population in Karachi and Lahore, respectively (Fig 3). This wave of pandemic will be over by 95% around 24 September and 11 October in Lahore and Karachi, respectively. Similarly, by 15 October and 4 November, it will be over by 99% in Lahore and Karachi, respectively.
By tipping point (cumulative count of the removed cases surpasses total infections) about 49% and 50% of susceptible population would be free from infection in Karachi and Lahore, respectively (Fig 3). The model showed, when disease transmission was closer to end, there would be approximately 23% and 22% remaining susceptible population in Karachi and Lahore, respectively (Fig 3). This proportion of population would be vulnerable for any second wave of disease transmission. Moreover, 24% of the total population that was younger than 10 years was assumed not at-risk by being at home. However, this proportion of cases might also be exposed to COVID-19 if all control and prevention measures are lifted including school opening before community transmission is controlled at low-level. To extrapolate disease prevalence by age-group, an age-specific incidence was used [27]. The results showed that 40-59 years old age-groups brunt most of the cases followed by 60-69 years old. While 10-19 years old age-groups was the least affected of the susceptible population followed by 20-39 years old (Fig 4). The frail elderly population (above 70 years old) had a disease prevalence in between other age-group. The lower elderly population affected in these study setting might be due to their smaller proportion among total population.

Fig 4 New and Cumulative extrapolated COVID-19 cases by age groups in Karachi (a), (b) and Lahore (c), (d) respectively
To determine the epidemic potential of COVID-19 in the two study settings, the effective reproduction number ( ), was estimated by the model. Following the peak of pandemic in our study settings, the ( ) start to decline (< 1) when the propagated disease transmission would not be possible and COVID-19 spread will die provided the population contact rate did not increase (Fig 5). Under the "current measures" scenario, the 0 value of 2.68 and 2.65 was estimated for Karachi and Lahore, respectively. Assuming a vaccine with 100% efficacy, a ( ) was ascertained at 63% and 62% for Karachi and Lahore, respectively. However, there is no available evidence that to suggest a long-lasting immunity from re-acquiring SARS-CoV-2 infection. Therefore, if the natural immunity after this infection wanes over ten months or two years, there might be yearly or two-yearly outbreaks, respectively.

Scenario analyses
In addition to the projections under "current measures", to ascertain the impact of NPIs (physical distancing and wearing face masks at public places) as risk mitigation strategies, the following scenario analyses were conducted i. The "relative measures" scenario depicted a situation when during the "current measures", local population movement would increase in both study settings at festive season (end July 2020). During this time there would be an imminent risk of population gatherings both indoor and outdoor. This scenario projected 6,000-12,000 extra infections compared to the "current measures" scenario and spike of cases, following an average incubation period (5-7 days) of festive holidays (Fig 6). This scenario would cause a peak of disease transmission by late august 2020 and cumulative case count would increase relatively more in Karachi compared to Lahore (Fig 7). Similarly, at the end about 21-22% of the unaffected population remained at risk of recurrent wave of disease transmission in both study settings (Fig 7). ii.
According to "easing measures" scenario, if the NPIs implementation as control and prevention measures were eased since mid-April 2020, the peak of ongoing disease spread would pass earlier than anticipated in both the settings. However, it will come at the cost of 18,000-40,000 extra infections during the peak of disease transmission that was projected around early to mid-August 2020 in Lahore and Karachi, respectively (Fig 8). This scenario would have additional 5% cases in total as compared with the "current measures" scenario.
Moreover, 15% population would remain susceptible in each of the study settings for any subsequent wave of disease transmission (Fig 9). iii. Next scenario analysis considered "strict measures" for the NPIs implementation on continued basis since mid-April 2020. At the peak of disease transmission, the daily number of estimated cases were 30,000-50,000 lower than under the "current measures" scenario in both the study settings (Fig 10). Similarly, the cumulative case count under "strict measures" scenario also would have been lesspronounced compared to "current measures" scenario. The "strict measures" would have prevented 30-32% population from getting infection compared to "current measures" scenario (Fig 11). This proportion of cases prevented from illness could also be translated into saving lives, putting a less burden on health system and more susceptible remaining for potential recurrent wave of disease transmission. Primarily, this scenario would have slowed disease spread and the delayed peak of transmission around by early to mid-November 2020 in Karachi and Lahore, respectively (Fig 10). iv.
Finally, the number of diagnostic tests conducted in relation to COVID-19 disease transmission in both the study settings were analyzed. As COVID-19 cases had varied disease presentation, the identified cases depended on number of confirmed positive cases. Therefore, test positive rate was studied using the data for number of laboratory tests conducted and confirmed positives among them using the WHO's suggested 10% cut off (Fig 12). Since April 2020, the test positive rates in both study settings remained above the 10% threshold until recently, when there were indications of downward trends (Fig 12).

Discussion
With the ongoing COVID-19 spread and the absence of any vaccine, the resulting disease burden is a strain for any health system, especially the less developed ones in the LMICs. Considering the modes of disease transmission and international travels, a high incidence in one country is known to quickly spread to others. The forecasting for patients in need of critical-care can guide resource-allocation and preparedness so mortality rates can be lowered.
Therefore, for prompt response efforts it was important to bring forward this study of estimating disease prevalence at sub-national levels. An added value of this model lies in estimating COVID-19 disease prevalence for several scenarios of NPIs implementation in populous cosmopolitan cities. Such estimations are key to developing policies especially for resource-constrained settings to improve priority setting and lessen the impact of COVID-19 on vulnerable communities.
Importantly, the community transmission of COVID-19 is an interplay between many factors. This gets further complex when both symptomatic and asymptomatic cases reportedly have a role in disease transmission. It is worrisome from global health viewpoint and challenging in controlling disease progression because tracing asymptomatic disease spreaders in communities is not a straightforward task. The worse comes when such a situation occurs in densely populated urban areas with a large susceptible population that becomes a perfect breeding grounds for rapid disease progression as studied in this model. A similar situation is found in major urban settings of USA, Europe and dense livings in India [43]. The proven solution to slow COIVD-19 spread as demonstrated in parts of Europe and Asia lies in frequent testing of suspected cases, isolation of confirmed cases, contact tracing and seroprevalence surveys to monitor disease transmission in communities.
Due to the overall challenges in diagnostic capacity especially in the low-income settings, main response plans place. Therefore, the recurrent wave of disease transmission after relaxation of lockdowns could be worse due to more cases in less time during dense school gatherings. This shows that without proper measures in place including vaccine, prompt testing, safe-distancing, and isolation, only lockdowns cannot prevent susceptible population from getting COVID-19 in the long run [46]. Otherwise, health system will get stretched with high case-count that would adversely affect diagnosis of new cases and management capacity of health system hence, a vicious circle.
To accomplish disease control and prevention, monitoring COVID-19 disease epidemiology is crucial. For example, the effective reproduction rate ( ) relies on rates of infection and removal reflecting the mechanisms involved in acquiring infection and survival. Therefore, periodic estimation of ( ) values can be used to gauge pandemic response efficacy at national or sub-national level. The estimated ( ) by our model is in accordance with reported elsewhere. However, this threshold may vary based on disease transmission in a population and the efficacy of available vaccine. In future, the numerous upcoming vaccines will vary in their efficacy that will eventually lead to variation in ( ). The multiple scenario analyses in this model show that at the end of disease transmission wave, there will always be a vulnerable population at risk of acquiring disease. Therefore, to prevent resurgence of COVID-19, a highly effective vaccine is the only reliable solution.

Study limitations
Although this modelling study is first of its kind to bring forward a thorough scenario analyses as an example for low-middle income settings yet, it is not free from limitations. Due to the modes of transmission involved in spread of SARS-CoV-2; population density is an important factor in disease transmission across communities. This limits the generalizability of the modeling study to urban settings while, rural areas with scattered population will have different dynamics of disease spread. The rural areas were not included in this model mainly due to limitations of available data. We assumed a homogenously spread susceptible population while, there would be some variation that is a common limitation of disease forecasting models. Due to the limited data availability for seroprevalence studies and health facilities' data, we considered available literature for the assumption that patients mainly with moderate-severe symptoms were seeking healthcare. The age-specific data for critical patients and mortalities were not available therefore, it was not extrapolated.

Conclusion
The ongoing spread of COVID-19 is a major challenge for any health system. In the absence of vaccines, the only factor that appears to slow the pace of disease progression is the transmission rate of infection. While, NPIs can slow the rate of infection, they have limitations in preventing infection in the long run. Therefore, the only reliable solution for risk reduction is vaccination of susceptible and give priority to high-risk and frail population. However, a forthcoming challenge for vaccine implementation will be to carefully monitor its efficacy, safety while in use and willingness to take by communities. Future studies should also focus on latent factors involved in COVID-19 spread including the proportion of cases reluctant to attend health facilities due to fear of stigma and trust on health system, role of private sector in case-management as well as seroprevalence surveys to estimate community transmission.