According to the NCDC report, the total cumulative number of COVID-19 cases in Kano stand at 73 between 22 to 24 April 2020, and stand at 77 between 25 to 27 April 2020 (after addition of four new cases), i.e., no new case was reported between 22 to 24 April 2020, as well as between 25 to 27 April 2020, which appears weird considering the rapid increase of the outbreak curve since the index case on 11 April 2020 in Kano (NCDC, 2020). We presume that the COVID-19 cases in Kano were under-ascertained probably from 22 to 27 April 2020. In this paper, we estimate the number of under-ascertainment cases and of COVID-19 in Kano, Nigeria from 22 to 27 April 2020 based on the available data during the early phase of the epidemic.
We used the time series data for cumulative confirmation compiled by the NCDC (2020) from 11 April to 30 April 2020. All cases data were confirmed from the laboratory according to the definition of COVID-19 cases by the NCDC which is available at https://covid19.ncdc.gov.ng/report/. The data chosen for this study was from 11 to 30 April 2020 instead of including up to the present date, this is due to the fact that the diagnostic testing have improved significantly since towards the end of April 2020, and also sufficient personal protective equipment’s (PPEs) were provided to the frontline health workers.
We suspected that there was a number of under-ascertainment of COVID-19 cases, denoted by, likely from 22 to 27 April 2020. The cumulative confirmation of the total number of cases, represented by Ci, of the i-th day since 11 April 2020 is the summation of the cumulative cases reported/ascertained, represented by , and cumulative number of under-ascertainment cases, represented by . Thus, following the previous studies (Trotter et al., 2005; Zhao et al., 2020b), the relation/formula for computing the expected number of under-ascertainment of cases is given as Ci = + , where is observed from the data, and is 0 for i before 22 April and for i after 27 April 2020. We employed the approach used in previous works (Zhao et al., 2020a; Zhao et al., 2020b; De Silva, 2009) and we modelled the outbreak curve. The Ci series is used as an exponential growing Poisson process. The data from 22 to 27 April 2020 seems constant probably due to the poor testing facilities and some other unknown reasons, thus, these data were ignored in exponential growth fitting. The and the intrinsic growth rate (represented by γ) of the exponential growth were to be estimated using the log-likelihood estimation (ℓ), from the Poisson distributed likelihood framework on number of cases. We estimated the 95% confidence interval (95% CI) of based on the profile likelihood estimation technique with cutoff threshold computed by a Chi-square quantile, given by χ2pr = 0.95, df = 1 (Fan & Huang, 2005). We obtained the R0 based on the estimation of γ following similar approach as in (Zhao et al., 2020a; Zhao et al., 2020b; Musa et al., 2020). Therefore, using similar approach (Zhao et al., 2020b), we have the basic reproduction number, R0 , given by with 100% susceptibility presumed at the early stage for COVID-2019 outbreak, where denotes the serial interval (SI) of COVID-19 follows a probability density function h(). We note that this formula has been derived theoretically as well as adopted in previous studies (Zhao et al., 2020a; Zhao et al., 2020d; Wallinga & Lipsitch, 2007).
Since the transmission chain of COVID-19 in Africa still remains fully uncovered, we adopted the SI information of COVID-19 from previous works, see for instance (Du et al., 2020; Nishura et al., 2020). We note that were modelled as a lognormal distribution with mean of 5.0 days and standard deviation (SD) of 1.9 days (Du et al., 2020; Nishura et al., 2020). It is important to note that slightly changing the SI information may not affect our main results and conclusion. In this work, we also aimed to evaluate the trends of the daily number of cases, in this case, represented by for the i-th day, and given as Ci = Ci−1 + . A simulations algorithm based on previous study (Zhao et al., 2020b) was formulated for the iterative Poisson distribution given by , here the function represent the expectation. For details of the simulations framework see (Zhao et al., 2020b).
Furthermore, we employed Autoregressive Integrated Moving Average (ARIMA) model (a time series model), which has been used in previous study to make short-term prediction (Maleki, 2020). The model consists of three key parameters: p (autoregressive order), d (the degree of difference) and q (moving average order). We assigned these parameters different values to form candidate models. To fit models with given values of p, d, and q, conditional-sum-of-squares method is used to find starting values of other parameters, then maximum likelihood method is applied passing the starting parameter values to calculate the final estimates (Shephard, 1997). As Akaike’s Information Criterion (AIC) (Akaike, 1998) is widely used in selecting the best model among alternatives (Wang et al., 2018; Ömer, 2010; Mondal et al., 2014), we calculated AIC value for models and chose the model with the lowest AIC. The 95% confidence intervals are obtained based on the assumption that residuals of the model are normally distributed. When making predictions, ARIMA model assumes the residuals normally distributed, which could make some predictions go below zero. Here, according to the actual situation, we only kept the non-negative predicted value.
The R statistical software (version 3.5.1) was used for the simulations in this study.