ASYMPTOTIC ANALYSIS OF THE SIR MODEL. APPLICATIONS TO COVID-19 MODELLING

. The SIR (Susceptible-Infected-Removed) model can be very useful in modelling epidemic outbreaks. The present paper derives the parametric so- lution of the model in terms of quadratures. The paper demonstrates a simple analytical asymptotic solution for the I-variable, which is valid on the entire real line. Moreover, the solution can be used successfully for parametric estimation either in stand-alone mode or as a preliminary step in the parametric estimation using numerical inversion of the parametric solution. The approach is applied to the ongoing coronavirus disease 2019 (COVID-19) pandemic in three European countries – Belgium, Italy and Sweden.


Introduction
The ongoing coronavirus 2019 (COVID-19) pandemics became a global emergency in 2020. Since the pandemic is still ongoing new epidemiological tools, which can accurately predict the time course of different outbreaks, can be of considerable public utility. While sophisticated mathematical models may be desirable for simulating different control strategies, it is difficult to accurately define all necessary inputs. Therefore, relatively simple epidemiological models can turn out also to be of comparable merits. Very recently, some authors have convincingly demonstrated that the SIR (Susceptible-Infected-Removed) epidemiological model can successfully model the short-term dynamics of COVID-19 for a number of countries [1,2,3,4,5,6].
The SIR model was introduced by Kermack and McKendrick in 1927 but it still remains a cornerstone of mathematical epidemiology [7]. The model can be extended in two directions -either by adding a final state, e.g. "dead" individuals -D; or by adding one or more intermediate non-observable populations -e. g. "exposed" E individuals. Very recently, it was also demonstrated [9] that COVID-19 can be approximated by a SIRD model. This result is supported by the findings in [10] where the deceased population could also be fitted by the SIR model.
The analytical solution of the SIR model was derived only relatively recently [11] as an equation of state between the phase space variables. As an alternative approach, Barlow and Weinstein derived a series for the S-variable and introduced rational convergents having better regions of convergence [5]. A different approximation scheme was introduced in [12]. Very recently, an inverse parametric solution for the I-variable has been also obtained [6,10,13].
The exact analytical solution of the SIR model was obtained by numerical inversion of a non-elementary integral equation for the I-variable by a Newton iteration (NI) scheme [10]. However, the numerical stability of the iteration scheme could not be thoroughly established, since the asymptotic analysis of the model is non-trivial.
The objective of the present work is to demonstrate an approximation scheme in terms of standard transcendental functions: exponents and gamma incomplete functions and to compare them with the previously introduced NI scheme. Moreover, the presented results demonstrate that the approximation scheme also provides asymptotic solutions for the incidence variable, which are valid on the entire real line.

Preliminaries of the SIR model
The SIR model has been typically used to study the spread of various infectious diseases (see the monograph of Martcheva [8] or [14]). The SIR model is formulated in terms of 3 populations of individuals [8]. The S population consists of all individuals susceptible to the infection of concern. The I population comprises the infected individuals. These persons have the disease and can transmit it to the susceptible individuals. The R poulation cannot become infected and the individuals cannot transmit the disease to others. The model comprises a set of three ordinary differential equations ODEs:Ṡ By construction, the model assumes a constant overall population N = S(t)+I(t)+ R(t) [7]. The interpretation of the parameters is that a disease carrier infects on average β individuals per day, for an average time of 1/γ days. The β parameter is called disease transmission rate, while γrecovery rate. The average number of infections arising from an infected individual is then modelled by the number R 0 = β γ , the basic reproduction number. Typical initial conditions are S(0) = S 0 , I(0) = I 0 , R(0) = 0 [7]. A phase diagram of the model is plotted in Fig. 1.
For simplicity of the presentation, the SIR model can be re-parametrized using normalized variables asṡ subject to normalization s + i + r = 1 and time rescaling as τ = βt.
Eq. 5 has a zero at s = g. Moreover, it is a maximum sincë and all variables are positive defined.

Solution procedure
Since there is a first integral by construction, the system can be reduced to two differential equations in the phase planes (i, s) and (i, r), respectively: In order to solve the model we will consider the two equations separately. Direct integration of the equation 7 gives where the constant c is a first integral that can be determined, for example, from the initial conditions.
where the signs denote the two different real-valued branches of the function. Note, that both branches are of interest since the argument of the Lambert W function is negative. Therefore, eq. 5 can be reduced to the first-order autonomous systeṁ which can be solved implicitly as This is, in fact, the solution obtained in [11].
3.3. The r-variable. The r variable can also be conveniently expressed in terms of i. For this purpose we solve the differential equation Since (s 0 , 0, 0) is a stable point it follows that Therefore, 4. The parametric solution The implicit solution eq. 12 can be computed as a definite integral, however this requires the computation of the Lambert W function on every integration step, which does not seem to be efficient. Alternatively, the solution can be computed more efficiently by change of variables by Prop. 2: This equation involves quadrature of only elementary functions, which are present in any numerical package. The solution is plotted in Fig. 2.
A B Figure 2. Parametric solution i(τ ) Parametric plot of (τ (i), i) for parameter values Ac = 6.5, g = 1.0 and Bc = 6.5, g = 2.0; LB denotes the left branch involving W − (x), RB denotes the right branch involving W + (x), eq. 18. Plots were produced using the quad qags Maxima numerical integration command.

4.2.
The s-variable. The parametric solution can be computed from eq. 9. It follows that where the domain of s is

The r-variable.
To solve for τ (r) we differentiate eq. 15 by s. From where for the c-parametrization. The domain of r is Remarkably, all of the involved integrals are non-elementary as demonstrated in [10]. Plots of the solutions are presented in Fig. 3.

4.4.
Initial value parametrization. For parametrization, based on an initial value, the indeterminate constant c can be eliminated using the initial condition as Moreover, the peak i m can be also determined from the initial conditions as For this case, the following autonomous differential equation can be formulated: where the negative branch is taken for τ < τ m and the positive branch is taken in the alternative case.
It is easy to re-paramterize the solution for given initial values as this amounts to computing the value of c from the initial conditions and adding a time offset to the integrals, amounting to the time to the peak. The time to peak can be calculated from eq. 19 as considering that the upper pole of integration is the initial condition for s 0 and the lower pole is g, corresponding to the maximum of the i-variable. In this case, the parametric model can be formulated as (τ (s) + τ m , s), (τ (i) + τ m , i), (τ (r) + τ m , r). This can be demonstrated for the initial value problem in Fig. 4.
A B Figure 3. Parametric solutions s(τ ), r(τ ) Parametric plots of (τ (s), s) and (τ (r), r) for parameter values Ac = 6.5, g = 1.0 and Bc = 6.5, g = 2.0; S denotes τ (s), eq 19. R denotes τ (r), eq 20. Plots were produced using the quad qags Maxima numerical integration command. 4.6. Computational aspects of the parametric solution. All three integrals can be efficiently computed by numerical quadratures. Reference implementation in the Computer Algebra System Maxima has been developed and the code is available through the Zenodo repository [16]. Maxima incorporates an efficient numerical integration routine ported from QUADPACK [17]. The integration functions compute a result to a user specified accuracy. Parametric plots of the parametric solution are exhibited in Fig. 2, 3 and 4. The numerical integrals were computed with relative precision 10 −8 .

Asymptotic analysis of the SIR model
Since the SIR solution is non-singular anywhere in R one can make use of the Banach Fixed-Point Theorem. Notably, one can use the non-linear approximation scheme of Daftardar-Gejji-Jafari (DJM method) for solving the equivalent integral equations [18]. If we treat the equations of the SIR model as independent we can formally solve eq. 4 as On the other hand, eq. 5 can be transformed formally aṡ Therefore, formally, Starting from the 0 th order approximation i (0) ≈ i 0 , it follows that However, this does not guarantee convergence of the iteration. To establish convergence we observe that s = g is a fixed point of eq. 7 since di/ds = 0 for this point and, therefore, Banach theorem can be applied. Therefore, we must take s 0 = g as an initial condition. This corresponds to the peak-value parametrization so i 0 = i m , i (0) = 0 and can be formulated as a functional integral equation. From this, the 1 st order approximation for the i-variable becomes where Γ is the upper incomplete Euler's gamma function. Therefore, Matching the limit at t → ∞ results in the equation which can also be used for fitting purposes. Then Matching the initial condition gives the value of the constant Therefore, Finally, following the same procedure for the i-variable we obtain where a = g im . The asymptotics of the i-variable are shown in Fig. 5. A -the asymptotic is computed from eq. 25; B -the asymptotic is computed from eq. 28. Plots were produced using the quad qags Maxima numerical integration command.
Therefore, for t → ∞ and we can conclude that i(t) dominates i (1) (t) as t grows from 0. The same argument holds also for t → −∞, however in this case sine the nominator is bonded while the denominator grows exponentially. The sign of q is again positive since W − (z) decreases away from −1 for z ∈ [−1/e, 0) while the sign of the denominator is negative.

Datasets
The COVID datasets were downloaded from the European Centre for Disease Prevention and Control (ECDC) website: https://opendata.ecdc.europa.eu/ covid19/casedistribution/csv. The downloadable data file was updated daily until 14 Dec 2020 and contains the latest available public data on COVID-19 aggregated per country. The data collection policy is available from https://www. ecdc.europa.eu/en/covid-19/data-collection.

Data Processing
The data were imported in the SQLite https://www.sqlite.org database, filtered by country and transferred to MATLAB for parametric fitting using native routines. Quadratures were estimated by the default MATLAB integration algorithms. Estimated parameter values were stored in the same database. The processing is described in [10], The parametric fitting was conducted using least-squares constrained optimization algorithm.
The least-squares constrained optimization was preformed using the fminsearchbnd routine [20]. The fitting equation is given by where I t is the observed incidence or case fatality, respectively. For numerical stability reasons the time variable was rescaled by a factor of 10.

Case Studies
The asymptotic approach is exemplified with data from ECDC for several European countries in the period Jan 2020 -Dec 2020. The data analysis is limited to 14 Dec due to reporting reasons [10].  Figure 6. Case fatality parametric fitting for Belgium and Sweden Parametric fitting of the case fatality data of Belgium (A) and Sweden (B); 'asymp' refers to parametric fit using the asymptotic formula eq. 25, 'sir a ' refers to the i-variable computed by numerical inversion using the parameters estimated by eq. 25 The most direct and "brute-force" approach is to directly fit the time series using the numerical inversion scheme. Such an approach has been followed in [10]. This incurs a relatively high computational cost associated with computation of integrals, followed by Newtonian iteration. Alternatively, the parameters can be fitted using the asymptotic eq. 25 only (see Fig. 6). Moreover, the asymptotic parameter estimation demonstrated faithful representation of the exact model (see Fig. 6). The numerical inversion approach is illustrated in Fig. 7 and compared with the asymptotic fitting in the subsequent tables. The raw series for the incidence demonstrated pronounced weekly variation. However, this did not preclude successful parameter estimation.
The optimal fitting scheme involved three steps (1) initial estimation from the raw data of the peak, the time to the peak and assuming g 0 = 1. (2) asymptotic estimation of i m , g, and T (3) refined estimation of i m , g, and T , involving numerical inversion of eq. 18.
Case fatality parameters for the first and second waves are presented in Tables  1 and 2. Incidence parameters are presented in Tables 3 and 4. From the data it is apparent that the new fitting procedure did not change parameter estimates for the countries with pronounced outbreaks. The estimates obtained using the method in [10] are marked with asterisks. In contrast, variation in the fitting could be observed for the first wave in Sweden. There the method did not converge well for the incidence and the data are not presented.  Figure 7. Parametric fitting of the first and second waves in Italy A, B -case fatality fitting; C, D -incidence fitting; 'asymp' refers to parametric fit using the asymptotic formula eq. 25, 'sir' refers to fitting the i-variable computed by numerical inversion using the parameters estimated by eq. 25. Note the pronounced weekly variation of the reported numbers.
The data demonstrate very good numerical (Tables 1, 2, 3, 4) and graphical agreement of both methods between themselves. Moreover, there is also an excellent agreement with the raw time series.

Discussion
The present results can be discussed in two directions.  Table 3. Incidence parameters, first wave * -fitting method as used in [10]. T is given in weeks and refers to the time passed since 1 st Jan 2020.  Table 4. Incidence parameters, second wave * -fitting method as used in [10]. T is given in weeks and refers to the time passed since 1 st Jan 2020. 9.1. Analytical implications. The asymptotic analysis of the SIR model is important in terms of the Newtonian approximation scheme since incorrect initial guess will converge the iteration to the wrong branch of the Lambert W function. Therefore, one needs an initial function, which is dominated by the i-variable on the entire real line. The previous publication using Newtonian iteration [10], did not address the asymptotic analysis of the SIR model. Instead, the original asymptotic of Kermack and McKendrick was used. However, this did not work for all combinations of parameters. Originally, Kermack and McKendrick proposed hyperbolic cosine, while Barlow and Weinstein [5] used Padé approximation scheme. An interesting result, related to the present asymptotic scheme, is the emergence of two fundamental time scales -a longer one determined by g and a shorter one determined by i m . Such phenomenon is not apparent from the form of the differential equations only, as these explicate only the longer time scale g. Therefore, the appearance of the shorter scale is a truly emergent phenomenon. It will be interesting to investigate such emergence also other epidemiological models derived from SIR.

Epidemiological implications.
A key finding of the presented results is that simple models are very useful in studying epidemic outbreaks also in the quantitative sense. This comes in contrast to some shared beliefs that models should be made more complicated to tackle real world epidemics [21]. Presented results demonstrate the utility of the SIR model for estimating the basic reproduction number and the peaks of the infections, respectively case fatalities in the COVID-19 pandemic outbreaks. There is a renewed interest in the applications of the SIR model in view of the COVID-19 pandemics [22,1,2,3,23,4,24]. In contrast to the approach in [24], here the asymptotics are derived from first principles, that is the non-linear algorithm of [18].
The new asymptotic approach simplifies and stabilizes the data fitting procedure compared to [10]. The employed numerical procedures also demonstrate increased robustness to fluctuations.

Funding
No specific funding available.

Conflicts of interest
The author declares no conflict of interest.

Availability of data and material
The COVID datasets were downloaded from ECDC:https://opendata.ecdc. europa.eu/covid19/casedistribution/csv. The analysis pertains to the version from 14 Dec 2020.

Code availability
Reference implementation in the Computer Algebra System Maxima has been developed and the code is available through the Zenodo repository [16].