Inference of an Epidemic Driven by Random Transmission Rate: Explaining the Dynamics of COVID-19 in Bogotá

The epidemic disease model is often used to predict an epidemic’s characteristics and help plan an eﬀective control strategy. Motivated by recent COVID-19 outbreaks, we develop and mathematically analyze a simple stochastic model that considers random perturbations in transmission rates and shows its validity using data from the Colombian city of Bogota. The stochastic epidemic model is stratiﬁed in the Susceptible-Exposed-Infectious-Recovered, SEIR , type compartmental model with the randomness depicting the impact of stochastic variations due to external factors such as changes in environmental conditions and human behaviors that may impact the dynamics of an infectious disease. The analysis resulted in derivation of approximate distribution of eventual extinction of infection state , persistence of infection in the mean , and the quasi-stationary infectious state ). Finally, we illustrate inferences and model parameters using reported COVID-19 epidemic data from the Colombian city of Bogotá. The outbreak in Bogota is divided into six distinct periods, with transmission rates high during the initiation of the epidemic and even much higher during the last period, potentially due to the rapid spread of the Omicron variant in Colombia still has a low vaccination rate.


Introduction
Studying the spread and transient behaviors of an infectious disease outbreak over time is vital for all public health departments.It allows health departments to identify disease control thresholds and help design timely and effective intervention measures such as successfully implementing vaccine-and treatment-related policies.In the last decades, there have been large global outbreaks of transmitted infectious diseases such as Zika, Chikungunya, Dengue, Leishmaniasis, influenza H1N1, SARS, Ebola, and more recently, SARS-CoV2 (Table 1).Epidemiology studies incorporate everything from geographical characteristics to complex mathematics to understand the disease distribution changes.When diseaserelated data is unknown or scarce due to resource limitations, mathematical modeling and analysis can help quantify some important basic epidemiological metrics that can help gain insights into the spread of the diseases and preventive measures to control it.A widely used models for epidemic studies are the deterministic mathematical models that use the system of coupled ordinary differential equations and are often referred as SIR-(for its acronyms, susceptible, infected and recovered ), SIS-(for its acronyms, susceptible and infected ) and SEIR-( by its initials, susceptible, exposed, infected and recovered ) type models.For example, the structure of the SEIR model (Figure 1) can be as follows: Infectious disease dynamics offer a wide variety of variations and unknown phenomena related to its spread.One of the critical aspects that are poorly understood is the role of seasonality on the transmission dynamics, which could drastically be different from one disease to another or between regions.There are numerous mechanisms in the literature that have been examined for their possible links with seasonal patterns (or seasonality).Seasonal host health and host nutrient intake, crowding, ambient temperature, population movement patterns, and humidity level are some of these mechanisms that govern the seasonality in directly transmitted diseases (see Table 1 for some examples).Hence, understanding the role of these factors can provide greater insight into all aspects of the dynamics of transmission and maintenance of this disease in humans.
One of the complexities in epidemic models is including appropriate process-stochasticity so that the role of randomness does not diminish as the population size increases.The stochasticity is typically either due to inherent features of the system, referred to as demographic stochasticity, or due to environmental factors external to the system.Generally, these stochasticity are captured using discrete-time Markov Chains (DTMCs), continuous-time Markov chains (CTMCs) and stochastic differential equations (SDEs) type models.In the models presented in this study, the variations, in human behaviors and/or other environmental variables as a factor that randomly modifies the infectivity rate (β) of a disease, is captured (as been suggested by some empirical studies [44]).Thus, referring to the model as a stochastic model with environmental stochasticity.Here, we construct a epidemic model with random perturbation using framework of a SEIR-type deterministic epidemic model and describing it in a form of a system of SDEs.As a case study, we attempt to capture COVID-19 dynamics using data from Bogota and various approximate distributions describing epidemic characteristics.Table 2 summarizes this article's mathematical analysis and numerical results.
2 Stochastic Model Description

Randomness in environmental conditions
Stochastic modeling of epidemics is critical when random variations are observed in processes such as transmission, births, or deaths impacting epidemic outcomes.The random variations due to population characteristics are referred to as demographic stochasticity, whereas the randomness associated with the environmental conditions is referred to as environmental stochasticity.These two types of stochasticity can impact any of the epidemiological or demographical processes (transmission, births, or deaths).However, appropriate stochasticity should be considered based on the disease and the affected region.In this study, we consider a case of a directly transmitted disease in which environmental factors such as changes in human behaviors due to an outbreak are the only cause of randomness.This stochasticity is caused by various causes, such as changes in contact patterns of social behavior and changes due to the Seasonal host births and flocking increase transmission during fall and winter, partial immunity is also important [21]

Rabies
An SEIR model with a incubation period drawn from a Poisson distribution Autumn peak in eastern US; early spring peak in central US Dispersal of juveniles during fall could increase contact rates with raccoons; early spring peak coincides with breeding activity [36] control measures taken during an outbreak of a disease.In particular, some papers as [19,38,39,47,52] suggest that population density, air pollution, poverty, and even land use can affect the transmission coefficient of the COVID-19.Beyond this, [52] emphasizes "distancing and travel restriction in reducing the COVID-19 pandemic".
he influence of environmental factors on infectious disease dynamics has been subject to study in recent years (see [37] for review).We hypothesis that the number of newly effective contacts per unit time per person (represented by β in the deterministic SEIR model) changes randomly (leading to model with perturbations) in dt time units due to environmental variations and is captured by βdt, a random variable with an average of βdt.Defining the random variable β i as number of individual infected during the interval t + i−1 n T, t + i n T for all i = 1, . . ., n with T > t ≥ 0. Suppose further that random variables {β i } n i=1 are independent and equally distributed.By the central limit theorem if n → +∞ then the the SIS model with random perturbations [18] and SIR model with saturated incidence [34].

Infection
Extinction SIS model with random perturbations [18] and SIR model with saturated incidence [34].
SEIR model with random perturbations

Mean Infection Persistence
SIS model with random perturbations [18] and SIR model with saturated incidence [34].
SEIR model with random perturbations.A implication of the persistence in the mean on this model, respect to the extinction

Stationary Prevalence
SIS model with random perturbations, mean and variance of this distribution [18].SIR model with saturated incidence [34].
Existence of the stationary distribution for the SEIR model with random perturbations.

Scheme of SDE
Euler-Maruyama method to a n−dimensional SDE system (with n volatility parameters) and its likelihood function [1].Example: a stochastic predatorprey model Euler-Maruyama method to a n−dimensional SDE system (with less than n volatility parameters) and its likelihood function.Example: SEIR model with random perturbations 4.1

Statistical Inference
Estimation of the volatility parameters using the maximum likelihood estimation [1].Computational algorithms for minimizing the sum of squares [2,8,10] Computational minimizing of the sum of squares, with some initial parameters given by a data update parameter estimation.Once known the deterministic model parameters, the volatility parameter is estimated by maximum likelihood estimation.

COVID data
Bogotá COVID-19 data 4.2 total number of individual in the interval [t, T ), denoted by β t,T is equal to β 1 + . . .+ β n , which has normal distribution with parameters nµ 0 and nσ 2 0 , where µ 0 and σ 2 0 are respectively the mean and the variance of each random variable β i [18].If the average number of infected during the period [t, T ) is denoted by β = 1 n β t,T , its distribution will be normal with parameters µ 0 and σ 2 0 /n.Given dt = [t, T ) then β ≈ βdt, therefore, it can be considered that the random variable βdt has normal distribution.
Our model considers environmental variations in the infectivity term caused by variations in contact rates due to fear and/or outbreak-related change in human behaviors.However, for simplicity, we do not consider the impact of the variations into taken into the birth and death rates, even though they might have some effect.The environmental variations are modeled via the following equation: where β and σ are positive constants that denote, respectively, the mean per capita infectivity rate,, and standard deviation of the Brownian motion, {B (t)} t≥0 respectively.Note that the differential of β + σB (t) is given by Since we assume that The studies of epidemic models with random perturbations recently gained greater attention (see for example, [14,20,30,25,53] ).Also, theoretical studies of epidemic models with random perturbations [6,26,24,33,48,49,51].In this model, we assume that the infectivity rate is affected by the changes in human behaviors.Similar to the System (2.5), we consider a system of stochastic differential equations with an incubation state known as the SEIR model with random perturbations (the flowchart of the model is shown in Figure 2) as follows:  3 Stochastic SEIR Model Analysis

Infection Extinction
Existence and uniqueness are shown in the appendix B.1, which means that the infected population is always defined on the SEIR model with random perturbations.On the other hand, the extinction determines under what conditions, concerning the parameters, the infectious disease will disappear almost surely.That is The proof of the theorem 3.1 is based [34], the proof uses the following lemma 3.1 [35].
Theorem 3.1.Let (S (t) , E (t) , I (t)) be the solution of the SEIR model with random perturbations and initial condition (S (0) , E (0) , I (0)).It satisfies the following: ) then the SEIR model with random perturbations has extinction.
Proof : (i ) From the stochastic system (2.6), it observe that E (t) is given by By the Itô ′ s formula applied to the function ln E (t), it notes that Rewriting ln E (t) as a integral, it has the equation (3.7) by the lemma 3.1 tanking δ = 2, v n = b > 0 and τ n = n, for all t such as 0 ≤ t ≤ n and all n ≥ η (ω), it notes that for all t with 0 ≤ t ≤ n and all n ≥ η (ω).
By the inequality (3.8), it observe that Suppose that for any t ≥ 0, L (t) tends to infinite, case which E (t) tends to 0. Under this limit we have that the previous inequality is maintained.In that way, by the inequality (3.9) and the previous inequality we observe that for all t with 0 ≤ t ≤ n, all n ≥ η (ω) and almost all ω ∈ Ω, taking t → +∞ and therefore n → +∞, for almost all ω ∈ Ω, then for almost all ω ∈ Ω. Taking n such as n − 1 ≤ t < n by the L'Hôpital ′ s rule it has that like b > 0 is arbitrarily less than 1 then taking b tending to zero, then (3.10) is given by for almost all ω ∈ Ω .
(ii ) By the inequality (3.11) is clear that taking the sequence E 1/n (n) n∈N by the root test ([42, pp.65-66]) the sequence converge to a point a, that is, On the other hand, by the stochastic system (2.6) is noted that d (S (t) + I (t)) is given by solving the equation (3.12), taking y (t) = S (t) + E (t), g (t) = N η − υE (t) and p (t) = µ for all t ≥ 0, according to [4, pp.66] the solution is being a indeterminate limit of the kind ∞ ∞ which by L'Hôpital ′ s rule corresponds to Since lim t→+∞ E (t) = 0 then lim t→+∞ S (t) = η µ N .
(iii ) By the stochastic system (2.6): whose solution with respect to y (t) = S (t) + E (t) + I (t) is given by analogously to the reasoning done in previously, note that On the other hand, according to then lim t→+∞ I (t) = 0, almost sure.That is, the SEIR model with random perturbations has extinction.■

Infection persistence in the mean
This section discusses persistence in the mean for the infective population.In general, a model is said persistent in the mean if only in average it always will have an infected population in a lengthy interval of time.In the stochastic model The following theorem gives sufficient conditions to have the persistence in the mean in the SEIR model with random perturbations.
then the SEIR model with random perturbations is persistent in the mean.
Proof : According with the system (2.6) integrating from 0 to t and dividing by the length of the time t note that that is where On the other hand, the system (2.6) also is possible observes that where ϱ (t) = 1 υ (I (t) − I (0)).By using Itô ′ s formula about ln E (t) note that For bounding β (S (t) I (t)) /E (t) when t tends to infinite it does , thus when t tends to infinite, it has that E (t) and I (t) in average are E * and I * ( [32]), respectively, so that on the other hand, It is possible notes the next inequality taking the integral from 0 to t, note that where κ (t) = 1 t (ln E (t) − ln E (0)).By the equation (3.14) it is equal to by the equation (3.15), note that by the law of large numbers for martingales ( [35][pp.12]), then we have that lim t→+∞ L(t) t = 0, the next inequality if r 0 > 1 then the SEIR model with random perturbations is persistent in the mean.■ When t is very large as in the theorem 3.2 is shown (under certain sufficient conditions) when the SEIR model with random is persistent in the mean, actually the theorem gives a lower bound for infected individual.Now it will give a upper bound for the susceptible population.According with the theorem 3.2 that is, for all ϵ > 0 and for all ω ∈ Ω exist a positive constant T (ω) such as on tge other hand, from the equations (3.14) and (3.15) note that so that, for all t > T (ω), A deterministic model is infection persistent in the mean when the basic reproduction number is greater than one, that is, when the model there is no extinction of the infected population (see [5]).In this way, r 0 defined by (3.13) can be considered as the basic reproduction number of the SEIR model with random perturbations.In [40] this concept is discussed, defining the basic reproduction variable by is the basic reproduction number of the SEIR model given by R SEIR 0 = ηυβN µ(µ+υ)(µ+γ) [12].The value of RSEIR 0,E in [40] is the value under which the SEIR model with random perturbations is asymptotically stable when RSEIR 0,E < 1. Preposition 3.1.On the SEIR model with random perturbations, persistence in the mean implies β 2 ≥ 2σ 2 (υ + µ).
whose unique critical point is given by: newly, deriving the function r with respect to N We wish to observe that the persistence in the mean implies a unique stationary distribution if 4 Numerical Results

Discretization and Parameter Estimation Method for the Stochastic Model
We now focus our attention for estimating parameters of the proposed model and simulating the paths by using the Euler-Maruyama Method based on [1].For estimating the values of S(t j+1 ), E(t j+1 ), I(t j+1 ) and R(t j+1 ) we define recursively the estimators Ŝ(t j+1 ), Ê(t j+1 ), Î(t j+1 ) and R(t j+1 ), by the discretized version of the stochastic system (2.6) given by: where S(0), E(0), I(0) and R(0) are defined by the initial conditions since that the epidemic gets to the study region.The objective in this section is to determine the estimators β, γ, η, μ,, υ and σ for the parameters in the SEIR model with random perturbations.The idea is to use the discrete approximations given by the Euler-Maruyama Method to estimate model parameters (procedure similar to one given in [1]).To carry out the procedure, we need the data corresponding to the infected, susceptible, exposed and recovered population, that is, the values for S (t i ) , E (t i ) , I (t i ) and R (t i ) when i = 0, . . ., n, t 0 = 0 and t n = n.
Note that I(t j+1 ) and R(t j+1 ) are random variables with mean µ Ij+1 and µ Rj+1 , respectively.However, the variance is zero, since in the system (2.6)I (t) and R (t) do not have an stochastic constant.
Therefore, the conditional probability density function is given by As det Σ = 0, the function is not defined, which is why, a "near" probability density function is defined, which corresponds to where Σ + j is the Moore-Penrose generalized inverse of the matrix Σ j .Λ j is the matrix with the eigenvalues different to zero of the matrix Σ j (desintegration theorem in [46, pp. 4-5]).In our case is For determining the Moore-Penrose inverse the process given by ([45, pp.302−303]) is followed, obtaining the matrix Σ + j given by then, the approximate probability density function is given by On the other hand, by the definition of conditional density probability function, it has that If we assume (S (t) , E (t) , I (t) , R (t)) T t≥0 is a Markov process, then S (t j+1 ), E (t j+1 ), I (t j+1 ) and R (t j+1 ), depend only on the values of S (t j ), E (t j ), I (t j ) and R (t j ).The likelihood function is given by and the log-likelihood function is For estimating the parameters, the log-likelihood function is maximized.First, we maximize for σ, having taking dl (θ) /dθ = 0, the estimator for σ2 is given by In the appendix C for the goodness bit for the simulated data of the volatility parameter estimation is presented.

An application to actual data: COVID-19 data from Bogotá
Now, we apply this theory for COVID-19 data in Bogotá during the first 385 days (from march 6 rd /2020 to march 29 th /2021), the daily reported database available in the Instituto Nacional de Salud de Colombia 2 .The database also has the recovered data corresponding to those who tested positive for COVID-19.
The figure 4 shows the infected and recovered data along with the smoothed data using the function frfast of the package npregfast in R. We analysed six time periods based on strain variant and infections spread.See the figure 4 the vertical lines corresponds to points where the infection conditions changed (see [11]) and points where the path for the infected population changes.As the infection conditions for each intervals are different, we estimate the parameters β, υ and γ for each interval.The other parameters are ηN ≈ η N = 73660.35such as N is the average of the individuals entering the Bogotá for daily work, η = 9.42/1000 and µ = 4/1000.N , η and µ are taken from the Colombian Official Statistics Office (DANE) 3 .The estimation is based on the sum of squares only for the infected data (see [2,8,10] where computational algorithms are applied for minimizing that function).The function is by minimizing the equation where I(t i ) is the data for the time t i and ϕ(t) is the solution for the infected population of the system 1.1 (see [29]), which is approximated by the function ode from the package deSolve in R. Given that we only have infected and recovered data (there are not susceptible and exposed data), we propose to estimate γ by solving this parameter of the system 4.19 with σ = 0, that is then we take γI = γj I = 1 n n−1 j=0 γ j for each interval I = I j , with j = 1, ..., 6, shown in the figure 4. Note that the data has a great variability, therefore, we suggest to smooth the infected and recovered population.This means that R(t j ) and I(t j ) are the smoothed values.Once we have the values of γI , for each interval I we estimate by least squares the parameters β I and υ I .As we pretend to use the function optim in R, it is important to give the initial values β 0 and υ 0 .For that, we graph nδ 2   2   for different values of β I and υ I having the figure 10 (see appendix D) to establish good initial parameters.
According to the figure 10, we can note special issue for determining the values of β whose value seems not to have minimum value (see figure 10, appendix D).For this reason, it is convenient to take values of β for which the basic reproduction number is 0 and not significantly big (for example, values greater than 1000).Concerning υ, some intervals do not have a minimum value for positive values, as can be seen in the figures 10b, 10c and 10e, behavior which coincides when the infected population does not increases significantly.Another issue that we find with the intervals [152, 205], [205, 265] and [306, 363] is that there is not a good fit when we take γj I as the initial value for the γ I .In this way, we had to do an eye approach for these intervals, obtaining positive parameters and a good fit, as seen in the figure 5.The values of β I and υ I , where the sum of squares are minimum in the figures 10a to 10e (for positive values of these parameters), and γI (or γ I0 doing the eye approach) are taken as the initial parameters of β I , υ I and γ I , respectively, inside of the function optim in R.
In the table 5 (appendix D) are given the final parameters (solution after applying optim).It is to important to say that the initial conditions for using optim (that is, S(0), E(0), I(0) and R(0) for each I) are A S I1 (0) = B, E I1 (0) = 0, I I1 (0) and R I1 (0), where B is the total projected population in march 6 rd /2020 in Bogotá according to Colombian National Office of Statistics (DANE) 4 .I I1 (0) and R I1 (0) are the smoothed infected and recovered data for the first day, respectively.
B S Ii (0) and E Ii (0) are the solutions for the corresponding time by using the function ode with parameters given in the table 4 for the previous interval I i−1 , for all i = 2, ..., 6.I Ii (0) and R Ii (0) are the smoothed infected and recovered data for the first day of each interval I i , respectively.
In the figure 5 are graphed the infected data, with the smoothed infected population, in addition the solutions by using ode for the infected population whose parameters are β I , υ I and γ I estimated previously and initial values given by the steps A and B. We also graph the values of µ Ij+1 given by the equation 4.20.Note that using the proposed metodology, there is a good fit for the infected population.
Additionally, the values of µ Ij+1 are close to the solutions by using ode.
We also calculate R0 I = υ I β I ηN µ(γ I +µ)(υ I +µ) , for each I = I 1 , ..., I 6 , the estimated basic reproduction number of the SEIR model (system 1.1).Following the interpretation of [16], as R 0 > 1 then "each infected individual produces, on average, more than one new infection, and the disease can invade the population", this means that for each interval the disease always was endemic.
For estimating σ by σ (equation 4.21) for each interval I, we need to have the data S (t j+1 ) and E (t j+1 ).These values must be estimated for which we use S (t j+1 ) and Ȇ (t j+1 ) given by where S (t j ) and Ẽ (t j ) are the solutions of the system 1.1 approximated by the function ode of the package deSolve with initial values given by the items A and B, and parameters β I , υ I and γ I given in the table 4 for each interval.In addition, the values of I (t j ) are the smoothed infected data.The reason why we consider I (t j ) into the estimation of the susceptible and exposed population is to involve the data on the estimation of the volatily parameter.Values of Ȋ(t j+1 ) (equation 4.24) are graphed in the figure 5. Inside the estimation of σ (equation 4.21), µ Sj+1 , and µ Ej+1 are the solutions of the system 1.1 approximated by the Euler Method for t j + 1, whose equation is 4.20, with initial values given by the items A and B. The values of Ŝ(t j ) and Î(t j ) in the equation 4.21 are the approximations by the Euler Method for t j .In the figure 5 also are graphed 100 paths of the stochastic model 2.6 approximated by the Euler-Maruyama Method (system 4.19).According to the figure 5 we can conclude that due to the estimated standard deviation given in the table 3 are small, then the stochastic paths are all too close to the deterministic solution using the function ode, as previously we described.
In the figure 6 we graph the estimations of the susceptible and exposed population given by the values of S(t j ) and Ȇ(t j ) of the equation 4.24.The values of I(t j ) are the smoothed infected data in this equation.In the same graph, we do 100 stochastic paths of the system 4.19 with parameters given by β I , υ I and γ I , and initial values according to the steps A and B. Given that the values of σI are all too small, then the stochastic paths approximated by 4.19 are too close to the solution of system 1.1 by using the function ode.Note that depending if the infected population increases or not, then the susceptible population will increase or not.Actually, in the first interval, it is observed that once gets the maximum susceptible population, this decreases and does not get the maximum value again.The exposed population initially observed a significant increase in behavior that was not newly observed in the rest of the intervals.This means that a rapid increase in susceptible and exposed populations is only seen during the first days of the epidemic in Bogotá.
In the table 3 we show the values of the estimators of σ by σ, the values of β 2 I /(2σ 2 I ) − (υ I + µ) for seeing the extinction of the infected population (theorem 3.1( ii )) and the values of r0 I for seeing the persistence  4 for each interval I.Estimated infected are the values of Ȋ(t j ) calculated using the equation 4.24 with I(t j ) being the smoothed infected data in t j .Euler infected approximation are the values of Î(t j ) of the system 4.20.Stochastic paths of the infected population are the approximations by the Euler-Maruyama Method (system 4.19) of the system 2.6 with σ estimated by 4.21 and taking S(t j+1 ) and E(t j+1 ) by S(t j+1 ) and Ȇ(t j+1 ), respectively, according to the equation 4.24.
Figure 6: Euler susceptible and exposed approximation are the values of S(t j ) and Ȇ(t j ), respectively, calculated using the equation 4.24 with I(t j ) being the smoothed infected data in t j .Estimated susceptible and exposed are the values of Ŝ(t j ) and Ê(t j ) of the system 4.20.Stochastic paths of the susceptible and exposed population are the approximations by the Euler-Maruyama Method (system 4.19) of the system 2.6 with σ estimated by 4.21 and taking S(t j+1 ) and E(t j+1 ) by S(t j+1 ) and Ȇ(t j+1 ), respectively, according to the equation 4.24.
in the mean (theorem 3.2).In addition, we determinate the distribution of the transmission coefficient which is normal with mean β Ij and variance σ Ij , for each interval I j , j = 1, ..., 6.In this way, we estimate the variance by σIj whose calculus is made by (4.21).Note that all the values of r0 I are greater than one, even when the infected population decreases, which implies persistence in the mean, which means that in average, for enougth large times always will have infected population with COVID-19 in Bogotá.
In this way, we rule out extintion of the infected population.In addition by the preposition 3.1 and the theorem B.2 the stochastic process has an unique stationary distribution for the infected population.On the other hand, the values of RSEIR 0,E in the table 3 indicate that the model considered is unstable, which is why, little variations on the initial values drive to modify drastically the solution of the model.Table 3: σI are the estimators of σ for each interval I.If β 2 I /2σ 2 I − (υ I + µ) < 0 then for the interval I there is extinction of the infected population (theorem 3.1).The values of r0 I can be interpreted as the estimated basic reproduction number for each I.If RSEIR 0,E > 1 (basic reproduction number of the SEIR deterministic model) then the deterministic model is unstable.In the table 4 are given the parameters estimated by using the function optim in R. The initial values for applying that function are taken by doing an eye approach.
Table 4: Parameter estimators of the corresponding intervals for each I after applying the function optim, values noted by β I , υ I and γ I .The values of R0 I is the estimated basic reproduction number for each I.

Conclusions
In this article, we have designed and analytically studied the dynamic properties of the stochastic SEIR model with random perturbations and used reported COVID-19 data from Bogota to infer characteristics of the outbreak over data-driven six distinct periods.The mathematical analysis of this stochastic model results in approximate distribution of mean endemic infection level and probability of extinction of infection.A procedure for estimating model parameters is also developed and used to study relevant epidemic-related quantities captured by parameters.
The analysis of the stochastic model suggests: • The extinction of the disease (that is, the number of infected individuals tending to 0 almost surely) in distribution, when t tends to +∞, is possible when the following condition is met, • If β 2 2σ 2 ≥ (υ + µ), then the disease will persist in the mean and will have the stationary distribution for the infectious population.
-A weaker condition (for disease persistence in the mean) than the previous one is The r 0 , referred as the basic reproduction number (conditions for infectious population persistence almost surely) of the SEIR model with random perturbations, can be interpreted in a similar to the deterministic model as the average number of secondary infections by a single infected individuals.
-When σ = 0, the stochastic system (2.6) corresponds to the equation of the deterministic SEIR model with demography.Its basic reproduction number is R 0 := υβηN µ(γ+µ)(υ+µ) , therefore, if R 0 > 1, then the infectious population in the model persist in the mean.
• We have also discussed the estimation of the model parameter σ by using the discrete approximations of the model given by the Euler-Maruyama Method and derived the log-likelihood function as Our study makes a valuable contribution to understanding the disease extinction and persistence in the presence of environmental stochasticity.Finally, we propose a methodology for estimating the volatility of the transmission coefficient (σ) using reported case incidence data from Bogota, Colombia, as a case study.
Suppose by contradiction that τ * < +∞ thus τ * is finite with Probability 1, that is, it exits T 2 > 0 finite such as Defining the function On the other hand, as max t≥0 S (t) is less than the total population (which is finite) so

B.2 Stationary distribution of the infected population
In this section of the appendix, we study the stationary distribution for the infected population.In our model I (t) is a random variable for each fixed t with distribution determined by: where is the distribution function of I (t) for each t ≥ 0.
The stochastic process {I (t)} t≥0 is defined (as long as, it exists) the stationary distribution as the random variable I (+∞) whose distribution is given by , for all x a continuous point of F , where F I(+∞) (x) is the distribution function of the random variable I (+∞) ( [18,22]).
The stationary distribution for infected population allows us to determinate the asymptotic behaviour of that population.In the proof of the theorem presented in this section, we use the lemma (B.1) given in [28], which uses the definitions of diffusion matrix and the mean time at which a path starting from a state x reaches a set U .
which is symmetric.By the Rayleigh ′ s principle, for all X (t) and ζ ∈ R 4 , it exists the smallest eigenvalue λ such as ζ T A (Z (t)) ζ ≥ λ ∥ζ∥, therefore the condition (i') is satisfied.
In this study, we show when the infected population will have a stationary distribution.To show this statement, we use the (ii') condition given in the following note.
Theorem B.2.The SEIR model with random perturbations has an unique stationary distribution, if Proof : By the Itô ′ s formula applied to the function ln E (t), it notes that Considering that R 0 > 1 then when t tends to infinite, it has that S (t) and E (t), in average, are S * and E * [32], respectively, so that Taking the function g (I (t)) given by one of the solutions is given by if β 2 > 2σ 2 (υ + µ) then the solution I 1 is well-defined and is positive between 0 and N .Note that g ′ (I (t)) = −σ 2 a 2 I (t) + aβ, then g ′ (0) > 0, that is, g is increasing in 0.Besides g (0) = − (υ + µ) < 0, furthermore, the other solution I 2 is between 0 and I 1 .
As N ≥ E (τ ∧ t) for all t ≥ 0 then ln (N ) ≥ ln (E (τ ∧ t)).On the other hand, the integral guaranteeing for both of those that E (τ ) < +∞ when t tends to +∞.
Case (iii) 0 < N ≤ I 2 .In this case, note that g (N ) ≥ g (x) for all 0 < x < N and similarly to the case (i), for I (0) ∈ (0, d), the next equation is obtained That means that if infected, susceptible and exposed population have stationary distribution, then the mean of each one depend on I (+∞).
1 For pair values of susceptible, exposed, infected and recovered data generated by the function ode we plus 1, 0.5, 0.2 and 0.2, respectively.
2 For odd values of susceptible, exposed, infected and recovered data generated by the function ode we less 1, 0.5, 0.2 and 0.2, respectively.

D Complements of the estimation based on the SEIR model with random perturbations
In the figure 10    values of the sum of squares is minimum are negative (intervals I 2 , I 3 and I 5 ).For β I there are not a unique value for which the sum of squares is minimum for all I j with j = 1, ..., 6.
In the table 5 are given the parameters estimated by using the function optim in R join with the estimated basic reproduction number for each interval.The initial values for applying that function are taken by doing an eye approach.In the figure 11 we graph the distribution of the transmission coefficient for each interval I j with j = 1, ..., 6.

Figure 2 :
Figure 2: Flowchart of SEIR model with random perturbations, the stochasticity is represented by the fluctuating line.

Figure 3 :
Figure 3: Simulations showing extinction for 4 different values of β

Figure 4 :
Figure 4: Infected and recovered COVID-19 data from Bogotá of the first 385 days.

Figure 5 :
Figure 5: Infected data are the data of daily infected cases by COVID-19 in Bogotá during the first 385 days according to Instituto Nacional de Salud Colombiano.Smoothed infected data is the smoothness of the infected data by using the function frfast.Solution by using ode is the solution of the system 1.1 with initial values given by the steps A and B, and parameters β I , υ I and γ I from the table4for each interval I.Estimated infected are the values of Ȋ(t j ) calculated using the equation 4.24 with I(t j ) being the smoothed infected data in t j .Euler infected approximation are the values of Î(t j ) of the system 4.20.Stochastic paths of the infected population are the approximations by the Euler-Maruyama Method (system 4.19) of the system 2.6 with σ estimated by 4.21 and taking S(t j+1 ) and E(t j+1 ) by S(t j+1 ) and Ȇ(t j+1 ), respectively, according to the equation 4.24.

Note B. 1 .
Taking only the stochastic process {I (t)} t≥0 in the SEIR model with random perturbations (that is, not considering the other populations to determine the existence of the stationary distribution), it has that the condition (ii) corresponds to the condition (ii') given by (ii') It exists a 1 , ..., a k and b 1 , ..., b k with a i < b i , where i = 1, ..., k, such as for all x ∈ R⧹U with U

gE
(N ) E (τ ∧ t) < ln I (0) − E (ln (0)) = +∞.(B.37)Additionally, it is clear that E (τ ) given I (0) ∈ (d, e) ∪ (b, c) is equal to 0, according to the definition given by τ (the minimum time to go from I (0) to (d, e) ∪ (b, c) is 0).■ If the stationary distribution is defined for the infected population, so by the equation (3.15) [S (+∞)] + E [E (+∞)] + E [I (+∞)] = η µ N .On the other hand, by the equation (3.15) note that lim t→+∞ are graphed the sum of squares in function of β I and υ I , with γ I = 1 n n−1 j=0 γ j (γ j defined by 4.23) for each interval I.Note that there are intervals where the value of υ I for which the

Figure 9 :
Figure 9: Simulated data and some trends of the model estimation using the purposed estimator of σ.

Table 1 :
Examples of modeling studies of directly transmitted infectious diseases for which seasonal drivers have generated significant longer-term variations in incidence and disease burden

Table 2 :
Differences in mathematical and numerical analyses of model in this present study with other studies in literature t) , E (t) , I (t))For any η and µ, given that 1 > 1/E(t) if only if −1 < −1/E(t) and as max t≥0 E (t) is less than the total population (which is finite) then Case 2: If 1 < E(t) for some t ≥ 0: