The Stochastic Modi�ed Lundqvist-Korf Diffusion Process: Statistical and Computational Aspects and Application to Modeling of the CO2 Emission in Morocco

The main goal of this paper is to study the possibility of using a stochastic non-homogeneous (without exogenous factors) diﬀusion process to model the evolution of CO 2 emissions in Morocco and concretely using a new process, in which the trend function is proportional to the modiﬁed Lundqvist-Korf growth curve. First, the main characteristics of the process are studied, then we establish a computational statistical methodology based on the maximum likelihood estimation method and the trend functions. When we are estimating the parameters of the process, a non-linear equation is obtained and the simulated annealing method is proposed to solve it after bounding the parametric space by a stagewise procedure. Also, to validate this methodology, we include the results obtained from several examples of simulation. Finally, the process and the methodology estab-lished are applied to real data corresponding to the evolution of CO 2 emissions in Morocco.


Introduction
Stochastic diffusion processes (SDPs) play an efficient role in modeling several phenomena in various disciplines. We will quote, for example, physics, biology, economy and finance, radiotherapy, chemotherapy, energy consumption, and others. In the same perspective, many types of stochastic diffusion processes have been defined, such as the stochastic Gompertz diffusion process [9,11], the stochastic Gamma diffusion process [12,25], the stochastic Weibull diffusion process [23], and others.
The Lundqvist-Korf growth curve belongs to the smooth sigmoidal functions arising from tree growth and forest populations. It was developed by Korf (1939). In this respect, Zarnovican [30] introduced a brief analysis of Korf's mathematical formula. It can be applied to forest mensuration, such as the relation between height and age for three black spruce stands. Nafidi and El Azri [24] proposed a new non-homogeneous SDP in which the trend function is proportional to the growth curve of the Lundqvist-Korf. Crescenzo et al. [5] and [6] introduced a new deterministic growth model which captures certain features of both the Gompertz and Korf laws. We also perform a comparison between the Logistic growth process and other models, such as the Gompertz model, the Korf model, and the modified Lundqvist-Korf model.
The evolution of CO 2 emissions is currently one of the most significant subjects in environmental science and climate change. In this context, various SDPs have been applied to describe and forecast this issue. In this perspective, see, for example, Gutiérrez et al. [14] in the study of the SDP with cubic drift with application to a modeling of the global CO 2 emissions in Spain. Then, Gutiérrez et al. [13] in the case of the non-homogeneous (with exogenous factors) stochastic Vasicek diffusion process in the case of CO 2 emission in Morocco. Note that the non-homogeneity in this process is obtained by adding exogenous factors such as external variables that affect the drift of the homogeneous Vasicek model. Moreover, Gutiérrez et al. [10] proposed the bivariate stochastic Gompertz diffusion model as the solution for a system of two Itô's stochastic differential equations (SDEs). The drift and diffusion coefficients are similar to those considered in the univariate stochastic Gompertz diffusion process. This stochastic model is applied to the modeling of the gross domestic product and CO 2 emissions in Spain. Moreover, Abbass et al. [1] presented a systematic review of two decades of research from 1995 to 2017 for CO 2 emissions and economic growth. Magazzino and Cerulli [22] examined the relationship among CO 2 emissions, GDP, and energy in the Middle East and North Africa countries using a responsiveness scores approach. Then, Solaymani [28] treated the CO 2 emissions patterns in seven top carbon emitter economies in the case of the transport sector. Furthermore, several techniques, different from the one we proposed in this work, have been used to study CO 2 emissions in several countries and different geographic regions of the world, we will cite, for example, regression models, see Wang et al. [29], Lin and Xu [21], and Hosseini et al. [16], and temporal series models see Nguyen and Le [26].
The question of statistical inference in SDPs has received considerable attention in recent years, both when the process is observed continuously and when it is observed discretely. The estimation of the parameters in stochastic models, in general, is not direct, except in simple cases and one possible methodological approach is based on approximating the maximum likelihood function. In the same context, various methods addressing this question have been developed, and many papers have been published on this subject, focusing on several variants of approximate likelihood methodology, the general case of this methodology can be consulted in Prakasa-Rao [27], Bibby et al. [4], Ait-Sahalia [2] and Egorov et al. [8], and in the case of particular diffusions, the following can be seen, for example, Gutiérrez et al. [9,15] and others. However, the approach of maximum likelihood estimation of the parameters using likelihood equations can be problematic, which is why we propose the use of the simulated annealing (SA) method. This algorithm is a method for solving unconstrained and bound-constrained optimization problems developed by Kirkpatrick et al. [18]. Duflo [7] used for optimization in continuous spaces the method models the physical process of heating a material and then slowly lowering the temperature to decrease defects, thus minimizing the system energy. Recently, many works have used the SA algorithm for estimating the parameters in the stochastic diffusion process (see, for instance, Nafidi and El Azri [24] and Nafidi et al. [23]).
The aim objective of this article is to introduce a new non-homogeneous stochastic modified Lundqvist-Korf diffusion process (SMLKDP), in our work, the nonhomogeneity in the process is of nature, that is to say, that in the definition of the process its drift is dependent on time, then in the future, it is possible to introduce exogenous factors in this process to obtain a version of double non-homogeneity which presents a trend function that is proportional to the modified Lundqvist-Korf growth curve in Eq. (2) and we can apply this process to fit and forecast real data. The rest of this paper is organized as follows: In section 2, we presented an overview of the Lundqvist-Korf growth curve and we define the proposed model in terms of the SDE. We then determine the explicit expression of the solution to the SDE, the transition probability density function (TPDF) and the trend functions. In section 3, we discuss the parameter estimation using the maximum likelihood method (MLM) on the basis of discrete sampling of the process. Since the closed form of the MLM estimators cannot be given because the system of likelihood equations does not have an explicit form, numerical methods are needed, we deal with this problem through the application of SA method: First, a brief summary of the algorithm is provided, and then their adaptation to the problem at hand is presented. Some strategies are suggested for bounding the space of solutions, and a description is provided for the application of the algorithms selected. In Section 4, we present the results obtained from the examples of simulation, and we illustrate the predictive study by fitting the SDP to simulated data. In Section 5, we present an application to real data by considering the evolution of global CO 2 emissions in Morocco. This subject is the primary driver of global climate change. It's widely recognized that to avoid the worst impacts of climate change. Finally, in the last section, we recapitulate the main conclusions from this study.

The model and its characteristics
This section provides an overview of the modified growth of the Lundqvist-Korf curve and defines the proposed process in terms of SDE. We then determine the explicit expression of the solution of the SDE, the TPDF, and the trend functions.

The modified Lundqvist-Korf growth curve
The most commonly used expression of the Lundqvist-Korf curve is: where K is the upper bound for the studied variable, that can only be reached after infinity time and is the scale parameter. If in the Lundqvist-Korf curve Eq.
(1), we replace t by 1 + t, then we obtain the modified Lundqvist-Korf growth curve: We impose that x(t 1 ) = x 1 > 0, hence K = x 1 exp β (1 + t 1 ) −α . Thus, we reach: This curve is a sigmoidal strictly increasing curve showing an inflection point at t I = (αβ/ (α + 1)) 1/α − 1, its value is In addition, t I > t 1 (that is, the inflection can be visualized) if and only if (2) when t tend to ∞. Its asymptote is dependent on the initial value.

The Stochastic modified Lundqvist-Korf diffusion process
To model the modified Lundqvist-Korf type behaviors from a stochastic point of view, our contribution is to consider a SDP whose trend function has the expression given in the curve in Eq. (2). Now, starting from the curve in Eq. (2), one is lead to considering the ordinary differential equation: where h(t) = αβ (1 + t) −(α+1) and the parameters α, β and σ are all positive. Hence Eq. (3) can be viewed as a generalisation of the Multhusian growth model with time dependent fertility rate h(t). Note that h(t) is a decreasing continuous positive and bounded function and has a horizontal asymptote at y = 0. The form of the proposed one-dimensional SMLKDP is defined as a diffusion process {x(t); t ∈ [t 1 , T ], t 1 > 0}, taking values on (0, ∞) and characterized by these infinitesimal moments (drift and diffusion coefficient): with initial distribution x(t 1 ) = x 1 . Alternatively, the above process can be defined as the unique solution of the following Itô's SDE: where w(t) is a standard Wiener process and x 1 is a positive random variable, independent on w(t) for t ≥ t 1 .
Taking into account that A 1 (t, x) and A 2 (t, x) specified in Eq. (4) satisfy the Lipschitz and the growth conditions for the existence and uniqueness of the solution to the SDEs (see, Kloeden et al. [19]). Thus, there exists a non negative constant C = αβ (1 + t 1 ) −(α+1) + σ, such that for all x and y of R + and t of [t 1 , T ], we have: Then, the SDE Eq. (5) has a unique solution {x(t) : t ∈ [t 1 , T ], t 1 > 0} continuous with probability 1, and satisfies the initial condition x(t 1 ) = x 1 .

Probability distribution of the process
By means of the appropriate transformation of the form y(t) = log(x(t)), and by using the Itô rule, the SDE Eq. (5) becomes: by integrating both sides yields, Finally, we have: ) .
(6) The process y(t) is a gaussian process if and only if log(x 1 ) is constant or normally distributed (see, for instance, Arnold [3]). In such a case, the process x(t) is a Lognormal process. That is, the TPDF: where µ(s, t, y) = log (y)

The moments of the process
From the properties of the Lognormal distribution, the r-th conditional moment of the process is: Then, for r = 1, the conditional trend function of the process is: In addition, taking into the initial condition P (x(t 1 ) = x 1 ) = 1, the trend function of the process is given by:

Statistical inference of the model
This section examines the MLM estimators of the parameters α, β and σ 2 of the model using discrete sampling. Then by Zehna's theorem [31], we obtain the corresponding estimated trend functions of the process.

Parameters estimation
Let us consider a discrete sampling of the process, based on d sample paths, for fixed times t ij , (i = 1, ···, d, j = 1, ···, n i ) with t i1 = t 1 , i = 1, ···, d. That is, we observe the variables x(t ij ) whose values provide the basic sample for the inference process. In addition, we assume t ij − t i,j−1 = h, for i = 1, ..., d; j = 2, ..., n i . Let x = {x ij } i=1,···,d;j=1,···,n i , be the observed values of the sampling. The likelihood function depends on the choice of the initial distribution.
Henceforth, we will consider the case when the initial distribution is lognormal. Denoting n = d i=1 n i , from Eq. (7), the log-likelihood function of the sample is: By deriving the log-likelihood function in Eq. (10) with respect to the parameters µ 1 and σ 2 1 , we obtain the ML estimators of µ 1 and σ 2 1 arê The estimation of the parameters α, β and σ 2 poses some difficulties. The resulting system of equations is exceedingly complex and does not have an explicit solution. Therefore numerical procedures must be employed. To address this problem, we propose the use of the simulated annealing algorithm to maximize the likelihood function or, equivalently, its logarithm.

Estimated trend functions
By using Zehna's theorem [31], the estimated trend functions of the process is obtained by replacing the parameters in Eq. (8) and Eq. (9) by their estimators, then the estimated conditional trend function (ECTF) is given by the following expression:Ê Taking into account the initial condition that is P (x(t 1 ) = x 1 ) = 1, the estimated trend function (ETF) of the process is given by:

Confidence bounds
The confidence bounds of the process are obtained by using the procedure described in Katsamaki and Skiadas [17]. Let v(s, t) = x(t)|x(s) = x s . Since the variable w(t) − w(s) is Gaussian with the mean equal to zero and the variance t − s for t ≥ s. Therefore, the random variable z is given by A 100(1 − κ)% conditional confidence bound for z is given by P (−ξ ≤ z ≤ ξ) = 1 − κ. From this, we can obtain a confidence bound of v(s, t) and v upper (s, t) = exp µ(s, t, x s ) + ξσ (t − s) , with ξ = F −1 N(0,1) (1 − κ/2) and where F −1 N(0,1) is the inverse cumulative normal standard distribution.
By substituting the parameters by theirs estimators in the equations Eq. (13) and Eq. (14), we obtain the estimated lower boundv lower (t) and the estimated upper boundv upper (t): andv

The algorithm
SA algorithm is a metaheuristic algorithm to approximating the solution of optimization problems of the type min θ∈Θ g(θ), where g is the objective function to be optimized and Θ is the solution space. It was developed by Kirkpatrick et al. [18]. Under this algorithm, in each iteration, let x be a current solution, x ′ be a new value selected in a neighborhood of x in the next iteration, and the objective difference is δ = g(x ′ ) − g(x). If δ ≤ 0, then x ′ is selected as the new solution. Otherwise, it could be accepted with probability p = exp (−δ/T ), where T is a scale factor called temperature.
The general application of the SA algorithm depends on the definition of the several parameters 1. Initializing the parameters of the algorithm such as the initial solution θ 0 , the initial temperature T 0 , the final temperature T F , the chain length for each application of the Metropolis algorithm L and the cooling procedure, and the stopping condition.
2. Apply the selection procedure for a new solution L times.
3. Verifying the stopping condition. If it is not verified, decreasing the temperature and returning to the previous step.
In our case, the problem becomes maximizing the function ln L x µ 1 , σ 2 1 , α, β, σ 2 . Since the algorithm aforementioned is usually formulated for minimization problems, then, from Eq. (10), the objective function is a function of the parameters α, β and σ 2 and has the following form: and the parametric space Θ linked to the objective function Eq. (17), on which the selected algorithms must operate, is continuous and unbounded. Consequently The drawback is that the solution space might not be explored with enough depth. This requires us to find arguments for bounding said space. The following are some strategies to this end.

Bounding the solution space
Regarding the parameter σ, when it has high values it leads to sample paths with great variability around the mean of the process. Thus, excessive variability in available paths would make modified Lundqvist-Korf-type modeling inadvisable (see Fig. 1). Some simulations performed for several values of σ have led us to consider that 0 < σ < 0.1, so that we may have paths compatible with the modified Lundqvist-Korf-type growth. On the other hand, there does not seem to be an upper bound for α and β. To this end, we will use the following reformulation: Setting b = exp(−β) and a = 1/α. This reformulation leads to the condition 0 < b < 1 and does not seem to be an upper bound for a. Nevertheless, the modified Lundqvist-Korf curve is sigmoidal and has an inflection point at t I = (− ln (b) / (a + 1)) a − 1, which is higher than t 1 if and only if The parameter a can be bounded taking into account the information provided by the sample paths and the asymptote of the curve verifying and the inflection point of the curve at From these expressions, and if we denote by k i the maximum value of the i-th sample path, the following expression holds ) and x 1 is the initial value of the sample path. Thus, the solution space, which is obtained numerically for a, b and σ is bounded and takes the form Once the solution space has been bounded, we specify the choice of the initial parameters of the algorithms and the stopping conditions. Let consider: 1. The initial solution is chosen randomly in the bounded subspace 2. The initial temperature should be high enough such that in the first iteration of the algorithm, the probability of accepting a worse solution is at least of 80% (see, Kirkpatrick et al. [18]). For this, we assume the initial temperature of 10.
3. For the cooling process we have considered a geometric scheme in which the current temperature is multiplied by a constant γ (0 < γ < 1), i.e. 5. The stopping criterion defines when the system has reached a desired energy level (freezing temperature). Equivalently it defines the total number of solutions generated, or when an acceptance ratio (ratio between the number of solutions accepted and the number of solutions generated) is reached. The application of the algorithm will be limited to 1000 iterations.
The coding is performed using Matlab R2018a computer software and is run on a Core(TM) i3-2348M CPU 2.30 GHz processor with 3.86 Go of RAM.

Simulation study
This section presents some simulated sample paths of the SMLKDP and demonstrates the performance of the proposed procedure using simulation examples, and validates the predictive study by fitting this process to simulated data.

Simulated sample paths of the process
This section presents some simulated sample paths of the SMLKDP. The trajectory of the model is obtained by simulating the exact solution of the SDE (6). We obtain the simulated trajectories of the model by considering the time discretization of the interval [t 1 , T ], with time points t i = t i−1 + (i − 1)h, i = 2, .., N and h = (T − t 1 ) /N is the descritization step size for the sample size N . The random variable σ (w(t) − w(t 1 )) in the Eq. (6) is distributed as one-dimensional normal distribution N 1 0, σ 2 (t − t 1 ) . This simulation is based on 25 sample paths with t 1 = 0, T = 40, x 1 ∼ Λ 1 (1, 0. 16), and N = 250. Fig. 1 shows some simulated sample paths of the SMLKDP and its trend function for several values of α, β and σ.

Estimation of drift parameters and the diffusion coefficient
This section presents several examples to evaluate the performance of the estimation procedure previously developed. To this end, Eq. (6) was simulated 25 times under the following assumptions t 1 = 0, h = 1, x 1 ∼ Λ 1 (1, 0.16) and the samples of sizes N = 100, 250 and 500 are used to investigate the effects of the sample size on the performances of the estimation procedure. In order to make the subsequent inference we have considered, in each example the 25 sample paths with t i = t i−1 + (i − 1)h, i = 2, ..., N .
The empirical mean, the standard deviation (std), and the coefficient of variation (CV) for a, b and σ (functions of parameters α, β and σ which are the arguments of the objective function) are defined in Table 1. Then, Table 2 shows the results obtained for calculating the latter measures. The results obtained show the performance of the methodology. Table 1 The empirical mean, the standard deviation and the coefficient of variation for a, b and σ (M is the sample paths). Table 2 Estimation values, the standard deviation (std) and the coefficient of variation (CV) of a, b and σ for several values of σ (a = 2, b = 0.02).

Predicted data using ETF and ECTF,
In this section, we have considered the predictive study based on fitting the diffusion process to simulated data in which N = 25, t i = t i−1 +(i−1)h, i = 2, ..., N starting at t 1 = 1, taking the step size h = 0.05 and x 1 = 2.6574. First, we use the first 22 data to estimate the parameters a, b and σ 2 of the process by SA. Moreover, we obtain the corresponding ETF and ECTF values given by the expressions (12) and (11). For the three last data, we predict the corresponding values using the ETF and ECTF. Also, we give the results attached to a 95% estimated confidence bound (ECB) and a 95% estimated conditional confidence bound (ECCB) of the process (see, the expressions (15) and (16)). Finally, to illustrate the performance of procedure, the results according to the one-step-ahead mean absolute error (MAE), the root mean square error (RMSE) and the mean absolute percentage error (MAPE), given by Table 3. According to Lewis [20], we deduce that the accuracy of the forecast can be judged from the MAPE result in Table 6. Table 7 shows the results of the simulated values with the ETF, ECTF, ECB and ECCB. Table 4 shows the values obtained from the estimation of the parameters of the SMLKDP. Table 5 shows the values obtained after the calculation of the three measures of the goodness of fit. Then the accuracy of the forecast can be judged from the MAPE, in other words if the value of the MAPE is less than 10%, the forecast is highly accurate. Fig. 2 and Fig. 3 illustrate the performance of the SMLKDP for forecasting using the trend function and the conditional trend function. Table 3 The MAE, the RMSE, and the MAPE.

Application to real data
This section presents an application of the proposed process and the computational statistical methodology described above to fits and forecast the global CO 2 emissions in Morocco using the ETF and ECTF. Let us the x(t), to be CO 2 total emissions from the Consumption of Energy and t i = t i−1 + (i − 1)h; for i = 2, ..., N starting at t 1 = 1, taking the step size h = 0.01, N = 32 and x 1 = 19.2.
The real data for these variables are annual values and correspond to the period 1987-2018; CO 2 emissions is expressed in Million Metric Tons of Carbon Dioxide. These data are published by Atlas Mondial de Données (Maroc-Environnement) at https://knoema.fr/atlas/Maroc/%C3%89mission-de-C02-kt.
The series of observations considered from 1970 to 2018 are used to estimate the parameters of the model by SA. The values of the estimators of the parameters areâ = 0.258755,b = 2.214039 × 10 −19 andσ = 0.099976. For the years 2019 and 2020, we predict the corresponding values using the ETF and ECTF. Also, we give the results attached to a 95% ECB and a 95% ECCB of the process. Finally, to illustrate the performance of procedure, the results according to the one-stepahead MAE, the RMSE and the MAPE. Then, the values of the MAE and RMSE are respectively 0.95229 and 1.28012. The MAPE is 2.48211%, so we conclude the the forecast is highly accurate. Table 8 shows the results of the observed values, the ETF, ECTF, ECB and ECCB. Fig. 4 and Fig. 5 illustrate the fits and forecast of the SMLKDP using the ETF and ECTF.

Conclusions
In this study, we defined a new non-homogeneous SDP related to the modified Lundqvist-Korf growth curve. Then, its distribution and main characteristics were analyzed including its trend function and its conditional trend function, which were found to be proportional to the modified Lundqvist-Korf growth curve. This process is advantageous over the deterministic modified Lundqvist-Korf growth curve. We have also applied the simulated annealing algorithm in order to solve inference problems.
Furthermore, an application to simulated data of the proposed model showed its usefulness in practice and demonstrated that the strategy used for bounding the parametric space behaves well. Finally, the process was applied to study the total emission of CO 2 in Morocco. By fitting the SMLKDP to the real data from the period 1987 to 2018, a good description of the series and good short-medium term forecasts (2019-2020) is obtained. The description and forecast using the conditioned trend function are considerably better than those based on the trend function alone, although they are only optima in the short term. Figure 1 Simulated trajectories of the SMLKDP and its trend function for several values of σ with (α = 0.5; β = 4).     Fits and forecast using the ETF and ECB.

Figure 5
Fits and forecast using the ECTF.