Spatio-temporal prediction of the population size at Tehran urban areas using BirnbaumSaunders Markov random elds

For implementing an intelligent urban management, it is necessary and inevitable to know the population size in different areas of Tehran city and identify how they change over time. Population and housing census data can determine the spatio-temporal distribution of the people and how it evolves. However, census will not be implemented across the country in 2021, like a previous manner, due to issues related to the coronavirus pandemic and its prevalence. From this perspective, the prediction of the population size is significant in different regions of Tehran for the year 2021. This paper focuses on developing a statistical learning model for predicting the population growth rate of Tehran urban areas. To be more specific, since the growth rates data are spatially correlated, skewed to the right, and positive, a Birnbaum- Saunders Markov random field (BSMRF) is proposed. We adopt a Bayesian framework for the parameter estimation and spatial prediction and use Markov chain Monte Carlo methods to sample from the posterior distribution. Then, the results are compared with the traditional models.


Introduction
One of the most critical indicators of Tehran's relations is the spatial distribution of the population and how it has evolved; this is also known as population foresight or future foresight. Tehran has become the 24th most populous city in the world. A city currently faces many biological problems and faces various crises such as pollution, climate, traffic, etc. This city involves 22 urban areas. The placement of four new cities on the edge of Tehran has created the need to provide more than 15 million people. Table 1 shows the population of Tehran and the annual population growth in 1986-2016 based on censuses. According to Figure 1, Tehran's population growth had declined sharply from 2.9% in 1986, and this amount has reached its minimum in recent years. Tehran is one of the most populous metropolitans in Tehran. One of the reasons for this decrease in population in Tehran could be the high cost of living in this city, especially the high cost of housing. The excess population of this city migrated to the suburbs. In 1956, about 95% of the urban population of Tehran province lived in Tehran. While in 2016, the city of Tehran accommodates about 70% of the province's urban population. The 1956 census is the first general census of the country's population, in which all people in the country were counted. The year 1956 can be considered the beginning of systematic and scientific population censuses in the country. In 2006, according to the Tehran city council, the gap between the implementation of general censuses of population and housing decreased from ten years to five years. Therefore, according to the last census conducted in 2016, 2021 is the available census of population and housing. Unfortunately, due to coronavirus issues, there will be no census in 2021. Coronavirus pandemic has affected Iran extensively.  This main motivation of this study is to provide spatio-temporal prediction of the population size at Tehran urban areas for the year 2021. According to some reports and empirical researches, the population growth rate between 2016 and 2021 can be considered relatively equal to that between 2011 and 2016. The main contribution of this paper is to develop a new Markov random field based on the Birnbaum-Saunders distribution for the population growth rate between 2011 and 2016. The motivating feature of this modeling framework is that it incorporates positive, skewed to the right and spatially correlated data, as needed in many empirical circumstances. Hence, the methodology presented here makes new research for modeling skewed spatial data.
A spatial process observed over a network is usually modeled using a traditional Gaussian Markov random field (GMRF) where the spatial dependence between data is captured based on a precision matrix corresponding a labelled graph. A nice feature of the GMRF's is conditional independence property. This feature brings sparsity of the precision matrix enabling the network to benefit from computation. Accordingly, this technique has an important role in machine learning. The required background and application in machine learning contexts, are given, for example, in Gelfand et al. (2010), Banerjee et al. (2014), Siden and Lindsten (2020), Perdikaris et al. (2015). However, Gaussianity assumption might be overly restrictive to represent the data. The real data could be highly non-Gaussian and may show features like skewness. In the case of point reference data, various approaches have been provided in the literature to handle this problem, including transformed Gaussian random fields (De Oliveira, et al. 1997), skew-Gaussian processes (Zhang and El-Shaaravi, 2010;Zareifard and Jafari Khaledi, 2013;Rimstad andOmre, 2014, Zareifard et al., 2018), Tukey g-and-h process (Xu and Genton, 2017;Yan and Genton, 2017), Birnbaum-Saunders (BS) model (Garcia-Papani et al. 2017, copula models (Bárdossy and Li, 2008; Kazianka and Pilz, 2009;Pilz et al., 2008;Kazianka and Pilz, 2010;Krupskii and Genton, 2017;Prates et al., 2015;Krupskii et al., 2018) and convolution of log-Gaussian and Gaussian processes (Zareifard et al., 2018(Zareifard et al., , 2019. However, there is little work to be done in skewness modeling for spatial areal data. To reduce unrealistic assumptions for conditional autoregressive models, Prates et al. (2014) considered a non-Gaussian distribution for the random effects in spatial generalized linear mixed models where incorporates skewness as well as heavy tail behavior of the data. However, although the covariance matrix of their random effects depends on its neighbors, the Markov property does not supported. Also, for the class of autoregressive and moving average (ARMA) models, Pourahmadi (2007) studied the construction of stationary skew normal ARMA models for colored skew normal noise.
The aim of this paper is to introduce a novel non-Gaussian Markov random field using the Birnbaum-Saunders distribution Saunders, 1969a and1969b). After investigating the main probabilistic features involving the conditional independence of the introduced random field, the focus of the paper is on the Bayesian approach to carry out statistical inference. We also develop and implement a Markov chain Monte Carlo (MCMC) sampling strategy for posterior computations, which is found to work quite well. There are strong connections between our approach and the method in Garcia-Papani et al. (2017 and. In particular, these works introduce a Birnbaum-Saunders spatial regression model, while we develop a new Markov random field for the areal data. However, their framework can only be applied for point-referenced data settings.
The rest of the paper is organized as follows. In the following section, a brief description and summary of the data, along with the motivation of considering a skewed spatial model, is provided. Section 3 introduces the new class of skewed spatial models and also discusses the Bayesian analysis of the model. Application to the data from Tehran is presented in Section 4. Finally, in Section 5 we present some conclusions and final remarks.

Population growth rate data
The data used for this study is collected from Statistical Centre of Iran and provided from Center for Urban Statistics and Observatory. Our data include the population growth rate between 2011 and 2016 at 22 Tehran urban regions. From the population division below area i in the last census to the population of the same area in the previous census, the dependent variable is made. The last census is the 2016 census, and the pre-last census is the 2011 census. The evidence of spatial correlation is confirmed through Figure 2 where we show the population growth rate spread in all the urban regions. The map depicts that the regions with high rate are closer together and same can be said about regions with low rate. This gives us strong evidence for spatial correlation in the population growth rate data. Given this, the use of classical models for data analysis does not seem justified. According to the figure, the region 22 has the highest, and the region 13 has the lowest population growth rate between 2011 and 2016. For further research, the Moran's I test is used to examine spatial correlation. The value of this statistic was 0.46, which indicates a high level of spatial correlation between the data.  Therefore, the data can be modeled using a Gaussian-Markov random field model. However, the histogram of data ( Figure 3) suggests that the data have a right-skewed distribution and hence motivates us for creating an appropriate skewed spatial model. With regard to use of covariate information can significantly increase prediction accuracy, after studying different variables and studying their correlation with the response variable, four covariates including death count, number of families, number of elderly, and children in different regions were selected and considered in the model in a regression framework.
Then, each / 0 marginally has a Birnbaum-Saunders distribution BS(1, 2) with the probability density function The shape parameter 2 controls the skewness coefficient of the distribution. We call . a Birnbaum-Sanders Markov random field and denoted by BSMRF(2, Σ ). It is completely specified by the shape parameter 2 and a correlation matrix Σ as follows: where / ( 0]) is . without the ith and jth observations. The sparseness of the precision matrix Q completely specifies the graph of the BSMRF and, hence, its conditional dependence structure. For convenience, we denote a BSMRF with the shape parameter 2 and precision matrix Q in the original scale by BSMRF(2, Q). Note that matrix Q is not the precision matrix of . and that the BSMRF is not affected by the scales of the original GMRF.

Bayesian analysis
In the following, the Bayesian inference of the proposed model will be offered. We complete the hierarchy of the model by specifying the prior distribution for the vector of the model parameters. The components of the parameter vector were initially assumed to be independent a priori. Then, proper prior distributions were assigned to the parameters. Accordingly, the joint posterior density of " and | is: (7) In our study, the prior distributions of ' and s 6 were set to be conjugate normal and inverse gamma distributions, respectively. The hyperparameters are chosen in the way that the corresponding prior distributions to be vague. The prior distribution of the shape parameter 2 was set to be Γ (κ1, κ2), a gamma distribution with shape κ1 and scale κ2. The hyperparameters were set to be κ1 = 0.01 and κ2 = 100. These priors were chosen to be proper but vague to allow the posterior estimates to be mainly data driven. A uniform prior over (0, 1) was put on the dependence parameter to ensure positive spatial dependence as intuitively expected (Banerjee et al.,2014). A general Gibbs sampling algorithm with Metropolis-Hasting update can be devised to draw from π(", ||"). Since the quantile function of the log-Birnbaum-Saunders distribution can be written in BUGS, an implementation of the proposed model would be very easy.

Analysis of experimental data
We fitted three models to the population growth data: the GMRF model, the log-GMRF model, and BSMRF model. An intercept model with the available covariates were fit for each model where any two regions within 10 km were considered to be neighbors. For each model, two chains with 10,000 iterations each were generated. We discarded the first 5,000 iterations as burn-in and thinned the rest by 5, resulting in 1,000 posterior samples. Convergence was verified using Gelman and Rubin (1992) criteria. To compare different models for the same data, we propose to use the conditional predictive ordinate (CPO) criterion (e.g., Gelfand et al.,1992;Dey et al.,1997). The summary statistic is the logarithm of the pseudo-marginal likelihood (LPML), which is the summation of the log density of leave-one-out marginal predictive posterior distribution. Notice that a higher value of the LPML indicates a better fit of the model.  Table 2-Posterior point estimates, standard deviations (SD) and 95% confidence intervals of the parameters and the spatial latent variables in the BSMRF regression model. The regression coefficients are in the order of intercept, the rate of deaths, the rate of elderly, the rate of households, and the rate of children below 15. intervals of the parameters and the spatial latent process from the BSMRF model are summarized in Table 2. Neither covariates was found to have a significant effect on the population growth rate. Table 3 shows the predicted values of the population in 22 urban regions. Figure 4 also displays the prediction map of the population size in 2021.

Discussion and Conclusion
In spatial random effects models, the random effects are priori assumed to have a normal distribution, which may not be a valid assumption. Although various approaches have been provided in the literature to handle this problem, we provided a new approach to deal with skewed spatial areal data. A Gibbs algorithm was developed for the Bayesian inference. The applied example illustrated the methodology and its utility in practice. According to the findings, the proposed model has an appropriate ability to provide a more precise prediction than the GMRF model. The LMPL shows that the predictive performance of the BSMRF model compares favorably with GMRF and Log-GMRF models.
It is also worth noting that the model considered in this research has the potential to be used in other modelling frameworks such as the generalized linear spatial models. To save time, our proposed approach can also be extended to provide fully model-based inference large datasets. These issues will be studied in future works.