Urban Growth Modeling Considering Educational Institutions In Qazvin, Iran: An Investigation Into The Performance of Three ANFIS Methods

Due to increasing urbanization, the rapid expansion of urban spaces has become a major environmental concern over the last few decades. Therefore, modeling the urban expansion as a complex system has been scrutinized in recent years; however, determining the rules that lead to the expansion of urban areas has always been a challenging factor in this field, especially for disaggregated models like cellular automata (CA). To overcome this issue, in this research, an Adaptive Network-based Fuzzy Inference System (ANFIS) is proposed to enhance the simulation of urban growth through the automatic production of transition rules. The ANFIS can be associated with several inputs division methods, such as ANFIS accompanied by grid partitioning (ANFIS-GP), subtractive clustering (ANFIS-SC), and fuzzy c-means clustering (ANFIS-FCM). Hence, twenty-two ANFIS models based on Landsat images for the time interval from 2000 to 2010 and using different division methods were trained to investigate their effect on the efficiency of ANFIS in urban growth modeling. To examine the efficiency, the Cellular Automata-based Markov Chain (CA-MC) as a popular method was developed, and the simulation accuracy of CA-MC and the most accurate ANFIS models were obtained through comparison with observed data. The most accurate ANFIS-SC model had a Kappa of 0.76 and an overall accuracy of 93.41% for the 2019 simulated map. The results from this study reveal that the ANFIS model is effective at simulating urban expansion and the ANFIS-SC is superior to CA-MC, ANFIS-GP, and ANFIS-FCM models in urban expansion modeling.


Introduction
The proportion of the world's population that resides in urban areas has grown from approximately 5% to over 50% in recent centuries (Ezeh et al. 2017), and it is expected that this proportion will rise to 70%, or even more, by 2050 (Broere 2016;Hegazy et al. 2017;Kabisch and Haase 2011).This issue causes a more serious problem to emerge, that is, the accelerating rate of urban expansion in most parts of the world, insofar as it is predicted that the world's urban areas will have tripled between 2000 and 2030 (Khamis et al. 2018;McDonnell and Hahs 2015;Southon et al. 2017).The rapid urban expansion has led to a variety of complex problems, such as increasing informal settlements, air pollution, deforestation, soil erosion, and climate change (Kamusoko and Gamba 2015;Wang et al. 2020;Xu et al. 2019).
Additionally, the rapid growth of the cities has strained their capacity to provide services, such as education, health care, and transportation (Ewing et al. 2002).Moreover, the level of urbanization in Iran increased from 33.7% in 1960to 74.9% in 2018(World Bank 2018), and this high rate of urbanization has led to the fast expansion of built-up areas in recent years (Bihamta et al. 2015;Taravat et al. 2017).Such issues can be prevented from occurring by modeling and studying the future urban expansion (Xu et al. 2019).Models provide a better understanding of the mechanisms of urban expansion and enable urban planners and policymakers to manage the continued growth of cities (Li et al. 2017;Samardžić-Petrović et al. 2017).
Over the last three decades, many models have been developed to simulate urban growth processes and land use/land cover (LULC) changes (Karimi et al. 2019;van Vliet et al. 2016).
Three typical classes of urban growth models have been introduced so far: the land use transport interaction (LUTI), the cellular automata (CA), and the agent-based models (ABMs) (Li and Gong 2016).Among these models, the CA is a popular (Liu et al. 2017;Musa et al. 2017) and strong technique for simulating the pattern of built-up areas (Aburas et al. 2016;Mustafa et al. 2018).Nevertheless, determining the transition rules, which leads to the calibration of the CA model, has always been a key challenge in this model because of the complex relationship between urban expansion and its multiple driving factors (Berberoğlu et al. 2016;Blecic et al. 2015;Li et al. 2017).To handle this challenge, Artificial Neural Networks (ANN) have been extensively utilized in urban growth/LULC change simulation (Lu et al. 2019) due to automatic determination of transition rules and less computation time (Mondal et al. 2020).Li and Yeh (2001), Li and Yeh (2002), and Li et al. (2013) automatically determined the transition rules of CA using neural network.Furthermore, Xu et al. (2019) integrated ANN with the Cellular Automata-based Markov Chain (CA-MC) model.In these studies, due to associating the ANN with CA, the capability of the CA model has improved in handling nonlinear transition rules (Yang et al. 2008).Nevertheless, neural networks have some disadvantages, such as difficult calibration, the tendency to overfitting (Kamusoko and Gamba 2015), and incompatibility with the uncertainty of transition rules (Azari et al. 2016).Azari et al. (2016) utilized a fuzzy set theory to tackle the uncertainty of transition rules.However, fuzzy models require a justifiable approach for calibrating (finding locations of the Membership Functions (MFs) for the model inputs) their parameters (Al-Ahmadi et al. 2009).
Several works have demonstrated that the weaknesses of fuzzy and ANN can be overcome by using an Adaptive Network-based Fuzzy Inference System (ANFIS).An ANFIS is a hybrid intelligent system that carries the benefits and learning potential of neural networks and fuzzy logic (Parvinnezhad et al. 2019) and eliminates many limitations of these two algorithms (Ahmadlou et al. 2019).Mohammady et al. (2013)  Correct Match (PCM) for model evaluation, which obtained 64% and 94%, respectively.Mohammady and Delavar (2016) and Mohammady (2016) have demonstrated that the ANFIS model has more accurate results in the simulation of urban growth compared with the ANN and Logistic Regression (LR).
The ANFIS model employs and trains a Fuzzy Inference System (FIS) with distinct division methods, such as Grid Partitioning (GP), Subtractive Clustering (SC), and Fuzzy C-means (FCM) to divide the input data space.It has been reported in some scientific studies that clustering type can play an important role in ANFIS model performance, such as Adedeji et al. (2020) and Moradi et al. (2019), whereas there is a paucity of high-quality research investigating this subject in urban development modeling.It is explicit that calibrating various kinds of ANFIS with urban parameters, comparing the accuracy of trained models, and finally, employing the most well-trained ANFIS model, lead to more precise transition rules and, as a result, more accurate forecasting of LULC changes.This study, therefore, aims to implement ANFIS accompanied by GP (ANFIS-GP), SC (ANFIS-SC), and FCM (ANFIS-FCM) models to assess the effect of clustering methods and their parameters on the accuracy of ANFIS in urban growth modeling.Additionally, it aims to simulate the future built-up areas and warn about the possible harmful effects of spontaneous urban growth on the environment.To fulfill these aims, we executed multiple scenarios for the ANFIS training to achieve the most accurately calibrated model.Additionally, we developed the CA-MC (Rimal et al., 2018;Rimal et al., 2017;R. Wang et al., 2018), a popular (Wang et al., 2020) existing model that integrates the CA and Markov chain analysis.Afterward, we compared the simulation results of CA-MC with the most acceptable ANFIS model for the 2019 simulated map.We finally simulated the upcoming urban expansion of Qazvin city for the year 2029, utilizing the ANFIS model, for the use of urban planners and policymakers.

Urban expansion
The rate of urbanization has significantly increased over the last five hundred years and by 30% in 2010 (Gross and Ouyang 2021).The increase in the level of urbanization, especially in less developed countries (Angel et al. 2016), has been recognized as one of the main reasons of the rapid growth of urban areas.As we mentioned in the introduction, the rapid urban expansion with no efficient planning has posed some serious risks to the environment and residents.Therefore, recognizing the nature and types of urban expansion can be useful to handle its negative consequences and achieve sustainable urban development.Generally, there are three spatial patterns of urban expansion, namely the edge-expansion, infilling, and outlying (Xian et al. 2019;Zhang et al. 2020).The urban edge-expansion denotes the development along or approximately in parallel to the board strip, the infilling has been defined as the development of vacant parcels located inside of built-up areas and eventually the outlying refers to development of areas placed outside of existing urban areas (Jiao et al. 2018).There is another distinct expansion pattern called sprawl, which in previous studies is commonly known as scattered development (Artmann et al. 2019).As a complex pattern of urban development, sprawl generates large and low-density suburban areas associating with some hazards to human health, such as air pollution (Frumkin 2002).
Regardless of the type of urban expansion, when a city is developing rapidly, the requirement of careful urban planning and the design of development policies to provide facilities for citizens become essential.Precise urban planning can greatly reduce the disadvantages of some of the mentioned urban expansions.Therefore, this study does not seek to distinguish diverse types of urban development; however, it aims to determine any urban expansion that has occurred through LULC.

Study area
Qazvin Province, with a top urbanization rate of 73.1%, is located in the northern half of Iran, as one of the 31 provinces of the country.Qazvin city, as the central city of this province, lies at an altitude of 1278 m (4193 ft.) above sea level.The study area of this research includes the Qazvin city and its suburbs, which locate between 36.2266° and 36.3604° N, 49.9614° and 50.0555° E (128.63 km2).Fig. 1 represents the Qazvin city and its position in the county and country.
The most significant reason for choosing this city as our study area was that the city is suffering from fast growth in both population and urban areas.According to the census of the Statistical Center of Iran, the population of Qazvin city was 402,748 in 2016, whereas in 2006 this number had been 349,821 (Statistical Center of Iran 2019).This increase in population has led to the rapid outward city growth.The low population-density zones and some formal and informal settlements have appeared surrounding the city in the last decades.The emergence of these areas causes some serious damage to the environment and natural resources.Consequently, an investigation of the rapid growth of this city appears to be vital for urban planners.

Layers and architecture of ANFIS
ANFIS was developed by Jang (1993) as an integrated hybrid neuro-fuzzy model that employs the Takagi-Sugeno fuzzy inference system (Abraham 2005).To put it more simply, ANFIS carries the advantages of both ANN and fuzzy reasoning at the same time (Zhou et al. 2017) by training a FIS model.As it is displayed in Fig. 2, this model is composed of five layers.
The parameters of the first and fourth layers are recognized as premise and consequence, respectively.Unlike the parameters of the other layers, they are adjusted throughout the training process.The first layer has come to be known as fuzzification layer.Nodes in this layer take the numeric inputs and express the membership grade of the inputs in a fuzzy set as the output by fuzzy Linguistic terms.For example, for the input x and fuzzy set Ai, the output of the first layer is defined as follows.
Where,    () represents the membership grade of the input x in fuzzy set Ai.The second layer is known as the rule layer.The general structure of a fuzzy If-Then rule in the ANFIS network is as follows.
Rule i: IF x is A i AND y is B i THEN F i = p i x + q i y + r i , i = 1: number of rules (2) Where, x and y are ANFIS inputs, A and B are fuzzy sets, Fi is an output function, and pi, qi, and ri are consequence parameters belonging to the output function.Any Triangular norm (Tnorm) operator that performs fuzzy AND belonging to the IF part of the fuzzy rule can be deployed in the second layer as the fuzzy implication function.Hence, this layer is utilized the minimum T-norm and multiplies the membership grades coming from the first layer in order to obtain the weight of each rule.Eq.3 represents the functionality of this layer for two x and y inputs.
Where, wi is the weight of ith rule and    () and    () represent the membership grades of the x and y.The third layer computes the normalized firing strength of ith rule ( ̅  ) using Eq.4.
, i = number of rule and j = total number of rules (4) The fourth layer, which is denoted as the defuzzification layer, multiplies the normalized weights from the previous layer by Then part of fuzzy If-Then rule (output function).Eq.5 exhibits the operation of this layer.
The only node in the fifth layer finally adds the outputs of the previous layer and generates the conclusive output of ANFIS (Karaboga and Kaya 2019).
Generally, there are two learning algorithms for ANFIS training, namely back-propagation and hybrid in order to adjust the premise and consequence parameters in every epoch.The hybrid method consists of two steps, forward and backward pass.The consequent parameters are specified by least-squares method over the forward pass and the premise parameters are adjusted by gradient descent over the backward pass.

Division methods in ANFIS
ANFIS can be categorized into four types, namely ANFIS-GP, ANFIS-SC, ANFIS-FCM, and ANFIS accompanied by a context-based fuzzy c-means (ANFIS-CFCM) according to the division method of inputs (Yeom and Kwak 2018).The division methods, such as GP and SC cluster the input data to fuzzy clusters.Afterward, an MF is allocated to each cluster and a FIS is generated (Adeleke et al. 2020).
The GP method divides the entire input data space into rectangular partitions and every grid is used to produce fuzzy rules.The accuracy of clustering in this method is greatly relying on the size of inputs and grids (Benmouiza and Cheknane 2019).The SC method, as the first step, calculates the potential of being the center of cluster based on the density of the data surrounding it for each data and selects the data with the greatest potential as the center of the first cluster.Afterward, all data that falls within the center's range of influence of the first cluster is deleted.This process iterates for the remaining data so that all data are clustered.The FCM clustering assigns each data to clusters using a membership grade.As the first step of this algorithm, the number of clusters is determined and the initial locations of clusters centroids are discovered by an initial guess.Afterward, the membership grades of each data for being in each cluster are calculated.The FCM algorithm adjusts the centroid location and membership grades in an iterative process.

Markov chain analysis
The Markov chain analysis (MCA) is a foreseeing approach that event occurring probabilities are varying depending on the values at the previous time interval (Fan et al. 2008).In fact, the future values of a particular variable depend only on its current (not on the past) value.
Generally, according to the Markov chain theory, if the sequence of the random variables of xn takes the distinct values of a1, a2, …, an, then the Markov chain can be represented by Eq.6 (Fan et al. 2008).
With this assumption that the values of the variables were changed from time t to time t-1, the transition probability in time t can be computed by the following relation (Balzter 2000).
The term p in Eq.7, which is named transition probability matrix, represents the probabilities of transitioning.Assuming that the states are 1, … , m, the p can be displayed as the following matrix.
Where, the p1m represents the probability of transition from state 1 to state m.

Driving factors
Up to this time, various driving factors of urban expansion, such as distance to primary and/or secondary roads, distance to previous urban, slope, elevation, population, and access to jobs have been considered in diverse investigations (Mustafa et al. 2017;Pijanowski et al. 2014;Radwan et al. 2019;Shafizadeh-Moghadam, Asghari, Taleai, et al. 2017;Shafizadeh-Moghadam, Asghari, Tayyebi, et al. 2017).In determining driving variables, it is highly important to pay attention to the particular characteristics of the local area.Qazvin, as our study area, possesses several colleges and universities at the present.Most of these institutions have been built outside of the city limits, which makes the value of nearby land grow and the chance of LULC change increase.Additionally, the past researches have not focused on the distance from educational institutions as a driving force in urban growth or sprawl modeling (Aburas et al. 2016;Hosseini and Hajilou 2019;Santé et al. 2010), except for a very limited number of studies such as Triantakonstantis et al. (2014).Hence, distance to the universities by five other spatial variables, namely slope, aspect, distance to the urban areas, distance to the urban transportation network, and distance to the urban squares, were selected as the driving factors of this study.

Methodology
Satellite imageries are qualified primary data sources for studying LULC changes.Hence, for this research, we adopted the Landsat images (from 2000, 2010, and 2019) due to their accessibility and appropriate special resolution.and Root Mean Square Error (RMSE), and the most appropriate one was selected.In the third part, a CA-MC model with WLC for the generation of transition suitability images and AHP for factor weighting was developed.In the prediction phase, the built-up areas of 2019 were predicted using the CA-MC and most well-trained ANFIS models.Eventually, the predicted map of CA-MC was evaluated using overall accuracy, Kstandard (Kappa standard), Kno (Kappa for no information), Klocation (Kappa for location), and KlocationStra (Kappa for stratum-level location), and the predicted map of ANFIS was assessed utilizing overall accuracy and Kappa.

Data sources and imageries preparation
Landsat images have been largely used for urban planning due to aspects of appropriate spatial resolution and shorter revisit time (S.Wang et al. 2018) Before utilizing the satellite images, pre-processing steps must be performed on the imageries.
Pre-processing refers to those processes that apply to the raw satellite images to reduce or eliminate the errors of sensor and platform and includes radiometric and geometric corrections (Asubonteng et al. 2018).The obtained images have georeferenced and orthorectified with high geometric accuracy (fewer than 6.8m RMSE), and no geometric corrections were required to add to the imageries accordingly.
Atmospheric correction, as one of the most important steps in pre-processing, eliminates the influences of Atmospheric phenomena, such as absorption and scattering (Jha et al. 2021).
Hence, we applied an absolute atmospheric correction to all imageries by using the fast lineof-sight atmospheric analysis of hypercubes (FLAASH) algorithm in the ENVI platform.
The Scan Line Corrector (SLC) of Landsat 7 ETM+ has failed since May 31, 2003, which made a loss of about 22% of data in the SLC-off images (Masoumi et al. 2017).Therefore, in order to fill the gap areas of the image of 2010, we used the Local Linear Histogram Matching (LLHM) approach by adding an add-on to the ENVI software and adopting another Landsat 7 image (image on June 19, 2010) from the same area of the initial image.
We selected the suitable three-band combination by applying the Optimum Index Factor (OIF) technique for improvement in classification accuracy (Patel and Kaushal 2011).Table 1 displays the implementation results of OIF method in the ILWIS software.As a preferable combination, the one with the highest OIF represents the most details in the images.we collected the land use of random sample points by GPS and generated a confusion matrix.
An overall accuracy of 80% and a kappa coefficient of 0.72 were received for the classification.
According to Pijanowski et al. (2005), the accuracy of classification was acceptable.

Generation and preparation of Input layers
In this study, six driving factors were selected and, as a result, six raster layers related to them were created in the ArcMap environment.Two layers of slope and aspect were achieved using Aster Digital Elevation Model (DEM) v.3.Four distance raster layers were generated by Euclidean distance ArcMap tool.Furthermore, the layer of areas with unchangeable LULC, such as military bases and parks, were generated by the LULC map.The value of these pixels remained 0 (unchangeable land use) in all study stages.All accomplished raster layers for 2019 with 30m pixel resolution and UTM 39N coordinate system are presented in Fig. 4.
Since pixels in every layer had a distinct range of quantities and units, we normalized all of them in the range of 0 to 1 by the Eq.8 in the MATLAB (version 2016b) software.
X(i , j) = ((X(i , j) − min(X(i , j))) / (max(X(i , j)) − min(X(i , j))) Where, X(i , j) is the normalized value of each pixel, min(X(i , j) is the minimum value of all pixels in a specific layer, and max(X(i , j) is the maximum value of them.

ANFIS models implementation
As the common ANFIS methods, we implemented ANFIS-GP, ANFIS-SC, and ANFIS-FCM using GenFIS1 (generate initial FIS), GenFIS2, and GenFIS3 MATLAB functions respectively.All models were accomplished with various input parameters and Gaussian MF.We predicted the map of built-up areas for 2019 by the most accurate ANFIS model in the prediction phase, in order to comparison of simulation accuracy with the CA-MC model.
Finally, we computed the kappa coefficient and overall accuracy for the predicted map by comparing this map with the classified land use map that we had already obtained.

Training dataset
Regarding there is no accepted rule in ANFIS to determine the number of training dataset, we randomly created a 100000*7 training dataset matrix using 100000 pixels (from a total of 142926) with nonresidential land use from the map of the year 2000.We used 70% of these pixels for training and the other 30% to validate the quality of the training process.The matrix below illustrates our training dataset matrix acceptably.
The first six columns were created by normalized input layers, whereas the seventh column was the ANFIS output illustrating the pixel development position in the map of the year 2010.
Zero indicates undeveloped and one indicates developed pixel in this column.

CA-MC model implementation
As a stochastic model, the Markov chain can express the probability of the LULC changing in simulating land use changes (Wu et al. 2019); however, it cannot obtain the spatial distribution of projected LULC (Nguyen et al. 2019).In order to resolve this issue, the CA-MC model, The CA-MC model with an iterative process was employed a kernel size of 5 × 5 as a filter.
With every iteration of the model, every land use map is weighted again through the filter to finally simulate the transferred areas of each land use for the 2019 year.

CA-MC model Validation
We assessed the CA-MC model performance by comparing the output map of the model and the land use map of 2019 and obtaining the kappa coefficients in the Idrisi.To fulfill this aim, we computed various Kappa coefficients, namely Kstandard, Kno, Klocation, and KlocationStra to validate the model in both quantification and location.The standard Kappa (denoted as kstandard) reveals the ability of classification to attain perfect classification and uses for quantity (not location) assessment (Pontius Jr 2000).As Eq.9 indicates, Kstandard signifies the ratio of the categories that correlated correctly to categories that correlated correctly by chance (Khwarahm et al. 2021).
Where, M(m), N(m), and P(p) represent the resemblance between the reference map and: the simulated map, a map with random sites of the cells, and a map that has perfect data, respectively.klocation (kappa for location) and klocationstra (Kappa for stratum-level location) reveal what extent the cells are located correctly on the landscape and within the strata, respectively.Eq.10 and Eq.11 calculate these two coefficients.
Where, H(m), P(m), and K(m) reveal the resemblance between the reference map and: a map with random sites of the cells in each stratum, a map with regenerate sites of the cells as well as possible in the entire map, and a map with regenerate sites of the cells as well as possible in each stratum, respectively.Finally, Eq.12 states the formula for calculating the kno (kappa for no information) that is defined as the ratio of pixels with correct classification to expected correct classification with no ability to specify quantity or location (Pontius Jr 2000).The N(n) in Eq.12 can be defined as accidental agreement.
All mentioned kappas can vary from zero (completely classified by chance) to one (perfect agreement between maps) (Christensen and Jokar Arsanjani 2020).
In addition to the kappa coefficients, we obtained the overall accuracy in MATLAB to compare the results of the two models extensively.

Results and discussion
All input parameters and execution results of ANFIS-GP and ANFIS-SC models, executed in the first part of the calibration section, are provided in table 2. According to this table, although the RMSE of ANFIS-GP was approximately equal to the RMSE of ANFIS-SC, its MSE was greater than the MSE of ANFIS-SC.Besides, it is apparent that the execution time of the ANFIS-GP was longer than ANFIS-SC due to more number of generated rules.On the other hand, ANFIS-SC was able to approximately obtain an equal accuracy to ANFIS-GP with more MFs and fewer rules.More MFs and fewer generated rules lead to more coverage of the input space and less execution time of the model, respectively.

Table 2 execution results of ANFIS-GP and ANFIS-SC in the first part of calibration
Additionally, the execution results and characteristics of ANFIS-SC and ANFIS-FCM in the second part are presented in tables 3 and 4, respectively.The stopping criteria was considered for algorithm as the maximum execution time of 50000 seconds.All models were successfully executed within the time limit, except for the ANFIS-SC with the radius of 0.   The line charts in Fig. 6 reveal the relation between the RMSE error of model training and the number of clusters/center's range of influence in both back-propagation and hybrid methods.
From these line charts it can be observed that the accuracy of training in ANFIS-FCM has not necessarily increased with the increase of clusters; however, the accuracy of ANFIS-SC has increased with decreasing radius (increasing clusters).Furthermore, it is quite obvious that the models that have used the hybrid optimization algorithm have always had less error than the back-propagation algorithm.Hence, utilizing an ANFIS-SC with a hybrid algorithm and smaller radius of influence can lead to more accurate modeling of urban expansion.
As it is indicated in Fig. 7, the target-output plot, errors plot, and the error histogram chart of the ANFIS-SC with the radius of 0.2 were generated in order to further examine the outputs of ANFIS model.The target-output plot represents the ideal output of ANFIS (0 and 1) as target and the predicted output as output at the same time, whereas the errors plot reveals the difference between them, and the error histogram chart illustrates the distribution of errors.
Looking at the error histogram chart, it is apparent that more errors were close to zero, which means the predicted outputs of the ANFIS-SC have been close to the ideal output.
Consequently, the results of the model have been satisfactory.
The predicted map of the ANFIS-SC (radius of 0.2) model for 2019 had the Kappa coefficient of 0.76 and overall accuracy of 93.41, whereas the CA-MC model had the Kappa of 0.66 and overall accuracy of 77.96.The kno of 0.66, Klocation of 0.79, and KlocationStra of 0.79 for the CA-MC model indicated that this model can have satisfactory results; however, the ANFIS model had much better results than CA-MC due to better Kappa and overall accuracy.Besides, the Kappa between 0.6 and 0.8 is considered a very good kappa (Pijanowski et al. 2005).Hence, as it is presented in Fig. 8, the predicted map for 2029 was achieved by ANFIS-SC model.
Comparing this map with the 2019 land use map, it is apparent that the projected development will occur in the north more than in other directions.A possible explanation for this may be the existence of two large universities by enrollment in the north of the area in addition to the suitability of other driving factors.These universities with more than 36000 enrolled students cause a large number of trips to the north of the city during the day.The examination of the road land use changes over the last few years reveals that urban decision-makers have improved the road transport network in the northern areas to meet the requirements of the students.The built-up roads to the universities have improved access into the northern barren lands; consequently, they have increased the probability of LULC change of these lands.Hence, large educational institutions can generally play an important role in increasing residential land use in the region, especially in small and medium-sized cities.

Conclusion
This study was established to develop a powerful model based on ANFIS, and to investigate the role of clustering method in the accuracy of ANFIS model training and simulation power in urban expansion modeling.The most obvious finding to emerge from this study was that the proposed ANFIS model had a high accuracy in modeling the complex phenomenon of urban expansion and was able to be an acceptable alternative to the traditional model of cellular automata.Besides, according to the results of this study, the ANFIS model offered a more precise simulation of LULC in comparison with the CA-MC model.According to the results, we can infer that the ANFIS-SC model will lead to more precise urban expansion modeling in comparison with the ANFIS-FCM.Moreover, although the ANFIS-GP has precise and acceptable results, it is not a great option for modeling urban expansion due to longer execution time and inability to execute and high complexity in handling high-dimensional problems (Adedeji et al. 2020).
An important aim of the present research is to warn about the harmful effects of spontaneous urban expansion on the environment, using a functional model and predicting the land use changes from a ten-year perspective.The valuable finding of our study is that the city expansion will occur to the north toward the barren lands, and expansion to the other directions will not approximately appear.The sprawl development in these barren lands can pose a significant threat to the environment of this area that is a habitat for various animals.Contrary to expectations, the predicted map of 2029 indicates that agricultural lands that are located in the south, west, and southwest of the city, will not be damaged in the future due to land use changes.This finding is likely to be attributed to the lower price of barren land compared with agricultural land, the fertility of agricultural land, and supportive government policies protecting agricultural land, and it will be of broad use to policymakers.
The most important limitation of this study lies in the fact that we are not able to utilize all driving factors influencing the expansion of the city due to some factors like the price of land parcels of the area in the past is not available now.Notwithstanding this limitation, this research employs as many important factors as possible to study urban growth.Another limitation of this study is the need for computers with powerful hardware to run more complex ANFIS models, especially the ANFIS-GP or ANFIS-SC with a smaller radius of influence.According ANFIS models were implemented based on cellular spaces in this study.Hence, it might be possible to use a parcel-based land use classification in further investigation, and instead of using cells in the model, parcels be used as the smallest unit for studying land use changes.
Additionally, the center's range of influence between 0.2 and 0.8 in ANFIS-SC was investigated due to the mentioned limitations, whereas it revealed that the model results improved with decreasing radius of clusters.Hence, further studies need to be carried out to achieve a more accurate training of the model using a radius smaller than 0.2, and cautiously investigate the relationship between decreasing radius and increasing model accuracy.
Moreover, a further study on implementing ANFIS-CFCM in urban growth modeling and comparing the results with the other three ANFIS models are suggested.

Supplementary Files
This is a list of supplementary les associated with this preprint.Click to download. graphicalabstract.tif utilized ANFIS for urban growth modeling from 2000 to 2006 in Sanandaj, Iran.They employed Figure Of Merit (FOM) and Percent Fig.3provides a general overview of the principal performed steps of this study.As it can be observed from this figure, image preprocessing was performed on the data because the received images were accompanied by errors.Afterward, images were classified and the six driving factors of urban growth as input layers of the ANFIS and CA-MC models were generated.This study developed and implemented a two-step model that comprised two phases of calibration for 2000 to 2010 and prediction for 2010 to 2019.The calibration phase consisted of three parts.In the first and second parts, ANFIS-GP, ANFIS-SC, and ANFIS-FCM models were calibrated and compared in terms of performance and accuracy of training in pairs using Mean Squared Error (MSE)

Fig. 5 ,
Fig.5, which achieved by ANFIS editor Graphical User Interface (GUI), represents our ANFIS-SC structure with the six driving factors of this study as inputs of the model and one output representing the development probability of each pixel.
which integrates the CA with the MCA model, has been employed in various urban growth and LULC changes studies.This integrated model includes a spatial dimension considering the effects of neighboring cells by CA neighborhood.To implement the CA-MC mode, we accomplished the MCA model in the Idrisi software by introducing land use maps of 2000 and 2010 and two time intervals from 2000 to 2010 and 2010 to 2019 into the model.As a result, the transition probability matrix (reveals the probabilities of transitioning from one land use to another one) was obtained.One of the inputs of CA-MC is the transition suitability image which reveals the suitability of converting to another land use.We used weighted linear combination (WLC) as one of the multicriteria evaluation (MCE) methods to obtain these maps.To achieve this aim, first, we obtained transition potential images (express the potential of change in the specific time) for three urban, agricultural, and barren land uses and then converted all three maps to transition suitability images in the Idrisi.It is worth mentioning that we normalized all pixels between 0 to 255 by fuzzy reasoning due to differences in units and pixel values, and utilized the AHP for factor weighting in the MCE method.
1.By comparing all RMSE and MSE errors of the training and test data, it was found that the ANFIS-SC with the radius of influence of 0.2 (the hybrid one) was the most accurate for training data.The ANFIS-SC with the radius of 0.4 had the least MSE for test data; however, it had more RMSE and MSE for training data in comparison with the ANFIS-SC with the radius of 0.2.The results were generally indicated the superiority of the ANFIS-SC to the ANFIS-FCM and ANFIS-modeling.This finding is consistent with the results ofAdedeji et al. (2020),Moradi et al. (2019), andAkkaya (2016) studies accomplished on different areas.
to the results of training and test dataset, the key strength of this study has been high-precision training of the model (without overfitting) by a high number of training dataset (100000 pixels out of a total of approximately 142000 pixels).

Fig. 1
Fig.1 the map and satellite image of the study area: a location of Qazvin province in Iran, b location of Qazvin city in the county, and c satellite image of Qazvin city (Source: Landsat 8 satellite imagery)

Figure 4 raster
Figure 4 . Hence, for this research, we adopted the Landsat 7 ETM+ (Enhanced Thematic Mapper) images onMay 22, 2000, and June 3, 2010,and Landsat 8 OLI/TIRS (Operational Land Imager/Thermal Infrared Sensor) image on June 20, 2019.The images were downloaded from the United States Geological Survey (USGS) database (https://earthexplorer.usgs.gov/)as the primary data sources of this study.As each sensor had a distinct spatial resolution, all images were resampled to 30m resolution, and then they projected to Universal Transverse Mercator (UTM) 39 N coordinate system.

Table 1
the OIF values belonging to the best band combinations of satellite images Afterward, we carried out a supervised method, namely maximum likelihood, to classify the imageries in the ArcMap software.Urban, agricultural, barren, and transport were selected as the land use classes based on study objectives.For assessing the classification performance,

Table 3
execution results of ANFIS-SC in the second part of calibration

Table 4
execution results of ANFIS-FCM in the second part of calibration