Analyzing Temporal Changes in an Urbanized Area Using Densely Staked Image Classication and Multinomial Logistic Regression (MLR) Technique

Monitoring transformation of non-built-up area to urban spread via densely-stacked Land-Use-Land-Cover (LULC) classication offers a catalogue of spatio-temporal statistics to evaluate discrepancies instigated by transition factors. Impacts of major transition apparatuses in an area persuading the haphazard urbanization pattern are evaluated for Vellore acts a major contribution to Smart city project. Implications of causative factors: i) Population density; ii) proximity from rail-road-network; and iii) commercial areas are scrutinized with respect to urbanization upsurge. Multi-variate correlation is established using trend analysis and Multinomial Regression (MLR) technique for individual and homogeneous amalgamation of the aforementioned factors. Resulting equations obtained is formally used to detect closeness of urban extent from several landscapes. Research outcomes exhibited that the built-up straggling occurs from 30 to 232 m along the landscapes with a maximum of 336 m. Illustration of this study can also be assessed for various social and economic causative factors against urbanization for other smart cities. dynamically in this study. In this study, an effort is undertaken to inspect spatial and temporal applications of GIS and Remote Sensing to classify built-up area in Vellore city and adjoining areas. Population density and Euclidean distance from varied features were considered as the causative factors for urban sprawl. For this purpose, cloud free time series Landsat images from 2002 to 2020 were obtained from USGS database for Landsat and Sentinel images. Remote sensing and GIS techniques were used to obtain data for land parcel occupied by impervious extent. Initially, the unsupervised classication algorithms (either ISO or K-means) were implemented for each Landsat imagery after enhancing them through PCA or ICA to re-grouped Built-up, dataset reports arm an urban growth of 43.2% till 2017. The major reasons for increase in urban area were recognized as increasing population, resident requirements and commercialization. In this study the major conversion from vegetative to built-up expanse was apparent in peripheries of the road-rail networks and commercial/income generating areas. Pixelwise analysis of classied images and comparing the classied images with historical archive of google imagery showed that the vegetative and barren feature classes were converted to settlements. Proximity matrix and increasing population were evaluated as per different causative factors apart from social and political tools (because the quantitative investigation


Introduction
In India, for last two decades, increasing pace of urban trends has made it necessary to identify the causative factors leading to drastic changes in natural landscapes. Thus, in order to nd an appropriate solution to minimize the damages to ora and fauna inhabiting the affected natural landscapes, trend analysis on the transition factors need to be evaluated. The swelling population density and outspreading urban sprawls, majorly in metropolitan municipalities, leads to intensifying demand of natural resources like water, energy, land surface thereby further results to deforestation, deserti cation and rigorous loss in agricultural lands. These land parcel variations mutually contribute in changing global environment and near-surface temperatures (Osgouei et al., 2019). Urbanization and changing life patterns during last three decades necessitated urban planners to refurbish an effective methodology and estimate the spatial extent of urbanization. Irregular shapes/sizes of urban features triggers hinderance to evaluate precisely the urban extent and its causative factors. In India, the evolving urbanized area resemble the fallow farmland because of their equivalent re ectance values (Long et al., 2009;Webster, 2001). Consequently, densely time stacked image analysis solves the aforementioned delinquent by categorizing urban and fallow farmland features appropriately. With saturated development in urban core, pre-urbanized segments located far away (ranges can be estimated using threshold proximity analysis) from city premises experiences the urban sprawl. VHR (1-4m) imagery is used to keenly introspect the urban changes occurring far away from peripheries of the densely populated areas (Ban et al., 2010;Del Frate et al., 2007).
Accuracy in urban area identi cation and modeling is of substantial interest to the municipal authorities for applications on urban planning such as resource allocation, management and distribution, facility provision and promotional policies (Jat et al., 2008). Non-parametric techniques like machine learning classi cation, decision tree algorithms and knowledge-based classi ers are used extensively to classify Landsat Imagery (Osgouei et al., 2019). Analysis and prediction modelling of impervious area using classi cation techniques consume high computational power and time. An alternative technique to demarcate the urban areas is by point sampling in addition to a supervised, unsupervised or knowledge based systematic learning technique (Bradley, 1997 algorithms to be used to classify images, and what dataset is used (multi-spectral or multi-temporal or multi-fusor). Typically, the accuracy to classify satellite images for 3 major classes (vegetation, water and urban) is higher (over 85%) but while identifying more number of features, it becomes di cult and time consuming (Herold et  Even though the urbanized areas are characterized, the question to isolate the reasons for increasing urban sprawl still remains unanswered. The evaluation of numerous factors contributing to increase in urban sprawl plays a vital role for discrete planning authorities towards computing development strategies intended for laid-back resource allocation and guaranteed forthcoming supplies of natural resources (Jat et al., 2008 (Mustafa et al., 2018). In India, marketing strategies and religious parameters show complex association with built-up expansion by reason of asymmetrical commercialization and conviction. Traditional methods to evaluate these factors includes manual mapping that necessitates employment, time and huge investment.
Remote sensing environment along with regression operation provides competent practice which is not time consuming and offers more accuracy for longterm outcomes (Haack and Rafter, 2006;Sudhira et al., 2004;Yang and Liu, 2005). Regression analysis of the causative factors and increasing urban spread using remote sensing is looked-for dynamically in this study. In this study, an effort is undertaken to inspect spatial and temporal applications of GIS and Remote Sensing to classify built-up area in Vellore city and adjoining areas. Population density and Euclidean distance from varied features were considered as the causative factors for urban sprawl. For this purpose, cloud free time series Landsat images from 2002 to 2020 were obtained from USGS database for Landsat and Sentinel images. Remote sensing and GIS techniques were used to obtain data for land parcel occupied by impervious extent. Initially, the unsupervised classi cation algorithms (either ISO or K-means) were implemented for each Landsat imagery after enhancing them through PCA or ICA to endure handler precision just before selecting distinctive pixel distribution. Training data in the order of 10 times the number of bands (10n) were carefully chosen by means of image to image comparison technique (comparing classi ed image with historical archive of Google Earth images), and ground truth data obtained from toposheets and survey maps. Further, MLC or SVM approaches intended for supervised classi cation are applied to classify the subsequent enhanced images and urban area are validated through Shannon entropy or patchiness change matrix taking place on landscape levels. The statistical analysis such as MLR and Trend analysis for identi cation of relationship between urban spread and causative factors was performed.

Study area
Vellore city (boundary obtained from Google maps) has an area of 98.3 km 2 is located in North Eastern part of Tamilnadu. The city experiences tropical savanna (semi-arid) climate with high temperatures ranging from 29 to 40 o C with a water shortage for almost entire summer season from March to July.
Vellore lies on the banks of Palar river which is an underground river opening at Bethamangala town. The major source of water for the city is groundwater (current ground water level ranges from 0.3 m to 8 m bgl) and overhead tanks provided by municipal Palar and Karungamputhur water works. The geographical location, as shown in Fig. 1 of the city lies between 12 o 53 ' 30 " to 12 o 57'30" N as latitude and 79 o 3 ' 30" to 79 o 10 ' 30" E as longitude, 220 m above mean sea level. Vellore city has been identi ed as one of the 27 upcoming Smart cites by Government of India. Ease in proximity from the two major metropolitan cities of India namely, Chennai and Bangalore, makes Vellore susceptible for urbanization. VIT campus and CMC are the major urbanized or commercial regions in the city. Vellore is believed to be a highly spiritual place with more than seven temples, three churches, and three mosques which attracts tourism and increase urban sprawl.
The other land use features comprise of vegetative areas, open grounds and water bodies (permanent and perineal). Intended for the Smart city projects, Vellore is prone to increasing development that triggers the cumulative urban sprawl and property demand for residential and commercial purposes. In order to ful l this land requirement vegetative lands are converted to built-ups in that way encroaching the pervious land usage.

Image processing and causative factor identi cation:
To model the spatial and temporal changes in the aforementioned features with respect to increasing urban extent densely time stacked cloud free Landsat images (Landsat MSS, TM and ETMT) from 2002 to 2020 are used extensively to develop LULC maps. The preliminary image corrections like radiometric and geometric corrections and enhancement procedures like Principal Component Analysis (PCA) and Independent Component Analysis (ICA) are performed.
Various learning techniques like supervised and unsupervised techniques are applied on the enhanced images to cluster similar spectral pixels with high degree of objectivity (Yand & Lo 2002 (a)). To ease the selection of training samples and cluster the similar spectral signatures, rstly unsupervised classi cation techniques (k-means and iso data) is performed. Image to image comparison is performed in which the classi ed image is contrasted to Google Earth achieve of historical images. A multiband image is obtained and each unknown feature is assigned with the mentioned 4 classes. Various indexes like NDWI or NDVI, NDBI, BI, DBI are performed to cross validate features obtained using unsupervised classi cation. Signature les or training dataset are correspondingly generated using Google earth and ground reference data by signature editor option in ERDAS imagine tool. Training samples are selected in the order of 10n (where, n is the number of classes acknowledged) along with ground truth data obtained by means of survey maps, toposheets obtained from SOI and city development plans as shown in Table 1. SVM classi cation technique is then performed for each set of images subjected to training samples obtained in the preceding phase. SVM tool typically incorporates variance and covariance of diverse signatures while assigning the feature category. The classi ed images are then subjected to accuracy assessment which involves selection of ground samples for each feature to compare with classi ed pixels, mathematically represented as Eq. 1. Accuracy assessment is done for each classi ed image considering 263 sample points. Figure 2 shows the stepwise methodology to extract built-up pixels from densely time stacked series of classi ed images and the effect of different causative factors on urbanization change.
In the next step, built-up feature from respective years was extracted and landscape indices like Shannon's entropy, sparseness index and built-up map density were assessed by means of demographical and built-up area statistics from data sources mentioned in Table 1. Further, data for temporal causative factors is obtained from OSM layers to map roads and railways, Census reports for annual population data and Toposheets for urban core identi cation. Classi ed images are converted into polygons and Euclidean distance of urban features from the NHs, SHs, railway lines, kaccha roads and commercial landuse layers is calculated using near distance feature. The classi ed images for each year are shown in Fig. 3

Image Classi cation:
Georeferenced urban features for Vellore city were extracted using toposheets, google earth, manual survey, city master maps, and municipality water distribution maps. As per 2011 census, the rate of urbanization in Vellore was noted to be about 43.2% which is slightly below the average rate of urbanization for the state of Tamil Nadu. 78 landsat images from 2002 to 2020 (preferably for the days of cloud free and no-rainfall periods) with band description as shown in Table 1 are used to develop an urban growth model to describe the spatial and temporal variation of urban features and predict the impact on future locations, characteristics of commercial structures and consequences of increasing growth rate. Each image is tested for the signature veri cation for accurate pixel representation using following three measures: i) Histograms of standard deviation -to con rm unimodality, ii) Transformed divergence (TD) comparison, and iii) Contingency matrix -preferred during accuracy assessment. Necessary merging and rede nition operations are executed to maintain histogram's un-modality. As shown in Table 2, TD values more than 1900 represents good separation and further, the band composition for Landsat archive is done using bands with higher TD values. For urban features, the rendering mechanism performed fairly feeble because of similar re ectance values of urban settlement, exposed rocky structures and wet alluvial soil. The variation in re ectance values from series of densely stacked cloud free images collected for a cycle gap of 16 days assisted in separating urban settlement from the features with analogous characteristics. Image enhancement like PCA and ICA performed in ERDAS and ENVI improved the visual qualities and equalized band histograms for the satellite image. The enhanced images (as shown in Fig. 4a) attained are exposed to two unsupervised classi cation techniques: K-means and ISO-data, to cluster the pixels into user speci ed number (shown in Fig. 4b) Selected training samples were veri ed using Google earth, SOI toposheets (Toposheet No.: D44N4, D44N8 and D44T5), archeological and historical backgrounds of buildings, survey sheets, expertise opinions and proposed maps of the features by government and private organizations (available in VIT estate o ce and municipal authorities). Pixel variance and covariance of signatures obtained for supervised classi ers were revised to evaluate the image enhancement procedure. Almost 263 training classi ers were obtained for 4 features (Water, Built-up, Vegetation and Barren lands) later subjected to Supervised learning techniques (a sample image for the year 2020 is shown in Fig. 4c). The classi ed images were then exported to a knowledge-based system in which an ancillary information about various features from DEM, soil maps, municipal boundaries and water bodies were integrated. This technique proved most effective for resource allocation and accuracy assessment studies in remote sensing environment (2). Validation datasets were obtained using random strati ed sampling technique from the google earth, ground truth data and toposheets.

Causative factors and Built-up area extraction:
The number of urban cells is calculated by integrating the supplementary features in one non-built-up class. The average percentage increase in built-up area was 51.45% from year 2002 to 2019, with an annual increasing pattern as shown in Table 3. The core urbanization for Vellore province increased due tourist attraction and commercialization surrounding NHs and SHs. Also, growing population trends were observed in the zones near to the institutional and market locations. According to expertise opinions and literature survey in various parts of India (Ajmer, Pune, Chennai, Delhi, Bangalore), population density, distance from roads, railways and commercial regions, were selected as causative factors contributing to urbanization increase. Population data for study area is acquired from the Census data and online portal (www.population.cit/India/Vellore) showing a growth rate of + 2.28% from 2001 to 2011. Decadal census statistics and the population growth were plotted for trend analysis in a regression model and a 3-degree polynomial proved to be best t comparing to linear, exponential, logarithmic and powered distribution. Eq. 2 shows the variation of population for the study area vs. decadal variation with its correlation coe cient as 0.99. where, P is in thousand and x is the years in decade The percentage increase in the population from 2002 to 2020 is estimated to be 53.8% which shows that by 2030, the increase in population can cross 67% (as given in deccan chronicle (dated July 4, 2017)). Logarithmic regression equation for the population data gives the lowest correlation with an R 2 value of 0.91 whereas for the other distributions (exponential, linear, quadratic polynomial and powered) values are as 0.9929, 0.9981, 0.9985 and 0.9761 respectively.
Eq. 2 was further used to estimate the population density and growth rate for each year. Population (P) data (in thousands) and number of built-up pixels for each year is shown in Table 3. Built-up areas extracted from the classi ed images and the population density were then correlated and a regression model is formulated using excel as shown in Eq. 3.  where, is the number of built-up pixels in hundreds The built-up area from 2002 to 2020 is increased by 80.82% approximately. In addition, pixel transformation analysis of built-up areas in classi ed images is done to study the distribution along the urban landuse like roads, railways and commercial areas. Urban sprawling, population density and economic indices alongside the commercial landuse was rapid as compared to the far-ung ranges further validated using census of India report, 2011. Layers of urban/nonurban variation, permanent commercial features proximity, proximity from roads and railways, and population density are prepared at local scale (as shown in Fig. 4d). On the other hand, factors associated with social, cultural and political in uence on urbanization change cannot be evaluated quantitatively. The impact of population change is also analyzed against the commercial regions and found to positively correlated. Figure 3c shows the calculation of distance of various features from roads, railways and commercial regions. From the proximity analysis, the distance from national highways, state highways and railways are found to be the major factor for the population increase and urbanization change as compared to proximity through township roads and commercial regions.
where, is the Distance from roads and railways is the Distance from tourist and commercial areas Increasing built-ups in the study area were analyzed for the proximity from the roadways and railways by dividing the decision criteria as inert features responsible for their correlation. The inert features considered for this study are: permanent water bodies, existing landmarks or developed areas, national and state highways, railways, pilgrimage and tourist places. A slope of more than 15% is not considered for urban sprawl. The proximity from settlements and tourist places like Vellore Fort, Golden Temple etc. were considered as constraints responsible for conversion of barren or vegetative spaces to urban features and upsurge urbanization around the landscapes. The second causative factor is proximity from road and railway network which was responsible to transform the forest and barren area to new-fangled impervious surfaces parting the restricted spaces aside. The constraints on the proximity to road and rail network is: the area falling in between the equidistant lines of distance between 30 to 232 m are more prone to urbanization. The commercial regions like university campuses, hospitals and market areas showed a major contribution in the increasing sprawling where the development became motionless approximately 336 m away from these areas. In some places, land parcels adjacent to the national highways are provided with trivial shrubberies aiding to sustainability and smart city requirements. Altogether, the above factors considered during the study helps to develop a MLR (shown in Fig. 4e) for analyzing the change of urban parcels due to a combined effect of increasing population, proximity through roads, rails and commercial area. Table 4 shows the combined MLR models for four combinations for causative factors vs. the built-up pixel variation with SE and NSE calculations. The rst correlation for population, proximity of roads, railways and commercial regions with built-up pixel gives the maximum NSE value and can be selected for the future studies on identi cation of increased urban extent with time.

Conclusion
One of the potential challenges for environmental protection developers, designers and planning agencies is to optimally utilize and allocate the natural resources in developing parcels. Decision-making in infrastructures initiatives, also depends on the potential harm to the environment due to the type of land transformation. Thus, identi cation of trend in urbanization and its causative factors becomes a prime importance for various authorities. So, the study performed deals rstly with the identi cation of urban areas using satellite images using supervised and unsupervised learning techniques. Urban pixel count extracted using densely stacked cloud free Landsat images were calculated. The annual comparison of the classi ed images showed almost 51% increase in urban pixel count from 2002 to 2020, while, o cial reports a rm an urban growth of 43.2% till 2017. The major reasons for increase in urban area were recognized as increasing population, resident requirements and commercialization. In this study the major conversion from vegetative to built-up expanse was apparent in peripheries of the road-rail networks and commercial/income generating areas. Pixelwise analysis of classi ed images and comparing the classi ed images with historical archive of google imagery showed that the vegetative and barren feature classes were converted to settlements. Proximity matrix and increasing population were evaluated as per different causative factors apart from social and political tools (because the quantitative investigation of these parameters is unattainable). The growth rate in population was obtained for Vellore city and the average growth rate of 2.87 was observed with a polynomial regression variation (R 2 = 0.99, SE = 2.57). Furthermore, various regression techniques like linear, exponential, logarithmic, polynomial and powered, were performed for each causative factor out of which polynomial exhibited the minimum standard error (SE = 2.072, NSE = 0.87, R 2 = 0.96 for PB vs. P, SE = 2.61, NSE = 0.54, R 2 = 0.94 for PB vs. D r and SE = 22.94, NSE = 0.81, R 2 = 0.88 for PB vs. D B ). Individual analysis for each causative factor showed that the distance from major roads and railways plays the most important role for urbanization increase which can be directly correlated to population change also. To analyze the combined effect of each causative factor on urbanization change and to develop a set of transition rules for prediction models, MLR for P, D r and D B was adopted. This multivariate relationship developed for Vellore city is useful for the local municipal authorities to easily quantify areas necessitated for resource allocation, daily needs scheduling, land acquisition, fund allocation for zonal or ward wise distribution, regional planning, designing for stormwater and sewage drainage system and so on. The uncertainty analysis of causative factors like tourism, religious places, public government partnerships, physical barriers, political domiciles, proper drainage system etc. may aid to improve the urban growth modelling as long as data availability is not an intricate factor.

Declarations
Availability of data and materials Not Applicable.

Figure 1
Vellore city and adjoining areas most likely in uenced for urbanization Vellore city and adjoining areas most likely in uenced for urbanization Detailed Methodology chart classi ed as Preprocessing of raw satellite images obtained, Image processing and identi cation of causative factors Year-wise classi ed landsat images for four major features Year-wise classi ed landsat images for four major features Stepwise procedure to evaluate the quantitative effect of transition factors Stepwise procedure to evaluate the quantitative effect of transition factors