Assessment and Predicting of LULC by Kappa Analysis and CA Markov model using RS and GIS Techniques in Udham Singh Nagar District, India


 In this study an attempt to generate the LULC maps and investigate change detection analysis over a period of 22 years using Landsat satellite images of 1994, 2000, and 2016 and to predict the LULCC for the year 2016-2032 using CA Markov model in Udham Singh Nagar district, Uttarkhand. Satellite images of Landsat 5 TM, Landsat 7 ETM+, and Landsat 8 OLI sensor of nominal spatial resolution 30m were used. Supervised image classifications with the help of parallel pipe algorithm were used in this study. The validity of the Cellular Automata Markov model were used to predict future (16 years) LULC of 2032. The estimation includes two modules to predict the future land use pattern of the study area such as MARKOV and CA-MARKOV model/modules. Commonly, the accuracy of the classification results is assessed by the error matrix calculation. The result of overall change detection indicates agriculture, forest, water body and fallow land are decreased by 121.75 Km2 (14%), 44.70 Km2 (5%), 38.91 Km2 (4.5%) and 230.71 (26.5%); settlement and river sand are increased by 379.89 Km2 (44%) and 56.18 Km2 (6%). The study has an overall classification accuracy 76.84%, and standard kappa coefficient value (K) of 0.722. The model predicts the future change detection in agriculture 32%, forest 38%, fallow land 5%, settlement 20%, water body 3%, and river sand is 2%. This study is very effective for future LULC prediction that is helpful in urban development planning and the ﬁeld of management of natural resources.

Landsat TM data in Twin cities metropolitan area. Rawat and Manishkumar (2015) have studied the LULCC that have taken place in Lagos for the last two decades due to the rapid urbanization.
The accuracy assessment estimation is an important final step in the processing of remote sensing data. Accuracy creates the information value of the resulting data to a user. A large number of the recent investigation applying accuracy assessments uses kappa coefficient (K) based indices, and overall accuracy as an indication of the validity of the classification algorithm. Though, recent developments in accuracy assessment methodology have pointed out the effect of the kappa indices (Pontius and Millones 2011). There are numerous authors have performed evaluate on classification accuracy assessment (Congalton 1991; Janssen and van der Wel 1994). The Kappa coefficient is an error matrix of overall accuracy assessment which is obtains since non-diagonal elements information. Kappa analysis is established as an immense method for analyzing a single error matrix and comparing the differences

Accuracy Assessment Estimator and Kappa coefficient
The accuracy assessment estimator is the most important and it is the last stage in classification process of the images (Foody 2002). It is to quantitatively assess how efficiently the pixels were sampled into correct land cover classes. There are different accuracy assessment model, among that the error matrix or confusion matrix has become the most efficient method for deducting the accuracy of classification obtained from remotely sensed data (Congalton 1991;Congalton and Mead 1983;Sanjoy Roy et al., 2015). The accuracy assessment methods were included the standard kappa coefficient, overall accuracy, producer's accuracy and user's accuracy. The overall accuracy estimator calculates the number of pixels classified accurately in the image. The image was classified into six classes settlement, water body, forest, fallow land, river sand, and agriculture ( Table 3). The user's accuracy measures how often the class on the map actually is present on the ground. The producer's accuracy measures the number of pixels classified to a class which accurately fit in to that class only. A wide field survey was performed and Landsat OLI images used to collect ground truth (validation) data for 2016. For the accuracy assessment, totally 285 ground truth points were generated using the stratified random sampling method over the study area for 2016 LULC image. Table 3 shows the relationship between ground truth data and the consequent classified data obtained through confusion matrix (Pontius and Millones 2011). An accurate Kappa statistic for the stratified random was find out by using the following equation (Petropoulos et al. 2015) (1).
Where, T is the test pixels, C is the correctly classified pixels observations, G is the sum of multiplied total value.
The overall accuracy or total accuracy computed by dividing the sum of the values along the major diagonal by the total number of reference pixels. The traditional accuracy assessment methods included the standard kappa coefficient followed by overall accuracy eq (2), Producers accuracy eq (3), and Users accuracy eq (4)

Cellular Automata (CA) model
The prediction model which is used in the present study for cellular automata (CA) model. It is broadly applied to the simulation of monitoring, complex systems, for instance urban growth modeling, ecological modelling, and  (5) -------------- (5) where S (t) is the system status at time , S (t, t + 1) is the system status at time of t + 1; Pij is the Transition probability matrix.
The error matrix allows to find a range of accuracy metrics from the data. In this study, thematic accuracy was measured by using an overall accuracy and error. For this instance, Markov chain analysis the image separate into two time periods of base imagery and second one on which the prediction is based on the other imagery. The first order of Markov model (Usher 1992) assumes that to predict the current nature of the system at time t+1. The core of the Markov model is the transition matrix P, that summarizes the chances of a cell in cover type i which are change to cover type j during a time step. The equation for the Markov model is given below (eq. 6, 7, 8 & 9).

Classification of imagery (1994, 2000 & 2016)
The classification of the imagery is done by the supervised image classification and in this classification used parametric rule and used parallel piped algorithm as it was the best suited algorithm in this study. The total areas of land-use pattern in the district of Udham Singh Nagar for the year 1994 were presented in figure 3. The agricultural land category involves land under crops, fallow, plantations, and aquaculture. The area under agriculture has 1019.14 km2 (39%) yield the largest of the area in 1994. Multi-storeyed and deciduous forest covers with a number of matured trees in the upper canopy and less dense forests shed their leaves during the dry season were presented in this area. The forest shows that the area is covered the 806.34 km 2 (31%). The fallow land is presently cultivated land that is not started or seeded for one or more increasing seasons. The fallow land was obtained in the area of 417.08 km 2 (16%) as shown in figure 7. The remaining feature classes settlement was observed in the area of 196.89 km 2 (8%), it includes urban and rural settlements, transportation, communication, and recreational utilities. Water bodies group comprises areas with surface water in the form of ponds, lakes, drains, and canals etc. The total area covered by water bodies in 1994 was 106.63 km 2 (4%). The river sand is a natural material which is presented in tiny chemical properties such as sandy soils, some portion of silt, clay, and organics. The river sand was observed in 40.31 km 2 (2%) as shown in Table 2.  Table 2.

Overall Change detection and Kappa coefficient
In overall change detection agriculture land is decreased by 121.75 km 2 (14%), forest area is declined by 44.70 km 2 (5%), fallow land reduced by 230.71 km 2 (26.5%), settlement is increased by 379.89 km 2 (44%), and water body decreased by 230.71 km 2 (4.5%), river sand is increased by 56.18 km 2 (6%) over the period of 22 years as presented in Table 2). Table 4 shows that the relationship between ground truth (validation) data and the selected classified data obtained through error matrix analysis. The results from accuracy assessment shows an overall accuracy obtained from the random sampling process for the image (2016) of 76.84%. User's accuracy ranged from 71.42% to 83.05% while producer's accuracy ranged from 75% to 77.7%. The measure of producer's accuracy reveals that the accuracy of prediction of the particular class. User's accuracy reveals that the constancy of the group to the user.
It is the more relevant measure of the classification's actual utility in the ground truth field. Fallow land was found to be more reliable with 77.08% of user accuracy. The study has an overall classification accuracy of 76.84% and the kappa coefficient value (k) was found 0.7203.

Future change detection for 2032
The change detection pattern for next 16 years (2032) both area and percentage were presented in table 3. The result shows that both increase and decrease changes occurred in the land use/cover pattern of the study area. During the last three decades the agriculture, forest, fallow land, and water body in the study area have decreased, settlement and river sand was increased in the study area. The analysis reveals that agriculture is declined by 847.78 km 2 (33%), forest is increased by 1021.78 km 2 (39%), fallow land is decreased by 22.08 km 2 (1%), and settlement 533.87 km 2 (21%), water body 55.93 km 2 (4%) and river sand 55.93 km 2 (2%) were increased in future change detection is shown in figure 6. The study is to understand the future changes for different LULC categories by using the different Landsat datasets.
The error matrix was preferred assess the accuracy of the classified map of 1994. Landsat 5 TM of 1994 was used for accuracy assessment, 285 stratified random points in each class were created in Table 4. The LULC maps produced from satellite images for the study area consists of six thematic land cover classes. In favour of accuracy assessment a totally 285 reference sites were used validation for the land-cover type. Out of these reference sites, agriculture (63), forest (55), fallow land (48), settlement (39), water body (44), and river sand (36). These reference sites subsequently compared to classified results created from the satellite images.
The diagonal elements of the error matrix represent areas that were exactly classified. It is indicative of the accuracy classification. In the study, out of the 63 agriculture reference sites only 49 of these were correctly identified in the classified imagery. Similarly, out of 55 forest reference sites 42 were exactly recognized, out of 48 fallow land reference sites 37 were observed, out of 39 settlement reference sites 30 were strongly presented, out of 44 water body reference sites 33 were accurately received, and out of 36 river sand reference sites 28 were accurately received in Table 4. The off-diagonal elements inform that how to improve the remote sensing classification and time have to spent examining these errors to determine where the most errors occurred in the classification.

Conclusion
The study concluded that in the near future settlement will be growing at a rapid rate. Migration and rapid industrialisation are acting as a major factor in the development of the district. Forest cover will be increasing in the future, but the area of agriculture land is decreased as compared to other areas. Predicted map shows the overall accuracy of accurately 77% with the help of CA MARKOV model and the use of Markov and CA Markov modules.
The classification results clearly demonstrate that satellite images very supportive for extracting LULC for change detection considering the level of classification. It is nonetheless to state that there is a great impact of spatial resolution (and spectral resolution as well) on any kind of remote sensing (RS) and GIS applications. This impact of the resolution was also evident in this study. The classification accuracy was fairly related to the resolution of the image. A good accuracy level was also achieved during the classification (88%) for 1994 imagery, (84%) for 2000, and (86%) for 2016.
Once the classification  has done, checked the accuracy of each classified images for proving how well is the classification. Then by applying change detection observed the changes between the three different images.
The Udham Singh Nagar district was a chosen as a study area to monitor land use/land cover dynamics over a period of 22 years. During 1994 to 2016, the study area has been divided into six major categories such as settlement, forest, agriculture, river sand, water body, and fallow land. CA Markov chain method is very effective for future LULC prediction that is helpful in the field of management of natural resources.