Study Area
The research work was conducted at two sites located in Maharashtra, India: Nanded district (Site 1) and Purna region of Parbhani district (Site 2), as depicted in Fig. 1. These sites exhibit diverse land covers, which can be broadly categorized into four major categories: vegetation, bare land, built-up area, and water.
Satellite Data
The image processing for this research was conducted using satellite imagery collected from the European Space Agency (ESA) Sentinel-2 satellite. The Sentinel-2 satellite has 13 multi-spectral bands with a maximum spatial resolution of 10 meters for four bands, and 20 meters and 60 meters for the remaining bands (Drusch, et al., 2012). In this study, all bands were resampled to a spatial resolution of 10 meters using the bi-linear interpolation technique.
Various formulations were proposed in this research work, based on the spectral reflectance curve. To derive the spectral reflectance curve, Landsat-8 imagery from January 12, 2018, was used.
Land Cover Reference Data
Two methods were employed to obtain the ground truth information: site visits and visual interpretation of historical images from Google Earth. The use of Google Earth historical images for data extraction has been widely applied in various studies (Kim, et al., 2018; Yu, et al., 2012; Magidi, 2021). The study considered four land cover classes for classification, namely Vegetation, Bare Land, Water, and Built-Up Area. About 70% of the collected data was used for training, while the pixels from remaining data were used for testing from Site 1. Data from Site 2 was only used for testing purposes and not involved in the training process, as presented in Table 1.
Table 1: Details of training and testing data
Class No.
|
Class Name
|
Training Area in m2 (Site 1)
|
test pixel count
(Site 1)
|
test pixel count
(Site 2)
|
1
|
Vegetation
|
856800.44
|
50
|
50
|
2
|
Bare Land
|
69120.31
|
50
|
50
|
3
|
Water
|
1406201.67
|
50
|
50
|
4
|
Built-Up Area
|
168915.79
|
50
|
50
|
|
Total
|
2501038.2
|
200
|
200
|
Data Processing:
The Sentinel-2 satellite imagery is acquired by the European Space Agency (ESA), while the Landsat-8 imagery is obtained from the United States Geological Survey (USGS). Partial reference data is collected using Google Earth, which is then combined with Elshayal Smart GIS to create high-resolution, geo-referenced imagery (Malarvizhi, et al., 2016). This imagery is valuable for generating shape files, which are subsequently employed for training purposes.
ESRI ArcMap 10.7 is utilized for various image processing tasks such as image registration, filtering, resampling, classification, and creation of index-derived images. In addition, data analysis and accuracy assessment are performed using Microsoft Excel. Spectral reflectance curves are also generated using Microsoft Excel.
SPECTRAL REFLECTANCE CURVE
The process outlined in Fig.2, as described by (Kale, 2021), is used to derive spectral reflectance curves from the Landsat-8 imagery. The resulting curves from the Landsat-8 imagery captured on January 12th, 2018 are shown in Fig.3. Data analysis and interpretation of the spectral reflectance curves are conducted accordingly.
NOVALCLASS EXCLUSION AND REASSEMBLING TECHNIQUE
In the classification of images using band arithmetic technique, it is important to note that not all classes can be accurately classified using a single arithmetic combination. While a particular formula may improve the classification ability of one or more classes, it may not work well for all classes, thereby reducing the overall accuracy of the classification. To address this issue, it is recommended to focus on the classification of individual classes one by one. This involves using a specific band arithmetic formulation that works well for a particular class and excluding that class from further classification. This process is repeated for the remaining classes, using different strategies or formulations as needed. Careful attention must be paid to accuracy, as the exclusion of classes can impact the overall classification results. Once all classes have been individually classified, the results can be combined to produce a final classified image, as shown in Fig. 5.
Stage 1: Identification of Water
As per Kale (2021), water is the easiest class to classify, so we will begin our classification strategy by excluding the Water class. We will apply the WaterI1 index (Eq. 1) on satellite imagery dated 22nd January 2018 to create a formulated raster image.
After classifying the image for all four classes, the accuracy of the Water class was observed. It was found that Water was classified with 100% accuracy for Site 1, but it was not checked for Site 2. Thus, the parts of the satellite images classified as Water were excluded from further processing for both sites’ imagery.
Stage 2: Identification of Vegetation
As Water has been successfully classified with 100% accuracy, the focus is now on the remaining three classes, which can be broadly categorized as vegetation and non-vegetation parts. Therefore, the classification strategy will now be aimed at accurately identifying the Vegetation class. In order to achieve this, the following indices have been implemented.
The resulting raster images obtained by applying NDVI, Index1, and Index2 are combined to form a composite image. After the classification, the class Vegetation is excluded from the imagery and used for stage 3 of both Site 1 and Site 2.
Stage 3: Identification of Built-Up Area and Bare Land:
After excluding the Vegetation class, the image is left with only two classes, namely Bare Land and Built-Up Area. To effectively classify these two classes, the following indices have been designed:
The composite image is obtained by combining three raster images: band C/A (Band 1) raster image, formulated raster images of Index4 and Index5, and is classified to distinguish between the remaining two classes, Bare Land and Built-Up Area.
The exclusion of one class is carried out at the end of each stage, and the excluded part is converted to a shapefile and preserved for future reference. The part other than the excluded class is also converted to a shapefile and used for clipping the imagery (excluding the class) for the next stage. At the end of Stage 3, all the preserved shapefiles representing their corresponding classes are reassembled, and one shapefile is prepared by combining shapefiles of all four classes. This combined shapefile is converted to a raster image, which is the classified image. The same process is followed for Site 2 of the proposed method, except for the accuracy analysis, which is carried out at the end instead of at every stage, as Site 2 is a test site. The classified images of both sites are shown in Fig. 5.
A Random Forest (RF) supervised classifier is used for the classification, and the model is trained using the data collected from Site 1 and is used to classify the images of both sites. Accuracy assessment is carried out using the individual test data from each site. The complete process of executing the study is systematically presented in the block schematic in Fig.4.