Unsupervised learning framework for region-based damage assessment on xBD, a large satellite imagery

Population-based damage assessment is crucial for providing timely aid in case of natural hazards such as floods, tsunami, and hurricanes. One of the quickest methods to do this task is to employ remote sensing data and observe the damage. Most of the previous works have restricted themselves to damage detection and classification with respect to buildings. However, no framework exists that makes a conclusion about the overall damage done in an area affected by the hazard. In this work, we present an unsupervised density-based clustering algorithm that automatically makes spatial groups of affected regions and assigns the label based on the degree of damage to the region. The algorithm automatically selects the optimum number of clusters based on the spatial distribution of the data and works well with any shape of the hazard-affected region. The demographic estimate for the affected region is then presented based on the area of the region and the census data. The navigation information is provided with aid of Google Maps depicting the overall damage along with possibility of transportation. For evaluation of the framework, we employ xBD which is the largest annotated building damage classified data set till date. The results correctly identify the regions and perform extremely well on the silhouette score and the DB index.


Introduction
Impact of disasters such as earthquakes, floods, and fires can be quantified using remote sensing images. The basic methodology in doing this task is to detect changes in building structures before and after the disaster and then classify the building damage according to predefined scale (Bouchard et al. 2022). Several recent works have shown remarkable precision in determining the degree of damage to buildings (Li et al. 2022;Gilani et al. 2018).
Many databases relevant to building damage detection with ground truth have been released recently (Gupta et al. (2019); Weber et al. (2022)). However, there is a lacuna in a framework that classifies regions or geographic areas based on the scale of damage. This is the crucial missing step in providing field assistance according to the severity of the damage. We can consider a simple and practical example to illustrate this point. In an observed region, there may be a building with weak structure which gets wrecked in the disaster, while the others have at worst minor damages. This might lead to confusion while observing remote sensing images and providing human aid. By analyzing satellite images visually, the size of affected areas and the level of damage can often be determined. Accordingly using census data and other government data, the required resources such as human aids, medicine supplies, and rescue services can be arranged (Hu et al. 2012).
Therefore, it is highly desirable to have an index of damage in a location that provides clues about the density of damaged buildings in that area (Theilen-Willige and Wenzel 2019). We accomplish this task by clustering damaged buildings according to their proximity to each other and using the clusters to determine the degree of damage done. The clustering algorithm proposed by us is novel in combining both the features. We present the criteria for automatically selecting the right number of clusters from the satellite images and labeling the cluster with a level of damage. We design densitybased clustering algorithms to accomplish this task and compare this to the standard clustering algorithm, K-Means.
An effective response to natural disasters requires a precise understanding of the damaged structures in the affected area and the population impacted by it. Current response methods necessitate conducting damage evaluations within 24-48 h after the disaster (Karki et al. 2022). In this paper, we demonstrate the application of our work in combining raw maps of damaged locations with other sources of demographic information to ease the task of decision making in providing aids. During disasters, a priority-based service is needed which takes into account the number of affected people, especially children and old people. We use data analytics step to combine demographic information with the location.
We also demonstrate another application of our work in providing navigation information to reach the disaster affected areas. Ertugay et al. (2016) present the difficulties in earthquake and other disaster in accessing the disaster region due to road closure possibilities. Navigation and visualization are done by integrating satellite images with Google Maps. Not all the routes to a location are available in a disaster like flood, volcano, earthquake, etc., as some roads might be blocked due to floods or building debris. This can be helpful to identify the blockages on the road using satellite images due to flood water, debris, etc., and generate a path to reach the location of disaster-affected areas.
For experimental purposes, we use xBD database given in Gupta et al. (2019). The full xBD data set comprises satellite images from 19 different natural hazards that span 1 3 22,068 images, with a total of 850,736 building polygons. The image database encompasses an area of 45,361.79 km 2 . It includes pre-event and post-event satellite imagery for a variety of disaster events, along with building polygons, damage level labels, and the relevant satellite metadata.
The paper is organized as follows. Section 2 gives a literature survey, while Sect. 3 details the experimental database. Section 4 presents our framework. Unsupervised learning framework is presented along with the results in Sect. 5. Section 6 presents a discussion on demographic estimation, while Sect. 7 is about navigation. Finally, the conclusions are presented in Sect. 8.

Literature review
Damage assessment has been divided into two distinct tasks: detecting buildings and classifying damage (Huang and Zhang 2012). Building classification in satellite images also has various uses, such as damage assessment, distributing resources, and determining the population size. The method used by Dimassi et al. (2021) relies exclusively on colored satellite images and employs a two-stage deep learning approach. The first stage utilizes a semantic segmentation model to extract building footprints. Then, the second stage classifies the cropped images. They also perform residential/non-residential building classification. Hamaguchi and Hikosaka (2018) use the information of size of building as well as context information such as roads to improve the accuracy of calculating the footprints of the buildings. Liu et al. (2019) use multiple view satellite imagery for the extraction of the built-up area. Their system is built with physical rule-based methods combining multiple clues for buildings.
The availability of the xBD data set has led to increased research in the field. Shao et al. (2020) examined the usage of pre-and post-disaster images, along with various loss functions, to perform building segmentation. Gupta and Shah (2021a), on the other hand, employs per-pixel classification models with multitemporal fusion. Weber and Kané (2020) processed the pre-and post-disaster images separately by passing them through a CNN with common weights. The features were then combined before semantic segmentation in the final layer. Hao et al. (2020) incorporated a self-attention mechanism to assist the model in capturing long-range information. Shen et al. (2020) evaluated the complex combination of pre-and post-disaster feature maps, introducing a cross-directional fusion strategy to analyze the relationships between pre-and post-disaster images. Boin et al. (2020) proposed to address the challenge posed by the strong class imbalance problem in the xBD data set. They proposed a new approach that increases the number of samples in the minority classes and includes three additional network architectural enhancements to improve the network performance. Pi et al. (2020) applied drone images and videos as input to a sequence of convolutional neural network (CNN) models to detect ground objects from aerial views of the disaster. Xiong et al. (2020) utilized a three-dimensional (3D) building model, aerial images, and drone camera data as input to a CNN. After that, a building image segmentation approach was proposed using the 3D building model as a georeference, allowing the acquisition of multiview segmented building images. Rudner et al. (2018) combine multi-resolution, multi-sensor, and multi-temporal satellite images in a CNN to identify building footprints. Li et al. (2019) use tweets with images posted by first-hand observers of a disaster to locate building damage in a disaster image. Duarte et al. (2018) suggested a CNN framework that 1 3 employs residual connections and dilated convolutions, which is applied to manned and unmanned aerial image samples, to classify building damage in satellite images. Weber et al. (2022) introduced a huge, labeled data set of hazard images from natural disasters. It comprises of 977,088 images and does not make the assumption of having at most one incident and place category per image. Instead, it includes multiple labels per image, such as building collapsed and building on fire. However, xBD offers a distinct advantage to research in annotating the buildings with polygon boundaries, thus making it the largest available annotated database of disasters.
In other recent work, hazards have been detected and assessed using different techniques. León-Cruz and Castillo-Aja (2022) assessed the hazard of Tornado disasters by combining historical tornado reports with potentially severe convective environments determined from the ERA5 data set. They addressed Tornado vulnerability by creating socioeconomic indicators and applying a multivariate statistical method. Bahadori et al. (2022) proposed CrowdBIG, a crowd-sourcing-based architecture for the acquisition of information from the disaster environment, arguing for the pervasiveness of mobile phones, which have camera, processor, and communicator. Their goal is to quickly obtain the necessary information on the damage caused by the earthquake. Tehrani et al. (2022) applied machine learning algorithms for landslide detection, such as landslide debris and destroyed tracks. Gupta and Shah (2021b) proposed RescueNet, a model that segment buildings and also assess the damage levels to individual buildings. They use a pixel-level segmentationbased approach, and use multi-scale, temporal (capturing before and after damage) features to detect damage in the segmented buildings. However, building localization and damage assessment are left as a future work in the paper. Bai et al. (2020) proposed a concurrent learned attention network that can be trained end-to-end, to perform joint localization of buildings and classification of damage. They adopt a semi-Siamese strategy, enabling collective learning based on pixel-level segmentation along with the utilization of residual blocks (RBs) featuring dilated convolution and squeeze-and-excitation (SE) blocks. Wang et al. (2022) presented a two-step solution approach with building localization and damage classification to handle the extremely imbalanced distributions of the building damages. A set of images from three different disaster events, namely, "Mexico-earthquake", "Midwest-flooding", and "Palu-tsunami", are utilized. Subsequently, the model is tested on a separate dataset comprising randomly selected images from the aforementioned historical events. During the testing phase, various statistics regarding the model's prediction performance are recorded for each event. Additionally, relevant factors such as building density and building damage distribution are taken into account to quantify the model's performance during testing. Qiang and Xu (2020) proposed a conceptual framework to assess resilience of road network from accumulated accessibility reduction during the hazard. The utility of this approach is demonstrated in a case study of winter storm. Ishmam et al. (2023) detect rice field damage from natural disasters in Bangladesh using high-resolution satellite imagery. NDVI differences before and after the disaster are calculated to identify possible crop loss. The areas equal to and above the 0.33 threshold are marked as crop loss areas as significant changes are observed. However, the work cannot be extended to other natural disaster affecting human population.
Remote sensing of artificial night lights provides an indicator of human activity, and nighttime lights (NTL) are correlated with population density and with economic activity. The study by Levin (2023) confirmed the dimming of nighttime illumination captured by the VIIRS sensor by cross-referencing it with data from the new SDGSAT-1 Glimmer multispectral nighttime sensor. Furthermore, the researchers compared the changes in 1 3 nighttime lights with reports on damaged buildings. The swift and accurate mapping of affected regions from space using nighttime lights plays a crucial role in effectively prioritizing and guiding emergency and rescue operations worldwide.
The framework by Rao et al. (2023) integrates high-resolution building inventory data with earthquake ground shaking intensity maps and surface-level changes detected by comparing pre-and post-event InSAR (interferometric synthetic aperture radar) images. They attempt both binary damage classification as well as multi-class damage classification for four recent earthquakes using ensemble models in a machine-learning approach The literature review reveals several key points regarding the detection and assessment of building damage using various imaging technologies. The following conclusions can be drawn: • Progress in detecting footprints of buildings: The literature suggests that significant advancements have been made in detecting footprints of buildings using satellite imagery, drone imagery, and other remote sensing techniques. These technologies have proven effective in identifying and mapping built structures in different geographical areas. • Lack of a comprehensive framework for region-based assessment: Despite the progress made in detecting building footprints, there is a gap in the existing frameworks when it comes to region-based assessment of building damage. Most approaches focus on identifying the presence of buildings but do not provide a comprehensive assessment of the severity of damage in a given area. • Importance of considering severity of building damage: The literature highlights the need for a framework that takes into account the severity of building damage. Simply detecting the presence of buildings is not sufficient; understanding the extent and severity of damage is crucial for effective disaster response and recovery efforts. • Comparative effectiveness of different approaches: While approaches like landslide detection or flood mapping provide valuable information about the spread of a disaster, assessing damage on man-made structures is of utmost importance. Buildings and infrastructure are critical assets and evaluating their condition can aid in prioritizing response efforts and allocating resources more efficiently.
Overall, the literature review emphasizes the progress made in detecting building footprints and the need for a comprehensive framework that considers the severity of building damage in a region-based assessment. Understanding the extent of damage on man-made structures is essential for effective disaster management and recovery.

Experimental database
Using satellite images, many labeled building data sets have only two classes, namely, "damaged" or "undamaged". This is done primarily to reduce the amount of time required by expert to annotate the building data manually. However, the damage is not a two-level classification. xBD was therefore proposed to discern between multiple levels of damage (Gupta et al. 2019). The distinctions between damage levels are often subtle. xBD focused on getting high quality images so that multilevel classification of the building damage can be done by inspecting the images. xBD satellite imagery consists primarily of less than 0.8 m ground sample distance (GSD) to meet this requirement.

3
xBD uses the discrete level scale ranging from no damage (0) to destroyed (3). This is presented in Table 1. The table shows an assessment of damage to buildings from satellite imagery in multiple types of disasters. The number of classes was selected in due consultation with hazard experts, balancing utility and ease of putting labels.
xBD is the largest building damage assessment data set available so far, with 850,736 building annotations across 45,362 km 2 of imagery. It was created with the assistance of worldwide experts and specialists in different types of disasters such as fire, earthquakes, and floods and to develop a damage scale that precisely portrays natural damage situations. Table 2 shows the details of the disaster data set that we have used in our work.
The images for xBD were obtained from the Maxar/DigitalGlobe Open Data Program, which releases an image database for ongoing disaster events in 14 countries around the world. It provides high-resolution imagery from many regions for varied hazards.
To create the annotation for every image, separate buildings were detected, and then manual annotation was done by an expert to draw polygons around the visible building footprints. An example of these building footprints is shown in Fig. 1. In total, xBD contains 850,736 annotated polygons of buildings. Figure 2 gives an overview of our framework. In the first step, the centroid of each building is calculated from the polygons representing the footprint of the building. These centroids are passed to the clustering algorithm known as DBSCAN which groups them according to the density. The output of this clustering step is the number of regions. These regions are then represented as convex hulls followed by classification of their areas according to priority of aid required. Through projection on earth's surface, separation between (longitude, latitude) is converted and we obtain area of the convex hull. This is used along with demographic statistics to compute an estimate of population within any region. The convex hull is also projected onto earth using Google maps, and various images can be generated for authorities to visualize the degree of damage. Also, navigation clues are generated to the damaged region.

Our framework
The steps outlined above are explained in more detail in the subsequent sections.

Centroid of a building polygon
The building footprint is given in terms of the points at the corners. Each point has the form (LONG,LAT), where LONG is the longitude and LAT is the latitude. Most of the buildings The centroid (C x , C y ) of a building polygon defined by n vertices (x 0 , y 0 ), (x 1 , y 1 ), ..., (x n−1 , y n−1 ) taken in order (clockwise or anticlockwise) is given by the following equations (Bourke 1988): where A is the interior area of the polygon.
The vertices of building are given in ordered fashion traversing the perimeter of the building. For each building in an image, the centroid now represents a single point which can be passed to clustering algorithms.

DBSCAN clustering algorithm
Clustering is a technique used in data analysis and machine learning to group similar data points together. The idea behind clustering is to identify patterns or structures in a data set without having prior knowledge about the groupings. There are different types of clustering algorithms, but most of them work by iteratively reassigning each data point to the cluster whose mean (or median) is closest to it. This process is repeated until the assignments no Overview of our framework longer change, or a stopping criterion is met. The end result is a set of clusters, where each cluster is a group of data points that are similar to each other.
Clustering is a form of unsupervised machine learning where the model is trained using unlabeled data. The model is not given any specific input-output pairs to learn from; instead, it must find the underlying structure or pattern in the data on its own. Unsupervised learning is useful when the goal is to discover hidden patterns or features in the data or to summarize the data in a more compact form.
Clustering has been used for wide range of applications (Xu and Wunsch 2005) such as segmenting a customer base into different groups with similar characteristics, which can be used for targeted marketing campaigns or social network analysis where clustering can be used to group people with similar interests and characteristics, which can be used to identify communities or groups within a social network. Clustering can be used to identify data points that do not fit well with the rest of the data, which can be used to detect anomalies or outliers.
The motivation for the clustering algorithm comes from observation of plots depicting damages on the longitude and latitude axes. Figures 3, 4, and 5 show the plot of some of the disasters. It can be clearly observed from the figure that the damage is not distributed continuously in spatial manner. Instead, the regions of severe damage are separated by regions of little or no damage. It would be very useful for aid supply agencies to identify the regions and get an estimate in terms of the number of affected people.
There are different classes of clustering algorithms (Xu and Tian 2015). The K-means clustering algorithm is the most basic unsupervised learning algorithm. It groups similar data points together by finding K cluster centroids, where K is a user-specified parameter, and then assign each data point is assigned to the cluster whose center is at least distance to it. The algorithm starts with an initial set of K cluster centroids, which can be chosen randomly or in some other way. At each step, each data point to the cluster whose center is closest to it. This is done by calculating the distance between each data point and each cluster centroid, and assigning the data point to the cluster whose center is at least distance. In the next step, it recalculates the cluster centroids as the average of all the data points in the cluster. These steps are repeated until the cluster assignments no longer change or a stopping criterion is met. Though it is simple and easy to implement, it has some limitations, like it assumes that clusters are round in nature and have the same size, and the performance is highly dependent on initial centroid selection. K-Means also suffers from the curse of fixing the value of the cluster to K regardless of the distribution.
This lead to development of density-based clustering algorithms (Xu and Tian 2015). Density-based clustering groups together data points that are closely packed together. It clusters regions of high-density together and separates regions of low-density. Essentially, the algorithm identifies areas that are densely populated with data points and defines them as clusters. Obviously, such clustering has high efficiency and suitable for data with arbitrary shape, which is the case with disaster regions. DBSCAN is one of the most powerful density-based algorithms.
DBSCAN (density-based spatial clustering of applications with noise) (Ester et al. 1996) is a clustering algorithm that creates clusters of dense regions in space, which are separated by regions of low density of data points. The best part of DBSCAN is that it can cluster real-life data that may be of arbitrary shape and is very robust against outliers (Schubert et al. 2017).
It is imperative to understand the details of the DBSCAN algorithm to understand the steps in setting its parameter for our problem. Due to the way the algorithm works, we are required to provide the algorithm with the following two parameters: (a) Epsilon (eps)-it is the radius of the circle that checks whether a point lies in the neighborhood of a particular point or not minPoints (b) n-these are the least number of points that are required to be within the epsilon distance of a core point. It has to be at least three.
The algorithm steps of DBSCAN in simplified form can be stated as follows: 1. Consider any unexplored point P in the dataset. 2. Make P as explored.
3. Compute neighbors of P within eps distance. 4. if numbers of neighbors of P are less than n then mark P as noise else form cluster around P.
Since our dataset has centroids in the form of longitude and latitude with several decimal places, the default value of eps = 0.5 did not give good results. Obviously, the eps value should be very small and n should be large, since in the event of a disaster, many houses are affected in a region. For our purpose we have taken eps value as 0.05 since it was observed that two points having 0.05 distance are close to each other when projected on earth's surface. Also, eps = 0.05 gave the best results among various values chosen. The parameter n indicates the number of houses affected in a region and we chose the parameter as 50. In case of a natural hazard, it is important to first identify the areas of significant damage and a region of larger than 50 damaged building should be given preference.

Results and comparison of clustering algorithms
For purpose of evaluation of performance of clustering algorithms in our problem, we use two frequently used evaluation metrics: Silhouette Score and Davies Bouldin index. The silhouette score is a metric that is used to evaluate the performance of a clustering algorithm (Rousseeuw 1987). It measures the closeness of an object to its own cluster compared to other clusters. The score ranges from −1 to 1, where a higher score indicates that the object is more similar to its own cluster than to other clusters. The silhouette score is calculated for each object in the data set and then the average silhouette 1 3 score is calculated for the entire data set. The silhouette score is a way to evaluate the performance of a clustering algorithm and can be used to choose the optimal number of clusters. For data point i, let (i) be the average distance between i and all other points in the same cluster group, and (i) be the smallest average distance of i with all points in any other cluster group. Define a silhouette value sv(i) of one data point i as follows (Rousseeuw 1987): From the above definition, it is clear that −1 ≤ sv(i) ≤ 1 . The silhouette score is computed by taking average of sv(i). A negative value or −1 indicates that the points have been allocated to incorrect clusters. A value of 0 indicates that there is a high degree of overlap in the clusters and that the clustering performance is bad. 1 is the ideal value and indicates that clusters are well formed without overlapping.
Davies-Bouldin index (DBI), introduced by Davies and Bouldin (1979), is another metric to evaluate clustering algorithms. DBI is a measure of the similarity between each cluster and its most similar cluster, as defined by the average similarity between the cluster mean and the means of the most similar clusters. It ranges from 0 to infinity, with a lower value indicating a better clustering solution. It is commonly used in evaluating the performance of clustering algorithms, particularly in image and text analysis. The minimum value of the DB index is 0, whereas a smaller value (closer to 0) represents a better model that produces better clusters.
Tables 3 and 4 present the comparison for various disasters mentioned in Table 2. Table 3 presents the comparison between DBSCAN and K-Means on the Silhouette Score for different regions. As mentioned above, the score will be between −1 and 1 with 1 representing a perfect score. For comparison purposes, K-Means was chosen for K = 3 and K = 5 representing three clusters. From the table, it can be observed that DBSCAN has a better score than K-Means ( K = 3 ) and K-Means ( K = 5 ) for all disasters. Referring to Figs. 6,7,8,9 and 10 which give plots of clusters formed by DBSCAN and K-Means for these disasters, we can observe that the predefined number of clusters will fail in some case or another. Consider the case of Midwest flooding where the DBSCAN Silhouette score almost reaches perfection, but K-means clusters can be seen to be mixed. From the figures, we can observe that when K = 5 , K-Means unnecessarily breaks a contiguous region into two or more regions to get more clusters as in case of Hurricane Harvey, while for K = 3 , K-Means tries to form one region, even though they may be quite at distance, as in the case of Hurricane Matthew. Table 4 presents the comparison between DBSCAN and K-Means on DBI for different regions. DBI Score close to 0 means formation of good clusters and it can be observed from the table that DBSCAN has significantly better results than K-Means. Das et al. (2013) suggested evaluating the impact of the disaster and identifying appropriate measures to reduce its effects in order to provide humanitarian aid such as supplies, shelter, rescue operations, and long-term services. Mitigation steps, including the provision of temporary housing and essential items like food, water, and medicine, are crucial in the immediate destruction after a disaster. The success of these efforts is heavily dependent on the accurate estimation of the affected population and the classification of their needs, such as those who are mildly or severely injured.

Convex hull computation and classification of region
A convex hull is a polygon that connects all the data points in the smallest possible way and encloses them. There are various techniques to find the convex hull systematically. Another way to showcase the size or distribution of clusters is to create an outline or a shadow around it. The research explains a convex hull algorithm that uses a combination of the two-dimensional Quickhull Algorithm and the general dimension Beneath-Beyond Algorithm (Barber et al. 1996).
The steps of the algorithm can be simplified as follows: 1. Select the data points with the least and largest x-coordinates (A and B, respectively). They will be part of the convex hull in any situation. In case of tie, choose ones with minimum/maximum y. 2. Partition this set into two subgroups of coordinates using the line formed by the two points. 3. Find distance of all points from the line and choose the coordinate C, on either side of the line, with the largest distance from the line and construct the triangle ABC. 4. The points that make up that triangle ABC are not included in the convex hull and can thus be disregarded in the following phases. 5. Iterate the above two steps on the two lines (ie, AC and BC) formed by the triangle.
Iterate until no more points are left. 6. Output the coordinates selected constituting the convex hull.
Before we can compute the area of a region, we need to project longitude and latitude on the Earth's surface to get the actual land distance in meters and compute the area of the region. We use the projection defined by: where is the latitude in terms of radians, LA is the longitude of the point, and LA 0 is the longitude of the central meridian (Snyder 1987).
The Shoelace algorithm, which calculates the area of a straightforward polygon whose vertices are specified by their Cartesian coordinates in the plane, is used to calculate the convex hull area (Braden 1986). The procedure involves multiplying the respective coordinates of a polygon's various vertices by two or more to determine its area as given by Eq. 3.

Estimating the number of people affected
In the immediate aftermath of a disaster, such as a tsunami or flood, government officials must determine the number of households that have been affected. While having current census data would be ideal for this purpose, it may not be current or available. In these cases, data on the location and concentration of buildings can serve as a useful alternative source of information. Chen et al. (2018) examined the techniques for selecting urban resources, determining the extent of the affected area and population, and forecasting the need for shelter using GIS in a multi-hazard setting using the city of Guangzhou as a case study. To create a map for random sampling of households, Pearson et al. (2015) utilized geospatial mapping of residential roofs on satellite images within a desired region. However, their method does not comprehensively present the effect suffering from drawbacks of statistics. Lowther et al. (2009) have a similar approach for areas with limited resources and inadequate maps, census data, and infrastructure, or when surveying population that could be affected in unstable regions.
Based on our estimate, we consider that the polygonal area of the roofs between 16 and 150 m 2 are households, which is consistent with the findings of a previous study that used overhead imagery for household enumeration. In the study by Kamanga et al. (2015), structures within the range of 9-330 m 2 were selected for examination. They concluded that a roof area greater than 150 m 2 generally belongs to non-residential, such as places of worship, community centers, grocery stores, healthcare facilities, educational institutions, government buildings, law enforcement offices, industrial buildings, storage facilities, agricultural buildings, vineyards, and performance venues. Switchenko et al. (2021) presented that average persons per household size is a good indicator for assessing damage in a region. They used size of house-index as 4.8 persons per household for their study. It should be (5) x = LA − LA 0 cos y = noted that depending on which part of the world disaster occurs, the authorities can use the house-index value appropriately using census. Table 5 shows the estimate of number of persons affected by Hurricane Harvey. The number of persons per household is readily available census data. For example, Texas, USA where Hurricane Harvey had an effect, has 2.76 persons per household (Quickfacts 2022). The accuracy of census data has always been a subject of debate. However, in case of disaster, an overall estimate is required for a region, rather than an accurate count. We use this census data for our purposes in consonance with longitude and latitude of the region. Figure 7a shows the regions for which estimates are presented for DBSCAN.

Results
The usefulness of the framework can be understood by observing Table 6 which gives detailed distribution of destroyed roofs and an estimate of people affected for a region marked by cluster 2. The authorities can very quickly judge the degree of damage as well as the requirement of aids to be provided. Kafi and Gibril (2016) show that GPS can be used to know the location of disasters such as earthquake and other disasters. However, the visuals shown through satellite images and Google map based on the region make it easier to judge the priority of providing the aids. Zakia et al. (2016) suggested a method to guide rescuers to locate victims during rescue operations by creating a network that can be initiated by either the victim or the rescuer. Their system takes into account that during a disaster, the mobile network may be malfunctioning or disrupted, and therefore it uses device-to-device communication within the LTE technology even in the absence of a network.

Navigation and prioritization of services
During a disaster, not only is there damage to structures, but there are also environmental factors such as smoke, fire, flooding, volcanic ash flows, and lava that can make accessing affected areas difficult. Through satellite images, when the regions are marked on maps, they are visible along with the pathways to affected regions.  Figures 11 and 12 show the navigation. The classification of regions is carried out using two threshold parameters, 1 and 2 . We develop the following criteria. Let S i be the sum of destroyed and major damage houses in a region i. If S i is greater than 2 , then it is in the region of highest priority, that is, priority 1. If S i is less than 2 but greater  than 1 , then the region i is of priority 2. If S i is less than 1 , then it is of priority 3. The priority P i of region i is given by equation: We have set the value of 1 as 100 and 2 as 500. From the figures we can see that there is more than one region for a priority. For example, Fig. 11 shows that there are two regions for priority 1 for Hurricane Harvey, while Fig. 12 shows that there are four regions for priority 2 for Indonesia (Palu) Tsunami. There are two views provided so that environmental factors can be seen from one view and accessible roads can be discerned from the other view.

Conclusion
In this work, a framework is presented based on unsupervised clustering algorithm that groups regions based on density. The algorithm is compared with the standard and popular K-Means algorithm, and has been shown to perform far superior to K-Means through cluster plots, Silhouette score, and DB index. After the regions are identified, a methodology is presented to obtain demographic estimate as well as priority of the aid required for the region. Accessibility to regions can also be visualized though maps.
The overall goal of the paper is to enhance providing quick and effective aid in case of a disaster. Detecting a damaged building and observing satellite images manually can certainly provide useful clues. However, interpretation done through a systematic algorithmic framework as presented here can be a useful tool for assessing damaged regions for the resource planning is done in terms of areas, locations and regions.