One of the most popular computer vision algorithms is the You Only Look Once (YOLO) object detection algorithm. YOLO is a deep neural network that predicts bounding boxes and class probabilities for each cell in a grid of input images. Bounding boxes, object classes, confidence scores, and class probabilities are the output. Self-driving cars, security systems, and drone surveillance use YOLO because of its speed and precision. To make YOLO more powerful and adaptable, several research hurdles remain. The algorithm has trouble detecting small items and can give false positives when objects overlap. The YOLO algorithm has had various versions with different features and enhancements. Anchor boxes helped YOLOv2 handle object scales and aspect ratios. YOLOv3, on the other hand, employed a feature pyramid network (FPN) to detect objects of varying sizes and included "swish activation," which enhanced network accuracy. YOLOv4, the latest algorithm, added a spatial pyramid pooling (SPP) module, a novel activation function, and CutMix, a new data augmentation technique. These enhancements make YOLOv4 one of the most accurate and efficient object identification algorithms. YOLO has improved with each edition, but more study is needed to improve it. Attention processes are being investigated for object detection. Attention methods can help the system focus on specific visual regions and better detect things. YOLO-based algorithms for 3D point clouds, utilized in autonomous driving and robotics, are also of interest.YOLOv8 provides improved architecture and developer experience. YOLOv8 detects objects in photos using CNNs. The YOLO CNN extracts edges and textures from the input image to detect objects. YOLO does object categorization and bounding box regression with one CNN. This makes object detection faster and more efficient than methods with numerous phases or task-specific models. YOLO trains its CNN on a large collection of annotated photos with object labels and bounding boxes. This lets the CNN detect objects of varied sizes, shapes, and orientations. YOLO's CNNs have revolutionized real-world object identification with rapid and precise detection. Figure 1 illustrates CNN-based object detection. First in conducting our research, we undertook a thorough investigation of the available images related to our study topic. We carefully analyzed and scrutinized each image, taking note of important features and patterns that could inform our understanding of the subject matter. After an exhaustive analysis, we were able to label all the available images and classify them into three distinct classes. The process of labeling and classification was a rigorous one that required a lot of attention to detail and careful consideration of various factors. We considered the different attributes of each image, including its size, color, and content, among other things, in order to accurately classify it into the appropriate category. Our method detects buildings with no damage, moderate damage, and catastrophic loss in post-disaster images.
3.1 Dataset
In this section, we showcase how we leveraged the Rescue-Net dataset, which is a high-resolution UAV semantic segmentation benchmark, to evaluate the extent of natural disaster damage. Our team employed Roboflow1 for image annotations and sorted the images into three categories: no damage, moderate damage, and destruction, as shown in Fig. 2. Our dataset included 13702 raw images, which we enhanced with pre-processing and augmentations to enhance our method's accuracy. The outcomes demonstrate that our strategy is highly effective in precisely evaluating natural disaster damage. With Rescue-Net and Roboflow, we were able to process large image datasets more efficiently and accurately, indicating that this method holds great promise as a tool for future disaster response efforts.
3.2 Image Pre-Processing
To enhance the efficiency of the YOLOv8 method for this dataset, we altered the size of the initial images. Specifically, we resized them from their original dimensions of 3000 by 4000 pixels to 1280 by 1280 pixels. This change was implemented to minimize computational demands and maximize efficiency. Our experimentation has demonstrated that this resizing did not result in any notable decrease in the accuracy of the YOLOv8 method for the Rescue-Net dataset, and it also had the added benefit of reducing training duration and memory usage.
3.3 Image Augementation
Image augmentation is a widely used method in machine vision to enhance the accuracy of computer vision models. This technique involves creating new images from existing ones by applying a range of transformations, such as flipping, rotation, zooming, brightness adjustment, and contrast modification. This approach enables us to considerably enlarge the dataset and avoid overfitting, enhancing the model's generalization. In this work, we used various image augmentation methods to enhance the detection accuracy. We employed 90° rotations, clockwise and counterclockwise rotations, within the range of -15° and + 15°, horizontal and vertical shearing of up to 15°, and brightness adjustments within the range of -25% and + 25%. These techniques helped us create a final dataset of 6265 images that included variations of the original dataset, improving its diversity. Figure 3 displays some sample images after the augmentation process. Our experimental results showed that the accuracy of our machine vision model improved considerably
when trained on this augmented dataset, which demonstrates that image augmentation is a vital technique for improving the performance of computer vision models.
3.4 Models
We utilized the latest version of the YOLO algorithm, namely YOLOv8 for building damage assessment and object detection. YOLOv8 is a modified version of the original YOLOv4 algorithm, which uses a more efficient backbone architecture and introduces various modifications to improve object detection accuracy.
By using all the extensions of YOLOv8, we can achieve even better results in object detection tasks. We trained and evaluated YOLOv8 on this dataset using different variants of the algorithm, including YOLOv8s, YOLOv8m, YOLOv8n, YOLOv8l, and YOLOv8x.