Use of low-cost drones to map arbovirus vector habitats with multispectral aerial imagery

doi:10.21203/rs.3.rs-3950831/v1

Download PDF

Research Article

Use of low-cost drones to map arbovirus vector habitats with multispectral aerial imagery

https://doi.org/10.21203/rs.3.rs-3950831/v1

This work is licensed under a CC BY 4.0 License

Version 1

posted

You are reading this latest preprint version

Background

This article introduces WaterMAI, a novel multispectral aerial imagery dataset that is optimized for detecting small to medium water bodies and is essential for mapping arbovirus vector habitats. While satellite datasets provide broad coverage and are valuable in many contexts, WaterMAI concentrates on utilizing high-resolution aerial imagery. This approach is suitable for capturing detailed information about water bodies, which may contain vectors for arboviruses.

Materials and methods

We benchmarked baseline deep learning algorithms on our WaterMAI dataset for water body detection, employing both bounding box and segmentation approaches, establishing new baselines for this domain. Furthermore, we extensively investigate the effectiveness of various spectral band combinations, including Near-infrared (NIR), Red, Green, Blue (RGB), and the Normalized Difference Water Index (NDWI), to determine the potential configuration for accurate water body detection.

Results

The WaterMAI dataset, covering 16 rural and sub-tropical regions with varied water bodies, increases the utility of research through multiple spectral bands, including visible and near-infrared. The findings demonstrate the potential of multispectral imagery that shall enhance the understanding and monitoring of water bodies in rural and subtropical regions. The WaterMAI dataset, orthomosaic images, and the implementation of the segmentation models for benchmarking are available in GitHub database.

Conclusion

Our result suggests incorporating NDWI and NIR spectral bands with RGB images potentially improves the water body detection algorithm.

UAV

Multispectral Aerial Imagery

WaterMAI

Deep Learning models

Water Body Detection

In a global context, vector diseases pose a major public health burden in tropical and subtropical regions with humid and warm environments [1]. Vietnam has a tropical climate that provides a favourable environment for arboviruses, which cause a significant health burden, as the country is endemic to diseases such as dengue fever, malaria, Zika, Japanese encephalitis, and lymphatic filariasis [2]. The Arbovirus vector species like An. minimus, Aedes spp, and An. Epiroticus mainly transmits these diseases in urban and adjoining regions in the Southeast of Ho Chi Minh City and Binh Duong Province [3]. Vaccines for most diseases remain elusive despite the urgent need to address this issue. Consequently, the World Health Organization emphasizes the essential role of behaviour modification and strategic environmental management in manipulating and controlling these arbovirus diseases [4].

Effective management is critical in reducing the threat to human populations and the number of fatalities annually due to arbovirus disease. While the challenges in monitoring and controlling container-breeding species stem from the extensive resources required to inspect and treat numerous domestic sites, the time, laboratory, and logistical barriers involved make traditional methods inefficient at the scale needed for significant impact. Therefore, emerging technologies represent a transformative advancement in this field and offer several advantages over conventional methods. Notably, the advent of low-cost Unmanned aerial vehicles (UAVs), aka. Drones have been introduced and are becoming the promised avenue. UAV technology allows rapid and wide-scaled detection of water bodies and other habitat conditions conducive to Arbovirus vector proliferation. Recent research has demonstrated drones for mapping and surveillance of these habitats and are also increasingly used in control measures [5–9]. Several researchers identified an Arboviruses' site by analysing natural colour ariel imagery (RGB type) to pinpoint near-standing water bodies or areas prone to water retention [5–7]. Implementing this method, while helpful, has certain limitations in the accuracy level of small and medium object information.

In contrast, multispectral imagery offers more nuanced and comprehensive information, providing more opportunities to detect and analyse vector control initiatives. Gabriel Carrasco-Escobar [10] stated that integrating multispectral images from drones with Deep Learning (DL) algorithms offers better results in the accuracy of larval habitat detection. The main advantage of multispectral over RGB images is that it provides data beyond the visible spectrum and in different bands, capturing information that is not apparent in natural colour imagery. Several studies have leveraged drone-based multispectral imagery to precisely identify and classify potential breeding sites in both rural and peri-urban areas, such as ponds, artificial water pools, and roadside puddles [6][8–9].

On the other hand, as far as we know, some published multispectral imagery datasets captured from UAVs only contribute to improving agriculture [11, 12] or automatically target recognition regarding vehicles [13], while none of the published datasets focus on hydrological methodologies. Moreover, most currently published papers about vertical control analysis using multispectral imagery apply traditional machine learning algorithms such as Random Forest [9, 10], leaving a gap for leveraging other current state-of-the-art deep learning models.

Within this context, the motivation of this paper is twofold. Firstly, the paper introduces Water Multispectral Aerial Imagery (WaterMAI), a new database designed to address the task of small and medium size water body detection from multispectral aerial imagery. Barnabas Zogo [14] found that natural habitats such as swamps, streams, and natural ponds, as well as artificial habitats such as drainage ditches, footprints, and hoofprints, were breeding sites for Arbovirus vectors. The findings encourage the construction of the WaterMAI dataset from this paper, which concentrates on rural site environments, offering visible and near-infrared spectra for more comprehensive analysis. Secondly, we benchmark multiple deep learning algorithms, considering both bounding box and segmentation type for water body detection task. The paper also utilizes the Normalized Difference Water Index (NDWI) combined with the near-infrared spectrum, opening opportunities to find the potential combination for solving the task. This benchmark can also facilitate the development of novel deep learning-based algorithms for analyzing the issues relating to water bodies from aerial imagery.

The introduction outlines our purpose for creating a dataset that supports the development and evaluation of testing for detecting water bodies from Unmanned Aerial Vehicle (UAV) imagery. Therefore, the dataset must satisfy certain conditions: images should include various water body types, especially arbovirus vector breeding sites. Moreover, it should consist of characteristics of the environment and diverse habitat in the Vietnamese region, including suburban, rural, and suburbs, etc. Secondly, the backgrounds should vary widely, and the associated ground-truth annotations should also be comprehensive to facilitate developing, training, and testing different computer vision algorithms. Lastly, the dataset can be copyright-free or permissible for use within the research community, considering the high cost usually associated with aerial images.

Drone and Camera

This research used the DJI Phantom 4 Multispectral (P4M) UAV and DJI GS PRO software for agricultural and environmental analysis. Figure 1 illustrates an overview of the drone and the multispectral camera. This UAV has a high-quality multi-spectral camera system, including four spectral bands: Red, Green, Blue, NIR, and Red-edge. Integration with the Real-Time Kinematics system enables highly accurate positioning, optimizing automated flight planning and control processes. DJI GS Pro software is essential in efficient multispectral image acquisition, contributing to highly accurate mapping and data collection. Before each flight, we calibrate the spectral image by taking images into MicaSense Calibrated Reflectance Panels. After the flight, we will calibrate the image to get the most accurate spectral value with bright images during the flight.

Collection, Annotation, and Construction

The study was conducted in Ben Cat town, Binh Duong province, which has a high exposure and vulnerable prevalence of the Arbovirus disease. The image acquisition is located in the Southern area of Vietnam. We tried to select by purposed regions to maximize a variety of conditions of water bodies, including in rural and peri-urban areas, such as reservoirs, ponds, temporary water pools, and road puddles. The dataset was collected from August to November 2023 in wet weather conditions, which is 16 times for flight and is crucial for maintaining consistency between successive images. We meticulously planned the flight paths. The detailed map of these locations is described by map pin in Fig. 1, where four locations are in the Vietnamese-German University campus, two are in the residential quarter in Thoi Hoa town, and ten are in the rural area in An Dien town, consisting of fields and the rural regions.

In this study, we implemented a comprehensive process to collect and annotate unmanned aerial imagery, primarily to facilitate the training of distinct algorithms. Our aerial coverage usually encompassed 100m x 100m and 200m x 200m (equal to 4km²), utilizing a flight speed of 5m/s at an altitude of 120m from the ground. We also ensure sufficient data overlap (around 70% for Orthomosaic construction. In the annotation stage, we employed two different methods tailored to the specific requirements of each algorithm. Bounding box annotations, essential for object detection tasks, were created using the LabelImg tool [15]. This involved manually drawing square boxes around objects of interest within the images. While in segmentation tasks requiring a more granular approach, we utilized the CVAT tool [16], enabling precise brushing masks on every object in the images.

Our final dataset contains 1,013 images across five spectral bands: RGB, NIR, and Red Edge. Each image has a resolution of 0.0265cm per pixel and a size of 1600x1300 pixels. Figure 2 shows the visualization of some RGB and NIR image examples from the WaterMAI dataset. The color images, consisting of three 8-bit channels (R-G-B), were utilized exclusively for training the RGB model. In contrast, the additional spectral band NIR from the P4M camera was pivotal in training multi-spectral models. We also utilize the NDWI band, with the methodologies and experimental setups detailed in the latter sections of this article. Furthermore, concerning maintaining the integrity of our dataset quality, we also filtered out images of subpar quality, particularly those affected by poor weather conditions such as darkness or cloud cover, as these factors significantly hinder the clarity of manual annotations.

The WaterMAI dataset contains 870 images containing six bands after filtering low-quality or low exposure for training and validation. Regarding the testing dataset, we collect in 2 distinct areas: One is in an area similar to the training set, and the other is in a completely new area around Ben Cat town. However, we still maintained the distribution of the testing data, which was similar to the training dataset. The number of each folder is illustrated in Table 1.

Table 1

Distribution for training, validation, and testing purpose
Purpose	Number	Modality	Wavelength range	Resolution
Training and validation	870 images each band	RGB	450nm − 590nm	0.0265 cm/pixel
Training and validation	870 images each band	NIR	790nm	0.0265 cm/pixel
Testing	143 images each band	RGB	450nm − 590nm	0.0353 cm/pixel
Testing	143 images each band	NIR	790nm	0.0353 cm/pixel

In processing multispectral image data from P4M, we employed Pix4D mapper to analyze images from UAVs for generating orthomosaic images. Initially, raw images are put into a Pix4D mapper with key settings like the WGS 1984 coordinate system. The process involves several stages: Initial image alignment processing, point cloud creation, mesh 3D creation, and development of a digital surface model to construct orthomosaic images. This method allows for seamless integration with Geographic Information Systems, enhancing spatial analysis capabilities. Figure 3 illustrates the sample of orthomosaic images, demonstrating the detailed and geographically accurate representation of the area surveyed, which is useful in applications such as environmental monitoring.

One of the primary objectives of this paper is to contribute to the computer vision community with resources for water-body detection from multispectral aerial imagery. This section aims to benchmark several deep learning-based architectures, stated below, by training and evaluating them using our collected WaterMAI dataset, facilitating comprehensive comparison.

Water Bodies Detection Algorithm

You Only Look Once version 7 [17] (Yolov7) is an advanced object detection model known for its speed and accuracy, applying a trainable bag-of-freebies approach. This approach in Yolov7 collectively combines innovative techniques such as mosaic augmentation, self-adversarial training, and a CSPNet backbone to improve the model’s performance during training without significantly increasing computational complexity or inference time. Specifically, these strategies aim to enhance the model’s robustness, generalization, and efficiency in detecting objects in images or videos. Similarly, DocF, proposed by Fang Qingyun [18], maximizes the potential of different modalities of multispectral images, which utilizes a Cross-Modality Fusion Transformer, a straightforward yet powerful method. Unlike previous approaches relying on Convolution Neural Networks (CNNs), the network, inspired by Transformer architecture, grasps long-distance connections and includes broader contextual details during feature extraction. The network seamlessly combines information within and between modalities, effectively capturing interactions between RGB and other multispectral domains. Comprehensive experiments were carried out on multiple datasets, confirming the effectiveness of the proposed scheme in achieving state-of-the-art detection results.

In contrast, a Multispectral Semantic Segmentation Network (MSNet) [19] is a segmentation deep convolutional neural network with significant strides in remote sensing. The model splits multispectral bands into visible and invisible groups, namely RGB and NIR, to fully utilize the multispectral information for distinguishing features like water and vegetation. The MSNet model leverages ResNet-50 for feature extractor and cascaded up sampling for increasing resolution to fuse multi-scale image and spectral features using a feature pyramid structure. The experiments being carried out by MSNet’s authors showcase competitive performance compared to other similar methods. Yuxiang Sun's RTFNet [20] model also leverages visible and invisible spectral bands for object detection. The network combines an Encoder-Decoder design, utilizing ResNet for feature extraction and introducing a new decoder to restore feature map resolution. Finally, U-Net architecture [21], a traditional semantic yet powerful deep learning model, is also evaluated on the WaterMAI collected dataset for benchmarking purposes.

Experimentational Setup

Multispectral imagery is a powerful remote sensing tool that leverages diverse channel combinations to extract features from the Earth’s surface. Each combination serves a unique purpose, offering insights into various environmental and geographical phenomena. In this experiment, we try utilizing three different types of channel combinations as follows.

RGB + NIR

RGB provides basic color information for distinguishing vegetation, soil, and water. On the other hand, NIR is highly reflective in vegetation but absorbed by water, making it excellent for differentiating between water bodies and vegetated areas. This combination can enhance the model’s ability to detect water bodies by contrasting them against land and vegetation.

NDWI + NIR + Green

NDWI is specifically designed to enhance the presence of water bodies in multispectral imagery. The combination of NDWI and NIR bands has the potential to maximize the reflection of water while minimizing the reflection from vegetation. Moreover, adding the green band to this combination can improve the detection of water bodies, as water has strong absorption in green wavelength.

Red + NDWI + Blue

This combination can effectively identify water bodies, even in areas with diverse land features, as NDWI is specifically designed to detect water, and the red and blue bands offer additional distinct water signatures. Furthermore, research by Bulent Ayhan [22] indicates that a similar combination (NDVI, Green, and Blue) achieved the highest accuracy in chlorophyll-rich vegetation detection on a public dataset. This finding encourages the exploration of similar concepts of spectral band combinations in this paper.

The NDWI highlights open water features in a satellite image, allowing a water body to stand against the soil and vegetation. It is calculated using the Green-NIR (visible Green and Near-Infrared) combination, and the value is a minor constant value to prevent the denominator from being zero. The equation of NDWI is defined as

NDWI = $\frac{Green - NIR}{Green + NIR + \epsilon }$ (1)

The experiments of deep learning models are conducted on a GPU server containing 1 CPU (12th Gen Intel i9-12900K) with a total of 64GB RAM and 1 GPU (Nvidia RTX 3090) with a total of 24GB vRAM. Regarding the primary parameter setting for the training process, the batch size is 8 to ensure maximum memory utilization, and the initial learning rate is 1e-2. We apply a dynamic linear learning rate scheduler from [23], defined with a linear decay function, that decreases the learning rate linearly from the initial learning rate to the setup minimum value based on the current epochs' value. The scheduler trick was proved to improve the accuracy of the training process and lead to better learning performance. The optimizer is a momentum-based Stochastic Gradient Descent [24], the Binary Cross Entropy loss function is selected for the binary task water body detection, and the number of training epochs is 200 to ensure a convergent result. Additionally, we apply a mixed precision training [25] technique to use a combination of single-precision and half-precision floating-point numbers to speed up training while maintaining accuracy and reducing memory consumption.

Evaluation metrics

The evaluation algorithm is required to analyse a group of ground truth images and predictions from a test fold. These predictions represent the anticipated positions of the target within the test images. The data image given, denoted as I, and a chosen threshold t, detection within an image I that exhibits a score surpassing the threshold t are considered valid detection, while others are disregarded. In practice, the results of predictions by computer vision’s model will be in one of four outcomes: True Positive or denoted TP(I,t) is presented that the number of correct positive predictions matched with the ground truth. Conversely, detection wrongly classified as positive is FP(I,t). Similarly, the number of instances improperly classified as negative are designated as FN(I,t). Finally, the number of correctly negative predictions with the ground truth is set FP(I,t). We assess the performance of the models in identifying water bodies within a specific test fold, the formula established for precision and recall as

$$precision\left(t\right) = \frac{{\sum }_{i \in fold}^{}TP(I,t)}{{\sum }_{i \in fold}^{}TP(I,t) + {\sum }_{i \in fold}^{}FP(I,t)}$$

$$recall\left(t\right) =\frac{{\sum }_{i \in fold}^{}TP(I,t)}{{\sum }_{i \in fold}^{}TP(I,t) + {\sum }_{i \in fold}^{}FN(I,t)}$$

The F1 is computed as the harmonic mean of the precision and recall scores, which reflects their relative contribution. Consequently, we assess this metric by the mathematical formula depending on the precision and recall as

F1 = $2 \times \frac{precision \times recall}{precision + recall}$ (4)

The ratio F1 ranges from 0 to 1, where 0 signifies a complete inability to detect any observation correctly. While 1 indicates a perfect match of each observation into the ground truth. On the other hand, Dice Score with threshold DSC(t) is a widely used metric in segmentation object detection. This metric effectively measures the overlap between the predicted segmentation X and the ground truth Y, normalized by the total size of both predicted and actual segmentation. A Dice Score of 1 indicates a perfect detection, whereas a score of 0 indicates no overlap.

DCS(t) = $\frac{2 \times \left|X\cap Y\right|}{\left|X\right| + \left|Y\right|}$ (5)

A comparative analysis of the training performance of various deep learning models on our WaterMAI dataset for water body detection, evaluated using F1 score and Dice Score metrics across different spectral band combinations in multispectral aerial imagery (Fig. 4).

The training analysis revealed that Yolov7 and Yolov7x, specialized in bounding box predictions, achieved stable consistency, with their F1 score plateauing above 0.6 after an initial learning phase, demonstrating their robustness in localizing water bodies. The dice score of Unet and MSNet, designed for the segmentation task, also displayed a similar pattern of convergence that suggests their efficiency in pixel-wise water body detection. The consistent performance across various band combinations is notable, with 'RGB + Green + NIR + NDWI' achieving leading results in F1 scores for the DocF model and Dice Scores for the MSNet model. This superior performance highlights the effective integration of RGB data with multispectral bands, enhancing the models' detection ability. These insights support our understanding of multispectral image analysis for hydrological applications and pave the way for more targeted algorithmic improvements in future research endeavors. The tool SciencePlots [26] is used to plot the visualizations.

In assessing the performance of deep learning models on the WaterMAI test set, our findings indicate a varied impact of modality on model accuracy. Table 2 compares deep learning models’ performance on the WaterMAI test set for water body detection regarding Precision, Recall, F1, and Dice score. Notably, methods incorporating the NIR and NDWI channels alongside RGB data, such as MSNet [6c] and DocF [6c] with the composite ‘RGB + Green + NIR + NDWI’ modality, demonstrated superior precision and F1 scores compared to their RGB-only counterparts. The RGB bands provided foundational color differentiation, while the green band penetrated water surfaces for more transparent information. The NIR’s strong contrast between water and vegetation was crucial in complex environments. The critical enhancement comes from NDWI, which uses Green and NIR bands to distinctly highlight water bodies, aiding in the precise detection and delineation, particularly in challenging terrains intermixed with other natural features. This suggests that additional spectral information is critical in enhancing model performances for water body detection. The result shows that MSNet tends to generalize more on the test set than other models. Additionally, implementing Unet [4c] with the combination ‘RGB + NIR’ modality underscores the potential of multimodal input to improve segmentation performance, as evidenced by the highest Dice scores observed.

Table 2

Comparison of Deep Learning models’ performance on WaterMAI test set in terms of Precision, Recall, F1, and Dice Score
Method	Modality	Precision	Recall	F1	DSC
Bounding box
Yolov7 [3c]	RGB	0.551	0.346	0.425	-
Yolov7 [3c]	Green + NIR + NDWI	0.47	0.285	0.355	-
Yolov7 [3c]	Red + NDWI + Blue	0.532	0.283	0.37	-
Yolov7x [3c]	RGB	0.525	0.315	0.394	-
Yolov7x [3c]	Green + NIR + NDWI	0.481	0.280	0.352	-
Yolov7x [3c]	Red + NDWI + Blue	0.415	0.273	0.329	-
DocF [4c]	RGB + NIR	0.593	0.372	0.457	-
DocF [4c]	RGB + NDWI	0.432	0.294	0.349	-
DocF [6c]	RGB + Green + NIR + NDWI	0.578	0.381	0.459	-
Segmentation
Unet [3c]	RGB	0.787	0.5	-	0.537
Unet [3c]	Green + NIR + NDWI	0.833	0.436	-	0.494
Unet [3c]	Red + NDWI + Blue	0.771	0.448	-	0.465
Unet [4c]	RGB + NIR	0.808	0.569	-	0.565
Unet [4c]	RGB + NDWI	0.750	0.526	-	0.521
Unet [6c]	RGB + Green + NIR + NDWI	0.808	0.552	-	0.539
MSNet [4c]	RGB + NIR	0.778	0.502	-	0.503
MSNet [4c]	RGB + NDWI	0.660	0.472	-	0.447
MSNet [6c]	RGB + Green + NIR + NDWI	0.797	0.581	-	0.555
RTFNet [4c]	RGB + NIR	0.769	0.477	-	0.519
RTFNet [4c]	RGB + NDWI	0.739	0.5	-	0.5

Moreover, Fig. 5 illustrates the visualization of RGB, NIR, ground truth, and water-body detection model performance regarding both bounding boxes and segmentation. MSNet, which incorporates RGB, Green, NIR, and NDWI channels (termed 'cognirndwi'), and Unet model with RGB and NIR channels (termed 'coir'), demonstrates better segmentation precision compared to Unet that utilizes only RGB ('co'). Similarly, the DocF model ('cognirndwi') outperforms Yolov7 ('co') in bounding box accuracy, particularly in detecting smaller and more ambiguous water-body objects that are often overlooked when solely relying on RGB data. These results carry potential implications for integrating deep learning models and multispectral imaging over RGB alone in remote sensing applications, particularly for environmental monitoring, where precise and reliable water body mapping is crucial.

In our study, while applying deep learning models like DocF, Unet, or MSNet showed promising results in water body detection from multispectral aerial imagery, it is crucial to acknowledge the need for improvements in our study. The geographical areas may not fully represent diversity. The WaterMAI dataset primarily focuses on specific environmental conditions, potentially limiting the generalizability of our findings. As a result, the models' performance might vary under different conditions not covered in our dataset and different year seasons. Furthermore, the requirement for larger than normal input image sizes when combining multiple modalities presents more potential for future research, as it imposes more computational demands and may restrict the applicability of the models in resource-constrained environments. These factors could affect the reliability and scalability of our findings in varied contexts.

On the other hand, building upon our current research, several promising directions could emerge for future exploration. Expanding our dataset to encompass a broader range of environmental conditions and geographic areas, including varying seasons, to enhance the model's robustness and versatility. Additionally, developing more efficient algorithms to process large multispectral images, reduce computational load, and advance instantaneous water body detection methods to aid in rapid environmental response. Finally, exploring the potential of these models in diverse fields, such as water quality monitoring and assessment, could open new avenues for practical application and interdisciplinary collaboration.

In conclusion, this study introduces the WaterMAI dataset, a novel multispectral aerial imagery collection optimized for detecting small to medium water bodies in rural and subtropical regions. The evaluation of various deep learning models is proposed to identify water bodies, utilizing both bounding box and segmentation techniques. The findings can set new standards in the field for such detections. The study's findings, particularly the effective use of spectral band combinations like 'RGB + Green + NIR + NDWI' for MSNet and DocF, or ‘RGB + NIR’ for the Unet model, underscore the importance of multispectral data in improving the accuracy of deep learning models in hydrological applications. However, limitations still exist regarding geographic and temporal scope, suggesting the need for further research to generalize these findings across varied environmental conditions. Future work can focus on expanding the dataset, developing more efficient algorithms, and exploring applications in other fields like Arbovirus disease surveillance. This research paves the way for advanced remote sensing applications, contributing to environmental monitoring and management.

HCC

Hepatocellular carcinoma

NIR

Near-infrared

RGB

Red, Green, Blue

NDWI

Normalized Difference Water Index

Deep Learning

LSM

Larval Source Management

LLINs

Long-Lasting Insecticidal Nets

IRS

Indoor Residual Spraying

GeoOBIA

Geographical Object-based Image Analysis

UAV

Unmanned Aerial Vehicle

P4M

DJI Phantoms 4 Multispectral

Availability of data and materials

The WaterMAI dataset, orthomosaic images, and the implementation of the segmentation models for benchmarking are available at https://github.com/Hoangpham13579/WaterMAI

Acknowledgments

We thank the Faculty of Engineering, Vietnamese-German University, Binh Duong, Vietnam, for providing the computer facilities to train and evaluate deep learning models and the P4M drone to collect the dataset.

Funding

This study was funded by the PAN-ASEAN Coalition for Epidemic and Outbreak Preparedness (PACE-UP; German Academic Exchange Service (DAAD) Project ID: 57592343)

Author's contributions

TPV conceptualised the study. PVH, TDK, and VBH designed the study. PVH pre-processed the collected dataset, conducted experiments, and contributed to data interpretation. PVH and NPL drafted the manuscript. All authors contributed to data acquisition, annotation, and orthomosaic image construction. VBH supervised the data acquisition process. TPV and TDK edited the final version of the manuscript.

Ethics approval and consent to participate.

Not applicable

Consent for publication.

Not applicable

Competing interests

The authors declare that they have no competing interests.

Gubler DJ: Dengue, Urbanization, and Globalization: The Unholy Trinity of the 21st Century. Trop Med Health 2011, 39(4 Suppl):3–11.
Nguyen-Tien T, Lundkvist Å, Lindahl J: Urban transmission of mosquito-borne flaviviruses - a review of the risk for humans in Vietnam. Infect Ecol Epidemiol 2019, 9(1):1660129.
Huynh LN, Tran LB, Nguyen HS, Ho VH, Parola P, Nguyen XQ: Mosquitoes and Mosquito-Borne Diseases in Vietnam. Insects 2022, 13(12):1076.
World Health Organization: Vector-borne diseases [Online]. Available: https://www.who.int/news-room/fact-sheets/detail/vector-borne-diseases. Published 2020, Accessed 2023.
Mahima KTY, et al.: MM4Drone: A Multi-spectral Image and mmWave Radar Approach for Identifying Mosquito Breeding Grounds via Aerial Drones. In: Tsanas A, Triantafyllidis A (eds) Pervasive Computing Technologies for Healthcare. PH 2022, Springer, Cham 2023. DOI: 10.1007/978-3-031-34586-9_27.
Rossi L, Backes A, Souza J: Rain Gutter Detection in Aerial Images for Aedes aegypti Mosquito Prevention. 2020. DOI: 10.5753/wvc.2020.13474.
Minakshi M, et al.: High-accuracy detection of malaria mosquito habitats using drone-based multispectral imagery and AI algorithms. J Public Health Epidemiol 2020, 12:202–217. DOI: 10.5897/JPHE2020.1213.
Sarira TV, et al.: Rapid identification of shallow inundation for mosquito disease mitigation using drone-derived multispectral imagery. Geospatial Health 2020, 15(1). DOI: 10.4081/gh.2020.851.
Stanton MC, et al.: The application of drones for mosquito larval habitat identification in rural environments: a practical approach for malaria control? Malar J 2021, 20:244. DOI: 10.1186/s12936-021-03759-2.
Carrasco-Escobar G, et al.: High-accuracy detection of malaria vector larval habitats using drone-based multispectral imagery. PLoS Negl Trop Dis 2019, 13(1):e0007105. DOI: 10.1371/journal.pntd.0007105.
Maulit A, et al.: A Multispectral UAV Imagery Dataset of Wheat, Soybean and Barley Crops in East Kazakhstan. Data 2023, 8(5):88. DOI: 10.3390/data8050088.
Munghemezulu C, et al.: Unmanned Aerial Vehicle (UAV) and Spectral Datasets in South Africa for Precision Agriculture. Data 2023, 8(6):98. DOI: 10.3390/data8060098.
Razakarivony S, Jurie F: Vehicle Detection in Aerial Imagery: A small target detection benchmark. J Vis Commun Image Represent 2015, 34. DOI: 10.1016/j.jvcir.2015.11.002.
Zogo B, Koffi AA, Alou LPA et al: Identification and characterization of Anopheles spp. breeding habitats in the Korhogo area in northern Côte d’Ivoire: a study prior to a Bti-based larviciding intervention. Parasites Vectors 2019, 12:146.
Tzutalin: LabelImg. GitHub. [https://github.com/HumanSignal/labelImg]. Accessed 2023.
OpenCV: CVAT. GitHub. [https://github.com/opencv/cvat]. Accessed 2023.
Wang C-Y, Bochkovskiy A, Liao H-Y M: YOLOv7: Trainable Bag-of-Freebies Sets New State-of-the-Art for Real-Time Object Detectors. IEEE/CVF Conf Comput Vis Pattern Recognit (CVPR), Vancouver, BC, Canada, 2023, pp. 7464–7475.
Fang Q, Wang Z: Cross-Modality Attentive Feature Fusion for Object Detection in Multispectral Remote Sensing Imagery. Pattern Recognit 2022, 130:108786.
Tao C, Meng Y, Li J, Yang B, Hu F, Li Y, Cui C, Zhang W: MSNet: multispectral semantic segmentation network for remote sensing images. GIScience Remote Sens 2022, 59:1177–1198.
Sun Y, Zuo W, Liu M: RTFNet: RGB-Thermal Fusion Network for Semantic Segmentation of Urban Scenes. IEEE Robot Autom Lett 2019, 4(3):2576–2583.
Ronneberger O, Fischer P, Brox T: U-Net: Convolutional Networks for Biomedical Image Segmentation. LNCS 2015, 9351:234–241.
Ayhan B, Kwan C, Budavari B, Kwan L, Lu Y, Perez D, Li J, Skarlatos D, Vlachos M: Vegetation Detection Using Deep Learning and Conventional Methods. Remote Sensing 2020, 12(15):2502.
Xie J, He T, Zhang Z, Zhang H, Zhang Z, Li M: Bag of Tricks for Image Classification with Convolutional Neural Networks. arXiv preprint arXiv:1812.01187. 2018.
Liu C, Belkin M: Accelerating SGD with momentum for over-parameterized learning. arXiv preprint arXiv:1810.13395. 2018.
Micikevicius P, Narang S, Alben J, Diamos G, Elsen E, Garcia D, Ginsburg B, Houston M, Kuchaiev O, Venkatesh G, Wu H: Mixed Precision Training. arXiv preprint arXiv:1710.03740. 2017.
Garrett JD: SciencePlots. PyPI. [https://pypi.org/project/SciencePlots/]. Published 2023. Accessed 2023.

No competing interests reported.

Download PDF

Version 1

posted

You are reading this latest preprint version

Use of low-cost drones to map arbovirus vector habitats with multispectral aerial imagery

Status:

Version 1

Abstract

Background

Materials and methods

Results

Conclusion

Figures

Introduction

Materials and methods

Drone and Camera

Collection, Annotation, and Construction

Water Bodies Detection Algorithm

Experimentational Setup

Evaluation metrics

Results

Discussion

Conclusion

Abbreviations

Declarations

References

Additional Declarations

Status:

Version 1