As a ubiquitous and typical geological hazard, landslides significantly endanger human lives, infrastructures, and other property security, and occur frequently worldwide (Cigna et al. 2018). A reliable inventory of landslides is therefore of critical importance for timely quantitative hazard assessment, disaster relief, and subsequent risk governance, especially in mountainous regions with complex environments and inconvenient transportation (Galli et al. 2008).
Traditionally, in-site manual investigation was the most commonly used scheme for landslide detection, which required experts and investigators to reach the landslide area and evaluate the landslides' magnitude, position, and hazard mechanisms (Ji et al. 2020). However, it is a time-consuming, laborious task, even sometimes risky (Li et al. 2022). With the boom of remote sensing technology, as well as improvements in the capacity of obtaining geospatial information on a regional scale, remote sensing technology has been extensively employed in various fields, such as for the investigation purpose of hydrology (Brinkhoff et al. 2022; Rajesh et al. 2021), geology (Barak et al. 2021), soil (Yu et al. 2022; Lourenco et al. 2022), environment (Petaja et al. 2022; Wang et al. 2021; Wiggins et al. 2021) and forestry (Eskandari et al. 2020; Altamirano et al. 2020). These remarkable works inspire the application of remote sensing technology in landslide detection. When high-resolution images are available, visual interpretation is a reliable method that creates appropriate interpretation markers to show the boundaries of landslides by considering the landslides’ colour, texture, shape, position, and other factors hidden in the remote sensing images. These markers are used by experts to further analyse and confirm the landslide area (Tanoli et al. 2017). The basic principle of using remote sensing images for landslide interpretation can be summed up as the spectral difference caused by the loss of vegetation and the exposure of fresh soil and rock (Li et al. 2014).
Although visual interpretation has been proved able to generate good results in landslides identification (Xu et al. 2015), it is usually inefficient and may suffer from interpreters' subjectivity (Fan et al. 2019). Thus, several computer visual mothods and machine learning based interpretation methods (Li et al. 2022; Chen et al. 2020; Merghadi et al. 2020; Nhu et al. 2020; Han et al. 2019; Tien Bui et al. 2019; Fan et al. 2019; Sun et al. 2017; Li et al. 2016; Behling et al. 2016; Li et al. 2014), including linear regression (LR), support vector machine (SVM), random forest (RF), k-nearest neighbour (KNN), and thresholding method, have been conducted to address these issues. These remarkable studies laid a solid foundation for landslide research. However, these works generally cost a lot of time in feature analysis, which consists of enumerating and selecting landslide conditioning factors (e.g., NDVI, NDWI, NDBI, TWI). Meanwhile, some of the crucial parameters in these methods (e.g., kernel function in SVM, max tree depth in RF, k value in KNN), require empirical adjusting, somewhat limiting the methods' flexibility. For instance, as we mentioned in our previous studies (Han et al. 2019; Li et al. 2014), grey-level threshold is crucial for panchromatic image binarization in landslide detection, but the determination of which is still debated.
Recently, convolutional neural networks (CNNs) have shown significant learning capacity in a variety of image-based tasks, including image classification (He et al. 2016; Simonyan and Zisserman 2014), object recognition (Ren et al. 2017), and semantic segmentation (Chen et al. 2018; Ronneberger et al. 2015). Many CNN algorithms have been proposed and implemented during the last two decades (Howard et al. 2017; Chollet 2017; He et al. 2016; Huang et al. 2016; Simonyan and Zisserman 2014), each based on a different rationale. In terms of landslides, the CNNs were used to extract the hidden features of the landslides in the remote sensing images, which helped to determine the existence of landslides at the location (Ding et al. 2016). Subsequently, CNNs have been widely used to retrieve images containing landslides from image datasets. Many studies (Yu et al. 2017; Li et al. 2014), used specific algorithms, such as automatic thresholding and region growth algorithms, to delineate the region of the landslides.
Up to date, various structures of CNN networks, with different convolutional layers, sizes, and input data channels, have been discussed to analyse the impact of network settings against accuracy (Ghorbanzadeh et al. 2019). In addition, multi-channel inputs of the combined spectral and topographic features were used to build CNN models (Su et al. 2021; Meena et al. 2021), which improved the accuracy of landslide detection. The convolution kernel of CNN can automatically obtain the effective features of images, which enables the understanding of semantic information of objects avoiding empirically determination of complex features (Lecun et al. 2015). Additionally, via network training, convolution kernel can further emphasise the distinction between landslide and background, which is essential for the purpose of landslide detection. Thus, CNN can be selected as an important method for image segmentation and inspection of landslides.
Among the whole CNN methods, the UNet model has been proven to be an efficient method for landslide detection (Meena et al. 2022; Yu et al. 2021; Prakash et al. 2020; Lei et al. 2019). With a succession of convolution, pooling, and deconvolution layers, this method tries to extract the required information from remote sensing images (Badrinarayanan et al. 2017; Shelhamer et al. 2017). However, the scale of landslides is different in remote sensing images. For tiny and slender landslide detection, the excessive convolution and pooling process may cause potential texture information loss (Li et al. 2019; Gu et al. 2019). For instance, a \(25\times 25\) pixel landslide target tends to be smaller than \(1\) (\((25÷{2}^{5})<1\)) after five pooling processes, which makes its texture information disappear on the feature map (Fig. 1). The lost texture information may lead to the following misjudgment of the landslide target. As a result, to successfully detect landslides, especially for a remote sensing image covering a large area, the UNet model needs to be revised so that it can detect landslide boundaries precisely.
In this paper, to address the issue of potential texture information loss caused by convolution and pooling in conventional CNN models, a reversed image pyramid feature (RIPF) boosted UNet algorithm for landslide detecting is presented. In this method, the original network structure of UNet models is utilised as the major network structure, while the reversed image pyramid features are employed to augment the feature in the network's decoding phase to increase the model's capacity to differentiate the area of landslides under a complicated backdrop. The RIPF-Unet model is supposed to better identify tiny and slender landslides from exposed surfaces. To verify the performance in RGB remote sensing image dataset, the proposed RIPF-Unet model is trained and tested using an open-source dataset, and the adaptability and effectiveness of the method in the actual area are further tested in the case of the Longtoushan region after the 2014 Ms.6.5 Ludian earthquake.