Since the 1950s, submarine landslides in the northern Gulf of Mexico (GoM) have been studied with an interest in protecting offshore structures for petroleum production (i.e., Shepard 1955; Coleman et al. 1978). With offshore energy infrastructure initially placed in shallow, nearshore waters off the GoM coast, early submarine landslide studies focused on the MRDF (i.e., Coleman et al. 1978, 1980). Since then, offshore projects have explored deeper waters to access additional resources with deepwater (water depth > 1,000 feet) and ultra-deepwater (water depth > 5,000 feet) activities (Bureau of Ocean & Energy Management 2008). With ongoing dependence upon marine infrastructure to support energy production, reliance on marine economies, and the likely advent of carbon storage projects, there is a need to map the potential of submarine landslides in remote areas of the seafloor. Currently, no basin-scale LSM applications for the northern GoM have been published. This is likely due to limited studies and data availability. However, with recent advances in sensor and seismic technology, open-sourced high-resolution bathymetry and spatial seafloor hazard data have become available, and there is an opportunity to perform LSM for offshore regions.
3.1 Study area
The case study area includes a portion of the northern GoM, parts of which are a hot spot for petroleum production in the U.S. Exclusive Economic Zone (EEZ). The specific boundary of the study area is the U.S. EEZ in the GoM where the water depth is greater than 120 meters (Fig. 1). This region extends out to 200 nautical miles offshore and covers a total area of 386,753 km2. Notable regions within the study area include the Texas (TX)-Louisiana (LA) Slope, the Sigsbee Escarpment, the Mississippi Canyon, the De Soto Canyon, and the Florida Escarpment. The GoM basin is the result of an extinct extensional regime that split the previously emplaced Louann salt sheet into the northern GoM salt basin and the Campeche salt basin (Galloway 2008). The Louann salt sheet forms the continental slope of the northern GoM before the Sigsbee Escarpment descends into the abyssal plain. Recent (late Neogene-present) sedimentation rates are greatest along the central GoM coast margin, with the heaviest sedimentation supplied by the Atchafalaya and Mississippi Rivers. The resulting progradation supplies massive amounts of sediments to the GoM basin that are concentrated in the Mississippi Canyon and then redistributed through geomorphologic processes along the continental shelf and onto the abyssal plain. Due to these high sedimentation rates, the seabed sustains a significant load that affects its stability. Further, the Louann salt sheet deforms and migrates continuously due to heavy clastic sedimentation rates that were initiated in the Late Jurassic-Early Cretaceous and continue to load the salt sheet and drive gravity tectonics in modern times (Galloway 2008). Salt diapirism is responsible for the many salt-withdrawal mini-basins that introduce high variability to the bathymetry, the differential accumulations of sediment, and much of the subsurface to seafloor structural complexity (i.e., faults and fractures). Dense structural complexity indicates a greater potential for fluid migration that utilize the high permeability provided by fractures to relieve fluid or gas in pressurized subsurface reservoirs. Where these fluid migration pathways breach the seafloor is indicated by seeps, mud volcanoes, hydrates, and chemosynthetic communities.
Four regions where historic landslides have been digitized were used for analysis (Dyer et al. 2022). Each region is labeled A-D from west to east across the study area, as shown in Fig. 1. Region A has an area of 8,508 km2 and is located within the TX-LA Slope salt basin. Region B has an area of 3,575 km2 and is also located within the TX-LA Slope salt basin. Region C has an area of 17,577 km2 and primarily covers a portion of the northern GoM salt basin and the western flank of the Mississippi Canyon. Region D has an area of 5,389 km2 and is located southwest of the De Soto Canyon.
3.2 Materials and methods
3.2.1 GIS feature database
A spatial database was curated containing 20 features relating to factors that are known to affect submarine landslide potential in the northern GoM. Topographic features include elevation, aspect, slope, and curvature (IOC et al. 2003). Geomorphology features include basins, canyons, and escarpments (Harris et al. 2014)as well as channels (Bureau of Ocean & Energy Management 2016a). Geological factors include faults (Diegel et al. 1995; United States Geological Survey 2004a), salt diapirism (United States Geological Survey 2004b), sediment thickness (Twichell et al. 1995), and sediment type (Buczkowski et al. 2020). Lastly, geochemical factors include gas presence, pockmarks, mud volcanoes, and seeps (Bureau of Ocean & Energy Management 2016a) as well as hydrates (Twichell et al. 1996; Majumdar et al. 2017). All spatial features that are represented as points, lines, or polygons were converted to a continuous surface by using the Euclidean Distance tool provided by ArcGIS Pro (Environmental Systems Research Institute 2022). Derivatives of elevation including aspect, slope, and curvature were created using the Surface Parameters tool in ArcGIS with a 3,000 meter search (Environmental Systems Research Institute 2022). The sediment type data acquired from the usSEABED database (Schweitzer et al. 2020) was broken up into features depicting the percent coverage of sand, mud, gravel, and rock. All rasters are sampled to a 500-meter spatial resolution. Source information and maps of each feature can be found in Table S1 and Fig. S1, respectively (Online Resource 1).
It is important to note that even though ocean surface waves can trigger landslide events in water depths below approximately 120 meters (Henkel 1970; Maloney et al. 2020), this factor is not included in this study. Therefore, analysis for this case study was limited to water depths greater than 120 meters in the U.S. EEZ. Furthermore, visual inspection of regional sediment accumulation rate predicted by Restreppo et al. (2020) in Fig. S2 (Online Resource 1) shows that there is little variation in sediment accumulation rate for the study region. For that reason, sediment accumulation rate is not included in this case study.
The historic submarine landslide inventory utilized in this study as observational data was curated by Dyer et al. (2022). This dataset represents boundaries of where sediment volume was lost (also referred to as the depletion area) of historic submarine landslides in four regions within the study area (Fig. 2), offering a dataset with minimal false negatives (i.e., non-identified landslides). These landslide observations were added to the GIS database by rasterizing to the same grid as the input features. Region 1 had 34,049 observations with 1,674 (4.92%) of them having a positive landslide class. Region 2 had 14,161 observations with 892 (6.3%) of them having a positive landslide class. Region 3 had 70,301 observations with 4,342 (6.18%) of them having a positive landslide class. Lastly, region 4 had 21,527 observations with 714 (3.32%) of them having a positive landslide class.
3.2.2 Mutual information
To assess the statistical relationships between the landslide and non-landslide groups, Mutual Information (MI) is performed on each input feature with the landslide class as the target variable. MI is a grounded concept in information theory that is used to estimate the dependency or relatedness between two discrete groups (Ross 2014). Here, we measured the information content between two continuous groups (landslide and non-landslide) by performing an entropy estimation using a nearest-neighbors method designed by Kraskov et al. (2004) and Ross (2014). By calculating the MI of each input feature between the landslide and non-landslide classes, the value that each feature may have in discerning between the two classes was estimated. MI values closer to zero indicate that the feature is independent of the two landslide classes, and values closer to one indicate that the feature has a high dependency between the two classes. Therefore, features with higher MI values will likely provide more valuable information for classification modelling. This functionality is provided by scikit-learn (Pedregosa et al. 2011).
3.2.3 Collinearity
A Pearson correlation matrix was created between all the input variables to assess the amount of collinearity between the input features. Values range from − 1 to 1. Here, it was assumed that any two variables with a correlation equal to or greater than 0.8 as well as equal to or lower than − 0.8 are highly correlated and it is unwise to include both features in the ML models. Between two highly correlated features, the feature with the lowest MI score was dropped from analysis.
3.2.4 Frequency analysis
Frequency analysis was completed for each of the remaining features to provide statistical results that may indicate higher or lower landslide susceptible groups of feature values. For each feature, the continuous values are split up into a set of predefined bins. For each bin, the total and percent of pixel values within that range bin is given, as well as for the total and percent of positive landslide classes that fall within that range bin. The frequency analysis is provided in Table S2 (Online Resource 1).
3.2.5 Predictive modelling
This study utilizes a GBDT model to predict the binary target (landslide or non-landslide) and forecast landslide susceptibility. This type of ML model is an ensemble method which combines several weak prediction models (decision trees) to create a highly accurate model. GBDTs are robust to outliers and can model non-linear feature interactions (Friedman 2002; Elith et al. 2008). GBDTs have been shown to perform with high accuracy in LSM studies (Song et al. 2018; Sahin 2020). Here, the eXtreme Gradient Boosting (XGBoost) algorithm was utilized as the GBDT model, provided by T. Chen and Guestrin (2016). Additionally, LR has been utilized for LSM (Raja et al. 2017) and is reported as a baseline and is provided by scikit-learn (Pedregosa et al. 2011).
To assess the predictive accuracy of a classifier and its ability to distinguish between landslide and non-landslide observations, a permutation method was used. This approach performs training and testing on three ratios of training-testing sets: 1:1 (one training region, one testing region), 2:1 (two training regions, one testing region), and 3:1 (three training regions, one testing region). For each ratio, all permutations of the possible arrangements for training and testing sets using the four training-testing regions were completed. This approach measures how the amount of training data available affects accuracy and offers an assessment of accuracy over all four training-testing regions.
The following ML workflow (Fig. S3, Online Resource 1) was performed on each permutation of training and testing sets. A randomized search method was used to tune several parameters of each model (Fig. S3, Fig. S4, Online Resource 1), which executes 10 parameter permutations and evaluated model performance using stratified k-fold cross validation (CV) on the training set. With the tuned parameters, an optimized model pipeline was fitted to the training set. For the GBDT model, the testing set was used as early stopping criteria during the optimized model phase with training ceased once the Area Under the Receiver Operating Characteristic Curve (ROC AUC, but hereby referenced as AUC) score did not increase after 10 training iterations (epochs). Early stopping can be a critical part of GBDT models due to the potential of overfitting to the training set.
Prior to model fitting throughout the ML workflow, the training and testing sets were sent through a data processing pipeline. First, all input features are scaled from 0 to 1. Next, all missing values are filled using k-nearest neighbor (KNN) imputation with five neighbors (Cover & Hart 1967; Triguero et al. 2019). Lastly, due to an imbalance in the two target classes, under-sampling is performed on the training set to reduce the number of non-landslide observations until equal to the number of landslide observations.
Model performance was evaluated with the testing set of each permutation using standard metrics for a binary classification problem including accuracy, precision, recall, and AUC. An accuracy score measures the proportion of correctly classified observations. A precision score measures the proportion of positive predictions (landslide) that were actually positive. A recall score measures the proportion of the positive observations that were correctly classified. Lastly, the AUC score compares the true positive rate (sensitivity) and the false positive rate (specificity) at various thresholds. The AUC score measures the ability of a classifier to distinguish between two classes. For all metrics, values range from 0 to 1, where lower values indicate poor performance and higher values indicate good performance.
3.2.6 Feature attribution
SHapely Additive exPlanations (SHAP) (Lundberg & Lee 2017) was used to assess the importance that each feature has in landslide classification. The SHAP method is an application of game theory and can be used to estimate Shapely values (Shapley 1997) which provide consistent and accurate feature attribution values. TreeSHAP, a SHAP method designed for tree-based models, is utilized to estimate SHAP values and enables fast computation speeds (Lundberg et al. 2018). Here, SHAP values reported are the absolute weighted average of Shapely values for each tree in a GBDT model and represent the contribution that a feature has in determining the outcome (prediction) of the model. SHAP values are estimated using the testing set for each permutation, allowing for the assessment of the distribution of feature attribution over the four training-testing regions.
3.2.7 Landslide susceptibility mapping
A landslide susceptibility map was produced for the entire study area to visualize the spatial patterns of submarine landslide susceptibility. The final map was created by training each model with the input data from all four training-testing regions and then predicting the probability of the landslide class from 0 to 1 for each pixel over the study area. Since the landslide observations from the dataset by Dyer et al. (2022) represent a variety of types, including slumps and flows, the predictions represent the potential for the mass transport of sediment that is inclusive of all types of transport mechanisms. Additionally, because the landslide boundaries represent the depletion areas, the landslide susceptibility maps will be applicable to predicting potential areas for landslide initiation and source sediment. The Jenks Natural Breaks Classification method is used to classify the landslide susceptibility prediction into risk classes representing very low, low, medium, high, and very high (Jenks 1967), which has been utilized to visually represent the degree of landslide risk (Chen et al. 2017; Song et al. 2018; Shahri et al. 2019; Wang et al. 2020). A cumulative density function (CDF) plot is additionally supplied for each map to provide context for the distribution of values within each risk class.