From the 1193 records (grid cells) that were observed by HyBIS in Area 1, 231 had the presence of at least one Sarostegia oculata. In Area 2, 29 out of 255 records were observed with the presence of S. oculata by Shinkai. The hexactinellid showed a higher abundance near the rift walls in both areas and was found only on hard substrates, of which the majority had a ferromanganese crust.
The performance of the models was good for each metric calculated, for both validation and test data. AUCROC (Fig. 4a) values were > 0.9, except for ANN which performed slightly worse. All models had good AUCPRG scores > 0.8 (Fig. 4b) and the RF, BRT, MaxEnt, GAM, and Ensemble models had nearly perfect values close to one for the test data, indicating a good discrimination ability to detect presence records. Similarly, all models showed high values for Sensitivity (Fig. 4c) and Specificity (Fig. 4d) for the validation data, suggesting a high proportion of observed presences and absences correctly predicted, respectively. Models RF, BRT, MaxEnt, and Ensemble had a higher sensitivity for the test data compared to the validation data. In contrast, GAM and ANN models had worse performance. Models had high specificity for test data as well, although the RF model was slightly worse. For the TSS metric, the ANN model had the overall worst performance, while models BRT, MaxEnt, and Ensemble had the highest values for test data (Fig. 4e). Friedman’s Aligned Rank test showed a significant difference between models for the five statistics. The post-hoc test showed that the most significant differences were found when comparing GAM and ANN with other models (Appendix D). The threshold values used to discriminate between presence and absence were 0.248, 0.126, 0.304, 0.121, 0.187, and 0.225 for the RF, BRT, MaxEnt, ANN, GAM, and Ensemble models, respectively.
The calibration plots show that the true probability of presence compared to the predicted presence of the five models are badly calibrated (Fig. 5). The ideal curve (dotted line) is below the lower confidence interval of the fitted calibration curve, indicating that the true probability of presence is much larger than the estimate given by the models. Only models RF (Fig. 5a) and MaxEnt (Fig. 5c) had the ideal curves above the higher confidence interval for low probability values. These results indicate models have a high discrimination power, i.e. the ability of a model to correctly distinguish between occupied and unoccupied sites, but the model output should not be interpreted as estimates of conditional probability of presence.
Generally, all models predicted a suitable habitat along with the full extent of both NE and SW rift borders in Area 1, and the NE rift border in Area 2. For Area 1 (Fig. 6), RF model predicted the distribution to be more extensive, with a relatively higher likelihood on the bottom of the rift compared to the middle of the plateaus. Other models had a relatively low (< 0.1) likelihood on the bottom of the rift, on top of the plateaus, on the south canyon, and the lower plane regions at southwest. All models predicted a larger area with a high likelihood around the north end of the NE rift border. However, only models RF and ANN extended this area throughout the small terrace at the NE plateau. Models predicted high suitability nearby the area between the east side of the canyon and the SW plateau. The predicted likelihood extended to the other side of the canyon for the MaxEnt model, and even further around the inner SW plateau for models RF and BRT. The south slide of the SW plateau showed high predicted suitability as well, except for the GAM model. For Area 2 (Fig. 7), models predicted a high likelihood near the slope of the NE rift border, between 700 and 1000 m. All models, except for RF, showed low (< 0.1) suitability at the rift bottom and the top to the plateau.
The predicted suitable habitat by the ensemble model reflected the average of all five models accordingly. This predicted distribution reflected environmental variables included in the model, namely depth and slope. The region with bottom depths between 700 and 1000 m that had nearby slopes > 20 degrees contained a continuous band of high prediction of suitable habitat. The spatial patterns of low-modeled uncertainty corresponded to the main areas predicted as highly suitable on the rift borders in both Area 1 and Area 2 (Fig. 8), together with the majority of the SW plateau and around the inner SW plateau. Regions of high uncertainty were obtained for the rift bottom, the canyon, the lower plane regions, and in some areas on top of the plateaus.
The importance of the environmental variables varied across the modeling algorithms, but depth, fine BPI, and northness usually had a high influence across all models (Table 2). For the RF model, the variables showed a low importance index (< 0.1), indicating that this model uses all variables to predict the presence of S. oculata, and changing a single variable has little effect on its output. Only depth and fine BPI had higher importance relative to other variables. For BRT, MaxEnt, and GAM models, the variables depth, fine BPI, northness, and curvature had a high influence in the likelihood of S. oculata. However, depth had a higher influence for BRT and GAM compared to MaxEnt, and northness had a higher influence for MaxEnt and GAM compared to BRT models. For the ANN model, most variables showed a high influence in its output, except for broad BPI and eastness.
Table 2
Mean index of the importance of each predictor variable across 100 permutations in the training dataset, for the Random Forest (RF), Boosted Regression Trees (BRT), MaxEnt, Generalized Additive Models (GAM), and Artificial Neural Networks (ANN) models.
Variables | RF | BRT | MaxEnt | GAM | ANN |
depth | 0.09 | 0.388 | 0.113 | 0.572 | 0.553 |
slope | 0.035 | 0.009 | 0.034 | 0.051 | 0.183 |
broad BPI | 0.016 | 0.002 | 0.000 | 0.000 | 0.066 |
fine BPI | 0.072 | 0.296 | 0.441 | 0.588 | 0.44 |
rift distance | 0.049 | 0.008 | 0.004 | 0.000 | 0.349 |
northness | 0.044 | 0.112 | 0.435 | 0.762 | 0.338 |
eastness | 0.031 | 0.003 | 0.007 | 0.000 | 0.081 |
rugosity | 0.036 | 0.011 | 0.008 | 0.000 | 0.19 |
curvature | 0.037 | 0.046 | 0.106 | 0.094 | 0.304 |
In general, models showed similar response patterns across the gradient of each environmental variable. Depth (Fig. 9a): MaxEnt, GAM, and ANN models showed a low response at higher depths below ~ 1000 m and shallower waters above ~ 700 m, but showed a peak in the predicted likelihood of S. oculata within ~ 700–1000 m. For RF and BRT models, the response remained high from deeper sites until ~ 800 m where it reached a peak, and then the response was low at shallow depths. Slope (Fig. 9b): models had a low response at flat sites, which increased at steeper slopes. The biggest variation in the response for slope was found in the ANN model. Broad BPI (Fig. 9c): it had a high response around 70 for RF, BRT, and MaxEnt models, but GAM and ANN predicted likelihood was higher at lower broad BPI values (< 0), and lower at broad BPI > 100. Fine BPI (Fig. 9d): RF, BRT, and ANN showed a lower response at fine BPI < 0, and a higher response at values > 0, with a larger variation produced by the ANN model. MaxEnt response was higher with negative fine BPI, while GAM was unresponsive for this variable, characterized by a flat horizontal line in the plot. Rift distance (Fig. 9e): All models except ANN showed a similar pattern for the rift distance. A peak in response near the rift (< 2000 m) and from 4500 to 7000 m, along with low response between 2000 and 4500 m and regions more distant than 7000 m. The ANN model was different, with a high response near the rift, lowering constantly as it moves away until 10,000 m. The MaxEnt, GAM, and ANN had very low predicted outputs when far away from the rift (> 20,000 m). Northness (Fig. 9f): peaks in response were produced at sides facing north and south in models RF, BRT, and MaxEnt. GAM generated a slightly higher response in sites facing north than south. ANN, instead, generated a higher response in sites facing south than north. Eastness (Fig. 9g): models predicted a slightly higher response on sites facing either east or west. However, the GAM was unresponsive for this variable as well. Rugosity (Fig. 9h): models had a low response at sites with low rugosity, that increased at areas with higher values. Curvature (Fig. 9i): only ANN model had a large variation in response for curvature, with a peak in sites with curvature close to zero. RF and BRT, instead, showed a smaller response in these sites. MaxEnt and GAM were unresponsive for this variable.