Acoustic tracking
Data collection was undertaken in the BIOT MPA between February 2013 and April 2019. Throughout the archipelago there have been situated up to 93 permanent and temporary acoustic receivers (VR2W, VR4-UWM, VR4G and VR2AR receivers, Vemco Inc, Nova Scotia, Canada), as configured in Fig. 1. The BIOT MPA is characterised by numerous small islanded atolls with submerged banks and reefs, with depths of 1,000 m or more separating each atoll or reef system (38). As such, acoustic receivers in the BIOT MPA are mainly based on areas accessible to divers, such as coral reef systems, with few receivers covering the deep pelagic waters of the region. This array was initially started in 2013, expanded throughout subsequent years [for more information see Carlisle et al. (6)], covering a perimeter of 700 km and an area of 25,500 km2 within the MPA, for the detection of acoustically tagged marine fauna. Of the 93 receivers, 82 are in depths 45 m or less. All receivers were situated far enough apart to avoid overlap in their detection range, with mean distance to closest receiver 2.15 km, with a range of 0.55 – 4.57 km. The frequency distribution of inter-receiver distances can be found in Additional file 1: Figure S1. Although range testing has not been undertaken for this array due to financial and logistical constraints of vessel time in the BIOT MPA, other studies conducted around coral atolls in the Indian Ocean using the same or similar equipment have found detection ranges between 300 and 500 m (14, 39-41). Maps of receivers at the different reef systems within BIOT, and an estimated 500 m detection range, can be found in Additional file 1: Figure S2-4.
We tagged 102 grey reef and 76 silvertip sharks with acoustic transmitters across nine different locations following the methodology described by Carlisle et al. (6). Of tagged grey reef sharks, 76 were female and 26 were male. Of the tagged silvertip sharks, 45 were female and 31 were male. As in previous studies (7, 42), silvertip sharks (mean total length, TL = 123.56 cm ± S.D. 19.14) were on average slightly larger than grey reef sharks (mean TL = 119.15 cm ± S.D 18.07). Detailed metadata for each tagged individual can be found in Additional file; Table S1. Tags acoustically transmit a unique ID code with a nominal delay of 60-180 secs for the duration of their battery life (~10 years), providing a long-term time series of detection data. Receivers were downloaded and serviced annually at the same time each year (March-May).
Only complete years were included in the analysis, with the final data set covering shark detections from 2014 – 2018. Prior to analysis, data from the seamounts were removed. Seamounts in the BIOT MPA differ ecologically from the rest of the archipelago and recent analyses suggest that they are potentially important drivers of silvertip shark movement (unpublished data). While we acknowledge the ecological importance of seamounts, we only had acoustic equipment on two of these features in the south of the MPA. Despite the small number of receivers on these sites, a significant number of detections were produced (1,165,546 from 2014-2018) that could significantly impact our analyses. As such, in order to robustly investigate segregation of reef sharks in coral reef habitat, seamount data were excluded.
Movement classification
Using a movement network approach, a ‘transition’ was deemed to have occurred when an individual left one receiver and arrived at another in a new location (26). Alternatively, an animal might leave a receiver, move out of detection range and then return to the same location, a ‘self-loop’ in network parlance, but here called a ‘recursion’. Classification of ‘restricted’ and ‘out of range’ activity was, therefore, inferred based on the duration of these two different types of movement (hereafter transitions and recursions) before being used as a binary response variable in subsequent models (Fig. 2).
All analyses were conducted in R version 3.6.0 (43). Classification of ‘restricted’ and ‘out of range’ movements was conducted using an optimal classification method, where similar data values are placed in the same class by minimising an objective measure of classification error (44). For recursions, detection gaps of less than six minutes (minimum of two detections) were removed from the data as an initial filter to ensure a recursion had taken place, rather than an animal had stayed in the same location but a detection had been missed. Time differences for recursive movements per species were log transformed to normalise the data, and the ‘classIntervals’ function in the classInt package (45) used to calculate thresholds between ‘restricted’ and ‘out of range’ movements. The Fisher algorithm was used, which determines thresholds by minimising the sum of absolute intra-class mean variance, as well as maximising inter-class mean variance (44, 46). This resulted in a threshold of 91 minutes for grey reef sharks and 64 minutes for silvertip sharks for ‘restricted’ activity, beyond which it was assumed that the shark had conducted an ‘out of range’ movement.
Transitions were subject to a separate filtering process. Unlike recursions, no initial filter was required for transitions as the detection of an individual on one followed by another receiver is immediately indicative of a movement from one location to another. Temporal gaps in the detection data for any given pair of receivers were informed by both the distance and species‑specific minimum sustainable swim speeds (0.69 m/s for grey reef sharks and 0.73 m/s for silvertip sharks) (47). For example, the predicted transition duration of a direct movement of a shark between two receivers, without deviation, would be the ratio between distance and speed. As such, by first calculating expected time for a transition using swim speeds and distance between receivers, the Relative Deviation from this Expected Time (RDET) between any pair of receivers was determined by dividing the expected transition time by the observed transition time. RDET values of > 1 were movements faster than expected, and values of < 1 slower/more tortuous than expected.
For transitions, log transformed RDET values were calculated for both species, and, as with recursions, the same optimal classification method for determining thresholds was used (44). Movement values of greater than the threshold value of 0.164 for grey reef sharks and 0.128 for silvertip sharks were determined as ‘restricted’, with values less than the thresholds determined as ‘out of range’. Animals rarely travel in straight lines and often vary in their tortuousity depending on factors such as resource use, habitat quality, competition and predation (48-50). These thresholds of 0.164 and 0.128 are, therefore, very conservative to allow for a tortuous movement to occur and still be classified as ‘restricted’ in each species. Finally, recursions and transitions were combined so that every movement was categorised as a binary response (restricted = 0, out of range = 1) (Fig. 2).
Data analysis
To explore the influence of explanatory variables on ‘out of range’ movements, an information theoretic approach was taken, which accounts for model selection uncertainty (51, 52). In recent years, information theoretic approaches have become a staple for modelling ecological systems, particularly those where explanatory models describing the system may have similar complexity and fit the data equally well, such as understanding the spatial distribution (53-55), behaviour (56, 57), and anthropogenic impact on survival of wildlife populations (58, 59). To limit exploratory analyses, and prevent model overfitting, an a priori selection of variables and interactions based on previous research and theory was conducted (52, 60, 61). Explanatory variables included in the model were ‘species’, ‘sex’, ‘size’, ‘season’ (wet/dry) and ‘diel period’ (day/night) (7, 17, 62-64). As size had a non-normal distribution it was log transformed. The BIOT MPA is located near the equator and has a roughly 12-hour day/night cycle. As such, day was designated from 0700 to 1900 and night from 1900 to 0700 following sunrise and sunset times obtained from https://www.timeanddate.com. The MPA experiences distinct Indian Ocean wet and dry seasons with wet season running from October to March, and dry season from April to September (65). Seasonal variability is often greater than monthly variability in tropical ocean systems (66, 67), and, therefore we deemed season a more biologically relevant driver of shark movement.
All variables used in the model were assessed for multicollinearity. Multicollinearity, which occurs when predictors in a multiple regression are highly correlated (68), was assessed by producing a variance inflation factor (VIF) using the ‘check_collinearity’ function in the performance package in R (69). VIF measures the degree of multicollinearity in a regression model by providing an index of how much the variance of the model variables increases due to collinearity (70). No evidence of collinearity was found, with all variables having a VIF ≤ 1.05, less than the critical threshold of 5.0 (see Additional file 1: Table S2) (68, 71). As such, all a priori selected explanatory variables were included in the global model.
A global model was subsequently created using a generalised linear mixed model (GLMM) (family = binomial, link = logit) in the glmmTMB package (72). To explore putative spatial and temporal segregation between grey reef and silvertip sharks, ‘species’ was included as interaction term with all explanatory variables and individual ID as a random factor. As the likelihood of a movement between locations decays as a function of distance (D. Jacoby unpublished data), receiver location was also included as an independent random factor. Residuals of the global model were checked for heteroscedasticity, autocorrelation and errors were checked for binomial distribution using the functions ‘resid’, ‘fitted’, and ‘acf’ from the stats package (43) and found free from autocorrelation and heteroscedasticity of residuals (Additional file 1: Figure S5).
To generate the model set from the global model, the ‘dredge’ function from the MuMIn package was used (73). Models in the set were ranked by Akaike Information Criterion (AIC) values. If no single parsimonious model results from the set and the weight of the best model is less than 0.9, model averaging is recommended over model comparison (61), and a confidence set, most likely to represent the system, selected. Only models with DAIC <2 were chosen for the confidence set (51, 74). A model averaging approach was then undertaken on the confidence set (61) using the ‘model.avg’ function in the MuMIn package (73). This function calculates Akaike weights based on the confidence set (61). Model averaged estimates and confidence intervals (97.5%) for each explanatory variable and interactions were calculated (52). The relative importance of each predictor variable (relative to other variables in the confidence set) was calculated by summing Akaike weights for all confidence set models containing them (51).
Model averaged estimate values indicate the probability of observing an ‘out of range’ movement as the value for continuous predictor variables increase. Categorical predictor variables were compared to the categorical variable level used as the model baseline. Positive estimates indicate an increased probability of ‘out of range’ and a decreased probability of ‘restricted’ movements; negative estimates, the opposite. It is important to note that using this method it is possible that some levels of categorical predictors may display a high relative importance value but show no significant result in the model averaged estimates, as these are dependent on the baseline chosen (75). Therefore, both the relative importance and model averaged estimate results should be considered in combination (56, 61). Marginal R2 (R2m) and conditional R2 (R2c) values were calculated, using ‘r.squaredGLMM’ in the MuMIn package (73, 76), to assess variance in the fixed effects and the combination of fixed and random effects, respectively (76, 77).
Model cross-validation
To assess the predictive capabilities of our final model, analysis was conducted on 80% of the data. Cross-validation of the model averaged estimate values from the confidence set of models was conducted on the remaining 20% of data as confirmatory analysis to quantify how well the selected model performed (52). The ‘predict’ function in the glmmTMB package was used to validate the expected outputs of the multi-model inference on the observed values from the reserved 20% of the data. Area under the receiver operating characteristic curve (AUC) values designate the probability that positive and negative instances are correctly classified (78). As such, AUC was calculated using the pROC package (79), as a threshold‑independent method to check the robustness of the model. An AUC value of greater than 0.7 indicates better than random performance and that the model is a good representation of the system being evaluated (78, 80, 81).