2.1 Study location and fishery data
This study focused on the Indian Ocean from May to July of 2000 to 2016, because these 3 months were the most productive months during the study period of our previous study (Mondal et al. 2021). Fishery data were collected from 0° to 50°S and 20°E to 120°E. Taiwan’s Overseas Fisheries Development Council (OFDC) supplied the albacore tuna fishing data. Initially, 433,876 data points with a 1° × 1° spatial grid were applied in the present study. Fishery data consisted of the year, month, latitude, longitude, catch number, effort (number of hooks), and weight (dry or wet was not distinguished in the logbook). Monthly N. CPUE was calculated as the number of individuals caught per 103 hooks.
CPUE = (C / E)
where CPUE is the N. CPUE (in 103 hooks) in each 1° × 1° spatial point, C is the number of individuals caught, and E is the effort made (in number of hooks). Using 1° × 1° fishery data is always more effective than using 5° × 5° fishery data for understanding the habitat. This study also applied the commonly used generalized linear model (GLM) to standardize the N. CPUE to remove bias resulting from spatial and temporal factors. A model with four factors, namely year, month, latitude, and longitude, was used to construct the GLM for standardization as follows:
Log (CPUE + c) = µ + Year + Month + Latitude + Longitude + €
where CPUE is the N. CPUE; c is a constant, which is equal to 0.1% of the mean nominal catch rates and commonly used for standardization (Su et al. 2008; Lan et al. 2018); µ is the intercept; and € is a normally distributed variable with a mean equal to zero.
2.2 Environmental data
2.2.1 Historical environmental data
Historical environmental (SST and SSC2) data were collected from May to July of 2000 to 2016 from 0° to 50°S and 20° to 120°E (Table 2). Monthly SST data in °C were collected from the HadISST data set (https://coastwatch.pfeg.noaa.gov/erddap/griddap/erdHadISST.html), which is produced at the Met Office Hadley Centre and has a spatial resolution of 1° × 1°. Monthly SSC2 data in mg/m−3 were collected from the biogeochemical hindcast for global ocean (https://resources.marine.copernicus.eu/products), which is produced at Mercator Ocean (Toulouse, France) and has a spatial resolution of 0.25° × 0.25°. Because the fishery and SST data had a 1° spatial grid, the SSC2 data were rescaled to the same spatial grid using MATLAB software (version 2019a).
Table 2. Environmental data sources and their descriptions.
Environmental data

Temporal resolution

Spatial resolution

Source

Time period

Sea surface temperature

Monthly

10 x 10

AQUAMODIS

20002016

Sea surface chlorophyll (2 months lag)

Monthly

0.250 x 0.250

COPERNICUS

2.2.2 Projected climate change data
All projected environmental data were collected from the IPCC climate models under RCP scenarios 2.6, 4.5, and 8.5. Environmental variables were downloaded from the Earth System Grid Federation. RCP 2.6, 4.5, and 8.5 were characterized using the stabilization without overshoot pathway to 4.5 W m−2 (400 ppm CO2 equivalent) and the rising radiative forcing pathway leading to 8.5 W m−2 (1370 ppm CO2 equivalent), respectively. RCP 4.5 is a scenario involving longterm global emissions of greenhouse gases, shortlived species, and land use and land cover, which stabilizes radiative forcing at 4.5 W/m2 (approximately 650 ppm CO2 equivalent) in the year 2100 without ever. We used the average of different IPCC climate models to reduce the maximum bias resulting from using a single climate model (Table 2). The data were collected for May to July of 2040, 2070, and 2100 to analyze the shortterm, mediumterm, and longterm effects of projected climate change on the distribution pattern of immature albacore tuna under both RCP scenarios. In this study, all climate model data were interpolated to a spatial grid of 1° by using MATLAB software (version 2019a) to match the 1° spatial grid of the fishery data. The six atmosphere–oceanic general circulation models (AOGCMs) used in this study are listed in Table 3.
Table 3. Sources of projected environmental data from different climate models.
Institute

Code

Resolution

Parameters

Institute Pierre Simon Laplace

IPSL

10 x 10

SST, SSC2

Geophysical Fluid Dynamics Laboratory

GFDL

0.310 x 10

SST, SSC2

Commonwealth Scientific and Industrial Research Organization

CSIRO

1.50 x 10

SST, SSC2

Hadley Center Global Environment Model

HadGEM

0.310 x 10

SST, SSC2

Max Plank Institute for Meteorology

MPI

10 x 10

SST, SSC2

Canadian Earth System Model

CanSEM

1.50 x 10

SST, SSC2

2.3 Model development
Suitability index (SI) curves were constructed for each parameter with the standardized immature albacore CPUE by using smoothing spline regression to identify the relationship between immature albacore relative abundance and their varying environmental preferences (Haghi et al. 2013). This approach elucidated the relationship between standardized CPUE (S. CPUE) and environmental variables. In the regression analysis, the S. CPUE was the dependent variable, and all environmental parameters were the explanatory variables. The SI for albacore was established by applying the S. CPUE and all environmental variables and was then normalized as follows (Tian et al. 2009; Lee et al. 2020):
where Ymax and Ymin, respectively, are the maximum and minimum number of observations of the S. CPUE or environmental variables, and Y is the predicted value from Ymax to Ymin; thus, the SI values can range only from zero to one.
The SI value was calculated using the summed frequency distribution of the S. CPUE of each class, and SI values were assumed to be between zero and one. The midpoints of each environmental variable class interval were used as the observed values to fit the SI models, and, finally, the relationship between the SI and environmental variables was calculated using the following formula (Chen et al. 2009; Lee et al. 2018; Lee et al. 2019):
where m denotes the response variable (environmental parameters) and α and β are fixed by applying the nonlinear least squares estimate to minimize the residual between the SI observation and SI function.
After construction of the SI curves, an AMMbased HSI model was constructed as follows by using the average and standard deviation of SST and SSC2 for the S. CPUE:
HSIAMM = (SST + SSC2)/2
An HSI of more than 0.6 was considered as high (Mondal et al. 2021).
2.4 Ensemble HSI forecasting
The expected frequency distribution of the combined forecasts is not assumed in consensus forecasting, but a measure of the central tendency (e.g., the mean or median) is produced for ensemble forecasting to reduce the uncertainty in future habitat projections under climate change (Araujo et al. 2005; Araujo and New 2007). The mean prediction is more sensitive to outliers than the median. Thus, consensus forecasting was employed to select AMM with SST and SSC2 to calculate the habitat predictions derived from the six AOGCM models. The median of the ensemble forecasts used to reduce inter model variance (Lefebvre and Goosse 2008) relating to projected climate change impact on the habitat of immature tuna in the Indian Ocean. Ensemble prediction was performed using the caret Ensemble package of R (version 1.2.1335). The monthly prediction of the HSI was mapped on a 1° × 1° spatial grid by using ArcGIS (version 10.2) for all three scenarios under all three time periods (2040, 2070, and 2100).
2.5 Identification of potential habitat hotspots (PHH)
The locations of habitat hotspots were determined using environmental probability indices, specifically the highprobability areas for immature albacore tuna in the Indian Ocean. In this study, SST and SSC2 were used to predict the potential habitat hotspots (PHHs). The probability was estimated using histogram graphs to show the relations between total S.CPUE and environmental variables, as well as the relationships between total fishing effort and the variables. At first, probability indices were calculated by dividing total S.CPUE at a particular interval of the histogram by the maximum total S.CPUE and fishing efforts were calculated using the same approach for each variable. After that, the average of probability indices from both SST and SSC2 interval ranges was then calculated. In connection to a certain interval of the variables, the maximum probability value reflects the highest frequency fishing effort of and total S.CPUE. This indicates the best environmental conditions (hot spots). Finally, the average of the probability indices from the interval ranges of all variables was calculated. A probability value for the probability index of more than 0.7 indicated habitat hotspots. The average SST and SSC2 probability was used to generate a probability map for all interval ranges of the oceanographic conditions.
where PHI is the PHI; PIS.CPUE is the mean probability index for immature albacore tuna based on the relationship between the S. CPUE and the two environmental parameters for each interval; PIF is the mean probability index based on the relationship between fishing frequency and the environmental parameters; S.CPUEij is the value of the S. CPUE in relation to environmental parameter i for class interval j; S.CPUEimax is the maximal S. CPUE for each environmental parameter; Fij is the value of fishing frequency in relation to environmental parameter i for class interval j; Fimax is the maximal fishing frequency for each environmental parameter; and n is the total number of variables.