Target species and presence records
In our study, we selected five EHs, along with the Crested Ibis, based on their popularity, coexistence, and the availability of occurrence data. These EHs include the Little Egret, Grey Heron, Chinese Pond Heron, Eastern Cattle Egret, and Black-crowned Night Heron. Field observations were conducted in Yangxian County and the surrounding 18 counties (32°02′-34°23′N, 106°11′-109°53′E) in Hanzhong and Ankang City, Shaanxi Province, China, from 2017 to 2020 (Fig. 1). To obtain specific geographic locations of nesting and foraging sites of the Crested Ibis, we conducted on-site fieldwork and utilized annual reports from the Shaanxi Hanzhong Crested Ibis National Nature Reserve. With the compilation of a comprehensive dataset, we collected a total of 238 occurrence points for the wild Crested Ibis. Presence records for the EHs were obtained from the China Bird Report database, available at http://www.birdreport.cn/. In total, we gathered 263 georeferenced occurrence records, including 68 records for the Little Egret, 36 for the Grey Heron, 42 for the Chinese Pond Heron, 62 for the Eastern Cattle Egret, and 55 for the Black-crowned Night Heron (Fig. 1). To address spatial autocorrelation, we applied a filtering process where we randomly selected one occurrence point within each 300 300m grid, corresponding to the maximum cell size of the environmental variables.
Environmental variables
To comprehensively examine the habitat preferences, environmental niches, and distribution dynamics of the Crested Ibis and the EHs coexisting in the study area, our model incorporated a wide range of environmental factors. These factors included climate conditions, topographical attributes, human impacts, and the availability of water and food resources. To incorporate plausible variations in future climate, we utilized climate projections for the years 2041-2060 (referred to as 2050) from the BCC-CSM2-MR, MIROC6, and CMCC-ESM2 Global Circulation Models (GCMs) featured in the sixth assessment report of the Intergovernmental Panel on Climate Change (IPCC6). We considered the Shared Socioeconomic Pathways 2-45 (SSP2-45) scenario for the year 2050, averaging three GCMs. The SSP2-45 scenario is characterized by a “middle of the road” development pathway, with moderate economic growth, a growing global population, and a moderate level of environmental protection. In this scenario, there is a gradual transition towards a more sustainable environmental condition, but it is not as rapid or comprehensive as in some other scenarios 48.
To obtain the bioclimatic variables, we sourced data from WorldClim, which has a spatial resolution of 30 arc-seconds (http://www.worldclim.org). The LULC variables were obtained from GLOBELAND30 (http://www.globeland30.org/), while the topographical variables were sourced from Geospatial Data Cloud (http://www.gscloud.cn). Lastly, the human influence variables were obtained from the Resource and Environmental Science and Data Center (https://www.resdc.cn/). We calculated the distance to variables using the Euclidean Distance tool in the Spatial Analyst in ArcGIS 10.8.
To reduce multicollinearity between the variables, we performed a subset selection of the variable set using Pearson's correlation coefficient. We selected the variables with a coefficient of |r| < 0.75, ensuring independence among the selected variables. Ultimately, we retained the ten most independent and ecologically significant environmental factors to build the niche models. These factors included Aspect, Precipitation of the coldest quarter (Bio19), Isothermality (Bio3), Temperature annual range (Bio7), Elevation, Farmland, LULC, Nightlight, Slope, and Distance to water (Table 1).
LULC simulation
To understand the dynamics of land in the study area and forecast its future trends, we conducted a LULC simulation using an advanced model known as the Patch-generating Land Use Simulation (PLUS) model 49. The PLUS model extends the well-recognized cellular automata (CA) model to capture intricate spatio-temporal interactions and the underlying rules governing land use changes 50,51. It incorporates a novel rule-mining strategy and a patch-generating mechanism, significantly enhancing its capability to represent the non-linear nature of LULC changes. This model is particularly suitable for simulating complex geographical evolution processes and has been widely employed in studies related to land use simulation 52-54. To implement the PLUS model in this study, we utilized two historical LULC maps as the fundamental input for future simulations. Additionally, we incorporated a total of 15 driving factors, including ten socioeconomic, two climatic, and three natural environment factors, which collectively contribute to the dynamics of land type transition (Table 1). The PLUS model consists of two main modules: the Land Expansion Analysis Strategy (LEAS) and the CA Model based on Multi-type Random Patch Seeds (CARS). We established specific parameters for the PLUS model as follows: (1) LEAS module: consisting of 20 regression trees, a sampling rate of 0.05, and mTry set to 15; (2) CARS module: with a neighborhood size of 3, patch generation rate of 0.7, expansion coefficient of 0.5, seed percentage of 0.0001, and neighborhood weights allocated as follows: 0.103 (Cropland), 0.495 (Forest), 0.151 (Grassland), 0.056 (Shrubland), 0.012 (Wetland), 0.039 (Water), and 0.376 (Urban).
To evaluate the accuracy of the model, we initially simulated the LULC for the year 2020 based on the development probability derived from the land type transition that occurred between 2000 and 2010. Subsequently, we employed the validation function within the PLUS model to compare the simulated LULC data for 2020 with the actual contemporaneous data obtained from GlobeLand30-2020 55. For this comparison, we utilized the Confusion Matrix calculation method, with a selected sampling rate of 0.05. To evaluate the model’s performance, we computed two key metrics: the overall accuracy (OA) 49 and the Kappa coefficient 56. Higher values of OA and the Kappa coefficient indicate greater accuracy, with a score above 0.8 generally considered to indicate statistically satisfactory model performance 57,58. Finally, we projected the LULC for the year 2050 to conduct future ecological niche analysis.
Ecological niche modeling
In our study, we utilized an ensemble modeling approach to enhance the calibration of ENMs for both the Crested Ibis and sympatric EHs. To create robust ENMs, we combined three widely used algorithms: Maximum Entropy (MaxEnt), Generalized Linear Model (GLM), and Random Forest (RDF). This ensemble approach allows us to capitalize on the strengths of each algorithm and improve the overall predictive performance.
To train the ENMs, we randomly divided the available data into two sets using the bootstrap random partition method. We allocated 80% of the data for model generation, and the remaining 20% was reserved for assessing the predictive accuracy of each model. The modeling process was implemented using the R package ENMTML 59, and the significance of environmental variables was assessed using "imp_var" function. To allocate pseudo-absence points within the background area, we used the "GEO_ENV_KM_CONST" method. This method combines environmental and geographical approaches with a k-mean non-agglomerative clustering procedure to distribute homogeneously on environmental space 60.
To evaluate the performance of each algorithm, we considered six different evaluation metrics: True Skill Statistic (TSS) 61, Boyce 62, Kappa 56, Sorensen 63, Jaccard 63, and Area Under the Curve (AUC) 64. Higher values for these metrics indicate greater accuracy and reliability of the individual ENMs. To investigate the potential distribution of the selected species, we constructed a final model by combining the results of all the algorithms using a weighted mean approach 65. This approach calculates suitability values by considering the performance of the algorithms. The weight of each algorithm was determined using the TSS values, as illustrated in the following equation:
present timeframe and the projected future scenarios for the 2050s. Finally, we used SDM toolbox 2.0 67 to estimate species richness, which represents the total number of species in a grid cell. We also calculated the areas encompassing various classes of richness that include the habitat of the Crested Ibis.
Quantification of niche similarities
To assess and compare the environmental niches of the Crested Ibis and EHs, we employed ENMTools 1.4.3 68, a valuable tool for automating ENM creation, calculating similarity measures, and conducting statistical comparisons of habitat distributions. To quantify the niche overlap between the species, we calculated two similarity metrics: Schoener’s metric D and Hellinger’s index I, derived from the ecological environmental niche space 69,70. Schoener's D considers the probability distributions of species occurrence across different regions or cells, measuring the overlap of niches based on species abundance in those locations. Hellinger's I, on the other hand, relies solely on probability distributions without assuming species abundance patterns. Both metrics range from 0 to 1, where 0 indicates no overlap or complete niche divergence, and 1 indicates complete overlap or identical niches. To assess the niche breadth for each species, we calculated the Levins' inverse concentration metric 68,71. Niche breadth quantifies the range of habitats occupied by a species. A value of 0 indicates that the species is restricted to a single grid cell, while a value of 1 suggests an equally suitable environmental distribution across all grid cells. Therefore, species with a wider environmental distribution have higher niche breadth values. Finally, to explore the critical environmental variables associated with the niche similarity between the Crested Ibis and EHs, we employed the Kernel density estimation method. If the Kernel density plots for each factor show significant overlap, it implies that the environmental preferences of the species share similar values or exhibit similar patterns in their distribution. This indicates a potential correlation or dependency between the factors for each species. Conversely, if the plots do not overlap much and are distinct, it suggests that the factors are independent of each other 72.