Climatic data provide clues on the geographic origin of a spider invasion in the Americas

Identifying the source population of invasive species is important to assess the distribution and potential effects in the invaded area. The araneid spider Cyrtophora citricola is widely distributed in Europe, Asia, and Africa; however, in the last twenty years, it has been reported in several countries across the Americas. To date, the geographic origin of the populations established in America remains unclear, but considering the successful colonization after its recent arrival, a high environmental similarity between the invaded and native geographic distributions is expected. In this study, we compared the environmental characteristics of two possible native regions (southern Africa and the Mediterranean) and the invaded region (America), to determine the more likely origin for the populations established in America. We found that the South African populations of C. citricola occupy environments with similar climatic conditions to those of the American populations, and these similarities are greater than the ones shared with the Mediterranean populations. Therefore, our results support a Southern African, rather than a Mediterranean origin for the populations established in America. In addition, our results also show that populations in America are expanding to environments that differ from those of the native populations. Further studies, assessing intrinsic (e.g. physiological tolerances, plasticity, behavior, reproduction) and extrinsic (physical barriers, predator release) factors could provide further information to disentangle the mechanisms behind this expansion. We would like to thank Dr. Angela Chuang and Dr. Ren Chu Cheng for providing information about the occurrence of this species in United States and Africa, Dr. Ingi Agnarsson for sharing the preliminary results of the phylogenetic analysis of Cyrtophora, and Dr. Yael Lubin for the guidance and valuable information about this spider’s natural history. We also thank Dirección General de los Asuntos del Personal Académico (DGAPA), National Autonomous University of Mexico for supporting AGR as postdoctoral fellow at Instituto de Biología, UNAM ; and the Vicerrectoría de Investigación of the Universidad de Costa Rica (project C0-252) for providing nancial support to GB.


Introduction
The arrival and establishment of any species on a recipient area is a process that depends primarily on the species' dispersal capacity, the propagule pressure (Simberloff 2009 Biological invasions are common in nature, but long-distance dispersal events are rare, because they are often restricted by physical and climatic barriers (Diamond 1984 , which is the focus of the present study (Levi 1997). Cyrtophora citricola is reported as native to the Old World (Levi 1997) -ranging from Africa, Europa to part of Asia -with extensive desert regions separating each group of populations between Europe and Africa. This species has recently invaded the American continent, with the rst report in 1996 (Levi 1997 Víquez 2007). However, the speci c region from which this species migrated to America currently remains unknown. Given the environmental differences between the regions of its native distribution (Cowling et al. 1996;Köppen et al. 2011;Peel et al. 2007), identifying from which geographical native region the New World populations of C. citricola more likely originated, provides information for a better understanding of its expansion and impact in the Americas.
Considering the disjunct native distribution of this species, we hypothesize the Southern African region and the Mediterranean region as the two possible origins for the American invasion. Preliminary molecular phylogenetic analyses suggest a closer relationship between the American and the South African populations than to the Mediterranean populations, although this remains unresolved (Agnarsson et al. unpubl.). The Mediterranean origin is considered as another possible scenario given the higher similarity in morphology and behavior between Mediterranean and American populations, than between the South African and American populations (Y. Lubin pers. comm.). Additionally, there is more intense maritime, commercial trade between the Mediterranean region and America (38542 Twenty-foot Equivalent Units (TEU) in 2016; the American Association of Port Authorities -AAPA) compared to the trading with South Africa (2770 TEU,~7% of the Mediterranean trade). Considering that many populations of C. citricola in the Mediterranean and southern African regions occur near coastal areas (Blanke 1972), and that this species has a long-lasting web, there is a high chance that this species was introduced to America via the merchant marine vessels.
In this study, we rst assessed the set of environmental variables that best explain the distribution of C. citricola spiders in the Mediterranean and southern African regions. Then, to provide support for the geographic origin of the invasion of C. citricola in the Americas, we applied different geographic and environmental approaches to test which of these two sets of climatic information (Mediterranean or southern African) predict more precisely the current distribution of this spider in America. Considering that the invasion of C. citricola in America is very recent, we expect that established populations occupy habitats with similar environmental conditions to those of the native region (Peterson 2003;Peterson 2011). Therefore, the environmental variables of the region occupied by C. citricola in America will have a higher overlap with the set of variables from the native region than with the non-native region.

Species occurrences
We compiled 2795 geo-referenced occurrence points of C. citricola from ve different sources. We obtained 258 data points from the Global Biodiversity Information Facility (GBIF.org; accessed on March 29 th 2018https://doi.org/10.15468/dl.hi6ahq), 18 from SpeciesLink (http://splink.cria.org.br/, accessed on April 4 th , 2018), and 662 from the Royal Museum of Central Africa database. We also obtained 78 records from different literature sources (Online Resource 1) and collected 13 points in the eld in Costa Rica that we geo-referenced using Google Earth. Additionally, our colleague Angela Chuang kindly provided 1574 data points from USA, collected as part of her own research.
We removed duplicated and inaccurate data points (e.g. points that fell in the ocean) from the database prior to conducting the analyses by projecting all points on a global map. Then we ltered the remaining data using the R package spThin (Aiello-Lammens et al. 2015) to remove all data points having less than a distance of 5 km from any other point and guarantee one record maximum per cell according to the resolution of our climatic layers (Online resource 2). With this procedure, we generated 32 data points for South Africa and the Southeastern part of Mozambique (hereafter, the South African region), 108 for the Mediterranean Region and 122 for America (Fig. 2).

Environmental information
To quantify environmental conditions throughout C. citricola's distribution range, we used the 19 Wordlclim's bioclimatic variables for all analyses. We used data from the WorldClim database Version 2 (Fick and Hijmans 2017, http://www. worldclim.org), at a 2.5 arcmin resolution (approximately 20 km² near the Equator). Additionally, we constructed other three environmental grids using information on monthly precipitation, minimum temperature and wind speed from the same climatic database (Fick and Hijmans 2017). We obtained the mean value from the monthly precipitation layers and then calculated the Average Annual Precipitation (hereafter AAP). We extracted the minimum temperature value for each raster cell within the 12-month layers of minimum temperature to get the Minimum Annual Temperature (hereafter MinAT). Finally, we constructed a Maximum Annual Wind Speed (hereafter MWS) layer, by extracting the maximum speed values for each cell layer within the monthly layers of wind speed. We included this last variable as new-born spiderlings of this species are known to disperse by wind (Johannesen et al. 2012).
For the analyses described below, we created a subset of environmental variables. For each Species Distribution Model (SDM) and the respective multivariate environmental similarity surface (MESS) analyses (Online Resource 3), we selected the subset of variables with the lowest collinearity. For this, we used the step-wise approach coded by M. W. Beck and available at https://gist.github.com/fawda123/4717702, which calculates the Variance-In ation Factors (VIF).

Model calibration
To obtain the possible native region of the American C. citricola, we created three sets of candidate models: one set for each of the two potential native regions (South Africa, Mediterranean) and one for the invaded region (America). We generated the sets of several candidate SDMs with different parameterizations using the ENMeval R package (Muscarella et al. 2014) . We generated each model using variations of two different parameters: (1) regularization multipliers that generate penalty values which help to select more simple models (see Elith  We calibrated all candidate models by delimiting an area of 1000 km around the presence points of each region, so that the calibration area would include enough background area containing both environments where the species could be present and environments where the species is likely absent. To avoid the spatial autocorrelation between testing and training points, we used two different data partitioning methods implemented by Muscarella et al. (2014). For the South African Model, we used the Blocks method, which divides the data in four bins with an equal number of occurrences but allows bins to vary in geographic size (Muscarella et al. 2014). This method has been recommended in cases where spatial extrapolation is needed (Muscarella et al. 2014) such as the case of this region which presents few occurrences that also tend to be grouped (Fig. 2). For the Mediterranean and American models, we used the Checkerboard 2 partitioning method, which, as the blocks method, divides the data in four bins, but facilitates the inclusion of isolated occurrences without altering the geographical size of the bin (Muscarella et al. 2014). Therefore, we considered this method appropriate for the scattered occurrences we have for these two regions.
We selected the model with the best evaluation metric from each set of candidate models and this was the model used for the projections on other regions. To determine the accuracy with which each native model predicted the real known occurrences of spiders in the invaded region, we projected each of the two native models on America. Then we created a third model -the one calibrated on the invaded environmental conditions -and projected it onto the two possible native regions, to cross validate the accuracy of our native models.
To evaluate the performance of each set of models created and select the best tted for each native region, we used four selection criteria with the following priority order: (i) the lowest 'Minimum training presence' omission rate (OR MTP ), (ii) the highest Area Under the Curve (AUC TEST ), (iii) the lowest value of 10% Training omission rate (OR 10 ), and (iv) the lowest number of parameters. For details regarding these criteria, see Muscarella et al. (2014). The model that best tted the criteria mentioned, was selected as the model to run the posterior projections and analyses.

Geographic and environmental similarity
We extracted the suitability values assigned by each selected native model to the occurrences of C. citricola in America. These values were generated by each model for each cell in the region where it was projected, after correlating and tting the environmental variables to the occurrences included for the region used for calibration (Warren and Seifert 2011) . We compared the suitability values of both native models using a T-test.
To analyze the climatic niche overlap among each native region and the invaded region, we used Schoener's D index (Schoener 1968

Results
The parameters and evaluation metrics of the three models selected showed an overall good performance, with omission rates below 0.09 and 0.13, and AUC above 0.8 ( Table 1). The projection of the South African model predicted a larger environmental suitability for C. citricola across America compared to the prediction of the Mediterranean model (Fig. 3). The suitability values obtained for the occurrence points in America from the South African model were slightly higher (0.86 ± 0.10 SD), than those obtained from the Mediterranean model (0.82 ± 0.32 SD), however, this difference is not statistically signi cant (t = -1.53, gl = 141.38, p=0.13, Fig. 4a). It is worth noting that the MESS map analysis showed that the South African model produces higher extrapolation values than the Mediterranean model when projected to America (Online Resource 3, Fig. S.2a). The variables that had the highest contribution for each model are shown in Table S1 (Online Resource 4).
Nonetheless, the projections of the American model to each native region (Figs. [4][5] show that the suitability values assigned by the American Model to the South African occurrences were signi cantly higher than those assigned to the Mediterranean occurrences (t=-4.01, gl = 31.19, p=0.0003, Fig. 4b), with an average suitability value of 0.13 (±0.16, SD) for the South African occurrences and 0.01 (±0.02, SD) for the Mediterranean occurrences. The extrapolation values of the American model projected to the two native regions were not particularly different between each other (Online Resource 3 Fig. S.2b).
The climatic niche overlap between the South African and American populations is moderate to low (D=0.29, I=0.44), but is higher than the overlap between the Mediterranean and American populations (D=0.12, I=0.20). Additionally, according to the PCA-env analysis, the environmental characteristics of the geographic distribution occupied by C. citricola in South Africa had a greater overlap (Stability 58%) with the characteristics of its distribution in America, than to those of the Mediterranean distribution (Stability 20%) (Fig. 6). This result also indicates that this spider has occupied new environmental combinations (Expansion) in America in relation to the conditions existing in the native regions -at least 42% based on the South African model, and 80% based on the Mediterranean Model. The rst axis of the PCA-env comparing the Mediterranean with America (Fig. 6) was mostly explained by Precipitation variables (Precipitation of the driest month =19.91%, Precipitation of the warmest quarter=17.80%, and Precipitation of coldest quarter=17.12%) while the second axis was represented by annual climatic variation (Precipitation seasonality=51.09%, Temperature annual range=15.59%, and Maximum Wind Speed=15.21%). In the PCA-env illustrating the relationship between South Africa and America (Fig. 6), the variables with the highest contribution for the rst axis were Precipitation of the driest month (20.67%), Mean diurnal range (19.47%), and Precipitation of coldest quarter (18.88%). For the second axis, the variables with the highest contribution were the same as for the Mediterranean (Precipitation seasonality=32.43%, MWS= 28.91% and Temperature annual range=27.42%) The supplementary analyses also showed greater similarities in the environmental conditions between the American and the South African regions, than between American and the Mediterranean region. For instance, the overlap in the distribution of each of the 22 variables (Online Resource 3, Fig. S.3-4), as well as the overlap obtained with the PCA (Online Resource 3, Fig. S.5) also showed a larger similarity of America with the South African region, than with the Mediterranean region.

Discussion
In this study we focused on identifying the possible geographic origin of the invasion of C. citricola in America by determining which of two possible native regions share more environmental characteristics with the invaded region. Our results support that the invasive populations of C. citricola in America inhabit environments more similar to the South African region than to the Mediterranean region. When we projected the American model onto the native regions, we obtained a higher suitability for the South African region. In addition, both D and I indices, the PCA-env, the density curves, and the PCA also showed that the environmental conditions occupied by C. citricola in America are more similar to those in the South African region than to those occupied by this species in the Mediterranean region.
The successful establishment of a given species in a new geographic area is largely determined by the environmental features of the recipient area and species-speci c life-history traits, such as dispersal capability, demographic structure (e.g., sex ratio), and adaptability to different environmental conditions and to novel biotic interactions (e.g., a new set of predators and parasites) (Brown et al. 1996; Guisan and Zimmermann 2000). After arrival, the environmental conditions could play a fundamental role on species establishment (Nuñez and Medley 2011;Peterson 2003; van Wilgen and Richardson 2012). Species arriving to places with similar conditions to those of the native area are more likely to succeed in their establishment than species arriving to sites with a different combination of environmental conditions (Peterson 2003). Given that C. citricola is a recent invasion and the signi cantly higher similarity of the environmental conditions between American populations and South African populations support the hypothesis that this species dispersed from the South African Region to America. This is congruent with preliminary genetic results which also suggest a South African origin for the American populations (Agnarsson et al. unpubl.), but we can not con dently discard different origins for the recent invasion of C. citricola in America, until a more extensive phylogenetic study includes populations from a wide range of the American distribution.
Our analyses also showed that C. citricola has occupied in America, during the last two decades, a new set of environmental conditions not present in the native regions analyzed (Fig. 6). This is also evident in the lack of reciprocity between invaded and native regions that resulted in the low suitability values obtained for the native occurrences when the invaded model was projected on both native regions (Figs. 4b-5). Two possible processes could explain these results. First, C. citricola in America may not be facing the same environmental and biological constraints as in its native regions (Blanke 1972;Brown et al. 1996). In America, C. citricola could be exploiting resources and tolerating conditions different from those present in the native regions, likely because those conditions are still within the species physiological tolerance thresholds. However, such conditions may not be within the geographic reach of the species in the native regions due to external constraints (e.g. mountain ranges, extensive desert areas, parasites, predators, etc.) (Broennimann et al. 2007). Second, the populations of C. citricola in America may have rapidly changed their physiological tolerance thresholds compared to those of native populations, and therefore, adapted to this new set of environmental conditions due to a change of the intrinsic (rather than extrinsic) constraints (Yoshida et al. 2007). However, these two possibilities remain to be examined in further detail, as there are no studies on these issues for this species.
As other invasive species, C. citricola has several traits that facilitate its expansion. This spider is a generalist predator, so diet does not represent a major limitation (Chauhan et al. 2009). It has a high reproductive rate: one female can produce several egg sacs during a single reproductive season (Chauhan et al. 2009;Leborgne et al. 1998). It has a dispersal method (ballooning) that allows a rapid expansion into new areas (Teruel et al. 2014), and the species is also highly tolerant of disturbed environments -favoring its establishment in open areas around cities (Nedvěd et al. 2011;Sánchez-Ruiz and Teruel 2006;Teruel et al. 2014). However, dense, tropical forested areas apparently limit the expansion of this spider species. Several non-systematic samplings in forested areas conducted close to sites where the species has been observed and over a three-year period showed that the species is absent in rain-dense forests, and very rare at the edge of tropical dry forests (Sandoval and Barrantes unpubl.).
The present study is also, to our knowledge, the rst to use different methods comparing environmental conditions to assess the origin of an invasive species. Even though this idea has been previously proposed (Steiner et al. 2008), this is the rst study that deduces the more likely invasion source by comparing environmental similarities between the possible native and the invaded regions. However, this approach has some constraints. A general limitation in our results is that the predictions are based on relatively static climatic conditions (exempli ed by the environmental trait means collected in a determined period), which might not be representative of the niche for dynamic populations (Elith et al. , and, in particular, the projections of our SDMs based on the native regions need to be interpreted with caution. Thus, despite that the South African model assigns higher suitability values -though non-signi cant -to the occurrences in America, its evaluation metrics are lower than those of other models, and its predictions tend to overestimate its suitability (as indicated by the MESS analysis). This is also the region with fewer occurrences recorded, so the distribution of C. citricola in South Africa might be undersampled in comparison with the other regions. Hence, our results are not entirely consistent in indicating the most likely native region for the American populations. However, we consider that using different analyses provides additional information that could more certainly indicate which was the native region, if most of these analyses converge on similar conclusions. In this study, most analyses indicate that South Africa is the region of origin of this recent C. citricola invasion, since South Africa shows a greater environmental similarity with the invaded region.
The analyses conducted here support the hypothesis that C. citricola populations in America are more closely related to those in South Africa than those in the Mediterranean. Our results also provide evidence of the expansion of C. citricola into a new set of environmental conditions in America, as a result of either plasticity allowing its rapid adaptation (Yoshida et al. 2007) or the absence of biological or physical constraints present in its native range (Broennimann et al. 2007;Roy et al. 2011). Further studies focusing on physiological performance, adaptation strategies, and biological constraints for the species in both native and invasive populations may help to better understand the processes driving its rapid expansion in the tropical areas of the invaded region.    Predictions of the distribution of Cyrtophora citricola in America projected from the two native niche models Figure 4 a. Suitability values obtained by projecting the environmental conditions of the native regions to America. There is no signi cant difference between the suitability values of the two regions (t = -1.53, gl = 141.38, p=0.13). b. Suitability values obtained from projecting the American environmental conditions on each native region. Suitability is signi cantly higher for the South African region (t=-4.01, gl = 31.19, p=0.0003).
The asterisk indicates the signi cant difference in suitability values between the regions