A Validation Procedure for Landscape Connectivity Approaches: Evaluation of The Accuracy of Ecological Corridor Locations.

Context Connectivity conservation analysis is based on a wide range of approaches designed to locate key ecological corridors in order to maintain multispecies ows. However, the lack of validation procedures with accessible data prevents from selecting the most accurate approach. Objectives We propose a new validation procedure to evaluate the accuracy of ecological corridor locations in landscape connectivity approaches. We applied this procedure to compare three modelling approaches and select the most accurate. Our procedure validates ecological corridor tracks with independent presence-only data, under the hypothesis that species should be present within or near the corridors (according to resistance distance) if the corridors have been correctly modelled. We applied Maxent and circuit theory to locate ecological corridors for forest bird species in a rural landscape following three approaches based on land cover, umbrella species and multispecies presence data. We compared the proportion of overlaid corridors among the three approaches, and selected the most accurate one according to our validation index. The corridors modelled from species presence data (umbrella and multispecies were more consistent than the habitat-based (land cover) The approach was the most accurate while was accurate locating ecological corridors.


Introduction
The increase in anthropogenic pressures has greatly affected species movement worldwide (Tucker et al. 2018). The dispersal of individuals among populations is essential to counterbalance the negative effects of habitat isolation and to maintain population and community persistence and dynamics (Hanski 1998 Connectivity conservation planning aims to maintain or restore ecological corridors for multiple target species but this generalized goal comes up against several limits. The application of graph and circuit theory -or even individual-based models -requires species observation data and is computationally intensive. Furthermore, combining the results for a wide range of species creates theoretical problems such as how to generalize a patch-corridor-matrix model for species with different ecological requirements (Forman 1995 Sahraoui et al. 2017). A rst option is to use land cover and expert opinion as a proxy for species presence and landscape resistance to movement to inform on potential multiple-species movements in the matrix (e.g. Tannier et al. 2012). A second option is to produce a habitat suitability map for each species by relating species-presence data with environmental predictors, then transform the habitat suitability map into matrix resistance, and nally to combine the maps for several species (e.g. Petsas et al. 2020). A third option relies on the concept of umbrella species, which "confer a protective umbrella to numerous co-occurring species" (Fleishman et al. 2000). Despite some progress, no single method has yet emerged as a reference (Diniz et  A validation procedure that is capable of determining the most relevant landscape connectivity approach could ll this knowledge gap. estimated from presence-only data, telemetry and genetic data. However, in a review of the use of biological data in combination with landscape graphs, Foltête et al. (2020) found that telemetry and molecular data represent only 8.4% of the studies and are much less used than presence data, that can be easily extracted from existing databases, especially in studies with operational objectives. Therefore, further validation procedures are required. Foltête et al. (2020) found few papers that used biological data (regardless of data type) both a priori and a posteriori while the classical statistical procedure of validation consists in calibrating a model with a subsample of data and validating it with another subsample (Fielding and Bell, 1997). The a priori use of biological data informs landscape connectivity approaches of species requirements (e.g. Du ot et al. 2018). The a posteriori use of biological data is based on the statistical inference of biodiversity indices such as species abundance (Pinaud et  Validation procedures that can be routinely applied with easily accessible independent data are thus critically needed. Taking advantage of the classical data partitioning procedure used in statistics to calibrate and validate landscape connectivity approaches with independent species presence data could ll this gap. In this study, we propose a new validation procedure based on independent species-presence data to evaluate the accuracy of the locations of ecological corridors predicted by landscape connectivity models. We applied the procedure to forest birds in two forested mountain ranges in the French Alps, separated by an agricultural landscape. This landscape is a suitable situation for identifying dispersal corridors for forest species. Our validation method is based on the hypothesis that the concerned species are more likely to be found along or near their predicted corridors, provided these corridors were properly modelled. We applied this validation procedure to a set of four forest bird species. We compared three modelling approaches based on circuit theory to design multispecies corridors and selected the most valid among them. The three approaches were: (i) a habitat approach that relies upon a land cover map used as a proxy for species presence; (ii) an umbrella-species approach based on presence-only data of Sitta europea, an umbrella species for woodpeckers, and (iii) a multispecies approach based on presenceonly data of Sitta europea and three woodpecker species: Dendrocopos major, Dryocopos martius and Dendrocopos minor.

Study area and land cover map
The study area covers 755 km² and is located in the northern French Alps (latitude: 45.223826, longitude: 5.453339). Two mostly forested mountain massifs are located along the western and the eastern edges of the study area: the Chambarans Range to the west and the Vercors Range, which is also a Regional Natural Park, to the east (Fig. 1). These ranges were digitized from aerial photography (2015) of the study area. Both massifs are considered core areas of biodiversity (124 km²). They are separated from each other by a rural matrix (631 km²), which concentrates several potential dispersal barriers along the Isère River valley. Several kilometres along both sides of the river are occupied by urban areas linked with transport infrastructure, a concentration of intensive walnut orchards and agricultural elds. Land cover ( Fig. 1; Supplementary Information, Appendix S1) is mainly composed of forests (patch area > 0.5 ha, 32%) and agriculture (33%), followed by orchards (14%), urban areas (11%), small woodlands (patch area ≤ 0.5 ha) and hedgerows (7%) and water (streams and water bodies, 2%).  (3) the potential dispersal barriers were spatially concentrated within a width of a few kilometres along the Isère Valley ( Fig. 1).

Landscape connectivity approaches and assumptions
First, we created resistance surfaces de ned according to (1)  Approach 1, hereafter called "habitat", was based on the following assumptions: (1) land cover is a good proxy for species presence, and (2) land cover transformation, from expert opinion and the literature, is a good proxy for matrix resistance. Following previous parameterizations applied to forest mammals, resistance values were xed to 1 for forest, 10 for small woodlands and hedgerows, 300 for orchards, 700 for agriculture, 900 for water and 1000 for urban areas (Verbeylen et al. 2003; Avon and Bergès 2016).
These resistance values were also considered valid for forest birds.
Approach 2, hereafter called "umbrella species", consisted in modelling functional connectivity with Maxent from presence-only data of Sitta europaea to predict a habitat suitability index (HSI). S. europaea is a secondary cavity nester and an umbrella species for woodpeckers (Kotaka et al. 2002). This approach was based on the following assumptions: (1) umbrella-species habitat suitability is a good proxy for other species' presence, and (2) transformation of umbrella-species habitat suitability is a good proxy for matrix resistance. We calibrated the habitat suitability model using 198 presence data (50%) through the Maxent software. The landscape metrics were derived from the land cover map and calculated with the Chloe-4.0 software (Boussard and Baudry 2017). We calculated the proportion of each land cover class using sliding windows (buffer radius of 100 m, 5 m between two adjacent windows). The windows were also used to calculate a landscape heterogeneity index: the Shannon diversity index or SHDI (Boussard and Baudry 2017). Buffer sizes were de ned to prevent the windows around the points from overlapping. Checking for correlations among the landscape metrics indicated that they were not excessively correlated (Pearson's correlation coe cient < 0.70). The importance of the individual landscape metrics and uncertainties were assessed with a Jackknife test based on three random selections of background points (or pseudo-absence; Phillips et al. 2009 If HSI ≥ threshold (in species habitat): resistance was assigned to 1.
If HSI < threshold (in matrix): we used Eq. 1 to assign resistance.
Resistance is inversely proportional to the permeability of the matrix. The function (eq. 1) assigned a resistance value of 1000 when HSI equalled 0 and of 1 when HSI was greater than or equal to the HSI threshold de ned by Liu et al. (2013).
Approach 3, hereafter called "multispecies", consisted rst in modelling functional connectivity separately for the three woodpecker species Dendrocopos major, Dryocopus martius and Dendrocopos minor using 50% of the presence data (respectively 332, 21 and 15 presence points) following the method presented in approach 2, then combining the three single-species connectivity maps (see next paragraph) and the Sitta europaea map to obtain a multi-species connectivity map. Single-species connectivity maps were standardized to give the same weight to each of the four species. Approach 3 was based on the assumption that transformation of species habitat suitability is a good proxy for matrix resistance. The validity of maxent models used in approaches 2 and 3 were evaluated using the AUC index (Area Under the Curve; Fielding and Bell 1997).
For the three approaches, we used the ArcMap v. . We adjusted the parameters as follows: the intensity of the reservoirs was set at 1 Ampere, the current was calculated from the "pairwise" mode, and the two maps produced from the injection of current into the core areas were cumulated. This method is based on the assumption that species randomly move across the landscape and that any of the possible paths between the two biodiversity reservoirs could be taken (Dickson et al. 2019).

Validating the location of the ecological corridors
To validate the location of the corridors, the cumulative current ow resulting from the three approaches were vectorised with three thresholds (at the 55 th , 65 th and 75 th quantiles) in order to calculate (i) the proportion of overlaid corridors among approaches, as a measure of consistency, and (ii) the resistance distance between independent presence points and the closest corridors, as a validation index. Only the polygons that continuously linked (i.e. with no breaches) the two core areas were considered corridors. where N is the number of random draws i, dist.moy.observed.i is the average cost distance to the corridors from the validation points for draw i, and dist.moy.random.i is the average cost distance to the corridors from randomly selected points in the landscape for draw i. The corridor score index was obtained by averaging 100 random draws (N=100) from a number of random points equal to the number of validation points. An index close to 1 means the validation points are close to the corridor in terms of resistance distance, which validates the hypothesis that the corridor, or its nearby environment, concentrates the potential dispersal ows. An index close to 0 means the approach is no better than a null model and invalidates the above hypothesis. A negative index means that the matrix, not the corridor, concentrates the potential dispersal ows. As the difference between dist.moy.random.i and dist.moy.observed.i is sensitive to the distribution of resistance range values, we divided this difference by dist.moy.random.i to calculate the corridor score. As a consequence, the corridor score is robust to the transformation function used to convert the HSI into resistance values ( Supplementary Information, Fig.  S1). The validation index was calculated for each species, approach and threshold of cumulative current. We tested the sensitivity of the validation index to these three parameters with ANOVA, for simple effects and all dual interactions. All statistical analyses were performed with R software v.3.6.0.

Habitat suitability model
The AUC indices ranged between 0.70 and 0.72, except for D. martius for which Maxent model validity was rather low (AUC=0.63). Variable importance ranking showed that the two most important landscape metrics, for the four species, include proportion of forest or water, or the Shannon diversity index ( Supplementary Information, Fig. S2). The Shannon diversity index always had either a null or positive effect on species presence, indicating that the species required multiple land covers such as forest and water. Proportion of forest and water were positively correlated with species presence, except when forest and water proportions were very high (with a few exceptions depending on species, Supplementary  Information, Fig. S2). The effect of the proportion of small woodlands and hedgerows was not consistent among species, while the effect of the proportion of orchards and urban areas was generally negative. The response of Dendrocopos minor differed from the other three species: for this species, the effect of forest proportion did not show a clear trend and the proportion of small woodlands and hedgerows had a negative effect.

Consistency among methods
The multispecies and umbrella-species approaches showed globally similar locations for the highest current values but the habitat approach differed (Fig. 2). From the highest (75 th ) to the lowest (55 th ) quantile, the proportion of corridors shared between the multispecies and the umbrella-species approaches ranged from 0.74 to 0.83, between the multispecies and the habitat approaches from 0.44 to 0.84, and between the umbrella-species and the habitat approaches from 0.44 to 0.60 ( Supplementary  Information, Fig. S3). An example of the consequences of these differences on corridor design is provided in Supplementary Information (Supplementary Information, Appendix S3).

Discussion
We propose a new procedure to validate ecological corridor locations, based on independent presenceonly data, under the hypothesis that species should occur within or near the corridors. The corridor score can be computed from Euclidean or resistance distances and is robust to the distribution of the resistance value range. For example, the score can be used to select the best way to classify resistance values or to transform habitat suitability into resistance. Here, we illustrated the validation procedure by comparing different landscape connectivity approaches used to map multispecies corridors and selecting the most accurate approach. Our results have important theoretical and practical implications for managers and stakeholders concerned with connectivity conservation and restoration planning, who wish to validate connectivity approach assumptions and de ne corridor locations.

Species responses to landscape metrics
Our results showed that species presence could bene t from a diversity of land covers including forest, water and small woodlands and hedgerows. This complementation effect indicates that these land cover have a positive effect as long as they are not too dominant in the local environment (Dunning et al. 1992). This result relates to the importance of riparian forests, which allow species to move along streams. The role of riparian forests as ecological corridors as well as their use by woodpeckers have . D. minor's differing response was expected given that the species is more generalist than the other three and is not rare in open habitats (Rassati 2015).

Consistency in ecological corridor location among methods
The ecological corridors were much more concordant between the multispecies and the umbrella species approaches than between these two approaches and the habitat approach. This may be explained by the similarity between the rst two methods, which both integrate species data related to landscape metrics. Meurant et al. (2018) also compared different selections of surrogate species for spatial prioritization with reference to the full set of species. In contrast with our results, they concluded that umbrella species are poor indicators of many species' requirements and that an indirect approach based on habitat is preferable even if important areas, for landscape connectivity conservation, may not be identi ed. In our case, the concordance between the multispecies and the umbrella-species approaches was likely due to an appropriate choice of the umbrella species, since Sitta europea is a good surrogate for related woodpeckers (Kotaka et al. 2002). In contrast, the umbrella species selected by Meurant et al. (2018) is only a moderately effective surrogate, though for a wider range of species (14 species).

Validation of ecological corridor locations
The corridor score was higher for the multispecies and the umbrella-species approaches and lower for the habitat approach. The ability of the multispecies approach to accurately locate ecological corridors for several species is in accordance with previous ndings (Carroll et al. 2001; Roberge and Angelstam 2004). We concluded that the assumption underlying the multispecies approach is the most realistic: species HSI are good proxies for matrix resistance when identifying multispecies ecological corridors. The gap between the multispecies approach and the habitat approach could be explained because multispecies presence data better re ect the reality than do habitat data. This difference is also partly due to how the landscape is modelled: either locally for the habitat approach, or accounting for the surrounding landscape context for the multispecies and the umbrella-species approaches. Indeed, the relative importance of local and landscape-scale factors can depend on landscape context and species (Howell et  In uence of species on the degree of validation We also tested corridor score sensitivity and performance against species and the current threshold we used to delineate corridors. The corridor score was the highest for Sitta europaea, certainly because the presence data of this species were used for calibration in two of the three models (umbrella and multispecies models). The median corridor score was the lowest for Dendrocoptes minor. The median corridor score for the habitat approach was negative for this species, indicating that the ecological corridor locations were less accurate than in a null model, and that our hypotheses on matrix resistance were globally inconsistent. This result is easily explained because the proportion of water in the local environment (and the Shannon diversity index) was the main driver of environmental suitability and not the proportion of forest. This illustrates the ability of our validation procedure to detect a misclassi cation of matrix resistance values, even with only a few validation points. Interestingly, the multispecies approach performed similarly to the umbrella species approach for Sitta europaea, indicating that including data from other species did not decrease the accuracy of the multispecies approach for this species. Both methods worked equally well for Sitta europaea and Dendrocopos major.
The multispecies model, followed by the umbrella-species model, performed better than the habitat model for D. martius and D. medius. For D. martius, the habitat approach had a high corridor score only when the discrimination of ecological corridors from the matrix was not very selective (i.e. when the 55 th quantile of current values was used to delineate the corridors). Overall, the accuracy of the different models varied according to the species used for validation. This result might be partly due to differences in the number of presence points and in the validity of the Maxent models. Interestingly, the AUC index only showed a lower validity of the habitat suitability model for D. martius, while the corridor score showed more contrasted results among species. Beyond the validity of habitat suitability models, we highlight the need to validate also the landscape connectivity models according to their nal outputs (ecological corridor locations). Finally, implementing a multispecies approach should guarantee maximal prediction accuracy in most cases.

Limits, advantages and perspectives
Our study did not overcome the limitations recently debated concerning the link between presence data, dispersal movement and the effective reproduction of migrants (Robertson et al. 2018;Fletcher et al. 2019). Presence data relate both to habitat suitability and species movements, and both processes are merged when movements decrease due to mortality risk or avoidance of less suitable types of land cover. Fletcher et al. (2019) question the consequences of neglecting to distinguish between these two processes, though they do recognize that the focus on matrix resistance had led to major advances in the understanding of connectivity. They propose a new connectivity model that separates mortality and movement behaviour. We acknowledge that presence data are indirectly related to movement and do not make it possible to discriminate between foraging, dispersal and migration, or to quantify the successful reproduction of migrants, with its consequences on genetic population diversity (Jeltsh et al. 2013).
However, these limits are not speci c to our study and our validation procedure still has several theoretical and practical advantages: (i) it provides a validation procedure based on easily accessible data; (ii) it can be used to select the most realistic landscape connectivity approach; (iii) it can detect a misclassi cation of matrix resistance values from only a few validation points and can help select the best function to transform habitat suitability into resistance values; (iv) it is independent from the modelling framework (e.g. expert opinions, least-cost paths or circuit theory) because it evaluates one nal output of a given landscape connectivity approach (i.e. ecological corridor locations) and not only an intermediate output (such as resistance values) that imply to use a speci c framework (circuit theory); and (v) it is easy to calculate after using softwares designed for connectivity conservation planning To reinforce our validation procedure for landscape connectivity approaches, we need to evaluate how the corridor score ranks different corridor design methods with more robust validation indices based on telemetry or genetic distances (Coulon et al. 2015). These objectives are within the reach of both empirical and theoretical studies that use in silico controlled experiments. Finally, further work is required to assess the potential of our validation procedure in order to dispense with the costly use of telemetry and genetic data.

Declarations
Funding This work was supported by the FEDER project "Trame verte forestière" (contract n°RA0017232). Study area with the land cover in the matrix and the core areas of biodiversity (determined by orthoimagery). Note: The designations employed and the presentation of the material on this map do not imply the expression of any opinion whatsoever on the part of Research Square concerning the legal status of any country, territory, city or area or of its authorities, or concerning the delimitation of its frontiers or boundaries. This map has been provided by the authors.

Figure 2
Comparison of the current maps for the three approaches used to de ne ecological corridors. Figure 3