4.1 Field Samples
Bat data were gathered in the context of Chirotteri Monitoring project of Foreste Casentinesi National Park, by acoustic transect using a Pettersson D240X bat detector (Pettersson Elektronik AB, Uppsala, Sweden) in time-expanded mode. We identified 25 transects (min 7 km, max 36 km) in the whole Park, homogeneously distributed within the Park (Fig. 1), in order to guarantee a satisfactory level of coverage of the different environmental conditions (altitude, exposure, land use). Transects, traveled only by car to minimize double or multiple counts (Braun de Torrez et al. 2017), both along paved and dirt roads, at an average speed of 30 km/h, were made across four years (2014 and 2016–2018), always between 16th August and 15th September, without fixed time slots. Every year we repeat the same transects, covering an average of 585 km (Table 2); the differences between the years are related to problems of practicability of some stretches (landslides, fallen trees, etc...). The location of each record was registered with a GPS device.
Acoustic records were analyzed using BatSound software (Pettersson Elektronik AB); for species identification we followed Russo and Jones (2002), Pfalzer & Kusch (2003), Obrist et al. (2004) and Barataud (2015).
Table 2
Monitoring effort of the four census years. For each year we report the number of transects carried out, the total amount of kilometers covered and the days effort.
year
|
n. of transect
|
km
|
days effort
|
2014
|
25
|
608
|
9
|
2016
|
25
|
562
|
9
|
2017
|
25
|
567
|
10
|
2018
|
25
|
598
|
11
|
4.2 Data analysis
The goal of this study is to evaluate the importance of conifer forests for the bat fauna in the Foreste Casentinesi National Park. For each species examined, we built two different ecological models using a step procedure. We first built a generic habitat suitability model, then a second species-specific habitat suitability model, which we used to test the effect of conifer forests.
The models were built using MaxEnt (Phillips et al. 2006; Phillips & Dudík 2008), a frequently used method in ecological research and conservation planning since it provides reliable results even with small samples (Papeş & Gaubert 2007; Pearson et al. 2007). One of MaxEnt’s advantages over other methods of analysis is that it is able to account for uneven levels of sampling (Fourcade et al. 2014); nevertheless, the presence of strong biases in the distribution of samples may negatively affect the accuracy of the models, and thus the ability to interpret the results obtained (Kramer-Schadt et al. 2013, Syfert et al. 2013). To reduce the effects that may arise from uneven sampling – since some transects were surveyed fewer times than others, often due to roads being impassable – we used a bias file (Fourcade et al. 2014), an information layer similar to those normally used in ecological analyses (e.g. soil use types, DEM), whose geo-referenced information concerns the sampling intensity for each spatial unit under consideration. The bias file was also built with MaxEnt, by designing a model in which the sampling points - which in this case were obtained by subdividing the transects in points located at a distance of 100 meters from one another – are considered in terms of presence data and are compared with the background, which is the National Park as a whole. The habitat types that will most contribute to defining the environmental characteristics of the sampling points will be those sampled most extensively. To build the model, we used all of the variables in Table 2, which we had previously transformed in PCA, computed with RSToolbox package in R.
As noted above, the analyses were performed using a step procedure. For each selected species, we first built a basic suitability model, testing the effects of certain generic variables identified according to current knowledge (Table 3). We used climatic variables such as ombrotype (the higher the value, the more mesophylic the conditions, see Blasi et al. 2004); morphological ones such as the standard deviation of elevation as a measure of the morphological variability of the landscape; anthropic ones, specifically distances from single houses and human settlements; and explicitely environmental ones, such as woodland surface area. As specific data on forest age were not available, we used an indirect estimate (proxy), namely presence in one of the five State Reserves located within the National Park. These reserves, which have been state property since the early 20th century (Chiari 2014) or earlier in the case of the Camaldoli forest (Vazzano et al. 2011), had long been managed in a sustainable matter, albeit with productive goals, and in more recent decades in a very conservative manner. Compared to surrounding areas, State Reserves now host extensive areas of mature topsoils, often centuries old, with some of the oldest and best-preserved forests in the entire Apennines. They are thus an excellent proxy for the degree of forest development, at least at a relatively vast scale.
To deal with potential problems of correlations between variables, we transformed them using the PCA analysis, again computed with RSToolbox package in R. By using PCA we lose the ability to assess the contribution of each single variable in detail. However, the purpose of this study is not to analyze the ecology of individual species, but rather to evaluate whether conifer forests have a possible effect on their presence. In this case, PCA is thus a worthwhile method of analysis.
Table 3
List of parameters used to define the environmental preferences of the species analyzed, divided between those used for the basic models and those for the specific models.
Factor
|
description
|
Base model
|
|
Ombrotype
|
Rainfall at the site, based on Blasi et al.(2004)
|
D_House
|
Distance from the nearest house
|
D_Villages
|
Distance from the nearest 100 houses
|
Alt
|
Altitude
|
SD_Alt300
|
Standard deviation of altitude calculated within a circular buffer of a diameter of 300 meters centered on presence data
|
SD_Alt600
|
Standard deviation of altitude calculated within a circular buffer of a diameter of 600 meters centered on presence data
|
Wood_100
|
Forest surface area calculated within a circular buffer of a diameter of 100 meters centered on presence data
|
Wood_300
|
Forest surface area calculated within a circular buffer of a diameter of 300 meters centered on presence data
|
Wood_600
|
Forest surface area calculated within a circular buffer of a diameter of 600 meters centered on presence data
|
Reserves
|
Variable set at 1 if the record is within a State Reserce, 0 if outside it
|
Specific model
|
|
Con_100
|
Surface area of conifer forests calculated within a circular buffer of a diameter of 100 m centered on presence data
|
Con_300
|
Surface area of conifer forests calculated within a circular buffer of a diameter of 300 m centered on presence data
|
Con_600
|
Surface area of conifer forests calculated within a circular buffer of a diameter of 600 m centered on presence data
|
The best approximating model was selected based on the information theoretic approach (ITA) (Burnham & Anderson, 2002), using the Akaike information criterion corrected for small samples (AICc, Burnham & Anderson 1998), testing the effect of different combinations of feature types and regularization coefficients (Campedelli et al. 2012).
Once we defined the best base models, we tested the effect of the presence of conifers (Table 3). We first looked at the correlation level between the PCAs used for the base models and for the Con_ variables (parameters related with coniferous surface area; see Table 3), and between the latter. Spearman analysis showed a lack of correlation in the former case, while in the latter correlation was very high, as expected (always > 0.97). On the basis of these results, we thus separately tested the effect of each of the three “conifer forest” variables, once again using the Akaike information criterion (AICc) to select the best model.
The models were built using a subset of the available data, selected randomly and amounting to 50% of the total; the rest of the data was used as a test for validation. We believe that the decision to use a higher percentage of the data than usual for validation, sacrificing part of it to build the model, is a justifiable compromise in a situation with a limited number of localizations, so as to both have a sufficiently high number of observations to build effective models and to be able to validate them with a sufficient amount of data. Performing validations with a lower percentage of the dataset, as would normally happen with larger datasets, would have increased the likelihood of including aberrant data in the test and of excluding important parameters once the procedure was complete. Unlike what was proposed by Warren and Styres (2011), validation was not carried out on the entirety of the available sample, but only on the test, thus privileging the independence of the process.
At the end of the process, we compared the two best models for each species, choosing from the base models and from the models with the “conifer forest” variables. The best model was chosen by means of the ΔAICc and the Akaike weights (wAIC). In the former case, the more efficient models are those in which the addition of the conifer forest variables leads to a reduction in the AICc value. The Akaike weights are instead a measure derived from the AICc that can give the weight of evidence in favour of each model among a set of competing ones (Wagenmakers & Farrell 2004). Since the Akaike weight is a measure that also depends on the number of models compared, for an unbiased comparison it is necessary to have an equal number of models for each comparison step, in our case the best base model and the best specific model.
Models were evaluated using validation datasets. Model evaluation should deal with two aspects, the performance and the significance of the model (Peterson et al. 2011; Tarjuelo et al. 2014). Model performance shows how well the model classifies presence and absence of the species. Omission error rate (OER, the proportion of presence occurrence records of the evaluation dataset that fall in an area predicted as unsuitable for the species) was used as a measure of model performance, expecting low omission rates for good models. This measure of model performance was selected because it does not require true absence records. It is also necessary to assess model significance, i.e. whether the model predicts presence/occurrence records from the evaluation dataset better than expected under random predictions. Thus, one-tailed binomial tests (one per model) were performed to evaluate whether the proportion of correctly classified occurrences differs from the proportion of area predicted as presence by the model.
A procedure such as this one has the advantage of making it possible to evaluate, step-by-step and in an objective manner, the contribution of each individual variable, which in this case have profoundly different ecological meanings.
Analyses were performed on all species with a minimum of 60 occurrences, and on the number of feeding buzzes, considering all species, that are the sequences of sounds that bats emit when they approach their prey.