Exploring biodiversity challenges in Europe: Completeness, geography and environmental representativeness

doi:10.21203/rs.3.rs-4251904/v1

Download PDF

Research Article

Exploring biodiversity challenges in Europe: Completeness, geography and environmental representativeness

https://doi.org/10.21203/rs.3.rs-4251904/v1

This work is licensed under a CC BY 4.0 License

You are reading this latest preprint version

Biases and gaps in biodiversity data lead to significant disparities in knowledge among species descriptions and distributions of different taxonomic groups. These gaps could be addressed by utilizing predictive models, but this requires ensuring that available information is environmentally representative. In this study we utilize data from GBIF to investigate geographical biases, gaps and spatial completeness patterns concerning species distribution for the main classes of terrestrial organism in Europe. By identifying the spatial units with comprehensive inventories for each class, we offer insights into their quantity, distribution, and ability to capture the environmental variability of the European subcontinent. The results clearly demonstrate a high spatial heterogeneity and variability between taxa in the number of well-surveyed spatial units, showing that the units with high completeness for vertebrates and vascular plants are several times more numerous than those available for invertebrates and mosses. Regarding the environmental variability represented by the available data, results demonstrate the uncoordinated and contingent character of the accumulation process of biodiversity information and the need of an extra effort, which should be more intense in those taxa with a lower geographical coverage of their data. These challenges raise doubts about the reliability of these data in providing a comprehensive understanding of biodiversity distribution, as well as hindering model estimations. Extra compilation efforts should be mainly directed towards those spatial units capable of improving the current environmental representation of the spatial units considered well-surveyed, to reach a representative sample capable of producing effective interpolations and reliable predictions of species distributions.

Biodiversity inventories

Environmental coverage

Completeness of inventories

Geographical biases

MESS

Compiled data about biodiversity are plagued with biases and deficiencies, leading to a significant disparity in knowledge among taxonomic groups regarding the proportion of species descriptions or their distributional ranges (see Meyer et al. 2016; Titley et al. 2017; Hughes et al. 2021 or García-Roselló et al. 2023, among others). When examining databases containing information on species distribution, it is common to find some locations that appear relatively well surveyed, while others lack sufficient or any data at all (Lomolino 2004). These gaps in distributional information could be addressed by utilizing predictive models capable of managing spatial, environmental, or biotic predictors to forecast the probable occurrence or abundance of species in those unsurveyed localities (Guisan et al. 2017). The statistical theory behind such modelling procedures indicates that they should be based on a randomly obtained sample or set of observations that equitably represents the main characteristics of the entire population (Soley-Guardia et al. 2024). In the case of biodiversity data, this requirement implies that a sample dataset of a given taxonomic group needs to reflect, at the very least, the different environmental characteristics of the geographic area occupied by this group of species. Consequently, the existence of gaps in the geographic distribution of available data for a group of species, while still important, is not the primary concern for generating reliable model predictions. Instead, the focus is on ensuring that this limited set of data is environmentally representative.

In this study, we utilize data sourced from GBIF (https://www.gbif.org/) focusing on the main classes of terrestrial organism in Europe. Our aim is to investigate geographical biases and information gaps concerning species distribution. We also explore the consistency of spatial completeness patterns across different taxonomic groups. By identifying the most likely spatial units with comprehensive inventories for each organism class under consideration, we offer insights into their quantity, distribution, and ability to capture the environmental variability of the European subcontinent. Ultimately, our goal is to evaluate the adequacy of the compiled information thus far for predictive modelling purposes, as well as to recommend regions where it will be necessary to incorporate extra data for each taxonomic group.

Used data

Both taxonomic and occurrence data used in this study were exclusively sourced from GBIF, as this database aggregates a significant portion of occurrence data available in other biodiversity databases (Feng et al. 2022). GBIF Backbone Taxonomy (GBIF Secretariat, 2022; see https://hosted-datasets.gbif.org/datasets/backbone/) was employed to establish a comprehensive taxonomy of the Animalia and Plantae kingdoms, utilizing only species labelled as “accepted” in this database (accessed March 18, 2023). The class rank was selected as the taxonomic category to group data, given its widespread use and balanced representation of terrestrial taxa (see Ruggiero et al. 2015). The same 12 classes of Animalia and 8 of Plantae used in García-Roselló et al. (2023) were chosen. This decision was made by excluding classes in which 50% or more of their species inhabit marine biomes according to the World Register of Marine Species (WoRMS Editorial Board, 2022) and the Interim Register of Marine and Nonmarine Genera (Rees, 2022). Only georeferenced occurrence records labelled as “observations”, “human observations” or “preserved specimens” were extracted from GBIF when located in terrestrial areas of the European subcontinent. Subsequently, a basic cleaning process was implemented. Among the total of downloaded occurrences, those meeting the following criteria were excluded: i) occurrences with identical latitude and longitude, ii) occurrences with latitude or longitude of 0º, and iii) occurrences in habitats other than terrestrial or freshwater ecosystems (see García-Roselló et al. 2014 for details). For this study, the European subcontinent is considered to extends to the Urals as eastern limit and to Turkey as southern limit, including Middle East and Iceland as the western limit but excluding Greenland, smaller islands below 34º of latitude (e.g. Canary islands), or those west of longitude − 24º (e.g. Azores islands). The total occurrences within this subcontinent amount to 682 million occurrences belonging to 115,170 different species (see Table 1). Complete datasets are available for retrieval from GBIF.org, 2023a,b, and can be imported into ModestR (García-Roselló et al. 2013; see https://www.modestr.es).

Table 1

Number of GBIF records in Europe (Records) for each of the considered classes of organisms, total species in GBIF with occurrences in European subcontinent (Species), number of 30 arcminute cells occupied by the species of each class (Cells), number of cells (C90%) and their corresponding MESS value (MESS%) when completeness values are equal to or higher than 90%.
Kingdom	Phylum	Class	Records	Species	Cells	C90%	MESS%
Animalia	Annelida	Clitellata	175,120	75	1,758	150	21.32
Animalia	Arthropoda	Arachnida	3,341,408	6,832	4,692	69	23.98
Animalia	Arthropoda	Branchiopoda	131,999	189	1,472	124	37.57
Animalia	Arthropoda	Chilopoda	98,033	48	2,365	93	27.56
Animalia	Arthropoda	Collembola	188,920	909	1,627	37	29.96
Animalia	Arthropoda	Insecta	92,958,044	57,646	6,038	30	17.69
Animalia	Chordata	Amphibia	2,246,053	184	5,258	1,586	80.85
Animalia	Chordata	Aves	318,198,812	1,663	6,740	2,917	83.13
Animalia	Chordata	Mammalia	13,320,866	595	5,548	1,282	79.42
Animalia	Chordata	Reptilia	1,492,802	371	4,771	1,380	69.95
Animalia	Mollusca	Bivalvia	344,941	875	3,154	205	40.57
Animalia	Mollusca	Gastropoda	1,946,575	4,301	4,365	320	48.01
Plantae	Bryophyta	Bryopsida	6,771,091	1,722	4,915	277	22.30
Plantae	Marchantiophyta	Jungermanniopsida	1,354,944	489	3,480	228	66.72
Plantae	Marchantiophyta	Marchantiopsida	135,452	94	3,060	201	58.44
Plantae	Tracheophyta	Liliopsida	52,222,164	5,919	6,578	1,648	67.98
Plantae	Tracheophyta	Lycopodiopsida	501,040	83	3,929	1,194	73.16
Plantae	Tracheophyta	Magnoliopsida	176,528,550	32,111	6,718	1,404	62.72
Plantae	Tracheophyta	Pinopsida	4,287,183	310	5,058	911	76.93
Plantae	Tracheophyta	Polypodiopsida	5,969,427	754	5,555	1,765	84.60

Completeness estimations

Accumulation curves for each animal and plant class were calculated in each European terrestrial cell of 30 arcminutes (approximately 55 x 55 km at the equator; n = 7,274 cells) using the exact estimator proposed by Ugland et al. (2003). Arctic cells permanently covered by ice, as identified in the ISRIC database (https://data.isric.org), were excluded from the analysis. The number of occurrences for each species was considered as a surrogate for survey effort, following the approach outlined by Lobo (2008) and Lobo et al. (2018). Subsequently, the obtained accumulation curves were fitted to the rational function described by Flather (1996), and the extrapolated asymptotic values were used to estimate the probable number of species in each cell. The proportion of observed species compared to those estimated by the asymptotic value was then considered as the completeness of each cell. This entire process was conducted using the freely available ModestR software (www.modestr.es; García-Roselló et al. 2013 and 2023), which integrates an optimized version of the R package KnowBR (Lobo et al. 2018; Guisande and Lobo, 2019). A detailed explanation of how to perform this process in ModestR can be found at https://www.modestr.es/sweb/documents/tutorial_stepbystep/PermalinkTutorials.php?tutorial=26.

Climatic representativeness

The multivariate environmental similarity surface (MESS) metric (see Elith et al. 2010) was utilized to evaluate the level of environmental representativeness of the cells within the European area under consideration. MESS is a method designed to quantify the environmental similarity of a site in relation to a set of spatial units acting as reference sites. The maximum MESS value for a site is 100, indicating that the site is the most typical within the environmental range of the reference sites. Lower values, ranging down to zero, suggest more atypical environments. MESS can also assume negative values, signifying that at least one variable of a site falls outside the range of values of the reference sites. In such cases, the site should be considered as representing a novel environment, inadequately captured by the reference sites. Therefore, a reasonable criterion would be to consider only those sites with non-negative MESS values as having environments acceptably represented by the reference sites.

We used 36 environmental variables to characterize the environmental conditions of each cell: the 19 bioclimatic variables and elevation data, freely available in the WorldClim 2.1 dataset (Fick and Hijmans, 2017), as well as the 16 bioclimatic variables from ENVIREM (Title and Bemmels, 2018). All these variables were resampled from their original resolution of 5 arcminutes to a resolution of 30 arcminutes to match the resolution of the completeness estimations. Subsequently, MESS values were calculated for each cell across the entire study region for each considered class of Animalia and Plantae. Reference sites were defined as those cells with completeness percentages equal to or higher than a certain threshold, with the threshold values ranging from 50–95% at intervals of 5%. The R package modEvA was used to calculate MESS values (Barbosa et al. 2013). Finally, the percentage of cells with no negative MESS values relative to the total number of considered cells was computed for each class and completeness threshold, resulting in 200 MESS percentage values (20 classes x 10 completeness thresholds; hereafter referred to as %MESS). As the number of used environmental variables could influence %MESS values, we compared the values obtained using the 36 variables with those calculated using the scores of the six components with eigenvalues higher than one after submitting all these variables to a Principal Component Analysis. As the correlation between %MESS values obtained for each taxonomic group using the two quantities of environmental variables is positive and highly significant (Pearson r = 0.958; p < 0.001; n = 20), we have decided to maintain the use of all the environmental variables in MESS calculations.

Statistical methods

The completeness values for all European terrestrial cells of 30 arcminutes, pertaining to the 20 considered classes, underwent a Cluster Analysis to identify the primary groups of organisms based on the extent and spatial distribution of their completeness. The recommended Ward’s method was chosen as linkage rule, and Squared Euclidean distances were utilized as the measure of dissimilarity, providing a progressively greater weight to those classes that were further apart (Legendre and Legendre, 2012).

A General Linear Model was used to relate %MESS values obtained from the 200 occasions (response variable) with the 20 different taxonomic classes (categorical predictor), while using the number of database records as a covariate. This was done to examine whether the environmental variability represented by the data from different groups differs independently of the quantity of data available. If the environmental representativeness of the different taxonomic classes were unrelated when the effect of the quantity of records remains constant, it would be expected that a specific pattern would have conditioned the survey of the different groups.

Completeness patterns

The quantity of data (number of GBIF records in Europe) and the geographical range they represent (number of occupied 30 arcminute cells) significantly differ between the different classes (Table 1). These two characteristics are positively correlated among them (Pearson r = 0.58; p < 0.01), as well as with the number of cells with completeness values equal or higher than 90% (r = 0.62 and 0.70; p < 0.01). However, some classes, such as Insecta, which harbour a high number of species and records, hardly have about thirty cells that could be considered well-surveyed. A similar disproportion between information quantity and the number of well-surveyed cells can be found in Arachnida or Gastropoda, albeit to a lesser extent. Conversely, Aves, with moderate species richness, have 46% of all records and 40% of the cells with completeness equal to or greater than 90% (Table 1).

The geographical variation of completeness values do not follow a homogeneous pattern among the different classes; instead, these patterns can be grouped into two (Fig. 1A and Fig S1 in Supplementary Material). One group comprises vascular plants (Lycopodiopsida, Polypodiopsida, Pinopsida, Magnoliopsida and Liliopsida), Insecta, and Chordata classes, characterized by a high number of database records (on average 66.7 million records) and cells with completeness values equal or higher than 90% (on average 1,412 cells). In contrast, another group consists of mosses, liverworts, and invertebrate classes, which harbour a lower number of database records (on average 1.4 million) and cells with completeness values of at least 90% (on average 170 cells). The classes of organisms with the most information exhibit high completeness values in North and Central regions of Western Europe or along the entire western part of the continent (Fig S1 in Supplementary Material). However, the high completeness values of liverworts and mosses (Marchantiopsida, Jurgermanniopsida and Bryopsida) only appear in United Kingdom, Benelux and south of Sweden, while the completeness pattern in invertebrate classes is scattered, as seen in Collembola, Chilopoda and Branchiopoda, or limited to a few places in the north and central regions of Western Europe (Fig S1 in Supplementary Material). This inequality between classes in the distribution of completeness values mean that a European 30 arcminute cell cannot be considered well-surveyed for all twenty classes simultaneously (Fig. 1B).

Environmental representativeness

A linear decreasing number of cells appears as completeness values increase (Fig. 2A), and this trend oscillates from − 76.9 cells by each 1% increase in completeness in the case of Magnoliopsida to -13.25 cells in Branchiopoda (mean ± 95% CI: -46.44 ± 8.9; n = 20). %MESS values also decrease with the increase in completeness but in a much more pronounced and heterogeneous manner (Fig. 2B); the rates of linear decrease in the %MESS values for each 1% increase in completeness oscillate from − 2.10 in Insecta to -0.34 in Amphibia (-1.02 ± 0.25 for the 20 classes). Although both the number of cells and their environmental coverage measured by MESS diminish with the increase in completeness values, the relationship between these two parameters is clearly curvilinear (Fig. 2C). Therefore, the relationship between the geographic and the environmental spaces represented by the cells follows a monotonically increasing tendency until reaching an inflection point, after which the curve follows a stationary phase approaching an asymptotic final value. Consequently, the environmental representativeness of the cells rapidly increases with the number of cells (Fig. 2C), although this increase is consistently lower than that obtained by the same number of cells selected at random (Fig. 2C). On average, a 1% increase in the number of cells implies a 2.6% increase in environmental representativeness.

The rise in %MESS values with the increase in the number of cells is, however, not homogeneous among the different taxonomic classes. The rate of increase in %MESS values with the number of cells is high in some invertebrate classes such as Collembola, Chilopoda, Branchiopoda, Arachnida or Clitellata, and low in vertebrate and vascular plant classes (Fig. 3). In reality, these rate variations seem to be related to the differential amount of data held by each class. Thus, this rate of increase is negatively correlated both with the logarithm of the number of database records of each class (Pearson r = -0.655; p = 0.002) and with the logarithm of the number of European cells in which each class is present (r = -0.884; p < 0.001). As a consequence, the lower the number of records and occupied cells, the greater the increase in environmental space that each cell represents. When the influence of the number of cells is maintained constant, %MESS values continue to differ between classes (F_{19, 179} = 3.06; p < 0.001; Fig. 3), being minimum in Aves and maximum in liverworts classes (Marchantiopsida and Jugermanniopsida).

In a region characterized by a long taxonomic and naturalistic tradition, the analyses provided clearly demonstrate a high heterogeneity in the spatial distribution of completeness percentages, as well as variability between taxa in the number of spatial units that can be considered well surveyed. These findings are not novel, as geographical and taxonomic shortcomings and biases in biodiversity databases have been acknowledged on numerous occasions and across various organisms and data sources (e.g. Dennis and Thomas, 2000; Meyer et al. 2016; Titley et al. 2017; Troudet et al. 2017; Hughes et al. 2021; García-Roselló et al. 2023). These challenges raise doubts about the reliability of these data in providing a comprehensive understanding of biodiversity distribution, which is crucial for establishing effective conservation strategies (Rocchini et al. 2023). Regarding geographical inequalities, our European data shows that there is a distinct latitudinal gradient observed in the occurrence of probable well-surveyed cells. This pattern runs in parallel with the southern increase in biological diversity (Myers et al. 2000) and the northern growth in taxonomic resources and task forces (dos Santos et al. 2020). However, a more pronounced and steep longitudinal gradient in cell completeness values is apparent, likely partially due to the limited focus of GBIF’s objectives in Eastern Europe (Gaiji et al. 2013).

When examining the disparities among different taxonomic groups, we observe that the number of 30 arcminute cells with completeness values equal to or higher than 90% ranges from 40% of total in Aves to 0.4% in Insecta (29.3% and 0.1% respectively for completeness values equal to or higher than 95%). In reality, the completeness patterns of the 20 studied classes can be divided in two groups clearly discriminated by the amount of information available about them. The data available for vertebrates and vascular plants is nearly fifty times larger, and the number of cells with completeness values of at least 90% is eight times greater than those available for invertebrates and mosses. While the information is taxonomically biased because, in general, a smaller part of the data corresponds to those more diversified groups such as invertebrates, this gradient in the amount of information is also manifested geographically. Thus, the worst-surveyed groups only exhibit high completeness values in some places in the north and central regions of Western Europe, while this pattern widens in the better-surveyed groups, tending to encompass the entire western part of the subcontinent. Nevertheless, the conclusion is that even a region with a prolonged taxonomic tradition shows a high heterogeneity in the taxonomic and geographic distribution of their completeness values. This allows us to examine how this inequality influences the capacity to obtain a reliable sample capable of representing the environmental variability of the territory.

The provided results shows that a little number of 30-arcminute cells may allow to cover an important range of the complete environmental variability of the territory as measured by %MESS variable. Thus selecting only 5% of the 30 arcminute European cells selected at random is sufficient to represent more approximately 92% of the total environmental variability. However, in cells with completeness values equal to or higher than 90% (around 10.8% of total cells, in average), the mean represented environmental variability barely reaches 54% (minimum = 17%, maximum = 85% depending of the taxonomic class). The mean %MESS value is even lower (34%) if the cells with completeness values equal to or higher 95% are selected (5.9% of total). This result demonstrates the uncoordinated and contingent character of the accumulation process of biodiversity information and the need of an extra effort that should be more intense in those taxa with a lower geographical coverage of their data.

We can consider the random selection of spatial units as a relatively efficient manner of obtaining the environmental representation of a territory. Thus, the difference in %MESS values when cell completeness are equal to or higher than 90% and a similar number of cells randomly selected across Europe could be considered a measure of how far the data of a taxonomic group would be from an adequate environmental coverage. In the case of Europe, this difference is negatively correlated with the number of European cells in which each class is present (r = -0.525; p < 0.02). Thus, the larger the spatial coverage of a group's data, the more efficient the environmental representativeness of its data. Non vascular plants and invertebrates, as they are under-surveyed, showed much smaller %MESS values (35.8% in average; maximum = 84.6%, minimum = 17.7%) than vertebrate and vascular plant classes (75.4% in average; maximum = 66.7%, minimum = 62.7%). However, due to the growth curve reflecting the increase of the environmental representativeness with the addition of spatial cells, these much less surveyed taxa showed a potential higher rate of increase in their environmental representativeness with the addition of new data. Although the differential amount of data available for each class would be the main factor explaining the degree of environmental representativeness of each organism data, our results suggest that some class specific attributes of the compiled information and/or of their distribution and environmental adaptations could also play a role.

Species Distribution Models based on correlations may be utilised to forecast the occurrence of a taxon in absence of exhaustive information (Guisan et al. 2017). Unfortunately, the lack of information in some localities and the low level of completeness in others propitiates the existence of an unknown number of false absences, which hinders model estimations (Lobo, 2016). Another requirement in these modelling exercises is that the response variable should be well distributed across the gradient of environmental conditions existing in the selected region. When this does not occur, model results will extrapolate beyond the observed range of environmental conditions used in the process of model building (Jiménez-Valverde et al. 2013; Yackulic et al. 2013). Our study indicates that the lack of completeness is widespread across many groups and regions in Europe. Evidently, there is much more information available than what is present in GBIF. However, most of this information is not freely accessible, remaining hidden (Hochkirch et al. 2021). Thus, the consequence is that the use of the available information on the identity and distribution of organisms in biodiversity assessments and conservation efforts still requires strategic sampling approaches and additional efforts to make the current hidden biodiversity data accessible to the public (Jetz et al. 2012). Furthermore, these extra compilation efforts should be mainly directed towards those spatial units capable of improving the current environmental representation of the spatial units considered well-surveyed. This is the only way to have a representative sample capable of producing effective interpolations and reliable predictions of species distributions. Considering the magnitude and speed of the biodiversity crisis (Glaubrecht, 2023), it seems reasonable that humankind should not wait for reliable data to plan and implement conservation measures but should instead facilitate the necessary actions to obtain the required data for those groups and regions capable of mitigating the existing biased picture of biodiversity information.

Competing interests

The authors declare no competing interests.

Author Contribution

JML and EGR conceptualised and designed the study. JML led the writing of the manuscript and the analysis of the data. EGR and JGD developed the software, and all authors have read and approved the manuscript.

Acknowledgement

We are indebted to the naturalists who, over the span of several decades, have compiled the data studied herein

Data Availability

All the used data are included in the provided GBIF links

Barbosa AM, Real R, Muñoz AR, Brown JA (2013) New measures for assessing model equilibrium and prediction mismatch in species distribution models. Divers Distrib 19:1333-1338. https://doi.org/10.1111/ddi.12100
Dennis RLH, Thomas CD (2000) Bias in butterfly distribution maps: the influence of hot spots and recorder’s home range. J Insect Conserv 4:73–77. https://doi.org/10.1023/A:1009690919835
dos Santos JW, Correia RA, Malhado ACM, Campos-Silva JV, Teles D, Jepson P, Ladle RJ (2020) Drivers of taxonomic bias in conservation research: a global analysis of terrestrial mammals. Anim Conserv 23:679-688. https://doi.org/10.1111/acv.12586
Elith J, Kearney M, Phillips S (2010) The art of modelling range-shifting species. Methods Ecol Evol 1:330-342. https://doi.org/10.1111/j.2041-210X.2010.00036.x
Feng X, Enquist BJ, Park DS, Boyle B, Breshears DD, Gallagher RV, Lien A, Newman EA, Burger JR et al (2022) A review of the heterogeneous landscape of biodiversity databases: Opportunities and challenges for a synthesized biodiversity knowledge base. Global Ecol Biogeogr 31:1242–1260. https://doi.org/10.1111/geb.13497
Fick SE, Hijmans RJ (2017) WorldClim 2: new 1km spatial resolution climate surfaces for global land areas. Int J Climatology 37: 4302-4315. https://doi.org/10.1002/joc.5086
Flather CH (1996) Fitting species-accumulation functions and assessing regional land use impacts on avian diversity. J Biogeogr 23:155–168. https://doi.org/10.1046/j.1365-2699.1996.00980.x
Gaiji S, Chavan V, Ariño AH, Otegui J, Hobern D, Sood R, Robles E (2013) Content assessment of the primary biodiversity data published through GBIF network: Status, challenges and potentials. Biodiversity Informatics 8:94-172. https://doi.org/10.17161/bi.v8i2.4124
García-Roselló E, González-Dacosta J, Lobo JM (2023) The biased distribution of existing information on biodiversity hinders its use in conservation, and we need an integrative approach to act urgently. Biol Conserv 283:110118. https://doi.org/10.1016/j.biocon.2023.110118
García-Roselló E, Guisande C, González-Dacosta J, Heine J, Pelayo-Villamil P, Manjarrés-Hernández A, Vaamonde A, Granado-Lorencio C (2013) ModestR: A software tool for managing and analysing species distribution map databases. Ecography 36:1202–1207. https://doi.org/10.1111/j.1600-0587.2013.00374.x
García-Roselló E, Guisande C, Heine J, Pelayo-Villamil P, Manjarrés-Hernández A, González Vilas L, González-Dacosta J, Vaamonde A, Granado-Lorencio C (2014) Using ModestR to download, import and clean species distribution records. Methods Ecol Evol 5:708–713. https://doi.org/10.1111/2041-210X.12209
GBIF org (2023a) GBIF Occurrence Download https://doi.org/10.15468/dl.7dq5d9. Accessed 09 May 2023.
GBIF org (2023b) GBIF Occurrence Download https://doi.org/10.15468/dl.jnwejp . Accessed 09 May 2023.
Glaubrecht M (2023) On the end of evolution- Humankind and the annihilation of species. Zool Scr 52 (3):215-225. https://doi.org/10.1111/zsc.12592
Guisande C, Lobo JM (2019) KnowBR. Discriminating well surveyed spatial units from exhaustive biodiversity databases. R Package Version 2.0. http://cran.r-project.org/web/packages/KnowBR
Guisan A, Thuiller W, Zimmermann NE (2017) Habitat suitability and distribution models, with applications in R. Cambridge, UK: Cambridge University Press https://doi.org/10.1017/9781139028271
Hochkirch A, Samways MJ, Gerlach J, Böhm M, Williams P, Cardoso P, Cumberlidge N, Stephenson PJ, Seddon MB, Clausnitzer V, Borges PAV, Mueller GM, Pearce-Kelly P, Raimondo DC, Danielczak A, Dijkstra K-DB (2021) A strategy for the next decade to address data deficiency in neglected biodiversity. Conserv Biol 35:502-509. https://doi.org/10.1111/cobi.13589
Hughes AC, Orr MC, Ma K, Costello MJ, Waller J, Provoost P, Yang Q, Zhu C, Qiao H (2021) Sampling biases shape our view of the natural world. Ecography 44:1259-1269. https://doi.org/10.1111/ecog.05926
Jetz W, McPherson JM, Guralnick RP (2012) Integrating biodiversity distribution knowledge: toward a global map of life. Trends Ecol Evol 27 (3):151-159. https://doi.org/10.1016/j.tree.2011.09.007
Jiménez-Valverde A, Acevedo P, Barbosa AM, Lobo J M, Real R (2013) Discrimination capacity in species distribution models depends on the representativeness of the environmental domain. Global Ecol Biogeogr 22:508–516. https://doi.org/10.1111/geb.12007
Legendre P, Legendre, L (2012) Numerical Ecology (3rd ed., p. 990). Elsevier
Lobo JM, Jiménez-Valverde A, Hortal J (2010) The uncertain nature of absences and their importance in species distribution modelling. Ecography 33:103-114. https://doi.org/10.1111/j.1600-0587.2009.06039.x
Lobo JM, Hortal J, Yela JL, Millán A, Sánchez-Fernández D, García-Roselló E, González-Dacosta J, Heine J, González-Vilas L, Guisande C (2018) KnowBR: an application to map the geographical variation of survey effort and identify well-surveyed areas from biodiversity databases. Ecol Indic 91:41-248. https://doi.org/10.1016/j.ecolind.2018.03.077
Lomolino MV (2004) Conservation biogeography. In: Frontiers of Biogeography: new directions in the geography of nature. Lomolino MV and Heaney LR (eds.). Sinauer Associates, Sunderland, Massachusetts. https://doi.org/10.2980/1195-6860(2006)13[424:FOBNDI]2.0.CO;2
Meyer C, Weigelt P, Kreft H (2016) Multidimensional biases, gaps and uncertainties in global plant occurrence information. Ecol Lett 19:992–1006. https://doi.org/10.1111/ele.12624
Myers N, Mittermeier RA, Mittermeier CG, da Fonseca GAB, Kent J (2000) Biodiversity hotspots for conservation priorities. Nature 403:853–858. https://doi.org/10.1038/35002501
Rees T (2022) The Interim Register of Marine and Nonmarine Genera. Available from https://www.irmng.org at VLIZ. Accessed 2022-8-16.
Rocchini D, Tordoni E, Marchetto E, Marcantonio M, Márcia Barbosa A, et al (2023) A quixotic view of spatial bias in modelling the distribution of species and their diversity. npj Biodivers 2:10. https://doi.org/10.1038/s44185-023-00014-6
Ruggiero MA, Gordon DP, Orrell TM, Bailly N, Bourgoin T, Brusca RC, Cvalier-Smith T, Guiry MD, Kirk PM (2015) A higher level classification of all living organisms. Plos One 10:e0119248. https://doi.org/10.1371/journal.pone.0119248
Soley-Guardia M, Alvarado-Serrano DF, Anderson RP (2024) Top ten hazards to avoid when modeling species distributions: a didactic guide of assumptions, problems, and recommendations. Ecography 2024 (4):e06852. https://doi.org/10.1111/ecog.06852
Title PO, Bemmels JB (2018) ENVIREM: an expanded set of bioclimatic and topographic variables increases flexibility and improves performance of ecological niche modeling. Ecography 41:291–307. https://doi.org/10.1111/ecog.02880
Titley MA, Snaddon JL, Turner EC (2017) Scientific research on animal biodiversity is systematically biased towards vertebrates and temperate regions. Plos One 12:e0189577. https://doi.org/10.1371/journal.pone.0189577
Troudet J, Grandcolas P, Blin A, Vignes-Lebbe R, Legendre F (2017) Taxonomic bias in biodiversity data and societal preferences. Sci Rep-UK 7: 9132. https://doi.org/10.1038/s41598-017-09084-6
Ugland KI, Gray JS, Ellingsen KE (2003) The species-accumulation curve and estimation of species richness. J Anim Ecol 72:888–897. https://doi.org/10.1046/j.1365-2656.2003.00748.x
WoRMS Editorial Board (2022) World Register of Marine Species. Available from https://www.marinespecies.org at VLIZ. Accessed 2022-8-16.
Yackulic CB, Chandler R, Zipkin EF, Royle JA, Nichols JD, Campbell Grant EH, Veran S (2013) Presence-only modelling using MAXENT: when can we trust the inferences? Methods Ecol Evol 4:236–243. https://doi.org/10.1111/2041-210x.12004

Supplementary Material is not available with this version.

No competing interests reported.

Download PDF

Reviewers agreed at journal
04 May, 2024
Reviewers agreed at journal
28 Apr, 2024
Reviewers agreed at journal
27 Apr, 2024
Reviewers invited by journal
24 Apr, 2024
Editor assigned by journal
12 Apr, 2024
Submission checks completed at journal
12 Apr, 2024
First submitted to journal
11 Apr, 2024

You are reading this latest preprint version

Exploring biodiversity challenges in Europe: Completeness, geography and environmental representativeness

Status:

Version 1

Abstract

Figures

Introduction

Methods

Used data

Completeness estimations

Climatic representativeness

Statistical methods

Results

Completeness patterns

Environmental representativeness

Discussion

Declarations

Competing interests

Author Contribution

Acknowledgement

Data Availability

References

Supplementary Material

Additional Declarations

Status:

Version 1