4.1. Study extent
This study was carried out in Benin. More specifically, almost all the sacred forests of Benin have been surveyed and inventoried. This dataset published on the GBIF website constitutes part of the data collected within the framework of the PIFSAP project. The species have been collected in all the forests located on all the department of Benin.
4.2. Sampling description
An inventory of species in the sacred forests and reserves of Benin was carried out. The occurrences of the species have been systematically collected. A sheet has been designed for this purpose and important information have been recorded (scientific name; day, month and year of collection; geographical coordinates; description of the locality and others information deemed important). The field equipment used during the inventories consists of: GPS receivers (Global Positioning System) for geo-referencing (longitude and latitude were taken), compass to take the direction of walk and orient oneself, digital cameras for the shooting.
4.3. Quality control
Herbariums of the species collected in the field were sent to the National Herbarium of Benin (HNB) for identification and validation of scientific names. The information noted on each species was entered into the spreadsheet (Excel sheet containing the standard fields for publishing data on the GBIF website) in accordance with those set by the Darwin Core (DwC) Standard (Darwin Core Task Group, 2009). Finally, the dataset was cleaned and prepared for publication on the GBIF website.
Data cleaning is a process used to improve quality by correcting detected errors and omissions in order to make them fit for use. The data collected were therefore cleaned according to the following steps: define and determine the types of errors; search for and identify occurrences of errors; correct the errors; documenting error cases and typical errors and modifying the data entry process to reduce future errors. Thus, various errors were corrected at the level of the essential attributes on the primary data collected from the biodiversity of Benin: taxonomic, spatial and temporal errors.
4.4. Step description
The information collected from the inventory of species in the sacred forests and reserves of Benin made it possible to properly complete the spreadsheet. The spreadsheet contains Darwin Core fields (Wieczorek et al., 2012) and includes the following: «eventID», «occurrenceID», «institutionID», «basisOfRecord», «eventDate», «year», «month», «day», «kingdom», «phylum», «class», «family», «genus», «subgenus», «specificEpithet», «infraspecificEpithet», «scientificName», «scientificNameAuthorship», «taxonRank», «continent», «waterBody», «countryCode», «country», «strateProvince», «locality», «decimalLatitude», «decimalLongitude», «coordinatePrecision», «geodeticDatum», «minimumElevationInMeters», « maximumElevationInMeters», «minimumDepthInMeters», «maximumDepthInMeters», «habitat», «fieldNumber», «individualCount», «organismQuantityType», «occurrenceStatus », « fieldNumber», «samplingProtocol», «eventRemarks», «recordedBy».
Taxonomic, temporal and spatial data (Ganglo et al., 2017) have been corrected. For taxonomic errors, Catalog of life and Taxonomic Name Resolution Service (TRNS) (Boyle et al., 2013) was used. This online database is very useful to correct spelling errors (badly written name), format errors (binomial nomenclature) and to replace the names given in synonymy with those accepted. Regarding errors related to spatial data, geographical coordinates (longitude and latitude) were projected on QGIS Desktop software version 2.18.4 (Sutton and Dassau, 2015) and those outside of the study area (outlier) were deleted (Ganglo et al., 2017). Coordinates outside of the desired range as well as coordinates at 0.00 have been removed. Geolocate and Eathexplorer were used to find geographical coordinates from administrative subdivisions or from the description of the place of collection. Occurrences with no collection location information were removed. In addition, Canadensys coordinates conversion and species link were used to convert geographical coordinates (Longitude and latitude) into decimal degrees, an appropriate format for publishing data on the GBIF website. In terms of temporal errors, no consistence dates such as 13-13-2050 (errors made during entry) have been corrected from the field evenDate. Feedback was therefore given to the data provider to this effect. Moreover, the field evenDate have been formatted under the standard accepted for publication on GBIF website: Year, month and day.
Finally, the duplicates occurrences have been removed by considering the fields scientifcName, decimalLatitude and decimalLongitude (fields relating to essential attributes of primary biodiversity data in order to have unique occurrence).