4.1-Plant taxonomic group completeness, species richness patterns, and the way forward for new data collections to more efficiently contribute to data use and biodiversity conservation in Africa
With respect to taxon rank completeness, the results showed that 12.66% of the plant records were not determined to the species level. Because this information is incomplete, it cannot be used in fine analyses such as ecological niche modeling or completeness analysis at the species level. My suggestion is to refer to the specimen images where possible to complement the determination of species.
Out of the 12 plant classes published on the GBIF site from Africa, two plant classes dominated the occurrence records up to approximately 95%. They are Magnoliopsida (79.62%) and Liliopsida (15.10%). The other 10 plant classes were shared, and the remaining 5% of records clearly indicated large data gaps across plant classes. This also applies to the species richness pattern, where Magnoliopsida and Liliopsia accounted for 90.01% of the total number of plant species derived from the data records we downloaded. Collecting data in the deficient classes should therefore be among the priorities for new collections. Indeed, many medicinal plants, microalgae, and phytoplankton belong to the underrepresented plant classes. For example, the class of Ginkgoopsida containing the sole species Ginkgo biloba has only 63 records; however, the medicinal properties of that species are recognized worldwide (Singh et al., 2008, Belwal et al., 2019). In the class Pyramimonadophyceae, phytoplankton are quite important in the marine food web (Jónasdóttir, 2019), while microalgae belonging to other underrepresented classes (Ulvophyceae, Chlorophyceae, Trebouxiophyceae, etc.) play an important role in the food web of streams (Hodač et al., 2015). Collecting more data records in those marginally represented classes will surely enable efficient data use and contribute to the conservation and sustainable use of their biodiversity. New data collections to fill data gaps across plant taxonomic groups should also address the representativeness of plant families. For example, the Meliaceae, Ebenaceae, and Moraceae families are rich in many valuable timber plant species; the Lamiaceae, Combretaceae, Sapotaceae, and Anonaceae families contain many medicinal plants and/or bear fruits of commercial interest (Britannica, 2008; 2015; 2021; Houéssou et al., 2012, Luteyn et al., 2021). They should also be among the priorities for new data collections across Africa.
Data published from Africa on the GBIF site are still scarce (less than 2% of the records). However, the total number of plant species we derived from the few records was 72,991. It is greater than the estimated number (40,000 to 60,000) of plant species from tropical Africa (Lebrun & Stock, 1991–1997, Küper et al., 2004). The total number of plant species derived from the data published from Benin is 5,013, as opposed to the 2,807 species reported in the Flora of Benin by Akoègninou et al. (2006). GBIF-mediated data will therefore surely provide insights, in the near future, into the revision of the Flora of tropical Africa and of many African countries. This will also surely apply to the countries of other continents and confirm GBIF as a unique and worldwide leading mega infrastructure and data repository, a real unique reservoir and motor of research and knowledge on biodiversity.
4.2-Data completeness with respect to the basis of records
The basis of record refers to the type, nature, and method of data collection (GBIF, 2022). In the data plant records of Africa, two bases of records were dominant: preserved specimens (75.49%) and human observation (18.60%). Data from sampling events represent 1.89% of the records. Globally, at the African level, efforts must be made to publish more preserved specimen data and sampling event data. Indeed, human observation data have some limitations in their validation in the absence of vouchers (Chapman, 2005). The data pattern of the basis of records, however, varies from country to country. For example, in Benin, human observation data represent 77.70% of the records, preserve specimen data, 4.81%, and sampling event data represent 4.00% of the records published at the GBIF site; as opposed to that pattern, in Kenya, preserve specimen data represent 77.37%, human observation, 10.40%, and sampling event data represent 10.81% of the records published at the GBIF site; in Madagascar, preserve specimen dominated the data published at the GBIF site up to 99.44%; in Tanzania, preserve specimen data represent 84.87%, human observation, 0.37%, and sampling event data represent 2.57% of the records published at the GBIF site; in South Africa, preserve specimen dominated the data published up to 98.40% followed by fossil specimen (1.37%). We therefore deduce that from country to country, effort will vary to increase the preservation of specimen and sampling event data.
4.3-Data completeness across subregions of the continent
From the geographic patterns of data completeness, plant data gaps are quite large across the continent at either spatial resolution. Data completeness is more achieved in West Africa (Benin and Southern Liberia, Côte-d’Ivoire, Ghana, Togo, and Nigeria). In Central Africa, plant data completeness is more successful in Cameroon and Gabon. In southern Africa, the data gap is reduced more in South Africa and Mozambique. In East Africa, Tanzania, Kenya, Uganda, Rwanda, and Ethiopia are more successful in data completeness. Madagascar is quite successful in data completeness acquisition. Data gaps are quite large in DRC, Sahelian, and Northern African countries. It therefore appeared that in the different sub-regions of Africa, efforts are still needed across countries to fill data gaps. Our results are in concordance with those of Asase and Peterson (2016), who found that data completeness was more achieved in the southern parts of Ghana. Plant data completeness was also analyzed in Benin (Ganglo and Kakpo, 2016); the authors pointed out that data gaps were mostly found in the northernmost departments of Benin. From the results of this study, we can understand that since then, appreciable efforts of data publication have been made to fill the data gaps identified so far in Benin.
4.4-Impact of accessibility and protected areas on data completeness
Sampling bias is commonly addressed in plant and animal inventories (Kadmon et al., 2004). From our results, accessibility to areas, either by roads or waterways, significantly affected data completeness, and the coarser the resolution of spatial grid cells, the stronger the correlation between the roads or waterways’ lengths with the number of records. Our results are supported by those of Ballesteros-Mejia et al. (2013), who reported positive effects of traffic access, road density, and tourism on the inventory completeness of tropical insects in Sub-Saharan Africa. Our results are also supported by a previous study by Ganglo and Kakpo (2016) on plant data completeness in Benin, where they found that data completeness was positively impacted by the density of roads and waterways. Our results are also in line with those of Kadmon et al. (2004) and Souza-Baena et al. (2013) in their respective studies on plant species data of Israel and plant data completeness of Brazil, where they found that road bias affected the distribution of data. Our results also showed that protected areas also affect data completeness and that protected areas are less well inventoried at coarser resolutions than at finer resolutions (≤ 2500 km²). Our results are in conformity with those of Ganglo and Kakpo (2016), who found that plant data completeness in Benin was more effective in protected areas with smaller surfaces. Another dimension of bias of data completeness, worth of consideration, is underlined by Meyer et al. (2015) in their study on gap analysis of digitally accessible information of terrestrial vertebrates; they found that data completeness depends on distance to researchers, local research funding, and data publication rather than on transportation infrastructure. The lesson to learn from possible bias on data completeness is that priorities of new data collections must address remote areas from roads (beyond 500–2000 m) (Kadmon et al. (2004) and waterways as well as remote areas of largest protected areas (> 2500 km²). Opening of new roads, with more representativeness of ecological conditions of the landscapes across African countries, can significantly contribute to less biased data collections. More incentives in data publication and more intense connections between researchers can also contribute to filling gaps in information (Meyer et al., 2015).