Montane forest is the rarest vegetation type on the African continent (White 1983). However, this biologically rich and unique “Afromontane” ecosystem is increasingly threatened by deforestation, fragmentation, and degradation (Balmford et al. 2001; Brooks et al. 2002; Mittermeier et al. 2004; Burgess et al. 2007; Cordeiro et al. 2007; Plumptre et al. 2007), while climate change has added new threats (Parmesan 2006; Sala et al. 2000; Carr et al. 2013; Ayebare et al. 2018; Wright et al. 2022). Many species confined to montane forest are highly susceptible to the impacts of climate change and human influences (Carr et al. 2013; Ayebare et al. 2018; Wright et al. 2022). This calls for prioritization of conservation actions to protect exceptional montane forest species and habitats that are prone to extinction driven by anthropogenic disturbance and climate change. Mapping of forest vegetation is one approach that can help to assess, plan, and guide conservation management in addressing these challenges (Zimmermann and Kienast 1999; Dias et al. 2004; Fjeldsa 2007; Brinkmann et al. 2011).
Predictive vegetation mapping is defined as predicting the vegetation composition across a landscape from mapped environmental variables (Franklin 1995). It is normally done using a combination of field data with digital maps of topography, as well as climatic limiting factors, facilitated by flexible, algorithmic modeling approaches and is driven by the need to map vegetation over large areas for conservation planning (Pfeffer et al. 2003; Evans and Cushman 2009). Using non-parametric machine learning data-mining techniques can readily uncover complex patterns in species data and relationships between species and physical features of the environment (Brieman 2001; Iverson et al. 2004; Prasad et al. 2006; Cutler et al. 2007; Evans and Cushman 2009).
The aim of the study was to predict and map the distribution of tree communities across Bwindi Impenetrable National Park, (Bwindi), an Afromontane forest in southwest Uganda. Because Bwindi is primarily forest (Howard 1991), we focus on trees to determine differences in the vegetation. There is a dearth in recent information on the distribution of tree species and communities across the Park. Accurate maps of vegetation can aid management and contribute to various research as well as identifying restricted and vulnerable communities and habitats to support priority conservation planning (McDonald and Boucher 1999; Pfeffer et al. 2003; Brinkmann et al. 2011). For example, the mountain gorilla (Gorilla beringei beringei) is a highly endangered flagship subspecies, and the population in Bwindi is one of just two in Africa (McNeilage et al. 1997). Despite their conservation (Butynski 1985) and economic (Macfie and Williamson 2010) importance, the present distribution patterns of gorillas related to forest composition are poorly understood because there is no up-to-date map of the forest vegetation (Guschanski et al. 2008).
Mountains create complex environmental gradients often covering broad range in temperatures, terrains, moisture, soils, illumination, and human impacts. Many environmental factors vary with elevation, such that the conditions required for species to persist are restricted to specific elevations leading to some characteristic zonation along elevation gradients (Hamilton 1975; Richards 1996; Lovett et al. 2001; Eilu et al. 2004; Ghazoul and Sheil 2010; Schmitt et al. 2010). Plant communities are further structured at a local scale by topography, reflecting differences in soil depth, structure, moisture, or nutrients (Leggat and Osmaston 1961; Hamilton 1969; Ghazoul and Sheil 2010; Eilu et al. 2004). Tropical species are known to be highly sensitive to disturbance and shifts in environmental conditions (Sheil 2016; Ssali et al. 2017a). For example, human disturbance has been shown to relax the constraints imposed by competition and extend effective elevation ranges of some species, particularly those in secondary forest, to warmer and cooler climates (Muñoz Mazón et al. 2019). Thus, montane vegetation spatial patterns are a result of complex interactions between topography, past and recent natural and human disturbance factors as well as interactions among species such as competition (Hoersch et al. 2002). This makes the understanding of relationships between physical features of the environment and vegetation a great challenge. Given the wide variation in elevation, topography and past human disturbance across Bwindi Forest and evidence from aforementioned studies, we hypothesized that elevation, topography, and human disturbance, either acting in isolation or in combination, greatly influence the composition and distribution of tree species and communities.
Study area
Bwindi Impenetrable National Park (henceforth “Bwindi”) is located in SW Uganda (Fig. 1), in the Kigezi Highlands at the eastern edge of the Albertine Rift (latitude 0º53´- 0º8´S and longitude 29º35´- 29º50´E). Covering an area of 331 km2, Bwindi lies at the north west end of the Kigezi Highlands which are associated with the up-warping and faulting during the formation of the Albertine Rift and are underlain by Precambrian shale, phyllite, quartz, quartzite, schist and granite of the Karagwe-Ankolean System (Leggat and Osmaston 1961). The Atlas of Uganda (1967) classified the soils of Bwindi as belonging to the “non-differentiated humic ferrallitic” types and having been derived from the foregoing geological formation, and they comprise of mainly tropical red earths with an overlying layer of brown to black spongy humus. The erosive action of the numerous rivers in their youth stage within the Highlands has caused the topography of the park to be extremely rugged with narrow, steep-sided valleys and deep gullies that run in various directions, bound by emergent hill crests lying between 1,190 m in the northwest and 2,607 m a.s.l. in the southeast (Howard 1991). The forest has been classified as a ‘moist lower montane forest’ (Hamilton 1982), ranging from near the upper boundary of lowland forest to montane forest (Hamilton 1975). It is one of the few forests in all of Africa where lowland and montane forests are in a continuum (Hamilton 1989). This exceptional feature is very important for studies of elevational variations in rainforest ecosystems in tropical Africa. Hamilton (1982, 1989) postulated that the Bwindi was richer in tree species than the other forests of western Uganda because of proximity to the Pleistocene refuge. However, recent studies attribute the rich tree diversity of Bwindi to high rainfall and soil characteristics (Eilu et al. 2004), highlighting the importance of studying shifts in floristic diversity, composition, and distribution along environmental gradients. The park lies in a region with the highest rural human population density in Africa (Codeiro et al. 2007). Because of this, little natural forest persists outside the boundaries of the park (Twongyirwe et al. 2011). Before it became a park, the forest was subjected to various forms of human disturbance with pit-sawing and poaching occurring throughout the park (Albrechet 1976; Harcourt 1981; Butynski 1984; Howard 1991; McNeilage et al. 1998), while mining (gold and tungsten) was concentrated in specific locations (Harcourt 1981; Butynski 1984). Human-induced wildfires impacted the forest periphery (Leggat and Osmaston 1961; Butynski 1984), and numerous human trails and two public roads traverse the forest (Harcourt 1981; Butynski 1984). Ongoing disturbance processes include localized landslides, elephants (Loxodonta africana), occasional wildfires following long periods of drought, harvesting of poles and wire snaring (Babaasa 2000; Babaasa et al. 2004; McNeilage et al. 2001; McNeilage et al. 2006; Olupot et al. 2009; Ssali et al. 2012; Hickey et al. 2019), and a public road cutting through the narrow forest corridor and high elevation zone (Barr et al. 2015). This complexity of factors contributes to making Bwindi a challenging site for investigations of plant communities and consequently attempts to map Bwindi vegetation have been limited.
Information on Bwindi’s vegetation cover is scarce and limited. A forest type map prepared by Cahusac (1961) for managing timber harvesting operations is now outdated given the management history of Bwindi that included over 40 years of intense and forest-wide timber harvesting (Butynski 1984; Howard 1991), whose long-term impact was the creation of a diverse patch mosaic of vegetation types differing in age, hence succession, and broken canopy cover (Babaasa et al. 2004; Sheil 2012). Incessant, heavy human and natural disturbances have prevented reclosure of the forest canopy (Ssali et al. 2017a, b). Previous descriptions of the tree flora of Bwindi were limited to a small area of high elevation in the south-east (Leggat and Osmaston 1961; Hamilton 1969, 1974; Lind and Morrison 1974; Howard 1991) or low elevation in the north (Eilu et al. 2004). Later work on trees (Davenport et al. 1996), though covering more representative areas of the forest, recorded only the species encountered but did not geo-reference the sample sites, preventing spatial analyses and extrapolation. Until now, it had remained unclear how forest-wide floristic patterns were spatially structured in relation to the environment. The results of this study were urgently needed by park management for conservation planning of Bwindi.
Study design
To account for the different environmental conditions, we employed a stratified random sampling approach to guide sampling of tree species. The park was divided into five strata based on geological formations visible on the Digital Elevation Model (DEM) of Bwindi and the starting points on the boundary of the DEM in each stratum selected randomly with the random point function within ArcGIS (version 10.2; ESRI, Redlands, CA, USA). Line transects were purposely drawn on the DEM in each stratum from the random starting points to traverse the topographic positions of the ridges (Fig. 1; Table 1). The number and length of the transects selected varied with area, accessibility and shape of the strata. We then superimposed the transect drawings on high-resolution (0.5m) true color, digital aerial photographs of Bwindi. The aerial photos were visually interpreted along the transects by drawing polygons around areas perceived to be of uniform tree community structure based on differences in tone and texture. This allowed the sample plots to be placed in what we perceived to be different tree communities.
Tree species sampling
We carried a printed copy of the digitized polygons, overlaid with a coordinate grid, and used a hand-held Global Positioning System (GPS) device to locate the digitized polygons in the field. A single random point within each digitized polygon was selected for tree species sampling. At each random sample point, we used the point-to-tree distance technique or plotless sampling method to sample trees (Hall 1991; Sheil et al. 2003; Klein and Vilcko 2006). This technique involved selecting the nearest 15 trees (≥20 cm dbh) around the random center-point. The selected trees were identified to species level and we measured the diameter-at-breast-height (dbh) of each individual. We named the tree species following nomenclature used in Kalema and Hamilton (2020). The distance from the center-point to the 15th farthest tree was measured and regarded as the radius of circular plot. This procedure is especially suitable for rapid and robust assessments of tropical rain forest composition compared to fixed area plot methods (Sheil et al. 2003; Klein and Vilcko 2006). At each center-point, eight environmental attributes were recorded: aspect – as the compass direction facing down slope; and steepness of the slope using a clinometer. Untransformed aspect and slope are poor for quantitative analysis, so slope was transformed to a more suitable index by taking the sine of the slope in degrees; aspect was also transformed into a suitable index by taking the negative cosine of the angle in degrees minus 35 (McCune and Grace 2002; Cushman and Wallin 2002). Four physiographic positions of valley, hillside, ridge tops and gully were simply recorded as “1” if the plot was in that physiographical class and “0” otherwise. The final recorded site characteristics consisted of spatial variables – the Universal Transverse Mercator (UTM) coordinates of easting and northing (datum WGS 84) using a hand-held GPS unit, standardized to zero mean and unit variance.
Biophysical predictor variables
We acquired twenty-one digitally mapped biophysical variables (Table 2). These were: the Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER) Global Digital Elevation Model (GDEM) 30×30 m Version 2 (http://gdem.ersdac.jspacesystems.or.jp/), and nineteen 1-km-scale (30 sec) bioclimatic variables, that include 11 temperature and 8 precipitation variables from WorldClim version 2 (http://worldclim.org/version2; Fick and Hijmans 2017). We clipped the ASTER GDEM and the 19 bioclimatic rasters to a window covering Bwindi Forest only. Lastly, the human disturbance across the park was based on the combined relative encounter of human activity signs that were likely to have an impact on vegetation – wood cutting, bee hives, old pitsaw sites, disused mineral extraction pits, snares, and fireplaces (McNeilage et al. 1998). We used the geo-references of all the 289 plots to extract site values from the predictor rasters. We constructed an environmental matrix of extracted variable values together with site measurements (aspect, slope, topographic position and x,y point coordinates) and tested for pairwise collinearity and one of any pair of highly correlated variables (Pearson r ≥0.75) discarded (Table 2). We ended with an environmental matrix of 289 plots by 12 least correlated environmental variables.
Data analyses
Based on field data, a plots-versus-species matrix (using basal area values of each tree species relative to the area of the plot [m2 ha-1]) was created. We only considered tree species occurring in more than four plots, resulting in a 289 plot by 89 tree species matrix. The data were natural log-transformed following the generalized procedure (McCune and Grace 2002) to minimize the influence of large trees. We subjected the data matrix to a varius multivariate techniques to identify associations and spatial patterns among the tree species using R software (version 4.2.1; R Core Team 2022). A polythetic, agglomerative hierarchical clustering (HC) using the flexible beta linkage method (β = -0.25; Lance and Williams 1966; Legendre and Legendre 1998) with the agnes() function and Bray-Curtis as a distance measure was used as a method for cluster formation. We applied the Indicator Species Analysis (ISA), described below, to determine the optimal level of pruning the HC dendrogram. Because we had numerous plots (n = 289) we simplified the presentation using composite plots from the original plots by computing the centroid of each of the cluster formed by HC analysis as the mean of the basal area/ha of each tree species per cluster.
We used Multi-Response Permutation Procedures (MRPP) and a Mantel’s test to test for differences in composition between the HC cluster groups. We also utilized ISA, a method that combines frequency and mean abundance tables, to identify the characteristic tree species for each cluster. Statistical significance of the indicator tree species was determined by random permutations for each species at p<0.05 significance level. Indicator values vary from 0 (no indication) to 1 (perfect indication).
A Kruskal’s non-metric multidimensional scaling (NMDS) based on Bray-Curtis coefficient was used to evaluate the ecological tendencies reflected in the HC dendrogram and the relationship between plots, clusters and environmental variables (Table 2). The plots were ordinated, then overlaid with the cluster centroids from HC analysis with surrounding confidence ellipses at ±2 SD from the mean (enclosing approximately 95% of plots within each cluster). Lastly, 12 least correlated site environmental variables (Table 2) were fitted onto the ordinations using 1,000 permutations.
We used Random Forests technique (Breiman 2001; Liaw and Wiener 2002) to predict and map the spatial distribution of the HC clusters. The Random Forests algorithm has been shown to produce very accurate predictions without overfitting models to the data (Breiman 2001). In Random Forests, bootstrap samples are drawn to construct multiple trees. Each tree is grown with a randomised subset of predictors and a large number (500-2,000) of trees are grown (hence 'forest' of trees). The trees are grown to maximum size without pruning and aggregation takes place by averaging the trees. It is not possible to interpret each of the trees in Random Forests, but the procedure does provide tables on relative importance among predictor variables. The proportion of the data not drawn (out-of-bag or OOB) in each bootstrap replicate is used to calculate an unbiased error rate and variable importance, eliminating the need for a test set or cross-validation. Since our response variable, the clusters, was a factor (categorical), we performed a classification procedure in the analysis. Six least correlated variables (Pearson’s r <0.75) that are digitally mapped for the whole park: Isothermality, temperature seasonality, Minimum temperature of the coldest month, Annual precipitation, Mean temperature of the coldest quarter and elevation were tested as predictors for cluster distribution.