Predictive mapping of tree species assemblages in an African montane rainforest

Conservation of mountain ecosystems can benefit from knowledge of habitats and their distribution patterns. This benefit is particularly true for diverse ecosystems with high conservation values such as the “Afromontane” rainforests. We mapped the vegetation of one such forest: the rugged Bwindi Impenetrable Forest, Uganda—a World Heritage Site known for its many restricted‐range plants and animal taxa including several iconic species. Given variation in elevation, terrain and human impacts across Bwindi, we hypothesized that these factors influence the composition and distribution of tree species. To test this, detailed surveys were carried out using stratified random sampling. We established 289 georeferenced sample sites (each with 15 trees ≥20 cm dbh) ranging from 1320 to 2467 m a.s.l. and measured 4335 trees comprising 89 species that occurred in four or more sample sites. These data were analyzed against 21 digitally mapped biophysical variables using various analytical techniques including nonmetric multidimensional scaling (NMDS) and random forests. We identified six tree species assemblages with distinct compositions. Among the biophysical variables, elevation had the strongest correlation with the ordination (r2 = 0.5; p < 0.001). The “out‐of‐bag” (OOB) estimate of the error rate for the best final model was 50.7% meaning that nearly half of the variation was accounted for using a limited set of variables. We demonstrate that it is possible to predict the spatial pattern of such a forest based on sampling across a highly complex landscape. Such methods offer accurate mapping of composition that can guide conservation.

. Many environmental factors vary with elevation, leading to some characteristic zonation patterns (Hamilton, 1975;Hedberg, 1951;Schmitt et al., 2010).Plant communities are further structured at a local scale by topography, reflecting differences in soil depth, structure, moisture, nutrients, exposure and aspect contributing to local variations in suitability for different plants to establish and persist (Eilu et al., 2004;Ghazoul & Sheil, 2010;Lovett et al., 2001).Tropical species are known to be sensitive to disturbance and shifts in environmental conditions (Sheil, 2016;Ssali et al., 2017).For example, disturbance (the availability of open areas where trees can establish) have been shown to relax the constraints imposed by competition and extend effective elevation ranges of some species that are not constrained by short-range dispersal, particularly those in secondary forest, to warmer and cooler climates (Muñoz Mazón et al., 2019).Thus, montane vegetation spatial patterns are a result of complex interactions between the terrain, local conditions and histories (Hoersch et al., 2002), as well as interactions among species such as competition (Pfeffer et al., 2003).This makes the understanding of relationships between physical features of the environment and vegetation challenging.
Montane forest is the rarest vegetation type on the African continent (Linder, 2014;White, 1983), and also among the least studied (Desalegn & Beierkuhnlein, 2010;Sainge et al., 2019).This biologically rich and unique "Afromontane" ecosystem is increasingly threatened by deforestation, degradation and defaunation (Cordeiro et al., 2007;Plumptre et al., 2007), while global climate change has added new threats (Ayebare et al., 2013 andAyebare etal., 2018;Wright et al., 2022).Deforestation is threatening the high aboveground carbon stock of montane forests (Cuni-Sanchez et al., 2021).This calls for prioritization of conservation actions to identify and protect vulnerable species and habitats.Mapping of forest vegetation can help to assess, plan, and guide conservation management in addressing these challenges by clarifying what taxa typically occur where and how these distributions are determined (Brinkmann et al., 2011;Fjeldsa, 2007).For example, variation in plant species composition along an elevation gradient is highly correlated with dietary variability (Elgart, 2010;Ganas et al., 2004) and variation in the female population genetic structure (Guschanski et al., 2008) of mountain gorillas (Gorilla beringei beringei).
Predictive vegetation mapping is defined as predicting the vegetation composition across a landscape from mapped environmental variables (Franklin, 1995).It is normally done using a combination of field data with digital maps of topography, as well as climatic limiting factors, facilitated by flexible, algorithmic modeling approaches and is driven by the need to map vegetation over large areas for conservation planning (Evans & Cushman, 2009;Hoersch et al., 2002;Pfeffer et al., 2003).Availability of biophysical predictor variables like elevation, temperature and precipitation mapped at relevant spatial scales and advances in species distribution modeling (Elith & Leathwick, 2009;Franklin, 2010;Hastie et al., 2009), provide opportunities to quantitatively analyze, predict and map the flora of biodiversity hotspots, if based on field data, or at least are validated or informed by field data.This increases our understanding of the vegetation composition and plant community patterns in relation to environmental factors.Various approaches are available for predictive mapping of species distributions (Elith et al., 2006;Hijmans & Elith, 2016).We highlight three general approaches ranked according to their "function complexity" (Pearce & Ferrier, 2000): first, are the earliest and simplest methods that include Bioclim, Domain and Mahalanobis.These methods do not consistently perform well when compared with other approaches (Elith et al., 2006).Second, are the generalized regression models-the Generalized Linear Models (GLM;McCullagh & Nelder, 1989) and Generalized Additive Models (GAM; Hastie & Tibshirani, 1986).Though more complex than the first group, the generalized regression models provide mixed results because the nature of the relationships with the predictor variables may vary across the range of each species (Franklin, 1995;Moore et al., 1991).Third, are the nonparametric machine learning techniques that can reveal very complex patterns between species and physical features of the environment and can yield good predictive models when data are sufficient (Breiman, 2001;Prasad et al., 2006;Cutler et al., 2007).The earliest approaches were the Artificial Neural Networks (ANN) and Classification and Regression Trees (CART; Breiman et al., 1984).Later models include MaxEnt (Maximum Entropy; Phillips et al., 2006), Random Forests (RF) (Breiman, 2001), Boosted Regression Trees (Elith et al., 2009), Multivariate Adaptive Regression Splines (MARS; Friedman, 1991) and Support Vector Machines (SVM; Guo et al., 2005).Comparisons among such options found that RF perform best in mapping current and future species distributions (Prasad et al., 2006).
Many machine-learning based vegetation mapping studies have focused on temperate forests, but rarely have they been applied to explore the complex, highly mixed and species-rich rainforests in the tropics (Lin et al., 2020).Yet, accurate maps of vegetation can aid management and contribute to various research as well as identifying restricted and vulnerable communities and habitats to support priority conservation planning (Brinkmann et al., 2011;Sainge et al., 2019).While numerous studies have focused on large geographical areas, fewer studies have identified and mapped vegetation communities along environmental gradients in smaller landscapes or mountains (Latt & Parker, 2022).However, local studies contribute to prioritization of areas in need of urgent conservation action particularly in biodiversity hotspots at high risk of habitat degradation and loss (Fjeldsa, 2007;Seddon et al., 2020).
Bwindi Impenetrable Forest in SW Uganda (henceforth "Bwindi") is well suited, as a challenging case, for evaluating the performance of predictive mapping.It is one of the few forests in all of Africa where lowland and montane forests are in a continuum (Hamilton, 1974 andHamilton 1975).The forest has a richer tree diversity compared to other forests in the ecoregion, attributed to high rainfall and soil characteristics (Eilu et al., 2004).The terrain of Bwindi is extremely rugged with high topographic diversity (Howard, 1991;Leggat & Osmaston, 1961).This limits conventional vegetation mapping approaches such as aerial photo interpretation or data derived from satellite images as their spatial resolution data is often insufficient for mapping vegetation in specific areas such as gullies and steep slopes where topographic shadows mask the vegetation (Pfeffer et al., 2003).
The forest also lies in an area with one of the highest rural human population densities in Africa (Cordeiro et al., 2007).Because of this, little natural forest persists outside the boundaries of the park (Twongyirwe et al., 2011).Heavy human disturbance of the past within the forest greatly modified the vegetation structure and ecology (Babaasa et al., 2004;Sheil, 2012;Ssali et al., 2017).Before it became a park, the forest was subjected to a broad gradient of human disturbance with pit-sawing and poaching being the most widely distributed in the forest (Howard, 1991;McNeilage et al., 1998), while mining (gold and tungsten) was concentrated in specific locations (Butynski, 1984;Harcourt, 1981).Human-induced wildfires and harvesting of non-timber products impacted the forest periphery (Butynski, 1984;Cunningham, 1996), while numerous human trails and two public roads traversed the forest (Butynski, 1984;Harcourt, 1981).Human disturbance across the forest varied greatly in intensity, distribution and time periods (McNeilage et al., 2001).
Ongoing disturbance processes include localized landslides, forest elephants (Loxodonta cyclotis), occasional wildfires following long periods of drought, harvesting of poles and wire snaring (Babaasa, 2000;Hickey et al., 2019;Olupot et al., 2009), and a public road cutting through the narrow forest corridor and high elevation zone (Barr et al., 2015).This complexity of factors contributes to making Bwindi a complex site for investigations of plant communities.
Given the wide variation in elevation, topography and past and current human disturbances across Bwindi Forest and evidence from elsewhere, we hypothesized that elevation, topography, and human disturbance, either acting in isolation or in combination, would play major roles in determining the composition and distribution of tree species and communities.We carried out extensive ground surveys followed by multivariate statistical techniques to evaluate a variety of environmental variables as potential predictive attributes to test this hypothesis.

| Study area
Bwindi Forest is located in SW Uganda (Figure 1), in the Kigezi Highlands at the eastern edge of the Albertine Rift (latitude 0°53′-0°8´S and longitude 29°35′-29°50′ E).Covering an area of 331 km 2 , Bwindi lies at the North West end of the Kigezi Highlands which are associated with the up-warping and faulting during the formation of the Albertine Rift and are underlain by Precambrian shale, phyllite, quartz, quartzite, schist and granite of the Karagwe-Ankolean System (Leggat & Osmaston, 1961).The Government of Uganda (1967) classified the soils of Bwindi as belonging to the "non-differentiated humic ferrallitic" types and having been derived from the foregoing geological formation, and they are composed of mainly tropical red earths with an overlying layer of brown to black spongy humus.
The erosive action of the numerous rivers in their youth stage within the Highlands has caused the topography of the park to be extremely rugged with narrow, steep-sided valleys and deep gullies that run in various directions, bound by emergent hill crests lying between 1190 m in the northwest and 2607 m a.s.l. in the southeast (Howard, 1991).The forest has been classified as a 'moist lower montane forest' (Hamilton, 1994), ranging from near the upper boundary of lowland forest to montane forest (Hamilton, 1975).
Information on Bwindi's vegetation cover is limited.A forest type map prepared by Cahusac (1961) for managing timber harvesting operations is now outdated given the management history of Bwindi that include more than four decades (from 1947 to 1991) of intense and forest-wide timber harvesting (Howard, 1991;McNeilage et al., 1998), and whose long-term impact was the creation of a diverse patch mosaic of vegetation types differing in age, hence succession, and broken canopy cover (Babaasa et al., 2004;Sheil, 2012).
Incessant, heavy human and natural disturbances have prevented reclosure of the forest canopy (Ssali et al., 2017).Previous descriptions of the tree flora of Bwindi were based on general tree species inventory and limited to a small area of high elevation in the southeast (Hamilton, 1969;Howard, 1991;Leggat & Osmaston, 1961) or low elevation in the north (Eilu et al., 2004).Later work on trees (Davenport et al., 1996), though covering more representative areas of the forest, recorded only the species encountered but did not georeference the sample sites, preventing spatial analyses and extrapolation.Until now, it had remained unclear how forest-wide floristic patterns were spatially structured in relation to the environment.

| Study design
To account for the different environmental conditions, we employed a stratified random sampling approach.The park was divided into five strata based on geological formations visible on the Digital Elevation Model (DEM) of Bwindi and the starting points on the boundary of the DEM in each stratum were selected randomly with the random point function within ArcGIS (version 10.5; ESRI, Redlands, CA, USA).Line transects were drawn on the DEM in each stratum from the random starting points to traverse the topographic positions of the ridges (Figure 1; Table 1).The number and length of the transects selected varied with area, accessibility and shape of the strata.We then superimposed the transect drawings on highresolution (0.5 m) true color, digital aerial photographs of Bwindi.
The aerial photos were visually interpreted along the transects by drawing polygons around areas perceived to be of uniform tree community structure based on differences in tone and texture.This allowed the sample sites to be placed in what we perceived to be distinct tree communities.

| Tree species sampling
We carried a printed copy of the digitized polygons, overlaid with a coordinate grid, and used a hand-held Global Positioning System (GPS) device to locate the digitized polygons in the field.
A single random point within each digitized polygon was selected for tree species sampling.At each random sample point, we used the point-to-tree distance technique or plotless sampling method to sample trees (Hall, 1991;Klein & Vilcko, 2006;Sheil et al., 2003).This technique involved selecting the nearest 15 trees (≥20 cm dbh) around the random center-point.The selected trees were identified to species level and we measured the diameter at breast height (dbh) of each individual.We named the tree species following nomenclature used in Kalema and Hamilton (2020).The distance from the sample site center-point to the 15th farthest tree was measured and regarded as the sample site radius.This procedure is suitable for rapid and robust assessments of vegetation where tree density varies, such as in patchy and disturbed tropical forest (Klein & Vilcko, 2006;Sheil et al., 2003).At each center-point, eight environmental attributes were recorded: aspect-as the compass direction facing down slope; and steepness of the slope using a clinometer.
Untransformed aspect and slope are poor for quantitative analysis, so slope was transformed to a more suitable index by taking the sine of the slope in degrees; aspect was also transformed into a suitable index by taking the negative cosine of the angle in degrees minus 35 (McCune & Grace, 2002).Four physiographic positions of valley, hillside, ridge tops and gully were simply  TA B L E 1 Sampling design for trees (≥20 cm dbh) for floristic gradient modeling in Bwindi Impenetrable National Park, Uganda.
recorded as "1" if the sample site was in that physiographical class and "0" otherwise.The final recorded site characteristics consisted of spatial variables-the Universal Transverse Mercator (UTM) coordinates of easting and northing (datum WGS 84) using a hand-held GPS unit, standardized to zero mean and unit variance.All the data were collected at 289 sample sites spread across the forest (Figure 1).

| Biophysical predictor variables
We acquired 20 digitally mapped biophysical variables summarized in  , 1998).We used the geo-references of all the 289 sample sites to extract site values from the predictor rasters.We constructed an environmental matrix of extracted variable values together with site measurements (aspect, slope, topographic position and x, y point coordinates) and tested for pairwise collinearity and one of any pair of highly correlated variables (Pearson r ≥ 0.75) discarded (Table S1).2) were fitted onto the ordinations using 1000 permutations.

| Data analyses
We applied RF techniques to predict and map the spatial distribution of the clusters (Evans & Cushman, 2009;Lin et al., 2020;Prasad et al., 2006).RF is a data mining technique that should produce accurate predictions without overfitting the data (Breiman, 2001;Breiman & Cutler, 2003).In R software, it is implemented in the function 'randomForest' in a package with same name (Liaw & Wiener, 2002).In RF, bootstrap samples are drawn to construct multiple trees; each tree is grown with a randomized subset of pre- (that is, the true error of the population as opposed to the training error only), which means that no overfitting is possible, a useful feature for prediction.By growing each tree to maximum size without pruning and selecting only the best split among a random subset at each node, RF maintains some ability for prediction (Breiman, 2001).
Random predictor selection diminishes correlation among unpruned trees and reduces bias; by taking an ensemble of unpruned trees, variance is also moderated.RF provides several metrics that aid in interpretation.The importance of each predictor variable is evaluated based on how much worse the prediction would be if the data for that predictor were permutated randomly.Since our response variable, the clusters, was a factor (categorical), we performed a classification procedure in the analysis.Six least correlated variables (Pearson's r < 0.75) that are digitally mapped for the whole park: isothermality, temperature seasonality, minimum temperature of the coldest month, annual precipitation, mean temperature of the coldest quarter and elevation were tested as predictors for cluster distribution.

| RE SULTS
We on ISA (Figure 3).The six groups were further grouped in three pairs, with clusters 1 and 5 being most similar in composition, while clusters 2 and 6 were more distinct.
The MRPP test showed that the clusters were significantly different from random association (observed delta = 0.79, expected delta = 0.89, p < .001).The Mantel test (r = 0.32, p < .001)also indicated that the six clusters differed significantly in composition.
In total, 63 indicator tree species proved significant for the six clusters (summarized in Table 3 and detailed in Table S2).All the species, with the exception of Gambeya albida in cluster 4, had indicator values of less than 50% implying that they also occurred, at lower abundance and frequency, in other clusters.The clusters had numerous indicator species with clusters 2, 5 and 6 having the largest number (14, 14 and 18, respectively), while clusters 1, 3 and 4 had the fewest (6, 7 and 4, respectively).Twenty-four species were pioneers (early successional) and mid-successional species (nonpioneer light demanders, NPLDs), spread across the clusters.
NMDS resulted in a 3-axis optimal solution and a good fit with a clear positive relationship between observed community dissimilarity and ordination distances.The NMDS ordination diagram produced a dense cluster of sample site points and HC analysis confidence ellipses that greatly overlapped on all the three axes (Figure S1).The sample site points that were positioned where the ellipses intersected did not belong exclusively to one cluster but had compositions indicative of two or more clusters.The overall patterns show a complex range of variables that show consistent and nonrandom relationships to species composition and distribution (p < .001).The strongest correlations were observed for elevation and longitude, but ridge-top, precipitation, temperature and human disturbance factors also showed a nonrandom role (Figure S2).Since elevation and longitude arrows were close and pointed in the same direction, it means they are positively correlated implying that clusters 2, 3, 4 and 6 positioned along the direction of the arrows are depicted to be at high elevation to the east of the park.
The RF predictive map for the six clusters (Figure 4) and description of the clusters (Table 3 and Table S2), revealed distinct vegetation geographically arranged along the north-west and south-east latitudinal and longitudinal axes of Bwindi.Cluster 1 (white) and 5 (light green) were largely in areas north and west of Bwindi, while cluster 2 (pink) and 6 (dark green) were primarily in the extreme southeasterly areas.Clusters 3 (yellow orange) and 4 (yellow green) were in the middle of the forest.However, in some areas, the clusters formed mosaics, with patches of cluster 6 within clusters 2 and 5, while those of cluster 2 were within cluster 5, and those of cluster 1 within clusters 2 and 5.We found no evidence of any of our six clusters being more or less associated with forest edges or large gaps located a distance from the forest edges.3 and 4, Figure 4; Appendix S2 and S3.
Estimated importance of the variables in predicting cluster composition and spatial distribution estimates revealed that elevation had the greatest estimated importance in terms of both accuracy and Gini score (Figure 5).Only two other variables had importance of nearly similar magnitude for predictive accuracy: precipitation of driest month and annual precipitation.The OOB estimate of the error rate was 50.7% (Table 4).

| DISCUSS ION
Our study shows that the methods we used can be useful for ecological understanding of mountain habitats.We demonstrate that the RF model, supported by multivariate statistical techniques, was effective in delineating tree communities and predictively mapping them in response to complex topographic gradients, compounded by vegetation that is highly mixed, species-rich with a heavy and long disturbance history.For the final result to be optimal, all our data were used in training the model instead of forfeiting them for an independent validation.Thus, we chose to use the OOB estimate of the error rate provided in RF.Our OOB error statistic was moderately high.plant species are tolerant to a wide range of varying conditions, and therefore, there can never be a perfect species-environment correlation.Moreover, species data tend to be redundant and noisy (Kent, 2012;Ter Braak, 1995) making it difficult to completely explain how a rich community of vegetation is determined by a limited set of interacting factors (Pfeffer et al., 2003).Nonetheless, the most influential variables indicated by the RF model are those expected to influence large scale patterns of montane vegetation (Hedberg, 1951;Lieberman et al., 1996;Eilu et al., 2004;Schmitt et al., 2010;Sainge et al., 2019;Lin et al., 2020).This, together with the fact that at least half of the tree community variation was accurately predicted using a limited set of predictor variables, permits us to predictively map the vegetation at unsampled locations with reasonable confidence.
Our approach to vegetation mapping in mountains provides an alternative to the sometimes limited approaches provided by remote sensing (Pfeffer et al., 2003).Remote sensing is poor at mapping plant communities that merge along elevational gradients (Singh et al., 2001;Townsend & Walsh, 2001).In addition, remote detection of specific plant species is most likely to be viable if the target species possess unique growth forms, phenology or other readily detectable characteristic (He et al., 2015).Vegetation types with the same overall physiognomy or plant form like trees, but with varied in floristic composition are difficult to differentiate using both spectroscopy and remote sensing, resulting in misinterpretation and misclassification of remote sensing images (Townsend & Walsh, 2001).
Several other factors that can influence vegetation composition of communities were not considered in our model.Inclusion of geology, edaphic conditions, hydrological regime, interspecific Predictive map of tree species assemblage spatial patterns in Bwindi Impenetrable National Park, Uganda.Numbers in the legend correspond to the clusters in Tables 3 and 4, Figure 2; Appendix S2 and S3.  2 and 4.
interactions, dispersal ability, and biotic or abiotic interactions (Godsoe & Harmon, 2012) might enable more accurate description of the vegetation pattern if suitable data were available.
A combined HC analysis and ISA identified six tree communities across Bwindi each with their own indicator species.The numerous indicator species for each tree community, without a single species being dominant, is indicative of a large proportion of the forest being secondary vegetation communities.In many tropical forests, single-species dominance is a characteristic of mature (climax) communities (Eggeling, 1947;Connell & Lowman, 1989;Sheil & Burslem, 2003).Furthermore, close to 40% of the indicator species, spread across the tree communities, were pioneers such as Macaranga barteri at low elevation and Polyscias fulva and Neoboutonia macrocalyx at high elevation and mid-successional or non-pioneer light demanders (NPLDs) such as Gambeya albida, Newtonia buchananii, and Entandrophragma excelsum.This is further evidence that considerable areas of the forest are in early succession stages.Human disturbances of the past, particularly pitsawing, were spread throughout the forest (Howard, 1991;McNeilage et al., 1998).However, the disturbances varied in intensity and were concentrated in different areas at different time periods (McNeilage et al., 2001).This led the forest vegetation to be structurally and compositionally complex, with a mosaic of patches of disturbed and regenerating forest, differing in size and succession stage, being superimposed on differences caused by topography and possibly soil characteristics.It could be a very long time before the forest regenerates to the primary (climax) stage.Aside from the past disturbances, there are additional reasons why much of Bwindi's vegetation remains in early successional stages.Ssali et al., (2017) found that forest regrowth in the many gaps dominated by bracken fern scattered across the forest is impeded in multiple ways including repeated damage and high seed predation.
The prevalence of pioneer species and NPLDs we found across the forest likely reflect similar influences (Chapman & Chapman, 1997).
The observed species have longer seed dormancy, greater growth potential and are less constrained by dispersal and seed predation (Muñoz Mazón et al., 2019;Sheil, 2016).Late successional species may take long periods to reestablish due to of the scarcity of remaining seed sources and limited dispersal (Ssali et al., 2017).
Ordination analysis demonstrated that the tree communities were mainly arranged along two collinear gradients-elevation and sample site's position in the forest landscape (i.e., longitude).
Bwindi is elongated from north-west to south-east, exhibiting a gradual rise in mean elevation in this direction.Although elevation had the strongest relationship with tree composition, additional variability was explained by topographic position and climate (temperature and precipitation).Topography has multiple potential mechanisms through which to impact vegetation: for example, the soils of slopes and lower valleys are generally moister, better structured, and richer in plant nutrients than the more drought-prone and better illuminated ridge tops (Ghazoul & Sheil, 2010;Jucker et al., 2018).For example, before it was sought out and harvested in accessible areas, the slow-growing and shade-tolerant Podocarpus latifolius was common on Bwindi's hillridges >2000 m a.s.l.(Hamilton, 1994).Our results, as expected, show that temperature tends to decline as elevation increases (Lieberman et al., 1996) with tolerance of low-temperatures often offset by lower competitive ability under warmer conditions (Sheil, 2016;Slik et al., 2009).Our RF model indicated that annual precipitation and precipitation of driest month also contribute to differences in tree composition.While many variables are clearly related to tree community patterns these are often correlated and cannot be unambiguously assigned to distinct mechanisms so their individual contributions could not be analytically separated.
Whilst the influence of elevation on tree community composition and distribution was systematically identified and statistically demonstrated, the tree communities in Bwindi do not form distinct zones along the elevation gradient.This accords with results from other montane forests such as Udzungwe Mountains National Park, Tanzania (Lovett et al., 2006).There are multiple possible reasons for lack of elevation zoning.We highlight two: Bwindi is one of the few forests in Africa where lowland and highland forests are in a continuum (Hamilton, 1974 andHamilton 1975).In a rather limited area, there is intermingling of lowland tree species that reach their upper boundary and highland tree species at their lower boundary.Our results showed an overlap of the cluster confidence ellipses on all the 3-dimensions of the ordination diagram and numerous sample sites and species positioned where the ellipses intersect indicating that they can reasonably belong to more than one community (Kent, 2012;Townsend & Walsh, 2001).
In addition, lack of perfect indicator species (having indicator value ≤0.5) implies that numerous tree species are spread widely but are most frequent and abundant within the tree community they indicate, presumably representing their ecological optimum (Austin, 2013;Mueller-Dombois & Ellenberg, 2002).The continuous nature of change of plant species in moist forest communities is related to a continuous change in environmental variables (i.e., precipitation and temperature) along the elevation gradient (Lovett & Lindberg, 1993).Secondly, the intense forest-wide human disturbance of the past that involved mainly large-scale species-selective cutting and removal of large hardwoods could have relaxed the constraints imposed by competition, making it possible for some tree species to colonize the disturbed areas resulting in extension of their elevation range limits (Sheil, 2016;Muñoz Mazon et al., 2019).In addition to natural disturbances such as drought and occasional landslides, human disturbances such as clearing, timber cutting, fires and many others have occurred at different times in different parts of the forest, leading to spatial patterns that obscure what would be expected in a system lacking human impacts (Chapman & Chapman, 2004).

| CON CLUS IONS
Prediction of vegetation across an extensive mountain landscape using mapped environmental variables offers a potential in addition to remote sensing.Remote sensing is challenging in species rich mountainous regions like Bwindi with considerable topographic complexity and a multifaceted history of natural and human impacts.
Efforts should be geared towards accessing and refining predictor variables at the appropriate scales so as to further improve such mapping.
We see that Bwindi is a mosaic of patches of different successional stages resulting from a complex history.Recovery to late successional forest is slow, likely due to seed limitation and continued disturbances.Small island forests such as Bwindi, surrounded by dense human populations, are under immense pressure and their long-term viability and survival remain in doubt.
Mapping tree species composition and patterns helps understand

F
I G U R E 1 Location of Bwindi Impenetrable National Park, Uganda, and of tree sampling transects and sample sites (red dots) among the strata superimposed on a Digital Elevation Model of the study area.
Based on field data, a sample sites-versus-species matrix (using basal area values of each tree species relative to the area of the sample site [m 2 ha −1 ]) was created.We only considered tree species occurring in more than four sample sites, resulting in a 289 sample site by 89 tree species matrix.The data were natural log-transformed following the generalized procedure(McCune & Grace, 2002) to minimize the influence of large trees.We subjected the data matrix to various multivariate techniques to identify associations and spatial patterns among the tree species using R software (version 4.2.1;R Core Team, 2022).To describe the different tree species assemblages within our study area, we clustered the sample sites with similar tree species that tend to occur together using a polythetic, agglomerative hierarchical clustering (HC) with the flexible beta linkage method (β = −0.25;Lance & Williams, 1966;Legendre & Legendre, 1998) with the agnes function and Bray-Curtis as a distance measure.The clustering results were portrayed by a dendrogram or clustering tree.We determined the optimal number of clusters using Indicator Species Analysis (ISA) procedure, described below.Because we had numerous sample sites (n = 289), we simplified the presentation using composite sample sites from the original sample sites by computing the centroid of each of the cluster as the mean of the basal area/ha of each tree species per cluster.We used Multi-Response Permutation Procedures (MRPP) and a Mantel's test to test for differences in composition between the clusters.We also utilized ISA, a method that combines frequency and mean abundance tables, to identify the characteristic tree species for each cluster.Statistical significance of the indicator tree species was determined by random permutations for each species at p < .05significance level.Indicator values vary from 0 (no indication) to 1 (perfect indication).A Kruskal's nonmetric multidimensional scaling (NMDS) based on Bray-Curtis coefficient was used to evaluate the ecological tendencies reflected in the cluster dendrogram and the relationship between sample sites, clusters and environmental variables.The sample sites were ordinated, then overlaid with the cluster centroids from cluster analysis with surrounding confidence ellipses at ±2 SD from the mean (enclosing approximately 95% of sample sites within each cluster).Lastly, 12 least correlated site environmental variables (Table recorded 4335 individual trees (≥20 cm dbh) comprising 121 species in 51 families from 289 sample sites.The sample sites spanned an elevation of 1320 to 2467 m a.s.l.broadly representing Bwindi's elevation range and topographic variation.Just 89 tree species from 47 families occurred in four or more sample sites and were included in the subsequent analyses.The richest two sample sites included 12 species while the poorest sites were 11 each with four species.The commonest tree was Strombosia scheffleri that occurred in 130 (45%) sample sites, while the least common species were nine (Anthocleista vogellii, Antiaris toxicaria, Casearia battiscombei, Celtis durandii, Dichaetanthera corymbosa, Ficus sur, Hannoa longipes, Memecylon myrianthum and Pauridiantha callicarpoides) each occurring in only four (1.4%) sample sites.The cluster dendrogram (Figure 2) grouped the sample sites into six clusters.This cluster-stage yielded the lowest average summed pvalues and the highest number of significant indicator species based

F
Polythetic, agglomerative, hierarchical clustering dendrogram depicting the relationships between the composite clusters.Numbers correspond to the clusters in Table Brinkmann et al. (2011) andLin et al. (2020) attribute inaccuracies of predicted maps to mapped predictor variables being too coarse to provide exact variable estimates for each sample site, especially, in hilly and topographically diverse areas or noninclusion of causal factors(Barbet-Massin & Jetz, 2014;Bedia et al., 2013).But more important could be the observation ofVolkov et al. (2003) that the distribution of plant abundances in natural communities at local scales is (or can be considered in effect to be) largely random with species being ecologically equivalent, or structured more by dispersal than by differences in abiotic conditions.Pfeffer et al. (2003) suggest that most F I G U R E 3 Changes in summed p-values and number of indicator tree species with p ≤ .05from randomization tests across 2-16 clusters.TA B L E 3 Indicator tree species for the clusters in Bwindi Impenetrable National Park, Uganda.forest to the southeast of the park 18 5 the extent, variation and ecological impacts of past and ongoing disturbance and other factors.Such information can assist protected area management to direct conservation efforts so as to avoid or reverse further degradation of the fragile ecosystem and monitor restoration programmes.Also, the information can be used to model wildlife abundance and distribution, estimation of forest on-ground carbon stocks and to examine the consequences of various climate change scenarios.Predicting changes in composition is instrumental to inform adaptive management strategies and conserving biodiversity.

Stratum Transect elevation range (m a.s.l.) No. of sample points
Table 2 and detailed in Appendix S1.These were: ), and - 2000;Fick & Hijmans, 2017) bioclimatic variables, that include 11 temperature and eight precipitation variables from WorldClim version 2 (http:// world clim.org/version2.1;1970- 2000;Fick & Hijmans, 2017).We clipped the ASTER GDEM and the 19 bioclimatic rasters to a window covering Bwindi Forest only.Lastly, the human disturbance across the park was based on the combined relative encounter of human activity signs that were likely to have an impact on vegetation -wood cutting, bee hives, old pitsaw sites, disused mineral extraction pits, snares, and fireplaces(McNeilage et al.