Study site
The study was conducted in 2014 in Kihansi gorge forests located at 8°35’S, 35°51’E along the eastern escarpment of the southern Udzungwa Mountains (Fig. 1). The area is distanced about 600 km East-south from Dar es Salaam with an elevation of 340 m.a.s.l. to 1240 m.a.s.l. (this study). The annual rainfall ranges between 1000 mm and 3200 mm (LKEMP 2004) and the mean daily minimum and maximum temperatures is 13°C and 25°C respectively (Cordeiro et al. 2006). The vegetation types comprise montane forest; miombo woodland and wetland spray meadow with various species composing each type (Lovett et al. 1997; Quinn et al. 2005). The Kihansi gorge forest harbours several endemic species of plant and animals including the globally endemic and extinct in the wild amphibian species, Kihansi spray toad.
Sampling of butterfly community
Butterfly species were surveyed using cylindrical butterfly traps of 90 cm height and 30 cm diameter fitted with a round 30 cm diameter plastic bait plate placed at 3 cm below the net (Austin and Riley 1995). In total, 41 traps were set at 600 m interval along nine transects of various sizes (0.9–5 Km) measuring a total of 25 km long. Due to limited accessibility to some parts of the Kihansi gorge (Poynton et al. 1998) available paths and trails were used to establish transects for this study. The survey transects intercepted three habitat types; woodland, montane forest and bushed grassland in the Kihansi gorge ecosystem. The habitat types were used as strata for later comparison of the butterfly distribution and abundance. Three transects (of 3–5 km long) were laid along the woodland habitat, two transects (each of 3 km long) in the bushed grassland and four transects (0.9–3 km long) were used in the montane forest cutting through the gorge spray wetlands (Rija et al. 2011). Each trap was placed on a randomly selected tree c. 3–4 meters (depending on the local terrain) above the ground (Bouyer et al. 2007) and on alternate sides of the transects at each trapping point. Depending on the length of transect, a total of 18, 13 and 10 traps were placed in the woodland, montane forest and the wooded grassland, respectively.
To attract fruit feeding butterflies, traps were baited with fermented banana prepared 48 hrs prior to the trapping (Alexander and DeVries 2012). This technique is very effective in surveying tropical butterflies and can greatly minimize errors of identification than when species are identified in flight (Sparrow et al. 1994; Walpole and Sheldon 1999). Traps were left for fourteen successive days, thereafter checked and emptied between 12.00 hr and 14.00 hr local time every day. Each time checking of traps started from the other end of the transect to avoid potential bias and to ensure independence of the data collected at individual trap level, allowing data pooling during analysis. The bait was inspected and replaced every day to prevent the bait from over-drying and replacing consumed or rain-washed bait (Houlihan et al. 2013). Trapped butterflies were removed, identified and marked with non-toxic silver markers, allowing identification of subsequent recaptures which were later excluded during analysis. After recording all captures were released in the wild. Where species were not identifiable in the field, voucher specimens were collected and later identified in the laboratory. Voucher specimens of all trapped species are stored in the Zoology laboratory at Sokoine University of Agriculture.
To understand species distribution across the habitats and altitudes, GPS coordinates of each trapping point were recorded using hand-held Garmin GPS 20. Two butterfly trapping events were conducted during late dry season (September-October 2014) and during mid-rain season (April 2015) and measured habitat cover at trap location to assess the influence of these generic variables on the butterfly richness and diversity. Vegetation cover (%) at ground, low (2 m above ground), and understorey and canopy cover were estimated and recorded. Other variables measured included tree diameter at breast height (DBH), plot tree density, distance to nearby water source (measured using GPS, Garmin eTrex 10), trap elevation above sea level and habitat type where a trap was set. The habitat characteristics were used to find out their influence on the abundance, species richness and diversity of butterfly in the study area.
Data Analyses
Prior to analysis, butterfly abundance data were pooled from individually sampled days for each trap and habitat characteristics recorded for each sampling station. First, to understand how species-rich and diverse the study area was, butterfly richness and diversity were analysed using Chao1 estimator and Shannon diversity index respectively under program EstimateS (Colwell 2013). To evaluate species richness and diversity indices, the numbers of runs were set to 100 to get smoothed curves at all levels of species accumulation (Gotelli and Colwell 2001). At this stage, randomization with sample replacement was checked, ensuring a target diversity index is selected to obtain richness and diversity estimates. Species richness and diversity data were further compared between habitat types and altitudinal gradients using Kruskal Wallis tests after confirming non-normal distribution (Shapiro test, p > 0.05).
Second, to understand how butterfly populations organize across the surveyed habitat types, a cluster analysis was performed on a Bray-Curtis similarity matrix of grouped butterfly samples under program Primer ver.6 (Gorley and Clarke 2006). To do this, butterfly abundance data were first square root transformed for normality, reducing the effect of high abundance species (Clarke et al. 2014). Further, non-metric multidimensional scaling (NMDS) and cluster plots were used to understand similarity in butterfly communities between habitat types. When clustering was evident a similarity profile (SIMPROF) test was used to examine if there was statistically significant clustering in the habitat sites (Gorley and Clarke 2006). To find out significant differences in butterfly communities collected from the surveyed habitat types, analysis of similarity (ANOSIM) test (i.e. ANOSIM is similar to classical analysis of variance-ANOVA test) was used between pairs of butterfly communities (Clarke et al. 2014). This test produces a global R-statistic which ranges from 0 to 1, indicating relative significant difference between the butterfly community groups being compared (Clarke 1993).
Third, to investigate how local habitat and environmental characteristics (henceforth termed as variables) affect butterfly diversity, abundance and species richness in the Kihansi gorge forest, the data were analysed using generalised linear models (GLM) under R-software ver. 4.0.5 (R-core Team, 2019). Prior to model fitting, correlation analysis was conducted to check for potential collinearity between independent variables. For the highly correlated variables (r > 0.5), only one was included in the model due to redundancy. Three models, one for each dependent variable (i.e. species diversity, abundance and species richness) were built incorporating eight explanatory variables: habitat type, tree density, tree DBH, distance to water point, ground cover, and herbaceous cover at 2m height, canopy cover and elevation. The GLM model with Poisson family and log link function was used to fit the species diversity data while abundance and species richness data were fitted with quasipoisson family and log link functions after confirming model overdispersion (theta ≥ 1.4). Model validation was conducted through examining deviance residuals distribution which looked fine. To evaluate the relative effect of each variable on the dependent variable, a backwards stepwise single deletion of non-significant terms was performed. For each deletion, model significance was tested using F-test for the model assessing diversity and Chi-test for models on abundance and richness (Kamil 2018). For any two competing models observed during the stepwise deletion of non-significant terms, analysis of variance (Anova) test was used to select the most parsimonious model based on the model showing the smallest residual deviance. Furthermore, prediction models were built for the final significant model terms using the “predict function” in the package “ggplot2” to understand how well the independent variables predicted the dependent variables. Finally, explained deviance for each model was calculated as pseudo R2.