Study Area
The study area is located in the Panamá Oeste Province in central Panama, straddling La Chorrera and Capira districts and covering all or part of seventeen corregimientos (sub-districts) (Figure 2). Guided by satellite imagery and the expert regional knowledge of members of our team, we selected this location due to its known history of Chagas disease transmission, abundance of A. butyracea palms, and adequate representation of each category of our disturbance gradient. Because landscape disturbance is by no means the only variable associated with A. butyracea prevalence, we chose an area with minor variation in other environmental variables known to influence palm distribution: elevation, precipitation, soil type and temperature (6, 38-40).
Data Collection and Pre-processing
Satellite Imagery
To conduct our remote identification of A. butyracea palm locations, we obtained a high-resolution WorldView2 (WV2) satellite image in both multispectral and panchromatic format covering 300.1 km2 of our region of interest (bounding coordinates NW: -80.013, 8.94; NE: -79.854, 8.93; SE: -79.855, 8.785; SW: -80.013, 9.793) (DigitalGlobe, Inc., 2017). Our imagery was collected February 1, 2013 at an off-nadir angle of 16.19°, producing a spatial resolution of 2.03 m for multispectral bands and 0.51 m panchromatic. To remove distortions due to image tilt and region topography, the imagery was orthorectified with a NASA Shuttle Radar Topography Mission (SRTM) 30-m digital elevation model of Panama, obtained through the Smithsonian Tropical Research Institute’s (STRI) GIS OpenData Portal.
Palm Collection and Validation
The palm data used in this study were obtained through a combination of manual field sampling in our study area in 2016 and 2017 and remote palm coordinate collection via visual interpretation of the WV2 satellite imagery. Our fieldwork was authorized by the Panama Ministry of the Environment (MiAmbiente), the Gorgas Memorial Institute for Health Studies (ICGES), and STRI.
We recorded the field locations of an initial convenience sample of 131 A. butyracea palms in our study region in July 2016. This palm dataset consisted primarily of easily accessible roadside palms, comprising both freestanding individuals and those located within a contiguous forest. We collected coordinate points of each palm with a Garmin Oregon 550T handheld GPS unit and photos of each palm crown as a record of species identification; coordinates and photos were linked with a unique identifier. We converted the palm coordinates to a point shapefile and overlaid these on our WV2 image, where each point was manually assessed to confirm visibility of a corresponding palm crown within the satellite image. We randomly selected 30% (n=39) of our palm sample to reserve as a validation subset; the remaining 70% (n=92) of the sample was used for training to manually digitize A. butyracea crowns across the remainder of the WV2 image.
For the methodological remote detection of A. butyracea palm crowns in our study area, we created a grid across the imagery’s extent, consisting of 535 0.75-km-by-0.75-km cells, which was generated by hollow square tessellation in ArcMap 10.5.1 software (ESRI, Redlands, California). Each one of these 0.56 km2 cells was carefully analyzed for palm presence using both panchromatic and pan-sharpened (i.e., a fused 2-m multispectral and 0.5-m panchromatic image having both high spatial resolution and four color bands) WV2 imagery. Mature palm crowns are readily distinguished on remotely sensed imagery, due to their distinctive starburst shape (Figure 3).
However, to increase palm visibility, each color band of the pan-sharpened image was enhanced with high contrast settings and brightened. The pansharpened imagery was set to a true color display for visualization of palm crowns, which was useful for palm species differentiation, along with scale and texture, (Figure 3a). Texture showed more clearly on the panchromatic imagery, which was especially useful in recognizing palm crowns in dense, contiguous canopy (Figure 3b). Beginning with our training dataset and then moving systematically through the grid cells, a trained analyst identified all large-crowned (7-12 m diameter) palms either fully or partially visible that matched the A. butyracea training set’s shading and crown shape/texture characteristics. Palm locations were delineated as points centered on the palm crown and recorded in a point shapefile. Our final dataset totaled 50,955 possible mature A. butyracea palms.
We tested the accuracy of this large dataset both against the reserved validation sample of known A. butyracea palm locations and against an additional field sample of 86 A. butyracea and non-A. butyracea palms collected expressly for this purpose in our study region in late 2017. To ensure an adequately sized and novel sample for the 2017 palm validation dataset, we randomly selected three of our grid cells for in-field data collection after excluding those with fewer than 30 identified palms and those that contained any of the 2016 convenience palm locations (Figure 4). The grid cells in our resulting validation pool subset (n=439) were assigned unique numerical IDs. We randomly chose three using an online random number generator. In each of the three 0.56 km2 grid cells, we field-collected locational coordinates and photos of approximately 30 large-crowned palms, comprising both A. butyracea and non-A. butyracea palm species. Though our field sites were randomly selected, the palms we sampled within them were not. Logistical (e.g., natural and manmade terrain barriers, inability to locate some landowners to request permission to access property) and time constraints made either a truly random sample or a full census of palms in these regions infeasible. Instead, led by the remotely identified palm dataset, we prioritized data collection in accessible areas that contained a combination of pasture and contiguous canopy and were removed from major roadways.
From our 2017 field validation sample, we calculated positive and negative predictive values, as well as sensitivity (true positive rate) and specificity (true negative rate), in order to quantify our ability to remotely detect and properly identify palm species, and to assess likelihood of Type 1 (false positive) and Type II (false negative) errors. We further compared the accuracy of our A. butyracea identification stratified by contiguous canopy forest and open pasture. Accurately identified A. butyracea (true positives) were palms identified remotely via WV2 imagery that were field-verified as A. butyracea. Accurately identified non-A. butyracea (true negatives) were field-verified non-A. butyracea palms that were ignored in our remote sensing effort as “other” species.
Disturbance Gradient
We derived our natural and anthropogenic landscape disturbance gradient using spatial data from two official Panamanian government datasets: 2012 land cover data and 2010 census data. Both datasets were obtained directly from the ministries responsible for their creation. The 2012 Panama land cover data were generated by Panama’s Ministry of the Environment (MiAmbiente; formerly ANAM) in collaboration with the Food and Agricultural Organization of the United Nations (FAO) (41). These data were compiled using 5-m spatial-resolution satellite images collected by the RapidEye sensor between January 1, 2011 and April 30, 2012 to form a seamless, 5-m resolution land cover dataset for the entire extent of Panama (41, 42). Panamanian census data are collected on a decadal basis by the National Institute of Statistics and Census (INEC). The census spatial data used in this study includes administrative boundaries of La Chorrera and Capira districts and sub-districts, as well as the locations of buildings, rivers and streams within this region.
The 2012 land cover dataset contains 24 land cover classifications, 11 of which comprise our study region: mixed mature broadleaf forest, mixed mature secondary broadleaf forest, conifer planted forest, hardwood planted forest, new growth/bushes, corn, pineapple, pasture, surface water, populated areas, and infrastructure. We extracted land cover data within the boundaries of our satellite image extent and consolidated similar land cover types to form a 5-category disturbance gradient (Table 1). From least- to most-altered, this disturbance gradient consists of mature forest, secondary and planted forest, pasture, planted food crops, and the built environment. The built environment is a term that encompasses anthropogenic structures and their supporting infrastructure (e.g. roads), to support human activity spaces for living, working, and recreation (43).
Table 1. 2012 Panama Land Cover Categories
Original Assigned Category*
|
Consolidated
|
Total Area
|
Masked Area†
|
Palms
|
2012
|
|
|
km2
|
%
|
km2
|
%
|
#
|
%
|
Bosque latifoliado mixto maduro
|
Mixed Mature Broadleaf Forest
|
Mature Forest
|
0.67
|
0.22
|
0.53
|
0.23
|
125
|
0.25
|
Bosque latifoliado mixto secundario
|
Mixed Secondary Broadleaf Forest
|
Secondary/Planted Forest (established)
|
78.72
|
26.18
|
59.50
|
25.30
|
16521
|
32.42
|
Bosque plantado de coníferas
|
Conifer Planted Forest
|
Secondary/Planted Forest (established)
|
0.03
|
0.01
|
0.02
|
0.01
|
3
|
0.01
|
Bosque plantado de latifoliadas
|
Hardwood Planted Forest
|
Secondary/Planted Forest (established)
|
1.04
|
0.35
|
0.70
|
0.30
|
94
|
0.18
|
Rastrojo y vegetación arbustiva
|
Vegetation Regrowth and Bushes
|
Secondary/Planted Forest (new growth)
|
6.52
|
2.17
|
5.51
|
2.34
|
1593
|
3.13
|
Maíz
|
Corn
|
Food Crops
|
0.12
|
0.04
|
0.12
|
0.05
|
1
|
0.00
|
Piña
|
Pineapple
|
Food Crops
|
6.31
|
2.10
|
3.79
|
1.61
|
298
|
0.58
|
Pasto
|
Pasture
|
Pasture
|
164.02
|
54.56
|
130.42
|
55.45
|
24169
|
47.43
|
Superficie del agua
|
Surface Water
|
Secondary/Planted Forest (riparian)
|
33.45
|
11.13
|
26.43
|
11.24
|
7126
|
13.98
|
Área poblada
|
Populated Area
|
Built Environment
|
9.39
|
3.12
|
7.86
|
3.34
|
1016
|
1.99
|
Infraestructura
|
Infrastructure
|
Built Environment
|
0.37
|
0.12
|
0.34
|
0.14
|
9
|
0.02
|
|
|
Total:
|
300.65
|
|
235.22
|
|
50955
|
|
*Landcover categories assigned by the Panamanian Mistery of the Environement (MiAmbiente), formerly La Autoridad Nacional de Ambiente de Panama (ANAM) †There was no significant difference between masked area and total area in any category
|
A comparison of the 2012 land cover data and our 2013 satellite imagery indicated high consistency between the two datasets. However, we observed poor representation in the land cover data of considerable contiguous tree cover associated with the riparian zones of minor streams and rivers, which may serve as important wildlife and vector movement corridors. To spatially define these riparian zones for inclusion in our disturbance database, we applied buffers to the river and stream locations contained in the 2010 census data, which were more complete than the surface water estimates within the 2012 land cover dataset. We applied a 25-m buffer to all rivers and streams within our study area, based on an average of riparian area widths measured in the satellite image. The land cover within this 50-m wide area was reclassified “riparian zone” and included in the secondary forest category of our disturbance gradient.
Due to the heterogeneity of land cover classes that comprise the secondary and planted forest classification, for some analyses we further stratified this category into riparian zone, recent growth (< 5 years), and established forest (> 5 years). Recent growth corresponds to the official Panama land cover classification of “rastrojo”, which refers to the initial stages of secondary forest at five years of age or less (42). Established secondary and planted forest is a combination of mixed secondary broadleaf forest, and both coniferous and deciduous planted forests.
Due to cloud cover and cloud shadow, which covered approximately 6% of our satellite image surface, certain regions of the imagery were obscured to remote palm extraction. To accurately calculate the density of our palm sample stratified by our disturbance gradient, it was necessary to mask the land cover data to replicate the visible satellite surface. We first generated a shapefile of the obscuring cloud cover/cloud-shadow using object-based image analysis (OBIA) software, eCognition Developer 8.1 (Trimble Inc., Sunnydale, CA). OBIA is an image classification method that transforms high-resolution pixels into meaningful objects, based on user-defined combinations of size, shape, and band metrics; pixels are first grouped, then joined, into desired categories through user manipulation (44). This shapefile was used to remove obscured regions in our study area for both the original 2012 land cover data and consolidated disturbance dataset, resulting in two masked subsets that exactly matched the visible satellite imagery extent (see Figure 4). Using z-tests for proportion comparisons, we assessed whether the masked datasets were representative samples of the total study region for each dataset. For each classification scheme, we found no significant difference in land cover distribution between the original dataset and its corresponding masked subset. We conclude that the masked region used for palm extraction and density analyses is representative of the entire study area; no significant bias was introduced to the analysis by disregarding the approximately 6% of the satellite image obscured by clouds or shadow.
Spatial Analysis of Palm Data
Natural and anthropogenic landscape disturbance is linked with increased A. butyracea propagation at local scales (6). To assess the relationship between landscape disturbance and palm propagation at a regional scale, we compared palm distribution, density, and proximity to key features across the disturbance gradient within our 300-km2 study area. Because landscape disturbance is also associated with increased likelihood of A. butyracea forming monospecific stands (10, 18, 26), we assessed the spatial relationship of both individual palms and statistically significant clusters of mature palms, a proxy for monospecific stands, for all spatial analyses. All geospatial analyses were conducted using ArcGIS 10.5.1 software (ESRI, Redlands, California) and all statistical analyses were conducted using SAS 9.4 software (SAS Institute Inc., Cary, NC).
Palm Clustering
We assessed our A. butyracea palm dataset for spatially clustered groupings of points that may indicate a monospecific stand, or cluster, of this palm species. Our palm data consisted only of x,y coordinate information, and the very large sample size and narrow distance increment of interest for clusters (~30-m) made traditional point pattern analysis of clustering (e.g. Ripley’s K analysis) prohibitively computationally intensive. Instead, we aggregated palm points to the cells of a 50x50-m grid vector shapefile overlaid on our study region, which provides a uniform and comparable measure of palm density across our sample area. We tested grid overlays at spatial resolutions of 5-m (an exact replicate of the underlying landscape raster), 30-m, 50-m, and 100-m; however, the 50-m spatial resolution best characterized palm distribution without including too much “empty” space (100m & 250m), or creating too many “islands” of non-contiguous occupied cells (30m). Additionally, A. butyracea propagation literature suggests an average of <10-m seed migration (by predators who feed on the surrounding fleshy mesocarp) from the parent tree, with occasional migration up to 30 meters (45, 46). We anticipate a cell size of 50-m spatial resolution is large enough to pick up clusters of related palms within a single cell, or among neighboring cells. We filtered unoccupied grid cells from the dataset to limit our assessment of palm clustering solely within the observed palm distribution area.
We assessed overall presence or absence of palm clustering in our study area using a global Moran’s I spatial analysis. To reduce the bias of edge effects in our Moran’s I analysis, we row-standardized the spatial weighting scheme, which proportionally controls the weighting of cells with unequal numbers of neighbors (47). Specific clusters of occupied cells were identified using the local Getis-Ord Gi* analysis, or commonly called hotspot analysis, with an inverse-distance weighting scheme. Local Getis-Ord Gi* is a popular type of Local Indicator of Spatial Autocorrelation (LISA). Given our large sample size, we applied a false discovery rate (FDR) correction to the hotspot analysis, which applies a more conservative threshold to cluster significance in order to reduce Type-1 errors associated with multiple testing and spatial dependency (48).
Palm Distribution
Palms and significant palm clusters were assigned a disturbance gradient category based on their location and surrounding land cover. To adjust for any minor spatial disagreement between our land cover data and satellite imagery due to limitations of positional error of the two data sets, which might introduce error when overlaying the palm coordinate data with level of disturbance, we assigned palms a corresponding gradient based on the disturbance category that comprised a majority of area within a 10-m buffer zone surrounding each palm point. Palm clusters were assigned the disturbance category that comprised the majority of their area. We used a Pearson’s chi-squared test of homogeneity to analyze whether our observed palm point and cluster distributions statistically deviated from an expected distribution. Because A. butyracea palms are ubiquitous across central Panama and our study area controls for known drivers of A. butyracea palm distribution (i.e. soil, temperature, precipitation), we expect our palm and cluster distribution to be generally evenly distributed across the disturbance gradient categories if no relationship exists between landscape disturbance and palm presence at a regional scale. We employed pairwise z-tests for proportions to identify the disturbance category or categories driving statistically significant variances from this expected distribution, where applicable.
We also assessed a random, stratified sample of 150 palm trees (thirty palms from each of our five disturbance gradient categories) to test whether average crown diameter, a proxy of palm age, differed significantly across the disturbance gradient. We used an analysis of variance (ANOVA) test to assess whether average palm diameter differed significantly by disturbance type.
Palm Density
Using the disturbance gradient assigned to each palm and palm cluster, we assessed average density of palms and clusters stratified by disturbance gradient. Palm density was measured as number of palms per square kilometer disturbance gradient. Cluster density was measured as average cluster hectare per square kilometer disturbance gradient.
Palm Proximity to Key Features
We assessed the distance of each palm and palm cluster to the nearest feature of interest in each of three categories: buildings, rivers, and other palms/clusters. Palm distance was measured from the center point of each crown. Cluster proximity was calculated between the feature of interest and the cluster’s boundary. Proximity to buildings is used as a proxy of Chagas disease risk to human populations, given concerns of crossover between sylvatic and domestic transmission cycles. Distance to riparian areas and other palms/clusters were considered as potential pathways of either vector or reservoir species’ movement (49, 50).
We considered palm proximity to all buildings within our study region, as well as a household subset. The Panama census building database does not specify residential buildings, but does specify fincas or occupied ranches, as well as many commercial, administrative and agricultural building types (e.g., schools, supermarkets, henhouses). Through process of elimination, we considered fincas and all buildings not otherwise labeled to be households. To further exclude buildings improbably large or small for residences, we further refined this potential household dataset to structures between 50-275 m2 in size which we derived spatially from the building’s footprint.
All distances were stratified by the originating palm/cluster’s corresponding disturbance gradient and averaged. To statistically compare stratified average proximity of palms and clusters to our key features of interest, we employed Kruskall-Wallace analysis of variance by ranks tests (51). Palms and clusters tended to be quite close to features of interest, with a minority of longer-distanced outliers. This skewed distribution violates assumptions associated with the more common statistical test to compare averages, the Analysis of Variance (ANOVA); the Kruskal-Wallace test is the non-parametric equivalent. Where necessary, we followed this test with a pairwise Dunn’s test for non-parametric post-hoc analysis, using a SAS macro developed by Elliott and Hynan (52).