Predicting Wetland Occurrence in the Arid to Semi— Arid 1 Interior of the Western Cape, South Africa, for Improved 2 Mapping and Management

15 As for drylands globally, there has been limited effort to map and characterize such wetlands in 16 the Western Cape interior of South Africa. Thus, the study assessed how wetland occurrence and 17 type in the arid to semi-arid interior of the Western Cape relate to key biophysical drivers, and, 18 through predictive modelling, to contribute towards improved accuracy of the wetland map layer. 19 Field-verified test areas were selected to represent the aridity gradient, rainfall seasonality, 20 hydrogeomorphic (HGM) types and physiographic zones encompassed in the study area. The arid 21 areas of the Karoo physiographic zones had: (1) a low (<1%) proportional area of wetland; (2) an 22 almost complete absence of seepage slope wetlands; (3) ephemeral depressions, all non-vegetated; 23 and (4) much of the wetland associated with valley bottoms confined within a channel. The less 24 arid mountain zones had: (1) a much higher (>3%) proportional area of wetland; and (2) wetlands 25 being predominantly hillslope seepages, but also including valley bottom wetlands. A spatial probability surface of wetland occurrence was generated based on the statistical 28 relationship of verified wetland presence and absence data points with a range of catchment-scale 29 predictor variables, including topographic metrics and hydrological/climatic variables. This layer 30 was product of wetland occurrence, attributed by HGM type. Vulnerabilities of the wetlands were 32 identified based on key attributes of the different wetland types, and recommendations were 33 provided for refining the wetland map for the Western Cape.

Introduction 51 52 Globally, much of the effort to map and characterize wetlands at a landscape scale has been in 53 humid temperate areas, where wetlands tend to be more extensive than in drylands (Hu et al. 2017). the field appear to have the same HGM type and a similar landscape setting and spectral signature 126 to a comparable wetland which has been observed in the field (and from which inferences can be 127 drawn). Given that the resources for collecting primary data were limited, the test areas were 128 derived as far as possible from existing sources. However, where major gaps were found in terms 129 of representing key environmental gradients represented in the study area (in particular, mean 130 annual rainfall and rainfall seasonality) test sites were generated from scratch by collecting primary 131 data. 132

133
The criteria used in the selection of test sites were chosen as follows: 134 • The spread of sites should achieve good representation of the aridity gradient represented in 135 the province, particularly along the drier end of the continuum, which covers much of the 136 interior of the province and has thus far received relatively little attention in terms of wetland 137 mapping effort. 138 • The spread of sites should achieve good representation of the main rainfall seasons 139 (aseasonal, mainly in the west vs. late summer rainfall, mainly in the east) represented in the 140 study area. 141 • The spread of sites should achieve good representation of the main hydrogeomorphic types 142 represented in the province. 143 • Sites with relatively low levels of wetland loss/alteration/creation are preferable to where 144 such levels are high because such impacts "interfere" with the predictive relationship between 145 wetland occurrence and biophysical drivers. 146 • Sites involving a shorter travel distance and with wetlands which are generally more 147 accessible are preferable to those with longer distances and lower accessibility. 148 • Sites with existing information on the wetlands are preferable to those without. 149 150 All of the datasets outlined above were used, to varying degrees, to inform probability models of 151 wetland occurrence and HGM type. These rely on binary data (e.g. wetland presence or absence) 152 correlated with associated environmental predictor data (for example, elevation). The 6 and use of variables specific to regions or model output. The study included 13 test areas (Table  155 2) covering a total of 2 520 km 2 . 156 157 Predictor variables and spatial datasets 158 Raster layers for nine predictor variables were used for the wetland models (Table 3). Raster layers 159 comprised a combination of primary data or were DEM-derived. The digital elevation model 160 (DEM) was obtained by first downloading the appropriate 3 arc-second SRTM (~90m resolution) 161 data tiles from the USGS website (USGS 2021), and then merging these together into a single 162 image. Data sinks were filled prior to calculating the DEM-derived surfaces: GRASS r.flow 163 routine for the flow accumulation surface; and SAGA Terrain model roughness for the roughness 164 index. All raster layers were standardised to the spatial extent of the study region. 165

Model development 167
Using field-verified datasets of wetland presence and absence across these landscapes, a spatial 168 probability surface of wetland occurrence was generated for the study area. This was based on the 169 statistical relationship of wetland presence and absence data points with a range of catchment-170 scale predictor variables. These included topographic metrics (elevation, slope, terrain roughness) 171 and hydrological/ climatic variables (mean annual precipitation, flow accumulation, aridity index). 172 This layer was combined with raster images of most likely HGM type within the landscape. 173 174 A coverage of random points at a fixed density per sub-catchment was generated using the 175 "random point inside polygons (fixed)" tool in QGIS. Various point densities were evaluated, 176 with the best option being a density of ~0.28 points.km -2 , or approximately 20 points per sub-177 catchment. Next, a subset of these points coincident with where wetland data were used was 178 selected using the "select by location" tool.  raster images were classified into binary layers: "high" elevation > 500m; "shallow" groundwater 202 < 8m below ground; "steep slopes > 5°. To spatially represent where seeps would be likely to 203 occur, the raster images were multiplied together using the raster calculator. This output image 204 was multiplied with the wetland occurrence probability map to indicate probabilities of wetland 205 seep occurrence. 206

HGM type vulnerabilities 208
A rating of the vulnerabilities of wetland types in the study area was undertaken based on: (1) the 209 wetlands' key biophysical features and likely key water inputs which are sustaining these wetlands; 210 and (2) current impacts on the different wetland types. The level of vulnerability to a specific 211 threat was based on a joint consideration of: (1) the intensity of the anticipated impact if a specific 212 threat was to occur; and (2) the observed current extent of the threat. For example, while the 213 anticipated impact intensity of sewage/ effluent/ pollution on hillslope seeps in the TMG (Table this wetland type subject to sewage/ effluent/ pollution is extremely low, and therefore the overall 216 level of threat for this wetland type and threat type combination is low. 217 218

Results 219
Wetland presence 220 Based on the field data collected, the arid areas of the Great Karoo and the Little Karoo 221 physiographic zones had the following key results: (1) a similarly very low (0.7%) proportional 222 area of wetland; (2) an almost complete absence of seepage slope wetlands ( The PCA indicated strong correlation between predictor variables and wetland presence ( Figure  272 2; Table 4). The optimal model for predicting wetland presence was based on four variables (A-273 pan evaporation, roughness, slope, elevation), with all model parameters being significant (p < 0.1; 274 Table 5 and Equation 1). The equation was translated into a spatial product using the raster 275 calculator (Figure 4). This output indicates distinct regions where wetland occurrence, irrespective 276 of HGM type, is likely to be high (Figure 4). The variable with greatest leverage was slope, 277 succeeded by roughness (Table 5) Wetlands of all HGM types which occur in the mountain zones, are generally inaccessible and 320 inherently unsuitable for intensive use, e.g. for cultivation (Table 6). In addition, the Cape fold 321 mountains have sandstone-derived soils with generally very low nutrient status, resulting in a low 322 value for livestock grazing and little grazing taking place. In contrast, the Great Escarpment 323 mountains, with their predominantly dolerite-derived soils of a higher nutrient status and their 324 grassy vegetation, are subject to much higher levels of livestock grazing, and are therefore more 325 vulnerable to livestock-related impacts. Furthermore, most of the Great Escarpment areas are 326 privately owned by livestock farmers, whereas extensive areas of the Cape fold mountain zones 327 fall within formally protected nature conservation areas. For these wetlands, the most immediate 328 threat is from invasive alien plants, which are already infesting many of the valley bottom wetlands 329 in this zone and, less so, the hillslope seep wetlands. 330 331 Another important threat to the mountain zone wetlands is climate change. It is anticipated that 332 some of the hillslope seep wetlands in particular which are close to a threshold of occurrence in 333 terms of minimum MAP to PET ratio could potentially be lost entirely, even with even a modest 334 increase in the MAP to PET ratio, as predicted with climate change. The coarse resolution of the 335 climate data used in the study did not allow such specific thresholds to be identified. However, it 336 is anticipated that when following the orographic rainfall gradient as one descends from the 337 relatively high MAP areas at high altitudes to the lower portions then some of the last hillslope 338 wetlands that one encounters before entering the Karoo zone (where hillslope seep wetlands are 339 almost entirely absent) would be amongst those close to this threshold. low. Both approaches constitute ancillary data that will support the national wetland mapping 412 process (Figure 7). In practical terms, polygons defined as potential wetlands, which have a high 413 probability of occurrence, would be confidently categorized as wetland areas. Conversely, 414 polygons defined as potential wetlands, which have a low probability of occurrence, are unlikely 415 to be wetlands but rather non-wetland drainage lines.       Principal components analysis of wetland presence/ absence relative to predictor variables (see Table 2