Assessing Integrity Using Vegetation Structure and Composition

Context The draft post-2020 Global Biodiversity Framework aims to achieve a 15% net gain in the area, connectivity and integrity of natural systems by 2050. First, we analyse the complexity (foliage cover) and composition (native species richness) of 6 plant functional groups relative to their empirically dened benchmark. Second, we extrapolate the spatial patterns in foliage cover and species richness to predict where different plant functional groups are above or below benchmark as spatially-explicit, continuous characteristics across the landscape. Methods We assess the integrity of vegetation relative to a numerical benchmark using the log of the response ratio (LRR) to reect the proportional change in the response variable. We use ensembles of articial neural networks to build spatially-explicit, continuous, landscape-scale models of cover and species richness to assess locations where functional groups meet or exceed benchmarks.


Introduction
The draft post-2020 Global Biodiversity Framework embarks on an ambitious global agenda to preserve and protect biodiversity (CBD 2020). The Framework aims for a "net gain in the area, connectivity and integrity of natural systems of at least 5%" by 2030 and that "an increase of at least 15%" by 2050. Furthermore, the Framework aims to ensure that "at least 20% of degraded ecosystems are under restoration". Elements like area, connectivity and con guration are critical for conservation ( for some species, the quality, condition or integrity of patches within the landscape is more important than their geometry Fleishman et al. 2002;Robles and Ciudad 2012). However, assessing and enhancing integrity is multifaceted. Often, historical reference states are regarded as the baseline from which we assess integrity. However, Rohwer and Marris (2021) state "there is no objective, empirically measured property of integrity in ecosystems that can be shown to be high in 1850 and low in 2020" (where the year 1850 is agged as a pre-colonisation historical reference state in North America).
We address three shortfalls in assessing and subsequently enhancing integrity. First, we assess integrity using values such as structure (e.g., foliage cover) and composition (e.g., native species richness) (Rohwer and Marris 2021) so that enhancing integrity can be focused and targeted. Second, we quantify integrity relative to a measurable and falsi able reference state. Third, we mapped areas that meet or exceed benchmarks as potential areas to secure a post-2020 net gain in integrity (Clements et  We treat vegetation as discrete plant functional groups de ned by native species within 6 growth formstree, shrub, grass and grass-like (hereafter referred to as grass), forb, fern and remaining 'others' ). Our rst objective was to analyse complexity (as summed foliage cover of each functional group) and composition (native species richness of each functional group) relative to their empirically de ned benchmark. The second objective was to extrapolate the spatial patterns in foliage cover and species richness to predict where different plant functional groups are above or below benchmark across the landscape as spatially-explicit and continuous characteristics. These data-driven models show where and how much of the landscape can be regarded as being above or below benchmark, as opposed to inferential models that rely on an indirect interpretation of disturbance patterns such as land use (e.g., Newbold et al. 2016) or human footprint (e.g., Venter et al. 2016).
The status of different plant functional groups within ecosystems offers a multi-species speci city that we argue cannot be achieved with existing approaches to vegetation extent mapping, community type mapping or inferential habitat integrity or vegetation condition modelling. The latter attempt to simplify complex multi-attributes into a single index and in so doing can obscure important structural and compositional patterns within and among ecosystems across the landscape. Maps of vegetation cover and richness can also provide a direct measure of the intrinsic quality of landscapes, as opposed to extrinsic approaches proposed by Cook et al. (2019) which consider the shape, size, funding for management, pressures such as human footprint and adjacent land uses. When plant functional groups are considered relative to their benchmarks, these estimates of vegetation state provide a comprehensive and consistent approach to mapping current integrity, quality or condition. Spatial maps of the integrity, quality or condition of ecosystems will make an important contribution to assessing the protected area network to support global efforts to conserve and restore biodiversity post-2020.

Study area
The 11 519 052 hectare study area is located in north-eastern New South Wales (NSW), Australia. The area is dominated by privately owned land used for agriculture (52% land used for grazing and 23% used for cultivation, including irrigated cropping). Less than 10% of the study area is within the protected area network (757 217 ha) or travelling stock reserves (215 852 ha). Vegetation formations are diverse, ranging from Closed Rainforests in the eastern, elevated (1 510 m) regions to Arid Shrublands and Grasslands on the drier, western plains (Keith 2004).

Environmental And Disturbance Variables -The Predictor Surfaces
Our choice of potential predictors was guided by ecologically informed expert-judgement. We prepared 29 potential surfaces that described patterns in macroclimate, soils and geology, topography, landscape modi cation and remote sensing (  (Table 1).

Model assessment
For cover variables, R 2 for the hold-out subset ranged from 0.88 (integrity of forb cover) to 0.79 (integrity of fern cover) and R 2 for richness variables ranged from 0.80 (integrity of tree and forb richness) to 0.57 (integrity of fern and other richness) ( Table 2). The RMSD for the models of cover condition ranged from 2.50 (fern cover) to 1.48 (forb cover); for the condition of functional group richness, RMSD ranged from 1.50 (condition of tree and grass richness) to 2.32 (condition of fern richness) ( Table 2). MAE for cover condition models ranged from 0.81 (condition of tree cover) to 1.77 (condition of fern cover) and for richness condition attributes MAE ranged from 0.85 (tree richness) to 1.88 (condition of fern richness) ( Table 2).
Sensitivity analysis for the ensembles of ANN models indicates land use, great soil group and foliage projective cover were the three most important predictors for informing the multi-target cover models; and land use, great soil group and % clay were the 3 most important predictors for informing the multi-target richness models (Supplementary Material S4). Estimates of Moran's Index suggested model residuals The predictive performance of each ensemble model was evaluated by calculating the coe cient of determination (R 2 ) which we used to assess the strength of the relationships between predicted and observed values for the training, testing and hold-out subsets for each modelled condition attribute. It is important to note that model performance was judged by determining how well the model performed when applied to new data (the out-of-sample, validation or hold-out subset). Parity between the R 2 for the training and hold-out subsets indicates how well the model has been trained. We used the root mean squared deviation (RMSD) and the mean absolute error (MAE) to quantify the deviation from the 1:1 line. (see Supplementary S3). Both error estimates report errors on the same scale as the input data. We calculated Moran's Index from the model residuals to determine whether residuals were spatially autocorrelated (implemented in Spatial Statistics toolbox ArcGIS v10.4) (Figure 1 -Step 3).
Predicting condition across the whole landscape We used ensembles of 25 ANN models to predict spatial patterns of vegetation condition across the entire landscape. Trained models were deployed to every grid cell in the prediction matrix to derive an estimate of vegetation condition in previously unobserved locations. These analyses produced 25 prediction models. The nal predicted output for each grid cell was averaged to create a single ensemble model for each functional group.
We used spatial analysis to identify pixels at benchmark (LRR = 0) or above benchmark value (LRR > 0) for each functional group. We then calculated the coincident area where LRR equalled or exceeded benchmark value for all functional groups, with separate values for structure and composition. Finally, we calculated the area and locations where vegetation was at or above benchmark values for both structure and composition and we interrogate how much of this area is located within a protected area (Figure 1 -

Steps 4 and 5).
were not spatially autocorrelated for the structure models (Supplementary Material S5). However, models of tree richness were dispersed and other richness were clustered. Numbers in bold highlight results for the hold-out subset.
The root mean squared difference (RMSD) and mean absolute error (MAE) estimates show the mean deviation of predicted cover with respect to the observed cover and predicted richness values with respect to the observed richness values. Number of observations cover models, n = 3 021; richness models, n = 9 268.
Extrapolating Spatial Patterns Across The Landscape Table 3 shows the area for each separate functional group where the LRR equalled or exceeded the benchmark value for structure and composition. Notably, the area where vegetation meets or exceeds benchmark is far greater for structure than for composition (Table 3; Figure 2 and Figure 3), the relative integrity of these different functional groups does not always meet or exceed the benchmark (Panel c of Figure 2 and Figure 3). In addition, mapped results indicate locations outside the protected area network where the composition and structure of each growth form exceed benchmark value.
When all functional groups are interrogated together, our ensemble models predicted 111 691 ha (approximately 1% of the study area) where the structure of the 6 functional groups were conterminously at or above benchmark value (in the same location). Of the six functional groups, fern cover is at or above benchmark value for approximately half of the study area. In contrast, these analyses show that the composition of different functional groups meets benchmark value in far fewer locations. Our ensemble models predicted that all 6 functional groups were at or above the composition benchmarks for only 10 371 ha (< 0.1% of the study area) although both grass richness and forb richness met benchmark values in approximately 10% of the study area. When we interrogated the areas where all 12 structure and composition attributes were predicted to be at or above benchmark value, 2 227 ha coincided and the largest contiguous area was <20 ha (Table 3).
Summarising patterns in the predicted Best-on-Offer reference state across tenure Relating spatial patterns in integrity to tenure highlighted that protected area status is a coarse indicator for predicting where foliage cover or cover species richness might exceed benchmark (Table 4). Of the 2 227 ha where vegetation was at or exceeded benchmark values for both structure and composition, approximately 75% is within a protected area, and the remaining 25% occurs outside a protected area (Table 4), scattered across the study area.

Discussion
Globally, conservation efforts need to be more targeted and more effective. Spatially explicit estimates of vegetation integrity are an essential tool to inform ecosystem conservation and restoration planning at a landscape scale. Yet ecological integrity is an opaque concept that is di cult to assess and enhance (Brown and Williams 2016; Rohwer and Marris 2021). To disentangle integrity, we demonstrate how tangible attributes of vegetation cover and richness can be assessed relative to their transparently de ned, contemporary reference states (McNellie et al. 2020). Moreover, we deliver models that provide continuous gridded surfaces at ne scales (100 m) commensurate with the key ecological processes that shape landscape patterns. These models are focussed and data-driven, as opposed to inferential models that rely on disturbance-driven patterns (such as land use or human footprint) which aim to assess and enhance integrity based on notions of pristine, intact or pre-intensi cation ecosystems. However, our results show that land use was substantially more in uential than other environmental or disturbance variables, suggesting that land use might be an appropriate proxy in some cases. Given our assessment of integrity can be sourced from existing data, and assessed against measurable and falsi able benchmarks, we believe our approach could be applicable to terrestrial landscapes at a global scale.
Previous landscape-scaled assessments of integrity have used national conservation reserves (e.g., Harwood et al. 2016) or distance from human populations or infrastructure (e.g., Allan et al. 2019; Watson et al. 2018) as proxies to identify reference states from which to assess condition. However, protected areas are not always minimally disturbed and are often not representative of a majority of ecosystems (Joppa and Pfaff 2009). We found that approximately 85% of vegetation that meets or exceeds the cover benchmarks is outside the current protected area network. In contrast, 85% of the vegetation that meets or exceeds the richness benchmark is contained within the protected area network.
When using a contemporary reference frame, the intrinsic value of data-based models are useful measures of integrity, quality or condition. Here, we show that some areas with the greatest potential biodiversity value are not included in the protected area network (Archibald et al. 2020;Clements et al. 2019; De Vos and Cumming 2019). Targeting these areas may yield disproportionate bene ts for conservation strategies because unprotected areas potentially face some of the greatest threats (Myers et al. 2000).
Inspection of the predicted patterns shows that, across the landscape, there is a larger area that meets or exceeds benchmark values for cover (approximately 112 000 ha) than for richness (approximately 10 000 ha). In addition, we identi ed areas where condition exceeds benchmark values outside the existing protected area network. Protected area status appears to be a limited indicator of vegetation integrity, especially when structural and compositional attributes are considered simultaneously. This may indicate that the composition of functional groups is more degraded and that efforts to restore landscapes will need to target species composition.
Here we have highlighted that when we use complex, multi-attribute information based on species' observations, it is di cult for all attributes to meet their benchmark simultaneously. This paints a picture of poor or incomplete integrity when assessing the contemporary landscape. This may contrast with common perceptions of integrity, especially in protected areas. However, people, including experts in vegetation assessment, use different cues to perceive integrity. When assessing vegetation communities experts may tend to be biased towards dominant woody species (Dorrough et al. 2021), rather than overall structure or composition (such as foliage cover and native species richness among functional groups).
One of the bene ts of the models presented here is that individual attributes have not been combined into an index of structural or compositional integrity. When used as standalone attributes, each can be used to inform different aspects of landscape conservation. An additional bene t is that log response ratios can be summed. As a result, different functional groups can be summed to assess the integrity of combined attributes, such as non-woody (grasses, forbs, ferns and 'others') or woody (trees and shrubs) components of the landscape. Furthermore, the data are on a continuous scale and can be interrogated to nd patterns in vegetation integrity across the landscape for operational and practical land management and conservation, such as active restoration. Modelled surfaces can identify where in the landscape most functional groups meet benchmark levels and, therefore, can inform targeted restoration.
For example, our results indicate those areas where both shrub and tree functional groups meet or exceed benchmark levels, yet active restoration may be necessary to improve the structure and composition of forbs and grasses.
Given systematic conservation planning is an inherently spatial undertaking (Pressey et al. 2000), planning can be improved when the integrity of different functional groups is considered. The maps generated here incorporate the relationship of observed states relative to contemporary reference states, which is likely to improve systematic conservation planning by identifying locations that simultaneously protect species from multiple functional groups. These maps show different degrees of change in integrity across the landscape and focus on a multi-species approach with an emphasis on structural and Areas that could potentially be used to augment and extend the conservation network will inevitably be located on private lands. Refocusing the Global Biodiversity Framework post-2020 could concentrate new conservation efforts on creating protected areas that are tailored to meet future biodiversity targets. We have shown that some functional groups have large contiguous areas that facilitate well-connected landscapes within functional groups.

Declarations Acknowledgements
We are grateful to T. Eyre, M. Lewis and R. Peet for their comments on an earlier version and we thank anonymous referees for all their valuable comments on the manuscript.
Funding or Competing Interests -The authors declare that they have no known competing nancial interests, competing interests or personal relationships that could have appeared to in uence the work reported in this paper. Contribution: conceived the ideas and designed methodology -ALL; proposed log response ratio (LRR) method -JD, JY; extracted the case study data and prepared LRR data -MJM; prepared all gures, tables and maps -MJM; led the writing of the manuscript -MJM; contributed critically to drafting and revising the manuscript and gave nal approval for publication -ALL

Data Availability
Input data and maps will be made available via publicly archived datasets