Ecosystem services from coffee agroforestry in Central America: estimation using the CAF2021 model

The goal of sustainable coffee production requires multiple functions from agroforestry systems. Many are difficult to quantify and data are lacking, hampering the choice of shade tree species and agronomic management. Process-based modelling may help quantify ecosystem services and disservices. We introduce and apply coffee agroforestry model CAF2021 (https://doi.org/10.5281/zenodo.5862195). The model allows for complex systems with up to three shade tree species. It simulates coffee yield, timber and fruit production by shade trees, soil loss in erosion, C-sequestration, N-fixation, -emission and -leaching. To calibrate the model, we used multivariate data from 32 different treatments applied in two long-term coffee agroforestry experiments in Costa Rica and Nicaragua. Without any further calibration, the model was then applied to agroforestry systems on 89 farms in Costa Rica and 79 in Guatemala where yields had been reported previously in farmer interviews. Despite wide variation in environmental and agronomic conditions, the model explained 36% of yield variation in Costa Rica but only 15% in Guatemala. Model analysis quantified trade-offs between yield and other ecosystem services as a function of fertilisation and shading.

may thus reduce coffee yields, but they also provide income (Rice 2008) and, by improving the microclimate, stabilise yields in the face of environmental change as discussed in a recent review by van Noordwijk et al. (2021).
An important issue in the evaluation of coffee agroforestry (CAF) systems is that many ecosystem services are difficult to quantify, which hampers the choice of shade tree species and agronomic management. This is a particular problem in Central America, where mountainous topography with young volcanic soils leads to strong spatial variation in weather patterns and soil fertility. Together with a wide variation in management choices, this has led to strong variability in productivity (Haggar et al. 2021). It is likely also causing strong variation in environmental ecosystem services, but not many data are available to estimate these.
Here we used a combination of data analysis and process-based modelling to quantify ecosystem services from coffee agroforestry on 89 farms in Costa Rica and 79 in Guatemala (Fig. 1). The same farms were studied previously by Haggar et al. (2021) who found that coffee productivity has no simple relationship with shade. Low-productivity farms tended to have either low or high shade. The measurements at these farms did not include ecosystem services related to carbon and nitrogen biogeochemistry. Here we estimated these ecosystem services using the new coffee agroforestry model CAF2021, which is freely available online (https:// doi. org/ 10. 5281/ zenodo. 58621 95). The model differs from its predecessors CAF2007 (Van Oijen et al. 2010a) and CAF2014 (Rahn et al. 2018;Ovalle-Rivera et al. 2020) mainly in that it allows for more complex agroforestry systems with up to three different shade tree species. The model simulates a wide range of ecosystem services: coffee yield, timber and fruit production by shade trees, soil loss by erosion, soil carbon sequestration, nitrogen fixation, loss of nitrogen in atmospheric emissions and leaching. Other process-based models of coffee agroforestry have been developed in recent years (Charbonnier et al. 2017;Vezy et al. 2019). These models focus on plant physiology, and they simulate agronomy and biogeochemistry in less detail than CAF2021, which makes them less suitable to assess a wide range of ecosystem services.
Long-term field experiments that measure a broad suite of variables are required to parameterise and evaluate complex process-based vegetation models. To calibrate CAF2021, we used multivariate data from a total of 32 different treatments applied in two long-term coffee agroforestry experiments in Costa Rica and Nicaragua. A subset of these data was used earlier in the parameterisation of CAF2014 by Ovalle-Rivera et al. (2020). These authors only used data from 10 of the treatments because their model did not simulate agroforestry systems with more than one shade tree species. We used the same method of model parameterisation, i.e. Bayesian calibration, an increasingly common method for calibration and uncertainty quantification of process-based models (Van Oijen 2020).
The application of the model to the farms in Costa Rica and Guatemala constitutes a test of the predictive capacity of the model against independent data as farm information was not used in the calibration. The farm data were less comprehensive than the experimental data, so we also compare model output against the rich literature on the impact of shading and fertilisation on delivery of ecosystem services from coffee agroforestry systems in Central America (Soto-Pinto et al. 2000;Cannavo et al. 2011;Meylan et al. 2013Meylan et al. , 2017Jha et al. 2014;Goodall et al. 2015;Sauvadet et al. 2019;Haggar et al. 2021;De Leijster et al. 2021).
The goals of this study were threefold: • 1: Identifying the drivers of observed differences in productivity between farms and CAF types.   Haggar et al. (2011), Noponen (2012, Noponen et al. (2013), Sepúlveda et al. (2016) and Ovalle-Rivera et al. (2020). We refer to these publications and to the Supplementary Information (S2) for further details.

Data from farms in Costa Rica and Guatemala
We used data from 89 coffee agroforestry (CAF) farms in Costa Rica and 79 farms in Guatemala collected by the SEACAF project (Haggar et al. 2021).
In the present study, these data are used for independent testing of CAF2021 without any farm-specific model calibration. The farm locations are shown in Fig. 1. Table 1 summarises the environmental conditions at the on-farm CAF systems grouped according to the typology of Haggar et al. (2021). For running CAF2021 full time series of daily weather from 2005 to 2020 were derived from the WorldClim database (Fick and Hijmans 2017) with additional information for recent years from weather stations run by Icafe and Instituto Meteorológico Nacional in Costa Rica and by Anacafe in Guatemala. Soil water holding capacity was derived from soil texture information (bulk density, silt, clay) using a pedotransfer function tested for tropical soils (Tomasella et al. 2003;Tomasella and Hodnett 2004). Soil carbon and nitrogen Table 1 Average environmental conditions for farms grouped by CAF system type in Costa Rica and Guatemala. The first digit of CAF type indicates relative productivity, the second shading level, both from 1 = low to 4 = high. Variables: altitude (m), slope (degrees), rain (mm y −1 ), temperature (degrees C), soil water holding capacity (-), soil carbon and nitrogen stocks (kg m −2 ). stocks measured in the top 26 cm were extrapolated to 1 m depth assuming the top 26 cm contained 80% of the stocks. Table 2 with on-farm CAF system properties shows management practices for fertilisation and shading, and coffee yield. Some farms had reported implausibly high fertilisation rates which we capped at 500 kg N ha −1 y −1 . We allocated farm-specific tree density data to the six species-types listed in the table (see also Table 3). Tree pruning and thinning regimes were not reported but had to be specified for running CAF2021. Trees were assumed to be pruned once per year, targeting each farm's reported shade level ( Fig. 4), except for Erythrina which in Costa Rica tends to follow a prescribed twice-yearly 90% pruning regime. Tree thinning was only applied to Cordia following a prescribed calendar. Pruned branches were removed from the field in Guatemala while remaining on the field in Costa Rica. Dry coffee fruit production was calculated as one third of reported fresh yield (Sepúlveda et al. 2016).
CAF2021 structure CAF2021 is a process-based model for the biogeochemistry of coffee agroforestry systems. We give a  CAF2021 simulates a coffee agroforestry system where the coffee plants are shaded by up to 3 tree species. The model allows for two lower-layer tree species, but no more than one upper-layer tree species. In typical use of the model the upper layer would be occupied by timber trees and the lower layer by service and fruit trees. The simulated tree species can have different initial planting densities and thinning regimes. The model simulates the biogeochemistryflows of carbon, nitrogen and water-of the soil-coffee-trees system. State variables are pools of carbon, nitrogen and water as well as coffee developmental stage. Height and crown area growth are simulated for each of the tree species. The model uses a daily time step and therefore requires daily weather data (light, temperature, rain, wind, humidity) as input. Further required inputs are atmospheric [ CO 2 ], soil properties (slope, water retention parameters, initial contents of C and N) and management calendars for fertilisation, pruning and thinning. The pruning and thinning calendars can be specified separately for each plant species.
The outputs from the model are time series with daily values of state variables and processes including ecosystem services such as productivity of coffee and trees, N-leaching, N-emission, erosion in the form of soil organic matter loss (C & N) in runoff, C-sequestration in the soil.
The model is implemented in FORTRAN but called from a user-interface written in R.
CAF2021 differs from its predecessor models in the following respects: • There can be three tree species present rather than just one. All tree variables are implemented as dynamic arrays whose length equals the number of tree species. Users with experience in programming can therefore relatively easily extend the number of shade tree species beyond three. • Fruit production can be simulated. • Lower-layer trees do not overlap, so the number of ground cover conditions (vertical vegetation profiles) in any simulated field is at most six: coffee shaded by (1) no trees, (2) service trees, (3) fruit trees, (4) timber trees, (5) service + timber trees, (6) fruit + timber trees. The model keeps track of the fractional ground area covered by each of these six categories (see Supplementary Information S1 for an example). • The relative heights, leaf area indices (LAI) and light extinction coefficients of the two competing tree species within ground cover categories (5) and (6) determine their access to light. • Morphology (tree height as a function of stem biomass and crown width as a function of branch biomass) is simulated using allometric equations but for each species a genetic or management-induced maximum height can be specified. • Shade management can be fully specified using calendars for dates and intensities of pruning and thinning. Alternatively, it may be simulated as being goal-directed toward a specified shade level.
Time series for selected output variables from CAF2021 are shown in the Supplementary Information (S1 and S3).

Bayesian calibration of CAF2021
We calibrated the parameters of CAF2021 using a Bayesian approach to allow for uncertainty quantification. Data from the two long-term experiments in Turrialba and Masatepe that we described above were used for this purpose. These measurements were suitable for calibration of a complex model such as CAF2021 because they comprised a wide range of coffee, tree and soil variables. This reduced the risk of tuning the model to one variable such as coffee yield production at the cost of poor simulation of other system processes.
For the calibration of CAF2021, we used data from 18 Turrialba treatments and 14 Nicaragua treatments. Some of these data had been used for Bayesian calibration of the predecessor model CAF2014 by Ovalle-Rivera et al. (2020), but that model was limited to treatments with no more than one tree species, so only six of the treatments from Turrialba and four from Masatepe could be used by them. In other respects we followed their implementation of Bayesian calibration.
We carried out a single multi-site, multi-treatment Bayesian calibration of CAF2021, using Markov Chain Monte Carlo sampling (MCMC) with a chain length of 60,000. At each iteration of the chain the model was run for all 32 treatments, so a total of 1.92 million runs was carried out during the MCMC. The calibration provided samples from the posterior distribution for 65 of the model's parameters: • universal parameters (n = 45) that were not allowed to vary between the treatments, • site-specific parameters (n = 5) that were allowed to differ between Turrialba and Masatepe, • tree-species specific parameters (n = 15).
The universal parameters included all 23 calibrated coffee parameters, thus ignoring any differences between cultivar Caturra in Turrialba and Pacas in Masatepe. Also treated as universal were 15 tree parameters and 7 soil parameters. The motivation for treating so many parameters as universal (which in reality may show variation) was to maintain a degree of universal applicability of the model, and to constrain the need for site-or tree-specific information in future applications of the model. The 5 site-specific parameters were soil parameters governing nitrogen leaching, organic matter composition and turnover. Making these parameters site-specific allowed simulation of the greater capacity of soils in Masatepe to stabilise organic matter and minerals (Noponen 2012). The 15 tree-species specific parameters were predominantly parameters for carbon allocation and morphology. All parameters were a priori assigned wide beta probability distributions, reflecting large prior uncertainty about plausible parameter values for the largely new model CAF2021.
Bayesian calibration generates a sample from the joint posterior probability distribution for all calibrated parameters. The MAP is the 'Maximum A Posteriori' parameter vector, i.e. the model parameterisation that achieves the highest value for the product of prior (beta distributions for parameters) and likelihood (fit to data) (Ovalle-Rivera et al. 2020; Van Oijen 2020). Results reported in this paper are for the MAP parameter vector.

CAF2021 application to farms Costa Rica and Guatemala
We used CAF2021 to simulate measured and unmeasured ecosystem services for each of the 168 farms in Costa Rica and Guatemala, using each farm's environmental conditions and CAF management choices as model inputs. All parameters for coffee and for Erythrina and Inga trees were left at the values determined by Bayesian calibration using data from the long-term experiments at Turrialba and Masatepe. Tree species not present in the experiments were grouped according to type and management (Table 2). Parameter values for the most abundant tree species in each group (listed as the 'Type species' in Table 3) were taken from the literature, specifically for banana (Musa spp.) (Schaffer et al. 1999;Chaves et al. 2009 Planting dates were not recorded in the farms database but coffee stands were generally mature and we simulated the years 2005-2020 for each farm. The only differences between Costa Rica and Guatemala in our model set-up were conform local practice: (1) pruned branches were removed in Guatemala but left on the field in Costa Rica, (2) the height of some tree species was differently managed.

Analysis of sensitivity to management
Besides the regular simulations for each farm, which used the observed and reported local conditions, we made two additional model runs to test farm sensitivity to management choices. In the first of these, we halved each farm's fertilisation rate. In the second, we halved tree densities and shade target.

Bayesian calibration on data from experiments in Turrialba and Masatepe
The left panel of Fig. 2 shows observed coffee productivity in the 32 treatments against the yields simulated by the calibrated model. Posterior parameter uncertainty was small, with an average coefficient of variation of 5%. However, each model output depends on multiple parameters, and output uncertainty for individual variables (coffee yield, soil carbon sequestration, nitrogen leaching) was 30-36%. These are predictive uncertainties for model application to a single set of conditions and do not apply to trends across treatments because any systematic effects of parameterisation on outputs then cancel out. CAF2021 outputs for different coffee agroforestry systems in Costa Rica and Guatemala CAF2021 accounted for 36% of observed variation in coffee yields for the on-farm CAF systems in Costa Rica (middle panel of Fig. 2), which is less than the variation accounted for in the Costa Rican long-term experiment at Turrialba (left panel), but compares well given the fact that the farm simulations were not calibrated. The results for the Guatemalan on-farm CAF systems were less good, with only 15% of yield variation accounted for (right panel). Table 4 summarises the on-farm simulations for coffee yield and seven other ecosystem services grouped by the CAF system types of Haggar et al. (2021). The model correctly identified CAF types 1.2/3 and 2.4 as much lower-yielding in both   ing Tables 2 and 4). However, the model was unable to explain the very high yield difference that was observed between the two highest-yielding types in Guatemala, which were 2334 and 5433 kg ha −1 y −1 of bean dry matter for CAF type 3.3 and 4.3 respectively, whereas the simulations were 2963 and 2785 kg ha −1 y −1 . The very similar yields in these simulations are consistent with these CAF types having similar environmental conditions (Table 1), shading and fertilisation ( Table 2). The CAF2021 simulations of Table 4 suggest that there were trade-offs between coffee yield and other ecosystem services. In Costa Rica, CAF types with high coffee yield also had high fruit yields but produced less timber and had higher levels of nitrogen leaching and emission. In Guatemala, the trade-offs were less marked apart from high N-emission rates in high-yielding CAF types.
To determine whether the trade-offs were driven by management choices we plotted the simulated ecosystem services at each farm against its level of N-fertilisation (Fig. 3) and shading (Fig. 4). The results suggest that ecosystem services other than timber production were more strongly determined by level of fertilisation than by level of shading. Fertilisation had the expected effects of increasing leaching and emission but it also stimulated vegetation cover and thus led to less erosion as measured by soil carbon loss in runoff. Coffee yields were more related to fertilisation level in Costa Rica than in Guatemala. This reflects the wider range and twice as high average level of fertilisation in Costa Rica (Table 2).

Sensitivity to management
The analyses of Figs. 3 and 4 show generally fairly weak correlations of the driving variables with the ecosystem variables. This was partly because of confounding of the many environmental and management differences between the farms. To highlight the effects of fertilisation and shading, we therefore carried out a sensitivity analysis where for each farm all conditions were kept the same apart from halving Fig. 3 Simulations of ecosystem services at farms in Costa Rica and Guatemala. Units as in Table 4. All plots are against N-fertilisation level (kg N ha −1 y −1 ) either fertilisation or shading. The results of this analysis are shown in Fig. 5. This shows that halving fertilisation is expected to reduce coffee yields, carbon sequestration, N-leaching and emission, while increasing (in Costa Rica) erosion through runoff. Halving shading also increases runoff but otherwise has very different effects: it increases coffee yield and reduces ecosystem services closely associated with trees (production of timber and fruit, N-fixation).

Bayesian calibration on data from experiments
We carried out a single Bayesian calibration-for all treatments in Turrialba and Masatepe simultaneously-to derive parameter estimates. All coffee parameters were treated as being universally applicable, and common tree species were also assumed to have the same properties in both experiments. The only experiment-and thus country-specific parameters in the calibration were five soil properties. Variation within experiments was ignored altogether. In reality, there may have been genetic and environmental variation in and between the experiments. Fig. 4 Simulations of ecosystem services at farms in Costa Rica and Guatemala. Units as in Table 4. All plots are against observed shade level (-) Fig. 5 Changes in ecosystem services when fertilisation or shading is halved. Units as in Table 4. All changes are normalised to the country average Disregarding such plot and block effects may have hampered model fit but ensured that the model was not overfit to noise. Moreover, the calibration was conducted simultaneously for multiple variables, so model outputs were not optimised for any single variable such as coffee yield. Despite all these restrictions, the calibrated model accounted for 74% of observed coffee yield variation between treatments at Turrialba and 63% at Masatepe. This compares favourably with linear regression of yield on fertilisation level in the experiments for which the r 2 was only 0.23. It shows that models for ecosystem service evaluation in coffee agroforestry systems need to integrate the effects of not only management but also weather and soil conditions.
The results suggest that CAF2021 can be a useful tool for analysing differences in system performance between different environmental conditions. However, this does not imply that the model is a good tool for forecasting the benefits of agroforestry at any individual site, as posterior output uncertainty (driven by posterior parameter uncertainty) was high at 30-36%.

Farm simulations
After the Bayesian calibration on the data from the two experiments, no further calibration of CAF2021 was carried out. The application to farms in Costa Rica and Guatemala thus constituted an independent test of the predictive capacity of the model. This test was restricted to coffee yield as data from other ecosystem services were not available. Coffee yields on Costa Rican farms were simulated reasonably well ( r 2 = 0.36 ), but yields on Guatemalan farms were not ( r 2 = 0.15).
One possible explanation for the mixed performance of the model is the use of parameter values calibrated on data from just two experiments. The experiments may have been more representative of plant properties and environmental conditions in Costa Rica than in Guatemala. It would have been possible to calibrate the model on a subset of reported yields but the model would then be tuned as a coffee yield prediction tool to the possible detriment of predictive capacity for other ecosystem services. It is therefore of importance to find rich local data sets, with measurements of a variety of coffee, tree and soil variables, before adding a second Bayesian calibration.
A second possible explanation for uneven model performance is the greater variation in environmental conditions in Guatemala compared to Costa Rica, as measured by the larger standard deviations for temperature, precipitation and altitude (Table 1). In the simulations, all parameters for coffee and tree morphology and physiology were kept at the same values for all 168 farms (89 in Costa Rica and 79 in Guatemala). Any genetic differences between coffee-and tree-varieties were thus ignored. The only tree-parameter differences accounted for were the farm-specific initial densities of shade trees. Disregarding interfarm variability in this way may have affected simulations in Guatemala the most because of its greater variation in growing conditions. A third possible explanation for poor predictive capacity of the model is errors in data. In particular surprising was the observation that coffee yields in Guatemala were on average 2.3 times higher on farm with CAF-type 4.3 than on those with type 3.3, despite similar environmental conditions and levels of fertilisation. The farms with CAF-type 4.3 had low densities of Inga trees (Table 2), so it is unlikely that nitrogen fixation explained their yield advantage. The soils of these farms also did not have higher nitrogen contents (Table 1) and their mineralisation rates were lower at 12.2 mg N kg −1 soil d −1 compared to 17.7 in CAF type 3.3 (Büchi et al., in prep.). The observed superior yield of CAF type 4.3 could not be reproduced by the model without reparameterisation. However, it must be noted that there were only 8 farms with this CAF type who reported high yields, and they also reported high variance in N application rates from over 1000 kg N ha −1 y −1 to almost zero. As stated in Haggar et al. (2021), there was a small but significant group of farms that report high yields but use alternative low N input management and which require further study.

Analysis of ecosystem services
Except for CAF type 4.3, discussed in the preceding paragraph, the observed and simulated coffee yields were higher in Costa Rica than in Guatemala, which matches the about twice as high average fertilisation rates in Costa Rica (Tables 2 and 3). The simulations suggest that the higher fertilisation in Costa Rica further stimulated carbon sequestration, nitrogen leaching and emission (Fig. 4). A positive effect of fertilisation on carbon sequestration was also shown by Körschens (2021), in their review of long-term experiments worldwide and in several empirical studies (Virto et al. 2012;Triberti et al. 2016). Other differences between the countries, such as production of more fruit but less timber in Costa Rica, were likely the result of different choices of shade tree species and planting density rather than differential impacts of fertilisation on tree species (Tables 2 and 3).
The model points to inevitable trade-offs between productivity of coffee, productivity of trees and environmental ecosystem services, similar to those reported by Meylan et al. (2017) for coffee agroforestry systems in Costa Rica. Changing the level of fertilisation or shading can benefit some ecosystem services but always to the detriment of others. This was apparent from the farm simulations shown in Figs. 3 and 4, but was highlighted most clearly in the sensitivity analysis presented in Fig. 5. The benefits from fertilisation for yield and carbon sequestration must be balanced against the costs associated with increased nitrogen leaching and emission. Likewise the benefits for coffee yield of removing shading (consistent with literature summarised by Meylan et al. (2017) but not confirmed by Soto-Pinto et al. (2000) for an experiment in southern Mexico) must be balanced against the associated losses in tree production (fruit, timber) and carbon sequestration as well as the reduced protection against erosion. Analysis of the actual yield from the CAF systems here modelled concluded that farms with moderate shade levels were the most productive, and there was no yield advantage to further reduction in shade, but yields were suppressed by higher shade levels over 60% cover (Haggar et al. 2021). In the simulations, the effect of shade-reduction on coffee yield varied between farms and was on average much smaller in Costa Rica than in Guatemala. De Leijster et al. (2021) summarised the literature on this and reported similarly inconsistent relationships between coffee yields and shade cover because of differences in local conditions and management. Vaast et al. (2008) concluded that moderate shading may not hamper coffee production in hot areas with suboptimal growing conditions.
Halving shading significantly reduced carbon stocks in the simulations (Fig. 5), and this is consistent with the finding of Goodall et al. (2015) that ongoing reductions in shade tree density on farms in Nicaragua are leading to loss of soil carbon stocks. Meylan et al. (2013) and Jha et al. (2014), in their reviews of impacts of shade in coffee agroforestry, also concluded that shading increases above-and belowground organic carbon sequestration. In some contrast to this, only a minor impact of shading on carbon sequestration was reported for the Turrialba experiment by Sauvadet et al. (2019). Protection against erosion by high canopy cover, as simulated here, has repeatedly been observed in coffee agroforestry systems (Meylan et al. 2017;De Leijster et al. 2021 and references therein). Cannavo et al. (2011) found that runoff in unshaded coffee in Costa Rica was over 50% higher than runoff in coffee shaded by Inga densiflora. Our sensitivity analysis found a similar effect where halving shading led to 10-15% increase in carbon lost in runoff (Fig. 5).
The focus of CAF2021 is on biogeochemistry, and the effects of fertilisation on processes in soils, coffee plants and trees are simulated in considerable detail. However, the effects of shading on the coffee agroforestry system are manifold and complex, and are not simulated fully by CAF2021. The model does account for shade trees lowering light intensity and temperature below them (impacting coffee plant physiology) as well as competing with the coffee plants for light, water, and nitrogen. But the model does not represent other organisms and thus does not simulate how shade supports biodiversity (e.g. Goodall et al. 2015) and constrains pests and diseases (e.g. Jha et al. 2014). Decision support for such non-biogeochemical ecosystem services would therefore require a different model.

Conclusions
Returning to the three goals of this paper, we conclude: • Variation in fertilisation and shading drives differences in productivity between farms and between CAF types. • The modelled ecosystem services indicate tradeoffs between ecosystem services that imply no single CAF type can maximize all services. The trade-offs were not only between provisioning and regulatory services, but also between provisioning services from coffee vs. those from fruit and timber trees. • CAF2021 can be used as a tool for studying biogeochemistry-related ecosystem services in complex coffee agroforestry systems, but the availability of multivariate calibration data from long-term field experiments remains critical.