Predicting Phenological Anomaly: a Case Study of Yucca in the Southwestern United States

Yucca in the American desert Southwest typically owers in early spring, but a well-documented anomalous bloom event occurred during an unusually cold and wet late fall and early winter 2018–2019. We used citizen science photographs as a means to generate owering presence and absence data. We t phenoclimatic models to determine which climate variables are explanatory for normal owering, and then we tested if the same conditions that drive normal blooming also drove the anomalous blooming event. Flowering for Y. brevifolia and Y. schidigera is driven by complex, nonlinear interactions between daylength, temperature, and precipitation. To our surprise, early-season owering odds are highest in colder and drier conditions, especially for Joshua trees, but increase with precipitation late-season. However, the models used to t normal blooming overpredicted the number of anomalous blooms compared to what was actually observed. Thus, predicting anomalous owering events remains a challenge for quantitative phenological models. Because our model overpredicted the number of anomalous blooms, there are likely other factors, such as biotic interactions or other seasonal factors, which may be especially important in controlling what is presumed to be rare, out-of-season owering in desert-adapted Yucca.


Introduction
In the winter of 2018-2019, park rangers and visitors to Joshua Tree National Park were surprised to see Joshua tree (Yucca brevifolia Engelm.) and Mojave yucca (Yucca schidigera Roezl ex Ortgies) in bloom. These two species normally bloom in spring from March to May 1,2 , and this anomalous bloom was unexpected and novel -so much so that this blooming event made local and national news 3 . While these reports anecdotally pointed to an unusually wet and cold fall as the reason for these Yucca being "fooled" into blooming, little is known about which environmental cues, and how cues may interact, to drive onset of blooming in Yucca or other arid-adapted plants.
Plant phenology in desert environments has been less comprehensively studied compared with temperate forests or even grasslands and prairies 4 . As might be expected, plants in arid regions have been shown to be particularly reliant on winter precipitation for plant growth during spring [5][6][7][8] . More formal phenological modeling of desert-adapted woody perennial species is surprisingly limited. Nevertheless, previous work has pointed to the potential for complex context-dependence between photoperiod, temperature, and precipitation. For example, 9 hypothesized that the onset of owering in saguaro cacti (Carnegiea gigantea (Engelm.) Britton & Rose) may be triggered by rainfall followed by favorable temperature conditions. Speci cally within Yucca, previous work by multiple authors has focused on local or, at best, regional examination of desert Yucca phenology, but without a strong statistical framework for examining drivers of owering onset 6,10−14 . Yucca elata (Engelm.) and Y. baccata (Torr.) have been shown to have increased reproductive output with increased winter and spring precipitation, with the slowest growth in early to mid-summer during what is often the driest season 10,11 . However, these studies did not focus on how environmental cues determine bloom timing. Ackerman (1980) tracked phenological timing of buds, open owers and fruits in populations of Y. schidigera and Y. brevifolia, along with many other species, in southern Nevada over two years in relation to climate but did not develop formal models. He suggested that owering in Y. schidigera may be driven by photoperiod. In Y. brevifolia, it has been shown that increasing temperatures may increase owering density and fruit mass, but may also decrease the possibility of successful establishment of new plants 13 .
Formal modeling approaches that can test the relative importance of temperature, precipitation, and photoperiod as drivers of timing of Yucca phenology are critically missing. Such models are particularly important, because species of Yucca have highly speci c, co-evolved obligate pollinators and herbivores-the yucca moths in the family Prodoxidae-and thus, their phenologies must synchronize in order to set fruit 15 . Given the constraints of arid environments and on Yucca in particular with obligate moth pollinators, our hypothesis was that there are tight environmental controls on blooming. However, previous evidence for synchrony between moth pollinators and Yucca owering has been mixed, with strong synchrony demonstrated in some 16 but not all Yucca 17 . Determining environmental factors affecting Yucca blooming is a key step towards understanding the possibility for phenological mismatches with their obligate pollinators. Models are also essential for assessing whether anomalous climate events are responsible for atypical timing of Yucca owering, such as that which occurred in 2018. If anomalous blooms become more common due to changing environmental conditions or more extreme climate variation, this could have negative consequences for Yucca, given its specialized pollination system.
The biggest bottleneck for creating predictive models of phenology is availability of data across the range of a species and environmental gradients, but new data resources hold great promise for overcoming the data gaps. Most knowledge of desert plant phenology is derived from population studies at a limited number of sites 6,9,14,18−21 . While these studies are of high value and often provide detailed knowledge, they do not often provide enough variation to capture the full range of conditions needed to test environmental drivers. Fortunately, availability of citizen science photographs of plants are growing exponentially and provide key phenological trait information at broad scales and with increasing data density. For example, as of December 2020, the citizen science platform iNaturalist has provisioned over 35,000 research-grade records of Yucca species. Yucca are also often photographed with a whole plant in view, providing crucially needed information about ower absence.
Here we scored the presence and absence of closed and open owers of Y. schidigera and Y. brevifolia -the two species with documented anomalous blooms in fall-winter 2018 -using best practice approaches for community science photographs described in 1. We linked these georeferenced and time-stamped records with key climate covariates including growing degree days (GDDs), precipitation and photoperiod. We considered two different GDD and precipitation accumulation periods, 30 and 120 previous days (before the observation date of an open ower), to determine if there are lags in response given the presumed importance of fall and winter rains on spring phenologies.
We utilized a general linear modeling framework that included careful attention to model calibration and validation to address the following questions: 1. What are the key climatic drivers for the owering phenology of Y. brevifolia and Y. schidigera, and how important are interactions between drivers and non-linear responses?
2. Can we use phenology models that are known to be performant in predicting normal owering period to accurately predict the observed anomalous blooms in fall and winter of 2018-2019?
3. Are models that take into account longer periods of heat or precipitation more accurate than those that use shorter-term accumulations?

Results
Summary of spatial distribution of records and shape of seasonal phenology curves We scored a total of 1,375 presence/absence observation records for Y. brevifolia and 1,637 for Y. schidigera ranging from 2009-2020. Figure 1 shows the spatial distribution of normal and anomalous owering for both species. The anomalous owering was mostly spatially localized to the Joshua Tree National Park area. Figure 2  0.89528, AUC test : 0.87614 for 120-day). The most common best model for Yucca schidigera, especially for 120-day climate accumulations, also included a three-way interaction among climate variables and a second-order polynomial term for daylength. Based on these results, we opted to t nal models using the full dataset for the 120day climate accumulations, given slightly superior performance based on AUC statistics. After another model selection step during model tting, the nal model for both species also included a three-way interaction and a nonlinear response to daylength, with results summarized in Table 1. Table 1 Final best-t models for each species. Occurrence records with incomplete climate data were ltered from the datasets used to t these models. The pluses and minuses indicate the general direction of the coe cient for each variable. The best model includes a three-way interaction between growing degree days, precipitation and photoperiod. A star is used to denote p < 0.05 and a double star indicates p < 0.01. The effects plots shown in Fig. 3 show the interrelationships found in our nal 120-day climate accumulation models. We found that these two Yucca species bloom when it is cooler, even if it is dry. Later in the season, high temperatures tend to decrease the probability of owering unless conditions are wet. For Y. brevifolia, medium daylength, low temperatures, and high precipitation increase the probability of blooming. For Y. schidigera, high precipitation and higher temperatures increase the probability of blooming if daylengths are short. Y. schidigera, which blooms later (Fig. 2) is more likely to bloom in longer daylengths, unlike Y. brevifolia, which only blooms lateseason if cold and wet.

Anomalous bloom prediction
Using the nal 120-day climate accumulation models for both species, we generated prediction probabilities for observed presences and absences from the 2018-2019 winter season anomalous bloom. These probabilities were converted into predicted presences (1) and absences (0) using MaxKappa, LPT, and LPT 5% values as thresholds, and then the accuracy of these predictions were assessed ( Table 2). For both species, we found higher commission (false positive) error rates than omission (false negative) error rates. The LPT threshold generated the highest commission error rates and the MaxKappa and LPT 5% thresholds were roughly equivalent in their commission error rates for the anomalous bloom season. We found generally low omission rates across all thresholds.

Discussion
We assembled a unique dataset of range-wide observations of Yucca available from iNaturalist. These provided the bases for determining presence and absence of owers for our two focal species, and while owering Yucca are likely to be more photographed, we still had su cient absences for these common and iconic species to use in downstream models. These models were effective at predicting phenology of the two focal species with generally high accuracy during the normal owering season. Model selection strongly favored the inclusion of all three predictors, and particularly the interactions among them, suggesting the critical importance of climate context for owering. This importance for arid-adapted Yucca is in line with other studies of phenology of desert plants, from Saguaro in the desert Southwest of North America 9 to lilies in arid environments in Africa 28 . While we expected strong interactions, it is more surprising that owering of Yucca does not necessarily always rely on increased precipitation. We expected precipitation would be a critical limiting factor in desert environments, and for Mojave yucca in particular, precipitation is a strong driver positively in uencing the odds of owering during spring and summer. However, for both species, owering odds are also high during cold and dry conditions earlier in the season (Fig. 3). These results may relate to precipitation falling as snow rather than rain during late winter.
Joshua trees and Mojave yucca have different growth forms and are of vastly different sizes at maturity, and therefore, may be expected to react differently to climatic drivers. However, our ndings indicate that the same interacting climate variables drive owering phenology for both species, and the overall shape of their seasonal phenology curves are similar (Fig. 2). The main differences in our models are likely attributed to adapted differences in overall bloom timing; in particular, Y. schidigera generally blooms later in the year than Y. brevifolia (Fig. 2). For example, Y. brevifolia only blooms under highest daylength conditions if it is unusually cold and wet, while Y. schidigera owering odds can often exceed 50% under warm and wet conditions when daylength is long (Fig. 3). Conversely, while Y. brevifolia has better odds of blooming in cold, dry conditions early in the year, even under wet conditions it can bloom with odds above 25%. In contrast, Y. schidigera is rarely in bloom in colder, wetter, early season conditions.
We also note the importance of including a polynomial term for daylength (Table 1), which always dramatically improved models ( Table 1). The outcome, clearly visible in Fig. 3, is that responses of phenology are strongly nonlinear across gradients. In sum, our work corroborates the importance of context dependence, nding that daylength, temperature, and precipitation interact in complex, nonlinear ways to in uence owering times.
A key question we sought to answer in this work is whether we predict anomalous owering events. Accelerating climate change means that species will experience conditions outside the range experienced for centuries. How species respond phenologically to these novel conditions is an area of active research 29 , but the focus has predominantly been on using yearly anomaly data, e.g. warmest years on record or via warming experiments 30 . Our efforts here are trying to predict a seasonal anomaly, where plants seasonally owered far outside of their presumed normal periods (e.g. in fall rather than spring). A key question of interest was whether the fall-winter bloom in 2018-2019 was itself triggered by anomalous climate conditions mirroring those of the usual bloom period.
We examined this question by testing whether models, which were t using data from years with a known normal blooming period, were able to predict presences and absences during the 2018-2019 fall-winter season. Our results show that we can predict absences with low error rates (4.7-6.7%, Table 2). However, these models had much higher rates for false positives (32.1-50.2%, Table 2). Our model predicts more anomalous blooming than actually observed. This suggests that, while Yucca might have been triggered to bloom by atypical cooler and wetter conditions, there are still factors not included in our models that limited the extent of anomalous blooming.
It remains possible that co-evolution between Yucca and their obligate pollinator or orivore community 31 may extend to how phenology is cued. It may also be that Yucca are responding not only to instantaneous climate conditions, such as mean photoperiod, but whether days are shortening or lengthening, and if so, it may be that the modeling approach used here is not sensitive enough to capture these types of more dynamic seasonal cues. Our work may also point to out-of-normal season blooming simply being more common and widespread than previously suspected, given broadly suitable climate conditions. A next step is to use growing citizen-science reporting of Yucca plants in ower to determine the rate of seasonally anomalous owering from dense, range-wide citizen science observations enabled via resources such as iNaturalist.
Finally, we note the value of examining accuracy of models using climate measurements over shorter and longer temporal windows. While these different climate accumulation windows are by nature highly autocorrelated, we found that data from the longer temporal window generally led to better models based on AUC statistics. It is likely that the longer temporal window captures more information about GDD and overall water input in the environment. For example, a classic paper by Beatley (1974) that focused on shrub phenology in the Mojave showed fall and winter rains were precursor triggers of phenological events in spring. We also note congruence with the ndings of Clair and Hoines (2018), who showed strong positive correlations with the 30-year averages of temperature and precipitation and fruit and seed mass in Joshua trees. This nding should not necessarily be extrapolated more broadly for arid-adapted plants, and traits such as perenniality or woody versus herbaceous habit with associated differences in costs for growth and reproduction, may condition thresholds for needed accumulation of heat or water.
We close by noting that phenology modeling is often treated as a one-off exercise where models are built, and results shared. We argue that the accelerating growth of new data resources and exible modeling frameworks provide a means for models to iteratively improve. One key step towards this goal is faster annotation of phenology state. Here we hand-coded two key states in Yucca photographs 1 . These carefully vetted classi cations can now provide the basis for more automated approaches for annotating photographs, e.g. via machine learning 32 . These new results can be fed into current models to test and improve model performance.
Expanding data resources for modeling ower presence is one key step, but the development of phenology models that include more tness-relevant responses is also important, such as number of owers or fruits, potentially in relation to vegetative biomass. Individual yucca plants, for example, do not bloom annually even in favorable conditions, because vegetative growth must precede production of a heavy, high-cost in orescence 10 . More sophisticated species-level models that link the full range of environmental conditions populations experience across their range with seasonal vegeative and reproductive biomass proxies are uncommon, mostly due to data limitations. Rather, studies typically focus on single, local areas or transect approaches across broad-scales 33 , with associated limitations for further prediction or forecasting.
We argue that the ability to develop range-wide models are in reach, using the same citizen science photographs that so far have only been used to generate simple states such as open ower presence. Such models hold promise in helping to provide a basis for improved detection of anomalous blooming events and their consequences. For example, we don't yet know if anomalous blooms produce fewer or greater ower numbers, as compared to normal periods. Do these blooms ultimately lead to the production of fruit and, if so, how much? Such next-step approaches are particularly critical and necessary, because as we experience more unusual weather phenomena and novel conditions, phenology prediction and understanding the consequences of phenological changes becomes even more challenging. As weather forecasting was improved by assimilating more data and building better process parameters, our hope is that similar methods with richer data types can improve the most di cult phenology prediction challenges.

Observation record accumulation
We chose to focus on Y. brevifolia and Y. schidigera, because original exploratory analyses of the Yucca dataset showed that some populations of these species were in bloom at abnormal times in the winter of 2018-2019. We used iNaturalist records for these two species, following the scoring methodology developed by Barve and colleagues 1 . We downloaded photographs from research-grade observations on iNaturalist for both species and used the software package ImageAnt (https://gitlab.com/stuckyb/imageant) to annotate if owers (open or closed) and/or open owers were present in the images. We note that there is debate regarding Joshua tree taxonomic delimitation with some considering eastern and western forms to be different species (Y. brevifolia and Y. jageriana) or subspecies. For the purposes of this work, we have kept these lumped. Future effort may be warranted to examine any subspecies or species-speci c phenology differences.
A key annotation component was whether the whole plant or only part of a plant was visible in the image, which is essential for scores of " owers absent". If an image covers part of a plant and has no ower, it is possible that the plant either actually does not have owers or may have owers on its other parts not captured in the image. In cases where a scorer could not determine whole plant status from a photograph, we utilized an "uncertain" category. Full details of scoring best practices, which were followed here, is available in 1. These best practices included having at least two independent volunteers annotate each image. In cases of con ict, annotations were checked by L.B., V.B., and R.P.G. and a nal decision was made. Initial work from Barve

Climate data collection
We used PRISM daily data, including both stabilized "recent years" layers, as well as the provisional "previous 6 month" data. For each Yucca record, we used the date and location to extract accumulated measures of growing degree days (GDD) and overall precipitation. Growing degree days and average precipitation were assembled for two different time spans: 30 days and 120 days before the observation. For GDD, we used a base temperature of 5 ℃ and no upper limit. In any cases where GDD values were less than zero, we assigned a value of zero. Finally, we assembled daylength (or photoperiod) for each record based on location and date of collection using the "daylength" function in the R package geosphere 22 .

Modeling framework
To test if owering phenology models of Y. schidigera and Y. brevifolia performed better when using a shorter or longer temporal window from the date of observation, we assembled a total of four datasets -two per species with climate data from 30 days prior to the date of observation and 120 days prior to observation. Before model tting, initial phenology annotations were processed to determine a nal scoring of con rmed absences and presences of owers based on the whole plant annotation of an image. Con rmed absences required that a whole plant was visible in the image and lacked owers. Cases where only a part of a plant was visible and owers were absent were not used further in analyses. We coded both ower presence and open-ower presence, which provide a means to infer unopened owers (cases where a ower is present but open owers are not). However, cases of con rmed unopened owers were uncommon, and we therefore focus on the broad category of ower presence/absence rather than sorting into narrower phenophases. We also eliminated records that did not have values for all climate variables used for modeling. This led to a nal set of 1,336 records available for model tting for Y. brevifolia and 1,589 cases for Y. schidigera. This total included both normal and anomalous owering scoring, which was further subset below and used as the response variable in generalized additive models (GAMs) and generalized linear models (GLMs).
To visualize the general shape of seasonal owering phenology curves, we t a GAM for both species separately where the response variable was ower presence or absence and the predictor was julian day of the year using the mgcv and mgcViz R packages 23,24 . We used a logistic function (family = binomial()) and a cubic cyclic regression spline smoother. This smoother constrained the start and end of each phenology curve to match in value and rst derivative, which is appropriate for cyclic biological phenomena 25 . GAM tting showed expected owering during late winter-late spring during cooler and wetter periods. We tested if indeed higher precipitation, cooler temperatures and shorter daylength interact to predict normal owering, and if these predictors also successfully predict anomalous blooming. Our approach is as follows. First, we separated the anomalous bloom period of fall 2018 to winter 2019 from the complete datasets. Anomaly data was not used to train or validate the models. The remaining data were randomly split with 75% of records utilized for model training and 25% for model testing/validation. We then used a repeated cross-validation approach, randomly splitting our data into testing and training datasets ten times. This process was repeated for all four datasets (two species x two climate accumulation periods) independently, and these training datasets were used to calibrate models, discussed below, for each species with the 30-day and 120-day climate data. We used the glm() function with a binomial logit link function to test the ability of GDD, average_precip (average precipitation), and daylength to predict the probability of owers for Y. schidigera and Y. brevifolia. All predictor variables were scaled to have mean zero and standard deviation of one before model tting.
There is no reason to expect purely linear responses of owering to climate drivers, and one predictor may interact with another. Therefore, we explored a suite of models that included second order polynomials for each predictor, as well as interactions. These more complex models often lead to high variance in ation, so we checked for such issues using the vif() function in R and did not consider models with predictors where VIFs > 5. We thus examined multiple models, ranging from purely linear effects with no interactions to more complicated ones with more interactions and polynomial terms. We used model selection approaches to determine the best models for those that passed screening via the Akaike's information criteria (AIC) and report the ΔAIC and Akaike weights for that model (see below).

Model validation and ability of models to predict anomalous blooming
After calibrating models, we tested their performance on the withheld test data. To evaluate model performance for both training and test data, we used the R package ROCR 26 to calculate area under the curve (AUC) scores. We considered models to be performant in cases where AUC scores were high (> 0.8), and where calibrated models show ability to generalize. We evaluated ability to generalize by determining if test AUCs were much lower than model calibration AUCs. Cases where training and testing AUC were both high were considered to be best-case scenarios. Metrics were calculated for all repeated cross-validations.
Finally, we examined model validation statistics for 30 and 120-day climate accumulation models across all repeated cross validations for each species, with the goal of determining if one or the other climate accumulation dataset was more predictive. While both climate accumulation models were performant, the 120-day models consistently outperformed the 30-day ones for both species (see Results below). We thus used all the data for both species to t nal models for 120-day climate accumulations and selected the best models based on AIC.
Using the predict() function in R, we used these nal models to calculate the probability of owering for records collected during the anomalous bloom period. We used thresholds based on model calibrations, including the maximum kappa (MaxKappa), least presence threshold (LPT), and 5% least presence threshold (5% LPT), to convert those probabilities into predicted presence or absence. These are commonly applied thresholds in the literature 27 , with MaxKappa representing the maximum of observed versus expected accuracy, and LPT thresholds representing minimum values where all or 95% of the plants in ower are correctly classi ed as such. Then, we determined the true and false positive and negative rates for the model predictions compared to the actual known owering state of the records collected during the anomaly. If unusual fall climate conditions mimicked spring conditions known to drive owering, we expected calibrated models to be useful for predicting anomalous owering. The spatial distribution of occurrence records used in the analyses. Colored dots indicate owering presences, and gray-scale dots indicate owering absences.