Epidemiological Dynamics of SARS-CoV-2 in White-tailed Deer

Abstract


32
Starting in 2020, SARS-CoV-2 was found in white-tailed deer [WTD; 1, 2]. By 2021, there was evidence 33 of regional transmission in WTD through a combination of ongoing deer-to-deer and human-to-deer trans-34 mission [2][3][4][5]. Endemic transmission of SARS-CoV-2 in WTD could position these populations as reservoir 35 hosts, posing risk for variant persistence [4, 6], evolution of new variants [7,8], and spillback into human 36 populations [7-9]. Cross-species transmission can have global health impacts during a pandemic. For exam- 37 ple, pandemic H1N1 Influenza virus (H1N1pdm09) was introduced into humans from animals [10], resulting 38 in ongoing animal-human transmission and viral evolution globally [11][12][13]. These patterns highlight the 39 need to better understand drivers of zoonotic pathogens establishing and persisting in new species to inform 40 science-based One Health decisions, improve risk assessment, and plan effective surveillance, early response, 41 and mitigation strategies. 42 Chandler et al. [1] initially documented the presence of virus neutralizing antibodies to SARS-CoV-2 in 43 free-ranging WTD. Experimental infection studies demonstrated the susceptibility of WTD to SARS-CoV-2 44 infection, shedding of live virus primarily through oral/nasal secretions, deer-to deer transmission, and the 45 persistence of neutralizing antibodies for up to several weeks following infection [14,15]. A later study 46 revealed that WTD can maintain antibodies for at least 13 months [16]. Deer-to-deer transmission may 47 occur within wild populations beyond the initial introduction period. Surveillance studies in WTD have 48 shown persistence of SARS-CoV-2 alpha and gamma variants in populations 4-6 months after their peak 49 circulation and extinction in the human population [4,5,17,18]. These data describing susceptibility and 50 transmission capabilities of SARS-CoV-2 in WTD and the fact that WTD are abundant and share a variety 51 of habitats with humans across the United States [19], highlight the need to understand potential infection 52 dynamics of SARS-CoV-2 in WTD across the range of the species. 53 Until recently, SARS-CoV-2 surveillance in WTD has only been conducted and described at the scale of 54 a single state, province, or region [1,2,4,7], with sampling occurring during a 3 to 4-month window. This 55 made it challenging to understand how widespread SARS-CoV-2 is in WTD, its epidemiological dynamics, 56 and associated ecological drivers. Recently, the United States Department of Agriculture (USDA) has been 57 working with state wildlife agencies in conducting widespread surveillance of SARS-CoV-2 in WTD to address 58 these gaps [5,17]. Here, we use data from a surveillance program of SARS-CoV-2 in WTD [17] to estimate 59 the spatial distribution of true prevalence, infection rates and their relationship to specific risk factors, and 60 the temporal dynamics of initial invasion into WTD across their range over 13 months.

61
The national-scale surveillance data are collected by opportunistically sampling hunter-harvested deer 62 and through targeted agency management. Different sub-populations are sampled unequally in space, time, 63 and overall numbers (e.g., female vs. male WTD, or WTD in suburban/urban or rural human populations).

64
Sample composition tends to be more challenging to account for using ( the opportunistic sample collection. There were similar numbers of samples collected from both sexes (males 82 = 5,076, females = 5,141), but SARS-CoV-2 viral RNA was detected more often in males (15%) relative 83 to females (11%). Adults (8,000 samples) were more heavily sampled than juveniles (2,217 samples), but 84 detection rates were similar in both groups (13% vs. 12%). Nasal swabs (9,343 samples) were collected more 85 often than oral swabs (364 samples), and 510 samples had missing data describing swab type. Infection rates 86 (i.e., proportion positive) appeared higher in oral and unknown swabs (16% and 17%, respectively) relative 87 to nasal swabs (12%). For management method, hunter-harvest samples were most common (4,577 samples 88 with 17% positive), followed by samples collected from USDA removal and management purposes (agency 89 management; 3,866 samples with 11% positive) or other mortalities (e.g., roadkill; 1,774 samples with 6%   A calibration curve showed that p k predicted positive and negative test outcomes well (Figure 1), and that 99 estimates of p k are close to apparent prevalence (observed data) with underprediction in regions with high 100 predicted prevalence. The model fit builds confidence that the method can estimate prevalence in counties 101 and at times when direct surveillance through sampling was not possible due to limited resources. The model 102 fit also builds confidence that the method can also estimate epidemiological characteristics of SARS-CoV-2 103 in WTD, such as peak prevalence and outbreak timing across counties. For sample-level factors, we found a significant interaction of sex and management method, but no significant 106 effect between oral and nasal swab types or age class (Table 1). Male deer samples collected by agency 107 management tended to have higher infection than female deer (Row a 2 in Table 1  For county-level effects, there is some evidence for weak relationships between local epidemiological reproduc-113 tion rate R ℓ and covariates. The effects of deer habitat (a proxy for deer abundance) and human population 114 density are positive but marginally not significant (Rows b 2 and b 3 in Table 1, Figure 3). Predicted prevalence 115 in WTD increased from 10% when human population density was 10 people per sq. km. to 15% when human  The model estimates that SARS-CoV-2 prevalence in WTD tends to increase with SARS-CoV-2 infection 125 in humans. The model uses the human SARS-CoV-2 death rate as a lagged, proxy-indicator for infection.

126
The model estimates the odds of WTD prevalence increases by 13% for every additional 11 human deaths 127 per 100,000 county residents (logistic regression parameter interpretation for row a 8 in Table 1; 95% highest 128 posterior density interval (HPDI) spans from 1% decrease to 31% increase). The model also estimates that,  Estimates of the local epidemiological reproduction number R ℓ were greater than 1 in nearly all counties in 133 states where samples were collected and ranged up to 2.5 in some counties ( Figure 4A). However, there is 134 also large uncertainty in R ℓ estimates in states where few samples were collected such that R ℓ could have 135 been less than 1 for many Mid-and South-western counties ( Figure 4). 5B; the proportion of positive test results per county) was more extreme (i.e., higher or lower) than time-141 averaged estimates in counties with low sample sizes ( Figure 5D). Importantly, uncertainty in time-averaged 142 prevalence estimates ( Figure 5C) was also higher in counties with low sample sizes. Predicted peak prevalence 143 varied spatially across the range of WTD studied. Peak prevalence occurred later in counties in the Midwest and Southeast ( Figure 6A). However, there was 146 local variation across counties within a state. In New York, peak prevalence is predicted to have occurred 1-3 147 months earlier in the western counties compared to the eastern counties ( Figure 6A). However, uncertainty 148 in predicted timing is higher in the eastern counties of New York compared to the western counties ( Figure   149 6B). Examination of SARS-CoV-2 prevalence in WTD over time predicted outbreak start, peak prevalence, 150 and prevalence decline occurred earlier in Onondaga County, New York than in Cuyahoga County, Ohio; 151 the two most intensively sampled counties in our study ( Figure 7). Comparison to human death rate data 152 illustrates how SARS-CoV-2 in humans is not necessarily a primary driver for SARS-CoV-2 prevalence in 153 WTD, but can prolong the duration of an outbreak in WTD. Large-scale surveillance data provided evidence for substantial deer-to-deer transmission in addition to 156 widespread spillover from human-to-deer populations [5]. High estimates for the SARS-CoV-2 local epi-157 demiological reproduction number R ℓ (i.e., up to 2.5) suggested that sustained deer-to-deer transmission 158 was probable following disease introduction to local deer populations (Figure 4). Model estimates suggested 159 SARS-CoV-2 in WTD was present in most counties that can support WTD populations (i.e., WTD habitat) 160 across the Conterminous United States (CONUS; Figure 5). Estimates for the time at which peak infection 161 prevalence occurred are similar for many counties sampled, which provided indirect evidence for widespread 162 transmission via human-to-deer spillover ( Figure 6). By comparison, in the absence of spillover, disease trans-163 mission across space should primarily be driven by diffusion between neighboring counties, which would yield 164 a smoother spatial pattern in disease timing [30]. Data indicates that differences between peak infection in 165 humans and WTD can differ as well (Figure 7). Different timing for disease in humans and WTD might 166 suggest concurrent widespread deer-to-deer transmission in addition to human-to-deer spillover.

167
The number and rate of disease transmission events from humans to deer are difficult to estimate. (1) can be used to estimate feasible ranges for the relative frequency of deer-to-deer vs. human-to-deer 174 transmission events. The modeling framework statistically attributes "excess infection" relative to the SIR 175 model's baseline infection process equation (2) as potential spillover pressure.

176
Posterior summaries for the risk factors identified in Table 1 suggest potential implications for general 177 knowledge of SARS-CoV-2 in WTD and ongoing surveillance programs. The model suggested that male 178 deer were infected at higher rates than female deer. Similar sex-related differences occur for chronic wasting 179 disease (CWD), bovine tuberculosis, and other infectious diseases [31][32][33]. Sex-related differences are often 180 explained by hypotheses that male deer have different social behavior, larger home range sizes, and breeding 181 season movements, leading to wider and more frequent contact with other groups of deer than females [33].

182
Although descriptive summaries of the raw data suggested that prevalence differed for management 183 method (i.e., Hunter vs. Agency) and swab type (i.e., Oral vs. Nasal), the model did not find strong evidence 184 for this pattern once the imbalanced sampling design factors were accounted for together, suggesting that 185 surveillance data collected from these different sources and methods can be analyzed together.  The model further suggested that SARS-CoV-2 local epidemiological reproductive rate increased with 206 the proportion of a county's land that supports WTD populations, albeit weakly. In lieu of using WTD 207 density estimates, we used the proportion of a county's land that WTD can inhabit (i.e., WTD habitat) to 208 approximate where WTD might be more densely populated. We chose this approach because WTD density 209 information is limited to small-scale studies due to the difficulty of collecting this data [34], and methods 210 for state-level abundance estimation vary across states, which introduces additional variation. Increased 211 habitat suitability is tied to increased incidences of CWD in WTD [35], with the supporting hypothesis that 212 suitable habitat supports higher density of WTD. The effect seen here might suggest infection reproduction 213 is facilitated through deer-to-deer contact. However, finer scale WTD density information or habitat data 214 that more closely informs WTD density would provide further insight to this relationship.

215
Quantifying disease dynamics requires intensive data distributed throughout time and space. In this 216 study, we used an opportunistic sampling design, which incurred temporal and spatial data gaps. These data 217 gaps propagated uncertainty in our estimates of SARS-CoV-2 prevalence in WTD ( Figure 5C). Uncertainty 218 in these estimates could be reduced through continued sampling in counties where long-term sampling 219 has already taken place. Furthermore, new sampling in counties that do not currently have data and 220 are distant from well-sampled counties (e.g., represent different values in of covariates such as proportion 221 of land inhabitable to WTD, human density, human case rates, or other potential risk factors that have 222 yet to be explored) would bolster the confidence of these estimates. However, requirements for reducing 223 estimate uncertainty can change over time, and would be best addressed using an adaptive sampling design.

224
Another limitation of this study is the spatial resolution at which disease dynamics could be described (i.e.,      253 We estimated human density for each county using 2020 Census Bureau population data, which includes 254 information about county area [36]. We divided the population of each county by the county area (km2). 255 We calculated the proportion of each county's land that can support WTD populations (i.e., WTD habitat)    The model's response variable Y k encodes the binary rRT-PCR test results for the kth sample such that Y k = 1 for positive results and Y k = 0 for negative results. The model treats Y k as a Bernoulli random variable with probability p k of being positive. We interpret p k as the individual infection risk or prevalence of SARS-CoV-2 for the kth animal's group, time, and location. The model uses the regression function specified via

County-level covariates
to link rRT-PCR test results to county-level SIR curves and sample-level covariates and external conditions 281 (e.g., age, sex, human death rate). The a j and z kj terms specify sample-level coefficients and covariates that 282 adjust the baseline infected compartmet i ℓ k (·) of the SIR curve for county ℓ k at time t k based on group-level 283 characteristics and external conditions for sample k (see Table 1 for detailed covariate listing).

284
The SIR curve we propose models the proportion of susceptible s ℓ (t), infected i ℓ (t), and recovered r ℓ (t) individuals in county ℓ at time t via spatially and temporally correlated systems of differential equations.
The SIR system of differential equations for each county specified via uses a population-level recovery parameter γ and spatially varying deer-to-deer contact rate β ℓ . Each county's 285 SIR curve is modeled with a local outbreak time t 0,ℓ and common initial conditions s ℓ (t 0,ℓ ) = s * 0 , i ℓ (t 0,ℓ ) = i * 0 , 286 and r ℓ (t 0,ℓ ) = r * 0 . Modeling SIR parameters and initial conditions with respect to spatial random effects and 287 covariates accounts for spatial and temporal similarities in SIR curves between counties. 288 We model the county-level contact rate β ℓ relative to the recovery rate γ scaled by a SARS-CoV-2 local epidemiological reproduction number R ℓ for each county, such that β ℓ = γR ℓ . The local epidemiological reproduction number quantifies the number of WTD to which a single infected WTD can be expected to transmit SARS-CoV-2 to naïve contacts. Covariates and spatially correlated random effects influence R ℓ via to link R ℓ to county-level covariates that can influence deer-to-deer contact rates (e.g., habitable area and 289 human population density). The link function g(·) is an exponentially smoothed ramp that is linear for 290 0.1 < R ℓ < 10 and decays to a low of R ℓ = 0 and a high of R ℓ = 15 (additional details in Supplement). The 291 b j and x ℓj terms specify county-level effects and covariates, and η ℓ specifies a spatially correlated random 292 effect for each county (see Table 1 for detailed covariate listing). A conditional autoregressive (CAR) process 293 model uses county adjacency reference information to model spatial connection and correlation for η ℓ [37,294 Chapter 4]. The CAR model requires a spatial precision parameter τ ℓ and a spatial range parameter γ ℓ , 295 both of which are estimated from data. 296 We also use a CAR process to model the local outbreak time t 0,ℓ . Like η ℓ , the CAR model for t 0,ℓ requires 297 a spatial precision parameter τ t0 and spatial range parameter γ t0 . In conjunction with the other SIR curve 298 parameters, the local outbreak time t 0,ℓ influences the time at which peak prevalence occurs.

299
All continuous covariates are scaled and shifted to have mean 0 and unit variance before model fitting.

303
Most of the spatial random effects for each county that support WTD populations across CONUS will not 304 be precisely estimated because data is only collected from 589 of the 2,893 modeled counties. However, the 305 model includes spatial random effects for all counties to inform spatial relationships between counties, and 306 support risk mapping.

316
The prevalence p Gℓt can be aggregated across both time and space, independently or together.

317
The time-averaged prevalence p Gℓ for demographic group and sample type G in county ℓ is the average 318 of the weekly prevalences p Gℓ1 , p Gℓ2 , p Gℓ3 , . . . . Maps of p Gℓ can illustrate where disease tended to be more 319 widespread across the study period. Composition sampling, again, propagates uncertainty and dependence 320 from estimates of parameters to estimates of p Gℓ .

321
The space-averaged prevalence p GAt for demographic group and sample type G in area A summarizes all prevalence estimates p Gℓt for G at time t in area A. The summary p GAt is a flexible weighted average specified via where w Aℓ is the relative weight (or contribution) of county ℓ to area A at time t. The authors declare no competing interests. The complete dataset analysed in this study is not publicly available due to sensitive sample-level collection 361 information, such as detailed sample collection locations and dates, but can potentially be made available 362 from the corresponding author on reasonable request. 363 Figure 1: Calibration curve, comparing in-sample predictions of infection risk p k (i.e., predicted prevalence) to observed outcomes against a 1:1 reference line (dotted line). Apparent prevalence (proportion of positive test results per group) is computed for model-fit diagnostic groups formed by binning predicted prevalence into 10 ranges. Error bars depict standard, frequentist 95% confidence intervals for each apparent prevalence group.   Estimates for local epidemiological reproduction number R ℓ and B) uncertainty (posterior probability that R ℓ < 1). States that did not participate in the study are greyed out. Counties estimated through the GAP WTD species distribution model to not support WTD populations are also greyed out.