Queen surveys
For Survey 1, we utilized previously published data36, which included N = 105 queens (excluding imported queens, which were biased to smaller ovary sizes due to a longer caging duration). N = 93 of these queens had associated viral data. For the ovary mass comparison across surveys, we retained only queens from producers that met the following criteria: the producer collected both failed and healthy queens, queens were delivered to the laboratory in one shipment, and the producer provided n > 2 failed and healthy queens each. These criteria were applied because in the 2019 survey, producers from many different locations shipped queens to the laboratory by different methods and with collection schedules; therefore, queens spent different lengths of time in transit and in cages prior to transit, which we know from previous work impacts ovary mass. Applying these criteria reduced our sample size for the ovary mass comparison but removed these extraneous effects.
For Survey 2, N = 22 queens were collected from colonies in 2018 from three different apiaries in southeastern Pennsylvania. Healthy and failed queens were age- and apiary-matched in order to eliminate sampling bias. The queens were frozen on dry ice immediately in the field and shipped frozen to the University of British Columbia, at which time the queens were thawed on ice and dissected.
For Survey 3, queens were collected from local collaborating beekeepers as they became available during the summer of 2020. Apiary location, colony symptoms (if failed), genetic background (i.e., local or imported stock), and age were recorded. The beekeepers considered the queens to have ‘failed’ if they either displayed spotty brood patterns, were drone layers, or if the adult population rapidly dwindled, including the presence of chalkbrood infections that the colony could not resolve on their own. Beekeepers considered the queens to be ‘healthy’ if they produced robust brood patterns, the colony did not have signs of appreciable brood infections, and the adult population was strong. Failed and healthy queens were caged with candy and attendants then kept in a queen bank with plentiful nurse bees if they could not be immediately analyzed in the laboratory. The queens were transported to the University of British Columbia, where they were anesthetized with carbon dioxide, decapitated, and their spermathecae were removed for sperm viability analysis. The remaining queen bodies were stored at -70°C until further dissection to determine ovary mass.
Sperm viability, sperm counts, and ovary mass
Sperm viability measurements for Survey 3 queens were obtained essentially as previously described36, following the methods of Collins and Donaghue72. Briefly, we burst the spermatheca with tweezers in 100 µl of Buffer D (17 mM D-glucose, 54 mM KCl, 25 mM NaHCO3, 83 mM Na3C6H5O7) and gently agitated the solution until sperm cells were homogeneously dispersed. We then stained the sperm using SYBR-14/propidium iodide dual fluorescent staining following the manufacturer’s protocol (Live/Dead sperm viability kit, Thermo). We dispensed 5 µl of the stained sperm into a well of a 24-well plate, over which we placed a round glass cover slip. We imaged the sperm (three fields of view per queen, average of ~100 sperm cells per image) using a Cellomics ArrayScan XTI (Thermo) and automatically counted live (green) and dead (red) channels using ImageJ v1.52a. Relative sperm abundances were obtained by counting the cumulative number of sperm across three fields of view. Every sample was analyzed in the same style of plate and with the same sized cover slip, so measuring the total sperm in a given area is proportional to total sperm in the sample. We later dissected the ovaries from the queen remains and weighed them using an analytical balance. Data for Survey 1 is already published and we direct the reader to McAfee et al.36 for the methods regarding the collection and processing of those queens.
Viral quantification for queen surveys
Viral data (SBV, DWV, and BQCV ) for Survey 1 were obtained by RT-qPCR exactly as previously described and are previously published data36. Briefly, samples were submitted to the National Bee Diagnostics Center, where total RNA was extracted from the head using a guanidinium isothiocyanate extraction buffer, cDNA was synthesized using the iScript cDNA synthesis kit (Bio-Rad Laboratories, Hercules, USA), and virus RNA was quantified by qPCR with previously published primers36. Standard curves for each virus amplicon were generated using serially diluted plasmids containing the target sequence. RP49 was used as the honey bee reference gene, enabling the viral RNA copy number per queen head to be calculated. Copy numbers per head of all three viruses were added together to generate the ‘Total counts’ variable. Queens from Survey 2 were not analyzed for viral infections because the sample size (22 queens) is too small to yield sufficient statistical power linking viral abundance to ovary size.
Queens from Survey 3 were processed by the Honey Bee Queen and Disease Clinic at North Carolina State University using methods as previously described73 with minor adjustments. For this sample set, we analyzed thoraxes because we have observed that other components of the head can sometimes interfere with PCR. We also tested for a wider range of viruses, including DWV-A, DWV-B, ABPV, SBV, IAPV, LSV, BQCV, and CBPV. Thoraxes were pulverized by bead beating and total RNA was extracted via the Zymo Research DirectZol RNA miniprep kit and Trizol reagent. RNA was quantified and verified for quality by NanoDrop Spectrophotometry and diluted to 200 ng/µL in RNAse/DNAse free water. cDNA was synthesized using the BioBasic All-In-One RT MasterMix. qPCR was performed on a QuantStudio 5 thermocycler using PowerUp Sybr Green chemistry. Standard curves were generated using serial dilutions of known quantities of custom plasmids containing the target sequences. Actin, 28s ribosomal subunit, and GapDH were used as reference genes. Final copy numbers were normalized to reference gene values using GeNorm software74.
Queen stress experiments
Italian queens were imported from California for the pesticide stress experiment, whereas the queens for temperature stress were grafted from existing colonies, reared using standard techniques75 at the University of British Columbia and mated locally. The temperature stress experiments were conducted in June 2019 and the pesticide stress experiments were conducted in August 2019. Queens were introduced to established 3-frame nucleus colonies and allowed to freely lay eggs for 2 weeks prior to exposing them to stressors. Colonies were fed protein patties (Global) and sugar syrup for the duration of the experiment. Prior to stressing, queens were caged with six attendant workers and candy, transported to the laboratory, anesthetized with carbon dioxide (5 min), and weighed on an analytical balance.
Once revived, queens were temperature stressed by placing them and their attendants in a 42°C incubator, humidified with a water pan, for 2 h. Control queens were placed in a 33°C incubator for 2 h. For the pesticide stress experiment, queens were administered 2 µl of a pesticide cocktail, diluted in acetone to a hazard quotient of 3,500, directly on the thorax while still comatose. We have previously used this pesticide cocktail at a dose of 2 µl diluted to a hazard quotient of 511 in order to investigate queen detoxification response38, but here we used a hazard quotient of 3,500 because this is the approximate contamination level in wax that has been previously associated with ‘queen events’42. The cocktail is composed of coumaphos, tau-fluvalinate, 2,4-dimethylphenyl formamide (an Amitraz degradate), chlorothalonil, chloropyrifos, fenpropathrin, pendimethlalin, atrazine, and azoxystrobin mixed in the same relative proportions as McAfee et al38. Solvent control queens received acetone alone, and handling controls received no treatment.
The stressed queens were then transported back to the apiary and reintroduced to their respective hives (2 d re-acclimation, then release). They were allowed to freely lay until two weeks after the stress event, then they were again caged and transported to the laboratory for euthanasia and dissection. Fertility metrics were measured as described above.
Statistical analyses of phenotypic data
All statistical analyses were performed in R v3.5.1. For Survey 3 data, we first checked if ovary mass, sperm count, and sperm viability were normally distributed and had equal variance between failed and healthy queens using the Shapiro and Levene test, respectively. The ovary mass and sperm viability passed these tests (p > 0.05); therefore, a linear mixed model (lme) was used for both, including health status (failed vs. healthy) and age (0, 1, or 2 years) as fixed effects and location (9 levels) as a random effect. The sperm viability data, however, was not normally distributed, so we analyzed these data with a generalized linear mixed model with a logit link function (glmer, family = binomial) as previously described33, constructed using the same fixed and random factors, but where sperm viability was represented as counts of ‘successes’ (live) and ‘failures’ (dead). Ovary mass data from three independent queen surveys were analyzed using a linear mixed model (lme) with health status as a fixed effect and queen source location as a random effect.
To determine the link between ovary mass and virus infection, we first tested used total viral loads as raw counts and tested the appropriateness of zero-inflated regression models using a Poisson distribution and negative binomial distribution but found that residuals suffered from over dispersion (dispersion parameter >> 1). We repeated these tests using log2 transformed viral data, but residuals were still over-dispersed. Therefore, we binned total viral loads into “low” (virus not detected), “medium” (0 < log10(total virus +1) ≤ 5), and “high” (log10(total virus +1) ≥ 5) infection groups to simplify the statistical analysis. This yielded n = 31, 56, and 19 queens in each group, respectively. Ovary mass was modeled against queen health status (healthy or failed), sperm viability, sperm counts, and total viral load using a linear mixed model, with queen source (apiary) as a random effect. Sperm viability and sperm count variables were not significant and were dropped from the final model. Total viral counts of Survey 3 queens were also binned into three infection categories, but with different cut-offs owing to different abundance distribution (likely owing to samples being analyzed by a different lab and from a different tissue). Survey 1 virus data was obtained from the head, whereas Survey 3 viral data was obtained from the thorax. For Survey 3, the bins were: low (virus not detected), medium (0 < log10(total virus +1) ≤ 1.5), and high (log10(total virus +1) ≥ 1.5), yielding n = 24, 14, and 16 queens. In the higher range of viral loads, the data are broadly dispersed (see Figure 2e) and setting a higher threshold for the “high” bin yielded too few queens in this category. We analyzed the data with a linear mixed model with queen age (0, 1, or 2 y old), health status (healthy or failed) and total viral load (low, medium, and high) as fixed effects and queen source (apiary) as a random effect.
For the comparison of weights and ovary masses of queens exposed to either heat stress or pesticide stress, we first checked for data normality and equal variance as described above. The data passed these tests (p > 0.05); therefore, we performed statistical analysis on pesticide-stressed and heat-stressed queens separately using a simple linear model (lm) with treatment group as a fixed effect. For the short-term heat stress experiments, ovary masses were first mean-centered by experiment, then analyzed using a simple linear model with treatment as a fixed effect.
Ovary proteomic sample preparation and statistical analysis
Ovary samples from Survey 1 (N = 88, 5 samples from the original 93 queens were lost during processing) were prepared for mass spectrometry essentially as previously described37. One ovary from each queen was homogenized in a separate 2-ml screw cap tube containing 300 µl of lysis buffer (6 M guanidinium chloride, 100 mM Tris, pH 8.5) and four ceramic beads. The homogenizer (Precellys 24, Bertin Instruments) was set to 6,500 s−1 for 30 s: samples were homogenized three times and placed on ice for 1 minute between each. Then samples were centrifuged (16,000 relative centrifugal force, 10 min, 4°C) to remove debris. 100 µL of supernatant was transferred to a new tube and diluted 1:1 with distilled H2O. Protein was precipitated by adding four volumes of ice-cold acetone and incubating overnight at −20°C. The precipitated protein was pelleted by centrifugation (6,000 relative centrifugal force, 15 min, 4°C) and the supernatant was discarded. The protein pellet was washed twice with 500 µl of 80% acetone, then the pellet was allowed to air dry (~5 min) before solubilization in 100 µl of digestion buffer (6 M urea, 2 M thiourea). Approximately 20 µg of protein was reduced (0.4 µg dithiothreitol, 20 min), alkylated (2 µg iodoacetamide, 30 min, dark), and digested (0.4 µg Lys-C for 2.5 h, then 0.5 µg trypsin overnight). Digested peptides were acidified with one volume of 1% trifluoroacetic acid and desalted with high-capacity STAGE tips as previously described76. Eluted samples were dried (SpeedVac, Eppendorf, 45 min) and resuspended in Buffer A (0.1% formic acid, 2% acetonitrile), then peptide concentrations were determined using a NanoDrop (Thermo, 280 nm). LC–MS/MS data acquisition was performed as previously described37.
MS data was searched using MaxQuant (v1.6.8.0) with the parameters and database as previously described37. In Perseus (v 1.5.5.3), the resulting protein groups were filtered to remove reverse hits, contaminants, proteins identified only by site, and proteins identified in fewer than 10 samples. Normalized LFQ intensity was then log2 transformed. Subsequent analyses were performed in R (v3.6.3). Differential expression analysis was performed using the limma package in R77 with ovary mass, status (failed vs. healthy), and total viral counts (log10 transformed of combined RT-qPCR counts, +1 to avoid undefined terms) included as fixed effects. Empirical Bayes moderation of the standard errors was then performed on the resulting linear model, and finally the false discovery rate was controlled to 5% (for the status coefficient) or 10% (for the ovary mass coefficient) using the Benjamini-Hochberg correction. GO term enrichment analysis was performed on the proteins differentially expressed among status groups using Ermine J as previously described38, though no terms were significant. We used a linear model with ovary mass, status, and total viral counts as fixed affects to analyze the expression of ApoLp-I/II and ApoLp-III.
Statistical analysis of spermathecal proteomic data
We analyzed previously published proteomics data using the limma R package essentially as previously described for ovaries. Proteomics data were first groomed to remove samples that did not have associated viral abundance data. We included log10 transformed total viral load (+1 to avoid undefined terms), health status (failed or healthy), and sperm counts as fixed effects in the statistical model. We used a Benjamini-Hochberg correction to control the false discovery rate to 10%. We direct the reader to McAfee et al. for details on proteomics sample preparation and upstream data processing.
IAPV experimental infections for measuring ovary mass
Mated queens of the same age were obtained in a single shipment from California. Queens were anesthetized with CO2 for 6 minutes before being injected between the abdominal tergites with approximately 4nL of IAPV-inoculum (n=10) or phosphate buffer saline (PBS) as sham (n=10). Injections were performed using the FemoJet Microinjector (Eppendorf) with hand-pulled glass needles. The queens were then kept in wooden cages with 5 attendant workers and provided with sugar candy in the incubator at 34°C with a water pan for humidity. After 65 hours the queens were sacrificed, and the wet ovary masses measured on an analytical balance. The dilution of purified IAPV used to inject the queens was determined after measuring survivorship in workers injected with a dilution series. In this experiment, attendants in the cages with IAPV-injected queens were beginning to show symptoms of paralysis after 2.5 days (presumably from contact with the infected queens).
IAPV experimental infections for expression analysis
Three large groups of sister queens were produced from a healthy mother queen to study the direct effect of virus infection on queens of three different ages following honey bee queen-rearing technique75. Briefly, young larvae (12 – 24 hours old) were grafted into the artificial queen cups and placed in a populous but queen-less nurse colony to develop. Prior to grafting, the donor colony was visually determined to be free of symptomatic diseases (Nosemosis, Varroasis, European and American Foulbrood, and Chalkbrood). After 6 days, sealed queen cells were transferred to an incubator (35°C, 65% humidity) and treated as follow:
The first group was used for pupal infections. Two days before emergence, queen pupae were gently removed from their capped cells. These pupae were either injected with 2 µL of IAPV-inoculum containing 5.24x103 virus particles (n= 34), or Phosphate buffered Saline (PBS) as a sham control (n= 31). The injection was performed using a Nanojet™ syringe pump (Chemix, USA) with an infusion flow rate of 0.1 µL/s, where the needle was inserted into the first abdominal segment located immediately behind the metathorax. Each queen then was placed individually in a 15 mm well in the 24 well glass bottom cell culture plate and kept in the incubator (35°C, 65% humidity). While in the incubator, development of the virus injected queens was compared with the PBS injected queens. Our observation confirmed a complete cessation of development in IAPV injected queens 20 hours post injection, not observed in the control group. Therefore, the experiment was terminated, and the abdomen of each queen was then separated, placed in an Eppendorf tube, and stored in a freezer at -80°C until RNA extraction.
The second group of queens was used for infecting newly emerged queen in the pre-reproductive stage. Two days after emergence, queens were either injected with the IAPV-inoculum (n= 30), or PBS as shame control (n= 30) as explained above. Each queen was accompanied with 10-15 worker honey bees from the donor colony in a single-use plastic cup, provided with water and sugar candy. Cups of IAPV-inoculated and PBS-injected groups were maintained separately in the incubator (35°C, 65% humidity). The IAPV-inoculated queens started showing paralysis symptoms around 20-24 hours post injection. The experiment was terminated by the onset of symptoms, and after anesthetization using CO2 the queens’ abdomens were sampled and stored as described above.
For infection of the last group of reproductive queens, right after emergence, queens were caged and banked in a donor colony for one week. Then, they were taken out and treated with CO2 two times a week later and return to the donor colony for one more week to stimulate ovary development without mating (Cobey et al., 2013). Thereafter, queens were either injected with the IAPV-inoculum (n= 34), or PBS as sham control (n=31) and maintained with attendant workers in the incubator (35°C, 65% humidity), as described above. Paralysis symptoms started in the IAPV-inoculated queens 40-44 hours post injection, when the experiment was terminated. Abdomens were sampled as described above.
RNA isolation, cDNA Synthesis, and Quantitative PCR (qPCR) on experimentally infected queens
Each abdomen was transferred into a 2 mL bead ruptor homogenizing tubes containing 5 ceramic beads. The samples were then homogenized using an automated Bead Ruptor Elite (Omni International, Kennesaw, GA, USA) at speed of 5 meter per second (5 m/s) for 20 s. Followed by homogenization, total RNA was extracted using an established TRIzol™ (Invitrogen, Carlsbad, CA, USA) protocol. The concentration and purity of the extracted RNA samples were measured using a Nanodrop OneC Microvolume UV-Vis Spectrophotometer (Thermo Fisher Scientific, MA, USA). The total RNA concentration was adjusted to 20 ng/µL in molecular grade water (Fisher Scientific, Fair Lawn, NJ, USA). cDNA was synthesized using the High-Capacity cDNA Reverse-Transcription Kit (Applied Biosystems, Foster City, CA, USA). Ten microliters of the RNA template (200 ng) were added to 10 µL of the provided cDNA master mix, followed by an incubation period as recommended by the manufacturer; 10 min at 25°C, 120 min at 37°C, and 5 min at 85°C. Resulting cDNA solution then diluted 10-fold in molecular grade water to serve as template in subsequent qPCR.
We quantified the IAPV intensity, expression level of three genes of interest including vitellogenin, protein lethal(2)essential for life-like, heat shock protein 70 cognate 4 (see Supplementary Table S7 for primer sequences). Three housekeeping genes including actin, glyceraldehyde 3-phosphate dehydrogenase (GapDH) and ribosomal protein S5 (RPS5) were also amplified as internal control and for relative quantification.
The qPCR was conducted in duplicate using 384-well plates on a QuantStudio™ 6 cycler (Thermo Fisher Scientific). The reactions were performed using unlabeled primers and SYBR Green DNA binding dye (Applied Biosystems) with a volume of 12 µL and the final primer concentrations of 0.4 µM. We added RNase-free water as template for the NTC and no reverse transcriptase for the NRT controls. The thermal cycling conditions were 10 min at 95°C, followed by 40 cycles consisting of a denaturing stage at 95°C for 15 s and as annealing/extension stage at 60°C for 1 min. This procedure was followed by a final melt-curve dissociation analysis to confirm the specificity of the products. For IAPV, the Ct values were determined at the same fluorescence threshold (0.05) for all plates, and a Ct value of 35 or lower was recorded as positive amplification. For three genes of interest and housekeeping genes we collected Ct values based on the default threshold as determined by the QuantStudio™ 6 for each target gene.
We quantified immune-gene expression for each sample using relative quantification. We calculated geometric mean of the Ct value from three reference genes (RPS5, actin, and GapDH) to confirm amplification. We subtracted the Ct value of geometric mean from target gene for each sample (∆Ct= Ct(gene of interest) – Ct(GMean of reference genes)). To calculate the ∆∆Ct for each target gene, we subtracted the average of the ∆Ct values across the control samples at each age from the ∆Ct for each target gene. Fold gene expression was then calculated using the 2−(∆∆Ct) method. Pairwise comparisons of gene expression were evaluated using the wilcox.test function in the ggpubr package (v0.4.0). Data are plotted in the form of log10(fold gene expression) for clarity.
Data Availability Statement
All data underlying the figures in this manuscript are supplied as supplementary information. The previously published (spermathecal fluid) raw data are publicly available on the MassIVE archive (www.massive.ucsd.edu, accession MSV000085428). The ovary mass spectrometry data are available on the MassIVE archive (www.massive.ucsd.edu, accession MSV000087150). Source code underlying figures and data analysis are available freely upon request.