Protocols for assessing the distribution of pathogens in individual Hymenopteran pollinators

Joachim R de Miranda (  joachim.de.miranda@slu.se ) Swedish University of Agricultural Sciences https://orcid.org/0000-0002-0335-0386 Ivan Meeus University of Gent Orlando Yañez University of Bern Niels Piot University of Gent Laura Jara Universidad de Murcia Piero Onorati Swedish University of Agricultural Sciences Pilar De la Rúa Universidad de Murcia Anders Nielsen 5. Norwegian Institute of Bioeconomy Research Ivana Tlak-Gajger University of Zagreb Erkay Özgör Cyprus International University Anne Dalmon Institut National de Recherche pour l'Agriculture, l'alimentation et l'Environnement Anna Gajda Warsaw University of Life Sciences Peter Neumann University of Bern Guy Smagghe University of Gent Robert J Paxton Martin-Luther-Universität Halle-Wittenberg


Introduction
The main objective of COST-Action FA1307: Sustainable pollination in Europe -joint research on bees and other pollinators (Super-B) was to coordinate research, outreach and policy towards sustainable pollination services in Europe 1-3 . One of the Super-B working groups, WG4, addressed the possible drivers for the decline in bee numbers and diversity in Europe, one of which was identi ed as the transmission of parasites and pathogens between bee species 4,5 . The increasing homogenization of the agricultural landscapes across Europe is re ected in the loss of pollinators in these landscapes [6][7][8][9] , with most of the pollination services provided by a few dominant bee species 6,8 . The key question is whether, and to what extent, such changes in the communities of pollinators are related the distribution and transmission of parasites and pathogens between bee species [10][11][12][13][14][15] . The protocols do not speci cally distinguish between replicating and non-replicating microbial agents, but the material collected allows for such a determination.
The experimental design resides largely in the sample collection strategy. The design consists of collecting, as they are encountered, thirty bees of the dominant bee species (the putative driver of transmission); thirty bees of all other species/targets of transmission, and fteen additional specimens of the most common of the target bee species.

Expertise required
All parts of the protocol can be easily implemented by trained students and technicians, with knowledge of basic molecular techniques. Due to the detailed and easy implementation of the protocol, it will be the preferred standard technique for future studies of this nature.

Limitations
The main limitations of the protocol, and source of methodological variation in the data, concern the early stages of the sample processing and nucleic acid extraction, particularly the choices made for sample homogenization and early sample incubations. These affect not only the nucleic acid's extraction e ciency (which can be corrected using the exogenously added reference nucleic acids [19][20][21] or its integrity (which can be corrected using internal reference standards 17,18,41,42 ), but also its composition 24 , which is much more di cult to correct. The best approach for guaranteeing comparability and minimizing bias between individual samples is high standardization in these early steps 24 . Since these limitations involve the very rst steps in sample processing, their consequences persist throughout all subsequent stages, irrespective of which alternative approaches are implemented later in the protocol. It is therefore crucial to ensure these early steps are as tight and controlled as possible.

Reagents
Below (Table 1) is a list of essential reagents and equipment, i.e. those that are critical for the performance of the protocol. Non-essential reagents and equipment, such as collecting vessels, transport options or dissection tools, are given in the procedures with a note de ning those features that are important. Equipment Set-up 1. Bead-mill (TissueLyser II) homogenization: Add three 3 mm steel beads and one 5 mm steel bead to the sample and buffer in an appropriate tube for bead-milling. Shake for 2 minutes at 30 Hz followed by 2 minutes at 20 Hz.

Reagents
2. ThermoCyclers: Set up thermocycling pro les using the appropriate software for each thermocycler.

Procedure
The full procedure is divided into four distinct protocols, concerning (A) the collection of the bee samples; (B) the processing and management of the samples; (C) the pathogen and parasite assays, and (D) the genetic identi cation of the bee species. Many different types of data are collected throughout the procedure, some for use in data analysis and others as intermediate stages in the conversion and normalization of the raw data into processed data that is suitable for use in statistical analyses. A meta le describing in detail each of these primary and derived parameters is given as a Supplementary  Table at the end of the procedure. A. SAMPLE COLLECTION PROTOCOL The eld sampling protocol below is based on honey bees (Apis mellifera) as a driver of pathogen distribution in wild bees, but this can be substituted for any other biological driver, e.g. a particular species of bumblebee. The speed at which the bees are collected, recorded by time, can be used as a proxy for the absolute density of bees in the area. The order in which different bee species are collected (up to their maximum) can be used as a proxy for the relative density of the different bee species. Bee foraging behavior is affected by weather conditions 43 and species of ower [44][45][46] , and differs furthermore between bee species [44][45][46][47][48][49] . Since bee pathogens can be transmitted through oral visitation networks 11-15 , and are also directly affected by temperature 50 , weather conditions and the oral character of the landscape are important metadata to collect. The sample processing is designed to allow the following two aims: To determine the prevalence of known (honey)bee pathogens across bee species To identify and characterize yet unknown, novel pathogens in wild bees The strategy for both aspects is to prepare a primary homogenate in a neutral buffer, extract nucleic acids from a small amount of extract for the "current-pathogen" analyses (aim 1) and retain the rest for "newpathogen" prospecting analyses (aim 2), which may involve additional steps prior to nucleic acid extraction.
There is strong emphasis in the sample collection protocol on minimizing cross-contamination between insects, in order to avoid misclassifying the pathogen status of individual bees. Aside from sampling artefact, the larger purpose of this is to attempt to distinguish between those microbial agents that are infectious to the bee tissues, and are therefore a potential health threat; those that are part of the bee microbiome, both internal and external, and those that are passively associated with the bee but not infectious to bee tissues. We have therefore also included several simple processing and assay strategies that maximize our ability to distinguish between infectious and non-infectious agents. One means to do this is to separate the body parts of the bees: Abdomen, where nearly all internal bee pathogens replicate, in the bee tissues, and shed their replicative propagules (spores/oocysts/virus particles etc.) in the gut lumen for voiding into the environment, with the faeces. The abdomen also contains passively acquired, non-infectious agents, both internally in the gut and externally on the exoskeleton. Head, where many of the viral pathogens actively replicate, and occasionally shed particles into the salivary and hypopharyngeal glands, especially in honey bees.
Wings & Legs, containing pollen baskets. Pathogens can be shared between bees through owervisitor networks, especially on pollen.
Since the abdomen contains by far the highest concentrations of pathogens, there is little loss of detection sensitivity from excluding the head, thorax and legs-wings, which can be retained separately for other studies. For example, the DNA/RNA from legs, wings and thoraxes contain very little contaminating microbial nucleic acids and are therefore optimally suited for possible host bee genetic analyses. In order to allow for the widest possible types of additional, future analyses on the material, it is best to prepare the primary homogenate in a neutral, aqueous buffer and only extract RNA and DNA from an aliquot of primary homogenate, with the remainder stored at -80 °C . There is no loss in detection sensitivity of the pathogens (or host mRNAs) from such a neutral primary extract as long as the extract is either frozen or added to the nucleic acid extraction buffers within 5 minutes. There are also several options at the assay level to distinguish active and passive infections, vide infra. Such measures are not entirely fool proof, but greatly improve the chances of nding truly infecting pathogens.
The protocols described below are loosely based on the COLOSS BeeBook chapters "Standard methods for virus research in Apis mellifera" 18 "Standard methods for research on Apis mellifera gut symbionts" 37 and "Standard methods for molecular research in Apis mellifera" 17 .

C. PATHOGEN ASSAYING PROTOCOL
The presence and abundance of a range of bee microorganisms, whether detrimental or bene cial, is determined by quantitative PCR of the cDNA and DNA templates using broad-range primers (i.e. those encompassing several strains or species within a complex). The reason for this is in part e ciency (fewer assays to run) and in part to avoid false negative results due to assay insu ciencies (Type-II errors).
Type-I errors (false positive results) are mostly due to trace contamination and are easily avoided by restricting the number of ampli cation cycles to 35, rather than 40 17,42,51 For accurate quanti cation we recommend buying synthetic external quanti cation standards (e.g. ThermoFisher), based on the product sequences ( Supplementary Information 3), rather than home-made standards from either puri ed plasmid clones of the fragment, or puri ed PCR product, primarily to guarantee uniformity of standards and quanti cation between different labs, as well as better absolute quanti cation with synthetically produced, and accurately quanti ed, standards. A key element of these quantitative assays is the use of the passive external reference nucleic acids, RNA250 and pJET1.2. These were added in exact known quantities at the start of the homogenization and nucleic acid extraction protocol and the amounts of these left in each cDNA and DNA template can be measured exactly through qPCR, similar to how the pathogen amounts are measured. The ratio of RNA250/pJET measured by qPCR in a standard template volume to the known amount originally added prior to extraction is therefore a simple, one-step conversion factor for all the individual, sample-speci c methodological errors and losses incurred from homogenization through qPCR, for accurate absolute quanti cation of the amounts of each target in the original bee. It will reduce, if not eliminate, random methodological noise from the dataset and improve the chances of detecting true biological differences between the samples. Another key feature is that the qPCR assays are all designed to work with the same thermocycling pro le, to enable different assays to be run in the same thermocycling run, and produce similar medium-sized PCR products, to facilitate distinguishing true product from illegitimate secondary products through either Melting Curve analysis or agarose gel electrophoresis.
11. qPCR ampli cation: a) Add 2 µL diluted DNA or cDNA template to 18 µL qPCR reaction mixture (e.g. BioRad EvaGreen) containing 0.2 µM each of forward and reverse primers for the assay being run (Supplementary Table 1 Some bees may be di cult to identify in the eld, either because they are cryptic (i.e. morphological features that overlap with other bee species), dirty or damaged. Occasionally a species may lack a morphological identi cation key. In these cases, it is usually possible to identify specimen through DNA barcode analysis, i.e. by sequencing one of several well-established barcoding genes in the bee nuclear or mitochondrial genomes. Most animals, particularly insects, are barcoded using a c.a. 650 bp fragment of the mitochondrial Cytochrome Oxidase I (cox1) gene, which is su ciently variable to be able to uniquely distinguish even closely related species 25,26 . A critical requirement is that the DNA template is su ciently pure to only amplify the cox1 region of the bee, and not of any contaminating DNA from eukaryotic parasites or plants. This effectively rules out using the puri ed DNA from bee abdomens for use in barcoding analysis, since this DNA consists of the genomes of all organisms residing in the bee gut, as well as some of the bee. The simplest approach is to amplify the cox1 directly from a snippet of leg, wing or thorax tissue 27 . Each bee cell contains many mitochondria, each of which contains numerous copies of the mitochondrial genome 52,53 , so that even the tiniest trace amount of tissue contains thousands of copies of the mitochondrial genome, more than enough for PCR ampli cation. In fact, there is greater danger that too much template is added to the PCR reaction, which can inhibit the PCR and the production of su cient PCR product for sequencing 51 Table 3. Those steps where the most methodological variability is introduced, i.e. the homogenization and nucleic acid extraction steps in protocol B 24 , are largely accounted for through the use of exogenous reference nucleic acids and individual conversion factors for each sample based on these reference nucleic acids. The main form of troubleshooting concerns Type-I and Type-II errors, i.e. false positive or false negative results. These are generally discovered during the assaying protocol, even though they may arise during earlier stages. The solution to these is generally to discard the results, or the contaminated nucleic acid samples, and repeat the entire protocol from Protocol B onwards. There are checks and controls throughout the protocol, such as determining the nucleic acid concentrations (steps 8i and 9c) and the inclusion of positive and negative controls during PCR (steps 11b, 11e, 11f) that enable the detection of a problem, and the subsequent action required.
These are standard troubleshooting controls and checks that are also included in the various commercial kits used in the protocols, and so are automatically included in the use of these kits.

Time Taken
By far the most time consuming part of the overall protocol, per individual bee, is protocol B (Sample Processing) and in particular steps 3 through 6, when each bee is individually weighed, measured, dissected and homogenized. The next most time consuming section is the production of DNA, RNA and cDNA template for assaying; steps 7-10. If extraction robots are used then part of this time can be allocated to other tasks. The least time consuming part of the protocol, per individual bee/pathogen combination, is protocol C (Pathogen Assaying). Below is a breakdown of the amount of time that should be budgeted for each part of the overall protocol, summarized in Table 4.

Protocol A -Sample Collection
The sample collection normally takes about 3-4 hours, but this depends very much on the circumstances and environmental conditions when the collection takes place, some of which may be beyond our control due to the nature and design of the experiment. The primary way this protocol can be made more e cient, if the experiment allows this, is to plan the collection around when and where the bees are most The laboratory part of the barcoding protocol is similar to running a single pathogen assay, and takes about 4 hours. If the product sizes are checked on a gel prior to submitting for sequencing (step 18a), then another hour is added to the protocol. The most time-consuming stage is actually waiting for the sequencing results and analyzing the data, which is separate to these protocols. Time can be saved by: a) Preparing the agarose gel and sample tubes while waiting for the incubation to nish

Anticipated Results
The results of the various sub-protocols are collected in a data spreadsheet, with numerous columns for the various data items associated with each sample. These include purely administrative items, such as the sample ID (and perhaps various sub-IDs) for tracing the samples, as well as dates, times, procedural information and for coordinating the sampling, processing and assaying between participating partners.
A second major group is the data collected during processing, for administrative record keeping and help with data processing. A third major group is the data collected during sampling, concerning the location, date, time and other interesting features of where the bees were collected, which can be used in analyses to explain the biological data. Then the nal major group is the biological data obtained from each sample, concerning the identity of the bee and the presence and amounts of various pathogens. An example is given in Supplementary Table S1, as a meta le with some typical examples of the type of data obtained and how this is registered. List of essential reagents for the protocols, together with their suppliers, the company websites and the catalog numbers.

Figure 2
List of essential equipment for the protocols, together with their suppliers, the company websites and the catalog numbers. Table describing the major risks associate with the protocols, the steps involved, and how to troubleshoot these.