Study Area
The Conwy Catchment is a 678 km2 river drainage in north Wales that encompasses a wide range of habitats including forest, moorland, agriculture, light urbanization, and acid grasslands (Fig. 1). The area experiences rapid climatic shifts, particularly during winter months due to its mountainous terrain and porous rock foundations, which facilitates flash flooding, making it susceptible to periodic disturbance 36. The Conwy Catchment area exhibits four distinct seasons that correspond to expected life cycles of EPT and Chironomidae larval emergences throughout the year.
Sampling
Fourteen sites across 5 landuse types (acid grassland, agriculture, forest, urban, moorland) were sampled once per season (spring, summer, fall, winter), during 2017 (5 landuse types x14 sites x4 seasons). In total 168 eDNA samples and 56 traditional kick-net samples were taken over the course of the study. Sampling for each season occurred over two consecutive days. For all sampling events, streams were sampled for both eDNA and macroinvertebrate community composition during the same day. Water samples for eDNA analysis (1 L) were collected in triplicates from each stream with plastic bottles that had been cleaned prior using a 10% bleach solution. These were then filtered through 0.22 μm SterivexTM filter units (EMD Millipore Corporation, Billerica, USA) using a Geopump TM Series II peristaltic pump (Geotech, Denver, USA). The filters were immediately preserved in ATL lysis buffer (Qiagen, Venlo, The Netherlands) and stored at room temperature during transit to the laboratory for further processing. Macroinvertebrate communities were sampled using a standardized 3-minute kick sampling protocol, with a 500 μm mesh gauge kick/hand net. Both bank margins and riffle habitats were sampled during this timed sampling period. Macroinvertebrates were preserved in absolute ethanol (99.8%; VWR International, Lutterworth, UK) on collection. Upon return to the laboratory the macroinvertebrates were cleared of other collected material and identified to the lowest practical taxonomic level. Environmental (abiotic) data including pH, conductivity, depth, moss coverage, algal coverage, plant coverage boulder coverage, gravel coverage and sand coverage, were collected for each site following UK Environment Agency site assessment protocols 37. Three field blanks consisting of deionized water were prepared in Nalgene sampling bottles and were kept with the other sampling gear throughout each sampling event. One blank was processed after every fourth sampling site (3 per sampling event; 12 blanks total) and treated the filters in the same manner as the other samples throughout the subsequent extraction and sequencing protocols.
Extraction and Sequencing
DNA was extracted from the filters using a modified QIAGEN DNA blood and tissue extraction protocol 38. In short proteinase K was added directly to the filters and incubated overnight in a rotating hybridization chamber. Then, the lysate was extracted and the full volume was filtered through a spin column tube, after which point the standard extraction protocol was continued. Extracts were then cleaned for impurities using QIAGEN Power Clean kit and frozen at -20°C for subsequent analyses. Sequencing libraries were created using a two-step protocol (see Bista et al. 2017), using matching dual end index tags (IDT) and the following COI gene region primers for the first round of PCR (PCR1): m1COIintF (5′-GGWACWGGWTGAACWGTWTAYCCYCC-3′) and jgHCO2198 (5′-TAIACYTCIGGRTGICCRAARAAYCA-3′) 39. Libraries were created at Bangor with the assistance of a Gilson pipette max liquid handler before being shipped to University of Birmingham’s Genomic sequencing facility for quality control and sequencing. Round 1 amplification (PCR1) with the COI primers was performed in triplicates, which were then pooled and index labeled during Round 2 PCR step (PCR2). PCR1 utilized Thermo Scientific's Ampli-gold mastermix Due to the high number of inosine in the COI primer pair, and for PCR2 we utilized New England Biolab's Q5 mastermix. All PCR1 and PCR2 reactions were run in 25 μL volumes. PCR1 amplicons were generated using a reaction mix of 12.5 μL mastermix, 2 μL DNA template, 1 μL of each primer and 8 μL nuclease free water and amplified using an initial 95 °C for 5 min then 25 cycles of 95 °C for 30 s, 54 °C for 30 s and 72 °C for 60 s followed by a 72 °C final annealing for 10 min. PCR2 amplicons were generated using a reaction mix of 12.5 μL mastermix, 2 μL DNA template, 1 μL of each primer and 8 μL nuclease free water and amplified using an initial 98 °C for 30 s then 15 cycles of 98 °C for 10 s, 55 °C for 30 s and 72 °C for 30 s followed by a 72 °C final annealing for 10 min. The, PCR2 amplicons were purified using High Prep PCR magnetic beads (Auto Q Biosciences) and quantified using a 200 pro plate reader (TECAN) with the QubitQubit dsDNA HS kit (Invitrogen). The final amplicons were pooled in equimolar quantities (at a final concentration of 12 pmol) using a Biomek FXp liquid handling robot (Beckman Coulter). Pool molarity was confirmed using a HS D1000 Tapestation ScreenTape (Agilent). Sequencing was performed on an Illumina HiSeq platform 250bp Paired-End, with an intended coverage of 100,000 reads per sample.
Bioinformatics
Bioinformatic processing up to taxonomic assignment was performed by University of Birmingham. In short, per base quality trimming was performed on demultiplexed reads using SolexaQA++ v.3.1.7.1 (Cox, Peterson, & Biggs, 2010) and paired end reads were merged using Flash v.1.2.11 40, with default parameters. Primer sequences were removed with TagCleaner v.0.16 41 allowing up to 3 mismatches per primer sequence. Only sequences with both forward and reverse primers were retained for further analyses. Operational Taxonomic Units (OTUs) were obtained via Usearch at 97% similarity threshold, and denoising with the -unoise3 algorithm. Taxonomy was assigned to representative OTU with BLAST against OTUs using the non-redundant nucleotide database of NCBI.
After stringent filtering and quality control, 12,592,362 reads were obtained with an average of 74,954 (SD = 31,050) reads per sample. Negative controls, while showing no bands on agarose gels post library preparation, generated 676 reads across all blanks (N=12). Of the negative control reads, 411 were unknown bacteria, 3 reads were associated with three genera of Rhodophyta (red algae), 2 reads were linked to unknown fungi, and 260 reads were linked to a single Dipteran OTU across four blanks.
Metacommunity diversity measures
All statistical analyses were performed using R version 3.6.1 42. Sequence reads were rarified for each set of replicates to the lowest replicate level. Mean number of reads for each OTU were calculated across the rarified data before being matched to their taxonomic identifier. Genera richness was calculated as the number of unique genera per site. We further divided richness into unique EPT genera for EPT richness, and Chironomidae richness as the number of unique Chironomidae genera. Functional richness was calculated as the number of unique functional groups per sample, following the designation of functional groups in Moog (2017). These groups reflect the species’ trophic function, divided into 8 categories, including; grazer/scrapers, xylophagous, shredders, gatherers/collectors, active filter feeders, passive filter feeders, predators and other. We refined these groups to shredders, grazers, and collectors whereby collectors were the summation of gather/collectors and filter feeding groups to simplify the functional groups to those used across wider studies. Beta diversity was calculated as nestedness and turnover components following Baselga (2010).
Statistics
We used linear regression, as implemented using the lm function in R, to assess the statistical relationships between (1) community, (2) EPT, (3) Chironomidae, and (4) functional richness (each as a separate response variable) and all two-way interactions of the explanatory variables, including sampling method (eDNA or traditional), season (spring, summer, fall and winter) and landuse gradient (see below for description). We further used linear regression to assess the statistical relationships between nestedness and turnover for the community, EPT and Chironomidae groupings (as response variables) against a set of explanatory variables, including the landuse gradient, sampling method, and season, including all possible two-way interactions for each model. The landscape gradient was calculated as the first principle component of a PCA derived from non-covarying environmental variables, where non-covariance was assessed via visualization of the pairwise correlation between all measured environmental variables (Fig S1) (Borcard, Gillet, & Legendre, 2011). Backward model selection was performed to find the most parsimonious model using Akaike information criterion (AIC) to determine the best model fit 43. Model assumptions, including normality and heteroscedasticity using model diagnostic plots were implemented in R. To assess the potential of spatial autocorrelation in the data, we calculated Moran’s I for each subset of the data used to perform the above-mentioned regressions. Moran’s I is a measure of spatial autocorrelation and was calculated via the ape package in R 43.