Archaeological context and site information
Twenty-six dental calculus samples from the Jewbury archaeological site were divided into five groups for analysis. These dental calculus samples ranged from (1170–1290 CE), and like most ancient calculus specimens, were too small to subsample. Consequently, different individuals had to be used to complete this analysis. This Medieval cemetery in northern England, UK was excavated in 1983 28. Archaeological examination of the human remains ended following reburial requests from the Jewish community. However, analyses on the site and skeletons were recorded 29 along with dental calculus samples which were collected for future investigations. All experimental protocols and analyses were performed and completed under ethical approval to study ancient human dental calculus (University of Adelaide: H-2012-108). All methods were performed in accordance with the guidelines outlined Cooper and Poinar (2000) 24. The population consisted of individuals with middle-to-poor socio-economic standing, with 98.1% buried in single graves. Dental caries were observed in 59.5% of individuals, and periodontal disease was present in > 80% 28. Calculus was sampled on-site at the time of excavation and stored in glass vials until transport. Samples were transported to the aDNA facility based in the Australian Centre for Ancient DNA (ACAD), Adelaide, Australia for processing. Previous aDNA analysis of calculus from this site has revealed that oral microbial communities were not statistically different between samples and were distinct from other cultures based on microbial composition alone 16.
Decontamination protocols
Each of the five groups underwent a different decontamination protocol prior to DNA extraction. The protocols are summarized in Fig. 1, and were as follows: untreated controls (n = 5); EDTA treatment (n = 5) 17; UVB treatment (n = 5) 16; UV treatment (n = 5); and bleach treatment (n = 6). For the EDTA treatment, calculus fragments were submerged in 1 mL 0.5M EDTA for 1 hour 17. The UVB treatment exposed dental calculus fragments to UV radiation for 30 minutes on each side, followed by a submersion in 3 mL of 5% bleach in a sterile petri dish for 3 minutes 16. The individual UV and bleach treatments used the respective element of the UVB protocol. Following the decontamination protocols, all samples were washed in 1 mL of sterile 80% ethanol for one minute to remove residual chemicals (i.e., EDTA or bleach) prior to extraction. The ethanol washes from each sample (n = 26), as well as control ethanol samples (n = 3), were evaporated, and the resulting DNA was suspended in TLE buffer (500 µl Tris HCL (1M), 10 µl EDTA (0.5M), and 50 ml dH2O) 30.
DNA extraction
All samples, except for the ethanol washes, underwent an in-house, silica-based DNA extraction, as previously described 31. To account for small sample sizes, total volumes of lysis and guanidinium DNA binding buffer were reduced as follows: 1.8 mL lysis buffer (1.6 mL 0.5 M EDTA (0.5M), 200 µL SDS (10%), and 20 µL proteinase K (20 mg/ml)) and 3 mL guanidinium DNA binding buffer. Two extraction blank controls were included for every seven samples for the amplicon process, while one was performed for shotgun.
16S rRNA Amplicon Approach
A 289 base pair stretch from the V4 region of the 16S ribosomal RNA (rRNA) encoding gene (position 515–806 of the E. coli reference genome) was amplified in triplicate from all samples (dental calculus, resuspended ethanol washes, and extraction blank controls) alongside an additional PCR negative control using a universal forward primer and sample specific, barcoded reverse primer 32. This amplicon was previously targeted in the study of ancient dental calculus by Adler et al. (2013) and is used to obtain sufficient resolution of bacterial taxonomy to allow analysis of the microbial community 16. Each PCR reaction contained: 17.5 µL sterile H20, 1 µL of DNA extract, 0.25 µL of Hi-Fi taq (Life Technologies), 2.5 µL of 10X Hi-Fi reaction buffer, 1 µL MgCl2 (25 mM), 0.2 µL dNTPs (10 mM), and 1 µL each of the forward and reverse primers. Samples were amplified using the following conditions: initial denaturing (95 °C, 6 minutes), followed by 37 cycles of denaturing (95 °C, 30 seconds), annealing (50 °C, 30 seconds), and elongation (72 °C, 30 seconds), and finally adenylisation (60 °C, 10 minutes). High numbers of cycles can cause inflation of diversity estimates 33. However, the high cycle number is normal for ancient DNA where low concentrations of input DNA are experienced 16,17. Following amplification, the triplicate reactions were pooled, and PCR products were visualized by electrophoresis on a 2.5% agarose gel. Samples were quantified (Qubit 2.0, Life Technologies) before being pooled at equimolar concentrations and purified (Ampure, Agencourt Bioscience). The pooled sample (i.e., DNA library) was quantified using the Tapestation and the KAPA SYBR Fast Universal master mix qPCR assay (Geneworks). DNA sequencing was completed using the Illumina MiSeq 150 bp paired end chemistry (Illumina, San Diego, CA, 34USA) at the Australian Genome Research Facility Ltd (AGRF), Adelaide. Sequencing data can be
Amplicon bioinformatic analyses
Sequences were demultiplexed and quality filtered in QIIME (V1.8)35 using the split_libraries_fastq.py script with parameters: barcode error = 0 and quality score > 20. Operational taxonomic unit (OTU) picking was completed against GreenGenes (V13.8)36 with 97% similarity using both closed and open reference methods. The closed reference OTU dataset only includes sequences that match references within the GreenGenes database; the open reference dataset also included OTUs without reference matches. To remove contaminant DNA introduced through laboratory processing, OTUs identified in negative controls and as common laboratory contaminants23 were removed from dental calculus samples processed in the same batch. Ethanol washes, which were not expected to have a high biomass or necessarily to be representative of human microbiota, were filtered by the OTUs present in the control ethanol samples. Finally, singletons (OTUs present only once) were removed from the data.
Following filtering, bioinformatics analyses for the 16S rRNA amplicon dataset were conducted within QIIME (V1.8). To examine differences in diversity between the different decontamination steps, a variety of analyses were performed. Alpha diversity (observed species) was calculated for each treatment group at rarefaction levels from 0 to 2,000 (in intervals of 10) using closed and open reference datasets in QIIME. A Goodness of fit test (G-test) was applied to detect significant differences in genus-level taxa between untreated samples and each of the decontamination protocols. To reduce false positives generated by rare taxa, OTUs below 0.1% of the total taxa present were removed before performing the G-test. To identify the environmental taxa impacted by the decontamination protocol, genus-level taxa that were significantly different (p < 0.001) were classified as environmental or oral based on their presence or absence (respectively) in the Human Oral Microbiome Database (HOMD) (homd.org).
Several statistical assessments were performed to identify taxa that were significantly altered by the different treatments. First, a one-way ANOVA was applied to test if the average frequency of OTUs in each protocol group had altered in relation to the untreated group. Next, for each sample, OTUs identified in the G-test analysis were ranked as increasing or decreasing relative to the untreated proportion. A one-way ANOVA was performed to identify taxa that significantly differed between the four treatment groups. Finally, taxa released into the ethanol washes were classified as environmental or oral using HOMD, and the ratio of oral to environmental taxa was assessed.
16S rRNA amplicon SourceTracker analysis
SourceTracker (V0.9.6)50 was used to identify the proportions of endogenous and contaminant signal in each sample. SourceTracker differentiates the community profiles in the ‘source’ to those of the ‘sink’ (i.e. the sample), using Bayesian methods to associate the extent of contribution of each source to a sink. Comparative data included dental calculus from modern (n = 6) and Industrial Revolution (n = 3) individuals accessed from the Online Ancient Genome Repository (OAGR)51. Additionally, preprocessed 16S rRNA gene datasets were downloaded from the Qiita database for comparison (qiita.microbio.me) and included: human skin samples (n = 11) 38 and environmental samples from agricultural soil (n = 8), temperate soil (n = 4), forest soil (n = 4), tropical soil (n = 5), and park soil (n = 6) (Study IDs: 232, 808, 846, and 1674. qiita.microbio.me). Specifically, a subset of skin samples was used to reduce bias from unevenly sized reference groups (samples: P15024, P15268, P15733, P16107, P16187, P16199, P16304, P16320, P16393, P16399, and P16562 were used). SourceTracker was run with default parameters (1,000 subsampling, 10 iterations per sink sample) in R version 3.1.0.39 using the QIIME wrapper. The suitability of the source populations was confirmed using the “take-one-out” method, which demonstrated that the samples within each reference group were more similar to one another than samples in any other group.
Shotgun Metagenomic Approach
Shotgun metagenomic library preparation and sequencing
Metagenomic shotgun libraries were constructed as previously described34 without the enzymatic damage repair, as used by other ancient dental calculus studies 40. Briefly, 20 µL of DNA extract underwent enzymatic polishing to produce blunt ended fragments, before the ligation of truncated 7-bp forward and reverse uniquely barcoded Illumina adaptors, and finishing by filling in the gaps between the adaptor sequences and the DNA sequence. MinElute Reaction clean-ups (Qiagen) were completed after both enzymatic polishing and barcode ligation steps. Libraries were amplified in triplicate by PCR for 13 cycles with Illumina amplification primers 34. Each PCR reaction contained: 13.25 µL sterile H20, 5 µL of library DNA, 0.25 µL of Hi-Fi taq (Life Technologies), 2.5 µL of 10X Hi-Fi buffer, 1.25 µL MgSO4 (50 mM), 0.25 µL dNTPs (100 mM), and 1.25 µL each of the forward and reverse primers. Cycling conditions were as follows: 94 °C for 12 minutes; 13 cycles of 94 °C for 30 seconds, 60 °C for 30 seconds, 72 °C for 40 seconds (increasing increments of 2 seconds per cycle); and ending at 72 °C for 10 minutes. PCR products were pooled and cleaned with AxyPrep magnetic beads (Axygen Scientific, Inc.), and then re-amplified with GAII Indexed Illumina primers 34, using the above cycling conditions and a modified PCR reaction: 12.75 µL sterile H20, 2 µL of purified Library DNA, 0.25 µL of AmpliTaq Gold (Life Technologies), 2.5 µL of 10X Gold buffer, 2.5 µL MgCl2 (25 mM), 0.625 µL dNTPs (10 mM), and 1.25 µL Illumina amplification primer, and 1.25 µL GAII Ilumina indexed adaptor. Libraries were purified again prior to quantification using TapeStation (Aligent), and pooled for a final 4 nmol/L DNA concentration, before sequenced with Illumina NextSeq 500, Mid Output 150 cycles (Illumina, San Diego, CA, USA) at the Australian Genome Research Facility Ltd. (AGRF), Adelaide.
Shotgun metagenomic bioinformatic and statistical analyses
Shotgun metagenomic sequencing data was converted into FASTQ format using Illumina’s bcl2fastq software before being trimmed, collapsed by overlapping paired sequences, and demultiplexed by unique combinations of P5/P7 barcodes using AdapterRemoval2 41. Collapsed reads were then aligned against a reference database containing 47,713 bacterial and archaeal RefSeq genome assemblies 42 using MALT v 0.3.8 43,44 with default settings. The resulting blast-text files were converted into RMA files via the blast2rma script45 with the following LCA (Lowest Common Ancestor) parameters: --minScore = 42; --maxExpected = 0.01; --minSupportPercent 0.1; --lcaAlgorithm = weighted; --lcaCoveragePercent = 80. Data was compared in MEGAN6CE (version 5.11.3)45. A biom table was exported from MEGAN6CE into QIIME2 (v. 2019.10)46. Species identified in EBCs were completely removed from the taxonomic table using the ‘qiime feature-table filter-samples' quality control plug-in 47.
A feature table was generated using the ‘qiime feature-table summarize’ plugin in QIIME2 (v. 2019.10). Microbial alpha and beta diversities were also calculated and analyzed using the ‘qiime diversity alpha-group-signfiance' and ‘qiime diversity beta-group-significance' plugins. For these analyses, OTUs observed in the extraction blank were removed using the ‘qiime feature-table filter-samples' plugin. Next, samples were rarefied to 70, 601 (the lowest amount of high-quality reads observed in a sample, sample ID 8833). Alpha (Shannon’s diversity, Simpson’s index) and beta (ecological distance based on Bray-Curtis dissimilarity) diversity differences among groups were tested using Kruskal-Wallis analysis of variance. Ordination of Bray-Curtis distances was done using principal coordinate analysis (PCoA) and visualized using Emperor in QIIME2 48. In addition, we performed a G-test to identify differences in OTU abundance among the different sample groups using the group_signficance.py script in QIIME and the Bonferroni method to correct for multiple comparisons. The Kruskal-Wallis test for all the comparisons were statistically insignificant with p-values greater than 0.05.
Shotgun metagenomics SourceTracker analysis
For the shotgun analysis, “source” contributions to samples were predicted from rarefied species-level bacterial and archaeal taxonomic frequency tables from MALT using SourceTracker (V0.9.8)49. Comparative data included gut (n = 19)50–52, plaque (n = 10)53, saliva (n = 10)54,55, skin (n = 6)56,57, and soil samples (n = 6)58,59 (Supplemental Information). The raw shotgun sequences for these source communities were downloaded and processed in the same fashion as the samples in the current study. A species-level table, filtered of taxa matching those identified in the EBCs, was used as input. SourceTracker was ran using the default settings, except with a rarefaction depth of 55,833 - the lowest number of species in the biom table. The predicted percentage contributions of each potential source to the sinks were visualized using the following R packages: ggplot260, dplyr61, tidyr62, and reshape263.