Evaluation of Methods to Optimise Shotgun Metagenomic Sequencing-Based Analysis of the Microbiome of Milk

Min Yap Teagasc Food Research Centre Moorepark Conor Feehily Teagasc Food Research Centre Moorepark Calum J. Walsh Teagasc Food Research Centre Moorepark Mark Fenelon Teagasc Food Research Centre Moorepark Eileen F. Murphy Precision Biotics, Cork, Ireland Fionnuala M. McAuliffe National Maternity Hospital, Dublin, Ireland Douwe van Sinderen University College Cork National University of Ireland Paul W O’Toole University College Cork National University of Ireland Orla O’Sullivan Teagasc Food Research Centre Moorepark Paul Cotter (  Paul.cotter@teagasc.ie ) Teagasc https://orcid.org/0000-0002-2577-9028


Background
Milk is an important source of nutrition for both humans and animals. Human breast milk contains bioactive compounds that contribute to a child's development and growth [1]. With the development of culture-independent methods for studying the microbiome, increasing interest has been devoted to understanding the interplay between breast milk and the infant gut microbiome [2]. Indeed infants primarily fed human breast milk within the rst few months of life have been shown to harbour a distinct microbiome from that of non-breast-fed infants [3]. 16S rRNA gene amplicon-based analysis of the human milk microbiome has also improved the understanding of the bacterial community related to mastitis. Despite these advances, the application of shotgun metagenomic approaches has the potential to provide even greater insights into the milk microbiome to help with prevention or treatment of lactational mastitis and further promote infant and maternal health [4].
Besides human milk, animal milk from cows, goats, sheep, and buffalo and the resultant dairy products, have proven to be versatile and valuable foods commonly consumed by humans [5][6][7]. The bovine milk microbiome has been intensely studied, with a view to assessing and improving animal health and ensuring product quality and safety for human consumption [8][9][10]. As with human mastitis, culturebased microbiology methods have had variable success in the diagnosis of bovine mastitis due to their inability to detect low abundance pathogens and because of challenges associated with subclinical infection of unknown aetiology [11]. Sequencing-based approaches have also addressed this issue to some degree [12]. In addition to animal health, research on the bovine milk microbiome has been performed motivated by the near-universal use of milk as an ingredient in many dairy products worldwide. Although certain bacteria, when present in milk products, can improve the resulting sensory and avour properties [13], other microorganisms can negatively in uence milk by deteriorating its quality, causing spoilage, or making it unsafe for consumption [14].
High-throughput sequencing technologies, both targeted amplicon sequencing, e.g. 16S rRNA gene, and shotgun metagenomics, have been widely adopted for the study of microbiomes, including relatively highly diverse and species-rich microbiomes such as the human gut [15][16][17], human oral cavity [18], and soil [19,20]. For environments with a low microbial load and diversity such as milk, amplicon sequencing has been the primary tool used. For bovine milk analysis, this has provided insights into the compositional changes in raw milk due to environmental factors, processing or the production of cheese [1,10,[21][22][23]. Notwithstanding the value of data already generated, extraction methodologies have been identi ed as having an impact on downstream sequencing data from 16S rRNA sequencing studies [24][25]. Furthermore, although these amplicon sequencing-based approaches have provided valuable insights, shotgun metagenomic sequencing can provide greater taxonomic insights, particularly with regard to elucidating the functional potential of the microbes present and the ability to generate metagenome-assembled genomes (MAGs) [26]. These MAGs provide essentially-complete genome information for key taxa present in the sample, revealing their functional and safety related properties [27]. However, shotgun sequencing, as an untargeted approach, sequences all the DNA in the sample, including that from the human or animal host, which in the case of milk represents a considerable majority (up to 95%) of the DNA present [28]. To address this challenge, previous shotgun metagenomicbased studies of the human milk microbiome [1,29] carried out sequencing at great depth, followed by post-processing of data to remove non-microbial reads using bioinformatic approaches. In addition to generating large numbers of reads that need to be discarded, the limited number of microbial reads generated prevents the detection of low abundance species [30].
Many recent studies involving clinical samples with high levels of host DNA, such as saliva, sputum and joint uid, have investigated depleting host DNA through commercially available kits or other chemical methods [31][32][33][34]. However, such approaches have not been applied to milk. To address this technological caveat, the aim of this study was to evaluate methods of host DNA depletion/microbial DNA enrichment to optimise shotgun sequencing-based analysis of the milk microbiome.

E ciency of methods for improving the proportion of microbial reads
We evaluated different approaches to optimise shotgun metagenomic sequencing-based analysis of the microbiomes of milk. Seven approaches were provisionally tested to assess their suitability for extracting DNA from low volumes of material (Additional le 1: Figure S1). From these, only the DNeasy PowerSoil Pro (PS) and MolYsis complete5 (ML) kits yielded su cient DNA concentrations. Therefore, the three methods presented in this study that were used for further evaluation with improved starting volumes were the (i) PS kit, which is not speci cally designed to enrich microbial DNA or deplete host DNA, (ii) the PS approach but used in combination with the NEBNext Microbiome Enrichment kit (NEB) and (iii) ML kit, designed to decrease the amount of host DNA present.
Microbial DNA was extracted from each of six bovine and six human milk samples using the three different methods (Fig. 1). The extracted DNA was subjected to shotgun metagenomic sequencing, generating an average of 3,972,773 (± 383,371) high quality paired end reads per sample (Additional le 2: Table S1), Although the DNA extraction approach taken did not signi cantly impact on the total number of reads per sample, the proportions of host to microbial reads differed considerably (Fig. 2a). More speci cally, the ML kit yielded a signi cantly higher (P < 0.05) % of microbial reads (Fig. 2b), i.e., 38.31% average (range 2.01 -93.12%), relative to the other two methods, i.e., NEB kit 12.45% average, (range 1.03 -41.63%) and PS kit 8.54% average (range 1.22 -30.28%).
Differential performance of taxonomic classi cation algorithms on spiked samples Bovine and human milk samples (BMS12 and HMS12) were spiked with a mock community of ten strains, present at equal proportions, to investigate the potential for the introduction of bias due to differences in extraction e ciencies from different bacteria and to inform the choice of bioinformatic classi cation tools. Negative extraction controls were also included to investigate the potential for kit contamination.
After sequencing and quality control of the sequences, taxonomic classi cation was performed using three different tools, Kaiju, Kraken2, and MetaPhlAn2. Overall, the top 20 species detected accounted for 48.7% of the species identi ed by Kaiju, 98.3% of the species identi ed by Kraken2, and 61.3% of the species identi ed by MetaPhlAn2 (Fig. 3a). The mock community accounted for an average relative abundance of 29.2% of the species identi ed by Kaiju, 96.6% of the species classi ed by Kraken2 and 58.1% of the species identi ed by MetaPhlAn2. All classi ers detected the 10 mock community species, although at varying relative abundances. In general, all three tools showed an under-representation of Bi dobacterium adolescentis (0.03 -0.49%) and Clostridium beijerinckii (0.08 -1.04%) in samples and an over-representation of Deinococcus radiodurans (12.29 -38.16%) (Fig. 3a, 3b). Both Kraken2 and MetaPhlAn2 produced similar mock community composition pro les, albeit with Kraken2 assigning a much higher % of reads. Both Kaiju and Kraken2 assigned reads within the negative control samples, while MetaPhlAn2 only assigned reads from one of the three negative control samples. The taxa identi ed were distinct from those detected in milk samples (without spiked controls) (Additional le 3: Table S2).
When the abundances of the 10 mock community strains were investigated in greater detail, Kraken2 and MetaPhlAn2 were found to assign the strains at relative abundances closer to that expected for the mock community than Kaiju. Kraken2 identi ed Escherichia coli, Enterococcus faecalis, D. radiodurans, and Bacillus cereus at a higher abundance than expected, while MetaPhlAn2 appeared to over-represent both E. faecalis and D. radiodurans (Fig 3b). When the Bray-Curtis dissimilarity of all mock community samples were plotted in individual PCoA plots by taxonomic rank, the overall dissimilarity increased further down the taxonomic ranks, but communities classi ed with Kraken2 consistently clustered closer to the mock community from phylum to species (Fig. 3c). At the highest rank (phylum), Kraken2 clustered closest to the mock community and signi cant differences were observed between the Kraken2 and MetaPhlAn2 communities (P = 0.012). Ultimately, Kraken2 was selected for further downstream analysis as it performed best in terms of overall correct assignment of mock community DNA.
The spiked mock community samples were additionally compared using Kraken2-classi ed metagenomic reads with 16S rRNA amplicon reads. The determined community composition, using these two different sequencing approaches, was similar, with small differences in relative abundances (Fig. 3d).
Additional genera, including Acinetobacter and Lactococcus were consistently detected in bovine milk samples regardless of sequencing method.

Analysis of bovine and human milk samples -community structure and taxonomic and functional pro les
The species-level composition of bovine and human milk microbiome samples as assigned by Kraken2 was compared across the different extraction methods used (Fig. 4a). In bovine milk, aliquots of the same samples that were extracted using different methods clustered closely together and, in general, generated similar pro les, albeit with slightly varying relative abundances. However, ML extraction from BM7 led to the additional assignment of reads to Lactococcus lactis, Pseudomonas alkylphenolica and Acinetobacter baumannii at low abundances (<0.1%). For sample BM8, L. lactis was detected at higher relative abundances in the sample extracted with the ML kit (compared to the PS and NEB kit).
With regard to the human milk samples (Fig. 4b), with the exception of HM7, aliquots of the same sample that were extracted by different methods did not cluster together. HM8_ML was found to be more similar in species composition to HM10_NEB and HM11_ML, re ecting the detection of Staphylococcus epidermidis in these samples. Similarly, HM9_PS clustered with HM8_NEB, HM8_PS, HM11_NEB and HM11_PS re ecting the high relative abundance of Yersinia enterocolitica detected in these samples.
Despite the taxonomic compositional differences, no signi cant dissimilarity was observed between the methods for either bovine (PERMANOVA: P=0.997, R 2 =0.02362) or human (PERMANOVA: P=0.749, R 2 =0.08218) samples (Fig. 4c). The overall community diversity was not impacted by any method used, implying that there was no community-level bias introduced by any of the methods.
In terms of functional pro ling, the predicted microbiome function in bovine or human milk samples using the three sub-domains of gene ontology (cellular components, biological processes and molecular functions) was not signi cantly impacted by extraction method used (Additional le 4: Figure S2).

Characterization of the milk microbiome
Extraction of DNA from milk samples using the ML kit resulted in greater apparent species richness from an overall higher detection of unique observed species. This increase was signi cant (P < 0.05) among bovine milk samples (Fig. 5a). A total of 84 metagenome-assembled genomes (MAGs) were recovered across all samples, of which 19 were considered high quality, using the cut-off of > 90% completeness and < 5% contamination (Fig. 5b, Table 1). From among these 19, 11 were assembled with sequence data derived from DNA extracted using the ML kit, while only 4 high-quality MAGs were recovered from DNA extracted from each of the other two methods. The majority of the high-quality MAGs recovered were from milk samples with the spiked mock community but Corynebacterium pyruviciproducens and E. faecalis MAGs were also recovered from the HM10 sample extracted using the ML kit. b Contamination: ratio of observed single-copy marker genes in ≥2 copies to total single-copy marker genes in the chosen marker gene set [46] Discussion Shotgun metagenomic sequencing is a powerful method for the study of many different microbiome types. In order to maximise the potential of this method for characterizing speci c microbial populations, the removal of host DNA is necessary to allow for a greater sequencing depth of the microbial populations of interest. The present study evaluated three different approaches for their e ciency in host DNA depletion and, ultimately, their effectiveness with respect to studying the milk microbiome. Results showed that the ML kit gave signi cantly higher percentages of microbial reads when compared with the NEB kit and PS kit. The ML kit uses a chaotropic buffer that selectively lyses host cells, followed by the degradation and removal of the released host DNA using DNAse. The microbial DNA in the samples remains intact before being subjected to DNA extraction. This contrasts with the NEBNext Microbiome DNA Enrichment kit, which exploits the methylation of DNA at CpG sites in eukaryotic cells, using magnetic beads to selectively bind and remove CpG-methylated host DNA. Studies using the NEB kit for analysis of other microbiomes have shown varying results [31,33,34]. The protocol recommends an input of high molecular weight gDNA (≥15 kb for optimal performance) and this was not possible to achieve in the milk samples due to the limited sample volume, and their lower relative microbial abundance when compared to samples such as human stool. Thus, the ndings provided here relate to this method's performance in relation to milk samples only and should not be extrapolated to other sample types. The PS kit and similar kits have been widely used in various microbiome studies [10,22,35,36]. The PS kit uses chemical and mechanical lysis before elution using a silica membrane spin column. Although it has no speci c step for the depletion of host DNA, it has been found to be effective in terms of total DNA yield, which was evident in this study (Additional le 2: Table S1) [35].
In this study, we found that the choice of bioinformatics tool for classi cation had a greater in uence on community composition than the method of DNA extraction, as determined with the mock community analysis ( Fig. 3 and 4). Kraken2 and Kaiju are taxonomic binning tools, based on alignments with reference genomes. Kraken2, which was the tool that was ultimately chosen for downstream analysis, is a nucleotide sequence classi er that assigns taxonomic labels to DNA sequences. It uses an algorithm that relies on exact k-mer matches that records the species identi er for every k-mer in every genome. To classify a sequence, each k-mer in the sequence is mapped to the lowest-common ancestor (LCA) of the genomes that contain that k-mer in a database [37]. Kaiju works similar to Kraken2 except by classifying reads using a reference database comprising the annotated protein-coding genes of a set of microbial genomes. However, unlike Kraken2, Kaiju performed poorly in classifying the mock community samples in this study. MetaPhlAn2, unlike Kraken2 and Kaiju, is based on the alignments with species-speci c marker gene sequences. Quality-controlled reads are mapped to a database of unique clade-speci c marker genes with high discriminatory power, which are more optimized for human-associated gut microbial communities [38]. The relative abundances of each clade in the samples are then estimated with species-level resolution [39]. As MetaPhlAn2 is based on marker genes, considerable microbial sequencing depth is needed to be able to detect marker genes from low abundance organisms [40]. Although MetaPhlAn2 has been found to be sensitive for low complexity food microbiomes [41], its performance in this study to characterize the milk microbiome was not as e cient as Kraken2, particularly in terms of total assignment, which was lower for samples classi ed using MetaPhlAn2 compared to Kraken2. Taken together, these results con rm that the choice of bioinformatic tool has an important in uence on taxonomic composition, particularly in low abundance environments. The impact of choice of classi cation tools have already been reported in other studies [41].
The comparison of the methods used in this study showed that there were no biases in taxonomic composition and community structure when any method was used. Although the ML kit identi ed more unique species from the milk samples compared to the other two methods, there were no signi cant differences in beta diversity (Fig. 4c). Similar results were found in the analysis of functional data of the samples, where no particular method generated functional pro les that were distinct or signi cantly different (Additional le: Figure S1). In addition, we showed that metagenomic sequencing of milk samples is not limited in identi cation of taxa compared to the standard method of amplicon sequencing (Fig. 3d).
The ability to track microbes at strain level is of great importance and interest both in the food industry and in uncovering mother-to-infant microbial transfer [22,29]. The ability to generate high-quality MAGs from the ML processed samples is promising when considering this challenge. Apart from recovery of mock community strains, the increased microbial sequencing depth allowed for the assembly of highquality Corynebacterium pyruviciproducens and Enterococcus faecalis genomes from the metagenome of a human milk sample. This suggests that with greater microbial sequencing depth, the recovery of high-quality MAGs can uncover species not previously identi ed in the milk microbiome, particularly for species that are di cult to culture.
This study aimed to evaluate three different methods for improved microbial read sequencing depth, however certain limitations remain. The study was performed using a small number of samples, yet the current sample size was su cient to allow for the identi cation of signi cant differences between methods. Additionally, the microbial or host DNA concentration was not absolutely quanti ed, which could provide a greater insight into the ratios of host to microbial DNA found in the milk samples.

Conclusion
Overall, this evaluation has addressed two important problems in metagenomic sequencing of low diversity environments, speci cally poor sequencing economics and poor microbial sequence depth. The results show that the host depletion approach of the ML kit performed better than the enrichment or direct sequencing alternatives by providing the potential for deeper strain level analysis without an observable bias. Our study additionally highlighted the importance of the correct downstream bioinformatic tool selection to accurately characterize the milk microbiome.

Methods
Collection, storage and initial treatment of milk samples Five bovine raw milk samples and ve human milk samples that were previously frozen at -80 o C were thawed for use in this study. Bovine raw milk samples were obtained from farms across Ireland, taken from bulk milk tanks and transported to the laboratory on ice where a subsample was stored at -80 o C. Human milk samples were collected as part of the Microbe Mom study, following informed approved consent of mothers. These samples were collected within the rst month after delivery. The maximum possible volume of available samples was used in the study, which was 30 ml of bovine milk and 4 ml of human milk. For one bovine and one human milk sample, an additional aliquot was taken for use as positive controls through spiking with a mock community. Milk samples were prepared as follows: bovine milk and human milk samples were centrifuged at 4,500 x g for 20 minutes at 4 o C. After centrifugation, the cream and supernatant were discarded and the remaining pellets were subjected to two washing steps, where the pellets were resuspended in sterile PBS and centrifuged at 13,000 x g for 1 minute, after which the supernatant was discarded. After the washing steps, the pellets were resuspended in 2 ml sterile PBS (Fig. 1). For the positive control milk samples, the pellets were resuspended in 1.8 ml of sterile PBS and then spiked with 200 µl of a mock community (see below for further details) and mixed thoroughly to ensure homogenization. All 12 samples were divided into two 1.5 ml tubes with 1 ml each for use in the subsequent DNA extractions (Fig. 1).
Description of mock community (positive control) and negative controls utilised . Metagenome-assembled genomes (MAGs) were assembled using MEGAHIT, followed by binning using MetaBAT2 and quality-checking using checkM [46]. Kaiju was used in the taxonomic classi cation of MAGs. High-quality bins were classi ed as >90% complete with <5% contamination as suggested by Bowers et al. [27]. For 16S rRNA amplicon sequencing, data was processed as previously described [47] and taxonomy was assigned using BLAST against the SILVA SSURef database release 132. All further bioinformatics analyses, data visualisation and statistics were done in R using vegan, ggplot2 and pheatmap packages. Figure 1 Experimental procedure used for the study. Process for skimming both bovine and human milk samples is indicated on the left. Direct extraction, host depletion, or microbial enrichment approaches are indicated on the right. Comparison of the mean microbial reads of all samples within one extraction method to another. Asterisk denotes signi cant differences of p value < 0.05 as determined using Student's t-test.   Bovine and human milk samples had similar species level composition and community diversity. Top 20 species composition of all test bovine (a) and human (b) milk samples using all three extraction methods and clustered based on Bray-Curtis dissimilarity. (c) Bray-Curtis dissimilarity plots for both bovine (left) and human (right) milk samples for microbial communities determined using all three methods.

Figures
Statistical community dissimilarities were calculated using ADONIS. The number of total and high-quality MAGs recovered from sequenced samples using either method. High-quality MAGs (hq) were greater than 90% completeness and less than 5% contamination.

Supplementary Files
This is a list of supplementary les associated with this preprint. Click to download.