Comparative Investigation of the Suitability of DNA Preparation Methods for Microbiome Analysis of Complex Clinical Samples

Ecient DNA preparation is essential for accurate and reproducible microbial data acquisition. In this study an in-house method (NWU lysis system), involving a lysis micro tube (LMT) was employed to determine its suitability for safe, rapid and accurate characterisation of the bacterial microbiome associated with clinical samples in comparison to two available commercial DNA extraction kits. During the experimental setup, it was conrmed that the LMT is suitable for downstream bacterial microbiome analyses with the incorporation of short read sequencing; but due to the fragmented nature of the DNA its suitability is highly dependent on the sequencing technology applied downstream. The study also served to demonstrate that shearing of DNA can have a signicant impact on downstream microbiome analyses, based on the sequencing technology used; and that care must be taken especially if long read sequencing is to be used employed.


Introduction
Since the onset of next generations sequencing (NGS) the eld of clinical genetics has been rapidly changing (Vrijenhoek et al., 2015). The swift progression of NGS technology and its use in clinical laboratories has allowed for remarkable progress in the genetic diagnosis of both inherited disorders and infectious diseases, as well as the provision of methods for studying accompanying risk factors such as Sequencing of the 16S gene is a commonly employed method to evaluate the microbial community in clinical samples (Pollock et al., 2018). Previous studies have emphasized that effective DNA preparation is crucial for the accurate characterisation of the complex bacterial populations often present in clinical samples, and that lysis e ciency is a major limitation, especially with samples containing hard to lyse bacteria (Lim et al., 2018). Several methods are currently available for effective DNA preparation prior to 16S sequencing, in the form of various commercial kits. These approaches to cellular lysis of available commercial DNA preparation methods differ, and the lysis method may depend on heat, chemical, enzymatic, or mechanical lysis. Additionally, the speci c DNA isolation techniques used also differ, such as spin column-based methods or the use of magnetic beads; while others have opted for direct DNA ampli cation of crude lysates, forgoing conventional DNA puri cation (Videvall et  The NWU-TB test is a molecular-based sensitive and speci c TB diagnostic tool. The NWU-TB system consists of three sequential steps: cell lysis in a LMT ( Supplementary Fig. 2), using an automated lyser device (NWU lysis system: Supplementary Fig. 3) completely inactivating TB within 7 minutes, followed by multiplex-PCR within 25 min. The lysis step of the NWU-TB test is a highly e cient and cost effective process that is based on the use of heat, chemical and mechanical means to achieve cellular lysis. In addition to its lysis e cacy, the lysis process is also extremely safe, with lysis occurring within the con nes of a lysis micro tube, offering the potential for this system to be used with decreased biosafety measures (Mutingwende et al., 2015).
Authors of recent studies have emphasized that the inclusion of mechanical lysis in conjunction with other lysis methods is essential to minimize possible biases due to some microbial cells being more resistant to lysis than others (Lim et al., 2018;Pollock et al., 2018) The decision was thus made to evaluate DNA obtained by means of the NWU lysis method as preferential tool for downstream microbiome analyses on two sample types: a clinical sputum sample evaluated with long read sequencing technology (ONT MinION), representing a lower complexity sample and a more complex gut sample evaluated with short read sequencing technology (Illumina Miseq), focussing on 16S amplicon-based sequencing in comparison to DNA obtained with the use of various commercially available kits.

Materials And Methods
Samples and DNA preparation DNA obtained by means of the NWU lysis method (method H) was compared to DNA obtained by means of two available commercial kits on both the caecum contents from an available mouse study (representing a gut sample), and a previously collected less complex clinical sputum sample; thus representing two samples types of varying complexity. The two commercial kits selected for comparison included the QIAamp DNA microbiome kit (Qiagen, Germany) and the GenElute™ Stool DNA Isolation Kit (Sigma, USA), and were labelled as method Q and S respectively. These kits were selected on the basis that both were bead-beating kits, and thus included both chemical and mechanical lysis to ensure that di cult to lyse bacteria are properly lysed during the DNA preparation process.

Sputum sample
For the evaluation of a low complexity clinical sample, a sputum sample of su cient volume was collected from a concurrently running study. The sample was initially collected, snap frozen and stored at -80°C until processing. The selected sample was divided into 12 x 250 µl aliquots before further processing. Aliquots were extracted in triplicate using either method Q or S; while lysate was prepared with the NWU in-house cell lysis method (H) with and without additional puri cation (carried out with the spin column technology of method Q and labelled as HQ).
Extractions with method Q and S were done according to the manufacturers' instructions. Freeze-thaw cycles may compromise bacterial integrity, and the benzonase treatment used during the host DNA removal Protocol of method Q may lead to a loss of exposed bacterial DNA. The decision was thus made to omit the host DNA removal step from method Q considering the samples have been previously frozen. Preparation of the crude lysate using the NWU in-house cell lysis method was done according to previously described methods (Mutingwende et al., 2015). For the NWU cell lysis method, lysis was carried out using the NWU lyser device: 250 µl of sample was mixed with 250 µl of a proprietary lysis buffer. The LMT was placed on the pre-set (95°C and 3600 rpm) lyser device for 7 min. Bacterial cell lysis was concurrently achieved through chemical, thermal and mechanical means (Mutingwende et al., 2015).
DNA concentration was assessed using the Qubit 4 Fluorometer (Thermo sher Scienti c, USA) along with the Qubit BR assay kit (Thermo sher Scienti c, USA), while quality was determined by nanodrop spectrophotometry on a Nanodrop One (Thermo sher Scienti c, USA). The integrity of extracted genomic DNA was evaluated by visualization of a 1.5% (w/v) agarose gel using GelRed® dye (Biotium, USA), after electrophoresis in the presence of a 1kb ladder as size reference standard.

Representative gut sample
The caecum content from a concurrently running mouse study was investigated as a complex sample. As part of the initial study the caecums were dissected, snap frozen and stored at -80°C prior to further analysis. To generate su cient sample volume, three caecums from C3HeB/JeF mice were physically cut open and the content was added to phosphate buffered saline (PBS). The mixture was vortexed at high speed for 2 minutes to generate a homogenized sample; 250 µl of this mix was then used for DNA extraction and lysate preparation in triplicate. DNA was extracted and/or prepared, subjected to quality control and samples were labelled as described under Sect. 2.1.1 Amplicon library and ow cell preparation Two sequencing runs were carried out; one for the sputum sample and one for the gut sample. For the sputum sample a 16S rRNA sequencing library was constructed according to the 16S Barcoding Kit (SQK-RAB204) Protocol (Oxford Nanopore Technologies, Oxford, UK) for sequencing on the ONT MinION platform. Library construction for the gut sample was performed according to the 16S metagenomics sequencing library preparation protocol (Illumina, San Diego, CA, USA) for sequencing on the Illumina MiSeq platform.

Sputum sample
Sequencing of the sputum sample was carried out at the North-West University using the ONT 16S Barcoding Kit (SQK-RAB204) according to the ONT Protocol with the only difference being the use of an inhibitor tolerant high-delity polymerase. Polymerase chain reaction barcoding ampli cation was conducted on a C1000™ Thermal Cycler (Bio-Rad, US). A 50 µl reaction volume consisting of: 1 µl (10 µM) of 16S barcode primer (Oxford Nanopore Technologies, Oxford, UK), 25 µl of Invitrogen Platinum SuperFi DNA Polymerase master mix (Thermo Fisher Scienti c, USA), 10 µl of GC enhancer (Thermo Fisher Scienti c, USA), 13 µl nuclease-free water (Sigma-Aldrich, USA) and 1 µl of template DNA (10 ng/µl) was prepared for each sample. In the case of the lysate produced by the NWU in-house cell lysis method, 1 µl of sample was added to the reaction mixture. PCR cycling conditions were set at 95°C for 1 min followed by 25 cycles of denaturation at 95°C for 20 s, annealing at 55°C for 30 s, extension at 65°C for 2 min and a nal extension step of 65°C for 5 min before holding at 4°C. PCR products were cleaned using AMPure XP beads (Beckman Coulter, USA) and eluted in 10 µl of a buffer containing 10 mM Tris-HCl pH 8.0 and 50 mM NaCl. Following PCR, 1 µl of eluted sample was quanti ed using a Qubit uorometer in order to pool the DNA barcoded libraries at an equal ratio. All barcoded libraries were pooled in the desired ratios to a total of 50-100 fmoles in 10 µl of 10 mM Tris-HCl pH 8.0 and/ 50 mM NaCl. Platform quality control (QC) was carried out using MinKNOW™ on 2 new R9.4.1 chemistry MinION™ ow cells before the ow cell was primed. In total 75 µl of sequencing mix consisting of the DNA library, sequencing buffer and library loading beads was prepared according to the ONT Protocol and added in a drop-wise fashion via the SpotOn sample port. The standard 48 h sequencing script was chosen with 1D live base calling.

Representative gut sample
The V3 and V4 regions of the 16S rRNA gene from the isolated microbiome were ampli ed on the C1000™ Thermal Cycler (Bio-Rad, US). A 25 µl reaction volume consisting of: 1 µl (5 µM) of forward primer, 1 µl (5 µM) of reverse primer, 12.5 µl of Invitrogen Platinum SuperFi DNA Polymerase master mix (Thermo Fisher Scienti c, USA), 5 µl of GC enhancer (Thermo Fisher Scienti c, USA), 4.5 µl nuclease-free water (Sigma-Aldrich, USA) and 1 µl of template DNA (20 ng/µl) was prepared for each sample. In the case of the lysate produced by the NWU in-house cell lysis method, 1 µl of sample was added to the reaction mixture. The following PCR conditions were used during ampli cation: initial denaturation at 98°C for 30 s, followed by 25 cycles of 98°C for 10 s, 55°C for 10 s and 72°C for 30 s and a nal elongation step at 72°C for 300 s. After ampli cation, PCR products were puri ed using the Agencourt AMPure XP PCR Puri cation kit

Bioinformatics analysis
Fast 5 les generated from the sputum sample sequencing runs on the MinION™ were base called and demultiplexed, followed by removal of adapter and primer sequences using ONT's Guppy™ sequencing software (version 3.2.4). The resultant fastq les were then ltered to remove reads with a Phred score below 7; and lengths below 1200 and above 1500bp, using NanoFilt (https://github.com/wdecoster/nano lt) (De Coster et al., 2018). Taxonomy was assigned to sequences using the sintax command from Usearch (Edgar, 2018), with a sintax cut-off of 0.8. Sequences were classi ed using the RDP 16S training set v16 (RTS) database comprising 13,212 sequences belonging to 2,126 genera. Following classi cation, the output sintax les were processed with in-house R scripts to produce OTU tables and excel summary les . The OTU table along with a mapping le were then fed into   the MicrobiomeAnalyst online software suite for further evaluation, which included alpha and beta

Statistical analysis
To evaluate the in uence of each extraction method on the DNA quantity/yield and quality (A260/A280 and A260/A230); ANOVA (one-way analysis of variance) was employed with Tukey's post-hoc test for multiple pairwise comparisons [15]. Statistical testing was carried out using Statistica (v13. whereas the Chao1 index is a qualitatively measure, which beside species richness also takes into account the ratio of singletons, and hence gives more weight to rare species. The Shannon diversity index on the other hand, is a measure of both richness and evenness of the microbes of the given sample; thus addressing the question of whether there is evenness and possible domination in the main genera/species found in the sample (Xia & Sun, 2017).
Beta-diversity calculations, which describes the diversity in a microbial community between different samples, were visualized using principal coordinate analysis plots (PCoA). Beta-diversity PCoA plots were based on Bray-Curtis distances, and compared using the nonparametric analysis of similarities (ANOSIM) test. Heat maps of the most abundant genera classi ed to the genus level were generated

Yield and quality of prepared nucleic acids
The concentration and quality of the prepared nucleic acids were compared for the various methods used for both sample types as summarised in Table 1. Method H without the addition of a puri cation step yields a crude lysate and thus no quality assessments could be obtained. Again all had acceptable A260/A280 absorbance ratios between 1.7-2.0, while only method Q had A260/A230 ratios within the acceptable range. DNA integrity was assessed by agarose gel electrophoresis and revealed similar results to those obtained with the selected sputum sample.
NGS analyses of bacterial diversity and composition of samples from various DNA preparation methods NGS analyses of bacterial diversity and composition from various DNA preparation methods were carried out on sequencing reads produced for both sample types.

Sputum sample
A single sequencing run on a MinION™ ow cell of R9.4 chemistry, produced 5 169 844 reads from 12 DNA barcodes. Platform quality control (QC) analysis preceding the sequencing run revealed a total of 1356 available pores, split into 4 groups. The mean read length was 1352 base pairs (bp), with an average Phred score of 10.01. The alpha diversity based on the number of observed species, Chao1 and Shannon diversity indices, was compared according to the DNA preparation method (Fig. 1: A). As seen in Fig. 1A, there was no signi cant difference between methods HQ, Q and H, based on the Shannon, Chao1 and observed indices, while method S ranked signi cantly higher than any of the other methods for all three alpha diversity measures. Clustering analyses were performed by means of a hierarchical clustering heat map (Fig. 2), which revealed three clusters: Cluster one, consisting of method H and HQ, and one cluster each for methods Q and S respectively. In samples extracted with method S, Gram-negative bacterial genera were more abundant, while samples extracted with method Q resulted in an increased abundance for Gram-positive bacterial genera.
Principal coordinate analysis of beta-diversity yielded similar results, showing signi cant clustering according to the method. Methods H and HQ clustered together, while methods Q and S formed unique clusters ( Supplementary Fig. 1(A)). Analyses of similarity con rmed signi cant variability among the preparation methods ([ANOSIM] R: 0.74691; p-value < 0.001).

Gut sample
The resulting run on the Illumina Miseq platform produced 4 582 278 reads for 12 samples, with an average quality of 32.58 and an average length of 453 bp. Again the alpha diversity based on the number of observed species, Chao1 and Shannon diversity indices, was compared according to the DNA preparation method (Fig. 1: B). For the representative gut sample, there was no signi cant difference between methods H, HQ, Q and S based on the Shannon, Chao1 and observed indices. Clustering analyses performed by means of a hierarchical clustering heat map, revealed random clustering among samples, also suggesting no statistically signi cant difference based on the various preparation methods (Fig. 3). This was again con rmed by principal coordinate analysis of beta-diversity (Supplementary Fig. 1(B)), which revealed no statistically signi cant difference between samples based on preparation method. Analyses of similarity con rmed no signi cant variability among the preparation methods ([ANOSIM] R: -0.11728; p-value < 0.759).

Discussion
The A260/A280 ratio is viewed as the primary measure of DNA purity, and DNA with a ratio between 1. Fragmentation of DNA was assessed by agarose gel electrophoresis. The DNA extracted with methods S and Q yielded high molecular weight DNA. Method H was originally designed to form part of a TB diagnostic system and as such the lysis procedure is incredibly robust primarily producing sheared single stranded DNA. The method was originally incorporated with real time qPCR, involving the ampli cation of short fragments of DNA; and the potential impact of shearing on larger PCR products has not been previously investigated (Kennedy et al., 2009).
NGS analyses of bacterial diversity and composition were compared between different DNA preparation methods and with two sequencing approaches. Evaluations of a representative gut sample with the incorporation of short read sequencing on the Illumina Miseq platform indicated no statistically signi cant difference between either DNA preparation methods at the end of analyses of the microbiome.
Analysis of the sputum sample with the incorporation of long read sequencing, on the other hand showed a statistically signi cant difference when compared to one of the commercial methods (S). When comparing methods H and HQ it is evident that puri cation of the DNA obtained with method H did not have any signi cant impact on the end results and that the differences observed were the result of the level of integrity of the DNA and the sequencing method applied. The DNA obtained with methods H, HQ and Q was consistently sheared, which may not be a problem when sequencing short fragments; but for single molecule or long-read sequencing technologies, particularly those from Paci c Biosciences (PacBio) and ONT, this can be a major issue (Lim et al., 2018). The MinION 16S work ow incorporates full length 16S amplicons of around 1500bp, and it would thus follow that a work ow incorporating larger amplicons will fail to capture the true diversity of a sample if there is a low amount of intact DNA, as illustrated by the current results.
A limitation of the present study was the use of only one representative sample for each evaluation. It was therefore not possible to determine whether the variation between the preparation methods was less than the inter-subject variation. According to previous studies the inter-subject variation generally tends to be greater than the technical variation. Hence, it is expected that analysis of additional samples would be in accordance with those of prior accounts ( Various studies tend to use mock microbial samples of known composition to evaluate new methods, but these mock samples also have their limitations: They are often presented in a liquid or broth and as such lack the complexity of real biological samples such as sputum, in which the viscosity of the sample is a major stumbling block. Furthermore, mock microbial communities tend to have a lower degree of diversity when compared to actual clinical samples, which harbour a much more diverse bacterial population. Given the growing signi cance of the microbiome, new products and tools are consistently being developed to study the human microbiome and its various relations to health and disease (Mohajeri et al., 2018). Despite the mentioned limitations, the current study provides useful information, describing the potential of a LMT to carry out downstream microbiome analyses with improved speed and safety.

Conclusion
To conclude, the NWU lysis system is a promising prospect for clinical microbiome studies due to its lysis e ciency. In addition to lysis e ciency the system also provides improved safety due to the fact that lysis of clinical specimens take place within a closed container. In its current form though the lysis method is only compatible with short read sequencing because of the sheared nature of the DNA produced and further optimisation will be required before the lysis step can be incorporated into a portable sequencing technology such as the ONT MinION. In addition, this study also highlights the importance of evaluating the integrity of DNA, as shearing may greatly impact the nal composition of the microbiome depending on the sequencing technology employed.

Declarations
Funding