Samples and DNA preparation
DNA obtained by means of the NWU lysis method (method H) was compared to DNA obtained by means of two available commercial kits on both the caecum contents from an available mouse study (representing a gut sample), and a previously collected less complex clinical sputum sample; thus representing two samples types of varying complexity. The two commercial kits selected for comparison included the QIAamp DNA microbiome kit (Qiagen, Germany) and the GenElute™ Stool DNA Isolation Kit (Sigma, USA), and were labelled as method Q and S respectively. These kits were selected on the basis that both were bead-beating kits, and thus included both chemical and mechanical lysis to ensure that difficult to lyse bacteria are properly lysed during the DNA preparation process.
Sputum sample
For the evaluation of a low complexity clinical sample, a sputum sample of sufficient volume was collected from a concurrently running study. The sample was initially collected, snap frozen and stored at -80°C until processing. The selected sample was divided into 12 x 250 µl aliquots before further processing. Aliquots were extracted in triplicate using either method Q or S; while lysate was prepared with the NWU in-house cell lysis method (H) with and without additional purification (carried out with the spin column technology of method Q and labelled as HQ).
Extractions with method Q and S were done according to the manufacturers’ instructions. Freeze–thaw cycles may compromise bacterial integrity, and the benzonase treatment used during the host DNA removal Protocol of method Q may lead to a loss of exposed bacterial DNA. The decision was thus made to omit the host DNA removal step from method Q considering the samples have been previously frozen. Preparation of the crude lysate using the NWU in-house cell lysis method was done according to previously described methods (Mutingwende et al., 2015). For the NWU cell lysis method, lysis was carried out using the NWU lyser device: 250 µl of sample was mixed with 250 µl of a proprietary lysis buffer. The LMT was placed on the pre-set (95°C and 3600 rpm) lyser device for 7 min. Bacterial cell lysis was concurrently achieved through chemical, thermal and mechanical means (Mutingwende et al., 2015).
DNA concentration was assessed using the Qubit 4 Fluorometer (Thermofisher Scientific, USA) along with the Qubit BR assay kit (Thermofisher Scientific, USA), while quality was determined by nanodrop spectrophotometry on a Nanodrop One (Thermofisher Scientific, USA). The integrity of extracted genomic DNA was evaluated by visualization of a 1.5% (w/v) agarose gel using GelRed® dye (Biotium, USA), after electrophoresis in the presence of a 1kb ladder as size reference standard.
Representative gut sample
The caecum content from a concurrently running mouse study was investigated as a complex sample. As part of the initial study the caecums were dissected, snap frozen and stored at -80°C prior to further analysis. To generate sufficient sample volume, three caecums from C3HeB/JeF mice were physically cut open and the content was added to phosphate buffered saline (PBS). The mixture was vortexed at high speed for 2 minutes to generate a homogenized sample; 250 µl of this mix was then used for DNA extraction and lysate preparation in triplicate. DNA was extracted and/or prepared, subjected to quality control and samples were labelled as described under Sect. 2.1.1
Amplicon library and flow cell preparation
Two sequencing runs were carried out; one for the sputum sample and one for the gut sample. For the sputum sample a 16S rRNA sequencing library was constructed according to the 16S Barcoding Kit (SQK-RAB204) Protocol (Oxford Nanopore Technologies, Oxford, UK) for sequencing on the ONT MinION platform. Library construction for the gut sample was performed according to the 16S metagenomics sequencing library preparation protocol (Illumina, San Diego, CA, USA) for sequencing on the Illumina MiSeq platform.
Sputum sample
Sequencing of the sputum sample was carried out at the North-West University using the ONT 16S Barcoding Kit (SQK-RAB204) according to the ONT Protocol with the only difference being the use of an inhibitor tolerant high-fidelity polymerase. Polymerase chain reaction barcoding amplification was conducted on a C1000™ Thermal Cycler (Bio-Rad, US). A 50 µl reaction volume consisting of: 1 µl (10 µM) of 16S barcode primer (Oxford Nanopore Technologies, Oxford, UK), 25 µl of Invitrogen Platinum SuperFi DNA Polymerase master mix (Thermo Fisher Scientific, USA), 10 µl of GC enhancer (Thermo Fisher Scientific, USA), 13 µl nuclease-free water (Sigma-Aldrich, USA) and 1 µl of template DNA (10 ng/µl) was prepared for each sample. In the case of the lysate produced by the NWU in-house cell lysis method, 1 µl of sample was added to the reaction mixture. PCR cycling conditions were set at 95°C for 1 min followed by 25 cycles of denaturation at 95°C for 20 s, annealing at 55°C for 30 s, extension at 65°C for 2 min and a final extension step of 65°C for 5 min before holding at 4°C. PCR products were cleaned using AMPure XP beads (Beckman Coulter, USA) and eluted in 10 µl of a buffer containing 10 mM Tris-HCl pH 8.0 and 50 mM NaCl. Following PCR, 1 µl of eluted sample was quantified using a Qubit fluorometer in order to pool the DNA barcoded libraries at an equal ratio. All barcoded libraries were pooled in the desired ratios to a total of 50–100 fmoles in 10 µl of 10 mM Tris-HCl pH 8.0 and/ 50 mM NaCl. Platform quality control (QC) was carried out using MinKNOW™ on 2 new R9.4.1 chemistry MinION™ flow cells before the flow cell was primed. In total 75 µl of sequencing mix consisting of the DNA library, sequencing buffer and library loading beads was prepared according to the ONT Protocol and added in a drop-wise fashion via the SpotOn sample port. The standard 48 h sequencing script was chosen with 1D live base calling.
Representative gut sample
The V3 and V4 regions of the 16S rRNA gene from the isolated microbiome were amplified on the C1000™ Thermal Cycler (Bio-Rad, US). A 25 µl reaction volume consisting of: 1 µl (5 µM) of forward primer, 1 µl (5 µM) of reverse primer, 12.5 µl of Invitrogen Platinum SuperFi DNA Polymerase master mix (Thermo Fisher Scientific, USA), 5 µl of GC enhancer (Thermo Fisher Scientific, USA), 4.5 µl nuclease-free water (Sigma-Aldrich, USA) and 1 µl of template DNA (20 ng/µl) was prepared for each sample. In the case of the lysate produced by the NWU in-house cell lysis method, 1 µl of sample was added to the reaction mixture. The following PCR conditions were used during amplification: initial denaturation at 98°C for 30 s, followed by 25 cycles of 98°C for 10 s, 55°C for 10 s and 72°C for 30 s and a final elongation step at 72°C for 300 s. After amplification, PCR products were purified using the Agencourt AMPure XP PCR Purification kit (Beckman Coulter, USA); and transported to the Agricultural Research Council Biotechnology Platform (ARC-LNR), Pretoria, South Africa, for sequencing on the Illumina MiSeq platform according to the standard protocol.
Bioinformatics analysis
Fast 5 files generated from the sputum sample sequencing runs on the MinION™ were base called and de-multiplexed, followed by removal of adapter and primer sequences using ONT’s Guppy™ sequencing software (version 3.2.4). The resultant fastq files were then filtered to remove reads with a Phred score below 7; and lengths below 1200 and above 1500bp, using NanoFilt (https://github.com/wdecoster/nanofilt) (De Coster et al., 2018). Taxonomy was assigned to sequences using the sintax command from Usearch (Edgar, 2018), with a sintax cut-off of 0.8. Sequences were classified using the RDP 16S training set v16 (RTS) database comprising 13,212 sequences belonging to 2,126 genera. Following classification, the output sintax files were processed with in-house R scripts to produce OTU tables and excel summary files. The OTU table along with a mapping file were then fed into the MicrobiomeAnalyst online software suite for further evaluation, which included alpha and beta diversity determination, rarefaction curve generation and clustering analyses (Dhariwa et al., 2017; Chong et al., 2020).
Fastq files produced by the Illumina MiSeq platform were processed and analysed using the mothur (version 1.42.3) shell program (Schloss et al., 2009). The sequences were aligned with SILVA-based bacterial reference alignment (version July 2019). Chimeric reads were filtered out using the abundant sequences (dereplicate = t) as a reference with the VSEARCH command in the shell program. Host DNA sequences along with other undesirable sequences were removed with the remove.lineage command where the DNA of chloroplasts, mitochondria, archaea and eukaryotes were removed. Sequences that shared a minimum of 97% pairwise nucleotide identity were clustered into operational taxonomic units (OTUs). OTUs were classified up to species level using the 16S rRNA RDP reference (version 16) and the classify.otu command in the mothur shell program. The abundance file obtained using the make.shared command and the taxonomy file obtained using the classify.otu command was exported for further analyses (Schloss et al., 2009).
Statistical analysis
To evaluate the influence of each extraction method on the DNA quantity/yield and quality (A260/A280 and A260/A230); ANOVA (one-way analysis of variance) was employed with Tukey’s post-hoc test for multiple pairwise comparisons [15]. Statistical testing was carried out using Statistica (v13.1) (Statsoft, Inc, USA) and visualised in GraphPad Prism (v8) (GraphPad, Inc., USA). Statistical analyses and visualisation of microbiome data were done with the assistance of MicrobiomeAnalyst on default settings unless specified otherwise (Dhariwa et al., 2017; Chong et al., 2020). Data was normalised using total sum scaling (TSS). The alpha diversity of samples was measured by observed, Chao1 and Shannon diversity indexes (Corcoll et al., 2017). The observed and Chao1 indices act as measures of species richness. The observed species index measures the number of distinguishable taxa in every sample; whereas the Chao1 index is a qualitatively measure, which beside species richness also takes into account the ratio of singletons, and hence gives more weight to rare species. The Shannon diversity index on the other hand, is a measure of both richness and evenness of the microbes of the given sample; thus addressing the question of whether there is evenness and possible domination in the main genera/species found in the sample (Xia & Sun, 2017).
Beta-diversity calculations, which describes the diversity in a microbial community between different samples, were visualized using principal coordinate analysis plots (PCoA). Beta-diversity PCoA plots were based on Bray–Curtis distances, and compared using the nonparametric analysis of similarities (ANOSIM) test. Heat maps of the most abundant genera classified to the genus level were generated using complete hierarchical clustering by Euclidian distance (Dhariwa et al., 2017; Xia & Sun, 2017; Chong et al., 2020). These tests were performed at the feature level with only the higher abundance taxa (> 0.1% of total).
Ethics
This study was approved by the North-West University Research Ethics Committee under ethics numbers: NWU-00127-18-A1 and NWU-00584-19-A5.