Human sample collection and processing for viral and microbial metagenomics
Saliva samples (≈ 5 mL) were obtained from 26 volunteers: 16 healthy controls, 7 IgA-deficient patients and 3 with common variable immunodeficiency (CVID) of IgA, IgG and/or IgM patients). Volunteers signed an informed consent indicating their willingness to participate in this study. None of the subjects or patients included experienced periodontal disease. The patients were randomly selected (53% male and 47% female; age range 11–65) within a cohort of patients with confirmed diagnosis of immunodeficiency at the Primary Immunodeficiency Unit of the Hospital 12 de Octubre (Madrid, Spain). Healthy control samples were collected on 18 January 2016, while samples from IgA deficient patients and CVID were collected on 31 January 2017. In all cases, saliva was collected before breakfast and oral hygiene in the morning and immediately brought to the laboratory on ice. Drinking was not permitted to assess microbial and viral abundance under same conditions. Samples were vortexed for 2 minutes at maximum speed and then processed for microbial and viral metagenomics and fluorescence microscopy. For microbial metagenomics, 450 ul of samples was stored at -80ºC until use. DNA extraction was performed with MasterPure™ Complete DNA and RNA Purification Kit (Epibio) following manufacture´s protocol.
For viral metagenomics, DNA extractions were performed within the same day of sample collection as follows. Saliva was centrifuged at 5000 g for 10 min at 4ºC and supernatant was sequentially filtered through 0.45 µm and 0.2 µm PES filters (Millipore). Free DNA present in saliva was removed using 50 U of Turbo DNase I (Ambion, Invitrogen) at 37ºC for 1 h. Finally, viral nucleic acids were extracted from 500 ul of DNase treated saliva with QIAmp® UltraSens® Virus Kit (Cat. Nº 53704, QIAGEN) according to manufacturer’s protocol.
Quality and quantity of extracted DNA from viral and microbial samples were checked with fluorimetry in a Qubit instrument (Invitrogen) and on an electrophoresis gel.
Sequencing, assembly, annotation and metagenome analyses
Microbial and viral metagenomes were sequenced by Illumina technology using the Nextera XT DNA library (ref. FC-131-1024, Illumina) in a MiSeq sequencer (2x250, pair-end) according to manufacturer’s protocol. Raw reads were quality filtered using prinseq-lite program 24 with the following parameters: min_length: 50, trim_qual_right: 20, trim_qual_type: mean, and trim_qual_window: 20. Additionally, Genome assembly was performed with SPAdes version 3.6.1 25 using “metaSPADES option” and applying the following parameters: -k 33,55,77,99,127.
General automated annotation was done at the IMG-JGI bioinformatic platform 26. Manual annotation was also done in house comparing predicted proteins by Prodigal program 27 with NR database (NCBI) with BLAST version 2.5.0 + 28 and with pfam database using HMMER package 29. PCA plots of COG, pfam and KO clustering between samples were done with the publicly available bioinformatic tools at IMG-JGI 26.
Metafast program was used with default parameters to compute pairwise distance for raw read from unassembled metagenomes 30. Best-score hit analysis with BLAST 28 of predicted viral genes against > 700,000 viral genomes available at IMGvr database 31 was performed in house. Genes belonging to COG3583 from metagenomes were downloaded from IMG database and compared against MEROPS database available at EBI-EMBL Institute (https://www.ebi.ac.uk/merops/).
qPCR experiments of COG3583
Specific primers for the detected genes belonging to COG3583 (n = 254) in microbial metagenomes were designed with Primer 3 program implemented in Geneious bioinformatic package 32. Two primer sets targeting different gene variants of COG3583 observed after manual alignment were used: primer set 1 197F (5´CAGTCATGGCTGATGGTGCA3´) and 544R (5´CAGTATTCACTGCACGGCT3´), primer set 2 60F (5´AACAGCTGTAACTATGACAGGT3´) and 143R (5´GTTCTAACAGTGTGTGCTGC3´). Real time PCR conditions were as follows: 25 ul final volume reaction with 12.5 ul of 2X Master Mix Power SYBR Green I (Applied Biosystem), 9.5 ul of mQ sterile water, 1 ul of primer forward (10 uM), 1 ul primer reverse (10 uM), 1 ul of DNA template (concentration of 5 ng/ul). Same DNA amount (5 ng in total) was added from all samples to qPCR experiments allowing cross-comparison of Ct. Thermal conditions were as described by manufacture´s protocol.
16S rRNA gene sequencing and analysis
PCR of region V4 of 16S rRNA gene and further sequencing was carried out according to Earth Microbiome´s standard protocol with primer set 515F/806R33. Sequencing was performed in a Miseq sequencer according to manufacturer’s protocol (pair-end 300x2) at the FISABIO Genomics Center (Valencia, Spain). The sequenced data was quality filtered using prinseq-lite with the following parameters min_length: 50, trim_qual_right: 30, trim_qual_type: mean, trim_qual_window: 20 and then joined with FLASH 34, using default. The primers were removed with cutadapt program, and the cleaned merged reads were analyzed with QIIME2.2020 35. Low quality reads were eliminated with quality-filter q-score. Deblur 36,37 was used to trim the sequences at position in order to remove low quality regions.
Diversity was studied using the QIIME2 plugin q2-diversity. Specifically, alpha-diversity was evaluated with Pielou’s Evenness, Shannon’s Diversity index and Faith’s Phylogenetic Diversity index and compared with the no-parametric Kruskal-Wallis test. Beta-diversity was studied using PERMANOVA with the Bray-Courtis distance, Jaccard distance and weighted Unifrac and unweighted Unifrac distances. PCoAs (--p-metric seuclidean) were performed for representing beta-diversity and for all the taxonomic levels, that were previously collapsed. Taxonomy was assigned with the already pre-formatted SILVA 138 database (reproducible sequence taxonomy reference database management for the masses. The comparison between taxa’s relative abundance to find differentially abundant features was performed with ANCOM 38.
Fluorescence microscopy and microbial abundance
For microbial counts (see Fig. S1), saliva sample (50 ul) was fixed with glutaraldehyde (0.5% final concentration w/w) at 4ºC for 30 min and store at -80ºC until use. DAPI stain was performed as described. Briefly, sample was diluted with sterile PBS buffer 1X up to 1 ml and the filtered through 0.2 um GTTP membrane filters (Millipore). For viral counts (see Fig. S1), fixation was as above and sample was stored at -80ºC until use. SYBR Gold stain was performed as described 39,40.
All methods were carried out in accordance with Spanish and eurpean relevant guidelines and regulations. All experimental protocols were approved by the Ethical Committee of the University of Alicante.
Public access to metagenomic data and 16S rRNA gene sequences
In pages 8 and 9, all detailed information of accession numbers to find our sequencing data generated in this study is provided.