3.1 Comparison of 16S rRNA, metagenome shotgun, and bacterial single-cell sequencing on the human salivary microbiome
The results of the 16S rRNA sequencing, metagenome shotgun sequencing, 48-well single-cell isolation, and short-read genome sequencing of the inactivated and culturable samples are shown in Figure S1. Taxonomic bar plots at the genus level based on 16S rRNA analysis revealed that the inactivated and culturable samples had similar microbiome structures. In both samples, Streptococcus was the predominant genus, followed by Prevotera, Neisseria, and Veillonella (Figure 1A and Table S1). For bacterial single-cell analysis, genome sequences were obtained from 43 out of 48 wells for the OMNIgene-preserved samples and from 45 out of 48 wells for the glycerol stock samples. Genomic completeness greater than 80% was achieved in 17 wells for the OMNIgene-preserved samples and in 24 wells for the glycerol stock samples compared with known genomic sequences (Figure 1B and Table S2). Similar to the 16S rRNA sequencing results, single-cell sequencing results showed that Streptococcus was the most abundant genus in the samples, followed by Prevotella. By contrast, the percentage of Neisseria, which was high in the 16S rRNA sequencing, was low, and Veillonella and Alloprevotella were not detected. In addition, 60 bacterial genera were detected using 16S rRNA sequencing, whereas only 17 genera were detected using single-cell sequencing (Figure 1A and 1C, and Table S2).
The total raw read counts for the metagenomic shotgun and single-cell analyses were 61,126,868 and 55,918,930, respectively (Tables S2 and S3). Metagenomic shotgun sequencing revealed an average contamination rate of 81.6%, indicating the difficulty in separating bacterial DNA from human saliva-derived specimens (Figure 1D and Table S3). By contrast, bacterial single-cell sequencing obtained a much lower average contamination rate of 10.4% per genome because of the single-cell separation process (Figure 1D and Table S2). Metagenome binning yielded nine bins from metagenome assemblies, of which eight were identified by metagenome shotgun sequencing and GTDBtk analysis, and 44 strains were identified at the species level by bacterial single-cell sequencing and GTDBtk analyses (Tables S2 and 1).
3.2 Detection of antimicrobial resistance genes and virulence factor genes from metagenome shotgun and bacterial single-cell sequencing
Metagenome shotgun sequencing revealed that cfxA encoding a β-lactamase and the erythromycin resistance genes ermF and ermX were complete sequences, whereas the other AMR genes were fragmented (Figure 2). Single-cell sequencing showed that four of the 88 isolates harbored cfxA and four harbored ermF. For tetracycline resistance, nine isolates carried tet32, tetM, tetO, or tetQ (Figure 3).
Both analyses detected fewer genes encoding virulence factors than AMR genes. Metagenome shotgun sequencing revealed intact gapA, an essential gene encoding an enzyme for glycolysis, whose ortholog inhibits the biological effects of C5a on human neutrophils [36]. Bacterial single-cell sequencing showed that in addition to gapA, the two samples contained 12 intact genes encoding virulence factors derived from S. pneumoniae (Figure 4). SPH0456+ (cpsB), cps4J, cps4K, and cps4L are pneumococcal polysaccharide capsule synthesis genes; lmb and pavA encode pneumococcal adhesins; piuA and psaA_1 encode pneumococcal transporters; nanB encodes sialidase; slrA encodes peptidyl-prolyl cis-trans isomerase; srtA encodes sortase A; and tig_ropA_2 encodes a trigger factor. STER1442, STER1444+, epsB+, and tig_ropA_5 were isolated from Streptococcus thermophilus; fbpA_6, psaA_3, and rfbB_1+ from Streptococcus gordonii; and ctrC_1 and lbpA from Neisseria. Concerning pneumococcal virulence factors, genes encoding pneumococcal cell surface proteins such as choline-binding proteins (cbpD, cbpG, lytA, and lytB) and cell wall anchoring proteins (iga, nanA, pavB/pfbB, pfbA, zmpB, and zmpC) were also detected in fragments. We previously reported that the orthologs of iga, nanA, pfbA, zmpB, and zmpC are distributed among closely related streptococcal species, including oral Streptococcus, which is consistent with our results [37-39].
Although the host species of several of the detected pneumococcal virulence factors remained unclear, several pneumococcal virulence factors were harbored by oral streptococci. Neisseria ctrC was detected in a single-cell isolate, OSU002-0007, which was predicted to be Neisseria mucosa. Another single-cell isolate containing Neisseria lbpA was not identified in this species. Bacterial single-cell sequencing allowed us to elucidate the level at which bacteria have specific genes, which is difficult to achieve with metagenomic shotgun sequencing.
In the MiGA ANI analysis, 10 of the 88 genomes met >95% of the criteria for species identification (Table 2). Although several genomes were predicted to be S. pneumoniae through GTDBtk taxonomy analysis (Table S2), MiGA ANI analysis showed that S. pneumoniae had the highest ANI value (91.4 %) in only one genome (OSU002-0038; Table 2). Despite the high genomic completeness of 99% determined by CheckM, the ANI value was less than 95%, indicating that no bacteria were virtually identified as S. pneumoniae in the samples. S. pneumoniae belongs to the mitis group of oral Streptococcus and cannot be distinguished from S. mitis or S. oralis through 16S rRNA sequencing, and some strains are difficult to identify even by biochemical tests [29, 40, 41]. Therefore, care must be taken to distinguish bacteria from closely related species at the species level. ANI analysis also suggested five single-cell-isolated bacteria as potential novel species (Table 2). Bacterial single-cell sequencing could be a powerful tool for searching for novel bacterial species in the microbiome.