Genomic features of strains used in the study
A total of 138 high-quality genomes were used in the present study with an average genome size ranging from 1.5 mb to 2.8 mb and 33–46% GC. A descriptive table of all the species, representative and undefined species of genus Streptococcus with their genomic attributes including numbers of CDS, genome size, GC content, isolation source etc. is summarized in supplementary table 1. Genomes of species such as S. danieliae, S. symci, S. devriesei were removed from the study because of the non-permissible range of contamination and completeness or the absence of 16S rRNA gene in their genome.
Phylogenomics resolved intermingled clades and species groups within the genus Streptococcus
16S rRNA-based phylogeny of all the type strains, representative and undefined species of genus Streptococcus suggested several intermingling phylogenetic positioning of earlier defined species groups (Fig. 1). We could identify several phylogenetic positioning such as, some of the type strains of Mitis-Suis clade such as S. suis, S. azizii, S. marmotae, S. himanayensis, S. cuniculi, S. varani, S. minor, S. ovis and S. merionis shares phylogeny with Pyogenes-Equinus-Mutans clades. Likewise, the S. downii of the Pyogenes-Equinus-Mutans clade is within the Mitis-Suis clade. These evidences clearly indicate that 16S rRNA-based phylogeny could not provide sufficient evidence for supporting the species groups among genus Streptococcus.
Moreover, the current phylogeny of the genus Streptococcus is based on MLST or CSI, including a maximum of 70 species. In the present analysis, inclusion of 115 type strains along with 23 unclassified strains at whole genome level provides us robust phylogenetic grouping of the genus Streptococcus (Fig. 2). We obtained two phylogroups in accordance with the earlier clades of genus Streptococcus i.e., Mitis-Suis and Pyogenes-Equinus-Mutans. However, there were some exceptions such as: S. acidominimus, S. porci and S. plurextorum belong to Mitis-Suis clade and S. pharyngis belong to Pyogenes-Equinus-Mutans clade contrary to their previous phylogeny. Additionally, we could also designate 23 unclassified strains of genus Streptococcus to their respective clades (14 in Mitis-Suis clade and 9 in Pyogenes-Equinus-Mutans clade).
Overall, several ambiguities in subclade level grouping could not be resolved by 16S rRNA or whole genome-based phylogeny. For instance, members of subclade Halotolerans (S. acidominimus, S. hyovaginalis, S. pluranimalium, S. halotolerans and S. thoraltensis) and subclade Entericus (S. entericus, S. marimammalium) of Pyogenes-Equinus-Mutans clade had intermingled positions at both the levels of phylogeny. Interestingly, out of all these seven members of Pyogenes-Equinus-Mutans clade, S. acidominimus now belongs to Mitis-Suis clade. To further examine these ambiguities in the genus Streptococcus, we performed genome similarity assessment.
Taxonogenomics suggest the presence of several genomospecies and subspecies reclassification
The current taxonomy of the genus Streptococcus is majorly based on biochemical and physiological methods. In the study, implementation of several genome similarity assessment methods helped in identification of several novel genomospecies and in truly demarcating the species and subspecies of genus Streptococcus (Fig. 3, Data sheet 1). Genomotaxonomy revealed twelve novel genomospecies amongst the unclassified species: S. sp HKU75 (GS1), S. sp. zq-86 (GS2), S. sp. KS 6 (GS3), S. sp. LPB0220 (GS4), S. sp A12 (GS5), S. sp oral taxon 431 F0610 5-114 (GS6), S. sp oral taxon 061 F0704 (GS7), S. sp. 116-D4 (GS8), S. sp oral taxon 064 W10853 (GS9), S. sp NPS 308 (GS11) and S. sp. 1643 (GS12) and S. sp CNU G3, S. sp CNU G2, S. sp CNU 77 − 61 (GS15). Further, three of the subspecies of S. oralis (S. oralis subsp oralis NCTC 11427 (T), S. oralis subsp dentisani CECT 7747 (T) and S. oralis subsp tigurinus AZ 3a (T)) and S. equi subsp ruminatorum CECT 5772 (T) were found to be novel species and designated as GS10, GS13, GS14, and GS16.
Further, the species status of some of the species of genus Streptococcus needs to be investigated in details like: S. ilei I-G2 (T) and S. koreensis JS71 (T) (ANI: 96.32; dDDH: 68.8); S. ratti FA-1 (T) and S. ursoris DSM 22768 (T) (ANI: 98.6; dDDH: 88) and S. bovis ATCC-33317 (T) with S. equinus NCTC 12969 (T) (ANI: 96.89; dDDH: 72.7). In the present study, we could also assign nine unclassified species to already known species of the genus Streptococcus. Like, S. sp DAT741 to S. ruminantium GUT-187 (T) (ANI:99; dDDH: 90.4), S. sp. NSJ-72 to S. constellatus subsp pharyngis CCUG-46377 (T) (ANI:97.73; dDDH: 78.3), S. sp ZB199 to S. parasanguinis ATCC 15912 (T) (ANI: 96.78; dDDH: 70.8), S. sp FDAARGOS 192 to S. salivarius NCTC 8618 (T) (ANI: 96.08; dDDH: 66.5); S. sp NCTC 11567 to S. dysgalactiae subsp dysgalactiae NCTC 13731 (T) (ANI:96.02; dDDH: 67.4); S. sp FDAARGOS 522, S. sp group B FDAARGOS 229, S. sp FDAARGOS 521, S. sp FDAARGOS 520 belong to S. agalactiae NCTC 8181 (T) (ANI value in the range of 98.7% − 99.8% and dDDH value in the range of 89.6% − 98.6%).
In the present study, we have obtained 16 novel genomospecies among the genus Streptococcus. These genomospecies were classified earlier as either subspecies of already reported species or as unclassified species. Further, merger of a few already defined species such as S. ilei and S. koreensis, S. bovis, S. equinus and S. equinus, S. ratti and S. ursoris.
Genus-wide resistome analysis
Infection caused by several species of genus Streptococcus, such as S. pneumoniae, S. pyogenes etc. are the major cause of community-acquired respiratory infections worldwide (Appelbaum 2002). Antimicrobial resistance in the bacterial population raises concern over the application of microbes in industry and agriculture. Resistome analysis of the genus Streptococcus resulted in several classes of drug resistance genes using high stringent cut-offs (see methods) (Fig. 2). Resistome gene profile included: AAC(6')-Ie-APH(2'')-Ia, ANT(6)-Ia, APH(3')-IIIa and aad(6) for aminoglycosides; dfrF for diaminopyrimidines; patA, patB and pmrA for fluoroquinolones; ErmB, RlmA(II), cfr(D), lmrP, lnuA, lnuB, lnuC, lsaC, lsaE and mel for lincosamides; ErmB, RlmA(II), lmrP, lsaC, lsaE, mel for macrolides; SAT-4 for nucleosides; cfr(D), lsaC, lsaE and mel for oxazolidinones; S. agalactiae mprF for peptidic antibiotic; Enterococcus faecalis chloramphenicol acetyltransferase; Lactobacillus reuteri cat-TC, catQ, cfr(D), lsaC, lsaE and mel for phenicols; lsaC, lsaE and mel for pleuromutilin antibiotic; ErmB, cfr(D), lmrP, lsaC, lsaE and mel for streptogramin antibiotic and lmrP, lsaC, lsaE, mel, tet(L), tet(W/N/W), tetA(46), tetM and tetO for tetracycline antibiotic (Data sheet 2). Most of these resistance genes were present in S. porci DSM 23759, S. gallolyticus subsp pasteurianus NCTC 13784, S. hyovaginalis DSM 12219, S. parasuis H35, S. plurextorum DSM 22810, S. orisratti DSM 15617, S. pluranimalium TH11417, S. sp FDAARGOS 522 and S. vaginalis P1L01 (Fig. 2). Further, 75 out of 138 genomes were harbouring fluoroquinolone resistance genes (patA and patB). Interestingly, well-curated antibiotic resistance genes were not found in 42 genomes in the present study. These genomes were majorly present in Pyogenes-Equinus-Mutans clade in phylogroup containing strains of probiotic importance like: S. thermophilus NCTC 12958 (T) and S. salivarius NCTC 8618 (T) etc. Further, Mitis-Suis clade also had 14 genomes not having perfect hits for resistance genes (Fig. 2).
However, low-stringent cut-offs (see methods) resulted in a highly enriched resistome profile for all the species of the genus Streptococcus (Data sheet 3). A total of 61,274 antibiotic resistance-related genes could be identified using the standalone RGI module of CARD. Out of which, 15,976 genes were efflux pump related genes. Antibiotic resistance genes related to β-lactam were detected in all the strains in large numbers (up to 178 copies). Aminoglycoside and glycopeptide antibiotics were also present in large numbers among all the strains. Few strains of genus harbour gene related to ethionamide, fluoroquinolone antibiotic; aminoglycoside antibiotic, isoniazid; triclosan, mupirocin, polyamine antibiotic, prothionamide, pyrazinamide, and sulfone antibiotic.