Genome features of Vibrio alginolyticus isolated from the Kiel-Fjord
We sequenced the genomes of nine Vibrio alginolyticus strains, previously isolated from pipefish in the Kiel-Fjord (28) using a combination of PacBio long- and Illumina short-read technology. The assembly resulted in eight closed genome sequences and one draft genome (strain K09K1), where both chromosomes have been assembled into a single contig due to multiple copies of an integrated filamentous phage. The replicon boundaries could not be resolved experimentally based on PCR thus strain K09K1 has been assigned a “permanent draft” status. All V. alginolyticus genomes contain a ~ 3.47 Mbp chromosome 1 and a ~ 1.88 Mbp chromosome 2, with a %GC content of around 44% (Table 1), which is typical for the genus Vibrio as well as for the species V. alginolyticus (45, 46). We found extra-chromosomal replicons including plasmids and filamentous phages in seven isolates (Table 1).
Species definition and phylogenetic relationship of the Kiel-Fjord isolates
All nine strains isolated from the Kiel-Fjord share more than 98% average nucleotide identity (ANI) with other alginolyticus strains, suggesting that these strains belong, as previously suggested based on a multi-locus-sequencing-approach (MLSA) (23), to the species V. alginolyticus. All alginolyticus strains share ~ 92% ANI with two other closely related Vibrio species, i.e. V. diabolicus and V. antiquarius (Fig. 1), which is below the species threshold of ANI = 95% (47, 48), indicating that V. alginolyticus is a distinct species.
The Vibrio alginolyticus pan/ core genome
Pangenome: To compare the gene repertoire of the nine V. alginolyticus strains from the Kiel-Fjord with the global V. alginolyticus gene repertoire, we calculated their pangenome including the amount of core-and accessory genes as well as strain-specific singletons, i.e. genes unique to single strains (49). The analysis included in total 73,277 protein-sequences encoded on chromosomes and MGEs, including plasmids and extra-chromosomal phage replicons. Overall, we found that the size of the pangenome of V. alginolyticus increased stronger (from 4997 to 8843 gene clusters) with the sequential addition of each new genome when we included all 15 strains from diverse habitats (Fig. 2a). In contrast, we only observed a slight increase in the pangenome (from 5044 to 5679 gene clusters) when we only included the nine V. alginolyticus isolates from the Kiel-Fjord (Fig. 2b). An increase in the pangenome results from new strain-specific genes, found in each newly analyzed isolate. The much larger increase in the pangenome across all available V. alginolyticus isolates relative to the increase in the pangenome of Kiel-Fjord isolates suggests that the genomic diversity of V. alginolyticus within the Kiel-Fjord is very limited. However, the species V. alginolyticus per se has a vast reservoir of genomic diversity.
We extrapolated both, the within-Kiel and the global V. alginolyticus pangenome by fitting a least-squares curve based on Heaps’ Law (1) and found that the within-Kiel pangenome is closed (α = 1.12), whereas the global V. alginolyticus pangenome is open (α = 0.58). This means, that within the global V. alginolyticus pangenome each newly sequenced V. alginolyticus isolate will reveal a unique set of genes, irrespective of the number of strains included in the present analysis. This open V. alginolyticus pangenome reflects the diversity in habitats in which this species exists. These distinct environments probably require different adaptations and/ or promote high levels of HGT. In contrast, a closed pangenome, as has been detected for the within-Kiel V. alginolyticus pangenome, suggests that the number of genes that will be obtained from any newly sequenced isolate will converge to zero. Here, we assume, that the at least sequenced pipefish associated V. alginolyticus strains within the Kiel-Fjord contain the major part of the gene equipment that is requested to adapt to their habitat. Thus the genomes can be expected to be highly similar, potentially resulting from niche adaptation and strong selection thereof. Strong positive selection of such adaptive genes might have led to a clonal expansion of the Kiel-alginolyticus ecotype. Indeed, we found no sequence divergence based on core-genomic signature among all nine isolates from Kiel (Fig. 2c), indicating a clonal expansion of this ecotype. It remains to be investigated, whether free-living V. alginolyticus strains or isolates from other eukaryotic hosts from the Kiel-Fjord share the same gene-pool or are more divergent from the pipefish-associated strains.
Core-genome and singletons: We observed a stronger decrease in the core-genome when we included all V. alginolyticus isolates as opposed to when we performed the analysis only within the Kiel-alginolyticus ecotype. Comparative analyses between all 15 V. alginolyticus strains and the Kiel-alginolyticus ecotype only, revealed that the core genome (4708 gene clusters) is four times larger than the accessory genome (971 gene clusters) when we only included the nine Kiel strains. However, when we extended the analysis and included all 15 V. alginolyticus, we found that the core genome (3876 gene clusters) becomes smaller than the accessory gene pool (4967 gene clusters). In other words, within the Kiel-Fjord, different V. alginolyticus isolates have a large core-genome (83% of the pangenome) with a limited accessory gene pool (17% of the pangenome). In contrast, despite all being members of the same species, the global accessory gene-pool is highly variable (56% of the pangenome) and the V. alginolyticus core-genome constitutes only 44% of the pangenome.
Habitat specific chromosomal regions
The acquisition of entire gene-blocks (genomic islands, plasmids, prophages) by HGT can rapidly alter the life-style of a bacterium in quantum leaps (50). This mechanism seems to be particularly important for bacterial adaptation to new ecological niches but also for how bacteria diverge from each other, forming ecotypes and ultimately new species (51). Genomic islands, which encode specific functions allowing for niche adaptation and maybe even speciation events are common within the genus Vibrio (52). For the Kiel-alginolyticus ecotype, we could identify 19 chromosomal genomic regions (GRs) of which most are unique to the strains isolated from the Kiel-Fjord (Fig. 3, Table S2). Overall, these 19 GRs encode a total of 487 genes out of which 305 have only been found within the Kiel-alginolyticus ecotype. Out of these 19 GRs five could be assigned to integrated prophages. GR 13 and GR 14 correspond to Vibrio phage VALGΦ2/2b and Vibrio phage VALGΦ6, which are unique for the isolates from the Kiel-Fjord and have not been found elsewhere (19). GR 3 corresponds to Vibrio phage VALGΦ1, which has so far only been detected in V. alginolyticus ATCC33787, with a relatively low query cover of 57%, thus making it a unique region for Kiel. In contrast GR 5, which corresponds to the filamentous phage Vibrio phage VALGΦ8 on chromosome 1, is absent in most Kiel strains, except K10K4 and K05K4, but present in some non-Kiel strains, such as ATCC17749 and FDAARGOS_114, suggesting that GR 5 is not unique for the Kiel isolates. Similarly, GR 15, which corresponds to a multiphage-cassette consisting of multiple repeats of a combination of the two filamentous phages Vibrio phage VALGΦ6 and Vibrio phage VALGΦ8 (19), is absent in most Kiel strains, except K04M3 and K04M5 but in parts present in ATCC17749 and thus not unique for the Kiel system. Of the remaining 14 GRs, which do not correspond to integrated phages, we could identify four genomic islands: GR 2, GR 6, GR 7 and GR 8 (> 10 kb, presence of integrase/ transposase, different GC content). According to functional COG categories, these islands contain a variety of proteins, most of them involved in replication, recombination, and repair [L] and transcription [K] (see Table S3 for functional annotation of all GRs). The high prevalence of those maintenance genes is however not surprising and might result from an identification bias, as they are usually better annotated for MGEs than accessory genes, which would provide a selective advantage to the host. As such, many proteins were classified as phage integrases, transposases or other proteins, involved with viral integration into host DNA or DNA repair, suggesting that HGT might have played an important role during the acquisition of these GIs. GR 6 encodes the multi-drug transporter subunit MdtN and two other loci which were assigned to COG category [V] and predicted to be involved in defense mechanisms. This suggests that GR 6 might play a role in resistance. None of the other genomic islands could be identified as pathogenicity-/ metabolic- or resistance island. All other GRs were either smaller than 10 kb or did not contain integrases or transposases and are thus referred to as genomic regions instead of genomic island. Out of these GRs, GR 16 encoded three proteins associated with the type VI secretion system and GR 18 encoded a beta-lactamase protein suggesting an adaptive role in virulence and resistance, respectively. The acquisition of these unique GRs might have allowed the Kiel-alginolyticus ecotype to invade a new niche, which was then followed by clonal expansion of this ecotype. Clonal expansion has been observed for several pathogens, in particular within the genus Vibrio. The best characterized example of clonal expansion upon acquisition of virulence genes is V. cholerae. But also other Vibrio pathogens, for instance V. parahaemolyticus have been shown to experience similar evolutionary dynamics (for a review see (53)).
Genomic differences within Kiel-alginolyticus ecotype
To investigate genomic differences between the nine strains from the Kiel-Fjord, we focused on all those gene-clusters from the Roary analysis, which could only be found within the Kiel-alginolyticus ecotype but were absent in all non-Kiel isolates. We found that all Kiel-specific core-genes (n = 412) were located exclusively on one of the two chromosomes (Fig. 4). In contrast, the majority of the Kiel-specific accessory gene clusters (89%) were encoded on mobile genetic elements, in particular plasmids. These results support ongoing theory that accessory genes are predominately located on MGEs and shared by horizontal gene transfer (HGT) (5). We found 490 gene clusters with no orthologous in any other Kiel strain, i.e. singletons out of which 40% were located on MGEs, in particular, extrachromosomal replicating elements (170 on plasmids and 27 on extrachromosomal phages) and 60% (n = 293) were chromosomal (Fig. 4). All Kiel-specific alginolyticus gene clusters were further assigned to putative functional categories using the Clusters of Orthologous Groups of Proteins (COG) database (34) (Table S4). Even though a large fraction of the gene clusters could either not be assigned to a COG or was poorly characterized, we found differences in the relative distribution of super-functional COGs between core- and accessory genomes and singletons: The majority of the singletons (37%) was predicted to be dedicated to cellular processes/ signaling, while relatively small proportions of gene clusters (10% and 16%) belong to information storage/ processing and metabolism. In contrast, the largest proportion of gene-clusters encoded on core-genes was predicted to belong to information storage/ processing (24%), and only 16% and 13% of gene-clusters encoded on core genes belong to cellular processes/ signaling and metabolism. Among the gene-clusters encoded on the accessory genome 22% could be assigned to information storage/ processing as well as to cellular processes/ signaling and only 6% to the metabolism (Fig. 4). Within the accessory genome most of the genes are involved in replication, recombination, and repair (COG [L], Table S4). These include mainly proteins involved in HGT, such as transposases, integrases, transferases, recombinases as well as proteins involved in immunity, such as CRISPR associated helicase Cas3 and restriction modification methylases. This extensive representation of proteins involved in HGT on the accessory genome suggests that HGT was potentially one of the driving evolutionary mechanisms underlying the diversification of the nine V. alginolyticus strains from the Kiel-Fjord.
The majority of the accessory gene-pool within the Kiel-alginolyticus ecotype is located on plasmids (Fig. 4). We could identify four different plasmids from all nine Kiel V. alginolyticus isolates. Three plasmid types isolated from the Kiel strains were unique for specific strains with a size of 0.9 to 2.9 kbp (Table 1). The fourth plasmid type was characterized as a mega-plasmid (Fig. 5a), which ranged between 280 and 300 kbp in size, and was present in six out of the nine strains (Fig. 5b). The six mega-plasmids share 296 core-genes, encode 129 accessory genes and between 5–26 singletons per plasmid (Fig. 5b). Together with the three small plasmids, plasmid-encoded singletons make up 34.5% of all 486 Kiel-specific singletons. The majority of the plasmid-specific singletons comprise hypothetical proteins. The remaining singletons include AAA proteins (K04M1_pL280), PFAM phosphoadenosine phosphosulfate reductase and ATPases (K04M5_pL294), spore germination and type IV secretion system (K5K4_pL289), endonuclease and site-specific methyl-transferase, potentially forming a restriction-modification system (K06K5_pL291), and DNA polymerase (K08M3_pL300). V. alginolyticus ATCC 33787 contains as well three plasmids including plasmid pMBL287, which is similar in size as the Kiel mega plasmids (46). However, a comparison of ATCC 33787 plasmids revealed no sequence similarity to any of the plasmids from the Kiel strains.
Only 6.5% of the 870 Kiel-specific accessory genes and singletons are located on extrachromosomal phages. The within-Kiel variation caused by these phages can be explained by the absence of extrachromosomal replicons of Vibrio phage VALGΦ8 in all but two strains (K04M1 and K05K4). However, four strains (K04M3, K04M5, K05K4, and K10K4) had an intra-chromosomal version of this phage, while strains K01M1, K06K5 and K08M3 did neither contain an intra- nor an extrachromosomal version of Vibrio phage VALGΦ8 (for a detailed analysis of Kiel vibriophages see (19)). Parts of Vibrio phage VALGΦ8 could be identified on both chromosomes of ATCC 17749T, and on chromosome 1 of FDAARGOS_114 but not on any other non-Kiel Vibrio. Genome analyses of Vibrio phage VALGΦ8 revealed no significant virulence-associated genes nor any other genes that could be associated with a habitat-specific adaptation (19).
Virulence and resistance of the Kiel-Fjord V. alginolyticus isolates
We found an identical virulence and resistance profile among the nine V. alginolyticus isolates from the Kiel-Fjord. The Kiel-Fjord isolates encode a total number of 17 homologues resistance genes out of which the majority (n = 13) is located on chromosome 1 and four resistance genes are encoded on chromosome 2 (Fig. 6, Table S5). In comparison with other V. alginolyticus strains (~ 25–130 resistance genes on Chromosome 1 and 11–12 resistance genes on Chromosome 2), isolates from the Kiel-Fjord have significantly less resistance genes (t-test: t8.85=3.51, p = 0.007; Fig. 6) and are missing in particular genes conferring resistance to tetracycline, aminoglycosides, and quinolones.
In contrast to non-pathogenic strains, such as V. fischerii, which is a mutualist containing 112 virulence genes, the Kiel-Fjord isolates encode more virulence genes (150 virulence genes, exception strain K05K4: 149 virulence genes). This number is lower than what has been found for human pathogenic vibrios, such as V. cholerae (~ 165 virulence genes) and V. parahaemolyticus (162 virulence genes), but similar to what has been found in other V. alginolyticus strains, except (strain FDAARGOS_114: 165 virulence genes). Unique for the Kiel-Fjord isolates is the absence of genes involved in Vibriobactin biosynthesis, which are present in almost all other V. alginolyticus strains. Similarly, In contrast to other V. alginolyticus isolates, the strains from the Kiel-Fjord miss the gene flaC, which is involved in the regulation of stringent-response-induced toxin production (54) and vscP, a gene involved in the type III secretion system. In contrast to other V. alginolyticus strains, the Kiel-alginolyticus ecotype encodes the type IV secretion system effectors, the phytotoxin coronatine and the thermolabile hemolysin (tlh), both toxins, which could not be found in any other V. alginolyticus strain. This unique virulence profile of the Kiel-Fjord isolates separates them from other Vibrio species including other V. alginolyticus isolates abut also another strain from the Kiel-Fjord: V. typhli, K08M4 (55) (Fig. 6). This clear separation was further supported by a hierarchical cluster analysis (Fig. 6), which indicates, that only based on the presence/absence of virulence genes the Kiel-alginolyticus ecotype can be distinguished from all other V. alginolyticus strains, sequenced to date. The unique resistance and virulence profile within the Kiel-Fjord V. alginolyticus isolates might be a further indication for niche-adaptation followed by clonal expansion of this ecotype.