Pseudomonas sp. KRP1 belongs to the P. aeruginosa species
In silico assembly of the de novo sequenced Pseudomonas sp. KRP1 resulted in two circular contigs of 6,162,740 bps and 575,136 bps, featuring a G+C content of 66.41% and 64.79%, respectively. Synteny comparisons between the initial in silico assembly and closely related P. aeruginosa strains showed multiple rearrangements of the ORFs encoded on the mega plasmid. In P. aeruginosa PA14, the sequence inscribed on the 575,136 bps contig are located between two homologous large ribosomal RNA clusters. These clusters are known to be spots of inner genome rearrangements within the P. aeruginosa species [4, 17]. The KRP1 homologue to PA14_61200 was also part of the mega plasmid and origin of a second reorganization event between PA14 and KRP1. As this large gene contains eleven 243 bp long tandem repeats, it has been found hard to be resolved by whole genome sequencing of a P. aeruginosa strain before [18]. Therefore PCR was used to investigate the DNA sequence surrounding the ribosomal RNA clusters on the main chromosome and on the potential mega plasmid. This resulted in a redefined genome structure of KRP1, with one circular chromosome, containing 6,301 protein-coding genes. The smaller contig was manually integrated at its correct position within the in silico sequence.
To clarify which species of the Pseudomonas genus KRP1 belongs to, the average nucleotide identity (ANI) percentage was calculated with respect to 105 fully sequenced P. aeruginosa strains and 8 other Pseudomonas species. When compared to the P. aeruginosa species, all calculated values are well above the accepted species threshold of 95-96% (Figure 1, for exact values see Table S1). For the eight other closely related Pseudomonas species, ANI values range between 80.4% (P. citronellolis P3B5) and 74.4% (P. psychrotolerans PRS08) (for all exact values see Table S1). While ANI comparisons are based on the nucleotide sequence, P. aeruginosa KRP1 was also analyzed based on the amino acid sequences of the transcribed proteins, towards the other strains and species. The tree was built based on a core of 1,537 genes per genome comprising 532,537 amino acid-residues per genome (Figure S1). For better visualization, a reduced version of the tree containing only the eight non-aeruginosa species and six P. aeruginosa strains is shown (Figure 2). The phylogenetic analysis clearly mark the strain KRP1 as a representative of the species P. aeruginosa and show a clear distinction of the strain towards other members of the same genus.
P. aeruginosa KRP1 relation to closely related P. aeruginosa strains
The phylogenetic trees in Figure 2 and Figure S1 are based on amino acid-sequences, and therefore present only non-synonymous nucleotide substitutions. For a more in depth investigation of KRP1, its genome was compared to the type strain PAO1, the highly researched strain PA14 and the two strains FA-HZ1 and W45909, to which KRP1 clusters most closely in the phylogenetic analyses (Table 1).
Table 1: Genomic overview of different P. aeruginosa strains used in this study.
P. aeruginosa strain
|
Total length [bp]
|
GC-content [%]
|
number of predicted genes
|
ANI with KRP1 [%]
|
Comment
|
Reference
|
KRP1
|
6,737,396
|
66.3
|
6,301
|
|
|
This study
|
PAO1
|
6,264,404
|
66.6
|
5,700
|
99.24
|
type strain
|
[17]
|
PA14
|
6,537,648
|
66.3
|
6,177
|
98.36
|
common research strain
|
[4]
|
LESB58
|
6,601,757
|
66.3
|
6,135
|
98.81
|
hyper virulent strain
|
[16]
|
FA-HZ1
|
6,866,790
|
66.2
|
6,389
|
99.98
|
closest sequenced relative to KRP1
|
[21]
|
W45909
|
6,777,566
|
66.2
|
6,475
|
99.96
|
2nd closest sequenced relative to KRP1
|
[22]
|
Regarding the synteny between KRP1 and the other P. aeruginosa strains, it becomes apparent that the type strain P. aeruginosa PAO1 (AE004091) shows a large-scale inversion with respect to the other five P. aeruginosa strains tested, which roughly affects 70% of the genome (Figure 3). This inversion of PAO1 with respect to other P. aeruginosa strains was observed before and is linked to an intra-chromosomal rearrangement at two of the four copies of a large ribosomal RNA cluster [4, 17]. When looking at the overall genome arrangement, KRP1 shows a high degree of synteny throughout the whole genome with the strains FA-HZ1, W45909 and PA14, which is not limited to certain parts.
The genome of P. aeruginosa has a mosaic-like structure, built of a conserved core (~ 90% of the total genome), which is interrupted by genomic islands containing variable accessory genes [7]. The numerical distribution between genes belonging to the core- and the accessory genome of the six P. aeruginosa strains (KRP1, PAO1, PA14, LESB58, FA-HZ1 & W45909) was analyzed using EDGAR and the results are summarized in Figure 4. The core genome shared by KRP1 and the two predominantly researched strains PAO1 and PA14 consists of 5,278 genes (Figure 4 A). This is equivalent to 83.8% (KRP1) – 92.6% (PAO1) of all genes annotated in the respective genomes (Table 1). KRP1 and PA14 contain orthologues of 216 genes, which are not part of the genome of the species type strain PAO1 (Figure 4 A, Area I). Additionally, there are 583 genes in KRP1, for which orthologues are not found in either of the two other strains (Figure 4 A, Area II). Thus, the environmental isolate KRP1 encodes for a substantially higher number of singletons than PAO1 or PA14. The overlap in the accessory genome of KRP1 is more pronounced with the FA-HZ1 and W45909 strains of P. aeruginosa (Figure 4 B, Area III). Both strains also cluster as the closest relatives of KRP1 when only the core genome is taken into the calculation (Figure 2 & Figure S1). The three strains share a total of 5667 genes, which corresponds to 89.94% of all KRP1 predicted ORFs. The clinical isolate W45909 and the environmental isolate KRP1 both contain five genes related to inorganic ion transport and metabolism, which are not present in any of the other analyzed strains. All of these genes encode proteins related to copper metabolism, and might relate to common environmental challenges faced by the strains in their natural habitat. With the highly virulent LESB58 strain, KRP1 shares a total of 5503 (Figure 4 B, core + Area IV and V). In an inter-species comparison of these four strains (LESB58, FA-HZ1, KRP1 & W45909; Figure 4 B), the KRP1 genome encodes the lowest number of singletons (Figure 4 B, Area VI). Of these 102 genes, ~78% did not yield a BLAST hit within the COG database, highlighting that most of the genes of this area are novel or hypothetical proteins (Figure 5, Table S2). This high portion of unclassified genes was typical for all closer investigated overlap areas, except for the overlap of the KRP1 strain with LESB58 and W45909 (Figure 4 B, Area IV) . Here, the majority of the genes have a metabolic function and ~27% are related to cellular processes and signaling (Figure 5, Table S3).
The accessory genome of P. aeruginosa KRP1
In P. aeruginosa genes of the accessory genome tend to cluster into GIs [7]. Therefore, the genome of KRP1 was analyzed to detect putative genomic islands. Since the existing software programs have a different degree of sensitivity and different shortcomings, multiple programs were used: SIGI-HMM [23], IslandPath-DIMOB [24], PHASTER [25] and GIPSy [26]. The screening was complemented by a manual mining for known P. aeruginosa GIs. A list of the initial 28 GIs can be found in Klockgether et al. [10]. This list was extended by the findings of Silveira et al. [11], Hong et al. [12] and Jani et al. [13]. Over the years, a different nomenclature was established naming the islands PAPI-X (P. aeruginosa pathogenicity island), PAGI-X (P. aeruginosa genomic island) and LESGI-X (Liverpool Epidemic Strain genomic island). It is important to note that no direct correlation between PAGI and LESGI exists and that the respective islands are not exclusive to the PA or LES strains of P. aeruginosa. With the approach used here, a total of 25 putative GIs scattered throughout the KRP1 genome were detected (Table 2, Figure 6). The loci of the majority of these GIs could be clearly assigned to specific “regions of genomic plasticity (RGPs)” [27] (Table 2), which mark locations where integration of foreign DNA into the P. aeruginosa genome have been previously reported to happen with increased frequency. In general, GIs can be classified to a proposed metabolic function based on the encoded genes. The GIPSy software [26] categorizes GIs into i) pathogenicity islands (PIs), which contain an increased number of genes related to pathogenicity factors, ii) resistance islands (RIs) for genes related to antibiotic resistance, iii) metabolic islands (MIs) with genes known to be related to the biosynthesis of (secondary) metabolites, iv) symbiotic islands (SIs), which mainly contain genes related to a host-bacterium symbiotic relationship. In P. aeruginosa KRP1 all of these GI-classes are found, except for MIs (Table 2 & Figure 6) and some GIs are placed in more than one category. For a classification of a GI into the mentioned categories, it is not necessary that each single gene of the respective GI falls into the respective category.
The genome of KRP1 was also analyzed to identify which version of known replacement islands are encoded, as these traits represent critical determinants for the fitness and virulence of an individual P. aeruginosa strain [7] (Table 3). The four replacement islands contain the same functional content and occupy nearly always the same genomic loci within the P. aeruginosa core genome. Intriguingly, the specific genetic sequence of each island is highly diverse between strains [28, 29] The gene loci of the O-antigen gene cluster and the flagellin glycosylation replacement island are part of the PI/SI 5 and the RI/SI 11, respectively. The genes encoding pyoverdine production and pilin/pilin modification are not identified by the different genomic island detection programs. The pyoverdine locus is located between PI/RI 12 and GI 13, while the pilin modification genes are situated between PI 19 and GI 20.
Of the GIs recognized by the prediction software packages, PI/RI 1, GI 3, PI/RI 12 and GI 17 share a large portion of their nucleotide sequence with the other investigated P. aeruginosa genomes (i.e. with PA14: 50%, 80%, 80% and 90%, respectively). On the other hand, unique putative genes within them are assigned to only one of the analyzed strains and their integration into the core genome could be traced to a specific known RGP (Table 2). This classifies them as valid regions of the accessory genome of P. aeruginosa.
As GIs are usually elements of foreign DNA and frequently represent phage genomes, which have integrated into the host genome. The PHASTER software [25] and its predecessor PHAST [30] are designed to rapidly identify and annotate prophage sequences within a bacterial query genome. The software detected seven prophages throughout the KRP1 genome (Table S4). All of the detected sequences can be assigned to specific GIs and were therefore also recognized by the other genomic island detection programs tested. PHASTER classified four out of these seven prophages as intact, hence their genome contains all the necessary parts to be a complete phage and therefore to leave the genome again at a given point in time. Incomplete phages have likely lost essential parts of their genome, binding them permanently within the hosts DNA. One of these intact prophages is part of GI 23. The majority of the 51 putative ORFs show significant sequence similarities with the bacteriophage JBD93 [31] (92% identity over 86% of the query length) (Table S4). A highly similar prophage is integrated into the genome of the P. aeruginosa LESB58 strain, termed LES-prophage 4 [16]. The intact prophage genome is flipped within KRP1 and its ORFs are clustered into genes encoding proteins for tail assembly, head assembly facilitating genes, genes used for integration, and genes responsible for lysogeny and lysis. GI 23 also marks the only GI, for which no homolog could be detected in any of the other five investigated P. aeruginosa strains and it contains most of the detected singleton genes of KRP1 (Figure 6). Its integration disrupts the MdlC benzoylformate decarboxylase locus (PA14_64770), which has not been recognized as a RGP in P. aeruginosa before.
Table 2: Summary of genomic islands predictions in P. aeruginosa KRP1.
Genomic Island
|
Start position [bp]
|
Stop position [bp]
|
Size [bp]
|
KRP1 locus tag (number of ORFs)
|
RGP*
|
Prediction Method
|
PI/RI 1
|
40,389
|
61,808
|
21,419
|
KRP1_00205 - KRP1_00235 (7)
|
RGP46
|
2 & 4
|
GI 2
|
285,777
|
298,203
|
12,426
|
KRP1_01295 – KRP1_01335 (9)
|
RGP2
|
5
|
GI 3
|
671,911
|
697,058
|
25,147
|
KRP1_03145 - KRP1_03300 (32)
|
RGP3/4
|
1 & 3
|
PI 4
|
1,063,975
|
1,085,974
|
21,999
|
KRP1_05045 - KRP1_05145 (21)
|
RGP88
|
1, 2, 4 & 5
|
GI 5
|
1,222,896
|
1,230,252
|
7,356
|
KRP1_05780 – KRP1_05780 (1)
|
RGP89
|
5
|
GI 6
|
1,302,346
|
1,320,820
|
18,474
|
KRP1_06140 - KRP1_06225 (17)
|
RGP36
|
1 & 2
|
PI/SI 7
|
1,973,098
|
1,991,464
|
18,366
|
KRP1_09255 - KRP1_09325 (54)
|
RGP31
|
2 & 4
|
PI 8
|
2,424,758
|
2,470,270
|
45,512
|
KRP1_11590 - KRP1_11760 (36)
|
RGP28
|
1, 2, 3, 4 & 5
|
GI 9
|
2,533,287
|
2,538,355
|
5,068
|
KRP1_12025 – KRP1_12060 (8)
|
-
|
2 & 5
|
GI 10
|
2,556,402
|
2,564,724
|
8,322
|
KRP1_12155 – KRP1_12190 (8)
|
RGP71
|
5
|
PI/RI/SI 11a
|
2,632,036
|
2,744,677
|
112,641
|
KRP1_12500 – KRP1_13040 (109)
|
RGP27
|
1, 2, 4 & 5
|
GI 11b
|
2,751,082
|
2,753,517
|
2,435
|
KRP1_13080 – KRP1_13095 (4)
|
RGP27
|
2
|
PI/RI 12
|
2,895,779
|
2,921,721
|
25,942
|
KRP1_13740 – KRP1_13765 (7)
|
RGP25
|
4 & 5
|
GI 13
|
3,221,391
|
3,272,809
|
51,418
|
KRP1_14895 – KRP1_15110 (44)
|
RGP23
|
1, 2, 4 & 5
|
GI 14
|
3,577,280
|
3,579,282
|
2,002
|
KRP1_16510 – KRP1_16510 (1)
|
RGP52
|
5
|
GI 15
|
3,769,299
|
3,777,692
|
8,393
|
KRP1_17360 – KRP1_17400 (9)
|
-
|
5
|
RI/SI 16
|
4,485,821
|
4,496,553
|
10,732
|
KRP1_20830 – KRP1_20870 (9)
|
RGP9
|
4
|
GI 17
|
4,592,095
|
4,616,393
|
24,298
|
KRP1_21355 – KRP1_21485 (27)
|
RGP7
|
1 & 5
|
GI 18
|
4,762,338
|
4,768,531
|
6,193
|
KRP1_2211 – KRP1_22245 (8)
|
RGP6
|
5
|
PI 19a
|
4,867,542
|
4,906,902
|
39,360
|
KRP1_22720 – KRP1_22960 (49)
|
RGP5
|
1, 3 & 4
|
PI 19b1
|
4,906,929
|
4,925,297
|
18,368
|
KRP1_22965 – KRP1_23060 (20)
|
RGP5/41
|
1 & 4
|
PI 19c
|
4,925,522
|
4,955,315
|
29,793
|
KRP1_23065 – KRP1_23155 (19)
|
RGP41
|
2 & 4
|
PI 19b2
|
4,955,299
|
4,983,156
|
27,857
|
KRP1_23160 – KRP1_23310 (31)
|
RGP5/41
|
1, 2 & 4
|
PI 19d
|
4,983,197
|
5,009,461
|
26,264
|
KRP1_23315 – KRP1_23425 (23)
|
RGP5
|
1, 2, 3 & 4
|
GI 20a
|
5,366,804
|
5,428,778
|
61,975
|
KRP1_25040 – KRP1_25385 (70)
|
RGP41
|
1, 2 ,3, 4 & 5
|
GI 20b
|
5,455,015
|
5,464,821
|
9,807
|
KRP1_25540 – KRP1_25565 (6)
|
RGP41
|
2, 4 & 5
|
GI 21
|
5,615,479
|
5,626,409
|
10,930
|
KRP1_26250 – KRP1_26310 (13)
|
-
|
5
|
GI 22
|
5,700,164
|
5,727,413
|
27,249
|
KRP1_26660 – KRP1_26790 (7)
|
-
|
5
|
GI 23
|
5,875,381
|
5,911,730
|
36,349
|
KRP1_27515 – KRP1_27765 (51)
|
-
|
1, 3, 4, 5
|
GI 24a
|
6,203,408
|
6,209,504
|
6,096
|
KRP1_29015 – KRP1_29040 (6)
|
-
|
5
|
PI 24b
|
6,209,865
|
6,225,427
|
15,562
|
KRP1_29045 – KRP1_29090 (10)
|
RGP62
|
1, 2, 4 & 5
|
GI 24c
|
6,225,700
|
6,237,151
|
11,451
|
KRP1_29095 – KRP1_29150 (12)
|
-
|
5
|
PI 24d
|
6,239,438
|
6,281,035
|
41,597
|
KRP1_29155 – KRP1_29380 (46)
|
RGP87
|
1, 3, 4 & 5
|
GI 24e
|
6,281,429
|
6,299,576
|
18,147
|
KRP1_29385 – KRP1_29450 (14)
|
-
|
5
|
GI 25
|
6,397,652
|
6,402,302
|
4,650
|
KRP1_29920 – KRP1_29930 (3)
|
-
|
2
|
* Reported regions of genomic plasticity - RGPs 1–62: [27]; RGPs 63–80: [33]; RGP 81-86: [16]; RGP 87-89: [34]; RGP 90-97: [35].
Prediction method: 1: IslandPath-DIMOB [24]; 2: SIGI-HMM [23]; 3: PHASTER [25]; 4: GIPSy [26]; 5: manual blast against previously described P. aeruginosa GIs.
Table 3: Replacement islands in P. aeruginosa.
Replacement island
|
subgroups
|
RGP*
|
PAO1
|
PA14
|
LESB58
|
FA-HZ1
|
W45909
|
KRP1
|
O-antigen biosynthetic locus
|
20
[36]
|
RGP31
|
O5
[37]
|
O10
[4]
|
O6
[38]
|
O1
(this study)
|
O1
(this study)
|
O1
(this study)
|
Pyoverdine locus
|
3
[39]
|
RGP73
|
type I
[28]
|
type I
[28]
|
type III
[13]
|
type I
(this study)
|
type I
(this study)
|
type I
(this study)
|
Pilin and pilin modification genes
|
5
[40]
|
RGP60
|
group II
[40]
|
group III
[40]
|
group I
[41]
|
group I
(this study)
|
group I
(this study)
|
group I
(this study)
|
Flagellin glycosylation island
|
2
[42]
|
RGP9
|
b-type
[42]
|
b-type
[43]
|
b-type
[44]
|
a-type
(this study)
|
a-type
(this study)
|
a-type
(this study)
|
* RGPs 1–62: [27]; RGPs 63–80: [33].
Besides GI 23, PI 24d is a predicted intact prophage (Table S4). The same φCTX phage was previously detected within the P. aeruginosa PSE9 strain [34], where it was termed PAGI-6. Interestingly, the φCTX phage integrated itself more than once into the KRP1 genome. It is also encoded within the first part of PI 19. In both loci, the prophage is lacking the φCTX cytotoxin producing gene ctx. While the PI 24d copy of the prophage does contain two pseudogenes at the ctx position, the PI 19a copy does not contain this 1,589 bp stretch of DNA. Hence, this loci marks a RGP of the prophage and further highlights that the ctx gene is not part of the ancestral φCTX genome but rather was incorporated by horizontal gene transfer later on, as has been previously suggested in literature [31, 45, 46]. A second RGP within the φCTX genome of PI 19a, spans the 5 ORFs KRP1_22865 - KRP1_22885 – encoding for proteins of unknown functions, which differ from PAGI-6 and the known φCTX genome. Another difference between PI 24d and PI 19a is the encoding of two recombinases (KRP1_29355 & KRP1_29375) and an integrase (KRP1_29370) only in PI 24d, while PI 19a, encodes a different integrase (KRP1_22960), which shows sequence similarities with KRP1_12505, the integrase encoded at the start of PI/RI/SI 11a. This is little surprising, since both integrases facilitate the integration downstream of a tRNAGly, while PI 24d is integrated downstream of a tRNAThr.
GI integration downstream of a tRNA is a well-studied phenomenon [47, 48]. The 3’-ends of tRNAs carry attB sites, which are recognized and used for site-specific recombination between an integrative and conjugative element (ICE) and the main chromosome. Overall, the integration of PI 8, 19 and 24b&d, 16 and 17 as well as GI 17, 11, and 20 occurred just downstream of specific tRNAs within the KRP1 genome. Of these islands, GI 17, PI 19 and GI 20 belong to the same family of P. aeruginosa GIs, which are marked by their bipartite structure. While the first segment, downstream of the tRNA contains strain-specific cargo ORFs, the second part shows a high degree of sequence similarity between the strains [16, 47] and mainly encodes structural and mobility-related genes, as well as genes for conjugal transfer [10]. Due to their conserved structure, islands of this family and similar islands in other β- and γ-proteobacteria are likely to be derived from one common ancient ancestor [49]. Additional, previously detected and analyzed GIs of this family include PAGI-2, PAGI-3 and LESGI-3 [16, 47]. Cargo genes of GI 8 include heavy metal resistance genes, genes for metabolic enzymes and enzymes used for the formation and altering of nucleic acids, transcription regulators, a two-component system as well as an antibiotic resistance gene. While the here analyzed cargo genes are KRP1 specific with respect to PAGI-2, PAGI-3 and LESGI-3, they share 99% sequence identity with 13 of the 105 P. aeruginosa isolates, used for phylogenetic comparison (Table S1). These include the previously mentioned FA-HZ1 and W45909 strains. We hypothesize that this set of cargo genes form a unit, which contributes to the successful survival of P. aeruginosa in certain habitats.
PI 19 is the structurally most complex genomic island of KRP1. PI 19b shows substantial sequence similarities to the second half of GI 11 and to LESGI-3 [16] (Figure 7). PI 19b mainly encodes proteins that relate to bacterial conjugation including part of a tra-like operon (KRP1_22995, KRP1_23010 & KRP1_23030), known to facilitate plasmid transfer in gram negative bacteria [51]. Instead of an array of cargo genes, as in GI 11, PI 19 has the integrated prophage PI 19a (φCTX phage) upstream of the conserved block. The tra-like transfer operon of PI 19b is interrupted by another section of DNA described as PI 19c, which is likely of foreign origin. The G+C content of 62.32% in PI 19c differs noticeably from the surrounding PI 19b parts with G+C contents of 68.21% and 67.17%, respectively. The enclosed genes have their closest sequence homologues outside the Pseudomonas genus and include i) genes associated with the genomic repair system of the cell (DNA repair proteins (KRP1_23085 & KRP1_23105) and RNA polymerase-associated proteins, needed for RNA polymerase recycling (KRP1_23090 & KRP1_23110)) and ii) a cluster of metabolic enzymes (thiamine biosynthesis protein (KRP1_23140), a cyclic AMP-GMP synthase (KRP1_23145) and a patatin-like phospholipase (KRP1_23150)). Interestingly, no genes associated with relocalization of genomic DNA are encoded in this region of the island. This suggests that the integration into PI 19b occurred a long time ago, followed by partial deletion of the respective genes or its integration by a yet unknown mechanism. The tra-like transfer operon of PI 19b is continued at the KRP1_23175 ORF, highlighting the interruptive nature of PI 19c within PI 19b. The rest of PI 19b contains mainly genes for proteins of unknown function. The last 23 ORFs of PI 19 likely form an additional separate sub-section PI 19d, as they contain multiple elements related to transposable rearrangement of DNA (KRP1_23345 - KRP1_23360) and other phage related proteins (Table S4). Other cargo genes of this sub-island include multiple proteases (KRP1_23370, KRP1_23380, KRP1_23425 & KRP1_23430) and genes linked to different stress responses like acid tolerance (KRP1_23390 & KRP1_23395) and phosphate starvation (KRP1_23420). This last part of PI 19 represents classical strain specific cargo genes, which are typical for the previously explained PAGI-2 and PAGI-3 like family of P. aeruginosa GIs [10].
Genomic resemblance of KRP1 to highly virulent P. aeruginosa strains
P. aeruginosa KRP1 contains an array of genomic elements that are found in the highly virulent strains PSE9 [14, 34] and LESB58 [13, 16, 52] (Table 4). Unfortunately, no complete genome sequence is available yet for PES9, so it could not be included in the full genome comparison. However, some of the shared GIs have been shown to be the source of the strain dependent virulence within the P. aeruginosa species [14-16]. KRP1 encodes all seven genomic islands found in the clinical isolate PSE9 [14, 34] (Table 4), whereby PAGI-9 (GI 5) and PAGI-11 (GI 14) were not detected by the GI prediction tools used in this study, but their presence within the KRP1 genome was manually verified (Table 2). PAGI-9 consists of 6581 bp and one large ORF, which was identified as a Rhs (rearrangement hot spot) element [34]. The nucleotide sequence of these proteins generally has a bipartite structure composed of a long G+C rich core and a relatively G+C poor tip sequence. While the core sequence is intra- and interspecies highly conserved, the tip is rather variable. Rhs elements in P. aeruginosa are often linked to- and translocated via type VI secretion systems [53-55]. Other members of the Rhs element family have been shown to exhibit bacteriocin properties, highlighting their use in inter-prokaryotic competition [56]. Similarly, PAGI-10 is a Rhs element of PSE9, which is also found within KRP1 (PI/RI 9). The fact that the strains PSE9 and KRP1 show sequence identity over the entire length of the ORF and not only in the conserved core shows the close genomic relationship between the hyper virulent PSE9 and KRP1.
PAGI-11 of PSE9 (GI 14 in KRP1) is only 2,003 bp long and located at RGP 52 (Table 4) and while Battle et al. [34] did not find any ORFs contained, the Prokka pipeline [57], used on the KRP1 genome, predicts the hypothetical protein KRP1_16515. The G+C content of just 43.19% is far below the average of the KRP1 genome (i.e. 66.3%). Other strains are known to contain GIs of larger size and with encoded mobile element related genes at this specific genomic locus [27]. Therefore, PAGI-11 might have been a larger genomic island in the past, which was partially lost over time in PSE9 and KRP1. The PSE9 strain originated from a patient with ventilator-associated pneumonia isolated at a hospital in Barcelona, Spain, between May 1993 and October 1997 [58]. It was found to be the most virulent out of 35 strains in a mouse model of acute pneumonia [59]. So far, two studies were able to link the increased virulence of PSE9 directly to PAGI-5 and PAGI-9 [14, 15]. Since KRP1 contains both of the mentioned islands, an increased virulence similar to the levels of PSE9 can be anticipated. Further, PSE9 and KRP1 share the same O-antigen type O1 (Table 3). The O-antigen type of the outer membrane lipopolysaccharide (LPS) layer has been previously linked to the virulence of P. aeruginosa, but most studies consider the serotype of the type strain PAO1 (type O5) for their study [60]. Both strains are also exoS positive and exoU negative. A genotype that has been linked to an invasive phenotype [61]. Since no full genome sequence of PSE9 is available so far, a deeper in silico comparison between both strains is currently impossible.
Besides PSE9, the P. aeruginosa strain KRP1 shows substantial similarities in its accessory genome with the LESB58 strain, an aggressive pathogen of a cystic fibrosis patient from Liverpool in 1988 [13, 16, 52] (Table 4). The majority of the shared GIs were found via manual search rather than by the applied software programs. LESGI-6 to LESGI-17 were first detected by Jani et al. [13]. The authors used a genome segmentation approach to identify genomic regions of foreign origin within the LESB58 strain. This technique varies from the ones used in this study and therefore different putative GIs were detected. The authors could show that these GI encode for additional virulence factors (LESGI-6, -8, -13, and -15) as well as drug and metal resistance cassettes (LESGI-12 and -17). LESGI-9, -16, and -17 add additional versatility to the LESB58 metabolic repertoire [13]. Since KRP1 encodes all of these GIs as well, it is very likely that it employs their functions and therefore shows an increased virulence potential, similar to the LESB58 strain.
Table 4: Genomic Islands in different P. aeruginosa strains. GIs of strain PSE9 and selected GIs of strain LESB58 and their corresponding GI as well as sequence similarity within the strain KRP1.
|
Location within KRP1
|
Sequence identity (Query length)
|
PSE 9 GIs
|
|
|
PAGI-5
|
GI 20
|
99.98% (98%)
|
PAGI-6
|
PI 24d
|
99.98% (100%)
|
PAGI-7
|
PI 4
|
100% (100%)
|
PAGI-8
|
PI 24b
|
99.99% (100%)
|
PAGI-9
|
GI 5
|
100% (100%)
|
PAGI-10
|
PI/RI 9
|
99.97% (100%)
|
PAGI-11
|
GI 14
|
100% (100%)
|
LESB58 GIs
|
|
|
LESGI-1
|
PI 8
|
98.62% (96%)
|
LESGI-3
|
GI 11
|
99.54% (65%)
|
LESGI-4
|
GI 13
|
99.61% (98%)
|
LESGI-6
|
GI 2
|
99.36% (100%)
|
LESGI-8
|
GI 9
|
99.41% (100%)
|
LESGI-9
|
GI 10
|
99.75% (100%)
|
LESGI-12
|
GI 15
|
99.55% (100%)
|
LESGI-13
|
GI 17
|
99.60% (100%)
|
LESGI-14
|
GI 18
|
99.56% (100%)
|
LESGI-15
|
GI 21
|
99.73% (100%)
|
LESGI-16
|
GI 22
|
99.62% (100%)
|
LESGI-17
|
GI 24
|
99.59% (96%)
|
LES-prophage 4
|
GI 23
|
89.31% (73%)
|
In contrast, the two strains showing the closest ANI identity and phylogenetic relationship with KRP1 are P. aeruginosa strain FA-HZ1 and W45909 (Figure 1 and Figure 2). FA-HZ1 is a dibenzofuran-degrading isolate from China [21] while W45909 is a clinical isolate from the USA [22]. All but three identified GIs in KRP1 are also present in these two most related strains (PI 8, PI 19 & GI 23 for W45909 and GI 23 for FA-HZ1). This provides circumstantial evidence that the genomic repertoire of P. aeruginosa KRP1 is likely to sustain a pathogenic as well as an apathogenic lifestyle in nature. While their genetic information is available, no further studies have been performed with either of these strains but we stand to believe that they will also show an increased virulence like KRP1.