Induction profiles of identified, intact prophages
Our bioinformatics analysis identified a sum of 29 intact prophages in the genome of the 12 P. aeruginosa clinical isolates. Of these, 22 intact prophages seemed to be longitudinally frequent when scanning the genomes of isolates from the same CF lung. Prophage Hunter generally outperformed PHASTER in predicting prophage completeness (Supplementary Table 3). Double agar overlay assays were the first tests undertaken to identify free phages resulting from our induction experiments. However, these assays alone would have been insufficient to distinguish whether lawn lysis stems from phage rather than bacteriocin killing. Such level of detail was possible thanks to the sequencing analysis. By mapping lysate reads to their corresponding host genome we could identify induced prophage locations with high border accuracy. Looking back at the results of the double agar overlay assays (Fig. 1a), we noted that clearing zones and plaques did always result from prophage induction, as confirmed from the sequencing (Fig. 1b).
In total, 15 predicted-as-intact prophages were found to exist as free particles by lysate sequencing. While we did predict genes encoding a major capsid and other structural proteins for the remaining 14 prophages (Supplementary Table 4), no or scarce reads corresponding to them were captured (Fig. 1b). Given that lysate DNA was extracted from at least two different time points per isolate and type of induction, it is unlikely that our method missed induced dsDNA prophages. Explanations for why the remaining 14 prophages were not induced can be attributed to various factors. For example, lysogenic to lytic conversion for these prophages may be triggered by conditions different to the ones tested [60], or they may belong to a non-inducible class of temperate phages (such as Escherichia virus P2) [61]. Another improbable scenario is that, while these prophages excised, our DNA filtration and Illumina sequencing methods selected against them because of their ssDNA genome [62].
Of the induced prophages some self-induced, while others excised due to one or both antibiotics used (mitC or cipro; Fig. 1b). Under the tested in vitro conditions, self-induction was common and occurred for almost 50% of the prophages sequenced. This aligns with multiple studies on P. aeruginosa clinical isolates from CF patients, such as [63, 64], which reported frequent in vitro prophage self-induction. Cipro and mitC caused additional prophages to excise from the 12 isolates (Fig. 1b). In particular, cipro induction patterns corroborated and occasionally expanded those of mitC (Fig. 1a). Ciprofloxacin, a fluoroquinolone commonly prescribed to CF patients including the Copenhagen CF Center patients, was previously shown to trigger high rates of in vitro prophage induction in clinical P. aeruginosa [65]. Considering these in vitro observations, we can extrapolate that in vivo P. aeruginosa-infected CF lungs often contain high titres of excised prophages, as already showcased for isolate LESB58’s prophages [66]. This extrapolation can be further supported by the regularity of prophages in genomes of P. aeruginosa CF lung isolates. Our results support findings in other studies that most clinical P. aeruginosa are mono- or polylysogenic [63, 64], because except F004 all isolates harboured prophages, with five of twelve being polylysogens harbouring two to eight intact prophages (Fig. 1b). It is tempting to hypothesize that in the CF lung environment, lysogeny (especially polylysogeny) contributes to P. aeruginosa persisting longer than its nonlysogenic or prophage-poor counterparts. This would be due to prophage moron genes and/or superinfection exclusion.
In the next sections, we present and discuss our results which support this hypothesis, i.e. that intact prophages favour P. aeruginosa persistence in the CF lung. More and more studies offer evidence that justifies this assumption. For instance, P. aeruginosa prophages DMS3 and pp3 were shown to exclude phages that require type IV pilus as a receptor and to assist in host adaptation by promoting biofilm formation, respectively [67, 68]. In other studies, the acquisition of prophages decreased antibiotic susceptibility and virulence of clinical P. aeruginosa and increased biofilm formation [69]. Moreover, Burns et al. [71] demonstrated that in mixed infections PAO1 polylysogens used phage predation to prevail over their isogenic prophage-free competitors. Intriguingly, monolysogeny here was indeed almost exclusively associated to isolates from monoclonal infections, except for isolate F002 (Fig. 1b). Yet, the CT of F002 (DK12) was reported to occur in multiple patients [72] likely implying its direct acquisition from a polyclonal CF lung environment.
To explore possible correlations between polylysogeny of an isolate and a higher permissiveness to phage invasion, we performed in silico predictions of CRISPR-Cas systems. In line with others [73], our study revealed that isolate genomes containing functional CRISPR-Cas systems harboured one or zero prophages. (Table 1). Despite the reported strong association between type I-F systems and self-targeting [73, 74], we detected no spacers targeting those native intact prophages. Conclusive answers to the above would necessitate experimental validation of the predicted CRISPR-Cas systems, and of any phage countermeasures.
Table 1
List of studied isolates predicted to contain functional or orphan CRISPR-Cas systems as compared to the number of intact prophages harboured by each isolate.
Isolate | Nr. of Cas Operons | Predicted Cas Subtypes | Nr. of Associated CRISPR Arrays | Nr. of Intact Prophages | Native Prophage (self-) Targeting? |
37 | 1 | I-F | 2 | 1 | No |
135 | 1 | I-E | 2 | 1 | No |
199 | 1 | I-F | 2 | 1 | No |
F004 | 1 | I-F | 2 | 0 | No |
F023 | 1 | I-F | 2 | 1 | No |
F056 | 1 | I-F | 2 | 1 | No |
LRJ32 | - | orphan | 1 | 7 | - |
Genomic features and gene predictions of induced prophages
Fourteen of 29 intact prophage genomes could not be resolved with high-border accuracy as they remained uninduced. For this reason, this section focuses on the 15 induced prophages. In Table 2, we present the names and key genomic characteristics of these 15 prophages discovered here. Marin, Shamal, Haboob and Sirocco genomes were found to be terminally redundant with terminal repeats of 61, 64, 60 and 46 bp, respectively (Supplementary Table 4). Marin, Shamal and Haboob were also found to encode three, one and four tRNAs, respectively, with Marin and Haboob displaying an almost identical tRNA gene arrangement, despite their overall nucleotide dissimilarity (94% identity over 46% query cover by BLASTn). This finding aligns with the concept that prophages can have maximum four tRNAs and that phage genomes of length longer than the average (here 42,061 bp) are more prone to harbouring tRNA genes [75]. Notably, these same genomes had the highest %GC content disparity as compared to P. aeruginosa’s genome, which averages 66.6% [76]. In general, it is hypothesized that phages carry selected tRNA genes either to compensate for any codon usage differences with their host or to overcome host tRNA-depleting anti-phage strategies [75–77]. Our observation regarding the %GC content disparity would be better explained by the former theory. Collectively considered, tRNA presence and %GC content disparity could imply that Marin, Shamal and Haboob encountered various hosts throughout their evolutionary history.
Functional annotations could averagely be assigned to one-third of the predicted Open Reading Frames (ORFs; Table 2). Annotated ORFs were typically associated with morphogenesis (portal, capsid and tail genes), DNA packaging (terminase subunits) and lysogeny-lysis (repressor/antirepressor, integrase, transposase, endolysin, holin) modules due to these proteins being more conserved in phages. Transposition was predicted for prophages Bise, Etesian, Gregale, Meltemi, Marin and Rashabar and agrees with early reports on the wide distribution of transposable phages in P. aeruginosa [78]. By combining functional annotations and synteny maps to related phages, we predicted that the lysis-lysogeny switch was regulated by repressor CI and either Cro- or Ner-like antirepressors for most prophages (Table 2). Nevertheless, a combined repressor/antirepressor tree (Fig. 2b) failed to cluster prophages according to their induction patterns, suggesting that repressor/antirepressor genes are not alone in governing lysogeny-lysis decisions.
Besides the repressors/antirepressors, we were particularly interested in identifying morons. Moron proteins are recognised as important contributors to the fitness of P. aeruginosa and other bacteria through, for instance, antimicrobial resistance or virulence regulation [79]. Depending on their role, some could hence support persistence of the prophage host in the CF lung. Our screening against the CARD database located a single gene of prophage Ostro (peg.66) encoding a protein with 92.5% identity to a bicyclomycin resistance protein (NCBI: ALV80601.1). Our search against PHI-Base located a considerable number of proteins with similarity to moron virulence factors, and which were encoded by the genomes of Ostro, Alize, Haboob, Riah, Solano. Ostro harboured two candidate virulence genes, pegs.69–70. Peg.69 encoded a protein 100% identical to CcoN4 (PHI:7765), an orphan cbb3-type cytochrome-c oxidase subunit, which was shown promote pathogenicity and biofilm growth for strain PA14 in a Caenorhabditis elegans infection model [80]. The peg.70-encoded protein had 50% identity to the GntR family protein YdcR (PHI:7225), whose gene knock-outs led to dampened pathogenicity of Salmonella enterica in a murine infection model, while the protein itself seemed to directly regulate virulence factor SrfN [81]. Taken together the results of CARD and PHI-Base strongly suggest that Ostro greatly assists the fitness of its host, isolate 382. Solano and Alize harboured the homologous candidate virulence genes peg.55 and peg.10, respectively, which encoded a protein 50% identical to the Arc family DNA-binding protein AmrZ of P. syringae (PHI:5372). AmrZ acts as a cellulose biosynthesis repressor, and amrZ mutants displayed a hypovirulent phenotype implying that AmrZ may regulate up to several virulence factors [82], like also proven for P. aeruginosa [83]. Gene peg.67 of Haboob encoded a protein with 34% identity to AlgR, a LytTR DNA-binding domain-containing protein of P. aeruginosa (PHI:8158). AlgR activates the biosynthetic pathway of alginate, which leads to mucoidity and immunoevasion and eventually persistence of P. aeruginosa in CF [84]. Moreover, AlgR was found to contribute to virulence via the regulation of type IV pili, which are essential for P. aeruginosa mammalian cell colonization, competence and pathogenesis [85]. Lastly, Riah’s genome contained gene peg.58 that encoded a protein with 43% identity to DprA from Streptococcus pneumoniae (PHI:11059). Pneumococcal DprA's central role in virulence was demonstrated through targeted deletion, where its deficiency led to attenuated virulence in a bacteremia mouse model infection [86]. However, a connection of DprA to P. aeruginosa virulence is yet to be established.
Table 2
Names and key genomic characteristics of the induced P. aeruginosa prophages. The ends of all genomes were defined based on their integration in the host chromosome.
Prophage | Genome Size (bp) | Lysogenized Isolate | ORFs with Assigned Function | %GC Content | tRNA Genes | Repressor, Antirepressor |
Rashabar | 38,722 | 37 | 16/55 | 63.3 | | unclear, Ner-like |
Riah | 44,821 | 20 | 24/59 | 62.1 | | CI-like, Cro-like |
Alize | 38,819 | F002 | 27/65 | 61.9 | | CI-like, Cro-like |
Haboob | 50,033 | 382 | 26/85 | 59.7 | tRNAMet tRNAGly tRNAAsn tRNAThr | CI-like, Cro-like |
Marin | 51,941 | F002 | 29/84 | 59.9 | tRNAGly tRNAAsn tRNAThr | CI-like, Cro-like |
Shamal | 44,124 | F002 | 19/76 | 57.3 | tRNASer | CI-like, Cro-like |
Gregale | 36,566 | F002 | 15/53 | 64.2 | | CI-like, Ner-like |
Solano | 40,334 | LRJ32 | 24/62 | 62 | | CI-like, Cro-like |
Meltemi | 36,609 | LRJ32 | 15/54 | 64.5 | | CI-like, Ner-like |
Lodos | 44,613 | 135 | 24/55 | 63.9 | | CI-like, unclear |
Bise | 38,458 | 199 | 30/55 | 66 | | CI-like, Ner-like |
Turba | 42,092 | F038 | 22/51 | 63.8 | | CI-like, unclear |
Ostro | 48,846 | 382 | 27/70 | 62.4 | | CI-like, Cro-like |
Etesian | 37,990 | 382 | 20/58 | 64.3 | | CI-like, Ner-like |
Sirocco | 36,945 | 188 | 35/52 | 62.5 | | CI-like, Cox-like |
The genomes of identified intact prophages are highly diverse
To explore how diverse the 29 prophages were as compared to an external -yet related- prophage population, we built a database of previously published dsDNA genomes belonging to 26 P. aeruginosa prophages proven to exist as active particles (Supplementary Table 2). Intergenomic similarity scoring for all 55 genomes was performed with VIRIDIC following the current International Committee on Taxonomy of Viruses thresholds for species (< 95%) and genus (~ 70%) demarcation [56]. As anticipated by this threshold’s stringency, only two species clusters were formed, grouping prophage Etesian together with database prophage JBD5. JDB5 possesses two genes encoding anti-CRISPR (Acr) proteins, AcrIF3 (NCBI:YP_007392740.1) and AcrIE1 (NCBI: YP_007392738.1) encoded by genes peg.25 − 6 in Etesian. These proteins should enable Etesian’s spread to other P. aeruginosa that, unlike studied host 382, carry active type I-F and I-E CRISPR-Cas systems [87]. Along with Gregale and Meltemi, Etesian was also part of the genus cluster represented by database prophage D3112. Other formed genera clusters were represented by database prophages JBD25 and JBD67. JBD25 cluster included the intact but uninduced prophages Caju and Notus, while JBD67 cluster contained only Rashabar. Lastly, phiCTX’s cluster included prophage Sirocco and intact but uninduced prophage Harmattan, but neither of them encoded a cytotoxin related to phiCTX’s (NCBI: NP_490598.1). Sirocco’s clustering with prophages of the P2 non-inducible class matches our observations on the inconsistency of Sirocco’s induction. Like with phiCTX, eventual lysis may occur by mutations in a lysogeny-associated gene [88]. For even broader comparisons, this study’s prophage genomes were BLASTn-compared to the viral nucleotide collection database (Supplementary Table 5).
The remaining intact prophages formed clusters with reduced or no similarity to database prophages. Overall, our prophages were distributed across the VIRIDIC heatmap. To assess the level of diversity among these prophages, we compared results yielded by VIRIDIC with a ViPTree-built phylogeny. ViPtree constructs trees based on tBlastX similarities, hence increasing resolution of genomic relations. The derived tree (Fig. 3b) validated VIRIDIC and BLASTn results, highlighting the clade of active prophages Lodos-Turba-Riah and the monotypic taxon Bise for their very low similarity to any known prophages. Other active prophages with reduced similarity to known prophages were Haboob, Marin, Shamal, Alize, Ostro, and Solano. Notably, phylogenetically related prophages (Fig. 3b) were not restricted to phylogenetically related hosts (Fig. 3a). This, combined with the 29 prophages shared ecological niche, corroborates prior findings on the remarkable versatility and diversity of P. aeruginosa prophages [90].
The genomes of longitudinally frequent prophages remain virtually unchanged over long evolutionary times
Each of the 12 P. aeruginosa isolates that harbour the 29 intact prophages represents a patient persistently infected by P. aeruginosa. To gather more evidence on whether any of these prophages favours host fitness and persistence, we gauged how frequently they appear longitudinally. Per patient environment, sequenced longitudinal isolates spanned a period of at least four years (Fig. 4). All same-patient longitudinal isolates were BLASTn-compared to the complete genomic sequence of each induced prophage to confirm earlier bioinformatics-determined frequencies. Here we focus on the 15 intact and induced prophages, as complete genomes of the 14 uninduced prophages could not be verified with lysate sequencing. This approach is taken, because sole reliance upon low-contiguity Illumina-based assemblies for prophage discovery could lead to false-negative low prophage frequencies.
We found all 15 induced prophages to be present in at least 80% of the dates when a persistent CT was isolated, including the most recent date. Hence, all 15 induced prophages were deemed longitudinally frequent, or else persistent. Figure 4 presents the CT and longitudinal isolates scanned per patient. Notably, we observed minimal alterations in all induced prophages’ genomes throughout the extensive evolutionary timeframe of four to nine years that we considered. Figure 5 displays two cases of this genomic conservation; a monoclonal infection exemplified by the synteny map of prophage Alize and a polyclonal infection exemplified by that of Meltemi. The genome of Alize matched by 93% overall nucleotide similarity a region detected eight years later in isolate 32V99 of the same CT (Fig. 5a). Similarly, the genome of Meltemi matched by 82% overall nucleotide similarity a region detected nine years later in isolate 23V71 of the persistent CT DK67 (Fig. 5b). Remaining synteny maps are available in Supplementary Fig. 1.
Meltemi and its cohabiting prophage Solano constituted the only cases of a persistent prophage likely moving from a transient (DK18) to the persistent CT (DK67). We wanted to validate this hypothesis, versus the possibility that Meltemi and Solano were inherent features of DK67’s backbone. Indeed, the two phages were not found in 36V42, an isolate of DK67 which only transiently infected a patient. We extended this strategy and searched for additional CTs that persisted in one patient but transiently infected another, and found examples for DK12 and DK26. DK12 persisted in PID76609 and was transient for patient PID08309 (Fig. 4). We screened the genomes of isolates F042, F043 and 93 from transient DK12 against the genomes of persistent prophages Alize, Marin, Shamal, Gregale of isolate F002 (DK12). Again, we found no homologous sequences, which indicated that these prophages do not belong to DK12’s backbone. Another CT, DK26, was persistent in patients PID42824 and PID08309, which were represented by isolates 382 and 188. We found none of the four induced and two uninduced prophages from these isolates to be shared except in one case; Sirocco had 76% overall nucleotide similarity to a region in the genome of 382. The latter corresponded to the predicted as intact but uninduced prophage Khazri (Fig. 3b), which was found to be a longitudinally frequent element of 382 (Supplementary Table 4). To further clarify the case of Sirocco, we looked at the genomes of isolates F011, F044 and 24V78. F011 -sampled in 2002- and 24V78 -sampled in 2021- originated from the same CF patient and a region of 99% overall nucleotide similarity to Sirocco was traced in both. The infection by DK26 in this patient seemed to prevail for almost 19 years, but we did not have sequences of intermediate isolates to validate its persistence. However, F044 from a CF patient transiently infected by DK26, also harboured a region with 99% overall nucleotide similarity to Sirocco. This identified ubiquity of Sirocco-type prophages in DK26 isolates combined with the general non-inducibility of this prophage lineage (Fig. 3b) suggests that Sirocco is an inherent part of DK26’s backbone.
Active prophages are bacterial parasites that burden the host in a dual manner; they are metabolically costly and can revert to the lytic cycle resulting in host death. Thus, the trend would be that they undergo mutational degradation along with other parts of the bacterial genome and become grounded over time [89], unless they confer ecological and evolutionary benefits to their host [90]. Conversely, a prophage carrying a moron gene that provides fitness benefits unmatched by the host genome is predicted to persist within the host, even amidst conflicting selection pressures [91]. Stability in the host environment is estimated to further prolong maintenance of such prophages [91]. Indeed, stability is a trait that characterises Pseudomonas-dominated CF lungs [92], and likely creates a positive feedback loop between beneficial prophage maintenance and host persistence. Considering all these, we expect that, excluding Sirocco, all other induced prophages enhance their hosts’ fitness and persistence in the CF lung. Suggestive to this role are their high genomic conservation within the persistent CT and their long-term longitudinal persistence (in contrast to transient examples of the same CT). The facil inducibility of all but Sirocco, albeit supportive to host survival via competing CT exclusion, may also be viewed as evidence of prophage selfishness and οpportunism. We propose that reported behaviours of intact prophages, whether as selfish genetic elements pursuing their own survival and proliferation or as nearly domesticated elements contributing to host adaptation and fitness, should not be regarded as mutually exclusive. Instead, they should be recognized as interconnected aspects of the nature of intact prophages, used to aid their survival. This idea is supported by Quistad et al. [93], who reason that genetic elements (e.g. intact prophages) with genes that promote host survival should be called “mobile” rather than “selfish”.