Four novel Curtobacterium phages isolated from environmental samples

Despite Curtobacterium spp. often being associated with the plant phyllosphere, i.e., the areal region of different plant species, only one phage targeting a member of the genus Curtobacterium has been isolated so far. In this study, we isolated four novel plaque-forming Curtobacterium phages, Reje, Penoan, Parvaparticeps, and Pize, with two novel Curtobacterium strains as propagation hosts. Based on the low nucleotide intergenomic similarity (<32.4%) between these four phages and any phage with a genome sequence in the NCBI database, we propose the establishment of the four genera, “Rejevirus”, “Pizevirus”, “Penoanvirus”, and “Parvaparticepsvirus”, all in the class of Caudoviricetes.

alignment fraction (AF) ≥89% to the closest hit. The phages Reje, Penoan, and Pize were isolated on strain AM11 from common household organic waste (HCS A/S, Glostrup, DK, 2020), and Parvaparticeps was isolated on strain AM15 from water from a small lake (Copenhagen, DK). The phage isolation, DNA extraction, sequencing library, sequencing, and genome assembly were done as reported previously [11,12]. The ends of the phage genomes were determined using the read start coverage tool in CLC Genomics Workbench v 12.0.3. Identification of putative genes and annotation were done as described previously [11,13]. A tRNA search was performed with tRNAscan-SE [14]. Transmission electron microscopy (TEM) images were obtained as described previously [15]. Phylogenetic analysis (MEGA X v 10.2.5, ClustalW alignment with neighbor-joining, 1000 bootstrap replications) based on predicted amino acid sequences of the terminase protein was done with the five closest relatives identified by BLASTp (identity, ≥30.7%; query coverage, ≥91%) for all four phages [16]. Nucleotide intergenomic similarity (NIS) values were calculated using VIRIDIC with the closest relatives (identity, ≥63.5%; query coverage, ≥4%) of each phage identified by BLASTn [17]. Proteomic tree analysis was carried out using the ViPTree server, using standard settings [18].
Phage characteristics, including genome size, GC content, and the number of predicted coding DNA sequences are summarized in Table 1.
In TEM micrographs, Reje and Pize showed podovirus morphology, with a tail stub masked by whiskers with distal globular appendages attached to the collar structure and an isometric head. Penoan showed siphovirus morphology, with an approximately 70-nm tail and a baseplate without additional appendages. Parvaparticeps also showed siphovirus morphology, with an tail approximately 128 nm in length and a baseplate with thin appendages (Fig. 1A). The following taxonomic classification is based on current International Committee on Taxonomy of Viruses (ICTV) guidelines and database. A complete overview of the BLASTn results and ViPTree analysis is available in Online Resources 1 and 2.
Phylogenetic analysis based on predicted amino acid sequences of the large terminase subunit encoded by terL showed that Penoan and Parvaparticeps did not cluster with any of their closest relatives (bootstrap value, 100). Pize and Reje formed a monophyletic clade together with the only other Curtobacterium phage, Ayka (ON381767, bootstrap value, 100; Fig 1C).
The closest relatives to Penoan are the three Arthrobacter phages Muttlie (KU160658.1), Yank (KU160674.1), and LouisXIV (MN813695.1), with an NIS of 11.6%, 10.2%, and 10.1%, respectively, which is well below the 70% genus demarcation limit [17]. The distant relatives of Penoan are all lytic and are classified as members of the genus Decurrovirus in the class of Caudoviricetes, without an assigned family. The phages in the genus Decurrovirus, like Penoan, have an average genome size of 15,527 bp, a GC% of 60.1%, and no tRNAs, [19]. The highest translated sequenced similarity (S G ) of 25.45% was found between Penoan and the zetavirus zeta1847, and these two isolates formed a monophyletic clade in a proteomic tree made using ViPTree (Supplementary Table S2.1, and Supplementary Fig. S2.1). However, zeta1847 did not align with the whole genome (BLASTn) or terL protein sequence (BLASTp) of Penoan.
Pize and Reje were found to be most closely related to each other (NIS 40.5%), followed by Ayka (NIS 30.6-32.4%), although all three apparently belong to different genera ( Fig. 1B) [6,17]. They have the same two other closest relatives; Arthrobacter phages Anjali (NC_048739.1) and Mendel (NC_048740.1), with NIS of 9.1-10.2%. Anjali and Mendel belong to the genus Anjalivirus, class Caudoviricetes [20]. Proteomic tree analysis showed that the S G between Reje and Pize was 47.31%, the S G to Ayka was 38.36% (Reje) and 37.97% (Pize), while the S G to Anjali and Mendel for both was <23%. Reje  Parvaparticeps had an NIS of 18.9-19.5% to its closest relatives, which were the Microbacterium phages Cicada (MT498057.1), Johann (MK016497.1), and Zanella (MN369765.1). These three Microbacterium phages are all goodmanviruses, as they formed a genus cluster with the goodmanviruses Microbacterium phage Goodman (NC_048101.1) and Rasovi (MT310855.1), with an NIS of ≥70%. However, this cluster did not include Parvaparticeps (Fig. 1B). The genus Goodmanvirus belongs to the class Caudoviricetes. The highest S G for Parvaparticeps is with the phage Zanella (24.09%), followed by Cicada (24.02%), but it was separate from them in the proteomic phylogenetic analysis (Supplementary Table S2.4 and Supplementary Fig.  S2.3). The closest relatives have not been assigned to any existing families or subfamilies (NIS ≤70%, Fig. 1B; nucleotide similarity 2.56-27.6%, Supplementary Table S1.1-S1.4). The criteria for intermediary-level classification into a family or order are still being defined, but it has been suggested that they should be based on complete viral proteomes [21]. Hence, this taxonomic analysis shows that Pize, Reje, Penoan, and Parvaparticeps currently cannot be classified into any existing family or subfamily, but they all belong to the class Caudoviricetes. Penoan could potentially belong to the same undefined family as zeta1847, based on the 25.45% S G and the fact that they formed a monophyletic group in the proteomic analysis. Reje, Pize, and Ayka might also belong to the same undefined family with their NIS ≥30%, S G ≥37%, and grouping in the proteomic analysis.
No bacterial virulence or antibiotic resistance genes were detected using the Virulence Factor Database [22], Viru-lenceFinder 2.0 [23], or ResFinder 4.0 [24], all using standard settings. Furthermore, no integrase or integrase-related genes were identified, indicating a lytic lifestyle for all four phages. A full list of putative gene functions for all four phages is available in Online Resource 3.
Reje has a large cluster of genes encoding structural proteins, including major capsid (CDS6), major tail (CDS7), upper collar (CDS8), and lower collar (CDS9) proteins. The genome organization of Pize is very similar to that of Reje, with a structural cluster of genes encoding major capsid, major tail, upper collar, and lower collar (CDS6-9) proteins, as well as an SSB protein (CDS1) and a DNA polymerase (CDS3). The predicted tail-related protein (CDS10) of Reje has 72% amino acid sequence identity to the hypothetical CDS10 of Pize, but not to any of the closest relatives. However, the predicted tail-related protein has synteny with the predicted tail-related proteins of Ayka, RHph_N3_8, Mendel, and Anjali (grey and yellow dot with asterisk in Fig. 2A). In the pairwise genome comparison analysis of Reje and Pize, these phages shared gene synteny, but the average amino acid sequence identity between Reje and Pize measured using Clinker [25] was only 52% (Fig. 1B and Fig. 2A).
Penoan has the smallest genome and has a cluster of genes encoding predicted structural proteins, including tape measure (CDS12), major tail (CDS16), head-tail connector (CDS18), major capsid (CDS19), and portal (CDS20) proteins. Besides these, only two other CDS had predicted functions -TerL (CDS21) and an endolysin (CDS9) -and all but CDS9 were found to have synteny with the three closest relatives (Fig. 2B).
The Parvaparticeps genome encodes a TerL (CDS50) protein, a polymerase (CDS2), an adenylosuccinate synthetase (CDS4), a deoxyuridine triphosphatase (CDS7), endo-and exonucleases (CDS11, 12, and 13), a recombinase (CDS18), a primase (CDS21), a helicase (CDS23), and a lysin (CDS29). The adenylosuccinate synthetase (CDS4) could potentially be involved in DNA modification, which has been observed previously in other phages as a defense mechanism, but it might also be involved in purine nucleotide synthesis [26,27]. The putative structural proteins of this phage include minor tail, tail-related, and tape measure (CDS32-34) proteins, a major tail protein (CDS36), three minor capsid proteins (CDS37, 38, and 46), a major capsid protein (CDS42), a scaffolding protein (CDS43), and a portal protein (CDS49). The synteny is conserved for the predicted genes, but there is also amino acid sequence similarity between CDS32-34 and the corresponding proteins of phages Johann, Cicada, and Zanella (green dots with asterisks in Fig. 2C). Cicada and Johann are predicted to be lytic [28]. The presence of a gene encoding a predicted recombinase (CDS18) suggests a temperate lifestyle, but since there is no indication of a putative integrase, CDS18 could have a function common to lytic phages [29]. Parvaparticeps causes turbid plaques. In the process of growing Parvaparticeps to a high titer, a clear-plaque mutant appeared two times independently. Sequencing of one of the clear-plaque mutants revealed a single nucleotide polymorphism (SNP) at position 21,059 (located in CDS31 directly downstream of the minor tail protein gene). The independent reoccurrence of the clear-plaque mutant indicates that it has a selective advantage over the wild type when grown under laboratory conditions. CDS31 had hits in HHpred to a receptor-binding protein from the Lactococcus phage 1358 but no significant hits in BLASTn or RAST. The SNP was identified as a transition from a T to a C, which changes the corresponding amino acid from an aspartic acid (D) to a glycine (G). This SNP likely improves binding to the receptor, resulting in clear plaque morphology.
In summary, including the four phages presented in this study, the genome sequences of only five phages targeting members of the genus Curtobacterium have been published in the NCBI database. This brief report highlights the diversity of phages against a single bacterial genus, despite three of the phages being isolated from the same sample. Based on the limited NIS to other phage genomes (≤32.4%) and to each other (NIS, <40.5%; S G , 47.31%; average amino acid sequence identity, 52%), the newly isolated phages Reje and Pize both represent new genera but cannot be assigned to any existing family or subfamily. They were both distinct from their closest relative Ayka (NIS ≤32.4% and S G ≤38.36%) in all analyses. Hence, we suggest the creation of the genera "Rejevirus" and "Pizevirus", potentially in the same undefined family as the proposed genus "Aykavirus". Parvaparticeps has limited NIS to its closest yet distant relatives (<19.5%) and was separated in the phylogenetic analysis (both based on terL and proteome from ViPTree). Hence, it cannot be assigned to any existing genus, and we therefore suggest the creation of the genus "Parvaparticepsvirus". Lastly, Penoan had limited NIS (<11.6%) to its closest relatives, but it formed a monophyletic clade with the zetavirus zeta1847 (S G 25.45%) in the proteome analysis. Hence, we suggest creation of the genus "Penoanvirus", potentially in the same undefined family as the existing genus Zetavirus. All four phages belong to the class Caudoviricetes. Fig. 1 (A) TEM pictures of the four phages Reje (dark green), Pize (green), Penoan (blue), and Parvaparticeps (brown). Appendages are indicates by asterisks in the pictures. (B) NIS, genome length ratio, and aligned genome fraction (VIRIDIC). Phages with an asterisk were identified based on terL sequences, and those in italics were identified by whole-genome sequencing. (C) Phylogenetic analysis based on the predicted amino acid sequences of TerL proteins.