Genome Sequence Analysis of Cronobacter Phage PF-CE2 and Proposal of a New Species in the Escherichia Virus RB16 Genus

The genome of Cronobacter sakazakii M1 phage named PF-CE2 was characterized in this work. And a new species named Cronobacter virus PF-CE2, in the Escherichia virus RB16 genus of the subfamily Tevenvirinae of the family Myoviridae was established. The Gp190 gene of phage PF-CE2 was rst proposed to encode a bacteriophage-borne glycanase, which is capable of degrading fucose-containing exopolysaccharides produced by C. sakazakii M1. Further, the taxonomic status of eight additional phages was modied according to average nucleotide identity analysis. This nding provides a theoretical basis for subsequent heterologous expression of the phage PF-CE2 glycanase and provides an important reference for the preservation and sharing of these phages.


Main Text
Cronobacter sakazakii are facultative, anaerobic Gram-negative bacteria that exist widely in various foods and raw materials [1,2]. In recent years, as a new foodborne pathogen, C. sakazakii has been commonly found in formula milk powder and is associated with several infectious diseases, including meningitis, necrotizing enterocolitis, and sepsis [3,4]. Viruses are ubiquitous in nature, and bacteriophages, which are viruses of bacteria, are effective tools to kill pathogenic bacteria [5]. The genetic diversity of C. sakazakii poses a challenge for the use of phages to control microbial contamination in food processing environments, therefore it is necessary to isolate and identify new phages targeting C. sakazakii [6].
Fucose-containing exopolysaccharides (FcEPSs) are a promising source of fucosylated oligosaccharides and fucose. Cronobacter sp. typically have the capacity to produce fucose-rich FcEPSs [7]. Bacteriophage-borne glycanases extracted from phages are effective tools for degrading FcEPSs [8]. A previous study found that a phage isolated from sewage was capable of degrading FcEPS produced by C. sakazakii M1 [9]. In this study, the bacteriophage targeting C. sakazakii M1 was purified via a plaque assay, more than 10 times. The genome sequence and functional biological characterization of phage PF-CE2 was completed and compared with those of homologous phages. In addition, the gene encoding the bacteriophage-borne glycanase was predicted.
Phage isolation and purification were performed as described previously [9]. Briefly, prior to phage PF-CE2 DNA isolation, DNase I (10 µg/mL, Sigma-Aldrich) and RNase A (20 µg/mL, Sigma-Aldrich) were added to a purified suspension of PF-CE2 and incubated at 37°C for 1 h, to digest bacterial DNA and RNA. DNA isolation and purification of phage PF-CE2 were carried out using the E.Z.N.A® Viral DNA Kit (OMEGA). The sequencing library was generated using the Illumina TruSeq DNA Nano Sample Preparation Kit (Illumina). One microgram of DNA was sheared into 300-500 bp fragments using M220 Focused-ultrasonicator (Covaris). Following PCR amplification, the purpose straps were recovered by gel excision. A TBS-380 Micro-Fluorometer (Turner BioSystems) and PicoGreen® (ThermoFisher Scientific) were used for quantitative analysis and clusters were generated by bridging PCR amplification on cBot 2 system (Illumina). The genome of phage PF-CE2 was sequenced using the Illumina HiSeq system with a 2 × 150 bp pairedend run. To make sure the accuracy and reliability of the sequencing results, quality control of the original data was performed as follows: (1) removal of the adapter sequence from reads, (2) reads containing non-AGCT at the 5' end were removed, (3) the ends of reads whose sequencing quality was less than Q20 were removed, (4) reads whose N proportion was more than 10% were removed, and (5) fragments less than 50 bp were discarded. Following quality control, clean data were obtained; detailed information is shown in Table S1. The reads were assembled using ABySS (v2.0.2) assembly software, and GapCloser (v1.12) software was used to carry out gap filling and base correction.
The results of high-throughput sequencing showed that phage PF-CE2 was assembled at 88-fold coverage into a genome of 17 8248 bp in length, with a G + C content of 44.8% and that genes made up 95.87% of the genome. The genome sequence of phage PF-CE2 was compared with that of other phages using the standard nucleotide BLAST in the NCBI database (https://blast.ncbi.nlm.nih.gov/Blast.cgi). Table S2 shows the basic characteristics of eight selected phages, which are similar to phage PF-CE2 in length and G + C content, including the Citrobacter phages (Margaery (unpublished) and Maroon [10]), Cronobacter phages (vB CsaM Cronuts (unpublished), vB CsaM GAP161 [11], vB CsaM leB [1], vB CsaM leE [1], and vB CsaM leN [1]), and Enterobacter phage vB EkoM5VN (unpublished). tRNAscan-SE 2.0 (http://lowelab.ucsc.edu/tRNAscan-SE/) was used to identify possible tRNAs in the genome, and rRNA was predicted using the RNAmmer 1.2 Server (http://www.cbs.dtu.dk/services/RNAmmer/) [12,13] [14]. As shown in Fig. 1, the genome consisted of several clusters, including structural proteins, DNA replication and transcription Met Gly Gly proteins, nucleotide metabolism and biosynthesis proteins, lysis proteins, and DNA packaging proteins. Although most proteins encoded by phage PF-CE2 were similar to the above-mentioned Cronobacter phages, there were still slight differences between them. For instance, Gp202 and Gp262 were unique genes to phage PF-CE2 (Fig. 1). Using HHPred software, Gp202 had a 96.23% possibility of encoding an HNH endonuclease, and the similarity between

Declarations
Nucleotide sequence accession number The genome sequence of phage PF-CE2 was deposited in the GenBank database under the accession number MW629017.