Complete Genome Sequence of Bacteriophage EFC1 Infecting Enterococcus Faecalis From Chicken

In response to Enterococcus faecalis infection of chicken origin, a multi host lytic phage, EFC1 was isolated and characterized the double-stranded circular DNA genome with size of 56099 bp, containing 89 predicted protein coding genes as well as 2 tRNAs involved in intron, structure, transcription, packaging, DNA replication, modication, lysis. Observation of the structure by electron microscopy and comparative phylogenetic analysis of terminase large subunit showed that the phage EFC1 belongs to a new member of Siphoviridae, which is relatively distantly related to its high similarity phages. The phage EFC1 has no relevant virulence genes and antibiotic resistance genes.


Introduction
Enterococcus faecalis is a normal ora in the intestinal tract of humans and animals and an important opportunistic pathogen. E. faecalis can cause infections in a variety of animals (pigs, sheep, cattle, donkeys, chickens, ducks, turkeys, pigeons, and geese). Chickens are the most sensitive, mainly manifested by amyloid arthritis and the death of 1-week-old chicks. Chickens with amyloid arthritis show lameness, hindered growth and development, and often occur at 5-6 weeks of ages, which mainly affect the development of chickens [2]. Umbilical in ammation, yolk sac infection and sepsis are manifested in the chick, resulting in the deaths of 0.3-11.7% of 1-week-old chicks [20], causing severe economic losses to the chicken industry.
The increasing drug resistance of E. faecalis due to the large quantity and unreasonable use of antimicrobials has attracted great attention worldwide. In particular, the emergence of vancomycinresistant super drug-resistant E. faecalis has become a public health concern [12]. In addition, the abuse of antibiotics can also cause problems such as drug residues, reduced food safety and quality, and imbalance of the bacterial ora in the breeding environment, which seriously threatens the ecological environment on which human and animal health depend [25]. The development of novel antibiotic alternatives for the treatment of multi-drug resistant bacterial infections has become one of the current research hotspots.
Bacteriophages are viruses that speci cally lyse and kill drug-resistant host bacteria. [3]. In recent years, a variety of phages that lysis chicken pathogenic bacteria have been isolated, puri ed, and applied, but there are no reports about the isolation and puri cation of bacteriophages that lysis chicken-derived Enterococcus faecalis[8, 13,18]. The isolated E. faecalis phages mainly lysis human pathogens, with few animal sources and few reports from chicken sources [16,19,21]. Bacteriophages are highly speci c, and phages of other origins are not necessarily applicable to E. faecalis of chicken origin, therefore, in this study, a lytic bacteriophage against multidrug-resistant E. faecalis of chicken origin was isolated and screened from sewage in a farm, determined its whole genome, analyzed and preserved to GenBank, to provide reference information for subsequent bacteriophages as new antimicrobial agents.
Bacterial strains and culture conditions E. faecalis strain L-9 was isolated from chick as a host strain for phage isolation. The strain used in this study was cultured in Brian Heart Infusion(BHI, Becton, Dickinson and company, MD, USA) at 37℃ with shaking 160 rpm under aerobic conditions.

Phage morphology observation by electron microscopy
Morphology of the phage was examined by transmission electron microscopy (TEM) analysis (Hitachi HT7800, Tokyo, Japan). The overall morphology of phage was tadpole shape, with an icosahedral head, 110~120nm in length, 50~70nm in width, and a tail 130~200nm in length, 5~7nm in width ( Figure 1).
Bacteriophage EFC1 belongs to the family Siphoviridae according to the classi cation criteria reported by the International Committee on Taxonomy of Viruses (ICTV)[6].

Whole Genome Sequencing and Bioinformatic Analysis
The bacteriophage was isolated from the sewage of farms in Handan, China. Genomic DNA from concentrated phages was extracted using the DNA viral genome extraction kit (Solarbio, China) according to the manufacturer's instruction. Whole genome sequencing was performed by Suzhou GENEWIZ Biotechnology Co. Ltd. using Illumina MiSeq sequencing. Clean data for subsequent information analysis were generated by removing adapters as well as low-quality sequences from sequencing raw data using the second-generation sequencing data quality statistical software cutadapt (v1.9.1). Coding sequences (CDSs) were predicted with Prodigal (v3.02) software [10].
Online alignment of the whole genome sequence of EFC1 with phage whole genome sequences from the Nr database using NCBI BLASTN was used for similarity analysis, the whole genome sequence of Enterococcus faecalis phage EFC1 and Enterococcus phage VB_EfaS_IME198 had the highest percent identity, which was 96.66%, and the highest score is 34204. Enterococcal bacteriophages vB_EfaS_IME198, vB_EfaS_HEf13[16], EF-P29 [4], EF-P10 [5], Entf1, vB_EfaS_PHB08[28], IME-EF1[31], SAP6 [17], etc., which are highly percent identity to the EFC1 complete genome, are all of the Siphoviridae bacteriophage family, and their genomes are all dsDNA.
Whole genome of phage EFC1 was circular double-stranded DNA with a length of 56099 bp and a G + C ratio of 39.96% (Figure 2). Based on the protein sequence obtained by predicting the coding gene, NCBI BLASTP is used to compare with the protein sequence in the Nr database, and the best matching result is used for gene annotation. Phage EFC1 predicts 89 protein coding regions, of which 27 have certain homology with the NCBI database annotated functional protein, and the rest have homology with hypothetical proteins. The shortest sequence length is 126 bp and the longest sequence is 3993 bp. All CDSs include 2 start codons. Among the 89 CDSs, only 2 start codons are TTG, and the rest are ATG. All CDSs are divided into 2 encoding directions, two CDSs are one direction, negative direction, and the rest are the other direction, positive direction.
Intron and structure module Intron related protein: HNH homing endonuclease with site speci city and sequence tolerance, invades the translational organization of the ribonucleotide reductase operon (NRD), cleaving small fragments of double stranded DNA [9]. Structurally related proteins: tail bers protein can bind to cell surface receptors as a receptor binding protein in phage, and the C-terminal domain may be a determinant of host speci city [15]. Most dsDNA tail bacteriophages copy multiple capsid proteins to assemble the capsid. The head morphogenetic protein, capsid protein and connector protein together form the head icosahedron structure to ensure the correct assembly of the head and complete the attachment of the phage head and tail [1]. Head-tail connector family protein forms the interface between the head structure and the tail structure of the phage, which keeps the viral DNA in the correct position so that it can be triggered by the interaction of the phage particle and the bacterial receptor [24].
Transcription and packaging module Transcription related proteins: transcriptional regulator regulates phage gene expression. Adenylate kinase, which catalyzes the reversible ATP dependent phosphorylation of AMP to ADP and dAMP to dADP, can also catalyze the conversion of nucleoside diphosphates to the corresponding triphosphates [11]. The main role of cytidine deaminase is to participate in the recycling of free pyrimidines [7]. Nucleoside triphosphate pyrophosphohydrolase hydrolyzes nucleoside triphosphates to nucleotides and PPi. Gluredoxin, which functions similarly to thioredoxin, is an effective reducing agent for disul de bonds in low molecular weight substrates and proteins, and is a hydrogen donor for ribonucleotide reductase [22].
Packaging related proteins: the terminase is a packaging protein that inserts a single phage genome within an empty capsid structure, typically consisting of one large and one small subunit, the small subunit speci cally recognizes viral DNA and attracts the large subunit for initial cleavage, the large subunit has ATPase activity and can provide energy for phage packaging initiation [26]. Phage portal protein can form a pathway to inject DNA into the host and participate in the protein connection between the phage head and tail [29].

Replication and modi cation module
Replication-related proteins initiate DNA synthesis, unwind DNA double-strands, initiate DNA replication, repair DNA gaps and other DNA replication processes. Modi cation related proteins: DNA methyltransferase; Modi cation of DNA, facilitating the transfer of methyl groups.

Lysis module
Lysis related proteins: holin, a small bacteriophage-encoded protein that accumulates during infection cycle morphogenesis until a critical value abruptly triggers pore formation at the cell membrane to alter its permeability and perform functions similar to those of signal peptides, genomic injection into host cells for propagation [27]. Hyaluronidase hydrolyzes hyaluronic acid to promote the diffusion of water and other extracellular matrix. Chitinase can cleave β-1,4 glycosidic bonds to degrade chitin, acting to degrade the cell wall. N-acetylmuramyl-L-alanine amidase contains three domains: SH3 peptidoglycan binding domain, PGRP superfamily conserved domain and N-terminal CHAP endopeptidase domain, dedicated to lysis, and holin activates amidase within a precisely de ned time [30]. The phage EFC1 lysis module contains holin and N-acetylmuramyl-L-alanine amidase, and most phages such as Enterococcus phages utilize holin and Endolysin to synergistically lyse host cells. Phage EFC1 has more lysis modules than other bacteriophages, hyaluronidase and chitinase, and bacteriophage EFC1 has a wider host spectrum, which may be related to the lysis of bacteria by these two proteins.
Gene analysis of phage resistance and virulence factors in predicted Open Reading Frame (ORFs) was performed by antibiotic resistance gene database (https://card.mcmaster.ca/) and Virulence Factors of the Pathogenic Bacteria database (VFDB, http://www.mgc.ac.cn/cgi-bin/VFs/v5/main.cgi). There were no related host virulence genes and antibiotic resistance genes in bacteriophage EFC1 and was not affected by the virulence of antibiotics and Enterococcus faecalis. It can prevent the increase of host bacteria resistance and toxicity during the phage infection and lysis process, and may be used as a candidate phage in combination with antibiotics or alone in cases of E. faecalis infection.

Phylogenetic analysis
The phylogenetic tree was analyzed using the ClustalW program in MEGA X and NCBI blastp alignments of amino acids sequence encoded by the selected phage terminase large subunit gene [14]. Nine strains of phage with high homology to the protein sequence encoding the phage terminase large subunit of bacteriophage EFC1 were selected, with node values representing credibility and branch length representing genetic distance ( Figure 3). The results showed a relatively distant relationship between phage EFC1 and its high similarity phages, and upon infection of the host, phage EFC1 had di culty communicating genetic material with other phages, suggesting that the phage terminase large subunit gene conservation may serve as a signature for each Enterococcus phage.

Nucleotide sequence accession number
The GenBank accession number for phage EFC1 is MW677132.

Declarations
Funding This study was funded by Special project for tackling key technical problems of science and Technology Department of Hebei Province (19226609D).

Con ict of Interest
The author has no con ict of interests to disclose. Complete genome map of phage EFC1. The outermost arrow indicates gene transcription direction, and the inner two circles indicate GC content (black) and GC skew (+, green;-, purple). Predicted coding genes are divided into lysis, structure, transcription, replication, modi cation, packaging, intronic and hypothetical protein based on functional annotations. The function of hypothetical proteins is unknown or different from the above.