E. faecium R.A73 genome annotation
Genome content
The present draft genome includes 2,935,283 bases, with a GC content of 38.0%, and was assembled into 28 scaffolds. The Genomic annotations illustrated a total number of 2,884 genes, corresponding to 2,834 coding sequences (CDSs) and 50 RNAs with single predicted copies of the 16S, 23S, and 5S rRNA genes and 47 predicted tRNAs (Fig. 1). A total of 342 RAST genome sub-systems were identified, with many features of carbohydrates subsystem, including the genes involved in the metabolism of central carbohydrate, amino sugars, di- and oligosaccharides, the carbon metabolism, organic acids, the fermentation metabolism, sugar alcohols, polysaccharides, and monosaccharides. There are also many amino acids and derivative characteristics of the sub-system, including the lysine, threonine, methionine, and cysteine.
Functional annotation
A total of 2,063 protein-coding genes (72.58 % of the total protein-coding genes) were assigned a putative function (by COGs). Genes associated with carbohydrate transport and metabolism (294 ORFs), translation (206 ORFs), and transcription (205 ORFs) were ranked among the most abundant COG functional categories. The genes distribution into COG functional categories is summarized in (Fig. 2).
Bacteriocin and antibacterial peptide production genes
Several genes involved in bacteriocin production as well antibacterial peptides were identified in Enterococcus faecium R.A73 genome. These genes include colicin V CvpA family protein (DTX73_02515), production bacteriocin pole, antibacterial peptides agents synthesis, bacteriocin-associated protein (DTX73_02925), bacteriocin immunity protein (DTX73_04250, DTX73_06025, DTX73_06505), bacteriocin (DTX73_04255, DTX73_06475, DTX73_06480, DTX73_07350, DTX73_09680, DTX73_09715), EntF family bacteriocin induction factor (DTX73_06500), TmhB bacteriocin enhancer peptide (DTX73_09690), ThmA bacteriocin (DTX73_09695), ABC-type bacteriocin/lantibiotic exporters, contain an N-terminal double-glycine peptidase domain (DTX73_09710), class IIb bacteriocin, lactobin A/cerein 7B family (DTX73_09720). Other genes possess different roles implicated in amidophosphoribosyl transferase (EC 2.4.2.14) (DTX73_12820), acetyl-coenzyme A chain carboxyl beta transferase (EC 6.4.1.2) (DTX73_04140), a synthase dihydrofolate (EC 6.3.2.12) (DTX73_05510), an rRNA pseudouridine synthase a (EC 4.2.1.70) (DTX73_10445) and the bifunctional folylpolyglutamate synthase/dihydrofolate synthase (EC 6.3.2.17) (DTX73_05510). Furthermore, the genome revealed the presence of a gene encoding for one enterocin (DTX73_06510).
Antibiotics resistance and virulence genes
We used the ResFinder-2.1 server [10] available at cge.cbs.dtu.dk/services/ResFinder/ in combination with PGAAP and RAST server annotation [11] to investigate genes involved in resistance to antibiotics and toxic compounds in the E. faecium R.A73 genome.
Two genes involved in resistance to antibiotics and toxic compounds were identified. These genes correspond to an homolog of aac(6')-Ii involved in Aminoglycoside resistance (% identity: 98.36; Query/HSP length: 549/549; Accession number: L12710) and an homolog to msr(C) involved in MLS - Macrolide, Lincosamide and Streptogramin B (% identity: 97.70; Query/HSP length: 1479/1479; Accession number: AF313494). Besides, PGAAP and RAST annotation systems were also able to detect 52 other genes potentially involved in virulence, disease, and defense mechanisms. These genes found in the HG937697 genome are presented in (Table 2).
Phylogeny and classification
Based on rDNA 16S sequences, the phylogenetic tree showed that the R.A73 is more similar to E. faecium LMG 11423 and E. durans NBRC 100479 than other Enterococcus species (Fig. 3).
The later analysis combined to the in silico DNA-DNA hybridization method confirmed its identification as E. faecium species. Indeed, DNA-DNA hybridization is considered as the best indicator for distinguishing species. The probabilities of DDH value higher than 70% detected through logistic regression under three formulae indicate that E. faecium R.A73 is different from other species of the genus excepting Enterococcus faecium. A DDH value > 96 % was found following the comparison against E. faecium T110 (Supplementary data Table S1).
Comparative genomics
Comparative analysis of genome sequences
The comparative genomics help to understand several aspects related to the pathogenicity, the resistance to antibiotics, and probiotic characteristics.
Enterococcus faecium protein sequences predicted by the PGAAP annotation system, have been retrieved and compared with 14 protein sequences of related organisms corresponding to Enterococcus 7L76 uid197170, Enterococcus casseliflavus This20 uid55693, Enterococcus faecalis 62 159663 uid, Enterococcus faecalis D32 171261 uid, Enterococcus faecalis og1RF54927 uid, Enterococcus faecalis Symbioflor 1 uid183342, Enterococcus faecalis V583 uid57669, Enterococcus faecium AUS0004 uid87025, Enterococcus faecium AUS0085 uid214432, Enterococcus faecium do uid55353, Enterococcus faecium NRRL B 2354 uid188477, Enterococcus hirae ATCC 9790 uid70619, Enterococcus mundtii that 25 uid229420 and Enterococcus faecium T110.
The comparative proteome analysis against 14 other Enterococcus related genomes (Table 1) showed that HG937697 is highly similar to the genome of E. faecium T110 with 2,318 common orthologs genes (80.37 %). We also identified 208 protein-coding genes that were specific to E. faecium R.A73 strain. The BRIG tool confirmed our previous finding with a very high similarity between Enterococcus faecium R.A73 and Enterococcus faecium T110 genomes (Figure 4).
Comparative analysis of virulence genes
We investigated the presence of genes related to virulence in Enterococcus faecium R.A73. Among several Enterococcus virulence genes available in the virulence factor database VFDB (http://www.mgc.ac.cn/VFs/), 30 genes, including the gene for enterococcal surface protein (esp), were absent in Enterococcus faecium R.A73 while EbpA (DTX73_01685), EbpB (DTX73_01690), EbpC (DTX73_01695), srtC (DTX73_017000), EcbA (DTX73_00685), EfaA (DTX73_03830) were found to be present.
Table 1 Genome size and gene count of 14 pathogens and probiotics Enterococcus species used in genome comparative study
Species
|
Genome size (Mb)
|
Gene count
|
Enterococcus 7L76 uid197170
|
|
|
Enterococcus casseliflavus This20 uid55693
|
|
|
Enterococcus faecalis 62 159663 uid
|
3.13
|
3158
|
Enterococcus faecalis D32 171261 uid
|
3.06
|
3174
|
Enterococcus faecalis og1RF54927 uid
|
2.73
|
2676
|
Enterococcus faecalis Symbioflor 1 uid183342
|
2.81
|
2761
|
Enterococcus faecalis V583 uid57669
|
3.35
|
3412
|
Enterococcus faecium AUS0004 uid87025
|
3.01
|
3118
|
Enterococcus faecium AUS0085 uid214432
|
3.23
|
3318
|
Enterococcus faecium do uid55353
|
3.05
|
3209
|
Enterococcus faecium NRRL B 2354 uid188477
|
|
|
Enterococcus hirae ATCC 9790 uid70619
|
2.85
|
2752
|
Enterococcus mundtii that 25 uid229420
|
3.35
|
3229
|
Enterococcus faecium T110
|
|
|
Table 2 Number of genes associated with the general COG functional Enterococcus faecium R.A73 genome categories
Code
|
Value
|
% Of total features
|
Description
|
J
|
206
|
7.14
|
Translation
|
A
|
0
|
0.0
|
RNA processing and modification
|
K
|
205
|
7.10
|
Transcription
|
L
|
109
|
3.77
|
Replication, recombination, and repair
|
B
|
0
|
0.0
|
Chromatin structure and dynamics
|
D
|
29
|
1.00
|
Cell cycle control, mitosis and meiosis
|
Y
|
0
|
0.0
|
Nuclear structure
|
V
|
62
|
2.14
|
Defense mechanisms
|
T
|
80
|
2.77
|
Signal transduction mechanisms
|
M
|
120
|
4.16
|
Cell wall/membrane biogenesis
|
N
|
12
|
0.41
|
Cell motility
|
Z
|
0
|
0.0
|
Cytoskeleton
|
W
|
0
|
0.0
|
Extracellular structures
|
U
|
17
|
0.58
|
Intracellular trafficking and secretion
|
O
|
63
|
2.18
|
Posttranslational modification, protein turnover, chaperones
|
C
|
73
|
2.53
|
Energy production and conversion
|
G
|
294
|
10.19
|
Carbohydrate transport and metabolism
|
E
|
145
|
5.02
|
Amino acid transport and metabolism
|
F
|
75
|
2.60
|
Nucleotide transport and metabolism
|
H
|
72
|
2.49
|
Coenzyme transport and metabolism
|
I
|
73
|
2.53
|
Lipid transport and metabolism
|
P
|
91
|
3.15
|
Inorganic ion transport and metabolism
|
Q
|
18
|
0.62
|
Secondary metabolites biosynthesis transport and catabolism
|
R
|
140
|
4.8
|
General function prediction only
|
S
|
183
|
6.34
|
Function unknown
|
-
|
821
|
28.48
|
Not in COGs
|