Review of the Historical Data for Y. enterocolitica and Related Species
The strains were obtained from five prefectures in Ningxia Hui Autonomous Region [Yinchuan (n = 112), Shizuishan (n = 7), Wuzhong (n = 1), Guyuan (n = 3), and Zhongwei (n = 148)], from 2007 to 2019 (Fig. 1 and Table S1).
In this study, 271 Y. enterocolitica and related species isolated from animals, foods, and human clinical samples were analyzed. In total, 208 (76.75%) were of animal origin, 50 (18.45%) were of food origin, and 13 (4.78%) were of patient origin. The source of animal samples was mainly feces (n = 102), pharyngeal swabs (n = 39), anal swabs (n = 7), and intestinal contents (n = 59) of animals. The food samples were obtained mostly from meat. The origin of patient samples was feces.
Using traditional phenotypic methods, 130/271 (47.97%) isolates were serotyped, and the most common serotypes were O:3 (n = 73), O:5 (n = 30), O:8 (n = 14), and O:9 (n = 9). In total, 148 isolates were reported as O: unidentifiable because the O-antigen reacted with more than one antiserum or with none of the antisera. Animal hosts included pig (n = 150), sheep (n = 32), rat (n = 15), cattle (n = 6), chicken (n = 3), and hamster (n = 2). Food was derived from meat products, comprising beef (n = 25), pork (n = 11), chicken (n = 9), lamb (n = 3), and fish (n = 2). Food (n = 50) comprised fresh meat (n = 19) and frozen meat (n = 31). Human samples were of fecal origin, the majority of which were from children (n = 9) (Table S1).
Of the 187 Y. enterocolitica isolates, 81.28% (n = 152) were of animal origin, the food source was 12.30% (n = 23) and the patient source was 6.42% (n = 12). Isolates of animal origin included 42.76% biotype 1A (n = 65), 50.0% biotype 4 (n = 76), 2.63% biotype 3 (n = 4) and 4.61% biotype 5 (n = 7). Of these, all isolates of biotype 5 were from sheep. In total, biotype 4 of isolates (n = 84) were from pig hosts. The dominant serotype of the isolates of animal origin was O:3. Food-derived isolates were 86.96% biotype 1A (n = 20) and 13.04% biotype 2 (n = 3). Patient-origin strains included 33.33% biotype 1A (n = 4) and 66.67% biotype 4 (n = 8) (Table 1).
Table 1
Serotype and biotype distribution of Y.enterocolitica isolates
|
Animal (n = 152)
|
Food
(n = 23)
|
Human
(n = 12)
|
Pig
(n = 121)
|
Sheep
(n = 17)
|
Cattle
(n = 6)
|
Rat
(n = 5)
|
Chicken
(n = 3)
|
Biotype
|
1A
|
41/33.88%
|
10/58.82%
|
6/100%
|
5/100%
|
3/100%
|
20/86.96%
|
4/33.33%
|
2
|
——
|
——
|
——
|
——
|
——
|
3/13.04%
|
——
|
3
|
4/3.31%
|
——
|
——
|
——
|
——
|
——
|
——
|
4
|
76/62.81%
|
——
|
——
|
——
|
——
|
——
|
8/66.67%
|
5
|
——
|
7/41.18%
|
——
|
——
|
——
|
——
|
——
|
Serotype
|
O:3
|
72/59.50%
|
——
|
——
|
——
|
——
|
——
|
1/8.33%
|
O:5
|
17/14.05%
|
2/11.76%
|
2/33.33%
|
——
|
2/66.67%
|
3/13.04%
|
——
|
O:8
|
1/0.83%
|
8/47.06%
|
——
|
——
|
——
|
4/17.39%
|
1/8.33%
|
O:9
|
3/2.48%
|
2/11.76%
|
——
|
——
|
——
|
2/8.70%
|
1/8.33%
|
O:53
|
——
|
——
|
——
|
——
|
——
|
1/4.35%
|
——
|
O:1,2,5
|
——
|
——
|
——
|
——
|
——
|
——
|
——
|
O:5,8,9
|
——
|
1/5.88%
|
——
|
——
|
——
|
——
|
1/8.33%
|
NA
|
28/23.14%
|
4/23.53
|
4/66.67%
|
5/100%
|
1/33.33%
|
13/56.52%
|
8/66.67%
|
Note: NA, not applicable and nonagglutinative. |
Average Nucleotide Identity (ANI) Estimation
The 271 Yersinia genomes were evaluated according to the suggested 95–96% ANI [34]. 12 species were delineated using a 95% ANI cut-off value: Y. enterocolitica, Y. intermedia, Y. massiliensis, Y. mollaretii, Y. pekkanenii, Y. proxima, Y. alsatica, Y. frederiksenii, Y. kristensenii, Y. hibernica, Y. canariae, and Y. rochesterensis.
Biotype identification
The biotypes of Y. enterocolitica were identified by traditional biochemical experiments, which revealed biotype 1A (n = 89), biotype 2 (n = 3), biotype 3 (n = 4), biotype 4 (n = 84), and biotype 5 (n = 7). There was no biotype 1B (Table 3S).
Clustering Analyses
After filtering and screening, we constructed a maximum likelihood (ML) tree based on 1,563,073 SNPs. The best-fitting model of the ML tree was GTR + F + ASC + G4, chosen according to BIC (Bayesian information criterion). The ML tree clearly presented 271 isolates comprising 187 Y. enterocolitica (69.0%), 31 Y. intermedia (11.44%), 30 Y. massiliensis (11.07%), 7 Y. mollaretii (2.58%), 5 Y. pekkanenii (1.85%), 4 Y. proxima (1.48%), 2 Y. alsatica (0.74%), 1 Y. frederiksenii (0.37%), 1 Y. kristensenii (0.37%), Y. hibernica (0.37%), Y. canariae (0.37%), and Y. rochesterensis (0.37%) isolates (Fig. 2a).
Notably, the 4 isolates closest to the Y. proxima reference sequence and the Y. artesiana reference sequence clustered together with Y. enterocolitica. Y. artesiana and Y. proxima are subspecies of Y. enterocolitica, first identified by Savin et al. ( Savin et al., 2019) and named NEW3 and NEW4, respectively. In this study Y. enterocolitica and the reference genome (GCA_000009345.1) refer to Y. enterocolitica subsp. enterocolitica. The results of its ANI analysis with both Y. artesiana and Y. proxima were < 95%.
The ML tree showed accurate clustering separation of the Yersinia species, consistent with the results of the ANI analysis and identical separation into 12 distinct species as determined by BAPS (Fig. 2a). The identification of Yersinia species before whole genome sequence techniques was largely based on limited biochemical data. The information presented by the ML tree revealed the inadequacy of the resolution provided by biochemical tests and emphasizes the need to use modern molecular methods for the classification of bacterial species.
In accordance with Y. enterocolitica, the ML tree was broadly divided into two clades. Biotype1A and 4 formed discrete clusters, whereas biotypes 2–3 consisted of closely related but distinct lineages, confirmed by BAPS clustering. Biotype 5 was divided into a separate branch relative to biotype 1A and biotype 4. The reference sequence Y. enterocolitica 8081 (biotype 1B) was divided into a separate branch as there was no biotype 1B among the 187 Y. enterocolitica isolates. The distribution of the strains was not directly related to the time, place, or host of isolation. It was evident from the ML tree that clustering was related to the distribution of the biotypes. This result was identical to the findings of Reuter et al. (Reuter et al., 2014). They classified Y. enterocolitica into two species clusters (SC), SC6 (biotypes 1A, 1B) and SC7 (biotypes 2, 3, 4, and 5). Biotype 4 was correlated with serotype O:3, and biotype 1A was associated with serotypes O:5, O:8, and O:9. Biotypes 3, 4, and 5 displayed tight clusters with short terminal branches compared with biotypes 1A and 2. One possible explanation is that they were the product of one or more recent population bottlenecks or population expansions (Fig. 2b).
Virulence Profiles
In total, 187 Y. enterocolitica isolates were annotated to 99 virulence genes in 4 categories: motility, invasion, immune modulation, and effector delivery system (Figure 3). As can be seen from Figure 3, the distribution of virulence genes was closely related to the biotype. Obviously, biotypes 3, 4, and 5 had virulence genes associated with the type III secretion system (T3SS) compared to biotypes 1A.
Type III secretion system
The effector delivery system was T3SS, which includes Ysc T3SS and Ysa T3SS. T3SS is a pathogenic strategy common to numerous gram-negative pathogens in plants and animals to inject virulence proteins into the cytoplasm of target eukaryotic cells [37]. When members of Yersinia cause disease in humans, they invariably employ the type III secretion machinery to inhibit the primary line of immunological defense in vertebrates, the professional phagocytic cells [38].
Yersinia Outer Proteins and Other Proteins Encoded by Low-Calcium Response Stimulons In very early work on Yersinia physiology, a phenomenon known as the low-calcium response (LCR) was identified, whereby bacteria grown in rich media at elevated temperatures (37°C) would show a growth defect when calcium ions were chelated. However, at 26°C, growth would continue logarithmically regardless of the calcium content, thus indicating a strong dependence of temperature on the regulation of LCR. Y. enterocolitica contains a highly conserved region of low calcium response-stimulating proteins (LCRS). Virulence genes in the LCRS region include the LcrV genes encoding the V antigen (LcrV) [39] and yadA [40]. These genes are most abundantly expressed at low calcium concentrations, resulting in growth restriction in vitro [38]. The secretion of Yersinia outer proteins (Yops) involves up to 22 proteins (YscA-L, YscN-U, LCRD, LCRP/YscM) [41]. Yops are especially important in enteropathogenic Yersiniae and require effective bacterial adherence to target host cells for expression and functional deployment [41].
Heat-stable enterocolitica toxin The heat-stable enterocolitica toxin (Yst) is chromosomally mediated [42] and acts by stimulating guanylate cyclase in intestinal epithelial cells [43]. Delor and Cornelis [44] demonstrated that Yst may be an important factor involved in Yersinia enterocolitica-associated diarrhea in young rabbits.
FliA Motility genes were used for flagellar assembly. Before Y. enterocolitica establishes an intimate contact with the intestinal epithelium, flagella and motility play an important role in initiating host cell invasion [37]. The flagellar regulatory genes flhDC or fliA are both required for motile expression. The fliA gene, present in E. coli and Salmonella typhimurium, encodes another sigma factor, Ko28, also known as RPOF [45]. Inhibition of motility at 37°C involves changes in DNA topology [42]. Iriarte et al. demonstrated loss of motility but no effect on pathogenicity in fliA mutants [45].
InvA
The invasive factor is the inv gene, which plays a crucial role in the initial stages of an intestinal mucosal invasion. It encodes the 103 kDa Inv protein invasin [46]. Invasin, an outer membrane protein that coats the bacteria, binds directly to integrin receptors on the cell surface and mediates cell entry and internalization of the bacterium. This process has been likened to a “zipper mechanism”[46]. Upon binding, the invasin ligand may induce a conformational change in the receptor that initiates a cascade of intracellular signals leading to uptake [47]. However, invasin promotes bacterial attachment to the extracellular matrix proteins fibronectin and collagen by binding with high affinity to β1 integrin [40]. Moreover, integrin receptors appear to be ideally suited for the process of polymerization of microfilaments during the entry process.
Immune modulation
The major gene of the immune modulation is the O antigen, the primary component of the lipopolysaccharide contained in the outer membrane of Gram-negative bacteria that is required for the correct expression or function of other outer membrane virulence factors [48]
ST Typing
Post-WGS implementation, the STs were derived from the genome data. Sequence typing data were available for all 187 Y. enterocolitica isolates. Multilocus sequence typing detected 54 STs in 187 isolates identified in this study. The most common STs were ST429 at 42.25% (79/187); ST3 at 4.81% (9/187); ST13 at 3.74% (7/187); ST278 at 3.21% (6/187); and ST178 and ST637 at 2.67% (5/187) (Table S3). As shown in Fig. 4, MST presented two main centers, ST429 and ST13. This was consistent with the results of Y. enterocolitica ML tree. In addition, biotype 1A was more abundantly polymorphic than biotype 4. ST429 included biotype 4 (n = 79) and biotype 3 (n = 2) and was closely related to serotype O:3. ST3 (n = 9), ST278 (n = 6), ST178 (n = 5), ST637 (n = 5), ST640 (n = 4), ST643 (n = 4), and ST216 (n = 3) were biotype 1A and were associated with serotypes O:5, O:8, O:9. ST13 (n = 7) was biotype 5, again associated with serotypes O:5, O:8, and O:9. These results were consistent with the findings of Hunter et al. (2019). In a survey of Y. enterocolitica and Y. pseudotuberculosis isolated from clinical specimens in the UK from 2004 to 2018, Hunter et al. (2019) found that certain STs correlated with specific serogroups, including ST18/O:3, ST12/O:9, ST184/O:6,30, and ST192/O:8. ST12 and ST4 were the two sources in the MST constructed from strains from the Ningxia region and those publicly available in the database (Table S2 and Fig. 4b). Bioserotypes 1A/O:6,30 and 2,3,4/O:9 were the biotypes and serotypes corresponding to these two sources. The strong correlation between phylogenetic signals and serotypes may suggest that the early evolution of Y. enterocolitica was dominated by ecological specialization.
Core Genome MLST
A neighbor-joining (NJ) tree and an MST based on cgMLST analysis of 187 Y. enterocolitica isolates were constructed. These 1,553 cgMLST target genes were randomly distributed across the genome, encoding functional enzymes and proteins. Y. enterocolitica isolates were divided into 125 cgMLST types (CTs) (Table S3 and Fig. 5a). CgMLST analysis revealed the core genome diversity of strains with the same ST from 0 to 84 allelic differences. The same cgMLST distribution was present between the different STs. The NJ tree of 187 Y. enterocolitica isolates from Ningxia indicated the names of two microclades of the HC1490 cluster (Fig. 5a). HC1490_10 and HC1490_2 were the primary microclades. These two microclades of the HC1490 cluster were consistent with the results of Y. enterocolitica ML tree. HC1490_2 were strongly associated with biotypes 3, 4, and 5; HC1490_10 was closely related to biotype 1A and 2.
Hierarchical clustering (HC) of CgMLST (HierCC) defines clusters based on cgMLST. Distances between genomes are calculated using the number of shared cgMLST alleles, and genomes are linked on a single-linkage clustering criterion. These clusters were assigned stable cluster group numbers at different, fixed cgMLST allele distances. Yersinia, for instance, had cut-offs such as 0, 2, 5, 10, 20, 50, 100, and 200, etc. After comparison of the clustering of HC50, HC100, HC200, HC400, and HC600, HC100 was determined as the criterion for CT clustering. HC100 indicates that the clusters included all strains with links no more than 100 alleles apart. The 125 CTs present in the 187 isolates from the Ningxia region clustered to form 54 microclades of HC100. Of these, HC100_2571, HC100_150, HC100_1273, and HC100_4570 were the principle microclades. The NJ tree constructed from strains from the Ningxia region and those publicly available in the database showed that several microclades of the HC100 cluster were significantly associated with serotypes such as: HC100_397/O:5,27; HC100_111/O:9; HC100_4570 and HC100_466/O:8; and HC100_150/O:5. O:3 was related to three microclades of the HC100 cluster: HC100_406, HC100_7, and HC100_2571 (Fig. 5b). Biotypes 1B and 5 were located in relatively separate microclades. Biotypes 2, 3, and 4 were divided into several microclades. Interestingly, individual microclades were host-specific, for instance, HC100_4570/sheep, HC100_111/pig and human, HC100_2571/pig and human, biotype 1B/human. The NJ tree suggested that cgMLST significantly improved the phenotyping and identification of Y. enterocolitica compared to STs. This method allowed for more accurate discrimination of sample information such as the source compared to ST typing.