Phylogenetic analysis
At least 47,061,190 of the paired-end reads processed had a Phred score of 30, suggestive of high quality. The overall GC content was between 39.06 and 40.26. A total of 71,368 core genome SNPs were identified from all strains by kSNP3. The phylogenetic tree based on the core-genome SNPs of these isolates indicated high genetic diversity, and were clustered into three primary lineages (LI, LII, LIII) (threshold = 0.1) based on the structure of the phylogenetic tree and the number of virulence genes (Fig. 1). The largest primary lineage was LIII, containing 49.5% (n = 108) of the study isolates. LI and LII contained 13.8% (n = 30) and 36.2% (n = 79) of the isolates, respectively. Each lineage contained sub-lineages with only LIIIA sublineage being highlighted due to its unique characteristics (Fig. 2).
Serotype, Sequence Type, And Clonal Complex
Among field isolates (n = 218), in silico whole-genome serotyping pipeline identified 11 of the 15 known G. parasuis serotypes (Fig. 3) with serotypes 9, 10, and 11 being undetected. A total of 6.4% (14/218) of isolates were untypable. The Simpson’s index of diversity for serotyping was estimated as 0.82. The most frequently detected serotypes were 7 (32.6%, 71/218), 13 (16.1%, 35/218), 2 (14.7%, 32/218), 4 (14.2%, 31/218), and 5/12 (7.3%, 16/218). Each lineage had unique predominant serotype (s). Within LI, LII, and LIII isolates, serotype 5/12 (50%, 15/30), 7 (75.9%, 60/79), and 13 (27.8%, 30/108), were the most frequently detected serotypes, respectively (Table 1). Consistent with the above distribution of serotypes within lineages, statistical analysis showed that serotype 5/12 was associated with LI, serotype 7 was associated with LII, and serotypes 2, 4, and 13 were associated with LIII (p < 0.05). In addition, LI and LII isolates were significantly associated with respiratory and systemic isolates, respectively (p < 0.05). A total of 80 isolates were isolated from tissues for which histopathological data was available. Of these, 8 did not have detectable lesions and were of serotypes 4, 7 and 13 with majority of pigs having lesions in other sites in which Streptococcus suis isolated. Among the 72 remaining isolates, the following lesions and serotypes were detected: polyserositis (n = 27) bronchopneumonia/pneumonia (n = 26) epicarditis/myocarditis (n = 9); meningitis/encephalitis (n = 3) arthritis/synovitis (n = 4), pleuritis (n = 2) splenitis (n = 1) (Additional file 3).
Table 1
Characteristics of isolates in the specific phylogenetic lineages
Lineage | Number of isolates | Number of flows | Country (n) | Serotype (%) b | ST | Average number of group 1 vtaAs | Putative virulence factors (%)b | AMR genes (%)b |
I | 30 | 12 | Canada (n = 7), Mexico (n = 1) and USA (n = 22) | 2 (3.3%) 4 (10%) 13 (13.3%) 14(10%) 5/12 (50%) NT (13.3%) | 6, 414, 427, 442, 446, 454, 455, 469, 472, 474, 551, 629 | 6.8 | bmaA4, bmaA6, pilA, pilC (63.3%), bmaA1 (33.3%), bmaA5, pilB (96.7%), lsgB (50%), siaB (93.3%), omP2 (66.7%), omP5 (100%) | bcr, ksgA, sul2 (100%), aph(3'')-Ib, bacA (99.1%), blaZ (13.3%), spc (13.3%, ermA and tetB (3.3%), tetM (6.7%) |
II | 79 | 25 | Canada (n = 6), Chile (n = 6), Mexico (n = 5), Peru (n = 5), USA (n = 57) | 2 (8.7%) 4 (6.3%) 6 (6.3%) 7 (75.9%) 5/12 (1.3%) 13 (1.3%) | 157, 403, 416, 422, 447, 452, 454, 458, 459, 462, 478, 526, 624 | 6.2 | bmaA1, lsgB (3.8%), bmaA4, bmaA6, pilA, pilB, pilC (40.5%), bmaA5 (62%), omP2 (53.2%), omP5 (100%), siaB (93.7.1%) | bcr, aph (3'')-Ib, ksgA, sul2 (100%), bacA (98.7%) |
III | 108 | 40 | Canada (n = 7), Chile (n = 3) Denmark (n = 4), Mexico (n = 19), Peru (n = 1) USA (n = 74) | 1 (3.7%) 2 (22.2%) 3 (0.9%) 4 (21.3%) 7 (10.2%) 8 (2.7%) 13 (27.8%) 15 (1.9%) NT (9.3%) | 116, 239, 299, 401, 402, 404, 406, 408–413, 415, 417–421, 423, 424–426, 430, 432–434, 437, 438, 440, 442,443,445, 448, 451, 452, 471, 473, 476, 498, 548, 549, 552, 553, 617, 618, 632 | 5.02 | bmaA1 (9.3%), omP2, pilC (56.5%), bmaA4 bmaA6 (55.6%), bmaA5 (85.2%), omP5 (100%), pilB (72.2%), siaB (75%), pilA (62%) | bcr, aph(3'')-Ib, ksgA, sul2 (100%), bacA (98.1%), qnrB, aph3''Ia (0.9%), blaROB−1, spc (3.7%), strA, ermA (2.8%), tetJ (2.7%, tetM (5.6%), tetB (4.6%) |
bRepresents percentage of a given serotype, putative virulence factor, and AMR gene within a lineage |
Multilocus sequencing typing analysis revealed 72 STs among the 218 field isolates, and 66 novel STs were identified with ST454 being the most frequently detected (Fig. 3, Additional file 3), representing 92.9% of the total STs identified. The Simpson’s index of diversity (SDI) for MLST was 0.95. All reference strains typed as previously known STs, except serotype 10 (Additional file 3). In contrast with serotyping data, the majority of the STs clustered within the same lineage based on whole-genome SNP phylogenetic analysis. The majority of isolates belonged to a limited number of CCs, which included CC157 (24.7%, n = 54), CC478 (8.3%, n = 18), CC413 (4.6%, n = 10), CC417 (3.7%, n = 8), CC56 (2.8%, n = 6), CC92 (2.8%, n = 6), CC452 (2.8%, n = 6), CC442 (2.3%, n = 5), CC438 (1.4%, n = 3), CC245 (0.9%, n = 2),), CC96 (0.9%, n = 2), CC433 (0.9%, n = 2), and CC246 (0.5%, n = 1). The most common CCs were CC157 and CC478 with CC157 comprising of STs, ST416, ST454, ST458, ST462, ST526, and ST157. CC478 was the second most frequently detected clone containing ST459, ST478, and ST624 (Additional file 1). ST478 is a single locus variant (SLV) of ST157 but formed its own clone (Additional file 1). Based on the group definition criterion, 43.4% (95/218) of the STs were considered singletons. ST6, ST454, and ST478 were associated with serotypes 5/12, 7, and 2, respectively (P < 0.05). However, ST6 strains were also from serotypes 13, ST454 of serotypes 4, 13 and 14, and ST478 of serotypes 7. A high ST diversity within serotypes was observed. For example, within serotype 7 strains, a total of 19 different STs were identified. Similar findings were observed with serotype 13 and 2. The untypable strains were of different sequence types.
Within the dataset, the three dominant STs were ST454 (19.3%, 42/218), ST478 (5.5%, 12/218), and ST6 (5%, 11/218). The remaining STs were each represented by ≤ 7 isolates, and 33/72 STs were represented by a single isolate. The two novel STs (454 and 478) are single locus variants (SLVs) of each other at the infB locus. Within LI and LII, ST6 (36.7%, 11/30) and ST454 (51.9%, 41/79) were the dominant STs, respectively, while LIII showed high diversity with 49 different STs. All six isolates in the LIII A sublineage (Fig. 2) were identified as novel STs, except for two serotype 8 isolates that were both ST299. ST6 was associated with LI (p < 0.05), and ST454 and ST478 were significantly associated with LII (p < 0.05). From cases with histopathology data, most ST454 and ST478 isolates originated from polyserositis cases.
Distribution Of Virulence-associated Genes
A total of 212 (97.2%) isolates possessed at least one of the group 1 vtaA genes. Within group 1 vtaA genes, vtaA5 (88.1%, 192/218), vtaA6 (84.7%,185/218), vtaA3 (67.4%, 147/218), vtaA8 (66.5%, 145/218), and vtaA2 (65.1%, 142/218) were the most frequently detected (Additional file 3). The remaining group 1 vtaA genes were vtaA1, vtaA9, vtaA4 and vtaA7, with 58.7% (128/218), 53.2% (116/218), 43.1% (94/218) and 30.7% (67/218) presence, respectively. On average, LI isolates carried 6.8 (min: 2, max: 9) group 1 vtaA genes, LII isolates carried on average 6.2 (min: 4 max: 9) group 1 vtaA genes, and LIII isolates carried an average 5.02 (min: 1, max: 9) group 1 vtaA genes excluding LIIIA. Sublineage LIIIA isolates lacked all vtaA genes (Fig. 2). VtaA3, vtaA8, and vtaA9 were significantly associated with LI and LII (p < 0.05). VtaA1, vtaA4, and vtaA7 were significantly associated with LI, and vtaA6 was associated with LIII.
A total of 23 isolates, predominantly within LI and LII, carried all 9 group 1 vtaA genes and were of serotypes 7 ST454, ST624, and ST632, serotype 5/12 ST6, ST427, and ST474, serotype 2 ST478 and ST548, serotype 4 ST549 and ST629, serotype 13 ST6, serotype 14 ST442, and NT ST427. From these 23 isolates, six also carried all the group 2 and 3 vtaA genes, and were of serotype 7 ST454 and ST624 and serotype 14 ST442. Some isolates represented by the same ST, carried a varied number of group 1 vtaA genes (Fig. 4).
Group 2 vtaA genes were found in the following proportions: vtaA10 (34.4%, 74/218) and vtaA11 (23.4% 51/218). Finally, within group 3 vtaA genes, vtaA12 was found in 56.4% (123/218) and vtaA13 in 61.5% of isolates, respectively (134/218).
Within serotypes, vtaA1 was associated with serotype 5/12, vtaA2 and vtaA7 were associated with serotype 2, and vtaA5, vtaA6 and vtaA8 were associated with serotype 7 (p < 0.05). Among the dominant STs, vtaA1, vtaA4, vtaA7, and vtaA9 were associated with ST6, and vtaA8 and vtaA9 were associated with ST454 (p < 0.05).
The prevalence of other putative virulence genes was further determined using whole-genome sequencing analysis. All isolates carried the porin protein ompP5 gene, and 56.9% (124/218) of the isolates were also positive for the porin protein gene ompP2. Fimbrial genes (pilA, pilB, and pilC) and siaB were present in 54.6% (119/218), 69.3% (151/218), 51.8% (113/218), and 84.4% (184/218) isolates, respectively. Among the monomeric autotransporter genes, bmaA5 (78.4%, 171/218), bmaA4 (51.3% ,112/218), and bmaA6 (51.3% (112/218), were the most prevalent while bmaA1 (10.6%, 23/218) was the least prevalent. The lsgB gene was the least prevalent gene among the selected putative virulence genes, with a proportion of only 8.3% (18/218) of isolates being positive; all isolates in LIII lacked this gene. Almost all lsgB positive isolates (15/18) were part of L1 (Table 1), and were predominantly serotype 5/12.
Identification Of Antimicrobial Resistance Genes
Fourteen AMR genes were detected among the 218 genomes, encoding resistance to antibiotics of nine different classes including 2,5-diketopiperazines, aminoglycosides, beta-lactams, liconsamides, macrolides, lincosamide streptogramins B (MLS), polypeptides, quinolones, sulfonamides, and tetracyclines. Among these, bcr, ksgA, bacA, sul2, and aph (3'')-Ib were detected in almost all isolates included in the collection. The remaining genes were present at the following proportions; blaZ (6.9%, 15/218, tetM (3.7%, 8/218), spc (3.7%, 8/218), tetB (2.8%, 6/218), bla-ROB−1 (1.8%, 4/218), ermA (1.8%, 4/218), strA (1.4%, 3/218)), qnrB (0.5%, 1/218), and aph3''Ia (0.5%, 1/215).
Relationships Between Strains And Geographical Location, Tissue, And Disease
The distribution of G. parasuis isolates was further analyzed geographically. A total of 76% of isolates (19/25) from Mexico, 48% from the USA (74/154), and 35% from Canada (7/20) were grouped into LIII; 20% (5/25), 37% (57/154), and 30% (6/20) of the isolates from Mexico, USA, and Canada, respectively, clustered in LII. However, 4% (1/25), 14.2% (22/154), and 35% (7/20) of Mexico, the USA, and Canada isolates, respectively, were grouped into LI. A few isolates from Chile, Peru, and Denmark clustered in specific lineages (Additional file 3). Lung (n = 86) and systemic strains (n = 89) accounted for 80.3% of the isolates. The remaining were from URT (n = 8) or unknown (n = 35). Generally, there was no observed tissue-specific distribution within the lineages.