A genome-wide association study of emm89 Streptococcus pyogenes identifies genetic variations contributing to severe invasive infections

doi:10.21203/rs.3.rs-3896691/v1

Download PDF

Article

A genome-wide association study of emm89 Streptococcus pyogenes identifies genetic variations contributing to severe invasive infections

https://doi.org/10.21203/rs.3.rs-3896691/v1

This work is licensed under a CC BY 4.0 License

Version 1

posted

You are reading this latest preprint version

Streptococcus pyogenes causes mild human infections as well as life-threatening invasive diseases. Since the mutations known to enhance virulence to date account for only half of the severe invasive infections, additional mechanisms/mutations need to be identified. Here, we conducted a genome-wide association study of emm89 S. pyogenes strains to comprehensively identify pathology-related bacterial genetic factors (SNPs, indels, genes, or k-mers). Japanese (n=311) and global (n=666) cohort studies of strains isolated from invasive or non-invasive infections revealed 17 and 1,075 SNPs/indels and 2 and 169 genes, respectively, that displayed associations with invasiveness. We validated one of them, a non-invasiveness-related point mutation, fhuB T218C, by structure predictions and introducing it into a severe invasive strain and confirmed that the mutant showed slower growth in human blood. Thus, we report novel mechanisms that convert emm89 S. pyogenes to an invasive phenotype and a platform for establishing novel treatments and prevention strategies.

Biological sciences/Microbiology/Microbial genetics

Biological sciences/Microbiology/Bacteriology

Health sciences/Diseases/Infectious diseases/Bacterial infection

Streptococcus pyogenes is a human-restricted gram-positive pathogen associated with a wide spectrum of diseases. While S. pyogenes often causes non-invasive diseases including pharyngitis and impetigo in children, it is also involved in life-threatening invasive diseases, such as necrotizing fasciitis and streptococcal toxic shock syndrome (STSS)^1,2. The reported incidence of invasive infections continues to increase globally¹. In cases of severe infection, rapid bacterial growth and profound metabolic acidosis necessitate urgent surgical inspection and extended debridement with empiric antibacterial chemotherapy³. However, even with proper treatment, the mortality rate of patients with S. pyogenes infections is still 23%–81%⁴. Although several protective vaccine candidates against S. pyogenes exist, no safe and effective commercial vaccine has yet been licensed for human use^2,5.

S. pyogenes has been classified into at least 240 emm types based on the emm gene hypervariable region sequence³. Since the mid to late 2000s, emm89 strains have been increasingly isolated from samples obtained from patients with invasive diseases and become one of the most frequently identified lineages in developed countries, including Japan and the UK^6,7.

S. pyogenes emm89 strains have been genetically sub-clustered into three clades according to the nga promoter region patterns. Clade 3 is distinct from clades 1 and 2 in terms of two features: the overexpression of virulence factors NAD glycohydrolase (NADase) and streptolysin O (SLO), due to mutations in the promoter region of the nga-ifs-slo operon and the lack of a hyaluronan capsule^7-9. Although clade 3 strains have frequently been isolated from invasive diseases, we previously reported no difference in the isolation frequencies of clade 3 strains between invasive and non-invasive diseases, at least in Japan, and concluded that the mutations in clade 3 are not responsible for the gain of invasiveness¹⁰. Therefore, there should be other genetic features lying within the emm89 strains that determine their phenotypes.

During infections caused by emm1 S. pyogenes, mutations in the two-component system, CovR/S, trigger high virulence¹¹. These mutations cause upregulation of virulence factors such as hyaluronan capsule, NADase, and SLO as well as downregulation of cysteine protease SpeB and streptolysin S (SLS)^12,13. The resulting mutants can prevent neutrophil-killing and subsequently promote tissue destruction and systemic infections. Epidemiologically, Ikebe et al. reported that nonsense mutations in covR and/or covS were present in 46.3% of S. pyogenes strains isolated from severe invasive infections in Japan but only 1.69% of isolates from non-severe ones¹⁴. Furthermore, the study indicated that the covR/S mutation is not responsible for all invasive clinical strains. Thus, we hypothesize that other mechanisms are involved in the development of invasiveness.

To explore novel hypervirulent mechanisms of S. pyogenes, we performed a genome-wide association study (GWAS) on S. pyogenes. We collected and sequenced emm89 clinical strains isolated in Japan, in addition to public emm89 genome sequences. Using these sequences, we investigated the bacterial factors associated with severe invasive infections in Japan and globally, using GWAS. For a comprehensive analysis, we constructed the core genome and pan-genome of emm89 S. pyogenes strains and evaluated the effect of single-nucleotide polymorphisms (SNPs) in core genes and accessory clusters of orthologous genes (COGs). Additionally, we performed a k-mer-based GWAS to detect variants in the intergenic regions and multiple mutations. Based on the bacterial protein structural predictions, we then selected candidates with high potential relevance to the phenotype. Finally, we introduced an SNP related to non-invasiveness into a clinical strain isolated from a severe invasive infection and examined the alteration of the bacterial phenotype using an ex vivo infection assay.

Collection of emm89 S. pyogenes clinical isolates in Japan and construction of cohorts

We collected clinical S. pyogenes strains isolated between 2016 and 2021 from patients with non-invasive infections and STSS in Japan. STSS was diagnosed according to the clinical criteria of the Ministry of Health, Labour and Welfare of Japan (Extended Data Table 1)¹⁵. Phenotypes of strains from STSS patients were determined as ‘severe invasive’.

Concerning the emm89 clinical isolates, we collected T serotype TB3264 and untypable strains in addition to emm genotype-identified strains. The T serotype TB3264 corresponds to the genotype emm89 or emm94¹⁶. A total of 207 clinical isolates were collected with the cooperation of the National Institute of Infectious Diseases and ten public health institutes nationwide (Extended Data Tables 2, 3). We performed draft genome sequencing of the strains and identified their emm types. In total, 150 of these were determined as emm89. To focus on the pathogenic mechanisms underlying severe invasive infections in the emm89 cohort, 150 emm89 strains were used for subsequent analyses (Fig. 1a and Extended Data Table 2). We previously determined the draft genome sequences of 161 emm89 strains isolated in Japan between 2011 and 2019 and determined their phenotypes using the same criteria (Extended Data Table 2)¹⁰. We combined these two sets and finally considered a total of 311 emm89 strains, including 135 severe invasive and 176 non-invasive isolates, as the Japanese cohort.

We also collected public genome sequences of emm89 S. pyogenes strains isolated from nine countries to further characterize the genetic properties of the Japanese cohort (Extended Data Table 2)^17-19. In this study, the phenotypes of these strains were considered invasive if the diagnoses included severe infections, STSS, invasive infections, necrotizing fasciitis, bacteremia, or sepsis, and isolation sites were described as normally sterile sites. Consequently, we identified 666 strains in the global cohort, including 420 isolates from invasive cases and 246 from non-invasive ones (Extended Data Table 2).

Pan-genome and phylogenetic analyses revealed both shared and distinct features in the Japanese and global cohorts

To determine the core genes and gene distribution in both cohorts, we performed pan-genome analyses. In the Japanese cohort, 1,417 core genes possessed by >99% of all isolates were determined out of the 3,334 different genes detected within the 311 strains. In contrast, the global cohort was more diverse, with 4,743 different genes, out of which 1,327 were core genes (Fig. 1b).

Next, the phylogenetic relationships of the core gene sequences were calculated (Fig. 1c and Extended Data Fig. 1). The tree for the global cohort branched into four clusters, with clusters A, B1, B2, and B3. Cluster B3 includes 640 genetically similar strains isolated mainly from Europe, North America, and Japan (Fig. 1c). The phylogenetic tree for the Japanese cohort could also be clustered as in the case of the global cohort, and there was no significant difference in the proportions of strains classified into each cluster (chi-square test, p=0.13; Supplementary Table 4); thus, we concluded that the overall phylogenetic features of emm89 strains were distributed quite similarly in Japan and other areas. Within cluster B3, we discovered a non-invasive strain from Japan that had no identical pattern to the reported nga promoter variations²⁰ (Fig. 1c and Extended Data Fig. 1). This pattern is supposed to be a subtype of clade 3 because it shares the haplotype A_–27G_–22T_–18, which is distinctive of clade 3, but has a mutation in the –10 box (Extended Data Fig. 2)²⁰. Thus, we named this novel nga promoter variation type 3.4²⁰. Further details about the phylogenetic analysis are given in Supplementary Note 1. Overall, using phylogenetic approaches, we found that most strains from Japan and countries in Europe and North America share genetically close relationships.

GWASes detected SNPs/indels associated with invasiveness that were both common and specific to Japan and other countries

To discover all types of genetic variants within emm89 S. pyogenes that were associated with (severe) invasiveness, we applied pan-genome analysis and performed three types of independent GWASes targeting SNPs in core genes, the presence or absence of all genes, and other variants located in intergenic regions spanning several nucleotides.

We extracted and detected 24,627 and 47,060 SNPs/indels from core gene alignments in the Japanese and global cohorts, respectively. Subsequent GWASes identified SNPs/indels associated with severe invasiveness in a Japanese cohort and invasiveness in a global cohort. To control for population bias, we calculated pairwise distance matrices and chose seven and three dimensions for the analyses of the Japanese and global cohorts, respectively (Extended Data Fig. 3a, b). The GWAS of the Japanese cohort detected 17 SNPs/indels in 13 core genes (Fig. 2a and Extended Data Table 2). Of the 17 significant variants, there were seven single-nucleotide deletions (SNDs), seven SNPs causing non-synonymous amino acid substitutions, and three SNPs causing synonymous substitutions. The covS gene, encoding a sensor kinase of the two-component system CovR/S, contains four SNDs with the lowest p-values (p=1.16×10^–7 for the 39th, 40th, and 46th nucleotides, and p=1.15×10^–6 for the 125th nucleotide). These four deletions were associated with severe invasive infections.

We also performed a GWAS for the global cohort and detected 1,075 SNPs/indels significantly related to invasive infections among the 360 core genes (Fig. 2b, c and Supplementary Table 3). Among the significant SNPs/indels, 725 caused synonymous substitutions and 319 caused non-synonymous substitutions or frameshift mutations. Nineteen SNPs induced nonsense mutations, whereas the effects of 12 SNPs/indels were unpredictable because of a lack of reference sequences (Supplementary Table 3). Notably, 96 SNPs/indels accumulated in a single gene, murJ, which is involved in peptidoglycan biosynthesis (Extended Data Fig. 3c). The SNP with the lowest p-value (p=1.35×10^–14) was lacE, which encodes the EIICB component of the lactose-specific phosphotransferase system (Fig. 2c). The mutation is associated with an invasive phenotype and is mainly observed in strains isolated in the US. Compared with the significant 17 SNPs/indels in the Japanese cohort, 10 SNPs/indels were also detected in the global cohort, including four SNDs in covS and one SNP each in six loci (Fig. 2c). Deletions at the covS locus are common among strains from several countries, including Japan. In contrast, SNPs found at six loci, gatA, group_1102, group_647, iscS_1, recU, and fhuB existed exclusively in Japan (Fig. 2c and Supplementary Table 3). These results suggest that several bacterial mechanisms cause severe invasive S. pyogenes infections, and some prevail worldwide, whereas others are specific to Japan.

The GWAS on COGs detected 2 and 109 genes associated with severe invasiveness in the Japanese cohort and invasiveness in the global cohort, respectively

Next, we examined the associations of accessory COGs with severe invasiveness and global invasiveness in Japanese patients. Two significant genes were detected in the GWAS for the Japanese cohort: group_184, which encodes a hypothetical protein, and divIC, which encodes a septum formation initiator protein (p=8.81×10^–6 and p=6.72×10^–6, respectively; Fig. 3a and Supplementary Table 4). Although analysis of the global cohort revealed the presence of 169 genes that were significantly related to invasiveness, no genes were identical or homologous to the two genes detected in the Japanese cohort (Fig. 3b, c and Supplementary Table 5). Among the 169 genes, 25 encoded phage-related genes and 14 encoded mobile genetic elements (MGEs) such as transposase, integrase, and recombinase. Overall, genes associated with invasiveness were found to encode MGEs and transporters, whereas major virulence factors were not significantly associated with invasiveness.

The k-mers-based GWAS detected both distinctive and identical variants compared to the SNPs- and COGs-based GWASes

To detect SNPs/indels and multiple mutations in the entire genome, we extracted 31-nt-length k-mers from whole genomes and performed a GWAS. The k-mers-based GWAS can handle polymorphisms spanning more than one base, such as indels, inversions, and translocations, in both the coding and non-coding regions.

The set of connected nodes and edges comprising each de-Bruijn graph is called a complex (Fig. 4a–f). In the Japanese cohort, the k-mers-based GWAS detected two complexes containing causative variants associated with severe invasiveness (Extended Data Table 4). The complex comprising the nodes with the lowest q-value (p=1.49×10^–2) was Comp_11 in the covS locus (Extended Data Table 4). The causative mutation in covS was an SND (Fig. 4a). This deletion resulted in a frameshift mutation that shortened the length of CovS from 500 to 35 amino acids, leading to increased invasiveness, as previously reported in other emm types¹³. Another complex significantly associated with severe invasiveness is Comp_2 (p=4.22×10^–2; Extended Data Table 4). Comp_2 is a highly variable region containing eight hypothetical protein-coding genes, with high similarity within the first 75 bp. Significant k-mers were also mapped to the first 26 bp of group_184 and 20 bp upstream (Fig. 4b). Together with this finding, group_184 possibly contributes to severe invasiveness through not only its presence itself but also that of the upstream region.

Next, we analyzed the global cohort. We identified mutations that were significantly associated with invasiveness in five regions (Extended Data Table 5). The mutation with the lowest q-value (p=1.90×10^-2) was identical to the SND in covS found in the k-mers-based GWAS of the Japanese cohort (Fig. 4a).

Two significant k-mers were present in Comp_6, which were found to be an intergenic region of 270 bp (p=1.90×10^–2; Extended Data Table 5 and Fig. 4c). In Comp_24, with a high sequence variation containing genes encoding transposases, the presence of a 281-bp sequence consisting of several k-mers was significantly associated with the invasive phenotype (p=1.90×10^–2; Extended Data Table 5 and Fig. 4d). n27458, in Comp_10, the sagG locus encoding the ATP-binding protein of the efflux transporter of SLS, was significantly correlated with the non-invasive phenotype and causes a synonymous mutation (p=2.40×10^-2; Extended Data Table 5 and Fig. 4e). The other significant mutation existed in the fhuB locus, encoding a putative ferrichrome transport system permease (p=2.40×10^-2; Extended Data Table 5). The mutation was identical to SNP T218C detected in the SNP/indels-based GWAS (Fig. 4f).

The k-mer approach identified multiple variants, including the mutation identified in the SNPs/indels-based GWAS, fhuB SNP T218C. Additionally, the mutation detected in covS was different from that detected in the SNPs/indels-based GWAS, but both caused frameshift mutations.

AlphaFold-based prediction of the impact of the identified mutations on function

To assess the impact of mutations on protein function, we predicted the protein structure using AlphaFold²¹. Here, we present structural predictions for three representative proteins: CovS, whose invasion-related deletions prevailed worldwide (Fig. 5a, b); FhuB, which carries a prominent mutation in the Japanese cohort and is associated with non-invasiveness (Fig. 5c–e and Supplementary Note 2); and LacE, whose mutation was observed mainly in invasive strains from the US (Supplementary Note 3).

We predicted a homodimerized CovS model using AlphaFold because the CovS of S. pyogenes forms homodimers²². SOSUI predicted that CovS has two transmembrane regions (Fig. 5a). Mutations detected in the SNP/indels- and k-mers-based GWASes were predicted to shorten the CovS protein to 35 and 45 amino acids, respectively. Since the intracellular domain of CovS is in the C-terminal region and is involved in the phosphorylation of the transcriptional regulator CovR, frameshift mutations leading to CovS truncation would inactivate the protein, and thus, CovR function (Fig. 5b).

The SNP fhuB T218C substitutes the 73rd valine of FhuB with alanine. FhuB is a component of an ATP-binding cassette transporter system that utilizes ferrichrome, which is one of the carriers of Fe³⁺. FhuB is predicted to localize to the cell membrane and form a channel with FhuG (Fig. 5c). The FhuBG complex can bind to one molecule of the extracellular ferrichrome-binding lipoprotein, FhuD, and two molecules of the intracellular ATP-binding protein, FhuC. Therefore, we constructed a structural model of the FhuBCCDG complex (Fig. 5d). This implied that the 73rd residue of FhuB exists in a region adjacent to FhuD. The hydrophobicity of the side chain was attenuated by the mutation, which potentially affected ferrichrome transport (Fig. 5e).

The fhuB T218C mutation represses the growth of a severe invasive strain in human blood

Based on the GWAS results and predicted protein structures, we focused on the SNP fhuB T218C. We constructed a mutant strain in which the SNP fhuB T218C was introduced, to further investigate its potential virulence. We selected the strain TK02, which carries the wild-type (WT) allele T218 in fhuB and was originally isolated from a sample obtained from a patient with severe invasive infection in Japan. We used a several times-passaged TK02 strain, TK02', as a WT strain and introduced the SNP fhuB T218C into it via allelic exchange mutagenesis with a thermo-sensitive shuttle vector. We then confirmed that there were no differences between the WT and fhuB T218C strains using whole-genome resequencing.

To reveal the effects of the SNP on invasiveness, we performed transcriptomic analysis of the WT and fhuB T218C strains in THY broth and human blood. Principal component analysis revealed that the differences in the overall transcriptional profiles between the strains were more remarkable in blood than in THY (Fig. 6a, b and Supplementary Table 6–9). We found that the expression of CovR-regulating genes, including speB, nga-ifs-slo operon, and sag operon, was upregulated in the blood, compared with that in THY, in both the strains (Fig. 6c). In human blood, the mutant strain downregulated mga and emm and upregulated the expression of genes encoding virulence factors, such as speC, scpA/B, endoS, ska, and sfbX, as compared with those observed in the WT strain (Supplementary Table 12). Although Mga regulates the expression of surface and secreted molecules important for colonization and immune evasion²³, no strong expression changes were observed in the Mga regulon, except for in the emm gene. Notably, fhuB, fhuC, and fhuD were upregulated in the fhuB mutant in both environments (Supplementary Table 9 and 12).

To determine whether fhuB T218C mutation affects iron transport, we measured the intracellular free ferric ion concentration in each strain and observed no differences among the strains in human blood (Fig. 7A). The upregulation of fhuBCD may compensate for the impaired function mediated by the SNP.

Next, to investigate the effects of SNP on bacterial survival in human blood, we performed a bactericidal assay using human blood. At 3 h after incubation, the fhuB T218C mutant strain showed a significantly decreased survival rate than the WT strain (Fig. 7B). To further determine the blood components that the attenuated survival of the mutant can be attributed to, we compared bacterial survival rates in erythrocytes, polymorphonuclear cells (PMNs), plasma, heat-inactivated plasma, and brain heart infusion broth (Fig. 7C–7G). Notably, after incubation with erythrocytes, PMNs, or plasma, the mutant strain showed a significantly lower survival index than the WT strain, as observed in whole blood (Fig. 7C–7E and 7G). However, there were no significant differences between the survival rates of the WT and mutant strains in heat-inactivated plasma, suggesting that the mutant strain is susceptible to complement (Fig. 7F). Overall, the polymorphism T218C in fhuB impaired the survival of severe invasive strains in human blood through interactions with erythrocytes, PMNs, and complements.

This study focused on emm89 S. pyogenes causing globally expanding invasive infections. We constructed a workflow for bacterial GWAS and explored the bacterial genetic factors related to severe or invasive infections, to reveal the plausible mechanisms of pathogenesis. Several GWASes have been conducted on S. pyogenes to date, however, no studies have described the combined analysis of SNPs, genes, and k-mers exclusively using emm89 strains^19,24. We independently performed SNPs/indels-, genes-, and k-mer-based GWASes for Japanese and global cohorts, to investigate the associations of all types of variants with strictly defined phenotypes in severe invasive S. pyogenes infections and with phenotypes in a broader context of invasive infections.

Spontaneous mutations in the covR/S genes potentiate the transition from localized to invasive infection by M1T1 S. pyogenes¹³. We showed that some SNDs in the covS locus were significantly related to the phenotypes. Sumby et al. suggested that systemic invasive infections are caused by the overexpression of virulence factors through the inactivation of the transcriptional regulator CovR, as a result of the introduction of a depletive mutation in the covS locus after infection^11,12. Several mutation sites causing such overexpression have been reported, such as the insertion at the 877th nucleotide¹¹. In line with a previous Japanese study¹⁴, the shortening of CovS that potentially occurred due to frameshifts caused by SNDs was found to be significantly associated with invasive infections in our study, which could contribute to the upregulation of virulence factors and invasiveness.

In contrast, several factors associated with invasiveness were found to be independent of covR/S mutations. Based on conformational predictions, we selected SNPs with non-synonymous substitutions that were likely to affect protein function. Our transcriptomic analysis suggests that the Japan-specific fhuB mutation contributes to the growth rate of S. pyogenes in human blood by adapting to the environment. Additionally, emm89 clade 3 carries the identical promoter region pattern of the nga operon as emm1 strains, and the pattern conferred similarly high expression of nga and slo^7,8,20. Our RNA-seq data showed that a severe invasive strain without covS mutations increased the expression of speB, nga, slo, and genes in the sag operon in the blood. Although covR/S mutations downregulate the expression of SpeB and SLS, SpeB and SLS act as virulence factors, allowing S. pyogenes to invade host tissues^25-27. These data suggest that severe invasive infections have multiple gene expression profiles in addition to covR/S mutation-induced profiles within a single lineage, emm89, and that the synergy between optimizing bacterial survival in human blood and upregulating multiple virulence factors contributes to the development of severe invasive infections.

We propose two possible roles of the FhuB V73A mutation in the pathogenesis of severe invasive infections. First, mutation of FhuB could increase bacterial susceptibility to free radicals generated in the presence of ferric ions in the blood. Catalyzed by ferric ions, the Fenton and Haber–Weiss reactions generate hydroxyl radicals in erythrocytes²⁸. Previously, we provided evidence that the iron in erythrocytes partially inhibits pneumococcal growth via a free radical-based mechanism⁷. Although no significant difference was observed in the intracellular ferric ion concentration in our study, the structural changes in FhuB may have caused an increased generation of free radicals in bacterial cells and the prevention of bacterial survival in human blood.

Second, the FhuB mutation could affect bacterial transcriptional profiles and result in reduced fitness for survival in human blood. We also observed a lower survival rate of the mutant strain with PMNs. Lower emm transcription in the mutant may lead to attenuated immune evasion. The M protein binds C4-binding protein and factor H and inactivate the deposited C4b and C3b, leading to limited surface opsonization²⁹. Furthermore, complements themselves could have inhibited the proliferation of the mutant strain, as suggested by the activation of the membrane attack complex (Fig. 7D-7E). Although the mechanisms of the FhuB V73A mutation in the interaction with each blood component must be further elucidated, our results indicated that the 73rd residue of FhuB is a key factor for bacterial survival in human blood during systemic infection.

Our genes-based GWAS showed no significant correlation with the distribution of major virulence factors. This finding minimizes the possibility that invasive infections are the result of the acquisition of virulence factors by non-invasive strains and indirectly supports the hypothesis that changes in gene expression profiles cause invasive infections. Given the possibility that the pathogen has multiple gene expression patterns, even within a single lineage, it may be difficult to create a universal vaccine with a single antigen, and a vaccine containing multiple antigens may be effective.

Our study had two limitations. First, because we sequenced bacterial genomes using short-read sequencing, we could not detect large bacterial genome rearrangements by comparing complete genome sequences. Thus, it is not possible to investigate the effects of long genomic structural dynamics on pathogenesis. Second, the clinical information associated with the isolates was limited; therefore, our analyses did not reflect host information, such as age, sex, underlying health conditions, and genome. A combined GWAS of the host and pathogen in S. pyogenes infection would highlight the relationship between host risk factors and bacterial genetic variants.

In this study, we revealed the genotype-phenotype associations found in not only the known factors represented by covS, but also factors that are related to invasiveness, including fhuB. Moreover, we experimentally validated the contribution of the fhuB mutation to bacterial survival in human blood. This study demonstrates the potential of our genomic statistical approach for elucidating the pathogenesis of invasive infections. Further analyses of the invasiveness-related factors identified in this study should provide a platform for establishing novel treatments and preventive strategies against invasive infections.

1. Clinical isolates in Japan

Clinical isolates were collected from public health institutions in Tokyo, Osaka, Yamaguchi, Fukushima, Kobe, Kyoto, Amagasaki, Sapporo, and Niigata, Japan. We defined the strains collected as STSS according to the Infectious Diseases Control Law in Japan. Non-invasive strains were defined based on diagnostic names, including asymptomatic, pharyngitis, tonsillitis, or non-invasive infections. Strains with no diagnostic names for the non-STSS strains were defined as non-invasive based on the isolate sites. Information on all the strains included in this study is presented in Extended Data Table 3.

The collected S. pyogenes strains were cultured at 37ºC in an atmosphere containing 5% CO₂, in Todd Hewitt broth supplemented with 0.2% yeast extract (THY; both from BD Biosciences, Franklin Lakes, NJ, USA) and stored in THY broth with 30 % glycerol (Nacalai Tesque, Kyoto, Japan), at –80°C.

2. Genomic DNA sequencing of the clinical isolates

The S. pyogenes strains were cultured until the exponential growth phase (OD₆₀₀=0.3–0.4). The bacterial cells were lysed with T10E1N100 buffer (10 mM Tris-HCl buffer, 100 mM sodium chloride, and 1 mM EDTA), 10 units/mL mutanolysin (Sigma, St. Louis, MO, USA), 10 mg/mL lysozyme (Fujifilm Wako Pure Chemical Corporation, Osaka, Japan), 0.5 mg/mL achromopeptidase (Fujifilm Wako Pure Chemical Corporation), and 0.3 mg/mL RNase A (Promega, Madison, WI, USA). Next, genomic DNA was extracted from each lysate using a Maxwell^® RSC instrument (Promega), according to the manufacturer’s instructions. Following that, 250 bp paired-end libraries were generated from the extracted DNA using a Nextera XT DNA Kit (Illumina, San Diego, CA, USA). Libraries were sequenced using a NovaSeq 6000 system (Illumina) at the Genome Information Research Center, Research Institute for Microbial Diseases, Osaka University, Osaka, Japan. On average, the number of reads was 5,433,301 (range 3,437,124–9,117,301).

3. Collection of published genome sequences

We previously sequenced the draft genomes of 161 emm89 clinical isolates collected in Japan between 2011 and 2019¹⁰. We defined strains derived from STSS as ‘severe invasive’, and those obtained from pharyngitis, tonsillitis, and superficial skin lesions as ‘non-invasive’ phenotypes.

To obtain public genome sequences of emm89 strains isolated from other countries, we downloaded draft genome sequences in FASTA format from the National Center for Biotechnology Information (NCBI) database, using Fasterq-dump v.2.9.6. The phenotype of each strain, whether invasive or non-invasive, was defined according to the definitions in the respective references that reported the strains^17-19.

4. Genomic data processing and pan-genome analysis

All processes and analyses were performed using the National Institute of Genetics (NIG) supercomputer and SQUID at the Cybermedia Center of Osaka University (Osaka, Japan). We constructed a workflow for bacterial GWAS and other bioinformatic processes (Supplementary Fig. 5). All collected sequences were subjected to quality checks using Fastp v.0.20.1³⁰. For newly isolated strains in Japan, emm typing was performed using the emm-typing-tool v.0.0.1, and only sequences of strains determined as emm89 were used for the following analyses³¹. All emm89 sequence data were then subjected to de novo assembly using SKESA v.2.4.0, with default parameters³². Next, the MLST of each sequence was performed using MLST v.2.19.0^33,34. Clade typing was performed using BLAST v.2.13.0, with reference to the three nga promoter region sequences⁹. After the genes were annotated with Prokka v.1.14.5, the pan-genome of all sequences was calculated using Roary v.3.12.0, with the parameters ‘-e -mafft -r -qc -cd 99’^35,36. Roary generated a core gene alignment and the distribution of all genes among the strains. To extract SNPs/indels from core genes, including single-nucleotide indels, snp-sites v.2.5.1 with the option ‘-v’ was used. The output files were further processed using bcftools v.1.9, with the parameter ‘norm -m -’, so that multiple alleles could be handled in the GWAS analysis as well. In parallel, k-mers were extracted using DBGWAS v.0.5.3, and the length of k-mers was set as 31 nt, with the parameter of DBGWAS ‘-k 31’³⁷.

5. Phylogenetic analysis

Phylogenetic relationships were calculated from the core gene alignment, using IQ-tree v.1.16.12³⁸, based on maximum likelihood. The substitution model was automatically selected considering the Akaike’s and Bayesian information criteria, by setting the parameter of IQ-tree ‘-m MFP’³⁹. Phylogenetic trees were constructed using iTOL v.6.6⁴⁰. The similarity of clustering in the two phylogenetic trees was statistically examined using a Pearson’s chi-square test, with R v.4.0.3⁴¹, followed by post-hoc analysis using residual analysis adjusted with the Holm’s method, if p<0.05.

6. GWAS

To investigate the associations between phenotypes and genotypes, including SNPs/indels and genes, we performed a GWAS using pyseer v.1.3.4⁴². The VCF file of SNPs/indels or the gene distribution matrix was designated as the genotype. To remove biases derived from lineages, we added information on the genetic distances between all pairs of strains as covariates using mash v.2.3⁴³. Briefly, de novo assemblies were compressed by converting them into minimum hash values using the command ‘mash sketch -s 10000’, following which the commands ‘mash dist’ and ‘square_mash’ provided us with the genetic distance matrix expressed with Jaccard coefficients⁴³. The obtained matrix underwent eigenvalue decomposition, and with the plot of the relationships between the eigenvalues and contribution ratios, we visually determined the number of eigenvalues used for multidimensional scaling. We then added the number of eigenvalues and the distance matrix as pyseer parameters. The pyseer calculation was iterated 1,000 times with randomized phenotypes, and the 5-percentile value of the minimal p-value in each calculation was set at the significance level. Using R and the package ggplot2, those results were drawn as a Manhattan plot for the SNPs/indels-based GWAS and a volcano plot for the genes-based GWAS, respectively⁴⁴. Heatmaps of the strains possessing significant variants were generated using Excel (v.16.66.1; Microsoft, Redmond, WA, USA).

The k-mers-based GWAS was carried out using DBGWAS³⁷. K-mers were considered significant at a false discovery rate (q-value) of <0.05. DBGWAS-calculated complexes, which are regions encompassing the k-mers, were significantly related to pathology and we generated de-Bruijn graphs of these. The sequences of the k-mers were also received as output and mapped on a reference sequence using Geneious Prime v.2022.0.1 (Biomatters, Auckland, New Zealand), to identify the mutations indicated by the k-mers. For the reference strain, we adopted MGAS27061, which was isolated from an invasive case in the USA and whose complete chromosomal sequence has been used as the reference sequence of emm89 clade 3⁴⁵.

7. Protein structure prediction

Significant variants found in the GWAS that caused non-synonymous substitutions in proteins were searched by converting nucleotide sequences into amino acid sequences using EMBOSS Transeq v.6.6.0.0⁴⁶. To assess whether these mutations affected the protein function, protein structure prediction models were constructed using AlphaFold v.2.2.2²¹. The calculations were performed five times for each model, by setting the option multimer_predictions_per_model=5. We predicted multimer models using the option --model_preset=’multimer’ if a protein is reported or anticipated to form a multimer. For each monomer, we selected the model with the best predicted local difference distance test (an indicator of local structural accuracy) score⁴⁷. For each multimer, AlphaFold was calculated and expressed as a weighted combination of the interface-predicted TM and predicted TM scores (ipTM + pTM). pTM is a metric for overall topological accuracy and ipTM is used to measure the structural accuracy of the protein-protein interface⁴⁸. The transmembrane regions of the proteins were predicted using SOSUI (https://harrier.nagahama-i-bio.ac.jp/sosui/mobile/)⁴⁹. The structures of the obtained model were visualized using PyMOL v.2.5 (Schrödinger, LLC., New York, NY, USA).

8. Construction of the fhuB T218C mutant strains

We used the several times-passaged S. pyogenes TK02 strain, TK02', as the WT strain. The TK02 strain was originally isolated from a patient with severe invasive infection¹⁰. The whole genome of TK02' was sequenced. A point mutation, fhuB T218C, was introduced using the temperature-sensitive shuttle vector, pSET4s, as reported previously^50-52. Sanger sequencing confirmed the presence of the point mutation. Additionally, we resequenced the draft genome to confirm that there were no differences apart from the point mutation, as described above. The generated fasta files for the TK02' WT and mutant strains are available in the Supplementary Data 1 and 2, respectively.

The bacterial strains, primers, and plasmids used in this study are listed in Supplementary Table 10 and 11. Escherichia coli strain DH5a (Takara Bio, Shiga, Japan) was used as a host for the plasmid derivatives. All E. coli strains were cultured in Luria Bertani broth, at 37°C, with agitation. For selection and maintenance of mutants, antibiotics were added to the media at the following concentrations: carbenicillin (Nacalai Tesque), 100 µg/mL for E. coli; spectinomycin (Fujifilm Wako Pure Chemical Corporation), 100 µg/mL for E. coli and S. pyogenes.

9. Transcriptomic analysis

The fhuB WT and mutant strains were harvested in 30 mL of THY broth, of which 1 mL was dispensed into 10 mL of THY and the remainder was centrifuged and resuspended in 2 mL of heparinized human blood. Bacterial mixtures with THY or blood were dispensed into three aliquots and incubated at 37ºC for 3 h. THY samples were centrifuged and resuspended in RNA Shield (Zymo Research, Irvine, CA, USA). For each blood sample, 2 volumes (1 mL) of RNA protection bacteria reagent (Qiagen, Venlo, The Netherlands) were added. L5, included in the PureLink™ Total RNA Blood Purification Kit (Thermo Fisher Scientific, Waltham, MA, USA), was added to remove erythrocytes. The bacterial cell wall was mechanically lysed in Lysing Matrix B using a MagNA Lyser (Roche, Basel, Switzerland). After centrifugation, the total bacterial RNA was extracted using a Quick-RNA™ Miniprep Kit (Zymo Research), according to the manufacturer’s instructions. Full-length cDNA was generated using the SMART-Seq^® HT Kit (Takara Bio), according to the manufacturer’s instructions. Pair-end libraries were generated using a Nextera XT DNA Kit and sequenced using a NovaSeq 6000 system (both from Illumina, San Diego, CA, USA). Sequenced data were preprocessed using Trimmomatic v.0.33 and FastQC v.0.12.1. The reads were mapped to the complete MGAS27061 genome (NCBI reference sequence: NZ_CP013840.1) using STAR v.2.7.0a. After a second quality check using FastQC, read counting was performed using featureCounts v.1.5.2⁵³. Differentially expressed genes were identified using iDEP v.0.96 and gene annotations from NCBI and Prokka were combined⁵⁴. Plots were created using iDEP and the R package ggplot2.

10. Intracellular ferric ion assay

Human plasma was obtained by means of centrifugation of heparinized human blood, after 30 min of incubation at 37ºC. The WT and fhuB mutant strains were harvested at the exponential phase, resuspended into 1 mL of THY or serum, and then incubated at 37ºC for 3 h. Viable bacterial cells were counted as colony-forming units (CFUs) by plating the diluted samples onto THY agar plates. Intracellular ferric ions were measured using a QuantiChrom™ Iron Assay Kit (BioAssay Systems, Hayward, CA, USA), according to the manufacturer’s instructions. Briefly, 50 µL of standards or samples in 96-well plates were mixed with 200 µL QuantiChrom™ Working Reagent and incubated at room temperature for an hour. The optical density at the wavelength of 600 nm was measured using an Infinite^® 200 Pro F Plex Instrument (TECAN, Männedorf, Switzerland). The assay was repeated three times and the results of the respective experiments were combined. Statistical analyses were performed using the Mann–Whitney U test.

11. Bactericidal assay

The bactericidal assay was performed as described previously, with minor modifications^55-58. Briefly, whole blood was collected from healthy adults. Human neutrophils and erythrocytes were prepared using PolymorphPrep™ (Serumwerk Bernburg, Bernburg, Germany), according to the manufacturer’s instructions. Heparinized human blood was centrifuged at 500 × g for 30 min, to isolate erythrocytes and polymorphonuclear cells, which were then suspended in Roswell Park Memorial Institute (RPMI)-1640 medium containing L-glutamine and Phenol Red (Fujifilm Wako Pure Chemical Corporation). Heat-inactivated plasma was prepared at 56ºC for 30 min. Following that, 195 µL of heparinized human whole blood, erythrocytes in RPMI-1640, polymorphonuclear leukocytes in RPMI-1640, plasma, heat-inactivated plasma, or brain heart infusion broth (BD Biosciences), and 5 µL of early exponential phase bacteria with 0.9–2.0×10⁴ CFUs/well were mixed in 96-well plates and incubated at 37°C, in an atmosphere containing 5% CO₂, for 1, 2, and 3 h. Viable bacterial cells were counted as CFUs by plating the diluted samples onto THY agar plates. The growth index was calculated as the number of CFUs at a specified time point divided by the number of CFUs in the initial inoculum. The assay was repeated three times and the results of the respective experiments were combined. Statistical analyses were performed using the Mann–Whitney U test. Differences were considered statistically significant at p<0.05, using Prism v.7.0c (GraphPad, La Jolla, CA, USA).

12. Ethics statement

Studies involving human participants were reviewed and approved by the Institutional Review Board of Osaka University Graduate School of Dentistry (approval nos. H26-E43, H29-E16-2, and R5-E23). The donors provided written informed consent to participate in the human blood bactericidal assay. With respect to the S. pyogenes collection, since we retrospectively obtained clinical isolates of S. pyogenes, we utilized an opt-out consent procedure instead of obtaining written informed consent from the patients.

13. Data availability

Data for the 207 sequenced S. pyogenes genomes were deposited in the DDBJ sequence read archive, under BioProject PRJDB16457. The DRR run number was DRR511668-DRR511874.

Acknowledgments

We would like to thank the NGS core facility of the Genome Information Research Center at the Research Institute for Microbial Diseases of Osaka University for their support in the DNA sequencing and data analysis and the Bioinformatic Research Unit of Osaka University Graduate School of Dentistry for their support in the bioinformatics analysis. This study was partially performed on the NIG supercomputer at the Research Organization of Information and Systems National Institute of Genetics. This study was partly achieved using SQUID at the Cybermedia Center, Osaka University, under the “Joint Usage/Research Center for Interdisciplinary Large-scale Information Infrastructures (JHPCN)” in Japan (Project ID: EX22701 and jh230035).” Masayuki Ono and Kotaro Higashi were recipients of the Iwadare Scholarship from the Iwadare Scholarship Foundation. We wish to express our gratitude to Mami Tateshita (Sapporo City Institute of Public Health) as well as the medical institutions that participated in the collection of clinical strains.

Funding

This study was partly supported by AMED (JP19fk0108044, JP22fk0108130, and 22wm0325001), the Japan Society for the Promotion of Science KAKENHI (grant numbers 20KK0210, 22H03262, 22K19618, 22K19619, 23H03073, and 23K19687), the Takeda Science Foundation, Naito Foundation, and Joint Research Program of the Research Center for GLOBAL and LOCAL Infectious Diseases, Oita University (2022B05). This study was conducted as part of ‘The Nippon Foundation - Osaka University Project for Infectious Disease Prevention’. This study was supported by JST SPRING (grant number JPMJSP2138). The funders had no role in the study design, data collection or analysis, decision to publish, or preparation of the manuscript.

Competing interests

The authors declare no competing interests.

Author contributions

M.Y., M.O., and S.K. designed the study. M.O., M.Y., and K.H. designed and performed bioinformatics analysis. D.M. performed the next-generation sequencing. M.O. performed the in vitro and in vivo experiments. M.Y., Y.H., T.S., and S.K. contributed to the setup of the experiments. R.O., T.Y., R.K., H.O., N.N., Y.K., C.N., R.Y., H.S., and Y.M. collected clinical strains. M.O. and M.Y. wrote the original manuscript. All authors reviewed and edited the manuscript.

Craik, N. et al. Global Disease Burden of Streptococcus pyogenes. in Streptococcus pyogenes: Basic Biology to Clinical Manifestations (eds. Ferretti, J.J., Stevens, D.L. & Fischetti, V.A.) (Oklahoma City (OK), 2022).
Brouwer, S. et al. Pathogenesis, epidemiology and control of Group A Streptococcus infection. Nat. Rev. Microbiol. 21, 431–447 (2023).
Stevens, D.L. & Bryant, A.E. Severe Group A Streptococcal Infections. in Streptococcus pyogenes: Basic Biology to Clinical Manifestations (eds. Ferretti, J.J., Stevens, D.L. & Fischetti, V.A.) (Oklahoma City (OK), 2016).
Walker, M.J. et al. Disease manifestations and pathogenic mechanisms of Group A Streptococcus. Clin. Microbiol. Rev. 27, 264–301 (2014).
Yamaguchi, M. et al. Streptococcus pneumoniae invades erythrocytes and utilizes them to evade human innate immunity. PLOS ONE 8, e77282 (2013).
Ikebe, T. et al. Molecular characterization and antimicrobial resistance of group A streptococcus isolates in streptococcal toxic shock syndrome cases in Japan from 2013 to 2018. Int. J. Med. Microbiol. 311, 151496 (2021).
Turner, C.E. et al. Emergence of a new highly successful acapsular group a Streptococcus clade of genotype emm89 in the United Kingdom. mBio 6, e00622 (2015).
Zhu, L. et al. A molecular trigger for intercontinental epidemics of group A Streptococcus. J. Clin. Invest. 125, 3545–3559 (2015).
Zhu, L., Olsen, R.J., Nasser, W., De La Riva Morales, I. & Musser, J.M. Trading Capsule for Increased Cytotoxin Production: Contribution to Virulence of a Newly Emerged Clade of emm89 Streptococcus pyogenes. mBio 6, e01378–e01315 (2015).
Hirose, Y. et al. Genetic Characterization of Streptococcus pyogenes emm89 Strains Isolated in Japan From 2011 to 2019. Infect. Microbes Dis. 2, 160–166 (2020).
Walker, M.J. et al. DNase Sda1 provides selection pressure for a switch to invasive group A streptococcal infection. Nat. Med. 13, 981–985 (2007).
Sumby, P., Whitney, A.R., Graviss, E.A., DeLeo, F.R. & Musser, J.M. Genome-wide analysis of group a streptococci reveals a mutation that modulates global phenotype and disease specificity. PLoS Pathog 2, e5 (2006).
Cole, J.N., Barnett, T.C., Nizet, V. & Walker, M.J. Molecular insight into invasive group A streptococcal disease. Nat. Rev. Microbiol. 9, 724–36 (2011).
Ikebe, T., Ato, M., Matsumura, T., Hasegawa, H. & Sata, T. Highly Frequent Mutations in Negative Regulators of Multiple Virulence Genes in Group A Streptococcal Toxic Shock Syndrome Isolates. PLOS Pathog. 6, e1000832 (2010).
Ministry of Health, Labour and Welfare. Severe Invasive Group A Streptococcal Infections.
Katsukawa, C., Tamaru, A., Morikawa, Y. & Oda, K. M protein gene (emm) typing of Streptococcus pyogenes. Kansenshogaku Zasshi 76, 238–245 (2002).
Beres, S.B. et al. Genome sequence analysis of emm89 Streptococcus pyogenes strains causing infections in Scotland, 2010–2016. J. Med. Microbiol. 66, 1765–1773 (2017).
Chochua, S. et al. Population and Whole Genome Sequence Based Characterization of Invasive Group A Streptococci Recovered in the United States during 2015. mBio 8, e01422-17 (2017).
Davies, M.R. et al. Atlas of group A streptococcal vaccine candidates compiled using large-scale comparative genomics. Nat. Genet. 51, 1035–1043 (2019).
Turner, C.E. et al. The Emergence of Successful Streptococcus pyogenes Lineages through Convergent Pathways of Capsule Loss and Recombination Directing High Toxin Expression. mBio 10, 1–20 (2019).
Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature 2021 596:7873 596, 583–589 (2021).
Jain, I., Danger, J.L., Burgess, C., Uppal, T. & Sumby, P. The group A Streptococcus accessory protein RocA: regulatory activity, interacting partners and influence on disease potential. Mol. Microbiol. 113, 190–207 (2020).
Ribardo, D.A. & McIver, K.S. Defining the Mga regulon: Comparative transcriptome analysis reveals both direct and indirect regulation by Mga in the group A streptococcus. Mol Microbiol 62, 491-508 (2006).
Kachroo, P. et al. Integrated analysis of population genomics, transcriptomics and virulence provides novel insights into Streptococcus pyogenes pathogenesis. Nat. Genet. 51, 548–559 (2019).
Terao, Y. et al. Group A streptococcal cysteine protease degrades C3 (C3b) and contributes to evasion of innate immunity. J. Biol. Chem. 283, 6253–6260 (2008).
Sumitomo, T. et al. Streptolysin S contributes to group A streptococcal translocation across an epithelial barrier. J. Biol. Chem. 286, 2750–2761 (2011).
Honda-Ogawa, M. et al. Cysteine proteinase from Streptococcus pyogenes enables evasion of innate immunity via degradation of complement factors. J Biol Chem 288, 15854-64 (2013).
Cimen, M.Y. Free radical metabolism in human erythrocytes. Clin. Chim. Acta. 390, 1–11 (2008).
Laabei, M. & Ermert, D. Catch Me if You Can: Streptococcus pyogenes Complement Evasion Strategies. J. Innate. Immun. 11, 3–12 (2019).
Chen, S., Zhou, Y., Chen, Y. & Gu, J. fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics 34, i884–i890 (2018).
Kapatai, G., Coelho, J., Platt, S. & Chalker, V.J. Whole genome sequencing of group A Streptococcus: development and evaluation of an automated pipeline for emm gene typing. PeerJ 5, e3226 (2017).
Souvorov, A., Agarwala, R. & Lipman, D.J. SKESA: strategic k-mer extension for scrupulous assemblies. Genome Biol. 19, 153 (2018).
Jolley, K.A. & Maiden, M.C. BIGSdb: scalable analysis of bacterial genome variation at the population level. BMC Bioinformatics 11, 595 (2010).
Maiden, M.C.J. et al. MLST revisited: the gene-by-gene approach to bacterial genomics Europe PMC Funders Group. Nat. Rev. Microbiol. 11, 728–736 (2013).
Seemann, T. Prokka: rapid prokaryotic genome annotation. Bioinformatics 30, 2068–2069 (2014).
Page, A.J. et al. Roary: rapid large-scale prokaryote pan genome analysis. Bioinformatics 31, 3691-3 (2015).
Jaillard, M. et al. A fast and agnostic method for bacterial genome-wide association studies: Bridging the gap between k-mers and genetic events. PLOS Genet. 14, e1007758 (2018).
Nguyen, L.T., Schmidt, H.A., von Haeseler, A. & Minh, B.Q. IQ-TREE: a fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies. Mol. Biol. Evol. 32, 268–274 (2015).
Kalyaanamoorthy, S., Minh, B.Q., Wong, T.K.F., von Haeseler, A. & Jermiin, L.S. ModelFinder: fast model selection for accurate phylogenetic estimates. Nat. Methods 14, 587–589 (2017).
Letunic, I. & Bork, P. Interactive Tree Of Life (iTOL) v5: an online tool for phylogenetic tree display and annotation. Nucleic Acids Res. 49, W293–W296 (2021).
Team, R.C. A language and environment for statistical computing. (R Foundation for Statistical Computing, Vienna, Austria, 2022).
Lees, J.A., Galardini, M., Bentley, S.D., Weiser, J.N. & Corander, J. pyseer: A comprehensive tool for microbial pangenome-wide association studies. Bioinformatics 34, 4310–4312 (2018).
Ondov, B.D. et al. Mash: fast genome and metagenome distance estimation using MinHash. Genome Biol. 17, 132 (2016).
Wickham, H. ggplot2: Elegant Graphics for Data Analysis, (Springer-Verlag New York, 2016).
Beres, S.B. et al. Transcriptome remodeling contributes to epidemic disease caused by the human pathogen Streptococcus pyogenes. mBio 7, 1–14 (2016).
Rice, P., Longden, I. & Bleasby, A. EMBOSS: the European Molecular Biology Open Software Suite. Trends Genet. 16, 276–277 (2000).
Mariani, V., Biasini, M., Barbato, A. & Schwede, T. lDDT: a local superposition-free score for comparing protein structures and models using distance difference tests. Bioinformatics 29, 2722-–2728 (2013).
Yin, R., Feng, B.Y., Varshney, A. & Pierce, B.G. Benchmarking AlphaFold for protein complex modeling reveals accuracy determinants. Protein Sci. 31, e4379 (2022).
Hirokawa, T., Boon-Chieng, S. & Mitaku, S. SOSUI: classification and secondary structure prediction system for membrane proteins. Bioinformatics 14, 378–379 (1998).
Takamatsu, D., Osaki, M. & Sekizaki, T. Thermosensitive suicide vectors for gene replacement in Streptococcus suis. Plasmid 46, 140–148 (2001).
Nakata, M. et al. Assembly mechanism of FCT region type 1 pili in serotype M6 Streptococcus pyogenes. J. Biol. Chem. 286, 37566–37577 (2011).
Yamaguchi, M. et al. Evolutionary inactivation of a sialidase in group B Streptococcus. Sci. Rep. 6, 28852 (2016).
Liao, Y., Smyth, G.K. & Shi, W. featureCounts: an efficient general purpose program for assigning sequence reads to genomic features. Bioinformatics 30, 923–930 (2014).
Ge, S.X., Son, E.W. & Yao, R. iDEP: an integrated web application for differential expression and pathway analysis of RNA-Seq data. BMC Bioinformatics 19, 534 (2018).
Terao, Y. et al. Group A streptococcal cysteine protease degrades C3 (C3b) and contributes to evasion of innate immunity. J Biol Chem 283, 6253-60 (2008).
Lancefield, R.C. Differentiation of group A streptococci with a common R antigen into three serological types, with special reference to the bactericidal test. J. Exp. Med. 106, 525–44 (1957).
Yamaguchi, M. et al. Identification of evolutionarily conserved virulence factor by selective pressure analysis of Streptococcus pneumoniae. Commun. Biol. 2, 96 (2019).
Takemura, M. et al. Pneumococcal BgaA Promotes Host Organ Bleeding and Coagulation in a Mouse Sepsis Model. Front. Cell. Infect. Microbiol. 12, 844000 (2022).

There is NO Competing Interest.

SupplementaryData1.txt
Supplementary Data 1
SupplementaryTables.xlsx
Supplementary Tables
Supplementaryinformation.pdf
Supplementary information
ExtendedDataFig1.pdf
ExtendedDataFig2.pdf
ExtendedDataFig3.pdf
ExtendedDataFig4.pdf
ExtendedDataTable1.pdf
ExtendedDataTable2.pdf
ExtendedDataTable3.pdf
ExtendedDataTable4.pdf
ExtendedDataTable5.pdf

Download PDF

Version 1

posted

You are reading this latest preprint version

A genome-wide association study of emm89 Streptococcus pyogenes identifies genetic variations contributing to severe invasive infections

Status:

Version 1

Abstract

Figures

Introduction

Results

Discussion

Online Methods

Declarations

References

Additional Declarations

Supplementary Files

Status:

Version 1