Strain, isolation, and serotyping. A fecal sample from a patient with diarrhea was plated on MacConkey agar directly or after enrichment in trypticase soy broth containing vancomycin (Sigma Chemicals Co., St. Louis, MO). Candidate colonies were then plated on trypticase soy agar medium and biochemically characterized using the API20E system (Biomerieux, Marcy l’Etoile, France). For O-antigen determination, we used the method described by Guinee et al , using all available O (O1 to O181) antisera. Antisera were absorbed with the corresponding cross-reacting antigens to remove non-specific agglutinins. The O antisera were produced at Laboratorio de Referencia de E. coli (Lugo, Spain [http://www.lugo.usc.es/ecoli]). This research was approved by the Research Ethics Committee of the Korea Centers for Disease Control and Prevention, and written informed consent was obtained from the patient. The isolated strain was deposited at the National Culture Collection for Pathogens (NCCP) at the Korea National Institute of Health under the accession number NCCP14539. EHEC O157:H7 str. EDL933 was used as the reference strain. In addition, we included previously published EHEC strains that were isolated from Korea: NCCP15736 , NCCP15737 , NCCP15738  and NCCP15739  for comparative genomic analysis (Table 1).
Library preparation and whole-genome sequencing. The genomic DNA of NCCP14539 was extracted from a culture of candidate colonies using a Wizard Genomic DNA purification kit (Promega, USA) according to the manufacturer’s instructions. Genomic DNA yield, purity, and concentration were evaluated using 0.8% agarose gel electrophoresis and a Nanodrop 1000 spectrophotometer (Thermo Fisher Scientific, USA). Three kinds of libraries were constructed for three platforms (DNALink: Seoul, Republic of Korea). First, a mate-paired library with 1 to 10 kb insert size was constructed for the IonTorrent platform. Secondly, PacBio RS II libraries were constructed and sequenced with 8 to 20 kb insert sizes. Third, a fosmid library with 30 kb average insert size (CopyControl Fosmid library production kit, Epicenter, Madison, WI) was constructed and used as a template for the construction of the physical map. 325,481,384 reads (39,050,715,080 bp) of raw reads with IonTorrent and 52,853 reads (482,609,715 bp) of raw reads with PacBio RS II were produced.
Genome assembly and annotation. A hybrid assembly was generated using SPAdes assembler (version 3.1) and confirmed with a FES (Fosmid End Sequence) map. The sequence gaps between contigs were filled with Sanger sequencing data after PCR amplification. The final assembly was corrected with proovread (version 2.12) , as the number of frameshifted genes was greater than 5% of the called genes. The genome was determined to be composed of a 5,315,402 bp circular chromosome. To predict and annotate ORFs, we used the RAST version 4.0  server pipeline. To identify the virulence factor genes in NCCP14539, we performed a Basic Local Alignment Search Tool (BLAST) search of NCCP14539 ORFs against the virulence factor genes listed in VFDB , with a cutoff e-value of 1e-5.
Phylogenetic analysis and comparative genomic analysis. A total of 20,603 E. coli genomes were downloaded from the National Center for Biotechnology Information (NCBI) and curated to create a final collection of 2,204 EHEC genomes with Shiga-toxin genes. To infer the evolutionary history of NCCP14539, we performed a multiple sequence alignment of the MLST genes using MEGA X (version 10.0.5). A MUSCLE algorithm was used for multiple sequence sorting, and the Unweighted Pair Group Method with Arithmetic Mean method was used as the clustering algorithm. Multiple sequence alignment results were saved in Mega format. Maximum-Parsimony was used as a statistical test method for estimating phylogenetic trees. The phylogenetic tree test was performed using the bootstrap method, with 1,000 iterations. The substitution model used maximum composite likelihood and included both transitions and transversions. Calculation results were saved in nexus format. The tree was visualized with FigTree (version 1.3.1) (http://tree.bio.ed.ac.uk/software/figtree/). In order to exclude the effect of HGT in our phylogenetic analysis, we used the multi-locus sequence analysis method [40, 41]. Seven housekeeping genes (adk, fumC, gyrB, icd, mdh, purA, and recA) from 2,204 E. coli strains were retrieved and concatenated. A phylogenetic tree of MLST genes was created using the method employed for MLST-based phylogenetic analysis . From a comparative genomic study using SEED Viewer version 2.0, we identified a syntenic region that aligned with the reference genome. Unaligned genes were defined as unique genes of NCCP14539.
Gene-based subtyping of EHEC strains in relation to LGI. During infection, EHEC strains attach to the host intestinal cell by the binding of adhesins to receptors of the host, such as glycans. Bacterial adhesins such as lectin are potential therapeutic targets and can contribute to an understanding of bacterial evolution. In our previous study, we constructed the LGI network of EHEC  and developed an e-Membranome database . Using an e-Membranome pipeline, we predicted putative adhesins from EHEC genomes and deposited them onto a public database. Among the putative adhesins, we selected the fimH gene and performed gene-based subtyping of EHEC strains. The nucleotide sequences of fimH gene were retrieved from 1,482 EHEC strains, including NCCP14539. We performed a phylogenetic analysis of the fimH gene using MEGA X (version 10.0.5), as described above.
T7 phage display. The biopanning and phage propagation procedures were carried out according to the manufacturer's instructions. We performed differential biopanning with negative and positive selection using pre-immune rabbit serum and rabbit polyclonal antibody as indicated by the manufacturer (TB178 T7Select® System; Novagen). Twenty-five liters of protein G + agarose beads were washed in phosphate-buffered saline (PBS) and blocked in 1% bovine serum albumin for 1 hour (BSA). After two hours of incubation with pre-immune rabbit serum at a dilution of 1:20, the beads were washed three times with PBS and then incubated for another two hours with the T7 phage display human brain cDNA library. This subtractive biopanning approach is required to eliminate proteins that react with pre-immune IgGs. The supernatant T7 phage cDNA library was then treated with polyclonal antibodies immobilized on protein G + agarose beads and incubated overnight at 4°C. After three washes with 1 PBS, the bound T7 phage cDNA library was eluted with 1% sodium dodecyl sulfate. The eluant was amplified using E. coli strain BLT5616 for the subsequent cycle of biopanning. After four rounds of biopanning, the selected phage library was used for immunoscreening.
Docking simulation. Using SWISS-MODEL, a homology modeling approach, the three-dimensional (3D) structure of FimH (PDB ID: 6GTY and Sequence :MKRVITLFAVLLMGWSVNAWSFACKTANGTAIPIGGGSANVYVNLAPAVNVGQNLVVDLSTQIFCHNDYPETITDYVTLQRGAAYGGVLSSFSGTVKYNGSSYPFPTTSETPRVVYNSRTDKPWPVALYLTPVSSAGGVAIKAGSLIAVLILRQTNNYNSDDFQFVWNIYANNDVVVPTGGCDVSARDVTVTLPDYPGSVPIPLTVYCAKSQNLGYYLSGTTADAGNSIFTNTASFSPAQGVGVQLTRNGTIIPANNTVSLGAVGTSAVSLGLTANYARTGGQVTAGNVQSIIGVTFVYQ) was produced using GM1(PDB ID: 4ZH1)- and Gb3(PDB ID: 6F4C) 3D structure. The docking simulation was conducted using the 3D structure of GM1 and Gb3. The Chem-office application (http://www.cambridgesoft.com, version: 7.0) was also used to reduce energy use. Autodock Vina 1.1.2 was used for the docking simulations. Possible hydrogen bonds and hydrophobic interactions were identified utilizing HBPLUS and non-bonded contact parameters as default settings, in a LigPlot based on the findings of docking simulation.
Plaque lift assay. EHEC bacteria in log phase growth were infected with various dilutions of phage and grown on agar plates. About 1000 well-dispersed individual phage clones from agar plates were transferred to nitrocellulose membranes using the plaque-lift technique. A nitrocellulose membrane (82-mm diameter HATF filter, Millipore Corp.) was overlaid onto the agar and incubated at room temperature for 10–15 min. Without disrupting the agar and plaques, the nitrocellulose membrane was carefully lifted. The immunostaining procedure included an initial incubation with the binding glycan (GM1 and Gb3, respectively)-EHEC for 1–2 h. The membranes were then rinsed, followed by the application of detecting reagents. The detecting reagents include horseradish peroxidase-conjugated goat anti-mouse IgG, followed by color development using 3, 3-diaminobenzidine (DAB) and 0.01% H2O2.
Solid Phase Peptide Synthesis. The initial step in synthesizing peptides on a resin was to connect the C-terminal amino acid to the resin. A transient protective group protects the alpha amino group and the reactive side chains from polymerization. The resin was next filtered and washed to eliminate byproducts and excess reagents. The N-alpha protective group was then removed, and the resin was washed again to eliminate byproducts and excess reagent. Then, the next amino acid was added until the peptide sequence was complete. It was then rinsed to remove the protective groups and the peptide was released from the resin. As in the above procedure, the peptide manufacturing synthesis proceeded with the same protocol for the mannose-6-phosphonate conjugated P1, P2, and P3. The Gb3-like peptides, P1 (CGTVLTRNETHATYS), P2 (CQCKQDFNITDISLL), and P3 (CYATPSSNATDPLKY) were similarly synthesized.
Peptide binding assays. The mannose-6-phosphonate-conjugated P1, P2, and P3 peptides were coated to each well of the plate at a concentration of 25 ng per well. After blocking, each well was filled with EHEC for 2 hours at room temperature and washed five times with 1% BSA. Next, the plate was incubated at 4°C for 15–16 hours. After another wash, the plate was stained with fluorescein isothiocyanate-conjugated staining reagent at a 1: 10,000 dilutions. After 2 hours at room temperature, the plates were washed, followed by fluorescence detection using a micro reader.
Statistical analysis. All experiments were carried out at least three times, and representative results are shown. The outcomes of the data analysis were statistically analyzed using the comparison-based one-way analysis of variance (ANOVA), which was then followed by a post hoc Bonferoni test to determine significance. Differences were considered statistically significant when their p-values were less than 0.05. *p indicates < 0.05 and **p < 0.01. The differences between the two figures are indicated in the figure legends.