Comparative proteomic analysis of Leptospira interrogans serogroup Icterohaemorrhagiae human vaccine strain and epidemic isolate from China

Leptospira interrogans serogroup Icterohaemorrhagiae is the predominant pathogen causing leptospirosis in China and is still used as the vaccine strain for the current human inactivated vaccine. Unlike the clade ST17, which is distributed worldwide, ST1 is the most prevalent in serogroup Icterohaemorrhagiae in China. To further characterize leptospiral pathogens, isobaric tags for relative and absolute quantitation and parallel reaction monitoring were used to analyze differences at the proteomic level between serogroup Icterohaemorrhagiae vaccine strain 56001 (ST1) and circulating isolate 200502 (ST17) from different periods. Two hundred and eighty-one proteins were differentially expressed between the circulating isolate and vaccine strain, of which 166 were upregulated (> 1.2-fold change, P < 0.05) and 115 (< 0.8-fold change, P < 0.05) were downregulated. Function prediction revealed that nine upregulated proteins were outer membrane proteins, including several known immunogenic and/or virulence-related proteins, such as ompL1, LipL71, and LipL41. Furthermore, important expression differences in carbohydrate, amino acid, and energy metabolism and transport proteins were identified between both strains from different clusters, suggesting that these differences may reflect metabolic diversity and the potential of the pathogens to adapt to different environments. In summary, our findings provide insights into a better understanding of the component strains of the Chinese human leptospirosis vaccine at the proteomic level. Additionally, these data facilitate evaluating the mechanisms by which pathogenic Leptospira species adapt to the host environment.


Introduction
Leptospirosis is one of the most important but neglected zoonotic diseases caused by pathogenic Leptospira species (Bharti et al. 2003). It is estimated that there are approximately 1 million human leptospirosis cases worldwide each year, with 58,900 deaths (Costa et al. 2015). In China, leptospirosis has been a reportable disease since 1955. Ten large outbreaks of leptospirosis, with an incidence of over 10 cases per 100,000 people, have occurred since 1990 (Shi and Jiang 2000). In addition to improving sanitation and water conservation, vaccination of at-risk populations should be conducted to prevent and control leptospiral infections. It has been reported that leptospirosis incidence has been maintained at less than 1 case per 100,000 people since 1997, with only 214 and 297 cases reported in 2019 and 2020, respectively. Regardless, leptospirosis remains endemic and small local outbreaks still occur in the southern provinces of China (Li et al. 2013;Liu et al. 2012;Wang et al. 2013).
A multivalent, inactivated leptospirosis vaccine is currently used to immunize high-risk populations in endemic regions in China. The vaccine contains L. interrogans serogroups Icterohaemorrhagiae, Canicola, Grippotyphosa, Autumnalis, Pomona, Australis, and Hebdomadis, covering more than 80% of the serogroups of the endemic strains in China (Shi and Jiang 2000;Xu and Ye 2018). Due to the serogroup-specific immunity induced by the inactivated leptospirosis vaccine, it is important to monitor pathogenic Leptospira epidemiology to guide vaccine production (Xu and Ye 2018). There is evidence that Icterohaemorrhagiae has been the predominant serogroup in China in recent history (Yan et al. 2006). For example, of the 341 Leptospira strains isolated from 2002 to 2015 in China, 61.1% belonged to serogroup Icterohaemorrhagiae (Zhang et al. 2019a, b). Although the predominant serogroups have been consistent in China (Zhang et al. 2019a, b;Zhang et al. 2019a, b), whether there occur differences between the current epidemic isolates and the vaccine strains isolated in the 1950s from the same serogroup is unknown.
Indeed, recent studies have reported the predominance of the clade ST1 in the serogroup Icterohaemorrhagiae in China in the past decades (Zhang et al. 2015). In contrast, ST17 has been the most prevalent clone globally (Guernier et al. 2017). It is hypothesized that in addition to genetic differences, the current predominant cluster ST17 strains have increased protein expression, providing them with increased fitness and selective advantage (Zhang et al. 2019a, b). Furthermore, it has been reported that leptospirosis-associated severe pulmonary hemorrhagic syndrome (SPHS) associated with fatality rates of > 50% in Brazil was caused by L. interrogans serogroup Icterohaemorrhagiae serovar Copenhageni, which also caused non-SPHS leptospirosis cases in China (Gouveia et al. 2008). This indicates that the appearance of SPHS may be associated with the introduction or emergence of a clone that has enhanced virulence within this serovar.
Therefore, this study aimed to determine the differences at the proteomic level between two strains of Leptospira serogroup Icterohaemorrhagiae from different periods: 200502, a cluster CC17 strain (ST17) isolated in Sichuan, China in 2005 and 56001, a cluster CC1 strain (ST1) isolated in Sichuan in 1958 used as one of the vaccine strains. Isobaric tags for relative and absolute quantitation (iTRAQ) and parallel reaction monitoring (PRM) were used to compare protein expression differences in the global whole-cell proteome to shed light on the adaptation of current epidemic strains.

Bacterial culture and sample preparation
Two Leptospira strains from two distinct phylogenetic clusters, namely circulating isolate 200502 (ST17) from cluster CC17 and vaccine strain 56001 (ST1) from cluster CC1, were selected as representative strains for comparative proteomics. To preserve the virulence, vaccine strain 56001 was propagated in guinea pigs. The Leptospira strains were grown in Ellinghausen-McCullough-Johnson-Harris medium (BD Biosciences, San Jose, CA, USA) at 28 °C until they reached the log phase. The cell suspension was centrifuged at 10,000×g for 30 min. Next, the supernatant was removed, and the cell pellet was washed thrice with normal saline and re-centrifuged at 10,000×g for 30 min. Thereafter, disruption buffer (8 M urea, 30 mM HEPES, 1 mM PMSF, 2 mM EDTA, and 10 mM DTT) was used to resuspend the cell pellet. Subsequently, whole-cell samples were sonicated at 180 W for 5 min, and the resulting lysate was centrifuged at 10,000×g for 30 min to remove cellular debris. The proteins in the supernatant were quantified using the Bradford assay (Bio-Rad Laboratories, Hercules, CA, USA). Three biological replicates were performed for the two strains.

iTRAQ labeling and SCX fractionation
Two iTRAQ experiments were performed for the two selected Leptospira strains, with three biological replicates for each strain. For each sample, 100 μg of proteins were digested overnight with 3.3 μg of trypsin (Promega, Madison, WI, USA) at 37 °C, followed by freeze-drying. Afterward, the peptides of each sample were reconstituted in 30 μL of 0.5 M TEAB and labeled with the iTRAQ8plex kit (SCIEX, Framingham, MA, USA) according to the manufacturer's instructions. Samples were labeled with the iTRAQ tags as follows: three replicates of ST1 were labeled with 113, 114, and 115 iTRAQ tags, and three replicates of ST17 were labeled with 116, 119, and 121 iTRAQ tags. After incubation at room temperature for 2 h, the labeled samples were mixed and dried by vacuum centrifugation.

HPLC-MS/MS
The SCX fractions were first separated by nano-HPLC and then analyzed by tandem mass spectrometry (MS/MS).
Briefly, each fraction was re-dissolved in solvent A (2% ACN, 0.1% FA), followed by centrifugation at 20,000×g for 10 min. Subsequently, using a Dionex ultimate 3000 nano-LC system (Thermo Fisher Scientific, Waltham, MA, USA), 10 µL of the peptide sample was loaded onto a 2-cm C18 trap column and then gradient eluted by solvent B (98% ACN, 0.1% FA) on a 15-cm analytical C18 column (inner diameter 75 µm) as per the following protocol: flow rate 0.4 µL/min; 5% solvent B for 10 min, a gradient from 5 to 80% solvent B for 38 min, and maintenance at 80% for 7 min. Finally, the system returned to 5% solvent B in 3 min and was maintained at 5% for 7 min.
The LC-eluted peptides were then subjected to mass spectroscopy (Q-Exactive MS; Thermo Fisher Scientific), set in positive ion mode and a data-dependent manner with a full MS scan range from 350 to 2000 m/z, full scan resolution of 70,000, MS/MS scan resolution of 17,500, MS/MS scan with a minimum signal threshold of 1E+5, and isolation width of 2 Da. To evaluate the mass spectrometric performance of the iTRAQ-labeled samples, two MS/MS acquisition modes and higher collision energy dissociation (HCD) were employed. Furthermore, to optimize the MS/MS acquisition efficiency of HCD, normalized collision energy was systemically examined and stepped by 20%.

Validation of differentially expressed proteins using PRM
The changes in protein abundance obtained using the proteomic analysis were confirmed by an LC-MS-based PRM assay. According to the ratio of upregulated to downregulated proteins in the iTRAQ experiment, 30 and 20 proteins, respectively, were selected as the number of preset targets in the PRM experiment. The 50 selected proteins were established by the PRM method and, finally, 13 proteins that met the PRM detection criteria were screened out for verification of differential proteins. PRM method construction, optimization, and data processing were performed using Skyline software (v3.7.0.11317, downloaded from the MacCoss Laboratory at the University of Washington).
Proteins (30 μg) from the whole-cell lysate of the two samples were separately prepared and digested with trypsin following the protocol for tandem mass analysis. The obtained peptides from the two samples were then mixed equally. Next, 2-µg peptide mixtures were introduced into a Q-Exactive MS via a C18 trap column (0.15 × 20 mm; 5 μm; 100 Å) and then via a C18 column (0.75 × 150 mm; 5 μm; 300 Å). The generated raw data were then analyzed using Proteome Discoverer 1.4 (Thermo Fisher Scientific). The false discovery rate (FDR) was set to 0.01 for the proteins and peptides. Data processing and proteomic analysis were performed using Skyline 3.7 software.

Bioinformatic analysis
The raw data files generated from the iTRAQ experiments were converted into MGF format files using Proteome Discoverer 1.2 (PD 1.4; Thermo Fisher Scientific). Mascot software (version 2.3.0; Matrix Science Inc., Boston, MA, USA) was used for protein identification by searching against the UniProt-Leptospira_171 database (Number of sequences: 438047), with an FDR of < 1%. The key search parameters were set as follows: (I) type of search, MS/MS ion search; charge states of peptides, + 2 and + 3; (II) the enzyme specificity of trypsin; (III) max missed cleavages, 1; (IV) parent ion mass tolerance, 10 ppm; fragment ion mass tolerance, 0.5 Da; (V) potential variable modifications, Gln > pyro-Glu (N-term Q); oxidation (M), deamidated (NQ); (VI) fixed modifications, carbamidomethyl (C); iTRAQ8plex (N-term), iTRAQ8plex (K). To reduce the probability of false peptide identification, only peptides with significance scores ≥ 20, FDR < 1%, and protein probability > 99.0% were accepted. Each confidently identified protein included at least one unique peptide. For protein quantitation, Student's t test was used for statistical analysis, and the relative quantitation of a given protein was reported as the median ratio in Mascot; p < 0.05 was considered statistically significant.
Up-and downregulated proteins were defined as having fold changes (FCs) > 1.2 and < 0.8, respectively. A two-tailed Student's t test with FDR correction using the Storey and Tibshirani method was performed with P < 0.05 and q < 0.05 assigned as statistically significant. Functional categories were assigned based on KEGG tools (Kanehisa et al. 2016), and functional analyses for enriched categories were performed using Fisher's exact test.

Comparison of whole-cell proteins of the circulating isolate and vaccine strain
The iTRAQ assay identified 2134 proteins from the 2 strains. Identified proteins that were common between the strains were used for quantitative analysis. As a result, 281 proteins were found to be differentially expressed, of which 166 were upregulated and 115 were downregulated in the circulating isolate 200502 compared with the vaccine strain 56001 (Supplementary Table 1 and Table 2). The iTRAQ results for the three biological replicates for each strain had good repeatability (R 2 > 0.8) (Supplementary Fig. 1).
Of the 166 upregulated proteins (fold change > 1.2, P < 0.05), 33 had a fold change > 2 in the circulating isolate compared with that in the vaccine strain (Supplementary Table 1). The protein expressions with increased abundance with the top five changes included lipoproteins, fadD (UniProt ID Q72RV8), 50S ribosomal protein L17, and two other uncharacterized proteins (UniProt ID M3HFM4 and M6HIJ0) in the circulating isolate. In addition, the fold increases were more than eight (Supplementary Table 1). Of the 115 downregulated proteins (fold change < 0.8, P < 0.05), 26 had a fold change < 0.5 in the circulating isolate compared with that in the vaccine strain, and the top five changes included long-chain fatty acid-CoA ligase, acetyltransferase component of the pyruvate dehydrogenase complex, transketolase, glycine dehydrogenase, and an uncharacterized protein (UniProt ID A0A0F6HXF1; Supplementary Table 2).

Functional roles of proteins with differential expression
For the functional categories, all 281 differentially expressed proteins were analyzed for functional annotations based on the KEGG database. As a result, 129 of the 281 proteins were annotated successfully, although the functions of most proteins were unknown. The annotation results of 79 upregulated and 50 downregulated proteins are shown in Fig. 1. Carbohydrate metabolism, genetic information processing and signaling, and cellular processing constituted the highest proportion of total annotated results (50.6%) of upregulated proteins (Supplementary Table 1). In addition, it was shown that several outer membrane proteins (Omps), lipoproteins, and other putative virulence factors, such as Bordetella resistance to killing B (BrkB) were found to be upregulated in the circulating isolate (Supplementary Table 1). Previous study showed that BrkB had played essential role for the resistance to complement-dependent killing by serum in other bacteria (Fernandez and Weiss 1994). Unlike with upregulated proteins, environmental information processing and energy metabolism-related functions were reported in downregulated proteins, while other metabolism-related proteins were fewer, for instance, 19 vs. 6 for carbohydrate metabolism, 8 vs. 3 for amino acid metabolism in the circulating isolate and vaccine strain, respectively ( Fig. 1 and Supplementary Table 2).
To further understand the differentially expressed proteins, subcellular protein location and secretion type were predicted bioinformatically. It was shown that 57.2% of upregulated proteins were located in the cytoplasmic membrane, while 21.1% were in the cytoplasmic space. Furthermore, 5.4%, 3.0%, and 2.4% of upregulated proteins were localized in the outer membrane and extracellular and periplasmic spaces, respectively; the location of 10.8% of the proteins was unknown (Supplementary Table 1). Regarding secretion pathways of upregulated proteins, 27 proteins (16.3%) were predicted to be secreted by the classical pathway, while 15 proteins (9.0%) were predicted to be non-classically secreted, and most proteins (74.7%) were assigned to an undefined secretion type (Supplementary Table 1). Unlike upregulated proteins, most downregulated proteins (76.5%) were located in the cytoplasmic space, with none found in the periplasmic space. Additionally, 17.3%, 3.4%, Fig. 1 Functional classes of the differentially expressed proteins between str.200502 (ST17) and str.56601 (ST1). The red bar represents upregulated proteins (fold change > 1.2, P < 0.05), while the green bar represents downregulated proteins (fold change < 0.8, P < 0.05) and 1.7% of proteins were found in the cytoplasmic membrane, extracellular space, and outer membrane, respectively (Supplementary Table 2). Similar to upregulated proteins, most downregulated proteins (68.7%) were predicted to have an undefined type of secretion, whereas 17.4% and 13.9% of proteins were classically and non-classically secreted, respectively (Supplementary Table 2).

PRM confirmation of selected proteins
PRM assays, as a confirmatory technique, were designed to quantitate proteins that were determined to be differentially expressed by iTRAQ between the two strains. As described in the Materials and Methods section, 13 differentially expressed proteins were selected as candidates ( Fig. 2 and Table 1). PRM results showed that 10 of the 13 proteins, including OmpL1, LipL71, peptidase M75, FAD-binding protein, signal-peptide peptidase (SppA), flagellin, bifunctional purine biosynthesis protein (PurH), putative lipoprotein, and one uncharacterized protein, had a consistent fold change trend in the iTRAQ assay. Moreover, 3 of the 13 proteins, including cysteine synthase (cysK), sigma factor regulatory protein (FecR/PupR), and transcriptional regulator protein (TetR family), had an inconsistent fold change trend in the iTRAQ assay (Table 1).

Discussion
Leptospirosis incidence has significantly decreased in the past 20 years; regardless, the disease is still considered an important reportable zoonotic disease in China (Shi and Jiang 2000). Several studies demonstrated that serogroup Icterohaemorrhagiae is the most predominant in China, being responsible for over 60% of leptospirosis cases, although the predominant serogroups or STs differ geographically by country and region (Zhang et al. 2019a, b). Recently, 17 different STs were identified in 120 Chinese Leptospira serogroup Icterohaemorrhagiae strains collected from 1958 to 2008, including 69 ST1, 18 ST17, 18 ST128, 9 ST143, and 2 ST209 strains, which differed from those isolated in other countries, such as Argentina, Russia, and Brazil where ST17 is predominant (Zhang et al. 2015). It is speculated that these predominant isolates may have adaptive selective advantages in the environment or in maintenance hosts.
Previous studies have demonstrated that altered expression or mutations in critical genes after many passages in vitro are closely associated with the virulence attenuation of pathogenic Leptospira (Toma et al. 2014). Therefore, Chinese human leptospirosis vaccine strains isolated 50 years ago were often passaged in animal models after 3-6 passages in vitro to maintain the virulence, which was closely associated with efficacy (Zhong et al. 2011). In the present study, it was shown that at the proteomic level, there were a number of proteins that differed between the vaccine strain and circulating isolate among serogroup Icterohaemorrhagiae strains from different periods, though the expression of most proteins showed no difference. There were nine outer membrane proteins whose expression increased in the circulating isolate, including the known proteins OmpL1 (Dong et al. 2008;Fernandes et al. 2012), lipL41 (Lin et al. 2011;Shang et al. 1996), and lipL71 (Verma et al. 2005;Zhang et al. 2013). OmpL1, a 31-kDa leptospiral transmembrane protein, was found to be upregulated with a 5.95-fold increase in ST17. Previous studies demonstrated that recombinant OmpL1 induced partial immunoprotective capacity in an animal model (Kanehisa et al. 2016). Additionally, OmpL1 was found to be synergistically immunoprotective in combination with LipL41 in a Golden Syrian hamster model challenged with pathogenic Leptospira, indicating that OmpL1 and LipL41 are important determinants of immunoprotection (Haake et al. 1999). The expression differences of immunogenic proteins between the vaccine strain 13 Targeted upregulated (fold change > 1.2, P < 0.05) and downregulated proteins (fold change > 1.2, P < 0.05) from iTRAQ assay, which also met the PRM detection criteria, were selected. Proteins labeled with pentacles indicated that results between the two assays were inconsistent and the circulating isolates may imply the need to select new vaccine strains. Therefore, these data can provide critical insights for developing new whole-cell and recombinant leptospirosis vaccines.
On the other hand, bacterial OMPs, as the first points of interaction with a host, play crucial roles in attaching to host cells by acting as receptors for various host molecules, functioning as porins, and/or acting as bactericidal antibody targets. These properties contribute to bacterial pathogenesis by helping bacteria evade the immune response or adhere to tissues (Pinne and Haake 2009). OmpL1, a surface-exposed protein, exhibits typical receptor-ligand interaction by binding laminin and plasma fibronectin. Furthermore, the interaction of OmpL1 with plasminogen can produce plasmin, facilitating host cell membrane degradation by bacteria (Fernandes et al. 2012). It has been suggested that OmpL1 may promote the attachment of leptospires to mammalian hosts by helping the bacteria disseminate during the infection process. These differences in OmpL1 expression may contribute to fitness differences between the two strains. Consistent with this assumption, LipL71, also known as LruA, a lipoprotein encoded by the LIC_11003 gene containing a LysM domain between residues 401 and 461 and associated with peptidoglycan binding in bacterial expression, was also upregulated in ST17 (Verma et al. 2005). It was found that LruA shares immuno-relevant epitopes with eye proteins, suggesting that cross-reactive antibody interactions with eye antigens may play an important role in Leptospira-associated recurrent uveitis (Verma et al. 2010). Furthermore, it was demonstrated that the 232 amino acid deletion at the C terminus of LipL71 (covering the LysM domain) resulted in attenuation of leptospiral virulence by modulating cellular interactions with serum protein ApoA-I, suggesting that LipL71 is a surface-exposed virulence factor that is closely associated with leptospiral pathogenesis . To advance the understanding of leptospiral pathogenesis, further genetic manipulation should be performed in future studies to confirm that the differentially expressed proteins identified in this study are key contributors to the fitness difference between the two clusters.
It is generally accepted that transporter proteins may be closely associated with the metabolic diversity of bacterial pathogens (Zavala-Alvarado et al. 2020), and L. interrogans has over 250 transporter proteins (Buyuktimkin and Saier 2016). We found that four transporter proteins were upregulated and two transporters were downregulated in the current circulating strain from the ST17 cluster. This differential expression of various transporters suggests that the two strains differed metabolically, with the likelihood of ST17 being more metabolically efficient than ST1. Furthermore, several carbohydrate and amino acid metabolism-related proteins such as M75, pectin acetylesterase, were highly expressed in the circulating strain, implying a higher ability to acquire essential nutrients and providing further evidence that the circulating strain was metabolically more efficient. This reflects the metabolic diversity and potential of the pathogens to adapt to different environments. We suggest that these changes might be common across current circulating ST17 strains globally, contributing to their expansion. In addition to the proteins mentioned above, we speculate that there are still many key proteins that have not been functionally identified among the 281 differentially expressed proteins identified in this study. In the future, we will continue the research to find more proteins closely related to bacterial function. As a targeted proteomics technology based on high-resolution, high-precision mass spectrometry, RPM can absolutely quantify target proteins and peptides (Peterson et al. 2012). This quantitative technique has been demonstrated to be better than western blotting, which is semiquantitative, less sensitive, and requires antibodies that may not be available (Saleh et al. 2019;Stolze and Nakagami 2020). In the present study, significant differences in iTRAQ results were confirmed using RPM. There was a strong correlation between RPM and iTRAQ results, although 3 out of 13 proteins exhibited inconsistent fold change trends. The minor lack of conformity may be attributed to the different working principles of the two methods. RPM targets specific peptides and transitions, which may be missed during iTRAQ as in shotgun proteomics, where more peptides are eluted than can be detected at once (Stolze and Nakagami 2020;Rauniyar and Yates 2014). Interestingly, the fold change of three upregulated proteins, which were A0A098MZ52 (peptidase M75), A0A161QCM2 (transcriptional regulator), and M3HFM4 (uncharacterized protein), was significantly higher in PRM than in iTRAQ. This inconsistency, which had also been reported by a previous study (Luu et al. 2018), may be associated with the sensitivity difference for certain proteins between the two methods.
Furthermore, to eliminate as much as possible the interference of other epidemiological factors, we compared two representative strains, 200502 and 56601, from different clusters, and they were consistent in the host (human), species (L.interrogans), and isolation site (Sichuan Province) in this study. As only one strain from each cluster was selected to explore the differences at the proteome level, further comparative studies should be conducted with a large number of L. interrogans serogroup Icterohaemorrhagiae strains from different hosts and geographical regions in China.

Conclusions
In conclusion, a total of 281 differentially expressed proteins were identified between the vaccine strain and circulation isolate in serogroup Icterohaemorrhagiae strains from different periods using comparative proteomics technology.
Expression up-regulation of some important proteins, including several known immunogenic and/or virulencerelated OMPs and lipoproteins, were found in the most prevalent circulating ST17 strain, providing critical insights for the development of whole-cell and recombinant leptospirosis vaccines. Furthermore, the differential metabolism-related proteins identified may be key contributors to the fitness difference between the two ST clusters.