TMT-based Proteomic Analysis and Protein Expression During Excystation of Cryptosporidium Anderson

Background: Cryptosporidium andersoni (C. andersoni) initiates infection by the release of sporozoites through excystation. However, the proteins involved in excystation remain unknown. Researching the proteins that participate in the excystation of C. andersoni oocysts will ll the gap in our understanding of the excystation system of this parasitic pathogen. Methods: In this study, C. andersoni oocysts were collected and puried from the feces of naturally infected adult cows. Tandem mass tags (TMT) coupled with liquid chromatograph- tandem mass spectrometry (LC-MS/MS) proteomic analysis was used to investigate the proteomic expression prole of C. andersoni oocysts during excystation. Results: Our proteomic analysis identied a total of 1586 proteins, of which 17 were identied as differentially expressed proteins (DEPs), with 10 upregulated and 7 downregulated proteins. Each of those 17 proteins had multiple biological functions associated with control of gene expression at the level of transcription and biosynthetic and metabolic processes. Quantitative real-time PCR of eight selected genes validated the proteomic data. Conclusions: Our ndings provide new information on the protein composition of C. andersoni oocysts as well as possible excystation factors. These data may help us to better understand the pathology of C. andersoni and thus may be useful in diagnosis, vaccine development, and immunotherapy for Cryptosporidium. species such as Cryptosporidium andersoni infecting the abomasum of cattle [6]. In this study, the proteomics of non-excysted and excysted cow-derived C. andersoni oocysts were compared in order to screen proteins associated with excystation and to nd potential targets for anti-cryptosporidiosis. in 0.5 M TEAB and processed according to the manufacturer’s protocol for a TMT kit. Briey, one unit of TMT reagent was thawed and reconstituted in acetonitrile. The peptide mixtures were then incubated for 2 h at room temperature and pooled, desalted, and dried by vacuum centrifugation. The tryptic peptides were fractionated by high pH reverse-phase HPLC using an Agilent 300 Extend C18 column (5 μm particles, 4.6 mm ID, 250 mm length). Briey, peptides were rst separated into 60 fractions with a gradient of 8%–32% acetonitrile (pH 9.0) over 60 min. Then, the peptides were combined into 18 fractions and dried by vacuum centrifuging. 5 s, s, of melting added product specicity, s. Relative expression levels normalized a housekeeping gene, method determine the fold change of gene expression levels.


Protein extraction and SDS-PAGE separation
Lysis buffer (pH 8.5) containing 7 M urea, 2 M thiourea, 65 mM Tris, 2% Dithiothreitol (DTT), 4% 3-[(3-Cholamidopropyl)dimethylammonio]-1propanesulfonate (CHAPS), 0.2% IPG buffer (GE Amersham, USA), and 0.1% v/v protease inhibitor cocktail (Merck, USA) was added into the tubes containing excysted or unexcysted oocysts [14]. Oocyst disruption was achieved by sonication at 80 W for 3 s × 100 at intervals of 10 s. The debris was removed by centrifugation at 12,000 g at 4°C for 10 min; the supernatant was transferred to a new centrifuge tube, and the concentration was determined using a BCA Protein Assay kit (Beyotime Biotechnology, China) according to the manufacturer's instructions. The oocysts were stored at −80°C until the following proteomic analyses.
Trypsin Digestion, TMT Labeling, and HPLC Fractionation TMT tagging and analysis were performed as described previously [15]. For digestion, the protein solution was reduced with 5 mM dithiothreitol for 30 min at 56°C and alkylated with 11 mM iodoacetamide for 15 min at room temperature in darkness. The protein sample was then diluted by adding 100 mM Triethylamonium bicarbonat (TEAB) to urea at a concentration less than 2 M. Finally, trypsin was added at a 1:50 trypsin-to-protein mass ratio for the rst digestion overnight and at a 1:100 trypsin-to-protein mass ratio for a second 4-h digestion. After trypsin digestion, the peptides were desalted using a Strata X C18 SPE column (Phenomenex) and vacuum-dried. Peptides were reconstituted in 0.5 M TEAB and processed according to the manufacturer's protocol for a TMT kit. Brie y, one unit of TMT reagent was thawed and reconstituted in acetonitrile. The peptide mixtures were then incubated for 2 h at room temperature and pooled, desalted, and dried by vacuum centrifugation. The tryptic peptides were fractionated by high pH reverse-phase HPLC using an Agilent 300 Extend C18 column (5 μm particles, 4.6 mm ID, 250 mm length). Brie y, peptides were rst separated into 60 fractions with a gradient of 8%-32% acetonitrile (pH 9.0) over 60 min. Then, the peptides were combined into 18 fractions and dried by vacuum centrifuging.

LC-MS/MS analysis and Database Search
The tryptic peptides were dissolved in 0.1% formic acid (solvent A) and directly loaded onto a home-made reversed-phase analytical column (15-cm length, 75 μm i.d.). The gradient comprised an increase from 6% to 23% solvent B (0.1% formic acid in 98% acetonitrile) over 26 min, 23%-35% in 8 min, increasing to 80% in 3 min, and then holding at 80% for the last 3 min, all at a constant ow rate of 400 nL/min on an EASY-nLC 1000 UPLC system [16].
The peptides were subjected to an NSI source, followed by tandem mass spectrometry (MS/MS) in a Q ExactiveTM Plus (Thermo) coupled online to the UPLC. The electrospray voltage applied was 2.0 kV. The m/z scan range was 350-1800 for full scan, and intact peptides were detected in the Orbitrap at a resolution of 70,000. Peptides were then selected for MS/MS using the NCE setting of 28, and the fragments were detected in the Orbitrap at a resolution of 17,500. A data-dependent procedure that alternated between one MS scan followed by 20 MS/MS scans with 15.0 s dynamic exclusion was used. Automatic gain control (AGC) was set at 5E4. The xed rst mass was set as 100 m/z [17].
The resulting MS/MS data were processed using the Maxquant search engine (v.1.5.2.8). Tandem mass spectra were searched against the Proteome Cryptosporidium database (28217 sequences) concatenated with the reverse decoy database. Trypsin/P was speci ed as a cleavage enzyme, allowing up to two missing cleavages. The mass tolerance for precursor ions was set as 20 ppm in the First search and 5 ppm in the Main search, and the mass tolerance for fragment ions was set as 0.02 Da. Carbamidomethyl on Cys was speci ed as a xed modi cation, and oxidation on Met was speci ed as a variable modi cation. The FDR was adjusted to < 1%, and the minimum score for peptides was set as > 40.

Bioinformatic analysis
Multiple bioinformatics tools were employed to analyze the proteins. Gene Ontology (GO) annotation of the proteome was derived from the UniProt-GOA database (http://www.ebi.ac.uk/GOA/) that classi ed proteins into three categories: biological process, cellular compartment, and molecular function. The Clusters of orthologous groups for eukaryotic complete (KOG) database was used for functional classi cation statistics of differentially expressed proteins (DEPs). The Kyoto Encyclopedia of Genes and Genomes (KEGG) database was used to annotate protein pathways. For GO, KOG, and KEGG enrichment analyses, a two-tailed Fisher's exact test was applied to test DEPs against all identi ed proteins, and a corrected p value < 0.05 was considered signi cant. Identi ed protein domain functional descriptions were annotated by InterProScan (a sequence analysis application) based on the protein sequence alignment method, and the InterPro domain database (http://www.ebi.ac.uk/interpro/) was used. We used Wolfpsort, a subcellular localization prediction program (https://www.genscript.com/wolf-psort.html) to predict subcellular localization. Wolfpsort is an updated version of PSORT/PSORT II for the prediction of eukaryotic sequences.
Validation of differentially expressed genes by quantitative real-time PCR The qRT-PCR was used to determine and verify gene expression levels of eight DEPs in the excystation of C. andersoni oocyst [18]. Total RNA of each sample was extracted from excysted and unexcysted C. andersoni oocysts (with three biological replicates for each group) using TRIzol TM Reagent (Invitrogen, USA). RNA puri cation and reverse transcription were performed using a Reverse Transcriptase M-MLV Kit with gDNA Eraser (Takara, Japan) according to the manufacturer's instructions. The quantity of RNA was analyzed using a Nano Drop One (Thermo Scienti c Fisher, US). For every sample, 1 μg of total RNA was treated with 1 μL of gDNA Eraser at 42°C for 2 min. First-strand cDNA was synthesized using 1 μL of Oliga(dT) and 1 μL of Random Primers and RNase-free dH 2 O (up to 10 μL) at 70°C for 10 min and at 4°C for 2 min. Second strand cDNA was synthesized using 4 μL of 5 × M-MLV buffer, 1 μL of dNTP, 0.5 μL of RI, 0.5 μL of M-MLV, 4 μL RNase-free dH 2 O, and 10 μL of rst-strand cDNA. The resulting products were used as templates for qRT-PCR. Gene-speci c qRT-PCR primers were designed with Premier 5.0 software (Premier Biosoft International, Palo Alto, CA, USA).
Oligonuleotide sequences of target and reference genes (18s) for qRT-PCR are listed in Supplementary Table S1. The qRT-PCR reaction was composed of 2 μL of cDNA, 0.4 μM ( nal concentration) of each primer, and 5 μL of 2 × SYBR qPCR mix (Takara, Japan). PCR reactions were performed in duplicate using the qTOWER 3 G IVD (Analytik Jena AG, Germany) with the following cycles: one cycle for denaturing at 95°C for 30 s, 40 cycles for PCR reaction at 95°C for 5 s, 55°C for 10 s, and 72°C for 15 s. One cycle of melting curve for all reactions was added to verify product speci city, with 95°C for 15 s and 65°C for 60 s. Relative expression levels were normalized to a housekeeping gene, 18s RNA. The 2 -ΔΔCT method was used to determine the fold change of gene expression levels. GraphPad Prism V 8.0 (https://www.graphpad.com/) was used to analyze and plot the data.

Puri cation and excystation of Cryptosporidium andersoni oocysts
Highly puri ed oocysts of C. andersoni were obtained from adult cow's feces via sucrose solution density gradient centrifugation and cesium chloride density gradient centrifugation ( Figure 2). The excystation rate of C. andersoni was 82% using 3-h incubation in a 37°C thermostat water bath. Nonexcysted oocysts, excysted oocysts, and sporozoites are shown in Figure 3. Most oocysts were round or oval, with a transparent wall around the oocyst, a bright vestigial body, and four sporozoites inside the oocyst. After excystation, the sporozoites swim snakelike or oscillate violently, and residual bodies either remain in the oocyst or prolapse outside the oocyst wall.

Protein identi cation and quanti cation
To understand how the proteome changes with Cryptosporidium andersoni oocyst excystation, a TMT-based labeling approach coupled with High Performance Liquid Chromatography (HPLC) and LC-MS/MS was employed to identify the proteome of excysted and non-excysted oocysts (Fig. 1). A total of 106840 two-stage spectra were obtained by mass spectrometry analysis. A subset of 20541 effective spectra were obtained after the mass spectrometry two-level graph was searched by the protein theory data, and the spectral graph utilization rate was 19.2%. A total of 10789 peptide segments were identi ed by spectral analysis, with 10213 unique peptide segments, and 1786 proteins were identi ed (Fig. 4). In addition, most of the matching errors for the majority of the peptides ranged from -5 to 10 ppm, indicating the good reliability of the TMT data in this study. The statistical analysis of protein identi cation is provided in Table S1. Our dataset comprised 46.1% of the C. andersoni predicted proteome (3876). Among those proteins, 44.3% (791/1786) were uncharacterized proteins, and 4.4% (78/1786) were ribosomal proteins.

Validation of the transcription of DEPs by real-time quantitative PCR
To investigate the excystation mechanism, the transcription levels of DEPs were veri ed by qRT-PCR. Nine DEPs were selected randomly for qRT-PCR analysis to characterize gene expression. The qRT-PCR results indicated that the transcription levels of most genes were consistent with expression levels according to TMT data (Fig. 7).

Discussion
To obtain a better understanding of the molecular mechanisms and proteome changes occurring when C. andersoni excysts in vitro, a TMT-based quantitative proteomics analysis was performed to directly screen differentially abundant proteins involved in the excystation of this parasite. In this study, 1786 proteins were identi ed in the excysted oocyst/sporozoite of C. andersoni, approximately 46.1% of the predicted proteome (3876). The number of identi ed proteins in the excystation of C. andersoni was greater than in previous proteome studies of oocysts and sporozoites of C. parvum; in 2007, 303 proteins were identi ed during sporozoite excystation of C. parvum [19]; in 2008, 1237 nonredundant proteins (approximately 30% of the predicted proteome) of excysted C. parvum oocyst/sporozoite were identi ed using LC-MS/MS analysis [20], and in 2013, 33 separate C. parvum sporozoite proteins were identi ed from 135 protein hits using SDS-PAGE with subsequent LC-MS/MS analysis [21]. Among the proteins identi ed in this study, many have been identi ed as putative virulence factors by immunological and molecular methods [1]; for example, serine protease and aminopeptidase are associated with excystation; P23 and P30 are associated with adhesion; TRAP and thrombospondin-related proteins are involved in parasite gliding motility and cell penetration; Cp2, Cap135, and secretory phospholipase are associated with invasion, and HSP70 and HSP90 are associated with stress protection [1].
Using a comparative proteomic approach, we identi ed 17 DEPs between the excysted and non-excysted C. andersoni oocysts. Further analysis of these DEPs was performed to understand the mechanism of C. andersoni excystation, especially the upregulated proteins. Compared with non-excysted C. andersoni oocysts, the RCC1 (regulator of chromosome condensation) increased by a factor of about 3.5 in the excysted oocysts. Histone H2A and an unidenti ed protein (Protein accession: A0A1J4MP75) containing High Mobility group box domains (HMG-box) were also increased after excystation. RCC1 is bound to chromatin and con ned to the nucleus. RCC1 ß-propeller domain binds the histone H2A/H2B dimer component of the histone octamer that can regulate the concentration gradient of Ran-GTP around the chromosomes to mediate nucleocytoplasmic transport and mitotic spindle assembly [22,23,24]. An RCC1 mutant of Toxoplasma gondii showed defects in nuclear tra cking and growth impairment under nutrient limitation, demonstrating that the rate of nuclear transport is a critical factor affecting growth in low-nutrient conditions [25]. High mobility group box (HMGB) proteins have been reported in many apicomplexan parasites, including Plasmodium falciparum [26], Toxoplasma gondii [27], and Babesia bovis [28]. PfHMGB1 andPfHMGB2 are potent inducers of two important mediators of in ammation, TNFa and iNOS, suggesting that these proteins may have immunomodulatory roles in the pathophysiology of P. falciparum infection [26]. TgHMGB1 was implicated in transcriptional regulation and most likely acts as an activator of many virulence factors in T. gondii [27]. Based on the above conclusions, we hypothesized that RCC1, H2A, and the unidenti ed protein were involved in the gene expression regulation of C. andersoni excystation factors. Phosphatidylethanolamine (PtdEtn) had a 1.258 rate increase after excystation of C. andersoni. PtdEtn is located at the plasma membrane and is enriched in a variety of biosynthetic and metabolic processes, including lipid biosynthetic and metabolic processes, organophosphate biosynthesis and metabolism, phospholipid metabolism, and organic substance metabolism. The increase of PtdEtn may be associated with energy acquisition in the excystation of C. andersoni oocysts. PtdEtn is one of the most abundant phospholipids in prokaryotes and eukaryotes; it contributes to the membrane integrity, membrane fusion/ ssion, protein stabilization, and autophagy events [29]. PtdEtn is the second major phospholipid classi ed in T. gondii. Phosphatidylserine decarboxylase (PSD) mediates the decarboxylation of phosphatidylserine (PtdsSer) to form PtdEtn and displays much higher (10fold) activity in T. gondii tachyzoites compared with yeast and mammalian cells [30]. In addition, choline kinase inhibitors inhibit the ethanolamine kinase activity of P. falciparum choline kinase, leading to a severe decrease in the phosphatidylethanolamine levels within P. falciparum, which explains the resulting growth phenotype and parasite death [31].
The B6AJJ3 protein is an uncharacterized protein that contains a WD40-repeat-containing domain and a YVTN-type repeat domain. The expression level of the protein increased after excystation of C. andersoni oocysts. During the Cryptosporidium life-cycle, sporozoites slip out of the oocyst and infect the host cells, and this mainly occurs in the intestinal lumen or stomach. This uncharacterized protein may be associated with environmental stress and the host immune system. WD40 proteins are much more abundant in eukaryotic organisms, where they participate in a diverse set of functions, including signal transduction, cell division, cytoskeleton assembly, chemotaxis, and RNA processing [32]. The WD40-repeat protein-like protein PfWLP1 may support the stability of adhesion protein complexes of the plasmodia blood stages [33]. The YVTN-type repeat domain is found in archaeal surface layer proteins that protect cells from extreme environments [34].
In addition to the DEPs, many known and potential virulence factors were identi ed in the oocysts of C. andersoni. Although there were no signi cant differences in the expression of those virulence factors in the excysted and non-excysted C. andersoni oocysts, their role in sporozoite adhesion and invasion of host cells is still worthy of study. The Cpa135 protein was localized in the apical complex of the sporozoite and in the parasitophorous vacuole (PV) during the intracellular stages. In fact, in the oocyst-sporozoite, both the Cpa135 mRNA and the Cpa135 protein are present, and the protein rapidly increases during the excystation process of C. parvum [35]. This study identi ed a variety of heat shock proteins (HSPs), including HSP10, HSP70, and HSP90. HSPs are involved in maintaining cell homeostasis [36]. Synthesis of HSPs, especially HSP70, increases dramatically under stressful conditions such as a sudden temperature shift, changes in concentrations of glucose and calcium, or in response to immune effectors [37]. Previous studies have identi ed two HSPs in Cryptosporidium, HSP70 and HSP90 [38,39]. Differences in HSP expression in T. gondii correlate with parasite virulence in the immunocompetent host [40]. The relationship between the expression levels of these HSPs and Cryptosporidium virulence is worthy of further study.

Conclusion
TMT-based proteomics technology provides a new method for the identi cation of proteins involved in the excystation of C. andersoni oocysts. The proteomes of excysted and non-excysted C. andersoni oocysts were compared, and multiple proteins such as RCCI, Histone H2A, and PtdEtn and two uncharacterized proteins (Protein accession numbers: B6AJJ3 and A0A1J4MP75) had increased expression in the excysted C. andersoni oocysts; these proteins may be key regulatory factors involved in the excystation of C. andersoni. played no role in the study design, in the collection, analysis, or interpretation of the data, in writing the report, or in the decision to submit the article for publication.

Availability of data and materials
The datasets supporting the conclusions of this article are included within the article and its additional les. The mass spectrometry proteomics data have been deposited at the ProteomeXchange Consortium via the PRIDE partner repository with the dataset identi er PXD028423. All analyzed data are available from the corresponding author upon reasonable request.  Figure 1 To understand how the proteome changes with Cryptosporidium andersoni oocyst excystation, a TMT-based labeling approach coupled with High Performance Liquid Chromatography (HPLC) and LC-MS/MS was employed to identify the proteome of excysted and non-excysted oocysts (Fig. 1).

Figure 2
Highly puri ed oocysts of C. andersoni were obtained from adult cow's feces via sucrose solution density gradient centrifugation and cesium chloride density gradient centrifugation ( Figure 2).

Figure 3
The excystation rate of C. andersoni was 82% using 3-h incubation in a 37°C thermostat water bath. Non-excysted oocysts, excysted oocysts, and sporozoites are shown in Figure 3.

Figure 7
To investigate the excystation mechanism, the transcription levels of DEPs were veri ed by qRT-PCR. Nine DEPs were selected randomly for qRT-PCR analysis to characterize gene expression. The qRT-PCR results indicated that the transcription levels of most genes were consistent with expression levels according to TMT data (Fig. 7).

Supplementary Files
This is a list of supplementary les associated with this preprint. Click to download.