Transcriptomic analysis of the Echinococcus granulosus protoscolex in the encystation process

Back ground: Echinococcosis (CE) is a zoonosis and in humans it occurs as a result of infection by the larva of Echinococcus granulosus . CE is seriously affects the development of animal husbandry and endangers human health. Due to the lack of in-depth understanding of the cystic fluid formation pathway, prevention and treatment of CE have been lack of innovative methods. Result: High throughput RNA-sequencing (RNA-seq) of protoscoleces (PSCs) in the encystation process of total three biological replicates for each period on 0d, 10d, 20d, 40d and 80d were analyzed. The results demonstrated, a total of 32,401 transcripts and 14,903 genes, including numbers new genes, new transcript, stage-specific genes and differently expression genes (DEGs). Genes encoding proteins involved in several signaling pathways, such as putative G-protein coupled receptor (GPCR), tyrosine kinases and serine/threonine protein kinase were predominantly up-regulated during encystation process of PSCs. Moreover, three major antioxidant proteins of PSCs were identified, and these proteins demonstrated have a high expression level, including cytochrome c oxidase, thioredoxin glutathione, and glutathione peroxidase. Intriguingly, The Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analysis suggested that upregulated DEGs involved in the vasopressin-regulated water reabsorption metabolic pathway might play important roles in the protein, carbohydrate, and other substances transport. Conclusions: The present study carried out the transcriptomic analysis of the encystation process of E. granulosus PCSs, which provide valuable information for the mechanism of cystic fluid formation during the encystation process. These results provide a basis and reference for further studies for investigation of the molecular mechanisms involved in PSC growth and development. In summary, through RNA-seq, we presented a transcriptomic analysis during the encystation process of PSCs. The study revealed numbers of new genes, new transcripts, DEGs, stage-specific genes and the key metabolic pathways. These data provide valuable information to understand the growth and development mechanisms of PSCs, which can be used as a base for further studies on new drugs and vaccine targets.

process, Differentially expressed genes, Protoscolex, RNA-seq Background Cystic echinococcosis (CE), also called hydatid disease, is a chronic neglected zoonotic disease caused by the larvae of Echinococcus granulosus which endangers human health and causes huge economic losses in the animal husbandry [1]. The recent epidemiological studies showed that at least 270 million people (58% of the total population) are at risk of CE in Central Asia, including areas of Mongolia, Kazakhstan, Kyrgyzstan, Tajikistan, Turkmenistan, Uzbekistan, Iran, Pakistan and western China [2,3]. The life cycle of E. granulosus characterized by long-term growth of larval stages (hydatid cysts) in the internal organs of humans and other intermediate hosts, especially in the liver and lungs [4]. Although benzimidazoles have been widely used to treat CE, the main disadvantages were the poor absorption and high hepatotoxicity [5,6]. Thus, there remains great interest in developing new chemotherapies against CE.
The hydatid cysts is composed of cyst wall and contents, including cystic fluid, protoscoleces (PSCs) and brood capsules. The cyst wall mainly consists of the outer protective acellular mucin-rich laminated layer [7] and the inner layer of germinal cells (blastoderm) [8]. The germinal layer has many cellular nuclei, which can produce groups of vesicles into the cyst lumen, these vesicles become brood cysts after cell vacuolation and then develop into the initial PSCs [9]. Secondary hydatid cysts are formed by encystation during both in vivo development and in vitro culture of the PSCs [10].
However, despite it is clear that during the process of encystation, the volume of cysts gradually increases, as well as the amount of cystic fluid, and the echinococcus cyst fluid is the internal environment for the growth and development of germinal cells and PSCs, the functional genes and the key metabolic pathways linked to this encystation process are still unknown.
Recently, the second-generation high-throughput RNA sequencing technology is mainly used to detect the expression levels of all genes or transcriptomes in a sample under specific physiological conditions, and it plays an important role in gene expression and transcriptome regulation [11]. While whole-genome sequencing, proteomic analyses, and transcriptomic investigations recently been carried out to identify the DEGs and proteins between the different life stages (adult, oncosphere, hydatid cyst wall, larval worms, and pepsin/H+-activated PSCs) of E.granulosus [12][13][14][15]. An unexplored issue associated with the parasites survival in its host is what is the key DEGs and main metabolic pathways that used by PSCs to produce the energy, formation cyst fluid and transport molecular needed to maintain its growth and development during the encystation process.
Here, for the first time, we reported there were 1,991 and 2,517 up-and down-regulated DEGs, respectively, during encystation process in five different stages (0d, 10d, 20d, 40d, and 80d) of PSCs by using RNA seq analysis method. Through enrichment analysis of DEGs, it was found that PSCs were mainly involved in vasopressin-regulated water reabsorption during the process of encystation. Our study identified a range of potential target genes of drugs and vaccines treatment and provided metabolism pathways that are primarily involved in the PSCs encystation process. These findings could facilitate the development of new intervention tools for the treatment and control of CE.

Materials and Methods
Protoscolex collection and RNA extraction E. granulosus hydatid cysts were collected from a naturally infected sheep liver, which freshly obtained from a slaughterhouse in Urumqi, Xinjiang, China. The hydatid fluid was aspirated aseptically using a 50 ml syringe, and then the PSCs were collected from hydatid fluid under aseptic conditions and washed 15 times with sterile PBS. Afterward, the PSCs were digested in 1% (W/V) pepsin for 30 min at 37 °C to release PSCs from the brood capsules. During the digestion, the samples were agitated every 5 minutes and observed under an inverted microscope until the tissues were digested completely.
Subsequently, the collected PSCs were washed with a solution of PBS that contained penicillin and streptomycin antibodies (Sigma, USA), and stained with 0.4% trypan blue until the viability rate was more than 90%. PSCs culture conditions were referred to Liu et al., 2018 [16]. The PSCs were cultured in RPMI 1640 medium (Gibco, USA) (pH 7.2), with 10% fetal bovine serum and 1% penicillin-streptomycin. The average number of PSCs in monophasic culture medium was 4000 per bottle, and the incubator conditions were 37 °C and 5% CO 2 . The culture medium was changed every 2 to 3 days. Different stages in culture medium were isolated based on the morphological classification described by Elissondo et al [17]. The isolated of PSCs on different stages were stored in liquid nitrogen for RNA extraction. PSCs tissues were collected from 15 batches of five different culture periods (three replicates each): EgPSC_0d, EgPSC_10d, EgPSC_20d, EgPSC_40d, and EgPSC_80d. Total RNA was extracted from all samples by using TRIZOL-Reagent kit (Invitrogen, CA, USA). The extraction was carried out according to the instructions, and the concentration and integrity of RNA was detected by ultraviolet spectrophotometer and agarose gel electrophoresis.

Construction of strand-specific libraries and RNA-seq
Conversion of mRNA from total RNA in each sample to obtain a strand-specific cDNA library was carried out according to the manufacturer's instructions, using the Illumina ® TruSeq ® RNA Sample Preparation Kit v2 (Illumina, CA, USA) and Oligo (dT) bead enriched PSCs mRNA. Combined different cDNA libraries according to the effective concentration and target data volume, before undertaking Illumina sequencing (Mig, Shanghai, China).

Data assembly and annotation
The raw reads sequences including linker sequences, low-quality sequences, long sequences, short length sequences and ambiguous N (unknown nucleotide). Thus the software SeqPrep (https://github.com/jstjohn/SeqPrep) was used to perform quality control of the raw data and obtain high-quality data (clean data) before analyzing [

Inter-samples correlation analysis
RSEM (http://deweylab.github.io/RSEM/), as a quantitatively analyzed the expression levels of genes and transcripts software, could be used to further analyzed the DEGs between different samples, and to reveal the regulation mechanism of genes by combining sequence function information [20]. Besides, RSEM distinguished which transcripts were different subtypes of the same gene through established the maximum likelihood abundance estimation model [21]. Venn diagram analysis was performed to display genes that specific and co-expression between samples.

Differential expression analysis
Since there were three biological replicates per period, the raw data were statistically analyzed directly using the DESeq2 software, which based on the negative binomial distribution [22]. EgPSC_0d was used as the control group, while other groups were the experimental groups, and a differential expression analysis of the genes from the samples was performed. Differentially expressed genes (DEGs) was threshold settings as previous studies (p-adjust <0.05&| log2FC | > = 1), transcript expression folds greater than 2 per group, and p-value less than 0.05, after multiple-test corrections, were selected [23].

Verification of RNA sequencing data by Q-PCR
12 DEGs were selected to verify the accuracy of the Illumina His seq X (Illumina, CA, USA) sequencing data, including six up-regulated DEGs and six down-regulated DEGs, respectively. The Primer 5.0 and Oligo 7.0 software were used to design and evaluate specific primers (Table 1). According to the previous study, Actin II was employed as an endogenous reference gene [24], and these primers were synthesized by Tsingke Biotech Company (Beijing, China). ALL 15 RNA templates that were used for RNA-seq were reverse transcribed to cDNA using Prime Script TM RT reagent Kit (Takara, Dalian, China) according to the manufacturer's instructions, individually. The final cDNA products were diluted 10 times with nuclease-free water before QPCR. QPCR was performed using SYBR Premix Ex Taq GC kit (Takara, Dalian, China). The reaction system comprised 10µl SYBR QPCR mix

RNA extraction and RNA-seq quality assessment
At the beginning of the in vitro culture, PSCs were invaginated, and the structure of the calcareous corpuscles and the apical hook could be observed under the inverted microscope (Fig. 1A). After 10 days of incubation, the morphology of the PSCs changed greatly, the apical hook and the sucker were evaginated (Fig. 1B). On day 20, the microcysts appeared with a thin laminated layer, and the small hook of scolex moved toward the central position of these micro-cysts, moreover, the calcareous corpuscles of the micro-cysts were observed clearly (Fig. 1C). After 40 days the micro-cysts and laminated layer were observed clearly, the apical hooks and sucker were not completely degraded ( Fig. 1D). On 80 days, the apical hook and sucker disappeared or degraded, the calcareous corpuscles decreased, and a considerable increment in the size of the cysts could be observed (Fig. 1E). Samples from three batches of PSCs specimens for five different culture periods (EgPSC_0d, EgPSC_10d, EgPSC_20d, EgPSC_40d, and EgPSC_80d) were collected. Total RNA extraction results showed that the RNA bands was slightly degraded, with no pigment contamination, obvious protein content, sugar, or other impurities.

Sequence assembly and annotations
Through the second-generation high throughput sequencing Illumina platform, the cDNA library forms each sample was constructed and sequenced, which average generated 51,300,044 clean reads representing a total of 7,597,846,677 (7.5 Gb) nucleotides of each sample ( Table 2). The GC content is one of the important characteristics of the genome base sequence, which can reflect the structure, function and evolutionary information of the gene [25]. Interestingly, the average GC content of the E. granulosus transcriptional level (46.44%) was similar to both its genome (42.1%) and coding regions (49.3%) [26].
Early studies estimated the S. mekongi, B. malayi and S. mansoni had a GC content of 34.1%, 30.5% and 35.3%, respectively [27,28], indicated that compared with other parasite species, the GC content of E.granulosus was higher. gene databases. Furthermore, some genes or transcripts could not be annotated and were defined as hypothetical protein, which suggested that the group of identified transcriptome dataset is sufficiently reliable for further characterization [29]. The length of the transcript was widely distributed, and the majority of the transcripts longer than 1,800 bp (34.65%) in length. In contrast, the transcripts length between 0 and 200 bp were the least (1.34%) (Fig. 2). The obtained transcriptome raw reads dataset has been submitted to the NCBI Short Read Archive (SRA) database under the accession number: SRP172517.

Veen diagram analysis and differential expression analysis between samples
Based on the expression matrix, inter-sample Venn diagram analysis was performed to obtain the co-expression and specific expression genes (Fig. 3A) or transcripts (Fig. 3B) between groups. Intriguingly, although a total of 14,903 genes and 32,401 transcripts were obtained by transcriptome sequencing at different developmental stages of PSCs in the encystation process, these genes and transcripts did not always present in every period of the cysts formation process. Conversely, the genes and transcripts were specifically expressed in different stages utilizing inter-sample Veen diagram analysis, especially at EgPSC_0d, the number of stage-specific genes and transcripts was the largest, 1079 and 1990, respectively. Secondly, the stage-specific genes and transcripts at the 20d stage were 277 and 704, respectively. While EgPSC_0d were the period when the original PSCs was just removed from the fresh sheep liver, and EgPSC_20d was the initial period of cyst formation, both of which were key periods of the encystation process.
In the previous transcriptome sequencing of the E. granulosus indicated that the presence of stage-specific genes among the cyst wall, PSCs and pepsin/H + -activated PSCs [12], which indicated that the stage-specific genes exist not only in different stages of the E. granulosus development but also in different periods of development.
For the DEGs analysis of each sample, EgPSC_0d was used as the control group compared with other groups (EgPSC_0d vs 10d, EgPSC_0d vs 20d, EgPSC_0d vs 40d and EgPSC_0d vs 80d). The parameter set as P-value cutoff of < 0.05 and an FC cutoff of > 2 to identify the significant DEGs, including 1,991 up-regulated and 2,517 down-regulated DEGs, respectively. A set of gene which composed of 1,991 up-regulated DEGs was established to provide the functional classification compared with the COG database. The result showed that most of the genes (25%) were classified into the poorly characterized category, which described as function unknown. Secondly, 144 genes (7.2%) were classified as cellular processes and signaling, which described as intracellular trafficking, secretion, and vesicular transport (Additional file 1: Table S1). Given that high levels of expression often indicate a fundamental role, these represent interesting goals for further research. The finding can open the way to the further studies.
Based on Veen diagram analysis, the total of 1,991 gene were up-regulated and 2,517 were down-regulated DEGs, we found that 52 genes were consistently on the rise during the process of encystation (Fig. 4A), while 152 genes have continued to decline (Fig. 4B).
Through cluster heat map analysis of 52 consistently up-regulated DEGs (Fig. 4C), showing that hypothetical protein (EGR_10197) and (EGR_05435) increased significantly at the late stage of encystation, and both of them were annotated as a hypothetical protein. Besides, gene1117 described as an integral component of the membrane and classified into cellular component term of the GO database. Moreover, 13 genes annotated to COG database among 52 DEGs and served many important biological functions, such as amino acid transport and metabolism, intracellular trafficking, secretion, and vesicular transport (Table 3). After analyzed the correlation coefficients between the 52 DEGs with FDR< 0.05 (Fig. 4D), a visualization network was obtained, which showed that cornifelin (EGR_03608), platelet glycoprotein (EGR_09887), potassium voltage-gated channel subfamily C member 3 (EGR_05864) and others are related to the expression of many other genes.
Another interesting found was that none of the genes remained up-regulated during the encystation process when we analyzed by Veen diagram analysis of the gene sets among EgPSC_0d vs 10d, EgPSC_10d vs 20d, EgPSC_20d vs 40d and EgPSC_40d vs 80d (Additional file2: Figure S1). Only ten genes exist at most of four gene sets simultaneously, and two genes of them continued up-regulated from EgPSC_10d to 80d, EGR_05443, and EGR_03870, respectively, and the other 8 genes were continuously up-regulated from EgPSC_0d to 10d and EgPSC_20d to 80d. In addition, of the 10 genes, except EGR_09887 and EGR_09469, which described as platelet glycoprotein and dynein light chain, respectively, the remain genes were described as hypothetical proteins (Table 4).

GO and KEGG enrichment analysis of DEGs
GO was a comprehensive database that aggregates all gene-related research results from around the world and it could standardize biological terms for genes and gene products from different databases, providing a uniform definition and description of gene and protein functions [30]. To obtain a better understanding of the enriched functions of 1,991 DEGs in the process of encystation of E. granulosus, GO term enrichment analyzed was performed. According to the GO classification of DEGs, there were 1094, 148 and 263 genes assigned to cellular component (CC), biological process (BP) and molecular function (MF) categories, respectively, of which 709, 0 and 122 genes were significantly DE (FDR value < 0.05) (Additional file 3: Table S2). The top three predominant terms for BP were endoplasmic reticulum unfolded protein response, O-glycan processing, and ion transport; for CC they were integral component of membrane, intrinsic component of membrane, and membrane part; and for MF they were phosphoric ester hydrolase activity, hydrolase activity, acting on ester bonds and beta-1,3-galactosyltransferase activity (Fig. 5).
KEGG is a large-scale database for systematic analysis of gene function, genomic information, and functional annotation. By analyzing the results using the information in  Table S3) (Fig. 6). Analysis of the top 20 most significantly enriched pathways showed that the map 04962, which described as vasopressin-regulated water reabsorption metabolic pathway was the most significantly enriched pathways (FDR value < 0.05) (Fig. 7), followed by pantothenate and CoA biosynthesis (map00770) and Hippo signaling pathway (map04391).
Since the morphological changed of the PSCs were greatest in the first 10 days, we performed GO and KEGG enrichment chord analysis the top ten of GO term and KEGG pathway with FDR value < 0.05 on the DEGs between EgPSC_0d and EgPSC_10d. After analyzed, 43 and 61 specific genes were obtained, respectively (Additional file 5: Figure   S2, Additional file 6: Figure S3).

Real-time quantitative PCR analysis
To validate the RNA-seq data and the analyses derived from them, 12 DEGs were selected to measure their relative expression level by QPCR, including six up-and six downregulated DEGs. The inter-sample statistical analysis was performed using T-test. 0.01<P<0.05 was considered significant (*), P<0.01 (**) and P<0.001 (***) were considered extremely significant. The QPCR data were compared with the relative expression level of the gene in the RNA-seq TPM value to confirm the reliability of the transcriptome sequencing results (Fig. 8).

Discussion
The parasite E. granulosus is a cestode tapeworm that acts as the causative agent of CE, one of the 17 neglected tropical diseases to be recently prioritized by the World Health Organization [15]. During its life cycle, hydatid cysts produce the pre-adult form, which has the ability to either differentiate into an adult worm (strobilation) in the terminal host (dog) or dedifferentiate into a secondary hydatid cyst in the intermediate host (human and livestock) [31]. Furthermore, the cysts may be developed in almost any internal organ or tissue by hematogenous dissemination, such as heart, bone and nervous system [32]. The early infection is asymptomatically, with the cyst volume increases, eventually causing physical damage result from mechanical compression of the surrounding tissues and organs [33]. The major sources of morbidity and mortality are pressure effects from cyst size, location in a sensitive organ, or cyst rupture caused by a spontaneous or external force (traumas or surgery) with subsequent anaphylaxis or dissemination of the secondary infection [34]. Thus, novel prevention and control strategies need to be developed based on basic and applied E. granulosus biology, especially genomic, transcriptomic and proteomic analysis. However, our understanding of key DEGs and the main metabolic pathways of PSCs in the encystation process is limited.
The previous study found that PSCs encystation generally appeared during the first week of incubation in vitro, and on day 20, some micro-cysts with a complete laminated layer were observed. Between day 38 and day 42, micro-cysts completely developed could be Here, we obtained a total of 32,401 transcripts and 14,903 genes, identified numbers stage-specific genes, and demonstrated that 1,991 and 2,517 genes were significantly upregulated and down-regulated, respectively. Genes encoding proteins involved in several signaling pathways, such as putative G-protein coupled receptor (GPCR), tyrosine kinases, and serine/threonine protein kinase were predominantly upregulated during encystation process of PSCs. These signaling pathways play major roles in key functions like movement, development, and reproduction in parasites [40]. Our findings are paralled to the previous studies of S. mekongi [27] and S.japonicum [41]. A previous study classified more than 60 putative GPCRs of E. granulosus as potential anthelminthic drugs target [14]. In this study, we filtered out coding putative GPCRs genes with low expression and un-expression through transcript sequencing, narrowed down to six genes (EGR_01295, EGR_00873, EGR_07838, EGR_08773, EGR_00585 and EGR_06296), in particular, EGR_00873 was significant up-regulate during EgPSC_0d to EgPSC_20d. In addition, EGR_10296, and EGR_09273, which encodes the tyrosine kinase and serine/threonine protein kinase, respectively, did not increase during the first 10 days of PSCs in vitro culture but showed a significant up-regulate trend at the later stage. Members of these pathways characterized here might represent novel drug targets for anti-parasitic intervention and require further study.
The vast majority of invertebrate species have shown an immune response that can inactivate and eliminate penetrating parasites, especially involves the formation of reactive oxygen species (ROS) [42]. ROS is a key trigger of the inflammatory activation of macrophages in malaria, simultaneously, it can damage proteins, carbohydrates, and DNA, which is harmful to the survival of parasites [43]. In order to neutralize the host's immune response and oxidative damage caused by oxygen free radicals, protective antioxidant proteins are produced by the body of parasites (such as schistosomiasis) [44]. In the present study, we identified three major antioxidant proteins of PSCs, including cytochrome c oxidase, thioredoxin glutathione, and glutathione peroxidase. Cytochrome c oxidase (EGR_08027), an up-regulated DEGs, has the molecular function of binding to heme and transporting protein in the mitochondrial electron-transport chain according to the COG database annotation, consistent with previous observations that it can regulate mitochondrial oxidative metabolism of E. granulosus [45]. Previous reports suggested that cytochrome c oxidase dysfunctions is associated with increasing mitochondrial ROS production as well as stimulating inflammation and apoptosis [46,47]. Hinted that in the face of external environmental pressure, cytochrome c oxidase of PSCs might play essential roles in the antioxidant defenses during the encystation process. Another wellcharacterized antioxidant defense is the thioredoxin glutathione system, which plays a critical role in maintaining the redox balance in the parasite, such as S. japonicum [48], E.
granulosus [12], T. brucei [49], and S. mansoni [44]. In this study, we filtered out three relative proteins through differential expression analysis, including thioredoxin (EGR_6727), thioredoxin related transmembrane protein (EGR_01550) and thioredoxin domain containing protein (EGR_09201). Lastly, glutathione peroxidase (EGR_06748), which presented continuous high expression of PSCs during the encystation process, involved in clearance of cytotoxic and genotoxic compounds and protection against oxidative damage [50]. Studies have reported that glutathione peroxidase of E. granulosus displayed the ability to bind non-substrate molecules, particularly anthelmintic drugs [51], and it acts as an essential enzyme for the survival of the schistosomiasis in the redox environment has been actively explored as a potential drug target [52].
Here, we analyzed the key genes previously identified by genomic [26], proteomic [53] and transcriptomic [12] studies of E. granulosus, and found that some genes were expressed significant differences in the encystation process of PSCs. For example, Heat shock protein, which has important roles in posttranslational modification, protein turnover, and ATP binding, was reported that coding gene EGR_09650 was only expressed in adult worms, while EGR_10561 was highly expressed in the hydatid cyst membrane [26]. However, in this sequencing, we found that both of them are present in the PSCs. It is worth noting that EGR_10561 is the gene with the consistently highest expression among all genes coding heat shock proteins during the encystation process of PSCs.
Cadherin, which has important roles in cell adhesion and recognition, overexpression of cadherin induced parasite aggregation in T. vaginalis [54].
Although more than forty genes were found encoding cadherin by sequencing, all the genes except EGR_06182 showed low transcription expression levels.
Glycosylphosphatidylinositol (GPI)-anchored wall transfer protein, despite it is important and exclusive to E. granulosus, the expression level of the encoding gene (EGR_06221) is low during the PSCs encystation. One possible reason for this is the heterologous proteins is covalently anchored to the PSCs cell wall by fusing them with the anchoring domain of GPI-anchored cell wall transfer protein [55], but in this study, PSCs was cultured in vitro and there was no heterogeneous source protein. Furthermore, tetraspanins are transmembrane proteins previously described as potential vaccine candidates for other helminth infections and are also found in the membranes of the tegument and extracellular vesicles of O. viverrini [56]. In some organisms, their role is associated with virulence and pathogenesis [57]. A wide variety of molecules have been found to be released by E. granulosus, such as tetraspanin proteins, channel protein that transport lipids, and nucleic acids [58,59]. Here, we identified two encoding tetraspanin genes (EGR_11042 and EGR_06311), showed a strikingly up-regulate trends during the encystation of PSCs, we speculated that the overexpression of genes related to membrane protein transport in PSCs might be a compensatory mechanism by these worms to adapt to the external environments to satisfy their own continuously encystation development processes.
The most highly enriched KEGG pathway involved in 1,991 up-regulated DEGs of PSCs during the encystation process was vasopressin-regulated water reabsorption metabolic pathways (map 04962), which include a total of 24 DEGs (Additional file 7: Table S4).
Dynein light chain (DLC), ras-related protein, synaptobrevin-like protein and aquaporin 4 (AQP 4) was specifically upregulated in the encystation process of PSCs. The vasopressinregulated water reabsorption metabolic pathways have been reported to play a critical role in regulating water, urea, and sodium transport to keep the water balance of body [60]. However, due to the special structure and life history of the parasite, there is no aquaporin 2, vasopressin and its corresponding V2 receptor (V2R) genes in PSCs. Previous reports indicated that DLC is involved in the process of protein, carbohydrate and other substances transport and retrograde vesicle transport [61]. Ras-related protein are involved in the process of vesicle transport after activation, parasites can secrete extracellular vesicles to achieve intercellular communication, and transfer biologically active molecules to the host cells in order to modulate host immune response [62]. Hence, vesicle transport is considered an important pathway for substances transport and information transmission between parasites and external environment [53]. In this study, we detected the presence of AQP4 in the cell membrane of PSCs, and eukaryotes usually permeate water and regulate water reabsorption by activating AQPs, such as S. japonicum

Acknowledgments
We are grateful to the editor and anonymous reviewers for their helpful comments and constructive suggestions.

Ethics approval and consent to participate
Not applicable.

Consent for publication
Not applicable.

Availability of data and material
RNA-Seq reads have been deposited in the National Center for Biotechnology Information (NCBI) Short Read Archive (SRA) database under the accession number: SRP172517.    Table 2 Sequencing data statistics  Table 3 COG classification statistics of the consistently up-regulated differential expressed genes  Table 4 Statistics on the expression of 10 consistently up-regulated genes   Assembled transcript length distribution of E. granulosus protoscoleces.   Gene Ontology (GO) transcript classification statistics. The ordinate represents the GO term, and the abscissa above represents the number of genes compared to the GO term, which corresponding to the different points on the broken line.
The lower abscissa represents the significance level of enrichment, corresponding to the height of the column. Among them, the smaller FDR is, the larger the -log10 (FDR) value, and the more significant the GO term is.  Vasopressin-regulated water reabsorption. All the products with color background in the figure belong to KEGG annotation results of this sequencing, in which the yellow background represents the reference gene, and the red border indicates that the genes in the analyzed gene set are annotated.