Sequencing and assembly
Illumina sequencing of the nine D. marginatus adult female tick samples resulted in 460,171,414 raw reads (Supplementary Table 2). After removing low quality reads, 449,349,046 valid reads accounted for 97.68% of the raw reads. De novo transcriptome assembly was performed using Trinity software, resulting in a total of 100,644 putative transcripts (N50 1,383), clustered into 30,251 unigenes. A summary of the assembly is shown in Table 1. Regarding sequence length, all unigenes are longer than 200 bp. The longest is 24,550 bp. Fifteen percent of assembled unigenes were above 2,000 bp, and many of the unigenes were between 300 and 1,000 bp (39.04%) (Fig. 2A). Principal component analysis (PCA) of the assembly revealed biological replicates grouped together separately showing acceptable variation within groups (Fig. 2B). It is notable that this is the first transcriptome analysis of D. marginatus produced using RNA-Seq. The assembly results of this study have augmented sequence information of D. marginaus. The transcriptome data used in this analysis have been submitted to GEO (Gene Expression Omnibus, http://www.ncbi.nlm.nih.gov/geo/) database of NCBI under accession number GSE151667.
Functional annotation of the assembled unigenes were performed by matching the 30,251 unigene sequences against 6 databases. Nr database hit the highest match up to 32%, while Swissprot database hit 22.56%, relatively less than the other database (Supplementary Table 3). Among the 9,672 annotated unigenes, 58.9% of them showed homology with sequences of tick species, and over half of the unigenes matched I. scapularis sequences (Fig. 3). The remaining 20,579 unigenes (68.03%) were classified as unknown because they did not produce matches within the searched databases, indicating that many new genes and non-coding RNA sequences had been identified [37, 46]. Genome data is very important to tick transcriptome studies. Recently published tick genome study described 6 tick species that is common in the Old Continent . These genome data are valuable in studying tick gene function, system evolution, and symbiotic relationship of the pathogen in different tick species. Among the genome data, Dermacentor silvarum Olenev is a closely related tick species with D. marginatus. We tried to use the D. silvarum genome as reference to analyze the RNA-seq data. To our surprise, the ratio of genomic alignments is low, thus the current de novo transcriptome analysis is more suitable for the study (Supplementary Table 4). The reason that most of hits matched with I. scapularis rather than D. silvarum which was the Dermacentor tick species sequenced probably because genomic annotations of I. scapularis sequences was available in the NCBI database , and most of the conceptually annotated gene sequences of D. silvarum were not available yet. For identifying potential tick proteins in the unknown unigenes, the SignalP prediction was performed and the results indicated that 320 out of 30,251 have signal peptide sequence, while 150 were among the unknown unigenes. The SignalP prediction results indicated that many potential tick proteins encoding genes are still waiting to be uncovered. The annotation table with signal peptide prediction results was uploaded as Supplementary Table 5.
Gene Ontology (GO) classification
The annotated sequences were next functionally characterized using the information available in GO classification that considers 3 categories. Initially, genes were classified according to their molecular function, cellular component, and biological process using the GO terms from the GO database (Fig. 4). Only the categories represented by more than 30 genes are depicted, each including the number of the genes assigned.
The most abundantly annotated functions were binding activity, such as protein binding activity (n=211), ATP binding activity (n=185) and metal ion binding (n=183) which coincided with transcriptomes of other ixodid tick species [47, 48]. The remaining annotated functions were as follows: structural constituent of ribosome (n=210), RNA binding (n=182), oxidoreductase activity (n=170), zinc ion binding (n=125), DNA binding (n=121), hydrolase activity (n=106), nucleic acid binding (n=98).
Gene classification according to cellular component resulted in up to 15 categories represented by more than 100 genes. In the cellular component category, the most abundant GO term was cytoplasm, assigned to 17.1% (n=664) of the total GO annotated unigenes. Many GO terms related to nucleus were abundant either (n=555). Membrane (n=327) and integral component of membrane (n=390) were next following.
Gene classification based on biological process resulted in 25 categories. Notably, the most abundant (n=171) was translation process followed by categories corresponding to oxidation-reduction process (n=141), proteolysis (n=120), regulation of transcription (n=91), and metabolic process (n=90). The remaining annotations of biological process that contained less unigenes were distributed into twenty categories. The GO terms for molecular function, cellular compartment and biological process assigned to all the genes annotated are showed in Supplementary Table 6.
Differential gene expression profiles
Blood-feeding and long-term starvation caused significant differential gene expression of D. marginatus. The volcano plots were applied to demonstrate the overall gene expression profiles (Fig. 5). Differentially expressed genes (DEGs) were almost one third of the overall unigenes and the down-regulated unigenes slightly outnumbered up-regulated unigenes between each compared group (Supplementary figure 1) indicating the three different states of D. marginatus female adults were experiencing significant physiological function shift which was in line with many Ixodidae transcriptome studies [18, 23, 49]. It is considered that H tick representing starved ticks experienced 6 months non-feeding period exhausting its energy sources of lipid, carbohydrate, and protein, whereas M tick stands for newly molted ticks of which physiological status was at its prime with many of the energy metabolism-related genes down-regulated . After mating, a D. marginatus female adult could ingest over 100-fold on its body weight in blood . The blood meal was digested and the nutrients were transported and mostly used for embryogenesis . Both blood-feeding and long-term starvation would cause significant differential gene expressions.
For further interpretation of systemic functions, the D. marginatus dataset was mapped to KEGG pathways. The 15 pathways most enriched in specific stages of female adult ticks were presented in Fig. 6. The pathways enriched significantly consisted of protein processing, glycan biosynthesis, lipid metabolism, detoxification, and defense responses.
Currently, proteins in blood meal was the main source of nutrient for oviposition and the energy source for resisting long-term starvation [18, 19, 50]. Up-regulated genes in the three stages potentially involved in protein metabolism are listed in Table 2 and Supplementary Table 5. Protein families of proteases, such as cysteine peptidases, aspartic endopeptidases, metallopeptidases and serine peptidases were significantly up-regulated in F tick group (Fig. 6A) suggesting that the proteolytic system was active . The results were quite similar to other transcriptome studies [22, 37]. The intracellular digestion of the proteins ingested with blood is performed by a group of lysosomal proteolytic enzymes that act sequentially and have been studied in detail in the family Ixodidae [27, 29, 51, 52]. It has been demonstrated that haemoglobin is the source of haem group for the development of the embryo in ixodids, and it is essential for a successful hatching of eggs spawned by engorged female ticks . In H tick group proteolysis enhanced, and pathways associated with diseases were underlined (Fig. 6C). In starvation, there was an up-regulation of pathways related to transcription and translation processes that are energetically expensive. The result is in consistence with a previous study on Dermacentor variabilis Say . It was probably caused by excessive consumption of nutrient substances so as to expend proteins in its cellular structure to maintain basic life support . In M tick group, pathways of steroid biosynthesis, protein export and N−Glycan biosynthesis were significantly enriched (Fig. 6B). New adult ticks emerging from engorged nymphs have consumed their last meal taken in their nymphal state. Host blood components, especially hemoglobin, were digested, and subsequently biosynthesized for new proteins acting on hormones and chitin biosynthesis . As tick experience long-term starvation, the energy sustaining life comes mainly from consumption of proteins in tick midgut [18, 54, 55].
In invertebrates, lipids perform several functions including serving as constituents of cellular structures, hormones, energy, and play roles in egg production and metamorphosis [56, 57]. Lipid metabolism is of crucial importance in D. marginatus, approximately 150 unigenes annotated were identified with a putative function in lipid metabolism. We selected 49 unigenes that are differentially expressed in blood feeding and starvation (Table 3, Supplementary Table 5). The KEGG pathway enrichment revealed that sphingolipid metabolism, synthesis and degradation of ketone bodies, steroid hormone biosynthesis, arachidonic acid metabolism were related to lipid metabolism after blood feeding. The unigene annotated as vitellogenin (DN57762) in F tick group was significantly up-regulated compared both to H tick and M tick group. Vitellogenin is essential for egg development and oviposition, and has been shown to play a role in heme sequestration [58, 59]. In D. variabilis, it has been demonstrated that Vg expression increases after engorgement and Vg is exclusively expressed in fat body and gut cells of vitellogenic females but not in the ovary . Beside vitellogenin, several apolipoproteins belong to the low-density lipoprotein family were up-regulated. Corresponding to the lipoprotein, the low-density lipoprotein receptor (LDLR) were up-regulated as well during blood feeding, which are the major cholesterol-carrying lipoprotein of plasma, acting to regulate cholesterol homeostasis in cells . In ticks the LDLR binds LDL and transports it into digestive cells . It has been considered that arthropods lack the set of enzymes to synthesize cholesterol, so they are obliged to obtain cholesterol from their hosts . The unigenes involved in cholesterol metabolism were insulin induced protein (DN55975), high-density lipoprotein (DN55629), Niemann-Pick type C1 (DN34326) and Niemann-Pick type C2 (DN53284) during blood feeding, and their expression profile indicated lipid metabolism was very active in newly molted ticks. In non-ruminants the INSIG1 gene modulates cholesterol metabolism, lipogenesis, and glucose homeostasis with a cellular localization on endoplasmic reticulum membrane . High density lipoprotein transports diglyceride from the fat body cell into the hemolymph, and is involved in the transport of cholesterol from the gut into the hemolymph in insects [65, 66]. Niemann-Pick type C1 encodes a large membrane glycoprotein with mostly a late endosomal localization. Niemann-Pick type C2 encodes a small soluble lysosomal protein with high affinity binding to cholesterol. Both proteins are in intracellular regulation of cholesterol metabolism, and their efficiency determine the processing and utilization of endocytosed cholesterol . Many unigenes related to sphingolipid metabolism were up-regulated after blood feeding (Table 3). Sphingolipids are one of the major classes of eukaryotic lipids and were appreciated as components of the plasma membrane and as modulators of cell–cell interactions and cell recognition . Other up-regulated unigenes involved in lipid transport were the oxysterol-binding protein, hemelipoglycoprotein, fatty-acid binding protein, microsomal triglyceride transfer protein (Table 3), in which several were up-regulated after long-term starvation. Unlike absorption of lipids during blood feeding, the female adult ticks undergoing long-term starvation were in steady consumption of lipid reserves . Compared to newly molted and engorged ticks, starved ticks presented most of the genes down-regulated regarding lipid metabolism, whereas, a number of genes related to hormone synthesis and wax formation were up-regulated in H tick and M tick groups indicating these genes might play a role in survival of D. marginatus during off-host period.
Carbohydrates as proteins and lipids ingested with host blood are the essential nutrients for ticks. A recent study demonstrated the major pathways involved in carbohydrate metabolism with details indicating blood glucose is an important nutrient I. scapularis . In the D. marginatus transcriptome analyzed in this study, several carbohydrases, carbohydrate transporters and cuticular proteins have been identified (Table 4, Supplementary Table 5). Up to 55 unigenes on carbohydrate metabolism were differentially expressed (35 in F tick group, 9 in H tick group and 11 in M tick group), covering biological process such as carbohydrate phosphorylation, protein glycosylation, glycogen catabolic process, galactose metabolic process. The results indicated that the up-regulated unigenes encoding carbohydrate-metabolizing enzymes were similar with the genes identified in the transcriptomes of Ixodes ricinus Linnaeus , Haemaphysalis flava Neumann , D. variabilis , and in Argasidae of Ornithodoros moubata Murray  and Ornithodoros erraticus Neumann . According to the annotations, the up-regulated unigenes were associated with the metabolism and transport of several carbohydrates including glucose, fructose, mannose, galactose, maltose, idose, malate, and chitin (Table 4).
In F tick group most up-regulated unigenes on carbohydrate metabolism were glycosyl hydralase (DN53826), mannose-binding lectin (DN36723), beta-galactosidase (DN53710), and α-L-fucosidase (DN57768). Glycosyl hydralase is a kind of enzyme that hydrolyzes glycosidic bonds, and plays an important role in the hydrolysis and synthesis of glycoconjugates and glycosidic compounds . In invertebrate mannose-binding lectin was related to specific pattern recognition of the complement system via activation of the lectin pathway [71, 72]. β-galactosidase is widely found in various animals, plants and microorganisms that hydrolyses the β-glycosidic bond formed between a galactose and its organic moiety [20, 63]. The data on a digestive α-L-fucosidase activity of Amblyomma cajennense Fabricius is described in a study and a plausible inference was present that the up-regulation of the gene was probably related to removal of fucose produced by microorganisms in tick gut [73, 74]. In our study, the α-L-fucosidase was over-expressed both in F tick and H tick group.
In H tick group, most up-regulated unigenes on carbohydrate metabolism were galectin (DN57588), glucose 6-phosphate dehydrogenase (DN56097) and 6-phosphofructo-2-kinase/fructose 2,6-bisphosphatase (DN57996). Galectins are a glycan-binding superfamily proteins (lectins) that plays an important role in insect and tick development, interaction with pathogens, and the innate immune system by recognizing repeating saccharide units found on microbial surface glycoproteins in immunity, respectively [75, 76]. Glucose 6-phosphate dehydrogenase is a NADPH-producing enzyme that was of critical role in the oxidative stage of the pentose phosphate pathway . A study on Rhipicephalus microplus G6PDH showed four transcripts differentially expressed in engorged and/or unfed adults in which G6PDH-D showed up-regulation as the one found in this study . The 6-phosphofructo-2-kinase/fructose 2,6-bisphosphatase (PFK-2/ FBase-2) is a bifunctional enzyme that catalyzes either the synthesis (PFK-2 activity) or the degradation (FBase-2 activity) of fructose-2,6-bisphosphate . The results revealed that up-regulation of PFK-2/ FBase-2 in H tick after long-term starvation might be related to fructose-2,6-bisphosphate.
In M tick group most of the unigenes related to carbohydrate as energy were not significantly up-regulated, while unigenes on chitin synthesis and binding were significantly up-regulated (Table 4). In the transcriptome of D. marginatus, up to 30 unigenes were matched with PF00379 domain and another 30 unigenes were matched with PF01607 in the Pfam database. These two chitin binding domains were generally found in arthropods . The majority of them annotated as structural constituent of cuticle, peritrophic membrane chitin binding protein, and conserved hypothetical protein were up-regulated in M tick group, and the highly expressed unigenes were cuticular proteins indicating that D. marginatus female adult shortly after completion of metamorphism were highly expressed, while the functional annotations of these proteins were inadequate. The chitin binding peritrophic-A domain (PF01607: CBM14) may be involved in formation and maintenance of peritrophic membrane  (Table 4, Supplementary Table 5). Peritrophic membrane is considered having the function of protecting microvilli of midgut epithelial cells from mechanical damage, pathogens and toxic substances, and acts as a semipermeable barrier allowing transport of small molecules and nutrients . The structure of tick peritrophic membrane has been described in I. scapularis , I. ricinus , Haemaphysalis longicornis Neumann  and Ixodes dammini Spielman . The chitin binding cuticular protein (PF00379) includes an amino acid motif which functions to bind chitin, but the protein shows no sequence similarity to the known chitin-binding domain (PF01607: CBM14) found in chitinases and some peritrophic membrane proteins . Cuticular protein comprises highly organized structural products that is formed as a layered and extracellularly secreted protein from the epidermis of insects . The up-regulated unigenes encoding cuticular proteins in newly molted ticks could be related to sclerotization and melanin formation of the tick surface cuticle, as a previous report has described at length that the two process may occur in concert in insects .
Ticks own effective defense mechanisms to protect from pathogenic microorganisms and to maintain the intestinal microbiota at a tolerable level . In the D. marginatus transcriptome 13 up-regulated unigenes were among 27 unigenes annotated with immune functions putatively encoding defensins, lysozyme, alpha-macroglobulin, and hemolectin (Table 5, Supplementary Table 5). Regarding defensins, there were up-regulations shown in all three states of D. marginatus transcriptome data. The most highly expressed defensin was DN36307 (TPM: 2888.17) in H tick group. Defensin is an antibacterial peptide that is the major components of innate immunity in ticks and protect ticks from gram-negative, gram-positive bacteria, and plasmodium [90, 91, 92]. Lysozymes are effective peptides which plays a role in carbohydrate digestion suppressing the proliferation of Gram-negative bacteria in the midgut . Alpha-macroglobulin plays an important role in phagocytosis of microbes, and may specifically be involved in phagocytosis of a certain genus of microbes in ticks . Several α-macroglobulins were up-regulated in F tick group after blood feeding indicating the ticks were trying to control microbial populations in its system. Besides α-macroglobulins, a hemolectin was up-regulated in F tick group which is a major clotting factor in insects [95, 96], and also has a function help insect immune system against bacterial infection .
Ingestion of large amounts of host blood and the digestion of the blood meal increases redox pressure in ticks. Therefore, detoxifying mechanisms are required for counteracting redox pressure in ticks . There were 64 up-regulated transcripts coding for proteins with oxidoreductase activities, chaperone proteins, and proteins involved in iron and hemoglobin metabolism (Table 5). Most of the up-regulated unigenes were in F tick group. In the differentially expressed unigenes, a glutathione S-transferase (GST) gene (DN56344) was highly expressed both in F tick (TPM: 2285.92) and H tick (TPM: 1869.90) groups. GSTs are known as genes of a superfamily that are involved in the detoxification of endogenous and xenobiotic compounds . Blood-feeding increased cellular stresses, and the majority of the GSTs were up-regulated except two omega class GSTs (GSTO) that were up-regulated in M tick group (DN51816; DN50592). The GSTOs were characterized in insects, and are over-express under stress response and at the presence of insecticides [100, 101]. Other highly expressed antioxidant proteins included catalase (DN57352; TPM: 624.90), protein disulfide isomerase (DN52494; TPM: 781.58) and glutathione peroxidase (DN40806; TPM: 1434.27).
Heat shock proteins (HSP) work as molecules involved in correction of protein folding which were over-expressed under heat shock and other stress responses as ticks ingesting host blood with elevated temperature, dealing with environmental stress, and neutralizing damage caused by toxic substances ingested with blood meal or produced during digestion of blood meal [102, 103]. We selected 16 differentially expressed HSPs including HSP90, HSP70, HSP60 and HSP20 (small HSPs) (Table 5). After blood feeding, The HSP90 (DN49937; TPM: 312.01) over-expressed in F tick group which was 5.8 folds the expression of the gene in H tick group and 3 folds the expression of the gene in M tick group. Notably a HSP20 (DN58874) was exclusively expressed in H tick group. Two highly differentially expressed HSP70s (DN38789 and DN48195) were in F tick group. Significantly up-regulated and highly expressed HSP20s were also found in F tick group implying up-regulation of these genes may induced by feeding and blood meal digestion , while up-regulation of unigenes including HSP60 (DN55315 and DN33794) and HSP20 (DN49326, DN46199 and DN44693) in H tick group were likely caused by exhaustion of energy substance and environmental stress after long-term starvation [104, 105].
Ticks acquire iron and heme from host blood, while ticks can’t utilize heme for bioavailable iron . In the iron metabolism ticks would have to bear redox pressure from acquiring iron from host transferrin . In the transcriptome of D. marginaus two unigenes annotated on iron metabolism were significantly up-regulated in F tick group including ferritin 2 (DN51538) and transferrin receptor (DN54992) (Table 5). Tick ferritin 2 is a unique secreted protein that is conserved in ticks , and tick ferritin 2 is a gut-specific protein which is secreted into the hemolymph transporting bioavailable iron in between organs . The iron in host transferrin was released by lysosome in tick midgut through endocytosis that was triggered when transferrin receptor conjugated with host transferrin [106, 108], but further detailed research is needed to characterize molecular function of tick transferrin receptors. Generally, feeding and digestion of haemoglobin in blood meal released large amounts of heme and subsequently caused the up-regulation of antioxidant genes, whereas exhaustion of energy reserve would increase reactive oxygen species, which in turn would exacerbate starvation-induced autophagy .
RT-qPCR Validation of RNA-seq data
In order to validate RNA-seq results, the expression level of six selected up-regulated genes were assessed by RT-qPCR, using ef-1α (DN43634) as reference. The selected genes encoded the following proteins: ferritin 1, ferritin 2, HSP70, galectin, GST, and Dm86 (Fig. 7). For checking the suitability of the primer pairs, the amplification products were electrophoresed in agarose gel, and the density of the bands of expected size were measured for all genes, including the housekeeping genes. The comparison between the RT-qPCR and RNA-seq results confirmed that all genes showed a similar trend in transcriptional expression changes, the normalized fold change values of all unigenes by ef-1α (DN43634) were recorded in Supplementary table 5. Transcripts encoding ferritin 1, HSP70, GST, galectin and Dm86 showed rather similar expression patterns, while, ferritin 2 did not show differential expression in RT-qPCR (Fig. 7). The less consistent result obtained from RT-qPCR and RNA-seq may be due to several reasons including variability in the expression of the reference gene in different biological samples or differences in data standardization method between RT-qPCR and RNA-Seq analyses .
The biological significance of the gene expression profiles is important, but the gene expression alone could not characterize them, as further investigations are required. In total, the differences among the gene expression profiles indicate that the transcriptome of D. marginatus female ticks are informative, which may provide data for further research on this tick species.