Genome-Wide DNA Methylation Variations Between Grass Carp with Different Ages

Background: Grass carp is an important farmed sh in China that infected by many pathogens, especially grass carp reovirus (GCRV). Notably, grass carp showed age-dependent susceptibility to GCRV, while the mechanism remains unclear. Herein, we performed a genome-wide survey of differences in DNA methylation and gene expression between ve months old grass carp (FMO, sensitive to GCRV) and three years old grass carp (TYO, resistant to GCRV) aim to uncover the mechanism. Results: Colorimetric quantication revealed global methylation level of TYO sh was higher than that of FMO sh. Whole-genome bisulte sequencing (WGBS) of two groups revealed 6,214 differentially methylated regions (DMRs) and 4,052 differentially methylated genes (DMGs), with most of DMRs and DMGs showed hypermethylation patterns in TYO sh. Correlation analysis indicated that DNA hypomethylation in promoter negative correlated with gene expression, whereas positive correlation was found between gene-body DNA hypermethylation and gene expression. Enrichment analysis revealed that promoter hypo-DMGs in TYO sh were signicant enriched in pathways involved in immune response while gene-body hyper-DMGs in TYO sh were signicant enriched in terms related to RNA transcription, biosynthetic, and energy production. RNA-seq indicated these terms or pathways involved in immune response, biosynthetic, and energy production also signicant enriched for the up-regulated genes in TYO sh. Conclusions: Collectively, these results revealed the genome-wide DNA methylation variations between grass carp with different ages. DNA methylation and gene expression variations in genes involved in immune response, biosynthetic, and energy production may contributed to the age-dependent susceptibility to GCRV in grass carp. Our results will provide important information for the disease-resistant breeding programs of grass carp and may also benet to the research of age-dependent diseases in human. resulting single-strand DNA fragments were PCR amplied using KAPA HiFi HotStart Uracil + ReadyMix. Library concentration was quantied by Qubit® 2.0 Flurometer and quantitative PCR, and the insert size was assayed on Agilent Bioanalyzer 2100 system. Libraries were sequenced on an Illumina Hiseq X Ten platform and 150 bp pair-end reads were generated.


Background
Grass carp (Ctenopharyngodon idellus) is an important farmed sh in China, which accounts for more than 18% of total freshwater aquaculture production in this country. Production of grass carp reached 6.01 million tons in 2016, implying the great commercial value of this sh [1]. Nevertheless, the grass carp hemorrhagic disease that caused by grass carp reovirus (GCRV) is an important threat for the aquaculture of grass carp [2]. Consequently, grass carp and the virus are of particular interest to sh breeding geneticists aiming to identify strategies for disease-resistant breeding [3][4][5][6][7]. Our previous study revealed that age is an important factor determined the susceptibility to GCRV infection.
Grass carp no more than one year old was sensitive to GCRV, while the sh over three years old was resistant to the virus [9]. Understanding the mechanism will be of bene t to grass carp disease-resistant breeding programs.
In mammals, age is also a crucial factor that affects disease outcomes [10]. Pseudorabies virus (PRV) infection caused more severe clinical disease in newborns than in older individual [11]. Infants and young children are more sensitive to the reovirus induced encephalitis than adults [12]. Mice display an age-dependent acceleration of mortality to In uenza virus In uenza virus (IAV) infection and are useful to model human aging and the outcomes to IAV infection [13]. Further research suggested that the maturation of the innate immune system and the expression of type I IFN genes may contribute to age-related susceptibility to virus infection [14], whereas why the these genes existed different expression patterns between young and adults remain unclear.
The age-dependent susceptibility to virus infection was also reported in several sh. Nervous necrosis virus (NNV) infection in barramundi (Lates calcarifer) caused nervous necrosis at 3 and 4 weeks of age, whereas developed subclinical symptom when sh age was more than 5 weeks [15]. Spring viremia of carp virus (SVCV) challenge in north American sh species revealed that sh of younger age classes were more vulnerable to SVCV infection than older sh [16]. Nevertheless, the mechanism in sh remains largely uncovered.
DNA methylation is an important mediator of gene expression regulation and many genes were reported to be in uenced by DNA methylation at different ages. TAP1 promoter methylation level was decreased while mRNA expression was increased in Sutai piglets from birth to weaning age (8, 18, 30, and 35-days old), which may contribute to the piglets resistant to Escherichia coli F18 at 35 days of age [17]. Genome-wide DNA methylation analysis of CD4 + and CD8 + T cells from younger and older individuals revealed increased number of methylation changes and higher methylome variation in CD8 + T cells with age, implying the link between age-related epigenetic changes and impaired T cell function [18]. Genome-wide DNA methylation survey of skeletal muscle between young and old healthy human showed a dynamic inter-relationship between DNA methylation, gene expression, age, and exercise [19]. Therefore, it seems like that DNA methylation play an important role in agerelated gene expression and is thought to underlie many age-related physiologic phenomena, for example, the age-dependent susceptibility to virus infection.
In the study, we performed a genome-wide survey of differences in DNA methylation and gene expression in spleen tissues between two age stages in grass carp: ve months old grass carp (sensitive to GCRV) and three years old grass carp (resistant to GCRV). Global DNA methylation patterns and gene expression pro les were investigated between two sh with different ages. The differentially methylated regions (DMRs) and differentially expressed genes (DEGs) were identi ed between two groups. Moreover, the correlation between DNA methylation and gene expression were investigated. We believe our results will provide important information for the disease-resistant breeding of grass carp and may also bene t to the research of age-dependent diseases in human.

Results
Global methylation pro le and whole-genome bisul te sequencing Epigenetic modi cations, such as DNA methylation level, have been shown to progressively accumulate during growth and then decreased in the ageing process [20]. Therefore, we analyzed the global DNA methylation status of spleen samples from two groups by colorimetric quanti cation of 5-methylcytosine. As shown in Fig. 1a, the global methylation level of TYO sh was higher than that of FMO sh, although this difference is not signi cantly. The gradually increase of global DNA methylation level in TYO sh was in accordance with the reports in human, implying the potential role of DNA methylation in age-related gene regulation.
To further reveal the mechanism underlying age-dependent susceptibility to GCRV in grass carp, we performed whole-genome bisul te sequencing (WGBS) on spleen samples collected from FMO and TYO sh at base-pair resolution. Three duplicates were processed for each group, yielding a total of 6 libraries. All libraries were sequenced on an Illumina Hiseq X Ten platform to generate 150 bp pair-end reads. Each library yielded > 33.68 GB clean base and > 23.06 × mean coverage depth. The bisul te conversion rates for all libraries were over 99.88%.
Moreover, for all libraries, more than 82.78% genomic sites were covered by at least ve unique reads and more than 70.70% sites were covered by at least ten unique reads (Table 1). Pearson correlation analysis and principal component analysis of the 6 libraries showed that samples from each group clustered together (additional le 1: Figure S1). Collectively, these results con rmed the high quality and repeatability of the WGBS data and suitability for further analysis. The sequencing data have been deposited in the Sequence Read Archive (SRA) at the National Center for Biotechnology Information (NCBI) (accession number: PRJNA638254). while the methylation rates of the cytosines in CHG and CHH contexts (where H is A, C, or T) were on more than 0.10% and 0.09%, respectively (additional le 2: Table S1). Among the mC sites that identi ed, more than 98% were in the CG contexts, while no more than 0.4% and 1.5% in CHG and CHH contexts (Fig. 1b). The genome sequence preference proximal to the sites of methylated CG (mCG), methylated CHG (mCHG), and methylated CHH (mCHH) contexts were also analyzed. It was found that no sequence preference in the mCG-anking regions or upstream of mCHG and mCHH contexts; however, the base following mCHG and mCHH contexts was almost always an adenine, following by thymine, while cytosine was observed less often (Fig. 1c). Moreover, the methylation level (ML), de ned as the proportion of reads covered each mCs relative to the total reads covered the sites, was calculated. Results showed that more than 80% of the mC sites displayed high ML (ML ≥ 70%) ( Fig. 1d).

Gene methylation pro le
To characterize methylation pro le of grass carp genes, the relative ML in the contexts of gene regions and of their upstream and downstream regions was calculated. In general, the relative ML of mCGs in the gene-body regions was higher than in the 5 ' upstream and 3 ' downstream regions in both group, while the ML in mCHG and mCHH was more complex (Fig. 2a). Interestingly, a sharp decrease of ML was observed across the boundaries of gene-body regions and upstream or downstream regions (Fig. 2a). We further divided the genome into different elements that including promoter, 5' untranslational regions (UTR), exon, intron, 3' UTR, and repeat. Results suggested that promoter and 5' UTR had relatively low ML in mCG, following by the exon, while the repeat had the highest ML (Fig. 2b, 2c). Moreover, the heat map of genomic elements suggested that genomic elements from the same group had the same methylation trends and a clear separation was observed between two groups, further implying the repeatability of duplicates and the distinction between sh with different ages (Fig. 2c).

DMRs between FMO and TYO sh
To further investigate the difference between two sh, the data of TYO sh group were compared with FMO sh group, and the DMRs were identi ed by using the DSS software. Due to most of mC sites were occurred in CG contexts, thereby only DMRs in CG contexts was considered for further study (Fig. 1b). A total of 6,214 DMRs were identi ed between the two sh groups, including 5,261 hyper-DMRs (hypermethylation in TYO sh compared with FMO sh) and 953 hypo-DMRs (hypomethylation in TYO sh compared with FMO sh) (additional le 3: Table S2). Apparently, the number of hyper-DMRs was more than that of hypo-DMRs. Interestingly, we also found that the mean ML of DMRs in TYO sh was higher than FMO sh (additional le 4: Figure S2a).The heat map of DMRs also showed similar result (Fig. 3a). The length distribution of DMRs was analyzed, and results showed that most of them were shorter than 400 bp (additional le 4: Figure S2b). Circos analysis in a circular layout revealed that the DMRs distributed uniformly in the genome, except the supercontigs C101000030 (Fig. 3b), in which more DMRs were identi ed. We identi ed a total of 4,052 DMGs, including 2,855 DMGs that harbored DMRs in gene-body regions (from TSS to TES) and 1,197 DMGs contained DMRs in promoter regions (additional le 5: Table S3).

Functional enrichment analysis of DMGs
To investigate the possible role of genes that showed differential methylation status, GO enrichment analysis was performed. For the genebody hyper-DMGs, the signi cant enriched GO terms were related to "regulation", such as regulation of transcription (DNA-templated), regulation of nucleic acid-templated transcription, regulation of RNA biosynthetic process, and regulation of RNA metabolic process, implying the DMGs involved in RNA transcription and biosynthesis. Whereas the gene-body hypo-DMGs were enriched in GO terms related to modulation of cell apoptosis or cell death, but the enrichment are not signi cantly. The promoter hyper-DMGs were also enriched in "regulation" related terms, such as regulation of RNA biosynthetic process, regulation of RNA metabolic process, and regulation of nucleobase-containing compound metabolic process, while the promoter hypo-DMGs were not signi cant enriched in any GO terms. The top ve signi cant enriched GO terms were listed in Table 2 and the detail information was shown in additional le 6: Table S4.  In addition, KEGG enrichment analysis was also carried out. Focal adhesion is the most signi cant enriched pathway of gene-body hyper-DMGs, following by ECM-receptor interaction, melanogenesis, dorso-ventral axis formation, and adherens junction. For the gene-body hypo-DMGs, Melanogenesis, GnRH signaling pathway, ubiquitin mediated proteolysis, vascular smooth muscle contraction, ubiquinone and other terpenoid-quinone biosynthesis were the top ve enriched pathways. Interestingly, immune related pathways, such as herpes simplex infection, cytokine-cytokine receptor interaction, and RIG-I-like receptor signaling pathway, were signi cant enriched in the promoter hypo-DMGs, whereas the biosynthesis and metabolism related pathways were the predominant enriched pathways in the promoter hyper-DMGs (Fig. 4). The detail information of the KEGG enrichment analysis was shown in additional le 7: Table S5.

Gene expression pro les between two groups
The gene expression pro les between two groups were investigated by RNA-seq using the same samples that used for WGBS. Three duplicates were performed for each group and then obtained a total of 6 libraries. All libraries were sequenced on an Illumina Hiseq X Ten platform to generate 150 bp pair-end reads. Each library generated ≥ 7.5 GB clean data and showed Q30 ≥ 94% (additional le 8: Table S6), implying the su cient quality of RNA-seq data and suitability for further analysis. The sequencing data have been deposited in the SRA at the NCBI (accession number: PRJNA634937). A total of 9,319 DEGs (additional le 9: Table S7) were identi ed when data of two groups were compared, including 4,639 up-regulated and 4,680 down-regulated genes. The DEGs were selected for annotation in order to assign the potential function.
For the up-regulated DEGs, GO enrichment analysis revealed that catalytic activity was the most signi cant enriched GO terms and also had the most number of enriched genes. In addition, many terms belonged to the cellular component category, such as intracellular part, cell part, and cytoplasmic part, were signi cant enriched. The down-regulated DEGs were enriched in GO terms related to signal transduction, enzyme activity, and cell response to stimulation (additional le 10: Table S8). KEGG enrichment analysis showed that metabolism, biosynthesis, and energy production related pathways, such as oxidative phosphorylation, proteasome, ribosome biogenesis in eukaryotes, and metabolic pathway, were signi cant enriched in the up-regulated DEGs. Moreover, some pathways participated in host defense (lysosome and phagosome), were also enriched. Focal adhesion, ECM-receptor interaction, phosphatidylinositol signaling system, vascular smooth muscle contraction, and adherens junction were the top ve enriched pathways in the down-regulated DEGs, following by several growth and development related signaling pathways, such as calcium signaling pathway, mTOR signaling pathway, and VEGF signaling pathway (additional le 11: Table S9 and additional le 12: Figure S3).

Correlation between DNA methylation and gene expression
To analyze the correlation between DNA methylation and gene expression, the DNA methylation level and gene expression level of the genebody regions, 5' upstream, and 3' downstream in two sh groups were analyzed. As shown in additional le 13: Figure S4, the DNA methylation and gene expression of gene-body regions, 5' upstream, and 3' downstream in both groups were presented. Furthermore, we divided genes into four groups according to the mRNA expression level: none, low, medium, and high, and the DNA methylation level in different groups were compared. In general, the methylation level in the four groups began to decrease from 2 kb upstream of the TSS of the genes while was increased downstream the TSS. In the 5' upstream, the genes with high mRNA expression level showed lowest DNA methylation level, whereas the unexpressed genes displayed the highest methylation level (Fig. 5a and additional le 14: Figure S5a). In contrast, the correlation between gene-body or 3' downstream methylation and mRNA expression was more complex. In addition, we also classi ed genes into ve categories based the methylation level, from the bottom 20% to the top 20%, corresponding to the 1st to 5th groups, respectively, and mRNA expression level of these groups were investigated. It could be seen that 1st group genes in the promoter regions showed the highest expression level, whereas mRNA expression of other groups was complex ( Fig. 5b and additional le 14: Figure S5b). Meanwhile, the 5th genes in the gene-body regions presented the highest expression level, while genes in other groups shared similar expression patterns ( Fig. 5c and additional le 14: Figure S5c). Therefore, based on the above ndings, it could be speculated that DNA hypomethylation in promoter regions and DNA hypermethylation in gene-body regions promoted gene expression.

DNA methylation and mRNA expression of DEGs and DMRs
The DEGs that obtained by RNA-seq were chose for further analysis to reveal the DNA methylation level of DEGs between two sh group. As shown in Fig. 5d, the DNA methylation level of gene-body regions of all DEGs in TYO sh group was slightly higher than that in FMO sh group, while DNA methylation level of promoter of DEGs in two groups was similar (Fig. 5e). Furthermore, the DMRs that obtained by WGBS were selected and the relationship between DNA methylation level and mRNA expression level were analyzed. For the DMRs in the promoter regions, both hyper-DMRs and hypo-DMRs displayed slightly negative correlation between DNA methylation level and mRNA expression level, and this correlation was more evidently in the promoter hypo-DMRs ( Fig. 6a and additional le 15: Figure S6a). However, for the DMRs in the gene-body regions, only the hyper-DMRs showed slightly positive correlation (r = 0.22, p < 0.05) (Fig. 6b), while no correlation was observed in the hypo-DMRs (p > 0.05) (additional le 15: Figure S6b). These ndings further suggested that hypomethylation in promoter regions and hypermethylation in gene-body regions involved in transcriptional activation.
Genes involved in age-dependent susceptibility to GCRV Enrichment analysis revealed gene-body hyper-DMGs were enriched in terms or pathways related to RNA transcription, biosynthesis, and energy metabolism. Moreover, it was proposed that gene-body hypermethylation promoted gene expression (Fig. 5c). We then investigated the mRNA expression pattern of genes involved in these terms or pathways to further con rmed the standpoint. As expected, for genes involved in protein biosynthesis and energy production, most of them were up-regulated in TYO sh (Fig. 6c, 6d). The promoter hypo-DMGs were enriched in the classical innate immune pathways and hypomethylation in promoter regions was thought to involve in transcriptional activation (Fig. 5b). We therefore analyzed the expression pro les of immune-related genes, and results showed that most of them presented up-regulation patterns in TYO sh (Fig. 6e). Genes involved in growth and development was also investigated, while most of them showed the opposite trend, downregulated in TYO sh group (Fig. 6f), implying the FMO sh was concentrated in cell growth and development. It is well known that immune response play an important role in host defense against pathogen. Protein synthesis and energy production are also bene t to organism resistant to virus infection. Therefore, the DNA methylation and gene expression variations of genes in these pathways may responsible for the age-dependent susceptibility to GCRV in grass carp with different ages.

Discussion
DNA methylation, the most widely and well-understood type of epigenetic modi cation, was reported to play a crucial role in transcriptional regulation [21,22]. In general, DNA hypomethylation in gene promoters is usually linked to transcriptional activation while hypermethylation in these regions is considered to correlate with transcriptional repression [23,24]. In sh, DNA methylation was reported to participate in skin color variations in crucian carp [25], phenotypic differences, environmental adaptation, and sex chromosome evolution in threespine stickleback [26][27][28], behavioral effects in zebra sh after bisphenol A exposure [29], sex determination in half-smooth tongue sole [30], and sex-speci c phenotypes in hybrid tilapia [31]. Moreover, it was also considered that DNA methylation was correlated with the disease resistance trait in grass carp and Chinese tongue sole [32,33]. Grass carp showed age-dependent susceptibility to GCRV. Therefore, we hypothesized that DNA methylation may involve in the disease resistance ability of three years old sh. Here, we reported for the rst time a genome-wide survey of DNA methylation status of spleen DNA from ve months old grass carp versus three years old grass carp and revealed the correlation between DNA methylation and mRNA expression. We believed that our results will bene t to the disease-resistant breeding of grass carp and may also provide useful information the research of age-dependent diseases in human.
We performed WGBS on spleen DNA from FMO and TYO grass carp. To con rm the reliability and repeatability of sequencing data, three duplicates were performed for each group. Therefore, the size and coverage depth of sequencing data obtained in our study was more than that reported in chicken, pigs, and Chinese tongue sole [24,33,34]. Interestingly, we found that general DNA methylation pattern of grass carp is consistent with other species [19,35,36]. For example, most of mC sites were occurred in the CG contexts, the relative low ML in the promoter or 5' UTR regions and high ML in repeat regions, negative correlation between promoter or 5' UTR regions methylation and gene expression. These results implied the conserved role of DNA methylation during evolution. The speci c roles of DNA methylation in gene-body regions remain unclear [37,38]. Interestingly, we showed genes with highest methylation level in the gene-body regions presented the highest expression level in both groups ( Fig. 5c and Figure S5C), suggesting that hypermethylation in gene-body regions positive correlated with mRNA expression. Similar results of positive correlation were also observed in arabidopsis, chicken, and human [24,39,40]. However, further study of DMRs revealed that negative correlation in promoter regions and positive correlation in gene-body regions are not obviously, probably due to the DNA methylation is only one of the factors that affect gene expression [34,35].
Colorimetric quanti cation of 5-mc showed global methylation level of TYO sh was higher than that of FMO sh (Fig. 1a). Moreover, we identi ed a total of 6,214 DMRs, in which most of them were hypermethylated in TYO sh. Meanwhile, the ML of DMRs in TYO sh was higher than FMO sh ( Figure S2A). These results may suggest that global DNA hypermethylation in TYO sh compared with FMO sh. Previous study showed that DNA methylation variations occur throughout the lifetime. Global DNA methylation levels increased over the rst few years of life and then decreased beginning in late adulthood [41,42]. Grass carp have a lifespan more 20 years and sexual maturity period of grass carp is about 4 ~ 5 years. The TYO grass carp is like the adolescent of human while the FMO sh is similar the infant or childhood of human beings. Therefore, it is understandable that DNA methylation level was higher in TYO sh. Similar result also reported in mouse, which blood showed a general pattern of epigenome-wide hypermethylation with age [43]. The DNA hypermethylation in TYO sh may be involved in the epigenetic reprogramming during development, which is important in regulate gene expression [44,45].
To reveal the potential role of DMGs, GO and KEGG analysis were performed. Results showed gene-body hyper-DMGs enriched in terms related to RNA transcription, biosynthesis, and metabolism, while gene-body hypo-DMGs were enriched in pathways involved in cell growth, proliferation, and differentiation (such as melanogenesis, GnRH signaling pathway, and calcium signaling pathway). Considering the DNA methylation in gene-body regions, especially DNA hypermethylation, positive correlated gene expression level, therefore, it could be speculated that TYO sh had superiority in biosynthesis and energy production, while the FMO sh were concentrated in cell growth and development.
Similar results of enrichment were also presented in the DEGs that obtained by RNA-sEq. The gene-body hyper-DMGs or up-regulated DEGs involved in RNA transcription, biosynthesis, and energy production suggested that the TYO sh is vigorousness, which may be bene t to resistant to virus infection [46,47]. The promoter hypo-DMGs were signi cant enriched in classical immune related pathways, such as herpes simplex infection, cytokine-cytokine receptor interaction, apoptosis, and RIG-I-like receptor signaling pathway. Regarding the DNA methylation in promoter regions, particularly DNA hypomethylation, negative correlated with mRNA expression level, we then supposed that the immunologic function of TYO sh was more perfect than FMO sh, which is important in defense against pathogen infection. Coincidently, RNA-seq also revealed that some immune related terms were also signi cant enriched in the up-regulated DEGs but not in the down-regulated DEGs.

Conclusion
In conclusion, global DNA methylation patterns and gene expression pro les between two sh groups with different ages were investigated. We

Methods
Experimental sh and sample collection A total number of 100 sh, which contained 50 ve months old (FMO) and 50 three years old (TYO) grass carp, on behalf of the sensitive and resistant sh to GCRV infection, respectively, were used in the study. All the sh were obtained from the Guan Qiao Experimental Station, Institute of Hydrobiology, Chinese Academy of Sciences (CAS), and acclimatized in aerated fresh water at 26-28 °C for one week before processing. Fish were fed with commercial feed twice a day and water was exchanged daily. If no abnormal symptoms were observed, the sh were selected for further study. In each group, 15 sh that represented three biological duplicates (n = 5 for each duplicate) were random collected, the sh were anesthesia by MS-222 at the concentration of 100 mg/L, and then sh were dissected and spleens were removed rapidly for analysis. The obtained samples were stored at − 80 °C until DNA or RNA extraction. After sample collection, sh were euthanized by immediately cutting off the spinal cord adjacent to the head. All of the experiments involving animals were carried out in accordance with the guide for the care and use of laboratory animals (Ministry of Science and Technology of China), and the protocol was approved by the committee of the Institute of Hydrobiology, CAS. All surgery was performed under MS-222 anesthesia, and all efforts were made to minimize suffering of sh.
Whole-genome bisul te sequencing Genomic DNA was extracted from kidneys using a DNeasy Blood& Tissue Kit (Qiagen, Hilden, Germany). DNA purity and concentration was measured using the NanoPhotometer® spectrophotometer (IMPLEN, USA) and Qubit® 2.0 Flurometer (Life Technologies, USA). DNA of su ciently high quality was used in library construction. A total of 5.2 µg DNA spiked with 26 ng λDNA were fragmented by sonication to 200-300 bp with Covaris S220, followed by end repair and adenylation. Cytosine-methylated barcodes were ligated to sonicated DNA as per manufacturer's instructions. Then these DNA fragments were treated twice with bisul te using EZ DNA Methylation-Gold™ Kit (Zymo Research, USA), and the resulting single-strand DNA fragments were PCR ampli ed using KAPA HiFi HotStart Uracil + ReadyMix. Library concentration was quanti ed by Qubit® 2.0 Flurometer and quantitative PCR, and the insert size was assayed on Agilent Bioanalyzer 2100 system. Libraries were sequenced on an Illumina Hiseq X Ten platform and 150 bp pair-end reads were generated.

Data analysis
Raw data reads in fastq format were initially processed using in-house perl scripts. In this step, clean data (clean reads) were obtained by removing adapter, poly-N and poor quality data. The Q20, Q30, and GC contents of the clean data were calculated, and all downstream analysis was performed using the clean high quality data. The reference genome of grass carp and clean reads were transformed into bisul teconverted sequences (C-to-T and G-to-A converted). Then, bismark software (version 0.16.3) was used to perform alignments of bisul tetreated reads to the reference genome [48]. Sequence reads that produce a unique best alignment from the two alignment processes (original top and bottom strand) are then compared to the normal genomic sequence and the methylation state of all cytosine positions in the read was inferred. The same reads that aligned to the same regions of genome were regarded as duplicated ones. The sequencing depth and coverage were summarized using deduplicated reads. The methylation extractor results were transformed into bigWig format for visualization using the IGV browser. The sodium bisul te non-conversion rate was calculated as the percentage of cytosines sequenced at cytosine reference positions in the lambda genome.
In order to calculate the methylation level of the sequence, we divided the sequence into multiple bins, with size is 10 kb. The sum of methylated and unmethylated read counts in each window/bin was calculated. Methylation level (ML) for each window or C site shows the fraction of methylated Cs (mC), and is de ned as: ML = reads (mC)/reads (mC + umC), where umC is the unmethylated Cs. Calculated ML was further corrected with the bisul te non-conversion rate according to previous studies [45].

Differentially methylated regions analysis
Differentially methylated regions (DMRs) analysis of two groups/conditions was identi ed using the DSS software [49], in which a slidingwindow method was used. The window was set as 1,000 bp and step length to 100 bp, and at least 10 informative CpGs in windows were considered. Fisher exact test was used to detect the DMRs. Differentially methylated genes (DMGs) were identi ed as genes whose gene-body regions (from transcription start site (TSS) to transcription end site (TES)) or promoter regions (upstream 2 kb from the TSS) have an overlap with the DMRs. GO and KEGG enrichment analysis of the DMGs were implemented by the GOseq R package and KOBAS software [50,51].

RNA -seq
RNA of spleens that collected above was isolated using TRizol reagent (Invitrogen, USA) according the manufacturer's protocol. RNA of su ciently high quality was used in library construction. Sequencing libraries were generated using the NEBNext Ultra RNA library prep kit for Illumina (New England Biolabs, USA) following the manufacturer's protocol. Libraries were sequenced on an Illumina Hiseq X Ten platform and 150 bp pair-end reads were generated. The output raw data reads were processed as described previously in order to get the clean data [4]. The clean reads were mapped to the reference genome of grass carp by Hisat2 software [52] and gene expression levels were calculated by FPKM (expected number of Fragments Per Kilobase of transcript sequence per Millions base pairs sequenced) method [53]. Differential expression analysis of two groups/conditions was performed using the DESeq package [54]. Genes with an adjusted P-value < 0.05 (q value < 0.05) in DESeq analysis were assigned as differentially expressed genes (DEGs). GO and KEGG enrichment analysis of DEGs were performed by GOseq R package and KOBAS software [50,51].
Global DNA methylation measurement and bisul te sequencing PCR Genomic DNA from the spleens of two groups was extracted using the traditional phenol-chloroform protocol with RNase treatment. Global DNA methylation level was measured with the MethylFlash™ Global DNA Methylation (5-methylcytosine, 5-mC) ELISA Easy Kit (Epigentek, USA) according to the manufacturer's protocol. The amount of input DNA for each assay was 100 ng to ensure optimal quanti cation. Three duplicates were performed for each group. Data were shown as the mean ± standard deviation of three replicates.

Consent for publication
Not applicable.

Availability of data and materials
The WGBS and RNA-seq data reported in the study have been deposited in the Sequence Read Archive (SRA) at the National Center for Biotechnology Information (NCBI) (accession number: PRJNA638254 and PRJNA634937). Other data generated or analysed during this study are included in additional les.
Competing interests