Ketogenic diet-mediated steroid metabolism reprogramming improves the immune microenvironment and myelin growth of spinal cord injury rats through gene analysis and co-expression network analysis

Background The ketogenic diet has been widely used in the treatment of various nervous system and metabolic-related diseases. Our previous research found that a ketogenic diet exerts a protective effect and promotes functional recovery after spinal cord injury. However, the mechanism of treatment is still unclear. In this study, different dietary feeding methods were used to detect myelin expression and gene level changes between different groups. Result We established 15 RNA-Seq cDNA libraries and divided them into 4 groups (SCI_KD, SCI_SD, Sham_KD and Sham_SD), and a total of 32,883 genes were detected by RNA-seq. First, KEGG pathway enrichment of upregulated differentially expressed genes (DEGs) in the SCI_KD and SCI_SD groups and GSEA analysis of the two groups found that a ketogenic diet signicantly improved the steroid anabolic pathway in rats with spinal cord injury. Through cluster analysis, PPI analysis and visualization of iPath metabolic pathways, Sqle, Sc5d, Cyp51, Dhcr24, Msmo1, Hsd17b7, and Fdft1 changed signicantly in the pathway. Second, through WGCNA analysis, all samples are placed in a gene network to analyse the correlation between gene modules and phenotypes. After module analysis, GO function and KEGG analysis showed that rats fed a ketogenic diet showed signicantly reduced immune-related pathways, including those associated with immune and infectious diseases. Conclusion

. For example, Tetzlaff et al. found that KD improves forelimb motor function after SCI [14]. Zhu et al. revealed that a KD exerted neuroprotective effects by attenuating oxidative stress in SCI rats [18][19].
Finally, our study recently reported that a ketogenic diet improved motor function by activating Nrf2 and suppressing NF-κB signalling pathways to attenuate oxidative stress and in ammation after spinal cord injury [17]. However, the underlying mechanisms by which a KD provides neuroprotection after SCI still not fully understood.
Based on the above questions, in this study, we performed RNA sequencing to examine the changes in gene expression of spinal cord tissue in Sprague-Dawley (SD) rats treated with different methods. We screened out differentially expressed genes and comprehensively detected genes with important biological signi cance through gene set enrichment analysis (GSEA) at the overall level [20]. Weighted gene co-expression network analysis (WGCNA) was used to establish a gene co-expression network to explore the relationship between the gene network and the KD diet phenotype [21]. These data promote further understanding of the new genes and related signal pathways and how KD treatment improves the recovery of motor function in SCI rats, providing new guidance for clinical treatment.

Experimental procedures and animal conditions
The entire experimental process is shown in Fig. 1A. To established a C7 spinal cord hemi-contusion model, unilateral C7 laminectomies were performed using a Precision Systems & Instrumentation-IH Impactor according to the manufacturer's instructions (Additional le 1: Fig. S1A). SD rats were anaesthetized with sodium pentobarbital to expose the spinal cord. We xed the C6-T1 vertebrae rigidly in a frame tilted at a 25.0° angle and then used the In nite Horizon impactor to deliver a set contusion force of 150 k dyne (Additional le 1: Fig. S1B); the contusion forces were monitored by computer (Additional le 1: Fig. S1C). An average of 150 k dyne contusion force was obtained for each sample. These results indicated that the C7 spinal cord hemi-contusion model was successful. Using myelin staining, the myelin sheath area in the SCI_KD group was signi cantly larger than in the SCI_SD group ( Fig. 1B-C), indicating that SCI rats protected or promoted myelin sheath growth at the injured site after feeding with a ketogenic diet.

Differentially expressed mRNAs
To further understand the effects and mechanism of the ketogenic diet on SCI, RNA-seq was performed on each group with 5 biological replicate samples, in which 5 samples were degraded and eliminated. To explore transcription factor expression in SD rats treated with KD under spinal cord injury, we established 15 RNA-Seq cDNA libraries divided into 4 groups (SCI_KD, SCI_SD, Sham_KD, and Sham_SD). A summary of the sequence assembly after Illumina sequencing is presented in Additional le 2: Table S1. The average raw reads was 23295214.7, the Q30 basic mass fraction of the sample was 100%, and the error rate (%) was 0, meeting the requirements for subsequent analysis. Using a comparison of the transcriptome and reference genome, greater than 82% of reads were uniquely mapped; these result meet the subsequent analysis requirements (Additional le 3: Table S2). In short, the sequencing data is suitable for subsequent data analysis.
A total of 32,883 genes were detected by RNA-seq, and Venn analysis between samples showed the number of expressed genes in each group ( Fig. 2A). We rst used an unsupervised classi cation methodprincipal component analysis (PCA) to characterize the differences between the gene expression pro les of the 15 samples. Fig. 2C shows that the biologically replicated samples can be divided into the following 4 groups: Sham_SD1-3, Sham_KD1-4, SCI_SD1-3 and SCI_KD1-5. The PCA model shows that the maximum variation in PC1 can explain 16.92% of the variation, while principal component 2 explains 9.85% of the variation [11]. In Fig. 2B, the scatter plot of differential genes between the SCI_KD and SCI_SD groups shows that 463 genes are differentially expressed between the two groups, among which 184 differentially expressed genes are upregulated and 279 differentially expressed genes are downregulated. The top 50 upregulated and downregulated genes are listed in Additional le 4: Table S3 and Additional le 5: Table S4, respectively.

Reprogramming Steroid Biosynthesis and Metabolism in SCI Rats on KD
Compared with the SCI_SD group, the KD diet group upregulated DEGs in many enrichment pathways, such as the steroid synthesis pathway in lipid metabolism (path ID: map00100, q-value = 5.82043E-08), biosynthesis of unsaturated fatty acid, and terpenoid backbone biosynthesisand, etc. A bubble chart of the differential genes in the top 20 most enriched pathways is provided (Fig. 3A). The downregulated DEGs from the KD diet group were signi cantly enriched in ribosomes, lysosomes, osteoclast differentiation, proteasome, tuberculosis, and antigen processing and display. Fig. 3B shows the KEGG enrichment distribution of the differentially expressed genes.
To further understand the biological functions of the SCI_SD and SCI_KD DEGs, we performed gene set enrichment analysis (GSEA) of the sample sets based on the C2 MSigDB database, which includes the Broad Institute recombined gene sets. The selected gene set was tested, and it was found that SCI_KD was signi cantly negatively correlated with in ammatory cytokines and their receptors (NES values are -1.44 and -1.46, respectively; Additional le 6: Fig. S2), suggesting that a ketogenic diet reduces the lethal function of spinal cord injury, which echoes the results of our previous studies. Similarly, based on the KEGG functional annotation, we found that SCI_KD was signi cantly positively correlated with the steroid synthesis pathway (NES is 1.62; Figure 3C) by GSEA analysis, which further shows that the ketogenic diet is involved in the metabolic reprogramming of steroids in SCI rats, which is an improvement by the KD. An important mechanism of SCI function may be related to reducing metabolic responses. iPath3.0 was used to visually analyse the enriched steroid metabolism-related genes, as shown in Additional le 7: Fig.  S3, which shows that the steroid metabolism DEGs in the biological system transmit information.

WGCNA and Hub module identi cation and analysis
According to the KEGG enrichment analysis of gene differential expression and the gene set enrichment analysis, the mean values showed that SCI rats in the ketogenic diet group signi cantly promote steroid anabolism, which fully indicates that the ketogenic diet may be protective by mediating steroid metabolism reprogramming. Perhaps by changing some physiological functions or pathways, we can analyse the SCI_KD, SCI_SD, Sham_KD, and Sham_SD results in a gene network (WGCNA), and fully understand the roles of a ketogenic diet and normal diet on genetic changes in SD rats. Genes that are commonly expressed in the four groups are placed in the same gene network through the WGCNA algorithm. A total of 32,883 mRNAs were screened by WGCNA, and a total of 5,163 mRNAs were used for subsequent analysis after pre-processing and ltering, such as excluding deletions and outliers. Fig. 4A and 4B show the scale-free tness curve and the average connectivity curve, respectively. The scale-free tness curve is in a smooth position and the power exponent is a weighted β value; thus, the soft threshold value is set to 6 and the scale-free topological index is 0.9. Therefore, the network conforms to the power law distribution and is closer to the actual biological network state. The resulting gene tree diagram and the colours in the corresponding modules are shown in (Additional le 8: Fig. S4A-B), with the number of mRNAs for each module shown in Attached le 9: Table S5.
After the data was pre-processed, the genes were classi ed, and genes with similar expression patterns are divided into categories. These genes categories are called modules. We performed module clustering and module correlation analysis for the obtained modules, as shown in Fig. 4D. Interaction relationships were analysed in the six different modules, and a network heat map of the correlation between the modules and the phenotype was drawn (Fig. 4C). The results show that the modules are independent of one another, which indicates the relative independence of gene expression in different modules. Here, we mainly analysed the correlation modules between the ketogenic diet-and standard diet-fed rats. Among them, the MEgreen module is highly related to KD and the MEblue module is highly related to SD. We then calculated the correlation coe cient between each gene in the module and speci c phenotype data to obtain the gene signi cance value (GS); this was combined with the MM value of each gene in the module (i.e., kME, eigengene-based connectivity) to obtain the MM-GS scatter plot. The stronger the correlation, the closer the relationship between the module and the phenotype; moreover, correlations with a lower p-value are more reliable. This analysis shows that MEgreen is an important module in the KD diet, while MEblue is an important module in the SD diet (Additional le 8: Fig. S4C-F).

Functional annotation analysis of hub modular genes
To explore the biological functions of the KD diet and SD diet modules (green and blue, respectively) and compare the differences in their biological functions, we performed GO and KEGG enrichment analysis with annotation and visualization. GO function annotation analysis classi es DEGs using different perspectives, such as the biological processes involved, the components that constitute cells, and the molecular functions achieved. Figure 5A shows the GO annotation functions of the MEgreen module and Figure 5B shows the GO annotation functions of the MEblue module. Different from other groups, there was a signi cant positive correlation between SCI rats with normal diet and MEblue module. GO function analysis immune system processes were expressed in biological processes, which indicated that immune system signi cantly affected SCI rats, and immune system processes included all process of in ammation-related pathways, congenital and acquired immune expression changes.
Similarly, KEGG annotation analysis was also performed on the green and blue modules. The results show that the KEGG annotation molecule in the MEblue module signi cantly expresses immune system, immune system and infectious diseases: viruses and other immune disease pathways among subtypes.
This module mainly focuses on rats with normal diet of SCI, which is consistent with the GO function analysis results. Compared with MEblue module, in terms of expression ratio, immune system, immune system and infectious diseases: the sub-categories of viruses and other immune disease pathways are signi cantly down-regulated, and the module is related to the ketogenic diet (Additional le 10. Fig. S5).
Multi-gene set KEGG enrichment analysis and gene interaction network analysis The Immune system and Infectious diseases: Viral and Immune diseases pathways were used as pathways of interest for multi-gene KEGG enrichment analysis, and Fig. 6A shows the KEGG enrichment results of the MEgreen module and Fig. 6B shows the KEGG enrichment results of the MEblue module. The MEgreen module is primarily a module closely related to the ketogenic diet rats while the MEblue module is mainly a module closely linked to the normal diet rats, especially the normal diet injury rats.
The results of the KEGG enrichment signal pathways and intersection of the top 20 pathways for the two modules show that only the Viral myocarditis and Phagosome pathways are present in both modules. The MEgreen module mainly activates innate immune-related pathways, such as the Toll-like receptor signalling pathway and the NOD-like receptor signalling pathway. These results indicate that MEgreen and MEblue act on different immune pathways. The WGCNA and functional analysis showed that rats fed with a ketogenic diet showed a reduction in immune system pathway processes, including immune diseases, infectious diseases and other immune microenvironments.

Discussion
The ketogenic diet has been widely used in the treatment of a variety of neurological and metabolicrelated diseases, especially epilepsy [22][23]. In recent years, some studies, including previous studies performed by our team, have shown that the ketogenic diet has a protective effect after spinal cord injury by increasing the spontaneous movement and ne manipulation of SCI animal models [17,[24][25]. However, the mechanism of treatment is still unclear. In this study, myelin expression and transcriptome level changes between different groups were detected in rats under different dietary feeding methods.
This is the rst study to use RNA-seq to understand and identify differentially expressed genes in SCI rats fed a ketogenic diet and then examine which biological signal pathways can explain the protective effects of a KD diet on SCI rats.
In this study, we used a variety of analytical methods to mine and interpret entire transcriptome RNA sequencing data, including differentially expressed genes and functional enrichment analysis between groups, gene set enrichment analysis between groups, iPath metabolic pathway analysis, gene cluster analysis, WGCNA and module analysis, module function analysis and gene interaction network analysis. From single gene expression analysis to changes in the overall gene network, a comprehensive analysis of the changes in gene expression patterns under a ketogenic diet can provide a deeper understanding of changes in gene transcription and signal pathways by a KD for the treatment of spinal cord injury.
From a local perspective, we used KEGG pathway enrichment of upregulated DEGs in the SCI_KD and SCI_SD groups and GSEA analysis of these groups to reach a consistent conclusion that KD signi cantly increased the steroid anabolic pathway in rats with spinal cord injury. Through cluster analysis, PPI analysis and the iPath metabolic pathway visualization, we concluded that Sqle, Sc5d, Cyp51, Dhcr24, Msmo1, Hsd17b7, and Fdft1 changed signi cantly in the pathway. Steroid biosynthesis appears to play a pivotal role in SCI.
The ketogenic diet is a high-fat, low-carbohydrate and moderate protein diet. Its purpose is to reduce carbohydrate intake and replace it with lipids while ensuring a su cient protein supply [26]. This programme was proposed by Wilder in 1921 for the treatment of epilepsy. It was later used for a variety of neurological diseases, such as Alzheimer's disease, Parkinson's disease, etc. [27][28]. Although it is known that a KD promotes high ketone body levels, high fat, low carbohydrates, low calories and other "immediacy" properties [29][30], this study suggests that the ketogenic diet may mediate steroid metabolism reprogramming to treat spinal cord injury rats This study is the rst to reach this conclusion. Steroids include sterols (such as cholesterol, lanosterol, sitosterol, stigmasterol, and ergosterol), bile acids and bile alcohols, steroid hormones (such as adrenal cortex hormones, androgens, and oestrogen), insect ecdysone, cardiac glycosides, saponins, and toad poison [31][32][33]. Eating a high-fat/low-carbohydrate ketogenic diet can cause the liver to produce ketones. In the brain, ketone bodies (such as βhydroxybutyrate) contribute to sterol synthesis [34], which is essential for the growth of myelin membranes [55 (35)]. Studies have shown that diabetic patients can signi cantly increase HDL cholesterol and lower blood pressure through a 90-day KD diet, while LDL cholesterol shows no signi cant difference [36]. However, long-term ketogenic diet increases the risk of type 2 diabetes and metabolic syndrome. This study explored the relevant mechanism of short-term KD treatment in SCI rats [37]. Additionally, studies have found that HDLc showed small increases after ketogenic phases, but there was no signi cant change over 12 months [38].
This study also found that myelin areas were signi cantly larger in SCI rats after being fed a KD than SCI rats in the normal diet group. A recent study by Stumpf found that a ketogenic diet can improve axon defects and promote myelination in Pelizaeus-Merzbacher disease. A high-fat/low-carbohydrate ketogenic diet can restore the integrity of oligodendrocytes and increase CNS myelin. Lipids (such as ketone bodies) easily enter the CNS without destroying the blood-brain barrier or blood-spinal cord barrier [39]. Ketone bodies produced by a normal diet are not able to enter the CNS to enhance the myelin membrane. These conclusions indicate that a ketogenic diet mediates steroid metabolism reprogramming after spinal cord injury, facilitates myelin membrane synthesis, promotes myelin sheath regeneration, plays a unique role in neuroprotection, and does not destroy the blood spinal cord barrier function. It is well known that the blood spinal cord barrier is an important barrier for the spinal cord to maintain neuroimmune balance.
This study combined 4 groups of rat genes into the same gene network through WGCNA analysis to analyse the correlation between gene modules and phenotype. Through module identi cation and analysis as well as GO function analysis and KEGG pathway analysis, we showed that rats fed a ketogenic diet displayed signi cantly reduced immune-related pathways, including immune diseases and infectious diseases. In addition, examining the SCI_KD and SCI_SD groups data and existing MSigDB analysed by GSEA, we concluded that SCI_KD is signi cantly negatively correlated with in ammatory cytokines and their receptors. These results indicate that KD treatment can improve the body's immunity level and reduce infection and in ammation.
In SCI secondary injury, in ammation is an important and main pathophysiological process. After SCI, ischaemia-reperfusion injury, endothelial cell injury and homeostasis disorders jointly initiate the in ammatory cascade reaction, and spinal cord innate immune cells (microglia and astrocytes) [40] and in ltrating white blood cells (neutrophils and macrophages) are activated [40][41]. The activated in ammatory cells release a large number of in ammatory factors to further expand the damage and increase the time and space of the damage response. Our research studies have shown that KD reduces or inhibits neutrophils in the plasma of SCI rats and inhibits the classic NF-κB pro-in ammatory signalling pathway at the injury site, which corresponds to the results of this study [17]. Steroid synthesis is also closely related to SCI neuroin ammation, and steroid synthesis hormones [42], oestrogen [43][44], saponin [45], tauroursodeoxycholic acid [46] and other steroid components in the spinal cord injury all have anti-in ammatory properties. Fig. 8A-B show the KEGG signalling pathway at the intersection of the immune microenvironment of a normal diet and a ketogenic diet, respectively. The normal diet mainly activates innate immune-related pathways, such as the Toll-like receptor signalling pathway and NOD-like receptor signalling pathway. This study cannot directly prove a correlation between ketogenic dietmediated reprogramming of steroid metabolism and neuroimmunity, which is the limitation of the study. The signi cantly expressed genes in the steroid synthesis pathway need to be further examined in future studies to determine if they are related to SCI e cacy.

Conclusions
This study is the rst to demonstrate that the ketogenic diet signi cantly promotes the reprogramming of steroid metabolism in the treatment of SCI al the transcriptional level. These changes are bene cial for myelin sheath growth and promote nerve function. Moreover, a comprehensive analysis of the ketogenic diet through WGCNA improves the SCI immune microenvironment and reduces in ammation. Therefore, we believe that the ketogenic diet may improve the immune microenvironment and myelin growth in rats with spinal cord injury through reprogramming of steroid metabolism.

Animals and diets
All animal experiments conformed to the Guide for the Care and Use of Laboratory Animals (NIH Publications No. 8023, revised 1978). All procedures and euthanasia performed in this study were approved by the Ethics Committee for Animal Experiments of the Department of Laboratory Animal Science, Peking University Health Science Center (NO. LA201456, 24 February 2014).
We randomly divided 32 8-week-old male Sprague-Dawley rats (300-360 g) into the following four groups (n = 8 in each group): the control group was undamaged and divided into the standard diet group (Sham_SD) and the ketogenic diet group (Sham_KD), while the SCI rats received either SD (SCI_SD group) or KD (SCI_KD group), 20 rats were used for RNAseq and 12 rats were used for histopathological examination of xed specimens. All rats were reared at 21°C with a light/dark cycle of 12/12 hours, and food and water were provided freely. Rats in the SCI_KD group were fed a KD (70 g/kg body weight/day), and the ratio of fat to carbohydrate and protein was 4:1 (fat: 50.5 g/100 g KD; carbohydrate: 23.6 g/100; four hours after SCI, protein content: 5.5 g/100 g KD; dietary bre: 16.1 g/100 g KD (Shenzhen Zeneca Inc., Shenzhen, China)). The rats in the Sham_SD group and SCI _SD group were fed normal carbohydrates. All groups had free water supply.

C7 spinal cord semi-contusion
A C7 hemi-contusion was performed in the two SCI groups as described previously for a different site of contusion [20]. Brie y, rats were anaesthetized with sodium pentobarbital after a unilateral (left or right side, chosen randomly) C7 laminectomy, and the C6-T1 vertebrae were rigidly xed in a frame tilted at a 25.0° angle. We then used the In nite Horizon impactor to deliver a set contusion force of 150 kdynes [21]. The C7 hemi-contusion is hereafter described as 'SCI.' Rats were fed for four weeks according to the above grouping and diet plan, and spinal cord tissue was sampled after four weeks.

EC myelin stain
Spinal cord tissue was placed in a fully automatic tissue dehydrator for an overnight dehydration programme (xylene, alcohol isocratic dehydration), embedded in para n, and cut it into 5-µm continuous sections with a microtome after cooling overnight at 40°C. The tissues were soaked in xylene I and II for 30 minutes each, after which they were soaked in 100%, 100%, 95%, 95%, and 80% alcohol for 5 minutes each and then rinsed twice with double-distilled water. The tissues were then stained with triochrome cyanine (EC) for 15-20 minutes before undergoing ferric chloride (5.6%) differentiation for a speci c time under a microscope. We made sure not to over-differentiate, as the staining would be too light. Afterwards, the samples were dehydrated and mounted with neutral gum.

RNA isolation, RNA-Seq Library Construction and Sequencing
We randomly blindly divide 20 SD rats into 4 groups, namely SCI_KD, SCI_SD, Sham_KD and Sham_SD. Total RNA was extracted from a 1.6-cm spinal cord segment at the centre of the impact site with the Trizol reagent (Invitrogen, Thermo Fisher Scienti c Inc., USA) and subjected to RNA-seq analysis to assess gene expression (n = 5). Genomic DNA was removed using DNase I. Agarose gels (1%) were used to monitor RNA degradation and contamination. A NanoPhotometer® spectrophotometer (IMPLEN, CA, USA) was used to check RNA purity. The Qubit® RNA Assay Kit was used to measure the RNA concentration on a Qubit® 2.0 Fluorometer (Life Technologies, CA, USA). The RNA Nano 6000 Assay Kit was used to assess the RNA integrity on a Bioanalyser 2100 system (Agilent Technologies, CA, USA). RNA sequencing of 5 biological replicate samples was performed for each group, and 5 samples were degraded and eliminated. Therefore, 15 RNA-Seq cDNA libraries were established. 1) We utilized two methods to treat the total RNA. Oligo(dT) magnetic beads were used to select mRNAs with polyA tails. rRNAs were hybridized with DNA probes and then digested to make DNA/RNA hybrid strand, followed by DNase I reaction to remove DNA probes. The target RNAs were then obtained by puri cation. 2) The target RNA was fragmented and then reverse transcribed into double-stranded cDNA (dscDNA) with N6 random primers. 3) The dscDNA was end-repaired with phosphates at the 5' end and sticky 'A's at the 3' end, after which the sequence was ligated to an adaptor with sticky 'T's at the 3' end of the dscDNA. 4) Two speci c primers were used to amplify the ligation product. 5) The PCR products were denatured with heat and the single strand DNA was cyclized with a splint oligo and DNA ligase. 6) Finally, the prepared library was sequenced with the Illumina HiSeq xten (2 × 150-bp read length, Illumina, San Diego, CA, USA). The sequencing programmes were performed by BGI Company (China, Shenzhen).

Quality control and expression level
The raw paired-end reads were trimmed and subjected to quality control by SeqPrep (https://github.com/jstjohn/SeqPrep) and Sickle (https://github.com/najoshi/sickle) with default parameters. The paired-end clean reads were aligned to the Rattus_norvegicus reference genome (Rnor_6.0) using the default parameters in TopHat (http://tophat.cbcb.umd.edu/, version2.1.1) [47]. The reference genome and gene model annotation les were directly downloaded from genome website (http://www.ensembl.org/Rattus_norvegicus/Info/Index). RSEM (http://deweylab.biostat.wisc.Edu/rsem/) [48] was used to quantify the gene abundances. To identify DEGs between two different samples, the expression levels of each transcript were calculated based on the number transcript reads per million readings (TPM, per million transcript reads). The reads were then normalized using DEseq2 v. 1.24.0 (http://bioconductor.org/packages/stats/bioc/DESeq2/) [49] and the statistical signi cance of the differentially expressed genes was assessed. The following default parameters and criteria were used here: Benjamini & Hochberg (BH) p-adjust< 0.05 and |log2FC|≥1 after multiple comparisons. In the initial data exploration, we performed principal component analysis (PCA) to directly calculate the coe cient of variation between the groups in R .
Gene function annotation analysis The genes and transcripts are aligned to six databases (NR, Swiss-Prot, Pfam, EggNOG, GO, and KEGG) to obtain comprehensive annotation information. The DIAMOND software (https://github.com/bbuch nk/diamond) [50]was used to sequence the genes and transcripts with the NR (ftp://ftp.ncbi.nlm.nih.gov/blast/db/), Swiss-Prot (ftp://ftp.uniprot .org/pub/databases), and EggNOG (http://eggnogdb.embl.de/#/app/home) databases [51]. Sequence alignment was performed with the BLAST2GO and GO database, followed by alignment with the Pfam database using the HMMER software (ftp://selab.janelia.org/pub/software/hmmer3/3.0/hmmer-3.0.tar.gz) [52]. GO (Gene Ontology, http://www.geneontology.org/) is a database established by the Gene Ontology Federation. Its purpose is to standardize the biological terminology of genes and gene products in different databases and to determine gene and protein functions. Using the GO database, the genes in the selected gene set can be classi ed according to the biological process (BP) they participate in, the cell component (CC), and the molecular function (MF) they achieve. The results of KEGG Orthology (Kyoto Encyclopaedia of Genes and Genomes, http://www.genome.jp/kegg/) were obtained using KOBAS2.1. KEGG is a knowledge base that systematically analyses gene functions and links genomic information and functional information. Using the KEGG database, the genes in the gene set can be classi ed according to the pathways involved or the functions they perform [53].

KEGG pathway enrichment analysis
The KEGG PATHWAY enrichment analysis was performed on the gene set using an R script in KOBAS (http://kobas.cbi.pku.edu.cn/home.do) [53]. The BH method was used to calculate the corrected p value. GSEA is performed on a prede ned set of genes, usually from functional annotations or the results of previous experiments (included in MSigDB, version 6.2, http://software.broadinstitute.org/gsea/downloads.jsp), and organizes the genes according to two sample types. The GSEA sorts the degree of differential expression and then checks whether the gene set clusters at the top or bottom of the ranking list. the rst round of enrichment is based on C2, which includes curated gene sets (from online pathway databases, publications in PubMed, and knowledge of domain experts). The second enrichment is based on the results of KEGG functional annotation in this study. For the analysis results, we generally think that gene sets with the following characteristics are meaningful: |NES|>1, NOM p-val<0.05, and FDR q-val<0.25.

Protein-protein interaction analysis (PPI)
We used an online search tool (STRING database, version 10.5; http://string-db.org/) to search for interacting genes to construct a protein-protein interaction (PPI) network based on the extensive functions of the genome-wide data [55].
Weighted Gene Co-expression Network Analysis (WGCNA) The co-expression network is constructed using the WGCNA (v1.47) package in R [21]. WGCNA is an undirected, weighted network. "Unweighted network" means that the correlation between genes can only be 0 or 1. 0 means that the two genes are not related, while 1 means they are related. A "weighted network" and genes are not only related or not but also their correlation value is recorded. The value is the weight (correlation) of the connection between genes. The soft threshold (β power) in the co-expression network is the expression of each gene at a speci c time or space and is regarded as a node. To obtain the correlation between genes, it is necessary to calculate the correlation coe cient (Person Coe cient) any two genes; this allows us to know whether the expression pro les of two genes are similar. In this study, the β power=6.
1) The module refers to the grouping of genes with similar expression patterns. Genes of a speci c type below to a module.
2) An eigengene is de ned as the rst principal component of a given module. It is the most representative expression pattern of the module. It is an indicator that approximately represents the expression status of all genes in a module but is not a real gene.
3) A hub module refers to a module that is highly related to phenotypic data or closely related to research.
4) Gene signi cance (GS), or a measure of gene importance, is the correlation between gene expression patterns and the characteristics of a sample. Generally, the larger the correlation value, the more important the gene, which can be evaluated as the log conversion of the p-value. The correlation index between a gene and a speci c pathway can also be the correlation between a gene and a speci c phenotypic feature.

5) Module membership (MM)
, also known as the eigengene-based connectivity (kME), is the correlation between the expression pattern of a gene and a characteristic gene in the module. The larger the value, the greater the possibility that the gene belongs to the module.
Genes with the highest MM and the highest GS are genes with high signi cance (hub genes). All these analyses were performed using commands implemented in the WGCNA software package with the following parameters: minModuleSize = 30, minKMEtoStay = 0.3, and mergeCutHeight = 0.2.

Modular gene expression correlation analysis
After a gene of interest is obtained based on the correlation of gene expression, the correlation coe cient between genes is determined using the spearman correlation algorithm. The correlation coe cient threshold is 0.5 and the signi cance level is the p value after BH correction (Corrected p-value)≤0.05, and these values were drawn into a visual network diagram. In the gure, the nodes represent genes and the connections between nodes represent the correlation between gene expression. The larger the node, the greater the expression correlation between the genes.

Multi-gene set KEGG pathway enrichment analysis
After performing KEGG analysis on the different gene sets, we utilized the intersection of the selected gene sets for the display and adjusted the signi cance level to the corrected p-value ≤0.5 as the threshold to display the top 20 pathways.

Statistical Analysis
The data are expressed as the mean ± standard deviation (mean ± SD). The statistical analysis was performed with GraphPad Prism 7.0 (GraphPad Software Inc., San Diego, CA). The Student's t-test was used for comparisons between two groups. A p value<0.05 was considered statistically signi cant. * p <0.05, ** p <0.01, *** p <0.001, and **** p <0.0001.