Animals and diets
All animal experiments conformed to the Guide for the Care and Use of Laboratory Animals (NIH Publications No. 8023, revised 1978). All procedures and euthanasia performed in this study were approved by the Ethics Committee for Animal Experiments of the Department of Laboratory Animal Science, Peking University Health Science Center (NO. LA201456, 24 February 2014).
We randomly divided 32 8-week-old male Sprague-Dawley rats (300-360 g) into the following four groups (n = 8 in each group): the control group was undamaged and divided into the standard diet group (Sham_SD) and the ketogenic diet group (Sham_KD), while the SCI rats received either SD (SCI_SD group) or KD (SCI_KD group), 20 rats were used for RNAseq and 12 rats were used for histopathological examination of fixed specimens. All rats were reared at 21°C with a light/dark cycle of 12/12 hours, and food and water were provided freely. Rats in the SCI_KD group were fed a KD (70 g/kg body weight/day), and the ratio of fat to carbohydrate and protein was 4:1 (fat: 50.5 g/100 g KD; carbohydrate: 23.6 g/100; four hours after SCI, protein content: 5.5 g/100 g KD; dietary fibre: 16.1 g/100 g KD (Shenzhen Zeneca Inc., Shenzhen, China)). The rats in the Sham_SD group and SCI _SD group were fed normal carbohydrates. All groups had free water supply.
C7 spinal cord semi-contusion
A C7 hemi-contusion was performed in the two SCI groups as described previously for a different site of contusion [20]. Briefly, rats were anaesthetized with sodium pentobarbital after a unilateral (left or right side, chosen randomly) C7 laminectomy, and the C6-T1 vertebrae were rigidly fixed in a frame tilted at a 25.0° angle. We then used the Infinite Horizon impactor to deliver a set contusion force of 150 kdynes [21]. The C7 hemi-contusion is hereafter described as ‘SCI.’ Rats were fed for four weeks according to the above grouping and diet plan, and spinal cord tissue was sampled after four weeks.
EC myelin stain
Spinal cord tissue was placed in a fully automatic tissue dehydrator for an overnight dehydration programme (xylene, alcohol isocratic dehydration), embedded in paraffin, and cut it into 5-µm continuous sections with a microtome after cooling overnight at 40°C. The tissues were soaked in xylene I and II for 30 minutes each, after which they were soaked in 100%, 100%, 95%, 95%, and 80% alcohol for 5 minutes each and then rinsed twice with double-distilled water. The tissues were then stained with triochrome cyanine (EC) for 15-20 minutes before undergoing ferric chloride (5.6%) differentiation for a specific time under a microscope. We made sure not to over-differentiate, as the staining would be too light. Afterwards, the samples were dehydrated and mounted with neutral gum.
RNA isolation, RNA-Seq Library Construction and Sequencing
We randomly blindly divide 20 SD rats into 4 groups, namely SCI_KD, SCI_SD, Sham_KD and Sham_SD. Total RNA was extracted from a 1.6-cm spinal cord segment at the centre of the impact site with the Trizol reagent (Invitrogen, Thermo Fisher Scientific Inc., USA) and subjected to RNA-seq analysis to assess gene expression (n = 5). Genomic DNA was removed using DNase I. Agarose gels (1%) were used to monitor RNA degradation and contamination. A NanoPhotometer® spectrophotometer (IMPLEN, CA, USA) was used to check RNA purity. The Qubit® RNA Assay Kit was used to measure the RNA concentration on a Qubit® 2.0 Fluorometer (Life Technologies, CA, USA). The RNA Nano 6000 Assay Kit was used to assess the RNA integrity on a Bioanalyser 2100 system (Agilent Technologies, CA, USA). RNA sequencing of 5 biological replicate samples was performed for each group, and 5 samples were degraded and eliminated. Therefore, 15 RNA-Seq cDNA libraries were established.
1) We utilized two methods to treat the total RNA. Oligo(dT) magnetic beads were used to select mRNAs with polyA tails. rRNAs were hybridized with DNA probes and then digested to make DNA/RNA hybrid strand, followed by DNase I reaction to remove DNA probes. The target RNAs were then obtained by purification. 2) The target RNA was fragmented and then reverse transcribed into double-stranded cDNA (dscDNA) with N6 random primers. 3) The dscDNA was end-repaired with phosphates at the 5' end and sticky 'A's at the 3' end, after which the sequence was ligated to an adaptor with sticky 'T's at the 3' end of the dscDNA. 4) Two specific primers were used to amplify the ligation product. 5) The PCR products were denatured with heat and the single strand DNA was cyclized with a splint oligo and DNA ligase. 6) Finally, the prepared library was sequenced with the Illumina HiSeq xten (2 × 150-bp read length, Illumina, San Diego, CA, USA). The sequencing programmes were performed by BGI Company (China, Shenzhen).
Quality control and expression level
The raw paired-end reads were trimmed and subjected to quality control by SeqPrep (https://github.com/jstjohn/SeqPrep) and Sickle (https://github.com/najoshi/sickle) with default parameters. The paired-end clean reads were aligned to the Rattus_norvegicus reference genome (Rnor_6.0) using the default parameters in TopHat (http://tophat.cbcb.umd.edu/, version2.1.1) [47]. The reference genome and gene model annotation files were directly downloaded from genome website (http://www.ensembl.org/Rattus_norvegicus/Info/Index). RSEM (http://deweylab.biostat.wisc.Edu/rsem/) [48] was used to quantify the gene abundances. To identify DEGs between two different samples, the expression levels of each transcript were calculated based on the number transcript reads per million readings (TPM, per million transcript reads). The reads were then normalized using DEseq2 v. 1.24.0 (http://bioconductor.org/packages/stats/bioc/DESeq2/) [49] and the statistical significance of the differentially expressed genes was assessed. The following default parameters and criteria were used here: Benjamini & Hochberg (BH) p-adjust< 0.05 and |log2FC|≥1 after multiple comparisons. In the initial data exploration, we performed principal component analysis (PCA) to directly calculate the coefficient of variation between the groups in R `prcomp`.
Gene function annotation analysis
The genes and transcripts are aligned to six databases (NR, Swiss-Prot, Pfam, EggNOG, GO, and KEGG) to obtain comprehensive annotation information. The DIAMOND software (https://github.com/bbuchfink/diamond) [50]was used to sequence the genes and transcripts with the NR (ftp://ftp.ncbi.nlm.nih.gov/blast/db/), Swiss-Prot (ftp://ftp.uniprot .org/pub/databases), and EggNOG (http://eggnogdb.embl.de/#/app/home) databases [51]. Sequence alignment was performed with the BLAST2GO and GO database, followed by alignment with the Pfam database using the HMMER software (ftp://selab.janelia.org/pub/software/hmmer3/3.0/hmmer-3.0.tar.gz) [52]. GO (Gene Ontology, http://www.geneontology.org/) is a database established by the Gene Ontology Federation. Its purpose is to standardize the biological terminology of genes and gene products in different databases and to determine gene and protein functions. Using the GO database, the genes in the selected gene set can be classified according to the biological process (BP) they participate in, the cell component (CC), and the molecular function (MF) they achieve. The results of KEGG Orthology (Kyoto Encyclopaedia of Genes and Genomes, http://www.genome.jp/kegg/) were obtained using KOBAS2.1. KEGG is a knowledge base that systematically analyses gene functions and links genomic information and functional information. Using the KEGG database, the genes in the gene set can be classified according to the pathways involved or the functions they perform [53].
KEGG pathway enrichment analysis
The KEGG PATHWAY enrichment analysis was performed on the gene set using an R script in KOBAS (http://kobas.cbi.pku.edu.cn/home.do) [53]. The BH method was used to calculate the corrected p value. The threshold of the corrected p value (Corrected P-Value) is 0.05. The KEGG pathway that meets the conditions showing that the pathway was significantly enriched for the gene set.
iPath metabolic pathway analysis
iPath3.0 (http://pathways.embl.de) was used to visually analyse the metabolic pathways and facilitate observation of the metabolic pathway information of differentially expressed genes in the entire biological system. Nodes represent various biochemical molecules and lines represent biochemical reactions. The overall metabolism of the biological system includes 146 metabolism-related KEGG pathways, 22 regulatory-related KEGG pathways, and 58 KEGG pathways involved in the biosynthesis of secondary metabolites. The nodes in the figure represent different compounds and the boundaries represent different enzymatic reactions. iPath3.0 can outline the biosynthesis and important regulatory pathways of secondary metabolites and easily identify complex metabolic pathways [54].
Gene Set Enrichment Analysis (GSEA)
GSEA is performed on a predefined set of genes, usually from functional annotations or the results of previous experiments (included in MSigDB, version 6.2, http://software.broadinstitute.org/gsea/downloads.jsp), and organizes the genes according to two sample types. The GSEA sorts the degree of differential expression and then checks whether the gene set clusters at the top or bottom of the ranking list. the first round of enrichment is based on C2, which includes curated gene sets (from online pathway databases, publications in PubMed, and knowledge of domain experts). The second enrichment is based on the results of KEGG functional annotation in this study. For the analysis results, we generally think that gene sets with the following characteristics are meaningful: |NES|>1, NOM p-val<0.05, and FDR q-val<0.25.
Protein-protein interaction analysis (PPI)
We used an online search tool (STRING database, version 10.5; http://string-db.org/) to search for interacting genes to construct a protein-protein interaction (PPI) network based on the extensive functions of the genome-wide data [55].
Weighted Gene Co-expression Network Analysis (WGCNA)
The co-expression network is constructed using the WGCNA (v1.47) package in R [21]. WGCNA is an undirected, weighted network. "Unweighted network" means that the correlation between genes can only be 0 or 1. 0 means that the two genes are not related, while 1 means they are related. A "weighted network" and genes are not only related or not but also their correlation value is recorded. The value is the weight (correlation) of the connection between genes. The soft threshold (β power) in the co-expression network is the expression of each gene at a specific time or space and is regarded as a node. To obtain the correlation between genes, it is necessary to calculate the correlation coefficient (Person Coefficient) any two genes; this allows us to know whether the expression profiles of two genes are similar. In this study, the β power=6.
1) The module refers to the grouping of genes with similar expression patterns. Genes of a specific type below to a module.
2) An eigengene is defined as the first principal component of a given module. It is the most representative expression pattern of the module. It is an indicator that approximately represents the expression status of all genes in a module but is not a real gene.
3) A hub module refers to a module that is highly related to phenotypic data or closely related to research.
4) Gene significance (GS), or a measure of gene importance, is the correlation between gene expression patterns and the characteristics of a sample. Generally, the larger the correlation value, the more important the gene, which can be evaluated as the log conversion of the p-value. The correlation index between a gene and a specific pathway can also be the correlation between a gene and a specific phenotypic feature.
5) Module membership (MM), also known as the eigengene-based connectivity (kME), is the correlation between the expression pattern of a gene and a characteristic gene in the module. The larger the value, the greater the possibility that the gene belongs to the module.
Genes with the highest MM and the highest GS are genes with high significance (hub genes). All these analyses were performed using commands implemented in the WGCNA software package with the following parameters: minModuleSize = 30, minKMEtoStay = 0.3, and mergeCutHeight = 0.2.
Modular gene expression correlation analysis
After a gene of interest is obtained based on the correlation of gene expression, the correlation coefficient between genes is determined using the spearman correlation algorithm. The correlation coefficient threshold is 0.5 and the significance level is the p value after BH correction (Corrected p-value)≤0.05, and these values were drawn into a visual network diagram. In the figure, the nodes represent genes and the connections between nodes represent the correlation between gene expression. The larger the node, the greater the expression correlation between the genes.
Multi-gene set KEGG pathway enrichment analysis
After performing KEGG analysis on the different gene sets, we utilized the intersection of the selected gene sets for the display and adjusted the significance level to the corrected p-value ≤0.5 as the threshold to display the top 20 pathways.
Statistical Analysis
The data are expressed as the mean ± standard deviation (mean ± SD). The statistical analysis was performed with GraphPad Prism 7.0 (GraphPad Software Inc., San Diego, CA). The Student's t-test was used for comparisons between two groups. A p value<0.05 was considered statistically significant. * p <0.05, ** p <0.01, *** p <0.001, and **** p <0.0001.