Establishment of NTD mouse models
The brain of mouse embryos in the control group was completely closed, with a full and smooth appearance, visible optic and otic vesicles, and intact spinal surface without laceration (Fig. 1A). In the NTDs group, the stillbirths of mouse embryos were increased; some typical morphological malformations were observed, including cracks in the top of the skull, abnormality of the hindbrain and face, and spina bifida (Fig. 1B).
Results of lncRNA sequencing and RRBS
A total of 945,603,356 raw reads were detected using lncRNA sequencing. After quality control, 939221946 clean reads were obtained, including 476,257,370 reads in the control group and 462,964,576 reads in the NTD group (Table S2). Furthermore, 95.23% of the reads matched the Mus musculus reference genome, indicating that our data were of high quality and suitable for subsequent analysis.
In the RRBS, 188,563,561 raw data were obtained and 182,999,476 clean reads remained after filtration (90,565,820 reads in the control group and 92,433,656 reads in the NTD group, Table S3). In the control group, the percentages of CG, CHG, and CHH were 47.57%, 0.13%, and 0.05%, respectively. In the NTDs group, the distribution of C sites was 47.15% CG, 0.13% CHG, and 0.05% CHH, respectively.
Identification of methylation-driven DEGs and DElncRNAs
After differential expression analysis, 639 (26 upregulated and 613 downregulated) DElncRNAs and 1132 (618 upregulated and 514 downregulated) DEGs between the NTD and control groups were screened (Fig. 2A). The heatmap for DElncRNAs and DEGs showed that these genes could significantly distinguish the NTD group from the control group (Fig. 2B and 2C). After integration with RRBS data, 8 DElncRNAs and 213 DEGs with abnormal methylation levels were identified. The top five DElncRNAs and DEGs are listed in Table 1.
Table 1
Top five DElncRNAs and DEGs with abnormal methylation level.
|
Gene ID
|
Gene name
|
Log2FC
|
P value
|
Padjust
|
Regulate
|
lncRNA
|
ENSMUSG00000113105
|
Gm47371
|
-1.16756
|
8.29E-16
|
2.62E-12
|
down
|
ENSMUSG00000115536
|
Gm32857
|
-1.23758
|
1.57E-15
|
3.73E-12
|
down
|
ENSMUSG00000104512
|
Gm37165
|
-1.7099
|
1.78E-15
|
4.00E-12
|
down
|
ENSMUSG00000098066
|
Gm26944
|
-23.4564
|
9.22E-15
|
1.37E-11
|
down
|
ENSMUSG00000093675
|
Gm20618
|
-1.45402
|
2.15E-14
|
2.60E-11
|
down
|
mRNA
|
ENSMUSG00000053838
|
Nudcd3
|
-1.04612
|
1.43E-32
|
6.79E-28
|
down
|
ENSMUSG00000035696
|
Rnf38
|
-2.22203
|
2.05E-21
|
3.24E-17
|
down
|
ENSMUSG00000041258
|
Zfp236
|
-9.70547
|
3.20E-20
|
3.79E-16
|
down
|
ENSMUSG00000017639
|
Rab11fip4
|
-27.541
|
8.97E-20
|
8.49E-16
|
down
|
ENSMUSG00000021144
|
Mta1
|
-25.9946
|
8.74E-18
|
5.91E-14
|
down
|
LncRNA–mRNA co-expression network
According to the PCC score between DElncRNAs and DEGs, 99 co-expression pairs were obtained, including 6 DElncRNAs and 28 DEGs. The co-expression network is shown in Fig. 3. The degree of each node was calculated. Nodes with a degree greater than 5 are listed in Table 2. We observed that lncRNA Gm4681 had the highest degree, followed by lncRNAs 4933406K04Rik, Gm13974, Gm40638, and Gm15521.
Table 2
The nodes with degree larger than 5 in the co-expression network.
Name
|
Degree
|
Betweenness
|
Closeness
|
Log2FC
|
Padjust
|
Regulate
|
type
|
Gm4681
|
23
|
280.2948
|
0.297297
|
-1.4828
|
3.50E-09
|
down
|
lncRNA
|
4933406K04Rik
|
20
|
167.5543
|
0.282051
|
-1.1272
|
0.022408
|
down
|
lncRNA
|
Gm13974
|
20
|
131.6011
|
0.282051
|
-1.03811
|
9.26E-04
|
down
|
lncRNA
|
Gm40638
|
19
|
186.1886
|
0.277311
|
-1.28794
|
1.21E-05
|
down
|
lncRNA
|
Gm15521
|
16
|
56.36119
|
0.264
|
-1.24576
|
2.00E-04
|
down
|
lncRNA
|
Usp46
|
5
|
8.605687
|
0.264
|
-1.73308
|
0.008168
|
down
|
mRNA
|
Trerf1
|
5
|
8.605687
|
0.264
|
-1.17501
|
0.014953
|
down
|
mRNA
|
Tdrd9
|
5
|
8.605687
|
0.264
|
-1.17423
|
3.14E-04
|
down
|
mRNA
|
Syn3
|
5
|
8.605687
|
0.264
|
-1.37689
|
8.94E-04
|
down
|
mRNA
|
Styx
|
5
|
8.605687
|
0.264
|
-1.11316
|
0.002523
|
down
|
mRNA
|
Rpgr
|
5
|
8.605687
|
0.264
|
-1.94071
|
4.18E-06
|
down
|
mRNA
|
Ppp1r9a
|
5
|
8.605687
|
0.264
|
-1.54827
|
6.20E-06
|
down
|
mRNA
|
Lrp1
|
5
|
8.605687
|
0.264
|
-1.05218
|
0.026977
|
down
|
mRNA
|
Klhl29
|
5
|
8.605687
|
0.264
|
-1.53648
|
6.98E-04
|
down
|
mRNA
|
Dapk1
|
5
|
8.605687
|
0.264
|
-1.95977
|
0.030613
|
down
|
mRNA
|
Cdk8
|
5
|
8.605687
|
0.264
|
-1.18471
|
0.047516
|
down
|
mRNA
|
Camta1
|
5
|
8.605687
|
0.264
|
-1.27631
|
0.003395
|
down
|
mRNA
|
Auts2
|
5
|
8.605687
|
0.264
|
-1.58481
|
0.009469
|
down
|
mRNA
|
Functional enrichment analysis of nodes in the co-expression network
To further explore the biological functions of the DElncRNAs and DEGs, functional enrichment analysis was conducted. The results showed that these DElncRNAs were significantly enriched in 1409 GO-BP terms and 19 KEGG pathways. In particular, 4933406K04Rik, Gm15521, Gm4681, Gm13974, and Gm40638 were mainly enriched in several GO-BP terms, such as positive regulation of cell projection organization, righting reflex, and reflex (Fig. 4A), whereas 4933406K04Rik, Gm13974, Gm15521, and Gm40638 were primarily involved in the axon guidance pathway (Fig. 4B).
For the lncRNA target genes, 327 GO-BP terms and four KEGG pathways were identified. We observed that these genes were significantly enriched in GO-BP terms, such as positive regulation of cell projection organization (Fig. 4C), and KEGG pathways, such as axon guidance (Fig. 4D).
PPI network analysis
A PPI network was constructed based on the PPI pairs between DEGs. As illustrated in Fig. 5, the PPI network was composed of 22 nodes and 25 edges. All DEGs were downregulated.
Establishment of lncRNA-miRNA-mRNA regulatory network
A total of 584 miRNA-mRNA pairs (408 miRNAs and 22 mRNAs) were predicted. Four common DElncRNAs and 70 miRNAs were identified in this study. After integrating the lncRNA–miRNA and miRNA–mRNA pairs, a lncRNA-miRNA-mRNA network containing 91 nodes (70 miRNAs, 4 lncRNAs, and 17 mRNA) with 237 edges was constructed (Fig. 6). Nodes with a degree greater than 10 are summarized in Table 3. It was demonstrated that lncRNA 4933406K04Rik had the highest degree in the network, followed by lncRNAs Gm13974 and Gm15521. These nodes may be hub abnormally methylated genes involved in the pathogenesis of NTDs.
Table 3
The nodes with degree greater than 10 in ceRNA network
name
|
Degree
|
Betweenness
|
Closeness
|
Log2FC
|
Padjust
|
Regulate
|
type
|
4933406K04Rik
|
44
|
2791.1
|
0.647482
|
-1.1272
|
0.022408
|
down
|
lncRNA
|
Gm13974
|
43
|
2495.893
|
0.625
|
-1.03811
|
9.26E-04
|
down
|
lncRNA
|
Gm15521
|
23
|
845.0656
|
0.494505
|
-1.24576
|
2.00E-04
|
down
|
lncRNA
|
Ppp1r9a
|
22
|
673.2955
|
0.552147
|
-1.54827
|
6.20E-06
|
down
|
mRNA
|
Sema5a
|
19
|
515.323
|
0.471204
|
-1.11811
|
0.0155
|
down
|
mRNA
|
Itsn1
|
18
|
545.051
|
0.514286
|
-1.4752
|
7.32E-06
|
down
|
mRNA
|
Klhl29
|
17
|
779.9576
|
0.542169
|
-1.53648
|
6.98E-04
|
down
|
mRNA
|
Zfhx3
|
16
|
269.1233
|
0.454545
|
-1.09265
|
8.83E-09
|
down
|
mRNA
|
Magi1
|
11
|
239.4414
|
0.481283
|
-1.28005
|
0.001264
|
down
|
mRNA
|
Camta1
|
10
|
133.242
|
0.511364
|
-1.27631
|
0.003395
|
down
|
mRNA
|
Validation analysis by qRT-PCR
To validate the accuracy of the sequencing analysis results, the expression levels of the hub lncRNAs were evaluated using qRT-PCR. The results indicated that the expression levels of lncRNAs Gm15521, Gm4681, and Gm13974 were significantly lower in the NTD group than in the control group (p < 0.05; Fig. 7), which was consistent with the results of bioinformatics analysis.
Animal models and samples
Eighty Kunming mice (9–10 weeks old) were purchased from Caven Biogle (Suzhou) Model Animal Research Co. Ltd (Suzhou, Jiangsu, China). The mice were placed in cages with equal proportions of males and females at 18:00, and then mated overnight. The vaginal plug was examined the following morning. Female mice with vaginal plugs at 8:00 were designated as embryonic days (E0d), whereas those at 16:00 were regarded as E0.5d. Finally, 30 pregnant mice were obtained, which were divided into NTD and control groups, with 15 mice in each group.
The NTD model was established as described previously [14]. On E7.0d-7.25d, mice in the NTD group received a one-time administration of 50 mg/kg (body weight) retinoic acid (RA) (R2625; Sigma) dissolved in corn oil by gavage, whereas the control group was administered an equal amount of corn oil. On E16.5d, the pregnant mice were killed by cervical dislocation. Embryos were dissected from the uteri and examined using a dissecting microscope to confirm the occurrence of NTDs. The brain tissues (the anterior end of the neural tube) of control and NTD embryos (n = 5 per group) were collected and frozen at -80°C until further sequencing. The animal experiments were approved by the Animal Ethics Committee of Kunming Medical University, and were conducted in accordance with the guidelines of Institutional Animal Care and Use.
LncRNA sequencing
Total RNA was extracted from each tissue using TRIzol reagent (Invitrogen, Carlsbad, CA, USA), and the concentration and purity of RNA were detected using Nanodrop2000 (Thermo Fisher Scientific, USA). After rRNA was removed, an RNA library was constructed using the RNA Library Prep Kit for Illumina® (NEB, UK) and then sequenced on the Illumina HiSeq 2500 platform with 2×150 bp paired-end reads to obtain raw data.
To ensure the accuracy of further bioinformatics analysis, the raw reads were filtered by deleting low-quality reads to obtain clean reads. Next, the clean reads were mapped to the Mus musculus (GRCm39) reference genome using the HISAT2 software [15]. Read counts of lncRNAs and mRNA were generated using RSEM, and their expression levels were normalized using fragments per kilobase of transcript per million mapped reads (FPKM).
RRBS analysis
Genomic DNA was extracted from each sample and digested with the methylation-insensitive enzyme MspI. DNA fragments sized 150–300 bp were selected by bisulfite conversion and PCR amplification to obtain a DNA library, followed by sequencing on the Illumina HiSeq 2500 Platform (Illumina, San Diego, CA, USA).
To obtain clean reads, we used the trimming method to remove sequencing adapters and low-quality reads from raw data. Next, the clean reads were mapped to the Mus musculus (mm10) reference genome using the Bsmap software. Finally, methylation sites were detected using the Bsmap software.
Differential expression analysis
To further explore the genes or methylation involved in the pathogenesis of NTDs, a series of differential expression analyses between NTD and control samples was performed. Based on RNA sequencing data, DEGs and DElncRNAs between NTD vs. control samples were screened using the DESeq2 software, with the criteria of an adjusted p-value < 0.1 and |log2FoldChange (FC)| > 1. The p-value was adjusted using the negative binomial distribution test. In addition, differentially methylated regions (DMR) between NTD and control samples were identified based on RRBS data using the Metilene software. DMRs with an adjusted p-value < 0.05 were considered significant.
Integrative analysis of DMR and DElncRNAs/DEGs
Data from RNA sequencing and RRBS sequencing analyses were integrated, and genes annotated in DMR anchor regions (such as Promoter, Exon, Intron, CGI, CGI Shore, Repeat, TSS, and TES) were obtained. Next, these genes were compared with DEGs and DElncRNAs to identify methylation-driven DEGs and DElncRNAs, respectively.
Construction of DElncRNA-DEG co-expression network
The Pearson correlation coefficient (PCC) between methylation-driven DEGs and lncRNAs was calculated using the cor function. Pairs with PCC > 0.7 and p-values < 0.05 were selected. A co-expression network was constructed via Cytoscape (version 3.6.1) [16]. The topological properties of nodes in this network were analyzed using CytoNCA (version 2.1.6) [17].
Functional enrichment analysis
To further explore the biological function of nodes in the co-expression network, we used the clusterprofiler package (version 3.16.0) [18] to assess the GO-BP terms and KEGG pathways of DEGs and DElncRNAs. DEGs and DElncRNAs were considered significantly enriched if the adjusted p-value < 0.05.
Construction of protein-protein interaction (PPI) network
The relationship between the target genes of the lncRNAs was predicted using the STRING (version 11.0) software [19]. In brief, the species was set as mice, and PPIs with a combined score > 0.15 were screened to generate a PPI network.
Prediction of miRNA
Furthermore, miRWalk 2.0 [20] was used to predict miRNAs in the DElncRNA–DEG co-expression network. miRNA-DEG interactions were verified using seven databases: miRWalk, Microt4, miRanda, miRDB, miRNAMap, RNA22, and Targetscan. Pairs appearing in at least five databases were selected to construct the miRNA-DEG network.
Construction of lncRNA–miRNA–mRNA network
In addition, miRNA-related lncRNAs were predicted using the DIANA-LncBase v.2 [21] database, and the lncRNAs appearing in the DElncRNA–DEG network were screened to obtain DElncRNA–miRNA pairs. Based on the identified DElncRNA–miRNA and miRNA–DEG pairs, a lncRNA–miRNA–mRNA network was constructed.
qRT-PCR validation
To validate the accuracy of the bioinformatics analysis results, the expression levels of several identified lncRNAs (Gm15521, Gm4681, and Gm13974) were evaluated using qRT-PCR. Total RNA was isolated from tissue samples using TRIzol® reagent (Invitrogen) and reverse-transcribed to cDNA using the PrimeScript™ 1st Strand cDNA Synthesis kit (Takara, China). qRT-PCR was carried out using TaKaRa Ex Taq® (Takara) on the 7900HT Fast qPCR System. The relative mRNA expression of lncRNAs was normalized to that of GAPDH using the 2−ΔΔCt method. The primers used for qRT-PCR are listed in Table S1.