3.1. Data Quality Assessment and Preprocess
Regression analysis of raw data was performed by using affyPLM package in R. The relative log expression(RLE) plot revealed that gene expressions levels in GSE74530 were consistent with the median approaching 0(Fig. 1A), indicating that the quality of expression data was reliable. As for the RNA degradation plot, it showed that the RNA integrity is in good quality(Fig. 1B). Therefore, all 12 samples can be used for further analysis. There were a total of 16434 mRNAs and 1210 lncRNAs that were identified in microarray data using Human Comprehensive gene annotation from GENCODE. As shown in Fig. 1C, there is no obvious outliers in spread and location in boxplot and small discrepancies can be sufficiently removed by normalization. The median of 12 samples were almost at the same level after normalization (Fig. 1D), which corrected the systematic differences between chips effectively.
3.2. The Screening Results Of DELs And DEMs
We identified the 34 DELs and 1641 DEMs by comparing the tumor groups with the adjacent normal tissue group through limma package. The expression of 34 DELs and all DEGs were shown in a heatmap (Fig. 2) and a volcano plot(Fig. 3) respectively. We showed in. The miRNA targets of lncRNAs were predicted using the miRcode database in R. The target mRNAs of these miRNAs were obtained by three highly reliable microRNA target prediction databases (miRTarBase, TargetScan, and miRDB) and the result was intersected with the DEMs mentioned above.
After integrating the two methods above, we got a total of 2137 reliable miRNA-mRNA pairs and 162 predicted lncRNA- miRNA pairs (including 11 lncRNAs, 51 miRNAs and 851 mRNAs) for further analysis.
3.3. Construction Of CeRNA Network In Oral Cancer
To reduce false positives, we calculated the PCC between 34 DELs and 1,641 DEMs and co-expressed pairs with the top five percent PCC value(PCC > 0.886) were defined as the significant co-dysregulated competing pairs. Then the results were merged with 11 lncRNAs, 51 miRNAs and 851 mRNAs. By taking the intersection, a total of 238 potential co-dysregulated competing triples were selected and the full list was shown in Additional file 1.
To go deep into the functional study of lncRNAs acting as miRNA sponge in oral cancer, we built a ceRNA network and applied Cytoscape to perform visualization(Fig. 4A). In the network, there were 10 lncRNA nodes, 41miRNA nodes, 122 mRNA nodes and 238 edges.
3.4. Functional Enrichment Analysis
We speculated on the possible function of each lncRNAs through functional enrichment analysis of their linked mRNAs. All of the mRNA nodes were performed to analyze their function via the GO analysis and the GO interaction network was constructed by BinGO(Additional file 2). Eighteen GO terms were significantly enriched (Fig. 4B and Additional file 3). Out of these, the top three enriched terms were collagen binding, extracellular matrix binding and cell adhesion molecule binding all of which belonged to molecular function. Interestingly, extracellular matrix(ECM) binding was related to the proliferation of OSCC cells(29) and ECM played an important role in the growth and survival of oral cancer cells(30). It was demonstrated that cell adhesion molecules, together with tumor-associated matrix molecules, function in the progression of oral cancer(31). What’s more, discoidin domain receptor-1 (DDR1) could be activated by the specific binding with collagens (II,III)(32) and activation of DDR1 have been reported in oral cancer(33). Ten KEGG pathway terms were shown in Fig. 4C, including ECM-receptor interaction, small cell lung cancer, PI3K-Akt signaling pathway, Focal adhesion, TGF-β signaling pathway and more (Additional file 4). Among these ten pathways, ECM-receptor interaction(34), PI3K-Akt signaling pathway(35), TNF signaling pathway(36) and TGF-β signaling pathway(37) were OSCC-related pathways.
3.5. Topological Analysis Of The CeRNA Network
To identify the hub genes in the oral cancer related lncRNA-miRNA-mRNA network, we computed the node degrees. In the study of Han et al.(27), they defined nodes with degree greater than 5 as a hub. Basing on this research, a total of 42 nodes could be chosen as hubs, including 10 lncRNAs, 28 miRNAs, and 4 mRNAs.(Table 1 and Additional file 5) In addition, BC was also calculated as a measure to select the hubs(38)(Table 2). The higher BC value a node had, the more important it would be in the regulatory network(39).
Table 1
The list of differentially expressed genes(node degree > 5).
Number | Degree | Name | Type |
1 | 27 | hsa-miR-17-5p | miRNA |
2 | 24 | HCP5 | lncRNA |
3 | 21 | AGAP11 | lncRNA |
4 | 19 | hsa-miR-24-3p | miRNA |
5 | 18 | hsa-miR-27a-3p | miRNA |
6 | 17 | hsa-miR-1297 | miRNA |
7 | 17 | hsa-miR-23b-3p | miRNA |
8 | 16 | hsa-miR-140-5p | miRNA |
9 | 16 | hsa-miR-129-5p | miRNA |
10 | 15 | CRNDE | lncRNA |
11 | 15 | hsa-miR-22-3p | miRNA |
12 | 15 | hsa-miR-20b-5p | miRNA |
13 | 14 | HCG22 | lncRNA |
14 | 13 | EPB41L4A-AS1 | lncRNA |
15 | 13 | hsa-miR-507 | miRNA |
16 | 13 | hsa-miR-107 | miRNA |
17 | 13 | hsa-miR-216b-5p | miRNA |
18 | 12 | SOX21-AS1 | lncRNA |
19 | 12 | hsa-miR-3619-5p | miRNA |
20 | 11 | UCA1 | lncRNA |
21 | 11 | hsa-miR-761 | miRNA |
22 | 9 | hsa-miR-490-3p | miRNA |
23 | 9 | hsa-miR-125b-5p | miRNA |
24 | 9 | hsa-miR-1244 | miRNA |
25 | 9 | hsa-miR-139-5p | miRNA |
26 | 7 | YOD1 | mRNA |
27 | 7 | LINC00491 | lncRNA |
28 | 7 | hsa-miR-125a-5p | miRNA |
29 | 7 | hsa-miR-425-5p | miRNA |
30 | 7 | hsa-miR-363-3p | miRNA |
31 | 7 | hsa-miR-876-3p | miRNA |
32 | 7 | hsa-miR-455-5p | miRNA |
33 | 7 | hsa-miR-217 | miRNA |
34 | 6 | NEK6 | mRNA |
35 | 6 | MFHAS1 | mRNA |
36 | 6 | RORA | mRNA |
37 | 6 | hsa-miR-33a-3p | miRNA |
38 | 6 | LINC00515 | lncRNA |
39 | 6 | AC073321 | lncRNA |
40 | 6 | hsa-miR-146b-5p | miRNA |
41 | 6 | hsa-miR-508-3p | miRNA |
42 | 6 | hsa-miR-135a-5p | miRNA |
We found that three lncRNAs(HCP5, AGAP11, and HCG22) not only had higher node degrees, but also had a greater BC, suggesting that they may be potential key regulators controlling the oral cancer related ceRNA network.
Table 2
The list of top 15 high betweenness centrality genes.
Number | Type | name | BetweennessCentrality |
1 | lncRNA | HCP5 | 0.26 |
2 | lncRNA | AGAP11 | 0.16 |
3 | miRNA | hsa-miR-17-5p | 0.14 |
4 | miRNA | hsa-miR-24-3p | 0.13 |
5 | miRNA | hsa-miR-27a-3p | 0.10 |
6 | miRNA | hsa-miR-1297 | 0.10 |
7 | miRNA | hsa-miR-22-3p | 0.09 |
8 | lncRNA | CRNDE | 0.09 |
9 | miRNA | hsa-miR-140-5p | 0.09 |
10 | miRNA | hsa-miR-129-5p | 0.08 |
11 | lncRNA | HCG22 | 0.08 |
12 | miRNA | hsa-miR-216b-5p | 0.07 |
13 | miRNA | hsa-miR-507 | 0.07 |
14 | miRNA | hsa-miR-23b-3p | 0.06 |
15 | miRNA | hsa-miR-107 | 0.06 |
3.6. Key LncRNA-miRNA-mRNA Sub-network
Based on the above analysis, we obtained three key lncRNAs(HCP5, AGAP11, and HCG22) that may play a role in oral cancer. Therefore, three more specific functional lncRNA-associated sub-networks with the hub lncRNAs and their linked miRNAs and mRNAs in the regulatory network was constructed. There were 1 lncRNA, 19 miRNAs, 21 mRNAs and 48 edges in AGAP11-associated sub-network(Fig. 5). As shown in Fig. 6, the subnetwork of HCP5 consisted of 1 lncRNA, 23 miRNAs, 53 mRNAs and 110 edges. HCG22 interacted with 14 miRNAs and 34 mRNAs(Fig. 7).
Accordingly, to further understand the biological functions of these three lncRNAs, we performed GO functional enrichment analysis and KEGG pathways analysis for each hub-associated subnetwork. The results of functional enrichment analysis revealed 14 GO terms and 11 pathways terms in the AGAP11-associated sub-network(Fig. 8A). As for HCP5, 15 GO terms and 6 KEGG terms were identified(Fig. 8B). There were 10 differentially enrichment GO terms and 10 pathways related to HCG22(Fig. 8C).