DEGs and PPI networks in ADM-PanIN-PDAC
In the GSE40895 dataset, a total of 1658 DEGs in ADM (972 upregulated, 686 downregulated), 557 DEGs in PanIN (254 upregulated, 303 downregulated) and 705 DEGs in PDAC (355 upregulated, 350 downregulated) were identified among samples compared with matched non-precursor lesions or non-tumor tissue samples, respectively. A heatmap of the DEG expression was also created in the corresponding ADM, PanIN and PDAC groups (Figure S1). The names of the genes used for the heatmap are presented in Tables S1-S3. To investigate the regulatory correlations of the DEGs in ADM, PanIN and PDAC, a PPI network of the DEGs was mined. 935 DEGs from a total of 1658 DEGs in ADM (Fig. 1) were filtered into the DEG PPI network complex, including 6278 edges and 935 nodes. The degree value of the genes was indicated by the size of the nodes. The top 10 largest nodes in ADM included Ehmt1, Hdac3, Cecr2, Cdk6, Ranbp2, Bptf, Actg1, Acta1, Ncbp1 and Cul3. We then clarified 639 PPI pairs among 261 filtered DEGs in PanIN and found 10 nodes with a higher degree (Rhob, Hist1h3g, Hist1h4j, Hist4h4, Hdac11 and Cdk17, Fig. 1c). Among all the 705 DEGs in PDAC, 320 DEGs were filtered into the PPI network with 320 nodes and 777 edges. Rhoq, Actr8, Frk, Zap70, Mras, Akt3, Lyn, Rab25, Pik3c3 and Jup were the top 10 largest nodes in the PPI network of PDAC.
Significant overlapping gene modules in ADM-PanIN-PDAC
In total, twenty-six, seven and nineteen gene modules were selected from the PPI networks of ADM, PanIN and PDAC, respectively, based on the MCODE plugin of Cytoscape software. Using hypergeometric tests of the genes above, we found one pair of modules containing 4 significant overlapping genes between ADM module-4 and PanIN module-1 (Fig. 2, P=0.012,). Detailed statistical information on overlapping genes is shown in Table S4. DEGs in ADM module-4 were enriched in nucleosome assembly, positive epigenetic regulation and initiation of DNA template transcription by GO analysis. Interestingly, DEGs in PanIN module-1 were enriched in similar functions including epigenetic regulation of gene expression, DNA-template transcription, and DNA methylation. 75% of all the genes annotated in both ADM and PanIN modules were members of the histone family, strongly indicating its potential role in promoting ADM to PanIN. Coincidently, all of the 4 overlapping genes (Hist1h2an, Hist1h4c, Hist1h4m and Hist4h4) belong to the core histone family, H2A and H4, suggesting that they may play critical roles in promoting ADM to PanIN. To further understand the major functions of the overlapping genes, we performed a GO analysis based on the DAVID database (Table 1). No significant overlapping genes were found between PanIN and PDAC modules. This result indicates that overlapping genes are highly involved in the initial stages of PDAC (ADM and PanIN), rather than the PanIN-PDAC stage. However, whether overlapping genes exist between PanIN-PDAC based on other databases remains to be explored. Therefore, the mechanisms for regulating ADM-PanIN-PDAC progression may rely mostly on crosstalk, especially at the PanIN-PDAC stage.
Crosstalk gene modules in ADM-PanIN-PDAC.
Crosstalk modules between ADM and PanIN
Three pairs of PPI sub-networks between ADM and PanIN with significant crosstalk modules were found. There were 63, 35 and 20 crosstalk pairs between ADM module-1 and PanIN module-4 (Fig. 3A, P=0.012), ADM module-4 and PanIN module-4 (Fig. 3B, P<0.0001), ADM module-5 and PanIN module-6 respectively (Fig. 3C, P<0.0001). Consistent with the overlapping gene modules between ADM and PanIN, members of histone family account for the majority of crosstalk genes. Hist1h3g and HDAC11 in PanIN module-4 had crosstalk with not only HDAC3, Hist1h4h, Hist1h2ad and Hist1h2bn in ADM module-1 and 4, but also with HDAC6 in PDAC module-9 (Fig. 3F, P=0.002). Notably, all the crosstalk genes between ADM module-5 and PanIN module-6 (P<0.0001) were from the olfactory receptor (OR/OLFR) superfamily.
Crosstalk modules between PanIN and PDAC
Seven crosstalk modules between PanIN and PDAC were uncovered. PanIN module-4 cross talks with three different PDAC modules, indicating the close interaction between these two stages (Fig.3D-3F, P=0.014, P=0.040, P=0.002 respectively). The OR family also links PanIN module-6 and PDAC module-1(Fig.3G, P=0.003). ORs are involved in cell differentiation and carcinogenesis[22]. However, their roles in the pancreas are unclear. PanIN module-3 interacted with both PDAC module-13 (Fig.3H, P<0.001) and -10 (Fig.3J, P<0.0001). An interaction between PanIN module-1 and PDAC module-9 (Fig.3I, P=0.036) was also uncovered. These crosstalk modules may provide new insights into how PanIN gradually progresses to PDAC.
Crosstalk modules in ADM-PanIN-ADM
We finally combined all the 10 crosstalk modules above to select crosstalk modules among the three different modules (ADM, PanIN and PDAC). Two crosstalk sub-networks of ADM-PanIN-PDAC were discovered (Fig.3K and Fig.3L). Detailed statistical information on the combined crosstalk genes are shown in Table S6. Interestingly, most genes involved in the crosstalk of ADM-PanIN-PDAC are from the histone family. Mutation and modification of histones may promote ADM progression to PDAC through multiple pathways including nucleosome assembly, microtubule-based processes, and chromatin modification, as predicted by GO analysis (Table 2). The top 10 genes regulating ADM-PanIN-PDAC with the highest degree were Chd1, Bptf, Smarca1, Cecr2,H2afz in ADM module-1, Smarca2 in PanIN module-4, and Frk, Clip3, Kcnmb2, Ccng1 in PDAC module-9 (Fig.3K). Another crosstalk network focuses on the OR superfamily (Fig.3L). GO analysis shows that these genes mainly function through the G-protein coupled receptor (GPCR) signaling pathway. So far, the role of ORs as GPCRs has been underappreciated. Some reports have shown that certain ORs inhibit the proliferation of lung and prostate cancer cell lines[23, 24].
Importantly, we mapped all the significant overlapping and crosstalk genes into human homologies to investigate their expression level in PDAC and their potential relationships with prognosis using the GEPIA tool. We found that SMARCA1, SMARCA2, CLIP3, BPTF and H2AFZ were significantly overexpressed in PDAC compared with normal pancreatic tissue (Fig. 4A). KCNMB2 and H2AFP (human homology of mouse HIST1H2AN), SMARCA1, CLIP3 and SMARCA2 are putative prognostic factors for PDAC (Fig. 4B). Information on the expression and prognosis of other overlapping and crosstalk genes are provided in Fig. S1.
Non-coding RNAs regulating ADM-PanIN-PDAC
Finally, we predicted the lncRNAs and microRNAs that regulate the overlapping and crosstalk genes during ADM-PanIN-PDAC progression with the miRDB database. There are 44 lncRNAs and 56 microRNAs that regulate the overlapping genes (Fig.5). Statistical information on these ncRNAs is shown in Table S5. As the node degree increases, the relationship between the non-coding RNAs and their related genes becomes more significant. The top five most highly regulated microRNAs were mmu-miR-335-5p, mmu-miR-669n, mmu-miR-7646-5p, mmu-miR-1191b-3p, mmu-miR-1224-5p. These miRNAs target two overlapping genes: Hist1h4m and Hist4h4. The top 10 most highly regulated lncRNAs were C530005A16Rik, 1810062O18Rik, 2010204K13Rik, 2810002D19Rik, 4732463B04Rik, H19, Pvt1, Gm14207, Gas5 and Mir17hg. They mainly target the overlapping gene Hist1h2an.
572 microRNAs and 55 lncRNA significantly regulate the crosstalk genes (Table S6). The top 5 microRNAs with greatest node degree are mmu-miR-325-3p, mmu-miR-7661-5p, mmu-miR-590-3p, mmu-miR-743a-3p, mmu-miR-743b-3p. Gm14207,H19, 2610203C20Rik, C130021I20Rik, 1700086L19Rik, 2810002D19Rik, BC006965, D630029K05Rik, Meg3 and A330023F24Rik were the top 10 lncRNAs with greatest node degree (Fig. 6). 2810002D19Rik, H19 and Gm14207 were predicted to regulate both crosstalk and overlapping genes.