(1) Identification of DEGs in thyroid cancer
As shown in Table 2, there are 340 DEGs in DTC-NTH; 162 DEGs in PDTC-NTH; 279 DEGs in PDTC-DTC; 484 DEGs in ATC-NTH; 5 DEGs in ATC-DTC; 276 DEGs in ATC-PDTC, respectively. The DEGs list of DTC-ALL is equal to that of DTC-NTH. The DEGs list of PDTC-ALL is merged by that of PDTC-NTH and PDTC-DTC. And the DEGs list of ATC-All is merged by ATC-NTH, ATC-DTC, and ATC-PDTC. There are 340, 427, and 595 DEGs in DTC-ALL, PDTC-ALL, and ATC-ALL, respectively. Furthermore, in Fig. 1, it is found that 17 cross-expressed genes are identified in the occurrence of thyroid tumors (including DTC, PDTC, ATC) : ADH1B AGR3, C7, FAM3B, GPM6A, IPCEF, KLHDC8A, LONRF2, LRP1B, OGDHL, PLA2R1, PTX3, SMOC2, TENM1, TFF3, THRSP, TMEM139. Moreover, as shown in Figure S1 of the overlap gene lists circle figure, Blue represents the DEGs of DTC, and Green represents the DEGs of PDTC; Red represents the DEGs of ATC. The greater the number of purple links and the longer the dark orange arcs implies greater overlap among the input gene lists. It can be seen that PDTC owns longer dark orange color arc and more purple links, which means that there are more identical genes in the DEGs in development PDTC. We will further study the functional enrichment of DEGs of DTC, PDTC and ATC.
Table 2
The DEGs information of DTC-ALL, PDTC-ALL, and ATC-ALL
Tumor type
|
DTC-ALL
|
PDTC-ALL
|
ATC-ALL
|
DTC-NTH
|
PDTC-NTH
|
PDTC-DTC
|
ATC-NTH
|
ATC-DTC
|
ATC-PDTC
|
Numbers of DEGs
|
340
|
162
|
279
|
484
|
5
|
276
|
340
|
427
|
595
|
(2)Functional enrichment analysis of DEGs in DTC, PDTC, and ATC
As shown in Fig. 2, it could be seen that the molecular function of enriched DEGs of DTC, PDTC, and ATC have a significant difference. Extracellular Matrix Organization is significantly enriched in DTC. While the following items in the PDTC are significantly enriched: Supramolecular Organization Fiber, Regulation of Cell adhesion, Positive Regulation of Cell Death, Regulation of Cascade of MAPK, negative Organization Regulation of Cellular Component. The enriched items in the ATC are: NABA MATRISOME ASSOCIATED, the PID INTEGRIN1 PATHEWAY, Extracellular Matrix Organization, Cell Regulation of adhesion, Blood Vessel Development, Regulation of positive Locomotion, Junction Cell Organization, Response to wounding, ossification, negative Regulation of Cell adhesion. While the following items are significantly enriched both in the PDTC and ATC: Supramolecular Fiber Organization, Regulation of Cell adhesion, Blood Vessel Development, Regulation of positive Locomotion, Signaling by Receptor Tyrosine Kinases. Moreover, functional enriched gene biological processes of DTC, PDTC, and ATC have obvious differences. Cellular component organization or biogenesis is obviously enriched in DTC. The biological adhesion, cellular component organization or biogenesis, biological regulation, signaling, negative regulation of biological process, cellular process, positive regulation of biological process, metabolic process, localization, regulation of biological process has obvious enrichment in PDTC. While the following items are enriched in ATC: developmental process, response to stimulus, biological adhesion, locomotion, cellular component organization or biogenesis, positive regulation of biological process, metabolic process, immune system process, multicellular organismal process. And biological adhesion, cellular component organization or biogenesis, positive regulation of biological process, the metabolic process are enriched in both PDTC and ATC.
Gene list enrichments are identified in the following ontology categories: TRRUST, DisGeNET, PaGenBase. The top few enriched clusters (one term per cluster) are shown in the Figure S2-4. As can be seen in Figure S2, the transcription factor (TF) enriched in ATC is significantly more than that in PDCT and DTC. And the transcription factor (TF) enriched in PDCT is more than that in DTC. The transcription factor (TF) of DTC is obviously enriched in VHL. The enriched transcription factor (TF) of PDTC: NFKB1, 53, 2F3, NF24, RELB. The transcription factor of ATC is significantly enriched in: NFKB1, SP1, RELA, TP53, JUN, ETS1; The common transcription factors (TF) enriched in PDTC and ATCare: NFKB1, SP1, STAT3, RELA, TP53, HDAC1, TFAP2A, E2F1, JUN.
Figure S5 shows the PPI network of DTC, which is enriched in cell morphogenesis involved in differentiation, chemotaxis, and taxis, with a significant difference. And the PPI network of DTC has 4 MCODEs. And in Figure S6., it can be seen that the enrichment of PPI network in PDTC are Regulation of Cytoskeleton Organization, Cell Division, positive kinase Activity Regulation, with significant difference. And the PPI network of PDTC has 4 MCODEs. Also, the Figure S7. Shows that the enrichment of PPI network in the ATC are extracellular matrix organization, extracellular structure organization, Extracellular matrix organization. And the PPI network of ATC has 11 MCODEs. The Table 3 summarizes the enrichment items of MCODEs in DTC, PDTC, and ATC, which could be found that chemotaxis may function in both DTC and PDTC. Moreover, the MCODE_2 in PDTC reveal sthat epidermal growth factor receptor signaling pathway might play an important role in development of PDTC.
Table 3
Summary of the enrichment items of MCODEs in DTC, PDTC, and ATC
|
DTC
|
PDTC
|
ATC
|
Enrich-ment items of PPI
|
differentiation
|
Regulation of Cytoskeleton Organization
|
extracellular matrix organization
|
chemotaxis
|
Cell Division
|
extracellular structure organization
|
taxis
|
positive kinase Activity Regulation
|
Extracellular matrix organization
|
Numbers of MCODE and Enrich-ment items
|
4
|
MCODE_1
|
G alpha (i) signalling events,
GPCR ligand binding,
positive regulation of chemotaxis
|
4
|
MCODE_1
|
negative regulation of dendritic cell apoptotic process,
positive regulation of chemotaxis,
regulation of dendritic cell apoptotic process
|
11
|
MCODE_1
|
Chemokine receptors bind chemokines,
chemokine-mediated signaling pathway,
Chemokine signaling pathway
|
MCODE_2
|
GO in Resolution of Sister Chromatid Cohesion,
Amplification of signal from the kinetochores,
Amplification of signal from unattached kinetochores via a MAD2 inhibitory signal
|
MCODE_2
|
epidermal growth factor receptor signaling pathway,
ERBB signaling pathway,
Clathrin-mediated endocytosis
|
MCODE_3
|
ADORA2B mediated anti-inflammatory cytokines production,
Anti-inflammatory response favouring Leishmania parasite infection,
Leishmania parasite growth and survival
|
MCODE_4
|
GO is in Cargo concentration in the ER,
COPII vesicle coating,
vesicle targeting,
rough ER to cis-Golgi
|
MCODE_3
|
Constitutive Signaling by Aberrant PI3K in Cancer,
PI3K/AKT Signaling in Cancer,
PI5P/ PP2A and IER3 Regulate PI3K/AKT Signaling
|
MCODE_3
|
ADORA2B mediated anti-inflammatory cytokines production,
Anti-inflammatory response favouring Leishmania parasite infection,
Leishmania parasite growth and survival
|
MCODE_5
|
Classical antibody-mediated complement activation,
Creation of C4 and C2 activators,
Initial triggering of complement
|
MCODE_6
|
RHO GTPases in the Activate Oxidases of NADPH,
Antimicrobial Peptides,
Antimicrobial Response humoral
|
MCODE_4
|
Amplification of signal from unattached kinetochores via a MAD2 inhibitory signal,
Amplification of signal from the kinetochores,
Cell Cycle, Mitotic
|
MCODE_7
|
NCAM1 interactions,
NCAM signaling for neurite out-growth,
ECM proteoglycans
|
MCODE_10
|
PID INTEGRIN1 PATHWAY
|
(3)Functional enrichment analysis of DEGs in PDTC
As shown in Figure 3.A, it can be seen that PDTC-NTH and PDCT-DTC functional enrichment gene molecular functions are quite different. The enrichment of functional enrichment gene molecular functions in PDTC-NTH are: regulation of plasma membrane bounded cell projection organization, NABA CORE MATRISOME. While the following items are enriched PDTC-DTC: regulation of cytoskeleton organization, negative regulation of cellular component organization, regulation of mitotic cell cycle, regulated exocytosis, Hemostasis, positive regulation of cell death, positive regulation of protein kinase activity, nuclear division, cell division, Signaling by Receptor Tyrosine Kinases, negative regulation of cell cycle, neural precursor cell proliferation, response to oxidative stress.
The functional enriched gene biological processes of PDTC-NTH and PDCT-DTC have obvious differences and similarities, as shown in Figure 3.B. In PDTC-NTH, cellular process, cellular component organization or biogenesis, immune system process are significantly enriched. While the functional enriched gene biological processes of PDCT-DTC is enriched in cellular process, cellular component organization or biogenesis , cell proliferation, regulation of biological process, signaling, negative regulation of biological process, cellular process, positive regulation of biological process, metabolic process, biological regulation, localization , the reproductive process.
When multiple gene lists are provided, all gene lists are merged into one list called "PDTC-ALL". PPI network and MCODE components are identified in the "PDTC-ALL"gene lists. All lists merged are colored by Counts (Figure S8.A) and Cluster (Figure S8.B), respectively. The PPI network of PDTC-ALL is enriched in Regulation of Cytoskeleton Organization, Cell Division, positive kinase Activity Regulation. And the Figure 4 show that PPI network of PDTC-ALL have 5 MCODEs. The enrichment of MCODE_1 in PDTC-ALL is in negative regulation of dendritic cell apoptotic process, positive regulation of chemotaxis, regulation of dendritic cell apoptotic process. While the following items are enriched in MCODE_2 in PDTC-ALL: epidermal growth factor receptor signaling pathway, ERBB signaling pathway, Clathrin-mediated endocytosis. And the enrichment of MCODE_3 in PDTC-ALL is in Post NMDA receptor activation events, Activation of NMDA receptors and postsynaptic events, glycolytic process. The MCODE_4 in PDTC-ALL is enriched in Amplification of signal from the kinetochores, Amplification of signal from unattached kinetochores via a MAD2 inhibitory signal, Cell Cycle, Mitotic.
Table 4 shows the core genes of PDTC-ALL in different MCODEs. The core genes of MCODE_1 in PDTC-ALL are: CXCL12, CCL21, CCL19, LPAR1, GABBR2, APP, MAT2A, MYH10, RAD2, ACTB, TUBB2B, KPNA5, EEF1A1, YWHAE, PAK3. The core genes of MCODE_2 in PDTC-ALL are: TGFB1, CTNNB1, GSK3B, MAP, GAP1, TWF1, EGF, HIP1, VAMP3, CD3D, SH3GL2. The core genes of MCODE_3 in PDTC-ALL are: TPR, PRKAA2, STIP1, PGM5, OGDHL, CFL1, CAMK1, UBA1, RAB11B, RAB5C, PRKAR1A. The core genes of MCODE_4 in PDTC-ALL are: AURKA, CENPA, NUF CLP1. The core genes of MCODE_5 in PDTC-ALL are: MEOX2, STX11, RUNX1T1.
Table 4
Summary of the core genes of PDTC-ALL in different MCODEs
Number of MCODES in PDTC-ALL
|
Enrichment items of MCODEs in PDTC-ALL
|
Core genes of MCODEs in PDTC-ALL
|
MCODE_1
|
negative regulation of dendritic cell apoptotic process,
positive regulation of chemotaxis,
regulation of dendritic cell apoptotic process
|
CXCL12, CCL21, CCL19, LPAR1, GABBR2, APP, MAT2A, MYH10, RAD2, ACTB, TUBB2B, KPNA5, EEF1A1, YWHAE, PAK3
|
MCODE_2
|
epidermal growth factor receptor signaling pathway,
ERBB signaling pathway,
Clathrin-mediated endocytosis
|
TGFB1, CTNNB1, GSK3B, MAP, GAP1, TWF1, EGF, HIP1, VAMP3, CD3D, SH3GL2
|
MCODE_3
|
Post NMDA receptor activation events,
Activation of NMDA receptors and postsynaptic events,
glycolytic process
|
TPR, PRKAA2, STIP1, PGM5, OGDHL, CFL1, CAMK1, UBA1, RAB11B, RAB5C, PRKAR1A
|
MCODE_4
|
Amplification of signal from the kinetochores,
Amplification of signal from unattached kinetochores via a MAD2 inhibitory signal,
Cell Cycle,
Mitotic
|
AURKA, CENPA, NUF CLP1
|
MCODE_5
|
|
MEOX2, STX11, RUNX1T1
|
(4)Comprehensive co-Expression Network Analysis and Candidate Biomarker in PDTC
In order to identify which gene may potentially play a biomarker in the development of PDTC, all significantly DEGs are used to construct gene co-expression network, as shown in Figure 5. The degree of a node describes the number of links one gene with others in the gene network. In the PDTC co-expression network, we founded the gene-expressions in PDTC are largely consistent with the previous study, such as the expression of CCND1 and KRAS. Some novel observations are also made in our study. Interestingly, the central of this network are CCNA2, EGF, which directly interacted with more than 20 neighboring genes. Furthermore, as shown in Figure 18, the top40 Hub genes are confirmed with MNC algorithm: CCNA2, EGF, AURKA, CCND1, CDC20, CD44, CTNNB1, ASPM, KIF4A, KIF20A, RACGAP1, KRAS, CKS2, DLGAP5, ACTB, PBK, NCAPG, RAD51AP1, NUSAP1, KIF15 , NUF2, APP, KIAA0101, DEPDC1, MELK, ANLN, CDCA2, KIF14, SHCBP1, FANCI, OIP5, HGF, CXCL12, FAM83D, CDCA3, E2F8, CCNE2, STAT1, CD40, BCL2L1.
(5)Relationship between the Expression of Hub genes and Clinical Features of thyroid cancer
Further the top 40 Hub gene expression and survival analysis in thyroid cancer are checked in GEPIA. As shown in Figure 6, the expression of EGF, CCND1, DEPDC1, ANLN, HGF, and BCL2L1 in thyroid tissue and thyroid cancer tissue are significantly different, with a certain impact on the survival of thyroid tumors. The Figure shows the survival analysis curve of EGF, CCND1, DEPDC1, ANLN, HGF, and BCL2L1 in patients with thyroid cancer. Among them, the HR of BCL2L1 and CCND1 are less than 1.0, which are 0.32 and 0.26 respectively. The HR of EGF, DEPDC1, ANLN and HGF are greater than 1.0, which are 5, 2.69, 2.99, 3.19 respectively.
By searching the human protein atlas database for evaluation of Hub gene protein expression, it was found that the protein expression of CCND1, DEPDC, ANLN, HGF, BCL2L1 in thyroid cancer and thyroid tissue had a certain difference, as shown in the Figure 8. The data of EGF was not retrieved, and EGFR was further searched and shown.
(6) GSEA reveal a close relationship between hub genes and tumor proliferation
Further the associated regulatory pathway analysis of DEGs of PDTC-NTH and PDTC-DTC are conducted in GSEA JAVA. There are 89/174 gene sets upregulated in phenotype PDTC-NTH. Two gene sets are enriched significantly at p-value < 1%, including KEGG_THYROID_CANCER and KEGG_GLIOMA. And 11 gene sets are significantly enriched at p-value <5%. The p-value of KEGG_P53_SIGNALING_PATHWAY is equal to 0.51. The Figure 9 shows the enrichment plot of these 12 KEGG pathway in PDTC_NTH.
There are 58/174 gene sets upregulated in phenotype PDTC_DTC. And 2 gene sets are significantly enriched at p-value < 5%, including KEGG_TRYPTOPHAN_METABOLISM and KEGG_STARCH_AND_SUCROSE_METABOLISM. The Figure 10 shows the enrichment plot of these 2 KEGG pathway in PDTC_DTC.