Bioinformatic analysis and identication of potential hallmarks in poorly differentiated thyroid cancer

Purpose: The accumulation of malignant tumor gene mutations makes differentiated thyroid cancer (DTC) gradually dedifferentiated to poorly differentiated thyroid cancer (PDTC) or anaplastic thyroid cancer (ATC). This study analyzed the gene expression prole in the process of dedifferentiation of thyroid cancer, aiming to explore the molecular mechanism of PDTC and ATC. Methods: Eight series from GEO database are collected for differentially expressed genes (DEGs) of DTC, PDTC and ATC. Then, GO analysis, KEGG pathway analysis, and functional path enrichment analyses are performed by MATESCAPE. Hub-genes are identied by using the method of MNC in protein interaction networks. Moreover, the expression of hub-genes and hub-gene related survival in thyroid cancers are analyzed by GEPIA2. Finally, the functional path enrichment pathways of PDTC are performed by GSEA Java. Results: There are obvious differences among DEGs of DTC, PDTC, and ATC, and 17 cross gene of DEGs were found. The DEGs in PDTC were mainly concentrated in regulation of cytoskeleton organization, cell division, positive regulation of kinase activity. While the DEGs in ATC were mainly in extracellular matrix organization, extracellular structure organization. Furthermore, 6 hub-genes were obtained in the development of PDTC by using Cytoscape: EGF, CCND1, DEPDC1, ANLN, HGF, BCL2L1. The high expression levels of the genes of EGF, DEPDC1, ANLN, HGF were associated with the poor survival of thyroid cancer. While the high expression levels of CCND1 and BCL2L1 were benecial to the survival of patients with thyroid tumor. Finally, GSEA revealed that two gene sets are signicantly enriched, including KEGG_THYROID_CANCER and KEGG_GLIOMA. And CCND1 and EGF have a core gene position in KEGG_GLIOMA. Conclusions: The molecular function (MF), biological processes (BP) and KEGG pathways have changed during the process of malignant transformation of thyroid cancer. Especially, the activation of EGF-EGFR pathway promotes the malignant transformation of thyroid tumor. Furthermore, six hub-genes, especially CCND1 and EGF, could become potential biomarkers of PDTC. This study can serve as a reference to understand the malignant transformation of thyroid cancer. co-expression analysis. Cytoscape (https://cytoscape.org/) is an open-source software platform for visualizing complex networks and integrating networks from the attributing data. We visualized the results of the protein interaction network of STRING through Cytoscape. In the present study, the hub genes are calculated by the method of MNC. of mitotic cell regulated exocytosis, Hemostasis, positive regulation of cell death, positive regulation of protein kinase activity, nuclear division, cell division, Signaling by Receptor Tyrosine Kinases, negative regulation of cell cycle, neural precursor cell proliferation, response to oxidative stress. EGF-induced nuclear localization of SHCBP1 activates β-catenin signaling and promotes cancer progression, Suppression of Wnt/β-catenin signaling by EGF receptor is required for hair follicle development , The critical role of EGF-β-catenin signaling in the epithelial-mesenchymal transition in human glioblastoma , EGF Inhibits Wnt/β-Catenin-Induced Osteoblast Differentiation by Promoting β-Catenin Degradation, FBXW2 suppresses migration and invasion of lung cancer cells via promoting β-catenin ubiquitylation and degradation, β-Catenin nuclear localization positively feeds back on EGF/EGFR-attenuated AJAP1 expression in breast cancer, TMEPAI/PMEPA1 inhibits Wnt signaling by regulating β-catenin stability and nuclear accumulation in triple negative breast cancer cells, Elevated β-catenin pathway as a novel target for patients with resistance to EGF receptor targeting drugs, Epidermal growth factor can signal via β-catenin to control proliferation of mesenchymal stem cells independently of canonical Wnt signalling, WNT7A Promotes EGF-Induced Migration of Oral Squamous Cell Carcinoma Cells by Activating β-Catenin/MMP9-Mediated Signaling.


Introduction
In recent years, the incidence of patients with thyroid tumors has shown a continuous and obvious increasing trend worldwide [1][2][3]. According to "Cancer Statistics, 2018", the global incidence of thyroid tumor patients has risen to the fth place among female systemic malignancies [4]. Based on the degree of differentiation, thyroid cancer could be classi ed as differentiated thyroid carcinoma (DTC), poorly differentiated thyroid carcinoma (PDTC) and anaplastic thyroid cancer (ATC). Basic research has found that changes of MAPK signaling pathway and PI3K-AKT signaling pathway are mainly involved in the formation of papillary thyroid cancer and follicular thyroid cancer, respectively. As the grade of thyroid tumor improves, the signal molecules of the two pathways become activated more frequently [5][6][7].
And when genes in the MAPK signal pathway are mutated, it will drive thyroid cells to papillary thyroid cancer; while the PI3K-AKT signal pathway is activated, it will drive the occurrence of follicular thyroid cancer [5].
Several rare types of BRAF mutations found in papillary thyroid cancer mainly affect the nucleotides around codon 600 and activate BRAF kinase. The activated BRAF kinase promotes the high expression of its downstream signal transduction pathways of MAPK, leading to excessive proliferation of thyroid cancer cells [8,9]. Studies have found that BRAF V600E is closely related to the malignant pathological manifestations of papillary thyroid cancer, including aggressive pathological features, increased recurrence, and loss of radioactive iodine a nity [10,11]. Animal experiments have proved that the expression of the mutant BRAF-V600E in transgenic rats can promote goiter enlargement and produce invasive growth of papillary thyroid carcinoma [12]. In addition to the BRAF-V600E mutation, RAS mutations are also common mutations in thyroid cancer, which encodes the protein of p21. When it is mutated and binds to GTP, RAS is activated. There are three hotspots of RAS mutations: HRAS, KRAS and NRAS. Among them, NRAS mutation is the most common RAS mutation in thyroid tumors, mainly involving codons 12 and 61 [13]. RAS mutations are a kind of classic dual activator of MAPK and PI3K-AKT signaling pathways, but RAS mutations in thyroid tumors seem to preferentially activate PI3K-AKT signaling pathways. RAS mutations are priority to combine with phosphorylated AKT in thyroid cancer [12,14].Studies have found that with the accumulation of mutated genes and the activation of two pathways, differentiated thyroid cancer can continue to progress to poorly differentiated thyroid cancer (PDTC) and anaplastic thyroid cancer (ATC) [15]. And gene mutations (including TP53, CTNNB1 and ALK mutations) further accelerate the occurrence and development of this process [16][17][18].
However, poorly differentiated thyroid cancer and anaplastic thyroid cancer are not easy to detect and the results of surgery are dissatisfactory, and many anaplastic thyroid cancer ptietients even lose the opportunity for surgery. The treatment of 131 I is even more unsatisfactory, and the prognosis is signi cantly poor. According to statistics, the 5-year survival rate of poorly differentiated thyroid cancer is only 50%, which is far lower than 95% of differentiated thyroid cancer. And the average survival time of anaplastic thyroid cancer is only 3-6 months [19,20]. Early and accurate diagnosis of poorly differentiated thyroid cancer and undifferentiated thyroid cancer is particularly important. But the molecular mechanism of PDTC is not well understood, and there are relatively few studies about hallmarks of PDTC.
Recently with the development of microarray and next-generation sequencing technologies, a growing number of microarray and sequencing data have been explored to seek novel biomarkers and therapeutic targets for thyroid cancer [21]. However, small sample sizes in individual studies and different Technological platforms create substantial inter-study variability and di cult statistical analyses. To solve this problem, integrated bioinformatics meta analysis have been utilized in various cancer studies [22]. This research aims to explore the key hallmark genes in the process of dedifferentiation of thyroid cancer. And we will further explore the molecular mechanism of PDTC and ATC, and provide a theoretical basis for its early accurate diagnosis and precise treatment.

Methods
(1) Identi cation of DEGs in thyroid cancer As shown in Table 1, eight series (including GSE3467, GSE3678, GSE27155, GSE53072, GSE65144, GSE76039, GSE82208, GSE53157) are obtained by searching GEO database with keyword of thyroid cancer, and limited the species as Homo sapiens. The data of mRNA expression are extracted and class ed as DTC, PDTC, ATC, normal thyroid. And they are grouped into: DTC-NTH (Normal thyroid); PDTC-NTH, PDTC-DTC; ATC-NTH, ATC-DTC, ATC-PDTC. The raw expression data are processed and normalized by R languae with package of "Affy" and "limma". The missing values are calculated by the method of KNN. The inter-batch differences (combat) are removed through R package of "sva".
The DEGs expression pro le is calculated by Bayesian test with R package of "futile.logger", with parameters of p-value < 0.05 & abs (log2 fold change difference) > 0.5. And then the heatmap of DEGs are conducted and clustered by using R package of "gplots".

GSEA reveal a close relationship between hub genes and tumor transformation
Gene set enrichment analysis (GSEA) is a computational method that determines whether the members of a gene set are randomly distributed throughout the entire reference gene list or are found primarily at the top or bottom of gene list.
We performed GSEA using the Java GSEA implementation (supported by Java 8) to validate the enrichment analysis.

Results
(1) Identi cation of DEGs in thyroid cancer As shown in Table 2 Figure S1 of the overlap gene lists circle gure, Blue represents the DEGs of DTC, and Green represents the DEGs of PDTC; Red represents the DEGs of ATC. The greater the number of purple links and the longer the dark orange arcs implies greater overlap among the input gene lists. It can be seen that PDTC owns longer dark orange color arc and more purple links, which means that there are more identical genes in the DEGs in development PDTC. We will further study the functional enrichment of DEGs of DTC, PDTC and ATC.   Figure S5 shows the PPI network of DTC, which is enriched in cell morphogenesis involved in differentiation, chemotaxis, and taxis, with a signi cant difference. And the PPI network of DTC has 4 MCODEs. And in Figure S6., it can be seen that the enrichment of PPI network in PDTC are Regulation of Cytoskeleton Organization, Cell Division, positive kinase Activity Regulation, with signi cant difference. And the PPI network of PDTC has 4 MCODEs. Also, the Figure S7. Shows that the enrichment of PPI network in the ATC are extracellular matrix organization, extracellular structure organization, Extracellular matrix organization. And the PPI network of ATC has 11 MCODEs. The Table 3 summarizes the enrichment items of MCODEs in DTC, PDTC, and ATC, which could be found that chemotaxis may function in both DTC and PDTC. Moreover, the MCODE_2 in PDTC reveal sthat epidermal growth factor receptor signaling pathway might play an important role in development of PDTC.   In order to identify which gene may potentially play a biomarker in the development of PDTC, all signi cantly DEGs are used to construct gene co-expression network, as shown in Figure 5. The degree of a node describes the number of links one gene with others in the gene network. In the PDTC co-expression network, we founded the gene-expressions in PDTC are largely consistent with the previous study, such as the expression of CCND1 and KRAS. Some novel observations are also made in our study. Interestingly, the central of this network are CCNA2, EGF, which directly interacted with more than 20 neighboring genes. Furthermore, as shown in Figure 18 By searching the human protein atlas database for evaluation of Hub gene protein expression, it was found that the protein expression of CCND1, DEPDC, ANLN, HGF, BCL2L1 in thyroid cancer and thyroid tissue had a certain difference, as shown in the Figure 8. The data of EGF was not retrieved, and EGFR was further searched and shown. The p-value of KEGG_P53_SIGNALING_PATHWAY is equal to 0.51. The Figure 9 shows the enrichment plot of these 12 KEGG pathway in PDTC_NTH.
There are 58/174 gene sets upregulated in phenotype PDTC_DTC. And 2 gene sets are signi cantly enriched at p-value < 5%, including KEGG_TRYPTOPHAN_METABOLISM and KEGG_STARCH_AND_SUCROSE_METABOLISM. The Figure 10 shows the enrichment plot of these 2 KEGG pathway in PDTC_DTC.

Discussion
In recent years, studies have shown that thyroid cancer is a continuous process. With the accumulation of mutant genes, DTC can continue to progress into PDTC and ATC. In this study, it could be noted that Extracellular Matrix Hub genes in thyroid cancer were analyzes in GEPIA. As shown in Figs. 5 and 6, the expression of EGF, CCND1, DEPDC1, ANLN, HGF, and BCL2L1 in thyroid tissue and thyroid cancer tissue are signi cantly different, with a certain impact on the survival of thyroid tumors. EGF is a strong cell division promoting factor, which is closely related to the occurrence and development of tumors. The activation of EGF-EGFR are accelerating cell proliferation, involved in the regulation of autophagy activity. In the meantime it will promote the malignant transformation of tumors by regulating the cell adhesion through epithelial-mesenchymal cell turnover (EMT) [25]. Ota et.al [26] have found that heparin-binding EGFlike growth factor activity enhanced invasion and metastasis of thyroid cancer; the study of Xue indicates MiR-200 mediated by EGF/EGFR signaling regulates epithelial-mesenchymal transition in anaplastic thyroid cancer [27]. EGFRmediated downstream signaling pathways mainly include PI3K/AKT/mTOR signaling pathway, JAK/STAT signaling pathway and MAPK/P38 signaling pathway. Hepatocyte growth factor (HGF) is ligand of tyrosine kinase receptor (epithelial mesenchymal conversion factor). And the combination of the two can induce the activation of carcinogenic pathways, promote angiogenesis and tumor metastasis [28]. Knauf et al. [29] found that HGF/Met activation mediates the resistance of mouse anaplastic thyroid cancer to BRAF inhibition, but its mechanism of action is not yet clear. In addition, both EGF and HGF can induce the activation of β-catenin and promote cell migration [30,31], thereby activating the Wnt/β-catenin signaling pathway.. The abnormal activation of Wnt /β-catenin signaling pathway is usually caused by somatic mutations, and its hot-spot mutation genes include: CTNNB1, KRAS, BRAF, NRAS, ERBB2, MET, PIK3CA, JAK1, etc [32]. CTNNB1 also plays an important role in epidermal growth factor receptor signaling pathway. However, the mechanism of EGF-EGFR and HGF in the malignant transformation of thyroid tumors needs further study.
Moreover, tumors are a type of diseases that cause uncontrolled cell growth and cell cycle disorders due to multi-gene mutation. Among them, a series of factors jointly regulate the cell cycle, and the role of cyclin is the most important, which is close related to regulate the growth of tumors. Cyclin 1 (CCND1) can combine with cyclin-dependent kinases (CDKs), such as CDK4 or CDK6, to form a complex and promote the cell cycle from G1 to S Phase transformation [33]. Sun et al. found that the silencing of E2F8 signi cantly inhibited the proliferation of PTC cells in vivo and in vitro by down-regulating CCND1, leading to G1 phase arrest [34]. In addition, CCND1 also plays a pivotal role in promoting tumor invasion and metastasis. Guo et al. [35] found that long-chain non-coding RNA NR2F1-AS1 promotes the proliferation and migration of thyroid cancer cells by regulating the miRNA-338-3p/CCND1 axis. Furthermore, anillin (ANLN) is a gene encoding actin binding protein, which is considered to be an important part of cell division. Recent studies have found that ANLN is involved in the occurrence and development of a variety of malignant tumors, and can promote tumor cell growth, migration and invasion. In bladder cancer, down-regulation of ANLN expression signi cantly reduces cell proliferation, migration and invasion capabilities [36]. Wang et al. [37] found that ANLN is involved in the progression of pancreatic cancer by regulating the EZH2/miR-218-5p/LASP1 signal axis. Although there have been ndings pointing to a contribution of ANLN to the development of cancer, the association of ANLN and thyroid cancer remains not understood. Recently, the impact of DEPDC1 on the biology of various malignant tumors has attracted the attention of many scholars. Human-derived DEPDC1 is enhanced by mitotic cell cycle thereby resulting in cause -related genes tumorigenesis [38]. And BCL2L1 belongs to BCL2 Family. It can form heterodimers with Bax and Bak in cells, thereby inhibiting the opening of VDAC channels and the occurrence of apoptotic reactions [39]. Therefore, BCL2L1 is an