Integrated bioinformatics analysis and screening hub genes in papillary thyroid cancer

Abstract


Abstract Background
With the increasing incidence, papillary thyroid cancer (PTC) is receiving more and more attention, but the pathogenesis of which is still not completely elucidated. The purpose of this study was to explore key biomarkers and new therapeutic targets in PTC. GEO2R and Venn online software were used for differential gene screening analysis. Hub genes were screened via STRING and Cytoscape, following Gene Ontology and KEGG enrichment analysis. Finally, survival analysis and expression validation were performed via UALCAN online software and immunohistochemistry.

Results
We screened 334 consistently differentially expressed genes (DEGs), composed of 136 upregulated genes and 198 downregulated genes. Gene ontology enrichment analysis suggested that DEGs mainly enriched in the cancer-related pathways and functions. PPI network visualization was performed to select 17 upregulated and 13 downregulated DEGs. Finally, the expression veri cation and overall survival analysis conducted in the Gene Expression Pro ling Interactive Analysis Tool (GEPIA) and UALCAN showed that LPAR5, TFPI and ENTPD1 were related to the development of PTC and the prognosis of PTC patients, and the expression of LPAR5 was veri ed by tissue chip.

Conclusions
In summary, the hub genes and pathways identi ed in the present study not only provided new biomarkers for PTC, but also will be useful for elucidating the pathogenesis of PTC.

Background
For the past few years, more and more attention has been paid to thyroid diseases and the incidence of thyroid cancer has signi cantly increased [1]. The thyroid cancer is divided into ve types according to histogenesis and morphology, including anaplastic, Hurthle cell, follicular, medullary and papillary thyroid cancer in which papillary thyroid cancer (PTC) accounts for 80% of the overall incidence [2]. With the widespread use of neck ultrasound and puncture, the number of mortality due to thyroid cancer is signi cantly reduced [3]. However, survival rate is affected by many factors, the prognosis of PTC is still very poor despite the use of multiple treatments such as thyroidectomy, radio-iodine treatment and chemistry-therapy [4]. Therefore, the early prevention and diagnosis of thyroid cancer is still an urgent problem for doctors and scientists, and it is very meaningful to explore potential key biomarkers and novel therapeutic targets in PTC for doctors and patients.
With the large-scale application of high-throughput screening technology, we have found many novel genes associated with disease initiation and progression and had a deep understanding of the molecular mechanisms of various tumor development [5][6][7][8]. In present study, we used the bioinformatics methods to mine the microarray data from GEO, analyzed the DEGs between PTC and normal tissues. Then, we built the protein-protein interaction (PPI) network, performed the analysis of functions enrichment and survival analysis, and identi ed 3 hub genes and discovered the biological processes and signaling pathways associated with PTC. In conclusion, the integrated analysis provided some new biomarkers for PTC, which could be valuable for the further research on the mechanism and the clinical application of diagnosis, prognosis and therapy of PTC.

Acquisition of microarray data
We obtained the high-throughput gene expression pro les of PTC and normal thyroid tissues from the GEO database. The independent datasets of GSE3678, GSE33630 and GSE53157 were selected and all of them were based on the GPL570 Platformsincluding 7 PTC tissues and 7 normal tissues, 49 PTC tissues and 45 normal tissues and 7 PTC tissues and 3 normal tissues, respectively.

Screening of DEGs
The DEGs between PTC tissues and normal tissues were screened via GEO2R tool. The cut-off criteria of logFC ≥1 and adjust P-value<0.05 were considered statistically signi cant. Then, the DEGs in three datasets were screened out by Venn software online. The DEGs with logFC ≥1 were considered as upregulated genes, while DEGs with logFC <1 were considered as down-regulated genes.

Enrichment analysis via GO and KEGG pathway
In order to characterize the functional roles of the DEGs, we used the Database for Annotation, Visualization, and Integrated Discovery (DAVID, version 6.8) [9] for GO enrichment analysis which included biological process (BP), cellular component (CC), molecular function (MF), and KEGG pathways analysis with a cut-off of P-value<0.05.

Construction of PPI network and analysis of module
The PPI network was built via the Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) database to construct to uncover the relationships of DEGs based on a minimum required interaction score = 0.4 [10]. PPI network was displayed and analyzed by Cytoscape (version 3.6.1) software [11]. In addition, the core modules of the PPI network were screened via the MCODE (node score cutoff = 0.2, degree cutoff = 2, k-score = 2, and max. Depth = 100).
2.5 Survival analysis and validation of the hub genes expression UALCAN was used for analyzing the relationship between key genes expression and survival of patients with PTC, which is a good resource for analyzing transcriptome data of cancers from The Cancer Genome Altas (TCGA) [12]. The Gene Expression Pro ling Interactive Analysis tool (GEPIA) [13] was used for analyze RNA expression data on the basis of thousands of samples from TCGA and the GTEx projects. P-value<0.05 was considered statistically signi cant.

Tissue samples and immunohistochemistry (IHC)
Human papillary thyroid cancer tissue microarray sections (HThyP120CS02) was obtained from Shanghai Outdo Biotech Co. Ltd. (Shanghai, China). The tissue samples were procured from 62 patients with papillary thyroid cancer which consisted of 58 cancer tissues and 58 para-cancerous tissues as well as 3 normal thyroid tissues and 1 chronic lymphocytic thyroiditis. The two-step EnVision method was used to perform immune histochemical experiments, along with different primary antibodies against LPAR5 (1:50). The study was approved by the Ethics Committee of Shanghai Outdo Biotech Company (Shanghai, China). The slides were analyzed using the Image-Pro PLUS software program (Media Cybernetics, Inc. USA).

Statistical analysis
Statistical analysis was performed via SPSS 22.0 and GraphPad Prism 8.0 software. Results were presented as the mean ± standard deviation. The statistically signi cant changes were indicated with asterisks and the P-values were calculated via a Student's t-test, wherein *, **, and *** represented P < 0.05, P < 0.01 and P < 0.001, respectively.
The top 10 GO terms were shown in Fig. 2. Furthermore, KEGG pathway analysis showed that the DEGs were enriched in 8 pathways including pathways in cancer, Rap1 signaling pathway, transcriptional misregulation in cancer, insulin resistance, small cell lung cancer, adipocytokine signaling pathway, complement and coagulation cascades and tyrosine metabolism (Fig. 2).

Construction of PPI network and analysis of module
By STRING and Cytoscape analysis, we drew the PPI network complex constructed by 241 nodes and 442 edges, including 101 upregulated and 137 downregulated genes. Then we made further analysis by applying Cytoscape MCODE plus and found 30 central nodes including 17 upregulated genes and 13 downregulated genes (Fig. 3, 4 & Table 2).

Selection of hub genes and Validation of the expression level
We used GEPIA to identify 30 central genes survival data and found that the expression levels of 4 hub genes were signi cantly associated with survival of thyroid cancer patients which included ENTPD1, LPAR5, AKR1C3 and TFPI (P<0.05, Fig. 5). Furthermore, GEPIA was applied to validate the expression level of 4 hub genes in PTC and normal tissue respectively. Results showed that the expression of all the four genes were remarkably different in PTC samples contrasted to normal samples. ENTPD1 and LPAR5 expression levels were signi cantly upregulated while the expression levels of AKR1C3 and TFPI were remarkably downregulated in PTC samples than normal samples (Fig. 6). Moreover, UALCAN was used to analyze the expression levels of the four hub genes. We found that the expression levels of ENTPD1, LPAR5 and TFPI were remarkably different in PTC compared to normal tissue except for AKR1C3 (Fig. 7).
In order to verify the reliability of the prediction, we selected LPAR5 protein and analyzed its expression by using tissue microarray. The results indicated that LPAR5 expression appeared to be remarkably higher in papillary thyroid cancer tissues than in the adjacent tissues (Fig. 8).

Discussion
In this study, we analyzed gene expression pro les from three GEO datasets including GSE 3678, GSE 33630 and GSE 53157 by GEO2R and identi ed 136 upregulated and 198 downregulated genes, for a total of 334 DEGs. In particular, the data sets we selected are from the same data platform, and the purpose is to ensure data uniformity and reliability. 63 papillary thyroid cancer specimens and 55 normal thyroid specimens were enrolled in this research. Gene Ontology and KEGG pathway enrichment analysis suggested that these DEGs were remarkably enriched in some pathways including pathways in cancer, TGFβ receptor signaling pathway, growth factor binding and closely associated with BMP, SMAD, MAPK signal activity. Then, we constructed PPI network via STRING and Cytoscape software and screened 30 vital hub genes from the PPI network complex via MCODE module analysis. Furthermore, we found that 4 of 30 genes were closely associated with survival by GEPIA analysis. Expression validation from GEPIA and UALCAN showed that there was a signi cant difference in the expression of ENTPD1, LPAR5 and TFPI in PTC and normal tissue, which suggested that these genes might be critical in the tumorigenesis and development of PTC.
Previous research had indicated that the three genes screened by our study were involved in the initiation and development of tumors. Lysophosphatidic acid (LPA) is a biologically active mediator that affects cellular functions such as regulating cell proliferation, transcellular migration, differentiation, morphogenesis, and preventing apoptosis [14,15]. LPAR5 is a member of G protein-coupled transmembrane receptors which interact with LPA [16,17]. The cells treated by 12-Otetradecanoylphorbol-13acetate (TPA) showed higher motile activity than control cells, while LPAR5 knockdown reversed the phenomenon [18]. Recent studies on PTC showed that LPAR5 was associated with progression and overall survival in thyroid cancer, which is consistent with our study [19][20][21]. Tissue factor pathway inhibitor (TFPI), is the endogenous inhibitor of tissue factor induced blood coagulation, and its expression had been demonstrated in smooth muscle cells, monocytes, plate lets and several breast cancer cell lines [22][23][24]. Wang et al. suggested that down-regulation of TFPI may result in the tumor cell growth and migration, the suppression of TFPI by hypoxia microenvironment might be one of the supervise mechanisms by which hypoxia promotes the angiogenesis process and tumor growth [25]. miR-500 inhibition could surpress the proliferation and invasion prostate cancer cell and tumorigenicity in vivo, while TFPI knockdown reversed the effects [26]. The level of TFPI in luminal-A breast cancer patients is signi cantly lower than healthy volunteers [27]. ZARYCHTA et al. claimed that TF seemed to be a tumor-promoting factor while TFPI exerted tumor suppressor properties [28]. Cihangir et al. found that a signi cant decrease in TFPI levels in hyperthyroidism patients [29]. ENTPD1, which encodes for the protein CD39, is very important to the production of immunsuppressive adenosine [30]. Eman et al. con rmed that the level of ENTPD1 on CD4 positive T helper cells in chronic lymphocytic leukemia (CLL) patients was signi cantly higher compared to the controls and ENTPD1 and CD4 were remarkably expressed in high risk CLL patients [31]. Cai et al. found that overexpression of ENTPD1 is a predictor of poor prognosis for GC patients after radical gastric cancer (GC) resection [32]. The pathological research showed that the expression of ENTPD1 in squamous cell carcinoma of the head and neck (HNSCC) is positively correlated with tumor stage, which may predict a poor prognosis [33]. Interestingly, Bastid et al. found that ENTPD1 deletion promoted development of both induced and spontaneous autochthonous liver cancer in mice [34]. All these suggest that the immunotherapy treatment targeting CD39 may be promising. So far, there are few reports on the relationship between ENTPD1 and thyroid cancer. Thus, the speci c functions of ENTPD1 in thyroid cancer were worth to be further explored. However, further experiments will be necessary to explore pathogenesis basis and molecular mechanism of these hub genes in PTC.

Conclusions
In summary, our integrated bioinformatics study presented 3 hub genes (LPAR5, TFPI and ENTPD1) and pathways associated with PTC, which may be a reliable and potential biomarker and provide new insights into the diagnosis, prognosis and target therapy for PTC.

Declarations
Ethics approval and consent to participate All work were conducted in accordance with the Declaration of Helsinki (1964). The experiment was conducted with the human subjects' understanding and consent and was approved by the Ethics Committee of Tianjin Xiqing hospital. This study has obtained written informed consent from the participants Consent for publication The work described has not been published before. It is not under consideration for publication elsewhere. Its publication has been approved by all co-authors. Its publication has been approved by the responsible authorities at the institution where the work is carried out.