Expression and prognosis analyses of PEA3 subfamily in head and neck squamous cell carcinoma via integrated bioinformatics

Backgroud: The PEA3 subfamily belongs to the rst category of the ETS family, and it consists of three family members, namely, ETV1, ETV4, and ETV5. These members were implicated in the progression of many cancers. But their prognostic roles in head and neck squamous cell carcinoma (HNSCC) remain to be characterized. Methods: Data from multiple open databases were used to analyze the expression, regulation, mutations, immune inltration, and functional networks of the PEA3 subfamily in patients with HNSCC. Results: The results demonstrated that these patients showed downregulated ETV1 and upregulated ETV4/5 compared with healthy subjects. Furthermore, high expression of ETV4/5 and low expression of ETV1 were linked to the shortened overall survival in patients with HNSCC. The immune inltration in B cells stimulated the prognosis, and it may have a prognostic medical signicance linked to the PEA3 subfamily in HNSCC. These results showed that data mining eciently revealed the expression of the PEA3 subfamily in HNSCC, but this study requires further large-scale genomic and basic research. Conclusions: This study provided new insight into the prognostic roles of PEA3 subfamily in HNSCC with potential mechanistic values.


Introduction
Head and neck squamous cell carcinomas (HNSCCs) are the seventh most common cancer worldwide, occurring in more than half a million new patients every year, which have seriously endangered the lives and health of the population [1,2] . The 5-year overall survival rate (OS) of HNSCC patients is poor, and the survival rate of patients with local recurrence and metastasis is even lower [3,4] . Therefore, in order to better prognosis patients with head and neck squamous cell carcinoma, it is necessary to determine new indicators for prognostic evaluation and targeted therapy of HNSCC.
The E26 transformation speci city (ETS) family comprises the most prominent signal-dependent transcription factor families, consisting of 28 protein-coding genes in the human genome [5] . ETS transcription factors participate in tumorigenesis and development by regulating various biological processes, such as cell proliferation, migration, apoptosis, aging, angiogenesis, and stem-cell development [6] . They are divided into several different subfamilies based on the ETS domain's high amino acid conservation and the subgroup-speci c amino acid sequence [7] . The PEA3 subfamily belongs to the rst category of the ETS family, and it consists of three family members, namely, ETV1, ETV4, and ETV5, which were implicated in the progression of many cancers [8] . ETV1 is often deregulated in prostate cancer [9] and expressed explicitly in most gastrointestinal stromal tumors [10] . The ETV4 transcription factor was frequently activated in gastric cancer [11] , lung cancer [12] , hepatocellular carcinoma [13] , and ovarian cancer. ETV5 was implicated in endometrial cancer progression [14] , thyroid cancer [15] , and ovarian cancer [16] .
Given the extensive carcinogenic effects of the PEA3 subfamily, ETV1, ETV4, and ETV5 have been proposed as prognostic markers for patients with cancer [17] . In our study, the mutation signatures and expression levels of the PEA3 subfamily in HNSCC datasets were explored through integrated bioinformatics analysis to provide new insights into their potential functions, molecular mechanisms, and prognostic signi cance.

Oncomine database
The expression data of PEA3 mRNAs in HNSCC and standard samples were obtained from Oncomine (https://www.oncomine.org/ ) [18] , a database that currently includes 715 gene expression datasets and 86,733 samples, and p < 0.05 was used as the cut-off criterion.
Gene Expression Pro ling Interactive Analysis (GEPIA) dataset and TNM Database GEPIA ( http://gepia.cancer-pku.cn/ ) [19] is a newly created interactive online database that allows users to nd RNA-seq expression data or samples based on Genotype-Tissue Expression projects (GTEx) and The Cancer Genome Atlas (TCGA). The database is based on a criterion processing pipeline and offers customizable functions, such as pro ling on pathological stages, cancer types, differential expression analysis, survival analysis, correlation analysis, and similar gene detection. TNMplot ( https://www.tnmplot.com/) [20] is an integrated database that uses available transcriptome-level datasets to compare typical tumors and metastatic data across all genes in real-time. In our study, this database was used to compare the expression of the PEA3 subfamily in normal and tumor tissues. UALCAN analysis UALCAN (http://ualcan.path.uab.edu) [21] was used to analyze, integrate and discover transcriptomic cancer data and for in-depth analyses of TCGA gene expression information. The website could perform in-silico analysis of potential candidate genes of interest to assess the expression in various subgroups, such as age, gender, race, and grade.

Kaplan-Meier plotter
Kaplan-Meier (K-M) plotter (http://kmplot.com/analysis/) [22] was used to analyze the prognostic signi cance of the mRNA expression of different PEA3 subfamilies in HNSCC. The patients were divided into low-and high-expression groups in accordance with each PEA3 subfamily's median values and assessed using K-M survival plot.
cBioPortal and Genemania database The cBioPortal portal (http://cbioportal.org) [23] encompassing datasets of 245 cancer studies was used to analyze PEA3 subfamily mutations. The GeneMANIA database( https://genemania.org/ ) [24] is an online database used for searching known protein interaction relationships. It was used in the present study to explore the interaction network of the PEA3 subfamily, and the co-expressed genes were analyzed. GO (Gene Ontology) and KEGG (Kyoto Encyclopedia of Genes and Genomes) function analyses of the PEA3 subfamily and the co-expressed genes were performed on R software.
Tumor Immune Estimation Resource (TIMER) Analysis TIMER (https://cistrome.shinyapps.io/timer/) [25] is a comprehensive asset used to systematically evaluate tumor in ltration in CD8+ T cells, B cells, CD4+ T cells, macrophages, neutrophils, and dendritic cells and their correlation with the expression of the PEA3 subfamily in HNSCC. The survival module was used to determine immune in ltrates and the PEA3 subfamily to obtain the survival differences, in which p < 0.05 was considered statistically signi cant.

Expression of PEA3 family members in HNSCC
The mRNA and protein expression of the PEA3 subfamily was investigated using the ONCOMINE and GEPIA datasets. As shown in Fig. 1A and 1B, the expression of ETV1 was higher in the normal tissues than in the cancer tissues. The expression of ETV4 and ETV5, on the contrary, was higher in the cancer tissues. The TNM plot dataset was also used to compare the PEA3 expression amongst cancer and normal tissues. The results indicated that ETV4/5 was signi cantly higher in HNSCC tissues than in normal tissues (p < 0.05, Fig. 1C).

Clinical subgroup analysis
A subgroup analysis of multiple clinical-pathological features was conducted using the TCGA database. Subgroup analysis by age, metastasis, status, gender subgroup, and cancer stages indicated that the transcriptional levels of ETV1 were lower in patients with HNSCC than in healthy individuals. Besides, subgroup analysis by age, cancer stages, metastasis status, HPV status, gender subgroup, and tumor grade demonstrated that the transcriptional levels of ETV4 and ETV5 were signi cantly higher in patients with HNSCC across all subgroups (Fig. 2).

Prognostic analysis
The prognostic signi cance, which was obtained from publicly available online datasets, of the PEA3 subfamily, was explored in patients with HNSCC. As shown in Fig. 3, the increased expression of ETV1 (HR = 0.73, 95% CI: 0.56-0.96, p = 0.021) was related to prolonged OS. The low expression levels of ETV4 (HR = 1.5, 95% CI: 1.14-1.96, p = 0.0031) and ETV5 (HR = 1.51, 95% CI: 1.15-1.99, p = 0.0026) were also related to prolonged OS. These results suggested that the levels of PEA3 may play an important role in HNSCC prognosis.
Predicted functions and pathways of PEA3 in HNSCC The mutation patterns and functional networks of PEA3 were analyzed using cBioPortal and GeneMANIA further to explore the biological role of PEA3 in HNSCC. Amongst the 827 HNSCC tumor samples sequenced, genetic alterations were detected in 132 samples, with a mutation rate of 16%. ETV5 was ranked as the most mutated gene, with a mutation rate of 16%. The network for PEA3 and 22 most frequently altered neighboring genes was also shown (Fig. 4).
The functions of PEA3 and these 22 genes were analyzed using GO and KEGG enrichment (Fig. 5). GO analysis indicated that the changes in biological processes included aging, positive regulation of hair cycle, extracellular matrix disassembly, replicative senescence, and negative regulation of metallopeptidase activity. The molecular function was mainly enriched in metalloendopeptidase inhibitor activity, transcription coactivator binding, mitogen-activated protein kinase binding, protein serine/threonine kinase activity, and transcription cofactor binding. Changes in cell components were primarily enriched in secretory granule lumen, cytoplasmic vesicle lumen, vesicle lumen, protein-lipid complex, and speci c granule lumen. According to KEGG analysis results, pathway enrichment analysis was mainly enriched in transcriptional misregulation in cancer, prostate cancer, cellular senescence, cholesterol metabolism, VEGF signaling pathway, MAPK signaling pathway, and FoxO signaling pathway.

Immune in ltrates in correlation with PEA3 in HNSCC
A statistically signi cant correlation was found between PEA3 expression in HNSCC and immune in ltrates (p < 0.05, Fig. 6). The expression levels of ETV1 and ETV5 were positively correlated with the immune in ltration levels in HNSCC, HNSCC-HPV-pos, and HNSCC-HPV-neg, whereas that of ETV4 was negatively correlated. We hypothesize that in ltration of immune cells signi cantly affect prognosis. Therefore, we explored the difference in cumulative survival between HNSCC, HNSCC-HPV-pos, and HNSCC-HPV-neg. The results showed that B cells, CD8+ T cells, and neutrophils of immune in ltrates statistically signi cant to PEA3 subfamily levels in HNSCC-HPV-pos subgroup, indicating that these immune cell in ltrations signi cantly affect prognosis. (Fig. 7).

Discussion
In this study, we explored the expression levels and mutational landscape of various PEA3 subfamilies in multiple HNSCC datasets and to elucidate their exact function in HNSCC. The results showed that the expression levels of speci c PEA3 subfamilies were correlated with prognosis and immune in ltration.
A growing number of studies have shown that the PEA3 subfamily promotes cell migration and invasion, thereby contributing to tumor progression and metastasis. Several studies have reported that the expression of ETV1 was important in muscle organ and cerebellar circuit development [26,27] . Lunardi et al. [28] demonstrated that ETV1 was a negative regulator of checkpoint kinase 1 (CHK1) in prostate cancer. ETV1 overexpression could inhibit CHK1, leading to the accumulation of DNA damage and the promotion of invasive tumorigenesis. MMPs are common genes related to cell migration and invasion.
Many studies have reported that ETV1 could stabilize β-catenin and directly bind to MMP1/7, resulting in increased accumulation of β-catenin and MMP1/7 and induced migration and invasion of prostate cancer cells [9,29,30] . A recent study has shown that the expression of ETV4 was activated by PI3K/Akt signaling in clear-cell RCC and ETV4 promoted cell migration by directly binding to its downstream promoter FOSL1 [31] . Moreover, in advanced prostate cancer, ETV4 has been reported to be overexpressed under the combinatorial activation of the PI3-kinase and RAS signaling pathways, indicating that ETV4 may represent a valuable therapeutic target in metastatic prostate cancer [32] . ETV5 was a target gene of the Src family kinase, and it promoted female germline stem cells [33] . Besides, ETV5 has been thought to be an obesity-related gene that was important for regulating energy balance and metabolism, including insulin secretion and glucocorticoid circulation [34,35] . Another study has also found that ETV5 acted as a transcription factor related to papillary thyroid cancer cell growth by targeting Twist1 to trigger the EMT process [17,36] .
The present study showed that ETV4 and ETV5 were higher in the cancer tissues than in the normal tissues. High ETV4 and ETV5 were also found in patients with HNSCC, which was correlated with a poor prognosis. Contradictory to previous studies, this study revealed that ETV1was lowly expressed in HNSCC tissues, and a low level of ETV1 could lead to poor prognosis, thereby warranting further exploration of its functions. The function and pathways of the PEA3 subfamily and their frequently altered neighboring genes were signi cantly enriched in aging, extracellular matrix disassembly, transcription coactivator binding, and other cell functions. KEGG analyses showed enrichment in the VEGF signaling pathway, MAPK signaling pathway, cellular senescence, and transcriptional misregulation in cancer. The MAPK signaling pathway is a crucial player in the initiation and progression of many cancers.
The role of the PEA3 subfamily and tumor-in ltrating immune cells were also explored. In the HNSCC-HPV-pos group, a signi cantly increased in ltration of B cells, CD8+ T cells, and neutrophils were found (p < 0.05), which was su ciently related to the expression of the PEA3 subfamily. This nding indicated that immune cells might have a signi cant effect on HNSCC prognosis. Therefore, the potential role of the PEA3 subfamily must be investigated further.
This study aimed to investigate the clinical value of the PEA3 subfamily in HNSCC and the molecular mechanism from big data. The ndings may offer new perspectives in future research and clinical applications for patients with HNSCC. However, this study had several limitations. All the data analyzed were retrieved from online databases. Precise clinical information was not provided. Thus, the ndings must be validated by further studies. Finally, additional research attention should be paid to the speci c functions and mechanisms of the action of the PEA3 subfamily in HNSCC.

Authors' contributions
Mei Wei carried out data analysis and drafted the manuscript; Wei Wang and Guangjian Ni participated in study design and data collection. All authors read and approved the nal manuscript.

Availability of data and materials
The online databases used in the study:       Correlation between PEA3 subfamily in HNSCC, HNSCC-HPV-pos, and HNSCC-HPV-neg expression. The abundance of immune in ltrates was statistically signi cant (p < 0.05, TIMER database).