Glutathione peroxidase 3 (GPX3) expression predicts the prognosis of numerous malignant tumors: A pan-cancer Analysis

and was proposed for the rst time. The of the inltration level of 6 classic immune cells. Meanwhile, the expression of GPX3 in some tumors is related to the level of immune inltration of CAFs. GPX3 participates in tumorigenesis by regulating AMPK signaling pathway. After functional validation and prognostic modeling, GPX3 was found to be a prognostic marker for STAD. This study provides blueprint data concerning the correlation between GPX3 and human tumors. The results of this analysis therefore expand our understanding on the role and mechanism of GPX3 in tumorigenesis. BRCA1-Associated Protein a Potential Prognostic Biomarker


Background
Malignant tumor is a general term for uncontrollable growth of cells. Here, cells do not respond to normal regulatory signals, grows and differentiates abnormally and display local tissue invasion and distant metastasis [1] . There are currently about 260 types of tumors in humans. According to the latest global cancer statistics by the World Health Organization, in 2020 alone, there were 19.29 million new cancer cases worldwide, resulting in 9.96 million deaths [2] . Cancers present a huge economic burden to patients, their families and even the nation, threatening both social and economic development. Cancer is highly fatal and seriously threatens human health and life. Even with the recent advances in diagnosis and treatment, the occurrence and mortality due to of malignant tumors are still increasing. Biomarkers are potentially accurate and effective diagnostic and therapeutic targets for malignant tumors.
Glutathione peroxidase (GPX) is an important protein that scavenges reactive oxygen species (ROS) in organisms [3] . Glutathione peroxidase 3 (GPX3) is the only known extracellular glycosylase in the glutathione peroxidase family and contains the selenocysteine residues. It defends against cellular stress signals and reactive oxygen species, thereby maintaining the cellular genetic integrity [4] . Moderate GPX3 expression is necessary to maintain normal metabolism and physiology and induction of important pathological changes in certain organs. However, abnormal expression of GPX3 results in the occurrence and development of various malignant tumors in the body. For instance, recent researches have demonstrated the abnormal expression of GPX3 in esophageal cancer [5] , melanoma [6] , colon cancer [7] , gastric cancer [8] , ovarian cancer [9] and other malignant tumors. Even so, studies on the role of GPX3 are relatively few and have only assessed the role of GPX3 in a small number of malignant tumors.

Pan-Cancer Analysis Project is a collaborative initiative that integrates, analyzes and interprets The Cancer
Genome Atlas (TCGA) data of different malignant tumors from different platforms [10,11] . Pan-cancer analyses can not only reveal common phenotypic characteristics of malignant tumors, but can also unravel molecular events underlying the development of tumors and corresponding internal regulatory mechanisms. These studies are also important in unraveling the complex tumorigenesis as well as possible molecular and target genes relevant to clinical prognosis of cancers. However, even with the large amount of clinical data, there is no pancancer evidence on the relationship between GPX3 and several tumor types.
Herein, we rst analyzed the expression of GPX3 in 33 different malignant tumors using RNA-seq data. Then we analyzed the correlation between GPX3 expression and tumor stage. The effect of GPX3 on the survival and prognosis of malignant tumor was also investigated. We explored the relationship between GPX3 expression and in ltration of immune cells. we also performed bioinformatics analyses to explore potential mechanism underlying GPX3 expression and development of human malignant tumors. Finally, we veri ed the expression pattern and prognostic signi cance of GPX3 in STAD and COAD, and proposed a prediction model for STAD. Findings of this research will deepen our understanding on the role of GPX3 in the development, regulation and prognosis of malignant tumors. It can also uncover potential biomarkers for early diagnosis, prevention and treatment of malignant tumors.

Expression of GPX3 in normal human tissues
The Human Protein Atlas (HPA) (https://www.proteinatlas.org/) is a comprehensive repository for protein expression pro les in tissue, cells and blood and their metabolic and pathologic roles in the body. The expression of GPX3 in normal tissues was analyzed using HPA RNA-seq tissue data [12] . The expression of GPX3 protein in main cancer tissues (Colorectal cancer, Prostate cancer, Breast cancer, Lung cancer, Liver cancer) and normal tissues (Normal kidney tissues) was analyzed using immunohistochemical (IHC) tissue images in HPA.

Gene expression analysis
The expression of GPX3 mRNA in 33 different malignant tumors and corresponding normal tissues was assessed using RNA sequence data in the TCGA and GTEx databases [13] . Differential gene expression between cancerous and corresponding normal tissues was analyzed using t-tests, whereas the differently expressed genes between the tissue sets were presented using a violin plot. Before plotting, the expression data was rst transformed to log2 [TPM (Transcripts per million) + 1], with P < 0.05 considered statistically signi cant.
The degree and nature of abnormal expression of GPX3 protein between cancer and adjacent normal tissues was based on Z-values, with the median protein expression levels used as reference points.
The GPX3 expression at different cancer stages was analyzed using the "Expression DIY" module in GEPIA2 platform (http://gepia2.cancer-pku.cn/#index). The corresponding violin plot was also constructed after transformation of the expression data to log 2 (TPM + 1). The comparative analyses for the expression of GPX3 in different cancer stages were performed to understand the role of the protein in cancer pathology [15] .

Prognostic utility of GPX3
We constructed the predictive potential of GPX3 for Overall Survival (OS), Disease Speci c Survival (DSS), Disease-Free Interval (DFI) and Progression-Free Interval (PFI) of different tumors in the TCGA database. The median GPX3 expression was used as the cutoff level for high and low expression of the protein. The predictive utility of GPX3 for OS, DSS, DFI and PFI of cancer patients was assessed using log-rank test and the Kaplan-Meier curve.
Further analyses were performed to assess epidemiological implication of GPX3 expression in 33 tumor types in the TCGA database. The effect of GPX3 expression on Overall Survival (OS), Disease-Free Survival (DFS), Progression Free Survival (PFS) and Disease Speci c Survival (DSS) for different cancers were assessed using R software V. 4.0.3. The relationship between GPX3 expression and OS, DFS, PFS and DSS were analyzed using univariate Cox regression analysis and hazard ratios (HR) at 95% con dence interval (CI) and log-rank P test at statistical signi cance of P < 0.05 [16] .

Genetic alteration in tumor cells
RNA-seq data for 33 cancer patients in TCGA database were downloaded from the genomic data Commons (GDC) portal (https://portal.gdc.cancer.gov/). Tumor Mutation Burden (TMB), de ned as the number of mutations (insertion/deletion) per mega base in the exon coding region of a gene, was analyzed as previously described [17] .
The TMB is directly proportional to the expression of neoantigens recognizable by T cells, which in uences the immune response. Microsatellite Instability (MSI) is any change in the microsatellite length caused by insertion or deletion of a repeat unit in a gene in a tumor tissue, relative to normal tissue [18] , which generates microsatellite alleles. TMB and MSI are often used in assessing the prognosis and effect of immunotherapies. The association between GPX3 expression and TMB and MSI in cancerous tissues was assessed. Relevant data was obtained from the TCGA database; whereas the analysis was performed using R software V. 4.0.3, with statistical signi cance sets at P < 0.05.

In ltration of immune cells
Tumor Immune Evaluation Resource 2 (TIMER2) is a database for the systematic analysis of immune in ltration of different cancer types (B cells, CD4 + T cells, CD8 + T cells, Neutrophils, Macrophages, and Dendritic cells). In this study, in ltrating immune cell scores of 33 cancers were downloaded from the TIMER2 database. Spearman correlation analysis was used to evaluate the correlation between GPX3 expression and scores of B cells, CD4 + T cells, CD8 + T cells, Neutrophils, Macrophages, and Dendritic cells [19] .
The relationship between GPX3 expression and in ltration levels of Cancer associated broblasts (CAFs) was analyzed using TIMER2 platform ( http://timer.cistrome.org/ ). CAF regulates functioning of immune cells in the tumor microenvironment (TME). The in ltration of immune cells in the TME were estimated using EPIC, MCPCOUNTER, XCELL and TIDE algorithms. Since most immune cell types are negatively correlated with tumor purity, we obtained P-values and correlation coe cient by Spearman's rank correlation test after purity adjustment. The above relationship was presented using a heat map and a scatter plot. Scatter plot was constructed for cells exhibiting the strongest correlation with tumor (P < 0.05). [15] 2.6 Enrichment analysis The top 100 genes associated with GPX3 expression in TCGA and GTEx databases were identi ed based on Pearson's correlation coe cient (PCC). The correlations between GPX3 and the top 3 most dysregulated genes were also assessed using GEPIA2 module. A scatter plot for the top 3 most dysregulated genes was also constructed.
Protein-protein interaction (PPI) network in tumor tissues associated with GPX3 expression was constructed using STRING platform (https://string-db.org/). The minimum required interaction score was set as Low con dence = 0.150, the max number of interactors to show was set as no more than 50 interactors in 1st shell. Finally, the available experimentally determined GPX3-binding proteins were obtained.
KEGG (Kyoto Encyclopedia of Genes and Genomes) pathway enrichment analysis of the 150 genes, which was combined from top 100 GPX3-similar genes and 50 GPX3-interacted genes, was performed to identify pathways regulated by the proteins. The resultant genes were uploaded to DAVID database, under the name of ("OFFICIAL_GENE_SYMBOL") and ("Homo sapiens") for species. GO (Gene Ontology) enrichment analysis for the Biological Process (BP), Cellular Component (CC) and Molecular Function (MF) associated with the dysregulated genes were also identi ed and plotted graphically using the cnetplot package (circular = F, colorEdge = T, node_label = T). KEGG and GO analyses were performed using R software. Statistical signi cance for both analyses was set at two-tailed P < 0.05 [15] .

Construction and validation of the nomogram of GPX3 for STAD
The results above indicated that GPX3 expression had an important impact on the survival prognosis of numerous malignant tumors, such as BLCA, COAD, PAAD, STAD. OS, DSS, PFS, DFS, DFI and PFI all strongly supported that the prognosis of STAD would get worse when GPX3 level elevated. Therefore, a nomogram of GPX3 for STAD was established and veri ed to further analyze the predictive signi cance of GPX3 for the OS of patients with STAD. Firstly, univariate and multivariate Cox regression analysis were used to identify all independent factors for STAD and displayed as hazard ratios (HR) combined with the corresponding 95% con dence intervals (CI). Then, according to the results of the multivariate Cox regression analysis model, a prognostic nomogram was established to predict the OS probability of STAD patients at 1-, 2-, 3-, and 5-year by the TCGA training dataset by using the rms package in R software. Concordance index (C-index), which ranges from 0.5 (poor) to 1.0 (perfect), was employed to assess the performance of nomogram. Brie y, the higher the C-index, the better its prognostic accuracy. Finally, to ensure the nomograms' accuracy, calibration and validation of the nomogram were performed using the R package "rms" and "cmprsk", P < 0.05 was considered statistically signi cant [20,21] .

Expression of GPX3 in normal human tissues
GPX3 gene is strongly expressed in Kidney, Thyroid gland and Adipose tissues, etc. (Fig. 1A). GPX3 protein was detected in carcinoid, colorectal cancer, prostate cancer, renal cancer, skin cancer and lymphomas (Fig. 1B). The IHC images for GPX3 protein in colorectal cancer, prostate cancer, breast cancer, lung cancer, liver cancer and normal kidney tissues are shown in Fig. 1C-H. Detailed clinical information of the tissue donors for IHC analyses are summarized in Table 1.  analyses were performed at statistical signi cance of P < 0.01 for CHOL and P < 0.001 for the rest of the tumors. Expression of GPX3 gene was relatively higher Lymphoid Neoplasm Diffuse Large B-cell Lymphoma (DLBC), Glioblastoma multiforme (GBM), Glioma (LGG), Liver hepatocellular carcinoma (LIHC) (all at P < 0.001) (Fig. 2).
As shown in Fig. 3A, analysis of CPTAC data revealed that compared with normal tissues, GPX3 was under expressed in Breast cancer, Ovarian cancer, Colon cancer, Clear cell RCC, UCEC, and Lung adenocarcinoma (P < 0.001).
Further analyses revealed that GPX3 expression levels correlated with pathological stages of eight cancers including ACC, BLCA, KIRC, KIRP, LIHC, Pancreatic adenocarcinoma (PAAD), READ and THCA (all at P < 0.05) (Fig. 3B). The expression of GPX3 was highest in stage I and lowest in stage II of ACC. In BLCA, GPX3 expression was highest in stage III, lowest in stage II and moderate in stage IV. There were no signi cance differences in GPX3 expression between stage I and III of KIRC, the same as stage II and IV. In KIRP, GPX3 expression increased gradually between stage I-III but decreased slightly in stage IV. In LIHC, GPX3 expression was highest in stage IV and lowest in stage III, In PAAD, GPX3 expression was highest in stage III and lowest in stage II whereas in READ, GPX3 expression increased from stage I to stage IV. In THCA, GPX3 expression was highest in stage II but lowest in stage IV.

Prognostic value of GPX3
Cancer patients were divided into high and low GPX3 expression groups. The GPX3 expression predicted the OS of four tumor types. Among them, high GPX3 expression conferred longer OS of patients with PAAD (P = 0.0031), whereas low GPX3 expression was linked to longer OS of patients with BLCA (P = 0.0049), COAD (P = 0.0078) and STAD (P = 0.00041) (Fig. 4A). The GPX3 expression also predicted the DSS of four tumor types and high GPX3 expression conferred longer DSS of patients with PAAD (P = 0.0091), whereas low GPX3 expression was linked to longer DSS of patients with BLCA (P = 0.1), COAD (P = 0.00026) and STAD (P = 0.00028) (Fig. 4B). The GPX3 expression predicted the DFI of three tumor types. Low GPX3 expression was linked to longer DFI of patients with BRCA (P = 0.012) LUAD (P = 0.004) and STAD (P = 0.00016) (Fig. 4C). Meanwhile, high-expression of GPX3 was associated with longer PFI of PAAD (P = 0.0037). Contrarily, low expression of GPX3 was associated with longer PFI of patients with COAD (P = 0.0036) and STAD (P < 0.0001) cancers (Fig. 4D).
Overall, these ndings demonstrate that GPX3 expression levels in uence the prognosis of several tumors, particular STAD, and COAD PAAD. Therefore, we focused on these 3 types of tumors in subsequent analyses, especially on STAD, and COAD due to larger sample sizes. for STAD, CHOL and KICH (all Ps < 0.05) (Fig. 6B).

GPX3 expression and in ltration of Immune cells
We investigated whether the expression of GPX3 in 33 tumors from the TIMER2 database was related to the level of immune in ltration. The results showed that the expression of GPX3 in PAAD and COAD was signi cantly positively correlated with B cells, CD4 + T cells, CD8 + T cells, Neutrophils, Macrophages and Dendritic cells. In STAD, the expression of GPX3 was signi cantly positively correlated with the in ltration levels of CD4 + T cells, CD8 + T cells, Neutrophils, Macrophages and Dendritic cells, but not with B cells (Fig. 7A).

Key pathways linked to GPX3 expression
Top 3 genes related to GPX3 expression based on the TCGA data were further analyzed. It was found GPX3 expression positively correlated with expression of MOCS1 (R = 0.28), TNS2 (R = 0.2) and FZD4 (R = 0.27) genes ( Fig. 8A). Further analyses identi ed 24 GPX3 binding proteins. The interaction network of these proteins is shown in Fig. 8B. KEGG analyses revealed that the pathogenesis of GPX3 in tumors was related to AMPK and Fructose and mannose metabolism pathways (Fig. 8C). The GO enrichment analysis further revealed that GPX3 gene regulates Angiogenesis and vascular morphogenesis, vasculature development, tube morphogenesis among others. The gene also regulates expression of intrinsic components of plasma membrane as well as phosphatase and Phosphoric ester hydrolase activities.

Validation analysis
In the above prognosis analysis, GPX3 showed signi cant effects on STAD, COAD and PAAD, especially on STAD and COAD in OS, DFS, PFS, DFI and PFI. To verify the expression pattern and prognostic signi cance of GPX3 in STAD and COAD, we further retrieved two datasets from Gene Expression Omnibus (GEO) (https://www.ncbi.nlm.nih.gov/geo/) and TCGA with the accession numbers GSE44861 and GSE29272 [22,23] .

Page 9/26
The results showed that the expression level of GPX3 in STAD and COAD tumor tissues was signi cantly lower than that in normal tissues, which were consistent with the results of TCGA datasets. The AUCs (area under the ROC curves) for COAD was 0.6698 (95%CI: 0.5660-0.7736, P=0.002) from GEO dataset and 0.9909 (95%CI: 0.9862-0.9956, P < 0.001) from TCGA dataset (Fig. 9A). The AUCs for STAD was 0.8406 (95%CI: 0.7865-0.8946, P < 0.001) from GEO dataset and 0.9599 (95%CI: 0.9464-0.9734, P < 0.001) from TCGA dataset (Fig. 9B), respectively. The results suggested that the diagnostic values of GPX3 for STAD and COAD was the same based on GEO and TCGA database, and GPX3 performed a good diagnosis ability for the two diseases, which also demonstrated that our aforementioned results were reliable.

Construction and veri cation of nomogram
Independent prognostic factors, including pTNM stage, age, radiation therapy, and GPX3 expression were included to create prognostic nomograms for the OS of STAD patients. The nomogram showed that pTNM stage had the greatest in uence on prognosis, followed by age and radiation therapy (Fig. 10A-B). In addition, the validation of the nomogram was performed by C-index and calibration. The C-index predicted by the histogram was 0.69 (95%CI: 0.627-1; P < 0.001) (Fig. 10C). The calibration curve showed the concordant survival rate between the predicted and observed nomograms (Fig. 10D). It can be believed the prognostic nomogram established in our present study could effectively predict OS probability of patients with STAD. Inactivation of GPX3 leads to the accumulation of ROS, which have been found to induce oxidative deoxyribonucleic acid (DNA) damage. The resultant gene changes lead to development of cancers [24] . Lou W et al. [25] reported that GPX3 participates in the growth and metastasis of breast cancer. In a related study, Noci S et al. [26] found that GPX3 expression in uences the incidence, survival rate and recurrence of colorectal cancer. However, the roles of GPX3 in different tumor types are still unclear. Moreover, to the best of our knowledge, there is few pan-cancer studies on the role of GPX3 in various cancer properties. Therefore, we evaluated the role of GPX3 expression on normal human tissues, gene expression, protein expression, prognosis, gene mutation, in ltration of immune cells, associated pathways, prognostic model, etc. Data for 33 different tumor types was extracted from the TCGA, GTEx, CPTAC, HPA and other databases.

Discussion
We found GPX3 expression was modulated in most tumors. In one study, under-expression of GPX3 was associated with metastasis of thyroid cancer. Moreover, expression levels of GPX3 corresponded with stage of the cancer [27] . In a related study, [28] under expression of GPX3 was associated with larger volume, more nodules worse clinical stage and poor prognosis of HCC. Both in vivo and in vitro studies [29] demonstrated that underexpression of GPX3 participated in the invasion and metastasis of gastric cancer. In this study, we found dysregulated GPX3 expression in 26 of 33 tumor types. Particularly, GPX3 expression was downregulated in 22 but up-regulated in 4 of the tumor types. Further analyses revealed under expression of GPX3 protein in all 6 tumors in CPTAC database. We also found GPX3 expression correlated with pathological stages of tumors. Overall, GPX3 expression signi cantly impacts on progression of several tumors.
GPX3 expression levels in uences the prognosis of most tumors. However, there are different reports about the role of GPX3 high / low expression in different tumor prognosis. Several studies have shown that loss of GPX3 expression in tumor tissues is associated with poor prognosis and chemotherapy resistance in patients [30,31] .
Low expression of GPX3 can also predict patient prognosis. For instance, Caroline C et al. [32] found that underexpression of GPX3 strongly correlated with low survival rate of lung adenocarcinoma and low-grade gliomas.
However, GPX3 expression was elevated in other tumor tissues [33,34,35] , high expression of GPX3 is associated with poor prognosis in cancer patients such as gastric cancer and lung squamous cell carcinoma. Herein, tumor patients were divided into high and low expression GPX3 expression groups. We found low-expression of GPX3 resulted in better prognosis of patients with BLCA, COAD and STAD, but poor prognosis of PAAD. Meanwhile, lowexpression of GPX3 was linked to longer DFI and PFI of patients with BRCA, LUAD, STAD, COAD and STAD but poor PFI of PAAD patients. Further Univariate Cox regression analyses revealed that GPX3 expression levels predicted the OS, DFS, PFS and DSS of patients with several cancers such as PAAD, STAD, BRCA, LUAD and COAD. It can be concluded that in OS, DFS and PFS, PAAD with low GPX3 expression has a worse prognosis, and COAD and STAD with low GPX3 expression have a better prognosis. When GPX3 expression was low in STAD and BRCA, DSS was longer, whereas LUAD was shorter. That is, although Fig. 4 and Fig. 5 were different algorithms, they both illustrate the prognostic value of GPX3 in different tumors. It can be seen that GPX3 plays a dichotomy role in different tumor types, both as a tumor suppressor protein and as a survival promoting protein. However, it is necessary to further study the molecular evidence of how the high / low expression of GPX3 speci cally affects the scavenging and redox signals of oxidants in the microenvironment of tumor cells.
This presents the rst report of the association between GPX3 expression and TMB/MSI, and some tumor types that may bene t more from immunotherapy were identi ed. The potential relationship between in ltration of Immune cell and GPX3 expression in different tumor types was understood. This study also showed that GPX3 expression positively correlated with in ltration of CAFs in BLCA, BRCA, BRCA-Basal, etc. Besides, GPX3-binding components and GPX3 expression related genes in all TCGA tumors were integrated for enrichment analysis, and it was found that "AMPK signaling pathway" and "Fructose and mannose metabolism" had an in uence on the etiology or pathogenesis of tumors. The nomograms were constructed and validated to provide a prediction of 1-, 2-, 3-and 5-year survival in patients with STAD. It showed good performance in applicability and accuracy, which also supported that the relationship between GPX3 expression and STAD discovered above. Combined with the existing research results, we speculate that GPX3 plays an important role in the occurrence and development of STAD, and is expected to become a new target for the diagnosis and treatment of STAD. However, due to the small sample size in this study, more experiments are needed to explore the speci c mechanism of GPX3 in the development of STAD and other cancers. expression and MSI or TMB was proposed for the rst time. The expression of GPX3 in most tumors was positively correlated with the in ltration level of 6 classic immune cells. Meanwhile, the expression of GPX3 in some tumors is related to the level of immune in ltration of CAFs. GPX3 participates in tumorigenesis by regulating AMPK signaling pathway. After functional validation and prognostic modeling, GPX3 was found to be a prognostic marker for STAD. This study provides blueprint data concerning the correlation between GPX3 and human tumors. The results of this analysis therefore expand our understanding on the role and mechanism of GPX3 in tumorigenesis.     Correlation of GPX3 gene expression with patient's OS, DFS, PFS and DSS in different cancer types. The forest plots with the hazard ratios (HR) and 95% con dence intervals for OS, DFS, PFS and DSS in different cancers showing the survival advantage and disadvantage of low expression of GPX3. (HR>1, indicates that high GPX3 expression predicts worse prognosis compared to low GPX3 expression, whereas HR<1 indicates that high GPX3 expression predicts better prognosis than low GPX3 expression) Figure 6 Mutation pro les of GPX3 in different cancers from the TCGA (A) Spearman correlation analysis showing the association between TMB and GPX3 gene expression (B) Spearman correlation analysis showing the correlation between MSI and GPX3 gene expression (The X-axis represents the correlation coe cient between genes and TMB/MSI, while the Y-axis represents different tumors. The size of dots in the gure represents the correlation coe cient, and different colors represent the signi cance of P value. In the schematic diagram, the redder the color, the smaller the P value) Figure 7 Correlation analysis between GPX3 expression and immune in ltration (A) Correlation between GPX3 expression and the level of STAD, PAAD and COAD immune in ltration (B) Correlation between the expression of GPX3 and immune in ltration of CAFs.  Independent validation of the differential expression and prognostic signi cance of GPX3 in GEO and TCGA datasets (A) the differential expression and prognostic signi cance of GPX3 in COAD (B) the differential expression and prognostic signi cance of GPX3 in STAD Figure 10 The nomogram predicted the probability survival in patients with STAD (A, B) Univariate and multivariate Cox proportional risk analyses of clinical parameters and risk scores in patients with STAD in the TCGA training cohort and CGGA validation cohort (C) Prognostic nomogram including age, pTNM stage and radiation therapy assessed probability survival of 1-, 2-, 3-and 5-years (D) The calibration curves of the nomogram predicted survival in patients with STAD.