PTPN2 Expression in Cystadenocarcinoma Ovary and It’s Clinical Value Based on TCGA Database

Ovarian serous cystadenocarcinoma (OV) is a malignant tumor that often has a poor prognosis because of its late detection. The expression of PTPN2 is associated with a variety of tumors, but its effect on OV is not well understood. Therefore, we analyzed the relationship between PTPN2 and the prognosis of OV. Analysis of patients with OV using The Cancer Genome Atlas revealed an association between PTPN2 expression and the prognosis of OV. We established a model of the relationship between these factors by logistic regression, which showed a signicant correlation between the tumor grade and decreased expression of PTPN2. Kaplan-Meier survival analysis showed that low PTPN2 expression was associated with poor overall survival. Further analysis of the expression of immune cells in OV using the ssGSEA package revealed a signicant correlation between the expression level of PTPN2 in OV and the numbers of mast, gamma delta, helper, and central memory T cells. We also found differences between the phenotypic pathways associated with low PTPN2 expression and pathways of genes and proteins that determine epithelial-mesenchymal transformation. Finally, a network diagram of protein molecular interactions was drawn using the STRING database, which showed that PTPN2 was closely related to the signal converter and transcriptional activator family and Janus kinase family. Thus, PTPN2 shows potential for use as a prognostic biomarker in OV and is associated with immune inltration. These results suggest that endothelial cells with low PTPN2 expression are more likely to progress to a higher late and distant metastasis than those with high PTPN2 expression. Kaplan-Meier survival analysis also showed that low PTPN2 expression was associated with poor


Introduction
OV is a common gynecological malignancy 1 . Because it is asymptomatic in its early stages, the detection of OV is di cult; as a result, many patients are initially diagnosed with advanced disease (stage III, IV), and thus the prognosis is poor 2; 3 . The unsatisfactory prognosis of OV also re ects the limitations of current treatment strategies. Therefore, studies aimed at identifying an effective prognostic marker are urgently needed, which is a current research hotspot. Although various biomarkers, such as carbohydrate antigen 125 and carbohydrate antigen 199, are currently considered as relevant in clinical practice, these markers are have become controversial 4 . Therefore, new prognostic molecular markers for early diagnosis and new treatment regimens are needed to improve patient survival.
Phosphorylation of tyrosine kinase is an important cell signaling mechanism in tumorigenesis. Protein tyrosine phosphatases regulate phosphorylation by removing phosphorylating groups and returning tyrosine kinases to their original state 5 . PTPN2, also known as TCPTP, is a widely expressed intracellular nontransmembrane phosphatase 6 .
In this study, we assessed the prognostic value of PTPN2 expression in human OV by analyzing data available from The Cancer Genome Atlas (TCGA). Additionally, biological pathways related to PTPN2 in OV were investigated by gene set enrichment analysis (GSEA).

Data acquisition
We downloaded the dataset of patients with OV from TCGA7. This dataset included 71 normal tissues and 263 tumor tissues. We then excluded pathologies with missing data such as age and survival time.
RNA sequencing data were transformed into transcripts per million reads (TPM) for subsequent analyses.
Tumor tissues were divided into two groups according to the expression level of PTPN2.

Expression analysis by USCS XENA
We downloaded TCGA GTEx and TPM RNASeq data format from UCSC XENA (https://xena.ucsc.edu/). RNAseq data of TCGA and GTEx were processed using Toil software 8 . The Wilcoxon rank sum test was performed to compare the expression of PTPN2 in GTEx and TCGA in tumor samples. PTPN2 expression data from normal samples from GTEx and TCGA were combined, and OV samples from TCGA were included in the comparison. The stage diagram with pathological stage as the variable was analyzed to compare PTPN2 expression in different pathological stages. A boxplot using the disease state (tumor or normal state) as a variable was drawn to calculate the differential expression of PTPN2.

Survival analysis
A Kaplan-Meier plot using the SurvMiner package was drawn to evaluate the prognostic value of PTPN2 in terms of overall survival (OS) of patients with OV 9 . Gene expression values were divided into high and low expression groups according to the median value. The risk ratios (HR) and logarithmic rank P values of 95% con dence intervals were also calculated.

Statistical analysis
Data obtained from TCGA were statistically analyzed using R-3.6.3 software. The correlation between PTPN2 expression and clinical data was analyzed by logistic regression, and the in uence of other clinical factors on the survival rate was evaluated by multivariate Cox analysis. We also analyzed the expression of PTPN2 in various tumors and analyzed the correlations between PTPN2 expression and levels of 24 immune cells.

Gene set enrichment analysis
GSEA was conducted to assess the distribution trends of genes of pre-de ned gene sets in the gene table to determine their in uence on phenotype 10; 11 . GSEA was conducted using the gseGO and gseKEGG functions to identify potential functional associations with differential expression of PTPN2 based on RNA sequences from TCGA database, and htSEQ-counts data were analyzed with the DESeq2 packet. There were 52 differential molecules of |log fold change (FC)|>2 and padj<0.05. The expression level of PTPN2 was considered as a phenotype, and the enrichment pathways of each phenotype were determined by the regulator P-value <0.05, false discovery rate (FDR) q-value <0.25, and standardized enrichment score |NES| >1 12 .
2.6. Analysis of immune in ltration of PTPN2 by ssGSEA Gene set enrichment analysis methods using the GSEA package URL or GSEA package literature in R (3. 2.7. Protein-protein interaction STRING (https://string-db.org) is a comprehensive network of protein interactions that enables searches for known protein interactions and prediction of protein interactions. These interactions include both direct physical interactions between proteins and indirect functional correlations between proteins. To further explore the interactions of PTPN2, a PPI network of PTPN2 was determined the STRING (http://string-db.org/) database. For each PPI relationship pair distributed between 1 and 0, the database generates a composite score; a higher total score indicates a more reliable PPI relationship. The commonly used comprehensive scoring threshold is 0.4. In this study, we used an interaction score >0.4 as the cut-off criterion.

Patient characteristics and multivariate analysis
In June 2020, we obtained clinical and gene expression data from 376 primary tumors from TCGA (Table  ). According to the Kaplan-Meier plot (Fig. 1a), the group with high PTPN2 expression was associated with better OS of patients with OV. In addition, PTPN2 expression in tumor tissues was signi cantly lower than that in normal tissues. In the Cox regression model (Table ), variables with P < 0.1 in univariate Cox regression were included in multivariate Cox regression. Variables satisfying this threshold included primary therapy outcome (P < 0.001), tumor residual (P < 0. 001), age (P = 0.017), race (P = 0. 047), tumor status (P < 0.001), and PTPN2 (P = 0.017). Multivariate Cox regression showed that primary therapeutic outcome (P < 0.001), age (P = 0.017), tumor status (P < 0.001), and PTPN2 (P = 0.036) were independent prognostic factors of OS (P < 0.05). PTPN2 expression was signi cantly correlated with tumor grade. The expression of PTPN2 decreased with increasing tumor grades.

Expression difference analysis
TCGA OV samples were divided into high and low PTPN2 expression groups. There were 52 differential molecules of |logFC| >2 and padj < 0.05, among which eight were high-expression genes and 44 were lowexpression genes (Fig. 1b). We also constructed a heat map (Fig. 2).

Relationship between PTPN2 expression and clinicopathological variables
A total of 376 OV samples were analyzed from TCGA (Table ), including PTPN2 expression data for all patient characteristics. Univariate analysis using logistic regression showed that PTPN2 was signi cantly correlated with the FIGO stage (P = 0.001), tumor residual (P = 0.038), and tumor status (P = 0.041). These results suggest that endothelial cells with low PTPN2 expression are more likely to progress to a higher late and distant metastasis than those with high PTPN2 expression. Kaplan-Meier survival analysis also showed that low PTPN2 expression was associated with poor OS (HR = 0.73 (0.56-0. 94); We also conducted logistic regression to analyze the relationship between the clinicopathological features of OV and value of PTPN2 TPM. PTPN2 was signi cantly correlated with the FIGO stage (P = 0.001), tumor residual (P = 0.038), and tumor status (P = 0.041).
We also drew a receiver operating characteristic curve with the false-positive rate on the horizontal axis and true positive rate on the vertical axis (Fig. 3c). We evaluated the diagnostic e cacy of PTPN2 for OV by receiver operating characteristic curve analysis. The area under the curve of PTPN2 in the gure is 0.710, suggesting that PTPN2 is a diagnostic molecule.
A nomogram was also used to draw the prognostic model (Fig. 3b). Tumor status, primary therapeutic outcome, age, histologic grade, tumor residual, and PTPN2 were included in the model shown in Figure   3b. The C-index of the model was 0.740 (0.720-0.760).

Relationship between PTPN2 expression and tumor-in ltrating immune cells
To determine the difference in PTPN2 expression between tumor and normal tissues, the expression levels of PTPN2 mRNA in normal tissues and multiple tumor types were analyzed using TGCA and GTEx databases (Fig. 3a). The results showed that PTPN2 was signi cantly differentially expressed (P < 0.  (Fig. 3d).

Differential expression of PTPN2
GSEA of differences between low and high PTPN2 expression data sets was performed to identify key signaling pathways associated with PTPN2. The results showed signi cant enrichment of several Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes terms (FDR < 0.05, NOM P < 0.05) 14 . GO analysis showed that genes related to PTPN2 mostly act on mRNA, including in mRNA 5′-splice site recognition and mRNA splice site selection. A histogram of the GO enrichment analysis results is shown in Fig. 5a.
Based on the OV expression matrix from TCGA data created using the cluster Pro ler package [PMID:22455463] 12 , GSEA was carried out on both low-expression and high-expression PTPN2 groups, and H. lati. v7. symbol. GMT [Hallmarks] and C5. all. v7 14; 15 . GMT (GO) was selected as a reference gene collection in MSigDBCollections. An FDR < 0.25 and Padjust < 0.05 were used to de ne signi cant enrichment. PTPN2 and genes involved in election-mesenchymal transition and epithelial-mesenchymal transition were signi cantly enriched (Fig. 5b).

Protein-protein interaction
By using the STRING search tool (Fig. 5c), we found that non-receptor type tyrosine-speci c phosphatase dephosphorylates receptor protein tyrosine kinases including CSF1R, INSR, PDGFR, and EGFR. PTPN2 also dephosphorylates non-receptor protein tyrosine kinases including Src, JAK1, JAK2, and JAK3 family kinases, and STAT1, STAT3, and STAT6 in either the cytoplasm or the nucleus. It also negatively regulates numerous signaling pathways and biological processes such as in ammatory response, hematopoiesis, glucose homeostasis, and cell proliferation and differentiation 16 .

Discussion
Here, we showed that the prognosis of OV was correlated with the expression level of PTPN2, with high PTPN2 expression related to a positive prognosis. In addition, our study showed that PTPN2 expression was associated with different sets of immune markers and immune in ltration. Therefore, PTPN2 may affect tumor immunity. We also found that PTPN2 expression differed between OV tumor tissue and normal tissue and was correlated with tumor grade. Multivariate analysis showed that PTPN2 expression was an independent prognostic factor affecting the prognosis of patients with OV. Therefore, PTPN2 is a promising cancer tumor biomarker.
PTPN2 is an enzyme that removes phosphate groups from its substrate. Phosphatase plays an important role in maintaining cell homeostasis. The subcellular localization of PTPN2 is variable, and under certain conditions, the protein shuttles between the nucleus and cytoplasm. PTPN2 is widely expressed in adult cells and plays an important role in regulating cell life activities. Therefore, abnormal PTPN2 protein can cause speci c diseases. PTPN2 is closely related to the immune system and affects most cells in the immune system. PTPN2 is closely related to JAK1, JAK3, STAT1, and STAT3, as observed in the PPI graph. Expression of PTPN2 was decreased in some breast cancers, and re-transfer of PTPN2 inhibited the growth of breast cancer cells. This regulation may occur via inhibition of the transforming growth factor (TGF)-β-transforming signal pathway. Therefore, it is important to study the relationship between the TGF-β signal pathway and PTPN2 from a physiological and pathological perspective.
The TGF-β signal pathway is a signal transmission process mediated by transforming growth factor. 17 This pathway plays a key role in the growth, development, and differentiation of cells and tissues, as well as an important regulatory role in cell proliferation, interstitial generation, differentiation, apoptosis, embryonic development, organ formation, immune function, in ammatory response, wound repair, etc. TGF-β expression and signal transduction disorders are associated with the development of many diseases, such as cancer, brosis, and hereditary hemorrhagic telangiectasia, familial primary pulmonary hypertension, and many other genetic diseases. TGF-β is associated with the occurrence, progression, and metastasis of tumors.
The classic TGF-β signal pathway contains TGF-β receptors I and II and its downstream proteins Smad, Co-Smad, and Smad4, with Smad4 forming complexes with phosphorylated R-Smad in the nucleus to induce transcription. Thus, Smad4, although not directly regulated by TGF-β activity, is a key factor in the entire TGF-β signaling pathway 18 .
In some cancers, the TGF-β signaling pathway is inhibited by continuously activated kinase NPM-ALK 19 , which continuously phosphorylates Smad4 and in turn prevents Smad4 from forming a complex with phosphorylated R-Smad in the nucleus for transcription.
PTPN2 has an obvious dephosphorylation effect on Smad4 phosphorylated by NPM-ALK, and the expression of NPM-ALK can dimerize and autophosphorylation is activated, thus showing sustained kinase activity 20 . Studies have shown that PTPN2 also dephosphorylates NPM-ALK, thus reducing the activity of NPM-ALK. Therefore, PTPN2 plays a role by reducing the activity of NPM-ALK and thus affecting Smad4.
The TGF-β signaling pathway also crosstalks with many other pathways, including the JAK/STAT pathway 21 . PTPN2 can mediate the dephosphorylation of pSTAT3(705) and negatively regulate the transcriptional activity of STAT3, whereas STAT3 can bind to Smad3 to inhibit the TGF-β signaling pathway. Therefore, high expression of PTPN2 inhibits STAT3 activity, thus preventing the binding of STAT3 to Smad3 and then restores the TGF-β signaling pathway. In summary, the effect of PTPN2 on the TGF-β signaling pathway through other pathways should be further studied.
Here, we showed that the prognosis of OV was correlated with the expression level of PTPN2, with high PTPN2 expression associated with a positive prognosis. In addition, our study showed that PTPN2 expression was associated with different sets of immune markers and immune in ltration. Thus, PTPN2 may affect tumor immunity. Further, the expression of PTPN2 differed between OV tumor tissue and normal tissue and was correlated with tumor grade. Multivariate analysis showed that PTPN2 expression was an independent prognostic factor affecting the prognosis of patients with OV. Therefore, PTPN2 is a promising cancer tumor biomarker.
Smad4, a tumor suppressor gene, contains mutations or deletions in many cancer cells 22; 23 . Our study showed that PTPN2 can restore the activity of the TGF-β signaling pathway, which is of high clinical value, and a safe, reliable, and effective PTPN2 agonist may be applicable in the clinical treatment of OV  Variables satisfying this threshold included primary therapy outcome (P < 0. 001), tumor residual (P < 0.001), age (P = 0.017), race (P = 0.047), tumor status (P < 0.001), and PTPN2 (P = 0.017). Multivariate Cox regression showed that primary therapeutic outcome (P < 0.001), age (P = 0.017), tumor status (P < 0.001), and PTPN2 (P = 0.036) were independent prognostic factors in OS (P <0. 05).  Figure 1 a. Kaplan-Meier plot using SurvMiner package was used to evaluate the prognostic value of PTPN2 in predicting the overall survival of patients with OV. b. In difference analysis, there were 52 differentially expressd molecules with |logFC| > 2 and Padj < 0. 05.

Figure 2
In difference analysis, in TCGA patients with OV, PTPN2 was divided into high and low expression groups, showing the co-expression difference of genes.  a. Correlation between PTPN2 expression in 24 kind of immune cells. b. Spearman correlation method was used to analyze the correlation between PTPN2 expression and Tcm, Th2 cells, Tgd, and mast cells.