Mining hub genes correlated with immune infiltrating level across multiple tumors microenvironment

Background Previous studies revealed that cancer-associated differentially expressed genes (DEGs) in an independent cancer type are rarely related to the tumorigenesis and metastasis, while the common DEGs across multiple types of cancer may be proved as potential oncogenes or tumor suppressors and extend our understanding. Although tumor-infiltrating lymphocytes (TIL) have been reported to be associated with prognosis in multiple types of cancer, the hub genes regulating immune cells function in different cancer types remain unclear. Methods To screen for the hub genes regulating immune infiltrating level across multiple tumors microenvironment, the raw data containing RNA sequencing and clinical information from TCGA database and immune scores from ESTIMATE website across 25 cancer types were obtained. Results Based on the immune scores, all cases were categorized into high-score and low-score groups. Kaplan–Meier survival analysis demonstrated that a strong correlation was found in six cancer types. The functional enrichment analysis of common DEGs revealed that infection and immune response are the most prominent biological characteristics. Subsequently, the twelve common DEGs with prognostic value were identified as candidate hub genes and were adopted to construct the PPI network. Because of highly interconnected with other hub genes, protein tyrosine phosphatase non-receptor type 6 (PTPN6) was selected as the real hub gene across the six immune-specific tumors. Conclusion Due to the significant correlation between PTPN6 with tumor-infiltrating immune cells in multiple cancers, PTPN6 may well play a vital role in regulating immune response for tumor development.


Abstract
Background Previous studies revealed that cancer-associated differentially expressed genes (DEGs) in an independent cancer type are rarely related to the tumorigenesis and metastasis, while the common DEGs across multiple types of cancer may be proved as potential oncogenes or tumor suppressors and extend our understanding. Although tumorinfiltrating lymphocytes (TIL) have been reported to be associated with prognosis in multiple types of cancer, the hub genes regulating immune cells function in different cancer types remain unclear.
Methods To screen for the hub genes regulating immune infiltrating level across multiple tumors microenvironment, the raw data containing RNA sequencing and clinical information from TCGA database and immune scores from ESTIMATE website across 25 cancer types were obtained.
Results Based on the immune scores, all cases were categorized into high-score and lowscore groups. Kaplan-Meier survival analysis demonstrated that a strong correlation was found in six cancer types. The functional enrichment analysis of common DEGs revealed that infection and immune response are the most prominent biological characteristics.
Subsequently, the twelve common DEGs with prognostic value were identified as candidate hub genes and were adopted to construct the PPI network. Because of highly interconnected with other hub genes, protein tyrosine phosphatase non-receptor type 6 (PTPN6) was selected as the real hub gene across the six immune-specific tumors.
Conclusion Due to the significant correlation between PTPN6 with tumor-infiltrating immune cells in multiple cancers, PTPN6 may well play a vital role in regulating immune response for tumor development.

Background 3
A key factor for tumor cells to evade antitumor immune surveillance lies in the complex immunosuppressive microenvironment is formed through downregulation of tumor antigens and production of immunosuppressive mediators [1,2]. Great progress has been made in modern immunotherapy by recognizing immune checkpoint [3][4][5] and adopting tumor vaccine [6][7][8], which suggested that elucidating the antitumor immune-related mechanisms is crucial to initiate a clinically meaningful antitumor immune response.
However, the cancer-associated differentially expressed genes (DEGs) only in an independent cancer type are rarely reported to be related to the initiation and development of cancer [9]. By contrast, tumor samples from different tissues have been reported to share some common features and may act as oncogenes or tumor suppressors across multiple cancer types [10,11]. Therefore, mining hub genes correlated with immune infiltrating level in multiple cancer types may provide novel immune targets for future immunotherapy.
In the tumor microenvironment, immune cells are the principal types of non-tumor components and have been proposed to be significantly correlated with tumor prognosis [12,13]. ESTIMATE algorithm is based on single sample Gene Set Enrichment Analysis (GSEA) and generates immune scores via gene expression data from The Cancer Genome Atlas (TCGA) database [14]. To date, immune scores based on ESTIMATE algorithm have been widely adopted to quantify the immune components in multiple types of tumor tissues, including colon cancer [15], breast cancer [16], prostate cancer [17] and glioblastoma multiforme [18]. However, the immune-related genes in previous studies are various and rarely have been confirmed to be associated with immune infiltrating level in tumor microenvironment.
In this study, we systematically analyzed the correlation between immune scores and survival prognosis across 25 types of cancer and six cancer types were selected as 4 immune-specific cancers. Subsequently, the common DEGs with prognostic value were identified as candidate hub genes. Protein-protein interaction (PPI) network were then used to screen the real hub gene. Finally, we investigated the relationship between the real hub gene protein tyrosine phosphatase non-receptor type 6 (PTPN6) and immune infiltrating level in tumor microenvironment.

Identification of immune-specific tumors
To single out the immune-associated tumors, immune scores across 25 types of cancer and their clinical information from TCGA were obtained. All cases were divided into high  (Fig. 2F)). For the convenience of subsequent description, we name these two groups positive and negative subgroups, respectively.

Identification of the common DEGs in immune-specific tumors
Based on the criterion of |log2(FC)| > 2 and adj. P value < 0.05, we compared gene

Identification of hub genes
All DEGs in six cancer types were subjected to univariate Cox proportional hazards regression analysis and a total of 131, 214, 445, 238, 454 and 295 DEGs in LUAD, CESC, SKCM, BRCA, LGG and KIRC were recognized to be associated with their survival prognosis.
Because of highly interconnected with other nodes, PTPN6 was selected as the real hub gene, which has a significant correlation with the KLRC1, JAK2, RASAL3, DBNL, GRAP2 and CORO1A (Fig. 4D).

Validation of the real hub gene
To validate the differential expression of the real hub genes on translational level in immune-specific tumors, the immunohistochemistry staining was obtained from HPA database. The results demonstrated that the expression level of PTPN6 was significantly upregulated in tumors compared to the normal samples ( Fig. 5A-F). In addition, the prognostic role of PTPN6 was further confirmed using TCGA data. Kaplan-Meier survival analysis showed that high expression level of PTPN6 was significantly correlated with better overall survival in LUAD (P = 0.0303), CESC (P = 0.0386), SKCM (P 0.0001) and BRCA (P = 0.0164), but with poorer prognosis in LGG (P=0.0060) and KIRC (P = 0.0116) ( Fig. 6A-F). These findings were consistent with the correlation between immune scores and survival prognosis.

Correlation analysis between PTPN6 and immune infiltrating level in tumor microenvironment
To quantify the interactive relationships between PTPN6 and immune infiltrating level in tumor microenvironment, the correlation based on the coefficient of correlation R was divided into three types as follows: (1)  To investigate the interplay between PTPN6 and tumor infiltrating immune cells, the coexpression between PTPN6 and immune marker sets of diverse immune cells was further analyzed after the correlation adjustment by purity. As is shown in table 1, PTPN6 expression was significantly correlated with most immune marker sets of diverse immune cells. the immune marker highly correlated with PTPN6 in various specific immune cells was further explored. The results show that PTPN6 was significantly correlated with CD19 and CD79A in B cell, CD8A and CD8B in CD8+ T cell, TBX21, STAT6 and IFNG in CD4+ T cell, CD68 and IL-10 in tumor-associated macrophage, ITGAM and CCR7 in neutrophils, ITGAM and HLA-DPB1 in dendritic cell (Fig.8A-F). Interestingly, PTPN6 was found to be highly interplayed with PD-1 in T cell exhaustion across six cancer types, which suggests 8 that PTPN6 may play a vital role in regulating immune checkpoint.

Discussion
Great success has been made in applying immunotherapy to hematologic malignancies [19], while unique challenges are faced in solid tumors because of the immunosuppressive tumor microenvironment [20]. The tumor microenvironment contains multiple components (such as suppressive cells, inhibitory ligands and soluble factors) that facilitate evasion of antitumor immune responses in solid tumors. Previous reports have provided elegant analysis on how the activation of tumor-intrinsic genes shapes tumor microenvironment [21,22]. With the rapid development of third-generation sequencing technology, an increasing number of available databases have been used to mine candidate genes regulating immune infiltration in an independent cancer type, while immune-associated genes in multiple cancer types have not been reported [10,11]. In this study, we focused on genes characteristic of tumor microenvironment, which in turn affect the development of cancer and hence contribute to patients' overall survival.
Six selected cancer types (including LUAD, CESC, SKCM, BRCA, LGG and KIRC) were found to be significantly correlated with immune infiltrating level in tumor microenvironment.
Twelve common DEGs with prognostic value were adopted to construct PPI network and identify potential hub genes that are associated with tumor infiltrating immune cells across multiple cancer types. The results show that most of them are crucial to immune cell signaling and immune responses. For example, GRAP2 [23], DBNL [24], TBC1D10C [25], PTPN6 [26,27] and SOCS1 [28] has been reported to be involved in the process of proliferation and activation, cytokine secretion and cell cycle regulation in T cells. In addition, data from experimental mice and clinical observations have unraveled that KLRC1 [29], RASAL3 [30] and MAP4K1 [31] may play a vital role in controlling the magnitude of inflammatory responses through multiple signaling events in autoimmune diseases or 9 tumorigenesis, and JAK2 [32,33] and Coro1A [34] represent important novel players with key functions in polymorphonuclear neutrophils trafficking and PD-1 ligand transcription in innate and adaptive immunity. Although there have been no studies to provide evidence for TCEB1 and ARHGAP9 with immune-infiltrating level in tumor microenvironment, they exhibit an accordant up-expression pattern in the cancer tissues in our study.
Because of highly interconnected with other nodes in the PPI network, PTPN6 was considered to be the most valuable one related to immune response. A systematic correlation analysis uncovered that the expression level of PTPN6 was significantly associated with immune infiltrating immune cells and marker sets of diverse immune cells, among which CD19 and CD79A in B cell, CD8A and CD8B in CD8 + T cell, TBX21, STAT6 and IFNG in CD4 + T cell, CD68 and IL-10 in tumor-associated macrophage, ITGAM and CCR7 in neutrophils, ITGAM and HLA-DPB1 in dendritic cell, PD-1 in T cell exhaustion were found to be significantly co-expressed with PTPN6. Previous reports have disclosed the role of PTPN6 in regulating CD8 + T cell activation and responsiveness by inhibiting immune checkpoint and increasing T cell receptor affinity to enhance T cell-mediated immunity [26,35]. Moreover, Zawada et al. performed a GO enrichment analysis of monocyte heterogeneity, which suggests that PTPN6 was significantly associated with monocyte activation [27]. However, the relationship between PTPN6 and other marker sets of diverse immune cells remains unclear. Therefore, our results may provide additional data in decoding the complex interaction of tumor and tumor microenvironment.
In conclusion, our results demonstrated that immune scores are significantly correlated with overall survival in six cancer types, including LUAD, CESC, SKCM, BRCA, LGG and KIRC. No significant differences between positive and negative subgroups were observed in the functional enrichment analysis of common DEGs. Subsequently, the twelve common DEGs with prognostic value were found to be significantly correlated with immune infiltrating level in tumor microenvironment. Finally, PTPN6 was selected as the hub gene regulating tumor infiltrating immune cells and marker sets of diverse immune cells by highly co-expressing with KLRC1, JAK2, RASAL3, DBNL, GRAP2 and CORO1A. In summary, the further investigation of these genes may lead to novel insights into the potential association of tumor microenvironment with tumors and provide novel targets for tumor immunotherapy.

Dataset
A workflow of our study is presented in Figure 1. Immune scores based on ESTIMATE algorithm across 25 types of cancer were downloaded from ESTIMATE website (https://bioinformatics.mdanderson.org/public-software/estimate/). The raw data containing RNA sequencing and clinical information were obtained from The Cancer Genome Atlas (TCGA) repository website (https://portal.gdc.cancer.gov/). This study was approved by the ethics committee of Wuhan union hospital.

Screening for immune-specific cancers
All patients in an independent cancer type were divided into high and low-score groups according to immune scores. Kaplan-Meier survival analysis was done to screen for tumors that are significantly correlated with survival prognosis.

Identification of DEGs in immune-specific cancers
The R 'limma' Bioconductor package was used to screen the DEGs between tumor and normal samples under the threshold of |log2(FC)| > 2 and adj. P value < 0.05.

Functional enrichment analysis of DEGs
DEGs were uploaded to the Database for Annotation, Visualization and Integrated Discovery (DAVID) (https://david.ncifcrf.gov/) to gain insight into the biological themes , particularly GO categories and KEGG pathways. Adj. P < 0.01 was set as the cut-off criterion.

The identification and validation of hub gene
Venn diagram of DEGs across six types of cancer was performed to screen for the common DEGs. Subsequently, the R 'survival' package was adopted to perform Kaplan-Meier survival analysis according to the expression levels of DEGs. The common DEGs with prognostic value were selected as candidate hub genes. The PPI network of common DEGs was then retrieved from STRING database and reconstructed using Cytoscape software.
The node that is highly interconnected with other candidate hub genes were considered to be the real hub gene. The Human Protein Atlas (HPA) (https://www.proteinatlas.org/) were used to validate the differential expression of hub gene between tumor and normal samples on translational level.

Correlation analysis between hub genes and immune infiltrating level in tumor microenvironment
The correlation between hub genes and immune infiltrating level in tumor microenvironment was analyzed in TIMER website (https://cistrome.shinyapps.io/timer/). In addition, we further assessed the correlation between hub gene and immune marker sets of various immune cells.

Statistical Analysis
Statistical analysis and visualization. Statistical analysis was performed using R-3.6.1.