The Analysis of Tumor Microenvironment of Prostate Cancer Identies Prognostic Signatures

Background: Tumor microenvironment (TME) is an essential part of tumor tissue, and increasing references suggested that TME has clinicopathological signicance in predicting prognosis and therapeutic ecacy. However, little ecacy has been demonstrated in prostate cancer. Methods: The cohort TCGA-PRAD (n=477) was used in this study. Based on the proportion of 22 types of immune cells calculated by CIBERSORT, TME was classied by K-means Clustering and differentially expressed genes (DEGs) were determined. Then TMEscore was calculated based on cluster signature genes, which were obtained from DEGs by random forest method, and the samples were classied to two subtypes. We performed somatic mutation and copy number variation analysis to identify the genetic characteristics of the two subtypes. Correlation analysis was performed to explore the correlation between TMEscore and the tumor response to ICIs as well as the prognosis of PCa. Results: Based on the proportion of immune cells, we constructed the TMEscore model and classied PCa samples into TMEscore high and TMEscore low groups. The results of survival analysis suggested that the TMEscore high group had signicantly better survival outcome than the TMEscore low group. The correlation analysis showed a signicantly positive correlation between TMEscore and the known prognostic factors of tumors. Conclusion: our study indicates that the TMEscore may be a potential prognostic biomarker in PCa. A comprehensive description of the characteristics of TME may help to explain the response to therapies for PCa patients and provide the new strategies for treatment.


Background
Prostate cancer (PCa) remains the most frequently diagnosed non-cutaneous malignancy among men, and the second most common cause of cancer death worldwide [1,2]. Prognosis and treatment decisions of PCa are based on the malignant grade using the Gleason sum, the clinical stage using the tumor, node, metastasis system, and a patient's serum PSA level [3,4]. Despite these well-used clinical predictive and prognostic factors, drastically variable outcomes are observed between PCa patients with similar stages and malignant grades [5]. Therefore, it has highlighted the importance of identifying novel biomarkers for the PCa prognostic assessment.
The tumor microenvironment (TME) is the cellular environment where the tumor located, mainly consisting of immune cells, mesenchymal cells, endothelial cells, in ammatory mediators, and extracellular matrix (ECM) molecules [6]. In the TME, the immune and stromal cells are the two major types of non-tumor components [7]. It is reported that the activated immune cells in the TME can promote tumor growth and progression [8]. Besides, the increasing references suggest that the TME cells have clinicopathological signi cance in predicting prognosis and therapeutic e cacy of tumors [4,5]. In various malignancies, including gastric cancer and glioma, the gene signatures from the TME signi cantly associated with outcomes of patients [6,9]. Although the landscape of TME has been widely focused across a range of tumor types, little e cacy was demonstrated in PCa. Researchers found the TME in PCa is complex and dynamic [8]. Recent studies revealed a signi cant role of the TME cells in the initiation and progression of PCa [10,11]. Importantly, a series of cell intrinsic changes in combination with distinct changes in the TME involve in the initiating events in PCa. The experimental data has uncovered the relationship between the TME cells and malignant tumor cells in which early changes in normal tissue microenvironment can promote tumorigenesis and in turn tumor cells can promote further pro-tumor changes in the microenvironment [2]. Moreover, it's worth noting that the abundance of immune cells and other cells in the TME can be estimated using computational methods [12,13]. Several studies using these methodologies have explored the clinical utility of TME in ltrates [9,14], and several mechanisms associated with the role of TME in immunotherapy response and resistance have been experimentally identi ed for some tumor types. However, the comprehensive pro le of the TME in ltrating cells and TME characterization has not yet been elucidated in PCa.
In the present study, we estimated the TME in ltration patterns of PCa patients and constructed the TMEscore to predict their prognosis. We systematically analyzed the TMEscore-related phenotypes with genomic and clinical features of PCa. As a result, we identi ed a list of microenvironment-related signatures from the high/low TMEscore subtype that can predict survival outcomes pretty well in PCa.
Our results demonstrate that the TMEscore could be a robust prognostic biomarker and predictive factor for the developing new diagnosis strategies in PCa.

Prostate cancer data collection
The transcriptional sequencing data of PCa patients (TCGA-PRAD) were separately downloaded from the TCGA database (https://xenabrowser.net/datapages) and the Gene Expression Omnibus (GEO) database (https://www.ncbi.nlm.nih.gov/geo/). The clinical information including gender, age, and survival time was also downloaded from TCGA and GEO. After removing the PCa samples without survival data or clinical data, 477 transcriptome samples from TCGA were selected as test sets, and 111 samples from GEO were used as validation sets ( Table 1). The SNP data of these 477 samples were obtained from the TCGA database. Among the 477 samples, the SNP6 copy number segments of 471 samples were available from http:// rebrowse.org/.

TME analysis
The proportion of immune cells was calculated using the CIBERSORT algorithm with the leukocyte gene signature matrix (LM22) as the reference and with 1000 permutations [12]. TME clusters were determined using the ConcensusClusterPlus R package based on the proportion of immune cells [15]. The optimum clusters number K was estimated by algorithms elbow and gap statics. The limma R package was used to analyze the differentially expressed genes (DEGs) among the TME clusters, with thresholds of p < 0.05 and |logFC| > 0.58. Cluster signature genes were obtained by random forest method, and functional enrichment analysis was performed using ClusterPro ler R package. Cluster signature genes were classi ed to two categories according to cox coe cient (positive or negative) based on cox regression model [16]. TMEscore was calculated as follows: X represents the expression value of cluster signature genes with a positive cox coe cient, and Y represents the expression value of cluster signature genes with a negative cox coe cient.
The maxstat R package was used to de ne the optimal breakpoint for TMEscore, thus the 477 PCa samples were classi ed to high TMEScore and low TMEScore group. The TMEsore model was validated in Cohort GSE70770 from GEO.

Mutation spectrum analysis
We analyzed the mutational spectrum and mutational signatures of 477 samples from TCGA-PRAD by maftools and SomaticSignatures R packages. Copy number variable regions (CNVRs), including chromosomal CNV regions and minimal common regions (MCRs), were detected by GISTIC. Based on the results of CNVs, tumor purity and ploidy were estimated by ABSOLUTE R package.

Correlation analysis
We used Kaplan-Meier method to illustrate the relationship between the survival of PCa samples and the genetic signatures, such as miRNA, mRNA and the methylation site. We also explored the correlation between TMEscore and tumor response to ICIs in PCa. The tumor response to ICIs was evaluated by Tumor Immune Dysfunction and Exclusion (TIDE) scoring system, in which the higher the TIDE score, the worse the tumor response to ICIs and the worse the prognosis. Statistical signi cance was de ned as two-tailed P values < 0.05.

Results
TME cell in ltrating patterns and DEGs associated with TME We calculated the proportion of immune cells in 477 PCa samples, according to the previous ndings that immune cells, especially tumor in ltering lymphocytes, are more related to the prognosis and immunotherapy effect of patients than the stromal cells in the TME [6,17]. We depicted a comprehensive pro le of the immune cell interactions, lineages and their effects on the OS of PCa patients, and obtained four distinct patterns of TME cell in ltration (Fig. 1A). The cell cluster A was characterized by the in ltration of the macrophages (M0), B cells (naïve), mast cells (resting) and so on. The cell cluster B was characterized by the in ltration of mast cells (activated), neutrophils, NK cells (activated) and so on. The cell cluster C and cluster D was characterized by the in ltration of the CD4 + T cells (memory resting) and the macrophages (M2), respectively. According to the proportion of immune cells, the PCa samples can be divided into three TME clusters through consensus clustering analysis (Figure S1A-C). Interestingly, by mapping the TME clusters to the proportion of immune cells, we found signi cant differences in the distribution of immune cells among different TMEcluster samples, and TMEcluster 1 and 2 were quite different from TMEcluster 3 (Fig. 1B). The survival analysis showed that the survival was not signi cantly different among the three TME clusters (Fig. 1C). Unexpectedly, the TME clusters directly classi ed by the proportion of immune cells cannot distinguish the prognosis of PCa. It is probably due to the redundant information caused by LM22 gene. In order to reduce redundant information interference, we used an unsupervised clustering method to re-cluster 477 PCa samples based on the expression pro le of TMEin ltrating phenotype related differential genes, and a total of 3637 DEGs were obtained. After dimensionality reduction by random forest algorism, the most relevant signature genes (n = 104) were used to classify the 477 samples into two clusters (Fig. 1D). These genes are mainly enriched in the pathways involved in immune activation, such as neutrophil chemotaxis, positive regulation of alphabeta T cell activation, regulation of alpha-beta T cell differentiation, neutrophil migration and so on (Fig. 1E).

Construction of the TMEscore model for PCa samples
Based on the re-clustering model, we evaluated the correlation between DEGs and prognosis. The Cox regression model was used to evaluate the relationship between the DEGs and OS of PCa samples.
According to the coe cient value of genes, the 477 samples were classi ed into two groups with the TMEscore high (n = 308) and TMEscore low (n = 169) phenotype (x=-0.31).
As shown in Fig. 2A, the prognosis of the TMEscore high group was signi cantly better than that of the TMEscore low group. A comprehensive comparison of different TME clusters (Fig. 2B) showed that cluster samples based on immune cell components and proportions were notably correlated with the DEGs and TMEscore. Consistent with these ndings, the TMEscore of the 111 PCa samples from the GEO can also re ect the prognosis of samples ( Fig. 2C and 2D).
Correlation of the mutational signatures with the TMEScore phenotypes in PCa The common genomic mutations, such as SNPs, have been reported to be used as predictors of aggressive prostate cancer. The exploration of characteristic SNPs would enable more accurate risk strati cation, allowing for tailored management of individual prostate cancer [18]. Therefore, we further explored the relationship between the genomic mutations and TMEScore. The somatic mutation analysis showed that the most common mutation in PCa is the missense mutation, mainly caused by SNP, and C > T is the most common type of SNP mutations ( Figure S2).The frequently mutated genes in TMEscore high group and TMEscore low group were presented in Fig. 3A and 3B, respectively, and their variant allele fractions (VAFs) were different between two groups (Fig. 3C). The contribution of 96 base substitution types was presented in Figure S3A and B. The mutational signature analysis showed that TMEscore high group was mainly related to signature1, signature3 and signature5 (Fig. 3E), and TMEscore low group was mainly related to signature1, signature5 and signature6 (Fig. 3F). It can be seen that there are signi cant differences in mutation characteristics between TMEscore high group and TMEscore low group.
MSI is an emerging biomarker used to predict the outcome of cancer treatment, and patients with MSI-H usually have better prognosis than those with MSI-L/MSS. The correlation analysis between TMEScore and MSI showed that the high TMEscore was signi cantly associated with MSI-H (Fig. 3D), and was associated with a good prognosis.

Correlation of copy number variants with the TMEScore phenotypes in PCa
We analyzed the correlation between the TMEscore and the characteristic of the chromosome CNV in PCa. The results showed that there were signi cant differences in chromosome copy number ampli cations or deletions between the two TMEscore related phenotypes. The CNVs analysis by GISTIC showed that ampli cations of chromosomal arms 8p, 8q, 7p and 7q, and deletions of chromosomal arms 8p, 18p, 18q and 16q frequently occurred in TMEscore-high subtype (Fig. 4A, C); ampli cations of chromosomal arms 8p, 8q, 7p and 7q, and deletions of chromosomal arms 8p and 18q frequently occurred in TMEscore-low subtype (Fig. 4B, D). There were signi cant differences in MCRs between TMEscore high group and TMEscore low group. In the TMEscore high group, 14 ampli cations and 30 deletions were detected, in which 3q26.2 ampli cation and 8p21.3 deletion were most signi cant (Fig. 4E); while in the TMEscore low group, 22 ampli cations and 24 deletions were detected, in which 8q24.21, 8q21.13 ampli cations and 6q14.3, 13q14.13 deletions were most signi cant (Fig. 4F). We also analyzed the correlation of tumor purity and tumor ploidy with the TMEScore related phenotypes, and the results showed no signi cant correlations ( Figure S3C, D).

Comprehensive analysis of genomic signatures associated with the TMEscore in PCa
Based on the TMEscore phenotypes, we conducted a comprehensive analysis of the genomic signatures associated with the prognosis of PCa. DEG analysis obtained 5 differentially expressed miRNAs and 127 differentially expressed mRNAs between TMEscore-high and TMEscore-low subtypes, with the threshold P < 0.05 and |logFC|>1 for miRNA, and |logFC|>1 and P < 0.05 for mRNA, respectively. Among them, hasmir-133b (Fig. 5A) and FMOD (Fig. 5B) were most signi cantly correlated with survival of PCa. As DNA methylation status is closely related to tumor progression and prognosis, we performed differential methylation analysis between the TMEscore high and TMEscore low groups, and 38 signi cantly differential methylation sites were obtained, among which cg03804126 was most signi cantly correlated with OS (Fig. 5C). Moreover, a comprehensive genomic landscape of PCa was presented in Fig. 5D. We found 12 survival-related genes, including CD38, PROK1, SRD5A2, TMEM35, DPT,FAM107A, SPOCK3, MT1G, SLC22A3, ANO7, MYLK and PTN.

Discussion
In recent years, TME signatures representing TME status have been identi ed, and their potential clinical relevance in several cancers has been evaluated [9,18,19]. In the present study, we focused on the TME signatures that contribute to survival of PCa samples from the TCGA database and the GEO database. Based on the TME in ltration pattern analysis of 477 PCa patients, our observations suggested that the in ltrating immune cells such as macrophages (M2), CD4 + T cells (memory resting), T cells (regulatory), Plasma cells and T cells (follicular) were closely related to the survival outcome of PCa patients. Interestingly, the results of our bioinformatic ndings are consistent with some of the experimental studies. For example, in prostate cancer, most (but not all) pathological studies found that the greater the number of tumor-associated macrophages, the worse the prognosis [20,21]. Additionally, the M2 macrophages were reported to be associated with higher stage and higher Gleason scores of tumors [22].
Researchers also found that regulatory T cells were elevated in the circulation of patients with PCa, and the elevated number of regulatory T cells was associated with worse prognosis [23].
Most importantly, we constructed the TMEscore model that is related to prognosis of PCa. Consistent with the results of previous studies on other types of tumors [6,9], PCa patients with the TMEscore high phenotype have obviously better prognosis. Furthermore, the e cacy of TMEscore on prognosis was compared with other common prognostic factors. As described in the previous report, the common genetic variants SNP may act as predictors of aggressive prostate cancer, and would enable more accurate risk strati cation in PCa [24]. Meanwhile, it is reported that MSI did not correlate with clinical stage, but might play a role in the development of a subset of prostate cancers. The MSI-H/dMMR molecular phenotype is uncommon in prostate cancer, but it has therapeutic signi cance [25]. Here, we found a strong positive correlation between TMEscore and SNP or MSI-H in PCa. The results indicated that the TMEscore may be useful in developing new diagnostic strategies for PCa.
In order to explore other potential prognostic signatures, we integrated the genomic, clinical and pathological features of PCa samples. Expectantly, we obtained a list of prognostic signatures, including miRNAs, mRNA and methylation sites. As well known, miRNAs are short, endogenous RNAs in cells, promoting translational repression and/or destabilizing target mRNAs to optimize protein levels in numerous biologic processes [26]. The miRNA expression pro les of PCa have been reported [27], and miRNAs can act as oncomiRs or tumor suppressors in a multitude of cancers [28]. Numerous studies have been published, showing that the alterations in miRNAs are associated with initiation and progression of PCa [27]. Moreover, a list of miRNAs, such as miR-133b and miR-1, are novel biomarkers with prognostic and diagnostic value for recurrence of PCa and have been identi ed by miRNA expression pro ling [29]. Consistent with these ndings, we identi ed miR-133b with the most signi cant prognostic value from the TMEscore model in the PCa. In addition, FMOD has been reported as a potential biomarker for PCa [30]. We also identi ed the FMOD with the most signi cant prognostic value from the mRNA pro le between TMEscore high/low groups. These results highlight the important role of TMEscore in recognizing prognostic signatures. Although the TMEscore phenotype of PCa can provide an alternative set of biomarkers with good prognostic ability, these results alone are not su cient for clinical application. Hence, further validation, especially genetic and experimental studies involving larger samples, should be needed in the future.

Conclusions
In conclusion, the comprehensive assessment of the cellular, molecular and genetic factors associated with TMEscore in PCa patients, has provided several important insights into how prostate tumors respond to survival outcomes and may guide the development of new diagnostic strategies.