Immune- and Stromal scores correlate with breast cancer subtypes
We found significant correlations between Immune score and Stromal score with breast cancer subtypes. In this study, we obtained information about 1,040 breast cancer patients from the TCGA database, including their gene expression signatures and clinical profiles. The average age of the patients was 58.34 years. Using the PAM50 classification method, we used the genefu package [28] in R language to divide the breast cancer subtypes into luminal A, luminal B, Her2+, and Basal. The number of patients with luminal A was 286, the number of patients with subtype luminal B was 427, the number of patients with Her2+ was 128, and the number of patients with Basal was 199. According to the calculation results of the ESTIMATE and maftools algorithms [29], as shown in Fig.1 A to F, the Immune score, Stromal score, Estimate score, Tumor purity, TMB, and MATH are significantly correlated with different breast cancer subtypes. Among the Immune score, the basal subtype had the highest average Immune score, followed by the Her2+ subtype, LumA third, and LumB the lowest (P = 2.7e-12); Similarly, in breast cancer subtypes, the order of Stromal score from high to low is LumA > Her2+ > LumB > Basal (P < 2.2e-16); In Estimate score, the score from high to low is HER2+ > Basal > LumA > LumB (P = 4.2e-10); In Tumor purity, the score from high to low is LumB > Basal > LumA > Her2+ (P = 4.2e-10), In TMB, the score from high to low is Basal > Her2+ > LumB > LumA (P < 2.2e-16), In MATH, the score from high to low is Basal > LumB > Her2+ > LumA (P = 2.4e-15).
Next, to explore the correlations between Immune/Stromal scores and breast cancer prognosis, we constructed survival curves of patients by classifying them into high and low score groups based on their gene expression profiles. We found that patients with higher Immune score have longer survival rates than those with lower Immune score (P < 0.048). Similarly, patients with high Stromal score have better life expectancy than patients with low Stromal score, although the data are not statistically significant (Fig.1G-H).
Significant correlation between Immune/Stromal scores and T stage of breast cancer
TNM staging system is a globally recognized standard for classifying the extent of spread of cancer into nearby tissue. Given the correlations of both Immune- and Stromal scores with breast cancer subtypes, we investigated their connection with TNM stages of breast cancer. We noted that the Immune score, Stromal score and Estimate score decrease following the T stage progression (Fig.2A-C). In line with our observation in Figure 1G, the higher the Immune score, the better the prognosis. Consistently, the Tumor purity, TMB, and MATH increase with the progression of T stages, correlating well with the degree of malignancy of breast cancer (Fig.2D-F). The M and N stages of the tumor are only correlated with TMB, but not related to the Immune score, Stromal score, Estimate score, tumor purity, and MATH (supplementary Fig.1).
Analysis of differentially expressed genes
To determine the correlation between the overall gene expression profile and the Immune/Stromal scores, we evaluated the sequencing data of 1,040 breast cancer patients in the TCGA database. In this analysis, immune-related genes were divided into the high-score and the low-score groups. We then used the limma package of R language to analyze the differentially expressed genes. Heatmaps in Figure 3A-B showed distinct gene expression profiles of DEGs corresponding to the high vs. low Immune score/Stromal score groups. 859 genes were upregulated and 40 genes were downregulated in the group of high Immune score (FC > 2, P < 0.05). Similarly, there were 1011 upregulated genes and 5 downregulated genes in the high Stromal score group (FC > 2, P < 0.05) (Fig.3C-D).
We found that 280 genes were upregulated in the groups with both high Immune score and high Stromal score (Fig.3E) and only 1 common gene was downregulated (Fig.3F). Besides, the Stromal score was not significantly related to the prognosis of breast cancer patients (Fig.1G-H). Therefore we decided to exclude Stromal score from our analysis when verifying gene modules associated with Immune score.
Functional enrichment analysis reveals FPR3 as a key immune-related gene
Using WGCNA, we sought to identify gene modules that are associated with the Immune score. Among the eight modules, the MEblue was highly correlated with the Immune score (R2 = 0.975), Stromal score (R2 = 0.538), Estimate score (R2 = 0.885), TMB (R2 = 0.043). MEgreen module showed a higher correlation with Stromal score (R2 = 0.887; Fig.4A). Since the Immune score impacts the survival time of patients (Fig.1G), we identified 388 genes that were overlapped with WGCNA analysis and the high Immune score group (Fig.4B). To postulate the underlying mechanism for the upregulation of these identified genes, we first used the STRING tool to perform a functional analysis of the PPI network, and applied the MCODE plugin in Cytoscape to obtain the two most significant modules. The first MCODE module contains 53 genes, including almost all the established immune checkpoint genes, for example, CD274, PDCD1, CTLA4 and LAG3, etc. (Fig.4C). Intriguingly, many of the immune checkpoint genes predict a better prognosis for breast cancer (supplementary Fig.2), providing a molecular explanation for the limited effects of targeting immune checkpoint in treatment of breast cancer.
To identify more attractive prognostic targets in immune microenvironment, we performed a survival analysis for these 388 genes and found that a total of 101 genes showed a statistical difference in patient survival (supplementary Fig.3). We observed that only one particular gene named FPR3 predicts poor prognosis under higher expression, which was included in MCODE module 2 (Fig.4D). To evaluate if FPR3 is an independent prognostic factor in immune microenvironment, multivariate Cox analysis was used to analyze the hazard ratio (HR) of FPR3 and some known immune checkpoints (Fig.4E). The known six immune checkpoints don’t influence the survival rate of patients (Fig.4E). Only FPR3 falls to the right of the invalid line (hazard ratio > 1), which means that the higher the FPR3 expression, the worse the patient's prognosis. The results above demonstrated that FPR3 may represent an independent risk factor for breast cancer.
Furthermore, we analyzed the functions of genes in module 2 using gene enrichment approach. A total of 46 genes are involved in leukocyte differentiation, immune response-activating cell surface receptor signaling pathway and positive regulation of cell activation (Fig.5A), with their molecular functions relating to protein tyrosine kinase activity, peptide binding and amide binding (Fig.5B). These genes are predicted to localize to the plasma membrane (Fig.5C) and required for Th1 and Th2 cell differentiation, Th17 cell differentiation and Epstein-Barr virus infection, as indicated by KEGG pathway (Fig.5D). These results indicate that FPR3, as a component of module 2, may cooperate with other partners in immune-based biological pathways.
FPR3 expression in breast cancer and the pathways enrichment determined by GSEA
We used TIMER to evaluate the expression of FPR3 in pan-cancer. As displayed in Fig.6A, FPR3 is highly expressed in a variety of cancers including diverse breast cancer subtypes, COAD cancer and HNSC cancer. As shown in both Oncomine and GEPIA database, the expression of FPR3 in breast cancer is significantly higher than the adjacent normal tissue (Fig.6B, C). The kaplan-Meier survival curve shows that the expression of FPR3 in breast cancer is negatively related to the survival of patients (Fig.6D). These data indicate that inhibiting FPR3 expression may be useful as an intervention strategy.
To reveal the functional role of FPR3, GSEA (gene set enrichment analysis) was used to analyze the gene expression matrix acquired from TCGA. The samples were divided into high and low expression groups according to the median expression level of FPR3. Top two enriched pathways in high expression group of FPR3 were “pathways in cancer” (Fig.7A) and “cytokine-cytokine receptor interaction” (Fig.7B), implying that FPR3 upregulation may lead to alterations in cancer-related pathways and cytokine-based immune regulations. We also performed a correlation analysis of FPR3 expression on the GSEA-enriched pathways. In the “pathways in cancer”, PIK3R5, SPI1, and CSF1R are most relevant to FPR3 expression (Fig.7A1-A3). In the “cytokine-cytokine receptor interaction pathway”, the most relevant genes to FPR3 expression are CCR1, IL10, and IL10RA (Fig.7B1-B3). FPR3 may directly or indirectly cooperate with these genes in promoting tumorigenesis. For instance, high correlation of FPR3 with PIK3R5 implies FPR3 may promote tumorigenesis through the PIK3R5-mediated G-protein coupled receptor activation [30]. As annotated in the “pathways in cancer”, high FPR3 expression actually associates with various pathways that promote tumorigenesis and development, such as VEGF signaling pathway, MAPK signaling pathway and PI3K-AKT signaling pathway. Due to FPRs belong to the classic chemotactic GPCR subfamily, we propose that FPR3 is involved in the G protein receptor coupled pathway to promote carcinogenesis, as shown in the schematic diagram (Fig.7C). Through unknown ligands, FPR3 may regulate breast tumorigenesis through G-protein coupled PI3K or MAPK signaling cascades, participating in a series of oncogenic processes, such as enhanced proliferation, sustained angiogenesis and apoptosis evasion.