CLEC10A is a Potential Prognostic Biomarker and Immune Regulator of Tumor Microenvironment in Breast Cancer

The communication between tumor cells and immune cells inuences the ecology of the tumor microenvironment in breast cancer, as well as the disease progression and clinical outcome. The aim of this study was to investigate the prognostic value of the immunomodulatory factor CLEC10A in breast cancer. We applied the CIBERSORT and ESTIMATE calculation methods to calculate the proportion of tumor-inltrating immune cells (TICs) and the amount of immune and stromal components in 1053 BRCA cases from The Cancer Genome Atlas (TCGA) database. The differentially expressed genes (DEGs) were analyzed by COX regression analysis and protein-protein interaction (PPI) network construction. Then, CLEC10A was identied as a prognostic factor by the intersection analysis of univariate COX and PPI. Further analysis revealed that CLEC10A expression was negatively correlated with the clinical pathologic characteristics (age, clinical stage) and positively correlated with survival of BRCA patients. Gene set enrichment analysis (GSEA) showed that genes in the high CLEC10A expression group were mainly enriched in immune-related activities. Genes in the low CLEC10A expression group were enriched in biochemical functions. CIBERSORT analysis of the proportion of TICs revealed that Macrophages M1, B cells memory, B cells naive, T cells CD4 + memory activated, T cells CD8 + , and T cells gamma delta were positively correlated with CLEC10A expression, and Macrophages M0, Macrophages M2, Neutrophils, and NK cells resting were positively correlated with CLEC10A expression was negatively correlated, suggesting that CLEC10A may be an important factor in the immune regulation of the tumor microenvironment, especially in mediating the anti-tumor immune response of tumor-inltrating immune cells at the tumor initiation stage. Therefore, CLEC10A expression may contribute to the prognosis of BRCA patients and provide a new idea for the immunotherapy of BRCA.


Introduction
Breast cancer is the most common type of cancer in women and the leading cause of cancer death [1] .
Although advances in diagnosis and treatment have improved the survival rate of patients with earlystage breast cancer, the survival rate of patients with advanced stages remains poor [2] . There is increasing amount of evidence indicate that the tumor microenvironment (TME), especially tumorin ltrating immune cells (TICs), plays an important role in tumor development and is closely related to patient prognosis [3,4] . The co-evolution of the interaction between cancer cell and the tumor microenvironment leads to the malignant phenotype of cancer, such as malignant proliferation, antiapoptosis and immune surveillance evasion. Immune cells play an important anti-cancer role, although some cancer cell evade immune surveillance [5,6] . Therefore, a comprehensive assessment of the tumor microenvironment and its level of immune in ltration, and further screening for prognostic markers associated with tumor immune in ltration can help predict the prognosis of breast cancer patients and guide individualized immunotherapy.
Breast cancer is generally considered to be non-immunogenic, so its response to immunotherapy is considered limited. However, recent reports have found that breast cancer also has a rich immune microenvironment, which is signi cantly related to the prognosis of patients. Tumor in ltrating lymphocytes (TILs) are the most common mononuclear immune cells, which have been reported in most breast cancer tissues. Study have shown that high CD8 + T cell ratio and high CD8/FOXP3 ratio in residual cancer after neoadjuvant chemotherapy can accurately predict the prognosis of patients with triple negative breast cancer (TNBC) [8] . Another study showed that macrophages can affect tumor progression and may change the proliferation activity of cancer cells in the metastatic microenvironment [9] . These studies have revealed the potential prognostic value of immune cells in breast cancer.
In the present study, we analyzed differentially expressed genes (DEGs) generated by comparing immune components and stromal components in BRCA samples from The Cancer Genome Atlas (TCGA) database, and identify the prognostic biomarker C-type lectin domain containing 10A (CLEC10A), also known as macrophage galactose-type lectin (MGL). Furthermore, we calculated TICs proportions and proportions of immune and stromal components of BRCA samples from The Cancer Genome Atlas (TCGA) database by applying ESTIMATE and CIBERSORT computational methods, and revealed that CLEC10A is a potential biomarker of TME immunomodulation in BRCA.

Raw data
Transcriptomic RNA-seq data of 1164 breast invasive carcinoma (BRCA) cases, all female, containing 111 normal and 1053 tumor samples and their corresponding clinical data were downloaded from the TCGA database (https://portal.gdc.cancer.gov/).

Generation of Immunescore, StromalScore and ESTIMATEScore
The scores of the immune, stromal components of the TME were estimated for each sample by the ESTIMATE algorithm by using the R software loaded with the ESTIMATE package (version 4.0.2), presented in the form of three scores: ImmuneScore, StromalScore and ESTIMATEScore, respectively, with immune, stromal and the sum of both positively correlated, which means that the higher the corresponding score, the greater the proportion of the corresponding component in the TME.

Survival Analysis
Survival curves were plotted by using the sursvive and survminer packages in R software. Survival analysis was performed by Kaplan-Meier method on 1034 samples with detailed survival times out of 1053 tumor samples downloaded from TCGA database. Survival curves were compared using log-rank test. The test criterion was α = 0.05.

Generation of DEGs in ImmuneScore and StromalScore
The 1034 tumor samples were labeled as high score or low score by comparison with the median score of ImmuneScore and StromalScore, respectively. Differential analysis of gene expression was performed with package limma, and then DEGs was generated by comparing high score and low score samples. log2FC|>1 and FDR < 0.05 were the inclusion criteria (FC = high score group/low score group).

GO and KEGG enrichment analysis of shared DEGs and Heatmaps
GO and KEGG enrichment analysis was performed on 224 shared DEGs by using the clusterPro ler, richplot and ggplot2 packages in R software. Only enrichment terms with p < 0.05 & FDRq-value < 0.05 were statistically signi cant. Heatmaps of DEGs were produced using R software and the package pheatmap.

Correlation analysis of clinicopathological parameters
Clinical data of BRCA cases were downloaded from the TCGA database and correlations between clinicopathological parameters and TME scores or CLEC10A expression levels were analyzed using R software with the Wilcoxon test or Kruskal-Wallis test. The Bonferroni method was used to correct for signi cance levels.

PPI network construction
The PPI protein interaction network was constructed from the STRING database, followed by the use of nodes with protein interaction relationship con dence greater than 0.70 for reconstructing the network with Cytoscape version 3.6.1.

COX regression analysis
Univariate COX regression analysis was performed using R software and showed the top 16 signi cant genes.

Gene set enrichment analysis
The C2 KEGG and C7 gene sets v7.2 sets were downloaded from the Molecular Signatures database as target sets for GSEA,which were performed using the gsea_4.1.0 software downloaded from the Broad Institute.

About TICs
The proportion of TICs in all tumor samples was estimated using the CIBERSORT calculation method, and only 854 tumor samples with p < 0.01 were selected for the following analysis.

Analysis process of this study
The transcriptome RNA sequence data of 1153 cases of BRCA downloaded from the TCGA database was calculated by CIBERSORT and ESTIMATE algorithms to estimate the proportion of TICs in the sample and the amount of immune and stromal components. Protein-protein interaction (PPI) networks and univariate COX regression analyses were constructed by using the DEG shared by ImmuneScore and StromalScore. Then, Intersection analysis was performed by using the core nodes in the PPI network and the signi cant genes obtained from the univariate COX regression analysis, resulting in 20 genes containing CLEC10A and others. We focused on the subsequent analysis of CLEC10A, including survival analysis and clinicopathological characteristics correlation analysis, gene set enrichment analysis (GSEA), and correlation with TICs.

Immunoscore correlates with prognosis of breast cancer patients
To determine the correlation between the proportion of immune and stromal components and survival, Kaplan-Meier survival analysis was performed on ImmuneScore, StromalScore and ESTIMATEScore, respectively. Higher scores on the ImmuneScore or StromalScore indicated a greater amount of the immune or stromal component in TME. ESTIMATEScore was the sum of ImmuneScore and StromalScore, which represents the combined proportion of these two components in TME. The results showed that although StromalScore and ESTIMATEScore were not signi cantly correlated with overall survival, the proportion of immune components was positively correlated with overall survival (Fig. 1).
These results suggested that the immune component of TME may be a valuable prognostic indicator for BRCA patients.

Tumor microenvironment score correlates with clinicopathological characteristics of breast cancer patients
We downloaded the corresponding clinical information of breast cancer cases from the TCGA database and analyzed the correlation between the proportion of immune and stromal components in TME and clinicopathological characteristics. As shown in Fig. 2 and Supplementary Table 2, ImmuneScore was negatively correlated with age (p = 0.045); StromalScore was negatively correlated with age (p = 0.033), T classi cation (p = 0.002), and TMN stage (p = 0.015), and ESTIMATEScore decreased signi cantly with the depth of tumor in ltration (p = 0.022). These results suggested that the proportion of stromal components is associated with BRCA progression, including invasion and metastasis, while age may be an in uential factor for the immune component to affect tumor prognosis.
2.4 Enrichment analysis of differentially expressed genes shared by immune score and stromal score DEGs were obtained by comparing high score samples to low score samples in the immune and stromal components, which were grouped into by comparison with the median. The results showed 818 DEGs were obtained from ImmuneScore, of which 728 genes were up-regulated, while 90 genes were downregulated (Fig. 3A, C, D). Similarly, 782 DEGs were obtained from StromalScore, including 658 up-regulated and 124 down-regulated genes (Fig. 3B-D). ImmuneScore and StromalScore were crossed by Venn diagram to obtain 203 up-regulated genes, and 21 down-regulated genes. These DEGs (224 genes in total) shared by both in ImmuneScore and StromalScore may be in uential factors for TME status. The results of gene ontology (GO) enrichment analysis showed that DEGs are mainly involved in immunerelated biological functions, such as T cell activation (Fig. 3E). The Kyoto Encyclopedia Genes and Genomes (KEGG) enrichment analysis revealed that the molecular mechanisms involved in DEGs include chemokine signaling pathways, cytokine-cytokine receptor interactions, and hematopoietic cell lineage (Fig. 3F). Therefore, the function of DEG seems to be related to immune-related activities, which means that the participation of immune factors may have an important impact on the formation of TME in BRCA.

Intersection of PPI network and univariate COX regression analysis (key genes)
We constructed a PPI network based on the STRING database using Cytoscape software to further explore the underlying mechanisms. Figure 4A shows the interactions between the 224 differential genes, and the bar graph demonstrates part of the genes ranked by the number of interacting nodes (Fig. 4B).
We performed a univariate COX regression analysis of survival in BRCA patients and identi ed the top 30 prognostically signi cant genes among the 224 DEGs (Fig. 4C). Then, the intersection analysis between key genes in the PPI network and univariate COX regression analysis of signi cant genes was performed, and the analysis identi ed 20 genes including CLEC10A (Fig. 4D).

The Correlation of CLEC10A Expression with Prognosis and Clinicopathological Characteristics of Breast Cancer PatientsCLEC10A
CLEC10A plays a key role in tumor-in ltrating lymphocyte-mediated antitumor immune responses. CLEC10A is an antigen-presenting cell (APC)-speci c antigen recognition receptor, which mediates immune-related tissue remodeling [10]. In this study, all BRCA samples were grouped into CLEC10A high expression and CLEC10A low expression groups by comparison with the median CLEC10A expression values. Survival analysis showed that BRCA patients in the CLEC10A high expression group survived longer than those with low CLEC10A expression (Fig. 5C). Subsequently, Wilcoxon rank sum test showed that the expression of CLEC10A was signi cantly lower in tumor samples than in normal samples (Fig. 5A). Similar results were observed in paired analysis between normal and tumor tissues from the same patient (Fig. 5B). The above results clearly indicated that CLEC10A expression is positively correlated with the prognosis of BRCA patients.
To further investigate the signi cance of CLEC10A in the development of breast cancer, we performed a correlation analysis with clinicopathological characteristics (Supplementary Table 1), which showed that CLEC10A expression was negatively correlated with age, TNM stage, and T classi cation of BRCA patients (Fig. 5D-H), especially CLEC10A expression decreased with the progression of TNM stage.

CLEC10A is a potential immunomodulatory factor in breast cancer
Given that the expression level of CLEC10A was negatively correlated with survival and TNM stage in BRCA patients, we applied the high and low expression groups compared with the median value of CLEC10A expression to GSEA enrichment analysis to study the gene function of CLEC10A, respectively. As shown in Fig. 6A, genes in the high CLEC10A expression group were mainly enriched in immunerelated activities such as antigen processing and presentation, B-cell signaling pathway, Fc gamma Rmediated phagocytosis, and hematopoietic cell lineage. As for the genes in the low CLEC10A expression group were enriched in biologic recognition pathways such as glycosylphosphatidylinositol (Fig. 6B). For the set of C7 immunologic signatures genes de ned by MSigDB, multiple immune function gene sets were enriched in the high CLEC10A expression group (Fig. 6C and Supplementary Table 4). In the low CLEC10A expression group, there was almost no gene set enrichment. These results suggest that CLEC10A may be is a potential immune regulator of TME in breast cancer.

Correlation of CLEC10A with the proportion of TICs
We analyzed the proportion of tumor-in ltrating immune cell subpopulations by using the CIBERSORT algorithm to further ascertain the correlation between CLEC10A expression and the immune microenvironment, and constructed 22 kinds of immune cell pro les in BRCA samples (Fig. 7). The results of the difference and correlation analysis showed that a total of 12 TICs were correlated with CLEC10A expression (Fig. 8). Among them, eight kinds of TICs were positively correlated with CLEC10A expression, including memory B cells, naive B cells, resting dendritic cells, M1 macrophages, activated memory CD4 + T cells, resting memory CD4 + T cells, CD8 + T cells, and γδ T cells; four kinds of TICs were negatively correlated with CLEC10A expression, including M0 macrophages, M2 macrophages, neutrophils, and resting NK cells. These results further supported that the level of CLEC10A in uences the immune activity of TME and suggested that CLEC10A is an important immune regulator of TME in breast cancer.

Discussion
In the present study, we identify TME-related genes from the TCGA database that contribute to the evaluation of survival in BRCA patients. CLEC10A is considered to be involved in the immune activity of the breast cancer tumor microenvironment. Importantly, a series of analyses suggested that CLEC10A may be a potential indicator of TME immune activity evaluation in BRCA patients. TME plays a crucial role in the initiation and progression of tumorigenesis. Exploring potential therapeutic targets can help improve TME remodeling and the prognosis of BRCA patients. A large number of studies have revealed the importance of the immune microenvironment in tumorigenesis. The results of our transcriptomic analysis of BRCA data from the TCGA database suggested that the immune component of TME contributes to patient prognosis, although the stromal component may dominate BRCA progression (e.g., invasion). These results highlighted that the exchange between cancer cells, in ltrating immune cells and stromal cells affects the ecology of the breast cancer microenvironment, in uencing disease progression and clinical outcome [11] . Immune in ltration plays an important role in breast cancer patients, especially in triple-negative breast cancer (TNBC) or her2-positive breast cancer patients [12] . New studies have con rmed that an increase in the percentage of tumor in ltrating lymphocytes (TIL) is associated with survival in patients with smaller volumes of primary breast cancer, with survival rates approaching 100% in patients with a TIL score of 30%, even in the absence of chemotherapy [13] . The e cacy of pembrolizumab monotherapy in patients with metastatic TNBC con rms that enrichment of CD8 + T cells predicts response to antitumor therapy [14] . Based on the transcriptomic analysis of BRCA in the TCGA database, we further found that the decreased expression of CLEC10A is closely related to the clinicopathological characteristics and poor prognosis of breast cancer. Therefore, it suggests that CLEC10A may be a potential prognostic marker and therapeutic target of TME in BRCA.
Protein glycosylation is known to have important effects in cell biological processes (such as proliferation, adhesion, differentiation, cell-cell interaction and immune response), and abnormal glycosylation is considered to be a hallmark of cancer [15] . Certain glycans with increased expression in tumor tissues compared to normal tissues are known as tumor-associated carbohydrate antigens (TACA) [16] . During tumor development, immune cells can detect the presence of TACA, which can enhance or weaken the immune response depending on the nature of the interaction [17] . The recognition of TACA is mediated by glycan binding protein, also known as lectin, whose binding speci city depends on their carbohydrate recognition domain (CRD). Macrophage galactose-type lectin (MGL; also known as CLEC10A or CD301), a member of the C-type lectin receptor (CLR) family [18] , recognizes glycan structures in a Ca2+-dependent manner through the carbohydrate recognition domain (CRD) and plays an important role in both innate and adaptive immune responses [6] . CLEC10A is mainly expressed by antigenpresenting cells (APCs), such as tumor-associated dendritic cells (DCs) and macrophages [19] , and which can speci cally recognize terminal GalNAc residues (α-or β-linked) found in the (S)Tn antigens and LacdiNAc epitopes. [20] . It was found that the tumor-associated mucin MUC1 (MUC1-Tn) containing Tn antigen is frequently overexpressed in malignantly transformed epithelial adenocarcinomas (such as breast, pancreatic, colon), and binding to lectin receptors on mature dendritic cells activates T-cell activity to initiate anti-tumor immune responses, however, binding to immature dendritic cells may lead to immunosuppressive effects [21] . Our results suggest that CLEC10A expression is downregulated and associated with poor disease prognosis in the advanced stage of BRCA patients. The same results have been reported in hepatocellular carcinoma and colon cancer [22] [23] . Interestingly, in the MGL1 (human CLEC10A homolog) de cient mouse strain, mouse mite induced dermatitis is exacerbated [24] and tissue remodeling associated with chronic immune responses is almost completely absent [10] , indicating that CLEC10A mediated immune in ltration may be involved in regulating stromal remodeling. Therefore, we further analyzed the relationship between CLEC10A expression and TME, and the GSEA results showed that the CLEC10A high expression group was signi cantly enriched in immune-related signaling pathways, such as antigen processing and presentation, B-cell signaling pathway, complement signaling pathway, and hematopoietic cell lineage. In the CLEC10A low expression group was enriched in glycosylphosphatidylinositol. All these results implied that CLEC10A may be involved in the regulation of immune in ltration in TME, especially the antitumor immune response. Importantly, the receptor ligand svL4 of CLEC10A (a peptide ligand mimic of MGL2) as a monotherapy for ovarian cancer has been proven to cause multiple-fold expansion of mature immune cell populations in the abdominal cavity and effectively inhibit the development of ascites in a mouse ovarian cancer model [25] . This may be an effective anti-cancer therapy.
In this article, CIBERSORT analysis of the proportion of TICs showed that activated memory CD4 + T cells, resting memory CD4 + T cells, and CD8 + T cells were positively correlated with CLEC10A expression in BRCA patients. In general, CD8 + cytotoxic T cells have been recognized as a key component of effective antitumor immunity, and breast cancer patients with higher CD8 + T cell in ltration have better survival [26] . However, lacking the help of su cient CD4 + T cells, CD8 + T cells often fail to fully exploit their antitumor immunity in vivo [27] . As the main role of the immune system, CD4 + T cells play a vital role in many aspects of their recruitment, activation and regulation of adaptive immune responses. Their auxiliary functions in B cell and CD8 + cytotoxic T cell-mediated immune responses has been fully proven.
However, tumor regulatory T cells (Treg) eliminate the early anti-tumor immune response induced by tumor-speci c CD4 + T cells and CD8 + T cells and suppress multiple immune cells, such as CD4 + and CD8 + T cells, B cells, NK cells, NKT cells, dendritic cells (DCs) and macrophages [28] . Therefore, we predict that the downregulation of CLEC10A expression in advanced breast cancer tumors may be caused by the reduction of APC induced by regulatory T cells, which often leads to tumor cell immune tolerance. related studies are in progress.
In conclusion, our research indicates that CLEC10A is a valuable prognostic biomarker and immunotherapeutic target for breast cancer, and reveals that it may play an important role in the initiation of tumor immune response. With the development of sequencing technology, we believe that our model has great potential in clinical practice, thus further validating our results with clinical samples.