A Pan-Cancer Study: The Immunological and Prognostic Signicance of Aberrant SPP1 Expression on Tumors

Background Secretory phosphoprotein 1 (SPP1) is a glyco-phosphoprotein that is widely expressed in a variety of cancer cells. Current studies have identied that SPP1 is differentially expressed in a variety of cancer cell species. However, there are few studies on the level of SPP1 expression in different types of cancer and its clinical signicance. Methods In this study, we analyzed SPP1 levels and its signicance in 33 different cancer types by using The Cancer Genome Atlas (TCGA) database. The study analyzed the correlation between SPP1 expression and tumor immunity. Results The results showed that SPP1 transcript levels were aberrantly expressed in most tumors. Univariate Cox analysis showed that SPP1 was strongly associated with Overall survival in multiple tumor types. We also found that SPP1 was signicantly correlated with tumor immune microenvironment, tumor immune cells, and tumor inltrating lymphocyte markers. The correlation of SPP1 with Tumor mutational load (TMB) and Microsatellite instability (MSI) also predicts its role in assessing the ecacy of immunotherapy. Gene set enrichment analysis of 33 cancer types provided further evidence for the relationship between SPP1 levels and cancer progression and immune cell inltration.

perspective in a comprehensive manner. Moreover, there are fewer studies on the immunological aspects of SPP1 with multiple tumors.
Pan-cancer analysis becomes an ideal method, which investigates the similarities and differences between different cancer types. It not only reveals the similar mechanisms between some different cancer types, but also reveals the signi cance of speci c genes and signaling pathways in cancer (7). The cancer genome atlas (TCGA) provides a comprehensive record of genetic data from tumor patient samples, including their DNA sequences, transcriptional information, epigenetic modi cations and related information (8). TCGA currently brings together more than 10000 samples from 33 cancers with genomic, epigenomic and proteomic data, etc. (9). Our study will explore the relationship between SPP1 and different types of tumors by pan-cancer analysis. Moreover, we explore the importance of SPP1 at the level of tumor immunity from the perspective of MSI, TMB, Tumor microenvironment (TME), immune cell in ltration and gene co-expression. Finally, Gene set enrichment analysis (GSEA) was used to explore the potential pathways of SPP1 action in different tumor types.

Differential expression analysis of SPP1
The results of SPP1 differential expression analysis of 33 cancers in the TCGA database are shown in

Relationship between SPP1 and clinical staging of tumours
We analyzed the relationship between SPP1 and clinical stage in different types tumor, and screened statistical signi cance at P < .050 ( Fig. 7A-M). The results showed that SPP1 was differentially expressed in different stages of tumors. The expression level of SPP1 was signi cantly different in most of the tumors at stage III compared with stage I or stage II (P < .050). There were also signi cant differences in the expression levels of SPP1 between stage I and stage II in most tumors (P < .050).

TME and SPP1 correlation analysis
The ESTIMATE package in R software was used to calculate the immune score and stromal score. The higher scores of ImmuneScore or StromalScore indicate more immune or stromal components in the TME. Then we analyzed the correlation between SPP1 expression and immune score and stromal score in each tumor type. Figure 8A-N and Fig. 9A-P showed signi cant correlations between SPP1 and measures (immune score, stromal score) in some tumors, respectively (P < .001).
22 immune cell-SPP1 correlations for 33 cancer types in TCGA were analyzed with the CIBERSORT package: 17 tumors with SPP1 associated with B lymphocytes; 28 tumors with T lymphocytes; 29 tumors with SPP1 expression associated with macrophages; 15 tumors with SPP1 expression associated with neutrophils; SPP1 expression There were 12 tumors associated with eosinophils. BRCA was selected as an example for visualization of the output (Fig. 10A-I).

Discussion
Effective cancer biomarkers will contribute to the development of tumor precision medicine. Researches on the role of SPP1 in tumorigenesis and development have been extensively carried out in a variety of malignant tumors, and made a lot of progress. However, due to the limited types of cancers currently studied and the lack of su cient studied populations, whether SPP1 can be a suitable cancer predictor is still a tricky problem. As immunotherapy becomes popular in cancer treatment, identifying SPP1's anomalous changes in the pathological process of cancer will not only help to understand the mechanism of cancer lesions, but also be of great help for individualized treatment. TCGA contains 33 types of cancer, covering multiple data such as genome, transcriptome, epigenetics, proteome, etc., which can help cancer researches and improve the technology of cancer prevention, diagnosis and treatment.
Therefore, based on the TCGA database, we use pan-cancer analysis methods to reveal the functional signi cance of SPP1 in cancer.
Identifying abnormal genetic expression in tumor is very important for individualized treatment, as it pointedly improves clinical outcomes (10). Our differential analysis suggests that SPP1 expression is upregulated in most tumors, and high expression is associated with poor prognosis and can be used as a risk prognostic factor. It is worth noting that SPP1 is down-regulated in some other tumors. Studies have found that SPP1 in colorectal cells can negatively regulate T cell activation by binding to CD44 and promoting cancer progression (11). However, some studies have also proved that the overexpression of SPP1 and the down-regulation of CD44 can inhibit the pathological process of colon cancer (12,13). This suggests that SPP1 has multiple effects on tumors and may function in multiple ways. The results of GSEA analysis also support this view.
Survival analysis is an important method for clinical evaluation of tumor diagnosis, treatment effect and prognosis. Our study showed that the relationship between SPP1 and cancer survival rate showed different K-M analysis results. In most tumors, up-regulated SPP1 is related to adverse prognoses, such as CESC, GBM, HNSC, LGG, LIHC, PAAD, SKCM, SARC, LUSC, LUAD, GBM, COAD, etc., with higher expression and lower survival. However, high expression levels in UVM show a better prognosis. It suggests that SPP1 has dual effects of promoting tumor and anti-tumor. This may be attributed to different subtypes (SPP1a, SPP1b, SPP1c) in different tissue locations, different molecular functions, and different signal transduction pathways activated to function (13)(14)(15). In addition, mining the potential pathways of SPP1 through GSEA is expected to provide a basis for future researches.
Our study found that SPP1 is also related to the clinical stage of the tumor. It is worth noting that the expression levels of COAD, ESCA, and READ in Stage I SPP1 are signi cantly different from those of Stage I, Stage II and Stage III. Moreover, most tumors have different levels of expression of SPP1 in different stages. These results indicate that SPP1 is expected to become an early diagnostic biomarker for cancer, and can even be used as an indicator for predicting the process stage of a speci c tumor and assessing the degree of deterioration. TME refers to the cellular environment in which tumors exist, including blood vessels, immune cells, broblasts, other cells, signal molecules and extracellular matrix (16). Our study proved the immune correlation between SPP1 and tumors through the CIBERSORT analysis of immune cell in ltration content and correlation analysis. In TME, eosinophils can in uence local immunity in the disease process, and there is growing evidence that they can exert anticancer effects through multiple pathways. Cytokine signaling and epigenetic signaling in the microenvironment, among others, induce neutrophils to polarize into anti-tumor N1 tumor-associated neutrophils and pro-tumor growth N2 tumor-associated neutrophils. Tumor cell-associated macrophages (TAM) and M2 type macrophages can promote the occurrence and development of tumors. Zhang Yan et al. found that SPP1 is highly expressed in TAM in human lung adenocarcinoma, and tumor cells can induce M2 type macrophages through SPP1 overexpression, and participate in tumor progression (4). This study further veri ed that SPP1 is closely related to neutrophils, eosinophils and M2 macrophages, and it is speculated that it can promote immune cells (neutrophils, eosinophils, M2 macrophages) to in ltrate tumors pathological process.
The mechanism by which cancer evades immunity is the target of immunotherapy. Gene co-expression analysis found that SPP1 can participate in tumor immunity through interactions with immune-related genes (CD86, HAVCR2, LAIR1, NRP1, LGALS9, etc.). In addition, it has been reported that SPP1 is related to the drug resistance of breast cancer and lung adenocarcinoma immunotherapy. The above can further infer that SPP1 may be a predictive target of immune resistance, but this requires more in-depth researches to be con rmed.
Gene mutations are one of the main causes of tumors (17). TMB refers to the number of somatic mutations in the tumor genome after germline mutations have been removed. The more neoantigens a tumor produces, the easier it is to be recognized by the immune system, and the better the immunotherapy effect (18). MSI veri es the defect of DNA mismatch repair. The higher the number, the more somatic mutations, the more neoantigens produced, and the better the e cacy of immunotherapy (19). Both TMB and MSI are important predictors of immunotherapy. In this study, we found that SPP1 correlated with TMB in ACC, THYM, STAD, SRAC, PRAD, OV, LGG, KIRC, DLBC and COAD; SPP1 correlated with MSI in SARC, LUSC, LUAD, GBM and COAD. SPP1 may be used as an indicator of the e cacy of a variety of tumor immunotherapy.
In summary, our study shows that SPP1 is an effective prognostic biomarker in some cancer types. Its expression is immune-related, which may provide a basis for developing new targets for cancer diagnosis, treatment and prognostic assessment. However, there are still some limitations in our study. Due to the wide range of cancer types studied, the ndings lack adequate support from actual clinical evidence and need su cient time to be validated.

Conclusions
In this study, we used a pan-cancer analysis to analyze the differential expression, survival prognosis and immune correlation of SPP1 in multiple cancers. The study further explored the correlation between differential SPP1 expression and tumor development stage, TME and tumor immunotherapy effect. And the potential pathways of SPP1 action in tumor cells were elucidated by GSEA. These results suggest that SPP1 plays a very important role in the developmental progression of a variety of tumors. We also found that SPP1 is an ideal indicator for assessing the clinical staging, therapeutic effect and prognosis of various tumors. Therefore, an in-depth understanding of the mechanism of SPP1 in tumors will hopefully lead to new directions for the diagnosis and treatment of tumors in the future.

Correlation between SPP1 and survival
Transcriptome data from 33 cancer types extracted from TCGA. SPP1 expression in cancer tissue samples and non-cancerous tissue samples were extracted using Perl software (version 5.8.3). The Wilcox test was used to analyze the difference in SPP1 expression between cancer and non-cancerous tissue samples. P < 0.05 was used for statistically signi cant differences.
Clinical data such as survival time and survival status of each sample were extracted using R (version 3.6.1). The samples were divided into high expression groups(>median) and low expression groups(<median) according to the expression of SPP1, and we chose overall survival (OS), disease Free Survival(DFE), disease-speci c survival(DSS) and progression-free survival (PFS) as the outcome indicator for the next step of the analysisThe survival curves between high and low SPP1 expression groups were constructed by Kaplan-Meier analysis using the survival package with lonk-rank test. Curves with statistical signi cance were screened at P < 0.05. Univariate Cox proportional risk regression model was used to analyze the relationship between SPP1 and patient prognosis. HR > 1 suggested that the more SPPI expression, the worse the prognosis of this tumor; HR < 1 suggested that the more SPPI expression, the better the prognosis of this tumor, and P < 0.05 was considered statistically signi cant.

Correlation between SPP1 and Tumer stage
The tumor TNM stage (stage) was obtained from TCGA clinical data, and the limma package was uesd in R software to explore the correlation between SPP1 and tumor stage. Wilcoxon rank sum test was used to compare the two groups, and Kruskal-Wallis test was used to compare three or more groups. P < .05 was considered to be statistically different.
Correlation between SPP1 and forecast index TMB re ects the number of mutations carried by the tumor cell genome, which is de ned as the total number of somatic gene coding errors, base substitutions, gene insertion or deletion errors detected per million bases. We used Perl to calculate TMB values for different tumor mutation samples, and spearman correlation test was then performed to analyze the correlation between SPP1 expression levels and TMB.
MSI is any change in microsatellite length due to insertion or deletion of repeat units in a microsatellite in a tumor compared to normal tissue, with the phenomenon of new microsatellite alleles. MSI is considered as a predictor of immunotherapy in clinical practice. The higher the value, the better the effect of immunotherapy. The correlation between MSI of speci c tumor types and and spearman correlation test was then performed to analyze the correlation between SPP1 expression levels and MSI., and P < .05 indicates a signi cant difference.

Correlation SPP1 and tumor immune
The ESTIMATE package of the R software was used to assess tumor microenvironment, including immune cells, stromal cells, while generating three measures (immune score, stromal score, and estimated score). Spearman correlation test was then performed to analyze the correlation between SPP1 expression levels and tumor microenvironment. The levels of 22 immune in ltrating cells in the dataset were calculated using the CIBERSROT package. Spearman correlation test was performed to analyze the correlation between SPP1 and immune in ltrating cells. The above process was screened for statistically signi cant results at P < .001 and was used to visualize the output.
Gene set enrichment analysis GSEA is an analysis method for genome-wide expression pro ling microarray data, a process of classifying genes by comparing them with a previously validated set of genes. In this study, we rst downloaded the KEGG function set (c2.cp.kegg.v7.1.symbols.gmt) through the GSEA online database (http://www.gsea-msigdb.org/gsea/index.jsp). Then we perform GSEA analysis in R environment to identify potential pathways and nally visualize the output.

Statistical analysis
All data were processed using the R software (version 4.0.3). Data extraction was performed using PERL (version 5.8.3). l Network plots were performed using Cytoscape (version 3.7.2). P-values (two-sided test) < .05 were considered signi cant.         Correlation of SPP1 with stromal scores in ESTIMATE. (A-P) ESTIMATE prediction of the relationship between SPP1 expression levels and stromal scores. The horizontal axis represents the sample stromal score, the vertical axis represents the SPP1 expression levels.  Supplementary Files