Diagnosis and Prognostic Value of SPARC in Gastric Carcinoma: database mining for GCTA

Gastric carcinoma (GC) remains high incidence and mortality both in developed and developing countries. SPARC is extracellular non-structural matrix glycoprotein. Previous studies were closely associated with bone disease. However, the role of SPARC in GC remains largely unclear. In our study, we explored the diagnosis, prognosis and pathway enrichments value of SPARC in GC. Here, with the data from The Cancer Genome Atlas (TCGA), we used receiver operating characteristic (ROC) curve analysis to estimate the diagnosis value of the SPARC expression, Univariate and multivariate analysis to the prognosis, Gene set enrichment analysis (GSEA) to the signal pathway enrichments. As a result, SPARC expression was signi�cantly higher in the GC tissue samples. Those with high SPARC expression of GC patients were worse prognosis. GSEA shows the gene sets related signal pathways including transforming growth factor (TGF) beta signaling pathway, pathways in cancer, Wnt signaling pathway, Mitogen-activated protein kinase (MAPK) signaling pathway etc. In brief, those results suggest that SPARC can serve as a potential biomarker for GC in diagnosis and prognosis.


Introduction
Gastric cancer (GC) is one of the ve most common deadly cancers in the world [1] .In the developed country, gastric cancer has become one of the most mortality cancers among adults [2] .GC is also the 5 most commonly diagnosed cancers which incidence and mortality rates were corresponding with aga among Chinese people [3] .In 2020, 27,600 new cases of gastric cancer and 11,010 deaths are estimated, with a 5-year survival rate of only 32% (2010-2016) [4] .Nowadays, early surgical treatment can obtain a 5-year survival rate of 85-95% in of stage I GC patients, so surgery is still the preferred treatment for gastric cancer [5] .However, the cost-effectiveness of colonoscopy and serology-based preventive screening and the inconspicuous early symptoms limit the early diagnosis of GC [6,7] , which makes the worse prognosis of GC.Thus, early and accurate diagnosis of gastric cancer biomarkers would help improve the prognosis of these patients.
Secreted protein acidic and cysteine rich (SPARC) ,also known as NOT or BM-40, is extracellular nonstructural matrix glycoprotein which was rst isolated and puri ed in human and foetal bovine bone [8] .As for the function of this protein, studies have shown there were certain relationships between the SPARC expression and tumorigenesis.A study indicated that SPARC conclusively shown to promote pancreatic cancer proliferation [9] .A prognostic report indicated that SPARC mRNA expression was a negative predictor of pathological complete response (pCR) following neoadjuvant nab-paclitaxel (nab-PTX) therapy regardless of breast cancer subtype [10] .Emerging shreds of evidence have manifested that SPARC expression may a potential therapeutic target or a potential clinical marker for the survival of GC [11][12][13] .However, whether the SPARC also plays a photodynmic dignosis and prognosis role in GC remains totally unclear.
Thus, the recent study aimed to evaluate diagnosis and prognosis of SPARC expression in human GC based on data obtained from TCGA.GSEA was performed to further understand the biological pathways involved in the SPARC regulatory network related to GC pathogenesis.

RNA-sequencing data collection
The gene expression data (407 cases with 32 normal samples and HTSeq-FPKM for work ow type:) and corresponding clinical information were downloaded from TCGA Genomic Data Commons (GDC) data portal (https://portal.gdc.cancer.gov/repository).RNA-Seq gene expression data and clinical data for 375 patients were retained and further analyzed (Table 1).

Gene set enrichment analysis (GSEA)
GSEA is a computational method to determine whether a priori de ned set of genes shows a statistically signi cant consistent difference between two biological states that is intended to detect changes in the expression of modest but functionally coordinated genomes [14,15] .In our study, datasets and phenotype marked les were generated and uploaded into GSEA software.GESA analysis was carried out to demonstrate the signi cant survival difference observed between high-and low-SPARC groups in GC patient obtained from TCGA.Gene set permutations were performed 1000 times for each.The nominal pvalue (NOM p-val) < 0.05 and false discover rate q-value(FDR q-val) < 0.5 were set to sort the pathways enriched in each phenotype.

Statistical analysis
Relationship between clinical pathologic features and were conducted with the Wilcoxon signed-rank test and logistic regression.Clinicopathologic characteristics associated with overall survival in TCGA patients were used Cox regression and the Kaplan-Meier method.Wilson method and percentage results were used in receiver operating characteristic (ROC) curve analysis which ful lled with survivalROC package.Univariate logistic regression was used to revealed SPARC expression was associated with clinicopathologic characteristics.Univariate and multivariate Cox analysis was used to compare the in uence of SPARC expression on survival along with other clinical characteristics (age, stage, grade, distant metastasis status, lymph node status etc).The median value was set to cutoff the value of SPARC expression into two groups.All statistical analyses were conducted by R (v.3.6.3).

Association with SPARC expression and the value of diagnosis
Then, a total of 407 samples with SPARC expression data were analyzed from TCGA.As shown in Fig. 1A, increased expression of SPARC correlated signi cantly with the tumor type(p = 2.017e-12).There were also signi cant differences in the expression of SPARC in 27 paired groups of tumor tissues and adjacent tissues (p = 1.197e-05,Fig. 1B).To assess the diagnostic e cacy of SPARC, receiver operating characteristic (ROC) curve was used the expression data from 375 tumor samples and 32 normal samples.The area under the ROC curve was 0.874[95% con dence interval (CI), 0.8216-0.9021;Fig. 1D].

Associations between SPARC expression and clinicopathology parameters
Clinicopathology data of 375 GC patients from TCGA were generally analyzed which including gender, grade(G) classi cation, metastasis(M) stage, tumor(T) size, lymphatic node(N) metastasis, stage classi cation and age at diagnosis(age).As shown in Fig. 2(A-G), increased expression of SPARC was notably associated with T size (p = 8.184e-04, Fig. 2D) and G classi cation (p = 0.023, Fig. 2B).

Survival outcomes and multivariate analysis
According to the Kaplan-Meier survival analysis, those with high SPARC expression of the 375 patients were worse prognosis (Fig. 1C, P = 0.009).The univariate Cox analysis revealed that SPARC-high correlated signi cantly with poor OS[hazard ratio(HR) = 1.300, 95% CI = 1.090-1.543,p= 0.003].Other clinicopathologic variables associated with poor survival include age, advanced stage, TNM classi cation (Table 3).

GSEA identi es SPARC-related signal pathways
To identify signal pathways which are differentially activated in GC, we used GSEA comparing SPARC expression data which divided by the median expression level.GSEA revealed signi cant differences (NOM p-val = 0.05 and FDR q-val = 0.05) in enrichment of MsigDB collection(c2.cp.kegg.v7.1.symbols.gmt).Table 4 has showed the 20 items of GSEA analysis.As is shown in Fig. 4, gene sets related to transforming growth factor(TGF) beta signaling pathway, pathways in cancer, Wnt signaling pathway, Mitogen-activated protein kinase(MAPK) signaling pathway, focal adhesion, cell adhesion molecules cams, melanogenesis and small cell lung cancer, which were related to the tumor-associated.

Discussion
GC has long been one of the world's major cancers and remains one of the major causes of malignant disease morbidity and mortality [16] .Evidence have proved that SPARC has a crucial function in the process of tumorigenesis, but the bioinformation according to the TCGA data in GC are still rstly performed in this study.
According to our study, SPARC expression was signi cantly higher in the GC tissue samples compared to the control samples or the paired adjunct samples.Which suggested that the up-regulation of SPARC expression may be related to the development of GC.
Moreover, the clinical diagnosis and prognostic value of the SPARC expression were examined in our study of GC patients.At the beginning, we found that SPARC expression was signi cantly associated with clinical grade and T classi cation.Second, Kaplan-Meier curves for OS revealed that high expression of SPARC was associated with poor outcomes in GC patients.The area under the ROC curve showed the up-expression of SPARC in value of diagnosis.Further, univariate logistic analysis indicated the SPARC expression had relation with T classi cation.Univariate and multivariate Cox analysis showed the SPARC expression may be a potential independent marker for poor prognosis in GC patients.The multivariate Cox analysis revealed age was an independent risk factor or OS in GC patient.In general, these ndings suggested that high expression of SPARC could indicate a factor of diagnosis and poor prognosis for GC patients.Which also might be a pivotal target gene involved in the process of GC cell growth and metastasis.
In this study, we observe that SPARC high expression phenotype was associated with TGF beta signaling pathway, pathways in cancer, Wnt signaling pathway, Mitogen-activated protein kinase (MAPK) signaling pathway, focal adhesion, cell adhesion molecules cams, melanogenesis and small cell lung cancer.TGF beta signaling pathway is instrumental in mammalian development which has pivotal role in many mechanisms of breast cancer [17] , lung cancer [18] and other cancer [19][20][21] .Wnt signaling pathway is required for adult tissue maintenance, and perturbations in Wnt signaling promote human cancer [22,23] .MAPK signaling pathway activated during the differentiation of myogenic cell lines [24] .Which is

Figures Figure 1
Figures