Prognostic value of biglycan in gastric cancer

Background: BGN (biglycan) is a family member of small leucine-rich repeat proteoglycans. High expression of BGN might enhance the invasion and metastasis in some types of tumors. Here, the prognostic signicance of BGN was evaluated in gastric cancer. Material and Methods: Two independent Gene Expression Omnibus (GEO) gastric cancer microarray datasets( n= 64, n=432) were collected for this study. Kaplan-Meier analysis was applied to evaluate if BGN impacts the outcomes of gastric cancer. The gene set enrichment analysis (GSEA) was used to explore BGN and cancer-related gene signatures. Bioinformatic analysis predicted the putative transcription factors of BGN. Results: For gastric cancer, the mRNA expression level of BGN in tumor tissues was signicantly higher than that in normal tissues. Kaplan-Meier analysis showed that higher expression of BGN mRNA was signicantly associated with more reduced recurrence-free survival (RFS). GSEA results suggested that BGN signicantly enriched metastasis and poor prognosis gene signatures, revealing that BGN might be associated with cell proliferation, poor differentiation, high invasiveness of gastric cancer. Meanwhile, the putative transcription factors, including AR, E2F1, and TCF4, weres predicted by bioinformatic analysis and also signicantly correlated with expression of BGN in mRNA levels. Conclusion: High expression of BGN mRNA was signicantly related to poor prognosis, which suggested BGN was a potential prognostic biomarker and therapeutic target of gastric cancer.


Introduction
Gastric cancer is the sixth most common malignant tumor and is the second leading cause of cancerinduced death in the world (1) . In East Asia (China, Japan, and Korea), the incidence of gastric cancer is higher than in other areas over the world (2) . It was estimated that about one million new cases of gastric cancer were diagnosed globally in 2018, and about half of new cases occurred in China (3) . The 5-year overall survival rate of gastric cancer is only 20-30% due to cancer progression (4) , although numerous new treatments have been utilized, including but not limited to chemotherapy, targeted therapy, and immunotherapy. However, for early gastric cancer, the 5-year overall survival rate is more than 90% (5) .
Unfortunately, early-stage gastric cancer usually has no or only non-speci c symptoms. Once apparent symptoms appear, it is usually advanced gastric cancer. Gastroscopy is a routine screening method for gastric cancer, but it is not widely accepted because it is invasive (6) . Currently, several tumor markers are used in the clinic for early detection of gastric cancer. These markers include carcinoembryonic antigen (CEA), pepsinogen, α-fetoprotein(AFP), carbohydrate antigens (CA), CA72-4, CA125, and CA24-2. However, the sensitivity and speci city of these serum indicators are poor (7) . Thus, it is urgently needed to explore novel biomarkers for early diagnosis and prognosis prediction in gastric cancer. Biglycan (BGN) is a family member of small leucine-rich repeat proteoglycans (SLRPs) characterized by a core protein with leucine-rich repeats (8) . Initially, BGN was only considered as a component maintaining the structural integrity of the extracellular matrix(ECM), involved in the regulation of in ammatory response, skeletal muscle development, and regeneration (9,10) . In a decade, it was found that BGN is a signal molecule, playing an essential role in angiogenesis, cell proliferation, differentiation, and migration (11)(12)(13) . In recent years, it has been gradually found that BGN is highly expressed in various malignant tumors, such as endometrial cancer (14) , ovary cancer (15) , pancreatic adenocarcinoma (16) , esophageal squamous cell carcinoma (17) , colorectal cancer (18) , and gastric cancer (19) , suggesting an essential role of BGN in the pathogenesis and progression of cancer. In some types of these cancers, high expression of BGN enhances the ability of invasion and metastasis of tumor cells (18)(19)(20) , or contributes to poor prognosis (16,17,21,22) . Therefore, BGN is closely related to the occurrence and development of a variety of tumors and is a potential target molecule for tumor treatment. However, up to now, there have been few studies on BGN in gastric cancer, especially the prognostic value of BGN. Thus, the purpose of the present study is to verify the BGN expression and the prognostic value of BGN in gastric cancer. In this study, we investigated the prognostic value of BGN in gastric cancer in an external transcriptome data set from the TCGA database.
To understand the role of BGN in gastric cancer, we analyzed our tissue microarray, including 125 cases of gastric cancers for immunohistochemical BGN expression.

Materials And Methods
Worldwide microarray gene expression datasets Two independent Gene Expression Omnibus (GEO) gastric cancer microarray datasets (total n=496) were collected for this study. There were 432 cases of gastric cancer patients from South Korea in GSE26253 dataset (23) , and all participants had clinical and follow-up annotations. GSE65801 (24) contained 64 Chinese patients but had no follow-up annotations. Detailed information about the two downloaded datasets is listed in Table I. To normalize the mRNA expression levels in the GSE26253 dataset, we restrati ed BGN scores into four grades (Q1, Q2, Q3, and Q4) based on the percentile. BGN-low (Q1+Q2) and BGN-high (Q3+Q4) were also divided by the median value of gene expression.
The recurrence-free survival (RFS) period was de ned as the time from initial surgery until tumor recurrence. Kaplan-Meier survival plot was used to display the proportion of the population's RFS by the length of follow-up.
Gene set enrichment analysis (GSEA) The GSEA software v3.0 was downloaded from www.broad.mit.edu/gsea and run on the JAVA 8.0 platform (25) .All dataset (.gct), and phenotype label (.cls) les were created and loaded into GSEA software, and gene sets were updated from the above website. The detailed protocol could see our previous publications (26) . Here, the permutations number was 1,000, and the phenotype label was ILMN_2206746 (BGN).

Data management and statistical methods
Student t-test, one-way ANOVA, and non-parametric tests were used to test differences among subgroups for continuous data. The Pearson Chi-square and likelihood test was used for categorical data analyses. Kaplan-Meier analysis was used to estimate the proportion of the population's recurrence-free survival (RFS) by the length of follow-up in months. Hazard ratios (HR) with 95% con dence intervals (CI) were calculated using Cox proportional hazards regression analysis. Two-sided P-values less than 0.05 were considered statistically signi cant. R and JMP statistical software were used for the above analysis unless otherwise noted.

Immunohistochemistry (IHC) assays of tissue microarray
The protocol for the use of human tissues was approved by the Institutional Review Board (IRB) of the A liated Dongyang People's Hospital of Wenzhou Medical University (Zhejiang, China). Before the study, all patients gave their written informed consent to allow us to use left tissue samples for scienti c research. All eligible participants had received radical gastrectomy or palliative gastrectomy. The primary tumor samples were obtained from surgical specimens. The exclusion criteria were: 1) no informed consent obtained, 2) multiple cancers. A total of 125 pairs of gastric cancer specimens, including cancerous tissue and adjacent tissue, that underwent surgery in 2018, were eventually enrolled. The above specimens were made into tissue microarray. The BGN antibody was purchased from Abcam(cat#ab209234). The previously described protocols of depara nization and immunohistochemistry (IHC) staining were used to apply on the multiple-tissue array (27).

Results
The prognostic signi cance of BGN for gastric cancer In the GSE65801 dataset, we came to the same conclusion that BGN mRNA level was higher in tumor tissues than that in normal tissues ( Figure 1A). Kaplan-Meier analysis showed that higher expression of BGN was signi cantly associated with poorer recurrence-free survival (RFS) of gastric cancer patients. In the GSE26253 dataset, samples were divided into four subgroups according to the expression level of BGN, Q1, Q2, Q3, and Q4, respectively. BGN mRNA levels negatively correlated with RFS of gastric cancer patients ( Figure 1B). Therefore, the BGN expression level was negatively correlated with the prognosis of gastric cancer patients in a dose-dependent manner. In a strati ed survival analysis according to the pathological stage, samples were re-strati ed as BGN-high (equal or greater than the median) and BGNlow (less than the median), according to the expression levels of BGN mRNA. The HR values were 1.44 (95%CI 1.02-2.06, p=0.038) and 2.16 (95% CI 1.22-3.87, p=0.007) for high BGN expression in stage I-III (n=365) and stage IV gastric cancer patients respectively ( Figure 1C and 1D). These results suggested that high BGN mRNA levels were signi cantly related to poor prognosis in gastric cancer.

Bioinformatics analysis for the genes and proteins interaction network of BGN
To understand the biological functions of BGN, we conducted bioinformatics analysis for coexpressed genes of BGN on Oncomine. The analysis of genes coexpressed with BGN was conducted on Chen Gastric dataset (28) . We screened out more than 10 genes with a strong correlation with BGN, such as THBS2, ARHGAP5, FN1, INHBA, and CDH11 ( Figure 2A). Meanwhile, the bioinformatics analysis for the protein-protein interaction network was conducted on www.strig-db.org. Figure 2B showed the proteinprotein interaction network of BGN, more than a dozen genes were reported interacting with BGN through text mining, such as VCAN, TLR4, HSPG2, TGFB1, and GPC1. Most of the above genes were involved in cell growth, cell communication, signal transduction, and cell adhesion ( Figure 2C), which closely related to tumorigenesis.

GSEA of BGN in gastric cancer
To explore the cancer-related gene signatures of BGN, we performed a GSEA on the GSE26253 dataset, a downloaded microarray dataset of 432 gastric cancer cases. The expression of BGN was signi cantly associated with the following gene sets: Park hsc VS multipotent progenitors UP ( Figure 3A), Nakamura metastasis model DN ( Figure 3B), IVANOVA Hematopoiesis Stem Cell Long Term ( Figure 3C) and RICKMAN Tumor Differentiated Moderately VS Poorly UP ( Figure 3D) in GSE26253 dataset. GSEA results suggested that BGN signi cantly enriched metastasis and poor prognosis gene signatures, revealing that BGN might be associated with proliferation, poor differentiation, and high invasiveness of gastric cancer.

Prediction of putative transcription factors of BGN by bioinformatic analysis
In order to further understand the carcinogenic mechanism, it is essential to explore the upstream regulation of BGN in gastric cancer. The prediction of the BGN promoter region has used the website of http://gtrd.biouml.org/ and http://alggen.lsi.upc.es/cgi-bin/promo_v3/promo/promoinit.cgi? dirDB=TF_8.3. In Fig 4A, the promoter region of BGN would be around the signal of H3K27Ac, which located around the 1st exon and partially overlapped with CpG island. Meanwhile, the potential transcription factor binding sites (TFBSs) were predicted by Gene Transcription Regulation Database (GTRD) and ALGGEN -PROMO. By co-expression analysis, the eligible transcription factors were identi ed, including AR, CEBPA, CEBPB, E2F1, ELF1, GATA1, MAZ, PAX5, RXRA, SP1, STAT5A, TCF4, TP53, and YY1. Figure 4B showed the location of these eligible TFBSs on the promoter of BGN. A linear regression analysis indicated expression BGN signi cantly and positively associated with TCF4 level, while negatively associated with AR or E2F1( Figure 4C).

BGN expression in protein level of gastric cancer tissue
For gastric cancer, the protein expression level of BGN in tumor tissues was signi cantly higher than that in normal tissues ( Figure 5). Unfortunately, BGN was mainly expressed in the extracellular matrix rather than in the intracellular matrix, which made quantitative analysis di cult.

Discussion
A study has shown that the reproducibility of preclinical cancer research is only 11 percent (29) . Therefore, third-party veri cation plays an essential role in proving the reliability of research results. This kind of third-party evaluation may also reduce potential biases, including observer bias and publication bias.
Fortunately, open public microarray databases provide real objective data for researchers to conduct a study. Here, we acquired a public gastric cancer microarray database, the GSE26253 dataset, containing 432 cases.
BGN, a member of the family of small leucine-rich repeat proteoglycans (SLRPs), is only considered as a component maintaining the structural integrity of extracellular matrix, involving in the regulation of in ammatory response, skeletal muscle development, and regeneration (9,10) . In recent years, it has been gradually found that BGN is closely related to the occurrence and development of various malignant tumors, such as endometrial cancer (14) , ovary cancer (15) , pancreatic adenocarcinoma (16) , esophageal squamous cell carcinoma (17) , colorectal cancer (18) , and prostate cancer (21) . In some malignant tumors, higher expression of BGN predicts more considerable invasiveness and worse prognosis (16,17,21,22) . Therefore, it is valuable to re-evaluate the prognostic signi cance and clinical meaning of BGN on other cancers.
A previous study has shown that BGN promotes tumor invasion and metastasis of gastric cancer both in vitro and in vivo, and is associated with TNM stage. BGN plays an oncogenic role by activating the FAK signaling pathway in gastric cancer (19) . In this study, through analysis of public datasets ( Figure 1A) and immunohistochemical analysis of tissue arrays Figure 5 , we con rmed that BGN expression was higher in tumor tissues than that in non-tumors tissues. Unfortunately, since BGN was mainly distributed in the extracellular matrix, it cannot be quantitatively analyzed. Besides, we acquired a public microarray dataset, the GSE26253 dataset, containing 432 gastric cancer cases. Kaplan-Meier analysis of BGN for the RFS revealed that higher BGN expression level portended poorer prognosis in gastric cancer patients (p=0.03). Strati cation analysis showed that BGN signi cantly associated with RFS of both stage I-III (p=0.038) and stage IV (p=0.007) patients of gastric cancer (Figure1). Meanwhile, to explore the cancerrelated gene signatures of BGN, we performed a GSEA on the GSE26253 dataset, revealing that BGN might be associated with poor proliferation, poor differentiation, high invasiveness of gastric cancer. Also, we analyzed and predicted the potential transcription factors of BGN by bioinformatic analysis.
Limitations of this study, including 1) The protein expression levels of BGN could not be evaluated by immunohistochemistry (IHC) analysis. In gastric cancer tissue samples, the signal of BGN protein could only be seen in the extracellular matrix rather than in the intracellular matrix( Figure 5), which made the di culty of quantitative analysis. Meanwhile, 2) The mechanisms of BGN associating aggressiveness and poor outcome of gastric cancer were still not clari ed. 3) It needs to be further validated if BGN was a therapeutic target by experimental study.
Taken together, BGN-high could enrich gene signatures of poor proliferation, poor differentiation, high invasiveness. Kaplan-Meier analysis revealed that overexpression of BGN was signi cantly impacted poorer RFS in a dose-dependent manner in gastric cancer, both In stage I-III and stage IV patients. Therefore, BGN may be a potential prognostic and therapeutic biomarker of gastric cancer.