COL6A2/3 can serve as potential therapeutic targets and prognostic biomarkers for clear cell renal cell carcinoma


 Background: Growing evidence has shown that the type VI collagen alpha chain (COL6A) family involved in the tumorigenesis and progression of diverse malignancies; however, its biological roles and potential mechanisms in clear cell renal cell carcinoma (ccRCC) remain unknown. The study was designed to explore the potential mechanisms and functions of COL6As in ccRCC.Methods: ONCOMINE and GEPIA databases were used to compare the transcriptional expression data of COL6As in ccRCC samples and normal renal samples. UALCAN database was utilized to determine the association between clinicopathological features and COL6As expression. Kaplan–Meier method was employed to determine the prognostic value of COL6As mRNA expression in ccRCC. CBioPortal database was used to investigate the genetic alterations of COL6As in ccRCC. Co-expression analyses, functional enrichment analyses, and gene set enrichment analysis (GSEA) were utilized to explore the potential action mechanisms of COL6As in ccRCC. Finally, we estimated the relationship between COL6As expression with immune cell infiltrates.Results: Upregulated transcriptional COL6A2/COL6A3 expression was observed in ccRCC specimens by comparison with noncancerous renal specimens. Patients with increased COL6A2/COL6A3 mRNA expression have a poor clinical outcome and unfavorable prognosis. Gene ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG), and GSEA analyses showed that COL6A2/COL6A3 might promote the tumorigenesis and progression of ccRCC by involving in several cancer-related pathways, such as axon guidance, focal adhesion, ECM receptor interaction. Besides, we found that COL6A2/COL6A3 expression was significantly associated with immune infiltration levels in ccRCC.Conclusions: COL6A2 and COL6A3 could act as candidate prognostic biomarkers and therapeutic targets in ccRCC. However, further experimental work was required to validate the conclusions.


Introduction
Clear cell renal cell carcinoma (ccRCC) is one of the most common urological malignancies worldwide [1] . Over 30% of the patients are diagnosed with advanced disease because of the lack of speci c clinical symptoms in the early stage [2] . However, the crucial question worried us is that these patients have missed the opportunity for radical surgery and responded poorly to routine chemoradiotherapy [3,4] . Furthermore, ccRCC has been clearly considered as a heterogeneous disease, leading to distinct tumor biologic traits, treatment response, and clinical outcomes [5] . Although improvement has been achieved in the clinical treatment of ccRCC, patients with recurrence and metastasis still show poor survival rates [3,6] . Therefore, it is particularly urgent to nd novel diagnostic and prognostic biomarkers and targets for ccRCC patients.
The type VI collagen alpha chain (COL6A) family is a major structural component of micro brils that contain six family members, including alpha1(VI) to alpha6(VI), also known as COL6A1 to COL6A6. COL6A4 is a pseudogene that does not encode protein among them. All other members encode primary extracellular matrix proteins that could prohibit cell invasion and metastasis [7] . It is thus not surprising that aberrantly deregulated COL6As contribute to the tumorigenesis and progression of human tumors.
Accumulating evidence has revealed that COL6As were involved in the occurrence and development of various cancers; for example, COL6A1 was highly expressed in pancreatic cancer samples than in adjacent control tissues and upregulation of COL6A1 indicates a poor clinical outcome of predicts unfavorable prognosis in patients with pancreatic cancer [8] . Evidence derived from bioinformatics analyses showed that collagen members COL6A1, COL6A2, and COL6A3 were increased in bladder cancer and negatively associated with patient prognosis [9] . High COL6A3 expression in gastric cancer promotes proliferation, migration, invasion and prohibits apoptosis of tumor cells through the PI3K-Akt signaling pathway [10] . A recent research showed that COL6A5 rs13062453 and rs1497305 were associated with the susceptibility of lung adenocarcinoma [11] . Besides, a previous study showed that COL6A6 was upregulated in pituitary adenoma tissues than in normal tissues [12] .
However, the associated with the COL6A gene family and ccRCC have rarely been recorded. A systematic functional study of COL6As will help us clearly understand their roles in malignancies. The study was aimed to probe into the roles of COL6As in ccRCC. In the current study, several available databases were employed to perform the expression patterns, clinicopathological analysis, prognosis value assessment, gene alterations exploration, functional enrichment annotate, gene set enrichment analysis and immune in ltration of COL6As in ccRCC. In conclusion, our study indicates that COL6A2 and COL6A3 could serve as potential biomarkers for diagnosis and prognosis of ccRCC patients.

ONCOMINE database
The gene expression array datasets of ONCOMINE (https://www.oncomine.org) are an online cancer microarray database [13] . In this study, the ONCOMINE database was used to analyze mRNA levels of COL6As in different malignancies. The mRNA expressions of COL6As in diverse cancer samples were compared with that in normal controls, using a students' t-test to obtain a P value. The cut-off of P value and fold change were de ned as 0.001 and 2, respectively.
GEPIA provides customizable functions such as tumor/normal differential expression analysis, pro ling according to cancer types or pathological stages, patient survival analysis, similar gene detection, correlation analysis and dimensionality reduction analysis. In the study, GEPIA was used to validate COL6As mRNA expression in ccRCC specimens. Furthermore, GEPIA database was employed to assess the association between COL6As mRNA expression and the survival rate of ccRCC patients. The 516 ccRCC patients with available overall survival (OS) and disease-free survival (DFS) time data were split into low-and high-expression groups by the median transcripts per kilobase million (TPM), a logarithmic rank P < 0.05 was considered signi cant. UALCAN database UALCAN (http://ualcan.path.uab.edu) includes TCGA grade 3 RNA-seq and clinical data of 31 human cancer types, which the platform can be used to analyze the relationship between gene expression and clinicopathological features [15] . In our study, UALCAN was utilized to compare differential mRNA levels of COL6As between paired AJCC stages or ISUP grades.

cBioPortal database
The cBioPortal for cancer genomics (http://www.cbioportal.org/) is an open-source database, which can provide multidimensional cancer genomics and clinical data retrieved from the TCGA database [16] . Here, three kidney renal clear cell carcinoma (KIRC) dataset, including data from 1,485 cases with pathology reports, was selected for further analyses of COL6As using cBioPortal database. The cBioPortal Cancer Genomics resource was used to assess the alteration frequency of the COL6As in ccRCC cases. STRING and GeneMANIA database The Search Tool for the Retrieval of Interacting Genes (http://string-db.org) [17] online database was utilized to identify ten co-regulated proteins; an interaction with a combined score > 0.4 was regarded as statistically signi cant.
The GeneMANIA (http://genemania.org/) [18] online database was utilized to identify co-regulated partners at the gene level. Furthermore, a protein-protein and gene-gene interaction networks for the COL6As and their functional partners were constructed using the STRING and GeneMANIA database. Correlation and Functional Enrichment analyses LinkedOmics (www.linkedomics.org/login.php) is a publicly available portal that provides a unique platform for biologists and clinicians to access, analyze and compare cancer multi-omics data within and across tumor types [19] . The COL6As co-expression analysis was statistically using Pearson's correlation coe cient through the LinkedOmics database. Correlated genes with a Pearson's correlation coe cient |0.5| and FDR < 0.05 were selected to perform functional enrichment analysis. Metascape (http://metascape.org) is a web-based portal designed to perform functional enrichment, interactome analysis, gene annotation, and membership search [20] . In the current study, Metascape was used to analyze the enrichment of Gene ontology (GO) for COL6As and its correlated genes, gene terms with P-value < 0.01, minimum count of 3, and enrichment factor of > 1.5 were regarded signi cantly. Additionally, the clusterPro ler package, which could automate biological terminology classi cation and gene cluster enrichment analysis processes [21] , was utilized to analyze the Kyoto Encyclopedia of Genes and Genomes (KEGG) for COL6As and its correlated genes, signal pathway enrichment analysis was conducted with the criterion both P and FDR < 0.05.
Gene set enrichment analysis To better understand the potential functions of COL6As in ccRCC, we conducted GSEA using transcriptional sequences from the TCGA database. In this study, gene expression pro les of 530 ccRCC patients were sorted into two subgroups (high-expression subgroup and low-expression subgroup) based on the median value of expression of COL6As. The "c2.cp.kegg.v7.2.symbols.gmt"gene sets from the Molecular Signatures Database (MSigDB) were analyzed using GSEA 3.0 software. Enrichment analysis was carried out using the default weighted enrichment statistics method. The number permutations of was set to 1,000 times for each analysis. Normalized enrichment scores (NES) were utilized to identify the signi cance of signaling pathways enriched in the high KIF4A expression subgroup. Gene sets with both a P-value of < 0.05 and a false discovery rate (FDR) of < 0.05 were considered as signi cantly enriched gene sets.

TIMER database
The Tumor Immunity Resource (TIMER) (https://cistrome.shinyapps.io/timer/) database contains mRNA sequencing data for 10,897 samples of 32 human cancers from the Cancer Genome Atlas (TCGA) [22] . It is an ideal tool for systematic analysis of immune in ltration in various cancer types. In this study, we assessed the COL6As expression with the abundance of different immune in ltrating cells.

Results
Increased COL6A2 and COL6A3 expression in ccRCC Using ONCOMINE database, we examined the differences of COL6As expression between ccRCC samples and normal renal samples. Results showed that COL6A1, COL6A2 and COL6A3 mRNA expression was highly expressed in neoplastic samples by comparison with normal renal specimens in the datasets (Fig. 1A). As indicated in Fig. 1B-C, in Beroukhim Renal dataset, COL6A2 mRNA expression was signi cantly upregulated in ccRCC tissues compared with the noncancerous tissues (Hereditary ccRCC vs. Normal, p = 4.90E-11, fold change = 3.482; Non-Hereditary ccRCC vs. Normal, p = 6.46E-8, fold change = 3.209), in Gumz Renal dataset, COL6A2 mRNA expression was highly expressed in ccRCC tissues than in noncancerous tissues (ccRCC vs. Normal, p = 1.93E-5, fold change = 3.356). In addition, increased COL6A3 mRNA expression was observed in ccRCC specimens based on data from Gumz Renal dataset (ccRCC vs. Normal, p = 3.06E-4, fold change = 4.420) (Fig. 1D). GEPIA database was used to further validate COL6As mRNA expression in ccRCC. Elevated transcription levels of COL6A2 and COL6A3 were found in ccRCC tissues compared to adjacent normal tissues based on the RNA-seq data from TCGA and the Genotype-Tissue Expression (GTEx) projects (Fig. 2).

Elevated COL6A2 and COL6A3 expression signi cantly associated with advanced clinicopathological features and unfavorable prognosis in ccRCC patients
Using the UALCAN database, we found that COL6A2/COL6A3 expression was signi cantly associated with clinicopathological parameters of patients with ccRCC. As can be seen from Fig. 3A-D, patients who were in more advanced stages or more advanced grade scores tended to express higher mRNA expression of COL6A2/COL6A3. Given the relationship between COL6A2/COL6A3 expression and clinicopathological features, survival analysis in Kaplan-Meier method was conducted further to evaluate the prognostic value of COL6A2/COL6A3 in ccRCC patients. Results revealed that COL6A2/COL6A3 expression was remarkably correlated with the overall survival (OS) and disease-free survival (DFS). As indicated in Fig. 4A-D, patients with the upregulated COL6A2/COL6A3 mRNA expression were likely to be predicted as having weak OS and DFS.

COL6A2/COL6A3 genetic alteration analysis and co-regulated network construction
Alteration frequency of COL6A2/COL6A3 mutations in ccRCC was analyzed using cBioPortal database. A total of 1,485 patients from three datasets of KIRC were analyzed. The analysis of genetic variations in COL6A2/COL6A3 reported in different datasets was showed in Fig. 5A, genetic alterations were 4.06 4.31% and 4.9%, respectively. The variations frequency included ampli cation (red), deep deletions (blue), mutation (green). Additionally, overview of the analyses of genetic variations in COL6A2/COL6A3 were showed in Fig. 5B, COL6A2/COL6A3 were altered in 66/1,485 (4.5%) of ccRCC patient samples. Furthermore, in our study, protein-protein network and gene-gene network were constructed using the STRING and GeneMANIA database, respectively (Fig. 5C-5D).

Correlation and Functional Enrichment analyses of COL6A2/COL6A3
Genes co-expressed with COL6A2/COL6A3 were analyzed using the linkedomics database. As shown in Fig. 6A and Fig. 6D, the dark red dots show a signi cantly positive correlation with COL6A2/3, whereas dark green dots reveal a signi cantly negative correlation with COL6A2/3. The top 50 signi cant genes positively and negatively correlated with COL6A2/3, as shown in the heat maps (Figs. 6B, C, E, F). According to the criteria: Pearson's correlation coe cient |0.5| and FDR < 0.05.

Discussion
Here, we analyzed the expression patterns and biological functions of six COL6A gene members in ccRCC. Our results show that COL6A2/COL6A3 was signi cantly elevated in ccRCC. Further research suggests that upregulated COL6A2/COL6A3 expression was markedly correlated with advanced clinical stage and higher tumor grade. Additionally, we found that patients with COL6A2/COL6A3 overexpression have an unfavorable prognosis. All of this evidence indicates that COL6A2/COL6A3 might serve as oncogene in ccRCC. Thus, we carry out functional explorations, including GO, KEGG and GSEA analyses, to explore the functional mechanism of COL6A2/COL6A3 in ccRCC. Mechanically, COL6A2 and COL6A3 were both found to involve in several tumorigenesis and tumor progression pathways, such as Focal adhesion, ECM − receptor interaction, Proteoglycans in cancer, Relaxin signaling pathway and Axon guidance. These pathways have been proven to associate with the occurrence and progression of ccRCC and several other types of cancer. Previous studies have shown that Focal adhesion played a vital role in tumor development, growth, in ltration and spread [24] , and many studies have shown activation of Focal adhesion-related genes in ccRCC [25][26][27] . The extracellular matrix (ECM) plays crucial role in the maintenance of normal structure and function of cells [28] . Deregulation ECM − receptor interaction contributes to the out of control of numerous cellular activities such as adhesion, migration, differentiation, proliferation, and apoptosis, and eventually affecting tumor progress [29] . proteoglycans have been demonstrated to be critical molecules in the tumor microenvironment, which contributing to the biological behavior of several malignancies, including proliferation, adhesion, angiogenesis and metastasis, resulting in tumor development and progression [30,31] . Several proteoglycans have been proven to promote tumor progression and metastasis in ccRCC [32,33] . Previous reports have indicated that relaxin could affect cell growth, cell migration, angiogenesis and extracellular matrix remodeling, targeting Relaxin signaling pathway has gradually become a promising therapeutic way in cancer treatment [34,35] . Axon guidance is integral to several basic cellular processes, including organogenesis, regeneration, wound healing.
A global genomes analysis has uncovered the potential involvement of axon guidance signaling pathway related genes in pancreatic carcinogenesis [36] . In conclusion, Focal adhesion, ECM − receptor interaction, Proteoglycans in cancer, Relaxin signaling pathway and Axon guidance may be signi cantly associated with COL6A2/COL6A3 expression in ccRCC.
Increasing studies have demonstrated that tumor-in ltrating immune cells (TICs) could interact with tumor cells and in uence the clinical outcome and immunotherapy [37,38] .
Recent studies have indicated that RCC tumors are more frequently in ltrated with lymphocytes than other malignancies [39] . In this work, we rst found the relationship between COL6A2/COL6A3 expression and TICs in ccRCC. The overexpression COL6A2/COL6A3 resulted in an increased in ltration levels of various immune cells. Among them, we found that the COL6A2/COL6A3 expression level was remarkably positively associated with in ltrating levels of CD4 + T cells, recent research has shown that CD4 + T cells could promote RCC cell proliferation