Bioinformatic gene analysis for potential biomarkers and therapeutic targets of kidney calculi-related renal cell carcinoma

Kidney calculi (KC) is considered to be a potential cause of renal cell carcinoma (cid:0) RCC (cid:0) due to urinary retention, hydronephrosis, pyelonephritis, and carcinoma of renal pelvis. We searched co-expressed genes in order to explore the relationships between kidney calculi (KC) and renal cell carcinoma (cid:0) RCC (cid:0) and reveal potential biomarkers and therapeutic targets of kidney calculi-related renal cell carcinoma.KC-related differentially expressed genes (DEGs) were identied via bioinformatic analysis Gene Expression Omnibus (GEO) datasets GSE73680 and GSE117518. Simultaneously, RCC-related DEGs were also identied via bioinformatic analysis GEO datasets GSE14994 and GSE40435. Subsequently, co-DEGs of KC-related RCC were found, and extensive target prediction and network analyses methods were used to assess protein– protein interaction (PPI) networks, Gene Ontology (GO) terms and pathway enrichment for DEGs, and co-expressed DEGs coupled with corresponding predicted miRNAs involved in KC and RCC were assessed as well.We identied 832 DEGs in KC and RCC samples. The co-DEGs of VIM (cid:0) DCN (cid:0) WNK1 and PXDN coupled with corresponding predicted miRNAs, especially miR-181c-5p and miR-181d-5p may be signicantly associated with KC-related RCC. The Co-DEGs of VIM (cid:0) DCN (cid:0) WNK1 and PXDN link KC and RCC. Finally, the top 5 miRNAs for each Co-DEGs may be potential signaling pathways for KC-relate RCC, especially miR-181c-5p and miR-181d-5p. Therefore, there is an association between KC and RCC and expression of VIM (cid:0) DCN (cid:0) WNK1 and PXDN genes may favor KC-related RCC.


Introduction
Kidney calculi (KC) is one of the most common urological condition. Data from National Health and Nutrition Examination Survey(NHANES) reported in 2012 noted the KC prevalence of 10.6% in men and 7.1% among women. 1 Globally, the incidence and prevalence of KC have increased over the years. [1][2][3] Despite technological advances in the surgical management of KC have signi cantly reduced patient morbidity and recovery time, new stone formation and recurrence remain signi cant health issues. Recurrence rate after the rst appearance of symptomatic stones is reported to be 30-50% within 10 years 1 .Therefore, the research on the mechanism of the occurrence and development of KC emerges one after another.
Similar to the incidence of kidney stones, the incidence of renal cell carcinoma (RCC), the most common type of renal parenchymal tumor accounting for 90% of malignant neoplasms arising in the kidney in adults, still continues to keep growing 4,5,8 . Although the best treatment for RCC is resection 6 , advanced patients can only be treated by chemoradiotherapy and other comprehensive methods 7 . If RCC could be predicted, it can be detected early and removed, which will improve the prognosis.
Although there are many risk factors for kidney cancer, such as smoking, obesity and hypertension 8 ,a Dutch team concluded that early diagnosis of KC (age < 40) was signi cantly associated with an increased risk of later diagnosis of kidney tumors (age ≥ 40) 9 . They found KC was associated with an increased RCC risk (HR:1.39,95% CI 1.05-1.84) 9 . In fact, previous studies have assessed the relationship between kidney stones and RCC and upper tract urothelial carcinoma (UTUC) 10-12 . In addition, a meta-analysis, based on eight case-control studies and one retrospective cohort study, found an increased risk of RCC and both ureter and renal pelvis cancer in patients with KC 13 .
The association between kidney cancer and kidney stones is understandable, given the many risk factors they share,such as obesity, diabetes mellitus and several dietary factors. Since RCC in itself is a multigenerelated tumor with extremely complex pathogenesis, we speculated that there may be some genetic association between KC and RCC. In this study, we identi ed co-expressed differentially expressed genes (co-DEGs) of KC and RCC and elucidated molecular mechanisms and pathology of KC-related DEGs (KC-DEGs) and RCC-related DEGs (RCC-DEGs). Finally, we provide a bioinformatic analysis of DEGs and predicted microRNAs (miRNAs) for KC patients prone to occurring RCC.

Methods
The datasets(GSE73680 and GSE117518) for KC were downloaded from GEO (http://www.ncbi.nlm.nih.gov/geo/) 14 . Expression pro ling arrays of GSE73680 were generated using GPL17077 Agilent-039494 SurePrint G3 Human GE v2 8 × 60K Microarray 039381 (Probe Name version), while expression pro ling arrays of GSE117518 were generated using GPL21827 Agilent-079487 Arraystar Human LncRNA microarray V4 (Probe Name version). Then, the GeneSpring GX v12.1 software standardized the quantile method and annotated the corresponding data. The data were screened for differential genes through the limma package of R language. Similarly, the datasets (GSE14994 and GSE40435) for RCC were downloaded from CEO. Expression pro ling arrays of GSE14994 were generated using GPL3921 [ht_hg-u133a] Affymetrix HT Human Genome U133A Array, standardized by RMA method. While expression pro ling arrays of GSE14994 were generated using GPL10558 Illumina humanht-12 V4.0 expression beadchip was used to obtain the corresponding data by quantile standardization method and annotation. The data were screened for differential genes through the limma package of R language. The criteria we chose were p < 0.05, foldchange > 2 or < 0.5.
The GO Analysis method annotated the gene function of the differential genes based on the GO database to obtain all functions involved in the gene, and then calculated the signi cance level (p-value) and misjudgment rate (FDR) of each function by Fisher's precise test and multiple comparison test. KEGG (Kyoto Encyclopedia of Genes and Genomes) is a database that systematically analyzes the relationships, gene functions and genome information of Genes (and their coding products). It is helpful for researchers to study Genes and expression information as a whole network. The integrated metabolic pathways provided by KEGG, including the metabolism of carbohydrates, nucleosides, amino acids, and biodegradation of organic compounds, not only provide all possible metabolic pathways, but also provide a comprehensive overview of the enzymes that catalyze each step. KEGG is a powerful tool for metabolic analysis and network research in vivo. Currently, KEGG Pathway is divided into eight categories, including the overall network, metabolic process, genetic information transmission, environmental information transmission, intracellular biological processes, biological systems, human diseases and drug development. Based on the KEGG database, Pathway analysis used Fisher's accurate test and chi square test to analyze the signi cance of Pathway involving the target gene. The screening criteria for both were P < 0.05. PPI networks of KC-and RCC-DEGs were analyzed using the search tool for the retrieval of interacting genes (STRING database, V11; http://string-db.org/) that predicted protein functional associations and protein-protein interactions. Subsequently, Cytoscape software (V3.5.1; http://cytoscape.org/) was applied to visualize and analyze biological networks and node degrees, after downloading analytic results of the STRING database with a combined_score ≥ 0.9 .
Finally, the CTD database was used to nd the genes related to RCC or KC. Genes are scored indirectly through a database. Then we chose the top four scores. In addition, we applied online prediction tools utilizing microRNA Data Integration Portal (mirDIP) (http://ophid.utoronto.ca/mirDIP) 15 , and DIANA Tools (http://diana.imis.athena-innovation.gr/DianaTools/) 16 , to predict potential microRNA targeting. Subsequently, we used the mirDIP and DIANA Tools software to predict which of the selected miRNAs could target co-DEGs. We determined 5 top candidate miRNAs based on higher predicted.

Identi cation of DEGs
We identi ed 15015 probes in GSE73680, of these, 10,468 probes were raised and 4,547 were lowered(Figure1a). Similarly, we identi ed 2105 probes in GSE117518, of these, 936 probes were raised and 1169 were lowered(Figure1b).The difference genes of the two data sets were intersected, and 343 intersection genes were obtained(Figure1c). In GSE14994, we identi ed 2332 probes, and 1430 probes were raised and 902 were lowered. While in GSE40435,1764 probes were identi ed, including 922 up-regulated probes and 842 down-regulated probes(Figure1d).The difference genes of the two data sets were intersected, and 832 intersection genes were obtained(Figure1e).The difference genes of the two data sets were intersected, and 832 intersection genes were obtained(Figure1f).

Function enrichment in GO database
According to the analysis results, the signi cant function of differentially expressed genes, the number of genes contained and the enrichment degree in the database, the target graph of signi cance function can be made including biological process, molecular function and cellular component. Figure2a illustrates the signi cant function of KC-DEGs, and Heatmaps of KC-DEGs in relation to interleukin-6 secretion, negative regulation of chemokine production were conducted for genes expression and these data appear in value: 1.59E-02), TGF-beta signaling pathway (p-value: 2.20E-02), C-type lectin receptor signaling pathway (p-value: 3.22E-02) and Adipocytokine signaling pathway(p-value: 2.98E-02). However, KEGG terms included Phagosome (p-value: 2.68E-13), Valine, leucine and isoleucine degradation (p-value: 2.84E-13), Carbon metabolism (p-value: 1.18E-08), Complement and coagulation cascades (p-value: 2.39E-08) and Glycolysis / Gluconeogenesis (p-value: 3.82E-08) were enriched in RCC-DEGs.

PPI network analysis
We identi ed 358 and 445 nodes from PPI network of KC-DEGs and RCC-DEGs, respectively and these data appear in Figure3c Identi cation of functional and pathway enrichment among predicted miRNAs and Co-DEGs The CTD database showed that Co-DEGs targeted KC and RCC. According to databases score genes indirectly, the inclusion criteria is score>10. Thus, four genes were selected-Vimentin(VIM) Decorin(DCN) WNK lysine de cient protein kinase 1(WNK1) peroxidasin(PXDN). The Figure4 showed the relationship between genes and diseases. In renal carcinoma, the disease most associated with DCN is RCC, while in calculi, the disease is KC (Figure4a). In addition, the same condition happens in VIM, WNK1 and PXDN (Figure4b-d). Prediction analysis using mirDIP and DIANA bioinformatic tools identi ed the top 5 selected miRNAs targeting each Co-DEG involved in KC-related RCC and these data appear in Table1. These data enable us to understand how predicted miRNAs are related to KC-related RCC progress.

Discussion
With the high incidence of KC, its consequent harm should also be highly concerned, which also including the potential of progressing to be RCC. In clinical practice, early and comprehensive screening for KC patients is essential to prevent the occurrence of RCC. However the etiology of RCC is not clear, it is widely believed that RCC is closely related to KC, obesity, and hypertension. Although the concurrence of KC and RCC may be in uenced by a series of confounding factors, there is still a close relationship between these two disease. Previous studies are suggested that KC can be used as a risk factor for postoperative recurrence in KC patients with RCC. Jeroen's group 9 had reported that nearly half of all RCC cases in their study could be attributed to KC based on the population attributable fraction using a multivariable-adjusted HR of 3.08. Facing with such a high risk, the process of its occurrence and development is worth exploring.
KC patients usually have hydronephrosis, which can increase the risk of nephropathy and the disease process, and signi cantly affect the prognosis of disease. But at present there are few studies focused on the relationship between KC and RCC.
In our study, major Co-DEGs included VIM,DCN WNK1 and PXDN, which are associated with both KC and RCC. The VIM bring a high in uence score in RCC and KC, especially in RCC. The result is consistent with previous researches considering the invasion and metastasis of RCC via altering the miR490-3p/vimentin signals 17 . VIM plays an important role in a variety of tumors, not only in RCC, but also in such as breast carcinoma 18 , nasopharyngeal carcinoma 19,20 , Glioblastoma 21 and gastric cancer 22 . In addition, Yu' group found extracellular vimentin modulates human dendritic cell activation, in order to prevent tissue-damage from contributing to the development of autoimmunity 23 . Similarly, Santos et al suggests that VIM may be a key regulator of the NLRP3 in ammasome 24 . Both of them indicated that VIM is closely associated with in ammation. It may be an implication that KC suffers may have a risk of developing RCC via altering the VIM.
As for DCN, in our results, it's mainly associated with kidney neoplasms, RCC, KC, renal colic and chronic kidney disease-mineral and bone disorder. According to Xu's research, they illustrated that the loci of DCN de ciency is signi cantly associated with RCC growth and metastasis 25 . Similarly, it's also related to in ammation 26  Additionally, WNK1 could promote renal tumor progression according to Kim's study 28 . Some researchers also deem it's related to immunity and in ammation 30 . More interestingly, PXDN is considered to be associated with renal brosis in the murine unilateral ureteral obstruction model 29 .Thus, there may be a relationship between KC and RCC that they may arise from loci mutations or gene variants.
Although our study may reveal some potential relationships between KC and RCC, it can not consider the biological mechanism must relate KC to RCC. In general, previous studies considered that chronic irritation and infection recruit in ammatory cells, which secrete cytokines and chemokines. In turn, free radical species from oxygen and nitrogen are produced, facilitating the onset of cancer through, among others, increasing cell proliferation 31-33 . However, the Co-DEGs in our study are compatible with the supposition.
Finally, the results of KEEG and GO pathway analysis about Co-DEGs illustrate some possible pathway to investigate the effect of KC to RCC. We found that miR-181d-5p, miR-181c-5p were Co-DEGs and may be potential signaling pathway of KC-related RCC. In previous studies, miR-181c-5p had been reported it could promote an in ammatory response leading to diseases progression 34-35 . While miR-181d-5p had been considered to induce inhibition of oxidarive stress response 36 . Obviously, it coincides with the previous supposition again.

Conclusion
The Co-DEGs of VIM,DCN WNK1 and PXDN link KC and RCC. Finally, the top 5 miRNAs for each Co-DEGs may be potential signaling pathways for KC-relate RCC, especially miR-181c-5p and miR-181d-5p. Therefore, there is an association between KC and RCC, and expression of VIM DCN WNK1 and PXDN genes may favor KC-related RCC.

Limitation
This study is a microarray analysis that all the results based on gene expression value. Therefore, validation should be carried out both in vitro, in vivo and clinical trials. The genes and potential pathway we found do not appear to be present in previous studies about KC-related RCC. Thus, our study may provide some directions for further research.   Figure 1 We identi ed 15015 probes in GSE73680, of these, 10,468 probes were raised and 4,547 were lowered (Figure1a). Similarly, we identi ed 2105 probes in GSE117518, of these, 936 probes were raised and 1169 were lowered(Figure1b).The difference genes of the two data sets were intersected, and 343 intersection genes were obtained(Figure1c). In GSE14994, we identi ed 2332 probes, and 1430 probes were raised and 902 were lowered. While in GSE40435,1764 probes were identi ed, including 922 up-regulated probes and 842 down-regulated probes(Figure1d).The difference genes of the two data sets were intersected, and 832 intersection genes were obtained(Figure1e).The difference genes of the two data sets were intersected, and 832 intersection genes were obtained(Figure1f).