2.1 DEG identification in different GEO data sets
In this study, gene expression characteristics of two datasets GSE6344 and GSE22459 were searched by NCBI-Geo database. In these Data sets, GSE6344 Microarray Data contained 20 samples, and further analysis of the specific composition revealed that it contained 10 renal cell carcinoma groups and 10 normal tissue samples. At the same time, GSE22459 Microarray Data contains 65 samples, which are specifically grouped into 24 renal fibrosis samples group and 25 normal controls group. We used GEO2R online tool to analyze the data obtained from GEO database. Finally, we found the genetic data of 2755 differentially expressed genes from GSE6344 dataset (Figure 1D). Its the cut-off criteria were |log FC| ≥ 0.5and P < 0.05; We found from GSE22459 data set 2552 differentially expressed genes of genetic data (figure 1E), its the cut - off criteria were |log FC|≥1 and P < 0.05. A total of 347 DEGs were detected (Figure 1A), of which 152 were down-regulated genes (Log FC ≤ 1) (Figure 1B) and 72 DEGs were up-regulated genes (Log FC ≥ 1) (Table 1 and Figure 1C).
Table 1
347 DEGS WERE IDENTIFIED FROM TWO PROFILE DATASETS
|
DEGs
|
Gene Name
|
Up
|
CRTAM、ANXA1、DOCK2、FPR3、PLEK、RUNX3、SLCO2B1、RAC2、CCL19、LAMP3、FYB、RHOH、VASH1、HLA-DQB1、GZMK、TRAT1、NKG7、XCL2///XCL1、PVRIG、LYST、SLA、ARHGAP25、SEZ6L2、LST1、MELK、SLC4A7、GPR18、RGS1、PTPRC、GNLY、ITGA4、C1S、GZMH、NLGN4X、PML、ADGRE5、CCL18、ASPM、GRIN2D、HLA-DPB1、CORO1A、CD52、TOP2A、LOC389906、PLEC、ISG20、HRH1、SLC17A2、REG1A、NR1H4、SDC3、DLGAP5、IL7R、SERPINE1、VCAN、SLC1A3、CCL5、DPEP2、SLC38A1、LYZ、CSF2RA、PON2、KIF20A、IGSF6、LY9、TRBC1、LOC101928916///NNMT、NOD2、APBB1IP、IGK///IGKC、APOC1、NUSAP1
|
Down
|
DMP1、EPB41L4B、SLC26A1、COL4A5、TCL6、TRPM6、GNAS、RPL21P28///SNORA27///SNORD102///RPL21、PDZRN3、FKRP、TF、ZNHIT2、SPTBN1、TACR1、GRIK2、CCDC181、HABP4、MFAP5、CHI3L1、PLA2R1、CEL、CBARP、TRPV5、IL5RA、NDNF、PRDM12、PROX1、PPM1H、PACSIN3、EGF、MYOZ2、TNNT2、APOA2、BAIAP3、GRM1、CLIC5、KMT2B、MFAP3L、NR4A3、MAGI1、CR1、SLC22A3、SYNPO2L、MAGEA8、ASB4、LINC00652、U2AF2、ACR、PPFIA2、CELA3B、TSC22D2、POU6F2、EYA3、TACSTD2、PHF2、STRA6、WISP1、HHLA3、MINA、PDE4DIP、GAS1、TNNI1、TRPM3、CA8、KCNN2、MBP、GHRHR、CD22、RPARP-AS1、SUCLG2、HSPB7、TSHR、CTTN、REEP2、GNRH2、PTGER3、TGFB2、CHRDL1、ATXN7L1、ASCL3、MIR124-1///LINC00599、LOC55338、BCAN、ARPP21、ATOH1、ADAM5、CFAP46、SCIN、PCDHB8、HAO1、TNFSF15、SOS2、CHGA、PAPPA、LOC100289518、TNNC1、CTAG2、F11、CYP2A6、LINC01361、GABRB1、MPO、NEU3、SLC26A4、USP2、NDOR1、PBOV1、CELF3、TYRO3、PEX6、KCNA10、VIPR2、DUOX1、PCDHA10、PTPRS、PTPN1、PLCG2、ZNF428、NUP188、SSX2B///SSX2、NLRP2、EXPH5、MAG、ATP6V1H、RPL3L、PMP2、EDDM3A、TRPM1、CCKAR、DDN、AVPR1A、ANKRD2、HOXB6、MC5R、METTL7A、CADM1、LMO3、CR2、MYH8、ADH1B、RASAL2、MLANA、NPY2R、PAX3、EYA4、RFX4、ANKRD1、EFCAB1、TPSD1、FTCD、SMG7-AS1、GPLD1
|
2.2 GO and KEGG Enrichment Analyses
Then, we used the online tool DAVID software to conduct GO analysis and KEGG analysis for the 72 up-regulated DEGs and 152 down-regulated DEGs mentioned above. These DEGs can be divided into the following two basic types: Cellular Components (CC) and Biological Processes (BP). Functional enrichment analysis of up-regulated genes showed that (Figure 2A), Genes in Cell Component (CC) mainly involve plasma membrane, membrane, integral component of plasma membrane, extracellular region and extracellular space. In Biological Processes (BP), the main function of these different genes is immune response and signal transduction(Table 2-1); Functional enrichment analysis of down-regulated genes showed that (Figure 2B), Genes in Cell Component (CC) mainly involve integral component of plasma membrane, plasma membrane, extracellular region and extracellular space (Table 2-2). The cut-off criteria were Count≥10 and P<0.05. KEGG pathway enrichment analysis showed that a total of 12 important signaling pathways were identified in up-regulated genes (Figure 2C). Including Cell adhesion Molecules (CAMs), Staphylococcus aureus Infection, Chemokine Signaling Pathway, Intestinal immune network for IgA production, Cytokine-cytokine receptor interaction, Viral myocarditis, Inflammatory bowel disease (IBD), Leishmaniasis, Influenza A, Tuberculosis, Herpes simplex infection and Fc gamma r-mediated phagocytosis (Table 2-3); A total of 6 important signaling pathways of down-regulated genes were identified (Figure 2D). Including Neuroactive ligand-receptor interaction, Calcium signaling Pathway, B Cell receptor signaling Pathway, and Dilated Cardiomyopathy, Hematopoietic cell lineage and Gap junction (Table 2-4). The filter criteria are PValue<0.05.
Table 2-1
GO Enrichment analysis of up-regulated DEGs
Category
|
Term
|
Count
|
PValue
|
GOTERM_BP_DIRECT
|
GO:0006955~immune response
|
13
|
6.39E-08
|
GOTERM_CC_DIRECT
|
GO:0005886~plasma membrane
|
35
|
4.37E-07
|
GOTERM_CC_DIRECT
|
GO:0016020~membrane
|
22
|
2.43E-05
|
GOTERM_CC_DIRECT
|
GO:0005887~integral component of plasma membrane
|
16
|
1.53E-04
|
GOTERM_BP_DIRECT
|
GO:0007165~signal transduction
|
12
|
0.004665
|
GOTERM_CC_DIRECT
|
GO:0005576~extracellular region
|
13
|
0.013557
|
GOTERM_CC_DIRECT
|
GO:0005615~extracellular space
|
11
|
0.024663
|
Table 2-2
GO enrichment analysis of down-regulated DEGs
Category
|
Term
|
Count
|
PValue
|
GOTERM_CC_DIRECT
|
GO:0005887~integral component of plasma membrane
|
28
|
4.31E-06
|
GOTERM_CC_DIRECT
|
GO:0005886~plasma membrane
|
48
|
8.05E-04
|
GOTERM_CC_DIRECT
|
GO:0005576~extracellular region
|
20
|
0.02846
|
GOTERM_CC_DIRECT
|
GO:0005615~extracellular space
|
17
|
0.04093
|
Table 2-3
KEGG pathway enrichment analysis of up-regulated DEGs
Term
|
Count
|
PValue
|
hsa04514: Cell adhesion molecules (CAMs)
|
7
|
2.24E-05
|
hsa05150: Staphylococcus aureus infection
|
4
|
0.001446
|
hsa04062: Chemokine signaling pathway
|
5
|
0.007241
|
hsa04672: Intestinal immune network for IgA production
|
3
|
0.016498
|
hsa04060: Cytokine-cytokine receptor interaction
|
5
|
0.018014
|
hsa05416: Viral myocarditis
|
3
|
0.023734
|
hsa05321: Inflammatory bowel disease (IBD)
|
3
|
0.029443
|
hsa05140: Leishmaniasis
|
3
|
0.035642
|
hsa05164: Influenza A
|
4
|
0.035903
|
hsa05152: Tuberculosis
|
4
|
0.037487
|
hsa05168: Herpes simplex infection
|
4
|
0.040764
|
hsa04666: Fc gamma R-mediated phagocytosis
|
3
|
0.048353
|
Table 2-4
KEGG pathway enrichment analysis of down-regulated DEGs
Term
|
Count
|
PValue
|
hsa04080: Neuroactive ligand-receptor interaction
|
12
|
3.01E-05
|
hsa04020: Calcium signaling pathway
|
8
|
0.001056
|
hsa04662: B cell receptor signaling pathway
|
4
|
0.023973
|
hsa05414: Dilated cardiomyopathy
|
4
|
0.039671
|
hsa04640: Hematopoietic cell lineage
|
4
|
0.043308
|
hsa04540: Gap junction
|
4
|
0.044557
|
2.3 PPI Network Analysis
We further introduced the above-mentioned up-regulated DEGs and down-regulated genes into the online analysis tool STRING, and obtained two different PPI networks to explore the interaction of up-regulated and down-regulated DEGs. In the PPI network of up-regulated differential genes, in addition to the disconnected nodes, it was also verified that 67 nodes (proteins) and 546 PPI edges (interactions) were in the PPI network constructed by up-regulated DEG (Figure 3A). The PPI network is imported into Cytoscape software, and the cytoHubba plug-in is used to determine hub genes (Figure 3C and Table 3), including PTPRC, PLEK, CD52, FYB, NKG7, CORO1A, LYZ, RAC2, DOCK2, NOD2. Using the same method, we obtained the PPI network (Figure 3B) and hub genes (Figure 3D and Table 3) that down-regulated differential genes, including ANKRD1, MYH8, TNNI1, SYNPO2L, TNNC1, MYOZ2, TNNT2, HSPB7, GRM1, CELF3. These hub genes may serve as promising biomarkers for renal fibrosis and renal clear cell carcinoma, and may be involved in the treatment of RF and KIRC.
Table 3
Top ten hub genes with higher Maximal Clique Centrality (MCC) of connectivity
DEGs
|
Rank
|
Name
|
Score
|
Up
|
1
|
PTPRC
|
7.64E+09
|
2
|
PLEK
|
7.64E+09
|
3
|
CD52
|
7.62E+09
|
4
|
FYB
|
7.53E+09
|
5
|
NKG7
|
7.34E+09
|
6
|
CORO1A
|
7.31E+09
|
7
|
LYZ
|
7.26E+09
|
8
|
RAC2
|
7.09E+09
|
9
|
DOCK2
|
6.94E+09
|
10
|
NOD2
|
6.92E+09
|
Down
|
1
|
ANKRD1
|
5936
|
2
|
MYH8
|
5869
|
3
|
TNNI1
|
5827
|
4
|
SYNPO2L
|
5800
|
5
|
TNNC1
|
5774
|
6
|
MYOZ2
|
5762
|
7
|
TNNT2
|
5162
|
8
|
HSPB7
|
5047
|
9
|
GRM1
|
3516
|
10
|
CELF3
|
3277
|
2.4 Verification of TCGA Data Set
First, we downloaded the KIRC and normal sample group data sets in TCGA and GTEx, and then used the GEPIA database to verify the expression levels and predicted values of hub genes. The results showed that the mRNA expression levels of a total of 13 central genes were significantly higher than that of normal kidneys organization (P<0.05) (Figure 4). These findings are consistent with the microarray data obtained.
2.5 mRNA-miRNA regulatory network
We reversely predicted the targeted miRNA regulated by 13 hub genes, and screened out 6 hub miRNAs through cytoHubba plug-in in the Cytoscape software (Figure 5B), including has-miR-3183, has-miR-4646- 5p, has-miR-520a-5p, has-miR-4723-3p, has-miR-4649-3p and has-miR-873-3p, which plays an important role in regulating gene expression, cell cycle and developmental sequence of organisms. Then establish a mutual assistance network between mRNA and miRNA (Figure 5A). At the same time, four genes related to hub miRNAs were screened out, namely TNNI1, LYZ, DOCK2 and PLEK.
2.6 Hub genes associated with prognosis
In the previous step, we screened out four hub genes, and then we used GEPIA online database to analyze whether these central genes were associated with prognosis and delineate the survival curve. Analysis showed that only LYZ gene was associated with the overall survival rate of patients with KIRC(P<0.05) (Figure 6). Meanwhile, we also noticed that the upstream miRNA directly associated with LYZ gene were has-miR-4649-3p and has-miR-873-3p, so we further screened out our core miRNA: has-miR-4649-3p and has-miR-873-3p.
2.7 The expression of LYZ is related to immune cell infiltration in KIRC
We obtained the final core gene LYZ through the previous step. Through prognostic analysis, we know that this gene is a protective gene (patients with high LYZ expression will survive for a long time), so we need to pick out immune cells with anti-tumor potential, including B cells and CD4+ T cells, CD8+ T cells, neutrophils, macrophages and dendritic cells. Then, the online analysis tool TIMER was used to analyze the correlation between multi-sample phasing and somatic cell copy number changes in KIRC samples and the above-mentioned immune cell infiltration. Our data showed that somatic copy number changes were significantly associated with the infiltration of CD8 + T cells (p <0.05) and macrophages (p <0.05) (Figure 7A). As shown in Figure 7B, the analysis showed that LYZ expression in KIRC was significantly correlated with tumor purity (cor = −0.241; p = 1.67e − 07), B cells (cor = 0.429; p = 5.57e−22), CD8 + T cells (cor = 0.383; p = 9.88e −17), CD4+ T cells (cor =0.283; p = 6.33e −10), macrophages (cor = 0.592; p = 1.18e −43), neutrophils (cor = 0.606; p = 2.83e −47) and Dendritic Cell (cor = 0.635; p = 1.11e − 52). According to the above results, the expression of LYZ was significantly positively correlated with these immune Cell infiltrates, and LYZ had a great influence on the Dendritic Cell infiltration of KIRC patients. Therefore, it was speculated that LYZ affected the survival of KIRC patients by affecting the degree of Dendritic Cell infiltration.
2.8 Rich analysis of LYZ functional networks in KIRC
The phenotype we need has been found above. Next, we need to explain what mechanism LYZ may influence the Dendritic Cell infiltration level of KIRC patients. We divided the high and low expression of LYZ in KIRC patients into two groups, and then conducted GSEA analysis to find out the pathway of immune cell infiltration regulation. The combined results showed that LYZ was found to be involved in both RF progression and KIRC. We used the functional module of LinkedOmics to analyze the LYZ mRNA sequencing data of 533 KIRC patients in TCGA. As shown in volcano diagram (FIG. 8A), LYZ was positively correlated with 4386 genes (dark red dots) and negatively correlated with 3166 genes (dark green dots) (FDR <0.01). 50 significant genes positively correlated with LYZ are shown in the heat map (Figure 8B), and 50 significant gene sets negatively correlated with LYZ are shown in the heat map (Figure 8C). They are associated with carcinogenicity of cancer, progression of fibrosis, development of kidney disease and inflammation. KEGG pathway analysis of GSEA is shown in Figure 8D. LYZ-related genes are detected in Leishmaniasis (FDR=0, P value=0), Th1 and Th2 cell differentiation (FDR=0, Immune network for IgA production (FDR=0, P value=0), Influenza A (FDR=0, P value=0), NF-Kappa B Signaling Pathway (FDR=0, P value=0), Ribosome (FDR=0, P value=0), Vibrio Cholerae infection (FDR=0, P value=0), Ubiquinone and other terpenoid-Quinone biosynthesis (FDR=0, P value=0) and Alzheimer disease (FDR=0, P value=0) (Figure 8E-P). Therefore, LYZ may be used as a diagnostic and therapeutic target for RF and KIRC by regulating immune cell infiltration through the above-mentioned pathways.
2.9 Verifying the HPA database
Finally, the protein expression level of LYZ was analyzed in HPA online database. Forty-four partial histological images of KIRC and normal renal tissues were analyzed. The results showed that LYZ gene protein was mainly expressed in renal tubules, and the expression of LYZ gene in KIRC was significantly increased compared with normal tissues (Figure 9), which was consistent with the above verification of mRNA levels.