Identification of prognostic m5C-Related LncRNAs in ccRCC
Using |R| >0.5 and P < 0.001 as filtering criteria, LncRNAs with co-expression relationship were identified from 16,876 LncRNAs and 31 MRGs, and a total of 422 co-expression LncRNAs were obtained. These results were visualized in Sankey diagram (Fig. 2A). Using P < 0.05 as the filtering criterion, the training group was firstly subjected to univariate COX analysis, and 36 m5C-related LncRNAs related to prognosis were screened out. Then lasso regression analysis was performed on the results of univariate analysis, and cross-validation was used to screen out the point with the smallest error (Fig. 2B-D). Then the COX model was constructed to obtain the optimal model for the 9 m5C-related LncRNAs screened by lasso regression. Correlation heat map was used to show the relationship between MRGs and m5C-related LncRNAs constructed by the model, in which red represented positive correlation and blue represented negative correlation (Fig. 2E). Risk score = (-0.732105208538704*AP004609.3) + (1.24333216856082*AC004837.2) + (1.25215186631223*AL133255.1) + (1.09407448306765*AC044781.1) + (-1.72051844326631*GNG12-AS1) + (-2.27930443465364*AP002755.1) + (-2.47688775750224*AC083805.2) + (-4.36659077891361*ERI3-IT1) + (4.50355534403254*AC078795.1). Through the obtained model formula, we could calculate the risk score of each sample in the training and testing groups. According to the median value of risk score, we divided the samples into high and low risk groups.
Survival analysis of the signature
In order to compare the survival of patients between high and low risk groups, survival analyses were performed on the training group, testing group and all groups. We found that the OS and PFS of patients in high risk group were significantly shorter than that in low risk group (Fig. 3D-E). Through the risk curve, we could observe the differences in the risk score and survival time of patients in high and low risk group (Fig. 3A-B). The heat map showed the expression level of 9 m5C-related LncRNAs in high and low risk groups. Among them, AC004837.2, AL133255.1, AC044781.1 and AC078795.1 were high-risk LncRNAs, while AP004609.3, GNG12-AS1, AP002755.1, AC083805.2 and ERI3-IT1 were low-risk LncRNAs (Fig. 3C).
Independent prognostic analysis
Univariate and multivariate independent prognostic analyses were performed to evaluate whether the prognostic model could be used as an independent prognostic factor characteristics. The results showed that our model was relevant to the prognosis of patients in both univariate and multivariate analyses. In univariate COX regression analysis, HR(hazard ratio) (95% CI) of risk score was 1.054 (1.041–1.067) with P < 0.05 (Fig. 4A). In multivariate COX regression analysis, the HR (95% CI) of risk score was 1.034 (1.020–1.048) with P < 0.05 (Fig. 4B). Then, we constructed ROC curve to judge the accuracy of our model in predicting patients’ survival. And the results showed that the AUC value of risk score (0.769) was significantly higher than that of age (0.652), gender (0.505) and grade (721) (Fig. 4D). In addition, the AUCs of one-year, three-year and five-year risk scores were 0.769,0.718 and 0.724, respectively, indicating that the model had reliable accuracy in predicting the survival of ccRCC patients (Fig. 4C).
Construction of nomogram and PCA analysis
We used age, sex, grade, stage, T-stage and risk score to construct a nomogram for predicting 1 -, 3 -, and 5-year survival (Fig. 5A). For example, if the patient has a score of 133, the probability of survival over one year is 0.896, over three years is 0.725, and over five years is 0.578. The calibration curve showed that the nomogram had good reliability in predicting the survival of patients at 1, 3, and 5 years (Fig. 5B). In addition, we constructed the C-index curve and found that the C-index value of the prognostic model was significant (Fig. 5C), indicating that the accuracy of the prognostic model was reliable. Patients were divided into subgroups according to clinical characteristics (gender, grade, stage, and T-stage), and then survival analyses were conducted on these subgroups based on risk scores. It was found that the model was applicable to patients in all subgroups (Fig. 5D-G). Finally, we performed PCA on the model LncRNAs, m5C-related LncRNAs, MRGs, and all genes (Fig. 6). We observed that the LncRNAs in our model could well distinguish patients with high and low risk groups, indicating the prognostic model was reliable.
Enrichment analysis and immune-related function analysis
GO enrichment analysis of DEGs between high and low risk groups showed that the biological process (BP), cellular component (CC) and molecular function (MF) were mostly enriched (Fig. 7A). The main components enriched in BP were phagocytosis and recognition, complement activation, and humoral immune response mediated by circulating immunoglobulin; the main enriched components in CC were immunoglobulin complex, circulation of immunoglobulin complex and external side of plasma membrane; and the main components enriched in MF were antigen binding, immunoglobulin receptor binding and endopeptidase inhibitor activity (Fig. 7A). KEGG analysis of these DEGs also suggested that these m5C-related LncRNAs might be associated with complement and coagulation cascades, viral protein interaction with cytokine receptor, and NF − kappa B signaling pathway (Fig. 7B). These results suggested that these LncRNAs might be involved in tumor development. Further analysis of immune-related functions showed that, except for major histocompatibility complex class I molecules (MHC_class_I), other types of immune responses were significantly higher in the high risk group than low risk group (Fig. 7C). This might explain why the high risk patients had worse survival compared with low risk patients.
TMB analysis and TIDE analysis
By comparing the mutation frequency between high and low risk groups, it could be observed that the top ten genes with high mutation frequency in high risk group were VHL, PBRM1, TTN, SETD2, BAP1, MUC16, MTOR, DNAH9, KDM5C, DST, HMCN1, LRP2, CSMD3, FBN2, KMT2C. TTN, SETD2, BAP1, MTOR, HMCN1 and CSMD3 had a higher mutation frequency in the high-risk group (Fig. 8A), while VHL, PBRM1, DNAH9, KDM5C, DST and KMT2C had high mutation frequency in low risk group (Fig. 8B). In general, there was no significant difference in TMB between the two groups (Fig. 8C). In addition, we performed survival analyses on patients with high and low TMB, and observed that the OS of patients with high TMB was significantly longer than patients with low TMB. And this result was also true in the case of a more detailed division of the groups (Fig. 8D-E). We then scored TIDE for each sample expressing the input file through the online website (http://tide.dfci.harvard.edu/), and observed that patients in high risk group had a greater potential for immune escape than low risk group, indicating that they had a worse response to immunotherapy (Fig. 8F).
Drug sensitivity analysis
According to the expression file and drug susceptibility file of the database, we scored each sample of ccRCC for drug susceptibility. A lower score indicated more susceptible to this drug. Then we could obtain sensitivity scores of each sample for 198 drugs. Using P < 0.01 as a filter, we found that the IC50 (half maximal inhibitory concentration) values of Entinostat and SB216763 were higher in high risk group compared with low risk group (Fig. 9A-B), indicating patients in low risk group were more sensitive to these two drugs. In addition, the higher IC50 value of Sapitinib in low risk group suggested that patients in high risk group were more sensitive to Sapitinib (Fig. 9C).
Role of lncRNA GNG12-AS1 in KIRC cells
Since 9 related LncRNAs were identified, the expression profiles of these 9 lncrnas in normal and tumor tissues were analyzed. Six lncrnas (AC004837.2, AL133255.1, AC044781.1, AC078795.1, GNG12-AS1, ERI3-IT1) were found to have statistically significant expression differences, and only GNG12-AS1 expression was down-regulated in tumor tissues (Fig. 10A). Therefore, we hypothesized that GNG12-AS1 may play an important role in the occurrence and development of KIRC. We then performed survival analysis of GNG12-AS1 expression and patient survival using the online database GEPIA2, and found that among KIRC patients, those with high GNG12-AS1 expression had better overall survival (Fig. 10B). GNG12-AS1 may act as a tumor suppressor gene and play an important role in the occurrence and development of KIRC. We performed a series of experiments to further explore how GNG12-AS1 regulates ccRCC progression in vitro. The plasmid overexpressing GNG12-AS1 was transfected into 786-O, 769-P, OS-RC-2 cells. The qPCR results showed good transfection efficiency (Fig. 10C). The results of wound healing assay showed that the scratch healing rates of 786-O, 769-P and OS-RC-2 cell lines overexpressing GNG12-AS1 were significantly lower than those of the control group. Transwell assays also showed that higher GNG12-AS1 expression reduced the migratory capacity of 786-O, 769-P, and OS-RC-2 cell lines. In the cell viability assay, the OD value at 450 nm of the GNG12-AS1 overexpressing cell line was lower than that of the control group at all time points (Fig. 11A-C).