Identification of genes and lncRNAs associated with disulfidptosis
The flow path of our analysis is presented in Figure 1. This study collected a total of 11 genes associated with disulfidptosis from the aforementioned study6. We analyzed the expression of genes in tumor and normal samples, and the results showed that NDUFS1, NDUFS2, NDUFC1, EPAS1 were downregulated in CRC, and SLC7A11, OXSM, LRPPRC, NDUFA11, NUBPL, CONT1 were upregulated in CRC(Figure 2A). We identified 253 lncRNAs associated with disulfidptosis genes using Person correlation analysis.(Figure 2B).
Construction of a prediction model for disulfidptosis-associated lncRNAs.
Next, we randomly divided all tumor patients into training group(n = 205) and test group(n = 204) at a 1:1 ratio. There is no statistical difference in characteristics between the two sets(Table 1). Then, integrating the analysis of disulfidptosis-associated lncRNAs with survival data in training set, we performed univariate Cox regression to assess their correlation with prognosis. We found that a total of 7 lncRNAs were significantly correlated with prognosis(Figure 3A). LASSO-Cox regression analysis was conducted on the training set to determine the optimal prediction score(Figure 3B,C).
The model equation is as follows:
Risk score = (AP003555.1 * 0.68564179981818) + (LINC00235 * 0.618608868499485) + (AF241728.2 * 1.65058793376611) - (SNHG16 * 0.660861460044132) - (HAS2-AS1 * 0.874173133799882)
As indicated above, SNHG16 and HAS2-AS1, are considered protective factors, while the remaining 3 lncRNAs are considered risk factors. Then,We investigated the expression of this lncRNAs in normol and tumor, it was found that the expression of lncRNAs in all 5 of the included models increased in tumors. Correlation between these 5 lncRNAs and disulfidptosis genes is shown in Figure 3E. They are positively correlated with SLC7A11, NUBPL, NDUFS1, LRPPRC, and CNOT1, but negatively correlated with NDUFS2 and NDUFA11.
Validation of the prediction index
To assess the model's ability to distinguish between different patients, we scored patients using the formula and categorized them into high- and low-risk groups according to their median risk scores. TSNE and Principal component analysis (PCA) demonstrated that the model lncRNAs were able to distinguish patients with different risk profiles(Figure 4A, B). To further prove the forecasting capabilities of the model for colorectal cancer, we evaluated it in the TCGA set, the training set, and the testing set separately. The overall survival curves showed that in all three cohorts, the low-risk group had significantly better prognosis compared to the high-risk group(Figure 4D-F). In addition, in each cohort, the survival of patients in the high-risk group continued to deteriorate as the disease progressed compared to the low-risk group(Figure 4G-I). Meanwhile, the expression of model lncRNAs in the high- and low-risk groups were analyzed, and it was found that the expression of AP003555.1 and AF241728.2 was down-regulated in the high-risk group, and the rest were up-regulated in the high-risk group(Figure 4j-L, Supplementary Figure 1). In addition, we also plotted the progression free survival curve of patients, the survival situation of patients in the low-risk group is better(Figure 4B).
Assessment prognostic Value of the lncRNAs-associted index
To validate the forecasting capabilities of the model we constructed, Cox regression analysis were performanced. Different regression analyses showed that model-based calculated risk score is an independent risk factor with Hazard ratio of 1.164(1.113-1.218) and 1.153 (1.096-1.210), respectively. ROC Curve showed risk score as best predictor of prognosis(Figure 5C). Time-dependent ROC analysis also indicating the good predictive ability of this signature for different years survival in all three cohorts(Figure 5D-F).
Correlation Analysis of Clinical Characteristics and Construction of nomogram
We discussed the correlation between risk scores and different clinical characteristics separately(Figure 6A). We found no significant differences in the lncRNAs used to construct the model across gender; AP003555.1 expression was significantly increased in patients younger than 65 years of age; SNGH16 expression was upgraded in tumor patient with T1-2 stages, whereas AP003555.1 was upgraded in tumor patient with T3-4 stages; in patients with N1-2, AP003555.1 and LINC00235 expression were upregulated; HAS2-AS1 and AF241728.2 were significantly upgraded in tumor patient with M1 stage; AP003555.1, LINC00235 and AF241728.2 were upgraded in tumor patient with StageIII-IV. In addition,the model had a good forecasting power for the survival of patients with different risks in both StageI-II and StageIII-IV (Supplementary Figure 2).
The previous studies suggested that the risk score having good predictive ability for prognosis.We integrated risk scores with clinical information to construct a nomogram(Figure 6B). A randomly selected patient in the cohort underwent a nomogram test that showed 1-, 3-, and 5-year survival rates of 0.978, 0.942, 0.9, respectively, which was validated using a calibration graph that showed good predictive ability with a c-index of 0.792(Figure 6C). The ROC curves also showed that the combined column line graphs with other clinical information had better predictive value(Figure 6D).
Gene ontology and gene enrichment analysis
We further explore the molecular mechanisms of disulfidptosis-associated lncRNAs and their impact on colorectal carcinogenesis and progression. GO analyses showed that the differential genes were enriched for collagen-containing extracellular matrix at the cellular compositional level, and extracellular matrix structural constraints at the molecular-functional level (Figure 7A). KEGG enrichment analyses showed that these genes were enriched in pathways associated with focal adhesion, ECM-receptor interactions (Figure 7B).
Consistently, the GSEA analysis indicated that in the high-risk group, there was a significant enrichment related to extracellular matrix structural constituent and collagen trimer(Figure 7C). In terms of pathways,the high-risk group is enriched in ECM glycoproteins, extracellular matrix organization pathways(Figure 7E). In the low-risk group, there was a notable enhancement of functions related to cytosolic small ribosomal subunit, small ribosomal subunit(Figure 7D). Additionally, pathways such as ribosome and cytoplasmic ribosomal proteins were significantly enriched in the low-risk group(Figure 7F).
Overall analysis of the immune profile and immunotherapy
GO and KEGG analysis showed that there are differences in two risk groups in the extracellular matrix, which is an important component of the tumor microenvironment, therefore, we scored the patients on the tumor microenvironment and understood that the high risk group’s stromal score is higher(Figure 8A). We also found that the high-risk group possessed higher immune scores and tumor purity scores(ESTMATE score). In order to further clarify the immune alterations in colorectal cancer, we first analyzed the relation between immune cells changes and risk scores using different immunoassay platforms, which showed most of the immune cells such as B cells, CD4+ effective T cells, CD8+ T cells, and NK cells were significantly positive correlated with model scores(Figure 8B). Next, we analyzed the alterations in immune function in patients with different risks and noticed that most immune functions of high-risk patients were significantly upregulated(Figure 8C). In addition, we examined the difference of 47 immune checkpoints, and almost all of them had increased expression in high risk patients(Figure 8D). We used tumor immune dysfunction and exclusion(TIDE) to calculate the immune escape possibility, and found that the high-risk group is more prone to immune escape than the low-risk group(Figure 8E).
For tumor patients, the higher the degree of microsatellite instability (MSI), the more justified the use of immunotherapy, and in colorectal cancer patients, immunotherapy is usually used in patients with high microsatellite instability (MSI-H), we analyzed the relationship between MSI-H and risk scores, and found that the proportion of patients with MSI-H was greater in the high-risk group and that patients with MSI-H possessed a higher risk score(Figure 8F, G).
Drug sensitive evaluation based on the prediction model
In order to evaluate the clinical value of the model and to find effective drugs, we analyzed the sensitivity of some common drugs of colorectal cancer in GDSC and CTRP databases. Results show differences in drug sensitivity among patients at different risks. High-risk patients are more sensitive to 5-fluorouracil, oxaliplatin, paclitaxel, and fluorouracil, low-risk patients are more sensitive to carboplatin and regorafenib(Figure 9).