1. Investigation Procedure of This Research
The workflow chart of data preparation, processing and analysis in the study was exhibited in Figure1.To calculate the DEGs of stromal and immune element in OC samples, data from the transcriptome and clinical data from 379 cases were obtained from the TCGA database and then calculated using the ESTIMATE methods. WGCNA analysis was performed built on these phenotypes, and the key module-trait “blue” and hub genes within the module were obtained. Finally,312 DEGs related to OC immune microenvironment were obtained to construct the prognostic prediction model of OC. Through multivariate and univariate Cox regression study, a 9-genes model was created. Depending on the appearance levels of these 9 genes, we established a risk score pattern, and then we concentrated on the risk score for the following series of analyses, containing correlation analysis riskscore and clinical and pathological features, immune cells,and immune checkpoints,and GSEA, as well as nomogram analysis.
2. Stromal and Immune Score Shared DEGs
According to the comparison study between low and high-score categories, a total of 1149 DEGs were screened from ImmuneScore, consisting of 629 rise genes and 520 genes that have been down-regulated. Similarly, 1132 DEGs (734 genes were found to be raised, while 398 were found to be down-regulated) were acquired from StromalScore in the study. To understand the exact distribution of differential gene profile on the whole,a visual analysis for DEGs was displayed by volcano plots(Figure2 A,C). Heatmaps of the top 20 up-and down-regulated DEGs in immunological and stromal elements were shown in Figure 2B and Figure 2D,respectively.
3. Identification of WGCNA
In this study,a WGCNA was performed with the 379 OSC samples identified above, leading to the hierarchical cluster tree(Hclust value=30000 and removed 4 significant outliers)(Figure3A).With a soft threshold power=4 and scale-free R2=0.9,the gene distribution conforms to a scale-free topology model fit(Figure3B). Hierarchical clustering results divided genes into e WGCNA modules(Figure.3C).The module-trait relationships revealed that module-trait ‘blue’ was was strongly linked with both ImmuneScore(r=0.58, p=1e-35) and StromalScore (r = 0.9, p= 2e-137).Therefore, the module ‘Blue’ was considered the key gene module.A total of 357 hub genes in immune score group(cor=0.97,p<1e-200) and 150 hub genes in stromal score(cor=0.85,p<1e-200) group were screened from blue module by setting GS>0.5 and MM>0.6(Figure. 3D,E).
4. Identification of DEGs in OSC
The intersection analysis showed by Venn plots between DEGs in ImmuneScore and the immne-regulated hub genes screened from WGCNA was implemented,and 311 genes were carried out(Figure 4A).A total of 126 stromal-regulated genes shared by stromal groups were displayed as well(Figure 4B).
5. Advancement and Validation of an Ovarian Cancer Prognostic Risk Model
5.1 A Prognostic Risk Model and Risk Score Assessment In Training Cohort
First,312 DEGs were carried out from the union of the above 311 immune-connected genes and 126 stromal associated genes. According to the expression of DEGs and corresponding survival information of 374 OC patients(excluding five samples with no information about survival) obtained from the TCGA database, we acquired a total of 28 DEGs correlated with prognosis through a univariate cox regression study (Figure5A).By using the multivariate COX regression model for the above single prognostic factor study, a successive deterioration method was applied and an optimal model including 9-prognostic genes(GIMAP7,HTRA4,CCL5,ICOS,CD40LG,CD3G,VSIG4,CD2,ANKRD22) was obtained. The risk score was considered created on the linear part of the multivariate Cox regression model of these 9 genes, and the formula was:
riskScore=(expression of GIMAP7)×(-0.169715998)+(expression of HTRA4)×(-1.744715582)+(expression of CCL5)×(-0.014980592)+(expression of ICOS)×(-0.806180272)+(expression of CD40LG)×(-1.998681563)+(expression of CD3G)×(-0.841718491)+(expression of VSIG4)×(0.050245582)+(expression of CD2)×(0.433625656)+(expression of ANKRD22)×(-0.403295818)
depending on the appearance of these DEGs, we considered the risk score of every case in the training group. Compared to the middle, individuals were categorized into high-(n=132) and low-risk categories (n=132). We mapped the patient’s risk curve(red for high-risk values, green for low-risk values)and the existing state diagram(green for alive, red for death) according to the risk model (Figure5B,C). These risk graph results demonstrated that the greater the risk, the greater the number of fatalities (Figure5D).
5.2 Authentication of the Prognostic Risk Model In Training Group
The forecast model was validated, and the overall survival curve was plotted depending on the risk categories as previously stated. It was evident that the greater-risk category had a significant correlation with a lesser prognosis (P< 0.05) (Figure5E).To clarify the fundamental role of the 9 genes in the overall survival of OC,the survival analysis was performed with an online tool: gene expression profiling interactive analysis(GEPIA2).As shown in Figure S1, patients who had a higher level of expression of CD2 , HTRA4,ICOS,CD40LG,and CD3G displayed better survival, whereas the other's expression of four signatures had no statistically significant relationship with overall survival(OS). The preceding outcomes showed a likely result in bias when the research was based on single genes rather than multiple genes. Next, the ROC curve analysis was acquired by showing sensitivity and accuracy,and the AUC for the 1- ( 0.728),3- (0.662) and the 5-year (0.727) survival was treated as excellent for predictions(Figure5F).
5.3 Authentication of the Prognostic Risk Model In Trial Cohort
In the internal validation investigation of the TCGA data, we plotted risk curve(Figure6A,B),model gene expression heatmap(Figure6C),survival curve(Figure6D) and ROC curve(Figure6E) sequentially.Similarly, the greater-risk group had lesser survival rates and the AUC analysis at 1-,3- and the 5-year survival were >0.6 in the testing cohort.
6 Tumor Micro-Environment Related Study
6.1 Differential Investigation of Immune Cells
To explore the alterations in traits of immune-related TME between the low and high-risk categories, ssGSEA was implemented to evaluate penetrating immunocytes. As presented in Figure 7A, the scores of 17 immune cells were determined to be a significant difference,such as Th1 cells,Tfh cells, and inflammation-promoting cells.
6.2 Differential Analysis of Immune Checkpoint
Combined risk score and immune checkpoint genes expressions investigation displayed that 17 genes was considerably changed between low and high-risk categories (Figure 7B).The correlation between special immunological checkpoint molecules PDCD1(PD-1) and 9 model genes were displayed in Figure8. The findings revealed that PD-1 was positively associated with 9 hub genes of the training cohort(P<0.01).Except for CD40LG [R=0.58,P<2.2e-16](Figure 8A),HTRA4 [R=0.55,P<2.2e-16](Figure 8B) and VSIG4[R=0.34,P<6.8e-12](Figure 8C),the higher expression levels of CCL5[R=0.74,P<2.2e-16](Figure 8D),CD2[R=0.79,P<2.2e-16](Figure 8E),CD3G[R=0.72,P<2.2e-16](Figure 8F),ICOS[R=0.70,P<2.2e-16](Figure 8G),GIMAP7[R=0.67,P<2.2e-16](Figure 8H),ANKRD22[R=0.66,P<2.2e-16](Figure 8I) were treated as excellent relation with R>0.6.
6.3 GSEA Enrichment Analysis
Given that the risk score was linked to OC patient survival, GSEA was carried out by “clusterProfiler” R package in the low-risk and high-risk clusters compared with the average risk score. As shown in Figure 9A, the DEGs in the low-risk group were strengthened in immune-related processes, including such autoimmune thyroid disease, antigen processing and presentation, and the intestinal immune network, IGA is produced. However, the DEGs were enriched in only one pathway for the high-risk group(no display). For C2 collection defined by MSigDB,the DEGs were mostly enriched in multiple immune cells events in the low-risk group, including adaptive immune response, positive regulation of immune response, and regulation of T cell activation and so on(Figure 9B). As the high-risk group,the gens improved mRNA translation and protein synthesis(Figure 9C).These outcomes recommended that the risk model might be a possible sign of the status of TME in OC patients.
7 Assessment of Independent Prognostic of Risk Model
By including age,race, tumor stage, tumor grade, tumor residual disease and riskScore in the predictive risk model, we looked into the relationship between the prognosis and clinical and pathological features of OC patients using the Univariate Cox regression.The outcomes presented that only the P-value of tumor residual disease and riskScore was less than 0.05(Figure 10A).These 2 clinical factors were consequently incorporated in the predictive model for multivariate Cox regression study, and the P-value of riskScore was identified as a prognostic significance still less than 0.05(Figure 10B).Therefore,the above analysis demonstrated that the prognostic value of riskScore was unaffected by other clinicopathological features.
8 Nomogram Construction for Survival Prediction
According to the total scores, we performed a prognostic nomogram with riskScore to predict the 01,03 and 05-year OS. The greater the score was, the lesser the overall survival rate was(Figure 11A).The calibration curves were plotted for predicting OC patient survival probabilities at 01,03 and 05-year in the training group (Figure 11B,C,D).The closer the slope was to 1,the higher the prediction accuracy was. The results for forecasting 01, 03, and 5-year OS displayed that the accuracy of forecasting prognosis models was very high, indicating that the prediction model could be used as a valid model.