1. Clinicopathological Characteristics of Enrolled OC Patients
The study flowchart is shown in Fig. 1. A total of 335 EOC methylation-related data samples were obtained from the UCSC Xena website. The median age of the patients was 59 years (30 to 87 years), while the median overall survival was 1039 days (31–5481 days). There were 20 cases of stage I-II, 315 cases of stage III-IV, and 83 cases of lymphatic metastasis. The clinicopathological features of the 335 patients are listed in Table 1.
Table 1
The clinicopathological features of patients (N = 335)
characteristics
|
number (%)
|
Age
|
|
≥ 65
|
105(31.3)
|
༜65
|
230(68.7)
|
Race
|
|
White
|
296(88.4)
|
Black and African American
|
17(5.1)
|
Others
|
22(6.5)
|
FIGO stage
|
|
I-II
|
20(6.0)
|
III- IV
|
315(94)
|
Histologic grade
|
|
1–2
|
53(12.8)
|
3–4
|
292(87.2)
|
Lymphatic invasion
|
|
NO
|
252(75.2)
|
YES
|
83(24.7)
|
Tumor residual
|
|
NO*
|
60(17.9)
|
1 ~ 2cm
|
23(6.9)
|
> 2cm
|
216(64.5)
|
others
|
36(10.7)
|
* No Macroscopic disease |
2. Dna Mdgs In Oc
The MethylMix R package identified 111 DNA MDGs of OC, including 50 hypermethylated and 61 hypomethylated genes (Table S1).
3. Enrichment analysis
We screened 111 MDGs (P-value < 0.05), and the level of gene methylation was visualized via heatmap (Fig. 2A). The “clusterProfiler” and “GOplot” package were used to perform GO enrichment analysis to determine the function of the MDGs in OC. The identified MDGs were found to be closely related to “CELLULAR RESPONSE TO NICOTINE”, “POSITIVE REGULATION OF ION TRANSMEMBRANE TRANSPORTER ACTIVITY”, “TRANSEPITHELIAL WATER TRANSPORT”, “GRANULOCYTE MACROPHAGE COLONY − STIMULATING FACTOR BIOSYNTHETIC PROCESS”, etc (Fig. 2B). The top 20 clustered KEGG pathway is shown in Fig. 2C and was converted into a network based on Kappa-statistical similarities and visualized by Cytoscape (v.3.1.2) (Fig. 2D). The MDGs were found to play an important role in the “HALLMARK ESTROGEN RESPONSE LATE”, “IL-18 signaling pathway”, “PID PDGFRB PATHWAY”, “MAPK signaling pathway” and other important pathways involved in tumorigenesis and development.
4. Prognostic Value Of Dna Mdgs In Eoc
Univariate analysis was used to screen for survival-related MDGs, and a total of 14 MDGs were identified, including MARVELD1, MFAP4, EMILIN1, PI3, SLC9A1, IL18BP, FBLN2,SCTA5A, SCNN1A, GSTP1, RPL39L༌ MSX1, CD3G, ACTA2 (P < 0.05) (Table S2). KM analyzed the correlation between the expression of the 14 genes and OS of EOC patients (Figure S1A). LASSO regression was used to narrow down the range of target genes (Figure S1B). A total of twelve DNA MDGs (EMILIN1, FBLN2, STAT5A, SCNN1A, GSTP1, CD3G, SLC9A1༌MARVELD1, PI3, MFAP4, MSX1, IL18BP) were identified as independent predictive factors (Figure S1C). COX multivariate analysis was performed to determine the correlation between the mRNA expression of the twelve genes and OS of EOC patients, as well as the correlation between the level of DNA MDGs and OS of EOC patients. Finally, six DNA MDGs (SLC9A1༌MARVELD1, PI3, MFAP4, MSX1, IL18BP) were used to construct the risk score model (Table 2), among them SLC9A1༌MARVELD1༌PI3 and MFAP4 were highly expressed and hypomethylated, while MSX1, IL18BP were lowly expressed, and their hypermethylation closely related to poor prognosis of EOC (Fig. 3A, C). The DNA methylation level was found to be closely related to mRNA expression (|R|>0.3, P < 0.05) (Fig. 3B).
Table 2
Results of the six genes in the multivariable Cox regression analysis.
Genes
|
Coefficient
|
HR
|
HR.95L
|
HR.95H
|
P value
|
SLC9A1
|
0.063
|
1.065
|
1.024
|
1.108
|
0.002
|
MARVELD1
|
0.024
|
1.025
|
1.000
|
1.049
|
0.046
|
PI3
|
0.013
|
1.014
|
1.007
|
1.020
|
2.511E-05
|
MFAP4
|
0.013
|
1.013
|
1.001
|
1.025
|
0.031
|
MSX1
|
-0.015
|
0.985
|
0.973
|
0.997
|
0.017
|
IL18BP
|
-0.105
|
0.900
|
0.844
|
0.960
|
0.001
|
HR: hazard ratio; L: low; H: high. |
5. Construction Of A Prognostic Risk Score Model For Eoc
Complete clinical and expression data of the six genes (SLC9A1,MARVELD1, PI3, MFAP4, MSX1, IL18BP) in EOC patients were used to calculate the risk score. The X-tile chart was used to determine the optimal cutoff value. KM analysis revealed that the OS of the high-risk score group was remarkably shorter than that of the low-risk score group (Fig. 4A). The distribution of the six gene signature risk scores is shown in Fig. 4B and a heatmap was used to show the expression profiles of the six genes and the related risk scores (Fig. 4D). Univariate and multivariate analyses of age, race, FIGO stage, lymphatic invasion, histologic grade, tumor residual, and risk score were performed. Univariate analysis showed that age, FIGO stage, and risk score were independent risk factors for poor prognosis of EOC patients, while multivariate analysis showed that age and risk score were independent risk factors for poor prognosis of EOC patients (Fig. 4C).
6. Construction And Evaluation Of The Nomogram
The area under the curve (AUC) for 3, 5, 10-years OS demonstrated the prognostic power of the risk score signature. The AUCs for 3-, 5-and 10-year survival times were 0.673, 0.696, and 0.801, respectively, and were significantly higher than the prognostic power of age and FIGO stage (Fig. 5A). The time-dependent AUC of the risk score was found to be significantly higher than that of age and FIGO stage for the 10 years OS prediction (Fig. 5B),indicating favorable discrimination based on the risk score. We constructed a nomogram for predicting the individualized probability of EOC survival times in clinical practice, taking into consideration, age, FIGO stage, and risk score (Fig. 5C), then the clinical model was assessed by DCA. The results showed that the nomogram performed better compared with the age or FIGO stage model. Therefore, the constructed nomogram can be used by clinicians as a prognostic guide to make accurate clinical decisions for EOC patients (Fig. 5D). The calibration curve of the nomogram at 3, 5, 10 years OS were close to 45°line༌thus, demonstrated high predictive accuracy (Fig. 5E).
7. External Validation of the nomogram
For further external validation, we validated the constructed 6-methylation-driven gene-based risk score prediction model in the GSE26193 dataset. The results of KM analysis suggested that the high-risk group had a poor prognosis (p < 0.05, Fig. 6A). The AUCs for 0.5-, 1-and 3-year survival times in GSE26193 were 0.873, 0.852, and 0.715 (Fig. 6B). Compared with stage and grade, risk score had the highest AUC value at three years (0.5, 1, 3 year), an indication that the best performing model was based on the risk score model in the validation dataset (Fig. 6C). In the GSE53963 dataset, KM analysis revealed that EOC patients with high-risk scores or high nomogram risk score (P < 0.01) had a poor prognosis (Fig. 6D, E). Furthermore, there were significant differences in prediction outcomes between FIGO staging, histologic grade, and risk-score (Fig. 6F). The AUCs of the risk-score and nomogram score at 0.5-, 1-, 5-and 10-year survival times were 0.744/0.645, 0.691/0.645, 0.685/0.645, and 0.746/0.645, respectively, which were significantly higher than age and histologic grade (Fig. 6G), similarly to the time-dependent AUC (Fig. 6H). Based on the validation results, the nomogram was found to be more accurate in predicting short-term and long-term prognosis of EOC than risk score, age, and FIGO staging. Therefore, the constructed nomogram can be effective in making accurate clinical decisions and follow-up for EOC patients.
8. Cnv, Mutation Characteristics And Pathway Enrichment Analysis
To study the effect of methylation on the expression of the selected genes, CNV analysis was performed to determine the factors affecting gene transcription. The results showed that transcription of the six genes (SLC9A1,MARVELD1, PI3, MFAP4, MSX1, IL18BP) was affected by gene amplification, deletion, and mutation. After eliminating these effects, changes in the degree of methylation had a significant effect on the transcription and expression level of the 6 candidate genes. These results showed that DNA MDGs play a key role in EOC. GSEA was used to investigate the signaling pathways associated with the 6 MDGs in the two risk score model groups. In the high-risk score group, the top five signaling pathways (risk score > 0.314, Fig. 7B) were “ADHERENS JUNCTION”, “NOTCH SIGNALING PATHWAY”, “ERBB SIFNALING PATHWAY”, “FOCAL ADHESION”, “HEDGEHOG SIGNALING PATHWAY”, while in the low-risk score group, the top five signaling pathways (risk score < 0.314, Fig. 7C) were “PROTEASOME”, “OXIDATIVE PHOSPHORYLATION”, “AUTOIMMUNE THYROID DISEASE”, “PRIMARY IMMUNODEFICIENCY” and “PARKINSONS DISEASE”. Most of these signaling pathways are closely related to tumor invasion and metastasis and promote the development of tumors.
9. mRNA Expression of the six genes in OVCAR3 cells after DAC treatment
Studies have shown that the six genes (SLC9A1,MARVELD1, PI3, MFAP4, MSX1, IL18BP) play an important role in the development of EOC. To determine the effect of methylation on the expression of the six genes, EOC cell line OVCAR3 was treated with DAC. DAC was found to inhibit methylation in the six genes, thus, increasing the mRNA expression levels(Fig. 8).
10. Immune Assessment Of Two Risk Score Model Groups
Our analysis found that the TMB level of the high risk group was lower than that of the low risk group (P < 0.049), which was negatively correlated with the risk score (Fig. 9A, B). The attached figure shows the results of somatic mutations in the high risk group (Figure S2A) and the low-risk group (Figure S2B). Further analysis showed that DNAH11, CHD4, and CREBBP have a higher mutation frequency in the low risk group, while BRCA1, MYH1, and MDN1 have a higher mutation frequency in the high risk group (Figure S2C). We use heat maps to show the analysis of TIMER, CIBERSORT, CIBERSOTR-ABS, QUANTISEQ, MCPCOUNTER, XCELL, EPIC of high-risk groups and low-risk groups (Fig. 9C). Among them, 16 groups with significant differential expression in TIMER, CIBERSORT and XCELL were show in Fig. 9D. Figure 9E, F shows the correlation between MDGs and Immune function and Check point. We found that compared with the high risk group, MHC-1 factor is high expressed in low risk group while the expression of tumor immune factors LAG3, CD274 (PD-L1) increased significantly. It shows that the risk score model we have established can effectively reflect the immune environment and immunotherapy response of ovarian cancer.