Differentially expressed ARGs
We analyzed the expression of 232 ARGs in 552 endometrial cancer tissues and 35 non-tumor tissues using the Wilcoxon signed-rank test in R. We obtained 37 differentially expressed ARGs, according to the criteria of |log2FC| > 1 and FDR < 0.05. These ARGs include 19 up-regulated genes (PARP1, PRKCD, CTSB, APOL1, ATG4D, BNIP3, BAK1, P4HB, ERBB2, ERO1A, GAPDH, IKBKE, TP63, EIF4EBP1, SERPINA1, IFNG, PTK6, BIRC5, and CDKN2A) and 18 downregulated genes (ITPR1, FOS, GRID1, HSPB8, NRG3, NRG2, DLC1, BCL2, FOXO1, CCL2, PRKN, TUSC1, CDKN1B, GABARAPL1 ST13, RAB33B, CALCOCO2, and MYC). The volcano map (Fig. 1A), heatmap (Fig. 1B), and boxplot (Fig. 1C) were visualized for these ARGs(Supplementary Table S1).
Go Enrichment Analysis Of Differentially Expressed Args
GO functional enrichment analysis of these 37 differentially expressed ARGs was performed(Supplementary Table S2), and the enrichment results were visualized to understand the biological functions of these genes. The results showed these ARGs were primarily involved with the intrinsic apoptotic signaling pathway, processes utilizing the autophagic mechanism, the cellular response to oxidative stress, integral components of the mitochondrial outer membrane, autophagosome membrane, protease binding, and heat shock protein binding(Fig. 2A, 2B).
Kegg Enrichment Analysis Of Differentially Expressed Args
The results of the KEGG pathway enrichment analysis indicated that the differentially expressed ARGs were related to autophagy, apoptosis, the ErbB signaling pathway, the HIF-1 signaling pathway, cellular senescence, the AGE-RAGE signaling pathway in diabetic complications, protein processing in the endoplasmic reticulum, endometrial cancer, the FoxO signaling pathway, and the estrogen signaling pathway (Table 1). In endometrial cancer, three differentially expressed ARGs (ERBB2, BAK1 and MYC), which are closely related to the occurrence of endometrial cancer, were increased. The estrogen signaling pathway, which is highly associated with the development of endometrial cancer, showed enrichment of four differentially expressed ARGs (ITPR1, FOS, PRKCD, and BCL2).
Table 1
KEGG enrichment results for differentially expressed ARGs.
ID
|
Description
|
P
|
geneID
|
hsa04140
|
Autophagy - animal
|
7.80E − 06
|
ITPR1/CTSB/PRKCD/
ATG4D/GABARAPL1/
BNIP3/RAB33B/
BCL2
|
hsa04210
|
Apoptosis
|
4.99E − 05
|
PARP1/ITPR1/FOS/CTSB/
BAK1
/BCL2/BIRC5
|
hsa04012
|
ErbB signaling pathway
|
4.99E − 05
|
CDKN1B/NRG2/ERBB2/
MYC/EIF4EBP1/NRG3
|
hsa04066
|
HIF-1 signaling pathway
|
1.62E − 04
|
CDKN1B/ERBB2/IFNG/
BCL2/GAPDH/EIF4EBP1
|
hsa04621
|
NOD-like receptor signaling pathway
|
1.94E − 04
|
ITPR1/CTSB/PRKCD/
GABARAPL1/CCL2/
BCL2/IKBKE
|
hsa01522
|
Endocrine resistance
|
8.02E − 04
|
CDKN1B/FOS/ERBB2/
CDKN2A/BCL2
|
hsa04933
|
AGE-RAGE signaling pathway in diabetic complications
|
8.02E − 04
|
CDKN1B/PRKCD/FOXO1/
CCL2/BCL2
|
hsa04137
|
Mitophagy - animal
|
1.84E − 03
|
CALCOCO2/PRKN/
GABARAPL1/BNIP3
|
hsa01521
|
EGFR tyrosine kinase inhibitor resistance
|
2.62E − 03
|
NRG2/ERBB2/BCL2/
EIF4EBP1
|
hsa04215
|
Apoptosis - multiple species
|
2.62E − 03
|
BAK1/BCL2/BIRC5
|
hsa04218
|
Cellular senescence
|
3.49E − 03
|
ITPR1/FOXO1/CDKN2A/
MYC/EIF4EBP1
|
hsa04141
|
Protein processing in endoplasmic reticulum
|
3.57E − 03
|
ERO1A/PRKN/BAK1/
BCL2/P4HB
|
hsa05213
|
Endometrial cancer
|
9.22E − 03
|
ERBB2/BAK1/MYC
|
hsa04068
|
FoxO signaling pathway
|
1.00E − 02
|
CDKN1B/FOXO1/
GABARAPL1/BNIP3
|
hsa04915
|
Estrogen signaling pathway
|
1.17E − 02
|
ITPR1/FOS/PRKCD/BCL2
|
Survival-related Args And The Prognostic Model
We performed univariate Cox regression of 37 differentially expressed ARGs, and 9 ARGs associated with endometrial cancer prognosis were obtained, including ERBB2, CDKN2A, BAK1, GRID1, NRG3, PTK6, DLC1, P4HB, and BIRC5 (P < 0.05). Seven prognosis-related ARGs (ERBB2, CDKN2A, BAK1, GRID1, NRG3, PTK6, and BIRC5) were considered risk factors (the minimum value of the 95% CI was greater than 1), and their high expression indicates a poor prognosis. Conversely, the high expression of the remaining two genes (DLC1 and P4HB) indicates better survival. Results were visualized using a forest plot (Fig. 4A).
LASSO regression analysis was then conducted to exclude genes that may be highly correlated with other genes. The complexity degree of LASSO regression is determined by the parameter lambda (λ). The larger the λ, the greater the penalty for the linear model with more variables. A model with fewer variables should be selected. We obtained 8 candidate genes (CDKN2A, ERBB2, GRID1, NRG3, PTK6, DLC1, P4HB, and BIRC5) by LASSO regression (Fig. 4B-C).
Multivariate Cox regression analysis of the training cohort indicated that CDKN2A, ERBB2, PTK6, and BIRC5 were independent prognostic factors according to the HR values (CDKN2A: 1.571; ERBB2: 2.310; PTK6: 1.355; and BIRC5: 2.375; P < 0.05)(Table2). Therefore, these genes were used to establish a prognostic model risk score = (0.45 ∗ CDKN2A expression) + (0.84 ∗ ERBB2 expression) + (0.30 ∗ PTK6 expression) + (0.86 ∗ BIRC5 expression).
Table 2
The results of univariate and multivariate Cox regression analysis.
Gene
|
Univariate
|
Multivariate
|
HR(95%CI)
|
P
|
HR(95%CI)
|
P
|
ERBB2
|
2.746(1.616–4.665)
|
0.001
|
2.310(1.084–4.923)
|
0.030
|
CDKN2A
|
1.27(1.130–1.427)
|
0.000
|
1.571(1.290–1.914)
|
0.000
|
BAK1
|
1.496(1.109–2.019)
|
0.008
|
|
|
GRID1
|
1.296(1.022–1.644)
|
0.037
|
|
|
NRG3
|
1.328(1.127–1.564)
|
0.000
|
|
|
PTK6
|
1.27(1.073–1.504)
|
0.006
|
1.355(1.062–1.730)
|
0.015
|
DLC1
|
0.550(1.367–0.822)
|
0.004
|
|
|
P4HB
|
0.626(0.426–0.921)
|
0.017
|
|
|
BIRC5
|
2.132(1.046–4.346)
|
0.033
|
2.375(1.147–4.917)
|
0.020
|
Regression coefficients, P values, HRs, and 95% confidence intervals of the prognosis-related autophagy genes are shown. |
Validation Of The Prognostic Model
We used the median risk value to divide the training set and the verification set into a high-risk group and a low-risk group. Kaplan–Meier plotter results showed that survival rates for high-risk patients in the training set for one, three, and five years were 76.0%, 30.1%, and 13.2%, respectively and survival rates for low-risk patients in the training set for one, three, and five years were 91.0%, 45.6%, and 27.9%, respectively (Fig. 5A). To evaluate the predictive accuracy of the prognostic model, we also plotted a ROC curve, where the area under the curve (AUC) was 0.75 for one-year survival, 0.79 for three-year survival, and 0.80 for five-year survival (Fig. 5B).
Similarly, we conducted a survival analysis in the verification set, and the results showed that survival rates for high-risk patients in the verification set for one, three, and five years were 82.2%, 27.4%, and 13.3%, respectively and survival rates for low-risk patients in the verification set for one, three, and five years were 88.1%, 452.2%, and 25.2%, respectively (Fig. 5C). To evaluate the predictive accuracy of the prognostic model, we also plotted a ROC curve, where the AUC was 0.700 for one-year survival, 0.836 for three-year survival, and 0.820 for five-year survival (Fig. 5D).
In addition to using survival curves to validate our prognostic model, we plotted risk curves for patients with endometrial cancer in the training set and the verification set. In both sets, as the patient's risk value increased, patient mortality increased significantly. A heatmap showed expression of risk genes was up-regulated in the high-risk groups. The results from the training and verification sets were internally consistent (Fig. 6A-F).
Validation Of Risk Genes At The Protein Level
Immunohistochemistry revealed ISH scores for four risk ARGs (CDKN2A, ERBB2, PTK6, and BRIC5) were significantly higher in endometrial cancer tissue than in healthy endometrial tissue, which suggested that these genes are highly expressed in endometrial cancer tissues. We also found that CDKN2A was mainly located in the cytoplasm, membrane, and nucleus, while ERBB2, BRIC5, and PTK6 were mainly located in the cytoplasm and membrane. All immunohistochemical results were derived from the HPA database (Fig. 7).