Gastric cancer (GC) is one of the most common clinical malignant tumors worldwide, with high morbidity and mortality. The commonly used TNM staging and some common biomarkers, such as CEA, CA 12 − 5, CA 19 − 9 and CA 72 − 4, have a certain value in predicting the prognosis of GC patients, but they gradually failed to meet the clinical demands at present because of their lag in time and there're great differences in individual prognosis. Therefore, the construction of a prospective, accurate and individualized prognostic prediction model for GC is one of the most urgent clinical tasks at the present stage.
After differential expression analysis and univariate/multivariate Cox regression analysis, we screen out 5 genes as the model genes (F5, MTTP, SERPINE1, CYP19A1 and SLC52A3). In previous studies, Liu, Y have found that F5 gene is significantly upregulated in GC tissues, and may be a potential prognostic biomarker for GC. SERPINE1, as a tumor-promoting gene of gastric cancer, its high expression is significantly related to the poor prognosis of GC patients, which can be used as a biomarker for the diagnosis and prognosis of GC alone. CYP19A1 was found to be correlated with the poor prognosis of lung cancer and colon/rectal cancer, unfortunately there is no research on GC.
After calculating the risk score and successfully constructing the prognostic prediction model, to ensure its effectiveness and stability, the study first used TCGA-STAD testing cohort for internal verification, and used three independent datasets for external verification. These results all suggest that patients in high risk group were negatively correlated with their OS. We also evaluated the impact of the 5 model genes on patients' OS individually, all the results were consistent with our risk score formula.
To study whether the differences in clinical information of patients have an impact on the predictive performance of the model, we further divided TCGA-STAD entire cohort into several subgroups according to their age, gender, tumor grade, clinical stage and TNM stage. The KM analysis results were all in line with the expected results, showing that our model is not easily disturbed by the above clinical information of patients, suggesting its wide range of clinical predictive value.
For further analysis of the specificity and sensitivity of the model, 32 types of TCGA tumors were selected for external verification of the model. The results of the KM analysis showed that a higher risk score was associated with a lower OS among 22 TCGA tumors. Our explanation for this result is that these tumors may have similar metabolic characteristics and share some same nodes in their metabolic pathways. The identification of these nodes may contribute to a more in-depth study of the mechanism of these tumors and update the existing prognostic prediction models.
In addition, GO analysis and KEGG analysis were performed to further explore the potential biological functions of LM-DEGs, and it was found that all these enriched terms are associated with cellular energy metabolism. For deeper study of the biological functions, we performed GSEA to explore the different enrichment pathways between high and low risk groups in TCGA-STAD entire cohort. Interestingly, it was found that all the "mitochondrial-related" pathways were more enriched in low risk group, suggesting that mitochondrial function of GC cells in high risk group was inhibited to a certain extent, which is consistent with the fact that even under the condition of sufficient oxygen, GC cells still absorb a large amount of glucose and metabolize it into lactate.
Importantly, tumor-infiltrating immune cells seems to have more and more significant prognostic value of tumors[29, 30], including GC. In our study, Mast cells activated were more enriched in high risk group and led to poor prognosis. It is found in previous studies that Mast cells enriched in tumor tissue can produce a large amount of pro-angiogenic factors, which promote angiogenesis and lead to poor prognosis[32, 33]. Sammarco found the same phenomenon in GC. Macrophages M1, NK cells activated, T cells CD8 and T cells follicular helper were found to be more abundant in patients of low risk group, and the latter three significantly affected patients' OS. Husain found tumor-derived lactate inhibits NK cell function. Fischer found tumor cells can increase surrounding lactate concentration and further inhibit the function of cytotoxic T cells. Interestingly, M1 and M2 macrophages were found to have almost opposite effects on tumor patients' prognosis, and the former has been proved to improve patients' OS.
As for tumor microenvironment analysis, Baumann found lactate induces TGF-β expression, and some other researchers' results showed TGF-β promotes GC malignancy and metastasis, which can be suppressed by attenuating TGF-β. It is worth noting in our study that SERPINE1 and CYP19A1, may act as oncogenic genes, were found to associated with Immune C6 (TGF-β dominant), which is consistent with the above researches. SLC52A3 was found to be associated with Immune C2 (IFN-γ dominant) in our study. Intriguingly, IFN-γ was once recognized as a tumor-killing cytokine, but there's controversial in recent years and its mechanism is not clear. Based on the ESTIMATE algorithm, patients in high risk group were associated with significantly higher Estimate Score and Stromal Score, indicating a correlation with poor prognosis. And higher expression of SLC52A3 was associated with lower Estimate Score, indicating better correlation. Part of the above results, as well as the results of immune checkpoint molecules, are not in line with expectations, which we think may be related to the complexity and polyvalence of metabolic pathways. On the whole, however, these results accord with our expectations.
According to our drug sensitivity analysis, Camptothecin, Cisplatin, Mitomycin C and Vinblastine showed lower drug sensitivity among GC patients in high risk group compared with those of low risk, while there're no significant reduced drug sensitivity among Docetaxel, Doxorubicin, Erlotinib, Paclitaxel and Sorafenib. Meanwhile, the expression of the 5 model genes were found to be related to increased or decreased drug sensitivity of a number of chemotherapeutic drugs. These results can provide a new and fresh basis for clinicians to formulate individualized chemotherapy regimens for GC patients in high risk group based on our prognostic prediction model.
However, some limitations exist in our study. Most of our data was downloaded from the public database, with the clinical information of some patients incomplete, and there's only a small number of cases from our center. At the same time, as a retrospective study, information bias is inevitable, so it is necessary to prove the clinical predictive value of the model through more convincing prospective studies in future clinical practice.