1. Clinical characteristics of patients
1191 cases of gastric cancer patients in training cohort were first diagnosed with laboratory test indicators such as platelet count, white blood cell count, lymphocyte count, CEA, AFP, CA125 and other tumor indicators. During the 5 years follow-up, there were 1095 cases of GC and 96 cases of GCLM. Among the 309 cases in external validation cohort there were 281 cases of GC and 28 cases of GCLM (see Table 1).
2. Identification of independent prognostic factors for GCLM
Logistic univariate regression was performed on clinicopathological indicators, and the statistics found that tumor site (OR=0.50, P=0.010), CEA (OR=1.00, P <0.001), CA125 (OR = 1.02, P < 0.001), CA242 (OR=1.02, P <0.001), CA199 (OR = 1.00, P <0.001), CA724 (OR=1.02, P <0.001), OPNI (OR=0.90, P <0.001), N (OR=5.32 , P <0.001), Age (OR=1.03, P =0.012), gender (OR=0.48 , P <0.001), total protein (OR=0.97 , P =0.033), WBC (OR=1.09, P=0.042), Hb (OR=0.98 , P <0.001) were all significantly correlated with liver metastasis of gastric cancer. Tumor T stage, AFP, PLR, NLR, RBC and PLT were not associated with liver metastasis of gastric cancer (Table 2). Logistic multivariate regression analysis was performed on the 13 indexes significantly correlated with the occurrence of liver metastasis of gastric cancer in the univariate analysis results. The results showed tumor site (OR=0.55, P =0.046), CEA (OR=1.00, P =0.018), CA242 (OR=1.01, P =0.006), CA724 (OR=1.01, P =0.016), OPNI (OR=0.95, P =0.041), N stage (OR=4.95, P =0.004), gender (OR=0.40 , P =0.001), WBC (OR=1.13, P=0.024), Hb (OR=0.98 , P <0.001) were independent risk factors for liver metastasis of gastric cancer (see Table 2).
3. Development of a prognostic nomogram in the training cohort
According to the univariate and multivariate Logistic regression results, 9 independent risk factors of gastric cancer liver metastasis were shown, and a risk prediction model of gastric cancer liver metastasis based on hematological indicators, demographic information and the primary site of gastric cancer was established. By drawing Nomogram, the risk factors of gastric cancer liver metastasis can be more intuitively displayed. By incorporating the above 9 indicators into R for analysis, a risk diagnosis and prediction model of gastric cancer liver metastasis as shown in Figure 1 is established. This model lists the scores corresponding to different groups of each independent risk factor. By summing the scores of the factors shown in the figure of gastric cancer patients, the probability of liver metastasis of the gastric cancer patient can be obtained. The higher the total score, the higher the probability of liver metastasis. This gastric cancer patient has a higher risk of developing liver metastases. The C statistic is generally used to evaluate the prediction accuracy of the model. The closer the C index is to 1, the better the prediction effect of the model is. The C statistic of this model is 0.887 (95% CI: 0.857-0.916). The Bootstrap resampling method was used for internal verification, and 1,000 samples (with repetitions) of this study were selected and included in the verification cohort. The calibration chart shown in Figure 1 can be obtained. The closer the prediction curve is to the standard type, the better the prediction ability of the model. The calibration curves in the external validation cohorts were executed based on the nomogram model established above. And achieved a C statistic of 0.914 (95% CI: 0.864-0.963).
4. The performance of the ROC analysis and decision curve analysis (DCA) for predicting the probability of GCLM in the training and external validation cohorts
ROC analysis was built to compare the combined model of nomogram to single risk factors (see Figure 2). We found that the model which achieved the AUC of 0.887 (CI: 0.857-0.916) is more competitive than single variant including OPNI (AUC = 0.648), PLR (AUC = 0.610), NLR (AUC = 0.676), CEA (AUC = 0.685), AFP (AUC = 0.699), CA125 (AUC = 0.695), CA242 (AUC = 0.651), CA199 (AUC = 0.643) and CA724 (AUC = 0.619). The decision curve analysis (DCA) also revealed that the nomogram provided superior diagnostic value than single variety (see Figure 3).