Construction of a Lymph Node Metastasis Nomogram Prediction Model Based on Dual-Energy CT Radiomics of Gastric Cancer Lesions

Hong-li CUN The Third A liated Hospital of Kunming Medical University: Yunnan Cancer Hospital Qian-ting DUAN The Third A liated Hospital of Kunming Medical University: Yunnan Cancer Hospital Ying-ying DING The Third A liated Hospital of Kunming Medical University: Yunnan Cancer Hospital Da-fu Zhang The Third A liated Hospital of Kunming Medical University: Yunnan Cancer Hospital Ling YANG The Third A liated Hospital of Kunming Medical University: Yunnan Cancer Hospital Hua-xiu LI The Third A liated Hospital of Kunming Medical University: Yunnan Cancer Hospital Na WANG The Third A liated Hospital of Kunming Medical University: Yunnan Cancer Hospital Lei SHI Siemens Healthcare GmbH Guan-shun Wang (  wgsh602@vip.163.com ) The Third A liated Hospital of Kunming Medical University: Yunnan Cancer Hospital

the IC and nIC were higher in the patients with LNM than those without LNM (P<0.05). Establishing a random forest (RF) model based on the radiomics extracted from the GC lesions had a high diagnostic value in predicting whether the lymph nodes in patients with GC were metastatic. The RF model AUC value was 0.959 for the train set and 0.977 for the test set. The AUC value of the nomogram predicting LNM was 0.996 for the train set and 0.976 for the test set.
Conclusion Models based on preoperative serum tumor markers (CA125, CA199, and CEA) in patients with GC, quantitative dual-energy CT parameter values of the lesions (AP and VP IC, nIC), and radiomics have a higher diagnosis of LNM. The value of nomogram in combination with multi-parameter analysis is higher diagnosis of LNM, which can provide a reliable basis for preoperative evaluation of LNM.

Background
Gastric cancer is one of the most common malignant tumors in the digestive system, and ranks third in cancer-related deaths worldwide [1] . The most important factor affecting the prognosis of GC patients is lymph node metastasis (LNM) [2] . The detection of metastasis in the lymph node also plays a vital role in the choice of treatment [3,4] . Accurate determination of lymph node status is one of the key factors affecting the treatment strategy and prognosis evaluation of GC. The American Joint Commission for Cancer Gastric cancer staging system recommends endoscopy and CT as the rst choice for the evaluation GC [5] . However, dual-energy CT can be used not only for preoperative staging of GC and the evaluation of neoadjuvant e cacy of GC, but also for predicting the status of lymph nodes around the stomach [6][7][8] . The newly released "Japanese Gastric Cancer Guidelines" (Ver5) states that if the lymph node metastasis risk is < 0.01, endoscopic resection has the same effect as surgery. Neoadjuvant therapy is only recommended for locally advanced patients if there is a large number of lymph node metastases [9] . According to the Japanese GC guidelines (ver4) published in 2014, endoscopic resection is the best choice for early stage GC without LNM because a D2 gastrectomy has higher postoperative complications and mortality [10] . Therefore, accurate preoperative assessment of lymph node status is very important for the development of the next stage clinical treatment plan. In the process of DECT scan, the high and low tube voltages simultaneously generate rays to obtain two sets of independent images, and then calculate the iodine diagram through corresponding post-processing, which can measure parameters such as IC, nIC, virtual at scan, and energy spectrum curve slope. IC is closely related to microvessel density (MVD) and vascular endothelial growth factor, and can re ect tumor angiogenesis, which is essential for evaluating tumor progression and metastasis. Radiomics uses a combination of arti cial intelligence and machine learning to extract large amounts of data from image images, and perform high-throughput quantitative analysis to obtain high-delity information to comprehensively evaluate various malignant tumors. These features can be combined with traditional imaging, molecular biology, molecular pathology, etc. for analysis to improve the reliability and accuracy of tumor preoperative diagnosis, e cacy evaluation, and prognosis prediction, and provide individualized treatment for patients.
The purpose of this study was to explore the radiomics of LNM tumors using dual-energy CT parameters and preoperative laboratory examinations to build a more reliable model for predicting LNM of GC. The use of radiomics can provide a more personalized treatment that predicts patients results to effectively guide the clinical development of appropriate treatment plans.

Materials And Methods
1.1 Patient information: Imaging data was collected from 177 patients with GC who underwent dualenergy CT scan in our hospital from October 2017 to April 2019. The study population included 113 males and 64 females with an average age of 57.36±10.87 years (range 27-83 years).
1.2 Inclusion criteria (1) diagnosis of GC by gastroscope and postoperative pathological biopsy 2 did not undergo any treatment at the rst consultation 3 without contraindications for CT examination 4 underwent D2 gastrectomy within 1 week after dual-energy CT scan and had pathological results 5 serum tumor marker test within 2 days of admission.
1.3 Exclusion criteria (1) refusal to undergo CT dual energy scan or contrast allergy (2) serious damage of liver and kidney function, (3) image quality of patients with poor coordination that were still not up to the requirements after taking anisodamine (654-2). This study was reviewed and approved by the ethics committee of our hospital and all patients signed an informed consent before examination.
1.4 CT scanning method: A the third generation dual-source CT (SOMATOM Force CT, Siemens Healthcare, Forchheim, Germany)scanner was used for inspection. The patients fasted for 6-8 hours before the CT examination as per standard gastric CT exam protocols. Anisodamine 654-2 (10 mg) was administered intramuscularly 15-20 min before scanning to inhibit gastrointestinal motility. Before the scan started, the patients took 2 bags of gas-generating powder (25 g/bag) and were trained to hold their breath under the guidance of a technician. All patients underwent a plain CT scan and three-phase enhanced scan. The contrast agent of the enhanced scan was ioversol (320 mgI/ml). The amount of contrast medium was calculated according to the patient's body weight (2 ml/kg), and injected at a ow rate of ~3.5-4.0 mL/s using a Bolus tracking scan in the arterial phase. The CT value was 100 HU, and the patient was scanned after a 60 s delay in the parenchymal phase. The tube voltages and currents were: tube A -100 KV and 350 mAs, tube B -sn150 KV and 233 mA. Real-time dynamic exposure dose adjustment (combined application reduced exposure, CARE) Dose 4D, Collimator 128 mm×0.6 mm, pitch 0.7, tube rotation time 0.25. After processing, the results were fused into a 125 KV image (fusion coe cient of 0.5, 50% 100 KV to 50% sn150 KV data ratio) using the following parameters: reconstruction matrix 512×512, 0.25 s/rotation, reconstruction layer thickness 0.75 mm, and layer spacing 0.5 mm.
1.5 Results of pathological biopsy: During the operation, the lymph nodes removed around the stomach were recorded. According to the gold standard of the operation results, patients were placed in the metastatic group if they had at least 1 lymph node diagnosed as metastatic and in the non-metastatic group if no lymph node was classi ed with metastasis.
1.6 CT images post-processing analysis: A radiologist transferred the dual energy data (100kV, Sn150kV) to the prototype software (eXamine: DE Tumor Evaluation, Siemens Healthineers, Forchheim, Germany). The software automatically creates three image sets: the arterial phase and venous phase of the blended images with a mixed ratio of 0.6 (60% 100 kV and 40% tin-ltered 150 kV), virtual noncontrast , and material density iodine images. Image analysis was performed in mixed-volume images. One fellowshiptrained body radiologists (6 years of gastric imaging experience) identi ed the GC in the mixed-volume images and independently drew a volume of interest (VOI) across the maximum dimension of the tumor in any plane (axial, coronal, or sagittal plane). Then the software automatically segmented the entire tumor volume and calculated the tumor volume and maximal diameter. In addition, the following were displayed: the mean HU of mixed-energy images. The tumor margin is manually edited if the tumor is not correctly segmented at once, by evaluating Arterial phase and venous phase images. An region of interest(ROI) was manually drawn in the aorta to calculate the normalized iodine concentration (NIC : iodine concentration in the lesion/iodine concentration at the abdominal aorta at the same level) to minimize intersubjective variations. The CT images were imported into Siemens syngo.via Frontier Radiomics software. The volume of the lesion was outlined semi-automatically, and a total of 1672 imaging radiomics features were extracted from each lesion. 1.7 Statistical analysis: The software SPSS 25.0, Graphpad prism 8.0, Python Scikit-learn 3.7, and R language 3.5.2 were used for statistical analysis. easurement data that conformed to the normal distribution used the t-test mean ± standard deviation (SD) to compare the differences between variables, the measured data that did not conform to the normal distribution used the Wilcoxon rank-sum test to compare the differences between variables using the median and interquartile range. The counting data was expressed using the number of chi-square test cases and percentiles, the average value was used as an interpolation for missing values. A p value <0.05 was considered statistically signi cant. And ROC curve was drawn to evaluate the effectiveness of the radiomics model for the diagnosis of LNM. The area under the ROC curve was used to determine the optimal threshold of IC for judging whether there is LNM. The feature extraction algorithm of Siemens syngo.via Frontier Radiomics software was used to extract image radiomics features, and the Python Scikit-learn software was used to construct a random forest (RF) model, the R software was used to construct a clinical-radiology Nomogram.

Patient clinical features
The clinical features of the 177 patients with GC enrolled in this study are shown in Table 1. All patients were tested for the presence of 7 serum tumor markers within 2 days of admission ( Table 2) Table 3, both the arterial and venous phases of GC in the LNM group were higher than these phases in the non-LNM group, and the difference between the 2 groups was statistically signi cant (P<0.05). As shown in Figure

Multi-parameters predict the sensitivity and speci city of LNM in patients with GC
The markers CA125, CA199, and CEA, the arterial and venous phase IC and nIC, and the RF model train set and test sets separately predict lymph node properties, the prediction corresponds to the AUC value, sensitivity, and speci city of each parameters shown in Table 4. Both the train set and the test set had higher AUC values, sensitivities and speci cities in the RF model than did CA125, CA199, CEA, arterial and venous phase IC, and nIC. 2.6 Comparison of the nomogram diagnosis of GC between metastatic lymph node group and nonmetastatic lymph node groups The available clinical variables, including age, gender, AIC, nAIC, VIC, nVIC, and various serum tumor markers, were analyzed using the rank-sum test. R software was used to establish a nomogram (Figure 7) by combining the variables with a univariate P<0.05 into the radiomics features. These variables were randomly divided into a train set and a test set at a 7:3 ratio.

Discuss
We constructed and internally validated a radiomics nomogram that was generated for personalized prediction of LNM in patients with GC prior to surgery. The nomogram includes the following preoperative parameters: serum tumor markers, IC and nIC of GC lesions in the arterial and venous phases, and imaging histology scores. According to the patient's LNM risk, the nomogram was used to successfully stratify the patients, and the nomogram can also be used to judge the prognosis and net bene t of the patients. Integrating the serum tumor markers of patients with GC, the IC and nIC of GC lesions in the arterial and venous phases, and the radiomics score into an easy-to-use nomogram chart can help to make individualized prediction of LNM prior to surgery in patients with GC.
This study analyzed the ability of serum tumor markers to predicted LNM in patients with GC and identi ed CA125, CA199, and CEA as statistically different between the 2 groups of LNM lesions (P < 0.005). Gastric cancer serum tumor markers are not useful for predicting early cancer, but are useful for detecting recurrence, distant metastasis, patient survival rate, and postoperative detection. In particular, the 3 serum markers CEA, CA199, and CA724 are signi cantly related to the stage of the tumor and the survival rate of the patient. The level of the marker may be increased in different stages of GC patients [11] . For example, CEA is associated with T, N, and M stages, and increased CEA levels are independent risk factors for predicting liver metastases [12] . Analysis of 221 cases of highly differentiated GC in patients with CEA levels found that the CEA positive tumors are larger and more likely to break through the serosa and lymphatic vessels, and blood vessels are more likely to be involved [13] ; these patients are also more likely to have lymph node and liver metastases. CA199 is associated with tumor invasion depth, lymph node, and peritoneal metastasis, and TNM staging; in addition, CA199 is commonly reported to be associated with LNM [11] . Kodera et al. [14] found that increased levels of CA199 are closely related to liver metastases. The positive report rate of CA724 is higher than that of CEA and CA199. CA724 is also related to the depth of tumor invasion, lymph node, and peritoneal metastasis and TNM staging, but the positive rate of CA724 in patients with peritoneal metastasis is signi cantly higher than CEA. And, CA724 is also related to the Bormann classi cation. In patients with type 2, 3, and 4, the positive rate of CA724 is higher than that of CEA. Bai [15] and others found that CA724, CA199, CA242, and CEA have clinical value in the diagnosis of LNM in patients with GC. In this study, the serum tumor markers CA125, CA199, and CEA were also statistically different between the LNM and non-LNM groups (P < 0.005) with AUC values of 0.62, 0.65, and 0.69, respectively. Although the CA724 level LNM group in this study was slightly higher than the non-LNM group, there was no statistical difference between the 2 groups, and the sample size needs to be expanded to further analyze and verify the diagnostic e cacy of CA724.
In addition, this study used dual-energy CT(DECT) quantitative imaging parameters of GC lesions, including arterial and venous phase IC and nIC, and radiomics features to predict LNM, and there was a statistical difference between the metastatic and non-metastatic lymph node groups (P < 0.005). Li [16] and others found that the GC lesions VIC and nVIC can predict LNM, and that these are higher in the metastatic group than the non-metastatic group. These results are consistent with this study. However, this study found that both the arterial phase and the venous phase IC and nIC were statistically different in GC lesions with or without LNM (P < 0.001) as demonstrated by the AUC values 0.83 for AIC, 0.79 for nAIC, 0.91 for VIC, and 0.87 for nVIC, which have a certain effectiveness in predicting LNM. This may be because the IC of the arterial and venous phases in this study take into account at least 3/4 of the area of the lesion and aorta on the same level, which is more representative of tissue features. In this study, the features of GC lesions were used to predict LNM. The RF model AUC value, sensitivity, and speci city were higher than other parameters reported in the literature. Wang et al. [17] used a GC lesion radiomics model to predict LNM, and the AUC values were 0.844 for the train set and 0.837 for the test set. In our study, both the train set and the test set have higher diagnostic e ciency (AUCs: train set 0.996 and 0.976 test set) mainly because this study extracts the radiomics features of the entire GC lesion, which can more fully re ect the tissue features of the lesion.
In addition, this study combined the use of serum tumor markers from patients with GC and DECT quantitative imaging parameters of GC lesions (arterial and venous phase IC and nIC) to generate a nomogram constructed by radiomics. The AUC value, sensitivity, and speci city of the multiparameter nomogram, were higher than a single parameter. A study that used the gender and tumor features of patients with gastric signet ring cancer to construct a nomogram to predict LNM [18] also showed good prediction performance. In addition, There is a report that uses radiomics to construct a nomogram to predict LNM [17] . The RF model is established by this group of radiomics features that are randomly divided into a train set and a test set according to the ratio of 7:3 resulted in AUC values 0.977 and 0.959, respectively. In comparison, the diagnostic e cacy of this study is higher compared to other studies, not only is this study extracted the radiomics features of the entire GC lesion, which is more representative and re ects the tissue features of the entire lesion. In addition, this study is a prospective study, and all patients are scanned using a machine. In addition, this study also used a combination of radiomics score, arterial and venous phase IC and nIC, and serum tumor markers to construct the nomogram. Very good values (AUC, speci city, and sensitivity) were observed for both the train and test sets. Since the nomogram in this study includes more parameters, the combined use of serum tumor markers, quantitative imaging parameters, and radiomics features of GC patients to construct an LNM prediction model may more fully re ect the pathophysiology of GC lesions and LN. These features and overall appearance result in a diagnostic e cacy that is higher than other studies in the current literature. In addition, this study also constructed the calibration curve and DCA curve of the nomogram train set and test set to provide a more comprehensive preoperative GC LNM personalized prediction model for the clinic.
The study has the following shortcomings. First, the study only focused on patients with positive or negative LNM, and did not study the role of the nomogram in the prediction of LNM (N1-N3b) or the prediction of lymph nodes at each station (16 stations); this will require further study. Second, this study only conducted internal veri cation and lacked external veri cation. The robustness of the model needs further veri cation.

Conclusion
In summary, this study provides a radiographic nomogram that combines serum tumor markers, radiomics features, and dual-energy CT parameters of patients with preoperative GC, which can be conveniently used for individualization of preoperative LNM in patients with GC.   The box diagram of the diagnosis of GC in the arterial phase and the venous phase IC and nIC in the LNM group and the non-LNM group.

Figure 4
The ROC curve diagram of the diagnosis of GC in the arterial phase and the venous phase IC and nIC in the LNM group and the non-LNM group.

Figure 6
The ROC curve of the RF model test set using 1672 radiomics features from patients with GC.

Figure 7
The nomogram for predicting LNM in patients with GC. The speci c scores from the second line to the ninth line all vertically correspond to the rst line "Points". Then, add the "Points" from the second line to the ninth line to get a total score. Next, vertically correspond to the "Total Points" on the tenth line, Finally, each patient's "Total Points" vertically corresponds to the "Risk" in the eleventh row of LNM to get the probability of LNM in each patient.
Page 19/24 The test set nomogram of the ROC curve predicts LNM in patients with GC.

Figure 10
The calibration curve of the nomogram train set. The calibration curve veri es the nomogram, the abscissa is the probability of LNM predicted by the nomogram, and the ordinate is the probability of actual LNM. The red line on the diagonal indicates the ideal situation, that is, the predicted probabilities of all nomograms that are consistent with the true probabilities. Regardless of whether the con dence interval is at 80% or 95%, it is located near the diagonal and the deviation is not large, indicating that the prediction of the nomogram is relatively accurate.

Figure 11
The calibration curve of the nomogram test set: The calibration curve veri es the nomogram, the abscissa is the probability of LNM predicted by the nomogram, and the ordinate is the probability of actual LNM. The red line on the diagonal indicates the ideal situation, that is, the predicted probabilities of all nomograms that are consistent with the true probabilities.

Figure 12
The nomogram and RF model train set curve (DCA). The horizontal axis represents the threshold probability, and the vertical axis represents the net income. The solid red line indicates the nomogram model, the solid blue line indicates the RF model, the black curve indicates that all patients have LNM, and the black horizontal line indicates that no patient has LNM. The threshold probability indicates that the expected bene t of performing treatment is equal to the expected bene t of not performing treatment.
As can be seen from the DCA, when the threshold probability ranging from 0.01-0.98 in the train set and 0.01-0.90 in the test set is compared with cases of complete intervention or no intervention at all, regardless of whether the nomogram model is used, the use of the RF model can bring net bene ts to the patient. Furthermore, the net bene t of the nomogram model of the imaging group is higher than that of the RF model.

Figure 13
The nomogram and RF model of the test set decision curve (DCA). The horizontal axis represents the threshold probability, and the vertical axis represents the net income. The solid red line indicates the nomogram model, the solid blue line indicates the RF model, the black curve indicates that all patients have LNM, and the black horizontal line indicates that no patient has LNM. The threshold probability indicates that the expected bene t of performing treatment is equal to the expected bene t of not performing treatment.