Use of Neural Network in Predicting Gastric Cancer with Para-aortic Lymph Node Metastasis in a Hospital Population Within the Last Two Decades

Background: In clinical practice, the accurate prediction of para-aortic lymph node status and the selection of appropriate surgery methods can significantly affect the prognosis of patients with gastric cancer. In the present study, we reviewed the data of patients who underwent radical gastric cancer surgery with dissection of para-aortic lymph nodes (PANs) within the last 20 years and assessed the possible independent predictors of PAN status. Methods: We included 308 patients with gastric cancer who fulfilled the inclusion criteria, and logistic regression and neural network were utilized to identify the possible independent predictors of PAN status. Results: Logistic regression analysis showed that male sex, Borrmann types III and IV, T4 stage, and pancreatic or splenic metastases were significant risk factors of PAN metastasis after adjusting for other factors, and the accuracy rate was 71.8%. After inputting all parameters into the neural network, the accuracy was 98%. Conclusion: The neural network has significant benefits in predicting PAN status in patients with gastric cancer. The finding of this study may be useful in predicting PAN metastasis.


Introduction
The prevalence of gastric cancer is high in China and in other countries worldwide [1,2]. Due to its adverse biological behavior, gastric cancer-related mortality is high [3,4]. The number of lymph node metastases is the main factor affecting the prognosis of gastric cancer [5,6]. Moreover, lymph node staging based on the number of lymph node metastases is an important method for the identification of prognosis and treatment strategies [7,8]. With regard to the different stages of gastric cancer, surgery remains the most important treatment method for surgically resectable gastric cancer [9,10].
Therefore, how to effectively predict the location and number of lymph node metastases before and during surgery is a key factor affecting surgery-related decisions and improves surgical outcomes.
Previous studies have shown that when patients present with para-aortic lymph node metastasis, dissection of para-aortic lymph nodes has survival benefits [11,12]. In the past, para-aortic lymph node metastasis is defined as distal metastasis (M1) in TNM staging, in Japan, when carrying out D2 radical surgery and N1 and N2 lymph node dissection, routine dissection of para-aortic lymph nodes is also performed [13,14]. Furthermore, in some medical centers, para-aortic lymph node dissection is also used as a routine surgical procedure to improve the prognosis of advanced gastric cancer [15,16]. However, it is often challenging to accurately determine the metastasis status of para-aortic lymph nodes. By contrast, a set of effective prediction system has not been established. Thus, we can only perform retrospective analysis to identify factors associated with metastasis, which results in bias in the selection of predictors to some extent. In our follow-up patient population, the proportion of patients who undergo routine para-aortic lymph node dissection is not high, which limits the possibility of using a large sample size to improve prediction sensitivity.
In the last decade, we previously carried out a retrospective analysis of risk factors that may affect para-aortic lymph node metastasis and found that gender, tumor site, and gross appearance were independent predictors. However, we did not carry out modeling and validation of the prediction system. Since then, some patients have undergone para-aortic lymph node dissection. Recently, we used two sets of the SPSS software and logistic regression and neural network to identify patients who fulfilled the inclusion criteria, and the possible independent predictors of para-aortic lymph nodes (PAN) status were assessed. Furthermore, we conducted a preliminary validation as relatively ideal statistical results will have clinical application value. In this section, we will individually introduce the methods and results of the aforementioned two parts.

Materials And Methods Patients
A total of 308 patients were enrolled. In this group, PANs were dissected from the level of the celiac trunk down to the root of the inferior mesenteric artery (station nos. 16a2 and 16b1). The inclusion criteria were as follows: 1) patients with histologically confirmed gastric cancer, 2) those who underwent D2 plus para-aortic nodal dissection (PAND), 3) those with complete medical record, 4) patients of every period of diagnosis and every surgeon are roughly equal, and 5) those who never received neoadjunctive therapies.
All patients were followed-up via mail or telephone interviews. The last follow-up was conducted in December 2018. Clinical, surgical, and pathological findings and all follow-up data were collected and recorded in the database.
The study protocol was approved by the ethics committee of The First Hospital of China Medical University, and informed consent was obtained from all participants. All methods were performed in accordance with the relevant guidelines and regulations.

Endpoints And Follow-up
Overall survival time was calculated from the date of surgery until the date of death or last follow-up contact. Patient data were censored during the last follow-up when they were still alive. Follow-up assessments were conducted every 6 months for the first 5 postoperative years and every 12 months thereafter until death.

Statistical analysis
The clinicopathological parameters that could be identified pre-or intraoperatively as the indication for PAND were compared between patients with and without PAN metastasis. Fisher's exact test or X 2 test were used to assess the differences in the proportion of patients. To assess the association between various factors and PAN metastasis, binary logistic regression analysis was carried out for  (Table 1).  Table 2: OR for histological metastasis of para-aortic lymph nodes (PAN)-univariable and multivariable analysis (n = 308) PAN metastasis was histologically found in 105 (34.1%) of 308 patients. The association between the possible risk factors and PAN metastasis is shown in Table 2. After adjusting for other variables, male sex, Borrmann types III and IV, T4 stage, and metastasis to the pancreas or spleen identified during surgery were the significant risk factors of PAN metastasis ( Table 2). The overall accuracy of the multivariate logistic regression was 71.8% (Fig. 1), and the area under the ROC curve was 0.749 ( Fig. 2).
In the neural network calculation steps in the figure, the left to right figures showed the patient data that were entered. The data were divided into the training and validation groups (green frame), which accounted for 67% and 33% of all data, respectively. The type of data was defined, and the variables (red frame) for calculation were selected. Variables, such as sex, age, and tumor site, were included, and the number of para-aortic lymph node metastases was set as the target variable. The calculation parameters (pink frame) were set, and output calculation results (blue frame) were observed. From left to right, detailed information about the overall efficacy rate (99%), prediction strength of various variables, overall structural map of the neural network, and accuracy of the training and validation groups was presented in Fig. 3.

Discussion
In our study, several factors were not uniformly distributed between the para-aortic lymph node metastasis and non-metastasis groups. With regard to sex, the proportion of male participants with metastasis was higher. In terms of middle and upper abdominal tumors and cumulative tumors in the entire stomach, the proportion of patients with metastasis significantly increased. In patients with T4 stage or Borrmann type IV (i.e., invasion of the serosa), the proportion of patients with metastasis increased. The appearance of metastases also indicated a higher N stage, larger surgery area, lower degree of radical treatment, and higher proportion of organ metastasis.
In the multivariate regression analysis, factors such as male sex, Borrmann types III and IV, T4 stage, and pancreatic or splenic metastases can be the factors associated with pre-and intraoperative treatment decisions, which can be used for the independent prediction of para-aortic lymph node metastases.
Experience as a result of caseload, surgical skill, and case selection are extremely important.
Physicians from different hospitals may have individual surgical habits and judgments about disease conditions. By contrast, physicians from the same department usually have similar treatment ideas due to long-term preoperative discussion, perioperative ward round, and interactions between the mentor and student. A retrospective study conducted at a single center can prevent subjective differences between different hospitals as much as possible. Even so, the selection bias in retrospective analysis cannot be prevented. The patient's condition, as determined using preoperative and intraoperative findings, and even financial status are the factors affecting intraoperative treatment decisions.
In future studies, two perspectives should be used in prospective clinical studies: the clinicopathological factors associated with para-aortic lymph node metastasis must be determined and the associated specific lymph node sites should be identified. In addition, a study about whether the metastasis status of specific lymph node sites should be included and whether there are sentinel lymph nodes for para-aortic lymph nodes must be conducted. A previous article has shown that station No.7 was the only significant indicator of PAN metastasis after adjusting for other variables.
The diagnostic sensitivity and specificity of station No.7 for PAN metastasis were high, which is clinically useful, and this may be a convenient diagnostic indicator of PAN metastasis.
We focused on the principles of the surgeons and did not include N stage as a variable because it is challenging to accurately determine N stage before and during surgery. We introduced the variable peritoneal metastasis status and found that the accuracy significantly improved compared with before. Improvements in accuracy are dependent on the operability of continuous variables, such as tumor size and age, and do not require manual grouping such as that in the past. The predictors extracted from the two software were different. However, the accuracy of the modeler was higher, and the operations were simpler. The accuracy can still be improved. In the future, more data analysis methods could be introduced, which include decision trees, C5.0, and other methods that are preliminarily used in medical data analysis to better determine the optimal analytical method for different data types. In addition, clinical experience must also be introduced to more effectively optimize the accuracy of model prediction and generate more data for validation.
In conclusion, sex, Borrmann type, T stage, and combined organ metastasis were associated with PAN metastasis. Neural networks can be a good predictive model.  Overall accuracy calculated using multivariate logistic regression Receiver operating characteristic curve calculated using multivariate logistic regression From left to right, the calculation steps of the neural networks were presented: the patient data were entered, data were divided into the training and validation groups (green frame), the type of data was defined, and variables were selected for calculation (red frame), calculation parameters (pink frame), and output calculation results (blue frame).