A practical nomogram predicting the prognosis of hepatocellular carcinoma patients with lymph node metastasis: A population-based analysis

Background: Lymph node (LN) metastasis is associated with poor survival outcomes in patients with hepatocellular carcinoma (HCC) patients and because of the reported low probability of lymph node metastasis, research into the Anchorprognoses of such patients is difficult to conduct. In this study, we aimed to develop a nomogram model to predict the prognosis of HCC patients with lymph node metastasis. Methods: HCC patients diagnosed with LN metastasis from 2010 to 2015 were enrolled from the Surveillance, Epidemiology, and End Results (SEER) database. Univariate Cox regression and lasso regression were used to screen prognostic factors. Cox multiple-factor analysis was used to investigate the independent risk factors for survival. We developed a prognostic nomograms using these independent risk factors. The predictive performance of our nomogram model was evaluated according to the concordance index (C-index) and calibration curve. The net clinical benefit was assessed via decision curve analysis (DCA). Patients were divided into different risk groups according to the model. A survival curve was drawn using the Kaplan-Meier method and the difference was compared by the log-rank test. Results: There were 944 patients in the training cohort and 402 patients in the validation cohort. Grade, T stage, surgery to the liver, chemotherapy, radiation recode, AFP, fibrosis score, tumor size group, M stage were selected as independent prognostic factors, and we developed nomograms using these variables. The calibration curves for probability of survival showed good agreement between the prediction by our model and actual observation in both the training and validation groups. DCA indicated that the nomogram had positive net benefits. Conclusions: The nomogram can accurately predict the prognosis of HCC patients with lymph nodes metastasis and provide a reasonable basis for treatment. analysis, grade, T stage, surgery to the liver, surgery to LN, bone metastasis, brain metastases, pulmonary metastasis, chemotherapy, radiation recode, Insurance recode, AFP, tumor size group, fibrosis score, M stage were associated with overall (OS).


Introduction
Hepatocellular carcinoma (HCC) is the seventh most prevalent tumor worldwide with 841, 080 new cases occurring every year [1,2] . The dominant pathogenic factors that contribute to the incidence rate of HCC vary according to countries and regions, including hepatitis B infection in China [3] , hepatitis C infection in Africa and alcohol intake in Western countries [4] . Extrahepatic metastasis occurs in almost 30%-50% of patients during the course of the disease [5] . The lymph nodes are the second most common site of extrahepatic metastases in HCC [6] . The literature reports the incidence of LN metastasis range from 1.23% to7.5% in some studies with large sample sizes [7][8][9] . Other research has shown that the incidence might reached approximately 30% of the average rate [10,11] . Although a large proportion of the data are derived from autopsies, they might reflect that the occurrence of LN metastasis is underestimated, and more patients have lymph node metastasis.
According to the Barcelona staging, patients with lymph nodes metastasis are assigned to the C phase [12] and systemic therapy is seen as the primary treatment. The same situation has also been observed in other staging systems, such as the American Joint Committee on Cancer (AJCC) staging system and the NCCN guidelines [13] . The main reason for this phenomenon is that HCC patients with lymph nodes metastasis have a poor prognosis. A recent study showed that the median progression-free survival (PFS) time after surgery is 16.3 months for HCC patients without nodal involvement, but only 5.8 months for the group with lymph nodes metastasis [7] . It is undeniable that lymph node metastasis is a poor prognostic factor for hepatocellular carcinoma [9] . However, with the development of various treatments and drugs in recent years, the prognosis of HCC patients with lymph node metastases has been improving [14][15][16] . Previous studies have shown that patients who were diagnosed with stage IV demonstrated a different prognosis, indicating that hereditary factors could contribute to the prognosis of such patients [16] . The prognosis of HCC patients with lymph node metastases is different, although they were treated with similar external beam radiotherapy in a study [17] . The selection of the appropriate treatment should be based on accurate identification of different prognosis groups. Therefore, it is important to distinguish patients' different prognoses. Because of its reported low probability, the grouping of patients requires a large sample size. This situation increases the difficulty of such studies. As far as we know, a study which constructing a prognostic model for the risk assessment of HCC patients with LN metastasis has not yet been reported. Therefore, the purpose of our study was to distinguish the different prognostic groups of HCC patients with LN metastasis and to assist and guide clinicians in making treatment decisions.
The nomogram is an effective statistical tool that represents a graphical solution to a predictive model, and it can accurately predict the outcomes of individual patients [18] . As previously mentioned, this type of study is difficult because of its reported low probability. To expand the sample size and fully identify the factors affecting the prognosis of HCC patients with LN metastasis, we analyzed medical records from the Surveillance, Epidemiology and End Result (SEER) database. The SEER database collects several types of cancer patient data from electronic pathology reports and is an authoritative source of information on cancer, covering approximately 34.6% of the U.S. population (http://seer.cancer.gov/). In the present study, we downloaded data on HCC patients with LNs metastasis from the SEER registry between 2010 and 2015. Then, we divided these patients into a training group and validation group. A nomogram was constructed using the training group, and the validation group was used to evaluate its ability to predict patient survival.

Patient selection
The data were acquired from the SEER database of the US National Cancer Institute (SEER 18 Regs Custom Data (with additional treatment fields), Nov 2018 sub (1973-2016 varying)). The data were obtained via the SEER*Stat software (version 8.3.6; http://seer.cancer.gov/seerstat/). Because some important prognostic factors were not available before 2010, patients diagnosed with HCC with LN metastases between 2010 and 2015 were finally included in our research.
The inclusion criteria were as follows: (1) hepatocellular carcinoma patients from 2010 to 2015, for whom the site recode ICD-O-3 was liver; (2) according to the 7th edition of AJCC TNM staging, lymph-node metastasis patients were enrolled; (3) the patients was older than 18 years old; and (4)follow-up data were available. The exclusion criteria were as follows: (1) samples without follow-up time information data; (2) demographic and treatment Information was not complete; (3) data on vital prognostic factors and tumor staging information were missing; (4) and other cause of death classification was not the first tumor. The patients were randomized into the training group (accounting for 70%) and the validation group (accounting for 30%) using R software. The caret package was used, and the seed was 1988. To acquire more information, the selection process is shown in Figure 1.

Statistical analysis
The data regarding patients from the SEER database comprised sex, age group, race, grade, T stage, diagnostic confirmation information, surgery to the liver, surgery to LN, bone metastasis, brain metastasis, liver metastasis, pulmonary metastasis, chemotherapy, radiation, insurance, marital status, AFP, fibrosis score, tumor size group and M stage. The patients were censored as alive or dead of other causes. Categorical variables were compared using the chi-square test and Fisher's exact test. Numerical variables were compared using the Mann-Whitney Utest. Kaplan-Meier survival curves were generated, and survival distributions were estimated by the log-rank test. Univariate and multivariate Cox proportional hazard models were used to identify factors associated with survival in the training group. According to the results of multivariate analysis, we constructed a nomogram using R software (version 3.4.3, https://www.r-project.org/), which was internally validated by bootstrapping in 1000 bootstrap samples. The c-index was used to compare the discriminative ability of the nomogram and AJCC 7 th system (IVA/IVB) in the training and validation groups. Calibration curves were created to assess the predictive accuracy in the two groups [19] . Then, we calculated the risk score of each training group patient according to the Cox regression model. The X-title program (versions 3.6.1) was used to select the optimal cutoff for the risk score to distinguish the differences in patient survival [20,21] . Then, the training and validationg group patients were divided into high and low risk groups according to risk score. Survival curves were plotted according to the Kaplan-Meier method, and the log-rank test was employed for survival analysis. Statistical analyses were performed with SPSS software, version 25 (IBM) and R software (version 3.3.4). All of the tests were two-sided, and p<0.05 was considered statistically significant.

Training and validation group patients' characteristics
As shown in Figure 1, a total of 40, 173 patients diagnosed with HCC from 2010 to 2015 were included in our research, of which 2, 662 cases (6.6%) had LN metastasis. According to the above exclusion criteria, 1346 patients were finally enrolled. We allocated 944 patients into the training group and the others into the validation group. The clinical details of the patients are shown in Table 1.
Of all the patients, 81.7% were male, and 69.5% were white in race. Patients that older than 60 accounted for 54.3%. The AFP level was elevated in 74.82% of patients, and 14.0% of patients were grade III/IV for the Edmondson-Steiner classification. More than half (86.3%) of the patients did not receive radiation treatment. The same trend was observed for surgery to the liver/LN. The median (Q1-Q3) follow-up time of the patients was 5.00 (2.00-12.00) months. Finally, we found that there was no significant difference in on most demographic or baseline features between the training group and validation group. The incidence of pulmonary metastasis was different between the two groups. It was higher in the training group than in the validation group, but there was no significant difference between the training group and the overall population.

Prognostic factors for HCC patients with lymph node metastasis
As shown in Table 2, in the univariate analysis, grade, T stage, surgery to the liver, surgery to LN, bone metastasis, brain metastases, pulmonary metastasis, chemotherapy, radiation recode, Insurance recode, AFP, tumor size group, fibrosis score, M stage were associated with overall survival (OS).
To reduce the risk of over-fitting our model, we applied the Lasso regression method, which can compress partial factorial regression coefficients to zero [22] . The glmnet package was used in the R software. 10x cross validation was applied to search for the least partial likelihood deviance which can represent the complexity of the model. Finally, the variables that we chose when the partial likelihood deviance is least (lambda=-4.37) were age group, grade, T stage, surgery to the liver, surgery to LNs, bone metastasis, brain metastasis, pulmonary metastasis, Intrahepatic metastasis, chemotherapy, radiation recode, Insurance recode, AFP, tumor size group, fibrosis score, M stage. Combined with the results of Cox univariate analysis, we removed the variables of age group and intrahepatic metastasis and 14 variables were included in multivariate analysis. For more details see Figure2a-b.
In the multivariate analysis, grade, T Stage, surgery to the liver, chemotherapy, radiation recode, AFP, fibrosis score, tumor size group, M stage remained independently related to OS. The details are summarized in Table 3.

Construction and validation of the nomogram
A nomogram was formulated based on the results of the multivariate Cox regression analysis, as shown in Figure  3. In the training group, the Harrell's C-index for OS prediction was 0.70 (95% CI, 0.68 to 0.72), and the area under ROC curve (AUC) for 1 and 2 years was 0.76 and 0.80, respectively. In the validation group, the Harrell's C index for OS prediction was 0.73 (95% CI, 0.70 to 0.76), and the area under ROC curve (AUC) for 1 and 2 year was 0.79 and 0.75. However, in the training and validation groups, the C-index of the AJCC staging system was only 0.58 (95% CI, 0.56 to 0.60) and 0.59 (95% CI, 0.56 to 0.62), respectively. The nomogram model showed better discrimination.
We further assessed the accuracy of our model predictions by calibration plot. The calibration plot for the probability of survival at 1 and 2 years showed good agreement between the prediction by nomogram and actual observation. See further details in Figure 4a-d.
Furthermore, the decision curve analysis (DCA) was plotted to observe the clinical benefits to the patient. The DCA indicated that our nomogram had a positive net benefit with a wide scale of threshold probabilities in the training group and validation group. See further details in Figure 5a-d.

The establishment of different risk groups according to the model
According to Cox regression model, the risk score of each training group patient in the model was calculated. The X-title program was used to select optimal cut-off for risk score to distinguish the difference in patient survival. The point of grouping was 1.12 when the training group was divided into a low-risk group (score≤1.12) and a high-risk group (score1.12). The corresponding total points were 261. The survival curves of the training group and the validation group were drawn, and the log-rank test was performed, with the p-value<0.001. See further details in Figure 6a-b.

Discussion
According to existing reports, the incidence of lymph node metastasis during the treatment of liver cancer is 1.6% to 5.9%, while it is 25.5% on autopsy, indicating that lymph node metastasis might be neglected [11] . In our research, 6.6% of all HCC patients had LN metastasis, which was consistent with the results reported in the previous study. Regarding the prognosis of such patients, the emergence of novel treatments, including radiation, ablation, interventional therapy, and sorafenib, has improved the prognosis [15,[23][24][25] . Our study also showed that the HCC patients with LN metastasis can benefit from radiation and chemotherapy. In terms of surgical treatment, our study showed that patients with LN metastasis had no benefit from lymphadenectomy, and previous studies have shown similar results [26][27][28] . However, our research analysis showed a benefit for surgery at the primary site: the liver. We consider three possible reasons for the finding. First of all, more than 38% of patients included in our study were diagnosed without histology or cytology. Whereas the diagnosis of primary liver cancer can be made by clinical diagnosis, imaging and laboratory examinations [29][30][31] , we included patients whose diagnoses were established according to clinical diagnostic criteria. In this case, the diagnosis of lymph node metastasis in many patients was based on clinical and radiographic findings and many patients might not actually have had cancerous lymph node metastasis. HCC patients often have chronic inflammation of the liver, such as hepatitis B, hepatitis C and on-alcoholic fatty liver disease. Inflammation of the liver can also cause enlarged lymph nodes and a study showed that the proportion of enlarged lymph nodes in hepatitis B virus-infected patients reached 9.4% [32] . According to the above discussion, in the absence of a pathological diagnosis of the lymph node, HCC patients with LN metastasis can be classified with cancerous metastasis and benign perihepatic lymph node enlargemen (PLNE). A study showed that PLNE was an independent positive prognostic factor that might improve the prognosis of HCC patients [33] . In this case, some patients might benefit from surgery on the primary site. Second, only a few patients had their livers operated on, which might have affected the results of our statistical analysis. Third, patients undergoing surgery might have better basic indicators, such as performance status and liver function, than non-surgical patients. These facts might affect the outcome. In summary, cytological or histological confirmation is recommended to determine whether lymph node metastasis is truly present, especially as to the patients with hepatitis, and we should choose the treatment more carefully for HCC patients without pathological diagnosis of LN metastasis.
The Edmondson-Steiner classification indicates a pathological grade. A higher grade incidates a lower degree of differentiation and a higher degree of malignancy. Some reports in the literature have shown a relationship between Edmondson-Steiner classification and HCC patients' prognosis, and a higher grade indicates that the patients' prognosis is likely to be worse [34,35] . Zhang et al reviewed the degree of cirrhosis affecting the prognosis of patients and found that the histological severity of cirrhosis is a vital adverse factor that affects the long-term outcomes of HCC patients undergoing liver surgery [36] . Similar results have been found in many reports [37][38][39] . The effect of AFP levels on patient prognosis is controversial. Some studies have found that elevated AFP levels could worsen HCC patients' prognosis [40][41][42] . Other scholars have found that AFP has no significant effect on patient survival [43] , perhaps because the studies are from different populations. In our study, we found that an elevated AFP level is an adverse prognostic factor.
With respect to pathological stage and tumor characteristics, T Stage, M Stage and tumor size group were associated with the prognosis of HCC patients with LN metastasis. Wu et al found that tumor size could be used as an independent risk predictor associated with survival in HCC [44] . In combination with T stage, we grouped the patients according to different tumor sizes and obtained similar conclusions. The effect of M stage on patient prognosis is not in doubt. The previous literature has shown that the prognosis of HCC patients with different metastatic sites is different [45,46] . Therefore, we included the following prognostic indicators: bone metastasis, brain metastasis, intrahepatic metastasis, pulmonary metastasis and M Stage. Finally, we found that only M Stage of the above factors was an independent risk factor for HCC patients with LN metastasis. This outcome suggests that differences in metastatic site might not be as important in such patients as in those with HCC without LN metastasis. T Stage was also an independent risk factor for worse prognosis. In general, with the increase in T stage, the prognosis of patients became worse. However, in our analysis, patients with stage T3 had a higher risk of death than patients with stage T4. The 7th edition AJCC stage was used by SEER in the original data. According to the 7th edition of AJCC staging [47] , T stage includes not only information about tumor size but also about vascular invasion and number of hepatic tumors. T3a indicates multiple tumors, at least one of which is >5 cm. T3b indicates that the tumor has invaded the main trunk of the portal vein or/and hepatic vein, which would lead to worse prognosis [48,49] , and it has been included in stage T4 by the 8th AJCC cancer staging manual. In the 7th edition AJCC staging manual, T4 is defined as the tumor invading adjacent organs except for the gallbladder or penetrating the serous membrane. Therefore, for patients with liver cancer with lymph node metastasis, the prognostic significance of the number of liver tumors and vascular invasion might be greater than that of the invasion of adjacent organs.
As previously mentioned, the prognosis of HCC patients with LN metastasis is improving, and some studies have shown that stage IV HCC patients demonstrated a different prognosis [14][15][16] . Therefore, it is important to distinguish the difference in prognosis in such patients.
On the basis of identifying the risk factors, we built the nomogram model and verified it. The model could well distinguish the difference of prognoses of HCC patients with LN metastasis. It could provide a basis for the choice of treatment for such patients. Only under the circumstance of reasonable differentiation of patients with different prognoses can a reasonable treatment plan be put forward. The establishment of the model could help to better distinguish the different prognoses of HCC patients with LN metastasis and provide a basis for follow-up treatment. To the best of our knowledge, no similar studies have yet been performed. Comparing to other models which need to use some technologies, our model applies a number of clinically accessible indicators. We evaluated predictors that were clinically relevant, so that the model can be easily applied in clinical practice.
However, our research still has some shortcomings. First, bias is inevitable in this type of retrospective study. For example, we removed many patients with lymph node metastasis whose important clinical data were unknown. A number of important prognostic factors were also missing from the enrolled patients. Some significant prognostic values were not recorded in the SEER database, such as liver function tests, hepatitis B or hepatitis C infections, and details of chemotherapy, radiation therapy and surgery. Second, internal validation was used for the model which might affect the accuracy of the models in general HCC patients with LNs metastasis. Next, we should further validate this model with our own clinical data. Prospective, randomized, controlled studies must be further implemented.

Conclusion
In conclusion, we showed that grade, T stage, surgery to the liver, chemotherapy, radiation recode, AFP, fibrosis score, tumor size group and M stage are independent risk factors for survival of HCC patients with LN metastases. We established a nomogram to distinguish between patients with a good prognosis and those with a poor prognosis. Internal validation demonstrated the accuracy and usefulness of the nomogram. A reasonable treatment could be devised according to different risk scores. Further studies are warranted.

Abbreviations
HCC, Hepatocellular carcinoma; LN, Lymph node; SEER , Surveillance, Epidemiology, and End Results; C-index, concordance index; DCA, decision curve analysis; AFP, alpha-fetoprotein; AJCC, American Joint Committee on Cancer ; PFS , progression-free survival; OS, overall survival; DCA, decision curve analysis; PLNE ,perihepatic lymph node enlargemen.    Figure 1 The flow diagram of the patients' selection. Lasso regression search for the optimal coefficient. (a) Lasso regression search for the optimal coefficient when the Lambda was -4.37. (b)10x cross-validation was applied for search the Lambda when partial likelihood deviance was the least.