Construction of a Nomogram to Predict Overall Survival for Patients With Lung Large Cell Neuroendocrine Carcinoma: Analysis of the SEER Database

Background: Lung large cell neuroendocrine carcinoma (L-LCNEC) has a poor prognosis with lower survival rate than other NSCLC patients. The estimation of an individual survival rate is puzzling. The main purpose of this study was to establish a more accurate model to predict the prognosis of L-LCNEC. Methods: Patients aged 18 years or older with L-LCNEC were identied from the Surveillance, Epidemiology and End Results (SEER) database from 2004 to 2015. Cox regression analysis was used to identify factors associated with survival time. The results were used to construct a nomogram to predict 1-year, and 3-year survival probability in L-LCNEC patients. Overall survival (OS) were compared between low risk group and high risk group by the Kaplan–Meier analysis. Results: A total of 3216 patients were included in the study. We randomly divided all included patients into 7:3 training and validating groups. In multivariable analysis of training cohort, age at diagnosis, sex, stage of tumor, surgical treatment, radiotherapy and chemotherapy were independent prognostic factors for OS. All these factors were incorporated to construct a nomogram, which was tested in the validating cohort. Conclusions: we constructed a visual nomogram prognosis model, which had the potential to predict the 1-year and 3-year survival rate of L-LCNEC patients, and could be used as an assistant prediction tool in clinical practice.


Introduction
Lung cancer is one of the malignant tumors with high incidence and mortality rate in the world [1]. Lung large cell neuroendocrine carcinoma (L-LCNEC) is a kind of rare and invasive non-small cell lung cancer. L-LCNEC, which is characterized by both morphological differentiation of neuroendocrine tumors and large cell lung cancer, belongs to highly malignant neuroendocrine tumor. L-LCNEC is reclassi ed into the category of neuroendocrine tumors(NETs) by 2015 World Health Organization Classi cation of Lung Tumors. L-LCNEC is identi ed as an independent solid tumor with clinical and biological characteristics similar to small cell lung cancers (SCLC) [2]. The incidence of L-LCNEC accounts for 15% of lung NET and 2.0% − 3.5% of the lung cancer resected cases [3,4].
Compared with other NSCLC histological types, L-LCNEC has a poor prognosis with lower survival rate than other NSCLC patients in the same period, even for patients with early stage disease or received surgery treatment [5][6][7][8]. There are several literatures about the prognosis of L-LCNEC, but they often came from small samples retrospective analysis. At present, there is no reliable predictor for the prognosis of L-LCNEC. Meanwhile, the independent prognostic factors of cancer can be screened according to the traditional Kaplan Meier method and cox proportional hazards model, but the survival prediction analysis can't predict the survival of an individual. The existing TNM staging can not fully integrate the clinical prognosis information of patients, so it is necessary to develop a new prognosis prediction model which can combined with TNM staging to predict the risk of death more accurately and reliably.
At present, nomogram model has been widely used in tumor survival prediction research [9,10]. Compared with the traditional TNM staging, nomogram has integrated more individual characteristics, including personal information, disease stage, treatment method, etc., and it has gradually become a new type of prognosis evaluation tool [11][12][13]. Our large-scale, population-based prospective cohort study is the rst attempt to establish a predictive nomogram for L-LCNEC for 1-year and 3-year survival probability.

Data Sources
The data of this retrospective study came from Surveillance, Epidemiology and End Results database (SEER database), which is the authoritative cancer statistical database in the United States. SEER database records the incidence, mortality and disease status of millions of cancer patients. At present, the number of registration stations has been expanded to eighteen. Data from registries are submitted to NCI twice a year for classi cation, statistics and aggregation, and cancer information of the population covered is disseminated to the United States and the world. We obtained the clinical information of patients diagnosed as L-LCNEC from SEER database in 2004-2015 with reference number 14026-nov2018 using SEER*STAT 8.3.6 software.

Inclusion and exclusion criteria
We extracted the data of L-LCNEC patients registered from 2004 to 2015. Patients who met the following criteria were included in this study: (1) Adults aged 18 years or older; (2) Patients with pathologically con rmed L-LCNEC; (3) No history of other primary tumor.
Patients with the following conditions were excluded from this study: (1) Patients with incomplete TNM stage (AJCC, 6th Edition) information registration required by the research; (2) Patients whose survival time was less than one month.

Statistical analysis
We randomly divided all included patients into 7:3 training and validating groups. Multivariable Cox proportional hazards models were determined to evaluate independent prognostic factors in training groups. Multivariable Cox proportional hazards model was used to estimate theβregression coe cient for each of the selected prognostic predictors. The risk scores of the training cohort were generated by combining with β expression coe cient and individual information of each patient. Those whose riskscore higher than median value were de ned as high-risk group, Those whose riskscore lower than or equal to the median value were classi ed as low-risk group.
Nomogram prediction model was built to predict the prognosis of L-LCNEC patients individually. C-index and calibration curve were used to evaluate the accuracy of model. Predictive value of model was assessed with the area under the receive operator characteristic curve (AUROC) .
The model is validated by validating groups. For better analysis, all variables are converted to categorical variables. Overall survival was de ned as the time from the beginning of the diagnosis until death of any cause or until the last follow-up date. The survival analysis was performed using Kaplan-Meier curves with P value determined by Log-rank method. The Hazard's ratio was determined by univariate and multivariate Cox proportional hazard model. All statistical tests were two-sided, and P-value < 0.05 was considered statistically signi cant. Univariate and multivariate Cox analysis were carried out using SPSS 19.0 software, other statistical analysis were conducted using R(version 3.6.0).

Characteristics of patients
According to the inclusion criteria, 3216 patients were included in this study. The average age at diagnosis was 66.17 ± 10.45 years old. The median age was 67 years old (age ranging from 18 to 94), 1725 patients were males and 1491 patients were females, the ratio of males to females was 1.16:1. We randomly divided all included patients into 7:3 training and validating groups. 2252 patients were in the training cohort and 964 were in the validation cohort. The characteristics and treatment measures of patients are shown in Table 1.

Independent prognostic factors in the training cohort
In the training cohort, the clinical characteristics and treatment measures of patients were included in the univariable analysis to explore the factors that may in uence overall survival(OS). All signi cant factors in the univariable analysis and the variables that may be related to OS were entered into the multivariable analysis based on the Cox regression. Multivariate analyses indicated that age at diagnosis, sex, stage of tumor, surgical treatment, radiotherapy and chemotherapy were independent prognostic factors for OS. The univariable and multivariate analysis are shown in Table 2. In the training cohort, there were 1143 low-risk groups and 1109 high-risk groups. The riskscore and survival status of low-risk groups and high-risk groups are shown in Fig. 1A and Fig. 1B. By the end of follow-up, the survival rate of low-risk group was 39.1%, while that of high-risk group was 9.5%. The median survival time of high-risk group was signi cantly shorter than low-risk group. (7 months vs 28 months, p = 0.000, Fig. 1C).

Nomogram construction
Signi cant independent factors based on the results of the multivariable analysis, including age at diagnosis, sex, stage of tumor, surgical treatment, radiotherapy and chemotherapy were incorporated to establish the nomogram for predicting 1-year, and 3-year survival probability. In the nomogram model, different scores (range 0-100) were given on a points scale at the top of the nomogram by the vertical line of the selected variable. The total score could be accumulated by adding the scores of all variables, and then we could estimate the probability of 1-year and 3-year survival of an individual patient by the vertical line to the prediction line at the bottom of the nomogram (Fig. 2).

Validation of the Nomogram
It is demonstrated that Harrell's C-index for the established nomogram to predict OS was 0.75 in the training cohort. Likewise, Harrell's C-index for prediction of OS in the validation cohort was 0.76.
ROC curve and AUC were used to evaluate the prediction ability of nomogram to patients with L-LCNEC for 1-year survival rate (AUC = 0.815) and 3-year survival rate (AUC = 0.838) in the training cohort (Fig. 3A,  3B). In the validation cohort, it also showed a good prediction ability of 1-year survival rate (AUC = 0.812) and 3-year survival rate (AUC = 0.86) (Fig. 3C, 3D).

Calibration of the Nomogram
The calibration plots showed a good agreement between the prediction and actual observations for 1year survival and 3-year survival in the training and validation cohorts (Fig. 4A-4D).

Discussion
L-LCNEC is a rare subtype of lung cancer classi cation. According to the existing literature reports, the average age of L-LCNEC was 60 years old, most of those were male [14], which was similar to our results. The survival rate of L-LCNEC greatly varied due to the selected patients' age, tumor stage or treatment difference [15][16][17][18][19][20][21][22][23]. Clinicians' prediction for the survival probability of L-LCNEC is often based on the large-population study, but the estimation of an individual survival rate is puzzling. Therefore, it is necessary to establish a more accurate model to predict the prognosis of L-LCNEC and provide more effective treatment measures for individuals. Based on a retrospective study of high-quality, populationbased data, our study summarized the clinical characteristics of patients with L-LCNEC, analyzed the factors related to the prognosis and established a more reasonable and effective nomogram model for predicting survival rate of L-LCNEC.
Our results indicated that the independent factors in uencing the prognosis of L-LCNEC were age at diagnosis, sex, stage of tumor, surgical treatment, radiotherapy and chemotherapy.
Compared with patients under 60 years old, there was an increased risk of death in patients over 60 years old. Increased risk in elderly groups may be due to their poor physical function, intolerance or insensitivity to surgery, chemotherapy, radiotherapy or other treatment methods, as well as multiple diseases combination. In terms of gender, the prognosis of male patients was worse than female patients, which may be related to the factors such as more smoking phenomenon involved in male patients or their different body functions or characteristics from female. Similar to other tumors, TNM staging is an independent factor for the prognosis of L-LCNEC patients. Compared with stage I patients, the risk of death in stage II patients increased by 1.06 times, stage III patients increased by 1.35 times, and stage IV patients increased by 3.02 times, thus suggesting that with the tumor staging increase, the risk of death in patients also increased. The possible reason is that the patients' body function is worse and the more complications appear with the advanced tumor stage, they are unable to receive comprehensive treatment for tumors.
There is no standard treatment method for L-LCNEC. Similar to other subtypes of lung cancer, combined therapy, including surgery, chemotherapy and radiotherapy, were taken. The bene cial effect of surgery on L-LCNEC patients has been reported in the previous literatures [24,25]. Surgical-predominant therapy should be the principle method for treating L-LCNEC patients with early stage presently. Our study indicated surgery was an independent factor affecting the prognosis of L-LCNEC. However, the effect of surgery on the prognosis of L-LCNEC patients was also related to the surgical method. Compared with the patients who did not undergo surgery treatment, the results showed that the mortality risk of patients received local tumor destruction was reduced by 29.4%, but there was no statistical signi cance (P = 0.729), while the mortality risk of those received sublobar resection, lobectomy and pneumonectomy was reduced by 47.1% (P = 0.000), 61.3% (P = 0.000), 61.2%(P = 0.000)respectively. Therefore, surgical treatment may not be limited to stage I L-LCNEC patients, but should be extended to resectable patients after adjuvant chemotherapy. Lobectomy or pneumonectomy could be conducted as soon as possible if the patient's physical condition allowed, because the cancer cells grow rapidly and patients may lose the chance of surgery within a few months.
L-LCNEC is aggressive with high potential to metastasize and is easy to recur after operation, so it is not enough to effectively treat the disease by surgery alone. L-LCNEC patients treated by surgery alone were rarely cured even in the early stage [26], which urges more scholars to consider chemotherapy or radiotherapy [6,18,27,28]. Two retrospective analysis showed that compared with the patients with surgery alone, platinum-based neoadjuvant chemotherapy or postoperative adjuvant chemotherapy could prolong the recurrence time of tumor and signi cantly bene t the long-term survival for the patients with early stage [29,30]. Chemotherapy regimens for small cell lung cancer/non-small cell lung cancer were used in L-LCNEC treatment, but most of the data were came from single center, small sample or retrospective studies, and the results were controversial [7,31,32]. For patients with advanced L-LCNEC, chemotherapy could signi cantly improve overall survival [33][34][35][36]. Our study also showed that chemotherapy may have a survival advantage. Since SEER database does not contain records of chemotherapy regimens, the topic which chemotherapy is better for L-LCNEC patients was not involved in our study.
Because of the low incidence, there were a few literatures on radiotherapy for L-LCNEC. Mackley believed that if gross residual disease was present after surgery, adjuvant radiation should be recommended, adjuvant radiotherapy could be bene cial to local control and reduced the risk of local recurrence [37]. Rieber conclued that patients with incomplete resection showed a survival bene t from adjuvant radiotherapy [38]. Our results also supported the prognostic bene ts of radiotherapy.
Through the establishment of nomogram, we could predict individualized prognosis according to their scores in order to select an appropriate treatment strategy for individuals.
There were some limitations in our study: this was a retrospective study, our results are inevitable subject to selection bias. In addition, due to the limited clinical factors included in SEER database, this study did not analyze the possible factors related to the prognosis of the disease, such as smoking record, genes status, chemotherapy schemes, ect.. Therefore, it is necessary for a prospective evaluation for the nomogram in clinical application.

Conclusion
In this study, our analysis indicated that age at diagnosis, sex, stage of tumor, surgical treatment, radiotherapy and chemotherapy were independent prognostic factors for overall survival of L-LCNEC patients. And on this basis, we constructed a visual nomogram prognosis model, which had the potential to predict the 1-year and 3-year survival rate of L-LCNEC patients, and could be used as an assistant prediction tool in clinical practice.