As shown in Fig. 1, after data cleaning, 4432 small-cell lung cancer patients were screened, including 1776 in the training cohort and 2656 in the validation group. Clinical and pathological characteristics of patients in the training cohort, validation group and pre-cleaning group are shown in Table 1.
Univariate Cox analysis results of survival influencing factors
Univariate Cox regression analysis was performed on 1776 small-cell lung cancer patients in the training cohort. Clinical pathologic factors of single factor survival analysis results showed that age, race, sex, the degree of differentiation, N, M and total stage, surgery of primary site, tumor size, extension, involvement the lymph nodes, metastasis, death from cancer and non-cancer causes, and sequence of primary tumor, tumor number, and age of diagnosis is associated with survival time and survival (P < 0.05). There was no correlation between T stage, operation or no operation, marital status, lateral, survival time and survival status of patients (P > 0.05) (Table S1).
Multivariate Cox analysis results of survival influencing factors
The statistically significant variables in the univariate Cox regression analysis were included in the multivariate analysis. The results showed that the patients were aged 65-69, aged 75-79, aged 80-84, aged >=85, white, male, total stage II, tumor invasion range correlated with survival time and survival status (Table 2).
The Nomogram model building
Based on the results of the multivariate analysis, the Cox regression model was constructed again for the above meaningful variables. They are included in the Nomogram plot and assigned as the point in the Nomogram according to the results of the Cox regression model. The total score can be obtained by adding the single score value of 5 variables. The overall score corresponds to the survival axis values, which can predict the 1-year, 3-year, and 5-year survival rates of small-cell lung cancer patients. The higher the overall score, the higher the survival rate, and vice versa. Detailed results are shown in Fig. 2.
The Nomogram model was tested by training cohort
C-index was used to evaluate the discrimination between the model and the real value of the training cohort. The results showed that the c-index was 0.6817, indicating that the model was acceptable and capable of prediction.
The results of the model evaluation using the calibration diagram are shown in Fig. 3A~C respectively. Three Numbers were taken on average in the sample number, and the fitting coefficient was set to 100 to calculate the actual 1-year, 3-year and 5-year survival rates corresponding to those predicted by Nomogram. The evaluation results show that the calibration curves of the 1-year, 3-year and 5-year survival prediction of the training cohort are close to the ideal 45° dashed line, indicating that the predicted values are in good consistency with the actual observed values.
ROC curve prediction was used to evaluate the model. The significant variables in the multivariate analysis, such as age, sex, race, total stage, and extension, were included in the calculation of the risk score of each patient. The ROC curve was prepared in combination with the risk score and the AUC was calculated (Fig. 4A~C). The results showed that the 1-year survival rate prediction (AUC=0.733), 3-year survival rate prediction (AUC=0.754) and 5-year survival rate prediction (AUC=0.743) in the training cohort were of moderate accuracy, indicating that the accuracy of survival rate prediction was relatively high. There was good consistency between the predicted value and the actual value.
Verifying the prediction model with the verification cohort
In the verification cohort, the Cox regression model of multiple factors was built first. Then the c-index of the training cohort was verified. The result of c-index is 0.6778, which conforms to the accuracy of model prediction of the training cohort and has the ability of prediction.
It is verified that the calibration diagram of the modeling group is accurate in evaluating the model. The evaluation results of the validation group on the model are shown in Fig. 3D~F. Evaluation results show that the one-year, three-year and five-year survival prediction correction curves of the verification cohort are similar to those of the training cohort, with an ideal 45° dotted line. They indicate that the calibration of the training cohort is relatively accurate.
The ROC curve of the training cohort was verified to evaluate the model. The risk score of each patient in the validation group was calculated. Then the ROC curve and the area under the curve were calculated by combining the risk scores (Fig. 4D~F). The results showed that the 1-year survival rate prediction, 3-year survival rate prediction, and 5-year survival rate prediction of the validation group all belonged to the range of moderate accuracy, indicating that the ROC curve of the training cohort was relatively accurate in evaluating the model.
The overall survival evaluation in the total samples
All patient data from the modeling and validation groups were integrated to obtain the total risk score. The survival curves of high risk and low risk were plotted (Fig. 5A). Based on the Cox regression model established above, the effects of 5 meaningful variables (age, gender, race, total staging and leaching degree) in the multivariate analysis on the survival of patients were analyzed, and the survival curve of each variable was plotted (Fig. 5B~F).
In the survival analysis, the median survival time of small-cell lung cancer was 7 (0-71) months, and the average survival time was (11.26±13.09) months. In the one-year survival rate statistics, the one-year survival rate gradually decreased with the increase of age, from 49.2% to 19.67%. In addition to 50-54 years survival rate (49.7%) was slightly higher than on slightly higher than a group. The 3-year survival rate also declined gradually, from 16.3% to 5.5%, except for the 65-69 years-old survival rate and the 50-54 years-old survival rate, which were slightly higher than the previous age group. Although there is a lack of data on 5-year survival, it is generally declining. The survival rate of black people (41.14%) was higher than that of white people (36.79%) and other people (32.61%). The survival rate of small-cell lung cancer in women (40.53%) was higher than that in men (33.32%). In tumor stage, the survival rate showed a general trend of decreasing with the higher stage, from 76.3% to 22.59%, among which stage IIA (73.0%), stage III (66.7%) and stage IIIB (52.6%) were slightly higher than the previous stage. In terms of the degree of an extension, the survival rate decreased gradually with the increase of the range, from 44.4% to 32.0% (Table S2).