Conditional Survival Probability of Non-Small Cell Lung Cancers, Based on The SEER Database

Background: This study aims to explore the dynamic survival probability of lung cancers after resection based on those had survived several years, provide more precise monitoring and treatment information for non-metastatic non-small cell lung cancer (NSCLC) patients. Materials and Methods: In the Surveillance, Epidemiology, and End Results (SEER) database (2000– 2016), 95531 eligible non-metastatic NSCLC patients after surgery were enrolled, TNM stage were reclassied, the methods of condition survival probability (CS) and actuarial overall survival (OS) were used to explore the relationship between clinicopathological characteristics and cancer prognosis. Results: The 1-, 3-, 5- and 10-year OS of included patients were 83.6% (95%CI: 83%-84%), 62.9% (95%CI: 62.6%-63.1%), 50.8% (95%CI: 50.6%-51.0%) and 33.1% (95%CI: 32.7%-33.6%) respectively. For those already survived 1, 2, 3, 4 and 5 years after diagnosis, the probability for surviving an additional 3 years were 67%, 71%, 73%, 75% and 77% respectively. Enrolled population were reclassied into 9 cohorts including T1aN0, T1bN0, T1cN0, T2aN0, T2bN0, T3N0, T4N0, T1-4N1, T1-4N2 according to 8th TNM staging. According to the conditional survival probability, patients with unfavorable tumor stage diagnosed initially at surgery had the signicant improvement in CS over time. Analysis based on other clinical features demonstrated similar conclusion that the poorer the initial diagnosis, the more signicant the benet of conditional survival over time. Conclusion: The worse the patient's prognosis, the more signicant the benet of time-dependent conditional survival probability, long-lived cancer patients may have a better cancer prognosis.


Introduction
Survival estimates for cancer patients was traditionally based on TNM stage at the time of diagnosis or after diagnosis or treatment [1], which answered prognostic questions that many cancer patients care about. This result makes the 5-year survival rate of cancer patients a xed value, and could be understood as "static survival estimate" [2]. However, for those who have survived for several years after diagnosis, the survival probability established at the time of diagnosis may not be applicable, because the overall survival rate of patients includes those who died within the rst few years, as well as those who have passed through the rst few years and "stand out" from them. For those patients pass through the rst few years, the doubt troubling them may be "If I have already lived x years after diagnosis, what is the probability that I survive for another y years".
In the past few years, the occurrence of conditional survival (CS) probability has given this question an exact answer [3], the CS refers to the probability of surviving for another n years if the patient with chronic disease has been alive for m years after diagnosis or treatment, the concept was derived from the conditional probability in biostatistics [4][5][6]. It can provide a dynamic and more precise survival rate for cancer patients [7]. It is well known that the prognosis of cancer patients who survived in the rst few years will be better, because the impact of death risk factors on the prognosis of patients will gradually weaken over time [8][9][10], and as we all known, for those patients who have pass through the rst few years after radiotherapy or chemotherapy, the adverse effects of radiotherapy and chemotherapy on the body gradually weaken. Conditional survival means that, on average, the prognosis of long-term cancer survivors are better than these newly diagnosed patients [11][12][13]. This study aims to explore the postoperative conditional survival of patients with non-distant metastatic non-small cell lung cancer, providing more powerful information for doctors to formulate treatment plans and monitoring plans, giving patients and doctors a new understanding of cancer prognosis.

Data Source
The patients of this study was selected from the National Cancer Institute database (The Surveillance, Epidemiology, and End Results, SEER) [14]. The raw data in this investigation was downloaded from the SEER web site (https://seer.cancer.gov/data/) via SEER*Stat in client-server mode after we submitted a request for access and signed the SEER research data agreement.

Study Population
In this study, non-metastatic NSCLC from 18 registration centers in the SEER database were obtained. Since the SEER database is a public database, analysis of lung cancer patients does not require informed consent and institutional review. The clinical pathological characteristics of the patients were screened, the inclusion criteria are: 1) 15 years old or older patients diagnosed with lung cancer between 2000 and 2016 years; 2) with de nite pathological diagnosis of NSCLC; 3) Single primary tumor; 4) Complete followup data (patients who died within 1 month after diagnosis were excluded, lack of speci c follow-up time was also excluded); 5) Complete clinical pathological characteristics (such as age, tumor size, whether surgery, TNM stage). Exclusion criteria include: 1) metastatic lung cancer; 2) lack of T and N stages; 3) Tumor size is not available, histological and grade information is unclear; 4) diagnosis based on autopsy or death certi cate only; 5) Died within one month of diagnosis or lack of follow-up data. The patient's TNM staging was reclassi ed based on the eighth edition of the American Joint Committee on Cancer (AJCC) staging standard according to the tumor size. Detail in Figure 1.

Statistical Analyses
The analysis of this study is based on two steps. First, TNM stage was reclassi ed based on tumor diameter according to the 8 th version of the American Joint Committee on Cancer (AJCC) staging (T1aN0,   T1bN0, T1cN0, T2aN0, T2bN0, T3N0, T4N0, T1-4N1, and T1-4N2). Clinicopathological characteristics were also strati ed, such as surgical situations, positive lymph nodes, tumor grade, tumor size and patient age. (Details in Table 1). Subsequently, Kaplan-Meier method was used to analyze actuarial survival rate of cancers, such as 5-year survival rate or 10-year survival rate. Most of the missing values in this article were eliminated.
Another statistical method involved in this study is the conditional survival probability. To illustrate how we obtain conditional survival estimates from the cumulative survival estimates, suppose we are interested in the population's 5-year lung cancer survival probability conditioned on already having survived 5 years. The estimate is obtained by dividing the cumulative survival at 10 years by the cumulative survival at 5 years. The 1-year lung-cancer survival estimates conditioned on already having survived 5 years after diagnosis are derived by dividing the cumulative survival estimates at 6 years by the cumulative survival estimates at 5 years. Subtracting this survival probability from 1 gives the probability of dying in the year conditioned on having already survived 5 years. CS was adopted to estimate the survival probability, the mathematical de nition of CS could be expressed as: CS (n | m) = S (n) / S (m), (m <n), where CS (n | m) is the probability of survival n years assuming that patient have already survived for m years after diagnosis. In this study, we estimated the additional 5-year conditional survival probability of patients given that they have already survived x years using the mathematical formula CS(x+5 x) = OS(x+5)/OS(x). Finally, the differences between the actuarial OS and the CS of the population were compared and analyzed. All statistical methods were implemented by Graphpad prism version 8.0.2 and R language 3.6.3 version, all statistical tests are twosided, P value <0.05 is considered statistically signi cant.
At the last follow-up, the median follow-up time was 62 months, the number of deaths or events was 51,175. 1-year, 3-year and 5-year OS were 83.6% (95%CI: 83%-84%), 62.9% (95%CI: 62.6%-63.1%) and 50.8% (95%CI: 50.6%-51.0%) respectively. Figure 2(A) shows the rapid decline of overall survival rate in the rst three years. Although the overall survival of lung cancers shows unsatisfactory, for those living several years after diagnosis, the mortality rate is gradually decreasing Figure 2(B) and the probability rates for surviving an additional 5 years were steadily increased ( Figure 2(C-D) and Table 2). Figure 3 shows the decrease of the actuarial survival rates over time and the increases of estimated CS(8) for 1-5 years in total patients, which demonstrated that as the survival time of cancers increases, the gap between overall survival rate and conditional survival rate becomes more signi cant.

Pathological types
Other tumor prognostic factors were also explored in this study. Analysis based on pathological types showed that the prognosis of patients with adenocarcinoma is better than other types including squamous cell carcinoma, adenosquamous cell carcinoma and large cell carcinoma ( Figure 4(A)). For those surviving in the rst few years after diagnosis, the worse the prognosis of pathological types (such as squamous cell carcinoma (SCC), adenosquamous cell carcinoma (ACC), and large cell carcinoma (LCC)), the more signi cance the conditional survival probability bene t, for example, in the adenocarcinoma cohort (n 55,592), the 8-year OS is 46.3% (95%CI: 45%-47%), but the 8-year survival rate for those who had survived 5 years after diagnosis is 79% (95%CI: 78%-80%), the difference is 22.7%. Simultaneously, in the SCC cohort (n 24,062), the probabilities of CS(8) increased from 30.2% (95%CI: 29%-33%) at baseline to 70% (95%CI: 67%-74%) at 5 years of follow-up, the probabilities of CS(8) in the ACC cohort (n 2,188) increased from 29.3% (95%CI: 27%-31%) at baseline to 72% (95%CI: 67%-76%) at 5 years of follow-up, and in the LCC cohort (n 3,712) increased from 29.4% at baseline to 74% at 5 years of follow-up (Figure 4(B) and Table 4). It indicated that the longer cancer patient survives, the more improvement of survival prognosis they would get, and the less signi cant the in uence of pathology type on cancer prognosis.

Patient age
As shown in Figure  showed that the chemotherapy cohort and the radiotherapy cohort, lymph node positives (more than 16), the SEER stage was regional, Caucasian, primary tumor site lie in whole lung or bronchus were associated with lower 5-year actuarial survival (Table 4). At the same time, time-dependent conditional survival probability was also explored, for example, patients in the female group had an actuarial 5-year OS of 57.4%, while the 5-year survival probability of these who had survived for 2 years (CS (5 2)) is 74%. The 5- year OS in the male cohort is 44%, and the CS (5 2) is 67% (Figure 5(B)). Simultaneously, the 5-year OS in the chemotherapy cohort is 42.1%, and the CS (5 2) is 62%, corresponding difference in the non-chemotherapy cohort is 19.6% ( Figure 5(D)), patients in the radiotherapy cohort had an actuarial 5-year OS of 30.2% and a CS (5 2) of 54%, The detailed overall survival rates and conditional survival probability of other tumor prognostic factors were showed in Table 4.

Discussion
The view that the risk of tumor-speci c death decreases with the length of postoperative survival is called conditional survival. Conditional survival (CS) means that, on average, long-term cancer survivors have a better prognosis than newly diagnosed individuals [15][16][17], because most of the patients who survived after the rst few years were those respond well to treatment, and the condition was alleviated, the complications were controlled, adverse reactions caused by surgery, radiotherapy and chemotherapy gradually weakened and the threat of death-related risks is gradually reduced [4].
The 5-year overall survival rate of the enrolled patients is 51%, For these who have survived 1-, 2-, 3-, 4-or 5 years after the diagnosis of cancer, the probability to survive another 3 years is 67%, 71%, 73%, 75% and 77% respectively, demonstrated that the survival probability increasing gradually as patients survive longer.
The results showed that the 5-year OS of patients with T1aN0 to T1-4N2 gradually decreases, range from 55% of T1aN0 to 17% of T1-4N2, which brought frustrating results to those patients initially diagnosed with advanced disease. However, it is gratifying to observe the signi cant improvement in CS as patients survive longer, especially for those with advanced disease. For example, the difference between 5-year actuarial OS and CS (5 3) in T1aN0 cohort is 15%, T2aN0 is 27%, T3N0 is 35%, and T4N0 is 39% ( Figure. 4D). We can also note that, for these long-term cancer survivors, the difference in conditional survival probability of people with different T and N stages gradually narrowed, tending to be consistent. In other words, for these had survived 5 years after diagnosed, the probability of patients still alive in the tenth year in T1aN0, T2aN0, T3N0, T4N0, T1-4N1 are 76%, 63%, 64%, 67% and 60% respectively, which revealed that the survival prognosis of patients is not only related to clinical pathological factors, but also to the survival time of patients after surgery, and as the survival time is prolonged, the prognosis of patients increasingly shows time dependence.
The results of the age-based grouping show that the survival prognosis of patients in different age groups varies greatly, the 5-year OS of the elderly patients is the unsatisfactory (32.4%), and the young patients is the best (46.5%). In the 15-45 cohorts, the 3-year conditional survival probabilities increased from 83% at baseline to 93% at 60 months, in these aged more than 70 years old, the CS3 increased from 56% at baseline to 65% at 60 month. Compared with young patients and middle-aged patients, the improvement of conditional probability in elderly patients is not signi cant, which may be due to the long smoking time of elderly patients, the high incidence of cardiovascular diseases and respiratory diseases or the poor physical performance of elderly patients caused by.
Other tumor characteristics, including poor histological grade, larger tumor size, adenosquamous carcinoma, lymph node involvement, male, Caucasians, tumor located in the middle lobe and seed stage were associated with poor survival prognosis ( Figure 6). However, from the perspective of long-term survival of patients, it may be more meaningful to explore the conditional survival probability and compare the actuarial OS and CS. Although unfavorable clinicopathological features show poor 5-year OS, the bene t of conditional survival becomes very signi cance with the survival time of patient increases. For example, in the cohort of more than 16 lymph node metastasis, compared with the 5-year OS of all patients, those survived the rst three years after diagnosis (CS (5 3)) shows a better 5-year survival rate, increased by 16% (55%-71%), while patients without lymph node metastasis increased by only 4% (75%-79%). Similarly, Patients with larger tumor diameters increased CS (8 5) by 26% and patients with smaller tumor diameters increased by only 5% (Figure 3). Those patients with adverse clinical factors did not pass the most critical period and died within the rst few years, as the years of survival of speci c patients increase, these adverse prognostic factors that affect the prognosis become increasingly unrelated. Therefore, our data suggests that CS may be a more valuable tool late in the postoperative period to estimate the prognosis of patients who are predicted to die based on initial actuarial estimates.
Some limitations of this study should not be overlooked. First, this study is a retrospective study. Inevitable deviations would be appear in the collection of clinical pathological characteristics, diagnosis and treatment of patients; second, the diagnosis of patients in this study the time span is large (2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013)(2014)(2015)(2016). The impact between lung cancer patients diagnosed in different periods and the prognosis has not been explored. Third, multiple primary tumors were excluded in order to eliminate interference, but the errors were unavoidable in reclassi ed the T stage and N stage. Nonetheless, this study also proposes a dynamic assessment of the conditional survival probability of lung cancer patients, thereby allowing adjustment of the predicted survival time after lung resection. The tool may prove useful to patients, doctors and researchers, and will guide dynamic and personalized clinical management decisions.

Conclusion
The worse the initial diagnosis of cancer patient, the more signi cant the bene t of time-dependent conditional survival probability, long-lived cancer patients may have a better cancer prognosis.

Availability of data and materials
The datasets supporting the conclusions of this article were included within the article

Con ict of interest
The authors declare no con ict of interest.     Abbreviation: a The conditional 5-year survival probability for these already survived 2 years in T1aN0 cohort, which means that if patients had survived 2 years after diagnosis, the probability of still alive in the seventh year after diagnosis is 74% (Purple background). b At the beginning of the interval c Relative to the previous year; d For example, the 5-year overall survival rate in T1aN0 cohort is 73%, 10-year overall survival rate in T1aN0 cohort is 55% (Red background).