Mathematical Models for Intraoperative Prediction of Metastasis to Lymph Nodes in the Hilar- intrapulmonary or the Mediastinal Region in Patients With Clinical Stage I Non-small Cell Lung Cancer: a Retrospective Cohort Study

Yue Zhou Jiangsu Province Hospital and Nanjing Medical University First A liated Hospital Junjie Du Jiangsu Province Hospital and Nanjing Medical University First A liated Hospital Changhui Ma Jiangsu Province Hospital and Nanjing Medical University First A liated Hospital Fei Zhao Jiangsu Province Hospital and Nanjing Medical University First A liated Hospital Hai Li Jiangsu Province Hospital and Nanjing Medical University First A liated Hospital Guoqiang Ping Jiangsu Province Hospital and Nanjing Medical University First A liated Hospital Wei Wang Jiangsu Province Hospital and Nanjing Medical University First A liated Hospital Jinhua Luo Jiangsu Province Hospital and Nanjing Medical University First A liated Hospital Liang Chen Jiangsu Province Hospital and Nanjing Medical University First A liated Hospital Kai Zhang Jiangsu Province Hospital and Nanjing Medical University First A liated Hospital Shijiang Zhang (  shijiangzhang@hotmail.com ) Jiangsu Province Hospital and Nanjing Medical University First A liated Hospital https://orcid.org/0000-0001-6209-918X


Background
Non-small cell lung cancer (NSCLC) is among the deadliest malignancies in the world [1]. Lobectomy plus complete lymph node (LN) dissection with removal of all ipsilateral hilar and mediastinal lymphatic tissue remains the standard surgical procedure for the treatment of dissectible lung cancer [2]. For the last decade, more early-stage lung cancer has been diagnosed, partly thanks to the development of computed tomography (CT) screening [3,4]. In the current era of value-driven healthcare, it is important to consider novel approaches to maintaining the curative intent pulmonary operation while decreasing unnecessary removal of surrounding healthy tissues. Tailoring LN dissection during operation for earlystage NSCLC has gradually become a particularly attractive target for value optimization because limiting LN harvest may avoid unnecessary intraoperative injury, shorten the operative time, reduce post-operative morbidity, and have better cost-effective outcomes [5][6][7][8][9].
Variety of techniques have been developed to detect the clinical N-category, such as radiologic imaging, endoscopic and surgical techniques [10]. However, it is still hard to nd the highest-quality and most costeffective investigation to accurately determine pathological N stage of an early NSCLC [11][12][13].
Therefore, there is no universally accepted method of LN dissection for this patient population [14,15].
A quick and accurate prediction of the presence and the precise regions of LN metastasis of clinical stage I NSCLC before or during operation will help surgeons choose optimized surgical approaches. The authors have previously explored relevant clinicopathologic factors affecting regional LN metastasis in clinical stage I NSCLC [16]. In this study, we further developed mathematical models to predict regional LN status in patients with clinical stage I NSCLC and to help surgeons make reasonable decisions of LN dissection by studying the relationship between the clinicopathologic variables and the hilarintrapulmonary nodal metastasis (HNM) or the mediastinal nodal metastasis (MNM).

Study Population
The institutional ethics committee approved the study with waiver of consent (approval no.2017-SR-097). The work was registered in www.chictr.org.cn with a registration number (ChiCTR2000031620). Clinical data of the consecutive patients with primary lung cancer who underwent video-assisted thoracoscopic surgery at our hospital from January 2017 to September 2019 were collected and reviewed retrospectively.
The enrollment criteria of this study were as follows: (1) diagnosed as having clinical stage I NSCLC based on the new International Staging System for NSCLC (National Comprehensive Cancer Network Guidelines Version 3.2014: Staging Non-Small Cell Lung Cancer); (2) underwent lobectomy with postoperative pathological con rmation of NSCLC and had complete LN dissection.
Patients who exhibited any one of the following conditions were excluded from this study: (1) preoperative tumour size > 4 cm on CT imaging; (2) preoperative LN > 1 cm at the shortest diameter on CT imaging; (3) had evidence of distant metastasis; (4) had preoperative chemotherapy or radiotherapy; (5) had previous or coexistent tuberculosis or malignant diseases; (6) had LN dissection that did not meet the current standards of complete LN dissection (i.e., all LN stations, including stations 10-14, right-hand stations 2-4 and 7-9, and left-hand stations 4-9); (7) had synchronous lung cancers or multiple primary cancers; (8) postoperative pathological diagnosis revealed special types of pulmonary infection or other LN diseases; or (9) had incomplete clinical data.
Eligible patients who underwent surgery before December 2018 were included in the derivation cohort to establish the models, and patients who underwent surgery after December 2018 were entered into the validation cohort.

Clinicopathological Variables
Trained chart abstractors collected clinical variables such as age, gender, smoking history, family history, tumour markers in blood, the identity of the lobe, tumour size, tumour location within the lobe. The chart abstractors also collected postoperative pathological results of the lymph node metastasis from the conventional sections as mentioned below. the Two clinicians (YZ and CM) veri ed the accuracy of predictor variables.
For histopathologic assessment, a part of the pulmonary tumour was processed for rapidly frozen sections during operation. The remaining tumour tissue with all lymph nodes were xed using 10% formalin, embedded in para n and conventionally sectioned. Both the intraoperative rapidly frozen sections and the postoperative conventional sections were kept in the department of pathology in our hospital for 10 years.
All rapidly frozen sections in this study were reviewed and assessed independently by two experienced pathologists. The classi cation of pathological variables such as histological tumour type and grading, lymphatic vessel invasion status, bronchial mucosa and cartilage invasion status, visceral pleural invasion status and nerve invasion status was based on the consensus of these two pathologists without knowing the results of postoperative formalin-xed para n-embedded sections.

Model Development
The method for developing mathematical models of predicting HNM and MNM was previously described in details [17]. For clinical use of the models, two nomograms were formulated based on proportionally converting each regression coe cient in multivariate logistic regression to a 0-to 100-point scale by using the rms package of software R 3.3.1 (www.r-project.org).

Statistical Analysis
All statistical analyses were performed using software SPSS 18.0 (IBM, Armonk, NY). The Odds ratios (ORs) with 95% con dence intervals (CIs) were calculated. In all analyses, p < 0.05 was considered to indicate statistical signi cance.

Patients' characteristics
During the study period, 3765 consecutive patients with primary lung cancer underwent video-assisted thoracoscopic surgery. Of these, 1003 patients who met the inclusion criteria were enrolled and divided into the derivation and validation cohorts (585 and 418 patients respectively). The clinicopathologic characteristics of the patients were listed in Table 1. The clinicopathological factors of the patients from the derivation cohort with and without the presence of HNM were evaluated. To help yield a simple risk score model later, continuous variables were converted into categorical variables based on the receiver operating characteristic (ROC) curve analysis with the maximum Youden index for the best cut-off value (Supplementary table 1

Validation tests of the prediction models
The tted models derived from the derivation cohort were applied to the validation cohort to produce estimated values of the possibility of risk for HNM or MNM. The area under curve (AUC) of the ROC curve of the estimated value for HNM was 0.872 (95% CI, 0.831-0.913) and 0.823 (95% CI, 0.766-0.879) for MNM in the validation cohort, demonstrating good discriminatory power (Fig. 1).

De nition of low and high risks of LN metastasis
According to the maximized Youden's index, i.e., the sum of sensitivity and speci city, the optimal clinically applicable cut-off value of estimated risks was 0.209 for HNM and 0.132 for MNM (Supplementary table 5). Therefore, we de ned the groups with different predicted risks by using a priori

Discussion
The incidence of LN metastasis in patients with clinical stage I NSCLC is signi cantly lower than that in patients with advanced lung cancer. A randomized trial named ACOSOG Z0030 concluded that systematic mediastinal lymph node dissection could not improve the survival for patients with early-stage NSCLC by con rming that there was no positive lymph node either in mediastinum or in hilum through presection sampling [18]. However, other scholars pointed out that postoperative pathology has shown that even the small-sized lung cancer (< 2 cm) had hilar and mediastinal node metastasis with an incidence of up to 20% [14,19]. Furthermore, patients with positive MNM exhibited a 20-38% incidence of skip metastasis, a phenomenon in which MNM occurs without the involvement of HNM [20,21]. Therefore, accurate integration of lymph node staging in patients with clinical stage I NSCLC is important in guiding the choices of surgical treatments.
In this study, we identi ed six clinical variables as the independent predictors in common for regional lymph node metastases including the hilar-intrapulmonary region and the mediastinal region, which is consistent with the literature [22][23][24]. However, the visceral pleural invasion was only associated with the MNM. This phenomenon seems to indicate that we should discuss the HNM and MNM separately. We think this is based on the following anatomical structures of LN system in the lung. The lymph nodes associated with the cancer metastasis are widely labelled using a system of numerical levels and assigned names based on their anatomical locations. First, the hilar-intrapulmonary lymph nodes (groups [10][11][12][13][14], and second, the mediastinal lymph nodes (includes group 2R, 3, 4R, 7, 8, 9R, and group 4L, 5, 6, 7, 8, 9L). Generally, the sequence of LN metastasis of central NSCLC should be step by step as follows: along the bronchial tree from the intrapulmonary region to hilar, and then to the mediastinal region.
Different from the central NSCLC, the LN metastasis of peripheral NSCLC with the peripheral capsule being invaded may incline to skip to MNM without HNM due to the lymphatic capillaries directly from the peripheral membrane to the mediastinal lymph nodes [25,26].
The rapid pathological results during operation would help surgeons to select the right procedures for the patients, for instance, wedge resection, segmentectomy, or lobectomy [27]. However, surgeons do not know which pattern to choose for lymph node dissection because of the complicated lymphatic spreading of lung cancer. Therefore, we creatively developed two utilizable prediction models from a logistic equation for regional LN metastases in patients with clinical stage I NSCLC. In terms of clinical relevance, the models provide more precise estimates of regional LN metastasis during operation for individual patients with clinical stage I NSCLC. Especially, the model for MNM is a good supplement for surgeons to improve their decision-making process for systematic LN resection.
There were some published models developed for the assessment of lymph node disease of lung cancer [28][29][30][31][32]. Some either only focused on the outcome of MNM, or only the preoperative variables were included while appropriate candidates for limited LN resection should be with pathological con rmation of negative LNs, both in the hilar and the mediastinal regions. For these reasons, none of these models is commonly employed in clinical practice. In our models, on the other hand, all variables, from the radiographic size of tumour to visceral pleural invasion, are available before or during operation by regular measures in routine clinical practice without consuming any extra resources or time. The models are even easier to use in clinical practice with the associated nomograms guiding surgeons to choose the optimized method for LN dissection rapidly.
In a case of clinical stage I NSCLC, when intraoperative rapid pathological results reveal invasive lung cancer, standard lobectomy will be performed with the following recommendations for LN dissection which are based on the patient's category of the prediction models (Table 4). (1) When HNM low risk plus MNM low risk, there is no need to dissect the lymph nodes or only regional LN sampling is adequate. (2) When HNM high risk plus MNM low risk, it usually refers to a central tumour involving the bronchial tree.
In this case, complete dissection of hilar-intrapulmonary LNs is a requirement but no need to dissect mediastinal lymph nodes or just do regional LN sampling in the mediastinum. (3) When HNM low risk plus MNM high risk, it usually indicates skipping metastasis in peripheral lung cancer invading the lung membrane. In this case, complete mediastinal LN dissection is required but no need for hilarintrapulmonary LN dissection or just do regional LN sampling in this region. (4) When HNM high risk plus MNM high risk, a systematic LN dissection is required, including the hilar-intrapulmonary region and the mediastinal region.
Given the retrospective nature of this single-institution study, selection bias is inherent in our study population. Although we validated the models, they still need to be validated by patients from comparative centres. The serum status of CEA and CYFRA221 was predictive factors in our models, but the kinds of tumour marker may not be the same at different hospitals which limits the application of the model. Moreover, the interval between blood test of tumour markers and surgery was not uniformly standardized, which may exert an uncertain impact on the serum status. As our study did not include the tumor recurrent status or the survival rate in this patient population, the relationship between survival rate and predictive value is unknown [33]. Last but not least, there occurs inaccuracy to some extent on the rapid pathological results of the risk-predicting variables during operation, such as visceral pleural invasion, bronchial mucosa and cartilage invasion, and vascular invasion. However, it could be improved in the near future with the development of advanced tools for pathological diagnosis.

Conclusions
Taken together, the mathematical models for prediction of regional lymph node metastasis were accurate and easy-to-use. Based on the patients' clinicopathologic variables before and during operation, these models are helpful in the surgical decision-making process for LN dissection in patients with clinical stage I NSCLC.