Patient selection
A retrospective analysis of 802 patients (802 cases) with lung adenocarcinoma (diameter ≤ 3 cm) was collected from the Shanghai Chest Hospital from February 2016 to January 2021, all tumors and lymph nodes (no less than 6 groups with N1 or N2 dissection) confirmed by pathological examination. The criteria of inclusion and exclusion are illustrated in Supplementary Figure S1.
Pathological evaluation
The pathological histology of all tumors was evaluated by two pathologists in lung tumors following 2015 WHO classification of lung adenocarcinoma. Lymph node staging was diagnosed according to the eighth version of the TNM staging method. The lymph node metastasis of lung adenocarcinoma was divided into 1 to 14 groups. N1 lymph nodes were located in the ipsilateral intrapulmonary or hilar lymph nodes (11–14 groups), and N2 lymph nodes were located in the ipsilateral mediastinal lymph nodes (2–9 groups).
Clinical information of selected lung adenocarcinoma
The clinical information of these patients collected here included age, gender, smoking history, tumor marker carcinoembryonic antigen (CEA). CT features included lung tumor location, lobulation sign, burr sign, pleural traction, and solid component size. PET parameter included SUVmax value. The clinical information of these enrolled cases was summarized in Table 1. The size of the solid component of pulmonary nodules, including mixed ground-glass nodules, referred to the average value of the longest cross-section length and vertical diameter length of the solid component on the pulmonary window [16].
Table 1
Clinical features of 528 patients enrolled in this study
Clinical characteristics
|
Training (n = 371)
|
Testing (n = 157)
|
Lymph node (-)
|
Lymph node (+)
|
P value
|
Lymph node (-)
|
Lymph node (+)
|
P value
|
Age,year (median;IQR)
|
62; 55 ~ 67
|
62; 55 ~ 68
|
0.62
|
63; 56 ~ 69
|
60; 50 ~ 70
|
0.166
|
Gender Male
|
103
|
46
|
0.369
|
48
|
23
|
0.555
|
Female
|
163
|
59
|
65
|
21
|
Smoke Yes
|
84
|
36
|
0.617
|
38
|
15
|
0.958
|
No
|
182
|
69
|
75
|
29
|
Location Upper lobe, right
|
109
|
30
|
0.024
|
48
|
12
|
0.101
|
Middle lobe, right
|
46
|
18
|
21
|
10
|
Lower lobe, right
|
13
|
8
|
5
|
1
|
Upper lobe, left
|
60
|
29
|
26
|
15
|
Lower lobe, left
|
38
|
20
|
13
|
6
|
CEA, ng/ml (median; IQR)
|
2.32; 1.53 ~ 3.93
|
3.67; 2.11 ~ 8.50
|
0.273
|
2.21; 1.53 ~ 3.52
|
3.59; 2.01 ~ 6.27
|
0.251
|
Lobulation (+)
|
260
|
103
|
0.836
|
112
|
44
|
0.279
|
(-)
|
6
|
2
|
1
|
0
|
Burr (+)
|
249
|
102
|
0.176
|
112
|
44
|
0.688
|
(-)
|
17
|
3
|
|
1
|
0
|
|
Pleural traction (+)
|
45
|
53
|
< 0.001
|
15
|
18
|
< 0.001
|
(-)
|
221
|
52
|
98
|
26
|
Solid components, cm (median; IQR)
|
0.85; 0.50 ~ 1.3
|
2.05; 1.65 ~ 2.45
|
< 0.001
|
0.95; 0.55 ~ 1.4
|
2.1; 1.65 ~ 2.49
|
< 0.001
|
SUVmax (median;IQR)
|
3.68; 2.1 ~ 7.45
|
11.11; 8.92 ~ 15.31
|
0.145
|
3.57; 2.08 ~ 6.99
|
10.25; 7.1 ~ 13.36
|
0.279
|
PET/CT scan procedures
All patients were examined under the same scanning conditions on the same device (Siemens Biograph MCT-S PET/CT). CT was a 64-slice spiral CT. 18F-FDG is provided by Shanghai Atom Kexing Pharmaceuticals Co., Ltd. All patients were forbidden to eat and drink for more than 6 hours before PET/CT, with blood sugar below 150mg/ dL. All patients were injected with 18F-FDG at 5 mBq/kg ± 10% of body weight and then rested for 60 minutes. The PET scan was divided into 5 or 6 beds, each bed was checked for about 2 minutes. CT data were used to attenuate correction PET images, and Truex + TOF was used to reconstruct PET images. The PET and CT scan thickness of all patients was 5mm. The matrix size of all PET reconstruction was 200 × 200, and the anisotropic voxel was 4.07 × 4.07 × 3.0 mm3. After regular PET and CT image scans, 1 mm breath-hold lung CT scan was added. CT was reconstructed by conventional algorithm, PET was reconstructed by iterative method.
PET/CT image processing and analysis
The 5 mm PET and 1 mm CT images of all patients were exported from PACS workstation in DICOM format, and then imported into ITK-SNAP software (version 3.8.0-beta, www.itksnap.org)to outline lung cancer in 3D mode. The entire delineation was completed by two medical imaging doctors with a history of not less than 12 years, and none of them knew the patient's pathological results. For CT image delineation, we observe the lesion on the lung window (window width 1600Hu, window level − 600Hu).
The two doctors delineated the primary tumor on PET images, using a 40% SUVmax threshold to characterize the volume of interest (VOI) [17–21]. To avoid including the physiologic uptake in the VOI, a combined CT and PET scan reading is performed [19–21]. An example of VOI delineation is shown in Fig. 1.
Image pre-processing
The original 5mm PET, 1mm breath-holding thin-layer CT (DICOM format), and the outlined VOI of each lung tumor were inputted into IBSI-compatible Artificial Intelligence Kit software (AK analysis kit, GE healthcare, 3.2.2) to be preprocessed [20–24]. The µ ± 3σ method was used to remove data with the brightness greater than 3σ to normalize the image brightness [20, 21, 25]. The images were resampled to 1 x 1 x 1 mm3 by using Linear interpolation to improve the resolution of the images. The pre-processed images were imported into ITK-SNAP to delineated the VOI.
Segmentation, feature extraction, feature selection, radiomic model construction, and validation
The intra- and inter-observer consistency coefficients were evaluated, 50 cases were randomly selected from the enrolled study cases. Two observers (observer A and B) with more than 10 years of experience in PET and CT applications delineated the VOIs. Observer A delineated the VOI of CT and PET images twice at an interval of 4 weeks, and the intra- observer consistency coefficients of the extracted features were evaluated between the two delineations of observer A. Observer B delineated the VOI independently once, and the inter-observer consistency coefficients between the radiomics feature extracted by the observer A (first time) and B and were evaluated. ICC > 0.75 indicates good agreement. Observer A finished the rest delineation. Based on the VOI of lung tumors outlined by observer A from CT and PET images, 402 radiomics features were extracted in every image by using AK software, including 42 histograms, 154 gray level co-occurrence matrix (GLCM), 15 formfactors features that described the shape of the VOI, 180 run length matrix (RLM) features and 11 gray level size zone matrix (GLSZM) features. The bin width was set to 25 while extracting the features.
The patients were randomly assigned into training (371 cases) and test group (157 cases) according to the ratio of 7:3 by using a stratified sampling method to ensure the balance of positive and negative samples in the training group and test group [17, 20–25]. In the training set minimum redundancy and maximum correlation (mRMR) and least absolute shrinkage and selection operator (LASSO) methods were used to select the most valuable radiomics features for predicting lung adenocarcinoma lymph node metastasis based on the features with ICC > 0.75. Then three multivariate logistic regression models based on PET/CT, CT, and PET were established in the training group.
The radiomic score of each patient was calculated based on the combination of the retained features weighted by LASSO logistic regression coefficients (Supplementary Methods). The area under the curve (AUC) was used to evaluate the diagnostic efficacy of the three-group radiomics model in predicting LLNM of lung adenocarcinoma. The efficacy of predicting LLNM of lung adenocarcinoma was evaluated in the test group. Delong test was employed to compare the performance of the three different models based on PET/CT, CT, and PET to figure out the most predictive model. To verify the reliability of the model, 100 times repeated cross-validation was performed. The workflow of radiomic analysis is shown in Fig. 1.
Construction of the radiomic nomogram
The univariate analysis was applied to clinical factors, factors with p < 0.1 were analyzed by using univariate logistic regression to identify whether the features were discriminative (p < 0.05) between the groups. Then multivariate logistic regression was applied to these discriminative clinical features to construct a clinical model, also the clinical features, as well as the radiomics score, were integrated to establish a predictive nomogram. Moreover, variance inflation factor (VIF) was used for collinearity analysis, and the factors with VIF > 10 were eliminated. All models were constructed in the training group and then validated in the test group.
Statistical analysis
In this study, R language software (version 3.5.1) was used for statistical analysis. For clinical data, the chi-square test was applied to normal distribution features which were given as mean ± sd, wilcoxon test was applied to abnormal distributions, and which were given as median (lower quartile, upper quartile). Decision curve analysis (DCA) was used to evaluate the clinical utility of the PET/CT molecular radiomics-clinical model in the test group to predict lung adenocarcinoma lymph node metastasis.