Patients’ Clinical Characteristics
Patient screening flow chart is displayed in Fig. 2. A total of 212 patients with IPF were screened, and 192 of them met the inclusion criteria [1] and were included to validate the CTPF comprehensive staging method. Of the 192 included patients, 86 survived; 74 died; 32 were lost to follow-up; 15 patients underwent lung transplantation. Patients’ general clinical characteristics are displayed in Table 3. The mean age was 64.1 ± 7.7 (years) and the average survival time was 28.1 ± 19.5 (months). The majority of the patients were men (183/192, 95.3%) and had a history of smoking (138/192, 71.9%). Most of the patients had a CT-based fibrosis stage of II-IV.
Table 3
Patients’ General Clinical Characteristics
Patient Data | Values |
Median age years | 64.1 ± 7.7 |
Male/female | 183/9 |
Smokers/Never-smokers | 138/54 |
Survival time(months) | 28.1 ± 19.5 |
SpO2% | 95.4 ± 3.2 |
FVC% pred | 72.6 ± 20.3 |
FEV1% pred | 75.4 ± 20.6 |
DLco% pred | 52.3 ± 28.8 |
FEV1/FVC% | 83.5 ± 7.8 |
CT Score values by Reviewer 1 | 24.4 ± 14.1 |
CT Score values by Reviewer 2 | 24.7 ± 14.4 |
CT-based stage I/II/III/IV/V | 0/107/72/13/0 |
PF-based grade a/b/c | 86/77/29 |
GAP stage I/II/III | 97/65/30 |
CPI | 52.3 ± 18.4 |
Notes: Measurement data are presented as mean ± standard deviation (SD). Count data are presented as percentage or proportion. |
SpO2%: oxygen saturation of peripheral blood. SpO2 is the resting arterial oxygen saturation measured at fingertips. FVC: forced vital capacity. FVC% pred: the percentage of the actual FVC over the predicted FVC. FEV1: forced expiratory volume in one second. FEV1% pred: the percentage of the actual FEV1 over the predicted FEV1. DLco: diffusing capacity of the lung for carbon monoxide. DLco% pred: the percentage of the actual DLco over the predicted DLco. FEV1/FVC%: the percentage of FEV1 over FVC. CT Score values by reviewer 1 and CT Score values by reviewer 2 were the scores from the two radiologists using the “4-section honeycomb lung percentage” method to score patients’ HRCT imaging results. CT-based stage: The stage was determined by using the average score of the two radiologists and following the criteria described in Table 2. PF-based grade: The grade was determined by using the pulmonary function and physiological parameters (age, gender, FVC%pred, DLco%pred, and SpO2%) and following the description in Table 2. The grade was defined as: mild (a), moderate (b), and severe (c). GAP (gender, age, and physiologic variables) stage followed the recommendation by Brett Ley, and a higher stage represented a greater death risk. CPI: composite physiologic index. In 2002, Athol U. Wells and colleagues proposed to use CPI, which combined chest CT and pulmonary functional parameters, to assess the severity of interstitial lung diseases (ILDs). A higher CPI represents a more severe ILD. |
The Relationship Between CT-based Stage/PF-based Severity and Pulmonary Function and Death Risk
The average CT scores of the 192 patients from the two radiologists using the “4-section honeycomb percentage” method were 24.4 ± 14.1 and 24.7 ± 14.4, respectively; the highest scores were 67 and 65, respectively, and the lowest values were 1 and 3, respectively (Table 3). The intra-group correlation coefficient of the scores from the two radiologists was 0.95 (P < 0.05). For each patient, the mean CT score from the two radiologists was used as the final CT score. The final CT scores were used in the Spearman correlation analysis to assess the correlation between the CT scores and pulmonary function parameters (Fig. 3). The CT scores negatively correlated with FVC%pred (rs = -0.47, P < 0.01, Fig. 3A), DLco%pred (rs = -0.66, P < 0.01, Fig. 3B), and SpO2% (rs = -0.40, P < 0.01, Fig. 3C) and positively correlated with CPI index (rs=0.63, P < 0.01, Fig. 3D), which represented ILD severity. These data support that the “4-section honeycomb lung percentage” scoring method can effectively represent the severity of pulmonary fibrosis.
To analyze the correlation between CT-based stage and death risk, we performed Fine–Gray univariate regression (Fig. 4A) and multivariate regression to eliminate the potential confounding effects from the PF-based grade (Fig. 4B). Both analyses revealed that CT stage positively correlated with death risk. Similarly, both Fine–Gray univariate regression (Fig. 4C) and multivariate regression to eliminate the potential confounding effects from the CT-based stage (Fig. 4D) found that PF-based grade also positively correlated with death risk.
CTPF stage
HRCT images of two representative cases are displayed in Additional file. Figure.S1 Example. A shows that the patient was CT-based stage III and PF-staged grade c and thus CTPF stage III c. The patient developed IPF exacerbation and died 23 months after the patient’s clinical data were acquired for the assessment of this study. Figure.S1 Example. B shows CT-based stage II and PF-based grade a and thus CTPF stage II a, and this patient survived well in the 39-month follow-up visit.
Table 4 displays the results from 4 Fine-Gray competitive risk regression prediction models. The predictive factors of the four models were CT-based stage, PF-based grade, CTPF comprehensive stage, and GAP stage, respectively. The CT model, PF model, and GAP model demonstrated that CT-based stage, PF-based grade, and GAP stage were risk factors for death from IPF. The CTPF model showed that CT-based stage and PF-based grade were independent predictors of death from IPF regardless of the type (univariate or multivariate) of the analysis.
Table 4
Fine–Gray Death Risk Regression Analysis Results From 4 Prediction Models
| Hazard Ratio (HR) | P-value | 95% CI |
Model CT | | | |
CT II | referent | | |
CT III | 2.22 | 0.001 | 1.36 to 3.63 |
CT IV | 5.32 | 0.001 | 1.97 to 14.39 |
Model PF | | | |
PF(a) | referent | | |
PF(b) | 1.99 | < 0.001 | 1.18 to 3.34 |
PF(c) | 4.39 | < 0.001 | 2.22 to 8.70 |
Model CTPF | | | |
CT II | referent | | |
CT III | 1.76 | 0.039 | 1.03 to 3.00 |
CT IV | 3.10 | 0.059 | 0.96 to 10.04 |
PF(a) | referent | | |
PF(b) | 1.68 | 0.066 | 0.97 to 2.92 |
PF(c) | 2.79 | 0.011 | 1.27 to 6.13 |
Model GAP | | | |
GAP I | referent | | |
GAP II | 2.30 | 0.002 | 1.37 to 3.87 |
GAP III | 3.31 | < 0.001 | 1.71 to 6.43 |
Notes: CI: confidence interval. Model CT: CT-based stage was used in the univariate Fine–Gray death risk regression analysis. Model PF: PF-based grade was used in the univariate Fine–Gray death risk regression analysis. Model CTPF: CTPF comprehensive stage was used in the multivariate Fine–Gray death risk regression analysis. Model GAP: GAP stage proposed by Brett Ley was used in univariate Fine–Gray death risk regression analysis. CT II: Honeycomb lesion area was < 25% of the entire lung. CT III: Honeycomb lesion area was 25%-49% of the entire lung. CT IV: Honeycomb lesion area was 50%-75%. PF-based grade was determined by assessing the scores of age, gender, FVC%pred, DLco%pred, and SpO2% according to the criteria in Table 2 and adding the scores. PF (a): score 0–3. PF(b): score 4–6. PF(c): score 7–10. GAP I: score 0–3. GAP II: score 4–5. GAP III: score 6–8. |
The AUC versus time plot from the Bootstrap cross-validation model is displayed in Fig. 5. Compared with the other three prediction models (CT model, PF model, and GAP model), the AUC value calculated from the CTPF model was the best; both the one-year and the two-year AUC values of the CTPF model were > 75%. Figure 6A is the nomogram showing CTPF-based death risk prediction, which was prepared from the CT-based stage and PF-based grade multivariate Fine-Gray regression coefficients. Figures 6B, 6C, and 6D show the calibration curves of the four prediction models after Bootstrap cross-validation. The CTPF model had the best stability. The one-, two-, and three-year cumulative death risks of patients at different CTPF stage are displayed in Table 5. When patients had the same CT-based stage, their cumulative death risk increased as their PF-based grade increased. When patients had the same PF-based grade, their cumulative death risk increased as their CT-based stage increased. Thus. combination of CT-based stage and PF-based grade could improve the accuracy of death risk prediction.
Table 5
CTPF Model-predicted one-, two-, and three-year accumulative death risk of patients at different CTPF stage
CTPF stage | 1-y Cumulative mortality % | 2-y Cumulative mortality % | 3-y Cumulative mortality % |
| II a | 4.81 | 13.07 | 17.50 |
II b | 7.95 | 20.98 | 27.63 |
II c | 12.84 | 32.34 | 41.51 |
III a | 8.29 | 21.82 | 28.67 |
III b | 13.54 | 33.88 | 43.33 |
III c | 21.44 | 49.65 | 61.02 |
IV a | 14.18 | 35.25 | 44.94 |
IV b | 22.66 | 51.84 | 63.32 |
IV c | 34.70 | 70.23 | 81.05 |
Notes: CTPF stage: CTPF-based comprehensive stage. |
II a: CT stage II and PF grade a; II b: CT stage II and PF grade b; II c: CT stage II and PF grade c; III a: CT stage III and PF grade a; III b: CT stage III and PF grade b; III c: CT stage III and PF grade c; IV a: CT stage IV and PF grade a; IV b: CT stage IV and PF grade b; IV c: CT stage IV and PF grade c. |
CT II: Honeycomb lesion area was < 25% of the entire lung. CT III: Honeycomb lesion area was 25%-49% of the entire lung. CT IV: Honeycomb lesion area was 50%-75%. PF-based grade was determined by assessing the scores of age, gender, FVC%pred, DLco%pred, and SpO2% according to the criteria in Table 2 and adding the scores. PF (a): score 0–3. PF(b): score 4–6. PF(c): score 7–10. |
Comparison Between CTPF Stage and GAP Stage
The CTPF model, which combined CT-based stage and PF-based grade, predicted that the AUC values of one-, two-, and three-year accumulative death risk were 78.6 (95% CI: 58.6–93.0), 77.8 (95% CI: 58.8–83.4), and 73.4 (95% CI: 57.5.1–85.1), respectively. The AUC values from the GAP model were 73.9 (95% CI: 58.4–87.7), 72.3 (95% CI: 58.8–83.4), and 70.8 (95% CI: 57.8.1–82.9), respectively. These data and the calibration curves after the cross-validation (Fig. 6) suggest that the CTPF stage appears to be more accurate for predicting death risk than the other 3 models.