Patients Characteristics of the Study Cohorts
The preoperative clinical-radiologic characteristics of all PCN patients were found no significant difference in the development and test cohort, as shown in Table 1. From February 2016 and December 2020, there was a total of 143 pancreatic cystic neoplasms patients enrolled in our study cohort, including 45 MCN patients, 47 SCN patients and 51 IPMN patients. The proportion of males and females in the population of MCN, SCN and IPMN patients was 1:6.5, 1:2.9, and 1:0.4; the median age with interquartile range was 60 (47.75, 68.25) years ,55(44.00, 62.00) years and 66 (61.00, 71.00) years, respectively. Comparison of consistency between preoperative radiological and postoperative pathological diagnosis, the result showed that diagnostic accuracy and precision of preoperative imaging findings in PCNs patients were relatively low. The rate of accurate diagnosis was 16.67%, 10.64% and 56.86%, respectively for the MCN, SCN and IPMN. Among the three subtypes, the percentage of PCNs patients with ambiguous diagnosis was separately 73.33%, 70.21%, and 39.22% (MCN, n = 33; SCN, n = 33; IPMN, n = 22). In our institution, the misdiagnosis rate of MCN, SCN, IPMN patients was 10.00%, 19.15%, and 3.92%, respectively.
Table 1
The clinical and imaging Characteristics of PCN patients in development and test cohort
characteristic
|
Development cohort
(n=102)
|
Test Cohort
(n=41)
|
p Value
|
Age
|
54.5 (43.0, 66.0)
|
61.0 (42.0, 67.0)
|
0.449
|
Sex
Female
Male
|
59 (57.8%)
43 (42.2%)
|
18(43.1%)
23 (56.1%)
|
0.132
|
Primary tumor type
MCN
SCN
IPMN
|
35 (34.3%)
33 (32.4%)
34 (33.3%)
|
10 (24.4%)
14 (34.1%)
17 (41.5%)
|
0.670
|
Chronic pancreatitis
Present
Absent
|
7(6.9%)
95(93.1%)
|
4 (9.8%)
37(90.2%)
|
0.558
|
Abdominal symptom
Present
Absent
|
53(52.0%)
49(48.0%)
|
22(53.7%)
19(46.3%)
|
0.855
|
History of Diabetes
Present
|
7(6.9%)
|
5(12.2%)
|
0.300
|
Absent
|
95(93.1%)
|
36(87.8%)
|
|
Serum AFP (ng/ml) *
|
2.0 (1.3, 3.1)
|
1.8 (1.3, 2.6)
|
0.115
|
Serum CA19-9 (U/ml) *
|
10.2 (6.3, 15.5)
|
10.4 (6.5, 10.4)
|
0.641
|
Serum CEA (ng/ml) *
|
0.8 (0.5, 1.8)
|
1.0 (0.5, 2.0)
|
0.250
|
Serum CA125 (U/ml) *
|
8.7 (5.4, 11.1)
|
9.3 (6.8, 11.6)
|
0.507
|
Serum CA72-4 (U/ml) *
|
1.5 (1.0, 2.7)
|
1.7 (1.1, 3.1)
|
0.809
|
Serum CA242 (U/ml) *
|
4.3 (3.0, 7.6)
|
4.0 (2.3, 6.1)
|
0.724
|
Serum ALT (U/L) *
|
15.9(11.9, 22.4)
|
18.7(15.2, 23.9)
|
0.554
|
Serum AST (U/L) *
|
18.7(15.2, 23.9)
|
18.8(16.0, 24.0)
|
0.746
|
Serum ALB (g/L) *
|
40.1(39.5, 42.1)
|
40.8(38.6, 42.5)
|
0.465
|
Lesion location
Head and neck
Body and tail
Uncinate
Diffuse
|
33(32.4%
50 (49.0%)
12(11.8%)
7(6.9%)
|
19(46.3%)
16(39.0%)
2(4.9%)
4(9.8%)
|
0.370
|
Tumor diameter (cm) *
|
2.83(1.85, 4.63)
|
2.81(2.17, 4.15)
|
0.857
|
Tumor number
Solitary
Multiple
|
79(77.5%)
23(22.5%)
|
29(70.7%)
12(29.3%)
|
0.400
|
Mean CT value (HU)
|
37(-8, 64.2)
|
26(-10, 59)
|
0.381
|
Calcification
|
|
|
0.595
|
Present
Absent
|
16(15.7%)
86(84.3%)
|
5(12.2%)
36(87.8%)
|
|
pancreatic duct dilatation
|
|
|
0.06
|
Present
|
55(53.9%)
|
30(73.2%)
|
|
Absent
|
47(46.1%)
|
11(26.8)
|
|
Bile duct dilatation
|
|
|
0.808
|
Present
|
14(13.7%)
|
5(12.2%)
|
|
Absent
|
88(86.3%)
|
36(87.8%)
|
|
Note: Except where indicated, data are numbers of patients, with percentages in parentheses. AFP = a-fetoprotein, CA199 = carbohydrate antigen 199, CEA = carcinoembryonic antigen, CA125 = carbohydrate antigen 125, CA724 = carbohydrate antigen 724, CA242 = carbohydrate antigen 242, ALT = alanine aminotransferase, AST = aspartate aminotransferase, ALB= albumin. |
*Data are medians, with parentheses are the interquartile range. |
Univariate And Multivariate Analysis Of Clinical-radiologic Parameters
Table 1 showed the researchers retrospectively collected and analyzed baseline characteristics that were considered clinically relevant with the outcome of diagnosis. Univariate logistic regression analysis of the clinical data and radiological features indicated that the age, sex, abdominal symptom, serum tumor markers [alpha-fetoprotein (AFP), carcinoembryonic antigen (CEA)], serum alanine aminotransferase (ALT), serum aspartate aminotransferase (AST), tumor diameter, calcification, bile duct dilatation and lesion location were statistically significant (P < 0.1) between the three subtypes in the development cohort. Next, the significant variables were entered into multivariate logistic regression to obtain the risk factors for diagnosis of PCNs. The statistical data of multivariate analysis was completely shown in Table 2. The result indicated that the age (P = 0.001, 0.005, 0.041), sex (P = 0.005, 0.041, 0.017) and tumor diameter (P < 0.000, 0.020, 0.021) were independent risk factors for differential diagnosis between MCN and SCN, MCN and IPMN, SCN and IPMN.
Table 2
Multivariate Logistic Regression analysis of risk factors in development cohort
|
MCN vs SCN
|
MCN vs IPMN
|
SCN vs IPMN
|
|
OR (95%CI)
|
P
|
OR (95%CI)
|
P
|
OR (95%CI)
|
P
|
age
|
0.863(0.792, 0.941)
|
0.001*
|
0.598(0.418, 0.857)
|
0.005*
|
0.693(0.487, 0.985)
|
0.041*
|
sex
|
0.122(0.010, 1.549)
|
0.005*
|
32.017(1.564, 65.520)
|
0.041*
|
262.102(9.525, 725.121)
|
0.017*
|
Abdominal symptom
|
1.353(0.296, 6.180)
|
0.696
|
0.042(0.000, 4.422)
|
0.182
|
0.031(0.000, 2.951)
|
0.135
|
CEA
|
3.660(0.831, 16.112)
|
0.086
|
2.236(0.089, 56.235)
|
0625
|
0.611(0.026, 14.307)
|
0.759
|
AFP
|
2.261(0.958, 5.336)
|
0.062
|
1.266(0.429, 3.730)
|
0.669
|
0.560(0.192, 1.629)
|
0.287
|
ALT
|
0.828(0.668, 1.026)
|
0.084
|
1.022(0.720, 1.450)
|
0.905
|
1.233(0.893, 1.703)
|
0.202
|
AST
|
1.151(0.914, 1.448)
|
0.232
|
0.779(0.484, 1.254)
|
0.303
|
0.677(0.415, 1.103)
|
0.118
|
Tumor diameter
|
1.3104(1.160, 11.126)
|
0.000*
|
33.913(1.732, 66.388)
|
0.020*
|
33.446(1.709, 645.413)
|
0.021*
|
Calcification
|
3.234(0.534, 19.577)
|
0.201
|
1.463(0.000, 30.172)
|
0.994
|
45.249(0.000, 94.340)
|
0.994
|
Bile duct dilatation
|
2.316(0.075, 60.672)
|
0.657
|
0.368(0.004, 34.070)
|
0.665
|
0.172(0.003, 11.115)
|
0.408
|
Lesion location
|
2.136(0.075, 60.672)
|
0.657
|
17.143(0.001, 214.380)
|
0.242
|
69.886(0.001, 403.815)
|
0.483
|
Note: Numbers in parentheses are the 95% confidence interval. CI: confidence interval; *, P < 0.05. |
Radiomic Feature Selection And Signature Construction
Among 1218 radiomics features extracted from CT images, 866 features were selected by evaluation of the intraclass correlation coefficient (ICC > 0.75). Then, the Pearson correlation test was used to exclude 722 features with high correlation coefficients.
The Boruta algorithm was conducted to screen 13 features to construct a radiomics signature by RF analysis. The important features selected from Boruta algorithm were presented in Figure 4. The radiomics signature demonstrated good prediction ability with the out-of-bag (OOB) error of 0.317 and a C index of 0.772 in the test cohort, and the diagnosis performance was summarized in Table 3.
Table 3
Diagnosis performance of the radiomics signature in the development and test cohort
|
Development cohort
|
Test cohort
|
P/T
|
SCN
|
MCN
|
IPMN
|
Pre
|
Rec
|
F1
|
SCN
|
MCN
|
IPMN
|
Pre
|
Rec
|
F1
|
SCN
|
23
|
5
|
4
|
0.719
|
0.697
|
0.708
|
6
|
0
|
2
|
0.750
|
0.429
|
0.546
|
MCN
|
7
|
29
|
1
|
0.784
|
0.829
|
0.806
|
2
|
9
|
2
|
0.692
|
0.692
|
0.692
|
IPMN
|
3
|
1
|
29
|
0.879
|
0.853
|
0.866
|
6
|
1
|
13
|
0.650
|
0.765
|
0.698
|
Total
|
33
|
35
|
34
|
OA
|
0.794
|
14
|
13
|
17
|
OA
|
0.683
|
Note: T, True type; P, Predicted type; Pre, Precision; Rec, Recall; OA: Overall accuracy. Precision = Tree Position/(Tree Position + False Position); Recall = Tree Position/(Tree Position + False Negative); F1-score = 2 × Precision × Recall/(Precision + Recall). |
Prediction Models Development And Validation
The radiomics-based models were established by three significant clinical-radiologic parameters at multivariable analysis and radiomics signature in the development cohort.
The multi-class prediction model indicated that the classification error (out-of-bag estimate) reached stable with the minimum value of 19.61%, when the number of trees was more than 500 and three variables were tried at each split. Figure 5 illustrated the relationship between error rate and the number of trees in the process of multi-class model construction. In the development dataset, the multi-class radiomics model had an overall accuracy of 0.804 and the respective precision of 0.800, 0.727 and 0.929 for SCN, MCN and IPMN (Table 4). In the test dataset, the overall accuracy to classify the triple tumors was 0.707 and the precision of identification severally for SCN, MCN and IPMN was 0.750, 0.667 and 0.722.
Table 4
Diagnosis performance of the multi-class prediction model in the development and test cohort
|
Development cohort
|
Test cohort
|
P/T
|
SCN
|
MCN
|
IPMN
|
Pre
|
Rec
|
F1
|
SCN
|
MCN
|
IPMN
|
Pre
|
Rec
|
F1
|
SCN
|
24
|
2
|
4
|
0.800
|
0.727
|
0.762
|
6
|
0
|
2
|
0.750
|
0.429
|
0.546
|
MCN
|
8
|
32
|
4
|
0.727
|
0.914
|
0.810
|
3
|
10
|
2
|
0.667
|
1.000
|
0.800
|
IPMN
|
1
|
1
|
26
|
0.929
|
0.765
|
0.839
|
5
|
0
|
13
|
0.722
|
0.765
|
0.743
|
Total
|
33
|
35
|
34
|
OA
|
0.804
|
14
|
10
|
17
|
OA
|
0.707
|
Note: T, True type; P, Predicted type; Pre, Precision; Rec, Recall; OA: Overall accuracy. |
Table 5A. Diagnosis performance of the SCN-MCN model in the development and test cohort
|
Development cohort
|
Test cohort
|
P/T
|
SCN
|
MCN
|
Pre
|
Rec
|
F1
|
SCN
|
MCN
|
Pre
|
Rec
|
F1
|
SCN
|
27
|
4
|
0.871
|
0.818
|
0.844
|
8
|
0
|
1.000
|
0.571
|
0.727
|
MCN
|
6
|
31
|
0.838
|
0.886
|
0.861
|
6
|
10
|
0.625
|
1.000
|
0.769
|
Total
|
33
|
35
|
OA
|
0.853
|
14
|
10
|
OA
|
0.750
|
Table 5B. Diagnosis performance of the MCN-IPMN model in the development and test cohort
|
Development cohort
|
Test cohort
|
P/T
|
MCN
|
IPMN
|
Pre
|
Rec
|
F1
|
MCN
|
IPMN
|
Pre
|
Rec
|
F1
|
MCN
|
33
|
3
|
0.917
|
0.943
|
0.930
|
10
|
3
|
0.769
|
1.000
|
0.869
|
IPMN
|
2
|
31
|
0.939
|
0.912
|
0.925
|
0
|
14
|
1.000
|
0.823
|
0.903
|
Total
|
35
|
34
|
OA
|
0.928
|
10
|
17
|
OA
|
0.889
|
Table 5C. Diagnosis performance of the SCN-IPMN model in the development and test cohort
|
Development cohort
|
Test cohort
|
P/T
|
SCN
|
IPMN
|
Pre
|
Rec
|
F1
|
SCN
|
IPMN
|
Pre
|
Rec
|
F1
|
SCN
|
29
|
5
|
0.853
|
0.879
|
0.866
|
11
|
2
|
0.846
|
0.786
|
0.828
|
IPMN
|
4
|
29
|
0.879
|
0.853
|
0.866
|
3
|
15
|
0.833
|
0.882
|
0.857
|
Total
|
33
|
34
|
OA
|
0.866
|
14
|
17
|
OA
|
0.839
|
Note: T, True type; P, Predicted type; Pre, Precision; Rec, Recall; OA: Overall accuracy.
The binary-class radiomics models were composed of three classification models to distinguish between SCN and MCN; MCN and IPMN, SCN and IPMN. The SCN-MCN model, MCN-IPMN model and SCN-IPMN model showed the overall accuracy of 0.853, 0.928, 0.866 in the development cohort, and 0.750, 0.889, 0.839 in the test cohort. For the SCN-MCN model, the precision was 0.871 and 0.838 in the development dataset (Table 5A). For MCN-IPMN model, the precision was 0.917 and 0.939 in the development cohort (Table 5B). Meanwhile, the precision of SCN-IPMN model was 0.853 and 0.879 in the development cohort in Table 5C.
All binary class prediction models virtually presented the higher overall accuracy and F1-score than the multi-class prediction model both in the development and test cohort. Especially, the model for diagnosis classification of MCN and IPMN yielded favorable predictive performance. By analyzing ROC curve in the test cohort, the multi-class radiomics model integrating radiomic and clinical-radiologic features improved the diagnostic accuracy efficacy, compared with the radiomics signature (Figure 6). The value of AUC was 0.772 and 0.850, respectively. However, the binary-class radiomics model showed the best discriminatory ability, with the value of AUC was 0.914 for SCN and MCN, 0.863 for SCN and IPMN ,0.926 for MCN and IPMN in the test cohort.
The calibration curve demonstrated that the model-predicted subtype was well- calibrated with the pathologically confirmed subtype in the binary-class radiomics models (Figure 7). With decision curve analysis, three binary-class prediction models displayed a great net benefit under the suitable range of threshold probabilities in the test dataset (Fig. 7). As shown in Figure 8, the nomogram was performed to visualize the binary-class radiomics models and provide the predicted probability of tumor subtypes for the individuals.