Baseline characteristics
Based on the inclusion and exclusion criteria, a total of 333 patient data were included from TCGA database. The cut-off value for CD38 expression level was calculated as 0.757, and according to this criterion, the patients were divided into CD38high group (145 cases) and CD38low group (188 cases). The baseline characteristics of the included patients are shown in Table 1. Except for tumor residual disease (P=0.024), there were no significant differences in other clinical features between the two groups. After screening the TCIA database, a total of 56 TCGA-TCIA intersecting cases were finally included. The Inclusion-Exclusion Flowchart can be seen in Supplementary Figure 1.
Table 1 Clinical characteristics of the EOC patients with high and low CD38 expression
Variables
|
Total (n=333)
|
Low (n=188)
|
High (n=145)
|
P
|
Age, n (%)
|
<60
|
171(51)
|
100 (53)
|
71(49)
|
0.513
|
≥60
|
162(49)
|
88(47)
|
74(51)
|
FIGO stage, n (%)
|
I/II
|
18(5)
|
8(4)
|
10(7)
|
0.324
|
III
|
265(80)
|
148(79)
|
117(81)
|
IV
|
50(15)
|
32(17)
|
18(12)
|
Histologic grade, n (%)
|
G1/G2
|
41(12)
|
28(15)
|
13(9)
|
0.143
|
G3/G4
|
292(88)
|
160(85)
|
132(91)
|
Tumor residual disease, n (%)
|
No macroscopic disease
|
58(18)
|
34(18)
|
24(17)
|
0.024
|
1-10mm
|
161(48)
|
95(50)
|
66(46)
|
≥11mm
|
84(25)
|
50(27)
|
34(23)
|
Unknown
|
30(9)
|
9(5)
|
21(14)
|
Lymphatic invasion, n (%)
|
Yes
|
90(27)
|
45(24)
|
45(31)
|
0.127
|
No
|
39(12)
|
27(14)
|
12(8)
|
Unknown
|
204(61)
|
116(62)
|
88(61)
|
Venous invasion, n (%)
|
Yes
|
58(18)
|
30(16)
|
28(19)
|
0.510
|
No
|
31(9)
|
20(11)
|
11(8)
|
Unknown
|
244(73)
|
138(73)
|
106(73)
|
Chemotherapy, n (%)
|
Yes
|
312(94)
|
177(94)
|
135(93)
|
0.871
|
No
|
21(6)
|
11(6)
|
10(7)
|
Expression level, immune infiltrates and gene enrichment analysis of CD38
CD38 expression is significantly upregulated in EOC compared to normal tissues (p < 0.01), as shown in Figure 1-A.
Differences in the degree of immune cell infiltration were observed between the CD38high and CD38low groups. Specifically, the CD38high group showed a significant increase in infiltration of some types of immune cells such as Natural Killer (NK) cells, Dendritic Cells (DC), and gamma-delta T (γδT) cells compared to the CD38low group (p < 0.001), as shown in Figure 1-B.
The results of GSEA revealed significant enrichment of differentially expressed genes between the CD38high and CD38low groups in multiple signaling pathways, mainly including the mitogen-activated protein kinase (MAPK) signaling pathway, chemokine signaling pathway, apoptosis, and PI3K-AKT-mTOR signaling, as shown in Figure 1-C and D. CD38 expression is significantly positively correlated (p < 0.001) with CSF2RB, PIK3CG, and FASLG genes, as shown in Supplementary Figure 2.
CD38 related survival analysis
Figure 2-A illustrated the Kaplan-Meier survival curves for CD38high and CD38low groups. The median survival time for CD38high group was 52.77 months, while for CD38low group, it was 42.13 months, indicating a significant association between high CD38 expression and prolonged overall survival (P < 0.001).
The results of the univariate Cox regression analysis indicated that high expression of CD38 (HR=0.554, 95% CI: 0.415-0.740, P < 0.001) and chemotherapy (HR=0.470, 95% CI: 0.288-0.767, P = 0.003) were protective factors for OS. The results of the multivariate Cox regression analysis demonstrated that high expression of CD38 (HR =0.540, 95% CI: 0.400-0.730, P < 0.001) and chemotherapy (HR=0.377, 95% CI: 0.226-0.627, P < 0.001) were independent protective factors for OS. As shown in Figure 2-B. In addition, we performed COX subgroup analysis on the main variable CD38, which confirmed that there was no significant interaction between the covariates and the association of CD38 and OS. Details showed in Supplementary Figure 3.
Radiomics feature extraction and selection
The 56 TCGA-TCIA cases, were divided into CD38high (29 cases) and CD38low (27 cases) groups based on the cut-off value of 0.757. A total of 107 radiomics features were extracted from manually delineated tumor regions and standardized. The median value of the ICC was 0.963, with 101 features having an ICC ≥ 0.80, 5 features with 0.5 ≤ ICC ≤ 0.8, and 1 feature with ICC < 0.5. Features with an ICC ≥ 0.80 were further selected for subsequent feature selection. Finally, five radiomics features, including glcm_ldn, glrlm_RunLenthNonUniformity, gldm_DependenceVariance, glszm_LargeAreaHighGrayLevelEmphasis, gldm_SmallDependenceLowGrayLevelEmphasis, were selected by mRMR and stepwise regression.
Construction and evaluation of LR and SVM models
In the LR model, the overall importance of the selected radiomic features is illustrated in Supplementary Figure 4. The formula for predicting the probability of CD38 expression level, represented by the Rad_score, is as follows:
Rad_score = 0.220-0.547*(glcm_ldn)
-0.706*(glrlm_RunLenthNonUniformity)
-0.566*(gldm_DependenceVariance)
+1.787*(glszm_LargeAreaHighGrayLevelEmphasis)
-1.181*(gldm_SmallDependenceLowGrayLevelEmphasis)
The SVM model was also constructed using the aforementioned five radiomics features. The importance of each radiomics feature in the model is shown in Supplementary Figure 5.
The predictive performance of the LR and SVM models was evaluated. The LR model achieved an AUC value of 0.739 (95%CI:0.609-0.870) for the ROC curve, and after 5-fold cross-validation, the ROC-AUC of the LR model was 0.732 (95%CI:0.595-0.868) as shown in Figure 3-A and B. The SVM model had an AUC value of 0.741 (95%CI:0.608-0.874) for the ROC curve, and a 5-fold cross-validated ROC-AUC of 0.700(95%CI:0.555-0.835), as shown in Figure 4-A and B. Hosmer-Lemeshow goodness-of-fit test demonstrated comparatively good consistency between the predicted probabilities and the actual values for both the LR (P=0.838) and SVM (P=0.074) models, The corresponding calibration curves are shown in Figure 3-C and Figure 4-C. The AUC for the PR curve was 0.760 for the LR model (Figure 3-D) and 0.721 for the SVM model (Figure 4-D). The DCA curves showed if the risk threshold probability is between 30% to 80%, the two model both had superior net benefit (Figure 3-E and 4-E). The Delong test indicated no statistically significant difference in AUC values before cross-validation and after cross-validation in the two models (P=0.961 of LR model, P=0.703 of SVM model), indicating the stability of the model fitting.
The LR and SVM models both output the probability of predicting CD38 expression level, referred to as Rad_score. There was a significant difference in the distribution of Rad_score between the CD38high and CD38low groups (p<0.01). The CD38high group exhibited higher Rad_score (Figure 3-F and 4-F).