As shown in Fig. 1, our research is divided into three stages. To estimate the proportion of TICs in LGG samples and tumor purity, transcriptome RNA-seq data from 516 patients were downloaded from TCGA; then ESTIMATE and CIBERSORT algorithms were performed. DEGs shared by ImmuneScore and StromalScore were used to construct a PPI network. Significant hub genes in the PPI network were evaluated using univariate Cox regression cross-analysis. Meanwhile, we selected a qualified data set from the GEO database and conducted a difference analysis to obtain clinical-related DEGs; then the association between all the DEGs and the survival of LGG patients were evaluated and screened. Next, CD3E was identified and validated as the most relevant gene after combination of the two datasets of DEGs. Further studies focused on impact of CD3E on survival, GSEA and correlation with TICs. Functional annotations of neighbor genes and clinical validation of CD3E was elaborated. Finally, we put the research conclusions in our own AHYMUN center for clinical cohort study.
TME-related scores are related to survival of LGG patients
In order to confirm whether the proportion of cells in TME and tumor purity will affect the survival time of LGG patients, we calculated ImmuneScore, StrromalScore and ESTIMATEScore, and drew a Kaplan-Meier survival curve. The higher the Score, the higher the proportion of the corresponding component in TME. The sum of ImmuneScore and StromalScore is ESTIMATEScore, which also reflects tumor purity from the side. In Fig. 2, TME scores are related to overall survival. ImmuneScore (P = 0.003), StromalScore (P < 0.001) and ESTIMATEScore (P = 0.006) were positively correlated with OS. These results show that we can infer the prognosis of LGG patients based on the proportion of immune cells in TME and formulate personalized treatment plans.
TME-related scores are related to the Clinical features of LGG Patients
We combined the corresponding clinical information of TCGA's LGG patients with the above calculated scores to determine whether the LGG's TME and tumor purity are related to the patient's clinical characteristics. ImmuneScore positively correlated to high grade of LGG (Fig. 3C, P < 0.001); StromalScore was positively correlated to high grade of LGG (Fig. 3F, P < 0.001), and ESTIMATEScore accompanied with high grade of LGG (Fig. 3I, P < 0.001). These results indicate that tumor purity and the ratio of immune/stromal cells in TME are related to the deterioration of LGG. The higher the ratio of immune/stromal cells in TME, the lower the purity of the tumor, and the worse the prognosis of LGG patients.
The Enrichment Analyses of Immune-Related DEGs
In order to determine the exact changes in the genetic profiles of immune and matrix components in TME, we compared high- and low-scoring samples based on the median. We got 297 DEGs through ImmuneScore, 201 genes were upregulated, and 96 genes were downregulated (Fig. 4A, C, D). We also got 518 DEGs from StromalScore, which contained 461 upregulated genes and 57 downregulated genes (Fig. 4B-4D). We found through Venn diagram that 199 upregulated genes with high score and 19 downregulated genes with low score were both in ImmuneScore and StromalScore. These 218 immune-related DEGs may play a decisive role in LGG's TME. We found through GO enrichment analysis and KEGG analysis that the biological functions of these genes are mainly related to immunity. (Fig. 4E-F).
Identify key Immune-Related genes
In order to further study the underlying mechanism of the above genes and find the key genes, We drew the PPI network diagram through String. The interaction between the genes is shown in Fig. 5A. We selected the top 30 genes ranked by the number of nodes and plotted them into a bar graph (Fig. 5B). We performed univariate COX regression analysis on the survival of Immune-Related DEGs and LGG patients to determine which genes are at high risk for LGG patients and which are low risk. (Fig. 5C). Finally, we combined the main nodes in the PPI and the top 75 genes ranked by the p value to analyze them, we have obtained 30 intersecting genes. (Fig. 5D).
Filter clinical-related DEGs and Lock the Target Gene
We use the R language package to screen all the genes that affect survival in three GSE sets. We screened 114 clinical-related DEGs (P < 0.001) that were significantly related to survival from 13299 related genes, and compared them with the previous immune-related DEGs to obtain 7 genes: CD3E, TLR2, CCR5, CXCL9, CXCL10, FCGR2A, and ITGAL (Fig. 5E). We mapped the PPI network for these 7 genes (Fig. 5F). 78.89% terms were in co-expression (lavender line), 7.65% terms were shared protein domains (yellow line), 7.11% terms were in co-localization (deep blue line), and 7.11% terms were predicted (khaki line). We also performed GO and KEGG pathway analyses on these 7 genes, finding that the genes were related to immune diseases and inflammatory response (Fig. 5G). Based on the hazard ratio (HR) value of each gene and the survival-related p value, we targeted CD3E for further study.
Identification of Clinical-Related DEGs
According to the median of CD3E expression in the sample, we divided the data set into a high and a low expression groups and screened using "log fold change = 1, and P < 0.05". A total of 114 related differential genes were obtained. The 15 genes with the most significant up-regulation and the 11 genes with the most significant down-regulation were selected for further analysis (Table 1), which were visualized by volcano map (Fig. 6A) and heat map (Fig. 6B).
Table 1
The 15 genes with the most significant up-regulation and the 11 genes with the most significant down-regulation were selected for further analysis
id
|
logFC
|
AveExpr
|
t
|
P.Value
|
adj.P.Val
|
B
|
CD3E
|
0.503296
|
8.649803
|
14.40339
|
1.42E-32
|
1.79E-28
|
61.13156
|
WDR3
|
0.540363
|
10.02952
|
7.664638
|
8.06E-13
|
5.09E-09
|
18.49936
|
CRABP2
|
0.574073
|
9.887544
|
6.228971
|
2.80E-09
|
2.95E-06
|
10.83195
|
MED25
|
0.573891
|
11.43372
|
5.924861
|
1.38E-08
|
8.78E-06
|
9.336745
|
SMAD6
|
0.550032
|
10.46
|
5.684687
|
4.69E-08
|
2.19E-05
|
8.192201
|
E2F2
|
0.693043
|
10.64766
|
5.252812
|
3.88E-07
|
8.76E-05
|
6.219642
|
GINS2
|
0.533677
|
11.25321
|
5.242128
|
4.08E-07
|
9.05E-05
|
6.172292
|
IGSF5
|
0.516926
|
10.51717
|
5.105327
|
7.78E-07
|
0.000138
|
5.57241
|
KCNIP2
|
-0.53684
|
11.71034
|
-4.94083
|
1.66E-06
|
0.000225
|
4.867026
|
LILRB4
|
0.506478
|
10.80824
|
4.702932
|
4.82E-06
|
0.000444
|
3.878589
|
DCT
|
0.502274
|
9.540816
|
4.647528
|
6.15E-06
|
0.000529
|
3.653898
|
ODF3L2
|
0.576656
|
10.64615
|
4.512396
|
1.10E-05
|
0.000801
|
3.114778
|
TIMP4
|
-0.51087
|
13.07861
|
-4.48168
|
1.26E-05
|
0.000864
|
2.994023
|
CCDC102A
|
0.509778
|
9.87002
|
4.438659
|
1.51E-05
|
0.000969
|
2.825997
|
OGDHL
|
-0.63799
|
10.63297
|
-4.37422
|
1.98E-05
|
0.001135
|
2.5768
|
SLC15A3
|
0.503854
|
11.32423
|
4.240455
|
3.43E-05
|
0.001631
|
2.069005
|
UPK1A
|
0.538824
|
9.495527
|
4.145965
|
5.03E-05
|
0.002087
|
1.718163
|
REM1
|
0.608585
|
10.15157
|
3.694196
|
0.000286
|
0.006442
|
0.133289
|
SYN2
|
-0.54007
|
11.02732
|
-3.67165
|
0.000311
|
0.006821
|
0.05829
|
RIT2
|
-0.5668
|
9.842119
|
-3.65283
|
0.000333
|
0.007176
|
-0.00398
|
P2RY1
|
-0.51652
|
11.04178
|
-3.59096
|
0.000416
|
0.008445
|
-0.20681
|
CAMK4
|
-0.53104
|
11.17374
|
-3.53893
|
0.000501
|
0.00946
|
-0.37504
|
KCNC2
|
-0.55738
|
9.807489
|
-3.53712
|
0.000505
|
0.00949
|
-0.38087
|
SNCB
|
-0.51772
|
11.36075
|
-3.43972
|
0.000711
|
0.011889
|
-0.68982
|
CALY
|
-0.6215
|
11.73797
|
-3.33767
|
0.001011
|
0.015188
|
-1.00539
|
ALDH1A3
|
0.520521
|
10.04314
|
3.110935
|
0.002143
|
0.025217
|
-1.67609
|
ZFR2
|
-0.54908
|
11.97667
|
-2.95741
|
0.003483
|
0.034902
|
-2.10604
|
Correlation Analyses of Clinical-Related DEGs
As illustrated in Fig. 6C, gene-gene interaction between Clinical-Related DEGs and related genes was performed. 95.20% terms were in co-expression (lavender line), and 4.80% terms were in co-localization (deep blue line). In Fig. 6D-6F, We conducted a biological function enrichment analysis of DEGs. The results showed that enrichments of biological processes were positive regulation of voltage-gated potassium channel activity, positive regulation of potassium ion transmembrane transporter activity and regulation of pry-miRNA transcription by RNA polymerase II (Fig. 6D); enrichments of cellular components were ion glutamatergic synapse, apical plasma membrane and apical part of cell (Fig. 6E); enrichments of molecular functions were oxidoreductase activity, calmodulin binding and copper ion binding (Fig. 6F). Enrichments in KEGG pathway were glioma, tyrosine metabolism and citrate cycle (Fig. 6G).
We correlated the 20 most significantly up-regulated genes and the 20 most significantly down-regulated genes with CD3E. Red for positive correlation, and green represents a negative correlation. The deeper the color, the greater the relevance. CD3E is positively correlated with LILRB4, UPK1A, and REM1, negatively correlated with RIT2, OGDHL, and KCNC2 (Fig. 6H).
CD3E Expression is Negatively Related to the Survival of LGG Patients
CD3E is an epsilon subunit of T cell antigen receptor. According to the median of CD3E expression, all LGG samples were divided into CD3E high, median and low expression groups. Survival analysis showed that in TCGA (P = 0.0011; Fig. 7A) and GSE (P < 0.001; Fig. 7B), the survival rate of LGG patients with high CD3E expression was lower than that of CD3E low expression. Similarly, in GEPIA, the OS of the CD3E high expression was lower than that of the low expression (P < 0.001; Fig. 7C)
CD3E is a Potential Indicator of TME Modulation
Considering that CD3E expression is negatively correlated with the survival rate of LGG patients, we performed GSEA analysis on the high expression group. We found that the genes in the CD3E high expression group mainly participated in immune-related activities, such as B cell receptor signaling pathway, chemokine signaling pathway and T cell receiver signaling pathway (Fig. 7D). Furthermore, CD3E was positively related to glioma and immune cell response. These results suggest that CD3E may be a potential indicator of TME status for LGG.
Correlation of CD3E With the Proportion of TICs
We used the CIBERSORT algorithm to analyze the proportion of TICs of 22 immune cells in LGG to further study the correlation between CD3E and the immune microenvironment of LGG. (Fig. 8). We found that the expression of CD3E is related to the TIC of 10 LGG (Fig. 9).Seven kinds of TICs were positively correlated with CD3E expression, including macrophages M0, macrophages M1, mast cells resting, NK cells resting, T cells CD4 memory activated, T cells CD8 and T cells regulatory; three kinds were negatively correlated with CD3E expression, including eosinophils, monocytes and NK cells activated. These results prove that CD3E is related to the immune activity of TME, thereby affecting the tumor purity of LGG.
Clinicopathological features related to CD3E expression
To verify CD3E expression in LGG, we performed immunohistochemistry (IHC) (Fig. 10A-10B). The scatter plot of the IHC scores revealed that CD3E expression increased in LGG tissues in the AHYMUN cohort (P < 0.01). In Table 2, we found that higher CD3E expression is with patients’ age (P = 0.027), grade (P < 0.001), microvascular invasion (P = 0.009), history of epilepsy (P < 0.001) and Karnofsky score (P = 0.002). This seems to indicate that the higher the expression of CD3E in patients, the worse the prognosis.
Table 2
Clinicopathological characteristics in relation to CD3E expression level in AHYMUM cohort.
Characteristics
|
AHYMUM cohort
|
CD3E expression
|
χ2
|
P
|
(N = 100)
|
Low IHC score
|
High IHC score
|
|
(N = 50)
|
(N = 50)
|
N (%)
|
|
|
|
|
|
Age
|
|
|
|
4.889
|
0.027
|
༜60 years
|
55(0.55)
|
33(0.60)
|
22(0.40)
|
|
|
≥ 60 years
|
45(0.45)
|
17(0.38)
|
28(0.72)
|
|
|
Gender
|
|
|
|
0.271
|
0.603
|
Male
|
82(0.82)
|
40(0.49)
|
42(0.51)
|
|
|
Female
|
18(0.18)
|
10(0.56)
|
8(0.44)
|
|
|
Grade
|
|
|
|
14.924
|
< 0.001
|
G1
|
69(0.69)
|
39(0.57)
|
30(0.43)
|
|
|
G2
|
31(0.31)
|
11(0.35)
|
20(0.65)
|
|
|
Seizure history
|
|
|
|
12.148
|
< 0.001
|
yes
|
61(0.61)
|
39(0.64)
|
22(0.36)
|
|
|
no
|
39(0.39)
|
11(0.28)
|
28(0.72)
|
|
|
Microvascular invasion
|
|
|
|
6.828
|
0.009
|
Absent
|
55(0.55)
|
34(0.62)
|
21(0.38)
|
|
|
Present
|
45(0.45)
|
16(0.36)
|
29(0.64)
|
|
|
Capsular invasion
|
|
|
|
1.961
|
0.161
|
Absent
|
51(0.51)
|
29(0.57)
|
22(0.43)
|
|
|
Present
|
49(0.49)
|
21(0.43)
|
28(0.57)
|
|
|
Karnofsky score
|
|
|
|
9.180
|
0.002
|
≥ 80
|
61(0.61)
|
36(0.59)
|
21(0.41)
|
|
|
༜80
|
39(0.39)
|
14(0.36)
|
29(0.64)
|
|
|
Cox regression analysis
We use univariate Cox regression analysis to show the relationship between CD3E and AHYMUN patients, we found that CD3E is not very relevant to age and gender. (Fig. 10C). In the multivariate model, we also found that patients in the high expression group had worse OS (HR = 3.22; P = 0.001). Moreover, in the AHNTU cohort, the microvascular invasion (HR = 1.52; P = 0.024), the presence of capsular infiltration (HR = 1.63; P = 0.016), and the Karnofsky scores (ref < 80) (HR = 1.46; P = 0.023) were associated with low OS (Table 3).
Table 3
Multivariate Cox regression analysis of DFS and OS in AHYMUM cohorts (DFS: disease-free survival; OS: overall survival)
Covariates
|
OS
|
|
DFS
|
HR
|
95% CI
|
P value
|
|
HR
|
95% CI
|
P value
|
Grade (ref. G1)
|
1.97
|
2.25–3.68
|
0.043
|
|
2.31
|
1.94–4.02
|
0.037
|
Microvascular invasion (ref. Absent)
|
1.52
|
1.61–2.54
|
0.024
|
|
1.98
|
1.73–3.64
|
0.031
|
Capsular invasion (ref. Absent)
|
1.63
|
2.17–3.21
|
0.016
|
|
1.54
|
2.31–3.16
|
0.017
|
Karnofsky score (ref. ༞80)
|
1.46
|
2.31–3.27
|
0.023
|
|
1.56
|
1.66–2.64
|
0.044
|
CD3E expression (ref. low)
|
3.32
|
2.48–9.91
|
0.001
|
|
4.33
|
2.64–12.21
|
< 0.001
|
We found that the patient’s gender and epilepsy history were not related to DFS (Fig. 10D). Multivariate We found through Cox analysis that the high expression of the CD3E gene caused a significant decrease in OS (HR = 4.33; P < 0.001) (Table 3). Including grade, capsular infiltration, microvascular invasion and Karnofsky scores are related to OS (P < 0.05). In Fig. 10E-F, the higher the CD3E expression level, the lower the OS and DFS of LGG patients.