Identification of differentially expressed DEIGs
TCGA LUAD dataset legacy-archive (hg19) was downloaded, including four datasets: RNA count, miRNA count, somatic mutation and somatic CNV (Table S1). Gene expression data of “Primary solid Tumor” and “Solid Tissue Normal” samples were accessed by R package “TCGA biolinks”.The DEIGs were identified by obtaining the overlapping subset of a list of immune-related genes from InnateDB and the list of different expressed genes. A total of 7477 immune-related genes were downloaded from InnateDB. The differentially expressed gene analysis was performed by edgeR and DESeq2, and only DEIGs detected by both methods were included. A total of 1359 differentially expressed genes were identified, 713 up-regulated and 646 down-regulated (Table 1). As expected, functional enrichment analysis indicated that the cytokine-cytokine receptor interactions and the inflammatory pathways were among the top terms enriched by DEIGs. (Figure1)
Table 1
A total of 1359 differentially expressed genes were identified, 713 up-regulated and 646 down-regulated
|
DEmRNA
|
DElncRNA
|
DEmiRNA
|
|
Immune
|
Others
|
Immune
|
Others
|
|
Tumor-up
|
713
|
1723
|
3
|
18
|
163
|
Tumor-down
|
646
|
1095
|
1
|
6
|
55
|
Construction of mRNA-miRNA-lncRNA Network
Cytoscape was employed to build a mRNA-miRNA-lncRNA network using differentially expressed lncRNA, miRNA and immune-related target mRNA (Figure 2). In total, the ceRNA network consisted of 8 lncRNA (AGAP11, CASC2, GAS5, MIAT, PVT1, SNHG1, SNHG12, SNHG3),7 miRNA(hsa-mir-1276, hsa-mir-133b, hsa-mir-137, hsa-mir-451a, hsa-mir-543, hsa-mir-551a, hsa-mir-577) and 117 target mRNA. (Table 2)
Table 2
The network was conducted by DElncRNA, DEmiRNA and immune-related DEmRNA with cytoscape
RNAtype
|
lncRNA
|
miRNA
|
mRNA
|
No.
|
8
|
7
|
117
|
Characteristics of target genes
As expected, the DEIGs were mainly involved in immune and inflammatory responses by using gene functional enrichment analysis. The most frequent biological terms among biological processes, cellular components, and molecular functions were “myeloid cell differentiation,” “cytoplasmic region” and “protein heterodimerization activity”(Figure 3A). For KEGG enrichment, MAPK signaling pathway was the most often enriched by differentially expressed target genes (Figure 3B). The 117 target genes were used to construct a protein-protein interaction network. Two subnetworks were extracted via MCODE algorithms. The first MCODE network had six proteins, SH3GL2, TFRC, SYT1, FZD4 and REPS2. The second MCODE network had five components, PDIA6, CDH2, CKAP4, CP and SPP1(Figure S1). The mutations of those 117 target genes were investigated in 569 tumors. The top 50 genes were visualized using OncoPrint. Different mutation type was indicated by different color, and the missense mutation is the most common type (Figure S2).
Evaluation of clinical outcomes
The optimized model consisted of the following genes: ASPH, CAV1, FKBP4, GRIK2, FURIN, SLC6A8, FSCN1, CKAP4, HAPLN2, IL22RA2, which could serve as prognostic marker for LUAD patients. The area under curve of the receiver operating characteristic (ROC) curve was 0.699 for 3years, 0.627 for 5 years, 0.681 for 10years, indicating the prognostic model based on DEIGs has definite potential in survival monitoring. The heatmaps showed that the gene expression profiles were different between the cases of the high and low risk score groups (Figure 4). Univariate Cox regression analysis suggested that the prognostic signature, age, tumor stage, pathologic stage and metastasis status are all associated with prognosis(Table 3).The prognostic model based on DEIGs was identified as an independent predictor using multivariate cox regression analysis after the adjustment of other parameters(Figure 5). We validated our model and selected genes using two additional datasets from GEO database. We obtained significant results in GSE72094(Figure S3).
Table 3
Univariate cox regression analysis
|
HR
|
95%CI
|
p.value
|
RiskScore
|
12
|
(6.4-24)
|
7.9e-14
|
TMB
|
0.97
|
(0.9-1)
|
0.46
|
T
|
1.5
|
(1.3-1.8)
|
7.8e-06
|
M
|
2.2
|
(1.3-3.7)
|
0.0047
|
N
|
1.7
|
(1.4-2)
|
1.9e-09
|
Age
|
1
|
(0.99-1)
|
0.48
|
Stage
|
1.7
|
(1.5-1.9)
|
4.9e-13
|
Smoke
|
0.92
|
(0.67-1.3)
|
0.59
|
Gender
|
1.1
|
(0.79-1.4)
|
0.68
|
Correlation between Prognostic Signature and Immune Infiltration
We analyzed the relationship between model predicted risk score and immune cell infiltration to verify the function of immune related genes in tumor immune microenvironment. As shown in the figure 6, to some extent, the risk score of our model is inversely related to the infiltration of the immune cells including B cell (R= -0.32, p=2.1e-13), CD4+ T cell (R= -0.15, p=0.00067) and dendritic cell (R= -0.11, p=0.014) (Figure 6). The higher the risk score, the lower the amount of these immunoinfiltrating cells. The correlation was more significant in B cells as well as the cell markers, the markers such as CD1D, CD5, CD19, MS4A1, CD27, CD40LG, CR2 were all negatively correlated with the model predicted risk score, the negative correlation of CD40LG is the most obvious (R= -0.46, p<2.2e-16) (Figure S4-S7).
Expression of ASPH and FSCN1 in different LUAD cell lines
RT-qPCR and Western blot were used to detect the relative expression of ASPH, FSCN1 in NSCLC cell lines (H1299, PC9, A549). Results Compared with normal BEAS-2B cells, all 3 NSCLC cell lines showed high FSCN1 expression, the expression of ASPH was the highest in A549 cells and lowest in H1299 cells ( P< 0. 01). (Figure 7).
Effect of the expression of the immune-related genes and B cell markers on OS of LUAD patients
We finally examined the expression of ASPH, FSCN1, MS4A1 and CD40LG in 60 lung adenocarcinoma tissues by immunohistochemistry. 78.33% (47/60) of LUAD patients tissue samples had positive expression of ASPH, 70.00% (42/60) of FSCN1, 21.67% (13/60) of MS4A1, 28.33% (17/60) of CD40LG (Figure 8). Based on the result of IHC of ASPH, FSCN1, MS4A1 and CD40LG, the patients were divided into 2 groups (negative group and positive group); Table 4 showed the characteristics of both groups. We found that the positive expression of ASPH, FSCN1, MS4A1 and CD40LG had a correlation with the TNM stage, cellular differentiation and the lymph node metastasis (p<0.05). But were not associated with the age and sex(p>0.05). Then The effect of the immune related genes on prognosis of LUAD patients was verified by Kaplan–Meier. Univariate Cox regression analysis suggested that the expression of ASPH (HR=0.608; 95% CI, 0.421~0.795), FSCN1 (HR=0.684; 95% CI, 0.482~0.897), MS4A1 (HR=0.741, 95%CI=0.624~0.891) and CD40LG (HR=0.653, 95%CI=0.437~0.833) were significantly associated with overall survival (OS) (Figure 9).
Table 4
Baseline characteristics of patients.
Clinical pathological characteristic
|
N
|
ASPH
|
P
|
FSCN1
|
P
|
MS4A1
|
P
|
CD40LG
|
P
|
Positive
|
Negative
|
Positive
|
Negative
|
Positive
|
Negative
|
Positive
|
Negative
|
Age(t/yr)
|
|
|
|
|
|
|
0.3561
|
|
|
0.4411
|
|
|
0.2017
|
≥60
|
38
|
29
|
9
|
0.3132
|
26
|
12
|
9
|
29
|
10
|
28
|
<60
|
22
|
18
|
4
|
16
|
6
|
4
|
18
|
7
|
15
|
Sex
|
|
|
|
|
|
0.5112
|
|
|
0.3371
|
|
|
0.2146
|
Male
|
33
|
22
|
11
|
0.4172
|
24
|
9
|
8
|
25
|
9
|
23
|
Female
|
27
|
20
|
7
|
18
|
9
|
5
|
22
|
8
|
19
|
Histological differentiation
|
|
|
|
|
|
|
0.0339
|
|
|
0.0277
|
|
|
0.02309
|
High or middle differentiation
|
21
|
13
|
7
|
0.0266
|
13
|
8
|
8
|
13
|
9
|
12
|
Low or no differentiation
|
39
|
34
|
5
|
29
|
10
|
5
|
24
|
8
|
31
|
TNM stage
|
|
|
|
|
|
0.0291
|
|
|
0.0433
|
|
|
0.0451
|
I-Ⅱ
|
35
|
27
|
8
|
0.0328
|
23
|
12
|
8
|
27
|
8
|
27
|
Ⅲ-Ⅳ
|
25
|
20
|
5
|
19
|
6
|
5
|
20
|
9
|
16
|
Lymph node metastasis
|
|
|
|
|
|
|
0.0346
|
|
|
0.0183
|
|
|
0.0147
|
No
|
26
|
19
|
7
|
0.0224
|
18
|
8
|
8
|
18
|
11
|
15
|
Yes
|
34
|
28
|
6
|
24
|
10
|
5
|
29
|
6
|
28
|