Patient Characteristics
We retrieved clinical and RNA-sequencing data for 594 patients from the Cancer Genome Atlas (TCGA) datasets, including 58 patients with matched normal tissue samples. The clinicopathological characteristics of these patients were shown in Additional file 1. We also obtained gene expression data of normal lung tissues (n = 288) from the Genotype Tissue Expression Project (GTEx) datasets to increase sample size. Two more datasets from the Gene Expression Omnibus database (GEO) with normal vs LUAD tissues were introduced: GSE10072 contains normal lung tissues (n = 48) and LUAD tissues (n = 58), and GSE115002 contains normal lung tissues (n = 51) and LUAD tissues (n = 52).
Low Expression of KLF4 in LUAD
According to the pan-cancer analysis, KLF4 expression was found generally low in cancers including ACC, BLCA, BRCA, CESC, COAD, HNSC, KICH, LUAD, LUSC, OV, READ, SKCM, THCA, UCEC and UCS (Fig. 1A). Specifically, in LUAD samples, KLF4 expression was significantly lower than that in normal lung tissues (p < 0.001) (Fig. 1B). This trend was also observed in 58 paired LUAD tissues (p < 0.001) (Fig. 1C). Similarly, the level of KLF4 expression were found significantly lower in LUAD than normal tissues in two GEO datasets: GSE10072 (Fig. 1D) and GSE115002 (Fig. 1E).
To further investigate the expression of KLF4, immunohistochemical (IHC) staining was conducted on a cohort comprising 25 cases of LUAD tissues paired with noncancerous tissues, collected from our hospital unit (details seen Additional file 2). Of the 25 samples initially collected, 2 were subsequently excluded due to lacking of cancer tissues from the paraffined tissues. The final cohort consisted of 23 patients with a mean age of 50.4 years (range 26–78 years). Consistent with previous analyses, KLF4 expression was detected a rather lower density staining and significant downregulation (P = 0.001) in LUAD as compared to the noncancerous tissues (Fig. 2).
Association of KLF4 Expression Levels with the Clinical Characteristics
We used TCGA database to analyze how different pathologic features of LUAD samples affected KLF4 transcription. As shown in Table 1 and Fig. 3, KLF4 expression was consistently lower in LUAD patients than in controls, regardless of the sub-group. The results of sub-group analysis, including pathologic stage, T stage, N stage, M stage, age, gender, ethnicity, smoking status, residual tumor, OS event, DSS event, and PFI event also indicated a significant reduction in KLF4 expression in LUAD patients compared to the controls (Fig. 3). Moreover, univariate logistic regression analyses revealed statistically significant differences between the high and low KLF4 expression groups in terms of T stage (T3 & T4 vs T1 & T2) and gender (male vs. female), with ORs of 1.875 (95% CI = 1.117–3.209) and 1.538 (95% CI = 1.094–2.167), respectively (Table 2).
Table 1
Clinicopathological characteristics of high- and low-KLF4 expression groups.
Characteristics
|
Low expression of KLF4
|
High expression of KLF4
|
p
|
n
|
267
|
268
|
|
Age, median (IQR)
|
65 (59, 71)
|
67 (59, 73)
|
0.183
|
Age, n (%)
|
|
|
0.376
|
<=65
|
135 (26.2%)
|
120 (23.3%)
|
|
> 65
|
127 (24.6%)
|
134 (26%)
|
|
T stage, n (%)
|
|
|
0.041
|
T1
|
91 (17.1%)
|
84 (15.8%)
|
|
T2
|
151 (28.4%)
|
138 (25.9%)
|
|
T3
|
15 (2.8%)
|
34 (6.4%)
|
|
T4
|
10 (1.9%)
|
9 (1.7%)
|
|
N stage, n (%)
|
|
|
0.508
|
N0
|
171 (32.9%)
|
177 (34.1%)
|
|
N1
|
50 (9.6%)
|
45 (8.7%)
|
|
N2
|
34 (6.6%)
|
40 (7.7%)
|
|
N3
|
2 (0.4%)
|
0 (0%)
|
|
M stage, n (%)
|
|
|
0.257
|
M0
|
180 (46.6%)
|
181 (46.9%)
|
|
M1
|
9 (2.3%)
|
16 (4.1%)
|
|
Pathologic stage, n (%)
|
|
|
0.234
|
Stage I
|
156 (29.6%)
|
138 (26.2%)
|
|
Stage II
|
57 (10.8%)
|
66 (12.5%)
|
|
Stage III
|
41 (7.8%)
|
43 (8.2%)
|
|
Stage IV
|
9 (1.7%)
|
17 (3.2%)
|
|
Primary therapy outcome, n (%)
|
|
|
0.13
|
PD
|
30 (6.7%)
|
41 (9.2%)
|
|
SD
|
22 (4.9%)
|
15 (3.4%)
|
|
PR
|
1 (0.2%)
|
5 (1.1%)
|
|
CR
|
169 (37.9%)
|
163 (36.5%)
|
|
Gender, n (%)
|
|
|
0.017
|
Female
|
157 (29.3%)
|
129 (24.1%)
|
|
Male
|
110 (20.6%)
|
139 (26%)
|
|
Race, n (%)
|
|
|
0.762
|
Asian
|
3 (0.6%)
|
4 (0.9%)
|
|
Black or African American
|
30 (6.4%)
|
25 (5.3%)
|
|
White
|
203 (43.4%)
|
203 (43.4%)
|
|
Residual tumor, n (%)
|
|
|
0.386
|
R0
|
183 (49.2%)
|
172 (46.2%)
|
|
R1
|
5 (1.3%)
|
8 (2.2%)
|
|
R2
|
1 (0.3%)
|
3 (0.8%)
|
|
Histological type, n (%)
|
|
|
0.623
|
Lung Acinar Adenocarcinoma
|
9 (1.8%)
|
9 (1.8%)
|
|
Lung Adenocarcinoma Mixed Subtype
|
58 (11.8%)
|
51 (10.4%)
|
|
Lung Adenocarcinoma-NOS
|
170 (34.6%)
|
170 (34.6%)
|
|
Lung Bronchioloalveolar Carcinoma Mucinous
|
1 (0.2%)
|
4 (0.8%)
|
|
Lung Bronchioloalveolar Carcinoma Nonmucinous
|
11 (2.2%)
|
8 (1.6%)
|
|
Anatomic neoplasm subdivision, n (%)
|
|
|
0.473
|
Left
|
98 (18.8%)
|
107 (20.6%)
|
|
Right
|
162 (31.2%)
|
153 (29.4%)
|
|
Anatomic neoplasm subdivision2, n (%)
|
|
|
0.53
|
Central Lung
|
27 (14.3%)
|
35 (18.5%)
|
|
Peripheral Lung
|
63 (33.3%)
|
64 (33.9%)
|
|
number_pack_years_smoked, n (%)
|
|
|
0.28
|
< 40
|
85 (23%)
|
103 (27.9%)
|
|
>=40
|
93 (25.2%)
|
88 (23.8%)
|
|
Smoker, n (%)
|
|
|
0.422
|
No
|
41 (7.9%)
|
34 (6.5%)
|
|
Yes
|
218 (41.8%)
|
228 (43.8%)
|
|
OS event, n (%)
|
|
|
0.063
|
Alive
|
182 (34%)
|
161 (30.1%)
|
|
Dead
|
85 (15.9%)
|
107 (20%)
|
|
DSS event, n (%)
|
|
|
0.023
|
Alive
|
202 (40.5%)
|
177 (35.5%)
|
|
Dead
|
49 (9.8%)
|
71 (14.2%)
|
|
PFI event, n (%)
|
|
|
0.271
|
Alive
|
161 (30.1%)
|
148 (27.7%)
|
|
Dead
|
106 (19.8%)
|
120 (22.4%)
|
|
IQR, interquartile range; NOS, Not Otherwise Specified |
Table 2
Associations of KLF4 expression with clinicopathological characteristics of patients (n = 535).
Characteristics
|
Total(N)
|
Odds Ratio(OR)
|
P value
|
T stage (T3&T4 vs. T1&T2)
|
532
|
1.875 (1.117–3.209)
|
0.019
|
N stage (N2&N3 vs. N0&N1)
|
519
|
1.106 (0.679–1.806)
|
0.685
|
M stage (M1 vs. M0)
|
386
|
1.768 (0.776–4.275)
|
0.185
|
Pathologic stage (Stage III&Stage IV vs. Stage I&Stage II)
|
527
|
1.253 (0.823–1.915)
|
0.294
|
Primary therapy outcome (SD&PR&CR vs. PD)
|
446
|
0.697 (0.415–1.161)
|
0.168
|
Gender (Male vs. Female)
|
535
|
1.538 (1.094–2.167)
|
0.014
|
Race (Asian&Black or African American vs. White)
|
468
|
0.879 (0.512–1.501)
|
0.636
|
Age (> 65 vs. <=65)
|
516
|
1.187 (0.840–1.678)
|
0.331
|
Residual tumor (R1&R2 vs. R0)
|
372
|
1.951 (0.726–5.768)
|
0.198
|
Anatomic neoplasm subdivision (Right vs. Left)
|
520
|
0.865 (0.608–1.230)
|
0.419
|
Anatomic neoplasm subdivision2 (Peripheral Lung vs. Central Lung)
|
189
|
0.784 (0.423–1.441)
|
0.434
|
number_pack_years_smoked ( > = 40 vs. <40)
|
369
|
0.781 (0.518–1.175)
|
0.236
|
Smoker (Yes vs. No)
|
521
|
1.261 (0.773–2.070)
|
0.354
|
Identification of DEGs in LUAD and Protein-protein Interaction (PPI) Network Analysis
By comparing the gene expression profiles of LUAD samples with high and low KLF4 levels, we identified a total of 1582 differentially expressed genes (DEGs), consisting of 1100 (69.53%) upregulated genes and 482 (30.47%) downregulated genes (adjusted p value < 0.05, |Log2-FC| > 1) (Fig. 4A and Additional file 3). Subsequently, relationship between the top 10 DEGs (including REG4, TFF2, MUC17, ANXA10, DPPA2, UPK1B, ONECUT3, TM4SF20, MUC5AC, PSCA) and KLF4 are presented in Fig. 4B. We utilized online STRING tool to construct a PPI network that aimed to investigate the possible interactions among the top 50 DEGs. Additionally, we identified a set of hub genes from the PPI network by using the same tool. The network exhibited a high degree of complexity, and the top 10 hub genes were identified as SPRR1B, SPRR3, SPRR2A, LCE3E, CASP14, PI3, RPTN, LCE3A, SPRR2E, and TCHH (Additional file 4).
Functional Enrichment Analysis Including GO, KEGG, and GSEA Analysis
We performed GO enrichment analysis for biological processes, cellular components, and molecular functions. The results revealed that DEGs were enriched in various GO terms such as epidermis development, cornified envelope, and peptidase inhibitor activity (Fig. 4C and Additional file 5). KEGG pathway analysis showed that the significantly DEGs-enriched pathways included neuroactive ligand-receptor interaction, retinol metabolism and chemical carcinogenesis (Fig. 4D and Additional file 5).
Subsequently, GSEA analysis using GSEA/MSigDB was performed, and the results indicated significant enrichment in immune-related biological processes, including immunoregulatory interactions, humoral immune response, and immunoglobulin production. These results suggested that KLF4 may have an increased immune phenotype in LUAD (Figs. 5 and Additional file 6).
Correlation Between Methylation and Expression of KLF4
To elucidate the underlying mechanisms of KLF4 overexpression in LUAD tissues, we investigated the correlation between KLF4 expression levels and methylation status, utilizing online tools. First, we found that the promoter of KLF4 had a significantly lower level of DNA methylation in LUAD tissues than in normal lung tissues using the UALCAN database (P = 0.0208) (Fig. 6A). We found most of the methylation sites in the DNA sequences of KLF4 were hypomethylated in LUAD (Fig. 6B). Notably, the level of methylation observed at these specific sites showed a significant correlation with the outcomes of the patients. Specifically, patients who exhibited low levels of KLF4 methylation had poorer overall survival rates when compared to those with high levels of KLF4 methylation, as depicted in Figs. 6CD.
Correlation Between KLF4 Expression and Immune Infiltration
In order to elucidate the correlation between the KLF4 expression landscape and immune infiltrates in LUAD, we employed TIMER2.0 as our analytical tool. Our findings showed a positive correlation between KLF4 expression and the levels of various immune infiltrating cells, including neutrophils, macrophages, monocytes, myeloid dendritic cells, mast cells, eosinophils, hematopoietic stem cells, NK cells, and CD4 + T cells, while a negative correlation was observed with B cells and Tregs (Additional file 7). Next, through the use of single-sample GSEA, we observed a positive correlation between KLF4 expression and the levels of immune infiltrating cells, specifically neutrophils (r = 0.370, P < 0.001), mast cells (r = 0.250, P < 0.001), eosinophils (r = 0.246, P < 0.001), and NK CD56bright cells (r = 0.188, P < 0.001) (Fig. 7AB). Meanwhile, the enrichment scores obtained for these four cell types were significantly higher in the KLF4 high expression group compared to the KLF4 low expression group (all P < 0.001) (Figs. 7C). Moreover, we examined the relationship between KLF4 expression and immune biomarkers, including tumor mutational burden (TMB) and immune modulators. The result showed that TMB scores was negatively correlated with the expression of KLF4 in LUAD (r = -0.13, P = 0.003) (Figs. 8A). Additionally, we noticed that KLF4 expression positively correlated with immunoinhititors of IL10, PDCD1LG2, CD274 (Fig. 8B), and immunostimulators of IL6, C10orf54 while negatively correlated with TNFRSF18 and TNFRSF25(Figs. 8C).
Prognostic Value of KLF4 in LUAD
KLF4 expression was shown to have better predictive power in distinguishing LUAD from normal tissues, with an area under the curve (AUC) of 0.963 (95% confidence interval [CI] = 0.951–0.975) as indicated by the ROC curve (Fig. 9A). Then we used the Kaplan-Meier method to estimate the relationship between KLF4 expression and the survival of patients with LUAD. The median value of KLF4 expression was utilized to stratify patients into high and low expression groups. Compared to the low expression group, the high KLF4 expression group displayed a significantly worse prognosis for overall survival (OS), disease-specific survival (DSS), and progression-free interval (PFI) (OS: hazard ratio [HR] = 1.45, 95% [CI] = 1.09–1.94, P = 0.011; DSS: HR = 1.65, 95% CI = 1.14–2.39, P = 0.008; PFI: HR = 1.36, 95% CI = 1.04–1.77,P = 0.023), as demonstrated by the corresponding Kaplan-Meier curves (Figs. 9BCD).
Next, univariate and multivariate Cox regression analyses were performed to identify prognostic factors. The multivariate analysis showed that KLF4 expression (adjusted HR = 1.867, 95% CI = 1.265–2.755, P = 0.002), T3 stage (adjusted HR = 3.487, 95% CI = 1.555–7.822, P = 0.002), and N1 stage (adjusted HR = 2.159, 95% CI = 1.035–4.503, P = 0.04) were independent factors for OS in patients with LUAD (Fig. 9E and Additional file 8). Similarly, for DSS, KLF4 expression (adjusted HR = 1.845, 95% CI = 1.086–3.133, P = 0.024), and R1&R2 tumor (adjusted HR = 3.318, 95% CI = 1.115–9.871, P = 0.031) were identified as prognostic indicators (Additional file 9). For PFI, KLF4 expression (adjusted HR = 1.547, 95% CI = 1.057–2.265, P = 0.025), T3 stage (adjusted HR = 2.584, 95% CI = 1.106–6.038, P = 0.028), and R1&R2 tumor (adjusted HR = 3.196, 95% CI = 1.420–7.196, P = 0.005) were identified as prognostic indicators (Additional file 10). Subsequently, we also assessed the prognostic value of KLF4 expression in different subgroups. The high expression of KLF4 was consistently linked to unfavorable outcomes in various subgroups based on OS, DSS, and PFI, including T2-T4 stage, N0-N2 stage, M0 stage, stages I-III, male gender, white race, and age ≤ 60 years (all P < 0.05) (Fig. 10 and Additional file 11).
Construction and Validation of a Nomogram Based on the Independent Factors
A nomogram was developed based on the independent factors of OS to estimate the prognosis of LUAD patients. On the nomogram, a higher total number of points was associated with a worse prognosis (Fig. 11A). Additionally, calibration curves were used to assess the prediction efficacy of the nomogram (Figs. 11BD). The bootstrap corrected C-index of the nomogram was 0.692 (95% CI = 0.667–0.716), indicating that the model had a moderate predictive accuracy for OS of patients with LUAD. Furthermore, the time-dependent ROC curve was applied to evaluate the discriminative ability of KLF4 expression (Additional file 12). Overall, the results suggested that the nomogram was appropriate for predicting the prognosis of LUAD patients.