Screening for DEGs
The mRNA expression data for AML were obtained from TCGA Acute Myeloid Leukemia Project. The patients were divided into long- and short-survival groups according to the length of survival. We identified 1347 DEGs between the groups, compared to the bone marrow tissue of the long-survival group, with 802 upregulated and 545 downregulated DEGs in the short-survival group (Fig. 1. A). According to the cytogenetic risk, the patients were divided into favorable and intermediate group and poor groups. We identified 5614 DEGs between the groups, compared to the bone marrow tissue of the favorable and intermediate group, with 4129 upregulated and 1485 downregulated DEGs in the poor group (Fig. 1. B). The two classification methods yielded 535 common upregulated DEGs (Fig. 1. C) and 381 common downregulated DEGs (Fig. 1. D).
Two datasets, GSE107465 and GSE34714, were downloaded from the GEO database and used to identify the differences in DEGs between patients with AML and normal individuals. In the GSE107465 dataset, we identified 1119 DEGs, including 355 upregulated and 764 downregulated DEGs, in the bone marrow tissue of patients with AML compared to the bone marrow tissue of the normal group (Fig. 2. A). In the GSE34714 dataset, we identified 531 DEGs, among which 69 were upregulated and 462 were downregulated, in the bone marrow tissue of patients with AML compared to the bone marrow tissue of the normal group (Fig. 2. B). The heatmap shows the expression levels of the top 40 DEGs (Fig. 2. C-D). In the two datasets, there were 3 common upregulated DEGs (Fig. 2. E) and 24 common downregulated DEGs (Fig. 2. F).
Screening for Hub Genes
To further screen the hub genes that affect the prognosis of patients with AML, we imported the selected genes from the DEGs and GEO databases into the STRING database for PPI analysis. In the analysis of DEGs from TCGA database, we obtained Fig. 3. A, with an interaction score of > 0.4. MCODE was used to identify the top 10 hub genes (THBS1, EGF, IL10, CTGF, HGF, MMP7, MMP2, FGF13, CCL2, and IGF1; Fig. 3. B).
For the analysis of DEGs in the GEO database, we obtained Fig. 3. C, with an interaction score of > 0.4. A total of 9 hub genes were identified (THBS1, BMP2, POSTN, TIMP3, HGF, MMP7, MMP2, FGF13, and MMP8). For further research, we selected THBS1, which was identified to have a key role in both analyses.
To further search for genes that interact with THBS1, we divided TCGA database into high- and low-expression groups according to the expression level of THBS1. We found 1037 DEGs between the high- and low THBS1 expression groups, with 673 upregulated and 364 downregulated DEGs (p < 0.05, |Log2-FC |>1) (Fig. 3. D). The relationships between the top 12 DEGs (DKK1, IGSF1, PRKG1, SLC10A2, PCDH15, PRG3, DEFA3, SNAP25, GABRG2, HMSD, PHACTR3, and LAMP5) and THBS1 are shown in Fig. 3. E.
GO and KEGG Enrichment Analyses of DEGs
To further study the biological roles of the identified DEGs, we used the "clusterProfiler" package in R for GO and KEGG enrichment analysis. In the analysis of DEGs in TCGA database, we obtained a total of 338 association pathways and selected the top 20 pathways with high correlation and THBS1 correlation (Table 1). We found that these pathways were mostly related to cellular immunity, intracellular environment, and cell division. Subsequently, by applying GSEA between the high- and low THBS1 expression groups, we found that most biological processes related to the cell microenvironment were significantly enriched in the low-expression group, suggesting that THBS1 expression affects the cell microenvironment (Fig. 3. F).
Table 1
GO and KEGG enrichment analysis in TCGA
ONTOLOGY | ID | Description | GeneRatio | BgRatio | P-value | p.adjust | Q-value |
BP | GO:0034329 | cell junction assembly | 23/441 | 420/18800 | 0.000160489 | 0.018387259 | 0.015783937 |
BP | GO:0050673 | epithelial cell proliferation | 23/441 | 443/18800 | 0.000344579 | 0.027396351 | 0.023517496 |
BP | GO:0050678 | regulation of epithelial cell proliferation | 22/441 | 382/18800 | 0.000108483 | 0.016745406 | 0.014374543 |
BP | GO:0031589 | cell-substrate adhesion | 21/441 | 364/18800 | 0.000151098 | 0.018310515 | 0.015718059 |
BP | GO:0033674 | positive regulation of kinase activity | 21/441 | 476/18800 | 0.004423447 | 0.081675031 | 0.070111243 |
BP | GO:0050878 | regulation of body fluid levels | 18/441 | 382/18800 | 0.004170148 | 0.078119423 | 0.067059049 |
BP | GO:0007162 | negative regulation of cell adhesion | 17/441 | 305/18800 | 0.000923638 | 0.040396048 | 0.034676658 |
BP | GO:1901888 | regulation of cell junction assembly | 15/441 | 201/18800 | 8.38004E-05 | 0.016745406 | 0.014374543 |
BP | GO:0033002 | muscle cell proliferation | 15/441 | 233/18800 | 0.000420126 | 0.030023444 | 0.025772638 |
BP | GO:0007596 | blood coagulation | 14/441 | 221/18800 | 0.000758321 | 0.036944257 | 0.031713582 |
BP | GO:0097191 | extrinsic apoptotic signaling pathway | 14/441 | 221/18800 | 0.000758321 | 0.036944257 | 0.031713582 |
BP | GO:0050817 | coagulation | 14/441 | 226/18800 | 0.000942389 | 0.040396048 | 0.034676658 |
CC | GO:0062023 | collagen-containing extracellular matrix | 38/446 | 429/19594 | 9.02792E-13 | 3.25908E-10 | 2.91744E-10 |
CC | GO:0034774 | secretory granule lumen | 18/446 | 322/19594 | 0.000451573 | 0.019544296 | 0.01749555 |
CC | GO:0060205 | cytoplasmic vesicle lumen | 18/446 | 325/19594 | 0.00050377 | 0.019544296 | 0.01749555 |
CC | GO:0031983 | vesicle lumen | 18/446 | 327/19594 | 0.000541393 | 0.019544296 | 0.01749555 |
MF | GO:1901681 | sulfur compound binding | 21/407 | 267/18410 | 5.58565E-07 | 5.53911E-05 | 4.7919E-05 |
MF | GO:0005539 | glycosaminoglycan binding | 20/407 | 234/18410 | 2.77541E-07 | 3.30274E-05 | 2.85721E-05 |
MF | GO:0008201 | heparin binding | 17/407 | 168/18410 | 2.01437E-07 | 2.99638E-05 | 2.59218E-05 |
KEGG | hsa04151 | PI3K-Akt signaling pathway | 19/194 | 354/8164 | 0.000729217 | 0.057851189 | 0.055522813 |
BP,Biological Process; CC,Cellular Component; MF,Molecular Function; GO, Gene Ontology; KEGG, Kyoto Encyclopedia of Genes and Genomes; Gene sets with p-value < 0.05 and q-value < 0.25 are considered as significant. |
In the analysis of DEGs in the GEO database, we identified 15 associated pathways (Table 2) and found that these pathways were mostly related to cellular immunity and intracellular environment.
Table 2
GO and KEGG enrichment analysis in GEO
ONTOLOGY | ID | Description | GeneRatio | BgRatio | P-value | p.adjust | Q-value |
BP | GO:0043312 | neutrophil degranulation | 9/24 | 485/18670 | 4.63e-09 | 1.45e-06 | 9.93e-07 |
BP | GO:0002283 | neutrophil activation involved in immune response | 9/24 | 488/18670 | 4.88e-09 | 1.45e-06 | 9.93e-07 |
BP | GO:0042119 | neutrophil activation | 9/24 | 498/18670 | 5.82e-09 | 1.45e-06 | 9.93e-07 |
BP | GO:0002446 | neutrophil mediated immunity | 9/24 | 499/18670 | 5.93e-09 | 1.45e-06 | 9.93e-07 |
BP | GO:0006898 | receptor-mediated endocytosis | 5/24 | 316/18670 | 4.39e-05 | 0.009 | 0.006 |
CC | GO:0030667 | secretory granule membrane | 6/25 | 298/19717 | 1.58e-06 | 1.29e-04 | 6.80e-05 |
CC | GO:0101002 | ficolin-1-rich granule | 4/25 | 185/19717 | 8.13e-05 | 0.003 | 0.002 |
CC | GO:0062023 | collagen-containing extracellular matrix | 5/25 | 406/19717 | 1.37e-04 | 0.004 | 0.002 |
CC | GO:1904813 | ficolin-1-rich granule lumen | 3/25 | 124/19717 | 5.05e-04 | 0.010 | 0.005 |
CC | GO:0042581 | specific granule | 3/25 | 160/19717 | 0.001 | 0.016 | 0.008 |
MF | GO:0004859 | phospholipase inhibitor activity | 2/24 | 12/17697 | 1.15e-04 | 0.007 | 0.004 |
MF | GO:0005041 | low-density lipoprotein particle receptor activity | 2/24 | 12/17697 | 1.15e-04 | 0.007 | 0.004 |
MF | GO:0030228 | lipoprotein particle receptor activity | 2/24 | 15/17697 | 1.83e-04 | 0.007 | 0.004 |
MF | GO:0055102 | lipase inhibitor activity | 2/24 | 17/17697 | 2.37e-04 | 0.007 | 0.004 |
MF | GO:0008329 | signaling pattern recognition receptor activity | 2/24 | 20/17697 | 3.30e-04 | 0.007 | 0.005 |
Gene sets with p-value < 0.05 and q-value < 0.25 are considered as significant. |
Relationship between THBS1 and Infiltration of Immune Cells
To further explore the role of THBS1 in cellular immunity, we conducted an immune correlation analysis using TCGA database. We investigated the correlation between THBS1 expression and the relative abundance of 24 immune cells (Fig. 4. A) and found that the enrichment scores of activated dendritic cells (aDCs), B cells, CD8 + T cells, and Th1 helper cells in the high THBS1 expression group were significantly higher than those in the low THBS1 expression group (Fig. 4. B-E). THBS1 expression was significantly and positively correlated with the infiltration level of immune cells in aDCs (r = − 0.196, p = 0.016), B cells (r = 0.165, p = 0.043), CD8 + T cells (r = 0.270, p < 0.001), and Th1 helper cells (r = − 0.286, p < 0.001; Fig. 4. F-I).
Construction and Validation of a Nomogram Based on the Independent Factors
To predict the prognosis of AML, a nomogram based on the independent factors of OS was generated. In the nomogram, a higher total number of points was associated with worse prognosis (Fig. 5. A). We found that in addition to clinically recognized cytogenetic risk, patient age had the greatest impact on prognosis. In addition, we used 1-year, 3-year, and 5-year survival calibration curves to evaluate the predictive effect of the nomogram (Fig. 5. B-D). These results indicated that the nomogram was appropriate. Thus, the age of patients with AML has a significant impact on their prognosis.
Clinical Correlation Analysis of THBS1 Expression
We conducted a pan-cancer analysis in TCGA database and found that THBS1 expression was significantly different between healthy people and patients with most types of cancer (Fig. 6. A). The mRNA expression in the tissues of patients with AML was significantly higher than that in healthy people (p < 0.001; Fig. 6. B). The baseline clinical data are presented in Table 3. Clinical data showed that THBS1 expression was positively correlated with patient age (p = 0.025; Fig. 6. C). Through survival analysis, we found that the OS of patients in the high THBS1 expression group was significantly lower than that of patients in the low THBS1 expression group (p = 0.01; Fig. 6. D). We plotted an ROC curve to validate the diagnostic value of THBS1, and the results showed that THBS1 upregulation had a high diagnostic value in AML (area under the curve = 0.949, 95%CI: 0.921–0.977) (Fig. 6. E).
Table 3
Clinicopathological characteristics of TCGA(n = 151).
Characteristics | Total(N) | Odds Ratio(OR) | P value |
Gender (Male vs. Female) | 151 | 1.269 (0.668–2.421) | 0.467 |
Race (White vs. Asian&Black or African American) | 149 | 1.997 (0.655–6.788) | 0.236 |
Age (> 60 vs. <=60) | 151 | 2.240 (1.163–4.383) | 0.017 |
WBC count(x10^9/L) (> 20 vs. <=20) | 150 | 0.525 (0.272-1.000) | 0.051 |
BM blasts(%) (> 20 vs. <=20) | 151 | 0.466 (0.237–0.902) | 0.025 |
PB blasts(%) (> 70 vs. <=70) | 151 | 0.601 (0.314–1.142) | 0.122 |
Cytogenetic risk (Intermediate&Poor vs. Favorable) | 149 | 3.103 (1.354–7.635) | 0.010 |
FLT3 mutation (Positive vs. Negative) | 147 | 0.957 (0.473–1.933) | 0.901 |
IDH1 R132 mutation (Positive vs. Negative) | 149 | 0.407 (0.106–1.314) | 0.150 |
IDH1 R140 mutation (Positive vs. Negative) | 149 | 0.310 (0.067–1.089) | 0.089 |
RAS mutation (Positive vs. Negative) | 150 | 0.583 (0.116–2.468) | 0.472 |
NPM1 mutation (Positive vs. Negative) | 150 | 2.415 (1.094–5.597) | 0.033 |
Cytogenetics (+ 8&del(5)&del(7)&inv(16)&t(15;17)&t(8;21)&t(9;11)&Complex vs. Normal) | 135 | 0.534 (0.268–1.053) | 0.072 |
Bold values denote p < 0.05. |
Univariate Cox regression analysis showed that age (p < 0.001), cytogenetic risk (intermediate) (p = 0.002), and cytogenetic risk (poor) (p < 0.001) were associated with OS in patients with AML (Table 4). By incorporating the results of the univariate Cox regression analysis into the multivariate Cox regression analysis, we found that the risk of disease increased with age (p < 0.001, 95%CI: 1.783–4.818).
Table 4
Univariate and Multivariate analysis associations with overall survival and clinicopathologic characteristics in TCGA patients(using Cox regression.n = 151).
Characteristics | Total(N) | Univariate analysis | | Multivariate analysis |
Hazard ratio (95% CI) | P value | Hazard ratio (95% CI) | P value |
Gender | 140 | | | | | |
Female | 63 | Reference | | | | |
Male | 77 | 1.030 (0.674–1.572) | 0.892 | | | |
Age | 140 | | | | | |
<=60 | 79 | Reference | | | | |
> 60 | 61 | 3.333 (2.164–5.134) | < 0.001 | | 3.175 (1.956–5.154) | < 0.001 |
Race | 138 | | | | | |
Black or African American | 10 | Reference | | | | |
White | 127 | 1.382 (0.506–3.776) | 0.529 | | | |
Asian | 1 | 2.954 (0.328–26.603) | 0.334 | | | |
WBC count(x10^9/L) | 139 | | | | | |
<=20 | 75 | Reference | | | | |
> 20 | 64 | 1.161 (0.760–1.772) | 0.490 | | | |
BM blasts(%) | 140 | | | | | |
<=20 | 59 | Reference | | | | |
> 20 | 81 | 1.165 (0.758–1.790) | 0.486 | | | |
PB blasts(%) | 140 | | | | | |
<=70 | 66 | Reference | | | | |
> 70 | 74 | 1.230 (0.806–1.878) | 0.338 | | | |
Cytogenetic risk | 138 | | | | | |
Favorable | 31 | Reference | | | | |
Intermediate | 76 | 2.957 (1.498–5.836) | 0.002 | | 1.745 (0.769–3.959) | 0.183 |
Poor | 31 | 4.157 (1.944–8.893) | < 0.001 | | 2.131 (0.822–5.520) | 0.119 |
FAB classifications | 136 | | | | | |
M0 | 14 | Reference | | | | |
M1 | 30 | 1.100 (0.497–2.433) | 0.814 | | 1.701 (0.742–3.899) | 0.210 |
M2 | 34 | 1.051 (0.481–2.299) | 0.900 | | 1.355 (0.588–3.124) | 0.476 |
M3 | 15 | 0.295 (0.090–0.963) | 0.043 | | 0.805 (0.198–3.278) | 0.762 |
M4 | 28 | 1.173 (0.533–2.581) | 0.691 | | 1.695 (0.722–3.979) | 0.225 |
M5 | 15 | 1.797 (0.723–4.464) | 0.207 | | 3.123 (1.189–8.201) | 0.021 |
Bold values denote p < 0.05. |
We plotted the multivariate Cox regression analysis on a forest map (Fig. 6. F), and the results showed that age is the only independent risk factor for the prognosis of patients with AML. Among these, THBS1 is a risk factor and patients with high THBS1 expression have a poorer prognosis than those with low THBS1 expression.
Survival analysis revealed that the OS of elderly patients with AML in the high THBS1 expression group was significantly lower than that of elderly patients with AML in the low THBS1 expression group (p = 0.038; Fig. 6. G). Therefore, THBS1 can be used as an age-related factor and plays a role in predicting the prognosis of patients with AML.
Association of THBS1 Mutations and Methylation Levels with Age in AML
AML progression and immune tolerance are both affected by genomic and epigenetic changes, and THBS1 expression varies significantly among different tumors[39–42]. To further investigate the cause for this difference, we used cBioPortal to explore the mutation status of THBS1 in different tumors (Fig. 7. A). The results showed that THBS1 had a relatively high frequency of mutations in melanoma, lung cancer, and esophageal cancer.
We further explored the effect of DNA methylation in the promoter region on the expression of THBSs family (Fig. 7. B). We found that THBS1 and THBS2 have high methylation levels in various tumors. THBS1 showed high methylation levels in AML (Fig. 7. C); abnormal methylation levels of THBS1 in AML may affect the expression of THBS1 protein.
To further determine the correlation between THBS1 methylation and age in AML, we used the UALCAN database. We found that THBS1 methylation levels decreased with age (Fig. 7. D). THBS1 methylation levels in patients aged > 60 years were significantly lower than those in patients aged ≤ 60 years (p = 0.01). This indicates a significant correlation between THBS1 methylation levels and age.
Clinical Validation of the Correlation between THBS1 and Age Factors in AML
To further explore the correlation between THBS1 and age of patients with AML, we collected the clinical data and bone marrow tissues from 10 healthy individuals and patients with AML for clinical analysis. As shown in Table 5, we analyzed the sex, age, peripheral blood leukocytes, peripheral blood primitive cells, and bone marrow primitive cells of patients with AML. The results showed that only age and cytogenetic risk were associated with prognosis (p = 0.004).
Table 5
Associations of THBS1 expression with clinicopathological characteristics (n = 20).
Characteristics | Intermediate | Favorable | Poor | P value |
n | 4 | 6 | 10 | |
Gender, n (%) | | | | 0.850 |
Male | 2 (10%) | 2 (10%) | 5 (25%) | |
Female | 2 (10%) | 4 (20%) | 5 (25%) | |
Age, n (%) | | | | 0.004 |
<=60 | 3 (15%) | 6 (30%) | 2 (10%) | |
> 60 | 1 (5%) | 0 (0%) | 8 (40%) | |
WBC counts(x10^9/L), n (%) | | | | 0.850 |
> 20 | 2 (10%) | 2 (10%) | 5 (25%) | |
<=20 | 2 (10%) | 4 (20%) | 5 (25%) | |
BM blasts(%), n (%) | | | | 0.395 |
> 20 | 4 (20%) | 4 (20%) | 9 (45%) | |
<=20 | 0 (0%) | 2 (10%) | 1 (5%) | |
PB blasts(%), n (%) | | | | 1.000 |
> 70 | 3 (15%) | 4 (20%) | 6 (30%) | |
<=70 | 1 (5%) | 2 (10%) | 4 (20%) | |
Bold values denote p < 0.05. |
The mRNA level in the bone marrow tissue of healthy people was significantly lower than that in the poor group (p = 0.0007), whereas the mRNA level in the bone marrow tissue of the favorable and intermediate group was significantly lower than that in the poor group (p = 0.0017; Fig. 8. A). The higher the THBS1 mRNA expression, the higher the cytogenetic risk of the patient. However, after testing the THBS1 protein levels in the three groups using ELISA, no significant difference was found among the groups (Fig. 8. B). This may be related to a mutation in THBS1.
In this study, the recurrence-free survival (RFS) of the high THBS1 expression group was lower than that of the low THBS1 expression group (p = 0.03; Fig. 8. C). Although the OS of the two groups was not significantly different, the survival curve of the high THBS1 expression group was lower than that of the low THBS1 expression group (Fig. 8. D).
We divided the patients into a younger group (age ≤ 60 years; n = 11) and an older group (age: >60; n = 9) to analyze their RFS and OS. The results showed that the RFS of the younger group was higher than that of the older group (p = 0.02; Fig. 8. E), as was the OS (p = 0.029; Fig. 8. F). Thus, increased age is an unfavorable factor for the prognosis of AML.
To verify the correlation between THBS1 and age of patients with AML, we further evaluated the difference in THBS1 expression between the younger and older groups. THBS1 mRNA expression in the bone marrow tissue of the younger group was significantly lower than that of the older group (p < 0.001; Fig. 8. G), whereas the protein expression in the bone marrow tissue of the younger group was lower than that of the older group (p = 0.0214; Fig. 8. H).
These results confirm the correlation between THBS1 and age of patients with AML, which can predict the prognosis of these patients.