Clinical analysis of D1 (GSE50948)
The differential expression analysis of breast cancer subtypes (ER-negative vs positive, PR negative vs positive, and HER2 negative vs positive) was performed, and a total of 2547 differentially expressed genes (p = < 0.05) were obtained. The differentially expressed genes were matched with lipid metabolism genes (n = 1035) and screened 273 common genes (Figure 2a). After processing, the mRNA expression values of 273 lipid metabolism genes were evaluated against all the available clinical features. Among these 273 genes, 114 lipid metabolic genes were significantly associated with most of the clinical parameters. Out of 114 genes, 53 genes showed overexpression in ER-negative samples, while 51 showed downregulation. In HER2 negative samples, 18 genes were upregulated, and 16 genes showed downregulation. Similarly, 39 lipid metabolic genes were overexpressed, and 15 genes showed downregulation in PR negative samples. Furthermore, most genes were significantly associated with grade, tumor size, age, residual, and menopausal status, detailed in additional file 2, Table S1. On subsequent analysis, 26 genes (8 downregulated, 18 upregulated) were significantly associated with tumor grade and other clinical factors, (Additional file 1, table S3).
Clinical analysis of D2 (GSE20194)
Like cohort 1 the differential expression analysis of breast cancer subtypes (ER-negative vs positive, PR negative vs positive, and HER2 negative vs positive) was performed, and a total of 3801 differentially expressed genes (p = < 0.05) were obtained. The differentially expressed genes were matched with lipid metabolism genes (n = 1035) and screened 510 common genes Figure 2a. Among 510 lipid metabolism genes, 114 genes were common with cohort 1 and were significantly associated with different clinical parameters. Genes significantly associated with breast cancer subtypes ER, HER2, and PR were 112, 54, and 89, respectively. In ER-negative sample, 62 genes were downregulated, and 51 genes showed upregulation, while in HER2 negative class 33 lipid genes were downregulated and 20 genes were upregulated. Moreover, 48 genes were downregulated, and 41 lipid genes showed upregulation in PR-negative patients. On subsequent analysis, 96 genes were associated with tumor grade whereas 21 genes were common with cohort 1. According to the clinical analysis results, high expression of 15 genes (ALDH3A2, PTDSS1, CERS6, ITPR1, IRS1, ABAT, ADRA2A, FOXA1, AREG, CCND1, IGF1R, MYB, CSAD, ESR1, ERLIN2), and low expression of 6 genes (SRD5A1, PSAT1, CXCL9, RARRES1, PDXK, UGP2) were significantly (P > 0.05) associated with early tumor grade (Additional file 1, table S4). Detailed association with other clinicopathological parameters is available in Additional file 2, Table S2.
Clinical analysis of D3 (GSE53031)
After differential expression analysis, 4100 differentially expressed genes were identified and matched with 1035 lipid metabolism genes. After processing, 476 lipid metabolism genes were identified and subjected to clinical analysis. Moreover, the genes signature of 114 lipid metabolism genes, common in cohorts 1 and 2 were also significantly correlated with different clinicopathological parameters in discovery cohort 3. Briefly, genes significantly (p = <0.05) correlated with ER, HER2, and PR subtypes are 110, 20, and 106 respectively. Out of 114 genes, 55 lipid metabolic genes showed overexpression, and 55 genes were downregulated in ER-negative samples. In HER2 negative samples, 4 genes were downregulated, and 16 genes were overexpressed. Similarly, in PR negative samples 59 genes were downregulated and 47 genes showed overexpression. Moreover, 45 genes were downregulated, and 48 genes showed overexpression in early tumor grade (Additional file 2, Table S3). Furthermore, genes commonly significant in discovery cohorts 1 and 2 also showed significant association with tumor grade and other clinicopathological factors in discovery cohort 3 (Additional file 1 table S5. The genes commonly associated with clinical features in all discovery cohorts were subjected to Kaplan Meier survival analysis. Survival analysis results revealed that overexpression of only ABAT (P = 0.0089), was associated with better relapse-free survival (RFS) figure 2b.
Cox regression survival analysis of discovery cohort 3
Furthermore, univariate, and multivariate cox regression survival analysis was conducted to analyze the relevant hazard ratio, 95% confidence interval of HR, and p-value. According to the univariate analysis of discovery cohort 3, relapse-free survival was positively associated with ABAT (HR 0.420, CI 0.215-0.822, p-value=0.011, while RFS was negatively associated with grade (HR 2.357, CI 1.148-4.840, p-value= 0.020), and node classification (HR 1.974, CI 1.047-3.723, p-value= 0.036), shown in table 2. Moreover, the corresponding multivariable analysis showed that FRS was also significantly associated with tumor grade (HR 5.082, CI 1.089-23.709, p-value = 0.039), and node classification (HR 3.721, CI 1.266-10.936, P-value = 0.017).
Table 2: univariate and multivariate analysis of discovery cohort 3
Factors
|
Univariate analysis
|
Multivariate analysis
|
HR
|
95% CI
|
P
|
HR
|
95% CI
|
P
|
Lower
|
Upper
|
Lower
|
Upper
|
ABAT
|
0.420
|
0.215
|
0.822
|
0.011
|
0.313
|
0.072
|
1.349
|
0.119
|
Grade
|
2.357
|
1.148
|
4.840
|
0.020
|
5.082
|
1.089
|
23.709
|
0.039
|
N classification
|
1.974
|
1.047
|
3.723
|
0.036
|
3.721
|
1.266
|
10.936
|
0.017
|
Trimester
|
.962
|
0.562
|
1.646
|
.887
|
0.819
|
0.431
|
1.556
|
0.531
|
Age
|
.902
|
0.492
|
1.654
|
.739
|
2.043
|
0.683
|
6.107
|
0.683
|
Tumor size
|
1.170
|
0.624
|
2.193
|
0.625
|
1.459
|
1.459
|
0..496
|
0.496
|
Survival analysis of validation cohort 1 (GSE42568)
Kaplan-Meier survival analysis of validation cohort 1 (GSE42568), revealed several genes of 21 genes signature were associated with RFS and OS (Additional file 1, figure S1a, S1b). Expression of ABAT, PSAT1, and SRD5A1 was evaluated against both RFS and OS. Kaplan Meier survival analysis revealed that high expression of ABAT (P = 0.0001) (figure 2c) was associated with better RFS while overexpression of SRD5A1 (P = 0.023) (Figure 2d) and PSAT1 (P = 0.0034) (Figure 2e) was associated with poor RFS. Moreover, overexpression of ABAT (P = 0.00015) (figure 2f), was significantly associated with good OS whereas overexpression of SRD5A1 (P = 0.015) (figure 2g) PSAT1 (P = 0.037) was associated with poor OS (figure 2h).
Cox regression survival analysis of validation cohort 1
Univariable survival analysis of validation cohort 1 showed that overall survival was positively associated with ABAT (HR 0.529, CI 0.373-0.752, P = 0.000), and negative association with SRD5A1 (HR 2.277, CI 1.155-4.492, P = 0.018), PSAT1 (HR 1.417, CI 1.015-1.979, P = 0.041), grade (HR 3.960, CI 1.849-8.485, P = 0.000), node classification (HR 3.987, CI 1.739-9.143, P = 0.001), tumor size (HR 2.028, CI 1.030-3.996, P = 0.041) with 95 CI and hazard ratio presented in table 3. Furthermore, multivariable analysis also showed that ABAT (HR 0.574, CI 0.371-0.888, p = 0.013), grade (HR 2.440, CI 1.064-5.596, P= 0.035), and node classification (HR 3.036, CI 1.272-7.244, P = 0.012) was significantly correlated with overall survival.
Table 3: univariate and multivariate analysis of validation cohort 1
Factors
|
Univariate analysis
|
Multivariate analysis
|
HR
|
95% CI
|
P
|
HR
|
95% CI
|
P
|
Lower
|
Upper
|
Lower
|
Upper
|
ABAT
|
0.529
|
0.373
|
0.752
|
0.000
|
0.574
|
0.371
|
0.888
|
0.013
|
SRD5A1
|
2.277
|
1.155
|
4.492
|
0.018
|
1.334
|
0.656
|
2.714
|
0.426
|
PSAT1
|
1.417
|
1.015
|
1.979
|
0.041
|
0.854
|
0.564
|
1.304
|
0.472
|
Grade
|
3.960
|
1.849
|
8.485
|
0.000
|
2.440
|
1.064
|
5.596
|
0.035
|
N classification
|
3.987
|
1.739
|
9.143
|
0.001
|
3.036
|
1.272
|
7.244
|
0.012
|
Tumor size
|
2.028
|
1.030
|
3.996
|
0.041
|
1.521
|
0.750
|
3.083
|
0.245
|
Survival analysis of validation cohort 2 (TCGA-PAN Cancer)
Kaplan Meier survival analysis of TCGA data showed that several genes were associated with disease-free survival and overall survival (Additional file 1, figure S2a, S2b). Consistent with discovery and validation cohort 1, overexpression of ABAT (P = 0.0066) was also significantly associated with better disease-free survival (DFS) in validation cohort 2 (Figure 2i). Furthermore, overexpression of SRD5A1 (P = 0.019) was correlated with poor DFS (Figure 2j). Similarly, overexpression of PSAT1 (P = 0.039) were associated with poor OS (Figure 2k).
Cox regression survival analysis of validation cohort 2
According to the univariate analysis of validation cohort 2, ABAT (HR 0.744, CI 0.599-0.923, P= 0.007), SRD5A1 (HR 1.663, CI 1.082-2.556, P = 0.020), PSAT1 (HR 1.182, CI 1.008-1.387, P= 0.040), tumor classification (HR 0.760, CI 0.633-0.912, P= 0.003), node classification (HR 0.679, CI 0.560-0.823, P= 0.000), metastasis (HR 0.473, CI 0.366-0.612, P= 0.000), age (HR 1.794, CI 1.299-2.479, P= 0.000), and tumor stage (HR 0.609, CI 0.516-0.720, P= 0.000) was significantly associated with overall survival, shown in (Table 4). Moreover, multivariable analysis revealed that PSAT (HR 1.743, CI 1.142-2.658, P= 0.010), metastasis (HR 0.646, CI 0.481-0.867, P= 0.004), age (HR 2.075, CI 1.454-2.962, P= 0.000), and tumor stage (HR 0.325, CI 0.174-0.610, P= 0.000), are the independent prognostic factors associated with overall survival in the validation cohort 2.
Table 4: univariate and multivariate analysis of validation cohort 2 (TCGA data)
Factors
|
Univariate analysis
|
Multivariate analysis
|
HR
|
95% CI
|
P
|
HR
|
95% CI
|
P
|
Lower
|
Upper
|
Lower
|
Upper
|
ABAT
|
0.744
|
0.599
|
0.923
|
0.007
|
0.979
|
0.637
|
1.504
|
0.921
|
SRD5A1
|
1.663
|
1.082
|
2.556
|
0.020
|
1.210
|
0.814
|
1.798
|
0.346
|
PSAT1
|
1.182
|
1.008
|
1.387
|
0.040
|
1.743
|
1.142
|
2.658
|
0.010
|
T classification
|
0.760
|
0.633
|
0.912
|
0.003
|
1.108
|
0.820
|
1.872
|
0.700
|
N classification
|
0.679
|
0.560
|
0.823
|
0.000
|
1.138
|
0.634
|
2.043
|
0.665
|
Metastasis
|
0.473
|
0.366
|
0.612
|
0.000
|
0.646
|
0.481
|
0.867
|
0.004
|
Age
|
1.794
|
1.299
|
2.479
|
0.000
|
2.075
|
1.454
|
2.962
|
0.000
|
Tumor stage
|
0.609
|
0.516
|
0.720
|
0.000
|
0.325
|
0.174
|
0.610
|
0.000
|
In-vitro validation of ABAT, SRD5A1, and PSAT1
ABAT, SRD5A1, and PSAT1's relative mRNA expression was validated in MCF10A (normal breast epithelial cells) and SK-BR3, MDA-MB-231, as well as BT549 breast cancer cell lines by quantitative polymerase chain reaction. The mRNA expression of ABAT was significantly downregulated, while PSAT1 and SRD5A1 mRNA expression was significantly upregulated in all tested BC cells compared to control (MCF10A) (Figure 3). This data further suggested the role of ABAT as a good prognostic marker, whereas SRD5A1 and PSAT1 overexpression suggest their key role as poor prognostic markers.
Functional enrichment and pathway annotation
GO enrichment and pathway enrichment analysis were performed to better understand the function of genes common in all cohorts. In the GO biological process, lipid metabolic process, response to oxygen, cellular response to lipid, lipid biosynthetic process, response to inorganic compounds, and lipid metabolic process were the most enriched terms (figure 4a). KEGG pathway analysis showed that multiple genes were enriched in pathways in cancer, metabolic pathways, microRNAs in cancer, HIF-1 signaling, prostate, and breast cancer, (figure 4b). In Reactome pathway analysis, the metabolism of lipids and several oncogenic pathways were enriched, (figure 4c). Furthermore, in molecular function analysis, the top enriched terms were kinase binding, protein kinase, transcription factor, insulin-like growth factor-1, and cholesterol binding were the top enriched terms (figure 4d).