3.1. Understanding the role of APOBEC3B and the associated genes and pathways in breast cancer
We have designed our study to achieve the goal as mentioned in the previous section and for which the large datasets (expression and mutation) have been collected from GEO, OncoLnc, and TCGA and applied in-silico approach for analysis and the summarized layout have been presented in Fig. 1a. We have mapped out the cox regression analysis of APOBEC3B in different types of cancers (Table 1) and the directly APOBEC3B-associated genes and the respective pathways for all the genes by using protein-protein interaction database and finally presented a combined network of all these genes and the pathways in Fig. 1b. Furthermore, a network of co-expressed genes for APOBEC3B and the associated pathways have been presented in Fig. 1c. In Fig. 1b and 1c, we observe that p53 signaling, cell cycle, oocyte meiosis, major cancer signaling, ubiquitin-mediated proteolysis, TLR signaling, chemokine signaling, antigen processing and presentation, regulation of actin cytoskeleton, neurotrophine, MAPK, BCR signaling, a number of metabolism associated signaling, and calcium signaling pathways as the potentially affected pathways due to alteration(s) in APOBEC3B either expression or mutation or both. From the previous work, it is known that in a number of cancer normal and malignant cells, increased GSH level is associated with a proliferative response and is essential for cell cycle progression, regulation of actin cytoskeleton pathway has its role in cancer cell migration and invasion in ECM, p53 signaling pathway, MAPK kinase pathway, neurotrophine and chemokine and calcium signaling pathways which when altered in cancer cells are involved in tumor initiation, angiogenesis, progression, and metastasis. These pathways are known to be very specific and play pivotal role in cancer cell migration and proliferation and these pathways which are very important in cancer progression were observed to be strongly associated with APOBE3B.
Table 1
Cox regression analysis for APOBEC3B.
Cox regression results for APOBEC3B
|
Cancer
|
Cox Coefficient
|
P-Value
|
FDR Corrected
|
Rank
|
Median Expression
|
Mean Expression
|
BLCA
|
0.004
|
9.60E-01
|
9.82E-01
|
15906
|
362.19
|
603.54
|
BRCA
|
0.112
|
2.10E-01
|
5.89E-01
|
5758
|
162.02
|
321.98
|
CESC
|
-0.608
|
1.30E-05
|
2.55E-02
|
8
|
755.45
|
924.17
|
COAD
|
-0.094
|
3.40E-01
|
7.24E-01
|
7557
|
214.67
|
245.31
|
ESCA
|
-0.048
|
7.20E-01
|
9.84E-01
|
12129
|
239.99
|
314.7
|
GBM
|
0.004
|
9.60E-01
|
9.92E-01
|
16161
|
67.15
|
86.21
|
HNSC
|
-0.028
|
7.10E-01
|
8.90E-01
|
13181
|
467.44
|
691.26
|
KIRC
|
0.19
|
2.40E-02
|
6.18E-02
|
6421
|
61.4
|
80.76
|
KIRP
|
0.59
|
5.50E-04
|
6.78E-03
|
1328
|
50.53
|
73.29
|
LAML
|
-0.018
|
8.70E-01
|
9.58E-01
|
13732
|
160.13
|
206.39
|
LGG
|
0.216
|
2.00E-02
|
4.32E-02
|
7720
|
17.17
|
25.49
|
LIHC
|
0.093
|
2.90E-01
|
5.53E-01
|
8176
|
83.16
|
161.41
|
LUAD
|
-0.015
|
8.40E-01
|
9.23E-01
|
15194
|
135.46
|
255.45
|
LUSC
|
-0.117
|
7.80E-02
|
6.10E-01
|
2146
|
491.14
|
651.6
|
Further, extending our analysis we classified positive and negative co-expressing genes but threshold we have applied for correlation value was either greater than + 0.5 or less than − 0.5 and such negative and positive correlation in expression provides important information regarding the dependence of gene expression on each other.
3.2. Mutational profiling and their functional impact in breast cancer
In the previous section, we have explored the fundamentally APOBEC3B associated genes and the pathways and moreover, mutational profiling of breast cancer genes have been performed and the respective enriched pathways for the two big clinical datasets from TCGA database. In this analysis, two datasets have been used and among the top-ranked mutated genes there are a large number of common genes (Fig. 2a) and as we go down the number of common genes decreases and similar to it the common enriched pathways with the respective p-values have been presented (Fig. 2b) and finally the venn diagram has been shown for both the genes and the enriched pathways (Fig. 2c). PIK3CA, TP53, MUC16, TTN, AHNAK2, SYNE1, CDH1, KMT2C, GATA3, and more are among the top ranked genes and MAPK, calcium signaling, cAMP, PI3K-AKT, focal adhesion, adrenergic signaling, thyroid hormone, oxytocin, ErbB, ubiquitin, apelin, tight junction, GnRH, Ras, cGMP-PKG, cell cycle, and pluropotency of stem cells are among the commonly enriched pathways. Overall, there were 42 commonly mutated genes and 18 pathways commonly enriched, 131 genes dataset1 specific mutated genes and 41 enriched pathways while 188 mutated genes and two enriched pathways specific to dataset 2. Among the enriched pathways list for dataset 1, there are a number of pathways which directly belong to the immune system and these pathways are TCR, BCR, TLR, NK cell-mediated cytotoxicity, TNF, TGF, cytokine-cytokine receptor interaction, and leukocyte transendothelial cell migration and ubiquitin-mediated proteolysis is common to both the mutational datasets. In Fig. 2c, We have presented the detailed analysis of the clinically significant top 100 genes and the associated biological functions, where we observe that CD40LG is directly associated 10 pathways and most of them are the parts of immune system and known to control a number of leading human diseases including cancers. CD40LG (CD40 Ligand) is a Protein Coding gene. The diseases associated with CD40LG are mainly immunodeficiency with Hyper-Igm, type-1 and toxoplasmosis. It mainly acts as a ligand for integrins, mainly ITGA5:ITGB1 and ITGAV:ITGB3; both the integrins and the CD40 receptor are required for activation of CD40-CD40LG signalling which have cell-type dependent effects, such as B-cell activation, NF-kB signaling and anti-apoptotic signaling.
Furthermore, we have also performed analysis of APOBEC3B signature (C->T) in breast cancer and observe that this specific mutation pattern is dominantly present in missense and exon followed by upstream, synonymous, intron, 3’ UTR, and 5’ UTR (Supplementary data S2).
3.3. Functional and clinical significance of APOBEC3B and the associated genes in breast cancer
After analyzing the APOBEC3B related genes and the pathways including the mutated genes and the altered functions because of mutation, we have analyzed the overall list of genes which are showing clinical significance in terms of overall patients survival (Fig. 3). Here, top 100 genes have been presented (Fig. 3a) followed by the respective p-values, panther protein classes for these top-ranked genes (Fig. 3b) have been shown also, and finally the associated pathways and the top-ranked gens as a network have also been shown (Fig. 3c). In terms of overall survival, MCTS1, OVOS2, MAPT-IT1, ATG4A, SLC16A2, SLC35A2, VDAC1, TBC1D24, RP11-214F16.8, and PDP1 are showing extremely high significance. From panther protein classification, majority of these top-ranked genes mainly belong to metabolite interconversion enzyme, transporter, protein modifying enzyme, defense/immunity protein, gene-specific transcriptional regulator, and membrane traffic protein classes. Most of the clinically significant genes belong to the above mentioned protein classes where metabolite interconversion enzyme has highest number of the genes followed by transporter, and protein modifying protein classes (Fig. 3b). From the network of these top-ranked genes and the associated functions, CD40LG appears to control the 10 pathways and most of them belong to immune signaling. Among the overall functions associated with these top-ranked genes, majority of these pathways are known and well-established that they control major human diseases multiple types of cancers including breast cancer, neurodegenerative diseases, diabetes, and infection diseases (Fig. 3c). Thus leading to the conclusion that immune system and its critical components are mainly affected as a result of breast cancer. Moreover, we have also presented a supplementary data where the clinical details have been presented (Supplementary data S3).