3.1 Gene expression analysis data
In this study, we aimed to investigate the carcinogenic effect of NF-κB1 (NM_001165412 for mRNA, NP_001158884.1 Fig. S1a) in human. As shown in Fig S1b, the structure of NF-κB1 protein is conserved between different species (such as H. sapiens, P. troglodytes, M. mulatta, etc.), and is usually composed of Death_ NF-κB1_p105 (cd08797) domain, RHD-n (c108275) domain, DD (cl14633) and Ank_2 (pfam12796) domain, etc. Phylogenetic tree data (Fig. S2) showed the evolutionary relationships of NF-κB1 proteins among different species.
We first analyzed the expression profile of NF-κB1 in nontumor tissues and different cells. As shown in Fig. S3a, combined with HPA (Human protein atlas), GTEx and Fantom5 (Function annotation of the Mammalian genome 5) data set, the expression of NF-κB1 is the highest in lymph nodes, followed by bone marrow, appendix and thymus (Fig. S3a). However, NF-κB1 was expressed in all the tested tissues (all of which were consistent with the normalized expression value >1), showing a low RNA tissue specificity. When NF-κB1 expression was analyzed in different blood cells, low RNA blood cell type specificity also appeared in the HPA/Monaco/Schmiedel dataset (Fig. S3b).
We analyzed the expression status of NF-κB1 in different types of TCGA cancer using the TIMER2 method. As shown in Fig. 1a, NF-κB1 plays an important role in BLCA (Bladder Urothelial Carcinoma), BRCA (Breast invasive carcinoma), CHOL (Cholangiocarcinoma), COAD (Colon adenocarcinoma), HNSC (Head and Neck squamous cell carcinoma), HNSC-HPV (Head and Neck squamous cell carcinoma-human papillomavirus), LUSC (Lung squamous cell carcinoma), SKCM (Skin Cutaneous Melanoma), STAD (Stomach adenocarcinoma), THCA (Thyroid carcinoma), UCEC (Uterine Corpus Endometrial Carcinoma) (P<0.001), ESCA (Esophageal carcinoma) (P<0.01), GBM (Glioblastoma multiforme), KIRC (Kidney renal clear cell carcinoma), LUAD (Lung adenocarcinoma), PCPG (Pheochromocytoma and Paraganglioma), PRAD (Prostate adenocarcinoma), READ (Rectum adenocarcinoma), PAAD (Pancreatic adenocarcinoma), (P<0.05), were higher than those of the corresponding control tissues.
After using normal tissues from the GTEx dataset as controls, we further evaluated differences in NF-κB1expression between CHOL (Cholangiocarcinoma), DLBC (Lymphoid Neoplasm Diffuse Large B-cell Lymphoma), GBM (Glioblastoma multiforme), LAML (Glioblastoma multiforme), PAAD (Pancreatic adenocarcinoma), THYM (Thymoma) (Fig. 1b, P<0.05). However, for other tumors, we did not get significant differences, such as BLCA (Bladder Urothelial Carcinoma), COAD (Colon adenocarcinoma), ESCA (Esophageal carcinoma), LIHC (Liver hepatocellular carcinoma), KICH (Kidney Chromophobe), PCPG (Pheochromocytoma and Paraganglioma), as shown in Fig. S4a.
Results from the CPTAC dataset showed that total NF-κB1 protein was highly expressed in primary tissues of breast cancer, UCEC (Uterine Corpus Endometrial Carcinoma), ovarian cancer, LUAD (Lung adenocarcinoma) colon cancer, colon cancer, clear cell RCC, and LUAD (Lung adenocarcinoma) colon cancer (Fig. 1c, P<0.001) compared with normal tissues.
We also used the GEPIA2 “Pathological Stage Plot” module to observe the correlation between NF-κB1 expression and tumor pathological stage, including BRCA (Breast invasive carcinoma), KIRC (Kidney renal clear cell carcinoma) (Fig. 1d, P<0.05).
3.2 Survival analysis data
We divided tumor cases into high expression group and low expression group according to the expression level of NF-κB1, and mainly studied the correlation between NF-κB1 expression and prognosis of patients with different tumors using TCGA and GEO data sets. As shown in Fig. 2a, high expression of NF-κB1 in TCGA was associated with poor overall OS (Survival) outcomes for CESC (P=0.02), LGG (P=0.019), LUSC (P=0.033), OV (P=0.026). Data from DFS (disease-free survival) analysis (Fig. 2b) showed that high NF-κB1 expression was not associated with poor prognosis for all types of tumors. In addition, low NF-κB1 gene expression was associated with poor OS prognosis for ACC(P=0.037), KIRC(P=0.00004), READ (P=0.035) (Fig. 2a, P=0.012) and DFS prognosis for KIRC (P=0.0014) (Fig. 2b, P=0.014).
Moreover, using Kaplan Meier mapping tool to analyze survival data, it was found that low expression of NF-κB1 was correlated with breast cancer OS (overall survival) (Fig. S7a, P=0.0000092), DMFS (survival without distant metastasis) (P=0.000035) and RFS (relapse-free survival) (P<0.001) prognosis. However, in ER status-IHC (positive and negative), ER status-arry (positive and negative), HER2 status (positive and negative), Grade2, Intrinsic subtype (Basal, LuminalA, LuminalB) and Pietenpol subtype (basal-like1, Luminal androgen receptor) breast cancer cases, high expression of NF-κB1 was associated with poor OS, RFS (relapsion-free survival) and DMFS prognosis (Table S1, P<0.05). Additionally, a low NF-κB1 expression level was associated with PFS (Progression Free Survival) (P=0.0005), prognosis for ovarian cancer (Fig. S5b). In contrast, high expression levels of NF-κB1was related to poor OS (P=0.019) and PPS (Post-progression survival) (P=0.0019) prognosis for lung cancer (Fig. S5c), FP (P=0.00083) prognosis for gastric cancer (Fig. S5d) and PFS (P=0.01) and RFS (P=0.0046) and DSS (P=0.035) prognosis for liver cancer. We also used the selected clinical factors to carry out a subgroup analysis and observed different conclusions. (Tables S1-S5). The above data suggested that the expression of NF-κB1 was different from the prognosis of patients with different tumors.
3.3 Genetic alteration analysis data
We observed genetic alterations in NF-κB1 in different tumor samples from the TCGA cohort. As shown in Figure 3a, the “mutant” uterine tumor patients had the highest frequency of NF-κB1 change (>6%). The “amplified” type of CNA was the dominant type of Pheochromocytoma and Paraganglioma, with a frequency of change of about 1% (Fig. 3a). Figure 3b further shows the types, loci and number of cases of NF-κB1 gene alteration. We found that missense mutations in NF-κB1 were the main type of genetic alterations. A520T/V alterations in the ANK_2 domain were detected in 1 UCEC cases, 1BRCA cases and 1 COAD cases (Fig. 3b), which can induce frame shift mutations in NF-κB1 gene, translating 520 NF-κB1 proteins from A (Alanine) to T/V (Threonine/valine), and subsequent NF-κB1 protein truncation. We can observe the A520T/V site in the 3D structure of NF-κB1 protein (Fig. 3c). In addition, we also explored the potential association between c gene alterations and clinical survival outcomes in patients with different types of cancer. The data in Fig. 3d shows that the prognosis of CUEC cases with NF-κB1 alteration is better in terms of disease-specific survival rate (P = 0.0811) and disease-free survival rate (P =0.096) than that of cases without NF-κB1 alteration. Additionally, we analyzed the association between NF-κB1 expression and TMB (tumor mutation burden)/MSI (microsatellite instability) in all TCGA tumor. As shown in Fig. S6, we observed that NF-κB1 expression of LUAD, BLCA, LIHC, BRCA, THCA and UVM was negatively correlated with TMB, while UCEC, COAD, STAD and LGG were positively correlated. The expression of NF-κB1 was also negatively correlated with PAD, BRCA, SKCM, HNSC, and DLBC (P<0.05), and positively correlated with COAD, KIRC, and LAML (P<0.05) (Fig. S7). This result deserves further investigation.
3.4 DNA methylation analysis data
In the TCGA project, we used the MEXPRESS method to study the potential relationship between NF-κB1 DNA methylation and different tumor pathogenesis. For TGCT cases, we observed a significant negative correlation between DNA methylation and gene expression in NF-κB1 non promoter region, such as cg06501333 (P<0.001, R=0.527), as shown in Figure S8.
3.5 Protein phosphorylation analysis data
We also compared the phosphorylation levels of NF-κB1 in normal tissues and primary tumor tissues. CPTAC data sets were used to analyze five types of tumors (breast cancer, ovarian cancer, LUAD, UCEC and clear cell RCC,). Fig. 4a summarizes the NF-κB1 phosphorylation sites and their significant differences. Compared with normal tissues, S893 within the NF-κB1 DEATH domain showed higher phosphorylation levels in all primary tumor tissues (Fig. 4a-g, all P <0.05), the next is the increase of phosphorylation level of S892 locus in the breast cancer death area. (Fig. 4b, P=0.00000049), clear cell RCC (Fig. 4c, P=0.1), LUAD (Fig. 4d, P=0.033). We also analyzed NF-κB1 phosphorylation identified by CPTAC using the PhosphoNET database, and found that the NF-κB1 phosphorylation of S893 in cell cycle was supported by a published article (21).
3.6 Immune infiltration analysis data
As an important part of tumor microenvironment, tumor infiltrating immune cells are closely related to tumor occurrence, progression or metastasis (22, 23). Tumor associated degmacyte in tumor microenvironment have been reported to be involved in adjusting the function of various tumor soaking immunocyte (24, 25). Here, we used TIMER, CIBERSORT, CIBERSORT-ABS, QUANTISEQ, XCELL, MCPCOUNTER and EPIC algorithms to investigate the potential relationship between different levels of immune cell infiltration and NF-κB1 gene expression in different types of TCGA tumors. Through a series of analyses, we find that the immune infiltration of CD8+ T cells is negatively correlated with the expression of NF-κB1 in THYM (Thymoma) (Fig.S9a-b). In addition, we observed that the expression of NF-κB1 was positively correlated with the invasion value of cancer-associated fibroblasts in TCGA tumors of LGG, LIHC, LUSC, RAAD and TGCT (Fig. 5). The above tumor scatter plot data obtained by an algorithm are shown in Fig. 5 and Fig. S9. For instance, because of the MCPCOUNTER algorithm, expression levels NF-κB1 in TGCT is positively correlated with the infiltration level of cancer-associated fibroblasts (Fig. 5, Rho= -0.621, P=5.12E-17)
3.7 Enrichment analysis of NF-κB1-related partners
In order to better understand the molecular mechanism of NF-κB1 gene in tumorigenesis. Through a series of pathway enrichment analysis, we tried to screen the target proteins bound to NF-κB1 and the related genes expressed by NF-κB1. We used the string tool to obtain a total of 50 NF- κB1 binding protein was sustained by experimental evidence. The interaction network between these proteins is shown in Figure 6a. Using GEPIA2 tool in combination with all tumor expression data of TCGA, we obtained the first 100 genes expressed by NF- κB1. In Fig. 6b, the expression level of NF- κB1 was positively correlated with UBE2D3 (R=0.63), IRF2 (R=0.63), ELF1 (R=0.58), ERAP1 (R=0.56), SMNDC1 (R=0.56) genes (all P<0.001). The corresponding heat map data also demonstrated that NF-κB1 was positively correlated with the above 5 genes in most explicit cancer types (Fig. 6c). By cross analysis, the two groups had 6 members, namely UBE2D3, SP1, STAT3, TNFAIP3, CNOT6L, RIPK1 (Fig. 6d).
We combined these two data sets for KEGG and GO enrichment analysis. The KEGG data in Figure 6e suggest that “endoplasmic reticulum protein processing” and “metabolic pathway” may be participated in NF-κB1 effect on tumor etiopathogenesis.