Prognostic and Immunological Characterization of NSD3 Evaluated through an Integrative Pan-Cancer Analysis


 Background: Nuclear receptor binding SET domain protein-3 (NSD3) has been reported to be a crucial regulator of carcinogenesis as a histone lysine methyltransferase in multiple cancer types. However, the underlying mechanisms have not been clearly delineated. Therefore, we aimed to investigate the expression pattern, prognostic value, and potential function of NSD3 in 33 types of human cancer. Methods: The potential roles of NSD3 were explored using datasets from The Cancer Genome Atlas (TCGA) pan-cancer dataset and an array of bioinformatics methods, including analyses of the relationship between NSD3 expression and prognosis, tumor mutational burden (TMB), microsatellite instability (MSI), DNA amplification, and immune cell inﬁltration across 33 cancer types. Results: Many types of cancers are characterized according to the dysregulation of NSD3, which is associated with the pathological stage of cancer. Patients in our study with higher NDS3 levels, which were attributed to NSD3 copy number amplification, always experienced shorter survival periods. Additionally, NSD3 expression was associated with TMB and MSI in 10 different cancer types. The top five cancers whose NSD3 expression correlated with immune scores were further analyzed. The levels of immune-cell infiltration differed significantly between high and low NSD3-expressing samples in each of the five cancer types. Functional enrichment of the NSD3 co-expressed genes indicated a role for NSD3 in the regulation of immune responses and tumorigenesis. Conclusions: Our study revealed that NSD3 can function as a prognostic marker in various cancers due to its role in tumorigenesis and tumor immunity.


Background
Cancer is the leading cause of death worldwide with the incidence increasing (1). Despite advances over the last few decades in the diagnosis and treatment of cancers, a large proportion of cancer patients are refractory to clinical therapy. Moreover, patients with cancer often face a heavy economic burden, severely affecting their daily lives (2). As a result, novel diagnostic and therapeutic methods are urgently needed, including the discovery of predictive biomarkers (3)(4)(5). Recently, increasing evidence has shown that cells in the tumor microenvironment (TME), mainly comprising stromal and immune cells, are involved in tumorigenesis and progression (6-9). Some patients with solid tumors respond to immune checkpoint blockade (ICB), such as antibodies against programmed cell death protein 1 (PD-1) (10) and cytotoxic T-lymphocyte-associated protein 4 (CTLA-4) (11), extending their survival; however a signi cant proportion of patients do not achieve remission. Accordingly, there remains a lack of biomarkers that can be used to reliably predict an immunotherapy response.
Nuclear receptor binding SET domain protein-3 (NSD3) is a member of the histone lysine methyltransferase family involved and is in the regulation of gene expression, cycle progression, and chromatin remodeling (12,13). Aberrant NSD3 expression is associated with many pathophysiological processes, including cardiac hypertrophy (14) and antiviral immune responses (15). Although accumulating evidence indicates that NSD3 affects tumorigenesis and metastasis by regulating the Wnt, Notch, and other signaling pathways (16-18), the role of NSD3 in tumors has not been fully elucidated.
Many studies to date that have focused on the role of NSD3 have been limited to a speci c type of cancer and pan-cancer studies of the association between NSD3 and various cancers remain unavailable. Therefore, in the present study, we used data from The Cancer Genome Atlas (TCGA) database to evaluate NSD3 expression and its relationship with prognosis in different types of cancers. We also explored the potential associations between NSD3 expression and microsatellite instability (MSI), tumor mutational burden (TMB), DNA methylation, and immune in ltration across 33 types of cancer. We also conducted co-expression analyses of immune-related genes with NSD3 and functional enrichment analyses to investigate the biological functions of the NSD3 gene in tumors. Our results suggest that NSD3 can be used as a prognostic factor for a variety of tumors and plays an important role in tumor immunity by affecting immune cell in ltration, TMB, and MSI, shedding new light on the role of NSD3 in tumor immunotherapy.

NSD3 expression pattern across different cancer types
The expression pattern of NSD3 was investigated in different types of cancer using TCGA data. NSD3 expression was compared between cancer specimens and corresponding normal samples. As shown in Fig. 1a, a total of 14 types of cancers showed abnormal NSD3 expression. Among them, NSD3 was highly expressed in cholangiocarcinoma (CHOL), esophageal carcinoma (ESCA), head and neck squamous cell carcinoma (HNSC), liver hepatocellular carcinoma (LIHC), lung adenocarcinoma (LUAD), lung squamous cell carcinoma (LUSC), pheochromocytoma and paraganglioma (PCPG), and stomach adenocarcinoma (STAD). Low NSD3 expression was observed in glioblastoma multiforme (GBM), kidney chromophobe (KICH), kidney renal clear cell carcinoma (KIRC), kidney renal papillary cell carcinoma (KIRP), prostate adenocarcinoma (PRAD), and thyroid carcinoma (THCA). A few cancer types, such as those with TCGA cancer codes SARC (sarcoma) and CESC (cervical squamous cell carcinoma and endocervical adenocarcinoma), had very few normal samples and the differences were not statistically significant; however, the lack of signi cance was likely due to the small number of normal samples for comparison.
Next, we analyzed NSD3 expression levels in various tumors and ranked them from high expression to low expression (Fig. 1b). All the cancers expressed NSD3 with the highest levels being in acute myeloid leukemia (LAML) and the lowest being in LIHC. We then assessed the 33 cancer types for differential expression of NSD3 based on the cancer stage. Of the 33 types, 10 showed NSD3 expression-stage correlation, including colon adenocarcinoma (COAD), HNSC, KICH, KIRC, LUAD, pancreatic adenocarcinoma (PAAD), (SKCM), STAD, testicular germ cell tumors (TGCT), and THCA ( Fig. 2a-2j).
Notably, we found the majority of significant differences in NSD3 expression were observed between stage I and other tumor stages. These results suggest that many types of cancers characteristically exhibit NSD3 dysregulation, which is associated with the pathological stage of the cancer.
Ampli cation of chromosomal region 8p11-12, where NSD3 is located, is a common genetic alteration that has been implicated in the etiology of cancers. Accordingly, we explored the correlation between mRNA expression and DNA-copy-number variation of NSD3 across the 33 tumor types. Interestingly, a positive association was detected for all the cancer types, with the top ve being LUSC, breast invasive carcinoma (BRCA), rectum adenocarcinoma (READ), ESCA, and COAD (Fig. 3), indicating the high expression level of NSD3 in cancers is supported by DNA ampli cation.

Association of NSD3 expression with cancer patient survival
To explore the prognostic value of NSD3 in cancers, we performed a patient survival analysis, including overall survival (OS) and disease-speci c survival (DSS) for each of the 33 cancer types. As shown in Fig.   4a, analysis using the Cox proportional hazards model revealed that higher NSD3 expression levels were signi cantly associated with shorter OS of patients with adrenocortical carcinoma ( Correlation of NSD3 with TMB, TMI, and immune checkpoint genes across cancer types We next analyzed the correlations of NSD3 expression with TMB, MSI, and expression of immune checkpoint genes, all of which have essential connections with ICB sensitivity. Our results indicated that NSD3 expression was positively associated with TMB in six cancer types, ACC, uterine corpus endometrial carcinoma (UCEC), STAD, LUAD, LAML, and HNSC, and negatively associated with TMB in ve cancer types, THCA, LIHC, KIRP, KIRC, and BRCA (Fig. 6a). NSD3 expression was related to MSI in the other 10 cancer types (Fig. 6b). As cancer cells can escape immune surveillance by regulating the immune checkpoint gene cytotoxic T-lymphocyte-associated protein 4 (CLAT4), we also calculated the Pearson correlation coe cient between NSD3 expression and immune checkpoint genes across all 33 of the cancers (Fig. 6c). NSD3 was obviously co-expressed with the majority of immune checkpoint genes in most of the cancers, suggesting a vital role of NSD3 in the regulation of immune checkpoints.

Relationship between NSD3 expression and TME
Another essential factor that affects ICB sensitivity is the TME. Accordingly, we used the ESTIMATE algorithm to calculate immune and stromal scores for each tumor type and the relationships between NSD3 expression and these two scores were then assessed. Our results revealed that NSD3 expression negatively correlated with the immune scores of many cancer types, with the top ve being GBM, brain lower grade glioma (LGG), SARC, PCPG, and ovarian serous cystadenocarcinoma (OV) (Fig. 7a). NSD3 expression also negatively correlated as with stromal scores in pan-cancer analysis, except for positive correlations with PRAD and KIRC (Fig. 7b). In ltrating immune cells are an important component of the antitumor immune response. Consequently, we investigated the correlation between NSD3 expression and immune in ltrates in the ve cancer types noted above for which NSD3 expression was highly correlated with the immune scores. As shown in Fig. 8, the number of in ltrating immune cells differed signi cantly between the high and low NSD3 expression populations across these cancer types. NSD3 expression positively correlated with the levels of infiltrating regulatory T cells (Tregs), activated mast cells, and M1 macrophages in GBM, LGG, and SARC tumor specimens, but negatively correlated with the levels of in ltrating CD8+ T cells and M2 macrophages. Similarly, NSD3 expression negatively correlated with levels of infiltrating activated NK cells, memory B cells, CD8+ T cells, and activated CD4+ memory T cells in PCPG and OV tumor specimens.

Functional annotation of NSD3
To identify the potential biological function of NSD3 in cancers, we performed functional enrichment analyses with NSD3-related genes using gene set enrichment analysis (GSEA) and gene set variation analysis (GSVA) algorithms. Enriched Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways and gene ontology (GO) terms for each cancer type are shown in Fig. 9. NSD3 was associated in diverse cancers with immune-related pathways and glucose metabolism pathways, including antigen processing and presentation, RIG-I-like receptor signaling pathway, TOLL-like receptor signaling pathway, and pentose and glucuronate interconversions (Fig. 9a). NSD3-related genes in each cancer type were mainly enriched in GO terms associated with artery morphogenesis, cell cycle, cell motility, immune response, and in ammatory response (Fig. 9b). GSVA analysis was performed to determine the potential function of NSD3. As shown in Fig. 10, NSD3 expression was positively associated with several immune cellrelated and histone methylation-related pathways. In contrast, NSD3 expression negatively correlated with cell metabolism-related, drug transport-related, and drug metabolism-related pathways.

Discussion
Although ICB therapy is being extensively applied for the treatment of various malignant tumors, a signi cant portion of patients do not respond (26-28). This refractory response facilitates the need for research aimed at the discovery of helpful biomarkers for screening and identifying sensitive patients.
Accumulating evidence has indicated that NSD3 plays a vital role in tumorigenesis, metastasis, and treatment sensitivity in various cancer types (17,29). However, a comprehensive pan-cancer analysis of NSD3 remains absent. Therefore, we conducted such an analysis to evaluate the expression pattern, prognostic value, and potential function of NSD3 in various types of cancers.
The Nuclear receptor-binding SET domain protein (NSD) family is a group of histone lysine methyltransferases consisting of three members, NSD1, NSD2, and NSD3 (18). The NSD3 gene, also known as WHSC1L1, has been characterized in recent years as a tumorigenesis-related gene (30,31). As reported previously, NSD3 triggers Notch signaling to promote epithelial-mesenchymal transition (EMT) by increasing H3K36 methylation during tumorigenesis and tumor progression (17). Mahmood et al. have indicated that NSD3 is a driver of several cancers, including breast cancer, pancreatic adenocarcinoma, and lung cancer (32). However, previous studies have been limited to speci c cancer types. In the present study, we analyzed 33 different cancer types and found that NSD3 expression was highly heterogeneous across the different cancers. In line with previous studies, NSD3 expression was upregulated in LUSC, ESCA, LIHC, and HNSC (33)(34)(35). Some studies have also reported that NSD3 expression is increased in BLCA and COAD [45,46], which is inconsistent with our current results. The heterogeneity of the study populations may have at least partly contributed to this discordance. However, our study indicated that high levels of NSD3 expression indicated poor prognosis for patients with PAAD, KIRP, and UVM, which is consistent with previous studies (32,36,37). A recent study concluded that ablation of NSD3 in a mouse model of LUSC attenuates tumor growth and results in prolonged survival (16). Surprisingly, our results indicated that NSD3 was a protective factor for patients with LUSC and negatively correlated with cell metabolism-related, drug transport-related, and drug metabolism-related pathways. One possible explanation for this apparent contradiction is that NSD3 exerts different regulatory roles in tumorigenesis and disease progression of LUSC.
Accumulating evidence supports both TMB and MSI as promising biomarkers in various solid cancers for predicting responses to ICB drugs (38-40), as well as the expression of some immune checkpoint genes, such as PD1 and programmed death-ligand 1 (PD-L1). Our current results also revealed that NSD3 expression correlated with TMB and MSI levels and there was a signi cant correlation between NSD3 expression and immune checkpoint genes in different cancer types. In ltrating immune cells in the TME also play vital roles in the immune response and escape during tumorigenesis and progression (41). We found that NSD3 expression negatively correlated with the level of in ltration of diverse immune cells in a portion of cancers.
To further investigate the potential roles of NSD3 in cancers, we performed functional enrichment analyses. We found that NSD3-related genes were mainly enriched in pathways that are dysregulated during the pathophysiological process of cancers, including pathways related to immune, cell metabolism, cell cycle, cell motility, drug transport, and drug metabolism. A previous study indicated that NSD3 upregulation inhibits the Notch signaling pathway and represses breast cancer cell EMT by facilitating H3K36 methylation (17). In addition, NSD3 knockdown in lung cancer and bladder cancer cells induce cell cycle arrest at the G(2)/M phase (35). Results from all these studies support the ndings of our current study. Interestingly, while one study revealed that NSD3 promotes innate immunity during viral infection (42), no study has reported the role of NSD3 in the regulation of cancer immunity. Our results strongly suggest that NSD3 exerted immune regulatory effects in cancers, but this needs to be veri ed with future experimental evidence.
In conclusion, we evaluated the expression pro le, prognostic value, and potential function of NSD3 in a pan-cancer analysis. Our results indicated that NSD3 may be a promising biomarker for predicting the response by patients to ICBs and provide new information that will bene t further exploration and con rmation of NSD3 functions in cancer development and progression and help identify new therapeutic approaches.

Methods
Data download and processing TCGA gene expression data, copy number variation, and clinical data of the patients were downloaded from the UCSC Xena database (http://xena.ucsc.edu/). Strawberry Perl, version 5.32.1 (http://strawberryperl.com/) was used to extract the NSD3 expression data to generate a data matrix.

Analysis of the expression pattern of NSD3
The comparison of NSD3 expression between tumor specimens and their corresponding normal tissue samples across 33 cancer types was analyzed using t-test with a P-value < 0.05 considered statistically signi cant. Subsequently, patients with each cancer type were categorized into three or four groups according to pathological stage and the NSD3 expression pattern was determined. To further investigate the cause of aberrant NSD3 expression in the different cancer types, correlations between expression levels and gene copy number variation was evaluated. The R package "ggpubr" version 0.4.0 (https://cran.r-project.org/packages=ggpubr) was used to visualize the results.

Univariate survival analysis
Patient survival and pathological stage information were selected for prognosis analyses. NSD3 data was subjected to OS and DSS analyses using the univariate Cox regression model. The OS and DSS analyses were performed using R packages "survival" version 2.43-3 (CRAN.Rproject.org/package=survival) and "forestplot" version 1.10.1 (https://cran.rproject.org/packages=forestplot) with P-values < 0.05 being considered statistically signi cant (19). Patient survival for each cancer type whose NSD3 expression was found to correlate with OS or DSS in the Cox regression model was visualized using the "survminer" package, version 0.4.9 (https://cran.rproject.org/packages=survminer) as the Kaplan-Meier plotter.
Correlation of NSD3 expression with TMB, tumor MSI, and immune checkpoint genes TMB, MSI, and the expression of immune checkpoint genes are commonly used biomarkers that correlate with responses to ICB treatment (20,21). In the current study, TMB scores were calculated using Perl script and normalized to the total length of the exons. MSI scores were calculated for each sample based on somatic mutation data downloaded from TCGA (https://tcga.xenahubs.net). Correlations of NSD3 expression with TMB and MSI were subsequently analyzed by calculating the Spearman's rank correlation coe cient. The results are presented as radar charts prepared using the "fmsb" package, version 0.7.0 (https://cran.r-project.org/packages=fmsb). Co-expression of NSD3 with immune checkpoint genes was analyzed and a heatmap of the results generated using the "pheatmap" package, version 1.0.12 (https://cran.r-project.org/packages=pheatmap).
Analysis of the relationship between NSD3 and TME ESTIMATE is an important tool for calculating immune and stromal scores using transcriptomic data.
Here, we evaluated the immune and stromal scores for each sample using the "estimate" package and assessed the correlation between NSD3 expression and the scores for each cancer type. The online analytic tool CIBERSORT (https://cibersort.stanford.edu/) was used to evaluate the levels of in ltrating immunocytes in each tumor sample (22). The samples for each cancer types were strati ed into two groups, high NSD3 expression and low NSD3 expression, based on the median expression levels of NSD3. Comparison of in ltrating immune cells between the two groups were investigated and visualized using "ggpubr".
Functional enrichment analysis GSEA and GSVA are two different approaches that have been used to explore the biological functions of NSD3 in cancers (23,24). Data regarding the genes co-expressed with NSD3 were collected for each cancer type. GO and KEGG gene sets were downloaded from the GSEA website (http://www.gseamsigdb.org/gsea/index.jsp). Functional enrichment analyses were conducted using the R packages "org.Hs.eg.db," "clusterPro ler," (25) and "enrichplot." The GSVA gene sets were obtained from the Molecular Signatures Database (MSigDB) v7.2 (https://www.gsea/msigdb.org/gsea/msigdb/index.jsp), which was updated September 2020. GSVA scores were calculated using the "GSVA" package. Correlation of NSD3 expression with signaling pathways in each cancer type was investigated and the top 15 most signi cant pathways with either positive or negative correlations are presented.    Relationship between NSD3 expression and DNA ampli cation. The cancer types with the top ve greatest correlation coe cients between NSD3 expression and NSD3 copy number variation are shown.