Characterization of tumor microenvironment and Immune inltration in head and neck squamous cell carcinoma

Background: Head and neck squamous cell carcinoma (HNSCC), is a frequent malignant tumor of the head and neck. Most HNSCC arise in the lip, oral cavity, paranasal sinuses, oropharynx, larynx, nasopharynx, and the pharynx. HNSCC is a heterogeneous disease entity, and the role of immune cells in HNSCC, particularly in immunotherapy has not yet been extensively studied. Method: In this study, we applied CIBERSORT to infer the inltration of 22 subsets of TIICs using gene expression data. This was to determine the relationship of the immune subpopulation, patients’ survival, function, and expression differences to reveal potential targets and biomarkers for immunotherapy. Results: Somatic mutations were the most common and the variant classication was missense mutation. The most common DNA sequence polymorphism type was SNP and the most common single nucleotide variants (SNV) class was C>T. The median number of variants per sample was 78 in HNSCC patients. The top 10 mutated genes related to TMB were TP53, TTN, FAT1, MUC16, CDKN2A, CSMD3, SYNE1, LRP1B, NOTCH1, and PIK3CA. We portrayed the immune scene in detail, uncovering the awesome immune inltration styles of various subtypes in HNSCC. Conclusion: The intricate connection between TIIC, TMB, and genomic alterations was additionally set up. Our paintings advance the information of immune response and offer signicant assets for research to enhance immunotherapy.


Background
Head and Neck Squamous Cell Carcinoma (HNSCC) is a head and neck malignancy affecting the lips, mouth, paranasal sinuses, larynx, nasopharynx, and other pharyngeal organs 1 . It is the sixth most common type of malignant tumor, accounting for more than 655,000 new cases and 90,000 deaths every year 2 . Smoking, drinking, and human papilloma virus infection are the risk factors for occurrence and prognosis of HNSCC 3 . Even with early detection, less than 50% of individuals with HNSCC survive for 5 years. The survival rate is further reduced to 35% in individuals with local recurrence and metastasis 4 .
Occurrence and development of HNSCC was a complex process involving in multiple molecules. The malignant phenotypes of cancers are characterized by intrinsic activities of tumor cells as well as by the immune cells activated and recruited at the site of infection 5 . Although it's a heterogeneous disease, the role of tumor-related immune cells in HNSCC, particularly in the current immunotherapy, is not well understood Tumor microenvironment (TME) comprises of many types of cells such as immune, broblasts and endothelial cells as well as extracellular molecules such as cytokines, extracellular matrix and growth factors, all fed by a vascular network 6 . Tumor-in ltrating immune cells (TIIC) in TME, plays an indispensable role the initiation, progression and even metastasis of tumors and affects the e cacy of therapeutic interventions 7 . TIIC comprise of macrophages (M0/M1/M2 macrophages), 7 types of T-cell types (resting memory CD4+ T cells, T follicular helper [Tfh] cells, activated memory CD4+ T cells, γδ T cells, Tregs, CD8+ T cells and naïve CD4+ T cells), resting natural killer cells, resting/activated mast cells, activated NK cells, resting dendritic cells, memory B cells, activated DC, monocytes, naïve B cells, plasma cells, eosinophils and neutrophils. TIIC at TME are targets for drugs in managing progression of cancers, and have shown prognostic value 8 . There is a complex immune interaction, which has to balance between over stimulation and controlling tumor progression 9 . Tumor cells avoids host immunosurveillance by upregulating expression of TIICs inhibitors, impacting on tumor mutational burden (TMB) [10][11] . TMB is the total number of mutations per megabase of a tumor tissue. It is an indicator of immunological reaction and tumor behavior [12][13] such as formation of mutations through activation or inactivation of related genes and pathways. As such, novel peptides that can animate immune reaction can be created 14 . High TMB may be driven by other underlying risk factors. Immune checkpoint inhibitors and has been appeared to be more signi cantly connected with reaction to PD-1 and PD-L1 blockade immunotherapy than PD-1 or PD-L1 expression 15 . Previous attempts to assess TIIC using ow cytometry and immunohistochemistry was limited by the quantity of uorescent channels accessible as well as few types of immune cell available for immediate assessment [16][17] . In this study, we explored the relationship between TME, TMB and the concentration of 22 TIICs based on Gene Expression Omnibus (GEO) and The Cancer Genome Atlas (TCGA) database on HNSCC, in relation to molecular subpopulation, survival, function and expression difference to reveal potential targets and biomarkers for immunotherapy.

Data acquisition.
Freely access data on gene expression pro les and corresponding visualization were identi ed and downloaded. Crude information from the microarray data produced by Affymetrix was downloaded from the GSE6631 using RMA algorithm for background adjustment. Information on gene expression, somatic mutations, copy number and corresponding clinical information were downloaded from level 4 geneexpression data (FPKM normalized) of HNSCC-TCGA cohort. RNA-sequencing data (FPKM values) from this set were changed into transcripts per kilobase million values, to create uniformity with subsequent microarrays data, to ease comparison between samples 18  CIBERSORT metagene tool utilizes data on deconvolution of mass gene expression, whose reliability has been validated by uorescence activated cell sorting. CIBERSORT is used to estimate the abundances of individual cell types in a mixed cell population, using data on gene expression and the LM22 gene signature was used for deconvolution. This enables large-scale analysis of mixed RNA for cellular biomarkers and therapeutic targets, especially in noise, unknown mixture content and closely related cell types [19][20] . Thus, we used CIBERSORT to evaluate TIIC in TME and in measuring the proportions of immune cells and cell types in HNSCC heterogeneous samples.

Assessment of tumor mutational burden.
TMB is the total number of mutations per megabase of tumor tissue. In general terms, it's the mutation density of a tumor gene, including the total number of gene coding errors, base substitutions, gene insertions or deletion errors. The larger the TMB, the easier it is to be detected by immune cells, and the easier it is to be targeted for tumor immunity, thus the more likely it is to be contained by treated with the immune checkpoint inhibitors 21 . We described the copy number and characteristics of somatic mutations using TCGA data base. We also analyzed the effect of TMB on survival. Type of TCGA work ow based on VarScan2 Variant Aggregation and Masking. Differentially expressed genes (DEGs) among TMB were controlled using signi cance standards as applied in the R package limma 22 . Adjusted P value for multiple testing was calculated using the Benjamini-Hochberg correction 23 . Function of the identi ed DEGs and biological processes were analyzed using GO (cellular components, biological processes and molecular functions) while enrichment and KEGG pathway analyses were performed using R software. Finally, we evaluated the relationship between tumor immune cells-in ltration and TMB.

Statistical analysis.
Statistical analyses have been conducted the use of R (version 3.5.3) and R Bioconductor software packages. Each dataset was handled in a weighted average approach to evaluate the differences in the composition of TIIC. Differences were illustrated using histograms, heatmap, corHeatmap and violin plot. Overall survival (OS) was characterized as the time interval from the date of diagnosis to the date of death samples with missing data were excluded from the study. Wilcoxon Rank Sum and Signed Rank Tests was performed to evaluate statistical differences in expression of immune genes for checkpoint molecules, clinical information, and DEGs on TMB between tumor and normal tissues in HNSCC-TCGA or GSE6631 (symmetric distribution). Mann-Whitney U test was used to explored the 22 TIICs proportions between low TMB and high TMB based on HNSCC-TCGA database. An adjusted P-value<0.05 was considered as signi cant when identi ed differentially expressed (such as DEGs) in HNSCC-TCGA or GSE6631. For other statistical analysis, a P-value<0.05 was viewed as signi cant.

Results
Distribution of TIICs based on GSE6631 and HNSCC-TCGA.
Histograms and heatmap showed the cluster information of the 22 TIICs subsets in each sample based on GSE6631 Figure 1A, 1C. Next, co-expression analysis was used in order to clari ed the potential immune cell subpopulations that may affect each other. CorHeatmap revealed that M1 macrophages and resting memory CD4 +T cells had the highest interaction (0.86) and activated mast and CD8 +T cells had the lowest co-expression coe cient (-0.71) in GSE6631 patients as shown in Figure 1B. But, the immune regulatory network was complexity and affected by many factors, our study may provide useful information for in depth researches.
Histograms and heatmap showed the cluster information of the 22 TIICs subsets in each sample based on HNSCC-TCGA Figure 2A, 2B. Similarly, the co-expression of T cells CD4 memory activated and T cells CD8 was highest in tumor patients (0.57) while the co-expression of M0 macrophages and T cells CD8 was lowest (-0.59) in HNSCC-TCGA cohort as shown in Figure 2C.
Missense mutation was the most common somatic genetic variation, while SNP was the most common DNA sequence polymorphism, with C>T being the most common single nucleotide variants class. The median variation per HNSCC-TCGA tissue was 78. Top 10 genes mutations contributing to TMB were TP53, TTN, FAT1, MUC16, CDKN2A, CSMD3, SYNE1, LRP1B, NOTCH1 and PIK3CA as shown in Figure 3A. CorHeatmap analysis revealed a co-expression relationship among many mutated genes Figure 3B. TTN gene had the most mutations, with alterations observed in 478 (94.47%) of 506 samples as shown in Figure 3C-3D. There was no association between TMB and OS (P=0.87), which may be due samples size bias.
TMB and differentially expressed genes in HNSCC-TCGA.
To evaluate the underlying biological characteristics of TMB based on HNSCC-TCGA, we divided individuals into high (185 samples) and low (189 samples) tumor mutational burden based on TMB median value, and analyzed 249 DEGs using limma 24 . GO enrichment and KEGG pathway analyses were performed using R clusterPro ler. GO analysis showed that biological processes in DEGs were signi cantly enhanced in response to acid chemical, connective tissue and cartilage development. Molecular function was mainly enhanced in structural constituent of extracellular matrix. DEGs cellular components were elevated in extracellular matrix, presynapse and endoplasmic reticulum lumen. KEGG analysis revealed that DEGs were enhanced in glutathione metabolism. Violin plot revealed differences between the 22 TIICs in TMB subgroups. Here, the expression of NK cells resting (P=0.032) and Eosinophils (P=0.019) was signi cantly different, and thus may play an important role in regulating tumor cells Figure 4 and Table 2.

Discussion
In this study, we used CIBERSORT to assess TME, TMB and characterized 22 subsets of TIICs based on GSE6631 and HNSCC-TCGA database in individuals with HNSCC, to explore molecular subpopulation, survival, function and expression difference of these cells in tumor conditions. We found that the composition of TIIC subtypes differs substantially in different HNSCC. Tumor mutations generates novel epitopes. Understanding the immunogenicity of these epitopes may deepen our knowledge on the clinical immunology and prognosis of HNSCC tumors. Based on TCGA data TME was associated with TMB, supporting the hypothesis that TMB is a strong proxy for TME. This is an interesting revelation, particularly in the face of immunomodulatory therapies, and may uncover potential biomarkers for immunotherapy or prognosis of HNSCC tumors.
In ltration of immune cells has been postulated to be a signi cant factor in uencing the prognosis of tumors. In this study, we rst explored the distribution of TIICs in HNSCC patients based on GSE6631 and HNSCC-TCGA datasets. Macrophages which mediate phagocytosis and in ammation aid in controlling tumor progression, by facilitating invasion of tumor tissues and angiogenesis [25][26][27] . Macrophages are mainly classi ed into M0/M1/M2 Under speci c conditions, M0 macrophages can be polarized to M1/M2 types, Liu et al 28 found that RhoA pathway could obstruct elongation (hummingbird phenotype) of M0 macrophages, suggesting that M0 may be important in bone-marrow-derived macrophages. As subpopulations of T cells, resting memory CD4+ T cells can differentiate to perform other functions (separate to those mediated by memory CD8 + T cells, but aiding in suppressing growth of tumors) [29][30] .
Resting memory CD4 + cells must be activated by IL-7 and IL-15, yet not MHC class II, for their survival and intermittent homeostatic proliferation 31 . A recent study suggested that the homologous interaction between tumor antigen-speci c CD4 + Th1 cells and tumor-associated macrophages may shift the M1/M2 ratio within the tumor to M1 32 . So, in ltration of immune cells plays a critical role in HNSCC and therefore more research was needed. Meanwhile, we also analyzed the interaction between the 22 different TIICs and elucidated on co-expression of immune cells in HNSCC, to open up future research on the same. This analysis though, may be subject to biasness, because the interactions may be in uence by the other factors in the TME.
The role of varying TMB in HNSCC-TCGA was also investigated. Missense mutation we found to be the most common somatic alterations, while SNP was the most common DNA sequence polymorphism with C>T single nucleotide substitution being the most frequent. Changes that modify amino acid sequences are known as Missense mutations. These changes may in uence the structure and stability of proteins, which may interfere their interactions with other biomolecules, translation of functional proteins and enhance progression of tumours 33 . For example, BRAF mutation in melanoma and KRAS G12D mutations in colorectal cancer [34][35] . Establishing impacts of cancer missense mutations may help in identifying drivers of mutations as well as illustrating molecular mechanisms for developing HNSCC. SNP have been identi ed in nearly all types of cancers, such as colorectal 36 , breast 37 , prostate 38 , and HNSCC 39 . SNP is a third-generation genetic marker for detecting numerous phenotypic differences in human, and susceptibility to diseases and drugs may be associated with SNP 40 . Top 10 mutated genes that related to TMB were TP53, TTN, FAT1, MUC16, CDKN2A, CSMD3, SYNE1, LRP1B, NOTCH1 and PIK3CA. Kubesova et al 41 reported that TP53 mutations with low variant allele frequency irrespective of disease subtype, driver gene status and cytoreduction during myeloproliferative neoplasm. Mukhopadhyay et al 42 suggested that ESR2-mutant TP53 combination prognosticates survival in triple negative breast cancer. But in this study, there was no statistical difference between TMB and clinical survival time which may have been in uenced by too small samples size in HNSCC-TCGA database. We nally divided TMB into high and low groups to analyze the functions of DEGs. Analysis of biological processes performed by DEGs suggested that these genes may play a signi cant role in tissue development and response to acidic chemicals. Molecular function was enhanced in structural constituents in the extracellular matrix, and affected cellular structure as well. KEGG analysis revealed that for most part, metabolism of glutathione is enhanced in DEGs. Glutamine is a signi cant metabolite utilized in the development of malignant cells.
Reducing the level of GLN through chemotherapy and radiotherapy has been found to restores diminished glutathione, enhancing the recovery of intestine epithelium and immunological system 43 . Our results also suggested that NK cells resting and Eosinophils was differently expressed between high and low TMB group indicating these genes are essential in tumor immune response.
Meanwhile, our study had several limitations. First, our results were not validated in clinical settings and may not provide precise data. We also analyzed samples from relatively few patients. Second, due to multiple types of histology and anatomical sites for HNSCC, tumor-in ltrating immune cells may vary widely. Finally, the interacting of different immune cells is affected by environmental conditions, thus ours corHeatmap analysis may be subject to bias. Meanwhile, these particular tumors may lack substantial tumor in ltrating immune cells (immune deserts) and needs more related studies conduct. We will do follow-up studies to develop new, accurate interventions for cancer medicine. Here, we described TME in relation to immune system, revealing variable in ltration of various HNSCC subtypes. The intricate connection between TIIC, TMB and genomic alterations was also elucidated the mechanism of immune response revealed in this study provides a strong foundation for future research that can enhance tumor immunotherapy.

Declarations
Ethics approval and consent to participate Not applicable.

Consent for publication
Not applicable.

Availability of data and materials
Data sharing is not applicable to this article as no datasets were generated or analyzed during the current study.

Competing interests
The authors declare that they have no competing interests.

Funding
This work is not supported by grants.     Violin plot of the 22 TIICs subsets between low TMB group and high TMB group in HNSCC-TCGA database.