Freely accessible datasets with gene expression profiles and corresponding visualization data were identified and downloaded. The crude information from the microarray data collections produced by Affymetrix were downloaded from the Gene-Expression Omnibus and using the RMA algorithm for background adjustment. Information on gene expression, somatic mutations, copy number and corresponding clinical information were recognized and downloaded from level 4 gene-expression data (FPKM normalized) of The Cancer Genome Atlas HNSCC cohort. Due to the histological types and multiple anatomical sites and of HNSCC so we mainly download larynx data from TCGA. For TCGA data set, RNA-sequencing data (FPKM values) were changed into transcripts per kilobase million (TPM) values, which are progressively like those subsequent from microarrays and easy comparable between samples20. The clinicopathological data collected included sex, age, stage, grade, T-stage, M-stage, N-stage, survival status and survival duration in days. Data were examined with the R (version 3.5.3) and R Bioconductor packages. We use perl language for immune cell matrix filtering according to P less than 0.5.
Assessment of immune infiltration.
Inference of infiltrating cells in the TME and to measure the proportions of immune cells and cell types in the HNSCC heterogeneous samples, a metagene tool CIBERSORT utilizes deconvolution of mass gene expression data and a sophisticated algorithm used and has been confirmed by fluorescence activated cell sorting (FACS). These TIIC comprised macrophages (M0/M1/M2 macrophages), 7 T-cell types resting memory CD4+ T cells, (T follicular helper [Tfh] cells, activated memory CD4+ T cells, γδ T cells, Tregs, CD8+ T cells and naïve CD4+ T cells), resting natural killer (NK) cells, resting/activated mast cells, activated NK cells, resting dendritic cells (DC), memory B cells, activated DC, monocytes, naïve B cells, plasma cells, eosinophils and neutrophils.
Assessment of tumor mutational burden.
TMB is the total number of mutations per megabase of tumor tissue. In general terms, the mutation density of a tumor gene, that is, the average number of mutations in the tumor genome, including the total number of gene coding errors, base substitutions, gene insertions, or deletion errors. The larger the TMB, the easier it is to be discovered by immune cells, and the easier it is to become a target for tumor immunity, so that the more likely it is to be effective for immunotherapy. We described the copy number and somatic mutations characteristic based on TCGA databased and then analyzed the connection between TMB and patients’ clinical factors, including age, gender, grade and TNM stage and we also analyzed the effect of TMB on survival. TCGA workflow type based on VarScan2 Variant Aggregation and Masking. Differentially expressed genes (DEG) associated with TMB were resolved using the R package limma21, which implements an empirical Bayesian way to estimate gene-expression modifications the use of moderated t tests. DEGs among TMB were controlled by using significance standards as applied in the R package limma. The adjusted P value for multiple testing was calculated the usage of the Benjamini–Hochberg correction22.To examine the function of the identified DMGs, biological analyses were performed utilizing GO (cellular components, biological processes and molecular functions) enrichment and KEGG pathway analysis via R package. Finally, we evaluated the related to tumor-infiltrating immune cells and TMB.
Statistical analyses have been conducted the use of R version 3.5.3 and Bioconductor. For examinations of two gatherings, statistical significance for normally distributed variables was evaluated by unpaired Student t tests, and nonnormally distributed variables were revealed by Wilcoxon tests. Each dataset was handled by a weighted average approach to contrast the differences in the composition of TIIC and using boxplot, heatmap, corHeatmap and vioplot to visualization the difference in normal and tumor samples. Overall survival (OS) was characterized as the time interval from the date of diagnosis to the date of death. The listwise deletion technique was utilized to deal with missing data, which excluded the entire sample from the investigation if any single value was absent. Wilcox test analysis was performed to evaluate the differences not only in the gene expression of immune checkpoint molecules but also TMB clinical information between tumor and normal tissues. To recognize differential genes in the GEGs evaluation, we applied the Benjamini–Hochberg way to transform the P values to FDRs. For every single statistical analysis, a P-value<0.05 was viewed as significant.