Development of an immunogenomics-based prognostic index model of laryngeal squamous cell carcinoma

Background: The immune system greatly affects the prognosis of various malignancies. Studies on differentially expressed immune-related genes (IRGs) in the immune microenvironment of laryngeal squamous cell carcinoma (LSCC) have rarely been reported. Methods: In this paper, the prognostic potentials of IRGs in LSCC were explored. The RNAseq dataset containing differentially expressed IRGs and corresponding clinical information of LSCC patients was obtained from The Cancer Genome Atlas (TCGA). A total of 371 up-regulated and 61 down-regulated IRGs were identied. Subsequent functional enrichment analysis revealed that the pathway of IRGs was mainly enriched in the cytokine-cytokine receptor interaction. Then, 30 IRGs with prognostic potentials in LSCC were screened out, and the regulatory network induced by relevant transcription factors (TFs) were constructed. Results: Finally, multivariate Cox regression analysis was conducted to assess the prognostic potential of 15 IRGs after adjustment of clinical factors and LSCC patients were classied into 2 subgroups based on different outcomes. The gene expression of the model was veried by other independent databases. Nomogram including the 15 IRGs signature was established and shown some clinical net beneft. Intriguingly, B cells were signicantly enriched in the low-risk group. Conclusion:These ndings may contribute to the development of potential therapeutic targets and biomarkers for the new-immunotherapy of LSCC.


Background
Laryngeal cancer, a most common malignant tumors in the head and neck, ranks the eleventh among the most common malignant tumors in men [1]. Notably, patients with invasive and metastatic laryngeal cancer have a poor prognosis. More than 95% cases of laryngeal cancer are laryngeal squamous cell carcinoma (LSCC). Despite great strides on LSCC treatment in recent years, the mortality rate of LSCC remains high [2]. Due to the lack of symptoms, LSCC in the early stage is often neglected. Furthermore, because of the screening and diagnostic limitations, the detective rate of LSCC, especially early-stage LSCC, is relatively low. Therefore, identifying effective biomarkers and target genes is of vital importance in improving diagnostic and therapeutic e cacies for LSCC [3].
Previous studies have con rmed the signi cance of tumor immunoreaction in the tumor microenvironment. Prognostic signatures based on immune-related genes (IRGs) have been proposed for nonsquamous non-small cell lung cancer [4] and papillary thyroid cancer [5]. Tumor-in ltrating immune cells with different densities, localizations and types have been identi ed as prognostic factors in lung cancer [6], colorectal cancer [7] and breast cancer [8]. Nevertheless, the clinical relevance and prognostic signi cance of IRGs in LSCC are yet to be elucidated.
Herein, we aimed to explore potential therapeutic targets and biomarkers for immunotherapy of LSCC.
TCGA datasets containing IRG expression pro les and clinical information of LSCC patients were rst analyzed. An immunogenomics-based prognostic index of LSCC was developed and the potential mechanism was further discussed.

Clinical samples and data acquisition
We downloaded the RNAseq dataset with mRNA level 3 and corresponding clinical information from The Cancer Genome Atlas (TCGA)-the largest cancer gene information database to collect relevant data, including the expression pro les and clinical information of 12 laryngeal tissues and 111 LSCC samples, from different microarray platforms [9] for further analyses. We also obtained a list of IRGs via the Immunology Database and Analysis Portal (ImmPort) database (https://www.immport.org) [10]. At the same time,the list of tumor-related transcription factors (TFs) was obtained from Cistrome Cancer (http://cistrome.org/) [11] a comprehensive resource of transcriptional and epigenetic factors regulating abnormal gene expression patterns in cancers. Finally, the abundance of six types of tumor-in ltrating immune cells in LSCC was retrieved from Tumor IMmune Estimation Resource (TIMER) (https://cistrome.shinyapps.io/timer/) [12].

Identi cation of immune-related genes (IRGs)
IRGs were then identi ed from the intersection between all differentially expressed genes and immunerelated genes. To explore the potential molecular mechanisms of the IRGs, we performed Gene Ontology(GO)enrichment analysis and the Kyoto Encyclopedia of Genes and Genomes (KEGG) signaling pathway analysis on these differentially expressed IRGs using package Cluster pro ler of R.

Survival analysis and construction of regulatory network
Prognostic IRGs were selected by univariate Cox analysis using the package survivial in R. At the same time, TFs were obtained from Cistrome. Then, we screened differentially expressed TFs from these TFs by package limma in R software with adj.p < 0.05 and |log(FC)|≥1. The regulatory network was constructed based on clinically relevant TFs and ltered TFs with a standard of correlation coe cient more than 0.3 and p < 0.05.The results of the interaction were imported into Cytoscape 3.7.1 [14] to establish a regulatory network to explore role of TFs in regulating these prognostic IRGs.

Construction of a prognostic index model of LSCC based on immune-related genes
We constructed a prognostic index model based on the result of the multivariate Cox regression analysis of prognostic IRGs.

External validation of the gene expression
We collected high-throughput gene expression dataset, numbered GSE59102 ,from the GEO database [15].Background correction, standardization and expression value calculation were performed on the original dataset using package Impute and Limma of R software and adj.p-val < 0.05 were de ned as the screening criteria.The protein expression of the genes in the prognostic gene signature was explored in the Human Protein Atlas database [16].

Veri cation the prognostic capacity of prognostic index
Patients were divided into the high-risk group and the low-risk group based on the median risk score value. ROC curves were constructed via the package survival ROC in R to validate the performance of prognostic index [17]. Survival analysis of two groups with a threshold of p < 0.05 was carried out by package survival and survminer in R to verify the clinical e cacy. In addition, we observed the prognostic value of prognostic index by introducing clinical factors like age and grade.Clinical survival analysis based subgroup was also conducted and a value of p < 0.05 was considered signi cant statistically.

Building and validating a predictive nomogram
We took the all independent prognostic factors identi ed by multivariate Cox regression analysis into a nomogram to investigate the likelihood of 1-OS, 3-OS and 5-OS for LSCC. The effectiveness of the nomogram was evaluated by discrimination and calibration. Finally, we plotted the calibration curve of the nomogram by package(rms) of R to observe the relationship between the predicted probability of the nomogram and the observed rate.

Functional enrichment analysis
We used Gene Set Enrichment Analysis (GSEA) [18] to identify consistent differences between high-risk and low-risk groups and to explore possible biological processes. In the gene list of KEGG, p < 0.05 were considered to be statistically signi cant screening criteria.
2.10 Differential expression of tumor-in ltrating immune cells between high-risk and low-risk groupsWe obtained immune in ltrate levels of LSCC patients from TIMER and calculated differential expression of tumor-in ltrating immune cells between high-risk and low-risk groups using two-sample T-test.

Analyses on DEGs
Analyses on 5483 DEGs were performed in R, and a total of 4611 up-regulated and 872 down-regulated DEGs were identi ed (Fig. 1).

Identi cation of IRGs
A total of 371 up-regulated and 61 down-regulated IRGs were identi ed from the DEGs obtained.
Cluster analyses on IRGs were presented in Fig. 2A. As expected, GO enrichment analysis (Fig. 2B) indicated that leukocyte migration was the most signi cantly affected biological process (BP). The external side of plasma membrane was the most signi cantly affected cell composition (CC). Meanwhile, the receptor ligand activity was the most signi cantly affected molecular function (MF). Subsequently, the KEGG signaling pathway analysis (Fig. 2C) found that the cytokine-cytokine receptor interaction pathway was most frequently implicated. Taken together, these results suggested that differentially expressed IRGs were of signi cance in the development of LSCC.

Survival analysis and construction of regulatory network
After integrating clinical information from TCGA and analyzing using the univariate Cox regression, 30 prognostic IRGs were identi ed. Forest plots (Fig. 3) showed that 11 IRGs may be the protective factors of LSCC, while the remaining 19 IGSs could be risk factors of LSCC. To investigate the regulatory mechanisms of the prognostic IRGs, expression levels of 318 TFs were examined. There were 65 differentially expressed TFs between LSCC and normal samples (Fig. 4). A regulatory network was constructed to depict these IRGs with relevant TFs (Fig. 4C).

Construction of a prognostic index model
Based on the results of the multivariate Cox regression analyses on the 15 prognostic IRGs, a prognostic index model was constructed (Table 1) The mRNA expression of genes of the model was consistent with the results in the TCGA cohort (Table 2),yet the upregulation of PAEP and CYSLTR2 was not found in GSE59102 (Table 3).We further explored the protein expression of the genes in the Human Protein Pro es ,which was shown in Fig. 5. However, we did not nd the protein expression of BTC FCGR3B and TLR2 in THPA.
3.6 Veri cation the prognostic capacity of the prognostic index model LSCC patients were separated into the high-risk group and the low-risk group based on the median level of risk score (Fig. 6A-C). Survival analysis showed that the survival rate in the high-risk group was remarkably lower than that in the low-risk group (p = 8.64e − 10, Fig. 6D). The area under curve (AUC) of the receiver operating characteristic (ROC) curve was 0.916 (Fig. 6E), suggesting a great performance of the prognostic index model in predicting the prognosis of LSCC patients.Compared with the other clinical factors including only age, gender, grade,stage or TNM, the prognostic index model shown the largest AUC for 1-year, 3-year and 5-year OS (Fig. 7). Moreover, univariate and multiple regression analyses ( Fig. 8) suggested that the prognostic signature could become an independent predictor after adjustment of age, gender, tumor grade, tumor stage, location of the primary tumor(T), distant metastasis(M) and lymph node metastasis status(N) in LSCC patients. In addition, the relationship between IRG expressions and clinical factors of LSCC patients was explored (Fig. 9). In order to assess the prognostic capacity of prognostic index more comprehensively, we conducted a strati ed analysis of clinical factors.
Interestingly, we found that high risk patients in nearly all the subgroups were inclined to have unfavorable overall survival (Fig. 10).

Building and validating a predictive nomogram
We used a number of independent prognostic factors (including age, gender, grade,stage,TMN and risk scores) to establish a nomogram to predict 1-year ,3-year and 5-year OS in 88 LSCC patients (Fig. 11).These results suggested that the advantage of a nomogram constructed using a combinatorial model is that it can better predict short-term survival (1-year ,3-year and 5-year OS) compared to a nomogram constructed using a single prognostic factor. This might be helpful for clinical management.

Identi cation of the related biological processes and pathways
We used risk score to classify the entire data set to determine the underlying pathway behind these 15 genes by using the Java software GSEA. The results showed that the high-risk group was more abundant in the "glycosaminoglycan biosynthesis keratankeratan sulfate"and"pathogenic escherichia coli infection"gene sets when the low-risk group was more abundant in the "arachidonic acid metabolism","butanoate metabolism","fatty acid metabolism","propanoate metabolism"and "selenoamino acid metabolism"gene sets (Fig. 12).

Tumor-in ltrating immune cells in LSCC patients of the high-risk and low-risk groups
To explore the relationship between the immunogenomics-based prognostic index model and tumor immune microenvironment, we compared the in ltration of immune cells in different risk groups. It is shown that B cells were signi cantly enriched in the low-risk group compared to those in the high-risk group (Fig. 13).

Discussion
LSCC, a most common tumor of head and neck [19], is prone to recurrence and metastasis [20]. Patients who suffer from recurrent or metastatic LSCC and those with a poor response to platinum-based chemotherapy have a low survival rate [21]. Since the immune system plays a vital role in cancer development, immunotherapy is now extensively applied to counteract the immune escape against malignant cancer cells through regulating the key signaling pathways in the host immune system. In particular, cancer immunotherapy shows potentials of durable responses with fewer adverse effects than conventional treatments [22]. The rst cancer immunotherapy drug approved by the Food and Drug Administration (FDA) in 2011 was ipilimumab, a cytotoxic T-lymphocyte antigen 4 (CTLA4)-blocking monoclonal antibody (mAb) for metastatic melanoma.
Although the prognostic models of LSCC for predicting overall survival are constantly updated [23,24], immune-related prognostic index models of LSCC have rarely been reported. In this study, we rst identi ed 371 up-regulated and 61 down-regulated IRGs of LSCC, and the prognostic IRGs were subsequently screened out. Through establishing a prognostic index model, LSCC patients were classi ed into the high-risk and the low-risk groups. Our ndings demonstrated the great performance of the prognostic index model in predicting the prognosis of LSCC patients as revealed by the ROC curves.
To further explore the biological functions of IRGs in the development of LSCC, pathway enrichment analysis was conducted to depict the regulatory network. The KEGG analysis showed that prognostic IRGs were mainly enriched in the cytokine-cytokine receptor interaction. Immune cells and a network of pro-in ammatory and anti-in ammatory cytokines collaborate in cancer development and progression [25]. Cytokines are a heterogeneous group of soluble, small polypeptides or glycoproteins involved in virtually every aspect of immunity and in ammation [26]. It is believed that an environment rich in in ammatory cells, cytokines and activated stroma potentiates and/or promotes neoplastic risk [27]. Our study also found that the relationships between TFRC, PPARG, AHNAK, TRBC1 and their surrounding TFs were more complex in the regulatory network. TFRC is differentially expressed in tumors [28,29]. PPARG is considered as a prognostic marker [30,31]. Our results con rmed that AHNAK acted as a potential tumor suppressor and could be a reliable clinical prognostic indicator [32,33]. Nevertheless, the potential role of TRBC1 in cancer development remains unclear.
In the constructed prognostic model, the following IRGs were subjected to the calculation of risk score: RBP1, TLR2, PAEP, AHNAK, AQP9, CCL2, PPARG, CYSLTR2, FPR2, BTC, EPO, STC2, TNFRSF4, FCGR3B, PLCG1, AHNAK and PPARG. More frequent methylation of RBP1 is found in the esophageal squamous cell carcinoma than in the normal esophagus [34]. TLR2 may be relevant to susceptibilities of oral squamous cell carcinoma and oral lichenoid lesions [35]. Aquaporins (AQPs), a family of small membrane transport proteins, assist the transportation of water and small solutes such as glycerol [36]. It has been reported that AQP9 is up-regulated in human glioma tissues [37]. In astrocytoma, AQP9 promotes cancer cell invasion and motility via the AKT pathway [38]. Ferreira et al. suggested that CCL2 promoted the spread of oral cavity squamous cell carcinoma to lymph nodes and the macrophage in ltration might play a role in less aggressive behaviors [39]. Cysteinyl-LTs participate in the pathogenesis of several chronic in ammatory diseases [40]. Previous studies found that FPR2-induced paracrine might contribute to the proliferation and metastasis of LSCC [41], which is consistent with our ndings. Betacellulin (BTC), a member of the EGF family, acts as a potent mitogen for various cell types [42]. BTC is frequently reported to be up-regulated in human tumors [43][44][45]. Seibold ND et al. demonstrated that EPO expression in locally advanced squamous cell carcinoma of the head and neck was an independent prognostic factor for locoregional control, metastases-free survival and overall survival [46]. Protein level of STC2 in tumor tissues is associated with invasiveness in the thyroid cartilage, T-stage, lymph node metastasis, clinical stage and pathological differentiation of LSCC. In addition, protein level of STC2 is an independent prognostic factor for overall survival of LSCC [47][48][49]. Zhu et al. suggested that phospholipase C gamma 1 (PLCG1) could be used as a prognostic biomarker for patients with advanced oral squamous cell carcinoma [50]. So far, the involvement of PAEP, TNFRSF4 and FCGR3B in LSCC or head and neck tumors remains unclear, which requires further exploration.
T cells contribute to oncological immune defense and B cells basically belong to the adaptive immune system. Currently, most immunotherapies are based on T cells [51]. In this study, we found that B cells were signi cantly enriched in the low-risk group compared to the high-risk group, suggesting that B cell in ltration might be a good prognostic signal for LSCC patients. B lymphocytes are the effector cells of humoral immunity and can terminally differentiate into antibodies that secrete plasma cells upon stimulation. Moreover, B cells contribute to cellular immunity by acting as antigen-presenting cells (APCs) and/or providing costimulatory signals to T cells [52,53]. The role of B lymphocytes in tumor immunity is still controversial. On the one hand, antigen-presenting B cells are able to activate tumor-speci c cytotoxic T cells [54]. B cell de cient mice exhibit signi cantly reduced tumor-speci c T cell immunity [55]. On the other hand, B cell antibody response may potentiate chronic in ammation and thus enhance tumor development [56,57]. It is reported that tumor in ltrating B-cells are correlated with a good prognosis for head and neck cancer [58], which is consistent with our ndings.
There are still some de ciencies in our study. First of all, we constructed a unique prognostic model of LSCC by analyzing IRGs. However, in-depth analyses on clinical data of LSCC patients are needed to con rm our ndings. Secondly, we did not monitor the relative expressions of the selected 15 IRGs in LSCC patients to assess their prognostic values. Thirdly, in vivo and in vitro functional experiments are needed to further validate our ndings.

Conclusions
We developed a novel IRGs-based prognostic index model for LSCC patients, an interpretation of the misregulated tumor immune microenvironment. Also, these IRGs could be the potential therapeutic targets for LSCC patients.  Tables   Due to technical limitations the Tables are available as downloads           Nomogram predicting overall survival for LSCC patients. A: For each patient, several lines are drawn upward to determine the points received from the predictors in the nomogram. The sum of these points is on the "total point" axis. Then a line is drawn downward to determine the possibility of 1-,3-and 5-year overall survival of LSCC. B:The calibration plot for internal validation of the nomogram. The Y-axis represents actual survival, and the X-axis represents nomogram-predicted survival.

Supplementary Files
This is a list of supplementary les associated with this preprint. Click to download. Table1.xlsx