An immune-associated lncRNA signature predicts the survival of patients with head and neck squamous cell carcinoma

Recent researches have established that lncRNAs (long non-coding RNAs) could be exploited as new signatures for head and neck squamous cell carcinoma (HNSCC) diagnosis, prognosis, and treatment. Herein, HNSCC transcriptome data was abstracted from the Cancer Genome Atlas (TCGA) data resource, and uncovered immune-linked lncRNAs through co-expression analysis. Besides, univariate along with Lasso penalty regression were employed to determine immune-linked lncRNA pairs with different expressions. We then compared area under the curve, calculated the Akaike information criterion (AIC) value of the receiver operating characteristic curve for 5 years, determined cutoff points, and established an optimal predictive model for identifying high- and low-risk HNSCC patients. Overall, we identified 40 differentially expressed immune-linked lncRNA pairs, 17 of which were incorporated in the Cox regression model. Using this model, we can more effectively stratify patients based on poor survival results, positive clinicopathological features, specific tumor immune invasion status, low chemotherapy responsivity, and high expression of immunosuppressive biomarkers. Our data illustrated that the immune-linked lncRNA pairs signature have clinical prediction value for HNSCC.


Introduction
Head and neck squamous cell carcinoma (HNSCC) is a class of tumors affecting the mouth, oropharynx, and larynx [1], with > 650,000 new cases along with 330,000 deaths worldwide each year [2]. HNSCC predisposing factors include tobacco use, human papilloma virus infection, alcohol use, and changes in populations of immune cells, tumor microenvironment, long with immune checkpoints that create a conducive environment for tumor immune evasion [3].
Considerable progress has been made in HNSCC treatment, including using immune checkpoint inhibitors (ICIs). Inhibitors against the programmed cell death protein 1 (PD1) receptor or its ligand programmed cell death 1 ligand 1 (PD-L1) using nivolumab and pembrolizumab have demonstrated remarkable efficacy in a subgroup of HNSCC patients [3,4]. Nivolumab is used for individuals with relapsing or metastatic HNSCC that have failed platinum therapy. The Checkmate 141 trial established that the nivolumab-treated group achieves sustained survival relative to the standard treatment group (docetaxel, methotrexate or cetuximab), reducing risk of death by 32% and increasing 2-year survival 3 times (16.9%), while reducing serious adverse events in grade 3-4 disease by 60% [5]. Pbrobrolizumab, a PD-L1 monoclonal antibody, has good activity in HNSCC and results from the KENOTE-048 trial indicate that Pembrolizumab chemotherapy should be first-line treatment for relapsed or metastatic HNSCC, and first choice for PDL1-positive patients [6].
Long noncoding RNAs (lncRNAs) are a novel class of RNA molecules with >200 nucleotides [7,8] and are involved in regulating cellular processes like proliferation, differentiation, development, immunity, metabolism, and signal transduction at epigenetic, transcriptional, and posttranscriptional levels [9,10]. LncRNAs have tumor suppressor or carcinogenic functions via interaction with proteins, RNA, and DNA to regulate gene expression [11]. Recent evidences show that lncRNAs influence cancer through direct modulation of the cell cycle, proliferation, microbiome balance, and immune cell differentiation and they function by affecting dendritic cell activity, metabolism, and T-cell ratio [12].
Additionally, lncRNAs directly regulate expression of genes involved in immune cell activation, leading to tumor immune cell infiltration [13].
In diagnosis, evaluation, and treatment of cancer, tumor immune infiltration signatures may predict prognosis [14][15][16]. Additionally, immune-associated lncRNAs among other molecules are employed in the development of these signatures. Ma et al [17] constructed and verified a robust 8 immune-linked lncRNAs signature for predicting breast cancer survival. Wang et al [18] established a prognostic signature for anaplastic glioma using 9 immune-linked lncRNAs. Khadirnaikar et al [19] identified lncRNAs associated with KIRC prognosis. Shen et al [20] established a novel predictive signature on the basis of 11 immune-linked lncRNAs related to immune cell invasion and independent of multiple clinicopathological parameters.
Numerous studies propose that lncRNAs can be utilized as clinical signatures for HNSCC diagnosis, treatment, or prognosis [21,22]. Nevertheless, the role of immune-linked lncRNA in the HNSCC prognosis is unclear. Here, we developed an immune-linked lncRNAs signature with prognostic, biomarker, and therapeutic potential.

Acquisition of transcriptome data and differential expression analysis
HNSCC RNA-seq data were abstracted from the TCGA data resource (https://portal.gdc.cancer.gov/). The GTF annotation file was abstracted from Ensembl data resource (http://asia.ensembl.org) and employed to distinguish between mRNA and lncRNA in the TSGA-HNSCC RNA-seq data. A list of immune-linked genes was abstracted from ImmPort data resource (http://www.immport.org) and immune-linked lncRNAs identified via co-expression strategies. Correlation assessment was conducted between immune genes and all lncRNAs. The immune gene correlation coefficients >0.4 and p<.001 were considered immune-linked lncRNAs. The limma R package was employed to perform differential analysis of all immune-linked lncRNAs to determine the immune-linked lncRNAs which are expressed differentially. The threshold criteria included a log fold change (lgfc) of >2 along with false discovery rate (FDR) of <.01).

Construction of differential immune lncRNA pairs
The identified differentially-expressed immune-linked lncRNAs were used to develop differential immune-linked lncRNA pairs. The differential immune-linked lncRNA were cyclically singly paired.
In the sample, when the level of expression of the first lncRNA was higher relative to the second, it was marked as 1, otherwise, it was marked as 0, generating an immune-linked lncRNA matrix comprised of 0 or 1. This model only needs to consider the comparison of lncRNA between the data, and does not need to perform the correction of the lncRNAs between the data so that it is applicable on other types of clinical data like microarray and PCR data. Matches were considered effective when the expression level of 0 or 1 lncRNA pairs were between 20% and 80% of the total pairs.

Download and organize clinical data
HNSCC patients clinical were downloaded from the TCGA-HNSCC project and organized into a matrix with row names as sample names and column names as various clinical data. Data lacking associated survival information were excluded.

Development of a risk model to assess risk score
All immune-linked lncRNAs were analyzed by univariate analysis, and immunorelated lncRNAs cycled. One immunorelated lncRNA obtained each time was compared with survival time along with the survival status with p<.01 indicating prognosis-linked lncRNAs. Next, LASSO regression analysis was employed to perform 10-fold cross-validation to identify immune gene pairs and delete gene pairs with higher correlation to prevent over-fitting of the model. The LASSO regression model was cross-validated to identify immune-linked lncRNAs combinations with the least error. Optimal immune-linked lncRNAs combination data were used for Cox proportional hazard regression and model construction. The AUC values for each model were calculated and plotted as curves. If the curve reached peak, illustrating the maximum AUC value, the computation was stopped and the model considered as the optimal candidate. Plot the one-year, three-year, and five-year ROC curves of the model. ROC curves were assessed to determine the maximum inflection point, which was regarded as the cut-off point to distinguish high or low risk scores.

Validation of the developed risk model
Kaplan-Meier analyses were carried out explore survival difference for patients in the high-vs low-risk group. The R packages, such as survival-ROC, survminer, and glmnet were employed to perform Kaplan-Meier analyses and revealed survival differences between the high-risk patients and low-risk patients. A survival curve visualization was employed to verify this cut-off point. Clinical correlation analysis was used to identify risk score differences of patients in different clinical groups and to confirm the clinical utility value of the developed model. To validate if the model could be applied as an independent clinical predictive model, we conducted single-factor and multi-factor Cox regression assessment between risk score and the clinicopathological characteristics and displayed the results on box and forest charts, using the R packages survival, pHeatmap, along with ggupbr. 6

Evaluation of tumor-invading immune cells
To evaluate the association of the risk with immune-cell features, we employed the currently recognized approaches to determine the immune invasion status in TCGA HNSCC samples. In this respect, TIMER, QUANTISEQ, CIBERSORT, MCPcounter, XCELL, EPIC, along with CI-BERSORT resources were employed to identify the immune invading cells. Afterwards, the Wilcoxon signed-rank test was implemented to determine differences in immune cell invasion levels between the high-risk patients and low-risk patients and the data illustrated in a box chart. The association of the risk score values and the immune invading cells was explored with spearman correlation analysis. A lollipop plot was employed to display the correlation coefficients data. p<.05 served as the significance threshold cutoff.

Assessment of the value of the model in clinical treatment
To explore the clinical model of HNSCC treatment, we determined the IC50 of commonly used chemotherapy drugs in the TCGA HNSCC dataset. Commonly used HNSCC anti-tumor drugs like docetaxel, paclitaxel, doxorubicin, cisplatin, and methotrexate were selected for analysis, and the IC50 difference in the high-and low-risk group compared using the Wilcoxon signed rank test.

Analysis of immunosuppressive molecules linked to ICI
To elucidate the relationship of the model with the expression levels of ICI-linked genes (BTLA, CD27, CD28, CD226, CDK8, CTLA4, HAVCR2, LAG3, PDCD1 ), the ggstatsplot package and violin plots were visualized.

Evaluation of differentially expressed immune-linked lncRNAs
A HNSCC transcriptome analysis dataset comprised of 44 normal and 501 tumor samples was downloaded from TCGA and annotated as per the Ensembl's gene transfer format (GTF) file followed by co-expression analysis of known immune-linked genes and INCRNAs. Overall, 809 immune-linked lncRNAs were identified (Table S1), of which, 77 were differentially-expressed (70 upregulated and 7 downregulated, (Figure 1A-B). We collected data from 545 HNSCC patients from TCGA data resource and determined the risk scores for all patients and employed the identified cutoff point to re-differentiate the high-and low-risk patients in the data set for verification.

Clinical assessment by risk evaluation model
Based on the confirmed cutoff point, 248 and 250 individuals with HNSCC were classified into the high-and low-risk groups, respectively. The risk score and survival of each patient ( Figure 3A-B) revealed HNSCC patients' risk score distribution and the association of the risk score with survival time. Kaplan-Meier data exhibited that the low-risk patients had a longer survival time in contrast with the high-risk group patients (p<.001, Figure 3C). Chi-square analysis of the relationship of HNSCC risk with clinicopathological features found that T stage, survival status, N stage, and clinical stage, were remarkably linked to the risk ( Figure 4A-E Figure 5B).

Evaluation of tumor-invading immune cells and immunosuppressive molecules with risk evaluation model
We investigated if the model based on differential immune-linked lncRNA is linked to the tumor immune microenvironment. The high-and low risk groups were correlated with tumor invading immune cells like macrophages, CD8+ T-cells, along with CD4+ T-cells ( Figure S1). Spearman correlation was carried out and the ballon plot of immune cell correlation analysis obtained ( Figure 6A).
In clinical use of ICIs to treat HNSCC, we studied if the risk model is linked to ICI-associated biomarkers and found that high risk scores were positively linked to high CDK8 expression (p<.001, Figure 6B) and negatively correlated with BTLA , LAG3 and PDCD1 (p<.001, Figure 6C-E).

Analysis of association of the risk model with chemotherapeutics
Next, we investigated the association between risk and efficacy of general chemotherapeutic agents in the TCGA HNSCC dataset. This analysis found that high-risk scores correlated with lower IC50 for chemotherapeutics like Docetaxel (p<.01, Figure 7A), indicating that this model can predict Docetaxel sensitivity and had no significant correlation with the lower IC50 for paclitaxel, cisplatin, doxorubicin, and methotrexate ( Figure 7B-E).

Discussion
Head and neck squamous cell carcinoma (HNSCC) is the 6th commonest cancer in the world [23].
While marked advancement has been made in diagnosis along with treatment, nearly 50% of patients still die within 5 years [24] and HNSCC prognosis remains dismal. Deep sequencing data have laid the foundation for identifying genes associated with cancer prognosis [25]. Currently, numerous studies are focusing on cancer prognosis-associated coding and non-coding RNA signatures [26][27][28]. However, most of these signatures require expression level quantification29. We previously created a signature that combines 2 lncRNA for HNSCC prognosis prediction without the need to precisely quantify their expression [29].
Here, we abstracted and sorted raw lncRNAs data from a TCGA HNSCC dataset and carried out differential co-expression analysis to stratify differentially expressed immune linked lncRNAs. The lncRNA pairs were validated using an improved cyclic single pair method with 0 or -1 matrix. We then performed a single factor analysis, along with a modified LASSO penalty regression, along with cross-validation, multiple repetitions, and random stimulation procedures to determine differentially expressed, immune linked lncRNA pairs. Next, we calculated each AUC value to obtain an optimal model and determined the optimal AUC cutoff point to identify high-and low-risk HNSCC patients. Finally, the model was evaluated in a variety of clinical factors, including survival, clinicopathological features, tumor-invading immune cells, chemotherapy, as well as checkpoint-linked biomarkers.
Numerous studies have documented that the tumor microenvironment influences tumorigenesis.
Immune system disorders may be a primary cause of cancer initiation and immunotherapy is a prospective cancer treatment strategy [30]. Many recent studies show that lncRNAs have fundamental roles in regulating the immune system [13,31]. For example, the LINK-A-dependent oncogenic lncRNA, down-regulates presentation of cancer cell antigen and intrinsic tumor suppression [32].
LncRNA NKILA could interacts with NF-κB and regulates the T-cell responsivity to activation-induced cell death (AICD) through repressing NF-κB activity [33]. LncRNAs interact with immune-linked genes and may modulate immune microenvironment remodeling or immune cells activation.
Immune-linked lncRNAs have therapeutic target and prognostic potential. Xia et al [34] created an immune-linked lncRNA signature for estimating glioma patient survival rate. Research by Chen et al showed that a 6-lncRNA immune prognostic signature predicts cervical cancer prognosis and guides immunotherapy strategies [35]. We find that some differentially expressed immune linked lncRNAs that had been identified during the modeling process have important roles in the malignant phenotype of various cancer types, including LINC02454 [36], HOXC13-AS [37,38], HCG22 [39,40], especially for HNSCC. Song et al [41] revealed that high MIAT expression correlates with poor laryngeal squamous cell carcinoma (LSCC) prognosis and that it may act as a carcinogenic lncRNA, promoting LSCC progression [42].
Tumor invading immune cells influence tumor spread, recurrence, metastasis, and immunotherapy response. For instance, Liu et al [43] found that the deficiency of memory B-cells or elevated M0 macrophage levels correlate with poor LUAD prognosis, while in LUSC, T follicular helper cells and elevated neutrophils correlate with good and poor prognosis, respectively. Increased tumor infiltrating lymphocyte (TLS) levels like CD4+, as well as CD8+ T-cells are linked to better laryngeal cancer prognosis and TILs along with PD-L1 expression may predict response to immune checkpoint inhibitors [44]. We employed seven common approaches to estimate immune invading cells, including XCELL [45,46], TIME [47], QUANTISEQ [48], MCPCOUNTER [48], EPIC [49], CIBERSORT [50] and CIBERSORT-ABS [51] and to examine the relationship of the risk score with the tumor invading immune cells. Our findings illustrated that tumor-invading immune cells like cancer associated fibroblasts and M0 macrophages are positively and negatively correlated with CD8 T-cells, myeloid dendritic cells, M1 macrophages, and plasmacytoid dendritic cells. Wang et al [52]. documented that immune score based on immune genomic analysis is a potential biomarker for forecasting overall breast cancer survival, and may predict overall survival and guide breast cancer treatment. Our model showed that the high risk group was sensitive to docetaxel, but not to cisplatin, paclitaxel, doxorubicin, or methotrexate. The signature positively correlates with checkpoint-linked biomarkers like CDK8 and BTLA, and negatively correlates with LAG3, CD28, and PDCD1. This may be due to differences between diverse immune cells and immune-linked phenotypes and specific biomarkers, and warrants further study. However, this study has some shortcomings: this is a retrospective study and is considered inferior to prospective randomized controlled clinical trials. We analyzed TCGA HNSCC data but did not evaluate similar datasets from other databases. We did not externally verify the constructed model. The gene expression level of each sample is different, which may make the final model less reliable.

Conclusion
In summary, we established a model that can evaluate the immune microenvironment along with survival outcome of head and neck squamous cell carcinoma patients based on differentially immunocorrelated lncRNAs. Our results offer promising prospects for identifying innovative molecular targets of immunotherapy and to improve therapeutic approaches for head and neck squamous cell carcinoma patients. Further studies are needed to explore the clinical potential of this feature in head and neck squamous cell carcinoma prognosis.