Three Autophagy-Related lncRNA are Prognostic Biomarkers for Patients with Head and Neck Squamous Cell Carcinoma

Background: Currently, no systematic analysis has been conducted to assess the potential of multiple autophagy-related long non-coding RNAs (lncRNA) to predict the prognosis of head and neck squamous cell carcinoma (HNSCC). we investigated the prognostic potential of autophagy-related long non-coding RNAs (lncRNA) in HNSCC patients. Methods: Patient information and Autophagy-associated genes were obtained from The Cancer Genome Atlas (TCGA) and Human Autophagy data resource. Autophagy-related lncRNAs were determined through Lasso and Cox regression analyses. Then, on the basis of autophagy- related lncRNAs, a risk score and a nomogram were constructed for estimation of prognostic outcomes for HNSCC patients. These models were veried internally using the TCGA and. Gene Ontology (GO) and the Kyoto Encyclopedia of Genes and Genomes (KEGG) were used for gene functional analyses. Results: Three autophagy-related lncRNAs (AC002401.4, AC245041.2 and TMEM44-AS1) that are associated with HNSCC were identied. Univariate and multivariate Cox regression analyses revealed that the risk score is an independent prognostic indicator (p ≤ 0.001), with its ability to predict prognosis being higher than that of other clinicopathological indicators (AUC=0.732). Concordance index of the nomogram was 0.712, and AUC values for one-year, three-year and ve-year survival rates were 0.730, 0.745 and 0.728, respectively. Internal verications revealed that this nomogram had a good ability to predict prognosis. Functional analysis showed that the genes were mostly enriched in autophagy and tumor-related cascades. Conclusion: The autophagy-related lncRNAs model can predict the prognosis of patients with HNSCC. a CCRCC prediction signature that included AC245041.2. They reported that elevated AC expression levels indicate poor prognosis for CCRCC patients. To establish the biological behaviors of these three autophagy-related lncRNAs, we constructed a co-expression network of mRNA and lncRNAs, and performed functional enrichment analysis. Functional enrichment analysis revealed multiple GO terms and signaling pathways associated with autophagy and tumor development, such as regulation of autophagy, HPV, apoptosis, and the PI3K-Akt signaling pathway. The roles of these three lncRNAs in HNSCC will be evaluated in detail in our further research.


Introduction
Globally, head and neck cancer is the sixth most cancer type [1]. Annually, about 500,000 patients are diagnosed with head and neck cancer, with more than 300,000 mortalities [1]. About 95% of head and neck cancer cases are squamous cell carcinoma. Although the clinical diagnosis and treatment of head and neck squamous cell carcinoma (HNSCC) has greatly improved, its ve-year survival rate is still about 50% [2][3][4]. High mortality rates are associated with late diagnosis and lack of effective prognostic biomarkers. Currently, the main method for predicting HNSCC prognosis is the tumor staging system, which is characterized by a poor predictive accuracy. Therefore, it is important to establish effective biosignatures for prognostic prediction and for the development of molecular targeted therapy for HNSCC.
Autophagy is the process through which intracellular organelles and proteins are degraded and reabsorbed by lysosomes. Autophagy plays a signi cant role in biological development and maintenance of body homeostasis [5]. In addition, autophagy is involved in several tumors [6]. Autophagy inhibits cancer cell formation during early tumor stages [7]. Besides, autophagy promotes cancer cell invasion and growth during tumorigenesis [8,9]. Studies have investigated the mechanisms through autophagy is involved in HNSCC. Autophagy inhibitors work synergistically with chemotherapeutic drugs to enhance cancer cell apoptosis, thereby inhibiting HNSCC growth [10,11]. In addition, excess autophagy in HNSCC cells activate autophagic cancer cell death [12,13]. Studies have shown that autophagy-related signatures can be used as emerging biomarkers for effectively predicting clinical prognoses of various cancer types, including HNSCC. Therefore, establishing the mechanisms through which autophagy is involved in the HNSCC occurrence, metastasis and drug resistance will help in the identi cation of new prognostic markers and therapeutic targets. In cancer, autophagy is regulated by many factors. At the genetic level, in addition to the classic messenger RNA (mRNA) and microRNA (miRNA), the role of long noncoding RNA (lncRNA) cannot be ignored.
Studies have investigated the roles of lncRNAs, a class of non-coding RNAs that are involved in transcriptional regulation, intracellular material transport and chromosome remodeling [14]. By regulating autophagy, lncRNAs are involved in invasion, migration, apoptosis and proliferation of HNSCC cells [15,16]. Moreover, lncRNAs are effective biomarkers for cancer prognosis prediction and diagnosis [17][18][19]. In view of these lncRNA characteristics, we postulated that autophagy-related lncRNAs may be effective for evaluating the prognostic outcomes of HNSCC patients.
A limited number of studies have evaluated the role of autophagy-related lncRNAs in HNSCC, and they all focused on a single gene [20][21][22]. Comprehensive multi-gene studies on the role of autophagy-related lncRNAs in the prediction of HNSCC are lacking. Therefore, we identi ed 3 autophagy-related lncRNAs from the TCGA database and established a risk score signature and a nomogram model for predicting the prognosis of HNSCC patients. Figure 1 shows the schematic process involved in this study. Transcriptome and clinical data of HNSCC patients were retrieved from the Cancer Genome Atlas (TCGA) web data resource (https://portal.gdc.cancer.gov/). Autophagy-associated genes (ATGs) were obtained from the human autophagy data resource (HADb; http://autophagy.lu/). All data sets met the following criteria: i. Patients diagnosed with HNSCC and ii. Each sample had complete clinical information and lncRNA data.

Information Collection
A total of 9 HNSCC and 9 paired adjacent normal tissues (ANT) were collected at Beijing Stomatological Hospital A liated to Capital Medical University from March 2015 to June 2018. Every patient involved in this study signed an informed consent form and agreed to the use of their clinical information and tissues in scienti c research. Ethical approval for this study was obtained from the Ethics Committee of Capital Medical University.
2.2 Screening and identi cation of differentially expressed autophagy-related lncRNAs Total RNA expression data (TPM) were standardized by log2 conversion. The edgeR package in R software was used to screen differentially expressed lncRNAs between HNSCC (n = 501) and adjacent normal tissues (n = 44) in TCGA. Speci c screening conditions were |log 2 fold change (FC) |> 0.5 and adjusted p < 0.05.
Pearson's correlation coe cient analysis was performed to determine the correlation between lncRNA and ATG. Autophagy-related lncRNAs were identi ed by a square of correlation coe cient greater than 0.3 (|R| 2 > 0.3) and p < 0.001).
A total of 498 TCGA-HNSC patients met the inclusion criteria, and were named the "entire cohort". Then, lncRNAs selected in the previous step were combined with clinical data of the entire cohort. Subsequently, the caret package in R software was used to randomly assign the entire cohort into training and test cohorts. The training cohort was used to construct a risk signature and a nomogram. The testing, entire cohorts were used to verify the robustness and effectiveness of the models.

Development of a Prognostic Signature
In the training cohort, autophagy-related lncRNAs with a prognostic value were analyzed using Kaplan-Meier (K-M) overall survival assessment and univariate Cox regression (p < 0.01). Least absolute shrinkage and selection operator (LASSO) regression were used for further screening. Then, multivariate Cox regression was performed to determine whether each autophagy-related lncRNA is an independent prognostic factor for predicting patient survival. A total of 3 autophagy-related lncRNAs were selected for the development of a prognostic signature. "Survival", "glmnet" and "survminer" packages in R were respectively applied in the above steps.
The cytoscape software (version 3.8.1, http://www.cytoscape.org/) was used to visualize the coexpression network of lncRNA and mRNAs. Moreover, the ggalluvial package in R (version 4.0.2) was used to generate a Sankey diagram to further explore the relationship between mRNA, lncRNA and risk types.
On the basis of the 3 identi ed autophagy-related lncRNAs, a risk score was developed for each patient. The risk score algorithm was as follows: , where "y" represents the risk score, "coef" represents the regression coe cient, and "x" represents the gene expression value.

Independent Prognostic Analysis and Nomogram Development
In the training cohort, based on the median risk scores, HNSCC patients were clustered into high as well as low-risk groups. K-M survival curves were used to analyze OS rates of individuals with HNSCC. After integrating the risk score with other clinical variables, Cox regression as well as receiver operating characteristic (ROC, R package "survival ROC") curves were used to analyze if the risk score can be utilized as an independent predictor of prognosis for HNSCC patients.
Risk scores and independent clinical variables were used to create a nomogram for further analysis of one-year, three-year and ve-year survival rates of HNSCC patients (R "survival", "rms", "Hmisc", "lattice", "Formula" and "ggplot2" packages). Concordance index (C-index, package "survcomp"), ROC curve as well as calibration curves (package "rms") were used to evaluate accuracy of the nomogram.

Veri cations of the Prognostic Signature and Nomogram
To test the robustness and effectiveness of the prognostic signature and the nomogram, the same risk score calculation formula was used for veri cation in the testing cohort (n = 248), the entire cohort (n = 498).

Relationships between the Prognostic Signature and Clinical Variables
We evaluated the relationship between risk scores and clinical variables. In different cohorts, patients were clustered into two groups on the basis of age, T stage, gender, stage, and N stage. Then, mean risk scores for each group of patients were calculated. Student's t-tests were then used to investigate differences within each group using p < 0.05.

Functional Analysis
Bioinformatics analysis of mRNAs associated with lncRNAs were performed to assess main biological attributes of autophagy-related lncRNAs. Gene Ontology (GO) analysis was performed to determine the biological processes (BP), molecular functions (MF), and C (CC) in which lncRNAs were enriched. The Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis was used to investigate potential signaling pathways involved in lncRNAs.

Quantitative Reverse Transcription (qRT)-PCR
Total RNA were extracted from tissues and cells using the TRIzol reagent (CWbiotech, China) according to the manufacturer's instructions. The Prime Script Reverse Transcription Kit (Takara, Japan) was then used to synthesize cDNA. Real-time PCR was performed to quantitate lncRNAs using the SYBR Green PCR Kit (Takara, Japan). Relative expressions of lncRNAs were determined by the 2-△△Ct method. Sequences of primers used in this study are presented in Table 1.

Differentially Expressed Autophagy-related lncRNAs with Prognostic Signi cance
A total of 14142 lncRNAs were extracted from RNA-seq data of TCGA-HNSC samples. Among them, 870 lncRNAs were found to be differentially expressed between head and neck cancer and adjacent normal tissues (|log 2 FC| > 0.05, adjusted p < 0.05, Fig. 2a, Table S1). Further, 232 ATGs were retrieved from the HADb (Table S2). Out of the 232 ATGs, 210 were expressed in HNSCC. The pearson correlation coe cient was used to establish the correlation between lncRNA and autophagy-related mRNAs with |R| 2 > 0.3 and p < 0.001. A total of 269 differentially expressed autophagy-related lncRNAs were identi ed (Table S3).
In the training cohort (n = 250), K-M survival and univariate Cox analyses of these lncRNAs combined with their respective clinical data resulted in the identi cation of 3 autophagy-related lncRNAs that had prognostic values (Table S4, p < 0.001). Moreover, 3 candidate lncRNAs were determined through LASSO regression analysis ( Figure S1). Furthermore, 3 representative autophagy-associated lncRNAs (AC002401.4, AC245041.2 and TMEM44-AS1) were obtained through multivariate Cox analysis. These lncRNAs exhibited a hazard ratio (HR) > 1 and were considered to play a risk role in HNSCC (Fig. 2b, Table 2). Co-expression networks of ATGs and lncRNAs and Sankey diagram indicating risk functions of lncRNAs are shown in Fig. 3. Each lncRNA exhibited a high predictive value for overall survival rates of HNSCC patients in all cohorts ( Figure S2). Based on the median risk score, patients were clustered into high-risk and low-risk groups. K-M survival curve assessment revealed that the OS rate of the high-risk group was remarkably low when compared to that of the low-risk group (p = 4.619e-05) (Fig. 4a). TCGA-HNSC patients were ranked based on risk scores (Fig. 4b). Patients with higher risk scores exhibited lower survival time (Fig. 4c). Expressions of protective genes decreased with increasing risk scores, whereas expression of risk genes increased with increasing risk scores (Fig. 4d). These ndings show that the risk score accurately re ects patient prognosis.
Univariate and multivariate Cox regression analyses were used to assess clinical signi cance of autophagy-related lncRNAs risk signatures. In both univariate and multivariate Cox regression analyses, the risk score was found to be an independent prognostic indicator ( Fig. 4e-f). ROC curve analysis combined with other clinical indicators revealed that AUC (0.732) of the risk score was greater, compared to AUC of other clinical indicators (Fig. 4g). These ndings imply that the risk biosignature of autophagyassociated lncRNAs is an independent prognostic indicator for survival of HNSCC patients.
To verify the robustness of this risk signature, we performed internal veri cation in the testing cohort (n = 248) and the entire cohort (n = 498) using the same risk score formula. Patients in the high risk score group exhibited signi cantly low OS when compared to the low score group (Fig. 5a, 6a). The trend in risk scores and patient prognosis for these two cohorts were comparable to those of the training group ( Fig. 5b-d, 6b-d). Findings from cox regression analysis con rmed that the risk score, age and N stage were independent prognostic factors ( Fig. 5e-f, 6e-f). In addition, AUC values of risk scores for testing and entire cohorts were 0.725 and 0.724, respectively, and were higher than those of other clinical indicators (Fig. 5g, 6g).

Establishment and Veri cation of the Nomogram
In the training cohort, risk scores, age and N stage, which were independent prognostic factors, were used to construct the nomogram for accurate prediction of three-year and ve-year survival rates of HNSCC patients (Fig. 7a). The C-index of this model was 0.712. In addition, AUC values for one-year, three-year and ve-year survival rates were 0.730, 0.745 and 0.728, respectively (Fig. 7b). Calibration curves for one year, three-year and ve-year survival rates and their reference lines were also established ( Fig. 7c-e).
In the testing cohort, AUC values for one-year, three-year, and ve-year survival rates were 0.726, 0.693, and 0.618, respectively (Fig. 7f). In the entire cohort, they were 0.725, 0.725 and 0.681, respectively (Fig. 7j). Calibration curves of the two groups were close to their respective reference lines (Fig. 7g-i, k-m). Therefore, the nomogram was accurate for the prediction of one-year, three-year and ve-year survival rates of HNSCC patients.

Correlation Analysis of the Prognostic Signature and Clinical Indicators
In the training cohort, after excluding patients with incomplete clinical information, a total of 198 patients were included in the subsequent analysis. Then, we performed a correlation assessment of the prognostic risk score and clinical indicators (Table 3). M stage was not evaluated as a variable because half of the clinical samples did not have data for M stage. Mean risk score for patients above 65 years old was greater than that for patients below 65 years of age. Increasing T and N stages were correlated with increasing mean risk scores. The same trends were re ected in the testing and entire cohorts (Table S5-6). These results imply that autophagy-related lncRNAs may be involved in tumor development.

Functional Characteristics Analysis
BP, CC and MF results from GO enrichment analysis were obtained (Fig. 8a). Genes in the BP category were enriched in cellular responses to external stimulus and in responses to endoplasmic reticulum stress. In the CC category, related genes were mainly enriched in integrin and protein complexes involved in cell adhesion. In the MF category, related genes were mainly enriched in heat shock protein binding, chaperone and Hsp70 protein binding, as well as in molecular adaptor activity. Top 30 enrichments for autophagy-related genes identi ed through KEGG analysis are shown in Fig. 8b. Genes were found to be mainly associated with "Autophagy-animal", "Human papillomavirus infection", "Pathways of neurodegeneration-multiple diseases", "Small cell lung cancer", and "PI3K-Akt signaling pathway" cascades.

Veri cation of Expression Levels of 3 lncRNAs in Cells and Tissues
According to results from TCGA-HNSC difference analysis, expressions of AC002401.4, AC245041.2 and TMEM44-AS1 in tumor tissues were higher than those in ANT (Fig. 9a). We measured relative expressions of 3 lncRNAs in 9 pairs of HNSC and matched ANT samples. According to qRT-PCR results, compared to paired ANTs, expression levels of 3 lncRNAs in HNSCC tissues were signi cantly elevated, consistent with results in TCGA-HNSC (Fig. 9b-d).
In addition, expression levels of 3 lncRNAs in HNSCC (CAL27 and SCC4) and human oral epithelial (HOEC) cell lines were determined. Compared to HOEC, the three types of lncRNAs were highly expressed in both CAL27 and SCC4 (Fig. 9e-g).

Discussion
Autophagy plays a signi cant role in tumorigenesis, invasion and resistance to radiotherapy and chemotherapy. In HNSCC, autophagy is associated with risk factors such as tobacco, alcohol and HPV [23][24][25]. Autophagy-related genes are potential biomarkers and therapeutic targets for HNSCC [26][27][28]. LncRNAs, a class of non-coding RNAs, are involved in autophagy regulation, and serve a pivotal role in tumor development [29]. Dysregulation of lncRNA can promote tumorigenesis and cancer progression [30]. Moreover, lncRNAs can act as oncogenes or tumor suppressors. Compared to genes encoding proteins, expressions of lncRNAs have higher disease and tissue speci cities [31]. Therefore, we used autophagy-related lncRNAs to establish a prognostic signature and a nomogram.
Individual prognoses for HNSCC patients signi cantly vary, thus, application of a single gene is inaccurate in prediction of patient prognosis. Therefore, to construct risk scores for patients with HNSCC, we screened 3 lncRNAs related to autophagy. The risk score revealed that OS rates for patients in the high-risk group were signi cantly low compared to those of the low-risk group. Univariate and multivariate Cox analyses revealed that the risk score is an independent prognostic factor for HNSCC patients. In addition, compared to other clinical indicators, ROC curve analysis revealed that the risk score was more accurate in predicting patient survival. These ndings show that the risk score model can accurately predict the prognosis of HNSCC patients. In addition, a nomogram was constructed to investigate the prediction of survival rates of HNSCC patients. We veri ed the models using two internal veri cation cohorts and an external cohort. It was found that the risk signature and nomogram had good robustness and repeatability. In summary, the prognostic biosignature, which was based on autophagyrelated lncRNAs, was accurate in predicting the prognosis of HNSCC patients, and has potential clinical application values.
Analysis of the relationship between risk scores and clinical indicators showed that the mean risk score for patients aged over 65 years were higher compared to those of patients aged below 65 years. This nding was consistent with results from previous studies, indicating that age affects the prognosis of HNSCC patients [28,32]. In addition, increasing T and N stages were correlated with increasing mean risk scores. This nding implies that autophagy-related lncRNAs play a role in lymph node metastasis and development of HNSCC.
A limited number of few studies have reported on the mechanisms of autophagy-related lncRNAs in HNSCC. Yang et al. [33] reported that lncRNA CASC 9 is a marker for prognosis and targeted therapy of oral squamous cell carcinoma (OSCC). Expressions of CASC 9 were found to be signi cantly up-regulated in OSCC, and were correlated with tumor size, stage as well as lymph node metastasis. Furthermore, CASC 9 inhibits the apoptosis of autophagy-related OSCC cells through AKT/mTOR axis. Downregulation of LINC00460 elevates the expression of miRNA-206, resulting in down-regulation of STC2, thereby promoting HNSCC cell autophagy and apoptosis [21]. Moreover, lncRNA HOX transcript antisense RNA (LncRNA HOTAIR) was found to be highly expressed after silencing HOTAIR, thereby elevating the expression of mTOR, inhibiting OSCC cell autophagy, migration and proliferation [34]. The expression of lncRNA-growth arrest-speci c 5 (GAS5) in laryngeal squamous cell carcinoma (LSCC) has been reported to be suppressed [35]. Overexpression of GAS5 in AMC-HN-8 cells elevates the expression of autophagyrelated proteins and activates cell autophagy. LncRNA-GAS5 play a tumor suppressor role in LSCC by regulating the miR-26a-5p/ULK2 axis, and is a potential new therapeutic target. More studies should aim at determining the roles of autophagy-related lncRNAs in HNSCC.
Among the 3 identi ed lncRNAs that were associated with autophagy, only AC245041.2 has been previously implicated in clear cell renal cell carcinoma (CCRCC). Wang et al.
[36] developed a CCRCC prediction signature that included AC245041.2. They reported that elevated AC expression levels indicate poor prognosis for CCRCC patients. To establish the biological behaviors of these three autophagyrelated lncRNAs, we constructed a co-expression network of mRNA and lncRNAs, and performed functional enrichment analysis. Functional enrichment analysis revealed multiple GO terms and signaling pathways associated with autophagy and tumor development, such as regulation of autophagy, HPV, apoptosis, and the PI3K-Akt signaling pathway. The roles of these three lncRNAs in HNSCC will be evaluated in detail in our further research.
Studies have investigated the prognostic markers for lncRNA and developed predictive signatures for HNSCC. Xu et al. [37] established a prognostic prediction signature that was based on 11 lncRNAs by analyzing the TCGA database. Ji et al. [18] identi ed 4 lncRNAs that were related to OS by analyzing the TCGA data. Then, they constructed a nomogram for predicting the prognosis of HNSCC patients. However, the existing lncRNA prediction signatures for HNSCC mostly focus on differences between cancer tissues and normal tissues, instead of focusing on the role of autophagy in HNSCC progression.
Chen et al.
[38] developed a prognostic signature based on seven immune-related lncRNAs, and created a nomogram with age, TNM stage, and risk score. The nomogram was found to have a good prognostic predictive ability in HNSCC. However, in this study, differential expression analyses of lncRNAs in cancer and normal tissues were not performed. If the lncRNAs in the prognostic signature are not speci cally expressed in cancer tissues, the possibility of their clinical application as HNSCC biomarkers is low. In this study, differential expression analyses of lncRNAs were veri ed in cancer cells and matched samples, which enhanced the possibility of their clinical applications. The autophagy-related lncRNAs model we developed can accurately predict patient prognosis and provide a theoretical basis for evaluating the mechanism of autophagy during HNSCC progression.
This study has some limitations. First, M stage and HPV infection status of HNSCC patients were not included in the analysis. This was due to incomplete sample data. Second, our study was retrospective and a small sample size was used. Prospective studies with larger sample sizes should be performed to verify our ndings.

Conclusions
In conclusion, we identi ed 3 autophagy-related lncRNAs. Then, we constructed a risk score model and a nomogram for accurate prediction of HNSCC prognosis. This study provides potential biomarkers and therapeutic targets for HNSCC.

Declarations
Ethics approval and consent to participate The study was approved by the Prospective, Observational, Real-world Oral Malignant Tumors Study (POROMS), and the POROMS number is NCT02395367.

Consent for publication
Not applicable Availability of data and materials The raw data of this study are available in the TCGA database (https://portal.gdc.cancer.gov/) and HADb (http://autophagy.lu/ ). The data generated during the research can be obtained in the supplementary materials.
Competing interests Figure 1 Flowchart of data analysis    Veri cation of the prognostic signature by the testing cohort (n =248). (a) Patients in high score group had a signi cantly lower OS than low score group. (b-d) As the risk score increases, the proportion of death patients increases, and the expression level of related lncRNAs gradually increases. (e-f) Age, N stage and risk score were independent prognostic factors. (g). The risk score had the highest AUC value in the ROC curve analysis.

Figure 6
Veri cation of the prognostic signature by the entire cohort (n =498). (a) The OS rate of the high-risk group was lower than that of the low-risk group. (b-d) The distribution trends of risk score, lncRNA expression level and patient survival time were consistent with the training cohort and testing cohort. (e-g) Age, N stage and risk score were still independent prognostic factors, and risk score had the closest relationship with overall survival rate.   The expression of 3 autophagy-related lncRNAs in HNSCC cells and tissues. (a) Heat map of 3 lncRNAs with signi cantly differential expression in TCGA-HNSC. (b-d) qRT-PCR results showed that 3 lncRNAs expression were higher in HNSCC cell lines (CAL27 and SCC4) than in the HOEC. *P < 0.05, **P < 0.01, *** P < 0.001. (e-g) qRT-PCR results showed that 3 lncRNAs expression were higher in HNSCC tissues than in ANTs (P < 0.05).

Supplementary Files
This is a list of supplementary les associated with this preprint. Click to download.