An autophagy-related lncRNA prognostic risk model for thyroid cancer

Thyroid cancer (TC) is the most common malignancy of the endocrine system and its incidence is gradually rising. Research has demonstrated a close link between autophagy and thyroid cancer. We constructed a prognostic model of autophagy-related long non-coding RNA (lncRNA) in thyroid cancer and explored its prognostic value. The data used in this study were all obtained from The Cancer Genome Atlas (TCGA) database and the Human Autophagy Database (HADb). We construct a co-expression network by autophagy-related genes and lncRNA to obtain autophagy-related lncRNAs. After univariate Cox regression analysis and multivariate Cox regression analysis, autophagy-related lncRNAs significantly associated with prognosis were identified. Based on the risk score of lncRNA, thyroid cancer patients are divided into high-risk group and low-risk group. A total of 14,142 lncRNAs and 212 autophagy-related genes (ATGs) were obtained from the TCGA database and the HADb, respectively. We performed lncRNA-ATGs correlation analysis and finally obtained 1,166 autophagy-associated lncRNAs. Subsequently, we conducted univariate Cox regression analysis and multivariate Cox regression analysis, nine autophagy-related lncRNAs (AC092279.1, AC096677.1, DOCK9-DT, LINC02454, AL136366.1, AC008063.1, AC004918.3, LINC02471 and AL162231.2) significantly associated with prognosis were identified. Based on these autophagy-related lncRNAs, a risk model was constructed. The area under the curve (AUC) of the risk score was 0.905, proving that the accuracy of risk signature was superior. In addition, multiple regression analysis showed that risk score was a significant independent prognostic risk factor for thyroid cancer. In this study, nine autophagy-related lncRNAs in thyroid cancer were established to predict the prognosis of thyroid cancer patients.


Introduction
Thyroid cancer is the most common malignant tumor of the endocrine system, ranking in 9th place for incidence in 2020, with a global incidence three times higher in women than in men [1,2]. Due to the development of diagnostic techniques and clinical treatment, the incidence of thyroid cancer has been steadily increasing, while it has a better prognosis and low mortality rate [3,4]. According to its histopathological characteristics, thyroid cancer can be classified as papillary 1 3 thyroid cancer (PTC), follicular thyroid cancer (FTC), medullary thyroid cancer (MTC) and anaplastic thyroid cancer (ATC), among which PTC is the most common, accounting for 85%-90% of all [5][6][7]. Surgery is the preferred treatment for thyroid cancer, in most cases of thyroid cancer the 5-year survival rate can exceed 90%, with the exception of ATC, which has a higher mortality rate [8]. Therefore, the development of valid screening biomarkers has a pivotal role.
Autophagy is an intracellular phenomenon of "selfphagocytosis", a physiological process in which cell membranes wrap organelles and cytoplasmic proteins within cells and fuse them with lysosomes to form autophagic lysosomes and degrade them [9]. Autophagy can occur in normal physiological processes and also plays an important role in pathological processes, and extensive studies have explored the intricate role of autophagy in cancer [10,11]. In the cancer-related research, autophagy is a "double-edged sword". Autophagy can inhibit tumorigenesis in the initial stage of cancer, and it can also promote tumor progression in the later stage of cancer [12,13]. Therefore, autophagy has become a potential therapeutic target for cancer treatment [14].
In recent years, long non-coding RNA (lncRNA) has become a rising star in the field of cancer diagnosis and treatment [15]. LncRNA refers to a type of non-coding regulatory RNA with over 200 nucleotides (nt) in length that does not have any protein-coding ability [16]. LncRNA cannot directly encode proteins or peptides, but lncRNA can regulate gene expression and function in a variety of biological processes, such as epigenetic regulation, transcription and post-transcription, splicing, metastasis and apoptosis [17][18][19]. Various lncRNAs have been shown to promote the development of thyroid cancer and can be used as novel biomarkers for early diagnosis and treatment, such as MALAT1, HOTAIR and BANCR [20]. At present, prognostic studies on autophagy-related genes in thyroid cancer have been reported, but there are no studies on the prognosis of autophagy-related lncRNAs in thyroid cancer [21,22]. Therefore, further investigation is necessary to provide clues for understanding their roles in thyroid cancer.

Acquiring information of patients with thyroid cancer
The clinical information and pathology records of the TCGA-THCA patients were acquired from TCGA (https:// portal. gdc. cancer. gov/). We obtained 567 thyroid cancer samples, which included 58 normal samples and 509 tumor samples.

Selection of autophagy-related lncRNAs in thyroid cancer
The autophagy genes were obtained from HADb (http:// autophagy.lu/clustering/index.html). The Ensembl human genome browser was used to classify and annotate the lncR-NAs and protein-coding genes. The autophagy-related lncR-NAs were selected according to the criteria of |Correlation Coefficient|> 0.4 and P < 0.001 by Pearson correlation analysis.

Construction of the risk signature
To identify autophagy-related lncRNAs associated with survival, we initially performed univariate Cox proportional hazards analysis with P < 0.01 as the criteria. And then we included these lncRNAs into multivariate Cox regression analysis based on the Akaike information criterion (AIC) to construct risk model. Based on this risk model, we calculated the risk score for each patient with thyroid cancer using the following formula: Risk score = ∑(expression lncRNA n × coefficient lncRNA n). Thyroid cancer patients were divided into highrisk group and low-risk group according to the median risk score. Survival analysis was used to compare the difference in survival between the high-risk group and low-risk group, and the AUC was obtained by the receiver operating curve (ROC) analysis to assess the predictive effect of the model.

Functional annotation
Gene set enrichment analysis is a powerful bioinformatics tool for gene function annotation [23]. Gene set enrichment analysis was used to analyze differentially expressed genes in the high-risk group and low-risk group and to compare the significantly enriched pathways in the two groups.

Statistical analysis
All statistical analyses were performed using Strawberry Perl (version 5.32.0.1) and R software (version 4.0.2). Cytoscape software (version 3.7.1) was used to construct autophagy-lncRNA co-expression network. GSEA software (version 4.0.1) was used to perform functional annotations. P value less than 0.05 was considered to be statistically significant.

Construction of the autophagy-related gene lncRNAs co-expression network
A total of 14,142 lncRNAs and 212 autophagy-related genes were obtained from the TCGA database and HADb 1 3 database, respectively. And then, we performed lncRNA-ATGs correlation analysis with correlation coefficient = 0.4 and P < 0.001 to obtain co-expression results, and finally obtained 1,166 autophagy-related lncRNAs.

Establishment of the optimal prognostic risk model of nine autophagy-related lncRNAs
To further assess the prognostic value of autophagy-related lncRNAs, we first performed a univariate Cox regression analysis, which resulted in 28 autophagy-related lncR-NAs associated with prognosis (Table 1). Subsequently, a multivariate Cox regression analysis was performed on 28 autophagy-related lncRNAs associated with prognosis to further identify independent prognostic factors. Then we optimized the model according to the Akaike information criterion, and selected nine lncRNAs as independent prognostic factors, suitable for model construction, namely, Table 2).
Then we demonstrated the relationship between these nine autophagy-related lncRNAs and the prognosis of thyroid cancer patients by Kaplan-Meier survival curves ( Fig. 1). The results revealed that patients with high expression of AL136366.1, AC008063.1, AC092279.1, DOCK9-DT, LINC02471 had better prognosis. On the contrary,  patients with high expression of AC096677.1, AC004918.3, AL162231.2, LINC02454 had worse prognosis. Based on the relationship between these nine prognosis-related lncR-NAs and ATGs we constructed a co-expression network and combined with the risk score from the multivariate Cox regression analysis we plotted a Sankey diagram (Fig. 2a, b).

The effect of the autophagy-related lncRNA risk model on thyroid cancer prognosis
Based on the median risk score obtained from multivariate Cox regression analysis, patients with thyroid cancer were divided into high-risk and low-risk groups. The survival curve results showed that the survival rate of patients in the high-risk group was significantly lower than that in the low-risk group (Fig. 3). The risk curve, scatterplot and heat map showed the high-risk group had higher risk coefficients and mortality compared to the low-risk group (Fig. 4a-c).
The ROC curve was used to assess the effectiveness of risk model. The AUC of the risk score was 0.905, indicating that the risk model is effective and considerably reliable (Fig. 5).
Univariate and multivariate analysis of independent prognostic was used to assess whether the risk model was an independent prognostic factor. The results of univariate Cox regression analysis showed that the hazard ratio (HR) of the risk score and 95% CI were 1.021 and 1.008-1.033 (P < 0.001). And the results of multivariate Cox regression analysis showed that the HR of the risk score and 95% CI were 1.065 and 1.025-1.107 (P < 0.001) (Fig. 6a, b). The results indicated that the risk score of the prognostic risk model of the nine autophagy-related lncRNAs can be used showed the association between prognostic autophagy-related lncR-NAs, autophagy-related genes, and risk types 1 3 as an independent prognostic factor for patients with thyroid cancer.

GSEA
Gene set enrichment analysis contributes to further understanding the function of genes. We performed Gene Ontology (GO) enrichment analysis and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis respectively. The enriched gene sets were obtained based on nominal P value < 0.05 and a false discovery rate (FDR) value < 0.25 after performing 1000 permutations. Both KEGG and GO enrichment analyses showed that the spliceosome-related pathways were significantly enriched in the low-risk group (Fig. 7a, b). The detailed information shown in Table 3. In addition, the results of GO enrichment analysis indicated that differentially expressed genes between high and low-risk groups were involved in the RNA splicing, RNA binding, viral transcription and initiation from RNA polymerase.

Discussion
Thyroid cancer is the most common malignancy of the endocrine system. In addition to ATC, the majority of patients with thyroid cancer have a better prognosis and higher overall survival [24]. The treatment for patients with thyroid cancer is usually based on surgical resection and also includes radioiodine therapy and long-term thyroid hormone replacement therapy [25,26]. It is well known that autophagy is a double-edged sword, which can both promote and inhibit tumorigenesis. Studies have shown autophagy is related to the progression of thyroid cancer. The induction of autophagy can make thyroid tumor cells sensitive to certain therapies and plays an important function in the regulation of apoptosis [27]. With the evolution of high-throughput sequencing technology, an increasing number of studies have shown that lncRNAs play an important role in the occurrence and development of cancer. Nowadays, survival models for autophagy-related genes have been established in thyroid cancer, but no model for autophagy-related lncRNA survival is available [28]. Therefore, it is necessary to construct a risk signature based on autophagy-related lncRNA in thyroid cancer. In this study, the TCGA database was used to collect RNA sequences and related clinical information of thyroid cancer patients, and the HADb database was used to collect autophagy-related gene information. After univariate Cox regression analysis and multivariate Cox regression analysis, a nine-autophagy-related lncRNAs significantly associated with prognosis was identified. We stratify patients with thyroid cancer into high-risk and low-risk groups based on the median prognostic risk scores, and the results showed that the high-risk group had a poorer prognosis. The ROC curve and AUC validated the accuracy of the risk model. The AUC of the risk score was 0.905, proving that the accuracy of risk signature was superior.
Of the nine autophagy-related lncRNAs, five lncRNAs (AL136366.1, AC008063.1, AC092279.1, DOCK9-DT, LINC02471) were protective factors for the prognosis of thyroid cancer, while other four lncRNAs (LINC02454, AC096677.1, AC004918.3, AL162231.2) behaved the opposite. LINC02454 and AL136366.1 have been shown to be pivotal mediators in the pathophysiological process of PTC and are closely associated with the development and prognosis of thyroid cancer [29]. Tan et al. reported that LINC02454 is highly expressed in PTC [30]. And LINC02454 may function as an oncogene that inhibits the apoptosis and enhances proliferation of PTC cells. Zhang et al. constructed an overall survival model composed of eight signature lncRNAs [31]. It showed that DOCK9-DT is a protective factor for thyroid patients. Dong et al. pointed out AC004918.3 was involved in constituting a prognostic signature for clear cell renal cell carcinoma (ccRCC) and could be a valid prognostic indicator for ccRCC [32]. Chen et al. demonstrated that LINC02471 can promote the development of PTC. Knockdown of LINC02471 can also inhibit the invasion and metastasis of PTC, and promote apoptosis of PTC cells by directly targeting miR-375 [33]. Cal et al. reported that LINC02471 can be used as a molecular biomarker for the progression and prognosis of thyroid cancer [34]. Yang et al. indicated that LINC02471 may be a valid prognostic indicator for ccRCC [35]. These results are almost same as ours and further proving that our results are reliable. To further explore the functions of nine autophagy-related lncRNAs signature in thyroid cancer, we performed KEGG pathway analysis and GO enrichment analysis. KEGG pathway analysis indicated that the pathways association with spliceosome, RNA polymerase and base excision repair were prominently enriched in the low-risk group. In addition, in GO enrichment analysis we found that differentially expressed lncRNAs are involved in the synthesis of U2-type spliceosomal complex, RNA splicing, SNRNA binding, viral transcription and the initiation of RNA polymerase. These results contribute to our further understanding of the nine autophagy-related lncRNAs signature.
To sum up, we constructed and confirmed a nine autophagy-related lncRNAs risk model which can perform as an independent prognostic factor of thyroid cancer. These may provide a certain theoretical basis for the screening, diagnosis and treatment of patients with thyroid cancer in the future.
This study is the first to construct a prognostic risk model of autophagy-related lncRNAs using bioinformatics in thyroid cancer. Even though we used the extremely authoritative TCGA database, there were still problems with our study. We only used the TCGA database, which lacks further validation of external datasets. Furthermore, we need some systematic and comprehensive in vivo or in vitro experiments to further validate our conclusions.