Construction of a Six Immune-Related lncRNAs Signature to Predict Outcome, Immune Cell Inltration and Immunotherapy Response in Patients with Hepatocellular Carcinoma

Background: Hepatocellular carcinoma (HCC) is one of the most lethal malignant tumors worldwide with poor prognosis. Growing evidence has demonstrated that immune-related long non-coding RNAs (lncRNAs) are relevant to tumor microenvironment (TME) and can help to assess the effects of immunotherapy and evaluate prognosis. This study aimed to identify an immune-related lncRNA signature for the prospective assessment of immunotherapy and prognosis in HCC. Methods: HCC RNA-seq data and clinical information were downloaded from The Cancer Genome Atlas (TCGA) project database. Firstly, we used ESTIMATE to evaluate the tumor microenvironment (TME). Then, cox regression analysis was used to construct a prognostic signature and the risk score. Univariate Cox regression, multivariate Cox regression, principal components analysis (PCA), the receiver operating characteristic (ROC) curve and stratication analyses were applied to con ﬁ rm. Gene set enrichment analysis (GSEA) analysis was employed to explore the biological processes and pathways. Besides, CIBERSORT was used to estimate the abundance of tumor-in ﬁ ltrating immune cells (TIICs). Furthermore, the relationship between the immune-related lncRNA signature and immune checkpoint genes was investigated. Finally, quantitative real-time polymerase chain reaction (qRT-PCR) assays were used to demonstrated the expression of the six lncRNAs. Results:. We identied a six immune-related lncRNAs (MSC-AS1, SNHG3, NRAV) with the ability to stratify patients into high-risk and low-risk groups with signicantly different survival. Univariate Cox regression, multivariate Cox regression, ROC and stratication analyses con ﬁ rmed that the six immune-related lncRNA signature was a novel independent prognostic factor in HCC patients. The high-risk group and low-risk group illustrated different distributions in PCA. GSEA suggested that the six immune-related lncRNA signature is involved in the immune-related biological processes and pathways. Besides, the

the expression of critical immune genes and could predict the clinical response of immunotherapy.
Finally, qRT-PCR demonstrated that the six lncRNAs were significantly differentially expressed in HCC cell lines and normal hepatic cell line.
Conclusions: In summary, we identi ed a six immune-related lncRNA signature with the ability to predict outcome, immune cell in ltration and immunotherapy response in patients with hepatocellular carcinoma.
Background Hepatocellular carcinoma is one of the most common human malignancies and the fourth most common cause of cancer mortality after lung, colorectal and stomach cancer according to World Health Organization. It exerts a heavy disease burden with more than 800,000 newly diagnosed cases and nearly 700,000 deaths each year [1]. Due to the aggressive and insidious growth nature of HCC, most patients are diagnosed at advanced stages at which point therapeutic options are limited and ineffective.
Although many advances have been made in the diagnostic and therapeutic strategies of HCC, the outcomes for HCC patients still remain poor. The median survival of advanced HCC patients is about 9 months and the 5-years overall survival rate is only 10% [2]. Considering the high mortality, it is necessary to further study the clinical diagnosis methods of HCC, explore new risk factors and molecular markers, and develop new therapeutic targets, which could improve the clinical prognosis of patients with HCC.
In recent years, the continuous progress of gene and molecular biology technology has revealed that TME plays a vital role in tumor epigenetics, tumor differentiation, immune escape, and in ltration metastasis [3]. Emerging evidence has con rmed that disorders of the immune response in the TME plays a pivotal role in cancer development and progression [4]. The expression and dysregulation of immunerelated genes are involved in the regulation of the immune system. The host immune response and immune cells are crucial factors of the TME that are consistently involved throughout tumor development [3]. HCC shows a high degree of malignancy, and its poor overall survival outcome is due to the collapse of immune surveillance. Therefore, a better understanding of immune-related factors in the TME and their effects on cancer cells will help in the discovery of novel prognostic elements of HCC.
Long non-coding RNA is a class of RNA molecules with transcripts longer than 200 nucleotides frequently dysregulated in various cancers. Abundant studies have reached a consensus that lncRNA is involved in the proliferation, migration, invasion, apoptosis, angiogenesis, and drug resistance of cancer cells.
Dysregulation of numerous lncRNAs has been reported in many different cancers including breast cancer, lung cancer, HCC and so on. For example, HOTAIR, a 2158 bp lncRNA, is more highly expressed in HCC tissues than in nontumor tissues and it promotes cell proliferation, autophagy, and invasion and reduces the response of hepatoma cells to the apoptosis stimulator TNF-α and the chemotherapeutic drugs [5]. In addition to affecting cancer cells themselves, lncRNA also in uence the TME [6]. In TME, lncRNAs contribute to mediating and controlling several immune and cancer cell interactions and important mechanisms of immune response [7]. Additionally, lncRNAs signi cantly affect the tumor immune process as well as the in ltration of TIICs. Thus, discovering some promising prognostic immune-related lncRNA markers and investigating the underlying molecular mechanisms are eagerly anticipated.
In this study, we identi ed the expression of immune-related lncRNAs in 374 HCC patients from TCGA project. By using ESTIMATE of TME, survival analysis, Cox regression model, CIBERSORT of TIICs and other methods, we identi ed a biologically relevant six-lncRNA signature with the ability to predict the prognosis of patients with HCC. Finally, we further veri ed that these six lncRNAs were signi cantly differentially expressed between HCC cell lines and liver cell lines by a quantitative real-time polymerase chain reaction (qRT-PCR). We aimed to take advantage of lncRNA expression pro les to explore immunerelated lncRNAs, which may have potential clinical signi cance in patient management and may shed light on the tumorigenesis of HCC.

Acquisition of HCC expression data
The human HCC transcriptome RNA-sequencing data were downloaded from TCGA data (https://portal.gdc.cancer.gov/). The corresponding clinical information, such as gender, age and survival information were also downloaded from TCGA. The data was updated to June 2, 2020. The RNAsequencing data was combined into a mRNA matrix le using Perl programming language (http://www.perl.org/). Then the ensembl IDs of genes were converted into gene names.

Evaluation of tumor microenvironment in ltration patterns
The ESTIMATE method was applied to evaluate the presence of stromal cells and immune cells in TME by calculating speci c gene expression data [8]. The ESTIMATE algorithm was utilized via the R software (https://cran.r-project.org/mirrors.html) to evaluate the tumor microenvironment of each HCC samples.
Samples of HCC were classi ed into high immune cell in ltration group and low immune cell in ltration group and the EstimateScore, ImmuneScore, StromalScore and TumorPurity were caculated.

Analysis of tumor in ltrating immune cells
The CIBERSORT method was applied to estimate the abundance of TIICs based on gene expression data [9]. The R package "CIBERSORT" was used to calculate the proportion of 22 immune cells types in each sample.

Acquisition of immune-related lncRNAs
The immune-related genes were obtained from the Molecular Signatures Database v 7.1 (Immune response M19817, immune system process M13664, http://www.broadinstitute.org/gsea/msigdb/index.jsp). Then, immune-related lncRNAs was identi ed according to correlation analysis between the immune score and lncRNA expression level in samples with an absolute value of > 0.5 and p < 0.001.

Acquisition of survival-related lncRNAs
The immune-related lncRNA expression was combined with survival data (samples with overall survival ≤ 30 days were excluded). The survival-related lncRNAs were extracted by univariate cox regression analysis using R "survival" package with signi cant prognostic value P < 0.0001 as the criteria.
Construction of immune-related lncRNA signature model Multivariate Cox regression analysis was used to construct a prognostic signature and the risk score was calculated. The risk score for each patient was as follows: risk score = (lncRNA1 expression × coe cient lncRNA1) + (lncRNA2 expression × coe cient lncRNA2) + …+ (lncRNAn expression × coe cient lncRNAn). The risk score model was a measure of prognostic risk for each hepatic cancer patient. The median risk score served as a cutoff value to classify the patients into a high-and a low-risk groups for the following study.
Validation of immune-related lncRNA model The R package "survival" and "survminer" were used to plot Kaplan-Meier survival curves to compare the survival difference for both groups. The R package "survivalROC" was used to investigate the prognostic value of the immune-related lncRNA signature over time. The receiver operating characteristic curve (ROC) was used to examine the performance of the survival-related lncRNAs. The R package "pheatmap" was used to generate the heatmap. Univariate and multivariate Cox regression analysis was used to evaluate the prognostic relationship between risk score and age, gender, grade, clinical stage and TMN stage and The R package "ggpubr" was used.

Principal components analysis
Principal components analysis (PCA) was carried out to demonstrate the expression patterns of immunerelated lncRNAs in low-risk and high-risk groups.
Role of immune-related lncRNA signature on the immunologic features Gene set enrichment analysis (GSEA) was used to detect the potential functional phenotype or pathways that immune-related lncRNA may be involved in. In the current study, the gene sets of GO (gene ontology), KEGG (Kyoto Encyclopedia of Genes and Genomes), all immunologic signatures gene, all oncogenic signatures gene, immune response and immune system process were analyzed using GSEA 4.0.3.

Correlation analysis of immune cell in ltration
The association of the immune-related lncRNA signature with 22 TIICs was performed to explore whether this immunerelated lncRNA signature may play a crucial role in immune in ltration in HCC. The R package "pheatmap" was used to generate the heatmap of TIICs. Correlation analysis was performed for abundance of TIICs and risk score. Samples with CIBERSORT output value P < 0.05 are considered signi cant.
Cell culture and quantitative real-time PCR detection of lncRNA expressions in cell lines Four human HCC cell lines (Huh7, HepG2, LM3, SMMC-7721) and one normal human hepatic cell line (LO2) were used for qRT-PCR. Huh7, LM3 and SMMC-7721 were cultured in Dulbecco's modi ed Eagle's medium (DMEM, Hylcone) containing 10% FBS (Gibco). HepG2 was cultured in minimum Eagle's medium (MEM, Hylcone) containing 10% FBS (Gibco). The cells were maintained in a humidi ed incubator at 37˚C with 5% CO2. Total RNA was isolated from cells using the Trizol Reagent (Invitrogen) according to the manufacturer's instructions. Each RNA was reverse-transcribed into cDNA with the the High Capacity cDNA Synthesis kit (Applied Biosystems, Foster City, CA, USA). QRT-PCR was determined using the SYBR Premix Ex Taq kit (Takara Biotechnology Co., Ltd.) on a 7500 Fast Real-time PCR system (Thermo Fisher Scienti c Inc., UK). PCR primer sequences were designed and synthesized by Sangon Biotech (Shanghai) Co., Ltd. The primer sequences for qRT-PCR are shown in Table 1. GAPDH was used as endogenous control. The relative expression levels are calculated as a fold change using 2-ΔΔCt method. Table 1 The primer sequences of six immune-related lncRNAs F primer: forward primer; R primer: reverse primer Statistical Analysis All analysis were carried out by R version 3.6.2 and corresponding packages. P values < 0.05 were considered signi cant (*P < 0.05, **P < 0.01, ***P < 0.001).

Result
The immune landscape of the TME in HCC Transcriptome data and clinical data were downloaded from TCGA database, and and the data contained 50 normal samples and 374 tumor samples. Ensembl IDs of genes were converted into gene names. According to the results of hierarchical clustering algorithm, HCC samples were divided into two groups according to immune in ltration, including the high immune cell in ltration group and the low immune cell in ltration group. Subsequently, the TME of each sample was scored, and the TME characteristics including EstimateScore, ImmuneScore, StromalScore and TumorPurity in the immunity high/low groups were compared. The heatmap showed that the immunity high group had lower Tumor Purity but higher ESTIMATE Score, Immune Score and Stromal Score (Fig. 1a). The TME of the two groups was signi cantly different. The box chart (Fig. 1b) also showed that there was a signi cant positive correlation between immunity high group and ESTIMATE Score, Immune Score and Stromal Score, respectively, while there was a positive correlation between immunity low group and Tumor Purity. To further explore the relationship between high and low immunity levels of TME and in ltration of immune cells, the CIBERSORT algorithm was used to measure the relative proportions of immune cells in immunity high/low groups. It revealed that the abundance levels of B cells memory, B cells naïve, Dendritic cell resting, Macrophages M1, T cells CD4 memory activated, T cells CD8, T cells follicular helper and T cells regulatory (Tregs) were signi cantly higher in the immunity high group than immunity low group. Conversely, Macrophages M0 and Macrophages M2 were signi cantly lower in the immunity high group (Fig. 1c).

Immune-related lncRNAs in HCC
The obtained genes were divided into lncRNAs and mRNAs. A total of 331 immune-related genes were obtained from the Molecular Signatures Database v 7.1 (Immune response M19817, immune system process M13664). 236 immune-related lncRNAs were identi ed by correlation analysis.

Validation of immune-related lncRNAs signature in HCC survival
The risk score of each patient was calculated and the samples were divided into high-risk groups and low-risk groups according to the median value of risk score (Fig. 2c). A higher mortality was observed in high-risk groups than in low-risk groups (Fig. 2d). The heatmap showed that with the increase of risk score, the expression levels of lncRNAs were elevated (Fig. 2e). Kaplan-Meier survival curves showed that the overall survival of the high-risk groups was signi cantly lower than that of the low-risk groups indicating the immune-related lncRNAs signature is effective (p = 1.975e-07) (Fig. 2b). Collectively, these studies identify six immune-related lncRNAs as prognostic signature for HCC.
Evaluation of immune-related lncRNAs as independent prognostic factors in HCC Univariate and multivariate Cox regression analysis were used to assess the independent risk factors. Several clinicopathological factors such as such as age, gender, grade, stage and TMN stage and the six immune-related lncRNA signature based risk score were included. The forest map illustrated that the hazard ratio (HR) of risk score and 95% CI were 1.511 and 1.349-1.694 in univariate Cox regression analysis (p = 1.133e-12) (Fig. 3a), and 1.442 and 1.271-1.382 in multivariate Cox regression analysis (p = 1.382e-08) (Fig. 3b), respectively. The ROC curve analysis was applied to illustrate the accuracy of the risk score model. The area under the ROC curve (area under curve, AUC) was measured. The AUC of risk score, age, gender, grade, tumor-stage, T-stage, N-stage and M-stage are 0.775, 0.454, 0.506, 0.475, 0.743, 0.752, 0.508 and 0.508, respectively (Fig. 3c). These data suggested that the six lncRNA signature was an independent prognostic factor in patients with HCC.

The relationships between immune-related lncRNAs and clinical parameters
To investigate the relationship between immune-related lncRNAs and clinical parameters, we analyzed the correlation of immune-related lncRNAs and the clinical characteristics, such as grade, tumor-stage and T-stage. We found that lncRNAs increased with grade, tumor-stage and T-stage (Fig. 4a-c).

PCA analysis
PCA was used to investigate the different distribution patterns between low-risk group and high-risk group on basis of different expression pro les. The low-risk and high-risk group are represented by green and red dots, respectively. Figure 6a, b, c showed the PCA results based on all genes set, all immune-related lncRNAs set and six immune-realated lncRNAs set, respectively. The results demonstrated that in the six immune-realated lncRNAs set, the low-risk group and the high-risk group were separated into two parts and the immune status of the patients in the low-risk group was distinguished from those in the high-risk group.
GSEA analysis of immune-related lncRNAs singnture The GSEA indicated that the primary increased functions of GO analysis (Fig. 7a)were regulation of gene expression epigenetic, gene silencing, histone deacetylase complex, mRNA binding, regulation of cell cycle phase transition; the primary decreased functions of GO analyses were aromatase activity, alpha amino acid catabolic process, monocarboxylic acid catabolic process, cellular amino acid catabolic process, aspartate family amino acid catabolic process. The primary increased functions of KEGG analysis (Fig. 7b) were endocytosis, vasopressin regulated water reabsorption, oocyte meiosis, ubiquitin mediated proteolysis and cell cycle; the primary decreased functions of KEGG analyses were retinol metabolism, drug metabolism cytochrome P450, fatty acid metabolism, primary bile acid biosynthesis and complement and coagulation cascades. Besides, the results showed that immune-related responses such as activation of immune response, immune response and immune system process were enriched in high-risk groups compared with low-risk groups (Fig. 7c).
Correlation between the immune-related lncRNA signature and immune cell in ltration To assess the relation between immune cell in ltration and our 6 lncRNA signature, we pro led 22 TIICs using CIBERSORT algorithm. The heatmap showed the TIICs abundance in high-risk group and low-risk group (Fig. 8a). Given that these 6 lncRNAs were related to tumor immunity, we analyzed the correlation between the 6 lncRNA signature and the in ltration of immune cell subtypes. The results showed the most signi cant positive correlation with immune in ltration were T cells follicular helper (R = 0.16, p = 0.0024) (Fig. 8b) and eosinophils (R = 0.11, p = 0.049) (Fig. 8c) and the most signi cant negative correlation with immune in ltration were plasma cells (R=-0.17, p = 0.0017) (Fig. 8d), monocytes (R=-0.15, p = 0.0049) (Fig. 8e), T cells CD4 memory resting (R=-0.13, p = 0.014) (Fig. 8f), NK cells activated (R=-0.11, p = 0.043) (Fig. 8g). These ndings suggested that the 6 immune-related lncRNAs were associated with immune cell in ltration in HCC.

Impact of the immune-related lncRNA signature and immune checkpoint gene expression on clinical outcome
To investigate the relationship between the immune-related lncRNA signature and immune checkpoint genes, we compare the expression of immune checkpoint genes including PD1, PD-L1 and CTLA-4 between high-risk group and low-risk group. The results showed that patients in high-risk group have higher expression of immune checkpoint genes than patients in low-risk group (Fig. 9a). Besides, survival distribution of four patient groups strati ed by the high/low-risk score and high/low immune checkpoint gene expression was compared. Kaplan-Meier survival curves showed that patients in low-risk group with high PD1 have better survival than patients in high-risk group with high PD1, and patients in low-risk group with low PD1 have better survival than patients in high-risk group with low PD1 (Fig. 9b). Similar results were seen in PD-L1 and CTLA-4 stratifying groups (Fig. 9c, d). These results indicated that the immune-related lncRNA signature may be a potential predictive biomarker of treatment response to immunotherapy.
Expression level of six immune-related lncRNAs in cell lines Finally, we detected the expression levels of six lncRNAs in four human HCC cell lines (Huh7, HepG2, LM3, SMMC-7721) and one normal human hepatic cell line (LO2). The results showed that the six lncRNAs were highly expressed in four HCC cell line than in normal human hepatic cell line (Fig. 10a-f).

Discussion
Hepatocellular carcinoma is one of the most common malignancies worldwide, and although considerable time, effort, and expense have been invested, its incidence and mortality continue to increase annually [10]. Therefore, it is of great clinical signi cance to explore the early diagnosis of HCC and accurately predict the prognosis, and determine the potential therapeutic targets and prognostic indicators of HCC.
It is known that the TME contains tumor cells and surrounding immune cells, endothelial cells, broblasts, extracellular matrix, secreted cytokines, chemokines, etc. Tumors can create a series of favorable conditions for themselves through the TME and even escape the immune cycle. A recent study has found that cellular immune responses are involved in the development of HCC and may be among the factors in uencing poor prognosis in HCC [11]. In this study, expression data from the TCGA database was analyzed based on ESTIMATE method, we demonstrated that there were signi cant differences in EstimateScore, ImmuneScore, StromalScore and TumorPurity between the immunity high group and immunity low group of HCC. Besides, we found that the proportions of TIICs varied between the two groups based on CIBERSORT method.
With the development of next generation sequencing technology, a large number of genomic and transcriptomic sequences are available in public databases. Through mining the public database, researchers found that lncRNAs acted as a key regulator for the development of cancer in various cellular functions, including proliferation, cell differentiation and DNA stability, etc [12]. Although the roles of many lncRNAs in HCC remain elusive, a small part has been extensively investigated. For instance, the HBx-LINE1 activates Wnt signaling, promotes HCC development and progression, and correlates with shorter patient survival [13]. lncRNA HULC is upregulated in HCC and promotes HCC growth, metastasis and drug resistance [14]. LncRNA-WRAP53 is an independent prognostic marker in relapse-free survival and may serve as a serum biomarker for HCC diagnosis and prognosis [15]. These observations point to the considerable potential of lncRNAs as a source of novel targetable molecules for HCC precision therapy and for discovering new diagnostic biomarkers.
Recently, evidence has indicated that lncRNAs can regulate immune cell differentiation and function, such as dendritic cell activity, T cell ratio and metabolism [6] and thus are potential targets for cancer therapeutics and possess predictive value for survival prognosis [4]. Through mining the transcriptome sequencing data via bioinformatics analysis, many studies have already established the lncRNA signature for predicting the prognosis of cancers, including thyroid cancer [16], breast cancer [17], renal cell carcinoma [18], bladder cancer [19], gastric cancer [20] as well as HCC [21]. In this study, we aimed to construct an immune-related lncRNAs signature in HCC. A total of 331 immune-related genes were obtained from the Molecular Signatures Database (Immune response M19817, immune system process M13664). 236 immune-related lncRNAs were identi ed by correlation analysis. By using univariate cox regression, we identi ed six immune-related lncRNAs including MSC-AS1, AC145207.5, SNHG3, AL365203.2, AL031985.3, NRAV as prognostic signature for HCC. By using risk score methods, we developed a six immune-related lncRNA signature which was able to classify HCC patients into the highrisk group and low-risk group with signi cantly different overall survival. We analyzed the relationship between age, gender, grade, tumor stage and risk score by univariate and multivariate Cox regression analysis. The results showed that only risk score had p < 0.05 in both univariate multivariate Cox regression analysis. The AUC of risk score is 0.775 which was greater than other factors. These data suggested that risk score may be an independent prognostic factor in HCC patients. Then we analyzed the correlation of immune-related lncRNAs and the clinical characteristics, we found that these lncRNAs increased with grade, tumor-stage and T-stage. To investigate the applicability of the signature in different clinical conditions, strati cation analyses were performed. It was observed that the signature was able to assess the risk score in subgroups of HCC patients and predict HCC patients survival in each stratum of age, gender, stage and T-stage. Besides, GSEA was employed to further verify the functional annotation, and we found that immune-related responses activation of immune response, immune response and immune system process were enriched in high-risk groups.
Immune cell in ltration in the TME may affect tumor cell survival, metastasis, and therapy resistance [22,23]. We made a comprehensive analysis of the TME immune cells in ltration landscape by estimation the abundance of 22 TIICs in HCC using CIBERSORT method. We found that eosinophils and T cells follicular helper were positive correlated with lncRNA prognostic signature and monocytes, NK cells activated, plasma cells and T cells CD4 memory resting were negative correlated with lncRNA prognostic signature. These ndings suggested that the six immune-related lncRNA signature may play a role in immune in ltration of HCC. However, the underlying mechanism of this association and its clinical signi cance in HCC require further investigation.
In conclusion, using bioinformatics methods and qRT-PCR experiment, we identi ed six immune-related lncRNAs that was correlated with HCC progression and prognosis and may be applied as an independent prognostic indicator in predicting survival for HCC. Besides, the six immune-related lncRNAs were related to the level of in ltration of tumor-in ltrating immune cells. Therefore, the identi cation of immunerelated lncRNA may provide new targets for the research of the molecular mechanisms and immunotherapy of HCC.
Declarations Figure 1 The immune landscape of the TME in HCC. (a) The heatmap showed that the immunity high group had lower Tumor Purity but higher ESTIMATE Score, Immune Score and Stromal Score. The TME of the two groups was signi cantly different. (b) The box chart showed that there was a signi cant positive correlation between immunity high group and ESTIMATE Score, Immune Score and Stromal Score, respectively, while there was a positive correlation between immunity low group and Tumor Purity. (c) CIBERSORT algorithm revealed that the abundance levels of B cells memory, B cells naïve, Dendritic cell resting, Macrophages M1, T cells CD4 memory activated, T cells CD8, T cells follicular helper and T cells regulatory (Tregs) were signi cantly higher in the immunity high group than immunity low group. Conversely, Macrophages M0 and Macrophages M2 were signi cantly lower in the immunity high group.

Figure 2
Construction and assessment of immune-related lncRNA prognostic signature for HCC. (a) The HR and pvalue from the univariable Cox regression of selected immune-related lncRNAs (MSC-AS1, AC145207.5, SNHG3, AL365203.2, AL031985.3, NRAV) according to the criterion of p < 0.0001. (b) Kaplan-Meier survival curves showed that the overall survival of the high-risk groups was signi cantly lower than that of the low-risk groups. (c) HCC samples were divided into high-risk groups and low-risk groups according to the median value of risk score. (d) A higher mortality was observed in high-risk groups than in low-risk groups. (e) The heatmap showed that with the increase of risk score, the expression levels of lncRNAs were elevated.    Immune-related responses such as activation of immune response, immune response and immune system process were enriched in high-risk groups.