Identication of Genomic Instability Related Lncrna Signature Associated with Immune Microenvironment in Pancreatic Cancer

Background: Increasing evidence suggested that the critical roles for lncRNAs in the maintenance of genomic stability. However, the identication of genomic instability related lncRNA signature (GILncSig) and their clinical signicance in tumor immune microenvironment of pancreatic cancer remain largely unexplored. Methods: In the present study, a systematic analysis of lncRNA expression proles and somatic mutation proles was performed in pancreatic cancer patients from TCGA. We performed co-expression network and Gene Ontology (GO) enrichment analyses to determine the potential functions and pathways involved in lncRNAs are associated with genomic instability. We then development a risk score model to describe the characteristics of the model and verify its prediction accuracy. ESTIMATE algorithm, single-sample gene set enrichment analysis (ssGSEA), and CIBERSORT analysis were employed to reveal the characteristics of tumor immune microenvironment in pancreatic cancer. The correlation of risk signature with immune inltration and immune checkpoint blockade (ICB) therapy was analyzed. Results: We identied 206 GILncSig, of which ve were screened to develop a prognostic GInLncSig model. Multivariate Cox regression analysis and stratied analysis revealed that the prognostic value of the GILncSig was independent of other clinical variables. ROC analysis suggested that GILncSig is better than the existing lncRNA-related signatures in predicting survival. Additionally, the prognostic performance of the GILncSig was also found to be favorable in patients carrying wild-type KRAS, TP53 and SMAD4. Besides, a nomogram exhibited appreciable reliability for clinical application in predicting the prognosis of patients. Finally, the risk score signicantly correlated with immune score, immune-related signature, inltrating immune cells (i.e. B cells, etc.), and ICB key molecules (i.e. CTLA4, etc.). Conclusion: In summary, the GILncSig identied by us may have crucial role in immune cell inltration(cid:0) immunotherapy and important indicator for clinical stratication management and therapy decisions for pancreatic cancer patients.


Introduction
Pancreatic cancer is one of the deadliest cancers, ranking as the fourteenth most common cancer and the seventh leading cause of cancer mortality worldwide. Due to the lack of obvious early symptoms, pancreatic cancer usually presents at an advanced stage, which results in a 5-year survival rate as low as 6% (ranges from 2-9%) [1]. Despite the great advances in surgery, chemotherapy, and radiotherapy for pancreatic cancer have been made in the past few years, long-term survival and prognosis remain terrible, with more than 80 percent of patients facing recurrence after resection [2]. More recently, a large number of previous studies have analyzed the relationship between the expression of molecular marker and clinicopathology and long-term survival in the molecular mechanism of pancreatic cancer. However, their impact on patient early diagnosis and treatment is still limited [3]. Therefore, searching for new prognostic markers that can predict the poor outcome of patients may become the target of intervention, and provide new treatment strategies for the treatment of pancreatic cancer.
Genomic instability refers to an increased tendency of the genome to acquire mutations, which is typically conferred by some mechanism dysfunction, such as DNA damage repair, DNA replication, transcription and so on. Genomic instability is a hallmark of cancer and is related to cancer initiation and progression [4]. In addition, genome stability status is also associated with survival and can be used as a prognostic marker for cancer patients [5]. Long noncoding RNAs (lncRNAs) are arbitrarily considered as non-protein coding transcripts over 200 nucleotides in length [6].
There is increasing evidence suggesting that lncRNAs are involved in a variety of biological processes and play a critical role in genome regulation [6][7][8]. Noticeably, the dysregulation of lncRNAs has been established to be associated with many complex diseases, including cancers [9][10][11]. A number of lncRNAs are abnormally expressed in tumor tissues, which have been considered as oncogenes, such as MALAT1 [12], HOTAIR [13], H19 [14], MEG3 [15].
The main function of lncRNA is to regulate gene expression and indicate the tumor status better than the proteincoding RNAs, so it can be used as a novel biomarker with diagnostic and prognostic signi cance [16]. Currently, several lncRNA signatures have been developed in various cancers to predict patient prognosis with great predictive performance, including lung cancer [17], head and neck squamous cell carcinoma [18], ovarian cancer [19] and breast cancer [20,21]. Recently, Lee et al. analyzed a non-coding RNA activated by DNA damage (or NORAD), maintained genomic stability by isolating PUMILIO protein [22]. Hu et al. reported that GUARDIN, as a p53-responsive lncRNA, kept genomic integrity under both stable and exposed status [23]. These results demonstrated the importance role of lncRNAs in maintaining genomic stability, but the lncRNAs associated with genomic instability need to be further explored.
In addition, studies have shown that immune cells act as tumor inhibitor or tumor promoter and may function as important players in the tumor immune microenvironment (TIME). Genomic instability has been termed as a promising indicator for predicting responsiveness to immune checkpoint blockade based on numerous researches. Therefore, we constructed a GILncSig and to investigate whether the lncRNA signature could re ect the tumor immune microenvironment, and serve as an effective prognostic predictor for patients with pancreatic cancer.
Materials And Methods

Patient dataset:
The clinical information, RNA-seq expression data, lncRNA transcriptional pro les and somatic mutation information of patients with pancreatic cancer were obtained from The Cancer Genome Atlas (TCGA) project (https://cancergenome.nih.gov/) [24]. A total of 171 TCGA pancreatic cancer patients with lncRNA expression pro les somatic mutations, survival information and clinical features were utilized in our study. TCGA patients with pancreatic cancer were divided into an 84-sample training set and 87-sample testing set. The training set was used to identify the prognostic lncRNA signature and establish the prognostic risk model, while the testing set was used to independently validate its prognostic value.

Identi cation of genomic instability-associated lncRNAs
In order to identify with genomic instability-associated lncRNAs, a computational framework was constructed based on the lncRNAs expression pro les and somatic mutation pro le of pancreatic cancer patients. As shown in Figure   1, the cumulative number of somatic mutations for per sample was calculated and arranged in descending order.
The rst 25% of patients were de ned as genomic instability group (GU group), and the last 25% were de ned as genomic stability group (GS group). Then compared the expression pro les of lncRNAs between GU group and GS group by the signi cance analysis of microarrays (SAM) method. The differentially expressed lncRNAs screened out by the lter of fold change and permutation correction were de ned as genomic instability-related lncRNAs (fold change > 1.5 or <0.67 and false discovery rate (FDR) adjusted P < 0.05).

Statistical Analysis
We carried out a univariate regression analysis to determine the relationship between the expression level of lncRNAs and the overall survival of the training set. Those lncRNAs with p-value less than 0.05 were considered as the candidate prognostic lncRNAs of pancreatic cancer whose expression levels were signi cantly associated with overall survival of pancreatic cancer patients. In order to assess the contribution of those candidate lncRNA as an independent prognostic factor for survival, multivariate Cox regression analysis was further performed. A P value less than 0.05 was considered as signi cant. A prognostic risk score model of genomic instability-related lncRNAs signature (GILncSig) was constructed based on the expression level of lncRNAs and multivariate Cox regression coe cient to predict the prognosis of patients with pancreatic cancer as follows: GILncSig (patients)= * . In our formula, GILncSig (patients) is the prognostic risk score for pancreatic cancer patients. lncRNAi is the each prognostic lncRNAs. Coe cient (lncRNAi) represents the corresponding coe cient of multivariate Cox regression analysis, and expression(lncRNAi) is the expression level of lncRNAi.
According to above formula, the lncRNA expression-based risk scores for pancreatic cancer patients could be calculated and divided patients into high-risk and low-risk group with the cutoff of the median risk score from the training set. Kaplan-Meier survival curves was utilized to estimate the survival rate of the different patient groups, and the survival differences between high-risk group and low-risk group was assessed by the log-rank test. Timedependent ROC analysis for overall survival was used to assess the performance of prognostic risk model for time dependent disease outcomes. Multivariate Cox regression and strati ed analysis were performed to determine whether the GILncSig was independent of other clinical variables. Hazard ratio (HR) and 95% con dence intervals (CI) were estimated by Cox proportional hazards regression model. A nomogram was built in the training set to predict the 1-, 2-, and 3-year survival based on the results of multivariate cox regression analysis by R "rms" and "survival" package and applied to the testing set and the entire TCGA set for veri cation. The corrected plot was used to assess the prognostic accuracy of the nomogram. All statistical analyses were performed using R software and Bioconductor.

Functional Enrichment Analysis
We calculated Pearson correlation coe cient to evaluate their correlation by using paired lncRNA and mRNA expression pro les, and then established a lncRNA-mRNA co-expression network. Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analyses of the co-expressed protein-coding genes with prognostic lncRNAs were performed to predict the biological function of the differentially expressed lncRNAs using clusterPro ler software in R-version 3.5.2 [25].

Tumor Immune related analysis
For re ect the characteristics of tumor immune microenvironment, R package "ESTIMATE" was utilized to calculate Scores of immune and stromal cells. Immune in ltration information contains each tumor sample's immune cell fraction were obtained from Tumor Immune Estimation Resource (TIMER) (https://cistrome.shinyapps.io/timer/). The correlation of tumor immune cell in ltrating with prognostic risk signature was further analyzed. We selected six key genes of immune checkpoint blockade-related genes in pancreatic cancer to investigate the potential role of lncRNA-based signature in ICB therapy of pancreatic cancer.

Results
1. Identi cation of genome instability-associated lncRNAs in patients with pancreatic cancer To detect the potential Genome instability-related lncRNAs, the cumulative number of somatic mutations in each patient with pancreatic cancer was calculated from TCGA. The rst 25% (n =43) and the last 25% (n =40) patients were classi ed into GU group and GS group by the descending order of cumulative number. Then the lncRNA expression pro les in GU group and GS group were analyzed by unsupervised clustering, the result show that total 206 lncRNAs were found to be signi cantly differentially expressed ( Figure 2A). All patients with pancreatic cancer in TCGA were divided into GU-like group and GS-like group by unsupervised hierarchical clustering analysis based on the expression levels of the 206 differentially expressed lncRNAs. The cumulative number of somatic mutations was higher in GU-like group and lower in GS-like group ( Figure 2B). As shown in Figure 2C, more mutated genes exist in GU-like group(P<0.001 Mann-Whitney U test). As UBQLN4 gene is one of the driving factors of gene instability, the expression level of UBQLN4 gene in GU-like group and GS-like group was compared. The results showed that there was signi cant difference in the expression level of UBQLN4 between the two groups, and the expression level of UBQLN4 in GU-like group was signi cantly higher than that in GS like group. (P < 0.001, Mann-Whitney U test, Figure 2D).
To better understand the biological signi cance of the 206 differentially expressed lncRNAs functional enrichment analysis was performed to predict potential functions. We selected the protein coding genes (PCGs) most related to the expression of each lncRNAs to construct an lncRNA-mRNA co-expression network ( Figure 3A). According to the enriched results of the lncRNA-correlated PCGs, GO biological process (e.g., cellular component (CC), DNA binding in the molecular function (MF), and metabolism in the biological process (BP)) and KEGG pathway (e.g., MAPK signaling pathway, cAMP signaling pathway, Pancreatic secretion and Endocrine resistance) were annotated to be associated with genome instability Figure 3B-C). Based on the above results, it is considered that the 206 lncRNAs were involved in the genomic instability-related biological process, and their altered expression may destruct the genomic stability of cells. Therefore the 206 differentially expressed lncRNAs were recognized as candidate lncRNAs with genomic instability in pancreatic cancer.
2. Acquisition of a genomic instability-associated lncRNA prognostic signature from the training set In order to screen out the prognostic lncRNAs with independent value, we performed univariate Cox proportional hazard regression analysis to analyze the relationship between expression levels of 206 GIlncRNA and OS in the training set, 17 candidate prognostic lncRNAs were found to be signi cantly associated with the prognosis of pancreatic cancer patients ( Figure 4A). Furthermore, multivariate Cox proportional hazards regression was used to analysis on 17 candidate prognostic lncRNAs. Based on the multiCox model ( Figure 4B), 5 of 17 candidate lncRNAs including AL121772.1, BX640514.2, LINC01133, AC087752.3 and LYPLAL1-AS1 were found to retain their prognostic signi cance and thus were identi ed as independent prognostic lncRNAs (P<0.05). Among ve prognostic lncRNAs, one lncRNAs (AC087752.3) having negative coe cients was shown to be a protective factor whose high expression level was closed associated with a longer survival, whereas the remaining four lncRNAs According to the GILncSig model, the prognostic risk score was computed for each patient in the training set. Using the median risk score as the cutoff point, all patients of training set were classi ed into a high-risk group (n = 38) and a low-risk group (n = 46). The Kaplan-Meier analysis indicated that the overall survival was signi cantly different between the two risk groups and patients in low-risk subgroup had markedly longer overall survival than those in the high-risk group (P=0.009, log-rank test, Figure 5A). The time-dependent receiver operating characteristic (ROC) curves analysis for GIlncRNA prognostic model achieved an area under the curve (AUC) of 0.653 at 1 year of overall survival ( Figure 5C). These results demonstrated the GIlncRNA had better prognosis prediction performance in patients with pancreatic cancer. Then we ranked the risk scores of patients in the training set. Figure 5B showed the expression pattern of the 5 Independent prognostic lncRNAs, the expression level of UBQLN4 and the count of somatic mutations. We found that for patients with high-risk scores, the expression levels of four risk lncRNAs(AL121772.1 BX640514.2 LINC01133 LYPLAL1-AS1) were up-regulated, while one protective lncRNA(AC087752.3) was expressed at a low level. In contrast, these prognostic lncRNAs expressed the opposite patterns in patients with low-risk scores. Similarly, there were signi cant differences in UBQLN4 expression level between high-risk group and low-risk group (P=0.049, Mann-Whitney U test; Figure 5D). Moreover, Figure 5D also revealed that the number of somatic mutations in high-risk group were slightly higher than those in low-risk group (P=0.09, Mann-Whitney U test; Figure 5D).
3.Validation of GILncSig in the testing set and entire TCGA set To con rm our ndings, the prognostic performance of the GILncSig was further evaluated in the testing set.
Patients in the testing set were divided into the high-risk group (n = 43) and the low-risk group (n = 44) by using the same GILncSig and cutoff value deriving from the training set. Kaplan-Meier curves showed that there was a signi cant difference in overall survival between the high-risk group and the low-risk group, and the overall survival of the high-risk group was much lower than the low-risk group (p<0.001, log-rank test, Figure 5E), which were similar to those observed in the training set. Validation of the GILncSig in the testing set of 87 patients produced an ROC with an AUC of 0.806 at 1 year ( Figure 5G). Figure 5F shows how the expression level of GILncSig, the count of somatic mutation, and the expression level of UBQLN4 in the testing set change with the increasing score. The analysis indicated that Somatic mutation counts and the expression level of UBQLN4 were signi cantly higher in the high-risk group as compared with those in the low-risk group (P=0.0044 P=0.00054, Mann-Whitney U test; Similar results were observed when the prognostic performance of the GILncSig was further used to the entire TCGA set. Like the training and testing set, the GIlncRNA was able to stratify 171 pancreatic cancer patients of the entire TCGA set into the high-risk group (n=81) and low-risk group (n=90) with obviously different overall survival (P<0.001, log-rank test, Figure 6A). The AUC of time-dependent ROC analysis for overall survival in the entire TCGA set was 0.724 ( Figure 6B). The expression of GILncSig, somatic mutation counts and UBQLN4 expression level of pancreatic cancer patients in TCGA set were presented in Figure 6C, which were similar to those observed in the training set and testing set. The counts of somatic mutations in high-risk group were signi cantly higher than that in low-risk group (P=0.0022, Mann-Whitney U test, Figure 6D), as was the expression level of UBQLN4 (P=0.0001, Mann-Whitney U test, Figure 6D).

4.Comparison of the GILncSig and other lncRNA-related predictive signatures for survival prediction
Recently, two lncRNA-related signatures were reported to predict prognosis of pancreatic cancer patients. Therefore, we further compared the prognostic value of our GILncSig to that of different lncRNA-associated signatures for predicting outcomes: the ve-lncRNA signature derived from Song's study (hereinafter referred to as SongSig) [26] and the three-lncRNA signature derived from Shi's study (hereinafter referred to as ShiSig) [27]. Utilizing the same TCGA patient set. Then we performed the time-dependent ROC analysis and calculated the area under the ROC curves to compare the prediction performance between the GILncSig and other two existing lncRNA-related signatures in the entire TCGA set. The result demonstrated that the AUC at1 year of overall survival for the GILncSig is 0.724, which was signi cantly higher than that of SongSig (AUC = 0.642) and ShiSig (AUC = 0.556) (Figure 7). For this reason, we believed that the GILncSig had better prognostic power than those two lncRNA-related signatures.

Independence of prognostic value of the GIlncRNA from other clinical variables
To determine whether the prognostic value of the GIlncRNA was independent of other clinical variables. Multivariate Cox regression analysis was performed in each patient set using prognostic risk score, age, gender, pathological grade and stage. Results from multivariate Cox analysis revealed that the GIlncRNA was signi cantly associated with overall survival in each set when adjusted for age, gender, pathological grade and stage (Table 1). At the same time, we also observed that age, gender, pathological grade and stage were different in the multivariate analysis signi cantly. So we further performed data strati cation analysis according to age and gender, pathological grade and stage. According to the age, pancreatic cancer patients could be strati ed into an old patient group (age > 65, n=81) and a young patient group (age ≤65, n=90). The GIlncRNA could subdivide each age group into high-risk group and low-risk group. There was signi cantly different overall survival between high-risk group and low-risk group in each age group. (log-rank test p=0.016 for the old patient group and log-rank test p<0.001 for the young patient group) ( Figure 8A). Next, all patients were also strati ed by gender. 78 female patients were classi ed into high-risk group (n =37) and low-risk group (n = 41) based on the GIlncRNA. Similarly, 93 male patients were separated into two groups (high-risk group: n=44; low-risk group: n=49). The overall survival of patients in the lowrisk group was signi cantly longer than that of patients in the high-risk group by analysis of the results. (log-rank test p=0.002 for the female group; log-rank test p=0.001 for the male group; Figure 8B). In addition, all patients in entire TCGA set were grouped according to tumor size, lymph node metastasis and distant metastasis. Each group was further separated into high-risk group and low-risk group by the GIlncRNA, and the difference of overall survival between the two groups was compared. As shown in Figure 8, with the exception of the metastatic group (M1 group), there were statistically signi cant differences in overall survival between the high-risk and low-risk groups in each group, whereas marginally signi cant difference was existed in high-risk subgroup and low-risk subgroup of M1 group (p=0.052 for T1-2 group, p<0.001 for T3-4 group, Figure 8D; p=0.027 for N0 group, p=0.03 for N1 group, Figure 8G; p=0.009 for M0 group, p=0.317 for M1 group, Figure 8C; log-rank test). Finally, the same analysis method was applied to the pathological grade and stage of patients. All patients were divided into low-grade (G1/G2) and high-grade groups (G3/G4) according to pathological grade. The results of strati ed analysis showed that the patients with high-grade were divided into either a high-risk group (n=24) with shorter survival or a low-risk group (n=25) with longer survival (p=0.068, log-rank test; Figure 8E). The patients in the low-grade group were similarly classi ed into two risk subgroups with signi cantly different survival time (p<0.001, log-rank test; Figure 8E). Furthermore, patients with pathologic stage I or II were combined into an early-stage group (n=161) and that with pathologic stage III or IV were combined into a late-stage group (n=7). The GIlncRNA divided the early-stage group and the late-stage group into high-risk group and low-risk group respectively. The overall survival was signi cantly different between the two groups among the early-stage group (p<0.001, log-rank test; Figure 8F). Nevertheless, the difference in overall survival between the two groups was not signi cant probably due to the limited sample size in the late group (p=0.549, log-rank test; Figure 8F). Taken together, these results indicated that the GILncSig was an independent prognostic factor associated with overall survival in pancreatic cancer patients.
6.The prognostic signi cance of GILncSig better than KRAS, TP53, SMAD4 mutation status KRAS, TP53 and SMAD4 were the most frequent mutant genes and associated with poor prognosis in pancreatic cancer. With this in mind, these three genes were included in the training set, testing set and TCGA set for analysis, respectively. Then further strati ed analysis was performed based on the mutation status of KRAS, TP53 and SMAD4 by GILncSig. The analysis showed that the proportion of patients with KRAS, TP53 and SMAD4 mutations in high-risk group was higher than that in low-risk group to varying degrees in each set. For KRAS, 66% of the highrisk group had KRAS mutations, signi cantly higher than 16% of the low-risk group in the training set (chi-square test P<0.001). In the testing set, 72% of the high-risk group had KRAS mutation, which was signi cantly higher than 46% of the low-risk group (chi-square test P=0.040). In the entire TCGA set, 69% of KRAS mutation in high-risk group signi cantly higher than 33% of low-risk group (chi square test P < 0.001). These results suggest that GILncSig is closely related to the mutation state of KRAS gene. Therefore, we applied GILncSig to patients with KRAS Wild type (KRAS Wild) and KRAS mutation type (KRAS mutation). Patients with KRAS Wild were divided into low-risk group (KRAS Wild/GS-like) and high-risk group (KRAS Wild/ GU-like), and patients with KRAS mutation were divided into low-risk group (KRAS Wild/GS-like) and high-risk group (KRAS mutation/GU-like). Through comparative analysis, we found that overall survival of KRAS Wild/GS-like group was signi cantly different from that of KRAS Wild/ GU-like group and KRAS Wild/ GU-like group and patients in KRAS Wild/GS-like group had better prognosis (p=0.01, log-rank test; Figure 9A). For TP53, as shown in Figure 9B, 73% of TP53 mutations in the high-risk group were signi cantly higher than 26% in the low-risk group in the training set (chi square test P < 0.001). Similarly, in the TCGA set, the TP53 mutation in high-risk group was obviously higher than that in low-risk group (high-risk group 66% versus lowrisk group 41%, chi square test P=0.004). However, TP53 mutations were only slightly higher in the high-risk group than that in the low-risk group in the test set, and there was no signi cant difference between the two groups (highrisk group 58% versus low-risk group 54%, chi square test P=0.874). In consequence, we believe that TP53 status can be predicted according to the GILncSig risk score. Then patients with TP53 mutation and TP53 wild type were further divided into TP53 mutation high-risk group (TP53 mutation/GU-like), TP53 mutation low-risk group (TP53 mutation/GS-like), TP53 wild high-risk group (TP53 wild/GU-like) and TP53 wild low-risk group (TP53 wild/GSlike). Survival analysis showed that patients in the TP53 wild/GS-like group had longer survival than those in the TP53 wild/GU-like group, and the higher risk scores were associated with lower survival rates in TP53 wild subgroups (p=0.002, log-rank test; Figure 9B). For SMAD4, it has similar results with KRAS and TP53. The patients in training set, testing set and TCGA set were respectively divided into high-risk group and low-risk group by using GILncSig. In each set, the proportion of SMAD4 mutation in high-risk group was signi cantly higher than that in lowrisk group (p=0.228 for training set; p=0.028 for testing set; p=0.009 for TCGA set; chi-square test; Figure 9C). The patients with SMAD4 mutation type and SMAD4 wild type were further separated into SMAD4 mutation/GU-like group, SMAD4 mutation/GS-like, SMAD4 wild/GU-like group and SMAD4 wild/GS-like group. The results of survival analysis showed that the overall survival among the groups was slightly different (p=0.062, log-rank test; Figure 9C). Therefore, the above ndings suggested that the GILncSig is superior to KRAS, TP53 and SMAD4 mutation status in prognosis.

Development and validation of a nomogram for predicting survival in patients with pancreatic cancer
In order to improve the clinical application of the GILncSig, we established a prognostic nomogram model combined with risk score, age, gender, pathological grade and stage to predict the patients' survival at 1-, 2-, and 3-years in the training set by using "rms" and "survival" packages in software R ( Figure 10A). In Figure 10B

Correlation of Risk Score with TIME (Tumor Immune Environment Characterization)
Through the ESTIMATE evaluation method, TumorPurity, ImmuneScore and StromalScore were calculated. These results indicated that patients in the low-risk group have lower TumorPurity and higher ImmuneScore and StromalScore( Figure 11A,B,C). To further uncover the the correlation between GILncSig and immune cell in ltration, the analysis showed that patients in low-risk group had more T cells CD8, B cells and T cells CD4 memory activated, while the Macrophages M0 was at low level (Figure 11 D). To further explore the in uence of GILncSig upon TIME of pancreatic cancer we analyzed correlation of risk signature with immune cell in ltration type and level. The results

Discussion
Pancreatic cancer is the most common cause of cancer death worldwide [28]. It is characterized by high morbidity, high mortality, di cult early diagnosis and poor prognosis [29,30]. Surgical resection is effective for patients with early pancreatic cancer, while palliative treatment is adopted for patients with locally advanced, metastatic and unresectable pancreatic cancer [31]. In recent years, the molecular research of pancreatic cancer has made great progress, and the survival rate of pancreatic cancer patients has been improved to some extent. However, the prognosis has not been improved [32]. As metastasis and recurrence are the main causes of poor prognosis, it is urgent to identify effective tumor biomarkers to evaluate the prognosis of patients with pancreatic cancer accurately.
Genomic instability is an important feature of human cancer, which is associated with poor prognosis, metastasis [4,33]. It has been reported that genomic instability affects the prognosis of pancreatic cancer, and the pattern of genomic instability is quite heterogeneous in metastatic pancreatic cancer [34,35]. It is known that the degree of genomic instability has diagnostic and prognostic implications, yet measuring genomic instability is a big challenge. Mettu rk, et al. constructed a 12-gene signature to assess genomic instability and predict clinical outcomes in cancers[36]. Zhang S, et al. developed a biological rationale-driven genomic instability score to predict the prognosis of ovarian cancer [37].
LncRNAs have complex biological functions and have been proved to be closely related to the occurrence and development of cancers [11,38]. Recently, increasingly more researchers pay attention to the clinical signi cance of LncRNAs in the prognosis of cancers. For instance, the high expression of lncRNA HOX transcript antisense RNA (HOTAIR) in lung tumor tissues is correlated with metastasis and poor prognosis in patients with lung cancer [39].
The lncRNA AOC4P induces a poor prognosis in gastric cancer patients through epithelial-mesenchymal transition [40]. It has been found that lncRNAs play an important role in maintaining genomic stability through continuous exploration of the function of lncRNAs [22,23,41]. Although some efforts have been made, few researches have been done on genomic instability-related lncRNAs in cancers. Therefore, there is an urgent need to investigate the prognostic value of genomic-instability associated lncRNAs in pancreatic cancer patients.
In our study, we identi ed 40 genomic instability-associated lncRNAs by analyzing the lncRNA expression pro le and somatic mutation pro le of 171 patients with pancreatic cancer. Then, the function of these lncRNAs was predicted by the lncRNA-mRNA co-expression network. The GO and KEGG enrichment results suggested that the genes coexpressed with these 206 lncRNAs were enriched at chromosomes and nucleoplasm in the cellular component,DNA binding in the molecular function and the transcription and compound synthesis and metabolism in the biological process can promote genomic instability, which lead to cancer eventually [42,43]. We further divided all patients into training set and testing set. Cox proportional risk regression analysis was performed on the candidate genomic instability-associated lncRNAs in the training set, and a genomic instability-associated lncRNAs signature (GILncSig) consisting of 5 lncRNAs with independent prognostic value (AL121772.1, BX640514.2, LINC01133, AC087752.3 and LYPLAL1-AS1) was established to predict the prognosis of pancreatic cancer. The GILncSig can classify pancreatic cancer patients in the training set into the high-risk group and low-risk group with signi cantly different overall survival, which was veri ed in the testing set and the whole TCGA set. In addition, we also found that patients with pancreatic cancer in the high-risk group had signi cantly higher somatic mutation counts and UBQLN4 expression levels,both of which are characteristics of genomic instability. Comparison of our GILncSig and two recently reported lncRNA-related signatures with predictive value for pancreatic cancer in the same TCGA patient set suggested that the GILncSig has better prognostic ability in predicting survival than those two lncRNA-related signatures. Our study also found that the GILncSig was independent of other clinicopathological factors, including age, gender, pathological grade and stage. Furthermore, based on the GILncSig, the mutation states of KRAS, TP53 and SMAD4 in high-risk group were signi cantly higher than those in low-risk group. The survival time of KRAS, TP53 and SMAD4 wild-type patients in low-risk group was signi cantly longer than that of patients with mutanttype.
The above results indicated that the GILncSig may have greater prognostic signi cance than KRAS, TP53 and SMAD4 mutation states. Finally, a nomogram was constructed by combining GInLncSig and the four independent prognostic factors of age, gender, pathological grade and stage in the training set, which further improved the predictive performance, and was veri ed on the testing set and the entire TCGA set.
What'more, numerous researches focusing on TIME have revealed the potential key role of lncRNAs on in ltrating immune cells. In this study, we nd that GILncSig was signi cantly correlated with immune cell in ltration, ESTIMATE results showed that GILncSig was positively with tumor purity but negatively correlated with estimate score and immune score, suggesting GILncSig could serve as a novel immune indicator in pancreatic cancer.
Besides, ssGSEA results indicated that in the low-risk group the in ltrating immune cells were signi cantly increased and immune signatures were remarkably activated. The immune-activated condition in the low-risk group was associated with high ICB-relevant genes expression, suggesting samples in with high risk score might respond to immunotherapy. What'more, the correlation analysis between ICB-related genes and GILncSig indicated that our signature may possess the ability to predict clinical outcome of ICB therapy in pancreatic cancer.
Although the GILncSig identi ed here is reliable and promising as a prognostic signature in tumor immune microenvironment of pancreatic cancer, there are still several limitations. In addition to validation in the TCGA dataset, the GILncSig requires more independent datasets to verify. Meanwhile, it is necessary to further explore the regulatory mechanism of GILncSig in biological function to maintain genomic instability.

Conclusion
In summary, we have performed RNA-seq prognostic analysis in pancreatic cancer patients by bioinformatics methods to develop a genomic instability-derived lncRNA signature to predict the prognosis of pancreatic cancer patients and successfully validated it on the independent cohort. Moreover, we integrated GInLncSig with age, gender, pathological grade and stage to construct a nomogram to improve its prediction performance. And further results unraveled that GILncSig was signi cantly correlated with immune cell in ltration and have important signi cance for genomic instability and ICB treatment of pancreatic cancer.

Declarations
Availability of data and material Publicly available datasets were analyzed in this study. The data can be found:https://portal.gdc.cancer.gov/,https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi.

Competing interests
The authors declare no con icts of interest Author contributions QL, YM, CQH, CYS, XLZ, RY, LDY and YPP all have made substantial contributions to conception, acquisition of data, analysis, and interpretation of data. All of them have been involved in drafting the manuscript and revising it critically for important intellectual content. All authors read and approved the nal manuscript and take public responsibility for appropriate portions of the content and agreed to be accountable for all aspects of work.  Figure 1 Computational framework of genomic instability-associated lncRNAs detection.          Comparison of 32 immune checkpoint blockade-related genes expression levels in low-/high-risk groups

Supplementary Files
This is a list of supplementary les associated with this preprint. Click to download.