Data collection
We collected GBM and low-grade glioma (LGG) patients clinical characteristics, transcription group data and somatic cell mutation data from cancer genome atlas database (https://portal.gdc.cancer.gov/). These data were matched by samples’ name. Patients with no information about survival or less than 30 days duration were excluded to eliminate interference from non-cancer causes. We differentiated mRNA and lncRNA using human genome profiles, and 629 samples retaining paired lncRNA and mRNA expression profiles, survival information, somatic mutation information and common clinicopathological features were obtained from HUGO Gene Nomenclature Committee (HGNC2) database (https://www.genenames.org/) for further study. All glioma patients were randomly divided into two groups, namedas training sets and test sets. A total of 316 patients in training sets were used to identify the prognostic features of lncRNA and establish a risk model for the outcome. The test set included 313 patients was used to independently validate the performance of prognostic risk model.We downloaded GSE43378 from Gene Expression Matrix data set (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi? acc = GSE43378) asexternal validation to test the model.
Technical route
The process of this study was displayed in Fig. 1. After collecting data, we analyzed data from somatic cell mutations and transcription groups to obtain genomic instability-related lncRNAs (GIrlncRNAs). The relationship between GIrlncRNAs and mRNA was analyzed by co-expression analysis. Next, we randomly divided patient cohorts into training and testing sets. Furthermore, Cox regression and lasso regression analysis of GIrlncRNAs were conducted to construct prognostic signature. The signature was evaluated by mutation correlation analysis, model comparison, independent prognostic value analysis, clinical stratification, examination of the external data set and cell line validation.
Identification of GIrlncRNAs
To identify GIrlncRNAs, a method derived from Mutator Hypothesis was applied20. The patients with the highest cumulative mutation count and the lowest 25% were assigned to genome unsteady-like (GU) and genome steady-like (GS) groups. The average expression of lncRNAs between the two groups were compared by Wilcoxon rank-sum test in limma package of R software. The cut-off thresholds were intended to be |fold change|>2.0 and false discovery rate (FDR) < 0.05.
Construction of GIrLncSig
The “survival” software package of R software was used to conduct univariate Cox regression analysis on the training set to assess the relationship between the expression level of GIrlncRNAs and the overall survival time of patients. The least absolute shrinkage and selection operator (LASSO) regression algorithm was used to further screen candidate GIrlncRNAs to constructed the GIrlncRNAs prognostic signature (GIrLncSig). The following formula based on a combination of the Cox coefficient and gene expression was used to calculate the signature risk score.

GIrLncSig was the prognostic risk score of glioma patients. Ei represented the expression level of lncRNAi in patients and coefi represented the coefficient of lncRNAi. The median GIrLncSig score was used as the risk cut-off point to divide glioma patients into low-risk and high-risk groups. The survival curves of the two groups were plotted by Kaplan-Meier method using "Survminer" and "Survival" package in R language, and the log-rank sum test obtained p < 0.05 was considered significant..
Real Time-PCR validation of cell lines
Cell lines U87, U251, LN229, U343 cell lines of human glioblastoma and immortalized cell line SVGp12 were used for cellular level validation of lncRNA in the model.
Cell line culture conditions
DMEM medium with 10% fetal bovine serum, 100 U/ml penicillin and 100 U/ml streptomycin, placed at 37 ℃ in a 5% CO2 incubator.
The collected cells were added with appropriate amount of 1 ml of Trizol (Invitrogen, Waltham, Massachusetts) to extract total cellular RNA, and the absorbance value of RNA at 260nm was measured using a Nanodrop 2000 UV spectrophotometer. 1ug of RNA was taken according to the concentration to synthesize cDNA by reverse transcription (New England Biolabs, Ipswich, MA), and SYBR Green (Applied Biosystems, Foster City, CA) method and CFX96 real-time PCR system (Bio-Rad, Hercules, California) were used for real-time polymerase chain reaction (RT-PCR), and Actin was used as an internal control. Amplification was set at 95°C/120 s followed by 39 cycles of 95°C/5 s and 60°C/30 s. The relative expression of RNA was calculated by the 2−ΔΔCt method. Primers were generated by Sangon Biotechnology (Shanghai,China). primers for lncRNA in GIrLncSig were as follows (Table 1).
Table 1
qPCR primers designed to amplify mRNA of lncRNAs in GIrlncSig as risk factor.
LncRNA
|
Forward
|
Reverse
|
LINC01579
|
5'-TCCCAGTGAAGAGAGAGCGA-3'
|
5'-CTAAGTTCCACGTCACGGCT-3'
|
LINC01116
|
5'-GAATGGCAAAGCACTTGGGG-3'
|
5'-AGCTCTCCTTGCAGGTAGGT-3'
|
MIR155HG
|
5'-AGGGGTTTTTGCCTCCAACT-3'
|
5'-TCTTTGTCATCCTCCCACGG-3'
|
CYTOR
|
5'-TTCCAACCTCCGTCTGCATC-3'
|
5'-AATGGGAAACCGACCAGACC-3'
|
H19
|
5'-GACATCTGGAGTCTGGCAGG-3'
|
5'-CTGCCACGTCCTGTAACCAA-3'
|
SNHG18
|
5'-TGCACTTTGCCACTGCTACA-3'
|
5'-GGGGAATGTGGTTCTCCCTT-3'
|
FOXD3.AS1
|
5'-AAGAGTAAGAGCAGCGCACC-3'
|
5'-ACCTGAGTGGTTTGGTTGGG-3'
|
CRNDE
|
5'-ATTCAGCCGTTGGTCTTTGA-3'
|
5'-CTTCTGCGTGACAACTGAGGA-3'
|