Molecular datasets from Rennes and Angers University
A total of 125 primary non-G-CIMP GBMs were collected between 2004 and 2013 from the Neurosurgery Departments of Rennes and Angers University Hospitals (RAUH), including a new cohort of 77 samples (RAUH-new cohort[6]) profiled by Infinium HumanMethylation450k BeadChip (Illumina Inc.) and Agilent Whole HumanGenome 8 × 60K Microarray Kit (Agilent Technologies); and a published cohort of 48 samples (RAUH-GSE22891[9]) profiled by Infinium HumanMethylation27k BeadChip (Illumina Inc.) and Agilent Whole HumanGenome 4 × 44K Microarray Kit (Agilent Technologies). All patients were treated with surgery followed by RT plus concurrent and adjuvant TMZ. Sample collection, histological review, clinical follow-up and microarray profiling were previously reported. All patients provided written informed consent, in accordance with the French regulations and the Helsinki Declaration. G-CIMP status was determined by k-means (k=3) clustering on the 1503 featured probes reported by Noushmehr et al[10]. MGMT methylation status was determined using a logistic regression model based oncg12434587 and cg12981137[11]. The gene expression subtypes were predicted by Binary tree classification prediction using the 840 classifiers reported by Verhaak et al[12].
Molecular datasets from public databases
Additional DNA methylation and gene expression microarray data were obtained from public databases, including the clinically annotated cohort from TCGA (TCGA-Brennan et al[13], RT/TMZ, n=219; RT alone, n=73), and three cohorts from the Gene Expression Omnibus (GEO; https://www.ncbi.nlm.nih.gov/geo/): (1) GSE50923[14], RT/TMZ, n=49; (2) GSE60274[15], RT/TMZ, n=32; RT alone, n= 27; (3) GSE36278[16], unknown treatments, n=52. Moreover, DNA methylation microarray data of IDHwt LGGs were obtained from TCGA[17] (WHO grade II to III, various treatments, n=94), Chinese Glioma Genome Atlas (CGGA[18], WHO grade II to III, various treatments, n=30), GSE48462[19] (WHO III, RT/PCV, n=21; RT alone, n=34). For comparison, DNA methylation data of G-CIMP (or IDH mutant) tumors from TCGA[13] and CGGA[18], gene expression data of G-CIMP (or IDH mutant) tumors from TCGA[13], CGGA[20] and GSE16011[21], and molecular data of non-tumor brains (NTBs) from TCGA[13], CGGA[18, 20], GSE60274[15], GSE16011[21] and our RAUH datasets were all obtained. In addition, DNA methylation data of 26 NTBs from GSE63347[22] were also downloaded for comparison of TCGA samples. Patient information for each included database was summarized in Table S1.
Probe selection and RISK score construction
Initial CpGs probe selection was performed by removal of probes not covered on both 27k and 450k platforms, those targeting X and Y chromosomes, and those relevant to single-nucleotide polymorphisms (SNPs). A compendium of 842 genes has been recently identified as DDR genes in a CPRISPR-based study by Olivieri M et al[4]. The 842 DDR genes were corresponded to the initially screened CpG probes based on gene symbols across PubMed gene database (https://www.ncbi.nlm.nih.gov/gene). A total of 1095 CpG probes corresponding to 635 DDR genes were kept for further selection. To make DNA methylation microarray data comparable, batch effects between each platform and dataset were adjusted by M-value transformation and the empirical Bayes approach (ber R package)[23]. Missing β values were imputed by impute R package. TCGA-Brennan et al and RAUH-new cohort were used for discovery phase. Selected CpGs with higher variability in methylation levels were used to correlate with OS by univariate Cox regression model and permutation test (Fig. 1). After the removal of inconsistent results from each discovery cohort, an overlap of 5 CpGs were kept and subjected to multivariate Cox model within meta-discovery cohorts adjusted by age, MGMT methylation status and treatment. Finally, the 5-CpG panel was identified for constructing a RISK score model, which was the sum of β values of each CpG weighted by their multivariate Cox coefficients, adjusted by age, MGMT methylation status, and treatment (Fig. 1). The cutoff for low-risk and high-risk tumors was predefined as median risk score from meta-discovery cohorts, or determined by maxstat R package[24].
Gene set enrichment analysis
Gene set enrichment analysis (GSEA) was run to evaluate functional profiles between grouped samples using the gene sets of the Gene Ontology Biological Processes and Kyoto Encyclopedia of Genes and Genomes (KEGG) from the Molecular Signatures Database (MSigDB)[25]. The proportion of 28 tumor-infiltrating immune cells (TIICs) in tumor bulks was estimated using single-sample GSEA (ssGSEA) algorithm and the 782-gene signature reported by Charoentong et al[26] (http://software.broadinstitute.org/gsea/msigdb/index.jsp). The abundance of the 28 TIICs in tumor samples was summarized as normalized enrichment scores (NES).
Tumor mutation burden (TMB) calculation
Mutation annotation format (MAF) flies for 212 non-G-CIMP GBMs were retrieved from the TCGA database (https://portal.gdc.cancer.gov/). The “maftools” R package was used to analyze the “Masked Somatic Mutation” type of somatic mutation data[27].
Immunohistochemistry staining
Formalin-fixed paraffin-embedded (FFPE) samples of 12 primary gliomas were collected from the Department of Neurosurgery, Xijing Hospital (NTBs, n=3; Grade II, n=3; Grade III, n=3; Grade IV, n=3) and were employed for immunohistochemistry (IHC) staining with anti-NSUN5 antibody (Abcam, ab121633). The percentages of positive cells were evaluated in at least five separate fields at × 400 magnification. Immunoreactivity was scored as follows: 0, no staining; 1, weak staining in < 50% cells; 2, weak staining in ≥ 50% cells; 3, strong staining in < 50%, cells; and 4, strong staining in ≥ 50% cells[28].
Cell culture and drugs
Human glioma cell lines U251 and A172, obtained from American Type Culture Collection (ATCC) were cultured in DMEM (Sigma) containing 10% fetal bovine serum (FBS) respectively at 37°C in 5% CO2. Temozolomide (MedChemExpress, #HY-17364) was dissolved in dimethyl sulfoxide (DMSO, Sigma-Aldrich) at a concentration of 100mM and stored at -80°C.
Pyrosequencing
Total DNA was extracted in using EZ DNA Methylation-GoldTM Kit (Beijing Tianmo). Pyrosequencing was performed by Pyromark Q96 ID platform and analyzed by PyroMark CpG software (Qiagen). The following primers were used: cg01251255-F: 5’-GTTTTGTAAGGYGTGTGTGTGAT-3’; cg01251255-R: biotin-5’- TCAAATTAACAAAAACTTAAAAACC-3’; and cg01251255-seq: 5’-TATTTGTAGGGGTAGGGAGT-3’.
Western blot
Cell lysates were performed in RIPA buffer contained protease inhibitor and phosphatase inhibitor (Roche). The primary antibodies against IRF3S396 (CST, #4947, 1:1000), IRF3 (CST, #4302S, 1:1000), P65S536 (CST, #3033S, 1:1000), P65 (CST, #4764S, 1:1000), STINGS366 (CST, #50907, 1:1000), STING (Abcam, ab171850, 1:1000), TBK1S172 (CST, #5483, 1:1000), cGAS (Sigma Aldrich, #HPA031700, 1:1000), GAPDH (Proteintech, #60004-1-lg, 1:20000), NSUN5 (Abcam, #a121633, 1:1000) were used according to the manufacturers’ recommendations. Each immunoblot was done at least thrice and the signals were quantified using ImageJ software (Bethesda).
RNAi transfection
For NSUN5 knockdown in vitro, negative control (siCtrl), NSUN5 small interfering RNA (si-NSUN5#1 and #2) were obtained from Ribobio™. siRNAs were transfected in GBM cells at the concentration of 1μg/ml. X-tremeGENE siRNA Transfection Reagent (Roche) was employed for establishing cells transfecting process according to the manufacturers’ recommendations. The sequences of siRNA were as follows: siCtrl (negative control); si-NSUN5#1: GACCTGCTCCGATGATGTA; si-NSUN5#2: GCTACCATGAGGTCCACTA.
Cell viability and TMZ cytotoxicity assay
After transfection, U251 cells were seeded on 96 well plates (5000 cells per well). CCK-8 kit was used for cell viability and TMZ cytotoxicity detection. U251 cells were treated TMZ at the concentration of 7.5, 15, 30, 60, 120, 240, 480 μM for 48 h. CCK-8 reagent was added to wells (10 μl/well) and incubated at 37°C for 1 h. The absorbance at 450 nm was measured for calculating the IC50 of TMZ, and for evaluating cell viability at indicated time points.
Cell cycle and apoptosis assay
U251 cells were harvested and fixed in 75% ethanol at 4°C for 1h, to analyze intracellular DNA content. Fixed cell suspension (1 × 106 cells) was then washed twice and stained with Annexin V-fluorescein isothiocyanate (FITC)/ Propidium iodide (PI) apoptosis detection Kit (Becton, Dickinson and Company). The cells were subjected to flow cytometric analysis.
Cell migration and invasion assay
U251 cells were cultured in the upside of the transwell chamber (Corning), 700 μl media with 10% FBS was added in the bottom of the wells. The chamber was lightly wiped with a cotton swab after cultivating for 24h. The migrated cells were fixed with 4% paraformaldehyde and then stained with crystal violet solution and counted under a microscope in three fields (×200). The process for cell invasion assay was similar with migration assay but the transwell membranes were coated with Matrigel (Corning).
RNA sequencing
Total RNA was isolated from cells using Axypre™ Multisource Total RNA Miniprep Kit (Axygen, #365) following the manufacturer’s procedure. The purity and quantity of RNA were measured by the Bioanalyzer 2100 and RNA 6000 Nano Lab Chip Kit (Agilent Technologies) with RIN number >7.0. cDNA libraries were constructed in using NEB Next Ultra Directional RNA Library Prep Kit (NEB), then used the Illumina sequencing technology on an Illumina novaseq 6000 platform. Differential expression analysis was performed with p-value ≤ 0.05 and absolute value of log2 fold change ≥ 1 for significance. Volcano plots were prepared to depict fold-change differences in gene expression.
Statistical analysis
Differences in clinical and molecular features within each subgroup were tested by unpaired t test, Mann-Whitney test, Fisher’s exact or Chi-square test. Overall survival (OS) was the time interval from the date of diagnosis or treatment to the date of death or last follow-up. Progression-free survival (PFS) was the time interval from the date of diagnosis or treatment to the date of progression defined by the Macdonald criteria or Response Assessment in Neuro-Oncology (RANO) criteria or the date of death or last follow-up[29, 30]. Survival data were estimated by the Kaplan-Meier Method, and compared by log-rank test. Univariate and multivariate Cox regression analysis was used to evaluate the prognostic correlation and independence of each variable. Meta-analysis was performed by the inverse-variance method where application of either fixed- or random effect models was based on statistical heterogeneity, with p-value for chi-square test ≤0.05 for significance. The prognostic performance was evaluated by time-dependent receiver operating characteristic (ROC) curve (survcomp R package)[31]. All the calculations were done with SPSS statistics (SPSS software Inc.) and R software, with two-side p value ≤0.05 for significance.