Catalogue and pattern of RNA editing in glioma detected by strand-specific RNA-seq
We performed strand-specific RNA-seq for tumor and matched normal samples obtained from 41 patients across various pathologies of gliomas spanning oligodendroglioma IDH mutant and 1p/19q-codeleted grades 2 and 3 (O2 and O3), IDH mutant astrocytoma grades 2, 3 and 4 (A2, A3 and A4), and GBM (Supplementary table 1). In addition, we developed a computational pipeline to identify RNA-editing sites in a genome-wide context from strand-specific RNA-seq. While many previous methods identifying RNA-editing sites from RNA-seq depend on pre-compiled gene annotation to assign RNA editing types, our pipeline considers strand information embedded in stranded RNA-seq to determine RNA editing types, allowing unbiased identification of RNA-editing sites (see methods in detail). After we applied the computational pipeline to individual samples in our dataset, we additionally filtered potential DNA variants that have inconsistency across the samples in terms of editing type or strand. Also, we only chose sites that were found in at least two patients to minimize the contamination of rare DNA variants. Application of our computational pipeline with the additional filters identified a total of 708,902 RNA variant sites across samples in our dataset. The RNA variants were predominated by A-to-G which is a representation of A-to-I editing in RNA-seq (Figure 1A), comprising about 81% of RNA variant sites (number: 571,774). The second dominant type was C-to-U whose proportion is about 4%. This type of editing is known to be found in human cells. If we consider potential technical errors in strand-specific RNA-seq, A-to-I and C-to-U editing can be identified as their reverse complemented forms of C-to-T and G-to-A, respectively. These four types accounted for about 92% of all the identified sites, indicating our pipeline’s higher specificity of identification.
As expected, most A-to-I editing sites in our list show their strong association with Alu: about 82% of the sites were identified in annotated Alu element regions. Also, most A-to-I editing sites were found in intronic regions (93%). The numbers of A-to-I editing sites vary across patients and tend to be smaller in tumor compared normal tissues (Figure 1B, paired t-test p-value: 0.002077). Many A-to-I editing sites were not found commonly across the patients. The proportion of A-to-I editing sites that were shared by patients decreases according to the increasing number of patients observing the sites (Supplementary Figure 1). But we also identified that some numbers (3,415) of A-to-I editing sites were found in all the patients, which implies that these A-to-I editing sites are regulated in human brain tissues, despite the variability of individual samples.
In order to identify A-to-I editing sites that show different editing levels between tumor and normal tissues, we performed regression-based statistical tests comparing tumor and normal tissues while controlling patient-specific differences (see methods). As pathology is a clear factor contributing to variation in our dataset (Supplementary figure 2), we conducted the statistical tests per a pathology. We excluded the IDH mutant astrocytoma grade 4 from this analysis as the number of patients is too small (n=2). We found that about 0.4~7.5% of A-to-I editing sites showed differential editing between tumor and matched normal tissues (Table 1). Interestingly, low grade gliomas, O2 and A2 showed relatively lower number of differential editing sites (0.4% for O2 and 0.5% for A2) compared with high grade gliomas (5.9%, 3.4%, and 7.5% for O3, A3, and GBM, respectively). When we checked the overlap of the differentially-edited sites between two pathologies, we found that the degree of overlap between O2 and each of the others was much smaller than comparisons between other pairs of pathologies (Figure 1C and Supplementary table 2). We also compared the overall distribution of A-to-I editing level differences between tumor and matched normal tissues. In figure 1D, the changes of A-to-I editing levels are summarized by a cumulative distribution whose shift to the left to the zero indicates the overall decrease of A-to-I editing levels in tumor relative to matched normal tissues. We found that most A-to-I editing sites in higher grades gliomas or non-oligodendroglioma were decreased in general in tumors relative to the matched normal tissues. In contrast, grade 2 oligodendroglioma showed no shift or no bias in difference of A-to-I editing levels. On average, A-to-I editing levels do not show significant differences between tumors and normal tissues (average difference: 0.05, t-test p-value: 0.04) in O2 while the others have significant decreases: O3 (average difference: -0.08, t-test p-value<10-15), A2 (average difference: -0.21, t-test p-value<10-15), A3 (average difference: -0.19, t-test p-value<10-15), GBM (average difference: -0.07, t-test p-value<10-15). All together, these results suggest that both the perturbed sites and the direction of editing level changes are different between grade 2 oligodendroglioma and the other gliomas.
Expression of circular RNA in glioma
Another Alu-associated post-transcriptional regulation that is abundant in brain tissue is a backsplicing process, where a 5′ splice donor joins an upstream 3′ splice acceptor, generating circular RNA. We first identified the genes that harbor circular RNA-supporting sequencing reads in our RNA-seq dataset (see methods). As expected, they are significantly associated with Alu (Figure 2A): the group of genes with circular RNA shows 99% association with Alu element while the group of genes without circular RNA expression only has 36% association (fisher exact test p-value<10-15). Interestingly, the number of genes with circular RNA-supporting reads was significantly decreased in tumor compared to normal tissues in all gliomas except for O2 (Figure 2B: note that we used the relaxed p-value threshold (0.1) for statistical significance here and A4 is excluded for the detection of statistical significance due to lower number of samples). We further found genes showing different circular RNA expression levels between tumor and matched normal tissues while controlling patient-specific differences (see methods). O2 has the least unbiased distribution of differential expression while the others have variable degrees of shift in the distribution of decreased expression of circular RNA in tumors relative to normal tissues (Figure 2C): O2 (average difference: <0.01, t-test p-value= 0.005897), O3 (average difference: -0.02, t-test p-value<10-15), A2 (average difference: -0.05, t-test p-value<10-15), A3 (average difference: -0.06, t-test p-value<10-15), GBM (average difference: -0.03, t-test p-value<10-15).
Expression of Alu RNA in glioma
Alu itself can be expressed and affect cellular processes. We estimated the expression levels of known 47 Alu elements in our samples and compared their expression levels between tumor and the matched normal tissues using the computational pipeline that can handle repeat elements for differential expression analysis (see methods). In O2, a higher proportion of Alu RNAs (37 out of 47) was perturbed in tumors relative to matched normal tissues while the other gliomas only had one to three Alu RNAs as significantly-changed ones (Figure 3A). Alu RNAs were down-regulated in gliomas in general (Figure 3B) but, interestingly, O2 was identified as the most affected tumor in the overall expression change of Alu RNA, as shown by the largest shift of the distribution of expression changes toward negative in the figure 3C: average fold changes of tumor relative to normal tissues for O2, O3, A2, A3, A4 and GBM are -0.52, -0.28, -0.34, -0.23, -0.50 and -0.16, respectively in log2 scale (all of the pathologies showed statistically-significant decreases according to t-tests and the p-value threshold less than 0.01) .
Towards an integrative understanding of Alu-associated molecular processes in glioma
In order to understand the decreasing pattern of A-to-I editing and circular RNA expression in gliomas relative to normal tissues, we checked the expression levels of ADAR (Adenosine Deaminase Acting on Rna) families that are known to generate A-to-I editing and to affect the formation of circular RNA (Figure 4A). ADAR2 was down-regulated significantly in all gliomas except for O2, which is consistent with the identified decreasing patterns of A-to-I editing and circular RNA. In contrast, ADAR1 did not show any significant change across gliomas. ADAR3, known to antagonize ADAR1 and ADAR2 by competing with them also showed some decrease in high grade gliomas. Therefore, ADAR2 among ADAR families seems to be a trans factor underlying the decreased patterns of A-to-I editing and circular RNA expression that we observed. For Alu RNA expression, we checked the expression level of RNA polymerase III transcribing Alu RNA. We found that RNA polymerase III expressions were downregulated in all the pathologies in gliomas despite varying statistical significance (Supplementary figure 3). This downregulation of RNA polymerase III may in part contribute to lower expression of Alu RNA in gliomas compared to normal tissues.
We also sought to understand the relationship between A-to-I editing and circular RNA as they can compete or cooperate for Alu element-associated factors such as ADAR. We first found that A-to-I editing and circular RNA occurred in the same genes, but the overlap size is moderate (Figure 4B). Second, they were compared in terms of the direction of changes in tumor relative to normal tissue. The two processes showed consistency in that if the genes showed decreased circular RNA expression in tumor compared to normal, they also tended to have decreased A-to-I levels in tumor in general (Figure 4C). Finally, we looked into whether the two post-transcriptional processes affect similar pathways by performing gene ontology analyses for the genes harboring differentially-edited sites or the genes showing the different circular RNA expression. We found that the genes with perturbed A-to-I editing were over-represented in multiple gene ontology terms (Supplementary table 3). About 25% of the over-represented terms were shared by at least two pathologies and included the pathways known in neuronal tissues, for examples, glutamate ion channels and the regulation of synapses (Figure 4D). For circular RNA, we found that different pathways are affected, including chromatin organization and regulation of metabolic process (Supplementary table 4). Therefore, our results demonstrated that A-to-I editing and circular RNA perturbations occurs concurrently at some genes, but they affect different genes in different pathways in general.