Identification of long noncoding RNAs in distinct subtypes of mouse retinal ganglion cells

Background : Emerging evidence indicates that long noncoding RNAs (lncRNAs) are important 22 regulators of various biological processes, and their expression can be altered following certain 23 pathological conditions, including central nervous system injury. Retinal ganglion cells (RGCs), 24 whose axons form the optic nerve, are a heterogeneous population of neurons with more than 20 25 molecularly distinct subtypes. While most RGCs, including the ON-OFF direction-selective 26 RGCs (ooDSGCs), are vulnerable to axonal injury, a small population of RGCs, including the 27 intrinsically photosensitive RGCs (ipRGCs), are more resilient. 28 Results: By performing systematic analyses on RNA-sequencing data, here we identify 29 lncRNAs that are expressed in ooDSGCs and ipRGCs with and without axonal injury. Our 30 results reveal a repertoire of different classes of lncRNAs, including long intergenic noncoding 31 RNAs and antisense ncRNAs that are differentially expressed between these RGC types. 32 Strikingly, we also found dozens of lncRNAs whose expressions are altered markedly in 33 response to axonal injury, some of which are expressed exclusively in either one of the subtypes. cis 38 potential roles for Conclusions: Overall, the results study reveal RGC expression of and provide a foundation for future investigation of neuronal

(DSGCs) are vulnerable to axonal injury (7-12). Although several studies have described distinct 70 transcription factors and signaling molecules that might contribute to the differential responses of 71 injured RGCs, our understanding of the underlying mechanisms remain fragmentary. 72 To investigate the molecular mechanisms underlying the exceptional survival and regenerative 73 ability of ipRGCs after axonal injury, we previously performed RNA-sequencing (RNA-seq) on 74 these RGCs and the vulnerable RGC subtype, ON-OFF direction-selective RGCs (ooDSGCs) 75 (7). Subsequently, we had reported hundreds of coding genes that were uniquely expressed in 76 these RGC subtypes with and without optic nerve crush injury. In addition to our study, others 77 have reported expression profiles of coding genes that were differentially expressed between 78 various RGC subtypes (13,14). However, subtype-specific expression of lncRNAs in RGCs has 79 not been systematically examined. In this study, we analyze RNA-seq data from purified murine 80 RGCs and identify lncRNAs that are expressed in two contrasting RGC subtypes (i.e. injury-81 resilient ipRGCs and the injury-vulnerable ooDSGCs) in normal and optic nerve-injured animals. Identification of lncRNAs by analyzing the RNA-seq data 85 RNA-seq data obtained from isolated ipRGCs and ooDSGCs of optic nerve axotomized (i.e. 3 86 days post crush) and normal (i.e. sham surgery) mice (7) were used for the lncRNA analyses. 87 RNA-seq data were previously deposited under the accession number GEO: GSE115661 (7). 88 For the analyses in this current study, trimmed reads were aligned to the mm10 version of the 89 mouse genome with STAR (15), followed by an assembly using the Cufflinks 2.2 tool. For the 90 alignment and assembly we used GTF file from GENCODE version M17 (vM17; release 91 04/2018). 92 To classify the transcripts as putative novel lincRNAs, we used the method described in (16) 93 with some modifications. Briefly, all reads that overlapped or were in a window of ±2 kb of the 94 known mouse GENCODE genes were removed. Next, we generated transcriptome assemblies 95 using Cufflinks 2.2 (17) for each of these samples separately and then used Cuffmerge to 96 combine all annotations. All transcripts that were identified in these analyses as class code 'u' by 97 Cuffmerge (class code "u" -putative novel intergenic transcripts) were retained. Transcripts 104 Differential expression analysis 105 The RNA abundance defined as the FPKM (Fragments Per Kilobase Million) and the sum of 106 exon read count per gene was calculated using RSEM (19), and differential expression (DE) 107 analysis was performed with DESeq2. A gene was considered detected if the FPKM was > 0.3 in 108 6 at least two replicates of one sample and significantly changed if the adjusted p-value was < 109 0.05. 110 Gene ontology (GO) enrichment analysis 111 We used DAVID v6.8 to perform GO enrichment analysis (20). "Biological Process" and 112 "Molecular Function" GO terms with p-value < 0.05 were defined as enriched. injured ooDSGCs vs. normal ooDSGCs; injured ooDSGCs vs. injured ipRGCs; normal 118 ooDSGCs vs. normal ipRGCs. When both mRNA neighbors (upstream and downstream) were 119 located at a distance less than 300 kb from the lncRNA, but only one of them was DE, we 120 selected the DE as the lncRNA neighbor. When both neighbors were non-DE, we selected the 121 closest mRNA as the lncRNA neighbor. Next, we searched for enriched "Biological Process" 122 and "Molecular Function" Gene Ontology terms among each set of lncRNA-nearby mRNAs (see 123 above).

124
Gene expression correlation analysis of lncRNA and mRNA 125 To predict the potential role of lncRNAs, we performed guilt-by-association analysis (21 For the injury procedure, the left optic nerve was exposed intraorbitally and crushed with 147 jeweler's forceps (Dumont #5, Roboz) for 10 seconds approximately 1 mm behind the optic disc. to 16-μm thickness. Tissue sections were blocked in 5% normal donkey serum and 0.3% Triton 153 X-100 in PBS for 1 hour and incubated with primary antibodies diluted in blocking buffer 154 overnight at 4°C, followed by 1-hour incubation with secondary antibodies at room temperature.

183
To identify the repertoire of lncRNAs that are expressed in ipRGCs and ooDSGCs, we analyzed 184 the RNA-seq data obtained from isolated murine ipRGCs and ooDSGCs (7). In addition to the 185 normal uninjured condition, we examined lncRNA expression of these RGC types extracted 186 three days after intraorbital optic nerve crush (7). Thus, we analyzed a total of four RGC groups: 187 normal ipRGCs, injured ipRGCs, normal ooDSGCs and injured ooDSGCs.

216
Next, we compared our lincRNA data, including novel lincRNAs, to the lincRNA data from  Table S3). For normal ooDSGCs, we found that 12% (58/467) 221 of lincRNAs were also seen in the whole retina RNA-seq (Additional file 3: Table S3)  For mRNAs, the median FPKM for normal and injured ipRGCs were 15 and 16.2, respectively.

236
For ooDSGC lncRNAs, the median FPKM for normal and injured conditions were 1.6 and 1.8, 237 respectively. The medians of mRNA for normal and injured ooDSGCs, were 13.1 and 13.7, 238 respectively ( Fig. 2c and 2d). Additionally, we compared the median FPKM for lncRNAs 239 detected in RGCs with the median FPKM for lincRNAs detected in the whole retina RNA-seq 240 (median FPKM = 1.5; Additional file 8: Fig. S2a and S2b) (22). However, we did not find a 241 statistically significant difference from this comparison. 242 We found that the average length of lncRNAs for ipRGCs and ooDSGCs were 1.62 kb and 1.57 243 kb, respectively (Fig. 3a). On average, lncRNAs were shorter than mRNAs in length (t-test, p-244 value < 0.001; Fig. 3a). Moreover, lncRNAs were less spliced, with an average of 2.7 exons per 245 transcript; whereas the average for protein-coding genes was 8.6 exons per transcript (t-test, p-246 value < 0.001; Fig. 3b). Compared to the whole retina lincRNAs, the average length of RGC 247 lncRNAs was shorter (t-test, p-value < 0.001; Additional file 8: Fig. S2c). Similarly, the average 248 number of exons in RGCs was smaller compared to that of the whole retina lincRNAs (t-test, p-249 value < 0.001; Additional file 8: Fig. S2d).

LncRNAs differentially expressed in ipRGCs and ooDSGCs 251
Since very little is known about lncRNAs that are expressed in different RGC types, we sought 252 to identify differentially-expressed lncRNAs in ipRGCs and ooDSGCs. We detected 8 lncRNAs 253 that were differentially expressed (adj p-value < 0.05) between normal and injured ipRGC 254 groups, and 81 lncRNAs differentially expressed between normal and injured ooDSGC groups 255 ( Fig. 4a and 4b). Only 1 putative novel lincRNA was differentially expressed between normal 256 and injured ipRGCs, whereas 21 putative novel lincRNAs were differentially expressed between 257 normal and injured ooDSGCs. Thus, although the amount of lncRNAs expressed in these RGC 258 types is similar (827 transcripts in ipRGCs and 1,078 in ooDSGCs; Fig. 1b), the number of 259 lncRNAs whose expression is significantly altered by injury is markedly lower in ipRGCs 260 compared to ooDSGCs ( Fig. 4a and 4b). 261 Next, we sought to identify lncRNAs that are differentially expressed between injured ooDSGCs 262 and injured ipRGCs as well as those that are differentially expressed between normal ooDSGCs 263 and normal ipRGCs. For the injury group comparison, we detected 103 lncRNAs that were 264 differentially expressed, whereas for the normal group comparison, we identified 112 265 13 differentially-expressed lncRNAs ( Fig. 4c and 4d). Of these lncRNAs, 28 and 35 correspond to 266 putative novel lincRNAs for each comparison, respectively. A full list of the differentially-267 expressed lncRNAs is provided in Additional file 4: Table S4. We validated the expression of 268 one randomly selected novel lincRNA using RT-PCR (Additional file 8: Fig. S3).  Table S2 contains a full list of these genes.

278
Notably, among the neighboring genes, we found protein-coding genes whose protein products  because of the small number of differentially-expressed lncRNAs. One of the nearby genes 296 enriched to this term is Shank3 (Fig. 5a, Additional file 5: Table S5), which encodes for a 297 synaptic scaffolding protein that is associated with neurodevelopmental disorders (26).

311
Nearby protein-coding genes of lncRNAs differentially expressed between normal ooDSGCs and 312 normal ipRGCs 313 Similar to the terms listed above, significantly enriched GO terms in this group comparison 314 include "neuron migration", "regulation of axon extension" "nervous system development" and 315 "transcription factor activity sequence-specific DNA binding" (Fig. 5d, Additional file 5: Table   316 S5). Notably, among the genes enriched in the term "regulation of transcription", there are 317 several transcription factors that are known to be expressed specifically in these two RGC types.

318
For example, Eomes (also known as Tbr2) was identified in our analysis as a neighbor of 319 lncRNAs detected exclusively in ipRGCs (Additional file 2: Table S2). Eomes knockout was 320 shown previously to cause death of ipRGCs in uninjured mice (4) as well as in mice with optic 321 nerve crush (7), demonstrating that Eomes is essential to the survival of ipRGCs.

322
Additionally, we identified Pou4f1 (also known as Brn3a) and Pou4f3 (also known as Brn3c) as 323 neighbors of lncRNAs that were detected exclusively in ooDSGCs. The Brn3/Pou4f transcription 324 factors are known to participate in RGC development and type specification (5, 27). We and 325 others have previously shown that ooDSGCs, but not ipRGCs, exhibit high levels of Brn3a and 326 Brn3c expression (7, 14, 28), raising the possibility that these lncRNAs may play functional roles 327 in regulating expression of these genes and promoting RGC type specification.  Table S7).

340
LncRNA and mRNA co-expression analysis: normal ipRGCs vs. injured ipRGCs 341 We did not find any correlated lncRNA-mRNA pair in the set of differentially-expressed 342 transcripts in this group comparison.

343
LncRNA and mRNA co-expression analysis: normal ooDSGCs vs. injured ooDSGCs 344 Co-expressing pairs comprised 19 lncRNAs whose expression was correlated with a total of 395 345 mRNAs (Fig. 6a, Additional file 6: Table S6), and these mRNAs were grouped into four clusters.

353
LncRNA and mRNA co-expression analysis: injured ipRGCs vs. injured ooDSGCs 354 Findings showed that 320 mRNAs were correlated with 14 lncRNAs and grouped in two clusters 355 (Fig. 7a, Additional file 6: Table S6). The mRNAs in these clusters were significantly enriched 356 in GO terms such as "protein catabolic process" and "response to unfolded protein and regulation 357 of translation" (Fig. 7a, Additional file 7: Table S7).  Table S6). The protein-361 coding genes in these clusters were significantly enriched in GO terms such as "calcium ion 362 binding", "nervous system development", "neuron migration," and "axon guidance" (Fig. 7b, 363 Additional file 7: Table S7).

397
We identified numerous lncRNAs that were differentially expressed between normal ipRGCs 398 and normal ooDSGCs. As mentioned above, many of these lncRNAs lie in close proximity to 399 protein-coding genes that encode RGC type-specific transcription factors, including Eomes, 400 Brn3a, and Brn3c. However, it remains unknown whether these differentially expressed 401 lncRNAs in fact regulate the expression of these transcription factors and control RGC 402 development.

403
Another notable finding in this study is the dozens of lncRNAs whose expressions were altered 404 in response to axonal injury. Nearest-neighbor analysis identified apoptosis-related genes that are 405 close neighbors of these lncRNAs. One example is the gene Ecel1, which has been studied 406 extensively for its role in regulating RGC death after injury (44). Ecel1 protein, a membrane-  Although many lncRNAs are known to act in cis and regulate expression of their nearby genes, it 415 is worth noting that lncRNAs can also act in trans, distant from their synthesis sites. One 416 example is lncRNA RP23-471B6.2 (also knowns lnc-NR2F1). We found that expression of this 417 20 lncRNA was higher in injured ipRGCs compared to ooDSGCs. Given its genomic location, it 418 was suggested that this lncRNA is a potential cis regulator of its gene neighbor Nr2f1 419 (Additional file 4: Table S4), which encodes a transcription factor involved in nervous system 420 development and neuron migration. However, a recent study has suggested that rather than  Guilt-by-association analysis suggested that lncRNAs may be functionally correlated to protein 445 coding genes enriched in specific "Biological Processes" and "Molecular Functions". Notably, 446 some lncRNAs were correlated with genes significantly enriched in "Biological Processes" 447 related to apoptotic pathways (Fig. 6). Genes associated with RGC death, including Atf4, Ddit3, 448 Chac1 and Trib3, were previously shown to be highly expressed in the injured ooDSGCs (7). In 449 our present study, we found that these genes were positively correlated with the lncRNA 450 Brip1os, lncRNA RP23-407N2.2, and lncRNA Dubr (Fig. 6b and 6c, Additional file 6: Table   451 S6).

452
Insulin growth factor-1 (Igf1) is an anti-apoptotic gene highly expressed in the injured ipRGCs 453 (7). Igf1 was positively correlated with the lncRNAs RP23-471B6.2 and RP23-83I13.10 454 (Additional file 6: Table S6). LncRNA RP23-83I13.10 was also positively correlated with Gpx3, 455 a gene known to ameliorate oxidative stress (54) and also highly enriched in the injured ipRGCs 456 (7). We also observed regeneration-associated genes, including Nrn1 and Spp1, which were  Table S6), raising the possibility that these lncRNAs may regulate expression of multiple genes 460 in RGCs after injury. We note however, that our assumptions on the functional roles of lncRNAs 461 remain speculative, and whether they in fact act to regulate gene expression, RGC specification, 462 and survival remains to be determined.            Table S6. List of lncRNAs with expression correlated to protein coding genes (r ≥ |0.9|).