A genetic variant rs10251977 in long non-coding RNA EGFR-AS1 1 creates a new binding site for miR-891b and modulates the expression of 2 EGFR A/D isoforms 3

: 30 Tyrosine kinase inhibitor (TKI) is one of the effective chemo-preventive approaches 31 against tumors that deregulate EGFR pathway. About 80% of HNSCC patients overexpress 32 EGFR, making TKI an effective treatment against this cancer. Recently a synonymous variant 33 rs10251977 in exon 20 of EGFR reported to act as a prognostic marker in HNSCC. Analysis of 34 this germline variant in blood samples of oral cancer patients showed a similar frequency in 35 cases and controls. Further, in-silico analysis showed that this polymorphism creates binding site 36 for miR-891b in EGFR-AS1. The EGFR-AS1 expression modulates the EGFR A/D isoforms 37 through alternative splicing. Our bioinformatic analysis showed enrichment of alternative 38 splicing marks H3K36me3 and presence of a few intronic polyA sites spanning around exon 15a 39 and 15b of EGFR facilitating the skipping of exon 15b and thereby promoting the splicing of 40 EGFR-A isoform. In addition, the presence of PTBP1 binding site in EGFR and EGFR-AS1 41 enhances the expression of EGFR- A isoform by preventing the premature termination. 42 Expression profiling of EGFR-AS1 along with miR-891b level and rs10251977 polymorphism 43 status in oral cancer patients may be useful for targeted therapy. 44 45 47 48 49 50 51 52 53 54 55 56 57 new level of our results the level of in tumors that expressed high level of miR-891b. we also found that binding site was found in 3’UTR of EGFR-A isoform. We also analyzed another miR-138-5p whose binding site is present in both EGFR-AS1 and EGFR-A isoform and its high expression level correlated with reduced level of EGFR-AS1 and EGFR-A. Our study showed both miRNAs were tumor suppressors and downregulated in tumor samples. In addition to support the ceRNA function, recent studies have reported miRNA sponging activity of 47,33 and miR-891b mediated gene findings suggest that EGFR-AS1, promotes tumorigenesis by increasing the level of through the modulation of alternative splicing by preventing the premature termination through PTBP1. The germline polymorphism rs10251977 (G>A) in EGFR-AS1 creates binding for miR-891b which decreases the level of EGFR-AS1 and thereby increasing the EGFR-D/A ratio which confers the susceptibility in tumor by promoting the EGFR-AS1 to act as miRNA sponge. These findings provide an insight on functional importance of germline polymorphism rs10251977 (G>A) in EGFR-AS1 and orchestrated interactions between non- coding RNAs and coding RNA in maintaining cellular homeostasis and its dysregulation may lead to of

Introduction tissues compared to that of normal tissues (p<0.001) (Figure 1a). The expression levels were 117 correlated with clinic-pathological characteristics of the tumors, there is a trend in increased level 118 of EGFR-AS1 was observed without any statistically significant association ( Figure S1a). 119 Interestingly, in tumors with GA and AA (N=25) genotypes, we noticed high level expression of 120 EGFR-AS1 albeit with no statistical significance (p=0.653), and this could be due to the small 121 sample size ( Figure S1b). 123 To know the functional relationship of EGFR-AS1 expression with EGFR and its isoforms, 124 we analysed the relative expression levels of EGFR isoforms A and D and D/A ratio with 125 reference to the expression level of EGFR-AS1. The tumor samples were stratified into two 126 groups as EGFR-AS1 high and low expression groups based on the median values of EGFR-AS1 127 expression (Table 3). Tumors with expression level above median value is considered as high 128 expression group and those with fold change below median value as low expression group. 129 We observed a statistically significant difference in the expression level of EGFR-A form with 130 reference to EGFR-AS1 level (p <0.0001) and no significant difference was found for EGFR-D 131 form ( Figure 1b). But we observed significant difference in the EGFR isoform A and D levels in 132 both the groups (Figure 1c), and found a significantly high level of EGFR isoform D transcript in 133 low EGFR-AS1 expressing group compared to the high expression group (p=0.017) and this 134 could be due to the modulation of RNA splicing. To address this, we carried out in-silico 135 analysis to identify the EGFR-AS1/EGFR binding partners that has functional association with 136 alternative splicing. 137 First, we predicted the binding partners of EGFR-AS1 by using online database lncRNAtor 12 . 138 EGFR-AS1 showed significant mechanistic association with three RNA binding proteins FIP1,  (Table S1 and S2), suggesting the PTBP1 role in alternative splicing of EGFR-A 144 isoform. To further support this hypothesis, we checked the binding partners for PTBP1 using 145 STITCH online tool, and found that most of its interacting partners HNRNPK, HNRNPU, HNRNPA1, HNRNPA3, HNRNPD, HNRNPL and SNRPA were known to play a role in 147 alternative splicing (Figure 2a).

148
Previous studies have shown that H3K36me3 plays an important role in exon skipping by 149 recruiting alternative splicing factors 14 . We checked whether the H3K36me3 marks are present 150 in EGFR region by using UCSC genome browser. Interestingly we found that enhanced 151 H3K36me3 marks were present around the skipped region spanning the exon 15a and b region 152 (Figure 2b), and to our surprise they also carried a few polyA sites spanning around region exon 153 15a leading to the skipping of exon 15b and prevents premature termination thereby reducing 154 EGFR-D isoform. EGFR-AS1 may act as a scaffold to recruit major alternative splicing factors 155 in association with PTBP1 to bring exon skipping and promoting EGFR-A expression.

157
Apart from acting as scaffold, lncRNAs may also act as miRNA sponges in the cytoplasm.

158
Tan et al., reported that the presence of minor allele in EGFR-AS1 decreased its steady state 159 level 6 and this could be due to the miRNA sponging mediated degradation of lncRNA. 160 We used a web-based tool called lncRNASNP2 to determine the functional impact of minor 161 allele in EGFR-AS1.

162
The minor allele generates a new binding site for miR-891b (Figure 3a and 3b). We chose this 163 miRNA and another miRNA, miR-138-5p ( Figure 3c) which targets EGFR and remains 164 unaffected by the presence of either major or minor allele for expression study ( Figure S2b).

165
Both the miRNAs targets EGFR-A isoform, as confirmed by TargetScan and miRwalk tools 166 ( Figure 3d). Gene Set Enrichment Analysis (GSEA) for miR-891b and miR-138-5p revealed that 167 they are involved in pathways related to cancer and MAPK signaling pathway, respectively 168 (Table S3 and S4).

223
EGFR-AS1, a lncRNA transcribed from the antisense strand of EGFR, was found to be 224 overexpressed in several cancers 30,11,31-33 . Our study observed a significant upregulation of 225 EGFR-AS1 in oral cancer patients suggesting the oncogenic potential of this lncRNA, which is 226 consistent with above findings. When the EGFR-AS1 expression profile was analysed with 227 reference to genotypes we found a decreased expression of EGFR in GG compared to GA+AA 228 genotype, but without any statistical significance. The altered expression pattern of EGFR-AS1 229 might be due to previously reported non-canonical RNA editing mechanism which maintains the 230 allele specific lncRNA based on germline polymorphism 34 .

231
In this study, there was a significant change in the EGFR D:A isoform ratio with the low level diminished level of EGFR-AS1. This is evident from our results that the decreased level of 271 EGFR-AS1 in tumors that expressed high level of miR-891b. Interestingly, we also found that 272 miR-891b binding site was found in 3'UTR of EGFR-A isoform. We also analyzed another 273 miRNA, miR-138-5p whose binding site is present in both EGFR-AS1 and EGFR-A isoform and The blood samples were subjected for DNA isolation using Proteinase K digestion and PCI 302 extraction method. The isolated DNA was quantified using NanoDrop2000 UV-Vis 303 spectrophotometer (Thermo Scientific, USA) and its integrity was assessed by resolving in 1% 304 agarose gel in Mupid gel electrophoresis (TaKaRa, Japan) and diluted to 100ng/μL stored at -305 20 o C which was later used for screening germline variants. The cDNA conversion was carried out from total RNA using custom designed miRNA seed 329 specific stem-loop primers for miRNAs (Table S5) and a random hexamer primer for coding/ 330 non-coding RNAs, with 2μg of RNA for mRNA and lncRNAs and 10ng of RNA for miRNAs.

331
The RNA samples were pre-incubated at 65˚C for 20 min followed by 55˚C for 90 min, 72˚C for  The frequency of alleles and genotypes were compared between the patient's and control groups 363 by chi-square test. Fisher exact test and odds ratio (OR) with 95% confidence interval (CI) was 364 calculated to find the risk association. Differences between the means were presented as mean ± 365 SEM and analysed using Student's t-test (Mann-Whitney) using Graph Pad Prism statistical 366 software, v 6.01 (Graph Pad software Inc, USA). All tests were two-tailed and a p value of <0.05 367 was considered as statistically significant.  levels. e. Graph shows the relative fold change of EGFR-AS1 in relation to miR-891b.

539
(Statistical significance represented as * for P < 0.01, two tailed Student's t-test).  Table 1 Demographic and clinical data of Oral cancer patients and healthy controls 542