REV3L Gene Variants rs1002481, rs462779, and rs465646 Lead to Increased Susceptibility Towards Non-Small Cell Lung Cancer in the Population of Jammu and Kashmir.

Background: Non-small cell lung cancer (NSCLC) is the most prevalent type of lung cancer accounting for 80–85% of all lung cancer cases. Various genetic studies have tried to reveal the association of REV3L (Protein reversion less 3-like) gene mutations with cancer, including lung cancer but no such study has been carried out in the population of Jammu and Kashmir (J&K). Methods: The selected REV3L variants were genotyped using the TaqMan allele discrimination assay in 550 subjects (203 NSCLC patients and 347 healthy controls). The association of variants was evaluated by logistic regression. Results: Out of the four REV3L variants genotyped, we found rs1002481, rs462779, and rs465646 signicantly associated with NSCLC risk with an Odds Ratio (OR) of 3.5 (1.98–6.3 at 95% CI ), p value = 0.00002; OR = 4.4 (1.8–10.4 at 95% CI ), p value = 0.00075; and OR = 2.4 (1.47–4.008 at 95% CI ), p value = 0.00053, respectively. Conclusion: Our data suggest a strong association of variants rs1002481, rs462779, rs465646 with NSCLC. This association is indicative of a potential role of mutations in the REV3L gene as risk factor for the development of NSCLC in the studied population. Although a rst report from any Indian population, these variants have been previously reported to be associated with lung and colorectal cancers in different world populations. These data along with our data supports the notation that these variants can be used as potential prognostic biomarker.

Conclusion: Our data suggest a strong association of variants rs1002481, rs462779, rs465646 with NSCLC. This association is indicative of a potential role of mutations in the REV3L gene as risk factor for the development of NSCLC in the studied population. Although a rst report from any Indian population, these variants have been previously reported to be associated with lung and colorectal cancers in different world populations. These data along with our data supports the notation that these variants can be used as potential prognostic biomarker.

Background
The region of Jammu and Kashmir (J&K) in India has long been considered as an endemic cancer zone with a peculiar cancer pro le (Rasool et al, 2012) [1]. Lung cancer, which globally accounted for 18.4% of cancer related deaths in 2018, is a major cancer type in Indian populations, especially in the male population (India State-Level Disease Burden Initiative Cancer Collaborators, 2016; Ferlay et al. 2018) [2].
One recent study has put lung cancer as the second most common cancer in the J&K region with 23.5% of all new cases in 2017 [3]. Although several studies have evaluated mutations in key genes, such as genes involved in cell cycle and cell growth regulation, in DNA damage repair pathways and many other pathways with reference to different cancer types in the population of J&K [4][5][6][7], none has evaluated the role of Translesion Synthesis (TLS) polymerase in any cancer type. The DNA sequence can get altered as the result of replication error and by the environmental agents such as tobacco smoking, mutagenic chemicals, and certain types of radiation, which may lead to cancer [8][9][10]. However, various DNA repair pathways, including DNA damage tolerance (DDT) pathway are responsible for maintaining the genomic stability. Any disruption in these pathways, therefore, has a potential to cause malignancy. TLS polymerases, the key players of the DDT pathway consist of a series of specialized polymerases such as polymerases κ, ζ, η, ι and Rev1 [11][12][13]. Due to their error prone nature, most of these TLS polymerases, including pol ζ have been found associated with different cancer types [14]. DNA pol ζ is a heterodimer, consisting of Rev3, the catalytic subunit and Rev7, the accessory subunit.
Pol ζ can mediate DNA replication bypassing DNA damage, which may prevent chromosome instability in cells and be considered as a suppressor of spontaneous tumorigenesis at the replication forks thus helping in maintaining the genomic stability, However, several point mutations in REV3L cause proteins misfolding and mis localization, which in turn alters their interactions and biological functions. These dislocations are usually caused by genetic mutations that cause fold level expressional changes and alter protein functions. Therefore, any reduction in the activity of pol ζ makes cells, including tumour cells sensitive to the DNA-damaging agents and its absence can also cause increased chromosome rearrangements and in ammation thus, promoting carcinogenesis [14][15][16][17].
In the present study, we conducted an association study of rs1002481 (T > A), rs462779 (G > A), rs465646 (G > A), and rs11153292(G > A) single nucleotide variants (SNVs) in the REV3L (Protein reversion less 3like) gene located on chromosome 6q21 (Chr 6: 111,299,028-111,483,71) with NSCLC in the population of Jammu and Kashmir. These variants were selected based on literature survey as they play an important role in carcinogenesis. SNV rs1002481 is an intronic variant located on Intron 7, rs462779 is an exonic variant located on exon 15, rs465646 is located on 3' UTR region of REV3L and rs11153292 is an intronic variant located on Intron 6.

Ethical Statement
The study was approved by the Institutional Ethics Review board (IERB) of Shri Mata Vaishno Devi University (SMVDU) vide IERB Serial No: SMVDU/IERB/16/41. The informed written consent was taken into account from each participant and all the parameters were recorded in pre-designed Performa. In this study all experimental research work was performed according to guidelines issued by Institutional Ethics Review board committee.
Sampling A total of 550 subjects, (203 NSCLC cases and 347 healthy controls) were recruited for the study. All cases were histopathological con rmed. The genomic DNA was isolated from the blood samples using the FlexiGene ® , Qiagen DNA Isolation Kit (catalogue No. 51206). Agarose gel electrophoresis was used to analyse the quality of genomic DNA and quanti cation was performed using Eppendorf Bio-Spectrometer ® (Nanodrop).

Genotyping
Genotyping of the selected SNVs was performed using the allele discrimination assay on the MX3005p Agilent Real-time PCR platform and we adopted the methodology from our previous studies [4,6]. TaqMan  concentration to 20X using TE (Tris EDTA buffer) as per the manufacturers protocol. The volume of the total PCR reaction was 10 μl, comprising of 2.5 μl of TaqMan UNG Master Mix, 0.25 μl of the probe, 3 μl DNA (5 ng/μl) and 4.25 μl nuclease-free water to make up the nal volume. The following thermal conditions were adopted; hold for 10 min at 95 °C, then 40 cycles of 95 °C for 15 s and 60 °C for 1 min. All the samples were run in 96-well plates with three no template control (NTC). The post PCR detection system was used to measure allele-speci c uorescence. Ninety-three random samples were picked and re-genotyped for cross-validation of genotyping calls and the concordance rate was 100%.

Statistical analysis
Genotypic frequencies of each variant were tested for Hardy-Weinberg equilibrium by performing Chisquare test (χ2) and then the variants that followed HWE were further analyzed to check there association with NSCLC by using IBM SPSS software v.23.0 and the associations between REV3L variants and nonsmall cell lung cancer risk were estimated by calculating the ORs at 95% CI by Binary logistic regression analyses with adjustment for Age, Gender, Body Mass Index, Alcohol consumption, and smoking status and we also calculated the statistical power of each variant by using PS software.

Results
We recruited a total of 550 subjects, out of which 203 were NSCLC patients (cases) and 347 healthy individuals (controls). Among the cases, 82% were males and 18% were females and among the controls, 75% were males and 25% were females, suggesting that the prevalence of NSCLC is high among males in the J & K population. The mean age of cases and controls was 61.38 ±9.7 years and 41.68±16 years, respectively which were slightly statistically different (p-value=6.36E-29). The average BMI of cases was 22.13±3.75 kg/m 2 and of controls was 23.01±5.2 kg/m 2 (p-value=0.034). The distribution of other clinical characteristics between cases and controls are given in table 1.  Table  2). The distribution of risk allele is shown in gure 1. The differences in the allelic frequency distribution of rs11153292 variant between cases and controls was statistically insigni cant (hence data not presented in this manuscript).  In order to explore the maximum effect of the risk alleles, we evaluated the association using the binary logistic regression analysis of different genetic models and observed that under recessive model, SNVs rs1002481, rs462779, rs465646 were found to be signi cantly associated with NSCLC risk with adjusted OR* values of 3

Discussion
Since the advent of genome-wide association studies (GWAS), rapid progress has been made in the identi cation of lung cancer susceptibility genes, such as those on chromosomes 5p15.33, 6p21, and 15q24-25.1 [18]. Besides this, individual gene mutation pro ling studies and the case-control association studies have also contributed signi cantly in the identi cation of lung cancer susceptibility genes such as EGFR, TP53, and RB1 [18,19]. TLS polymerases, although play an important role in preventing more deleterious effect of stalled replication fork their error prone nature often result in the introduction of genomic instability, a potential cause for malignancy [14]. The role of Rev3 in the maintenance of common fragile sites, further reinforces the role of TLS polymerases in cancer [20]. Consistent with their potential role in cancer, several studies have investigated the association of SNVs in these genes with different cancers including lung cancer [21][22][23]. However, the association of TLS gene SNVs with any cancer in the population of J&K has not yet been studied. This is a rst study analyzing the role of REV3L genetic polymorphisms in NSCLC in the population of J&K. We looked into the association of three SNVs with NSCLC to understand the role of REV3L germline polymorphisms in increasing the susceptibility towards NSCLC in the studied population. A statistically signi cant association with NSCLC was observed for genetic variants rs1002481, rs462779, and rs465646 with adjusted odds ratio (OR*) of 3.5 ( 1.98-6.3 at 95% CI adjusted), 4.4 (1.8-10.4 at 95% CI adjusted), 2.4 (1.47-4.008 at 95% CI adjusted), respectively suggesting that REV3L polymorphisms may have a role in the pathological process of NSCLC in the studied population.
By using Insilco tool SNIPA, we found that genetic variants rs1002481 and rs465646 play a direct regulatory role and rs462779 has a direct effect on the transcript. All the three SNPs were found to be in Linkage disequilibrium (LD r 2 ≥ 0.8). The plot symbol of each variant indicates its functional annotation ( Figure 2). The mechanism of pol ζ mediated lesion bypass involves the binding of Rev1 to the replication complex which then recruits DNA pol ζ to the damaged site. Rev3 does not bind directly to Rev1, instead it binds to MAD2L2, the accessory subunit of pol ζ during the formation of the pol ζ complex and this leads to conformational change in the C-terminal domain of MAD2L2 that binds with the C-terminus of Rev1 to carry out the DNA translesion synthesis [14]. The lesion bypass allows the replication process to continue, thereby preventing the more deleterious effect of blocked replication forks which may lead to chromosome instability in cells (Supplementary gure S1). Hence pol ζ along with other TLS polymerases may be considered as suppressors of spontaneous tumorigenesis at the replication forks thus helping in maintaining the genomic stability [16,17]. REV3L is shown to interact with many genes as visualized in String tool software v 10.5 [25] (Supplementary gure S2) .
Any molecular alteration in the REV3L gene can cause misfolding and mis localization of Rev3 thus leading to weakened interaction between Rev3 and MAD2L2 (also known as Rev7) [26]. As a result, this will cause under-expression of pol ζ complex and therefore a compromised lesion bypass [27]. On the other hand, nucleotide changes such as rs465646 in the 3'UTR region of REV3L, a microRNAs binding site can cause over-expression of REV3L which again will lead to compromised lesion bypass mechanism. Hence, both over-expression and under-expression of TLS polymerases can increase the rate of mutation, signifying that they could act equally as oncogenes and tumour suppressor genes as shown in gure 3 [28]. Figure: 3 Schematic representation of molecular alteration in REV3L leading to both over-expression and under-expression of DNA polymerases Zeta that can increase the rate of mutation thus, promoting carcinogenesis [27].

Conclusion
The statistically signi cant association of rs1002481, rs462779, and rs465646 with non-small cell lung cancer, which are in linkage-disequilibrium suggests an increased genetic risk factor in the population of J&K for the non-small cell lung cancer. The association of TLS polymerases with the development of chemoresistance adds one more dimension in the importance of evaluating these polymerases in the cancer patients of this cancer prone geographic region. In the modern era of medical sciences, the discovery of an effective treatment for cancer, including lung cancer remains a daunting challenge due to increase in drug resistance. Therefore, by identifying and targeting the chemoresistance causing genes can prove critical milestone in improving prognosis. The importance of this study, therefore, lies in the identi cation of nucleotide changes in the TLS polymerases in cancer patients which might prove to be a prognostic or predictive biomarker in the study population.

Availability of data and materials
Data generated and analyzed during study is not available publicly but can be made available from the corresponding author upon reasonable request.
Ethics approval and consent for participation   Linkage disequilibrium plot and the plot symbol of each variant indicating their functional annotation [24]. Schematic representation of molecular alteration in REV3L leading to both over-expression and underexpression of DNA polymerases Zeta that can increase the rate of mutation thus, promoting carcinogenesis [27].