Map Based Cloning of a Dominant Rust Resistance Gene and Mapping of its Duplicated Paralogues in Cultivated Groundnut (Arachis hypogaea L.)

Understanding the mechanism and nature of resistance genes in crop plants is essential for its use in new breeding techniques. Previously, a dominant rust resistance gene was ne-mapped within a 1.2 cM interval in chromosome A03 of groundnut. Here, the rust resistance gene, VG9514-Rgene was isolated through map based cloning. Sequencing of the gene from resistant and susceptible plants revealed non-synonymous mutations in the TIR, NBS and LRR region of R-protein. Genetic mapping of these SNPs-based markers conrmed the position of VG9514-Rgene in between FRS 72 and SSR_GO340445 markers in chromosome A03. Homology searching identied four homologous R-genes in groundnut genome. Of them, Arahy.R8KUIR, Arahy.T6DCA5 and Arahy.ZZ0VZ9 are paralogues. These paralogous genes had several small InDels. Mapping of these InDels-based markers revealed tandem duplication of these paralogous R-genes at distal portion of chromosome A03. K a /K s calculation revealed that this unique VG9514-Rgene had undergone positive selection. Homology based structure modelling of this R-protein revealed a typical consensus three dimensional folding of TIR-NBS-LRR protein. Non-synonymous mutations in susceptible version of R-protein were mapped in this protein model and found E268Q mutation in hhGRExE motif, Y309F in RNBS-A motif and I579T in MHD motif of NB-ARC domain are probable candidates for loss of function.


Introduction
Plant genome contains many resistance genes (R-genes) and their analogues (RGAs), which provide nonhost resistance to plants (Schulze-Lefert and Panstruga 2011). In nature, speci c R-genes have been evolved against speci c plant pathogens (Flor 1942; Guido et al. 1992). Such speci c interaction between pathogen effector (Avr gene product) and R-gene impart vertical resistance to crop plants, where such resistance is mostly controlled by oligogenes. R genes are usually dominant in nature which provides full or partial resistance to one or more pathogens. Way back in 1992, rst R gene Hm1 was cloned from maize (Johal and Briggs 1992). Since then, many R genes were cloned from different plants including crop species. This account a total of 314 cloned functional R genes which operate through nine distinct mechanisms inside plant cell towards imparting disease resistance (Kourelis and van der Hoorn 2018). Jiang et al. (2018) isolated a broad spectrum and durable NB-LRR gene R8 that provide resistance against Phytophthora infestans. Similarly, a unique tandem kinase-pseudokinases R gene that provides broad spectrum resistance to Puccinia striiformis f. sp. tritici (Pst), was isolated based on map based cloning approach (Klymiuk et al. 2018). Most resistance genes encode intracellular nucleotide binding/leucine-rich repeat (NLR) immune receptor proteins that have three distinct domains, Toll/Interleukin 1 Receptor (TIR) or Coiled Coil (CC), Nucleotide Binding-ARC (NB-ARC) Site (NBS) and Leucine Rich Repeat (LRR). NB-ARC domain named so due to the presence in APAF-1 (apoptotic proteaseactivating factor-1), R proteins and CED-4 (Caenorhabditis elegans death-4 protein) along with nucleotide binding sites (van der Biezen and Jones 1998). Cloning of R-genes can be achieved with genetic map based positional cloning or mutational genomics approach (Arora et al. 2019).
Prior to genome sequence availability in groundnut, R-genes were predicted from expressed sequence tags (EST) using known R-gene protein sequence from model plant. Liu et al. (2013) identi ed six different classes of R genes from 1053 ESTs which were later assembled into 156 contigs and 229 singletons as groundnut-expressed RGAs. Genome sequence initiative in wild diploid progenitor has yielded 278 NBS from A. duranensis and 303 NBS from A. ipaensis (Song et al. 2017). Subsequently, Zhuang et al. (2019) identi ed 661 NBS domain containing R-genes in tetraploid groundnut. All these Rgenes were divided into three groups: coiled coil (CC)-NBS-leucine-rich repeat (LRR) (CNL), Toll/interluekin-1 receptor (TIR)-NBS-LRR (TNL) and resistance to powdery mildew8 (RPW8)-NBS-LRR (RNL). Efforts on mapping and development of molecular markers against various disease resistance genes in Arachis species has progressed considerably, however no reports are available on isolation and cloning of resistance gene/R-gene for a particular disease.
Breeding and cytogenetic efforts were undertaken to introgress rust resistance gene containingchromosome fragment from A. cardenasii into cultivated groundnut species through wide hybridization techniques in India and abroad (Varman 1999 Recently, ne mapping effort on VG 9514 x TAG 24 derived RIL population has delimited the rust resistance gene within a 1.2 cM fragment anked by two SSR markers SSR_GO340445 and FRS 72 in chromosome A03 (Mondal and Badigannavar 2018). In Parallel, the position of the rust resistance gene in the same genomic fragment of chromosome A03 was con rmed by utilizing ddRAD-seq and whole-genome resequencing approach ). Genome sequence initiative has further helped to look into the nucleotide sequence level and revealed a 331 kb segment corresponding to the 1.2 cM ne mapped region. This 331 kb chromosome A03 fragment within the ne mapped region has several disease resistance related protein coding genes like a single TIR-NBS-LRR gene and three glucan endo-1,3 β glucosidase genes (Mondal and Badigannavar 2018). Present paper reports the research work on cloning of the TIR-NBS-LRR gene and genetic validation towards the candidature of this gene for rust resistance in cultivated groundnut. Mapping of other paralogous R genes in cultivated groundnut genome and position of non-synnonymous mutations in homology model of this R-protein were also undertaken in this study.

Plant materials
A rust resistant breeding line VG 9514 was used as a source of resistance, while TAG 24 was used as susceptible line. VG 9514 was bred at Tamilnadu Agricultural University, Vriddachalam from an interspeci c cross between Co 1 (A. hypogaea L.) and A. cardenasii (Varman 1999). TAG 24 is a high yielding cultivar bred at Bhabha Atomic Research Centre, Mumbai and is cultivated widely across India (Patil et al. 1995). A recombinant inbred line population (VG 9514 X TAG 24) with 164 lines was used for mapping of SNP based PCR markers and InDel markers designed speci cally in this study.
Cloning of R-gene within ne mapped region Based on the nding from Mondal and Badigannavar (2018), a TIR-NBS-LRR gene (Aradu.Z87JB) was predicted as a candidate gene for rust resistance. The predicted gene was downloaded from Peanutbase (https://peanutbase.org/home). The total size of the genomic fragment of the predicted gene was about 3.9 kb. Towards cloning, four pairs of primer (Table 1; Fig 1a) were designed in such a way that some overlapping region between the PCR products exist. This overlapped region will help to assemble the derived sequence from each PCR products and extract the whole TIR-NBS-LRR sequence. Gradient PCR (60 to 70 ºC) was followed to standardize the selective ampli cation from each primer pairs. After that, selective ampli cations of respective fragment from both VG 9514 (resistant) and TAG 24 (susceptible) were carried out in a 20 µl PCR reaction volume contained 1x colorless buffer, 2 mM MgCl 2 , 0.2 mM each dNTPs, 0.2 µM each primer and 1.0 U of Q5® high delity DNA polymerase (New England Biolabs, Madison, USA). The ampli cation pro le consisted of initial denaturation for 5 min at 95°C; 35 cycles of 30s denaturation at 95°C, 30s annealing at 62-70ºC (depending of primer pairs; Table 1), 1 min extension at 72°C and a nal extension at 72°C for 10 min. The desired size of the PCR product was eluted from agarose gel (1% agarose gel in 1X TAE) and puri ed with a gel puri cation kit (Qiagen, Humburg, Germany). The puri ed PCR product of desired size was then ligated into pcDNA3.1-myc-HisB plasmid cut with EcoRV. The ligation mixture was then used for transformation into E. coli DH5α competent cells and the viable cells were selected within ampicilin (100 µg/ml) containing Luria broth agar plates. The positive recombinant cells were con rmed through colony PCR using T7 promoter forward primer and BGH reverse primer ( Table 1). The positive colony thus obtained was inoculated in LB broth containing 100 µg/ml ampicilin. After overnight growth at 37ºC, the recombinant plasmids were isolated from positive cells and the insert size was con rmed through double digestion with XhoI and HindIII enzymes. For each fragment, 10 colonies were isolated and further con rmed through double digestion of recombinant plasmid with above enzymes. The recombinant plasmids from ve such positive clones were then sequenced bi-directionally using vector-speci c primers.

Sequencing of the target R-gene
The positive recombinant plasmid DNA from each R-gene fragment was sequenced bi-directionally in an ABI PRISM 3700 DNA Analyzer (Applied Biosystem, CA, USA) using T7 promoter forward and BGH reverse primer at Dr. KPC Life Sciences Pvt. Ltd., Falta SEZ, West Bengal, India. Removal of the extra dye was performed according to the manufacturer's protocol. Vector sequence was trimmed from each sequence and all the fragments were aligned and the whole R-gene sequence was extracted. The extracted R-gene sequence from VG 9514 and TAG 24 were then submitted at NCBI.

Identi cation of SNPs and design of SNP-based PCR marker
Both the R-gene sequence from VG 9514 and TAG 24 were aligned in ClustalW tool (https://www.genome.jp/tools-bin/clustalw) and similarity and/or mismatch were pointed out. The DNA sequences were then translated in silico and protein sequences were generated. Based on protein sequence alignment non-synonymous mutations were found and PCR-based SNP primer pairs (Table 1) based on 3'-mismatches were designed in WebSNAPPER tool (Drenkard et al. 2000). Each of the PCR primer pairs (0.2 µM) were then tested for polymorphism in between VG 9514 and TAG 24. PCR was performed in a 10 µl reaction mixture with speci c annealing temperature ( Table 2) and as per following thermal condition: initial denaturation for 5 min at 95 °C; 25 to 32 cycles of 30 s denaturation at 95 °C, 30 s annealing at 58-67 ºC (depending of primer pairs; Table 2),30s extension at 72°C and a nal extension at 72°C for 5 min.PCR ampli cation of SNP markers were carried out by using 1 X buffer, 0.2 mM dNTP and 1.5 U Taq DNA polymerase supplied by Board of Radiation and Isotope Technology (BRIT), Mumbai, India. The ampli ed product was size separated in 1.5% agarose gel along with 100 bp DNA ladder and stained with 0.1% ethidium bromide. Later, the stained agarose gel was photographed in a gel documentation system (Kodak, Rochester, USA)

Identi cation of Indel in other homologous sequence
The identi ed R-gene sequence of VG 9514 was used in BLASTn against all three genome in Peanutbase (https://peanutbase.org/home). Based on the searching parameter (coverage >80%, E value = 0, and identity >90%), four homologous sequences in A. hypogaea and one each in A. duranensis and A. ipaensis were found. Within these sequences, small InDel (3 to 18 bp) were identi ed through multiple sequence alignment in ClustalW2. Primer pairs were designed from the anking sequences of those identi ed InDel and later used for detection of polymorphism and genetic mapping studies ( Table 1).

Genotyping of mapping population with SNP and InDel markers
Genomic DNA was isolated freshly from young leaves of all 164 RILs (VG 9514 X TAG 24) along with parents. The polymorphic SNP-based PCR markers and InDel markers were used for genotyping in all the 164 RILs. Standard PCR conditions along with optimum annealing temperatures (Table 1) were followed for PCR ampli cation of SNP-based PCR markers and InDel markers. For the PCR ampli cation of InDel markers 1U of Go Taq polymerase was used with 0.2 µM of each primer, 200 µM dNTPs and 1X colorless buffer (Promega, Madison, USA). The ampli ed products from InDel markers were resolved in a capillary gel electrophoresis (Qiagen, Germany). The size of the PCR product (allele size) and scoring were done using the QiaExcel software (Qiagen, Germany).

Linkage analysis
The linkage analysis was performed using QTL IciMappingver. 4.1 (Wang et al. 2016). Minimum LOD score of 3.5 was set as thresholds for linkage group determination. For ordering and rippling of grouped markers 'nnTwoOpt' and 'SAD' (Sum of adjacent distances) command were used, respectively. The map distance was expressed in centiMorgan (cM) using the Kosambi (1944)  Validation of SNP-based PCR marker in other resistant genotypes A set of 11 rust resistant (including VG 9514) and 11 rust susceptible (including TAG 24) groundnut genotypes were used to validate the strong association/co-segregation of R-gene sequence derived SNP based PCR marker with rust resistance in eld (Table 3). Genomic DNA was isolated freshly from all these above genotypes. 10 g DNA from each genotype was used for PCR reaction of SNP based PCR marker by following the temperature pro le and reaction mixture condition as mentioned in above.

HRM Realtime-PCR ampli cation and data analysis
High Resolution Melting (HRM) assay was performed in Rotor-Gene Q realtime PCR Thermocycler (Qiagen, Germany) using pre-screened ampli ed primer pairs as per the protocol of Type-it HRM PCR Kit (Qiagen, Hamburg, Germany). Speci c ampli cation for each primer pair was checked on 2.5% agarose gel. Three primer-pairs (HRM-1, HRM-4 and HRM-6) harbouring multiple SNPs with product size of ~100 bp were designed (Table 1). These three primers were screened in three genomic DNA samples (in duplicate): VG 9514 (Resistant), TAG 24 (Susceptible) and heterozygous sample made by pooling equal volume of VG 9514 and TAG 24 gDNAs. For each HRM PCR reaction, four independent experiments were carried out. HRM PCR ampli cation were carried out in 10 µl reaction volume containing 5 µl eva green master mix (2x), 4.8 µl genomic DNA (2 ng/ul) and 0.2 µl primer mix (forward and reverse) of 10 pmol/ul. Thermal cycling was carried out with initial denaturation of 5 min at 95ºC followed by 3-steps cycle of initial denaturation at 94ºC for 30 sec, annealing at 60ºC for 30 sec and extension at 72ºC for 20 sec. HRM of ampli ed PCR products was done at temperature range from 65ºC to 90ºC with 0.1ºC increase in temperature at every step. Raw HRM data was analysed with the help of inbuilt software in Rotor-Gene Q. For data quality control, PCR ampli cation was analysed through the assessment of the C T value and ampli cation e ciency (Wu et al. 2008). HRM runs with CT value ≤ 30 with ampli cation e ciency >1.4 was considered for analysis. Raw HRM curves were analysed using the HRM analysis module as per Rotor-Gene Q Software 2.3.3.5 TECHNICIAN (Qiagen, Germany). Melting temperature (T m ) difference was calculated manually with the help of negative derivative plots. Homozygotes melt in a single transition whereas, heterozygotes showed multiple melt phases. Genotypes were differentiated based on normalised HRM curve, derivative melt curves and difference plots.

Gene Expression analysis
Total RNA was isolated from the healthy young leaves of VG 9514 and TAG 24 plants using the RNeasy Plant Mini Kit followed by DNase I (RNase free) treatment. The quality and quantity of isolated RNA was checked by electrophoresis on 1% agarose gel and A 260 /A 280 absorbance measurement in a Nanodrop spectrophotometer (Thermo-Fisher Scienti c Inc., USA). cDNA was prepared by reverse transcription of 1.2 μg of total RNA using high-delity reverse transcriptase and oligo-dT primer supplied in a Prime Script TM Hi-delity RT-PCR kit. Next, 4 μl of the 1/5-diluted cDNA was added to the 20 μl reaction mix containing 1X SYBR mix and 0.2 μM of each primer pairs (Table 1). Quantitative real time PCR (qRT-PCR) analysis of the set of genes including ACT2 as housekeeping (Morgante et al. 2011) was performed using premix Ex-Taq DNA polymerase (Takara Bio Inc., Japan) in Rotor-Gene Q (Qiagen, Germany). Reactions were carried out in triplicates from four independent samples. The two-step PCR program consisted of 15 s of initial denaturation at 94°C followed by 40 cycles of 5 s at 94°C and 31 s at 60°C, with a nal ampli cation for 1 min at 60°C. The ampli cation uorescence signals were acquired at 60°C in each set of 40 cycles. Speci city of target genes ampli cation was checked by melting curve analysis at 60 ˚C to 90˚C followed by electrophoresis separation of the same PCR products on 2.5% agarose gel. Transcription levels were expressed as a fold change relative to the control samples, as calculated by the comparative (2 −ΔΔCt ) method (Livak and Schmittgen 2001).
Analysis of evolutionary signi cance of the cloned R-gene based on K a /K s ratio All the homologous (including homeologous) genes were identi ed based on blast search in Peanutbase from three available genomes of Arachis. Pairwise alignment was done using clustal-omega (Sievers et al. 2011) and converted to AXT format. AXT le was used as input to K a K s Calculator 2.0 (Wang et al.

Results
Cloning and Sequencing of the targeted R-gene Using speci c primer pairs (Table 1) for four different fragments (R1, R2, R3, R4) of the target R-gene, the genomic regions were ampli ed from both VG 9514 (resistant) and TAG 24 (susceptible) genotypes. The R1 fragment was ampli ed at 62ºC annealing temperature and has produced 866 bp PCR product. While, the PCR product from R2, R3 and R4 fragments were 1115 bp, 915 bp and 1281 bp, respectively (Fig  1a,b). There was no length variation of any PCR fragments between the resistant and susceptible genotypes. The whole sequence (3998 bp) was later extracted from these four fragments. A sequence for VG9514-Rgene was submitted in Genbank as an accession MK791522. The other sequence from susceptible parent (TAG 24) was mentioned in Genbank as MK791523. Since the VG9514-Rgene  (Fig 2).

Identi cation of SNPs and their mapping
Multiple alignments between each respective fragment from VG 9514 and TAG 24 revealed no SNPs in R1, 39 SNPs in R2, 51 SNPs in R3 and 86 SNPs in R4 fragment. There were many non-synonymous mutations in R2 fragment which mostly correspond to both TIR and NBS region of R-protein. Based on amino acid sequence comparison, it was found that TIR domain had four changes in amino acid sequence, 18 changes in NBS domain and 11 changes in LRR domain (Fig 2). Some of these nonsynonymous regions were chosen to develop six PCR based SNP markers. Of the six PCR-based SNP markers, three showed reproducible polymorphism between VG 9514 and TAG 24. In all the three cases, VG 9514 produced positive ampli cation, while TAG 24 had no bands (Fig S1). All these three SNP-based PCR markers were used for genotyping all 164 RILs and had no segregation distortion (Table 2). They all co-segregated with the rust resistance in RILs. Further linkage mapping placed all three SNP-based PCR markers (FR4, FR1 and FR6) in the same place of A03 linkage group in between the SSR markers, FRS 72 and SSR_GO340445 within 0.6 cM interval (Fig 3). A representative gure for co-segregation of FR6 markers were shown in Figure S2. Thus, co-segregation of marker data of these gene-based PCR markers and positioning within the identi ed map interval con rmed the candidature of VG9514-Rgene for rust resistance in cultivated groundnut.

Validation of SNP-based PCR markers
To validate the strong association of these PCR-based SNP markers, 11 rust resistant cultivated groundnut genotypes and 11 susceptible genotypes were screened with these allele speci c diagnostic markers. In all cases, these three markers had proper one to one relation with the resistant/susceptible genotypes. In all the 11 resistant genotypes, the diagnostic allele-speci c markers ampli ed the PCR product, while no ampli cation was noticed in all the susceptible genotypes (Table 3).
HRM analysis of the SNP region in the identi ed R-gene Agarose gel electrophoresis of HRM qPCR products of all three SNP loci showed single band with no primer-dimers (Fig S3). The normalized HRM curve, derivative melt curves and difference plots distinguished resistant and susceptible genotypes clearly for all the three tested HRM-primer pairs. The primer pair HRM-FR1 and HRM-FR6 clearly differentiated resistant and susceptible parent but not its heterozygotes. The HRM marker HRM-FR4 distinguished the three genotypes namely, homozygous resistant, heterozygote and homozygous susceptible due to difference in shape and shift in melt curves (Fig 4a&b; Table 4). The melting temperature difference in normalized HRM curve and derivative melting curve of these HRM PCR products con rmed the presence of SNPs and allelic difference between resistant and susceptible parents.

Identi cation of homologous sequence in Arachis genome
When the cloned R gene (VG9514-Rgene) sequence was used in BLASTn against A. duranensis genome, a single hit in A03 chromosome was found. This is the same gene Aradu.Z87JB which was proposed in our earlier report (Mondal and Badigannavar 2018). It had 18 bp deletion in intron-2 compared to the isolated VG9514-Rgene. In the genome of A. ipaensis, a single gene Araip.0R3VU was found to be homologous with the isolated R gene. Araip.0R3VU had 3 bp deletion in intron-3, 3 bp insertion in exon-5, 11 bp insertion in intron-7 and 3 bp deletion in exon-8 and thus found to be non-functional. In the cultivated groundnut genome, four homologous genes were detected. The gene Arahy.GFGJ54 was found in chromosome 13 (B03). Arahy.GFGJ54 was found to be the same gene as found in A. ipaensis (Araip.0R3VU) and had multiple indels like in Araip.0R3VU. While three multiple genes Arahy.6DCA5, Arahy.R8KUIR and Arahy.ZZ0VZ9 were detected in chromosome 3 (A03) of A. hypogaea. Arahy.6DCA5 had two small indels: a 6 bp insertion in exon-7 and a 15 bp deletion in exon-8 and thus it evolved as nonfunctional. Arahy.R8KUIR had 4 bp deletion in intron-2, 3 bp deletion in intron-3 and 11 bp deletion in intron-7. The gene Arahy.ZZ0VZ9 had two deletions as compared to VG9514-Rgene: 18 bp deletion in intron-2 and 3 bp deletion in intron-3 (Fig 5). The position of 18 bp deletion in Arahy.ZZ0VZ9 is same as in Aradu.Z87JB. With the hope to get polymorphism based on these identi ed indels in these homologous genes, we designed primer pairs (Table 1) around these indels and screened polymorphism between parents (VG 9514 and TAG 24).

Identi cation of polymorphic R gene speci c Indel markers and their mapping
Screening of these new InDel markers revealed that all are polymorphic between VG 9514 (resistant) and TAG 24 (susceptible). Of the ve Indels, three had ampli ed bands as per the InDel size and found polymorphic (Table S1; Fig S4). These three are InDel 3, InDel 11 and InDel 15. Further these InDels are corresponding to the three homologous R-genes, Arahy.ZZ0VZ9, Arahy.R8KUIR and Arahy.T6DCA5, respectively. In every case, multiple bands were ampli ed due to the presence of homologous genes in the groundnut genome (Fig S4&S5). Based on the ampli cation pro le in the entire RIL population, a linkage map was constructed. These four polymorphic InDel markers were mapped at the distal region of A03 chromosome and they were tightly linked to each other. Further, they were mapped almost 19.3 cM away from the cloned VG9514 R-gene in the same chromosome (Fig 3).

Analysis of non-synonymous mutations within the PCR-based SNP marker of R gene
Based on the sequence comparison between resistant and susceptible ampli ed genes, we had identi ed several SNPs and later developed polymorphic SNP markers. Three such markers showed reproducible polymorphism and thus validated the position of mutations within the R-gene. The SNP marker FR4 was designed for 'C to T', 'G to T' and 'T to C' mutations within exon-3. This exon-3 code for some parts of TIR and NBS domain, where non-synonymous mutations changed 'lysine to asparagine', 'serine to leucine' and 'valine to serine' in the distal portion of TIR domain which is in close proximity of hhGRExE and walker A/P-loop motif of NBS domain. Another SNP marker FR1 was designed for 'T to A' and 'C to G' in the same exon-3. These mutations corresponded to the amino acid changes of 'isoleucine to tyrosine' and 'glutamine to glutamate' between RNBS-A and Walker B motif of NBS domain. The third polymorphic SNP marker FR6 harbored 'G to T' and 'A to G' in exon-7. Such point mutations lead to change of aspartate to cysteine in LRR domain of the R-gene of susceptible version (Table 5).

Gene expression of R-gene and other pathogenesis-related genes
To check the transcript level of this R-gene and other associated pathogenesis related genes (mentioned in Mondal and Badigannavar 2018), we have designed gene speci c primer pairs for qRT-PCR analysis ( Table 1). No ampli cation was observed in negative control (-RT). Comparison of transcript in healthy young leaves of VG 9514 and TAG 24 revealed a 4 fold enhanced expression of the cloned R-gene in the resistant parent. The other pathogenesis related (PR) genes (mainly glucan endo-1,3 β glucosidases) were also had higher expression in the resistant parent (Fig 6). Of the three PR genes, transcript level of Aradu.1WV86 was quite low (Ct value >34) and thus was not considered for analysis. Both the other PR genes, Aradu.T44NR and Aradu.NG5IQ had higher level of expression in the resistant plant.
Evolutionary signi cance of the cloned VG9514-R-gene compared to other homologues According to the homology standard based on BLAST, it was found that three genes were duplicated in tandem on distal portion of chromosome A03 of A. hypogaea. These three genes were Arahy.ZZ0VZ9, Arahy.T6DCA5 and Arahy.R8KUIR and they are paralogs of the identi ed VG9514-R-gene. However, no orthologous relationship was noticed between these three genes and Aradu.Z87JB (Fig 7). The phylogenetic tree showed that there is a stronger homologous relationship between Araip.0R3VU and Arahy.GFGJ54 (bootstrap 100), but lower relationship between Arahy.ZZ0VZ9 and Aradu.Z87JB (bootstrap less than 70). Taken together, there was orthologous relationship between VG9514-Rgene and Araip.0R3VU. In addition, there was homoeologous relationship between Arahy.GFGJ54 and three orthologous genes (Arahy.ZZ0VZ9, Arahy.T6DCA5 and Arahy.R8KUIR) of VG9514-Rgene. Based on these leads, K a /K s values were determined from all possible combinations of VG9514-Rgene with one homeologous (Arahy.GFGJ54) and three orthologous (Arahy.ZZ0VZ9, Arahy.T6DCA5 and Arahy.R8KUIR) genes in A. hypogaea genome. The identi ed R-gene displayed >1 K a /K s value with respect to Arahy.GFGJ54 and Arahy.R8KUIR. With respect to two other paralogous genes, K a /K s values were found close to 1 ( Table 6). These results indicated that these paralogous genes underwent positive selection or relaxed purifying selection.  (Fig 8). Validated model of VG9514-Rprotein was supported by having more than 99% of the amino acid residues in the allowed region of Ramachandran plot (Fig S6a). Further, ProSA program checked the quality of the model, with a good Zscore (-8.28) and not much deviation of energy was observed in the proposed model (Fig S6b). Amino acid differences from R-protein of TAG 24 sequence were mapped to the modelled structure of VG9514-Rprotein. There were only three residue differences observed at the end of the TIR domain, whereas most of the SNPs were in the NB-ARC and LRR domain of the modelled R-protein.
Mapping of non-synonymous mutations (found in R-gene sequence of TAG 24) identi ed that the mutations were scattered throughout the whole VG9514 R-protein model (Fig 8). At the TIR domain three speci c non-synonymous mutations K252N, S253L, V254S which are positioned in close proximity of hhGRExE motif of NBS domain. Another important non-synonymous mutation E268Q was revealed within hhGRExE motif. In the RNBS-A motif of NB-ARC domain, an important non-synonymous mutation Y309F was deciphered (Fig 8). Other three amino acid changes Q349H, I350Y, Q351E were observed in between RNBS-A and Walker B motif of NB-ARC domain. An interesting non-synonymous mutation K380N was observed in close proximity to Walker B motif. Similarly K420E was placed in between RNBS-B/sensor I and RNBS-C motif. Another K457E mutation was found in GLPL motif of the NBS domain of this R-protein in TAG 24. Two other mutations Y527W and R528I were at the end of RNBS-D motif of NBS domain. At the MHD motif, an I579T mutation was also deciphered in the susceptible R-protein in TAG 24 (Fig 8). Many other mutations were also noticed in LRR domain which is responsible for recognition of avirulence (AVR) protein in pathogen.

Discussion
Genetic resistance provided by the crop cultivars is an environmentally viable approach to combat the yield losses in plants due to deadly diseases. Often such resistances are not available in primary gene pool. Disease resistance was then introgressed into cultivated species from wild crop relatives. Such a dominant rust resistance gene was introgressed into cultivar Co1 (A. hypogaea L.) through wide hybridization in hexaploid pathway from A. cardenasii (Varman 1999;Mondal et al. 2007). The resultant rust resistant breeding line VG 9514 was thus evolved and later used for making a mapping population to tag this dominant rust resistance gene (Mondal et al. 2007; 2012 a, b). Usage of several chromosome speci c SSR markers later ne mapped this rust resistance loci within a 1.  (Table 1) for studying polymorphism and genetic mapping. Three of them showed reproducible polymorphism (presence vs. absence) between resistant and susceptible parents.
Genetic mapping of these polymorphic SNP markers (FR4, FR1 and FR6) positioned them in between the identi ed ne mapped region of FRS 72 and SSR_GO340445 markers in chromosome A03 that harbored the consensus major rust QTL (Mondal and Badigannavar 2018). Thus, the genetic mapping of these SNP markers revealed complete co-segregation of them with rust resistance and position of these markers was located within the ne mapped region without any recombination between them (Fig 3).
Such non-synonymous SNP regions were later used for high resolution melting (HRM) analysis through quantitative PCR technique and examined for their allelic difference. All these three could differentiate both the genotypes clearly through HRM reaction. The particular primer pairs of HRM-4 distinguished even the heterozygous plant from both the parent and thus displayed its usage in marker assisted selection (MAS) (Fig 4a & b). Identi cation of such gene based marker in the present study will greatly helpful in the introgression of rust resistance in high yielding cultivars without linkage drag of undesirable agronomic characters from wild crop relatives. It was interesting to know the gene expression pattern in presence of rust in this two contrasting genotypes VG 9514 and TAG 24. We could not succeed to get a rust infection in arti cial growth chamber condition and thus we proceeded with comparative gene expression analysis between them in normal plant growth condition. A signi cantly higher basal expression of the identi ed R-gene was noticed in VG 9514 probably due to presence of a strong promoter element. Further two other pathogenesis related genes (Aradu.NG51Q and Aradu.T44NR) also had higher expression in resistant genotype compared to susceptible one (Fig 6).
Mapping of other homologous R-gene in cultivated groundnut genome and evolutionary signi cance of the isolated VG9514-R gene.
Sequence analysis revealed several homologous R-gene sequences in cultivated groundnut genome in comparison to the isolated gene VG9514-Rgene. In the A genome, we could nd Arahy.ZZ0VZ9 and Aradu.Z87JB. Similarly Araip.0R3VU and Arahy.GFGJ54 were found in B genome. According to the homology standard based on BLAST method, it was found that Arahy.ZZ0VZ9 had two paralogs, Arahy.T6DCA5 and Arahy.R8KUIR on chromosome A03. When the isolated VG9514-Rgene sequence was used as query sequence for BLAST in A. hypogaea genome in Peanutbase, we could detect several InDels within these paralogous genes. The size of InDel ranged between 3 to 15 bp (Fig 5). Vishwakarma et al. (2017) found 515,223 InDels over the groundnut genome when compared two diploid A. hypogaea progenitor's genome sequences. Such small sized InDels in paralogous genes probably occurred in the process of chromosome doubling and tetraploidization during the domestication of cultivated groundnut. Based on their polymorphism between VG 9514 and TAG 24, these InDels were used in genotyping of RILs and genetic mapping. Mapping revealed all these three InDel markers were mapped in close proximity to each other at the distal place of chromosome A03 and positioned themselves around 19.3 cM apart from the identi ed VG9514-Rgene (Fig 3). Thus, the sequence analysis and mapping of these InDel markers revealed tandem duplications of R-genes in the genome of cultivated groundnut. These genes are non-functional against the rust pathogens and thus introgression event of a new R-gene from A. cardenasii in other locus of the same chromosome imparted resistance in VG 9514. Shirasawa et al. (2018) had detected a common 2.7 Mb genome fragment (131.9-134.6 Mb in A03) introgression from A. cardenasii in four rust resistant genotypes, RIL97 (TAG 24 x GPBD 4), GPBD 4, ICGV 86855 and VG 9514. The present study revealed the sequence of this novel resistance gene (VG9514-Rgene) in groundnut and also mapped three other paralogues of the cloned R-gene at the distal portion of chromosome A03.
Based on K a /K s analysis, it was found that VG9514-Rgene had positive selection during the process of evolution compared to Araip.0R3VU, Arahy.GFGJ54 and Arahy.R8KUIR. Whereas with respect to the paralogous genes Arahy.ZZ0VZ9 and Arahy.T6DCA5, the K a /K s value was almost close to 1 and thus they have been conserved along with the introgressed gene VG9514-Rgene in chromosome A03. Selection pressure on protein coding sequence is normally assessed by the ratio of the non-synonymous substitution rate (K a ) to the synonymous substitution rate (K s ) (Hughes and Nei 1988). If K a /K s > 1, positive selection is assumed to have occurred during the evolution of the sequence. Typically natural selection shapes the evolution of species through one of the two mechanisms: purifying selection, which promotes the conservation of existing phenotypes, and positive selection, which favors the emergence of new phenotypes (Vallender and Lahn 2004). Taken together, it is obvious that the VG9514-Rgene was introduced through an introgression event from A. cardenasii and this gene has undergone positive selection that emerged as a novel dominant rust resistance in groundnut. It is postulated here that resistance genes at the distal portion of chromosome A03 had undergone tandem duplications, possibly through transposon activities. Further, both intergenic and intragenic gene conversions and unequal crossing-over occurred during repeated backcrossing in hexaploid pathway of gene introgression event from wild species, might have evolved the functional rust resistance in VG 9514.
Implications of the non-synonymous mutations in respect to homology modelling of VG9514-R protein The LRR domain is a common motif found in many proteins, and it is involved in protein-protein interactions and effector binding (Jones and Jones 1997). In many NBS-LRR proteins, the putative solvent exposed residues of LRR domain show signi cantly elevated ratios of nonsynonymous to synonymous substitutions, indicating that diversifying selection has maintained variation at these positions. The LRR domain is involved in determining the recognition speci city of several R proteins (Hwang and Williamson 2003 Based on the protein sequence homology, van Ooijen et al. (2008) proposed nine conserved motifs in NB-ARC domain. These motif sequences are conserved across the R-protein and alternations of these sequences may interfere with structure-functions relationship of the R-gene. Based on mapping of mutations in susceptible R-protein of TAG 24, it was found that three mutated amino acid residues are lying in these conserved domains: E267Q in hhGRExE motif, Y309F in RNBS-A motif and I579T in MHD motif (Fig 8). We proposed here the critical role of these mutations in providing rust resistance function in groundnut.
In conclusion, the present investigation has isolated a TIR-NBS-LRR domain containing dominant rust resistance gene in groundnut for the rst time in literature and developed gene-based SNP markers for marker assisted selection. The gene based markers and their segregation pattern further proved its cosegregation with the rust resistance and its candidature. This novel VG9514-Rgene was distinct from the other homologous R-gene in the tetraploid genome and found to undergo positive selection. Homology based protein modelling study later revealed that the non-synonymous mutations in susceptible version of this gene in TAG 24 were positioned in all the domains of the R-protein. Of them, three were in conserved motifs (hhGRExE, RNBS-A and MHD motif) of NB-ARC domain which is responsible for ATP binding, hydrolysis and transduce the ATP-binding-induce conformational change to TIR domain for program cell death.

Declarations Acknowledgement
We are thankful to Head, Nuclear Agriculture and Biotechnology Division, BARC for constant encouragement and constructive comments during the work. We also extend our special thanks to Associate Director, Bioscience group, BARC for timely support. Authors duly acknowledge Dr. Swathi Kota, MBD, BARC for extending her laboratory facility for transformation and plasmid isolation during this work. Efforts made by T. Chalapathi and Sujit Tota at eld experiments are duly acknowledged.

Data Availability Statements
All data supporting the ndings of this study are available within the paper and within its supplementary data published online. Sequence information of MK791522 (isolated R-gene sequence from VG 9514 genotypes) and MK791523 (isolated R-gene sequence from cv. TAG 24) is available at NCBI (https://www.ncbi.nlm.nih.gov).

Con ict of Interest Statement
Authors do not have any con ict of interest

Con ict of Interest
Authors declare no con ict of interest.