Screening of parents and RILs against MYMIV
The resistant (Mash114) and susceptible (KUG253) parents showed clear phenotypic differences (Figure 1) while F1 showed resistant reaction (DS=1) indicating dominance of resistance over susceptibility. The frequency distribution in RILs showed symmetric distribution (Figure 2). There were 70 asymptomatic RILs, thus scored as resistant (mean DS=1) while 90 RILs were MYMIV susceptible showing characteristic yellow mosaic symptoms. This distribution of resistant (R) and susceptible (S) fitted the RIL population in 1R:1S ratio (ꭓ2=3.5, p ≥ 0.05) indicating single dominant gene conferring MYMIV resistance. Of the 90 susceptible RILs, 50 RILs showed very high chlorosis in leaves with stunted growth thus given the score of DS=7, whereas 17 RILs showed low chlorosis and scored as DS=3, while 23 RILs were scored DS=5 with moderate chlorosis in leaves and retarded growth, when compared with MYMIV score of infector row (mean DS=7) (Figure 2). Based on MYMIV reaction data, 40 RILs showing extreme resistance and 40 RILs showing extreme susceptible reaction were selected to constitute MYMIV-R and MYMIV-S bulks respectively.
Whole genome re-sequencing analysis and variant detection
A total of 37.6 Gbp clean reads with QC > 30 were obtained from re-sequencing of two bulks (MYMIV-R, MYMIV-S) and two parental genotypes (Table 1) that represented 90 per cent of the original sequencing data. An average sequencing depth of 16.5 x with aggregate 212.9 million clean reads were obtained after initial quality check and mapping of these clean reads to the CN80 blackgram reference genome gave a mean mapping coverage of 92 per cent. GATK identified ~18 million genetic variants encompassing 1522432 SNPs and 301425 InDels in susceptible parent ‘KUG253’ whereas, 1674446 SNPs and 335549 InDels were identified in the resistant parent, Mash114, with a total of 2795896 variants. Among 2795896 variants, transitions (2017687) outnumbered transversions (666488) for both the parental genotypes. All InDels were filtered out for simplicity of analysis and retained after final analysis. SNPs were filtered with maximum missing 0.9, this led to a total of high quality 1861060 SNPs available for QTLseqr analysis, covering 11 chromosomes (492 Mbp of total 570 Mbp genome). However, the SNPs present on smaller contigs were excluded for further analysis.
QTL-seq analysis and identification of candidate QTL region
A total of 588821 high quality SNPs after filtering by QTLseqr default parameters (Additional file 1: Figure S1) were used for QTL-seq. To infer the QTL region conferring MYMIV resistance, the genome wide comparison SNP-index of MYMIV-R and MYMIV-S bulk over the entire length of genome was performed using QTLseqr. The value of SNP-index equaling ‘0’ implies that all the sequencing reads at a particular genomic position in a bulk originated exclusively from susceptible KUG253 genome, while SNP index value of ‘1’ implied that reads originated entirely from resistant Mash114 genome. The SNP-index values of MYMIV-R and MYMIV-S bulks identified four QTLs named as qMYMIV6.1, qMYMIV6.2, qMYMIV6.3, qMYMIV6.4 to be associated with MYMIV resistance on chromosome 6 harboring QTL region containing 16613, 2046, 7014 and 723 SNPs, respectively (Table 2, Fig. 3). The physical distances between the QTLs qMYMIV6.1, qMYMIV6.2, qMYMIV6.3 and qMYMIV6.4 were 2, 5 and 4 Mbp, respectively. The QTL region qMYMIV6.1 was the largest spanning over 8.30 Mbp, followed by qMYMIV6.3 with length of 4.93 Mbp, qMYMIV6.2 with 2.0 Mbp and qMYMIV6.4 covered 0.52 Mbp on chromosome 6.
The 3.4 Mbp region of QTL qMYMIV6.1 from 5.6 to 9 Mbp of chromosome 6 was characterized by average SNP-index lower than 0.2 (lowest value = 0) and higher than 0.8 (highest value = 1) and temporarily designated as qMYMIV6.1.1. While, the rest of the genomic regions of qMYMIV6.1 and remaining three QTLs did not show similar trend; therefore not considered for further analysis. In the qMYMIV6.1.1 region, all the component RILs of MYMIV-R bulk had resistant parent Mash114 type allele and all component RILs constituting MYMIV-S bulk had susceptible parent KUG253 type allele as evident from individual reads in the bulks on integrated genome viewer (IGV) (Figure 4). This strongly suggested that qMYMIV6.1.1 could be probable region for gene underlying MYMIV resistance. Moreover, peak ∆(SNP-index) of 0.65 and peak Gprime value of 42.08 at 8.7 Mbp position of qMYMIV6.1.1 further supports this as putative QTL region. Therefore, qMYMIV6.1.1 was selected for further analysis.
Annotation and identification of candidate genes
A total of 1167 CDS were predicted in qMYMIV6.1 region which were annotated by sequence similarity searches using BLASTx algorithm against NCBI non-redundant (nr) protein database of which 46 % of cds had significant similarity to already known proteins while 53 % had no hits in nr database and referred to as uncharacterized protein (Additional file 2: Data S1). The functional annotation by Blast2GO assigned functional terms and 9330 gene ontology numbers to 547 CDS differentiating their molecular function, biological process and cellular component. Of these, 150 cds mapped to KEGG pathways having role in disease resistance, were selected. The disease resistance genes present on targeted 3.4 Mbp QTL region of MYMIV6.1.1 were visualized on IGV. The 20 genes were identified with read depth more than 40 and in silico polymorphism in parental genotypes as well as corresponding bulks. Of these 9 candidate genes were selected based on GO function, enzyme activity and KEGG pathway for potential involvement in virus resistance.
KASP marker development, linkage mapping and identification of major effect locus
To further validate the association of qMYMIV6.1.1 region to MYMIV and understand genetic relationship of SNPs in the region, a total of nine SNPs present in both introns and exons of selected candidate genes and spanning the QTL region, were extracted. The selected SNPs were converted to KASP markers (Additional file 1: Table S1 and Table S2). The nine KASP markers were used to genotype 160 RILs (Fig. 4) that generated a genetic linkage map of 15.98 cM. KASPs were closely placed on genetic linkage map such that the genetic distances between adjoining markers VM608, VM601, VM609, VM604, VM605, VM610, VM602, VM606 and VM607 were 3.73, 2.76, 1.58, 2.57, 0.78, 1.12, 2.06 and 1.38 cM, respectively. All the KASP markers showed linkage to MYMIV resistance, Further, inclusive composite interval mapping mapped a large effect QTL at maximum LOD value of 42.4 showing PVE = 70.89 % and additive effect = -2.3201 associated with MYMIV. This major effect locus spanned a genetic distance of 1.9 cM including three markers viz., VM605, VM610 and VM602. The interval between markers VM605 and VM610 (0.77 cM) showed LOD value of 42.4 at P > 0.05 and the interval between markers VM610 and VM602 (1.12 cM) showed LOD value of 41.9 at P > 0.05 (PVE = 71.28 %, Additive effect = -2.3201). Thus, it was confirmed that VM602, VM605 and VM610 were tightly linked to MYMIV resistance gene spanning a genetic distance of 1.9 cM which corresponds to 0.5 M physical region on V. mungo genome.
MYMIV resistance in Mash114 introgressed from V. umbellata
Genotyping of V. umbellata genotype RBL1 using KASP markers indicated 2.1 Mbp fragment on chromosome 6 from genomic position 6978055 bp (VM609) to 9085008 bp (VM605) in Mash114 had same alleles as RBL1 (V. umbellata) (Fig. 6). These findings intrigued convincing clue that the gene(s) controlling MYMIV resistance in V. mungo cv. Mash114 are present on introgressed segment from V. umbellata. The candidate region (0.5 M) of introgressed segment harboured 80 genes as predicted by Augustus (Additional File 2: Data S2), of which 9 genes were involved in disease resistance pathway carrying 31 non-synonymous SNP mutations.
In addition, comparison of physical and genetic positions of KASP markers in blackgram revealed similar order of arrangement in both maps (Fig. 6) with few exceptions. Since, genome of cowpea (Vigna unguiculata) was well annotated than blackgram (Vigna mungo), sequence similarity of V. unguiculata with V. mungo was inferred to enrich the annotations of V. mungo and fill the gaps. It was found that the QTL region (0.5 Mbp) mapped by KASP markers on chromosome 6 of blackgram was syntenic with 430 Kb genomic fragment of chromosome 2 in Cow pea (Vigna unguiculata). This syntenic fragment (24-26Mbp) of chromosome 2 of V. unguiculata was further disintegrated to reveal the component genes. Interestingly, abundance of plant disease resistance gene clusters including 37 domains of LRR receptor like serine threonine kinases and thaumatin like protein, domains of putative disease resistance protein RGA, disease resistance RPP13-like protein 4, F-box protein , E3 ubiquitin-protein ligase in QTL region.
Our study has identified a large effect QTL associated with MYMIV resistance in V. mungo and further identified tightly associated KASP markers that can be used in breeding program. However further narrowing down of this region will help to identify putative candidate gene(s) responsible for MYMIV resistance and unleashing its function thereof.