A SNP-Based Linkage Map and QTL Identication for Resistance to Yam Anthracnose Disease (YAD) in Water Yam (Dioscorea alata)

Background: Yam anthracnose disease (YAD) caused by Colletotrichum gloeosporioides is the primary cause of yield loss in water yam (Dioscorea alata), the widely cultivated species of yam. Development of resistant cultivars have been a prime target for sustainable management of anthracnose in water yam. Molecular breeding tools are required to expedite the development of improved yam varieties. QTL analysis using high density genetic maps serve as a powerful tool to discover key locations of quantitave traits. This study aimed at tagging quantitative trait loci (QTL) for anthracnose disease resistance in a bi-parental mapping population of D. alata. Results: In this study, two contrasting parents for yam anthracnose disease reaction and their 204 full- sib offspring were used to develop a high-density genetic linkage map with 3,257 SNP markers by the GBS technique. The total length of the consensus map was 1460.94 cM with an average of 163 markers per chromosome. Four QTLs were detected for anthracnose disease resistance in 4 locations on 3 chromosomes. The proportion of phenotypic variance explained by these QTLs ranged from 10 to 13%. Plant defense response genes including GDSL-like Lipase/Acylhydrolase, Protein kinase domain and F-box protein were also detected within the QTL regions. Conclusion: The results from the present study provide valuable insight into the genetic architecture of anthracnose resistance in water yam. The candidate markers and putative genes identied herewith form a relevant resource to apply marker-assisted selection as alternative to a conventional labor-intensive screening for anthracnose resistance in water yam. best option for integrated management of the disease. This study assessed functional association of anthracnose resistance and genetic markers via QTL mapping approach using recombinant clonal population to enable indirect selection for resistance to the disease in cultivar development efforts. The recombinant clonal population showed differential response to the disease-causing organism across the two-year evaluation period. The population showed quantitative resistance with a continuous distribution from resistance to susceptible eld eld experiment

cultivars of water yam in West Africa, Central America and the Paci c [10,11,12,13]. High genetic and pathogenic variance has been reported among isolates of C. gloeosporioides from different geographical locations [7,14,15], suggesting that there is high probability for the geographic variation in strains, some of which could overcome existing resistance [16].
Cultural control approaches such as the use of disease-free planting materials, adjustment of plant spacing and planting dates, burying infected plant residues in the soil immediately after harvesting, intercropping, crop rotation with non-host crops and fallowing have been used in other plant pathosystems to reduce pathogen inoculum in the eld, delay disease onset, or slow disease progress [17,18]. Nonetheless, these disease management practices have not been effective for controlling anthracnose disease in water yam or result in substantial increase in tuber yield [19], especially in disease endemic areas. Also, biological control to impede or outcompete the multiplication and spread of virulent C. gloeosporioides strains in yam elds has been limited [20]. Chemical control can be an effective disease management approach but most yam producers are smallholder growers and may not have the prerequisite technical support and nance to afford the use of fungicides [21]. Furthermore, inappropriate use of fungicides could potentially result in the development of resistant C. gloeosporioides strains to systemic fungicides [22] as well as detrimental environmental effects. The best control option is therefore the development and deployment of anthracnose resistant water yam varieties. It is, therefore, expedient to develop varieties with multiple disease resistance genes to provide stable and durable resistance against the broad spectrum of the fungal pathogen. Substantial progress has been made to develop anthracnose resistant water yam varieties at the International Institute of Tropical Agriculture (IITA), Nigeria and national agricultural research systems in West Africa and elsewhere through conventional breeding using phenotypic observations. This effort is, however, arduous and considerably slow due to the inherent biological constraints of a heterozygous vegetatively propagated crop [23]. Genomics-informed breeding techniques such as molecular marker assisted breeding and genomic selection would accelerate efforts in introgresssing anthracnose resistance into preferred genetic backgrounds [3].
Earlier investigations on anthracnose disease in water yam showed that resistance is likely to be dominant and quantitatively inherited [24]. Efforts have also been made to identify QTL controlling YAD using low-throughput molecular markers and less dense or unsaturated genetic maps such as AFLP markers [5,25] and EST-SSRs [26]. Prospects for locating additional QTL and applying molecular breeding methods in water yam improvement programs is very promising especially due to advances in next-generation sequencing and the recent development of the reference genome sequence of D. rotundata and D. alata. The objective of this study was to develop a SNP-based genetic linkage map and identify QTL for anthracnose disease resistance in a bi-parental mapping population of D. alata.

Results
Phenotypic variability of the parents and derived clones Signi cant differences (p < 0.05) were observed among the mapping population for their reaction to YAD in both years ( Table 1). The mean squares for year as well as genotype by year interaction were highly signi cant (p < 0.01). The disease pressure was higher in 2018 compared to 2017. The area under disease progression curve (AUDPC) estimates ranged from 210.0 to 397.5 with an average of 245.5 in 2017 while the range was from 233.4 to 482.1 with an average of 299.8 in 2018. Exposure of the progenies to the natural eld infestation by anthracnose revealed none of the lines were highly resistant (mean severity score of 1, equivalent to AUDPC value < 105) or highly susceptible (mean severity score of 5, equivalent of AUDPC > 525) (Fig. 1). Majority of the genotypes evaluated (67-92%) were moderately resistant to anthracnose across the 2 years. Broad sense heritability for YAD resistance was high (70.64%).  2) followed by conservative Bonferroni threshold on the informative SNP markers identi ed 3,931 markers as distorted and 395 as redundant and were therefore removed before the map construction (Fig. 3). Prior the genetic map construction, Pairwise recombination fractions for all markers were calculated and SNP marker were ordered (Supplementary le Fig. 1).The nal genetic map was constructed using the highly informative 3,257 SNP markers that covered all 20 linkage groups of the water yam genome (Fig. 4, Supplementary Fig. 2

Qtl Analysis With The Genetic Map
The QTL detected on three chromosomes out of the 20 for the speci c year, as well as across years data are presented in Table 3 and   Chr = chromosome, pos = position, LOD = logarithm of odds score, CI = con dence interval, R 2 = % variation explained.
Gene annotation within the QTL regions of the signi cant QTLs identi ed putative plant defense response genes. Adjacent chr7_3179921 were two plant biotic stress related genes; DRNTG_08663.1 (GDSL-like Lipase/Acylhydrolase) and DRNTG_08664.1 (Protein kinase domain). The N-terminal alpha/beta domain gene (DRNTG_14305.1) was detected within the anking sequence of chr15_8632438. Two additional genes DRNTG_18245.1 (ANTH domain -Putative clathrin assembly protein) and DRNTG_29617.1 (WD domain -WD40 repeat-containing protein) were detected within the vicinity of chr18_18405143, while the F-box protein (DRNTG_23336.1) was found near chr07_5765300.
Interaction among the four QTL detected for yam anthracnose disease resistance revealed signi cant (p < 0.05) QTL by QTL interaction for chr7_3179921 and chr15_8632438, and chr07_5765300 and chr18_18405143, while no signi cant variation was observed among all other QTL combinations (Table 4).

Discussions
Anthracnose is one of the major constraints contributing to yield reduction in water yam production. It occurs wherever water yam grows. Several management options are available to tackle the anthracnose disease threat in water yam production. However, development of cultivars with stable and durable resistance against the broad spectrum of the fungal pathogen is the best option for integrated management of the disease. This study assessed functional association of anthracnose resistance and genetic markers via QTL mapping approach using recombinant clonal population to enable indirect selection for resistance to the disease in cultivar development efforts. The recombinant clonal population showed differential response to the disease-causing organism across the two-year evaluation period. The population showed quantitative resistance with a continuous distribution from resistance to susceptible range with substantial skewness towards resistance. No immune lines were identi ed through the natural eld exposure to C. gloeosporioides, rather, large number of the lines were resistant or moderately resistant. Result of this study con rms the earlier reports by Mignouna et al. [5], Petro et al. [25] and Bhattacharjee et al. [26], that resistance to YAD is dominantly and qualitatively inherited trait. The heritability estimate in the present study was high just as earlier reported by Petro et al. [25] and Bhattacharjee et al. [26].
The high-density genetic linkage map using 3,257 SNPs from the GBS platform that spanned a total length of 1460.94 cM represents the most saturated genetic map for D. alata to date. In an earlier effort, Cormier et al. [27] constructed a high-density genetic map of D. alata using 1,579 polymorphic SNP markers with a consensus map length of 2613.5 cM leading to the identi cation of a major QTL for sex determination on linkage group six. Genetic linkage maps of water yam were also developed using EST-SSRs [26] and AFLPs [5; 25]. The genetic linkage map presented in this report will offer an outstanding genetic background for qualitative and quantitative trait analysis of water yam.
Three studies have so far been conducted for mapping QTLs controlling resistance to anthracnose in water yam [5; 25, 26]. The study by Mignouna et al. [5] and Petro et al. [25] utilized AFLP maps and identi ed one and nine QTLs, respectively for anthracnose resistance explaining 10% and 26 to 74% of the total phenotypic variation. Bhattacharjee et al. [26] utilized an EST-SSR genetic map for their study and identi ed a major QTL on linkage group 14 explaining 69% of the total phenotypic variance. Even though the previous studies ordered markers on 20 linkage groups, the absence of a common genetic map and the different marker systems makes it di cult to compare the location of the detected QTLs in these studies. In this study, four QTLs for yam anthracnose disease resistance were identi ed on four chromosomes explaining 10.0 to 12.6% of the total phenotypic variation in the trait. The detected QTL showed general or year speci c effect which could be attributed to variation in strain or intensity of strains of C. gloeosporioides infestation over years during the eld experimentation. Earlier investigations have also demonstrated resistance of water yam to anthracnose disease to be both isolate-speci c and non-speci c [24; 25]. Strain-speci c as well as non-strain-speci c QTLs were detected for anthracnose resistance in different populations of D. alata [5; 25]. Also, Geffroy et al. [28] reported QTL speci c for anthracnose resistance in leaves, stems and petioles of common bean. A similar mechanism of isolate speci city or non-speci city may have occurred in this study leading to the identi cation of different QTL conferring anthracnose resistance over the two years.
Gene annotation identi ed plant biotic stress response genes within the anks of the signi cant QTL for YAD discovered in this study. The GDSL-like Lipase/Acylhydrolase gene detected within the vicinity of chr7_3179921 was reported to regulate systemic resistance to Alternaria brassicicola in Arabidopsis [29; 30]. Hong et al. [31] also found this gene to be involved in the defense against drought and Xanthomonas campestris pv. Vesicatoria in pepper. The protein kinase domain implicated in the resistance against bacterial blight (Xanthomonas oryzae) in rice [32] and resistance to the necrotrophic fungal pathogen, Plectosphaerella cucumerina in Arabidopsis [33] was also found within the QTL region of chr7_3179921.
The ANTH domain and WD domain discovered within the vicinity of chr18_18405143 were reported to be important in Nicotiana benthamiana and Arabidopsis defense against Pseudomonas syringae [34] and enhanced the resistance to anthracnose leaf blights in maize caused by Colletotrichum sublineolum [35; 36], respectively. The F-box protein found within the QTL region of chr07_5765300 was reported to be involved in cell death and defense response during the pathogen recognition of Pseudomonas syringae and Tobacco mosaic virus in tomato and tobacco [37]. The N -terminal domain identi ed within the anking sequence of chr15_8632438 in the present study was reported to be involved in the resistance of Arabidopsis to the downy mildew pathogen Hyaloperonospora arabidopsidis [38]. Therefore, enough evidence exists that the genes within the anks of the signi cant QTL for anthracnose disease resistance discovered in this study code for response to plant biotic stress.

Conclusion
In this study, we report the use of a highly saturated genetic linkage map based on SNP markers to identify QTLs for YAD in water yam. The high-density genetic map based on 3,257 SNPs would be very useful for genetic studies in water yam. The linkage analysis identi ed four QTLs accounting for 10.0 to 12.6% of the total phenotypic variance in anthracnose severity score. Five genes involved in plant defense against diseases were also identi ed within the anks of the QTLs detected in this study. These results, upon validation and development of diagnostic SNP markers will enable the application of markerassisted selection especially at the early breeding stages to shorten the breeding cycle of water yam. Giving the quantitative nature of this trait and the proportion of the phenotypic variance unaccounted for in this studies, future investigations should consider larger mapping populations through meta-QTL analysis encompassing diverse genetic backgrounds with high level of resistance and high through-put phenotyping for precise disease evaluation across different locations and years.

Plant materials
A recombinant clonal population of 204 genotypes from the cross of TDa9900015 x TDa0500048 developed at IITA was used for this study. TDa9900015 is a female breeding line moderately resistant to YAD while TDa0500048 is a male breeding line expressing susceptible reaction to the disease. All the recombinant clones plus both parents and one highly susceptible check (variety# TDa92-2) were evaluated in eld experiment. The eld anthracnose phenotyping experiment was conducted in two seasons (2017 and 2018) at IITA, Ibadan research farm in Nigeria. The eld experiment was carried out using partial replicated design during the major rainy seasons when anthracnose incidence and severity are high. The susceptible check (TDa92-2) was planted as spreader row between blocks and around the eld.

Phenotyping
Anthracnose disease severity was scored at two months after planting and thereafter, fortnightly till six months. Severity was scored by visual assessment of the relative area of plant tissue affected by anthracnose using a 1-5 severity rating scale. Where, 1 = No visible symptoms of anthracnose disease or infection spot on the leaf surface; 2 = Few anthracnose spots or symptoms on 1 to 25% of the plant (i.e. one or two spots of less than 1 cm diameter width, and dry tissue on the leaf surface); 3 = Anthracnose symptoms covering 26 to 50% of the plant (i.e. one or two spots of more than 1 cm diameter width, and dry tissue on the leaf surface, small dark and no dried spots with more than 1 cm width are present); 4 = Symptoms on > 50% of the plant (i.e. coalesced spots with dry tissue and covering a signi cant proportion of the leaf surface, areas with less than 1 cm width coalesce to bigger spots and yellowing of green tissue is intense around the spots areas); and 5 = Severe necrosis and death of the plant (i.e. coalesced spots with dry tissue more than 1.5 cm in diameter and covering a great proportion of the leaf surface and yellowing of the green tissue is generalized in the leaf blade) [39].
The area under the disease progression curve (AUDPC) was estimated from the disease severity scores using the trapezoidal method [40]. This method discretizes the time variable and calculates the average disease intensity between each pair of adjacent time points. (see Equation 1 in the Supplementary Files) where n = total number of observations, y i = disease severity at the i th observation, and t = time at the i th observation.

Genotyping
Young fresh leaf samples were collected from the 206 genotypes (204 recombinant progenies, the two parents and a check variety) and immediately dipped in dry ice. The leaves were stored at -80 o c prior to lyophilization. Lyophilized leaf samples were sent to CIRAD-France for DNA extraction, library construction and Genotyping by Sequencing (GBS). DNA extraction and Genotyping by Sequencing (GBS) were performed as described in Cormier et al. [29]. GBS libraries were constructed as described by Elshire et al. [41] using PstI-MseI restriction enzymes. Sequencing was conducted on Illumina HiSeq 3000 system (150 bp, single-end reads) at the GeT-PlaGe platform in Toulouse, France.

Data Analyses
Phenotype data Anthracnose severity score data collected at different time during the crop's growth period were converted to AUDPC for quantitative comparison across years. The relative area under disease progress curve data was subjected to mixed model analysis using lme4 library package implemented in R (see Equation 2 in the Supplementary Files) [42].
Where; Y ijk = phenotypic value, µ = overall phenotypic mean, β i = effect of year i, R ij = effect of block j in year i, G k = effect of genotype k, (β i x G k ) = effect of interaction between year i and genotype k and e ijkm = residual. Block effects were added to the model as random variable to remove the spatial variation within the trial eld. Broad sense heritability was estimated from the model to assess the proportion of phenotypic variation in data set due to genetic effects. Phenotypic BLUE (Best Linear Unbiased Estimator) values of un-shrunken means for QTL analysis were extracted for the years and across years.

Snp Calling And Quality Assessment
Raw data was rst ltered using a pipeline described in Scarcelli et al. [43]. Demuladapt (https://github.com/Maillol/demultadapt) was used for demultiplexing and cutadapt 1.2.1 [44] used to remove the adaptors and low-quality bases reads with a mean quality score < 30 using a free perl script https://github.com/SouthGreenPlatform/arcadhts/blob/master/scripts/arcad_hts_2_Filter_Fastq_On_Mean_Quality.pl. For the nal SNP calling, GATK was used while mapping was performed using default options of Burrows-Wheeler Aligner (BWA) [45] using the D. alata reference genome. The SNP quality assessment was performed using vcftools [46] and plink [47], and SNPs with low information (such as MAF < 0.05, depth of sequencing < 5, 20% missing data for both SNPs and genotypes, multiple alleles) were removed.

Genetic Map Construction
The VCF le with the ltered SNPs including the two parents alongside the progenies was used for the linkage analysis using MAPpoly package [48] in R environment [42]. Chi-square test was applied for all the markers, considering the expected segregation patterns under Mendelian inheritance, random chromosome pairing and no double reduction using the lter.non.conforming function in MAPpoly library in order to remove non-informative SNP markers from further analysis. Linkage grouping was performed using an initial LOD value of > 4 in MAPpoly. The LOD value of 4.0 that established known linkage groups was then chosen as the signi cance criterion for multipoint linkage testing. Marker order and diagnostics were performed using the thresh.LOD.ph and thresh.LOD.rf function in MAPpoly R package. The nal high-density genetic map was constructed using R/QTL2 [49] and viewed in LinkageMapView [50]. The Qtl/jittermap was used to adjust the genetic position of closely related makers.

Qtl Analysis And Annotation
The QTL analysis was performed using the Composite Interval Mapping (CIM) method in R/QTL2 package [49]. A forward and backward simple stepwise regression was run to select background markers having a signi cant level of P < 0.10. The threshold levels to declare signi cant QTLs were empirically determined through 1000 permutations of the data, which maintained chromosome-wise Type I error rate of 0.05 [51].
The location of a QTL was described according to its LOD peak location and the surrounding region with 95% con dence interval using the Bayesian model. The proportion of phenotypic variance accounted for by each detected QTL was estimated by a single-factor analysis of variance using General Linear Model procedure on the individual marker loci closest to the QTL identi ed by CIM. Interaction among signi cant QTL was assessed using the R/QTL2 package [49]. The marker effect of each signi cant marker was estimated through a simple step wise regression analysis using lme4 R package. The gene-nding format les (gff) of the D. rotundata and D. alata reference genomes (https://drive.google.com/drive/folders/1H5T4xjKAEl9LliR-4qK_IR6TypCDe8nj and https://yambase.org/organism/Dioscorea_alata/genome ) were used to locate candidate genes within the anking sequences (5Kb based-pair) of the QTL detected for anthracnose disease resistance.        The boxplot shows the effect of the different alleles (variants) of chr7_3179921on the AUDPC estimates. The letters on the X axis represent alleles (CC, CT/TC and TT).