Genome-wide Identication of Powdery Mildew Resistance in Common Bean

Background: Genome-wide association studies (GWAS) have been utilized to detect genetic variations related to the powdery mildew (PM) resistance and several agronomic traits in common bean. However, its application in common bean and the PM interactions to identify genes and their location in the common bean genome has not been fully addressed. Results: Genome-wide association studies (GWAS) through marker-trait association are useful molecular tools for the identication of disease resistance and other agronomic traits. SNP genotyping with a BeadChip containing 5398 SNPs was used to detect genetic variations related to resistance to PM disease in a panel of 211 genotypes grown under eld conditions for two consecutive years. Signicant SNPs identied on chromosomes Pv04 and Pv10 were repeatable, conrming the reliability of the phenotypic data scored from the genotypes grown in two locations within two years. A cluster of resistance genes was revealed on the Pv04 of common bean genome among which CNL and TNL like resistance genes were identied. Furthermore, two resistance genes Phavu_010G1320001g and Phavu_010G136800g were also identied on Pv10; further sequence analysis showed that these genes were homologs to the Arabidopsis disease resistance protein (RLM1A-like) and the putative disease resistance protein (At4g11170.1), respectively. Two LRR receptor-like kinases (RLK) were also identied on Pv11 in samples collected in 2018 only. Many genes encoding auxin-responsive protein, TIFY10A protein, growth-regulating factor 5-like, ubiquitin-like protein, cell wall protein RBR3-like protein related to PM resistance were identied nearby signicant SNPs. These results suggested that the resistance to PM pathogen involves a network of many genes constitutively co-expressed and may generate several layers of defense barriers or inducible reactions. Conclusion: Our results provide new insights into common bean and PM interactions, and revealed putative resistance genes as well as their location on common bean genome that could be used for marker-assisted selection, functional genomic study approaches to conrm the role of these technology to identify SNPs associated with the resistance to the PM disease. The most signicant SNPs were identied on Pv04 and Pv10. Further sequence analysis showed the presence of a clustered of resistance genes of the Pv04 of common bean among which several genes have been shown to be associated with the PM disease resistance. Moreover, two resistance genes were also identied on Pv10. Our results revealed putative resistance genes as well as their locations on the common bean genome that could be utilized for marker-assisted selection, functional genomic study approaches to conrm the role of these putative genes; therefore, provide the benet in breeding program for common bean improvement.

SNPs identi ed on chromosomes Pv04 and Pv10 were repeatable, con rming the reliability of the phenotypic data scored from the genotypes grown in two locations within two years. A cluster of resistance genes was revealed on the Pv04 of common bean genome among which CNL and TNL like resistance genes were identi ed. Furthermore, two resistance genes Phavu_010G1320001g and Phavu_010G136800g were also identi ed on Pv10; further sequence analysis showed that these genes were homologs to the Arabidopsis disease resistance protein (RLM1A-like) and the putative disease resistance protein (At4g11170.1), respectively. Two LRR receptor-like kinases (RLK) were also identi ed on Pv11 in samples collected in 2018 only. Many genes encoding auxin-responsive protein, TIFY10A protein, growth-regulating factor 5-like, ubiquitin-like protein, cell wall protein RBR3-like protein related to PM resistance were identi ed nearby signi cant SNPs. These results suggested that the resistance to PM pathogen involves a network of many genes constitutively co-expressed and may generate several layers of defense barriers or inducible reactions.
Conclusion: Our results provide new insights into common bean and PM interactions, and revealed putative resistance genes as well as their location on common bean genome that could be used for marker-assisted selection, functional genomic study approaches to con rm the role of these putative genes; hence, developing common bean resistance lines to the PM disease.

Background
Common bean (Phaseolus vulgaris L.) is a signi cant legume species among the pulse crops that play a major role in addressing global food security, environmental challenges and health diets [1]. Powdery mildew (PM) is one of the most ubiquitous plant diseases and it infects a variety of legumes including common bean. PM disease in common bean caused by pathogen Erysiphe polygoni DC accounts for extensive damage and signi cant yield losses 69% prior to owering under the environment of warm temperatures (20-24℃), high humidity and shade [2,3]. Because it is an airborne disease, accurate identi cation and immediate taking actions are critical to effectively prevent the spread of PM thereby minimizing signi cant yield losses and quality of edible parts. Initial symptoms appear as small and white talcum-like spots, which are most commonly seen on the upper surface of leaves [4]. As the symptoms develop, infected leaves gradually curl downward and change color from pale yellow to brown and ultimately abscise. Under severe conditions, the entire leaves and plants are covered by white cottony mycelia [5], which inhibit photosynthetic process through blocking light absorption by leaves [6] and decrease in the rate of photosynthetic carbon dioxide assimilation [7].
Several strategies are used to control PM disease, including application of fungicides, adjusting planting date to synchronize with periods of maximum sunlight exposure, and adoption of good cultural practices.
However, these are expensive and not sustainable. For instance, fungicide treatments may not be effective in minimizing pathogen accumulation [8]. The development of resistant bean varieties is the most economical, e cient and ecological approach for the management of this disease-causing pathogen [9]. Screening sources of resistance and studying its inheritance in common bean genotypes have reported that resistant genotypes carry different resistance genes to the PM disease [4,[9][10][11].
Recently, susceptibility (S) genes have been used as an alternative source for the resistance to PM disease. The mildew resistance locus (mlo) is such S genes that promote the pathogen proliferation by suppressing the immune system; therefore, act as negative regulators of immunity [12,13]. Loss of function studies in some mlo genes have conferred a durable and broad-spectrum, recessively inherited resistance in Arabidopsis [14], cucumber [15], tomato [16,17] and grapevines [18]. Comparative genomics approach has revealed that ve Mlo loci in the common bean genome were clustered in the clade V along with Arabidopsis orthologs underlying PM resistance [19]. However, the functionality of these Mlo loci has not been validated yet.
A Genome wide association study (GWAS) is a molecular tool used to identify speci c genomic regions or loci governing simple to complex traits. GWAS can be used to determine if a genomic variant is associated with a trait of interest using either germplasm, segregation population, or a collection of diverse genotypes [20,21]. Various candidate genes or quantitative trait loci responsible for traits of interest have been identi ed using GWAS technology in different crops. For instance, in common bean several studies have reported identi cation of genomic regions and candidate genes associated with different traits for bruchid resistance [22], agronomic traits [23], drought tolerance [24], anthracnose, angular leaf spot and Fusarium wilt diseases [25,26] and symbiotic nitrogen xation [27]. However, these studies have been conducted for different traits with different common bean cultivars in different geographical regions. In this research study, the aim was to reveal the association between genomic regions and the PM resistance using the GWAS approach in eld grown cultivars. A collection of 211 common bean genotypes obtained from different resources and locations would provide enough recombination frequency for identi cation of causal loci related to PM resistance.

Results
Phenotyping for powdery mildew disease Two trial sites were ideal experimental environmentsfor testing PM disease resistant trait because of their higher disease pressure due to consecutively growth bean for more than ve years resulting in the accumulation of E. polygoni inoculum season after season. The higher rainfall (>1000 mm) and higher relative humidity (>80%) also created conducive environment for disease development during these two bean growing seasons [28]. The environment was used to fully assess the PM resistance potentials of common bean genotypes in nature. As a result, highly signi cant (p<0.001) differences of resistant trait were observed among the 211 genotypes for powdery mildew in both 2017 and 2018 growing season [28]. Also, high signi cant interactions of genotype x environment and genotype x season were observed, indicating the resistance to PM disease is affected by environments (Table 1). Disease severity was higher in 2018 with maximum disease severity score of 9 compared to 2017 with a maximum disease severity score of 6 among 211 genotypes ( Table 2). Disease severity of PM under natural infection was negatively correlated with yield at r=-0.24 (data not shown), indicating that PM had a signi cant effect on yield of common bean. Table 1. Analysis of variance in the collection of common bean genotypes for PM disease. Genetic relationship was revealed in the collection of 206 common bean accessions using STRUCTURE software. Based on Bayesian model, the genetic population structure was captured by describing the molecular variation in each subpopulation using a separate joint probability distribution over the observed sequence sites or loci. The model was used for association analysis to reduce false association due to unequal distribution of alleles among subpopulation. The model grouped 206 genotypes into three clusters (K=3). The rst cluster (K=1) consisted of 113 genotypes belonging to the Mesoamerican gene pool; the second cluster (K=2) consisted 72 genotypes belonging to the Andean gene pool and third cluster (K=3) consisted of 21 genotypes belonging to the admixture gene pool ( Figure 1). Since this collection included some breeding lines derived from different sources of parents, there was a possibility that offspring combined genetic background from both Mesoamerican and Andean gene pools leading to these hybrids in the third cluster. The result suggested these subpopulations were associated with their genetic background. Interestingly, statistical analysis of disease severity scores based on genotypes in three structure groups showed the PM resistance was signi cantly different (p<0.01) between two gene pools, i.e. genotypes of Mesoamerican gene pool have signi cant higher resistances than those from Andean gene pool. However, genotypes in the admixture group have similar resistance with either gene pools depending on environment and season, suggesting the expression of resistant trait in admixture genotypes was easier in uenced by environment.

Marker trait associations for powdery mildew disease
Genome-wide association analysis of the resistance to PM disease showed comparable results of signi cant SNPs between two the year's data. The most signi cant markers linked to PM resistance were located on Pv04 and Pv10 for both years, indicating the reliability of data set. A total of nine SNPs was detected as association with common bean powdery mildew resistance on the Pv04, Pv10, and Pv11 ( Figure 2; Table 3 Figure 2).  Identi cation of candidate genes associated with powdery mildew resistance There were 181 coding genes observed within the interval of signi cantly associated SNPs on Pv04, 46 on Pv10, and 24 on Pv11. Using the BLAST analysis of these coding genes against protein database in GenBank, nine coding genes were considered as candidate resistance genes for the PM disease (Table 4). On the Pv04, three coding genes located at the telomeres were homologous to resistance genes RPP13, TMV-N, and LRR receptor-likeserine/threonine protein kinase (LRR-RLK) RPK2, while two coding genes located at the opposite telomere were homologous to one LRR-RLK and the transcription factor (TF) MYB 87. On Pv10, two coding genes were homologous with RLM1A-like and At4g11170 putative resistance genes, while on Pv11 two other coding genes on Pv11 were similar with LRR-RLK. The sequences of these candidate resistance genes were listed in Additional le 1.

Discussion
Most disease resistance in plant is a complex trait controlled by quantitative trait loci and in uenced by environmental factors. The development of resistant varieties requires the identi cation of resistance genes as a prerequisite in plant breeding. Moreover, understanding the genetic basis of complex traits is needed for molecular breeding. GWAS is such a powerful approach for dissecting these complex traits and has been applied in many plant species including Arabidopsis, Vigna unguiculata and V. radiata. In this study, we used a nature population collected from different resources of common bean that possessed a large number of cross-over events. The BeadChip with 5398 SNPs provided enough required marker coverage on a diverse set of 206 accessions. The phenotype-genotype association was repeatable between the two years using GWAS approach, suggesting candidate gene identi cation pinned by signi cant associated SNPs was reliable.
Genome-wide association analysis of genes governing PM resistance resulted in the identi cation of nine candidate genes located on Pv04, Pv10 and Pv11 in the common bean genome that contains a total number of 28,134 coding genes (https://plants.ensembl.org/Phaseolus_vulgaris/Location/Genome? r=11:1-1000). Pv04 and Pv10 have the lesser coding genes (<2,000) compared to the rest of the chromosomes.
In this study, a coding gene (Phavu_004G036200g) was identi ed at the telomere of Pv04 (between 4.014 -4.016 Mb). BLAST analysis revealed that this gene is a homolog of the disease resistance gene RPP13. The RPP13 resistance gene encodes a coiled coil-nucleotide-binding site-leucine-rich repeat (CC-NBS-LRR, CNL) type of resistance protein and has been known to confer resistance to fungal diseases in plants, including resistance to downy mildew disease in Arabidopsis [29,30], resistance to powdery mildew in barley [31] and in wheat [32]. This RPP13-like gene located on Pv04 could play an important role in PM resistance in common bean. Recently, [4] have identi ed three loci related to PM resistance at the top arm of Pv04 in common bean based on the linkage mapping of three bi-parental segregation populations. Physically, these three loci were located between 0 to 1.09 Mb speculated from the cM of anking DNA markers. Additional candidate gene (Phavu_004G001500) related to PM resistance was identi ed by [11], which was located between 0.84-2.18 Mb on Pv04, suggesting the existence of a cluster of resistance genes in this region of Pv04. Indeed, in this study, BLAST analysis of coding genes around RPP13-like gene, detected several genes related to the PM resistance within 2 Mb interval between the two most signi cant SNPs, including Phavu_004G020000g (senescence-associated carboxylesterase 101), Phavu_004G028900g (TMV resistance protein N-like), and Phavu_004G037500g (LRR-RLK, RPK2). Both TNL and CNL are involved in pathogen recognition but differ in signaling pathways. The TMV resistance protein N is encoded by Toll/interleukin-1 receptor-nucleotide-binding site-leucine-rich repeat type (TIR-NBS-LRR, TNL) gene [33]; activation of resistance response involving TNL requires several known general cofactors of disease resistance, including protein kinases [34]. At the opposite telomere, additional LRR-RLK and MYB87 could be involved in disease response since MYB87 functions as regulator of genes affecting cell wall organization and remodeling [35]. It is probable that upon pathogen infection, the constitutive expression of this set of genes located on Pv04 could show the characterization of quantitative resistance trait to the PM disease in common bean. Further investigation is needed to elucidate the molecular mechanism of the cooperation between TNL and CNL pathways: whether TMV-like gene mediating the resistance to PM disease requires the protein kinase LRR-RLKs to induce the effector triggered immunity (ETI), or requires alternative spliced transcripts to promote resistance proteins that can speci cally recognize the pathogen molecular elicitors [36].
Based on the two years' data, two other candidate resistance genes, Phavu_010G1320001g and Phavu_010G136800g, were identi ed at the bottom arm of Pv10 with 0.6 Mb apart nearby the signi cant SNPs. The former candidate gene was homologous with the disease resistance protein RLM1A-like, while the latter hit the putative disease resistance protein (At4g11170.1). The RLM1A gene confers resistance in Arabidopsis against Leptosphaeria maculans, a fungus pathogen, and is involved in the rst layer of defense through the callose deposition acting as a temporary cell wall in response to pathogen attack [37][38][39]. The gene At4g11170 refers to disease resistance gene RMG1 (Resistance Methylated Gene 1) [40]. Expression of this gene is controlled by DNA methylation on its promoter region. The RMG1 promoter region is constitutively demethylated by active DNA demethylation mediated by the DNA glycosylase ROS1 [40]. Both identi ed candidate genes encoded TNL resistance protein and their physically closely located on Pv10. However, they did not show signi cant similarity in either DNA or amino acid sequences, suggesting they were not derived from the duplication event. These two clustered resistance genes may meet the digenic requirement for functional resistance as observed in the RPP2 that consists of a complex of TIR-NB-LRR genes for defense response [41].
Additional candidate resistance genes were detected on Pv11 based on the signi cant SNPs from the association analysis in 2018. Both genes were annotated as LRR receptor-like serine/threonine protein kinase (At1g5430 and At1g56130). LRR receptor-like kinases (RLK) represent one of the largest protein families in plant [42]. The LRR-RLK proteins in plant are involved in the plant signaling pathway regulating pathogenic defense responses [43]. LRR domain of RLK has undergone an accelerated evolution that generated numerous cell surfaces and cytoplasmic receptors to interact with a diverse group of proteins for the speci city of pathogen recognition [43]. LRR receptor-like serine/threonine protein kinase were identi ed as candidate gene conferring the resistance to apple scab as demonstrated by [44]. In this study, LRR-RLKs were identi ed on Pv11 as associated with PM resistance based on oneyear data. Although resistance loci were detected on Pv11 using different genotype materials of common bean [4,45], further study of the interaction between LRR-RLK and PM pathogen will help with understanding the mode of action of these resistance genes. Nevertheless, many genes encoding auxinresponsive protein, TIFY10A protein, growth-regulating factor 5-like, ubiquitin-like protein, cell wall protein RBR3-like protein were linked to signi cant SNPs, suggesting that the resistance to PM pathogen involves a network of many genes constitutively co-expressed and generates several layers of defense barriers or inducible reactions. The function of the candidate genes identi ed in this study needs to be elucidated by performing loss of function study.

Conclusions
Common bean is affected by PM disease causing signi cant yield losses. In this study, we used the GWAS technology to identify SNPs associated with the resistance to the PM disease. The most signi cant SNPs were identi ed on Pv04 and Pv10. Further sequence analysis showed the presence of a clustered of resistance genes of the Pv04 of common bean among which several genes have been shown to be associated with the PM disease resistance. Moreover, two resistance genes were also identi ed on Pv10. Our results revealed putative resistance genes as well as their locations on the common bean genome that could be utilized for marker-assisted selection, functional genomic study approaches to con rm the role of these putative genes; therefore, provide the bene t in breeding program for common bean improvement.

Plant materials and eld experiment
Seeds of 211 common bean accessions were originally sourced from different germplasm centers [27]. Among them, 184 accessions seeds were originally acquired from the common bean germplasm repository at International Center for Tropical Agriculture (CIAT) (www.ciat.cgiar.org) and the remaining 27 were derived from Tanzania Agricultural Research Institute (TARI) Selian center where they were originally collected from Ethiopia (12), Kenya (10), Tanzania (3), and Rwanda (2) (www.tari.go.tz). These genotypes represented two gene pools based on their origin centers, Mesoamerican origin in southern Mexico and Guatemala and Andean origin in Peru and Columbia [27]. The experimental materials were planted in northern Tanzania

Phenotypic data analysis
The PM disease severity scores were evaluated using 1-9 scale where 1 being non-pathogenic and 9 pathogenic following the scoring rubric of CIAT [46,47]. The evaluation of score was conducted twice i.e. at R-6 and R-8 developmental stages and the mean of two evaluations was used for downstream analyses. Analysis of Variance (ANOVA) was used to test the interactions of genotype x environment, genotype x season and genotype x environment x season. Protected Least Signi cant Differences (LSD) of (p=0.05) were used for comparison of genotypes [48,49]. For each accession, mean of disease scores obtained from two replications and two environments in each year was used for the phenotype-genotype association study.

Genotyping data analysis
Total of 15 seeds per genotype were shipped from Tanzania to Tuskegee University through USDA-Animal and Plant Health Inspection Services (APHIS) with permit number P587-180801-005 and phytosanitary certi cate number 00310248 as per import/export plant material regulations. Seeds were planted in a pot with diameter of 15 cm in the greenhouse at the George Washington Carver Agricultural Experiment Population structure analysis The Bayesian model-based clustering method was applied using STRUCTURE 2.3.4 software [51] to determine population in this collection. The admixture model with independent allele frequencies without prior population information was used for simulation. The STRUCTURE software was set at burn-in period length of 50,000 and after burn-in 50,000 Markov chain Monte Carlo (MMC) repetition were set ve times. For joint inference of population substructure, the kinship (K) set at the range of 1 to 10 with ve number of iterations runs for each kinship. The ideal number of sub-populations was determined using K method [52] implemented in the HARVESTER software [53].

Marker-trait association analysis
Filtering the monomorphic SNP markers, and ones with minor allele frequency (MAF) <2% with 6.4% SNPs missing, 5052 SNP markers were retained for population structure analysis and association analyses with TASSEL 5.0 software. The following mixed linear model (MLM) was used: Y= γα + ρβ + kµ + ε Where, Y is Phenotype of each genotype; γ is the xed effects of the SNP; ρ is the xed effect of population structure from principle component analysis results; k is the random effect of kinship relative; ε is the term error under normally distribution with mean =0 and variance δ2. The statistical model was used to test for trait-marker associations [54].

BLAST analysis
Signi cant SNPs were identi ed using the manhattan plots for two bean growing season and the cut-off value 5 of -log 10 (p-value) was used to identify the most signi cant SNPs associated with PM resistance.
The coding genes within the interval of signi cantly associated SNPs were used as queries to search for putative homologous proteins in the EnsemblPlants release 46 version [55] and (https://plants.ensembl.org/Phaseolus_vulgaris/Location/Genome?r=11:1-1000). The coding genes were annotated as candidate genes when they were homologous with genes, kinases, and transcription factors with nucleotide identity >90% and associated with disease resistance. The location of coding genes hitting the putative proteins was used as the position of candidate resistance genes on the corresponding chromosomes.