Bruchid resistance segregation and genetic analysis
BKB exhibited resistance to Acanthoscelides obtectus (Say), with a mean percentage of damaged seeds of 28.3%. In contrast, LYD3 showed high susceptibility, and all the seeds were damaged by bruchids (percentage of damaged seeds equal to 100%) (Fig. 1a, b).
The resistance of the F1 and F2:3 populations and RILs was identified. Two traits, namely, the percentage of seeds damaged by bruchids (PDB) and the number of perforations (NP), were investigated. The percentage of F1 seeds damaged by bruchids was 100%, which was the same as that of the susceptible parents, and this finding indicated that the trait of resistance to A. obtectus was a recessive trait. The PDS among the F2:3 individuals ranged from 0 to 100%, and the NP ranged from 0 to 360, showing significant resistance segregation (Fig. 1c). The PDS in the RILs ranged from 0 to 100%, and the NP ranged from 0 to 420. However, the minimum damage rate of seeds was lower than that of resistant parents, which indicated that resistant traits had a positive superparent advantage. The distribution of bruchid resistance was continuous in the F2:3 generation, and the ratio of susceptibility to resistance was not consistent with Mendelian inheritance. The kurtosis and skewness values were 0.35 and -0.84, respectively, with an absolute value lower than one, which was approximately normal, and this finding indicates the quantitative inheritance of the resistance trait controlled by recessive polygenes.
High-density genetic linkage map construction
Based on the technical characteristics of whole genome resequencing, the 157 RILs and two parents of common bean were resequenced. Raw reads were obtained by CASAVA base calling and conversion from the raw image data acquired by high-throughput sequencing. After filtering, 80029096 (17.12X coverage) and 71913622 (16.95X coverage) clean reads were generated from BKB (resistant parent) and LYD3 (susceptible parent), respectively, the clean reads from 157 RILs ranged from 39686552 to 105924272, with an average sequencing depth of 14.38 X coverage. By comparing the sequencing data of BKB, LYD3 and 157 RILs with the reference genome, a total of 300,003 SNPs among the parents were screened out and found to be distributed on 11 chromosomes (Table 1) using GATK 3.8 software [19].
The sliding window approach, with 15 SNPs in each sliding window, was used in this study. According to the proportion of SNP markers in the sliding window from the male parent to the female parent, the genotype of the segment where the sliding window was located was determined, and the location of the recombination breakpoint of the chromosome segment was determined. When the sliding window hit a homozygous/homozygous breakpoint, the genotype transiently changed from homozygous to heterozygous and then changed to the other homozygous genotype. A genotype remained unchanged until it hit a recombination breakpoint [27]. After all genotypes were called and the recombination breakpoints were determined, we identified a total of 48,689 recombinant breakpoints, with an average of 310 breakpoints per RIL (Fig. S1a).
By consolidating continuous non-recombination markers on the genome into a bin, a total of 4214 recombination bins were obtained for the 157 RILs. The average physical length of the recombination bins was 135.2 kb, including 712 SNPs. Each bin was used as a marker for genetic marker screening and genetic linkage map construction. Because segregation distortion could affect the results from map construction and QTL positioning, a ratio of 1:1 was used to perform segregation filtration with 4,214 candidate markers, and 1088 biased markers were filtered out to yield 3126 valid markers, which accounted for 74.18% of all bin markers. According to the genome information [59], the linkage groups were constructed using JoinMap 4.0 software [70] by controlling the LOD value between 2.0 and 10.0. A total of 11 linkage groups were generated with 3106 bin markers, containing 2234769 SNP markers. Taking the linkage group as the unit, the genetic distance between the markers was calculated using the Kosambi algorithm. A high-density genetic map of a total map distance of 1,283.68 cM, with an average interval of 0.61 cM between the bin markers, was constructed (Table 1, Fig. 2). The number of effective bin markers in each chromosome ranged from 134 to 417. The length of each chromosome ranged from 79.23 cM to 166.73 cM, and the average distance between adjacent markers ranged from 0.43 cM to 0.80 cM. With the exception of two >10-cM gaps between adjacent markers on LG05 and LG08, all other gaps were < 8 cM. The conversion ratio between genetic distance and physical distance ranged from 1.67 to 2.79. LG08 was the longest with a map length of 166.73 cM, containing 417 bin markers with 297249 SNPs, and had an average genetic distance of 0.71 cM. LG03 was the second with a map length of 159.84 cM containing 289659 SNPs. LG09 and LG01 were the two shortest with map lengths of 79.23 cM and 95.36 cM, including 134 bin markers with 110344 SNPs and 364 bin markers of 246412 SNPs, respectively, and the average genetic distances of LG09 and LG01 were 0.43 cM and 0.63 cM.
Collinearity analysis was performed based on the positions of markers on the genome and the genetic map, and the results showed that the order of most marker positions on each linkage group was consistent with the chromosome (Fig. S1b), indicating good collinearity, high accuracy of the genetic recombination rate and high accuracy of the map construction.
Quantitative trait loci (QTL) analysis
The genetic map was used to identify QTLs controlling bruchid resistance in common bean. After the identification of bruchid resistance, the PDS of the RILs ranged from 0 to 100%, and the NP ranged from 0 to 420. The ICIM-ADD method in QTL ICIMapping4.2 software was used to analyse PDS and NP. The analysis of the 157 RILs detected one QTL associated with PDS was also located on chromosome 6 between markers Bin1565 and Bin1566. The LOD values were 5.49, explaining 16.10% of the variation, and the additive effects were -0.11. One QTL related to NP, which was located on chromosome 6 between markers Bin1565 and Bin1566, had an LOD value of 4.31 and additive effects of -27.48, and explained 16.37% of the variation (Table 2; Fig. 3a; Fig. S2). This QTL for both PDS and NP was located at the physical positions between 1,250,000 and 1,500,000bp on chromosome 6 (Fig. 3b). We named the QTL qCBr6, and the beneficial allele at this locus came from BKB, the resistant parent. The qCBr6 might be a major locus for resistance to Acanthoscelides obtectus (Say).
For narrowing down the regions identified by QTL-seq, we developed 44 SSR markers and 5 InDel markers (Table S1) on chromosome 6in candidate regions to detect polymorphisms between BKB and LYD3. A total 11 SSR markers and 3 InDel markers on chromosome 6were polymorphic between the two parents. The regional linkage maps for chromosome 6 and chromosome 9 were constructed with polymorphic DNA markers using F2 mapping population consisted of 188 individual plants. We identified ten recombinants from the fine‐mapping population using seven molecular markers. The qCBr6 locus was delimited to an interval of approximately 122.3 kb between SSR marker I6-4 and I6-16 (Fig.3c).
To further verify the correlation between this locus and bruchid resistance, an association study in candidate region was conducted using SNP markers and the phenotypic data of two traits, PDS and NP, of one natural population (contained 625 common bean accessions) (Table S2). A total 437 SNPs were generated from the whole genome resequencing project between I6-4 (Chr06,1367622bp) and I6-16 (Chr06,1489891bp) on chromosome 6. The results showed that 70 SNPs (P-value < 1 × 10–6) associated with PDS and 7 SNPs (P-value < 1 × 10–6) associated with NP were detected (Table S3). Seven SNPs (Chr06_1413364, Chr06_1429130, Chr06_1429225, Chr06_1429229, Chr06_1429376, Chr06_1432127, Chr06_1432316) were associated with both two traits. The peak SNP (SNP with the lowest P value) was located in 1432316bp on chromosome 6 (PDS-associated Chr06_1432316, P = 8.38×10-28; NP-associated Chr06_1432316, P = 3.44×10-12) (Fig S3). Four SNPs associated with PDS (Chr06_1376148, Chr06_1376347, Chr06_1376967 and Chr06_1407036) were located within genes, others lied between genes. These results suggested that this locus might be strongly associated with bruchid resistance.
Annotation and prediction of candidate regions for bruchid resistance
According to the common bean reference genome sequence, the region between marker I6-4 and I6-16 contained five genes, Phvul.006G003700 (chr06,1376123–1378343bp), Phvul.006G003800 (chr06,1406616–1408098bp), Phvul.006G003900 (chr06,1417873–1419245bp), Phvul.006G004000 (chr06,1454577–1457697bp) and Phvul.006G004100 (chr06,1486729–1491368bp) (Fig. 3d; Table 3). Phvul.006G003700 was found to be a homologous gene in A. thaliana according to the NCBI database and encoded a protease inhibitor/seed storage/LTP family protein, belonging to the AAI-LTSS [alpha-amylase inhibitor (AAI), lipid transfer (LT) and seed storage (SS)] protein superfamily. The AAI-LTSS family is known to play important roles in defending plants from insects and pathogens, in lipid transport between intracellular membranes, and in nutrient storage. The BLAST searches against the NCBI and UniProt databases revealed that Phvul.006G003700 showed the strongest similarity to an AAI domain-containing protein in Phaseolus vulgaris (kidney bean) (identity = 91.29%) and an AAI domain-containing protein in Glycine max (Soybean) (E-value = 0.0, score = 1,614, identity = 77.9%), which speculated that the protein encoded by Phvul.006G003700 might have a similar function of inhibiting ɑ-amylase activity. Phvul.006G003800 encodes a carbohydrate-binding X8 domain superfamily protein. Phvul.006G004000 and Phvul.006G004100 encode homologues of yeast ADA2 2B and ADA2 2A, respectively. Phvul.006G003900 has unknown function.
Sequence analysis and expression of candidate genes
The coding DNA sequences (CDS) of Phvul.006G003700 in BKB and LYD3 were determined. The amplification of the candidate gene indicated that Phvul.006G003700 had a length of 999 bp in cDNA in the resistant genotype BKB and encoded 332 amino acid residues. No difference was found between BKB and LYD3 (Fig. S4). The results from the search of the SMART Database showed that the Phvul.006G003700 protein contained an AAI domain within the region from amino acids 55 to 329 (Fig. S5).
To further examine the transcriptional regulation of this gene, we analysed the upstream promoter sequences of the genes using the PlantCARE database. The BLAST searches revealed that the amplified sequence included essential cis-regulatory elements of the promoters such as the core promoter element TATA and the common element CAAT that were highly conserved across the investigated species. We obtained the sequence of promoter region for Phvul.006G003700 gene. A 5-bp difference on 800 bp upstream of the start codon in the promoter region of the Phvul.006G003700 gene was found between LYD3 and BKB, which resulted in the absence of essential cis-regulatory element TATA box in the susceptible material (Fig. 4). TATA box was known to be an essential transcription regulator. The absence of the TATA box may reduce the efficiency of transcription and thus affect the gene expression of Phvul.006G003700 in the susceptible parent LYD3.
The qRT-PCR results showed that the expression level of Phvul.006G003700 in BKB was markedly higher than LYD3 both in seeds and leaves (Fig. 5), which revealed that the gene expression level was higher in the bruchid-resistant accession than in the bruchid-susceptible accession. In seeds, gene Phvul.006G003700 in BKB was about 40 times more expressed than it in LYD3, and about 3.5 times more expressed in leaves.
Through the results of association study between SNP markers and two traits of PDS and NP in the natural population, we found that three significant SNPs associated with PDS (Chr06_1376148, p = 5.85×10-10; Chr06_1376347, p = 9.97×10-10; Chr06_1376967, p = 1.11×10-9) were localized in the 3’-UTR region of the candidate gene Phvul.006G003700 (Fig S6). These findings further support the likelihood that Phvul.006G003700 was the candidate gene for bruchid resistance.
Identification and homology sequence analysis of candidate genes
Homologous sequences were accessed of candidate gene Phvul.006G003700 within the G19833 reference genome and other crops. We searched for homologous sequences of the candidate genes within the common bean G19833v1.0 reference genome and identified 14 additional specific genes located on chromosomes 1, 3, 6, 8 and 9. Seven homologous genes, Phvul.003G169100, Phvul.003G218600, Phvul.003G218700, Phvul.003G218800, Phvul.003G218900, Phvul.003G219000 and Phvul.003G219100, were located on chromosome 3. Four homologous genes, Phvul.006G138600, Phvul.006G138800, Phvul.006G138900 and Phvul.006G139100, were located on chromosome 6. Phvul.001G050300, Phvul.008G142500 and Phvul.009G092000 were located on chromosomes 1, 8 and 9, respectively. These homologous genes all had a similar protein domain as the candidate gene and encoded a bifunctional inhibitor/lipid-transfer protein/seed storage 2S albumin protein belonging to the AAI-LTSS protein superfamily. The analysis of conserved motifs revealed two motifs with an e-value >E-60 within the AAI domain region of Phvul.006G003700 and homologous proteins in the P. vulgaris genome (Fig. S7).
To further clarify the relationship between the Phvul.006G003700 gene and homologous genes from other crops, a BLAST search against the NCBI and UniProt databases was performed to find the homologous protein of Phvul.006G003700 in soybean (Glycine max), cowpea (Vigna angularis var. angularis), mung bean (Vigna radiata var. radiata), lupine (Lupinus albus), wheat (Triticum aestivum), maize (Zea mays), rice (Oryza glumipatula), barley (Hordeum vulgare subsp. vulgare), Medicago truncatula, Arabidopsis thaliana and other crops. The protein sequence encoded by Phvul.006G003700 was compared with the homologous protein sequence in other crops by phylogenetic tree analysis (Fig. S8). The results showed that the Phvul.006G003700 protein was closely related to the AAI protein in soybean, which suggested that it might have a similar function to AAI. The analysis of conserved motifs revealed three motifs with an e-value >E-60 among Phvul.006G003700 and homologous proteins in other crops, and the first two conserved motifs were located in the AAI domain region (Fig. S9).