The main trajectories of C. annuum domestication
Three hundred forty-seven accessions from 12 species of Capsicum, collected from genebanks in Asia, the Americas, Africa, and Europe, of which 311 C. annuum, were resequenced to an average depth of ~9×, generating 10.1 trillion paired-end reads (Table S1). A variome map was obtained, including 18,372,022 single nucleotide polymorphisms (SNPs) and 802,875 insertions/deletions (InDels), with an accuracy of >95%, verified by Kompetitive Allele-Specific PCR (KASP) (Table S2). Variants were uniformly distributed along the 12 chromosomes, with the exception of a genomic region in chromosome 9 containing substantially more variants (Fig. 1a), and they were about twice as abundant in intergenic regions than in gene bodies (Fig. S1). The median heterozygosity of the accessions was 1.11% (Fig. S2), and 56,182 SNPs and 3,080 InDels caused changes in the protein sequences of coding genes (Table S3).
We used 33,346 synonymous SNPs located in genes to investigate the phylogenetic relations of the accessions. Different Capsicum species formed distinct branches (Fig. 1b), while the 311 annuum accessions formed nine groups (Fig. 1c): I) the wild/ancestral group, which included two wild C. annuum var. glabriusculum and 10 ancestral accessions and was located immediately next to non-annuum species; II) a group mainly composed of old landraces; III) cultivars with diverse geographical origins; IV) and VI) blocky fruit peppers; V) cultivars with diverse fruit types and origins; VII) accessions from the northwest and north of China; VIII) accessions from central China; IX) accessions from southwest China, collected from high-altitude areas in Yunnan, Guizhou, Sichuan, and Tibet (Fig. S3).
Groups I to IX represent the main domestication and breeding trajectories of pepper worldwide. Both the evolutionary relationships (Fig. 1c-d), and the genetic diversity (π) within each group and the genetic differentiation (FST) between groups (Table 1) suggest the following scenario: group I is the ancestral group containing the early domesticates, as suggested by its position near the root of the tree and its high genetic diversity (π=0.2939); group II represents old landraces, being closest to group I (FST=0.1546), while group III represents later cultivars, being among the closest to group II (FST=0.1184); both groups II and III exhibit a high genetic diversity (π=0.2860 and 0.2935, respectively), suggesting either the existence of minor genetic bottlenecks, or of diversifying selection, during the early steps of C. annuum domestication. The evolutionary relationships of groups I, II, and III were further supported by the genotypic compositions, with more ancestral alleles present in group I, while more derived alleles by selection were present in group III (Fig. S4). Groups II and III gave rise, directly or indirectly, to all other groups (Fig. 1c-d and Table 1): directly to groups IV (blocky), V, VII and VIII; and indirectly to groups VI (large fruited blocky, derived from group IV) and IX (high-altitude Chinese peppers, derived from group VII).
Among the Chinese groups, group IX exhibited the highest genetic diversity (π=0.2831) (Table 1 and a predominant genetic component (represented by dark green in Fig. 1d), present in significant levels in the ancestral groups I and II, which were possibly re-introduced in group IX to favor adaptation to high altitudes. The large genetic variation in group IX resulted in large fruit length variations, including a specific slim fruit type (Fig. 1d). All groups, with the exception of IV and VI, exhibit large π values, indicating the inheritance of a large variety of different alleles, or the action of diversifying selection, during their formation. Group V exhibits large variations in fruit shape and likely represents a transition group between traditional and blocky fruit peppers (Fig. 1d). All groups present relatively high levels of genetic admixture (Fig. 1d), confirming the absence of major genetic bottlenecks during domestication and subsequent breeding, with the exception of groups VI (large fruited, blocky peppers) and VIII (central China).
The domestication and differentiation of narrow fruit peppers
The two wild accessions (C. annuum var. glabriusculum) have short, very small, waterdrop shaped, erect fruits, with high (839-1146 mg/Kg DW) capsaicinoid content. The early domesticates of group I present, compared to the wild accessions, a significant increase in fruit size, a large variation in fruit shape (olivary, short, conical), the appearance of pendent fruits (8 out of 10), and very large variation in capsaicinoid content (0-1972 mg/Kg DW), indicating a strong diversifying selection exerted on these traits during early domestication (Table S1).
During early domestication, average fruit length increased from around 5.0 cm in group I to 8.0 cm in group II and 11.0 cm in group III, without a corresponding increase in fruit diameter, resulting in increasingly elongated fruit types (Fig. 2a). The Chinese peppers in groups VII, VIII, and IX showed comparable fruit lengths to those of group III. The increase in length, resulting in increased surface-to-volume ratio, probably served a dual purpose: making the early domesticates distinguishable from their wild ancestors, and facilitating air-drying, a common technique applied to this day to conserve chili peppers. In contrast, capsaicinoid levels, after the initial diversifying selection in early domesticates showed a multi-phasic trend, with a slight increase in group II and a clear reduction in group III (Fig. 2b). The pungency increased again later in groups VII, VIII, and IX, consistent with a secondary selection for increased capsaicinoid levels in China, where spicy food is popular. Finally, pendent fruit types, which were already prevalent in groups I and II, became almost exclusive in groups III to IX (Table S1).
Selective pressure generates genomic selection signals, measured as a reduction of nucleotide diversity [ROD]21. Several genomic selection signals were detected in the pepper genome during early domestication (group I to group II), in particular on chromosomes 4, 8, 9, and 11 (Fig. 2c and Table S4). Three previously reported QTLs for fruit shape and length22, 23 and four capsaicinoid biosynthesis genes (PDH_E2-P3, PDH_E2-D1, CM1-D2, and a-CT-D1)18 are localized in these genomic regions. There are 348 gene units under selection in the transition of group I to group II, including the A-class gene flower homeotic gene AP2-A (Capana04g002188) (Tables S5-S6).
Genomic regions of five chromosomes were found to be under selection in the second transition (group II to group III) (Fig. 2d and Table S4). Two previously reported fruit shape QTLs (fs4.1R and fs10.1B), one fruit weight-related gene (fw/CA05g10770), and seven capsaicinoid biosynthesis genes (BCKDH_E3-D2, PDH_E3-D2, GS2-D3, ACS2-D4, ACS2-D1, CPR-D2, pAMT-P5) are localized in these regions, including a second A-class gene AP2-A (Capana02g000700) (Tables S5-S6).
Thus, it appears that early evolutionary transitions in narrow fruit peppers (group I to II, and group II to III) involved selection at large groups of candidate genes for fruit pungency and/or shape, probably relying on the vast genetic diversity for these traits that is present in these groups and on the absence of genetic bottlenecks. In contrast, transition from group II to the Chinese groups VII-IX involved a selection on a narrower group of genes (Fig. 2e-f and Tables S5-S6), consistent with the hypothesis that a genetic bottleneck was active during this transition, probably due to the transport via sea or land (the silk road) to mainland China of a subset of the group II genepool.
The recent emergence of blocky fruit, sweet peppers
Blocky fruit peppers (groups IV and VI) exhibit distinctive phenotypes, such as a large increase in fruit diameter and weight, decreased variation in fruit shape, reduction to almost zero of capsaicinoid levels, and pendent fruit orientation, which is necessary to support the large fruit (Fig. 2a and Table S1). As aforementioned, they also exhibit a very low genetic diversity (Table 1), consistent with their recent emergence8 and a higher fraction of fixed alleles, either ancestral or derived, compared to the other groups (Fig. S4). Of the two groups, group VI was probably selected later, as suggested by its higher FST value with respect to group III, lower π value, higher proportion of fixed alleles, and also, larger fruits. The linkage disequilibrium (LD) values of groups IV and VI are the highest in the whole C. annuum population, further confirming their recent emergence (Fig. S5).
By comparing groups IV and VI with group III, several genomic selection signals were identified using the ROD parameter (Fig. 3a and Table S4), overlapping with previously described QTLs for fruit shape, length or weight (fs-8, fs10.1B, fs11.4, fl-8, fd-11, fw4.1)22, 23, 24, 25, and with two capsaicinoid biosynthesis genes (ACS2-D1 and pAMT-P5)18. Given the recent emergence of blocky fruit peppers, parameters XP-EHH (cross population extended haplotype homozygosity)26, and Tajima’s D27 were used to find additional genomic selection signals (Fig. 3b), which were overlapping with QTLs fd-3.1 for fruit diameter, SAP for flower and ovule development, and qcap6.1 for pungency. Capana07g001005, an Agamous family gene regulating ovule development, Capana10g000984 and Capana10g001014, encoding cyclin-dependent protein kinase regulators of cell cycle, and Capana05g000060, a member of the IQD family that includes SUN, regulating fruit shape in tomato28, were localized in these genomic regions and found to be under strong selection (Tables S5-S6).
Two genomic regions, named F9 and F11, on chromosomes 9 and 11 showed very low XP-EHH values (Fig. 3b). The two regions exhibited clear differences, between blocky and non-blocky types, in the depth of reads mapped to the reference genome, which is derived from a non-blocky pepper (Fig. S6), suggesting that these two regions may derive from distant introgressions. To confirm this hypothesis, we determined the major haplotypes in the genomes of blocky fruit peppers and estimated their similarity to the total C. annuum population by calculating the major haplotype sharing score (MHS). Two regions with consistently lower MHS scores co-localized with F9 and F11 (Fig. S7). In F9, all blocky types except three, plus five conical fruit accessions from group V were highly homologous to each other, while most (92.46%) of the other non-blocky types diverged (Fig. 3c). In F11, all blocky types except four showed high homologies to each other, as well as four conical fruit peppers from group V, the two wild and five ancestral peppers from group I, while most (91.06%) of the other non-blocky types diverged (Fig. 3d). These data, taken together, suggest that F11 probably originated from an introgression from a wild C. annuum, that occurred in ancestral peppers of group I, persisted at low frequency in groups II and III, and was almost fixed in blocky fruit peppers. F9 is more divergent to the reference fragment than that of F11 as the former has a higher frequency of coding SNPs in comparison to the rest of the pepper genome (Fig. 1a and 3e). We further compared genotypes of loci in F9 with the released sequences of C. annuum var. glabriusculum18, and built a phylogenetic tree to trace their evolutionary relationships. These results support the conclusion that F9 was introgressed from this wild C. annuum (Fig. S8).
Selection at few key loci controls the main transitions in pepper fruit evolution
Fruit shape is an important agronomical trait and is controlled by a conserved network of interacting gene products in distantly related plants29, 30. In pepper, fruit shape is extremely varied and serves the dual purpose of distinguishing different cultivars from each other, and facilitating air drying for long-term storage of elongated types. We found overlapping, strong association signals for fruit shape index, length, and diameter on chromosome 3. The most significant SNP overlapped with previously mapped QTLs for fruit shape and length (fs-3.1, fl-3.2), and caused a nonsynonymous Ile340Thr mutation in the Capana03g002426 (TRM25) gene (Fig. 4), encoding a TONNEAU 1 Recruiting Motif protein. TRM proteins are part of a protein complex interacting with microtubules arrays and controlling cell division patterns, and are well-known regulators of fruit shape in tomato and cucumber30. TRM25 was expressed in the early stage of pepper fruit development in both the pericarp and placenta tissues (Fig. 4). An additional gene, Capana09g001401, localized in the chromosome 9 introgression in blocky fruit types, was highly expressed in the pericarp of non-blocky fruit peppers, but not of blocky fruit ones (Fig. S9). Capana09g001401 encodes a glycine-rich cell wall structural protein (GRP) that is associated with cell elongation/expansion and differentiation in various tissues in rice31. The gene was found to be under selection in blocky fruit types (Table S5) and is thus a strong candidate for the control of blocky fruit peppers.
Several genes controlling pungency in pepper have been identified, encoding either structural genes in the capsaicin biosynthesis pathway or, in one case, a transcriptional regulator11, 12, 13, 14. GWAS analysis in narrow-fruited peppers showed a strong association signal on chromosome 6, at the Capana06g001204 gene location. Two nonsynonymous mutations (Ile812Val, Thr495Ile), in strong LD to each other (r2=0.99) were found in this gene and were significantly associated (P=8.71×10-11 and 6.16×10-11) with the increased pungency phenotype (Fig. 5a-b). Capana06g001204 encodes a phospholipid-flipping ATPase (flippase) and is highly expressed in the middle and late development stages of pepper fruit in the pericarp and placenta (Fig. 5c). We propose the name Flip1 for this novel gene controlling capsaicinoid accumulation. Flippases translocate lipids (mainly phospholipids) across biological membranes through the hydrolysis of ATP, and are involved in a series of physiological responses such as membrane stabilization, vesicle-mediated metabolite transport, adaptation to temperature changes, defense, and lipid signaling32. The role of the FLIP1 protein in the control of pungency is intriguing: we hypothesize that it could be either directly involved in capsaicinoid transport across membranes, or in membrane protection against the destabilizing effects of high capsaicin concentrations33.
Compared to narrow fruit peppers, blocky fruit peppers contain almost no capsaicin or dihydrocapsaicin. GWAS analysis found a strong association signal on chromosome 2 (Fig. S10a-b), close to the previously reported Pun1 gene (Capana02g002340) mediating the last step in capsaicin biosynthesis11. A loss-of-function deletion in the recessive allele pun1 was found using reads mapping information (Fig. S10c); this structural variation has the most significant association (P<2.23×10-308) with the pungency trait.
Fruit orientation is an important agricultural trait in both vegetable crops and fruit trees, but its molecular basis is unknown. As aforementioned, fruit orientation transitioned from erect (up) in wild peppers to pendent (down) in domesticated large-fruited ones. The up locus controls fruit orientation in pepper (Fig. 6a), but the gene underlying this variation is unknown. We conducted a genome-wide association study (GWAS) for this trait and found a strong association signal on chromosome 12, where the up locus resides34 (Fig. 6b). The most significant signal was in the promoter region of gene Capana12g000954, expressed in the flower pedicel and the placenta of pepper fruit (Fig. S11). Capana12g000954 encodes a BIG GRAIN 1-like (BGL) protein, whose rice ortholog is expressed in vascular tissues and mediates auxin transport35. This gene was one of two genes considered previously as candidates for controlling pepper fruit orientation17. A 579-bp deletion was detected in the promoter region of the gene in the pendent accessions, with an extremely significant association with the fruit orientation trait (P=6.00×10-175) and was confirmed in a test population composed of 241 samples (Fig. 6c and Fig. S12). RNA-Seq and quantitative Real Time (qRT) PCR analyses found that this deletion is associated with a high expression level of the gene in pedicels of pendent fruits, but accessions with erect fruits exhibited low level of expression of the gene (Fig. 6d). BG1-like genes have been implicated mostly in controlling organ size and yield in rice, Arabidopsis and maize35, 36. Additional growth-related traits such as plant height, tiller angle, and gravitropism, as well as stress tolerance were affected by down or up regulation of these genes. The function of BG1-like genes has not been determined yet in fruit crops. The novel putative role of BG1-like in controlling fruit orientation in pepper is likely mediated by differential distribution of auxin and level of gravitropism response in the pedicle.
We crossed a wild pepper accession (erect) with a blocky pepper accession (pendent) and obtained a F2 population of ~360 individual plants. Bulked segregant analysis with whole genome resequencing (BSA-seq) identified a single significant signal on chromosome 12 (Methods). Inspection of the genomic position of the peak signal found that it overlapped with the GWAS signal, where locates the gene Capana12g000954 (Fig. 6e). We further verified the function of the BG1-like gene through virus-induced gene silencing (VIGS) (Methods). Plants infected with the TRV2::up vector showed erect fruits, compared to the pendent fruits of the wild-type accession and of the accession infected with an empty TRV2 vector (Figs. 6f and S13). Expression of the BG1-like gene was suppressed in pedicels of erect fruits infected with the TRV::up vector, but not in pedicels of pendent fruits not infected, or infected with the empty TRV2 vector (Fig. 6g).
The key temporal sequence in pepper fruit domestication and diversification
Analysis of the pepper variome allows a temporal reconstruction of the key events that shaped the high diversity of today’s peppers. Starting from fruit orientation, the 579 bp deletion in the up promoter associated with pendent fruits was already present in high proportion in the ancestral group I, increased in groups III to IX and reached complete fixation in blocky groups IV and VI (Fig. 7a). Interestingly, the flip1 mutation controlling fruit pungency shows a very similar trend to up, reaching 100% frequency in group III and remaining high thereafter. In the analyzed population, the key variants of the two genes show very high association (P value=2.32×10-11) which is not due to physical linkage, since the two genes map to chromosomes 6 and 12, respectively. Similarly, the F9 and F11 introgressions associated with the blocky fruit type were found at different frequencies (8.33% and 58.33%, respectively) in group I, but thereafter showed a very high association in all groups (P=1.12×10-21).
Strong associations between unlinked loci can be explained by a series of different scenarios: i) reduced gene flow of the populations containing the associated regions with respect to the general genepool; this hypothesis is unlikely in the present case, since it would influence the linkage disequilibrium of additional unlinked loci, which does not seem to be the case; ii) simultaneous selection for two different traits, encoded by the associated loci; this seems to be the case for up and flip1 during early domestication; and iii) cooperative action of the associated unlinked loci in determining a single phenotype; this seems to be the case for the F9 and F11 introgressions, which are almost always found together in blocky fruit types.
In contrast, the knock-out pun1 allele controlling fruit pungency was extremely rare in narrow pepper groups I-IX, and its frequency increased progressively in groups IV and VI (blocky) (Fig. 7a). The most likely explanation is that early selection for blocky fruits co-opted accidentally the pun1 sweet pepper allele in a subset of group IV accessions, and that the associated “sweet” phenotype was subsequently selected for to reach a complete fixation in group VI, which is the most recent blocky fruit group and presents the largest fruits. This selection probably accompanied a switch in the culinary uses of pepper, from a spice in which small, elongated, easy to air-dry fruits prevailed, to a large-fruited, fresh-market vegetable for consumption in raw or cooked form.
On the basis of the above data, we present the following model for pepper fruit domestication and diversification (Fig. 7b): i) all alleles found in one or more later groups were pre-existing in the ancestral group; ii) during early domestication (groups I→III), the up allele frequency increased to almost complete fixation, mediating the conversion from erect to pendent fruits; iii) the F9 and F11 introgressions were co-opted, leading to the appearance of blocky fruit peppers (groups IV and VI), which also became sweet due to the increase and fixation of pun1. In contrast, the genetic circuits controlling fruit elongation and pungency in narrow fruit peppers appear to be more complex: iv) fruit elongation between groups in groups II and IX was primarily mediated by the trm25 allele, while in other groups the primary contribution appears to be due to the contribution of additional genes (Fig. 2f); v) similarly, in spite of the low frequency of pun1 in group III, this group has lower capsaicinoid content than group II, which is associated with the complete fixation of flip1 and also probably accompanied by selection at other loci controlling capsaicinoid content (Fig. 2f).
In conclusion, the first variome map of pepper described here, uncovered the main genomic events underlying the initial transition from small, almost round, erect, pungent fruits, to larger, more elongated fruits, with a larger variation in capsaicinoid content, followed by the further diversification in fruit shape, pungency and the recent appearance of sweet, blocky peppers. These findings greatly expand our understanding of pepper fruit domestication and diversification, and constitute a cornerstone for the further breeding and improvement of this important horticultural crop.