Pepper variome reveals the history and key loci associated with fruit domestication and diversication

Pepper (Capsicum spp.) is one of the earliest domesticated crops, providing a unique pungent sensation when eaten. Through the construction of the rst pepper variome, we describe the main groups that emerged during domestication and breeding of C. annuum, their relations and temporal succession, and the molecular events underlying the main transitions. The initial differentiation in fruit shape and pungency, increase in fruit weight, and transition from erect to pendent fruits, and the recent appearance of blocky, large, sweet fruits (bell peppers), were accompanied by strong selection/xation of key alleles and introgressions in two large genomic regions. Furthermore, we describe the identication of Up, a key domestication gene controlling erect vs pendent fruit orientation, encoding a BIG GRAIN protein involved in auxin transport, and Flip1 associated with capsaicinoid content, encoding a protein involved in phospholipid ipping. The function of Up was conrmed by virus-induced gene silencing. These ndings constitute a cornerstone for understanding the domestication and differentiation of a key horticultural crop.


Introduction
With $15.67 billion of production value ( http://www.fao.org/faostat/ ), pepper (Capsicum spp.) is the third most produced vegetable crop, and a major component of spicy food, highly appreciated in the Mediterranean area, Middle and far East and the Americas. Its pungency is conferred by capsaicinoids, primarily capsaicin and dihydrocapsaicin 1 , and is sensed by a vanilloid receptor also involved in heat and pain perception (2). The fruits of wild peppers are extremely pungent, small, nearly round, brightly colored, and erect, discouraging mammalian herbivores, which are sensitive to pungency, and favoring seed dispersal by birds, which have impaired capacity to sense pungency and good color vision 2, 3 . About 35 species have been described in the Capsicum genus, including the ve domesticated species: C. annuum L., C. frutescens L., C. chinense Jacq., C. baccatum L., and C. pubescens Ruiz & Pavon 4 . Among the domesticated species, C. annuum is the most widely cultivated one. Archeological microfossil evidence 5 indicates that cultivated pepper species have undergone distinct domestication events as early as 6,000 years ago in primary diversity centers in South and Meso-America 4,5,6,7,8 . Pepper was introduced from the West Indies into Europe in the late 15th and early 16th centuries, and then it was rapidly distributed to Africa and Asia, including China, where the earliest written record of pepper dates back to 1591 (Ming Period) 7,8,9 . During domestication and breeding, non-deciduous peppers with diverse fruit shapes, sizes, weights, pendent fruit orientation, and a range of pungency levels emerged 10 . The change in fruit position from erect to pendent was selected during early domestication and provides an adaptation to increased fruit size, better protection from sun exposure and from predation by birds. It is thus a key agronomic trait in different fruit-bearing crops 10 . A more recent selection was the emergence of very large, blocky, non-pungent fruits (sweet bell peppers), whose earliest record dates to the 1700's 8 .
Studies exploiting the natural variability of pepper allowed the identi cation of several QTLs and candidate genes controlling capsaicinoid levels such as Pun1, pAMT, CaKR1, and Pun3 11, 12, 13, 14 or fruit shape/size (longifolia 1-like) 15,16,17 . In contrast, the molecular basis of other key fruit traits, such as erect vs. pendent orientation or narrow vs. blocky types, is hitherto undescribed in pepper or in any other plant species.
The large variations in fruit size, shape, weight, orientation, and pungency found in the pepper germplasm offer an opportunity to explore the genomic events underlying the diversi cation of these important agronomic traits, and the temporal sequence in which they appeared. In spite of the availability of highquality genomic sequences of several pepper species and accessions 18,19,20 , the understanding of the molecular evolution of this crop is lagging behind its close relative, tomato. To ll this gap, we resequenced 347 accessions of 12 Capsicum species, characterized the major fruit traits in these accessions, and uncovered the genomic variations associated with these traits. Our ndings allowed the reconstruction of the history of pepper domestication and breeding, and of the major genomic events and key genes that shaped the present-day diversity of this important horticultural species.

Results And Discussion
The main trajectories of C. annuum domestication Three hundred forty-seven accessions from 12 species of Capsicum, collected from genebanks in Asia, the Americas, Africa, and Europe, of which 311 C. annuum, were resequenced to an average depth of ~9×, generating 10.1 trillion paired-end reads (Table S1). A variome map was obtained, including 18,372,022 single nucleotide polymorphisms (SNPs) and 802,875 insertions/deletions (InDels), with an accuracy of >95%, veri ed by Kompetitive Allele-Speci c PCR (KASP) (Table S2). Variants were uniformly distributed along the 12 chromosomes, with the exception of a genomic region in chromosome 9 containing substantially more variants (Fig. 1a), and they were about twice as abundant in intergenic regions than in gene bodies (Fig. S1). The median heterozygosity of the accessions was 1.11% (Fig. S2), and 56,182 SNPs and 3,080 InDels caused changes in the protein sequences of coding genes (Table S3).
We used 33,346 synonymous SNPs located in genes to investigate the phylogenetic relations of the accessions. Different Capsicum species formed distinct branches (Fig. 1b), while the 311 annuum accessions formed nine groups (Fig. 1c): I) the wild/ancestral group, which included two wild C. annuum var. glabriusculum and 10 ancestral accessions and was located immediately next to non-annuum species; II) a group mainly composed of old landraces; III) cultivars with diverse geographical origins; IV) and VI) blocky fruit peppers; V) cultivars with diverse fruit types and origins; VII) accessions from the northwest and north of China; VIII) accessions from central China; IX) accessions from southwest China, collected from high-altitude areas in Yunnan, Guizhou, Sichuan, and Tibet (Fig. S3).
Groups I to IX represent the main domestication and breeding trajectories of pepper worldwide. Both the evolutionary relationships ( Fig. 1c-d), and the genetic diversity (π) within each group and the genetic differentiation (F ST ) between groups (Table 1) suggest the following scenario: group I is the ancestral group containing the early domesticates, as suggested by its position near the root of the tree and its high genetic diversity (π=0.2939); group II represents old landraces, being closest to group I (F ST =0.1546), while group III represents later cultivars, being among the closest to group II (F ST =0.1184); both groups II and III exhibit a high genetic diversity (π=0.2860 and 0.2935, respectively), suggesting either the existence of minor genetic bottlenecks, or of diversifying selection, during the early steps of C. annuum domestication. The evolutionary relationships of groups I, II, and III were further supported by the genotypic compositions, with more ancestral alleles present in group I, while more derived alleles by selection were present in group III (Fig. S4). Groups II and III gave rise, directly or indirectly, to all other groups (Fig. 1c-d and Table 1): directly to groups IV (blocky), V, VII and VIII; and indirectly to groups VI (large fruited blocky, derived from group IV) and IX (high-altitude Chinese peppers, derived from group VII).
Among the Chinese groups, group IX exhibited the highest genetic diversity (π=0.2831) ( Table 1 and a predominant genetic component (represented by dark green in Fig. 1d), present in signi cant levels in the ancestral groups I and II, which were possibly re-introduced in group IX to favor adaptation to high altitudes. The large genetic variation in group IX resulted in large fruit length variations, including a speci c slim fruit type (Fig. 1d). All groups, with the exception of IV and VI, exhibit large π values, indicating the inheritance of a large variety of different alleles, or the action of diversifying selection, during their formation. Group V exhibits large variations in fruit shape and likely represents a transition group between traditional and blocky fruit peppers (Fig. 1d). All groups present relatively high levels of genetic admixture (Fig. 1d), con rming the absence of major genetic bottlenecks during domestication and subsequent breeding, with the exception of groups VI (large fruited, blocky peppers) and VIII (central China).

The domestication and differentiation of narrow fruit peppers
The two wild accessions (C. annuum var. glabriusculum) have short, very small, waterdrop shaped, erect fruits, with high (839-1146 mg/Kg DW) capsaicinoid content. The early domesticates of group I present, compared to the wild accessions, a signi cant increase in fruit size, a large variation in fruit shape (olivary, short, conical), the appearance of pendent fruits (8 out of 10), and very large variation in capsaicinoid content (0-1972 mg/Kg DW), indicating a strong diversifying selection exerted on these traits during early domestication (Table S1).
During early domestication, average fruit length increased from around 5.0 cm in group I to 8.0 cm in group II and 11.0 cm in group III, without a corresponding increase in fruit diameter, resulting in increasingly elongated fruit types (Fig. 2a). The Chinese peppers in groups VII, VIII, and IX showed comparable fruit lengths to those of group III. The increase in length, resulting in increased surface-tovolume ratio, probably served a dual purpose: making the early domesticates distinguishable from their wild ancestors, and facilitating air-drying, a common technique applied to this day to conserve chili peppers. In contrast, capsaicinoid levels, after the initial diversifying selection in early domesticates showed a multi-phasic trend, with a slight increase in group II and a clear reduction in group III (Fig. 2b).
The pungency increased again later in groups VII, VIII, and IX, consistent with a secondary selection for increased capsaicinoid levels in China, where spicy food is popular. Finally, pendent fruit types, which were already prevalent in groups I and II, became almost exclusive in groups III to IX (Table S1).
Selective pressure generates genomic selection signals, measured as a reduction of nucleotide diversity [ROD] 21 . Several genomic selection signals were detected in the pepper genome during early domestication (group I to group II), in particular on chromosomes 4, 8, 9, and 11 ( Fig. 2c and Table S4).
Thus, it appears that early evolutionary transitions in narrow fruit peppers (group I to II, and group II to III) involved selection at large groups of candidate genes for fruit pungency and/or shape, probably relying on the vast genetic diversity for these traits that is present in these groups and on the absence of genetic bottlenecks. In contrast, transition from group II to the Chinese groups VII-IX involved a selection on a narrower group of genes ( Fig. 2e-f and Tables S5-S6), consistent with the hypothesis that a genetic bottleneck was active during this transition, probably due to the transport via sea or land (the silk road) to mainland China of a subset of the group II genepool.
The recent emergence of blocky fruit, sweet peppers Blocky fruit peppers (groups IV and VI) exhibit distinctive phenotypes, such as a large increase in fruit diameter and weight, decreased variation in fruit shape, reduction to almost zero of capsaicinoid levels, and pendent fruit orientation, which is necessary to support the large fruit ( Fig. 2a and Table S1). As aforementioned, they also exhibit a very low genetic diversity (Table 1), consistent with their recent emergence 8 and a higher fraction of xed alleles, either ancestral or derived, compared to the other groups ( Fig. S4). Of the two groups, group VI was probably selected later, as suggested by its higher F ST value with respect to group III, lower π value, higher proportion of xed alleles, and also, larger fruits. The linkage disequilibrium (LD) values of groups IV and VI are the highest in the whole C. annuum population, further con rming their recent emergence (Fig. S5).
By comparing groups IV and VI with group III, several genomic selection signals were identi ed using the ROD parameter ( Fig. 3a and Table S4), overlapping with previously described QTLs for fruit shape, length or weight (fs-8, fs10.1B, fs11.4, -8, fd-11, fw4.1) 22,23,24,25 , and with two capsaicinoid biosynthesis genes (ACS2-D1 and pAMT-P5) 18 . Given the recent emergence of blocky fruit peppers, parameters XP-EHH (cross population extended haplotype homozygosity) 26 , and Tajima's D 27 were used to nd additional genomic selection signals (Fig. 3b), which were overlapping with QTLs fd-3.1 for fruit diameter, SAP for ower and ovule development, and qcap6.1 for pungency. Capana07g001005, an Agamous family gene regulating ovule development, Capana10g000984 and Capana10g001014, encoding cyclin-dependent protein kinase regulators of cell cycle, and Capana05g000060, a member of the IQD family that includes SUN, regulating fruit shape in tomato 28 , were localized in these genomic regions and found to be under strong selection (Tables S5-S6).
Two genomic regions, named F9 and F11, on chromosomes 9 and 11 showed very low XP-EHH values (Fig. 3b). The two regions exhibited clear differences, between blocky and non-blocky types, in the depth of reads mapped to the reference genome, which is derived from a non-blocky pepper (Fig. S6), suggesting that these two regions may derive from distant introgressions. To con rm this hypothesis, we determined the major haplotypes in the genomes of blocky fruit peppers and estimated their similarity to the total C. annuum population by calculating the major haplotype sharing score (MHS). Two regions with consistently lower MHS scores co-localized with F9 and F11 (Fig. S7). In F9, all blocky types except three, plus ve conical fruit accessions from group V were highly homologous to each other, while most (92.46%) of the other non-blocky types diverged (Fig. 3c). In F11, all blocky types except four showed high homologies to each other, as well as four conical fruit peppers from group V, the two wild and ve ancestral peppers from group I, while most (91.06%) of the other non-blocky types diverged (Fig. 3d).
These data, taken together, suggest that F11 probably originated from an introgression from a wild C. annuum, that occurred in ancestral peppers of group I, persisted at low frequency in groups II and III, and was almost xed in blocky fruit peppers. F9 is more divergent to the reference fragment than that of F11 as the former has a higher frequency of coding SNPs in comparison to the rest of the pepper genome ( Fig.   1a and 3e). We further compared genotypes of loci in F9 with the released sequences of C. annuum var. glabriusculum 18 , and built a phylogenetic tree to trace their evolutionary relationships. These results support the conclusion that F9 was introgressed from this wild C. annuum (Fig. S8).
Selection at few key loci controls the main transitions in pepper fruit evolution Fruit shape is an important agronomical trait and is controlled by a conserved network of interacting gene products in distantly related plants 29, 30 . In pepper, fruit shape is extremely varied and serves the dual purpose of distinguishing different cultivars from each other, and facilitating air drying for long-term storage of elongated types. We found overlapping, strong association signals for fruit shape index, length, and diameter on chromosome 3. The most signi cant SNP overlapped with previously mapped QTLs for fruit shape and length (fs-3.1, -3.2), and caused a nonsynonymous Ile340Thr mutation in the Capana03g002426 (TRM25) gene (Fig. 4), encoding a TONNEAU 1 Recruiting Motif protein. TRM proteins are part of a protein complex interacting with microtubules arrays and controlling cell division patterns, and are well-known regulators of fruit shape in tomato and cucumber 30 . TRM25 was expressed in the early stage of pepper fruit development in both the pericarp and placenta tissues (Fig. 4). An additional gene, Capana09g001401, localized in the chromosome 9 introgression in blocky fruit types, was highly expressed in the pericarp of non-blocky fruit peppers, but not of blocky fruit ones (Fig. S9).
Capana09g001401 encodes a glycine-rich cell wall structural protein (GRP) that is associated with cell elongation/expansion and differentiation in various tissues in rice 31 . The gene was found to be under selection in blocky fruit types (Table S5) and is thus a strong candidate for the control of blocky fruit peppers.
Several genes controlling pungency in pepper have been identi ed, encoding either structural genes in the capsaicin biosynthesis pathway or, in one case, a transcriptional regulator 11,12,13,14 . GWAS analysis in narrow-fruited peppers showed a strong association signal on chromosome 6, at the Capana06g001204 gene location. Two nonsynonymous mutations (Ile812Val, Thr495Ile), in strong LD to each other (r 2 =0.99) were found in this gene and were signi cantly associated (P=8.71×10 -11 and 6.16×10 -11 ) with the increased pungency phenotype (Fig. 5a-b). Capana06g001204 encodes a phospholipid-ipping ATPase ( ippase) and is highly expressed in the middle and late development stages of pepper fruit in the pericarp and placenta (Fig. 5c). We propose the name Flip1 for this novel gene controlling capsaicinoid accumulation. Flippases translocate lipids (mainly phospholipids) across biological membranes through the hydrolysis of ATP, and are involved in a series of physiological responses such as membrane stabilization, vesicle-mediated metabolite transport, adaptation to temperature changes, defense, and lipid signaling 32 . The role of the FLIP1 protein in the control of pungency is intriguing: we hypothesize that it could be either directly involved in capsaicinoid transport across membranes, or in membrane protection against the destabilizing effects of high capsaicin concentrations 33 .
Compared to narrow fruit peppers, blocky fruit peppers contain almost no capsaicin or dihydrocapsaicin. GWAS analysis found a strong association signal on chromosome 2 ( Fig. S10a-b), close to the previously reported Pun1 gene (Capana02g002340) mediating the last step in capsaicin biosynthesis 11 . A loss-of-function deletion in the recessive allele pun1 was found using reads mapping information (Fig. S10c); this structural variation has the most signi cant association (P<2.23×10 -308 ) with the pungency trait.
Fruit orientation is an important agricultural trait in both vegetable crops and fruit trees, but its molecular basis is unknown. As aforementioned, fruit orientation transitioned from erect (up) in wild peppers to pendent (down) in domesticated large-fruited ones. The up locus controls fruit orientation in pepper (Fig.  6a), but the gene underlying this variation is unknown. We conducted a genome-wide association study (GWAS) for this trait and found a strong association signal on chromosome 12, where the up locus resides 34 (Fig. 6b). The most signi cant signal was in the promoter region of gene Capana12g000954, expressed in the ower pedicel and the placenta of pepper fruit (Fig. S11). Capana12g000954 encodes a BIG GRAIN 1-like (BGL) protein, whose rice ortholog is expressed in vascular tissues and mediates auxin transport 35 . This gene was one of two genes considered previously as candidates for controlling pepper fruit orientation 17 . A 579-bp deletion was detected in the promoter region of the gene in the pendent accessions, with an extremely signi cant association with the fruit orientation trait (P=6.00×10 -175 ) and was con rmed in a test population composed of 241 samples (Fig. 6c and Fig. S12). RNA-Seq and quantitative Real Time (qRT) PCR analyses found that this deletion is associated with a high expression level of the gene in pedicels of pendent fruits, but accessions with erect fruits exhibited low level of expression of the gene (Fig. 6d). BG1-like genes have been implicated mostly in controlling organ size and yield in rice, Arabidopsis and maize 35,36 . Additional growth-related traits such as plant height, tiller angle, and gravitropism, as well as stress tolerance were affected by down or up regulation of these genes. The function of BG1-like genes has not been determined yet in fruit crops. The novel putative role of BG1-like in controlling fruit orientation in pepper is likely mediated by differential distribution of auxin and level of gravitropism response in the pedicle.
We crossed a wild pepper accession (erect) with a blocky pepper accession (pendent) and obtained a F 2 population of ~360 individual plants. Bulked segregant analysis with whole genome resequencing (BSA-seq) identi ed a single signi cant signal on chromosome 12 (Methods). Inspection of the genomic position of the peak signal found that it overlapped with the GWAS signal, where locates the gene Capana12g000954 (Fig. 6e). We further veri ed the function of the BG1-like gene through virusinduced gene silencing (VIGS) (Methods). Plants infected with the TRV2::up vector showed erect fruits, compared to the pendent fruits of the wild-type accession and of the accession infected with an empty TRV2 vector (Figs. 6f and S13). Expression of the BG1-like gene was suppressed in pedicels of erect fruits infected with the TRV::up vector, but not in pedicels of pendent fruits not infected, or infected with the empty TRV2 vector (Fig. 6g).
The key temporal sequence in pepper fruit domestication and diversi cation Analysis of the pepper variome allows a temporal reconstruction of the key events that shaped the high diversity of today's peppers. Starting from fruit orientation, the 579 bp deletion in the up promoter associated with pendent fruits was already present in high proportion in the ancestral group I, increased in groups III to IX and reached complete xation in blocky groups IV and VI (Fig. 7a). Interestingly, the ip1 mutation controlling fruit pungency shows a very similar trend to up, reaching 100% frequency in group III and remaining high thereafter. In the analyzed population, the key variants of the two genes show very high association (P value=2.32×10 -11 ) which is not due to physical linkage, since the two genes map to chromosomes 6 and 12, respectively. Similarly, the F9 and F11 introgressions associated with the blocky fruit type were found at different frequencies (8.33% and 58.33%, respectively) in group I, but thereafter showed a very high association in all groups (P=1.12×10 -21 ).
Strong associations between unlinked loci can be explained by a series of different scenarios: i) reduced gene ow of the populations containing the associated regions with respect to the general genepool; this hypothesis is unlikely in the present case, since it would in uence the linkage disequilibrium of additional unlinked loci, which does not seem to be the case; ii) simultaneous selection for two different traits, encoded by the associated loci; this seems to be the case for up and ip1 during early domestication; and iii) cooperative action of the associated unlinked loci in determining a single phenotype; this seems to be the case for the F9 and F11 introgressions, which are almost always found together in blocky fruit types.
In contrast, the knock-out pun1 allele controlling fruit pungency was extremely rare in narrow pepper groups I-IX, and its frequency increased progressively in groups IV and VI (blocky) (Fig. 7a). The most likely explanation is that early selection for blocky fruits co-opted accidentally the pun1 sweet pepper allele in a subset of group IV accessions, and that the associated "sweet" phenotype was subsequently selected for to reach a complete xation in group VI, which is the most recent blocky fruit group and presents the largest fruits. This selection probably accompanied a switch in the culinary uses of pepper, from a spice in which small, elongated, easy to air-dry fruits prevailed, to a large-fruited, fresh-market vegetable for consumption in raw or cooked form.
On the basis of the above data, we present the following model for pepper fruit domestication and diversi cation (Fig. 7b): i) all alleles found in one or more later groups were pre-existing in the ancestral group; ii) during early domestication (groups I→III), the up allele frequency increased to almost complete xation, mediating the conversion from erect to pendent fruits; iii) the F9 and F11 introgressions were coopted, leading to the appearance of blocky fruit peppers (groups IV and VI), which also became sweet due to the increase and xation of pun1. In contrast, the genetic circuits controlling fruit elongation and pungency in narrow fruit peppers appear to be more complex: iv) fruit elongation between groups in groups II and IX was primarily mediated by the trm25 allele, while in other groups the primary contribution appears to be due to the contribution of additional genes (Fig. 2f); v) similarly, in spite of the low frequency of pun1 in group III, this group has lower capsaicinoid content than group II, which is associated with the complete xation of ip1 and also probably accompanied by selection at other loci controlling capsaicinoid content (Fig. 2f).
In conclusion, the rst variome map of pepper described here, uncovered the main genomic events underlying the initial transition from small, almost round, erect, pungent fruits, to larger, more elongated fruits, with a larger variation in capsaicinoid content, followed by the further diversi cation in fruit shape, pungency and the recent appearance of sweet, blocky peppers. These ndings greatly expand our understanding of pepper fruit domestication and diversi cation, and constitute a cornerstone for the further breeding and improvement of this important horticultural crop.