Identification and chromosomal localizations of bHLHs in pepper
After excluding redundant sequences, a total of 107 bHLH proteins were identified from the C. annuum genome. As shown in Table S1, all identified CabHLH proteins encoded 190-940 amino residues. The molecular weight (Mw) of these proteins ranged from 21.61 kDa to 106.70 kDa, and the theoretical pI ranged from 4.60 to 9.91. Only ten of the proteins were stable (instability index<40).
The 107 CabHLHs were renamed as CabHLH001 to CabHLH107 and mapped to the pepper chromosomes based on their chromosomal positions (Figure S2). They were distributed on 12 chromosomes. Chromosome 01 contained the largest number of CabHLH members (18). Chromosome 05 and chromosome 09 included three CabHLH members, respectively. However, the genes ranging from CabHLH092 to CabHLH107 failed to locate on any chromosomes.
Conserved amino acid residues in the bHLH domain
The amino acid sequences of the bHLH domain were used to carry out multiple alignment analyses (Figure S3). The results indicated that bHLH family proteins possessed a bHLH conserved domain, which contains the basic, first helix, loop and second helix regions. As shown in Figure 1, 20 amino acid residues are conserved with a consensus ratio greater than 50%, and six amino acid residues are conserved with a consensus ratio greater than 75% among the conserved bHLH domains. Five residues (His-5, Glu-9, Arg-10, Arg-12 and Arg-13) are conserved in the basic region; four residues (Leu-23, Leu-26, Val-27 and Pro-28) are conserved in the first helix region; Lys-31 and Asp-34 are conserved in the loop region, and nine residues (Lys-35, Ala-36, Ser-37, Leu-39, Ala-42, Ile-43, Tyr-45, leu-49 and leu-56) are conserved in the second helix region. The residues Leu-23 and Leu-49 are extremely conserved among the 107 bHLH proteins in pepper.
Phylogenetic analysis of the bHLH family proteins
To classify the CabHLH proteins, a phylogenetic tree that contains all of the identified bHLH protein sequences in peppers and those in Arabidopsis was constructed by the neighbour-joining method (Figure 2). According to the classification of AtbHLHs in a previous study , the CabHLH proteins were divided into 15 subfamilies and named from group I to group XII. Group II contains the largest numbers of CabHLHs (25) and AtbHLHs (13), while group VII contains only one CabHLH and three AtbHLH proteins. The different number of CabHLHs and AtbHLHs proteins in same group might be induced by unequal duplication of the bHLH family during the evolutionary process of the plant.
Members of the same group might possess similar biological functions. To preliminarily speculate on the biological functions of CabHLHs, another neighbour-joining phylogenetic tree was constructed for all bHLH proteins in Arabidopsis, tomato, rice and pepper (Figure S4). The functional characteristics of AtbHLHs and SlbHLHs that have been reported and summarized in the literature, and these functional characteristics were used to evaluate the potential function of CabHLH in the same group (Table S2). SlPIF1a included in group VI can regulate carotenoid biosynthesis by a light-dependent mechanism in tomato . SlPIF1a mapped to CabHLH051 in peppers and was also classified into group VI. These results indicated that the members of group VI might be involved in carotenoid biosynthesis.
Analysis of conserved motifs in CabHLH proteins
To investigate the structural features of CabHLH proteins, the conserved amino acid motifs were analysed and identified using Multiple Em for motif Elicitation (MEME) Suite. A total of 15 conserved motifs containing 21-100 residues were found, and they are in a range from motif 1 to motif 15. The information from the motifs is listed in Table S3. Motif 1 and motif 2 are located in the bHLH domain region, which appeares in all proteins. Motif 3 to motif 15 are distributed outside the bHLH domain region. Motif 6, motif 7, motif 8 and motif 11 are primarily restricted to group II. Motif 9, motif 12 and motif 15 are specifically within group IV. Motif 10 and motif 13 were found in group Id, and motif 14 was observed in group V. Generally, most of the proteins in the same group possessed common motifs in terms of alignment and position (Figure 3).
Expression profiles of CabHLHs at different developmental stages in pericarp and placenta
Capsorubin and capsaicinoids are found in pepper fruit and are closely associated with the transcriptional level of biosynthetic genes during developmental stages. To obtain further insight into the potential functions of CabHLHs during capsorubin and capsaicinoid biosynthesis, the expression profiles of CabHLHs in different developmental pericarp and placenta were investigated. RNA-Seq raw data were obtained from Kim et al.  and included 6, 16, 25, 36, 38, 43 and 48 DPA stages (Figure 4 and Figure 5). All the raw reads were spliced and remapped to the version 2.0 of the C. annuum genome.
As shown in Figure 4B, the expression levels of capsorubin biosynthetic genes gradually increased at 36 DPA, which was consistent with the accumulation profile of capsorubin in pericarp tissue. A total of 20 expressed CabHLH genes were not detectable. They were probably transcribed at a low level in different developmental pericarp stage. According to the similarity of the expression profiles, all CabHLH expression patterns in different developmental pericarp stages were hierarchically clustered and classified into 10 clusters (Figure 4A). The expression profiles of CabHLHs in cluster C1, cluster C2, cluster C3 and cluster C4 maintained good agreement with the expression profiles of capsorubin biosynthetic genes; the members of cluster C1, cluster C2, cluster C3 and cluster C4 might be associated with capsorubin biosynthesis.
Capsaicinoids were abundantly produced from 13 DPA to 25 DPA in the developmental stages of the placenta, and the levels of capsaicinoid biosynthetic gene expression were high during this stage and were then gradually reduced (Figure 5B). Based on the similarity of the expression profiles, all the CabHLH expressions in different developmental placenta stages were hierarchically clustered into 10 clusters (Figure 5A). A total of 16 genes were expressed at a low level that could not be detected. The expression profiles of CabHLHs in cluster L5, cluster L6, cluster L8 and cluster L9 were similar to the expression profiles of capsaicinoid biosynthetic genes. Therefore, these results indicated that the members of cluster L5, cluster L6, cluster L8 and cluster L9 might be associated with capsaicinoids biosynthesis.
Additionally, capsorubin and capsaicinoids are produced mainly in the pericarp and placental tissues of peppers. To confirm whether CabHLHs are specifically expressed in pericarp and placental tissues, the expression profiles of all identified CabHLHs in different tissues including the leaf, root, stem, pericarp and placenta were investigated. However, the RNA-Seq raw data uploaded by Kim et al. (2014) did not contain leaves, roots or stems tissues from pepper. The RPKM values of these tissues, which were mapped to version 1.5 of the C. annuum genome, were directly published online. The heat map indicated that CabHLHs were not specifically expressed in certain tissues (Figure S5). Presumably, these TFs orchestrate functions in addition to regulating capsorubin and capsaicinoid biosynthesis.
Validation of candidate bHLH genes involved into capsorubin and capsaicinoids biosynthesis
To further verify the expression profiles and expression specificity of CabHLHs in the pericarp and the placenta, ten CabHLHs from candidate clusters that might be related to capsorubin and capsaicinoid biosynthesis were selected for qRT-PCR analysis. The selected genes were relatively highly expressed in different developmental pericarp or placental tissue. As shown in Figure 6A, the contents of β-carotene, zeaxanthin and capsorubin increased at the MG stage in pericarp tissue, whereas the lutein content from the branch of the non-synthetic capsorubin decreased. The expression of CabHLH032, CabHLH048, CabHLH095 and CabHLH100 was consistent with the accumulation pattern of carotenoids (β-carotene, zeaxanthin and capsorubin) in pericarp, while the CabHLH009 expression was similar to the accumulation pattern of lutein in pericarp. However, they were also highly expressed in other tissue (roots, flowers, stems, placentas, leaves and seeds) (Figure 6B). Thus, it was likely that the members of cluster C1, cluster C2 and cluster C3 were associated with capsorubin biosynthesis but also possessed other specific functions in other tissues.
The content of capsaicin and dihydrocapsaicin started to be produced at 10 DPA and peak at 25 DPA in placental tissue, and then gradually decreased (Figure 7A). The expression of CabHLH007, CabHLH009, CabHLH026, CabHLH063 and CabHLH086 resembled the accumulation pattern of capsaicin in different developmental placenta stages. A high expression level of CabHLH026 was observed in pericarp, seed and placental tissue, and high expression of CabHLH063 was shown in stem and leaf, while CabHLH007, CabHLH009 and CabHLH086 were highly expressed in certain tissues (Figure 7B). Therefore, CabHLH007, CabHLH009, CabHLH026, CabHLH063 and CabHLH086 in cluster L5, cluster L6 and cluster L9 might be associated with capsaicinoid biosynthesis.
The expression of candidate CabHLHs associated with capsaicinoid biosynthesis in response to different temperatures
To obtain a preliminary understanding of whether capsaicinoid biosynthesis is regulated by CabHLH genes in response to different temperatures, the expression of five candidate CabHLHs and five important capsaicinoid biosynthetic genes at different temperatures were measured. As shown in Figure 8A, the content of capsaicin and dihydrocapsaicin tended to increase with increasing temperature from T15 to T25. The capsaicin content of the placenta in peppers with T25 treatment was significantly higher than that in papers with T33 treatment. The expression of CabHLH007, CabHLH009 and CabHLH086 increased with increasing temperatures, similar the accumulation of dihydrocapsaicin (Figure 8B). CabHLH026 was highly expressed with T25, which was consistent with the accumulation of capsaicin and the expression of capsaicinoid biosynthetic genes including AT3, AMT, BCKDH and KasIa (Figure 8B; Figure 8C). In contrast, CabHLH063 expression decreased with increasing temperature, which maintained a good agreement with the expression profile of the capsaicinoid biosynthetic gene Acl (Figure 8B; Figure 8C). Thus, the expression of CabHLH007, CabHLH009 and CabHLH086 was positively associated with an increase in dihydrocapsaicin content and temperature. Inversely, CabHLH063 expression was negatively related and CabHLH026 expression was positively related to the increase in the capsaicin content and temperature. These candidate genes might be related to capsaicinoid biosynthesis in response to different temperatures through the regulation of the transcription of capsaicinoid biosynthetic genes.
The interaction of candidate CabHLHs and MYB31 in yeast
bHLH proteins usually bind DNA and regulate the transcription of downstream genes. To verify the interaction between candidate CabHLHs and CaMYB31, a yeast two-hybrid assay was carried out. The results indicated that these bHLHs interacted with MYB31 in a gene-dependent manner. CabHLH007, CabHLH009, CabHLH026, CabHLH063 and CabHLH086 can interact with MYB31 in vitro. CabHLH026 and CaMYB31 strongly interacted in yeast, while the interactions of CabHLH063, CabHLH086 and CaMYB31 were weak (Figure 9). Therefore, it is likely that CabHLH can regulate capsaicinoid biosynthesis by interacting with CaMYB31, a master regulator of capsaicinoid biosynthesis.