Identification, chromosomal distribution and subcellular localization of BBX gene family
To identify the BBX genes in the Gossypium spp. genome and obtain their sequences, a global search of the Gossypium spp. Genomes were carried out by using HMM profiling of the BBX domain (PF00643). After ensuring that the identified members contain conserved domains and delete the repeated sequences, in total, of 17, 18, 37 and 33 putative BBX sequences were identified in G. arboreum, G. raimondii, G. hirsutum and G. barbadense, respectively, via genome-wide identification analysis. In G. hirsutum, 1 BBX was located on scaffold fragments. The BBXs were named according to their location on the chromosomes (Figure 1), and the BBXs located on the scaffold fragments in G. hirsutum is finally named. Table S2 contains detailed location information. The lengths of putative GaBBX protein sequences ranged from 163 aa (GaBBX3) to 374 aa (GaBBX13); GrBBXs, 197 aa (GrBBX16) to 374 aa (GrBBX10); GhBBXs, 166 aa (GhBBX29) to 374 aa (GhBBX32) and GbBBXs, 136 aa (GbBBX3) to 505 aa (GbBBX27). The predicted MW and pI of each BBX are shown in Table S2. The results of subcellular localization showed that most of the BBXs were located in the nucleus, while the others were located in the extracellular, indicating that the nucleus was the main region of biological functions of BBXs.
Based on the genomic location information of 105 BBX genes, we visualized the chromosome distribution of GaBBXs, GrBBXs, GhBBXs and GbBBXs (Figure 1). In G. arboreum, 17 GaBBXs were unevenly distributed on 10 chromosomes. A12 and A13 contained 3 GaBBXs, whereas the other 8 chromosomes, A01, A02, A04, A05, A06, A08, A09 and A11, contained 1 or 2 GaBBXs. In G. raimondii, 18 GrBBXs were located on 9 chromosomes. D02 and D08 contained the most GrBBXs (3), while the other 6 chromosomes contained only 1 or 2 GrBBXs. In G. hirsutum, 37 GhBBXs were unevenly mapped to 21 chromosomes, while, GhBBX37 was located on unassembled scaffolds. At13 contained 4 GhBBXs. At01, Dt01, Dt12 and Dt13 contained 3 GhBBXs, and the other 16 chromosomes contained only 1 or 2 GhBBXs. In G. barbadense, 33 GrBBXs were unevenly mapped to 20 chromosomes. At12, At13, Dt12 and Dt13 contained 3 GrBBXs. The other 16 chromosomes contained only 1 or 2 GrBBXs.
Phylogenetic analysis of the BBX gene family
To investigate the phylogenetic relationships of BBXs, 137 BBX protein sequences (G. arboreum (17), G. raimondii (18), G. hirsutum (37), G. barbadense (33) and A. thaliana (32)) were used to construct a phylogenetic tree based on the NJ method. Members of the BBX family were classified into 5 major groups, I-Ⅴ (Figure 2), and each subgroup was named according to the taxonomic results of previous studies in Arabidopsis [4]. It is worth noting that although AtBBX26 and AtBBX27 belong to group Ⅴ of Arabidopsis according to their structural classification, they are phylogenetically closer to AtBBX12 and AtBBX13, which are in group Ⅱ. As shown in Figure 2, group Ⅱ was the smallest subgroup, containing 7 BBXs. By contrast, group Ⅳ had the most massive numbers of BBX genes, including 69 BBXs. There were 29 BBXs in group I. No cotton species were divided into group Ⅲ or group Ⅴ. In G. hirsutum, BBX gene has 2, 9 and 26 members in group Ⅱ, Ⅰ and Ⅳ, respectively.
Replication events of BBX gene family
From the perspective of the cotton evolution, tetraploid cotton is the result of genome doubling of two diploid cotton hybrids. In terms of the number of genes, the sum of BBX genes in G. arboreum (17) and G. raimondii (19) is about equal to the number of those in G. hirsutum (37) or G. barbadense (33). The results further confirm this view. To explore the replication events of the BBX gene family, MCScanX was used to analyze the collinearity between the At and Dt subgenomes of G. hirsutum and their corresponding ancestral A and D diploid genomes (Figure 3). The data showed that most homologous gene pairs of the BBX gene family were amplified by segmental replication, which meant segmental replication played a key role in the evolution of the BBX gene family. However, the genomic evolution of allotetraploid cotton is extremely complex. In the process of evolution, the genome has experienced not only segmental duplication events but also many tandem duplication events. The duplicate types of BBXs in G. hirsutum are shown in detail in Table S3. In G. arboretum and G. raimondii, 1 and 2 tandem duplication events (GaBBX15/GaBBX16 as well as GrBBX1/GrBBX2 and GrBBX17/GrBBX18) were identified, respectively. In G. hirsutum, 4 tandem duplications (GhBBX1/GhBBX2, GhBBX15/GhBBX16, GhBBX18/GhBBX19, GhBBX34/GhBBX35) were discovered.
Ka/Ks ratios were calculated to evaluate the selection pressure of these homologous gene pairs. Among the 95 homologous gene pairs, 90 homologous gene pairs had Ka/Ks values <1, which indicates that most of the homologous gene pairs had undergone purifying selection in the process of evolution, and these genes pairs may play a similar function. Only a few homologous gene pairs have experienced positive selection, which may lead to new biological functions of these genes.
Analysis of gene structure and conservative motif
The results of the phylogenetic analysis showed that 37 GhBBXs could be divided into 5 groups (A-E), which contained 9, 2, 12, 5 and 9 members, respectively (Figure 5I). To better understand the structural characteristics of GhBBXs, the exon/intron structure was analyzed by GSDS (Figure 5III). The GhBBX genes contained 3 to 7 exons, but most of them contained less than 5 exons. Moreover, the conserved motif was further analyzed by MEME program. The GhBBXs in the same group shows similar motif composition, which further validated the classification results (Figure 5II). Except for group A, the order of motif 1 and motif 2 in GhBBX of other groups was the same. Motif 3 existed only in group A and group B, but motif 4 existed in all groups except group A and group B. Motif 5 exists only in group C, while motif 6 only exists in group E. Figure 5 shows that the distribution of conserved motif and exon/intron structure are different among different groups, but they are highly conservative on the same branches. The results showed apparent conservation, which laying a foundation for functional conservatism and providing guidance for follow-up functional research.
Analysis of cis-acting elements in GhBBX promoter regions
To better understand the regulation of GhBBXs gene transcription and expression, the promoter region of GhBBX (genomic DNA sequence 2 kb upstream of the transcription start site) were used to search the PlantCARE database. A variety of cis-elements were found in the GhBBX promoter region. Among the cis-acting elements, the cis-acting elements related to phytohormone and stress response are the focus of our attention. We found abscisic acid (ABA) response element, gibberellin (GA) response element, auxin (IAA) response element, salicylic acid (SA) response element and methyl jasmonate (MeJA) response element in 21, 19, 11, 17 and 17 GhBBX promoters, respectively. In some GhBBX promoters, there are cis-acting element related to multiple phytohormone, while in other GhBBX promoters, there are only cis-acting element related to a single phytohormone response. In terms of stress-related response elements, these cis-acting elements are mainly related to low temperature, drought, anaerobic and other defenses. In the midst of these elements, the anaerobic cis-acting element is the most frequent stress response element, which appears in the promoters of 32 GhBBX genes, followed by the cis-acting element in response to low temperature. It exists in the promoters of 20 GhBBX genes. Thus, it can be seen that GhBBX may respond to stress response and abiotic stress of cotton. In addition, a large number of light response elements were found in the promoter region of GhBBXs, including Box-4, G-box, GT1-motif, TCT-motif and MRE.
Expression patterns of GhBBXs in different tissues
In order to study the expression pattern of GhBBXs in different tissues, we analyzed the transcriptomic data of root, stem, leaf, anther, filament, pistil and petal in TM-1. The results showed that different members of the cotton BBX family showed different expression patterns. According to the expression characteristics and based on hierarchical clustering analysis, 37 GhBBXs were divided into 3 categories (I-III) (Figure. 5). 6 GhBBXs (GhBBX5, 8, 9, 2326, and 28) belonging to group II were highly expressed in nearly all tissues. 10 GhBBXs (GhBBX2, 4, 10, 16, 19, 21, 22, 29, 32 and 35) belonging to group I were poorly expressed in all tissues. The remaining members (GhBBX1, 3, 6, 7, 11, 12, 13, 14, 15, 17, 18, 20, 24, 25, 27, 30, 31, 33, 34, 36 and 37) belonging to group III exhibited slightly higher expression in vegetative organs, while others showed slightly higher expression in floral organs. These differences in expression patterns might be related to the various functions of GhBBXs.
Expression characterization of GhBBXs in cotton flower bud differentiation
Flower bud differentiation is an important sign that a plant is undergoing a transition from vegetative growth to reproductive growth [27]. To explore whether GhBBX gene, which is highly expressed in flower organs, is involved in the process of flower bud differentiation, we analysed the relative expression of these genes in the shoot apex of the early-maturing cotton cultivar CCRI50 and late-maturing cotton cultivar GX11 from one-leaf stage to five-leaf stage. The graphical representation of the expression profiles of 6 genes at 5 different times is shown in Figure 6. Among the 6 BBX genes, the expression levels of GhBBX5, GhBBX8, GhBBX9, and GhBBX26 in the early-maturing material were higher than those in the late-maturing material, and significant or extremely significant differences were detected at 4, 1, 3, and 5 periods, respectively. The expression levels of the other 2 genes, GhBBX23 and GhBBX28, in the early-maturing material were higher than those in the late-maturing materials at the most stages, but at three-leaf stage, the expression levels in the early-maturing material were lower than those in the late-maturing material.
Expression pattern of GhBBXs under multiple stress
The analysis of cis-acting elements in the promoter region showed that GhBBXs might be related to the abiotic stress, and some studies have shown that BBX genes play a positive role in abiotic stress resistance [23, 28]. Therefore, based on the transcriptomic data of TM-1, we further analyzed the expression patterns of GhBBX under PEG and NaCl stresses. The results shown that GhBBXs clustered mainly into three categories (I-III) based on a hierarchical clustering analysis according to expression features (Figure 5B). The GhBBX genes that belong to groups II and III responded to PEG and NaCl, especially at 12 h after treatment, showing significantly downregulated expression. 11 GhBBXs belonging to group I showed universally low expression after stress treatment.
Based on the transcriptomic data, GhBBX5, GhBBX8, GhBBX9, GhBBX17, GhBBX23, GhBBX26, GhBBX28 and GhBBX36 which belong to group II and whose expression levels increased after stress treatment, were selected for qRT-PCR verification. As shown in Figure 7, PEG stress affected the activity of GhBBXs. GhBBX5, GhBBX23 and GhBBX28 responded to PEG treatment at 1 h, after which their expression was downregulated continuously until a minimum point was reached at 12 h. Afterwards, their expression increased at 24 h. The expression of GhBBX8, GhBBX9, GhBBX17 and GhBBX26 increased at the beginning, after which it decreased but then increased again. The minimum expression level occurred at 12 h. GhBBX36 was the only gene whose expression abruptly decreased but then increased, followed by a sudden increase after reaching the minimum at 12 h. Under NaCl stress treatment, GhBBX5, GhBBX8, GhBBX9, GhBBX17, GhBBX23, GhBBX26 and GhBBX36 showed the same expression pattern. The expression of these genes quickly decreased but then increased. Afterwards, their expression reached a minimum at 12 h followed by an increase. GhBBX28 was the only gene whose expression began at 9 h, after which it decreased; after reaching a minimum at 12 h, the expression increased. Taken together, the results show that GhBBXs can respond to plant abiotic stresses, and at the same time, it provides potential candidate genes for further study.
Expression pattern of GhBBXs under phytohormone
In order to study the response of GhBBXs to phytohormone, 9 genes containing corresponding hormone-response elements in their promoters were selected for qRT-PCR experiment to further analyse the transcript levels of BBX genes under three different hormone treatments (Figure 8). After treated with ABA (Figure 8A), the expression of most of these genes increased early (mainly from 0.5 h to 2 h), followed by a decrease. The expression of three genes decreased early, after which it increased and finally decreased with time. Under GA treatment (Figure 8B), 4 expression patterns were found. the expression of most of these genes increased early, peaking at 0.5 h or 1 h after GA application, after which their expression decreased over time. The expression of two genes (GhBBX18 and GhBBX20) was downregulated at all time points, and the expression of one gene, GhBBX11, was upregulated at all time points except the 9 h time point. The expression of GhBBX7 was upregulated early on, after which it decreased and then increased. In response to IAA treatment (Figure 8C), these genes could be divided into 2 major patterns on the basis of their expression characteristics. The expression of most of these genes increased early, followed by a decrease, peaking from 0.5 h to 2 h. Three genes (GhBBX7, GhBBX13 and GhBBX25) presented downregulated expression at all time points, reaching a minimum at 12 h.