Introduction and phylogenetic analysis of the GhGASAs
To screen cotton flowering-related genes, we identified a total of 38 GhGASAs in the upland cotton genome with an uneven distribution on the 18 chromosomes, and their physicochemical properties were analyzed (Table S3). Notably, 10 GhGASAs were concentrated on chromosomes A05 and D05, whereas no corresponding genes were found on chromosomes A01, A03, A08, A13, D01, D08, or D13. We systematically renamed these genes GhGASA1 to GhGASA38 from chromosomes A01 to D13 according to their chromosomal location (Fig. S2). To further clarify the evolutionary relationships of the GhGASAs, we selected 155 GASA proteins from seven plant species: 38 from upland cotton; 37, 23, 18 and 15 from soybean, Canavalia rosea, tobacco, and Arabidopsis, respectively; and 24 from grapevine (14) and rice (10). An evolutionary tree was constructed, and the tree divided the 155 GASAs into three major branches. Subfamily I, the largest branch, comprised 73 genes, including 20 GhGASAs and 6 AtGASAs. Subfamily II consisted of 44 GASAs, which included 7 GhGASAs and 5 AtGASAs. Subfamily III was the smallest, with only 38 genes, which included 11 GhGASAs and 4 AtGASAs. Therefore, more GASA members from Arabidopsis, cotton, and tobacco belonged to subfamily I than to the other subfamilies. Interestingly, during the evolutionary process, GhGASAs exhibited the closest relationship with soybean, followed by Arabidopsis, Canavalia rosea and tobacco, and exhibited less affinity for rice and grapevine. The results indicated a closer relationship between cotton and most dicotyledonous plants than between cotton and monocotyledonous plants. Additionally, the evolution of Subfamily II appeared relatively conservative compared with that of Subfamily I and Subfamily III (Fig. 1).
Gene duplication analysis of the GhGASAs
To determine the homologous relationships between GhGASAs in the At and Dt subgenomes of G. hirsutum, we examined gene duplication events to determine the amplification patterns of these genes. Tandem duplication occurred when two closely related GhGASAs were located on the same chromosome, and segmental duplication occurred when they were found on different chromosomes. We analyzed the duplication patterns of the 38 GhGASAs in the G. hirsutum genome and identified 44 homologous duplicated gene pairs (Fig. 2). Among these homologously replicating gene pairs, one pair formed on chromosome D05 (GhGASA28/29) by tandem replication, whereas the other 43 pairs formed by fragment replication. These findings suggest that segmental repetition played a more significant role than tandem repetition in the evolution of GhGASAs. To gain insight into the evolutionary constraints influencing the functional divergence of GhGASAs, we calculated the nonsynonymous substitutions (Ka), synonymous substitutions (Ks), and nonsynonymous-to-synonymous substitution ratio (Ka/Ks). A total of 42 duplicated GhGASA pairs presented a Ka/Ks ratio<1, suggesting that these gene pairs might have undergone intense purging selection pressure. However, two pairs of duplicated GhGASAs exhibited Ka/Ks values exceeding 1.0, indicating that these genes underwent positive selection (Table S4). In conclusion, the identification of tandem and segmental duplications elucidated the amplification patterns of GhGASAs and revealed that segmental repetition played a dominant role in their evolution. Furthermore, Ka/Ks analysis indicated that purifying selection is a major force that shapes the functional conservation of duplicated GhGASAs, whereas positive selection may have driven the divergence of specific gene pairs. These findings improve the understanding of the evolutionary dynamics of GhGASAs in the context of polyploidy.
Conserved motif, gene structure and cis-acting element analyses of the GhGASAs
To elucidate the genetic structural diversity of these polygenic families, we constructed a comprehensive structural analysis of the GhGASAs. The exon‒intron maps revealed that most GhGASAs, except for GhGASA17, which contained only one exon in its coding region, typically contained 1-4 exons and introns (Fig. 3a). Similar exon‒intron distribution patterns were observed among evolutionarily closely related GhGASAs. This structural multiformity was crucial for gaining deeper insights into the features and functions of these proteins in cotton. Conserved motif analysis indicated that all the putative GhGASA proteins shared a conserved GASA domain. Most of the proteins contained three to five conserved motifs; in addition, motif 3 was entirely absent in GhGASA17, and motif 6 was exclusively detected in GhGASA8, GhGASA14, and GhGASA23. The diverse motifs suggested that the functions of GhGASA tended to diversify during evolution (Fig. 3b). The distinct order of the bases among the five conserved motifs of the GhGASA proteins is shown in Fig. 3c. These results suggested that the GhGASAs that clustered in the same group might share similar structural features.
To predict the potential biological functions of the GhGASAs, we analyzed the cis-acting elements within their promoters. The cis-acting elements could fall into four categories, and almost all the GhGASAs were involved in light response elements. Most of the GhGASA promoters contained p-box, AuxRE, and GARE cis-acting elements related to gibberellin responsiveness. Furthermore, a total of 48 TGACG and CGTCA motifs were found to be involved in MeJA responsiveness. Notably, five AuxRE elements (auxin-responsive elements) were found in the GhGASA promoter, and ten salicylic acid-responsive TCA elements were identified in eight GhGASA promoters. Additionally, the GhGASA promoters were implicated in plant growth and development: RY elements for seed-specific regulation, O2 sites for metabolic regulation, CAT boxes related to meristem expression, and GCN4 motifs for endosperm expression. Additional cis-acting elements included LTR and TC-rich repeated cis-acting elements, indicating their involvement in stress and low-temperature responses (Fig. 3d). These results underscore the finding that GhGASAs not only play an important role in plant hormone regulation but are also related to plant reproductive development
Tissue-specific expression patterns of GhGASAs
To elucidate the expression patterns of the 38 GhGASAs, we constructed an expression calorimetric map across different tissue organs. Based on the phylogenetic tree and expression level, the nine genes closely related to soybean and Arabidopsis thaliana were preliminarily screened and exhibited upregulation in leaf and flower organs (Fig. 4a). Subsequently, we validated the expression levels of these nine candidate GhGASAs in various tissues by qRT‒PCR (Fig. 4b). The results revealed that the expression of GhGASA3 and GhGASA13 tended to increase first and then decrease in different organs. GhGASA14 and GhGASA16 were highly expressed in floral organs, and GhGASA25 was predominantly expressed in leaves. Interestingly, GhGASA23 was highly expressed specifically in the pistil. Moreover, GhGASAs exhibited both similar and contrasting expression levels across multiple tissues. For instance, GhGASA9 and GhGASA31 were upregulated in anthers and petals. Intriguingly, GhGASA9 and GhGASA18 exhibited completely divergent expression patterns. These findings suggested that the nine GhGASAs were chiefly involved in plant reproductive growth. Additionally, by utilizing transcriptome data from six different leaf stages of early- and late-maturity varieties of upland cotton, we noted that GhGASA9 and GhGASA14 were upregulated not only in floral organs but also in early-maturity varieties of upland cotton (Fig. S3). The reliability of the transcriptome data was further verified by qRT‒PCR. Taken together, these results indicated that GhGASA9 and GhGASA14 might be key candidate genes regulating cotton flowering.
Functional verification of GhGASA9 and GhGASA14 in upland cotton
To validate the functions of GhGASA9 and GhGASA14, we suppressed their expression using VIGS technology. After 21 days of TRV:GhCLA1 plants albinoed, the stem segments of seedlings treated with GA3 exhibited notable elongation compared with those of the control plants sprayed with water (MOCK). However, no significant difference was detected between TRV:00 and TRV:GhGASA9 or TRV:GhGASA14 under both GA3 treatment and MOCK conditions at this stage (Fig. 5a). After four weeks of TRV:GhCLA1 plant albinoed, the response to GA3 gradually weakened after the silencing of GhGASA9 and GhGASA14 compared with that of TRV:00. Only the TRV:00 plants exhibited a significant increase in plant height (Fig. 5b). To provide a clearer observation of the plant height of the TRV:00 plants and the plants in which the two target genes were silenced, we separated the MOCK group from the GA3-treated group. The results revealed that the TRV:00 plants were taller than the TRV:GhGASA9 and TRV:GhGASA14 plants after GA3 treatment. However, no noticeable difference was observed in the MOCK plants (Fig. 5c). These findings indicate that these two target genes exhibit a robust response to GA3.
To clarify the effect of GhGASA9 and GhGASA14 on reproductive development, we investigated the budding and flowering times of these plants. The TRV:00 plants exhibited the earliest budding, followed by the TRV:GhGASA9 and TRV:GhGASA14 plants, whose budding occurred 5-6 days later. Statistical analysis revealed that the average bud emergence for the TRV:GhGASA9 plants was 4.8 days later and for the TRV:GhGASA14 plants was delayed by 6.0 days on average contrasted with the TRV:00 plants. Similarly, flowering time exhibited a consistent trend, with the average flowering time of the TRV:GhGASA9 and TRV:GhGASA14 plants being delayed by 4.3 and 5.0 days, respectively, compared with that of the TRV:00 plants. These results indicating that the silencing of these two target genes delayed both the flowering and budding of the plants compared with those of the TRV:00 plants. After treatment with exogenous GA3, the first bud appeared 3-6 days later in the two gene-silenced plants than in the TRV:00 plants, while the second and third buds were not significantly different between the two gene-silenced plants and the TRV:00 plants. Observations and statistics showed that the growth of the silenced plants was similar to that of the TRV:00 plants, and there was no significant difference in flowering time (Fig. 6a). Interestingly, compared with those in plants exposed to MOCK, the first buds in plants treated with GA3 appeared later; however, no significant difference in the flowering time of the first flower at this stage was detected between these two treatments. These findings indicate that the period from budding to flowering was on average 10-12 days shorter in the GA3 treatment group than in the MOCK treatment group. Taken together, GhGASA9 and GhGASA14 might play a positive role in regulating plant flowering time., GA3 could influence the growth and development of plants, making the flowering period of the different fruiting branches more concentrated and promoted the flowering of plant (Fig. 6b). Moreover, our found GhGASA14 had a more significant effect on flowering time (Fig. 6c).
GhGASA14 is differentially expressed between early‐maturity and late‐maturity cultivars
To verify the authenticity of the results, we used an online website to analyze the presence of SNP loci in these genes. No SNP loci were identified in the coding region of GhGASA9. Notably, one SNP (T/C; chrA09:3986021) associated with flowering time was detected within the exon of GhGASA14 (Fig. 7a), and the flowering time of early-maturing accessions with the CC genotype was significantly shorter than that of late-maturing accessions with the TT genotype; therefore, we speculated that the CC was the favorable allelic variation (FAV) for early maturity (Fig. 7b). Then, we cloned the CDS (321 bp) of GhGASA14 from five early-maturing accessions and five late-maturing accessions. Sanger sequencing revealed that the SNP existed mainly in early-maturing accessions with the CC genotype and existed in late-maturing accessions with the TT genotype (Fig. 7c). The expression level of GhGASA14 in early-flowering accessions with cytosine (CC) was significantly greater than that in late-flowering accessions with thymine (TT) (Fig. 7d). In addition, linear regression analysis was conducted on the expression levels and flowering times of GhGASA14 among the 12 upland cotton accessions, and the results showed a significant linear relationship (R2 = 0.5988), which also confirmed that the higher the expression of GhGASA14 was in early-maturing accessions, the shorter the flowering time was (Fig. 7e). These results suggest that GhGASA14 is differentially expressed between early‐maturity and late‐maturity cultivars and that a high expression level is favorable for flowering.
GhGASA14 may undergo artificial selection in upland cotton
To study the evolution of FAV across geographical origins, we compared the frequency differences among the four cotton regions, namely, the NIR, NSEMR, YRR and YZRR regions. The FAV frequencies of the cotton varieties NIR, NSEMR, YRR and YZRR were 76%, 50%, 49% and 47%, respectively. The FAV frequencies in the cotton varieties at high latitudes were greater than those at low latitudes, and as latitude increased, the FAV frequency increased (Fig. 7f). To verify the effect of this allelic variation on flowering traits, Min-50 and Max-50 germplasms of flowering time were selected, and frequency distributions were calculated. We found that the FAV frequency of the Min-50 accessions was greater than that of the Max-50 accessions (Fig. 7g). Moreover, we estimated the π and Fst values in nearly regions (A09:3,980,000-4,000,000 bp) of GhGASA14. A low π value for a gene region indicated low genetic diversity and a relatively single nucleotide type; a high FST value in the region indicated that the genetic differentiation degree and genetic differences were relatively large (Fig. 7h). Taken together, these results reveal that GhGASA14 may undergo artificial selection in the process of variety improvement.
GhGASA14 regulates cotton flowering through the gibberellin pathway
To further verify the role of GhGASA14 in regulating flowering, we detected several key flowering genes, including GhAP1, GhFT, GhSOC1, GhSVP and GhLFY, as well as GhGA20ox2, GhAGL24 and GhGAI in TRV:GhGASA14 plants. We utilized leaves from TRV:GhGASA14 and TRV:00 plants as templates. Compared to those in TRV:00, the expression levels of GhAP1, GhFT and GhSOC1 were significantly decreased in TRV:GhGASA14 plant, whereas GhLFY was significantly increased. Similarly, the expression of the gibberellin-related genes GhGAI, GhGA20ox2 and GhAGL24 were also significantly reduced in the TRV:GhGASA14 plant. In the presence of GA3, the expression levels of GhSOC1, GhFT and GhGAI were significantly lower than those at TRV:00. In contrast, the expression of GhAP1 and GhSVP increased (Fig. 8a). In addition, we also analyzed the expression levels of GhGASA14, GhAGL24 and GhGAI in the GhAP1-D3-OE plant, ghap1 and WT plants. the expression levels of GhAP1, GhAGL24 and GhGAI were significantly upregulated in the GhAP1-D3-OE line compared to WT. However, the expression level of GhGASA14 was significantly decreased in the GhAP1-D3-OE plant, while it was significantly increased in ghap1 plants (Fig. 8b). In conclusion, these findings suggest that GhGASA14 might promote flowering by influencing the expression levels of key genes in the GA pathway.