Thirty four GH3-encoding genes (BoGH3s) are present in TO1000
In Ensembl Plants database (http://plants.ensembl.org/index.html), protein sequences of fifty five GH3 candidate genes in TO1000, kale-type B. oleracea, showed similarities to the nineteen Arabidopsis GH3 proteins [10]. Among these, thirty four GH3 proteins were found to have intact GH3 domains (pfam03321) and considered as GH3 proteins (Table S1; Figure S1). Although identical genomic sequence was also used for annotation in NCBI database (NCBI, http://ncbi.nlm.nih.gov) [34], only thirty B. oleracea GH3 candidate proteins, including two with truncations in GH3 domains, were found to have significant similarities to Arabidopsis GH3s. The thirty four BoGH3 proteins with the intact GH3 domains in Ensembl Plants database include all twenty eight putative GH3 proteins with the intact GH3 domains identified in NCBI database (Table S1). For proteins showing different protein sequences between two databases, such as Bo3g009110 and Bo5g053450, NCBI protein models were adopted in our study because they are supported by RNA-seq data in NCBI. While thirty four GH3 protein-coding genes were identified from kale-type TO1000 (B. oleracea var. oleracea) in our study, twenty five and twenty nine GH3 protein-coding genes were previously reported for cabbage type B. oleracea var. capitata in the comparison with B. napus genes by two independent studies, respectively [32, 33].
Similar to previous phylogenetic analyses of GH3 proteins including cabbage type B. oleracea var. capitata, phylogenetic clustering of Arabidopsis and BoGH3 proteins demonstrated that BoGH3 proteins can be divided into 3 groups (Group Ⅰ, Ⅱ, and Ⅲ) (Fig. 1A) [6, 10, 32, 33]. It was found that Group Ⅰ consists of two Arabidopsis and four BoGH3 proteins, while Group Ⅱ consists of eight Arabidopsis and eleven BoGH3 proteins. In case of Group Ⅲ, nine Arabidopsis GH3s and nineteen BoGH3 proteins were clustered together. In general, exon/intron structures of BoGH3 genes were same to closely related counterparts in Arabidopsis with some exceptions (Fig. 1B). For example, four protein-coding exons were detected for Bo9g023750 in Group Ⅱ, based on the distribution of RNA-seq reads in NCBI database, while three protein-coding exons of At2g14960 is reported in TAIR JBrowse (https://jbrowse.arabidopsis.org/). In case of Bo3g039200 and Bo4g196110, which are closely related to At2g46370 (JAR1) with four protein-coding exons, three exons supported by RNA-seq reads were observed. Structural differences were also observed for five BoGH3 genes (Bo3g023700, Bo4g164910, B07g116230, Bo7g011450, and Bo8g109490) that were identified only in Ensembl Plants.
Synteny is observed for Group III subgroup 6 GH3 genes in Arabidopsis and TO1000
In TO1000, four out of thirty four Group Ⅲ BoGH3 proteins (Bo2g011200, Bo3g009140, Bo7g011450, Bo9g166800) show a close relationship with Arabidopsis subgroup 6 GH3 (Fig. 1A). While these four BoGH3 genes are found on different chromosomes, four Arabidopsis GH3 genes (At5g13350, At5g13360, At5g13370, and At5g13380) in the same subgroup are located within less than 15,000 base pairs (bp) genomic region on Arabidopsis chromosome 5 (Fig. 2A). When genes located around Arabidopsis and TO1000 subgroup 6 GH3 genes were compared, syntenies were detected around the At5g13350 ~ At5g13380 GH3 cluster and three TO1000 GH3 genes (Bo2g011210, Bo3g009140, and Bo9g166800) (Fig. 2B-2D). In the upstream of three TO1000 GH3 genes, Bo2g011200 (Fig. 2B), Bo3g009120 (Fig. 2C), and Bo9g167820 (Fig. 2D) showing sequence similarities to At5g13330, an RAP2.6L transcription factor, which is found upstream of the At5g13350 ~ At5g13380 GH3 cluster, were identified (Fig. 2A). Moreover, Bo2g011190, Bo3g009110, and Bo9g167830, which are clustered with At5g13320 (PBS3) in the phyologenetic tree as Group Ⅲ subgroup 4 GH3 genes, were also found further upstream, same to At5g13320 (PBS3) located upstream of the At5g13350 ~ At5g13380 GH3 cluster. Consistent with the syntenic relationships in these genomic regions, sequence similarities were also observed downstream of the Arabidopsis GH3 cluster and the three TO1000 GH3 genes on different chromosomes (Fig. 2): Bo2g011240 and Bo9g166790 show sequence similarity to At5g13390, No Exine Formation 1.
Subgroup 6 BoGH3 genes are not induced by auxin treatment in the seedling stage
In Arabidopsis, auxin treatment can induce transcription of some GH3 genes, such as At4g27260 (WES1), At4g37390 (YDK1), and At5g54510 (DFL1) [15–17]. However, expression conditions and functions of GH3 genes in other plants are largely unknown. To gain insights on the expression patterns and functions of four TO1000 subgroup 6 GH3 identified in this study, we determined whether these genes can be induced by plant hormones and found that none of subgroup 6 BoGH3 genes were significantly induced by auxin (synthetic 2,4-Dichlorophenoxy acetic acid (2,4-D) or natural IAA), GA or JA treatment at the seedling stage, except Bo3g009140 that is weakly induced by JA (Fig. 3). However, transcriptional inductions by auxin were evident for Bo1g004760, a YDK1-like BoGH3 gene, and Bo1g048130, a WES1-like BoGH3 gene, like their closely related GH3 genes in Arabidopsis [16, 17].
Bo2g011210 and Bo1g048130 are strongly expressed in stamen at a specific stage during flower development
For four subgroup 6 and two auxin-inducible GH3 genes in TO1000, relative expression patterns in six different organs - root, leaf, stem, floral bud, opened flower, and silique - were determined. Among four subgroup 6 BoGH3 genes, Bo2g011210 was found to be most strongly expressed in floral bud, although significant expression was observed in silique compared to that in leaf (Fig. 4A). Only negligible expressions of Bo2g011210 were detected in other organs, including open flowers. For the other three subgroup 6 BoGH3 genes, the strongest expression was commonly found in siliques (Fig. 4B-4D), while comparable expressions in floral bud and open flower were also observed for Bo3g009140 (Fig. 4B). For auxin-inducible Bo1g004760 and Boi1g048130, distinct relative expression patterns were detected: Bo1g00476 and Bo1g048130 were found to be most strongly expressed in root and floral bud, respectively (Fig. 4E & 4F).
For Bo2g011210 and Bo1g048130, which show strong preferential expressions in floral bud (Fig. 4A & 4F), it was also determined whether expressions of these genes are temporally regulated during floral bud development. When the expression levels were monitored for developing floral buds sorted by lengths, which reflect the progress of flower development [44], both genes showed stronger expression when bud lengths are about 2 to 6 mm, although Bo2g011210 in subgroup 6 GH3 showed more dramatic expression changes by developmental progress than Bo1g048130 (Fig. 5A & 5B). In 4 ~ 6 mm-long floral buds, where the two genes are most strongly expressed, almost exclusive expression was detected in stamen among sepal, petal, stamen, and pistils (Fig. 5D & 5E). In contrast, no significant developmental and organ-specific expression patterns were observed for Bo3g009140, another subgroup 6 BoGH3 (Fig. 5C & 5F).
Bo2g011210 and B1g048130 are expressed in tapetum and pollen grains
To narrow down spatial expression patterns of stamen-expressed Bo2g011210 and B1g048130, we generated transgenic plants, in which GUS (β-glucuronidase) reporter genes are expressed under the control of about 1,500 bp putative promoter sequences of these BoGH3 genes. Bo2g011210 (-1489 ~ -1)::GUS and Bo1g048130 (-1496 ~ -1)::GUS are two transgenic plants, in which − 1489 ~ -1 and − 1496 ~ -1 bp DNA sequences upstream of Bo2g011210 and Bo1g048130 start codon, respectively, are fused to GUS reporter genes. In Bo2g011210 (-1489 ~ -1)::GUS, GUS expression was observed in anthers of developing floral buds (Fig. 6F & 6G), consistent with the qRT-PCR (quantitative reverse transcription polymerase chain reaction) results (Figs. 4 & 5). Weak GUS staining in some stigmas was found to be caused by stigma-attached pollens (Fig. 6H). GUS staining was also observed in siliques, but only in the floral organ abscission regions of petals, sepals, and stamens (Fig. 6I & 6J). In addition, GUS expression was detected in leaf primordia of seedlings (Fig. 6K & 6L). In Bo1g048130 (-1496 ~ -1)::GUS, GUS expression was detected in developing anthers and unfertilized ovule or aborted seeds(Fig. 6M − 6Q), but not in seedling leaf primordia (Fig. 6R). To further define the spatial expression patterns of Bo2g011210 and B1g048130 in anther, cross-sectioned floral buds were examined and specific expression in tapetum cells and pollen grains were detected (Fig. 6U − 6X). In Bo2g011210 (-1489 ~ -1)::GUS, GUS staining seems to appear in the tapetum first and pollens later (Fig. 6U & 6V).
Bo2g011210 and B1g048130 are most strongly expressed around when polarized microspores are generated
To investigate which milestone events in microsprogenesis or microgametogenesis occur in pollens when Bo1g048130 and Bo2g011210 are expressed (Fig. 5), developing pollens were collected from floral buds and open flowers. Based on the numbers and organization of 4’,6-diamidino-2-phenylindole (DAPI)-stained nuclei, it was found that tetrads and microspores are observed in less than 2 ㎜ floral buds (Fig. 7A & 7D), in which the two anther-expressed GH3 genes, Bo2g011210 and Bo1g048130, are weakly expressed (Fig. 5). In 2 ~ 6 ㎜ floral buds, in which the two anther-expressed GH3 genes are most strongly expressed, microspores, polarized microspores, and bicellular pollens were observed (Fig. 7B − 7C & 7E − 7F). While bicellular and tricellular pollens were observed in 6 ~ 8 ㎜ buds, only tricellular pollens were observed in 8 ~ 10 ㎜ buds and opened flowers (Fig. 7G − 7L). These data show that Bo2g011210 and Bo1g048130 are strongly induced when polarized microspores are mainly produced during early microgametogenesis [45, 46].
One hundred eighty six bp region upstream of Bo2g011210 is sufficient for anther-specific expression
DNA sequences responsible for tissue-specific expression of Bo2g011210 was investigated with different DNA regions upstream of the start codon (Fig. 8A). When P1 region, in which − 1017 ~ -1 bp region was fused upstream of GUS reporter gene, was used to generate P1 transgenic plants, GUS expressions in anthers and pollens were still detected (Fig. 8B − 8D), but those in floral abscission zones and leaf primordia were lost, except one case showing GUS staining in the floral abscission zone (Fig. 8E & 8F). When P2 (-418 ~ -1) and P3 (-340 ~ -155), in which − 155 ~ -1 bp 5’ untranslated region based on RNA-seq reads in SRX209697 (NCBI) is removed, were used, anther-specific GUS expressions were found to be maintained (Fig. 8G & 8H). While P4 (-278~ -155) did not show GUS expression in all twelve lines, five out of twelve P5 (-418 ~ 279) showed GUS expression, suggesting sixty-two bp region (-340 ~ -279) in P3 sequence is important for anther-specific expression of Bo2g011210 (Fig. 8I & 8J).