Identification and basic physicochemical properties analysis of kenaf HcCUC genes
Through Blastp alignment, CDD and Pfam database for domain analysis, six HcCUCs were finally identified. In order to facilitate the distinction, they were renamed according to the chromosome position (Table 1). The predicted six CUC proteins ranged in length from 327 to 386 amino acid residues, and their relative molecular masses ranged from 36300.8 Da to 42780.41 Da. The isoelectric point (pI) values range from 5.81 to 8.48, with pI values of 7 for three members. The aliphatic index ranged from 58.5 to 69.51. The total average hydrophilicity (GRAVY) of the protein was between -0.388 and -0.554, which was a negative and hydrophilic protein. The instability coefficient ranged from 24.96 to 52.75. HcCUC2 and HcCUC5 were less than 40 stable proteins, and the rest were more than 40 unstable proteins.
Table 1 Basic information of CUC genes in kenaf
Gene name
|
Gene ID
|
Length of CDS (bp)
|
Length of protein (aa)
|
PI
|
MW (Da)
|
Instability index
|
Aliphatic index
|
GRAVY
|
HcCUC1
|
Hc.01G004330.t1
|
984
|
327
|
8.48
|
36300.8
|
52.75
|
58.50
|
-0.554
|
HcCUC2
|
Hc.01G008340.t1
|
1056
|
351
|
6.21
|
39579.93
|
24.96
|
66.13
|
-0.404
|
HcCUC3
|
Hc.02G037940.t1
|
1050
|
349
|
6.61
|
38405.1
|
43.63
|
62.92
|
-0.423
|
HcCUC4
|
Hc.06G022950.t1
|
1065
|
354
|
8.25
|
39050.86
|
50.07
|
60.42
|
-0.491
|
HcCUC5
|
Hc.07G027740.t1
|
1161
|
386
|
5.81
|
42780.41
|
35.24
|
69.51
|
-0.388
|
HcCUC6
|
Hc.13G003690.t1
|
1062
|
353
|
7.64
|
38576.2
|
51.05
|
60.34
|
-0.418
|
Conserved motif and gene structure analysis
In order to better understand the structural evolution of HcCUCs, a phylogenetic tree was constructed based on the predicted full-length HcCUCs protein sequence. HcCUCs were divided into two groups (I-II) (Fig 1a), and the HcCUCs clustered in the same group contained highly similar conserved motifs and gene structures (Fig 1b, c). Motif 1-4 exists in all HcCUCs, where the region of Motif 1-3 is the region of the NAM conserved domain of HcCUCs, Motif 5~9 exists only in group I HcCUCs, while Motif 10 exists only in group II HcCUCs (Fig 1b). All HcCUCs did not have a 3' non-coding region, except for HcCUC2, the remaining HcCUCs did not have a 5' non-coding region (Fig 1c). The number and arrangement of coding regions of all HcCUCs are highly similar.
Phylogenetic and collinearity analysis
To further understand the phylogeny of HcCUCs, 22 CUC protein sequences from other species and six HcCUC protein sequences were used to construct a phylogenetic tree (Fig 2a). Six HcCUC genes were divided into two groups (I and V). HcCUC1, HcCUC3, HcCUC4, and HcCUC6 in group I was closely related to CpCUC2, while HcCUC2 and HcCUC5 in group V were most closely related to CpCUC3 and DcCUC3. Among the three AtCUC genes in the model plant Arabidopsis, AtCUC2 and AtCUC3 were divided into groups I and V containing HcCUC genes, respectively, indicating that the two AtCUC genes are more closely related to HcCUC genes, and the same cluster of CUC genes may have similar biological functions.
Gene duplication events are essential for the evolution of gene families and often play an important role in gene amplification and the generation of new functional genes. Colinearity analysis showed that six pairs of fragment repeat gene pairs (HcCUC1/3, HcCUC1/4, HcCUC1/6, HcCUC2/5, HcCUC3/4, HcCUC4/6) were formed between the six HcCUCs (Fig 2b), indicating that fragment repeat plays an important role in the amplification of HcCUC gene family. Collinearity analysis showed that eight pairs of homologous gene pairs (HcCUC1/AtCUC1, HcCUC1/AtCUC2, HcCUC2/AtCUC3, HcCUC3/AtCUC1, HcCUC4/AtCUC1, HcCUC5/AtCUC3, HcCUC6/AtCUC1, HcCUC6/AtCUC2) were formed between six HcCUCs and three AtCUCs (Fig 2b). These homologous gene pairs may have evolved from a common ancestor and has a closer evolutionary relationship.
Tissue-specific expression analysis of the HcCUC gene
The expression patterns of HcCUC genes were analyzed using RNA-seq data from different kenaf organs. As shown in Fig3, different expression patterns exist among different members. For example, HcCUC6 is difficult to detect its expression in different tissues at different stages. HcCUC4 and HcCUC5 can be detected in different degrees in most tissues. HcCUCs were mainly expressed in leaves at the seedling stage, buds at the mature period, and anthers at the dual-core period, and all members could be detected in anthers at the dual-core period. At the same time, except for HcCUC6, all members could be detected in leaves at the seedling stage and buds at 2cm at the mature period, which indicated that HcCUCs might play a more important role in the development of these tissues.
Overexpression of HcCUC1 regulated the growth and development of Arabidopsis leaves and lateral branches
To further study the biological function of HcCUCs, we cloned HcCUC1, which is in the same group with AtCUC2 in phylogenetic analysis and has a collinearity relationship with AtCUC2, and is highly expressed in leaves at seedling stage, buds at mature period and anthers at dual-core period. HcCUC1 overexpression lines driven by cauliflower mosaic virus (CaMV) 35S promoter (35S: HcCUC1) were generated in Arabidopsis (Col-0). Two independent overexpression lines (OE1 and OE2) increased the transcriptional level of HcCUC1 for further study (Fig 4c).
Overexpression of HcCUC1 delayed the flowering time of Arabidopsis
In addition to the differences in leaves and lateral branches, the study also found significant differences in flowering time between wild-type and transgenic Arabidopsis. Compared with the wild type, the bolting and flowering time of HcCUC1 overexpression lines were about four days later (Fig 5a, b, d, e). Due to the late bolting time of the transgenic lines, the plant height of the transgenic lines was always lower than that of the wild-type in the early growth stage, but there was no significant difference between the transgenic lines and the wild-type plants in the whole growth period (Fig 5c, f). These results suggest that heterologous expression of HcCUC1 can delay the flowering time of Arabidopsis but does not affect the final height of the plant.
HcCUC1 affects the expression of genes related to auxin, leaf development and flowering regulation
Some studies have shown that auxin plays an important role in leaf and branch development(Xiong and Jiao 2019), and many genes also regulate plant flowering(Fornara et al. 2010). By detecting the transcription levels of some related genes in wild-type and transgenic Arabidopsis, it was found that the transcription levels of auxin synthesis-related genes YUC2 and YUC4 and auxin transport-related genes PIN1, PIN3, and PIN4 in transgenic Arabidopsis were up-regulated as a whole compared with wild-type plants (Fig 6a). In addition, the transcription levels of two leaf-related genes, KNAT2 and KNAT6, were also significantly higher than those in wild-type plants (Fig 6b). It is suggested that the overexpression of HcCUC1 may influence the growth and development of leaves or lateral branches by up-regulating the transcription levels of auxin and leaf-shape-related genes. However, Transgenic Arabidopsis showed dramatically reduced transcription levels of genes involved in flowering regulation (such as FT, AP1, LFY and FUL), as compared to wild-type plants (Fig 6c), which demonstrated that the heterologous expression of HcCUC1 may delay the flowering time of Arabidopsis by down-regulating the transcription level of genes involved in flowering regulation.