Identifcation of full length CcWRKY genes
In this study a local BLASTP search was performed to identify complete WRKY members in jute, using Arabidopsis WRKY protein sequences as query sequences and also Pfam was used to detect their conserved domain. As result a total of 43 candidate genes containing WRKY domain (named as CcWRKY) were identified, as shown in Table 1.
Analysis of conservative domain of CcWRKY genes
The conservative domain of CcWRKY gene sequences were identified and analyzed using DNAMAN5.0 software and. conservative structure prediction was performed by Weblogo. The results showed that the conserved domains of WRKY gene family in jute could be divided into three groups: I, II and III. Group I had nine members. It could be further divided into I-C and I-N subgroups. Group I contain two WRKY domains and zinc finger structures, and the zinc finger structure are CX4C22-23HXH.Group II could be further divided into subgroups II-a, II-b, II-c, II-d and II-e, with 2, 7, 7, 6 and 6 members, respectively. In II-a, II-b, II-d and II-e, the heptapeptide domain and zinc finger structure of WRKY at C-terminal were WRKYGQK and CX5C23HXH, while in II-c, the heptapeptide domain and zinc finger structure of WRKY at C-terminal were WRKYGQK and CX4C23HXH. There were six members in group III. The heptapeptide domain and zinc finger structure of WRKY at C-terminal were WRKYGQK and CX7C23HXC (Fig1). Moreover, the present results showed that there are still mutations in its protein sequence, though WRKY transcription factor has a much conserved WRKY domain. Among the 43 members of WRKY transcription factor identified in jute, the conserved domain of one gene (WRKYGQK) and the zinc finger structure of four genes were all mutated (Additional file 2: Table S2). This variation indicated that despite the structurally high conserved WRKY gene family, some variations still occur in its WRKY domain, which also illustrated that the plant WRKY gene family had diversity in the evolutionary process.
Phylogenetic analysis of CcWRKY protein in diverse species
BY comparing the known WRKY region of Arabidopsis thaliana WRKY protein with CcWRKY, the WRKY domain sequence of CcWRKY protein was clustered and analyzed using MEGA7 (Fig2). These CcWRKY proteins can be divided into three groups: I, II and III. And Group II can be divided into II-a, II-b, II-c, II-d and II-e subgroups. The classifications of phylogenetic tree analysis were consistent with the results of Figure 1(Table 1).
Structure analysis of intron and exon of WRKY in jute
In this study, the number of exons and introns of jute WRKY gene were analyzed and the results are shown in Figure 3. The number of exons varied from 3 to 11.21 WRKYs(48.84%) contained 3 exons, 5 WRKYs(11.63%) contained 4 exons, 8 WRKYs (18.60%) contained 5 exons,6 WRKYs (13.95%) contained 6 exons. From the groups, Group II c+d+e and group III were relatively conservative, while Group I and Group II a+b+c’s structures were significantly different and changed greatly. Most CcWRKYs in Group II c+d+e and group III contain 3 exons except Ccv40151700 (4 exons) and Ccv40018590 (4 exons).
Analysis of tertiary structure of protein
The tertiary structure of protein is further coiled and folded on basis of the secondary structure. The tertiary structure of CcWRKY protein was conducted by SWISS-MODEL. The majority of the 43 amino acid sequences have the similar three-dimensional structure. One representative homology modeling from CcWRKY gene family was shown in Figure S1, and consists of several beta folding. Their tertiary structure were quite similar with that of Arabidopsis thaliana[33]. It had also proved that the CcWRKY gene family is highly conserved in structure.
Expression analysis of CcWRKY genes in different tissues
Tissue specific expression of genes is often considered as markers of specific gene functions in this tissue. Since WRKY genes are related to the bast fiber development of plants[34, 35], we mainly focus on the expression of CcWRKY genes at different stages of stem growth. Based on the RNA-seq data, we used R language to draw the heatmap of the expression patterns of CcWRKY genes in different stem growth stages (Additional file 4: Fig. S2). The difference of gene expression is generally represented by colors, red represents high expression and blue represents low expression. The results showed that all the CcWRKY genes were expressed in the stem of jute, and the expression of WRKY genes differ at different stem growth stages. Meanwhile, it proved that there were no pseudogenes in 43 genes. From Figure S2, we could see that 43 genes were divided into two categories. The expressions of 13 genes were lower in the different tissues of jute, and the others were higher. Totally, 10 WRKYs were highly expressed in leaf(60d), 3 WRKYs were highly expressed in hypocotyls (10d), 2 WRKYs were highly expressed in stem stick(60d), 2 WRKYs were highly expressed in stem bark(60d), 14 WRKYs were highly expressed in root(60d), and 12 WRKYs were highly expressed in stem bark(120d). It could be seen that the WRKY genes were mainly expressed in the stem bark of jute. With the continuous growth of jute, the bast fiber of jute will gradually accumulate in the stem bark. Therefore, it is believed reasonably that the WRKY genes are involved in bast fiber development in jute. For example, Ccv40032460 was highly expressed in hypocotyls (10d), lowly expressed in stem bark (60d), and no expression in stem bark (120d). It suggests that this gene may play a negative regulatory role in jute fiber accumulation.
GA3stress analysis of CcWRKY genes involved in cell wall formation
According to our previous research[36], "Aidianyehuangma" is a dwarf variety that sensitive to GA3. "Huangma 179" and "Aidianyehuangma" were planted in 0.1 mg·L-1 GA3 exogenous hormone medium (GA) and control group MS medium (CK), respectively, and the length of hypocotyl was measured when the first true leaf was grown. It could be seen that the hypocotyl length of " Aidianyehuangma " treated by GA3 (4.58cm) is higher than that of CK (1.77cm) (Additional file 5: Fig. S3). The average lengths of hypocotyl of "Aidianyehuangma" and "Huangma 179" treated with GA3 was 4.58cm and 4.67cm respectively, while that of "Huangma 179" without GA3 was 4.61cm (Additional file 5: Fig. S3). The results showed that there were no significant differences in the lengths of hypocotyl among the three groups. After stressing the plant with GA3, it was found that the plant height of "Aidianyehuangma" could be significantly increased, which would greatly improve the fiber yield of the variety. This dwarf variety is very suitable for studying the relationship between GA3 and fiber development.
To further explore the relationship between fiber development, CcWRKYs and GA3, we selected some important fiber related genes as marker genes. These genes were include,CesA1 (CesA, Cellulose synthase), CesA4, CesA7, CesA8, CCoAOMT (Caffeioyl coenzyme A methyltransferase), 4CL (4-Coumarate: Coenzyme A Ligase), Ent-copalyl diphosphate synthase, Ent-kaurene oxidase, Ent-kaurene synthase, Ent-kaurenoic acid oxidase, GA 20-oxidase, Gibberellin 2,3-hydroxylase and Gibberellin C13 oxidase. In the vigorous growth period (60 days after sowing), the stem barks were treated with GA3 stress for "Huangma 179" and " Aidianyehuangma ". Then, the samples were taken after 4 hours and 72 hours, respectively. The samples without GA3 treatment could be used as control. We analyzed the RNA-seq results of these materials, and then drew the corresponding histogram (Additional file 6: Fig. S4, Additional file 7: Fig. S5), the up column indicated that the gene expressions were up-regulated, and the down column showed the gene expressions were down-regulated. The expression of WRKY genes mostly changed significantly under GA3 stress (like other fiber related marker genes), especially for the down regulated genes. By comparing the expression of CcWRKY genes under different treatment time (4h and 72h) after spraying GA3, the expression of most of CcWRKY genes (31 genes) changed in the same trend in "Huangma 179", similar results were found in "Aidianyehuangma".
From these, we also found21 CcWRKY genes and most of the fiber related marker genes of "Aidianyehuangma" were sensitive to the GA3 stress. The variations of expression of these genes in "Aidianyehuangma" were more significant than those in "Huangma 179" (Fig 4 and Fig 5).This indicated that these CcWRKY genes, were similar to that of like other fiber related marker genes, played a certain role in the increase of fiber yield of "Aidianyehuangma" under GA3 stress. In addition, the promoters of these CcWRKYs were analyzed by PlantCARE. The results showed that the promoters of most CcWRKY genes contained elements related to gibberellin, such as GARE-motif, P-box and TATC-box (Table 2). The CcWRKY genes responded to the stress of GA3 and could increase the fiber yield of the "Aidianyehuangma". This suggested that WRKY genes might also be involved in the growth and development of bast fiber like other fiber genes in jute.
To verify the accuracy of the gene expression, 9CcWRKY genes were randomly selected for qRT-PCR analysis (Additional file 8: Fig. S6). The results of qRT-PCR corresponded to the results of FPKM.