Genome-wide identification of the ZlARFs gene family in Z. latifolia
To identify the ZlARFs gene family of Z. latifolia, we performed BLASTP program searches of the Z. latifolia genome databases by ARF sequence summarized in the Oryza sativa and Arabidopsis thaliana data sets (Fig. S1). Using each predicted protein sequence as a query to search, 33 potential ARF proteins sequence were discovered. All ZlARFs family members contain the DBD and MR conserved structural domains by Pfam and SMART website to identification. On the basis of our analysis, a generic ZlARF naming system used, from ZlARF1 to ZlARF33 by chromosome location, which distinguished each ARF gene according to the Z. latifolia scaffold information. All ZlARFs with the DBD and MR domain are summarized in Table 1. Overall, the analysis showed that there were 33 ZlARFs members in the genome of Z. latifolia.
Table 1
The ARF gene family members in Z. latifolia.
Gene nameA | Gene_IDB | DomainC | Chromosome locationD | ORF length (bp)E | Deduced polypeptideF | DirectionG (R/F) | No. Of intronH | Arabidopsis homologousI |
Length (aa) | MW (KDa) | PI | Instability index |
ZlARF1 | Zlat_10046274 | DBD; MR; CTD | scaffold_21:842350:855269 | 2442 | 813 | 90.40 | 6.49 | 58.37 | R | 17 | AT1G59750.3 (AtARF1) |
ZlARF2 | Zlat_10045507 | DBD; MR; CTD | scaffold_23:1002962:1014822 | 2433 | 810 | 89.09 | 5.54 | 53.63 | F | 14 | AT4G23980.1 (AtARF9) |
ZlARF3 | Zlat_10042822 | DBD; MR | scaffold_44:189023:210245 | 2706 | 901 | 100.51 | 5.86 | 73.27 | F | 13 | AT1G30330.3 (AtARF6) |
ZlARF4 | Zlat_10039405 | DBD; MR; CTD | scaffold_84:856365:861456 | 2529 | 842 | 92.99 | 6.74 | 59.32 | R | 13 | AT5G62000.2 (AtARF2) |
ZlARF5 | Zlat_10036653 | DBD; MR; CTD | scaffold_119:422479:427812 | 2511 | 836 | 92.53 | 6.69 | 56.90 | R | 15 | AT5G62000.2 (AtARF2) |
ZlARF6 | Zlat_10033866 | DBD; MR; CTD | scaffold_166:63647:71937 | 3399 | 1132 | 125.74 | 6.06 | 62.29 | F | 13 | AT1G19220.1 (AtARF19) |
ZlARF7 | Zlat_10033398 | DBD; MR; CTD | scaffold_169:614405:620911 | 2442 | 813 | 90.50 | 5.95 | 62.86 | R | 13 | AT5G37020.1 (AtARF8) |
ZlARF8 | Zlat_10032874 | DBD; MR | scaffold_173:486166:491783 | 1227 | 408 | 45.30 | 6.25 | 49.24 | F | 11 | AT2G33860.1 (AtARF3) |
ZlARF9 | Zlat_10031865 | DBD; MR; CTD | scaffold_190:298677:304803 | 2454 | 817 | 91.06 | 5.98 | 63.85 | R | 13 | AT5G37020.1 (AtARF8) |
ZlARF10 | Zlat_10031816 | DBD; MR | scaffold_190:27070:31806 | 2757 | 918 | 101.26 | 5.86 | 52.41 | R | 12 | AT1G19850.1 (AtARF5) |
ZlARF11 | Zlat_10031564 | DBD; MR; CTD | scaffold_211:13403:18162 | 2676 | 891 | 99.25 | 5.85 | 63.99 | R | 13 | AT1G30330.1 (AtARF6) |
ZlARF12 | Zlat_10028683 | DBD; MR | scaffold_223:24568:27701 | 1182 | 393 | 43.67 | 6.27 | 68.73 | F | 12 | AT1G30330.3 (AtARF6) |
ZlARF13 | Zlat_10028341 | DBD; MR | scaffold_239:609211:611540 | 2088 | 695 | 74.90 | 7.11 | 48.00 | R | 2 | AT4G30080.1 (AtARF16) |
ZlARF14 | Zlat_10026465 | DBD; MR | scaffold_266:621432:622703 | 1140 | 379 | 41.89 | 9.05 | 39.17 | F | 1 | AT4G30080.1 (AtARF16) |
ZlARF15 | Zlat_10022744 | DBD; MR; CTD | scaffold_324:216217:223745 | 3423 | 1140 | 126.44 | 6.41 | 65.26 | R | 13 | AT1G19220.1 (AtARF19) |
ZlARF16 | Zlat_10023255 | DBD; MR; CTD | scaffold_336:152463:158522 | 3183 | 1060 | 117.69 | 7.59 | 72.76 | F | 16 | AT1G30330.3 (AtARF6) |
ZlARF17 | Zlat_10022955 | DBD; MR | scaffold_347:482128:484518 | 2142 | 713 | 76.37 | 6.15 | 46.55 | R | 3 | AT4G30080.1 (AtARF16) |
ZlARF18 | Zlat_10022003 | DBD; MR; CTD | scaffold_358:524224:528820 | 2181 | 726 | 80.79 | 5.68 | 50.55 | R | 14 | AT4G23980.1 (AtARF9) |
ZlARF19 | Zlat_10021233 | DBD; MR; CTD | scaffold_378:444906:451433 | 1623 | 540 | 60.18 | 7.68 | 53.93 | F | 12 | AT1G19220.1 (AtARF19) |
ZlARF20 | Zlat_10019056 | DBD; MR; CTD | scaffold_432:225158:229813 | 2640 | 879 | 98.07 | 5.78 | 62.27 | F | 13 | AT1G30330.1 (AtARF6) |
ZlARF21 | Zlat_10018204 | DBD; MR; CTD | scaffold_457:428399:433130 | 2049 | 682 | 75.79 | 6.18 | 59.80 | F | 13 | AT1G59750.3 (AtARF1) |
ZlARF22 | Zlat_10017981 | DBD; MR; CTD | scaffold_460:101457:108766 | 2979 | 992 | 110.89 | 5.83 | 69.65 | F | 14 | AT1G19220.1 (AtARF19) |
ZlARF23 | Zlat_10010482 | DBD; MR | scaffold_735:135203:141844 | 2067 | 688 | 74.55 | 6.44 | 49.78 | F | 9 | AT2G33860.1 (AtARF3) |
ZlARF24 | Zlat_10009593 | DBD; MR | scaffold_820:62920:66333 | 2061 | 686 | 74.94 | 6.57 | 47.96 | F | 2 | AT4G30080.1 (AtARF16) |
ZlARF25 | Zlat_10008345 | DBD; MR | scaffold_855:199807:202584 | 2073 | 690 | 75.08 | 8.18 | 47.81 | R | 2 | AT4G30080.1 (AtARF16) |
ZlARF26 | Zlat_10008072 | DBD; MR | scaffold_872:196040:201540 | 2115 | 704 | 77.63 | 7.60 | 55.53 | F | 10 | AT2G33860.1 (AtARF3) |
ZlARF27 | Zlat_10005792 | DBD; MR | scaffold_1005:123221:128458 | 2751 | 916 | 100.74 | 6.02 | 54.76 | R | 12 | AT1G19850.2 (AtARF5) |
ZlARF28 | Zlat_10005803 | DBD; MR; CTD | scaffold_1041:97033:110057 | 2700 | 899 | 98.93 | 6.14 | 52.53 | F | 16 | AT5G62000.5 (AtARF2) |
ZlARF29 | Zlat_10005411 | DBD; MR | scaffold_1089:69:4914 | 1065 | 354 | 39.24 | 6.11 | 56.48 | R | 9 | AT2G33860.1 (AtARF3) |
ZlARF30 | Zlat_10002430 | DBD; MR | scaffold_1824:15174:22474 | 2445 | 814 | 88.00 | 6.77 | 52.03 | R | 10 | AT2G33860.1 (AtARF3) |
ZlARF31 | Zlat_10002256 | DBD; MR; CTD | scaffold_1883:1595:7058 | 2493 | 830 | 92.45 | 6.24 | 62.45 | R | 14 | AT5G62000.5 (AtARF2) |
ZlARF32 | Zlat_10000945 | DBD; MR | scaffold_2488:6240:8497 | 2115 | 704 | 75.94 | 6.30 | 51.76 | F | 2 | AT4G30080.1 (AtARF16) |
ZlARF33 | Zlat_10000277 | DBD; MR | scaffold_3192:2200:4370 | 1551 | 516 | 55.33 | 5.27 | 45.46 | R | 1 | AT1G77850.2 (AtARF17) |
A Names of ZlARF genes in Z. latifolia. |
B Gene ID: annotated in Z. latifolia genome. |
C DBD: B3 DNA-binding domain; MR: Middle transcriptional regulatory region; CTD: C-terminal domain. |
D Scaffold numbers assembled in Z. latifolia genome. |
E ORF length: Length of open reading frame in base pairs. |
F Length: The number of amino acids, PI: theoretical isoelectric point, Mw: molecular weight of polypeptide, Instability index. |
G Direction (R/F): ZlARFs. |
H No. Of Extron: Exon number of ARF in Z. latifolia. |
I Homology: the homology with Arabidopsis ARFs (AtARFs). |
The deduced polypeptides are characterized by three types of information: length (number of amino acids, aa), molecular weight and theoretical PI. The ZlARF proteins showed wide variation in their length, molecular weight (MW) and isoelectric point (pI) (Table S3). ZlARF ORF lengths ranged from 1065 (ZlARF29) to 3399 bp (ZlARF6) and the molecular weights ranged from 39.24 (ZlARF19) to 126.44 kDa (ZlARF15) (Table 1). The PI range from 5.54 (Z1ARF2) to 9.05 (Z1ARF14), among which 6 ARF protein PI more than 7 were alkaline, and the remaining 26 ARF genes less than 7 were acidic.
Exon-intron, structural domains, conserved motifs and cis-elements of the ZlARF members
The functional motifs and gene exon-intron positions were analysed based on evolutionary tree relationships (Fig. 2, Fig. 3). The full-length protein sequences alignments from all the ZlARFs gene products generated unrooted phylogenetic trees, which indicated that all the ZlARFs can be divided into four major categories, which are presented by colour (purple, orange, red, and green) by branching the evolutionary tree (Fig. 2A). Thirty-three ZlARFs formed 12 orthologous gene pairs, and all pairs were solidly supported by the bootstrap tests(N > = 99%). The results indicated that all ARF genes were interrupted by introns, and the number of introns varied from 1 to 17 (Fig. 2B). The exon-intron organization suggested that all ZlARF genes showed conserved patterns of gene structure were associated with the DBD and MR domain. In the same evolutionary tree categories, the intron exon data were basically consistent. Overall, the highly similar gene structures and motif distributions of the ZlARF members were consistent with their phylogenetic relationships (Fig. 2, Fig. 3). The main structural domains were detected by Pfam. In addition, all sequences contained the MR and DBD domains, among 17 sequences (ZlARF1/3/4/5/6/7/9/11/15/16/18/19/20/21/22/28/31) contained the CTD domains (Fig. 2C). Furthermore, the sequence alignment of the homologous domain sequences of the ZlARF proteins revealed that the domain sequences were highly conserved (Fig. S3).
We then used MEME suite version 5.0.4 to analyse the sequence characteristics of the ZlARF protein sequences. The results showed that there were 20 independent motifs in the 33 ZlARFs gene. We named these motifs 1–20 and used different colours to distinguish the motifs (Fig. 3). We found that the amino acid sequences of all the motifs were highly conserved, a finding consistent detected Arabidopsis thaliana and Oryza sativa ARF families [26, 40, 41]. The DBD was composed of motifs 1, 2 and 10; the MR domain comprised motifs 4, 6, 9 and 11; and the CTD domain comprised motifs 7, 8 and 13. All the predicted ZlARF protein sequences had common motifs 1, 2, 4, 5, 6, 9, 10 and 11 (Fig. 3A, Fig. S2). cis-element analysis showed that several promoters contained defines and stress responsiveness, auxin-responsive element, zein metabolism regulation, meristem expression, abscisic acid responsiveness and gibberellin-responsiveness (Table S1, Fig. 3B), indicating the roles of several ZlARFs genes in auxin-responsive element and other plant hormone-responsive.
Comparison of phylogenetic tree branches and the relationships of AtARFs, OsARFs and ZlARFs
To investigate the evolutionary relationships between ZlARFs gene in Z. latifolia and we compared 33 ZlARFs, 23 AtARF and 25 OsARF genes and constructed phylogenetic trees (Fig. 4, Table S2). The phylogenetic trees showed that the ARF gene families in Z. latifolia, Oryza sativa and Arabidopsis could be divided into 6 groups, designated group Ⅰ-Ⅵ (Fig. 4). Group Ⅰ and Group Ⅵ constituted the largest clades, containing a total of 39 ARFs, and accounted for 19.75 and 28.40% of the sequences, respectively (Fig. 4B). Interestingly, the ZlARF protein numbers in Groups Ⅰ, Ⅱ, Ⅲ and Ⅵ were almost the same (Fig. 4B), indicating that these ARF genes from the three species plants may have come from a common ancestor. Furthermore, we also found that the number of ARF in Oryza sativa was almost the same as that of Z. latifolia. However, Arabidopsis produced an aggregation rate of up to 75.57% in the group Ⅳ (Table S2, Fig. 4B).
In the Fig. 4B, the phylogenetic tree indicated that there were 6 ZlARFs gene in group Ⅰ (43.75%, 6 of 16), while only 3 AtARFs gene and 6 OsARFs gene (24.00%, 6 of 25) were assigned to this group. The other groups were determined in the same manner: in group Ⅱ, there were 2 AtARFs (8.70%, 2 of 23) and 4 OsARFs (16.00%, 4 of 25); in group Ⅲ, there were 2 AtARFs (8.70%, 2 of 23) and 5 OsARFs (20.00%, 5 of 25); in group Ⅳ, there were 11 AtARFs (47.83%, 11 of 23) and 1 OsARF (4.00%, 1 of 25); in group Ⅴ, there was 1 AtARF (4.35%, 1 out of 23) and 1 OsARF (4.00%, 1 of 25); in group Ⅵ, there were 4 AtARFs (17.39%, 4 of 23) and 8 OsARFs (20.00%, 5 of 25) (Fig. 4, Table S2). The results also indicate that most of the ZlARFs share high homology with OsARFs family members than they do with AtARFs family members, as shown by the evolutionary tree (Fig. 4). For instance, ZlARF13, 24, 25, 33 were clustered with OsARF8, 13, 18, 22 in group Ⅰ, while AtARF10, 16, 17 were clustered into a separate clade. The outermost histogram shows the number of amino acids. The amino acid peptide is 354 (ZlARF29)-1140 (AtARF15). The ZlARF proteins showed wide variation in their length (Fig. 4A). However, the total number of amino acids was concentrated between 500 and 1000, ratio of 78.79%. Further analysis of the composition results by orthologous gene pairs. Every group was further subdivided into 16, 11, 13, 14, 4, 23 members that formed 6, 3, 5, 4, 1, 8 orthologous gene pairs. Finally, 81 ARF members formed 27 orthologous gene pairs, among 17 ZlARF-OsARF orthologous gene pairs. This indicates that ZlARFs was more closely related to OsARFs.
Analysis of subcellular localization, synteny, Ka/Ks values and divergence times of the ZlARFs
ARF protein sequences subcellular localization based on WoLF PSORT. In the current study, the N-terminal signal peptide prediction for the 33 ZlARFs was performed using the WoLF PSORT signal peptide prediction program. According to the prediction results, we found that the ZlARFs family is located mainly in the nucleus with a proportion of 76% (Fig. 5A). ZlARF1 was located in mitochondria. ZlARF6, ZlARF14, ZlARF22, and ZlARF30 were located in chloroplasts. ZlARF20, ZlARF29, and ZlARF33 were located in the cytoplasm, and the other 25 ZlARFs were located in the nucleus.
The mapping of the ZlARF genes loci showed that an inconsistent distribution of the genes with only scaffold information, chromosomal was incomplete (Table 1). We rebuild all scaffold information into one class for collinear analysis. Further analyses of ARF gene evolution and divergence times among Z. latifolia, Arabidopsis and Oryza sativa showed that a total of 57 orthologous gene pairs exhibited a collinear relationship (10 Z. latifolia - Z. latifolia, 47 Z. latifolia - Oryza sativa; Fig. 5, Fig. 6, Table S3). These results demonstrated that the ARF genes of Z. latifolia and Oryza sativa appeared to be derived from a common ancestor and that the function of these ARF genes of Z. latifolia plants might be the same as those of Oryza sativa. In addition, among the orthologous gene pairs, each OsARF gene presented 1–3 ZlARF orthologous genes (Fig. 6, Table S3), suggesting that a few ZlARF genes had been duplicated by genome. The Ka/Ks values of these gene pairs were all less than 0.45 except for three pair (ZlARF28-LOC_Os11g32070.1, ZlARF17-OsARF10, Zlat_10014903-OsARF13, Ka/Ks = 0.65, 0.54, 0.67), and the average divergence times were estimated to be 12.17 million years ago (Mya, Z. latifolia - Z. latifolia) and 4.16 Mya (Z. latifolia - Oryza sativa) (Table S3, Fig. 5B, C). These results demonstrated that the ARF gene pairs shared between Z. latifolia and Z. latifolia, Z. latifolia and Oryza sativa had undergone strong purifying selection with limited functional divergence after whole-genome duplication.
Expression patterns of the ZlARF genes in swollen stem formation
To investigate the physiological roles of the ZlARF genes, the real-time PCR technique was used to detect the spatial expression of individual members of the gene family. The accumulation of the transcriptional products of 33 ZlARF genes in the before and after stem formation were evaluated (Fig. 7). Most ZlARFs gene were up/down-regulated, indicating that they might play a central role in swelling stem formation.
The results showed that the transcriptional significant of the ZlARF genes varied greatly in before and after stem formation, suggesting that the ZlARF genes had multiple functions in Z. latifolia stem formation and development. 16 ZlARF genes were remarkable expressed after stem formation. Meanwhile, the expression of 8 ARFs were significantly up-regulated and 8 down-regulated. The ZlARFs gene is widely expressed throughout the process of Z. latifolia stem formation, where it may play a certain role in many aspects of swelling stem formation.
idopsis and Oryza sativa.
GO, KEGG and interacting network analysis of ZlARFs
All ARF gene members were annotated for identification of GO term and encyclopedia of genes and genomes (KEGG) analysis (Fig. 8). GO analysis indicated that 33 ARF genes were enriched in DNA binding, nucleus, regulation of transcription, DNA-templated, auxin-activated signalling pathway (GO: 0003677, GO: 0005634, GO: 0006355, GO: 0009734); KEGG analysis showed that four genes were annotated plant hormone signal transduction (ko04075). Although only four genes were involved in signalling pathways for plant hormone signal transduction. However, the GO analysis found that all the genes were annotated with auxin-activated signalling pathway (GO:0009734) (Fig. 8A, Table S5). The results showed that ZlARFs may participate in the activation of auxin signal to further stimulate the expansion and growth of stem in Z. latifolia. Cellular component found that the ARF gene is located in the nucleus, which is consistent with the previous subcellular prediction. To gain further insight into the roles of these ZlARF genes in stem formation, interaction networks of the ZlARF genes were constructed. From the STRING database, 23 co-expressed genes were identified by compare to Oryza sativa genome (Fig. 8B). Most of the ZlARF genes interact with IAA genes (Aux/IAA family). These genes may together to regulate stem formation and expand.
Comparative transcriptome analysis of ZlARF genes in Z. latifolia stem formation
To determine the characteristics of all ZlARFs gene according to their expression before and swollen stem formation in Z. latifolia, we investigated the expression patterns of the genes using second-generation RNA-Seq technology. The percentages of the reads aligned to the genome were, on average, greater than or equal to 80%, which signifies both the quality of the libraries and the relative completeness of the Z. latifolia genome. The RNA-Seq data from stem formation, TDF-treatment and U. esculenta infection of Z. latifolia showed that the ZlARF members exhibited variable expression patterns; several genes were broadly expressed in all different treatment, while some members were exhibited a more specific pattern (Fig. 9). The qRT-PCR results from stem formation confirmed the RNA-Seq data. In our study, 10 up-regulation in stem formation among 4 log2FC value greater than 0.5; 13 down-regulation in TDF-treatment among 3 log2FC value greater than 0.5; 16 down-regulation in U. esculenta infection among 3 log2FC value greater than 0.5. The length of stem formation of Z. latifolia was measured (Fig. 9A). The stem had not yet formed after TDF (triadimefon) treatment. The stem length of normal Z. latifolia was faster than that of U. esculenta infected male Z. latifolia. By microscopic observation, TDF treatment can obviously inhibit the formation of U. esculenta. U. esculenta infection can make male Z. latifolia swollen stem formation (Fig. 9B, C, D). Together, these results suggested that some ZlARF genes may play important roles in stem formation after U. esculenta infection. ARF genes family may be key to stem formation of Z. latifolia. Functional verification of these genes will be the focus of our follow-up research.