Genome-Wide Identication Analysis of The Auxin Response Factor Gene Family (ARF) in Zizania Latifolia and its Role in Swelling Stem Formation After Ustilago Esculenta Infection

Background: Auxin signalling plays a crucial role in plant growth and development. Although the auxin response factor (ARF) gene family has been studied in some plant species, its structural features, molecular evolution, and expression prole in Zizania latifolia (Z. latifolia) are still not clear. Results: Our study identied 33 putative YSL genes from the whole Z. latifolia genome. Furthermore, a comprehensive overview of the ZlARFs was undertaken, including phylogenetic relationship, gene structures, conserved domains, synteny, Ka/Ks, motifs, and subcellular locations of the gene product. Synteny analyses and the calculation of Ka/Ks values suggested that all 57 orthologous/paralogous gene pairs between Z. latifolia and Z. latifolia, Z. latifolia and Oryza sativa have experienced strong purifying selection. The phylogenetic analysis of ARFs indicated that the ZlARFs can be divided into 6 classes and that most ZlARFs from Z. latifolia have closer relationships with Oryza sativa than with Arabidopsis. RNA-Seq data and qRT-PCR analyses showed that ZlARF genes were expressed in TDF treatment and U. esculenta infection, while some ZlARFs exhibited high expression levels only in U. esculenta infection. Meanwhile, the interaction networks and gene ontology (GO) term of the ZlARF genes were constructed and 23 ZlARF co-expressed genes were identied, most of which were down-regulated involve auxin-activated signalling pathway in after swelling stem formation. Transcriptome analysis results veried the relevant functions of ARF genes, and most ZlARF genes regulated physiological processes in response to differential cell expansion. Conclusion: Comprehensive bioinformation analysis of the auxin response factor gene family (ARF) in Z. latifolia and its association with swelling stem formation after U. esculenta infection. The bioinformatic and RNA-Seq analyses provided valuable information for further study on the regulation of the growth and development of swelling stem formation by ZlARFs in Z. latifolia. an adapted exploratory yeast one hybrid (Y1H) with a core TGTCTC motif used as a decoy sequence [23]. Further, a recent report highlighted the functional importance of the MR. This report indicated that the intrinsically disordered middle region and the PB1 interaction domain of the ARFs drive protein assembly formation [18]. The DBD is a highly conserved structural domain found exclusively in transcription factors that enables proteins of higher plants to interact with domains in other proteins [24]. The MR between DBD and CTD activates or suppresses transcription according to its amino acid composition [25]. 1). We rebuild all scaffold information into one class for collinear analysis. Further analyses of ARF gene evolution and divergence times among Z. latifolia, Arabidopsis and Oryza sativa showed that a total of 57 orthologous gene pairs exhibited a collinear relationship (10 Z. latifolia - Z. latifolia, 47 Z. latifolia - Oryza sativa; Fig. 5, Fig. 6, Table S3). These results demonstrated that the ARF genes of Z. latifolia and Oryza sativa appeared to be derived from a common ancestor and that the function of these ARF genes of Z. latifolia plants might be the same as those of Oryza sativa. In addition, among the orthologous gene pairs, each OsARF gene presented 1–3 ZlARF orthologous genes (Fig. Table S3), suggesting that a few ZlARF genes had been duplicated by genome. The Ka/Ks values of these gene pairs were all less than 0.45 except for three pair (ZlARF28-LOC_Os11g32070.1, ZlARF17-OsARF10, Zlat_10014903-OsARF13, Ka/Ks = 0.65, 0.54, 0.67), and the average divergence times were estimated to be 12.17 million years ago (Mya, Z. latifolia - Z. latifolia) and 4.16 Mya (Z. latifolia - Oryza sativa) (Table S3, Fig. 5B, C). These results demonstrated that the ARF gene pairs shared between Z. latifolia and Z. latifolia, Z. latifolia and Oryza sativa had undergone strong purifying selection with limited functional divergence after whole-genome duplication. indicating that they might play a central role in swelling stem formation. The results showed that the transcriptional signicant of the ZlARF genes varied greatly in before and after stem formation, suggesting that the ZlARF genes had multiple functions in Z. latifolia stem formation and development. 16 ZlARF genes were remarkable expressed after stem formation. Meanwhile, the expression of 8 ARFs were signicantly up-regulated and 8 down-regulated. The ZlARFs gene is widely expressed throughout the process of Z. latifolia stem formation, where it may play a certain role in many aspects of swelling stem formation.

It is reported that the ARF proteins are encoded by a many family and are conserved throughout the evolution of the plant kingdom, and since the expansion of the family seems to be related to the evolution and diversity of plants, the function of the ARFs has been thoroughly studied [26,27] In recent years, microarray assays has shown that the Arabidopsis AtARF1 and AtARF5 domains are have greater similarity to domains with a TGTCGG sequence than to those with the AuxRE TGTCTC element [16]. Many functions of the ARF gene interaction mechanisms and their regulation of biological process, including upregulated growth levels, in Z. latifolia need to be explored Taking advantage of the genomic identi cation of the Arabidopsis thaliana ARF gene (AtARF) family, many researchers have found that the Arabidopsis ARF1 and ARF2 molecules function as transcriptional inhibitors connected with the regulation of plant development, organogenesis and growth. [28,29]. Meanwhile, the ARF family has been found in different species (Zea mays, Oryza sativa and so on), the identity of which has been discovered and continuously veri ed. Sequences derived from massive numbers of sequencing projects are conducive to functional research, offering a resource for discovering transcription factor families. A differential genome analysis of species showed that Oryza sativa, Arabidopsis, Hordeum vulgare, Solanum lycopersicum, Setaria italica, tobacco, Vitis vinifera, and Zea mays have 25,23,17,17,24,50,20 and 31 ARF family members, respectively [26,[30][31][32][33][34][35][36].
Auxin response factors (ARFs) are plant-specific transcription factors that couple the perception of the hormone auxin to gene expression programmes essential to all land plants [37]. The complete genome sequencing of Z. latifolia was published in 2015 and has been widely publicized, laying a solid foundation for research on Z. latifolia genes [3]. However, the ARF gene family has not been identi ed in Z. latifolia. The identi cation and name of an ARF gene in this species will provide a solid foundation for future research. The recent development of the RNA-Seq method has led to not only greater information on genome-wide gene expression but also to analyses with advantages such as higher sensitivity and greater and more dynamic ranges of gene expression patterns and functions to research [38]. The complete genomic sequence of Z. latifolia provides a precious resource for analysing and comparing members of the ARF gene family in terms of evolution at different stages. The genome-wide analysis indicated that the dominant auxin-responsive gene family in plants has various nuclear localization signals and dual origins [39]. Because ARF genes are critical factors in plant growth and development, exploring these genes will aid to information for the comprehensive understanding of the auxin changes in Z. latifolia.
In this study, we performed a systematic analysis of the ARF family in Z. latifolia. The biophysical properties of ZlARF genes were determined. Furthermore, RNA-Seq data from three stem formation were analysed. A comprehensive bioinformatic analysis of ZlARF genes was employed, including an analysis of the conserved motifs, exon-intron structure, phylogenetic tree, conserved domains, synteny, Ka/Ks values and divergence times, expression patterns of 33 ZlARFs. Meanwhile, the homology of each ZlARFs gene family member and relative gene expression levels during swelling stem formation were determined. Samples with before and after stem formation were selected according to their auxin level for transcriptome analysis. By transcriptome sequencing, we explored the relevant expression and functional pro les before and after swollen stem formation, TDF (Triadimefon)-treatment and U. esculenta infection in Z. latifolia. Our results suggested that ZlARFs genes provide valuable information for further studies on the mechanism of swelling stem formation in Z. latifolia.

Plant materials and Treatments
To explore the changes in ARF before and after stem formation, we identi ed the ARF gene family and analysed the relationship between the evolution of the gene family and gene expression. The speci c analysis was follows: The single-season cultivar (Z. latifolia Turcz., cv. 'Dabieshan No. 1') was selected as the experimental material (Yuexi County Longjing Ecological Agriculture Development Co., Ltd., Anhui, China) ( Fig. 1).
For subsequent experiments, stem samples were collected at 153 and 159 days after transplantation, with stem diameters about 5cm, respectively.
Meanwhile, we used two-method to TDF -treatment (80 mg L -1 ) and U. esculenta infection in Z. latifolia. Triadimefon (TDF, CAS: 43121-43-3) is widely used in agriculture and medicine as fungicide, as it can prevent fungal colonization and inhibit the formation of stems and hyphae. Z. latifolia was formed by U. esculenta infection, we could stimulate the expansion of the stem formation through injection of male Z. latifolia. All swelling stems were harvested at appropriate time point, and each sampling was selected in triplicate using freeze tube storage. All samples were placed in cryopreservation tubes and immediately followed by liquid nitrogen storage in a refrigerator at -80℃ for RNA extraction and some index determination. All experiments were performed in three replicates.
Genome-wide identi cation of ARFs gene family members in Z. latifolia The genome annotations and sequences of Z. latifolia were downloaded using the Z. latifolia genome database (Z. latifolia, http://ibi.zju.edu.cn/ricerelativesgd), and a local BLASTP database was built using the BioEdit tool. The comprehensive identi cation of Z. latifoliaARF gene family members were achieved using all AtARFs from the Arabidopsis genome database. The ZlARF gene family members were identi ed by a BLASTP search [60]. Each new query parameter for the AtARF sequence was set to an E-value ≤ 10 −10 to avoid the loss and addition of orthologues. The domains of obtained ZlARF proteins were also further veri ed using NCBI-Conserved Domain database (https://www.ncbi.nlm.nih.gov/Structure/) search program and SMART databases (http://smart.emblheidelberg.de/) [61]. Those proteins which lack ARF domain were removed from further analysis. In addition, protein sequences that were found with obvious errors in their gene. All the candidate genes that might contain the ARF domain were further veri ed. Furthermore, the query sequence was used for further analysis of the structural domain using the Pfam program (http://pfam.xfam.org). Finally, 33 ARF gene models were identi ed in the Z. latifolia genome for further analysis.
Related annotation information about the full-length sequences, chromosome location, deduced polypeptides, and CDS sequences were acquired from the Z. latifolia genomic database using the UltraEdit tool. The sequences containing DBD and MR domains were screened. The PI/Mw was calculated with an ExPASy server to predict the molecular weight of the polypeptide (Mw), instability index, theoretical isoelectric point (PI) and amino acid number (aa) of each ZlARF protein sequence (https://web.expasy.org/protparam/). Finally, we identified and isolated 33 Z. latifoliaARF genes (ZlARFs). These ARF genes are named by chromosomal location. The N-terminal signal peptide of the 33 ZlARFs protein sequences was predicted using the WoLF PSORT [62] server (https://wolfpsort.hgc.jp/), which predicts genes subcellular locations.
Interspeci c phylogenetic tree analysis, classi cation of ZlARF genes The phylogenetic tree was constructed from the 33 ZlARFs (Z. latifolia), 23 AtARFs (Arabidopsis thaliana) and 25 OsARF (Oryza sativa) protein sequences described above using the latest MEGA-X software and the neighbor-joining (NJ) method, and the 1000 replicates bootstrap test was adopted using the ARF protein sequences [63], save the NWK format le. To beautify the evolutionary tree and put the composition of amino acids into the evolutionary tree by the online tool iTOL [64]. The classi cation of the ZlARFs gene were classi ed by comparing them to the ARF of ZlARFs, AtARFs and OsARFs. Meanwhile, 33 ZlARFs evolutionary tree also was constructed using the same method.

Analysis of motif composition structural domain of ZlARFs and cis-elements predictions
The genomic sequence and GFF3 le of the 33 ARF genes were downloaded from the Z. latifolia genomic database, and the intro-exon number patterns were detected through the GSDS 2.0 website as previously described [65]. These sequences were used for further alignment of the structural domains, namely, the DBD, ARF and CTD domains, at the Pfam website (http://pfam.xfam.org) [66]. MEME Suite Version 5.0.5 (http://meme-suite.org/) was used to analyse the conserved motifs of the ZlARFs gene family, and the parameters are set as follows: motif site distribution was zero or one per sequence (ZOOPS), the maximum number of motifs that that MEME should nd was set to 30, sites per motif was set to be between 2 and 33 and motif width was between 6 and 50. A condensed phylogenetic tree was constructed by the construct/test neighbour-joining (NJ) tree method using the MEGA-X software (Version 10.0.5) and a 1000 replicate bootstrap test. Tbtools (Version:1.047) (Chen C, 2020) was used for visual analysis. The promoter regions (3.0 kb upstream sequences) of all YSL genes were used to predict cis-acting elements in the PlantCARE database [67]. Tbtools (v1.047) was used for all visual analysis [68].

Collinearity analyses and inference of divergence time
Synteny analysis among the Z. latifolia, Arabidopsis thaliana and Oryza sativa genomes were conducted via all against all BLASTP comparisons. Z. latifolia genome was scaffold information and no speci c chromosome. To represent the co-progressivity relationship, combine all chromosome locations into one marked scaffold. We compared gene pair to Z. latifolia-Oryza sativa and Z. latifolia-Z. latifolia. The all protein sequences were used as queries to search the corresponding protein sequence data (E < 1*e − 5 , top 5 matches). Synteny pairs were extracted using MCScanX [69] to identify syntenic blocks and duplications within the ARFs. For synteny analysis, Circos tool [70] and TBtools was used to mark the identi ed positions on the chromosomes. The rates of synonymous (Ks) and nonsynonymous (Ka) substitutions were calculated for duplicated ARF genes with the Ka/Ks calculator. Ks could be used as a proxy for time to estimate the dates of the segmental duplication events. Ks values > 2.0 were discarded to avoid the risk of substitution saturation. The Ks value was calculated for each of the gene pairs and then used to calculate the approximate date of the divergence time (T = Ks/2λ), divergence rates λ = 1.5*10 -8 for Z. latifolia species [71].
Transcriptome (RNA-Seq) analysis, cDNA library construction Total RNA from the 12 stem samples taken before and after stem formation, TDF-treatment and U. esculenta infection were extracted using the mirVana miRNA isolation kit (mirVana™ miRNA isolation kit, Ambion-1561) following the manufacturer's protocol. RNA integrity was evaluated using an Agilent 2100 Bioanalyzer (Agilent Technologies, Santa Clara, CA, USA). The samples with an RNA integrity number (RIN) ≥ 7 were subjected to the subsequent analysis. The libraries were constructed using the TruSeq Stranded mRNA LT sample prep kit (Illumina, San Diego, CA, USA) according to the manufacturer's instructions. Then, these cDNA libraries were sequenced on an Illumina sequencing platform (HiSeqTM 2500 or an Illumina HiSeq X Ten), and 125 bp/150 bp paired-end reads were generated. In addition, the p-values < 0.05 & |log2 (fold change)| ≥ 0.5 was set as the threshold for signi cantly differential expression.
The gene expression of Z. latifolia was pro led by RNA-Seq datasets from the NCBI Gene Expression Omnibus (GEO) and Sequence Read Archive (SRA). Those datasets could be retrieved from the SRA database by searching BioProject id PRJNA551367.

Heatmap and microscopic observations
A hierarchical clustered heatmap for ARF genes was plotted with the heatmap package [72], wherein "Euclidean" distance and "Complete" cluster were used for both row-based and column-based clustering. To study the interaction relationships of the above experimental groups, their respective swollen stem samples were stained using aniline blue (CAS: 28631-66-5, Solarbio), according to an improved method [73]. Fresh stem tip samples of Z. latifolia tissue was sliced and xed with a uid containing the Carnot xative (glacial acetic acid and alcohol, volume ratio of 1:3) and put into a 2-mL centrifuge tube, stored in refrigerator at 4°C for 24 h. To it was added the right amount 10% KOH solution, after which it was subjected to 85°C high-temperature treatment lasting 1 h; washed several times during the KOH solution, after processing the used for subsequent dyeing observation. The slices were transferred into a petri dish containing aniline blue dye, and stained on a horizontal shaker for 5-10 min. After decolorization with 75% alcohol, the slices were observed and photographed under a uorescence microscope (Olympus, Japan, SZX10).

Functional annotation, KEGG enrichment pathway analysis and interacting network
Differentially expressed genes (DEGs) were subjected to GO and KEGG enrichment (http://www.genome.jp/kegg/) analyses to characterize their biological functions and signi cantly enriched metabolic pathways or signal transduction pathways. The DEGs were submitted to the GO analysis of enrichment using the GO-Seq R package based on Wallenius' noncentral hyper-geometric distribution [74]. The KOBAS 3.0 [75] website (http://kobas.cbi.pku.edu.cn/index.php) was used to test statistically the enrichment of the DEGs in KEGG pathways. The interaction network associated with ZlARF genes was constructed using the STRING database [76] and Cytoscape software [77]. The STRING tool (http://string-db.org/, version 11.0) was analysed 33ARF interacting network. We constructed an interactive network of Z. latifolia by comparing Oryza sativa genome used red to marked in the gure.
RNA extraction, cDNA synthesis and quantitative real-time PCR analysis To explore the expression patterns of 33 ARF genes were performed before and after swelling stem formation in Z. latifolia using real-time qPCR. Total RNA samples were isolated from different Z. latifolia stages using an RNA-prep pure plant kit (Tiangen, DP432). The DNase-treated RNA was reverse transcribed for cDNA using the Prime Script™ RT reagent kit (TaKaRa, Japan). qRT-PCR was then performed with the TB Green™ Premix Ex Taq™ II (TaKaRa, RR820A). The ZlARFs gene primers were designed using Oligo 7.60 software. qRT-PCR ampli cation was performed with a Bio-Rad CFX96™ system according to the manufacturer's protocol (Bio-Rad, CA, USA). The relative expression level of the Zl-Actin gene was used as an internal reference for data standardization. Three biological replicates of each sample were subjected to qRT-PCR, and the relative gene expression levels were calculated with the 2 -ΔΔCt method [78]. The realtime qPCR primer sequences are shown in the additional information (Table S4). The result is displayed as the mean ± SD (n = 3 biological replicates).

Statistical analysis
The statistical analysis of the relative expression data was performed using SPSS software. Every sample was based on three biological replicates. The SPSS was used for the analysis of signi cance (Duncan test signi cant: * P < 0.05; highly signi cant: ** P < 0.01).

Results
Genome-wide identi cation of the ZlARFs gene family in Z. latifolia To identify the ZlARFs gene family of Z. latifolia, we performed BLASTP program searches of the Z. latifolia genome databases by ARF sequence summarized in the Oryza sativa and Arabidopsis thaliana data sets (Fig. S1). Using each predicted protein sequence as a query to search, 33 potential ARF proteins sequence were discovered. All ZlARFs family members contain the DBD and MR conserved structural domains by Pfam and SMART website to identi cation. On the basis of our analysis, a generic ZlARF naming system used, from ZlARF1 to ZlARF33 by chromosome location, which distinguished each ARF gene according to the Z. latifolia scaffold information. All ZlARFs with the DBD and MR domain are summarized in Table 1. Overall, the analysis showed that there were 33 ZlARFs members in the genome of Z. latifolia.  The deduced polypeptides are characterized by three types of information: length (number of amino acids, aa), molecular weight and theoretical PI. The ZlARF proteins showed wide variation in their length, molecular weight (MW) and isoelectric point (pI) ( Table S3). ZlARF ORF lengths ranged from 1065 (ZlARF29) to 3399 bp (ZlARF6) and the molecular weights ranged from 39.24 (ZlARF19) to 126.44 kDa (ZlARF15) ( Table 1). The PI range from 5.54 (Z1ARF2) to 9.05 (Z1ARF14), among which 6 ARF protein PI more than 7 were alkaline, and the remaining 26 ARF genes less than 7 were acidic.

Exon-intron, structural domains, conserved motifs and cis-elements of the ZlARF members
The functional motifs and gene exon-intron positions were analysed based on evolutionary tree relationships (Fig. 2, Fig. 3). The full-length protein sequences alignments from all the ZlARFs gene products generated unrooted phylogenetic trees, which indicated that all the ZlARFs can be divided into four major categories, which are presented by colour (purple, orange, red, and green) by branching the evolutionary tree ( Fig. 2A). Thirty-three ZlARFs formed 12 orthologous gene pairs, and all pairs were solidly supported by the bootstrap tests(N > = 99%). The results indicated that all ARF genes were interrupted by introns, and the number of introns varied from 1 to 17 (Fig. 2B). The exon-intron organization suggested that all ZlARF genes showed conserved patterns of the highly similar gene structures and motif distributions of the ZlARF members were consistent with their phylogenetic relationships (Fig. 2, Fig. 3). The main structural domains were detected by Pfam. In addition, all sequences contained the MR and DBD domains, among 17 sequences (ZlARF1/3/4/5/6/7/9/11/15/16/18/19/20/21/22/28/31) contained the CTD domains (Fig. 2C). Furthermore, the sequence alignment of the homologous domain sequences of the ZlARF proteins revealed that the domain sequences were highly conserved (Fig. S3).
We then used MEME suite version 5.0.4 to analyse the sequence characteristics of the ZlARF protein sequences. The results showed that there were 20 independent motifs in the 33 ZlARFs gene. We named these motifs 1-20 and used different colours to distinguish the motifs (Fig. 3). We found that the amino acid sequences of all the motifs were highly conserved, a nding consistent detected Arabidopsis thaliana and Oryza sativa ARF families [26,40,41].

Comparison of phylogenetic tree branches and the relationships of AtARFs, OsARFs and ZlARFs
To investigate the evolutionary relationships between ZlARFs gene in Z. latifolia and we compared 33 ZlARFs, 23 AtARF and 25 OsARF genes and constructed phylogenetic trees (Fig. 4, Table S2). The phylogenetic trees showed that the ARF gene families in Z. latifolia, Oryza sativa and Arabidopsis could be divided into 6 groups, designated group - (Fig. 4). Group and Group constituted the largest clades, containing a total of 39 ARFs, and accounted for 19.75 and 28.40% of the sequences, respectively (Fig. 4B). Interestingly, the ZlARF protein numbers in Groups , , and were almost the same (Fig. 4B), indicating that these ARF genes from the three species plants may have come from a common ancestor. Furthermore, we also found that the number of ARF in Oryza sativa was almost the same as that of Z. latifolia. However, Arabidopsis produced an aggregation rate of up to 75.57% in the group (Table S2, Fig. 4B).
In the Fig. 4B, the phylogenetic tree indicated that there were 6 ZlARFs gene in group ( (Fig. 4, Table S2). The results also indicate that most of the ZlARFs share high homology with OsARFs family members than they do with AtARFs family members, as shown by the evolutionary tree (Fig. 4). For instance, ZlARF13, 24, 25, 33 were clustered with OsARF8, 13,18,22 in group , while AtARF10, 16, 17 were clustered into a separate clade. The outermost histogram shows the number of amino acids. The amino acid peptide is 354 (ZlARF29)-1140 (AtARF15). The ZlARF proteins showed wide variation in their length (Fig. 4A). However, the total number of amino acids was concentrated between 500 and 1000, ratio of 78.79%. Further analysis of the composition results by orthologous gene pairs. Every group was further subdivided into 16,11,13,14,4,23  Analysis of subcellular localization, synteny, Ka/Ks values and divergence times of the ZlARFs ARF protein sequences subcellular localization based on WoLF PSORT. In the current study, the N-terminal signal peptide prediction for the 33 ZlARFs was performed using the WoLF PSORT signal peptide prediction program. According to the prediction results, we found that the ZlARFs family is located mainly in the nucleus with a proportion of 76% (Fig. 5A). ZlARF1 was located in mitochondria. ZlARF6, ZlARF14, ZlARF22, and ZlARF30 were located in chloroplasts. ZlARF20, ZlARF29, and ZlARF33 were located in the cytoplasm, and the other 25 ZlARFs were located in the nucleus.
The mapping of the ZlARF genes loci showed that an inconsistent distribution of the genes with only scaffold information, chromosomal was incomplete (Table 1). We rebuild all scaffold information into one class for collinear analysis. Further analyses of ARF gene evolution and divergence times among Z. latifolia, Arabidopsis and Oryza sativa showed that a total of 57 orthologous gene pairs exhibited a collinear relationship (10 Z. latifolia -Z. latifolia, 47 Z. latifolia -Oryza sativa; Fig. 5, Fig. 6, Table S3). These results demonstrated that the ARF genes of Z. latifolia and Oryza sativa appeared to be derived from a common ancestor and that the function of these ARF genes of Z. latifolia plants might be the same as those of Oryza sativa. In addition, among the orthologous gene pairs, each OsARF gene presented 1-3 ZlARF orthologous genes (Fig. 6 Fig. 5B, C). These results demonstrated that the ARF gene pairs shared between Z. latifolia and Z. latifolia, Z. latifolia and Oryza sativa had undergone strong purifying selection with limited functional divergence after whole-genome duplication.

Expression patterns of the ZlARF genes in swollen stem formation
To investigate the physiological roles of the ZlARF genes, the real-time PCR technique was used to detect the spatial expression of individual members of the gene family. The accumulation of the transcriptional products of 33 ZlARF genes in the before and after stem formation were evaluated (Fig. 7). Most ZlARFs gene were up/down-regulated, indicating that they might play a central role in swelling stem formation.
The results showed that the transcriptional signi cant of the ZlARF genes varied greatly in before and after stem formation, suggesting that the ZlARF genes had multiple functions in Z. latifolia stem formation and development. 16 ZlARF genes were remarkable expressed after stem formation. Meanwhile, the expression of 8 ARFs were signi cantly up-regulated and 8 down-regulated. The ZlARFs gene is widely expressed throughout the process of Z. latifolia stem formation, where it may play a certain role in many aspects of swelling stem formation.

GO, KEGG and interacting network analysis of ZlARFs
All ARF gene members were annotated for identi cation of GO term and encyclopedia of genes and genomes (KEGG) analysis (Fig. 8). GO analysis indicated that 33 ARF genes were enriched in DNA binding, nucleus, regulation of transcription, DNA-templated, auxin-activated signalling pathway (GO: 0003677, GO: 0005634, GO: 0006355, GO: 0009734); KEGG analysis showed that four genes were annotated plant hormone signal transduction (ko04075). Although only four genes were involved in signalling pathways for plant hormone signal transduction. However, the GO analysis found that all the genes were annotated with auxin-activated signalling pathway (GO:0009734) (Fig. 8A, Table S5). The results showed that ZlARFs may participate in the activation of auxin signal to further stimulate the expansion and growth of stem in Z. latifolia. Cellular component found that the ARF gene is located in the nucleus, which is consistent with the previous subcellular prediction. To gain further insight into the roles of these ZlARF genes in stem formation, interaction networks of the ZlARF genes were constructed. From the STRING database, 23 co-expressed genes were identi ed by compare to Oryza sativa genome (Fig. 8B). Most of the ZlARF genes interact with IAA genes (Aux/IAA family). These genes may together to regulate stem formation and expand.
Comparative transcriptome analysis of ZlARF genes in Z. latifolia stem formation To determine the characteristics of all ZlARFs gene according to their expression before and swollen stem formation in Z. latifolia, we investigated the expression patterns of the genes using second-generation RNA-Seq technology. The percentages of the reads aligned to the genome were, on average, greater than or equal to 80%, which signi es both the quality of the libraries and the relative completeness of the Z. latifolia genome. The RNA-Seq data from stem formation, TDF-treatment and U. esculenta infection of Z. latifolia showed that the ZlARF members exhibited variable expression patterns; several genes were broadly expressed in all different treatment, while some members were exhibited a more speci c pattern (Fig. 9). The qRT-PCR results from stem formation con rmed the RNA-Seq data. In our study, 10 up-regulation in stem formation among 4 log2FC value greater than 0.5; 13 down-regulation in TDF-treatment among 3 log2FC value greater than 0.5; 16 down-regulation in U. esculenta infection among 3 log2FC value greater than 0.5. The length of stem formation of Z. latifolia was measured (Fig. 9A). The stem had not yet formed after TDF (triadimefon) treatment. The stem length of normal Z. latifolia was faster than that of U. esculenta infected male Z. latifolia. By microscopic observation, TDF treatment can obviously inhibit the formation of U. esculenta. U. esculenta infection can make male Z. latifolia swollen stem formation (Fig. 9B, C, D). Together, these results suggested that some ZlARF genes may play important roles in stem formation after U. esculenta infection. ARF genes family may be key to stem formation of Z. latifolia. Functional veri cation of these genes will be the focus of our follow-up research.

Discussion
Auxin response factor plays an important role in swelling stem formation It is well known that Z. latifolia has high levels of nutrient and protein content, primarily due to cell division and swelling stem elongation [42]. Auxin involve in fungal infection and host de nes responses are crucial events involved in plant-fungus interactions [43]. Auxin is an important plant hormone and because control of cell expansion and proliferation is closely related to the ARF family of plant hormone signalling pathways [44]. We need to understand the role of these hormones in the gene regulation network during auxin-regulated development, and these genes can play valuable roles during genetic evolution and improvement [45]. Exogenous auxin can increase the yield and photosynthesis capacity in stem formation. lead to the increase of endogenous IAA content, so it is of great signi cance to study the expression of ARF family genes [8]. Base on qRT-PCR, we found that the relative expression levels of many genes of ARF genes were signi cantly expression (Fig. 7). To elucidate the role of the ZlARF gene in growth regulation, auxin signal transduction and plant hormone response in Z. latifolia. We described the main structural characteristics and relative expression model of the ARF gene families by bioinformation.
Comprehensive analysis of ARF genes in stem formation of Z. latifolia Through our research, a search for ARF genes in the Z. latifolia genome resulted in the identi cation of 33 members (Table 1). Among the 33 ZlAEFs, 16 members not included CTD domain, which generally lacked partial sequences in anking region. All of these proteins are nucleus proteins (Fig. 5A) and exhibit similar patterns of the main motif distribution base on phylogenetic analysis (Fig. 3A). Subcellular localization is the main content of our future validation. We found that the nucleus is one of the sites with the greatest concentration of ZlARF. Conserved C-terminal region includes a CTD domain that can form ARF/Aux-IAA heterodimers and ARF/ARF heterodimers and ARF/ARF homodimers through a phytohormone auxin reaction [46,47]. This nding is in accordance with the results of an Oryza sativa [26] analysis, suggesting that these ZlARFs may be regulated in an auxin-independent manner. Meanwhile, we compared the evolutionary relationship, synteny, Ka/Ks values and divergence times of ARF genes in Z. latifolia, Oryza sativa and Arabidopsis, and found that the genes in Z. latifolia and Oryza sativa were the most similar (Fig. 4). Based on phylogenetic analysis, the intron and exon distributions were found to be basically the same in in the ZlARFs gene of the same group (Fig. 2). In other words, the distribution of a gene family within a gene structure can also re ect the relationship between phylogenetically related genes [30,48].

Relevant functions and expression patterns analysis
Our analysis of the genetic structure and motifs in ZlARF will help with understanding its different roles in swelling stem formation. The feature of these motif and cis-elements was important for predicting the transcription factor function [17]. In addition, the motif distributions of some closely related sequences are similar, suggesting that these ZlARF families may have similar functions (Fig. 3). In general, the similarity of most ZlARFs gene structures and motif compositions in each group supports the ndings from the phylogenetic analysis.
All predicted members of the ZlARF families were divided into six different colour-coded groups (group 1-group 6) (Fig. 4), in a manner similar to the classification scheme used for Arabidopsis [49] and Oryza sativa [26]. Interestingly, the ZlARF protein numbers in Groups , , and were almost the same (Fig. 4B), indicating that these ARF genes from the three species plants may have come from a common ancestor [50]. These results revealed that the ARF gene family is affected by polyploidy and has experienced segmental and genome duplication in Z. latifolia species.
In the evolutionary analysis, we found that each group had different genes from Oryza sativa and Arabidopsis, indicating that the ARF gene was mutated before the three species evolved and diversi ed. Further evolutionary analysis showed that there were 27 orthologous gene pairs combined in Z. latifolia and Oryza sativa. In the synteny analysis, there were 57 orthologous gene pairs, including the 47 ZlARF-OsARF pairs and 10 ZlARF-ZlARF (Fig. 5, Fig. 6). This result shows that the relationship between Z. latifolia and Oryza sativa is closer than that between Z. latifolia and Arabidopsis, and the ARF family was better preserved during the evolution of these three plants. In addition, we found that the OsARFs family was evenly distributed, and in Arabidopsis, they were mainly found in group gene clusters. This suggests that the genes in this population may have been acquired by Z. latifolia or may have been lost from the Oryza sativa lineage [51] after differentiation of the common ancestor of Oryza sativa and Z. latifolia [52]. Additionally, we estimated the divergence times of the orthologous ARF pairs between Z. latifolia-Z. latifolia and Z. latifolia-Oryza sativa, which showed Ka/Ks ratios less than 0.45 and indicating the occurrence of strong purifying selection (Fig. 5, Table S3), re ecting the strong control exerted over these genes in evolution [53]. Moreover, the average divergence times of Z. latifolia-Z. latifolia and Z. latifolia-Oryza sativa were estimated to be 12.70 and 4.16 Mya, respectively, indicating that Z. latifolia and Oryza sativa shared a common ancestor and exhibit different differentiation times [54]. These results imply that the ARF gene family might be a good candidate molecular reference for analysis of plant evolution.
Our study found that most duplicated ZlARFs were expressed in different period and treatment, suggesting that these genes had speci c or cellular functions. According to GO and analysis, most of them are located in transcription, DNA-templated, auxin-activated signalling pathway, nucleus and plant hormone signal transaction [14,15] (Fig. 8). PPI showed that ARF and IAA could together to improve plant development [55] and stem formation [56]. In the current study, we used qRT-PCR and RNA-Seq to explore the expression pro le of the ZlARF gene before and after swelling stem formation in Z. latifolia. (Fig. 7, Fig.  9). The expression of the ZlARF gene during Z. latifolia development was rapidly changed compared to that before stem formation in Z. latifolia. Among stem formation, the gene expression levels of different 33 ZlARF genes clearly differed, and some of these genes were very highly and speci cally expressed. These results were similar to those obtained in peach [57] and sweet orange [58], suggesting that some ARF genes play important roles in stem developmental processes. The expression pro le of a gene is often related to its function. Since ARFs regulate auxin biological processes based on gene expression, we hypothesized that the ZlARF gene responds to the formation of swollen stem.
Then, we assayed 6 RNA-Seq from stem formation, TDF treatment and U. esculenta infection, and found that 33 ZlARF genes were expressed. Among them, the down-regulated expressions of ZlARF9, ZlARF19 and ZlARF30 were found stem formation and U. esculenta infection, while the up-regulated expressions were found after TDF treatment (Fig. 9). The results showed that the symbiosis between Z. latifolia and U. esculenta could mediate the expression of ARF gene and in uence the stem formation [59]. However, further research is required to investigate the detailed regulatory mechanisms of ZlARF9, ZlARF19 and ZlARF30 related to induced U. esculenta stimulating stem formation of Z. latifolia. In summary, 33 ZlARF gene family was identi ed based on genome-wide, and related functions were predicted by bioinformatics analysis. These ndings provide a solid foundation for an insight into the potential function of the ZlARF gene. Comprehensive analysis was helpful to explore ARF genes for further functional identi cation and the yield provides a theoretical basis of Z. latifolia.

Conclusions
In this study, we identi ed 33 ZlARF gene family members in Z. latifolia species. Base on the Z. latifolia genomic database, we have gained new insight and comprehensively evaluate about the gene structures, cis-elements, synteny and evolutionary relationships of the Z. latifolia ARF genes. Additionally, expression pattern assays based on RNA-Seq and qRT-PCR data revealed that ZlARFs present after stem formation patterns and are involved in stem growth and expand, especially auxin-activated signalling pathway. Combined analyses of RNA-Seq datasets from different of Z. latifolia demonstrated that 33 ZlAREF genes were up/downregulated in the U. esculenta infection, possibly indicating crucial roles in auxin regulation and transport. Furthermore, the coexpression networks assay exhibited that only ZlARFs and IAAs can be detected the co-expressed genes among these ARF genes. These results provided basic information for studying the mechanism of ARF function in the stem formation in Z. latifolia. In the future, more experimental and bioinformatics work is needed to fully understand the functions of these important candidate genes and the regulatory mechanisms of some important stem formation actionrelated genes in this particular species.

Declarations
Ethics approval and consent to participate: Not applicable.
Consent for publication: All authors agree to publish.
Availability of data and materials: The raw RNA-Seq data used in this study have been deposited in the Nation Center for Biotechnology Information (NCBI) Sequence Read Archive (SRA) database under the accession number SRP212030 (https://trace.ncbi.nlm.nih.gov/Traces/sra/sra.cgi?study=SRP212030).