Genome-wide Identi cation and Expression Pro ling Analysis of WOX Family Proteins Encoded Genes in Triticeae Plant Species

Lei Shi Institute of Crop Scoences,Chinese Academy of Agricultural Sciences Ke Wang Chinese Academy of Agricultural Sciences Institute of Crop Sciences Lipu Du Chinese Academy of Agricultural Sciences Institute of Crop Sciences Yuxia Song Key Laboratory of Agricultural Biotechnology of Ningxia Huihui Li Chinese Academy of Agricultural Sciences Institute of Crop Sciences Xingguo Ye (  yexingguo@caas.cn ) Institute of Crop Sciences,Chinese Academy of Agricultural Sciences https://orcid.org/0000-0002-6616-2753


Background
Triticeae tribe belongs to the Poaceae family, and is made up of more than 350 plant species in 30 genera around. In Triticeae plants, a series of species such as Triticum aestivum (bread wheat), Hordeum vulgare (barley), Secale cereale (rye), Triticum urartu, Triticum dicoccoides, and Triticum turgidum have been cultivated as crops and provides necessary nutrition for more than two billion people in the world [1,2]. For a long time, people are interested in understanding the origination, genetic basis and evolution of Triticeae plants for their better improvement. Clear interpretation on genetic information of Triticeae plants may bring us closer to achieve the aforementioned objective. The assembling of wheat genome is a milestone in interpreting the genetic information of Triticeae plants. However, due to the large genome size of up to 16 Gb, the genomic study on wheat is legged behind of rice and maize [3]. The application of modern biotechnology tools such as transgene and gene editing in plant breeding can help us to increase yield, improve quality, and enhance biotic or abiotic resistance of major crops, but the realization of these aims depends on genetic transformation. The ability of regenerating new plantlets from in vitro tissues is a big limitation that restricts the application of genetic transformation and gene editing systems [4,5].
Regeneration ability is one of important genetically physiological traits for most plants, which enables plants recover from wound tissues and form new organs. For modifying plants using genetic-engineering strategy, shoot or somatic embryo production from isolated tissues or cells is an indispensable step to achieve transgenic plants. But, it is still di cult to obtain regenerated plants in the process of genetic transformation from most genotypes (especially the extensively commercial varieties) of wheat and other Triticeae species [5][6][7]. During plant regeneration, a series of genes express in an orderly manner under the regulation of auxin and cytokinin. These regeneration-related genes include WUSCHEL-RELATED HOMEOBOX (WOX),AUXIN RESPONSE FACTOR (ARF), BABY BOOM (BBM), SCARECROW (SCR), SHORT ROOT (SHR), PLETHORA (PLT), CUP-SHAPED COTYLEDON (CUC), and YUCCA (YUC), which express during the progress of embryonic patterning, somatic embryogenesis, cell differentiation, wound reparation, and epigenetic reprogramming [5,[8][9][10][11][12]. An in-depth understanding of regeneration-related genes in molecular level will make it possible to break through the bottleneck in genetic transformation and build a more e cient transformation system with less genotype-dependent. The application of regeneration-associated genes including WUS2 and BBM in crop transformation has achieved a great success, by which various maize inbred lines and tissues, and recalcitrant genotypes of Indica rice, sugarcane, and sorghum can be e ciently transformed for getting transgenic plants [13,14].
The WOX family is a group of plant speci c transcription factors and belongs to the homeobox (HB) transcription factor family [15]. All the identi ed WOX genes contain a conserved sequence of amino acids (60-66 residues), which is called as homeodomain (HD) encoded by the HB DNA sequence [16,17]. The distinctive WUS-box motif forms as T-L-X-L-F-P-X-X(T-L-[DEQP]-L-F-P-[GITVL]-[GSKNTCV]), of which the consensus structure is TLELFPLH [15]. These homolog sequences fold into a DNA-binding domain. Update published data suggests that WOX genes act as pivotal regulators during the progress of embryonic development and polarization, plant growth and development, stem cell differentiation, embryo patterning, and ower development [18][19][20][21][22]. There are 15 WOX genes in Arabidopsis thaliana, 13 in rice, and 21 in maize [15,23,24]. In Arabidopsis, as a stem cell regulator, AtWUS expresses in the organizing-center (OC) cells in the shoot apical meristem and regulates plant growth and shoot stem cell maintaining [25,26]. Ectopic overexpression of WUS genes promotes cell dedifferentiation in shoot meristem, somatic embryo formation, adventitious shoot and lateral leaf origination [26][27][28].
It is found that AtWOX1 possibly regulates the activity of S-adenosylmethionine decarboxylase polyamine homeostasis and/or the expression of CLAVATA3(CLV3), and has an important function in meristem development in Arabidopsis. Overexpression of AtWOX1 leads to abnormal meristem development and polyamine homeostasis [29]. Normally, AtWOX2 expresses in the zygote and early embryogenesis formation, and performs functions in correcting the apical domain development of the embryos [23]. AtWOX2 triggers the expression of PINFORMED1 (PIN1), which is an auxin transport and localizes auxin to the cotyledonary tips of early embryo and root pole [18]. AtWOX3 (PRESSED FLOWER1, PRS1) expresses in the peripheral layer of shoot meristem and regulates cells to form the lateral domain in vegetative and oral organs [30]. The expression of AtWOX2 and AtWOX3 are regulated by Leafy Cotyledon2 (LEC2), and AtWOX2 and AtWOX3 play essential roles in somatic embryogenesis [31]. AtWOX4 expresses in a narrow domain in cambial cells, and AtWOX4 coordinating with PHLOEM INTERCALATED WITH XYLEM (PXY) acts as a key regulator for cambium activity in the main stem [32]. AtWOX5 expresses in the QC of meristematic zone in root tips, regulates the columella stem cell (CSC) identity, and helps to maintain the root stem cell niche [33]. AtWOX6 (PRETTY FEW SEEDS2, PFS2) expresses in developing ovules and primordials and differentiating organs, regulates ovule development, and affects differentiation and maturation of leaves, outer integuments and oral primordial [34]. AtWOX7 expresses during all development stages of lateral root, but primarily involves in the initiation of lateral root [35]. AtWOX8 (STIMPY-LIKE) and AtWOX9 (STIMPY) are closely homologs [36,37] and responsible for maintaining the normal development of both basal and apical embryo lineages at early development stage [18]. The expression of AtWOX8 is induced by AtWRKY2 in the basal cell lineage at the initiation stage of embryogenesis [38]. AtWOX11 plays a key role in the course of vascular cambium differentiation to new lateral root founder cells. AtWOX11 is strongly induced expressed in de novo root organogenesis, which is the same as its homologous AtWOX12 [39,40]. AtWOX13 expresses mainly in meristematic tissues to promote replum development and orchestrate fruit patterning [41]. AtWOX14 is regulated by the CLAVATA3/ESRLIKE41/PHLOEM INTERCALATED WITH XYLEM (CLE41/PXY) pair, expresses in the procambium during stem maturation, and promotes xylem differentiation, vascular cell differentiation and ligni cation in in orescence stems [42,43].
Based on the phylogenetic analysis in Arabidopsis, plant WOX proteins are naturally divided into three clades: WUS and WOX1 to WOX7 in the WUS clade; WOX8, 9, 11, and 12 in the intermediate clade; and WOX10, 13, and 14 in the ancient clade [15]. But, the WOX genes in Triticeae plant species have not been fully identi ed and characterized yet. Therefore, the objectives of this study are, (1) identifying WOX genes in the six Triticeae plant species including T. aestivum, T. turgidum, T. dicoccoides, H. vulgare, A. tauschii, and T. urartu, and aligning them onto chromosomes; (2) dividing all of the WOX proteins in the six Triticeae species into groups by phylogenetic analysis using deduced protein sequences from all the WOX genes and the sequences of OsWOX genes from rice and AtWOX genes from Arabidopsis; and (3) analyzing the differential expression of TaWOX genes in different tissues by RNA sequencing (RNA-seq) and quantitative real-time PCR (qPCR). Our results would provide insights for further understanding the functions and evolution clari cation of WOX family genes in Triticeae plants, and facilitate their application in gene transformation for the improvement of Triticeae plants.

Results
Identi cation of WOX genes in Triticeae plant species Totally, 43 TaWOX transcripts were obtained using the recently released IWGSC wheat genome [3], and there were still 6 pseudo gene copies (  (Table S4) were identi ed from IWGSC genome database, respectively. Some homologous alleles of WOX genes were not annotated as transcripts in the database, but were also collected and listed in the tables. For example, TaWUSb and TaWUSd were located on chromosomes 2B and 2D in T. aestivum, respectively ( Table 1). The WUS genes in other ve Triticeae plant species were also located on their group 2 chromosomes ( Table 2, Table S1-S4). TdWOX12a, TdWOX12b, TdWOX7b and TdWOX13b were located on chromosomes 1A, 1B, and 3B in T. dicoccoides, respectively (Table S2).  In the six Triticeae plant species, only one transcript of WUS gene was annotated as TaWUSa on chromosome 2A in wheat in the database (Table 1). We found the homologous fragments of TaWUSa on chromosomes 2B and 2D in wheat (Table 1), 2D in A. tauschii (Table S1), 2A and 2B in T. dicoccoides and T. turgidum (Tables S2 and S3), and 2H in barley ( Table 2). According to the results of multiple sequence alignment, the full length of the open reading frame (ORF) of these homologous genes can be achieved, and their deduced amino acid sequences were highly consistent with TaWUS (Fig. 1A). To understand if these genes can normally transcribe and express, promoter analysis was performed. It was showed that the promoter region of the WUS genes in the six Triticeae plant species all contained core promoter elements including transcription start TATA-box and AT ~ TATA-box,indicating they possessed potential transcriptional activity (Fig. 1B). In the promoter region of TaWUSa, TdWUSa, TtWUSa, and TuWUS, a fragment of GGTCCAT was existed, which is a cis-acting regulatory element involved in auxin responsiveness. Nevertheless, this element was not detected in the promoter of AtaWUS, TaWUSb, TaWUSd, TdWUSb, and TtWUSb.
Chromosomal location of WOX genes in Triticeae plant species In general, no WOX gene was found on homologous groups 6 and 7 for the genomes of the six Triticeae plant species, i.e., T. aestivum, T. turgidum, T. dicoccoides, H. vulgare, A. tauschii, and T. urartu, (Tables 1 and 2, and Tables S1-S4). In T. aestivum, except TaWUS, all the TaWOX genes had three copies of transcripts on its genomes A, B, and D. Three homologous alleles of TaWUS were located on chromosomes 2A, 2B, and 2D. The homologous genes of TaWOX2 or TaWOX12 were located on chromosomes 1A, 1B, and 1D. Three copies of TaWOX4 or TaWOX11 were located on chromosomes 2A, 2B, and 2D. The three homologous genes of TaWOX7 to TaWOX10, TaWOX13 and TaWOX14 were all located on chromosomes 3A, 3B, and 3D. The three alleles of TaWOX6 were located on chromosomes 4A, 4B, and 4D. The three alleles of TaWOX3 or TaWOX5 were located on chromosomes 5A, 5B, and 5D. Further investigation would be needed for the unknown chromosomal location of an incomplete transcript of TaWOX8. No WOX gene was found on homologous groups 6 and 7 (Table 1, Fig. 2A). The HvWOX genes in H. vulgare showed the similar chromosomal localization to the TaWOX genes in T. aestivum and AtaWOX genes in A. tauschii. HvWOX2 and HvWOX12 were located on chromosome 1H; HvWOX4 and HvWOX11 were located on chromosome 2H; HvWOX7 to HvWOX10, HvWOX13, and HvWOX14 were located on chromosome 3H; HvWOX6 was located on chromosome 4H, and HvWOX3 and HvWOX5 were located on chromosome 5H. (Table 2; Fig. 2B). There are additional copies of HvWOX8 and HvWOX10 on chromosome 3H. The HvWOX10.1 and HvWOX10.2 showed complete sequence consistency, but HvWOX8.2 was shortened compared with HvWOX8.1.
Similar situation was observed in A. tauschii. AtaWOX2 and AtaWOX12 were located on chromosome 1D. AtaWOX4 and AtaWOX11 were located on chromosome 2D. AtaWOX7 to AtaWOX10, AtaWOX13, and AtaWOX14 were all located on chromosome 3D. AtaWOX6 was located on chromosome 4D, AtaWOX3 and AtaWOX5 were located on chromosome 5D (Table S1, Fig. S1A). Similar results were also obtained in T. dicoccoides and T. turgidum. As expected, all the TdWOX and TtWOX genes were located on the corresponding chromosomes of their genomes A and B because the two species only have the two genomes (Table S2, Table S3, Fig. S1B, Fig. S1C). Additional copies of TdWOX8a and TtWOX14a were also existed on the corresponding chromosomes.
To verify the chromosomal locations of those WOX genes in the six Triticeae species, partial sequences of some of the WOX genes were ampli ed by their speci c primers using a set of T. durum-T. aestivum genome D substitution lines (Fig. 3). The TaWUSa and its two homologs (named as TaWUSb and TaWUSd) were detected in T. aestivum L. cv CS (ABD genome), T. durum cv Langdon (AB genome), and other substitution lines except 2D(2A), indicating that the two copies TaWUSa and TdWUSa were located on chromosome 2A. TaWUSb was ampli ed in CS, Langdon, and other substitution lines except 2D(2B), indicating that TaWUSb was located on chromosome 2B. TaWUSd only appeared in CS, 2D(2A) and 2D(2B), indicating that it was located on chromosome 2D (Fig. 3). Similarly, WOX2a, WOX2b, WOX6a, and WOX6b were absent in 1D(1A), 1D(1B), 6D(6A), and 6D(6B), respectively. WOX2d and WOX6d were only detected in CS and the substitution lines which contain chromosome 1D or 4D (Fig. 3).

Evolution of WOX family proteins in Triticeae plant species
Phylogenetic trees of WOX family proteins in Triticeae species were constructed based on the deduced protein sequences. From the phylogenetic trees, it was suggested that WOX proteins in Triticeae plants were also divided into three clades, like those in many other plant species [44,45]. However, the WOX protein classi cation in wheat was closer to that in rice in comparison with that in Arabidopsis. TaWUS, TaWOX2 to TaWOX5, TaWOX9, TaWOX13, and TaWOX14 were assigned to the same clade with the homologous proteins in rice, corresponding to Arabidopsis WUS clade (AtWUS and AtWOX1 to AtWOX7). TaWOX6, TaWOX7, and TaWOX10 to TaWOX12, and their homologous proteins from rice were classi ed into a clade, corresponding to an Arabidopsis intermediate clade (AtWOX8, 9, 11, and 12). TaWOX8 and OsWOX8 were clustered in separated branches, showing correspondence to an Arabidopsis ancient clade (AtWOX10, 13, and 14) (Fig. 4).
Barley WOX proteins were also divided into three clades: the rst clade harbored HvWOX2, 3, 5, 9, 13 and 14; the second clade was for HvWOX8 only; and the third clade included HvWOX6, 7, and 10 to 12 (Fig. S2A). Similar to wheat, one branch in A. tauschii contained AtaWOX2 to AtaWOX5, 9, 13 and 14. AtaWOX6, 7, and 10 to 12 were clustered into the same branch, but AtaWOX8 was belonged to another branch alone (Fig. S2B). In T. turgidum, TtWOX proteins were also divided into three clades: TtWOX2 to TtWOX5, 9, 13 and 14 were in the rst branch; TtWOX6, 7, and 10 to 12 were in the second branch; and the three copies of TtWOX8 were clustered into the same group with OsWOX8 (Fig. S2C). In T. dicoccoides, TdWOX2 to TdWOX5, 9, 13 and 14 were clustered in one branch, TdWOX8 was in other branch alone, and TdWOX6, 7, and 10 to 12 were in another branch (Fig. S2D). In T. urartu, only eight sequences coding WOX family proteins were retrieved because there was no complete genome information on T. urartu yet. The deduced protein sequences from gene sequences of TuWOX and OsWOX were used to construct a phylogenetic tree, in which TuWOX2, 5, and 9 were grouped together, and TuWOX10 and TuWOX6/11 were in the same branch, and the two homologous sequences of TuWOX8 were clustered together (Fig. S2E).
The phylogenetic tree of the WOX family proteins from the six Triticeae species was also constructed via maximum likelihood method (Fig. 5). Based on the tree, it was clearly seen that the WOX proteins with the same names from the six Triticeae species were clustered together (Fig. 5), indicating that the WOX proteins were conserved in these plant species.

Analysis for the conserved motifs of WOX proteins in Triticeae species
All the amino acid sequences of WOX proteins in the six Triticeae species were deduced from their transcripts mentioned above. Each member contained HOX homeodomain, which were the most noteworthy symbol and de ning feature of this protein family (Fig. 6, Fig. S3). Sequences of HOX homeodomain of the three clades of WOX proteins were conserved in the six Triticeae species (Fig. 7A).

Expression patterns of TaWOX genes in different organs of wheat
The WOX genes mainly expressed in the meristematic region, and played a regulatory role in the process of plant growth and tissue differentiation. We retrieved the data from expVIP website (http://wheat-expression.com), and sketched the contours of expression pattern of TaWOX genes. It is showed that TaWUS expressed in root during seedling stage, in spike during vegetative stage, and in spike and leave/shoot during productive stage. Its expression level was higher in spike than other organs (Fig. S4A). All the three homologous of TaWOX2 to 4, 7, 8, and 12 showed higher expression level in developing spike than other organs, and even higher at vegetative stage than reproductive stage (Fig. S4B-D, G, H, and L). The expression level of TaWOX5 was higher in grain than that in other organs at reproductive stage (Fig.  S4E). TaWOX6, 9 to 11 showed a high transcriptional activity in root (Fig. S4F, I-K). The transcripts of TaWOX10 and TaWOX11 mainly accumulated in root at seedling stage while the expression level of TaWOX9 was high in root at vegetative stage (Fig. S4I-K). The transcript levels of TaWOX6b and TaWOX6d in root were increased at productive stage compared with vegetative stage (Fig. S4F).
Further, we used wheat root, stem, leave, spike at booting stage, and anther at heading stage as well as immature embryo, callus derived from the immature embryos at proliferative and differential stages as materials to perform expression pro ling analysis of TaWOX genes by qPCR assay. The results indicated that expression patterns of TaWOX genes changed greatly in different organs at different stages (Fig. 8). The expression levels of TaWUS and TaWOX6 to 8 were relative high in spike (Fig. 8A, B), and the expression levels of TaWOX9 and TaWOX11 were high in root (Fig. 8B, C). Additionally, TaWOX2 showed high activity in embryo, and TaWOX3 and TaWOX4 showed high expression levels in embryogenic callus and differential callus, respectively (Fig. 8A).

Discussion
In Triticeae plant species, wheat and barley are two important crops globally, which account for a large proportion of food production in the world. With the completion of assemble and annotation of the colossal wheat genome, a great progress on functional genomic study in Triticeae plants, especially in wheat, has been achieved [46][47][48][49]. It is wellknown that wheat genome was originated from the natural hybridization of its three ancestor species. Therefore, wheat genome consisting of three genomes of A, B, and D has a large number of repeated gene sequences, and most wheat genes have three or more copies [50]. In present study, we identi ed 43 WOX gene copies in the genome of T. aestivum, 42 of which was consistent with the result reported by Li et al. [51], and a new locus of TaWOX8 was added to the results of TaWOX family. Particularly, we rstly identi ed 17 WOX genes in H. vulgare, 13 in A. tauschii, 30 in T. turgidum, 25 in T. dicoccoides, and 8 in T. urartu. There were still several duplicated copies of WOX gene such as TaWOX14a, TaWOX14d, HvWOX10, and TdWOX14. A few of WOX-like pseudo genes were found to be scattered over Triticeae genomes, which might be a duplication of WOX genes or the other genes losing transcriptional activity during their evolution progress.
WUS plays an indispensable role on the stem cell niche maintenance in shoot apical meristem (SAM), lateral primordia differentiation and other diverse cellular processes [26]. The de ciency of WUS gene will lead to the loss of function of SAM and terminated plant growth [25]. However, only the allele of TaWUS located on chromosome 2A was annotated as a transcript. TdWOX12a, TdWOX12b, TdWOX7b and TdWOX13b, which have a high sequence identity with their homologous genes from wheat, were also not annotated as transcripts in the database. The DNA sequences and deduced protein sequences of four genes TdWOX12a, TdWOX12b, TdWOX7b, and TdWOX13b were added into the WOX members in the six Triticeae species (Table S2). In barley, the annotation of HORVU1Hr1G087940 and HORVU1Hr1G087950 and their deduced protein sequences A0A287GM87 and A0A287GM65 are actually originated from HvWOX12 (Table 2).
In previous studies, the classi cation and naming of WOX genes in wheat were confused to some extent. This might be attributed to the different naming scheme of WOX genes in Arabidopsis and rice [15,23,24]. For example, the TaWOX5 reported by Zhao et al. [52] were regarded as TaWOX9 due to their highly similarity to OsWOX9, even though it showed a close similarity to AtWOX5 in all the WOX members in Arabidopsis (Fig. 4). Several reported TaWOX members such as TraesCS3A02G358100, TraesCS3B02G391100, TraesCS3D02G352500, TraesCS3A02G358200, TraesCS3A02G358400, TraesCS3B02G391200, TraesCS3D02G352600, and TraesCS3D02G352700 on chromosomes 3A, 3B, and 3D, respectively, were named as TaWOX13 and TaWOX14 [51] according to new nomination regulation. However, TaWOX13 was not similar to AtWOX13 or OsWOX13, and AtWOX14 was also not similar to AtWOX14 in transcripts. While, TaWOX13 and TaWOX14 were similar to the homologs of TaWOX5 according to phylogenic analysis (Fig. 4). The WOX13 and WOX14 in other Triticeae species showed the similar phylogenetic relationship with WOX5 members (Fig. 5).
All the TaWOX genes in wheat have three or more copies. Due to their sequence similarity, it is di cult to distinguish the expression level of each copy of TaWOX genes. A feasible approach was applied to estimate the amount of mRNA by calculating transcript amount of each copy. Zhao et al. indicated that the transcriptional level of individual TaWOX5 allele was varied during the period of callus growth in wheat [52]. Based on the results in the present investigation, the expression pro les of other WOX alleles were also changed in different wheat organs, which need to be justi ed by further research.

Conclusions
To our knowledge, this is the rst study on genome-wide and contrastive analysis on WOX family genes in Triticeae plant species. In total, 130 WOX genes were identi ed, including 43 in T. aestivum, 28 in T. turgidum, 23 in T. dicoccoides, 15 in H. vulgare, 13 in A. tauschii, and 8 in T. urartu. The homologous genes of TaWUSb, TaWUSd, and WUS in other ve Triticeae species were annotated, which were predicted to express normally according to promoter element analysis. Four novel homologous alleles of TaWOX genes including TdWOX12a, TdWOX12b, TdWOX7b, and TdWOX13b were also identi ed in T. dicoccoides. All of these WOX members showed evolutionary conservation and same chromosomal location arrangement. Based on the RNA-seq data in wheat-expression database and qPCR array results, TaWOX genes were found to have tissue-speci c expression feature. The results showed in this study would be helpful to further understand the molecular function and evolutionary relationship of WOX family genes in Triticeae plants, and potentially apply them in plant genetic transformation in the future. Total RNA was extracted using TRIzol™ Reagent Kit (Invitrogen 15596026), and reverse transcription reaction was performed using the PrimeScript™ RT reagent (Takara) according to the manufacturer's protocol. The qPCR was performed on ABI7500 Thermal Cycler using 2 × RealStar Green Fast Mixture (with ROX II, Genestar). TaActin (Genbank: AB181991) was used as internal controls, and three biological replicates were adopted. Gene-speci c primers were designed with premiere primer 6.0 (Table S5)