Genome-wide identi cation and characterization of barley bHLH transcription factors and their expression in response to low nitrogen stress

Background: Improvement of low nitrogen (LN) tolerance or nitrogen use eciency (NUE) in crops is imperative for environment-friendly agriculture development. The basic helix–loop–helix (bHLH) transcription factors are involved in multiple abiotic stress, suitable as the candidate genes for improving LN tolerance. Little research was done on characterization of bHLH gene family and their response to LN stress in barley. Results: In this study, 168 bHLH genes were identied in barley through genome-wide analysis. HvbHLH proteins were classied into 26 subfamilies based on phylogenetic analysis with bHLH proteins from Arabidopsis thaliana and rice. The analysis of conserved motifs and gene structures supported the evolutionary relationships among these HvbHLH proteins. Further, analysis of stress-related cis-elements in the promoter regions showed that bHLH proteins in barley are probably involved in multiple stress responses. Finally, at least 16 bHLH genes were differentially expressed in two barley genotypes differing in LN tolerance under LN stress. Dynamic expression analysis showed that these differentially expressed genes (DEGs) differed between the two barley genotypes in response to LN stress. Conclusion: It is the rst genome-wide analysis of bHLH family genes in response to LN stress in barley. The results indicate the distinct difference among HvbHLH genes in response to various abiotic stresses. The HvbHLHs specically expressed in the LN-tolerant barley genotype XZ149 identied herein may be valuable for future function analysis of HvbHLH genes under LN stress and breeding for barley cultivars with LN tolerance.


Introduction
Transcription factors (TFs) play important roles in the growth and development of plants and animals, and their responses to the external environment via regulating downstream gene expression. As one of the most critical factors regulating gene expression, TFs have been intensively concerned in the eld of biological research. Among various TF families, the basic-helix-loop-helix (bHLH) TFs constitute one of the largest TF families [1]. The bHLH proteins are characterized by a conserved bHLH domain consisting of about 60 amino acids. Each domain has two functional regions, i.e. the basic region at N-terminus and the HLH region at the C-terminus. The basic region comprises about 15 amino acids including six basic residues, and has DNA-binding activity [2]. The HLH region, containing about 50 amino acids, is composed of two α-helices separated by a variable loop, and it participates in the formation of homodimers or heterodimers [3].
TFs are the excellent candidate genes for developing cultivars with improved tolerance to adverse stress [13][14][15]. In plants, bHLH TFs are involved in many biological processes, such as the regulation of avonoid biosynthesis, morphology and the accumulation of fruit pigments [16][17]. Some bHLH TFs participate in regulation of biotic and abiotic stress responses [18][19][20]. For example, some bHLH TFs confer drought tolerance through regulating stomatal development, photosynthesis and growth [21], or by promoting root development and abscisic acid synthesis [22]. Some enhance plant salt stress [23], cold [24], or manganese [25] through their overexpression. And some could improve plant growth while subjected to nutrition de ciency, such as iron, Pi and N de ciency [26][27][28].
Environmental pollution caused by excessive application of nitrogen (N) fertilizers became a big issue.
One of the e cient solutions is to develop cultivars with high N use e ciency /low nitrogen (LN) tolerance. It is reported that bHLH could improve plant tolerance to N de ciency [28]. Thus, bHLH genes may be used as suitable candidates for the genetic improvement of plant LN tolerance/high N use e ciency. However, up to date little has been known about the function of plant bHLH proteins. In this study, bHLH family genes in barley were identi ed and characterized at genome-wide level. It was found that HvbHLH genes were differentially expressed in the two barley genotypes (XZ149, LN tolerant and XZ56, LN sensitive) differing in LN tolerance [29][30] at transcription level under LN stress by RNA-seq.
The major objectives of this study were to determine the characteristics of HvbHLH gene family, to explore the differences of HvbHLH gene expression between the two contrasting genotypes in response to LN stress, and to identify the HvbHLH genes useful for LN tolerance breeding.

Identi cation of bHLH family genes in barley
Proteins containing bHLH domain in barley were identi ed by searching the whole barley genome. Totally 168 bHLH genes were obtained in barley genome after removing the redundant protein sequences (Additional le 1 and 2: Table S1 and S2). By EXPASY analysis, we found the amino acid length of HvbHLH proteins ranged from 36 to 938, accordingly the molecular weight (MW) ranging from 9.63 to 102.01 kDa (Additional le 1: Table S1). The isoelectric point (pI) varied greatly, from 4.63 to 11.9 (Additional le 1: Table S1). While the MW and pI of 11 HvbHLH proteins were not predicted, due to existence of some consecutive unde ned amino acids (Additional le 1: Table S1).

Phylogenetic analysis of bHLH proteins
In order to analyze the evolutionary relationship of the bHLH genes between barley and other species, the unrooted phylogenetic tree was constructed using the full-length amino acid sequences of all bHLH in Arabidopsis, rice and barley by the neighbor-joining (NJ) method. The result showed that 450 bHLH proteins from Arabidopsis, rice and barley were divided into 27 subgroups (Subgroup 1 to 27; Fig. 1) based on the classi cation of bHLH proteins in Arabidopsis and rice [6][7]. Except Subgroup 20, all members of bHLH family in barley could be found in all the other subgroups ( Fig. 1). Subgroups 3 and 9 were the smallest, each only having three members in barley, rice and Arabidopsis. Subgroup 27 was the largest, having 20, 16 and 17 members in barley, rice and Arabidopsis, respectively. Another NJ phylogenetic tree was also constructed based on HvbHLH protein sequences, and HvbHLHs were classi ed into 25 subfamilies (Subfamily A to Y, Fig. 2a). Classi cation in HvbHLH was generally consistent with those in Arabidopsis and rice phylogenic tree ( Fig. 1 and 2a).
Analysis of motifs and exon/intron structures Ten conserved motifs containing 11 to 57 amino acids were predicted in HvbHLH proteins through MEME software (Additional le 3: Fig. S1). HvbHLH proteins consist of 1-5 motifs, varying with different subfamilies (Fig. 2b). However, all of them contained at least of the Motif 1 or 2, and the proteins in the same subfamily exhibited the similar motif composition (Figs. 2a and b). For example, Motifs 2, 3, and 10 were identi ed in all 6 members of Subfamily K, and Motifs 1, 2, and 7 were identi ed in 12 of the 15 members of Subfamily X. Furthermore, some motifs are unique to one or more Subfamilies. For instance, Motif 5 was speci cally shared by each member in the Subfamily D, and motif 3 was only found in Subfamilies B, K and L. This conservation of the motif composition patterns in each subfamily might indicate their similar biological functions in the same subfamily [31].
The alignment of genome and coding sequences, and also exon/intron structure within the coding sequence in bHLH genes was performed using the TBtools [32]. The structure differed greatly among HvbHLH genes, but most members of the same subfamily had the similar exon/intron structure (Figs. 2a and c). For instance, the members in the Subfamily A had the relatively simple structure, possessing only 1-2 exons, while the members in the Subfamily T had 6-8 exons, being much more complex in its structure (Fig. 2c).

Chromosomal distribution and gene duplication of HvbHLH genes
According to their physical positions, 161 HvHLH genes were nonrandomly mapped on seven barley chromosomes, with most located at the both ends of the chromosomes (Fig. 3). More than one-third of the HvbHLH genes were located on Chr3 (30 members) and Chr5 (26 members), and only 12 genes was located on Chr1 (Fig. 3). In addition, 7 HvbHLHs could not be conclusively localized to a chromosome (Fig. 3).
Totally 18 segmental duplication events were identi ed among HvbHLHs (Fig. 4, Additional le 4: Table  S3). Among them, 16 events happened between diverse chromosomes, while only 2 events were detected within a same chromosome. Nine genes were identi ed as tandem duplication genes (Fig. 4, Additional le 4: Table S3), and the genes located on chromosome 3 showed the highest numbers (7) of tandem duplications, while two genes (HORVU4Hr1G087590 and HORVU4Hr1G087610) were tandem duplicated on Chr4 (Fig. 4). Each pair of duplication genes were derived from the same subfamily ( Fig. 1 and 4).
Obviously, some HvbHLH genes were generated due to gene duplication, and the segmental duplication may play vital roles in the HvbHLH gene family expansion in barley genome.

Stress-related cis-elements in the promoter of HvbHLH genes
To understand the potential regulatory mechanisms of HvbHLH genes, cis-elements were analyzed using 1.5-kb upstream sequences from ATG through PlantCARE. Various cis-elements involved in abiotic stress responses were identi ed, and mainly they could be divided into two groups. One is elements responsive to hormones, including abscisic acid (ABRE), jasmonic acid (CGTCA and TGACG motif), auxin (AuxRRcore and TGA-element), gibberellins (GARE motif, TATC-box and P-box,), salicylic acid (SARE and TCAelements). The other one is responsive to environmental stress, including low-temperature responsive element (LTR), anaerobic induction element (ARE), MYB binding site involved in drought-inducibility (MBS), and defense and stress responsive elements (TC-rich repeats (Fig. 5, Additional le 5: Table S4).
All HvbHLH genes had light responsive cis-elements, belonging to the most commonly predicted ciselements in the promoter of HvbHLH genes, and 155 (92.26%) HvbHLH genes contained the G-box elements.most genes had more than one kind of cis-elements, suggesting that both groups could respond to multiple stresses.
Some HvbHLHs highly expressed in the speci c tissue(s). For instance, the genes HORVU2Hr1G044230 and HORVU6Hr1G020520 showed higher expression in caryopsis (5 and 15 DPA) than other tissues (Fig.  6), suggesting their signi cant roles in grain growth and development. HORVU2Hr1G044230 and HORVU1Hr1G072810 had higher expression in embryos and roots, while HORVU2Hr1G044230, HORVU7Hr1G050530 and HORVU1Hr1G054260 had higher expression in shoots (Fig. 6), indicating these genes may participate in growth and development of vegetative organs. Notably, HORVU2Hr1G044230 displayed extremely high expression in all the tissues (Fig. 6).

Expression pro les of HvbHLHs under low nitrogen stress
The expression patterns of HvbHLH genes in response to LN stress were analyzed in the roots of the two barley genotypes (XZ149, LN tolerant and XZ56, LN sensitive) using RNA-seq (Additional les 7 and 8: Tables S6 and S7). Consequently, 16 differentially expressed genes (DEGs) encoding HvbHLH proteins were identi ed using pair-wise comparison for each genotype under LN stress (Fig. 7, Additional le 9: Table S8). Remarkably, no DEG was found in the two barley genotypes under LN stress at 12 d, suggesting HvbHLHs may play a regulating role at early stress. Furthermore, all the DEGs were changed only at one time point, except MLOC_74557 (HORVU6Hr1G064820) and MLOC_21066 (HORVU3Hr1G000150), which was down-regulated in XZ149 both at 6 h and 48h after stress (Fig. 7).
Four DEGs were up-regulated under LN stress, and three of them responded at 6 h, with MLOC_72280 (HORVU2Hr1G114070) only in XZ149, and MLOC_36351 (HORVU3Hr1G108680) and MLOC_76486 (HORVU5Hr1G018100) in both XZ149 and XZ56. MLOC_55964 (HORVU4Hr1G075250) was up-regulated at 48 h in both XZ149 and XZ6 (Fig. 7). Notably, the change fold of all the up-regulated DEGs was larger in the tolerant genotype XZ149 relative to the sensitive genotype XZ56 (Fig. 7).
To con rm the transcriptomic results and understand the expression patterns of HvbHLHs, the dynamic response to LN stress were examined using real time PCR for the 6 identi ed DEGs at 8 time points after LN stress. The expression patterns of HvbHLHs at 6 h, 48 h and 12 d after LN stress were highly consistent with those obtained by the transcriptomic analysis. (Additional le 10: Fig. S2). Under LN stress, different HvbHLHs exhibited distinct dynamic expression patterns, suggesting that HvbHLHs may be involved in regulation of LN tolerance in different ways. For instance, MLOC_72280 (HORVU2Hr1G114070) was up-regulated during 1 h -2 d under LN stress, but down-regulated at 3 d -12 d in XZ149, while it was up-regulated at 1 h, 3 h, 6 h, and 2 d, and down-regulated at 24 h and 3 d -12 d in XZ56 (Additional le 11: Fig. S3). For MLOC_37666 (HORVU2Hr1G108480), its expression generally showed the decrease with the time of exposed LN stress in XZ149, but increase at 1 h, 3 h, and 6 d under LN stress in XZ56 (Additional le 11: Fig. S3).

Discussion
The bHLH TFs in higher plants comprise a large family, involving in plant growth, development, and stress responses [2]. It is well documented that bHLH is responsible to many abiotic stresses such as cold, drought and salt stress [18,[34][35], and nutrients such as iron, Pi and N de ciency [20,28,36]. Therefore, it is imperative to make the comprehensive analysis of bHLH TFs family. N de ciency is a common issue in the agricultural production worldwide. Some studies found that bHLHs are involved in LN stress [28], nevertheless, their relationship/ roles of bHLH family genes in LN tolerance has not been clearly de ned. In this study, genome-wide identi cation and characterization of the bHLH gene family in barley were carried out. Furthermore, the expression pro les of bHLH family under LN stress were determined by using two barley genotypes differing largely in LN tolerance.
As a result, totally 168 bHLH genes were identi ed in barley through genome-wide analysis (Table S1). These HvbHLH genes were distributed over the barley genome, mainly on both ends of the chromosomes (Fig. 3), similar to the distribution of Arabidopsis and rice bHLH genes on their chromosomes [6][7]. Gene duplication is an important event for the evolution of plant genome, leading to the generation of new genes [37][38]. There were at least 14 pairs of duplication genes in barley bHLH family, indicating that gene expansion occurred during the evolution of the barley genome. Moreover, the segmental duplication could be a predominant driving force for the enlargement of HvbHLH [39].
Based on phylogenetic analysis, HvbHLH proteins could be classi ed into 26 subfamilies (Fig. 1). Motif and gene structure analysis further supported the subfamily division, as members of the same subfamily showed the similar gene structure and motif (Fig. 2), indicating their similar evolutionary origins and biology functions. For instance, in the subfamily K, AtbHLH100 (AT2G41240), AtbHLH101 (AT5G041059), AtbHLH38 (AT3G56970), and AtbHLH39 (AT3G56980) played important roles in regulating iron homeostasis under Fe de ciency in Arabidopsis [19,26]. In the same subfamily, OsbHLH056/OsIRO2 (Os01g72730) protein is an essential regulator of the genes involved in Fe uptake under Fe-de cient conditions in rice [40]. On the other hand, HvbHLHs displayed diverse expression patterns in plant tissues, and some genes within the same subfamily may have different expression patterns (Fig. 6), indicating that their functions have been differentiated even in the same subfamily or they may perform the same function in the different ways.
A great number of cis-elements associated with abiotic stress responses were discovered in promoter regions of HvbHLHs, suggesting that HvbHLHs may be highly associated with the regulation of abiotic stresses response in barley. To further explore the possible functions of the bHLH family in LN tolerance, the expression pro les of HvbHLH genes were analyzed using two barley genotypes differing in LN tolerance. A total of 16 DEGs encoding HvbHLH proteins were identi ed in the two genotypes after 6 h and 48 h treatments (Fig. 7), but no one was found at 12 d LN treatment, suggesting that HvbHLHs respond to LN stress only at early time.
There was a large difference in the transcription regulation of HvbHLH between the two barley genotypes. In detail, the change fold of ve up-regulated DEGs was much higher in the LN tolerant genotype XZ149 than the LN sensitive genotype XZ56 under LN stress (Fig. 7). In particular, MLOC_72280 (HORVU2Hr1G114070) and MLOC_42946 (HORVU4Hr1G003210) were up-regulated in XZ149, but little changed or down-regulated in XZ56 (Fig. 7). Thus, it is worthy to determine the roles of these HvbHLH genes in XZ149. It has been reported that overexpression of TabHLH1 in tobacco improved plant tolerance to N deprivation via regulation of nitrate transporter (NRT) gene transcription [28]. In Arabidopsis, the expression of speci c bHLH transcription factors was enhanced in response to N de ciency, accompanied by anthocyanin accumulation [41][42]. Thus, we may focus the candidate genes on the roles in anthocyanin synthesis pathway and regulation of NRT gene transcription under LN stress in the further study.

Conclusion
In summary, 168 bHLH genes were identi ed in barley by a genome-wide analysis, and their evolutionary relationships were clari ed using phylogenetic, conserved motif, and exon/intron structures analyses. The expression pro les revealed that at least 16 HvbHLH genes may mediate LN stress in barley. The speci cally expressed bHLH genes in XZ149 may be valuable for further functional characterizations of bHLH genes under LN stress and breeding of barley cultivars with LN tolerance.

Phylogenetic analysis
Multiple alignment was performed based on the sequences of Arabidopsis thaliana, rice and barley using ClustralW (http://www.genome.jp/tools/clustalw/) [49]. The unrooted phylogenetic tree was constructed by MEGA 6.0 software employing the neighbor-joining (NJ) method, bootstrapping with 1000 replicates.

Exon/intron structure and motif analysis
The exon/intron structure of the barley bHLH genes were determined by alignment of the cDNAs and their corresponding genomic DNA sequences from the barley genome database. The conserved motifs were predicted on the online MEME tool (http://meme-suite.org /tools/ meme) [50]). And a diagrammatic sketch structure and the motif composition of HvbHLHs was mapped through TBtools software [32].

Cis-elements in promoter regions of HvbHLHs
To predict the cis-elements in the promoter regions of HvbHLHs, the 1500 bp upstream sequences from the start codon of each HvbHLH were retrieved. The cis-element distribution was then investigated in PlantCARE (http://bioinformatics.psb.ugent.be/webtools/plantcare/html/) [54].

Plant materials and stress treatments
Two Tibetan wild barley genotypes XZ149 (LN tolerant) and XZ56 (LN sensitive) were used for transcriptome and real time PCR. The barley seedlings were cultivated with hydroponics in a plant growth chamber (22/18 ℃, day/night) according to Quan et al. (2016) [55]. Then, three-leaf-stage seedlings were exposed to 0.2 mM N (LN stress) and 2 mM N (control).
The roots in XZ149 and XZ56 were taken for transcriptomic analysis at 6 h, 48 h and 12 d after LN stress.
The threshold for screening DEGs was set as FDR<0.05 and FPKM ≥1 at least in one of the samples [56]. Differential expression analysis of HvbHLHs was displayed using FPKM (fragments per kilobase of exon per million fragments mapped reads) [57] and the expression levels are showed as the heatmaps created by TBtools software by a color gradient from low (blue) to high (red) [32].
To con rm the RNA-Seq results and to further explore the expression patterns of HvbHLH genes, the expression of HvbHLH genes in roots was analyzed at 1 h, 3   The datasets used and/or analyzed during the current study are available from the rst author on reasonable request, and her email address is bio_quanxy@ujn.edu.cn. Figure 1 Phylogenetic tree of bHLH families in barley, Arabidopsis and rice. The different-colored arcs indicate different subgroups of the bHLHs. The unrooted neighbor-joining phylogenetic tree was constructed using MEGA6.0 with full-length amino acid sequences of 450 bHLHs, and the bootstrap test replicate was