Phylogenetic Analysis and Multiple Sequence Alignment of NbFLAs
To better reveal their evolutionary relationships and to help the classification of NbFLAs, the sequences of all 21 AtFLAs and 38 NbFLAs were used to construct a phylogenetic tree (Fig. 1). Because of the low sequence similarity between some FLAs, phylogenetic analysis alone could be misleading and therefore pair-wise sequence similarity, presence and number of fasciclin domains and GPI were also used to create a classification, as previously described[8]. Most NbFLAs were sufficiently classified by phylogenetic analysis, but for a few, including NbFLA8/15 and NbFLA10/14 their protein properties had also to be taken into account.
The 38 NbFLAs we identified could be divided into the same four subclasses previously reported for the AtFLAs [8], named I to IV (Fig. 1). NbFLA2/8/12/15/22/25/26/27/29/32/33/36 belong to subclass I, and have a single fasciclin domain and GPI anchored signal (except NbFLA36), as do the related AtFLAs and PtrFLAs [8, 11]. NbFLA6/9/16/17 belong to subclass II. Subclass II is the smallest group and members contain two fasciclin domains but have no C-terminal GPI anchor site. Members of subclass III (NbFLA3/4/5/7/10/14/18/19/23/24/34/38) have either one or two fasciclin domains, and most (77%) have a C-terminal GPI anchor site. The remaining NbFLAs (NbFLA1/11/13/20/21/28/30/31/35/37) constitute subclass IV, which contains NbFLAs that are quite distantly related to the other NbFLAs and which have no consistent pattern in the number of fasciclin domains or the presence of a GPI signal.
We also constructed separate phylogenetic trees for each subclass of NbFLAs, including the sequences from the other 8 plant species in which FLAs have been identified (Arabidopsis, rice, wheat, poplar, cotton, Chinese cabbage, Eucalyptus grandis and textile hemp) (Additional file 2: Fig. S1). In general, FLAs have a relatively high homology among closely related species, like AtFLAs/BrFLAs and OsFLAs/TaFLAs. FLAs from the same species often exist in pairs, like NbFLA26/29 and TaFLA19/27, suggesting that they may be alleles. Subclasses I and III are the two largest groups and the clustering patterns are complicated. FLAs from the same species do not generally group together, and there are some closely-related pairs from different species suggesting that they might be homologs (e.g. NbFLA12/BrFLA22 and TaFLA2/OsFLA2). In subclasses II and IV, most FLAs from the same species group together (e.g. NbFLA6/9/16/17 and TaFLA6/7/8/29). Subclass II has fewest members and most of them are not GPI anchored, but the OsFLAs are a significant exception.
Previously reported fasciclin domains contain about 110–150 amino acid residues and have two highly conserved regions (H1 and H2) and a [Phe/Tyr]-His ([Y/F] H) motif [12]. An alignment of the amino acids sequences of the fasciclin domains of the NbFLAs constructed using MUSCLE and some manual showed a similar pattern (Fig. 2). The Thr residue in the H1 region is highly conserved and is followed by other conserved residues such as Val/ Ile (one position after Thr) and Asn/Asp (six positions after Thr). These residues may play a role in maintaining the structure of the fasciclin domain and/or cell adhesion [8]. As reported for other fasciclin domains [11, 31, 35], small hydrophobic amino acids such as Leu, Val and Ile are abundant in the H2 region. In the [Y/F] H motif, His and Pro residues are also relatively conserved.
Analysis of the Structural and Conserved Motifs of NbFLAs
Further analysis of gene structure and motifs of the NbFLAs is shown in Fig. 3. The phylogenetic tree confirmed that NbFLAs could be grouped into four subclasses (Fig. 3a). Analysis of the genomic DNA sequences showed that NbFLAs usually had 0, 1 or 2 exons (Fig. 3b). All of the members in subclass II have one or two introns while most members of subclasses I and III have none (Fig. 3b). The most closely related members of each subclass, usually have a similar exon/intron structure, with little difference in the length of introns and exons. However, a few NbFLAs gene pairs showed different intron/exon arrangements. For example, NbFLA1 and NbFLA31 have high sequence similarity, but NbFLA1 has no introns while NbFLA31 has one.
An online MEME analysis was done to identify additional motifs among the 38 NbFLAs. Twenty conserved motifs were predicted (Fig. 3c and Additional file 3: Table S2) and each NbFLAs contained between five and ten of these. Some motifs were common to most members, while the others were unique to one or few subclasses. For example, most NbFLAs (84%) contained motif 17. Motifs 10 and 11 were present only in subclass III and motifs 9, 16, 18 and 19 were found only in subclass II. Motif 7 was unique to subclasses II and IV, and most members of subclasses I and III contained both motifs 3 and 8 except NbFLA4/5/7/26/38. Subclass IV was clearly less closely related to the other subclasses, and motifs 12, 13 and 15 were unique to this subclass.
Prediction of cis-acting Elements and Transcription Factors among the NbFLAs
The cis-acting elements in the promoter regions of the NbFLAs were analyzed and the results are shown in Fig. 4 and Additional file 4: Table S3. There were seven types of cis-acting elements: environmental stress-related elements, hormone responsive elements, development related elements, light responsive elements, promoter related element, site-binding related elements and other elements. 105 cis-acting elements were predicted and showed great diversity (Fig. 4a). The most abundant elements were light-responsive elements, including G-box, GT1-motif and GATA-motif. Among the predicted environmental stress-related elements, STRE, MBS and ARE were the most abundant (Fig. 4b). 15 hormone responsive elements were also identified and these are mainly involved in response to abscisic acid (ABA) or methyl jasmonate (MeJA)(Fig. 4c).
By binding to transcription factors (TFs), cis-acting elements regulate the precise initiation and efficiency of gene transcription. We then therefore predicted potential TFs which may regulate the transcription of NbFLAs (Fig. 5 and Additional file 5: Table S4). In total, 25 TFs were predicted of which C2H2, BBR-BPC, Dof, Myb and MIKC were the most abundant. The NbFLAs had an average of five TFs, but it appears that NbFLA4 and NbFLA27 may be regulated by more TFs, including specific TFs like RAV and CPP, while NbFLA8/15/38 may each be regulated by only two TFs.
Subcellular Localization Analysis of NbFLAs
Bioinformatics analysis based on the NbFLA amino acid sequences suggested that all of them could locate to membranes, and only NbFLA4 was predicted to locate in both the nucleus and membranes (Table 1). To validate the prediction results, we selected one NbFLA in each subclass (NbFLA4/6/31/32) to analyze their localization by laser confocal microscopy. AtP1P2A-GFP was used as membrane marker [36]. Results showed that NbFLA6, NbFLA31 and NbFLA32 located in membranes and that NbFLA4 was present both in membranes and the nucleus, consistent with the predictions (Fig. 6).
A GPI anchored signal is vital for membrane localization and is predicted in about two thirds of AtFLAs and PtrFLAs and in 20 of 38 (53%) of NbFLAs (Table 1). Among the four selected NbFLAs, only NbFLA31 was not GPI anchored. Correspondingly, although NbFLA31 did locate in membranes, a weak red fluorescence could also be observed in the cytoplasm.
Tissue-Specific Expression of NbFLAs
To comprehensively understand the functions of NbFLAs, we selected 11 NbFLA genes from four subclasses and analyzed their expressions in five different tissues (root, stem, young leaf, mature leaf and flower) by RT-qPCR (Fig. 7 and Additional file 6: Fig. S1). NbFLA4 and NbFLA34 were expressed highly in flowers, and the expression of NbFLA4 was particularly high. The expression level of all selected NbFLAs (except NbFLA4) was higher in young leaves than in mature ones. NbFLA11/18/31/32/34 showed similar patterns and were highly expressed in young leaves, while NbFLA2/6/15/17, belonging to subclasses I and II, were highly expressed in stems, suggesting that they may play a role in stem dynamics.
Expression of NbFLAs Under Biotic Stress
To investigate whether NbFLAs participate in the response to pathogens, leaves of N. benthamiana were inoculated with TuMV, potato virus X (PVX), pepper mottle mosaic virus (PMMoV) and the bacterial pathogen Pst DC3000. At 5 days post virus inoculation (dpi), or 2 days post Pst DC3000 infection, leaves were collected to study the expression pattern of 11 NbFLA genes by RT-qPCR (Fig. 8).
TuMV infection led to a huge reduction in expression of all the NbFLAs tested, especially NbFLA15/18/32/34, which all decreased by more than 99%. PVX or PMMoV infection usually induced a modest reduction in expression, although NbFLA6 was slightly upregulated by PVX. The bacterial pathogen Pst DC3000 decreased expression of most NbFLAs by 73–99% but, in contrast, the expression of NbFLA4 and NbFLA7 was substantially upregulated. These results show that most NbFLAs are substantially affected by TuMV and Pst DC3000 and may therefore play roles in post-infection responses.