Identification of Members of the NbFLA Family
Based on previous studies[8], FLAs have an AGP-like glycosylated region, a fasciclin domain and an N-terminal signal peptide. We followed these criteria to identify putative FLAs in N. benthamiana. The sequences of the 21 identified AtFLAs were downloaded[8] and the N. benthamiana genome was downloaded from the Sol Genomics Network (https://solgenomics.net/)[34]. A total of 38 NbFLAs were identified by two round BLASTP and signal peptide prediction (Table 1 and Additional file 1: Table S1). Most of these (66%) have lengths of 200-300aa, while the largest (NbFLA10) has 495aa and the smallest (NbFLA26) has only 182aa. The predicted isoelectric points range from 4.29 to 9.77, and the molecular weights (MWs) derived only from the amino acid sequences (not including glycans) are in the range 19.68-52.32kDa. The protein properties of the NbFLAs are similar to those of other plant species[8, 11].
Phylogenetic Analysis and Multiple Sequence Alignment of NbFLAs
To better reveal their evolutionary relationships and to help the classification of NbFLAs, the sequences of all 21 AtFLAs and 38 NbFLAs were used to construct a phylogenetic tree (Fig.1). Because of the low sequence similarity between some FLAs, phylogenetic analysis alone could be misleading and therefore pair-wise sequence similarity, presence and number of fasciclin domains and GPI were also used to create a classification, as previously described[8]. Most NbFLAs were sufficiently classified by phylogenetic analysis, but for a few (NbFLA8/15 and NbFLA10/14) their protein properties including the presence and number of fasciclin domains and GPI had also to be taken into account.
The 38 NbFLAs we identified could be divided into the same four subclasses previously reported for the AtFLAs[8], named I to IV (Fig. 1). NbFLA2/8/12/15/22/25/26/27/29/32/33/36 belong to subclass I, and have a single fasciclin domain and GPI anchored signal (except NbFLA36), as do the related AtFLAs and PtrFLAs[8, 11]. NbFLA6/9/16/17 belong to subclass II. Subclass II is the smallest group and members contain two fasciclin domains but have no C-terminal GPI anchor site. Members of subclass III (NbFLA3/4/5/7/10/14/18/19/23/24/34/38) have either one or two fasciclin domains, and most (77%) have a C-terminal GPI anchor site. The remaining NbFLAs (NbFLA1/11/13/20/21/28/30/31/35/37) constitute subclass IV, which contains NbFLAs that are quite distantly related to the other NbFLAs and which have no consistent pattern in the number of fasciclin domains or the presence of a GPI signal.
We also constructed separate phylogenetic trees for each subclass of NbFLAs, including the sequences from the other 8 plant species in which FLAs have been identified (Arabidopsis, rice, wheat, poplar, cotton, Chinese cabbage, Eucalyptus grandis and textile hemp) (Additional file 2: Fig. S1). In general, FLAs have a relatively high homology among closely related species, like AtFLAs/BrFLAs and OsFLAs/TaFLAs. FLAs from the same species often exist in pairs, like NbFLA26/29 and TaFLA19/27, suggesting that they may be paralogous genes. Subclasses I and III are the two largest groups and the clustering patterns are complicated. FLAs from the same species do not generally group together, and there are some closely-related pairs from different species suggesting that they are orthologous genes (e.g. NbFLA12/BrFLA22 and TaFLA2/OsFLA2). In subclasses II and IV, most FLAs from the same species group together (e.g. NbFLA6/9/16/17 and TaFLA6/7/8/29). Subclass II has fewest members and most of them are not GPI anchored, but the OsFLAs are a significant exception.
Previously reported fasciclin domains contain about 110-150 amino acid residues and have two highly conserved regions (H1 and H2) and a [Phe/Tyr]-His ([Y/F] H) motif[12]. An alignment of the amino acid sequences of the fasciclin domains of the NbFLAs constructed using MUSCLE and some manual analysis showed a similar pattern (Fig. 2). The Thr residue in the H1 region is highly conserved and is followed by other conserved residues such as Val/ Ile (one position after Thr) and Asn/Asp (six positions after Thr). These residues may play a role in maintaining the structure of the fasciclin domain and/or cell adhesion[12]. As reported for other fasciclin domains[11, 31, 35], small hydrophobic amino acids such as Leu, Val and Ile are abundant in the H2 region. In the [Y/F] H motif, His and Pro residues are also relatively conserved.
Analysis of the Structural and Conserved Motifs of NbFLAs
Further analysis of gene structure and motifs of the NbFLAs is shown in Fig. 3. The phylogenetic tree confirmed that NbFLAs could be grouped into four subclasses (Fig. 3a). Analysis of the genomic DNA sequences showed that NbFLAs usually had 0, 1 or 2 introns (Fig. 3b). All of the members in subclass II have one or two introns while most members of subclasses I and III have none (Fig. 3b). The most closely related members of each subclass, usually have a similar exon/intron structure, with little difference in the length of introns and exons. However, a few NbFLA gene pairs showed different intron/exon arrangements. For example, NbFLA1 and NbFLA31 have high sequence similarity, but NbFLA1 has no introns while NbFLA31 has one.
An online MEME analysis was done to identify additional motifs among the 38 NbFLAs. Twenty conserved motifs were predicted (Fig. 3c and Additional file 3: Table S2) and each NbFLA contained between five and ten of these. Some motifs were common to most members, while the others were unique to one or few subclasses. For example, most NbFLAs (84%) contained motif 17. Motifs 10 and 11 were present only in subclass III and motifs 9, 16, 18 and 19 were found only in subclass II. Motif 7 was unique to subclasses II and IV, and most members of subclasses I and III contained both motifs 3 and 8 except NbFLA4/5/7/26/38. Subclass IV was clearly less closely related to the other subclasses, and motifs 12, 13 and 15 were unique to this subclass.
Prediction of cis-acting Elements and Transcription Factors among the NbFLAs
The cis-acting elements in the promoter regions of the NbFLAs were analyzed and the results are shown in Fig. 4 and Additional file 4: Table S3. There were seven types of cis-acting elements: environmental stress-related elements, hormone responsive elements, development related elements, light responsive elements, promoter related element, site-binding related elements and other elements. 105 cis-acting elements were predicted and showed great diversity (Fig. 4a). The most abundant elements were light-responsive elements, including G-box, GT1-motif and GATA-motif. 15 hormone responsive elements were identified and these are mainly involved in response to abscisic acid (ABA) or methyl jasmonate (MeJA) (Fig. 4b). Among the predicted environmental stress-related elements, STRE, MBS and ARE were the most abundant (Fig. 4c). Several abundant predicted cis-acting elements are known to mediate plant immunity. For example, VdMYB1 binds to the MBS in the VdSTS2 gene promoter, thus activating VdSTS2 transcription and positively regulating defense responses[36]. Machi3-1 and TaRIM1 also bind MBS cis-acting elements to increase host resistance[37, 38].
By binding to transcription factors (TFs), cis-acting elements regulate the precise initiation and efficiency of gene transcription. We then therefore predicted potential TFs which may regulate the transcription of NbFLAs (Fig. 5 and Additional file 5: Table S4). The NbFLAs had an average of five TFs, but it appears that NbFLA4 and NbFLA27 may be regulated by more TFs, including specific TFs like RAV and CPP, while NbFLA8/15/38 may each be regulated by only two TFs. In total, 25 TFs were predicted of which C2H2, BBR-BPC, Dof, Myb and MIKC were the most abundant. Previous studies have demonstrated the role of TFs in regulating plant immunity. NbCZF1, a novel C2H2-Type zinc finger protein, is a regulator of plant defense[39] and VvDOF3 enhances powdery mildew resistance in Vitis vinifera[40]. In addition, AtMyb15 and MdMyb30 also participate in enhancing disease resistance[41, 42].
Subcellular Localization Analysis of NbFLAs
Bioinformatics analysis based on the NbFLA amino acid sequences suggested that all of them could locate to membranes, and only NbFLA4 was predicted to locate in both the nucleus and membranes (Table 1). To validate these predictions, we selected one NbFLA in each subclass (NbFLA4/6/31/32) to analyze their localization by laser confocal microscopy. AtP1P2A-GFP was used as membrane marker[43]. The results showed that while NbFLA6 and NbFLA32 were only located in membranes, NbFLA4 was present both in membranes and the nucleus, consistent with the predictions (Fig. 6).
A GPI anchored signal is vital for membrane localization and is predicted in about two thirds of AtFLAs and PtrFLAs and in 20 of 38 (53%) of NbFLAs (Table 1). Among the four selected NbFLAs, only NbFLA31 was not GPI anchored. Correspondingly, although a plasmolysis experiment confirmed the membrane localization of NbFLA31, a diffused red fluorescence could also be observed in the cytoplasm (Fig. 6 and Additional file 6: Fig. S2).
Tissue-Specific Expression of NbFLAs
To comprehensively understand the functions of NbFLAs, two or three NbFLAs from each subclass were randomly selected to analyze their expression in five different tissues (root, stem, young leaf, mature leaf and flower) by RT-qPCR (Fig. 7 and Additional file 7: Fig. S3). The expression level of all selected NbFLAs (except NbFLA4) was higher in young leaves than in mature ones. NbFLA11/18/31/32/34 were highly expressed in young leaves, and NbFLA4 were expressed highly in flowers. It was earlier reported that PtFLA6 is specifically expressed in tension wood (TW) and that decreased transcripts of PtFLA6 influenced stem dynamics[18]. In this study, NbFLA2/6/15/17, belonging to subclasses I and II, were highly expressed in stems, suggesting that they may play a role in stem dynamics.
Expression of NbFLAs Under Biotic Stress
To investigate whether NbFLAs participate in the response to pathogens, leaves of N. benthamiana were inoculated with turnip mosaic virus (TuMV), potato virus X (PVX), pepper mottle mosaic virus (PMMoV) and the bacterial pathogen Pseudomonas syringae pv tomato strain DC3000 (Pst DC3000). At 5 days post virus inoculation (dpi), or 2 days post Pst DC3000 infection, leaves were collected to study the expression pattern of 11 NbFLA genes by RT-qPCR (Fig. 8).
TuMV infection led to a huge reduction in expression of all the NbFLAs tested, especially NbFLA15/18/32/34, which all decreased by more than 99%. PVX or PMMoV infection usually induced a modest reduction in expression, although NbFLA6 was slightly upregulated by PVX. The bacterial pathogen Pst DC3000 decreased expression of most NbFLAs by 73-99% but, in contrast, NbFLA4 and NbFLA7 were substantially upregulated. These results show that most NbFLAs are substantially affected by TuMV and Pst DC3000 and may therefore play roles in post-infection responses.