3.1 Identification and characteristics of U. rhynchophylla GATA TFs
In this study, a total of 25 GATA genes were identified from U. rhynchophylla genome (BioProject: PRJNA931770) and were named as UrGATA1-UrGATA25 corresponding to their chromosomal location. The characteristics of 25 UrGATA genes, including the number of amino acids, molecular weight, theoretical pI (pI), instability index, and subcellular localization are presented in Table S2. The predicted lengths of the U. rhynchophylla GATA TFs ranged from 134 (UrGATA15) to 461 (UrGATA11) amino acid, and the molecular weights ranged from 15.15 kDa (UrGATA15) to 50.97 kDa (UrGATA11). The predicted pI values ranged from 5.33 (UrGATA24) to 10.81 (UrGATA10). All UrGATA proteins were considered as unstable proteins. Subcellular localization prediction indicated all 25 GATA proteins were found in the nucleus.
3.2 Phylogenetic analysis and multiple sequence alignment of UrGATA
A neighbor-joining tree was constructed using MEGA 7.0 for multiple sequence alignment of full length GATA proteins from U. rhynchophylla, A. thaliana, and O. sativa with 1000 bootstrap replications to analyze phylogenetic relationships (Fig. 2). According to the GATA classification of A. thaliana, and O. sativa, UrGATA proteins were divided into four subgroups, I, II, III, IV (Fig. 2, Fig. 3a). Subgroup I with 13 UrGATAs members was the largest group (UrGATA2, UrGATA6, UrGATA8, UrGATA11-14, UrGATA17, UrGATA19-21, UrGATA23-24). Nine UrGATAs were placed in subgroup II (UrGATA1, UrGATA3-5, UrGATA9, UrGATA15, UrGATA18, UrGATA22, UrGATA25), and subgroup III had only one UrGATA designated UrGATA16. UrGATA7 and UrGATA10 were grouped into subgroup IV.
3.3 Conserved motifs and gene structure analysis of UrGATA gene family
To better analyze the diversity of UrGATA proteins, conserved protein motifs were predicted using the online website MEME. In total, eight conserved motifs were identified (motif 1–8) (Fig. 3b, Fig. 3d, Table S3). Motif 1 was detected in all UrGATA proteins. Motifs 2, 3, 5, 6, 7, and 8 were only found in subgroup I, and motif 4 was only observed in subgroup II.
To gauge the genetic structural diversity of UrGATA genes, the exon/intron analysis of DNA sequences of UrGATA genes was performed. As is shown in Fig. 2c, the number of introns in UrGATA genes varied from one to six (twelve with one intron, nine with two introns, one with three introns, two with four introns, one with one intron), in which UrGATA16 contained the greatest number of predicted introns (n = 6). In general, UrGATA members in the same subgroup displayed similar gene structures, corresponding to the phylogenetic tree analysis of subgroups. Similar to the previous reports of A. thaliana, O. pumila, and O. sativa, 25 UrGATA proteins belonging to different subgroups contained the same conserved domain, only UrGATA16 belonging to subgroup III had the C-X2-C-X20-C-X2-C structure, the other UrGATAs had the conserved domain, C-X2-C-X18-C-X2-C (Fig. 4).
3.4 Chromosomal distribution and collinearity analysis of UrGATA gene family
The location map of 25 UrGATA genes in the genome of U. rhynchophylla are shown in the Fig. 5. UrGATA genes were most abundant on Ur_chr7 (UrGATA7-9) and Ur_chr16 (UrGATA17-19). Six chromosomes (Ur_chr1, Ur_chr3, Ur_chr10, Ur_chr12, Ur_chr13, Ur_chr21) had no UrGATA genes and the other chromosomes contained 1–2 UrGATA genes. Synthetic blocks within the U. rhynchophylla genome were analyzed to identify the relationship between UrGATA genes and gene replication events (Fig. 6). The results showed that no tandem duplication gene pairs were identified among 25 UrGATA genes, while 28 gene pairs of fragment repeats were detected between fifteen chromosomes. According to previous studies, gene replication is a key reason for the gene expansion (Doksani, 2019). Therefore, some UrGATA genes might have been generated by gene duplication.
To obtain a more comprehension understanding of the evolution of the UrGATA family, synteny analysis comparing U. rhynchophylla with three dicotyledonous plant species (A. thaliana, C. canephora, and C. roseus) and one monocotyledonous species (O. sativa) were constructed (Fig. 7, Table S4). UrGATA genes showed different collinearity relationship with A. thaliana (48), C. roseus (45), C. canephora (39), O. sativa (11), with UrGATA genes having a comparatively greater similarity relationship with that in A. thaliana.
3.5 Cis-acting elements analysis of UrGATA gene family
To gain further insight into the potential functions of GATA TFs in U. rhynchophylla, 2000-bp regulatory upstream region of translation initiation site were extracted to identify cis-acting elements using the PlantCARE. Cis-acting elements were divided into three categories, plant growth and development, phytohormone responsive, and stress responsive boxes. 25 UrGATA TFs contained at least one cis-acting element in the promoter region (Fig. 8). The elements related to plant growth and development (266), containing light responsive and development-related elements were found in the promoter region. The elements, G-box (69), Box4 (61), GT1- motif (26), GATA-motif (17), and Sp1 (7), are associated with light responses. There were also elements corresponded with growth, such as O2-site (13), and CAT-box (11). Phytohormone responsive elements for abscisic acid (ABA) (58), methyl jasmonate (MeJA) (50), salicylic acid (SA) (26), gibberellin (GA) (16), auxin (13) were also identified in the promoters of UrGATA genes. Abiotic and biotic stress-related elements were also found in UrGATA promoter regions, with antioxidant response (ARE) (58), drought (MBS) (15), low temperature (LTR) (14), wound responsive (WUN-motif) (2) boxes, identified. These prediction results indicated that several cis-acting elements may regulate UrGATA genes during the growth and phytohormone response.
3.6 Protein network interaction and expression patterns of UrGATA genes
The expression patterns of 25 UrGATA genes in four different tissues (root, stem hook, leaf, flower) were evaluated using the transcriptome database of U. rhynchophylla (Fig. 9). Among all 25 UrGATA genes, 64% UrGATA genes (16/25) were expressed in all samples (TPM > 0), whereas 20% UrGATA genes (5/25) were not expressed in any tissues. Furthermore, 40% (10/25) UrGATA genes displayed highest expression levels in the flower, followed by 16% (4/25) of UrGATA genes that had the highest expression levels in stem hook and root, and 8% (2/25) of UrGATA genes had the highest expression levels in roots. These results showed that the expression patterns of 25 UrGATA genes were diverse in different tissues, and therefore, likely performing different functions in these tissues.
The software STRING was used to build a protein-protein interaction network of 13 UrGATA proteins based on the transcriptome database of the four different tissues. In the protein interaction network (Fig. 10), the darker the color, the greater the number of interactions with other GATA proteins. Among the 13 UrGATA proteins, five UrGATA proteins belonged to subgroup I, five UrGATA proteins belonged to subgroup II, one UrGATA protein belonged to subgroup III, and the remaining were grouped into subgroup IV. UrGATA4, UrGATA5, UrGATA17, UrGATA22, UrGATA24, as five key nodes in protein interaction network, have different intensities of interaction with other UrGATA proteins. The results showed that these UrGATA proteins might play a key role in the whole GATA regulation process, especially UrGATA4.
3.7 The accumulation of four TIAs in U. rhynchophylla under different lights
The determine whether altered light treatments could affect TIA accumulation in tissue culture grown U. rhynchophylla seedlings, four TIAs, corynoxeine, isocorynoxeine, isorhyncophylline, and rhyncophylline, were measured at six time points (0d, 1d, 4d, 7d, 14d, 21d) under three different conditions: normal light (CK), 20% light and red light (see Material and methods). As is shown in the Fig. 11, for corynoxeine, neither treatment resulted in significant changes until 7d, where it was observed that red light resulted in increased accumulation from 7d to 21d. Whereas, 20% light treatment produced less corynoxeine at 7d and 14d, but produced higher levels by 21d. For isocorynoxeine red light treatment resulted in steady increases in production over the course of the experiment, whereas 20% light, had higher content at 7d and 21d only. The results for isocorynoxeine were complex, as red light resulted in increased content only at 7d and 21d, while 20% light initially caused a decrease in content at 4d, but then resulted in significant increases at all later time points (7d, 14d, and 21d). Analysis of rhyncophylline content, found that both red light and 20% light treatments resulted in increases from 7d onwards to 21d.
These results indicate that responses to the different light treatments were treatment and TIA specific, but both treatments could significantly alter the content of the four measured TIA during the experiment. In general, the light treatments resulted in increased levels compared to the control that occurred at later time points (by 7d).
3.8 Expression patterns of UrGATA genes and key enzyme genes under different lights
The expression profiles of 25 UrGATA genes and key TIA enzyme genes from leaf tissue under normal light, 20% light and red light conditions were examined using RT-qPCR, and expression heat maps generated to display expression profiles (Fig. 12a, Fig. 12b). Several UrGATA genes (UrGATA1, UrGATA9, UrGATA14-15, UrGATA18, UrGATA20-21, and UrGATA24) were not expressed in leaf tissue, which is consistent with the transcription sequencing data (Fig. 9, Fig. 12a). However, several UrGATA genes (UrGATA2, UrGATA4-5, UrGATA11, UrGATA13, UrGATA17, and UrGATA22) were up-regulated under the red light during 21d treatment, compared to CK. Whereas, most UrGATA genes were down-regulated under 20% light condition, with only three UrGATA genes (UrGATA7, UrGATA8, UrGATA23) found to have increased expression.
For key enzyme genes involved in the TIA pathways, UrG8H, Ur7DLGT, UrAS, UrAnPRT, UrTSA, UrTDC, UrSTR, and UrSGD were significantly up-regulated by red light, while the expression of Ur8HGO and Ur7DLH were increased but not significantly compared to the CK. Interestingly, UrGES, Ur8HGO, UrIO, Ur7DLH, UrLAMT, UrSLS, UrTSA, and UrTSB were highly up-regulated under the 20% light treatment compared to the CK (Fig. 12b).
Correlation analysis between UrGATA genes and key enzyme genes were analyzed (Fig. 13). The results showed that UrGATA7 and UrGATA8 were positively correlated with key enzyme genes analyzed, showing high expression under the different light treatments. Among these, UrAS, UrTDC, and UrG8H showed strong correlations (p < 0.05, r > 0.8) with UrGATA8, and UrGATA7 had significant correlations with UrSGD, UrLAMT, UrIO, and Ur7DLH (p < 0.05, r > 0.8). Moreover, UrGATA23 was positively correlated with the expression of UrGES, (Pearson coefficients 0.87). These results indicate that GATA genes and key enzyme genes could share similar regulation by the light treatments, and may therefore, regulate the expression of key enzyme genes that result in an increase TIA content.
To predict whether UrGATA transcription factors can bind to cis-acting elements in the promoter region of key enzyme genes, 2.0Kb promoter region of these important enzyme genes involved in TIAs pathway of U. rhynchophylla were predicted using the online website PlantCare (http://bioinformatics.psb.ugent.be/webtools/plantcare/html/). Based on the prediction, the GATA cis-acting elements were found in the promoter region of UrGES, UrG8H, Ur8HGO, UrSLS,UrTSB, and UrTDC. Therefore, providing support for a possible method by which these key enzyme genes might be regulated by UrGATA TFs.