Identification and Phylogenetic relationship of the ST gene family in N. nucifera
To identify the all putative ST proteins in the N. nucifera genome, we used the HMM profile of the MFS domain (PF07690) and Blastp to search against the database. A total of 35 STs were identified and subjected to Pfam and SMART analyses, which resulted in the TableS.1. The ST genes were clustered into seven group, including STP, VGT, PLT, INT, SFP, TMT and Glct. The SUT sub family were not found in N. nucifera. To investigate the classification and phylogenetic relationship of the ST gene family in N. nucifera, we used the ST proteins to construct a phylogenetic tree. Based on the phylogenetic tree (Fig.1), all the ST proteins were consistented with Arabidopsis groups.
To explore the diversity in each group, we were identified motifs by MEME program. As showed in Fig. 1, the ST proteins that share common motif 1-10, which were conversed in ST. The motif 1, 4, 7, 8 and 10 were representative MFS domain. The each subgroup that share similar motifs and motif compositions were clustered into the same group. The result that ST gene family has highly conserved domains and motifs . (Fig.4).
Evolution and Expansion of ST in different plant species
To investigate the evolution of the ST gene family in the plant kingdom, we selected 9 Angiospermae (6 eudicots, and 1 basal angiosperm), 1 Pteridophyta and 1 Bryophyta species for comparative analysis (Fig.2). Based on the whole-genome level, the number of ST in each species was counted (Fig.2a). The land plants have a relatively large number of ST gene. In addition, species that have a larger genome seem to contain a greater number of ST gene except for Am. Trichopoda and P.patens. In all species, the densities of ST proteins in A. thaliana (0.3926 number/Mb) were the highest, followed by B.rapa (0.2403 number/Mb) and C. papaya (0.1704 number/Mb), which were higher than those in lower plants. The reason is that Vitis vinifera, P. trichocarpa, and C. papaya did not undergo α and β duplications and Am. trichopoda, a basal angiosperm, did not undergo the γ duplication event. Furthermore, due to specific WGT events, there were more ST gene family members in some species[40, 41]. Meanwhile, we found that no ST were detected in V. carteri and C.reinhardtii. Then we constructed phylogenetic tree of the ST genes to analyze the evolutionary relationships of these species (Fig.2b). The phylogenetic tree showed that the ST gene family formed five distinct groups (STP, VGT, PLT, INT, SFP, TMT and Glct), which is consistent with the result for A. thaliana. The expansion happened in the evolutionary process from low plants to high plants, and the density of ST proteins increased as the plants evolved. From algae to angiosperm, the ST gene family has highly conserved domains and motifs. According to our findings, the evolutionary history of ST in the plant kingdom was constructed .
Chromosomal distribution and synteny analysis of NNUST genes
The NNUST genes were unevenly mapped on the 11 megascaffolds(Fig.3). Some megascaffolds have more genes, whereas others have few. The megascaffold1 contained the largest number of NNUST genes (11). In other megascaffolds, the numbers of ST genes in megascaffolds6 (4), followed by megascaffolds2 and megascaffolds3(3) , which megascaffolds10, megascaffolds14, megascaffolds5 and megascaffolds8 have only one gene. The NNUST duplicate genes were identified with PlantDGD . The duplicate genes were derived from four modes of gene duplication, including 7 of whole-genome duplications(WGD), 2 of tandem duplications(TD), 7 of transposed duplications(TRD), 17 of dispersed duplications (DSD). These results indicated the dispersed duplication may be a major driving force for NNUST genes evolution.
To investigate the evolution of NNUST family, three dicots (Arabidopsis, S.tuberosum and grape) and one monocots (maize) were constructed four comparative microsynteny maps with N. nucifera (Fig. 4). The collinear gene pairs showed syntenic relationship in maize (135), followed by grape(107), Arabidopsis(89), S.tuberosumand(22) (TableS.2). In these syntenic gene pairs, we found that some genes correspond to at least 4 collinear genes, especially in maize and grape,such as NNUSTP7 and NNUSFP5. Some NNUST collinear gene pairs(correspond to at least 4 collinear genes) identified in N. nucifera /Arabidopsis, N. nucifera/grape and N. nucifera/maize, indicated that these genes may already exist before the ancestral divergence and play an important role of NNUST gene family during evolution. In contrast, the subgroups of INT collinear gene pairs were not identified between N. nucifera and all of the other four species, indicate that may occurred after the divergence of dicotyledonous and monocotyledonous plants
To further investigated the evolution footprint of the NNUST family, the Ka/Ks ratios of the NNUST gene pairs were calculated between N. nucifera and Arabidopsis(TableS.3). All collinear gene pairs NNUST gene pairs had Ka/Ks < 1, suggesting that the NNUST gene family might have purifying selective pressure during evolution(Fig.S1).
Comparative expression pattern analysis on the ST genes in different tissues from N. nucifera
Compared different Sugar transporters in different tissues expression, we investigate the divergence expression patterns(Fig.5, TableS.4). We dected different MST gene expression levels, including the leaves, petioles, flowers and rhizome. In SFP family, all genes showed relatively low levels in leaves and petioles, of which the expression of NNUSFP3 gene was slightly higher; The expression of NNUSFP3 gene was the highest in the rhizome, followed by NNUSFP4 and NNUSFP5, and NNUSFP2 was hardly expressed. In pGlcT family, all genes showed relatively levels in flowers, of which the expression of NNUpGlcT2 and NNUpGlcT4 is the highest. NNUpGlcT4 was predominant expressional member in this gene family and is expressed in all four tissues, followed by NNUpGlcT5. In STP family, NNUSTP6 gene is a low active member in the family, because it is undetectable in our study. NNUSTP3 and NNUSTP9 were the main members for the family, showed relatively levels in all four tissues, of which the expression of STP9 gene was highest, especially in flowers. NNUSTP1, NNUSTP4, NNUSTP5, NNUSTP7, NNUSTP8 expressions were detected in leaves, but these genes do not express or express less in other tissues. In INT and PLT family, some genes (including NNUINT2, NNUINT3, NNUPLT1, NNUPLT2) were detected only in leaves. Both INT4 and PLT4 gene have higher expression only in flowers. INT1 and PLT3 gene do not express or express less in all four tissues. In tMT family, all genes expressions were detected in all four tissues, and NNUtMT2 is the most active. In VGT family, both NNUVGT1 and NNUVGT2 were highly expressed in leaves and flowers, but less expressed in other tissues.
Cis-elements and Interaction Network Analysis among ST Proteins in N. nucifera
The ST gene of cis-elements were identified at 1.5 kb promoter regions. Then, we analyzed the cis-elements by the Plant Cis-acting Regulatory DNA Elements (PLACE) website.We identified the 10 most common cis-elements in ST genes in N. nucifera(Fig.S2).
A total of 10 common cis-regulatory elements were identified in the promoter regions of the MST and SUT genes, which were highly conserved among all the studied MST and SUT genes in N. nucifera (Fig.S2). Three common cis-regulatory elements, ARBE, the TGACG-motif and the GARE-motif, were responsive to plant hormones, including ABA, JA and GA. Some common cis-regulatory elements were responsive to both abiotic and biotic stresses, including one fungal elicitor-responsive elements (W-box), a light-responsive element pathogen (G-Box), low-temperature responsiveness (LTR), defense and stress responsiveness (TC-rich repeats) and a drought-responsive element (MBS), indicating the importance of ST genes in stress tolerance. The ST genes were responsive to various stresses including drought, cold and salinity, which may be due to upstream gene specificity and the binding of corresponding cis-elements that regulate the expression of ST genes.
Expression profiles of ST genes under abiotic stresses in N. nucifera
In plant, ST gene family plays very important roles in development as well as in stress responses. Abiotic stress such as drought, extreme temperatures, and salinity adversely affects plant growth and crop productivity. So we chosed NaCl, PEG, ABA and cold treatments to identify the stress-responsive ST genes (Fig.6, TableS.5). Under ABA treatment, 23ST genes were upregulated. Meanwhile, 2ST genes were down-regulated. Among the 31 ST genes, the NNUSFP5, NNUSTP2, NNUSTP4 and NNUSTP5, their expression was over 6 times than that of the control at 8h. Under NaCl treatment, 25 ST genes were upregulated. We found that especially the NNUSFP2, NNUSFP3, NNUSTP3, NNUSTP4, NNUSTP6, NNUSTP7 and NNUVGT2, their expression was upregulated at 8h or 16h but were downregulated at 24h. Under PEG treatment, we found that the NNUSTP5 and NNUSTP8 were reached the highest at 24h and over 6 times than that of the control. Including the 12 ST genes were upregulated at 8h. Under Cold treatment, 7 ST genes were upregulated at 8h and reached the highest. Among the 31 ST genes, especially the NNUINT3, NNUSTP2 and NNUSTP5, their expression was over 5 times than that of the control at 8h; the expression of NNUINT1, NNUINT2, NNUSTP3, NNUSTP5 and NNUSTP8 was over 4 times than that of the control at 24h. Specifically, we observed that NNUVGTs were not responded or less to the four stress treatments.
To further investigate the connection between these ST genes, correlation and co-regulatory networks were established based on the PCCs of the relative expression of the genes (Fig.6b). NNUST gene pairs with PCC values that were significant at the 0.05 significance level and were greater than 0.5 were collected and visualized to construct hormones and abiotic stresses coregulatory networks (Fig.S3). All the gene pairs with positive significant correlations were shown in the co-regulatory network, a total of 35 nodes. The NNUST genes interaction network shows a very complicated correlation with other genes N. nucifera, which may indicate that NNUST genes are involved in many fundamental mechanisms and regulate many downstream factors and/or are regulated by many upstream genes. The expansion of the gene family depicted in the network, could help plants adapt to the changing environment by increasing cooperation and obtaining new functions.