Identification of the AP2/ERF superfamily in longan
All candidate AP2/ERF genes corresponding to the Pfam AP2 domain (PF00847) were extracted from the longan genome dataset [33]. We annotated 125 distinct AP2/ERF putative genes based on the complete AP2 domains (Additional file 1: Table S1). A total of 101 genes with a single AP2 domain were assigned to the ERF family, which was divided into the ERF and DREB subfamilies with 61 and 40 genes respectively. A total of 19 genes were assigned to the AP2 family, and 13 of them contained two AP2/ERF domains. The remaining six genes, Dlo_000585.2 (AIL), Dlo_023666.1 (WRI1), Dlo_008555.1 (ADAP), Dlo_009234.2 (RAP2.7), Dlo_007774.3 (RAP2.7), and Dlo_012821.1 (RAP2.7), contained only one AP2 domain, but had high similarity with AP2 family members and were distinct from ERF family members. Because of this phylogenetic relationship, we assigned them to the AP2 family, which is consistent with the corresponding genes in Arabidopsis [1]. Four genes, which contained a single AP2 domain and a B3 domain, were assigned to the RAV family. The Dlo_030750.1, which was alone on a branch and close to the RAV family, was homologous to Arabidopsis Soloist (At4g13040). Therefore, we found that the longan AP2/ERF (DlAP2/ERF) superfamily contained 101 ERF, 19 AP2, 4 RAV, and 1 Soloist members.
The following gene characteristics were analyzed: lengths of the coding sequence (CDS) and protein sequence, and protein isoelectric point (pI) and molecular weight were analyzed (Additional file 1: Table S1). Among the 125 DlAP2/ERF genes, the completed CDS of one gene was not found (dlo_039227.1, ERF-77); the other 124 genes were analyzed further. Dlo_013715.1 was the smallest protein with 123 amino acid (aa) and the largest protein was Dlo_011527.1 (Baby Boom, BBM) with 762 aa. The molecular weights of the proteins ranged from 13.1 to 83.3 kDa, and the pI ranged from 4.53 (Dlo_001352.1, ERF-35) to 10.82 (Dlo_013715.1, ERF-50).
Phylogenetic analysis and classification of the longan AP2/ERF superfamily
The evolutionary relationship of the DlAP2/ERF superfamily was assessed by phylogenetic reconstruction using the conserved AP2 domain of the AP2/ERF proteins with the neighbor-joining (NJ) method (Fig. 1). The phylogenetic analysis indicated that the DlAP2/ERF superfamily divided into 13 groups, which is consistent with previous studies [1, 29]. Groups I–IV belonged to the DREB subfamily, groups V–X belonged to the ERF subfamily, and groups XI, XII, and XIII belonged to the AP2, RAV, and Soloist families, respectively. The Soloist gene contained a single AP2 domain but was clustered with the RAV family, which is consistent with what was found in Chinese cabbage [29]. However, in grape, the Soloist gene was clustered with the AP2 family [27]. The in-depth phylogenetic analysis of the AP2 family members in group XI showed that they clustered into three subgroups, AP2-R1, AP2-R2, and AP2-R3 (Additional file 2: Fig. S1).
To further confirm the evolutionary relationship of the DlAP2/ERF superfamily, we also used a maximum likelihood (ML) method to assess the phylogenetic reconstruction (Additional file 3: Fig. S2). The phylogenetic analysis indicated that the AP2/ERF superfamily clustered into four subfamilies, ERF, AP2, RAV, and Soloist, which is similar to the results obtained with the NJ method. The exceptions were Dlo_002336.1 (ERF-91), Dlo_000247.1 (ERF-2), and Dlo_000250.1 (ERF-3), which contain a single AP2 domain and were assigned to the ERF family but clustered with the DREB family with the ML method. The motif analysis showed that in Dlo_002336.1 (ERF-91), Dlo_000247.1 (ERF-2) and Dlo_000250.1 (ERF-3), the highly conserved AP2 domain was similar to that of the DREB family (Fig. 2B), so the different phylogenetic results between the NJ and ML method may be explained by the high similarity of the AP2 conserved domains.
Motif composition and gene structure of the DlAP2/ERF superfamily
The conserved motifs and exon–intron organization of all identified DlAP2/ERF genes were examined to gain insights into the evolution of the DlAP2/ERF superfamily. The results showed that the different families among the AP2/ERF superfamily usually had different structures. For the conserved motif analysis among the DlAP2/ERF proteins, we used the MEME Suite of tools [34]. We identified a total of 30 motifs and used them to analyze the characteristics of the DlAP2/ERF superfamily proteins (Fig. 2B). The results showed that motif-1 was present in almost all AP2/ERF family members, motif-2 was present in all ERF family members. Almost all ERF family members contained motif-1, motif-2, motif-3, and motif-4, and almost all AP2 family members contained motif-1, motif-3, motif-4, motif-9, and motif-14. Additionally, AP2/ERF superfamily members in the same groups usually shared a similar motif composition. For example, motif-5 and motif-6 were unique to the AP2 family, motif-13 and motif-17 were unique to the RAV family, and motif-23 was unique to the AP2-R3 subfamily.
Gene structure analysis showed that the ERF family members had from zero to three introns (Fig. 2C); 14 had one intron, two had two introns, one had three introns, and the remaining 83 members had no introns. All AP2 members and the Soloist gene had one to ten introns, whereas the RAV family had no introns. Further, genes within the same group usually had similar structures. For example, all DREB subfamily members had no intron, and all group V members, five group VI members also had no intron. The similarity of conserved motif composition and gene structure in the same subfamily indicated that the phylogenetic classification was reliable.
Cis-acting elements in longan AP2/ERF genes
To further analyze the potential function of DlAP2/ERF genes, a 2000-bp regulatory region upstream of the ATG promoter was searched for cis-acting elements. Light-responsive (1522), hormone-responsive (870), and stress-responsive (402) boxes were detected in the promoter regions of AP2/ERF superfamily genes (Fig. 3; Additional file 4: Table S2). Hormones-related elements, such as methyl jasmonate (MeJA) (342), ABA (327), gibberellin (GA) (75), auxin (63), and SA (63) boxes, were often found in the promoter of DlAP2/ERF superfamily genes. Stress-related elements, such as anaerobic induction (294), low-temperature (55), defense (49), and wound induction (4) boxes, also were found in the promoter regions of DlAP2/ERF genes. Endosperm-related (35), circadian-related (23), meristem regulation (51), and seed regulation (8) cis-acting elements also were found in the DlAP2/ERF family. These results indicated that DlAP2/ERF genes may be regulated by various cis-acting elements in their promoters during growth and stress responsive.
RNA sequencing (RNA-seq) revealed SNPs and InDels in DlAP2/ERF genes in 'HHZ' and 'SJM' cultivars
To investigate the variation of DlAP2/ERF superfamily genes, we extracted the SNP and InDel data from four transcriptome datasets corresponding to NEC and embryogenic cultures of the 'HHZ' cultivar, different light quality and hormone treatments in EC of the 'HHZ' cultivar, and nine organs of the 'SJM' cultivar (Figs. 4 and 5; Additional file 5: Table S3 and Additional file 6: Table S4). We identified 74, 86, 92, and 159 SNPs, and 2, 3, 4, and 3 InDels in the AP2/ERF superfamily genes in these four datasets, respectively. Among then, Dlo_000585.1 (AIL), Dlo_007513.1 (ERF-11), Dlo_021787.1 (AIL), Dlo_013163.1 (ERF-49), and Dlo_026624.1 (ERF-78) had the highest number of SNPs in the 'HHZ' and 'SJM' cultivars; Dlo_000585.1 (AIL) had more SNPs in the 'HHZ' cultivar than in the 'SJM' cultivar, and Dlo_013163.1 (ERF-49) had more SNPs in the 'SJM' cultivar than in the 'HHZ' cultivar. In addition, 22 genes had SNPs that were detected only in the 'SJM' cultivar and five genes had SNPs that were detected in only the 'HHZ' cultivar. This difference in SNP numbers indicated that the AP2/ERF superfamily members may have diverse functions in these two cultivars.
RNA-seq revealed alternative splicing (AS) events of DlAP2/ERF genes in different tissues and treatments
AS events regulate gene expression and expand proteomic diversity in plant development processes, and are regulated by the developmental stage and stress conditions. To investigate transcript expression and protein translation of AP2/ERF superfamily members, we examined four types of AS events across the NEC and embryogenic cultures (EC, IcpEC, and GE) of the 'HHZ' cultivar, different light quality and hormone treatments in EC of the 'HHZ' cultivar, and nine organs of the 'SJM' cultivar, including intron retention, exon skipping, alternative 5′ splice site donor, and alternative 3′ splice site acceptor (Fig. 6). The largest number of AS events in the AP2/ERF superfamily was detected in the NEC (62) stage, and the smallest number of AS events was found in the EC (34) stage (Fig. 7; Additional file 7: Table S5). The most and least types of AS events among these four stage were intron retention (103) and exon skipping (15), respectively. Nineteen of the 125 DlAP2/ERF genes had AS events in the NEC stage, followed by 15 genes in the GE stage, 14 in the IcpEC stage, and 11 in the EC stage (Fig. 7). Additionally, seven genes (Dlo_012821.1 (RAP2.7), Dlo_032272.1 (ANT), Dlo_006200.1 (ERF-40), Dlo_005429.1 (ERF-38), Dlo_019916.1 (ERF-18), Dlo_015474.1 (ERF-14), and Dlo_000444.1 (ANT)) had a total of twelve specific AS events in the NEC stage. Interestingly, no specific AS events were found in the EC, IcpEC, and GE stages. These results showed that the large number of specific AS events in the NEC stage may form various proteins with different functions, which may be important for the maintenance of NEC.
To confirm whether the AS events of AP2/ERF superfamily were affected by different light qualities and hormones, we detected four types of AS events from the transcriptome datasets (Fig. 7 and Additional file 8: Table S6). The highest number of AS events was detected in blue-light treatment (12), followed by the dark (10) and white-light (8) treatments. Additionally, a total of eight genes had AS events in the blue-light treatment, followed by seven in the dark treatment and five in the white-light treatment (Fig. 7). Among the hormone treatments (2,4-dichlorophenoxyacetic acid (2,4-D) and kinetin (KT)), the highest number of AS events was detected in the 2,4-D+KT (16) treatment, followed by the 2,4-D (15), KT (15) and MS (Murashige and Skoog) (15) treatments. Further, 10 genes were found to undergo AS events in the 2,4-D+KT treatment, followed by nine genes in the 2,4-D, KT, and MS treatments (Fig. 7 and Additional file 9: Table S7). Additionally, Dlo_011527.1 (BBM) underwent specific AS event in the 2,4-D+KT treatment, and Dlo_018830.1 (ERF-70) and Dlo_011527.1 (BBM) underwent specific AS events in blue-light treatment. No specific AS events were detected in the dark, white-light, 2,4-D, KT, or MS treatments. We presumed that the AS events of Dlo_011527.1 (BBM) and Dlo_018830.1 (ERF-70) were affected by KT and blue-light quality.
By analyzing the 'SJM' cultivar transcriptome dataset, we found that the largest number of AS events in the AP2/ERF superfamily was in seed (51) and the smallest number was found in flower (23) (Additional file 10: Table S8). Among the types of AS events found in all nine organs, intron retention (263) was the most common and exon skipping (14) was the least common. A total of 16 AP2/ERF superfamily genes underwent AS events in seed, followed by 14 genes in root; the lowest number of AS events was found in flower (8 genes) and pulp (8 genes) (Fig. 7). Additionally, we detected five specific AS events (Dlo_030836.1 (WRI) and Dlo_014268.1 (ANT)) in seed, and one specific AS event (Dlo_032272.1 (ANT) and Dlo_034081.1 (ADAP))in root and flower, respectively. These results indicated that AS events can form diverse transcripts and may affect seed development. Overall, AS events are enriched in the NEC and seed stage, and they may also be affected by blue-light and KT.
Expression profiling of longan AP2/ERF genes by RNA-seq
To investigate whether the AP2/ERF superfamily genes affected early SE, the development of different organs, and response to light and hormone in longan, we calculated the transcript levels in longan transcriptome datasets using FPKM values and constructed functional pathways. The four transcriptome datasets included the NEC, EC, IcpEC, and GE of the 'HHZ' cultivar, the ECs of 'HHZ' under different light quality and hormone treatments, and nine organs of the 'SJM' cultivar. We first analyzed the transcriptional levels of the AP2/ERF genes in the NEC and embryogenic cultures (EC, IcpEC, and GE) of the 'HHZ' cultivar (Fig. 8, Additional file 11: Table S9). Among the 125 DlAP2/ERF genes, six (Dlo_028757.1, Dlo_023742.1, Dlo_007909.1, Dlo_027111.1, Dlo_000083.1 and Dlo_012177.1) were not expressed in any of the tested samples; the remaining 119 genes were most abundant in the NEC stage (8897.38), followed by the GE (7626.92), IcpEC (6283.83), and EC (3746.59) stages. Among the 119 genes, 67, 31, 11, and 9 genes were most highly expressed in the NEC, GE, IcpEC, and EC stages, respectively. We found 78 differentially expressed AP2/ERF genes in NEC vs. EC, 41 in EC vs. IcpEC, and 18 in IcpEC vs. GE. Besides, Dlo_000444.1 (ANT), Dlo_032272.1 (ANT), Dlo_028976.1 (RAP2.7), and Dlo_015581.1 (RAP2.7) in the AP2 family, and Dlo_005429.1 (ERF-38), Dlo_001352.1 (ERF-35), Dlo_023983.1 (ERF-75), Dlo_015474.1 (ERF-14), Dlo_006200.1 (ERF-40), Dlo_025295.1 (ERF-28), Dlo_022612.1 (ERF-23), and Dlo_024437.1 (ERF-76) in the ERF family were specifically expressed in the NEC stage, with extremely low expression levels in the other stages. Thus, these genes can be used as marker genes in the NEC stage. We presumed that DlAP2/ERF genes may effect mainly the development of NEC and GE, and affect the transition from NEC to EC.
To analysis whether the AP2/ERF superfamily responds to light quality, we analyzed the transcript levels of the DlAP2/ERF genes in ECs with blue-light, white-light, and dark treatments using FPKM values from the transcriptome dataset. We found 106 expressed DlAP2/ERF genes in the transcriptome dataset (Additional file 12: Fig. S3, Additional file 11: Table S9). They were most abundant in the dark treatment (6818.18), followed by white-light (3448.89) and blue-light (3071.69) treatments. A total of 34, 38, and 3 genes were differentially expressed in white-light vs. dark, dark vs. blue-light, and white-light vs. blue-light comparisons, respectively. Further, Dlo_004646.1 (PLT) and Dlo_000585.2 (AIL) had the highest expression levels in the white-light and blue-light treatments, respectively. These results showed that light may affect the expression of DlAP2/ERF genes but they may not be influenced by light quality changes.
To investigate whether the AP2/ERF superfamily responds to 2,4-D and KT, the expression levels of the DlAP2/ERF genes in the transcriptome dataset from hormone treatments (2,4-D and KT in ECs of 'SJM' cultivar) were analyzed. Among the 125 DlAP2/ERF genes, 107 were detected in the transcriptome dataset; the other 18 were not detected (Additional file 13: Fig. S4 and Additional file 11: Table S9). DlAP2/ERF genes were most abundant in the 2,4-D+KT (5770.59) and 2,4-D (5582.93) treatments. The MS (5149.25), and KT (4932.11) treatments, which lacked auxin (2,4-D) in the medium, had the lowest numbers of DlAP2/ERF genes. Additionally, we detected 10 differentially expressed genes in 2,4-D+KT vs. MS, followed by 2,4-D vs. MS (8), 2,4-D+KT vs. KT (7), and 2,4-D vs. KT (5). Only one and zero differentially expressed genes were detected in 2,4-D+KT vs. 2,4-D and KT vs. MS. These results indicate that some DlAP2/ERF genes may be affected by 2,4-D.
We also analyzed the transcriptional levels of DlAP2/ERF genes in nine organs of the 'SJM' cultivar (Additional file 14: Fig. S5, Additional file 11: Table S9). Among the 125 genes, we detected 120 genes that were expressed in all nine organs, three (Dlo_026157.1, Dlo_025387.1 and Dlo_012769.1) that were expressed in some of the organs, and two (Dlo_012177.1 and Dlo_008555.1) that were not detected in an on the organs. The AP2/ERF genes were most abundant in pericarp (14196), followed by stem (12807) and leaf (11765), and least abundant in root (3583) and seed (3831). Some of the genes were specifically expressed in different organs. For example, Dlo_015606.1 (ERF-64), Dlo_021787.1 (AIL), Dlo_014268.1 (ANT), and Dlo_000585.2 (AIL) were specifically expressed in seed, and Dlo_015581.1 (RAP2.7) was specifically expressed in flower bud and flower (Additional file 14: Fig. S5). The results indicated that the AP2/ERF superfamily may widely regulate the developmental processes of longan because of its role in the development of NEC and embryogenic cultures and various tissues, and may be induced by 2,4-D and light.
Expression patterns of AP2/ERF genes in NEC and embryogenic cultures of longan and the response to hormones by qRT-PCR
To confirm the transcriptional regulation of the DlAP2/ERF genes in the NEC and embryogenic cultures of the 'HHZ' cultivar, 12 differentially expressed DlAP2/ERF genes were selected for validation by qRT-PCR (Fig. 9). The qRT-PCR results showed that seven of the 12 selected genes had the same expression trends as those obtained by RNA-seq. Among them, Dlo_023524.1 (ERF-73) had the highest expression level in the NEC stage; Dlo_004646.1 (PLT), Dlo_000287.1 (AP2), Dlo_007774.3 (RAP2.7), and Dlo_009234.2 (RAP2.7) had the highest expression levels in the IcpEC stage; and Dlo_028566.1 (ERF-6), Dlo_011527.1 (BBM), Dlo_014268.1 (ANT)s and Dlo_000585.1 (AIL) had highest expression levels in the GE stage. Specially, the expression levels of Dlo_011527.1 (BBM) and Dlo_000585.1 (AIL) increased from the NEC to GE stage and reached their peaks in the GE stage. Interestingly, Dlo_015581.1 (RAP2.7), Dlo_019916.1 (ERF-18), and Dlo_013715.1 (ERF-50) were specifically expressed in the NEC stage, indicating they may marker genes for the NEC stage. Because of the high expression levels of AP2 genes in the IcpEC and GE stage, we presumed that the AP2 and AIL subfamily genes may promote the formation and development of GE.
According to the cis-acting elements in the AP2/ERF superfamily, we chose six of the 12 validated genes to confirm by qRT-PCR whether the expression patterns were influenced by different hormone treatments (SA, GA, ABA, and MeJA) (Fig. 10). We found that some of the DlAP2/ERF genes were significantly induced by multiple hormone treatments, and others were repressed. For instance, Dlo_023524.1 (ERF-73), Dlo_011527.1 (BBM), Dlo_000585.1 (AIL), Dlo_007774.3 (RAP2.7), and Dlo_009234.2 (RAP2.7) were significantly induced by MeJA treatment, whereas Dlo_004646.1 (PLT) was down-regulated and then up-regulated by MeJA treatment. Dlo_023524.1 (ERF-73) and Dlo_004646.1 (PLT) were induced by GA treatment, Dlo_011527.1 (BBM), Dlo_000585.1 (AIL), and Dlo_007774.3 (RAP2.7) were repressed by GA treatment, and Dlo_009234.2 (RAP2.7) was induced then repressed by GA treatment at different times. Dlo_004646.1 (PLT), Dlo_007774.3 (RAP2.7), and Dlo_009234.2 (RAP2.7) were induced by ABA treatment, Dlo_023524.1 (ERF-73) and Dlo_000585.1 (AIL) were repressed by ABA treatment, and Dlo_011527.1 (BBM) was induced then repressed by ABA treatment. Dlo_004646.1 (PLT), Dlo_007774.3 (RAP2.7), and Dlo_009234.2 (RAP2.7) were induced by SA treatment, whereas Dlo_011527.1 (BBM) and Dlo_000585.1 (AIL) were repressed by SA treatment. Interestingly, three genes, Dlo_011527.1 (BBM), Dlo_004646.1 (PLT), and Dlo_000585.1 (AIL), showed opposite expression patterns under the MeJA and SA treatments. Overall, the candidate genes were influenced by different hormone in various ways.
AP2/ERF protein interactions among specific proteins in longan
To confirm the potential functions of the DlAP2/ERF proteins, we selected six validated differentially expressed genes to construct a protein–protein interaction network using STRING 10.5 software and an Arabidopsis association model (Fig. 11). Dlo_023524.1, which shared high homology with AtRAP2.12, interacts with ACBP1, ACBP2, CYS2, and PRT6. Dlo_011527.1, which shared homology with AtBBM, interacts with MYB115, SERK1, AGL15, LEC1, and FUS3. Dlo_004646.1,which shared homology with AtPLT2, interacts with WOX5, SHR, EIR1, WOX4, BPS2, NRPB3, and NRPE3B. Dlo_000585.1, which shared high homology with AtAIL6, interacts with ELP1, TTG1, EDM2 AFO, LFY, WUS, and MP proteins. Dlo_007774.3 and Dlo_009234.2, which shared homology with AtRAP2.7, interact with TGG2, FKF1, EFE, GAI, TY1, KAN, and ROXY2.