Identification and classification of MADS-box genes in pineapple
Initially, 44 pineapple MADS-box genes were identified by Hidden Markov Model (HMM) search. To carry out an exhaustive search for MADS-box genes, BLASTP was conducted to search the pineapple genome database using MADS-box protein sequences in Arabidopsis and rice as queries. Finally, a total of 48 MADS-box genes were identified in the pineapple genome (Table 1) and further confirmed by NCBI Conserved Domain Database. The CDS length of pineapple MADS-box genes ranged from 180 bp (Aco030553.1) to 4569 bp (Aco027629.1). The relative molecular mass varied from 6.68 kDa to 166.54 kDa, and protein IP ranged from 4.80 to 11.23.
In order to study the evolutionary relationship between pineapple MADS-box genes and the known MADS-box genes from Arabidopsis and rice, multiple sequence alignments were conducted and then a phylogenetic tree was constructed based on amino acids of MADS-box genes in pineapple, Arabidopsis and rice. Thirty-four pineapple genes were classified as type II MADS-box genes including 32 MIKC-type and 2 Mδ-type (Fig. 1a). Fourteen type I MADS-box genes were further divided into Mα, Mβ and Mγ subgroups. Mα was the type I subgroup with the most genes. Eight out of 14 type I genes were classified as Mα subgroup, while 2 and 4 type I genes were classified into Mβ and Mγ subgroup, respectively (Fig. 1a). 32 MIKC-type pineapple genes were further divided into 11 clusters: TT16, APETALA3, PISTILLATA, SVP, ANR1, SEP, FUL, AGL12, AGAMOUS, AGL11 and SOC1 (Fig. 1b).
Gene structure and conserved motif analysis
To explore the structural evolution of MADS-box genes in pineapple, structural arrangements of MADS-box genes were examined by Gene Structure Display Server. The result showed that the closely related genes were usually more similar in gene structure, such as genes Aco004785.1, Aco011341.1, Aco007999.1 and Aco009993.1, which all had 7 exons. However, some closely related genes showed significant difference in structural arrangements (Fig. 2). For instance, Aco022101.1 possesses only one exon, while Aco027629.1, its closely related gene, had 19 exons. Furthermore, pineapple MADS-box genes contained exons ranging from 1 to 19. Nine out of 48 MADS-box genes had only one exon, and those genes with one exon except for Aco030553.1 belong to type I. The exon number of most pineapple MADS-box genes was less than 10, only three genes Aco013736.1, Aco003667.1 and Aco027629.1 had 10, 11 and 19 exons, respectively (Fig. 2).
MEME software was used to analyze motifs in the MADS-box proteins. Twenty conserved motifs were identified (Fig. 3) and these conserved motifs were annotated by SMART program. Motif 1, 3, 7 and 11 are MADS domains, motif 2 represents K domain, and motif 6 is C domain. All of MADS-box genes (except for 4 genes: Aco003667.1, Aco015492.1, Aco030656.1 and Aco019839.1) contained motif 1, and the 4 genes without motif 1 all contained motif 2. Meanwhile, motif 2 was identified in the majority of type II MADS-box genes, while it was only discovered in four type I genes (Aco019039.1, Aco011677.1, Aco030656.1 and Aco019839.1). Genes in the same group tend to have commonly shared motifs. For example, Mδ-type group includes Aco013736.1 and Aco019026.1 contained only motif 1. Aco022101.1 and Aco027629.1, in Mγ group, both possessed motifs 1, 8, 11, 15 and 20.
Location on chromosomes of pineapple MADS-box genes
The majority of pineapple MADS-box genes (42 out of 48) were randomly distributed across 19 chromosomes, while only 6 genes were scattered in 6 scaffolds that could not be assigned to chromosomes (Table 1, Fig 4). Six genes (12.5%) were on chromosome 1, followed by 4 genes (8.3%) on chromosome 15. Type II MADS-box genes were mapped to 18 chromosomes (except from chromosome 4), while type I MADS-box genes were scattered to only 9 chromosomes due to fewer members. Out of type I genes, Mα group genes were distributed on chromosomes 7, 8, 9, 15, 19 and 20, whereas two Mβ group genes were clustered across chromosomes 1 and scafford_1517. Genes in Mγ group were located on chromosomes 4, 13 and 15.
Expression analysis of the pineapple MADS-box genes in different tissues
To investigate the expression patterns of pineapple MADS-box genes in different tissues, RNA-seq libraries prepared from four pineapple tissues: leaf, flower, root and fruit were constructed and RNA-seq analysis was further performed to obtain FPKM values of MADS-box genes in pineapple. Forty MADS-box genes were expressed in at least one tissue, while the other 8 genes (Aco019026.1, Aco008623.1, Aco013644.1, Aco019842.1, Aco019839.1, Aco013324.1, Aco030553.1 and Aco028086.1) were not detectable in any of those four tissues. Therefore, 8 genes with no detectable expression (FPKM value equals “0” in all four tissues) were filtered out and the expression level of 40 genes was shown in a heat map (Fig. 5).
RNA-seq expression profile of pineapple MADS-box genes revealed that a majority of genes were highly expressed in flower. Besides, some genes, such as Aco019365.1, Aco017589.1 and Aco025594.1, were expressed much higher in flower than in other tissues. In leaf tissues, many genes had relatively lower expression, but some genes (Aco027629.1 and Aco002729.1) expressed higher in leaves than in flowers. In fruit tissue, a few genes, such as Aco002729.1, Aco016643.1 and Aco013229.1 showed high expression level. Two genes, Aco007995.1 and Aco018015.1, were highly expressed in root, and Aco022101.1 was only expressed in root.
Ten MADS-box genes were randomly selected for quantitative RT-PCR analysis in flower and leaf tissues to verify the RNA-seq data (Fig.6). The qRT-PCR results confirmed that most of MADS-box genes had high expression in flower and had low expression in leaves. However, a few genes, such as Aco027629.1 and Aco002729.1, expressed higher in leaves, which exhibited the same trend as RNA-seq data. These results showed that our RNA-seq data is suitable for investigating the expression patterns of MADS genes in different tissues of pineapple.
Expression analysis of pineapple MADS-box genes in green tip and white base leaves
Pineapple is a CAM plant that achieves greater net CO2 uptake than their C3 and C4 counterparts [24]. To investigate the potential roles of MADS-box genes in pineapple CAM photosynthesis, we studied the expression pattern of MADS-box genes in photosynthetic (green tip) and non-photosynthetic (white base) leaf tissues. The green and white leaves are physiologically different, the green tip has very high concentration of chlorophyll, while white base contains extremely low chlorophyll concentration, which shows the difference of green and white leaves in photosynthetic rate [25]. The genes with no detectable expression and low expression (FPKM less than 1 in both tissues) were filtered out. As shown in Fig.7, MADS-box genes can be classified into three clusters. Over the 24-hour period, the expression level of cluster I genes in green tip leaf was higher than that in white base leaf. However, the cluster II genes showed opposite expression: genes in white base expressed higher than in green tip leaf. In cluster III, genes did not exhibit obvious differential expression between green tip and white base tissues. Meanwhile, some MADS-box genes showed peak expression at certain time period in either green tip or white base. For example, Aco012428.1 had highest expression at 6 pm in white base leaf, while Aco027629.1 exhibited highest expression at 12 am in green tip leaf.
There are 14 genes in cluster I and II, we chose 6 genes for qRT-PCR analysis to verify their expression level in green and white leaves (Fig.8). According to qRT-PCR results, the genes in cluster I also showed the similar expression pattern: expressed higher in green tip leaves than white base leaves, and cluster II genes had higher expression in white base leaves. Besides, our qRT-PCR results confirmed that Aco027629.1 had highest expression at 12am in green tip leaves.
Diurnal expression analysis of pineapple MADS-box genes
To identify the circadian expression pattern of MADS-box genes in pineapple, RNA-Seq data of pineapple green tip and white base leaf tissues over 24-hour period were used to determine MADS-box genes whose expression patterns fit the model of cycling genes in Haystack[26].Transcription factors with a strong correlation (r > 0.7) were empirically considered as genes with diurnal rhythm [27], we used the same correlation cutoff as the threshold for analyzing diurnal expression pattern of MADS-box genes. 11 out of 48 (23%) of MADS-box genes were cycling in either green tip or white base leaf tissues. Out of these cycling genes, 4 genes (Aco013229.1, Aco015104.1, Aco004028.1 and Aco019365.1), which all belong to type II group, were cycling in both green tip and white base leaf tissues (table 2).
Four genes were cycling in green tip leaf only, as shown in Fig.9. Aco015492.1 exhibited peak expression at 10am and lowest expression at 1pm, while Aco004988.1 had lowest expression at 10am and highest expression at 1pm. Aco002729.1 and Aco016643.1 showed similar diurnal rhythm: peak expression at 8am and lowest expression at 6pm. There were three genes cycling only in white base leaf tissues (Fig.9). What’s interesting is that Aco012428.1 exhibited two peak expressions at 6am and 10am. Four genes were cycling in both green tip and white base leaves (Fig.10). Aco013229.1 had much higher expression in green tip than in white base during daytime from 6am to 6pm and similar expression level in both tissues at night. Aco019365.1 exhibited similar expression pattern in both green tip and white base: highest expression at 3pm, lowest expression at 10pm, while Aco004028.1 showed opposite expression profiles: highest expression in white base at 10pm and in green tip at 8am.