Basil interspecic hybridization and transcriptome study indicates altered developmental and metabolic gene expression

In order to understand the developmental modulation of transcriptome and associated gene expression in inter-genomic combinations, a systematic study was planned using two diverse yet closely related species of Ocimum, targeting their hybrid F1 and derived amphidiploid (colchiploid of F1 hybrid). The existing developmental alterations between F1 and amphidiploid through phenotypical and anatomical assessments were analyzed. Study of several genes and transcription factors putatively involved in the growth and developmental processes of plants clearly amalgamates the transcriptome data linking the phenotypic differences in F1 and amphidiploid. Additionally, differentially expressed genes of stomatal patterning and development revealed their involvement leading to higher density of stomata in F1 while larger size of stomata in the amphidiploid. Absence of 8,330 transcripts of interspecic hybrid F1 in its amphidiploid and exclusive presence of two detected transcripts in amphidiploid provides a set of genes to analyze the suppressed or activated functions between F1 and amphidiploid. Estimation of chlorophyll, lignin, avonoid and phenylpropenes (eugenol and methyleugenol) content were correlated with the average FPKM and digital gene expression values in F1 and amphidiploid. Transcriptome and metabolome of synthetic Solanum autotetraploids reveal key stress


Abstract Background
In order to understand the developmental modulation of transcriptome and associated gene expression in inter-genomic combinations, a systematic study was planned using two diverse yet closely related species of Ocimum, targeting their hybrid F1 and derived amphidiploid (colchiploid of F1 hybrid). The existing developmental alterations between F1 and amphidiploid through phenotypical and anatomical assessments were analyzed.

Results
Study of several genes and transcription factors putatively involved in the growth and developmental processes of plants clearly amalgamates the transcriptome data linking the phenotypic differences in F1 and amphidiploid. Additionally, differentially expressed genes of stomatal patterning and development revealed their involvement leading to higher density of stomata in F1 while larger size of stomata in the amphidiploid. Absence of 8,330 transcripts of interspeci c hybrid F1 in its amphidiploid and exclusive presence of two detected transcripts in amphidiploid provides a set of genes to analyze the suppressed or activated functions between F1 and amphidiploid. Estimation of chlorophyll, lignin, avonoid and phenylpropenes (eugenol and methyleugenol) content were correlated with the average FPKM and digital gene expression values in F1 and amphidiploid.

Conclusion
This is the rst investigation which describes the genes and transcription factors in uenced by interspeci c hybridization leading to developmental changes and alleviation of intergenomic instability in amphidiploid.

Background
All current day owering plants that appear to be diploid have in fact undergone at least one round of ancient whole genome duplication, suggesting that all angiosperms have polyploidy lineages [1,2]. It is estimated that up to 25-30% of angiosperms still continue to remain polyploid that have not yet diploidized [3,4,5,6]. Further, it is suggested that autopolyploidy can remodel transcriptome and metabolome [7,8]. As such polyploidy is considered to be the most important event in diversi cation and speciation of owering plants [9]. It has been observed that polyploidy leads to increased genetic variation offering ecological and evolutionary advantages including robustness, increased environmental tness and tolerance to a broader range of ecological and environmental conditions, enhanced photosynthetic e ciency, resistance to abiotic and biotic factors [6,10,11], enhanced productivity of secondary metabolites [12], increased cell size, stomata and vascular cells [13,14], however, bestow differential effect on body size [15,14]. Also owing to enhanced variation and adaptation potential the polyploids could be 20% more invasive than their closely related diploids [16].
Since the events of polyploidization are preceded by hybridization in allopolyploids, therefore, genomic doubling could drive multiple changes in gene expression, including gene silencing or chromosomal changes [17,18], loss and retention of duplicate genes [19], physiological divergence [20] and epigenetic modi cations [14]. One of the important observations in this context is the possible dominance of one genome over the other and repeated patterns of genetic change in the natural and synthetic hybrids including detection of transcriptomic shock altering gene expression, DNA methylation and epigenetic modi cation [21,22,23] However, the relative role of hybridization vs. genome doubling as drivers of genetic and genomic change has not yet been addressed [23], although such knowledge could be quite important in breeding of plants where genetic enhancement of vegetative organs, tissues and the secondary metabolites produced therein are the source of economic product. The present investigation was therefore planned on two closely related but diverse species of the model plant Ocimum, their interspeci c hybrids and genomically doubled amphidiploids to elucidate developmental changes occurring during intergenomic stabilization and associated transcriptomic insights and gene expression using next generation sequencing approach.
Ocimum is an important and the largest genus of mint family, Lamiaceae. The genus Ocimum collectively called as basil is represented by many different varieties having incomparable curative properties and unique chemical compositions [24]. The genus shows high degree of morphological, chemical and genetic differences at inter and intra speci c levels [25]. According to karyomorphological studies performed on different varieties of Ocimum, it was suggested that there also exist a great variation in the chromosome number across the genus, 'n' number ranging from 11-38 (http://ccdb.tau.ac.il/search/Ocimum/). This shows that polyploidy is a common event in Ocimum spp. and has played an important role in the course of evolution of this genus. Conventional breeding techniques and other ploidy manipulation tools have been actively used in the genetic improvement of Ocimum species for developing better plant having high yielding essential oils and other bioactive molecules [24]. In this study, an interspeci c hybrid of Ocimum basilicum and Ocimum kilimandscharicum and its amphidiploid plants were used. Further assessment of the plants revealed that the interspeci c hybrid F1 were sterile with smaller leaf area and taller plant height. However, the fertility was restored in genomically double amphidiploids which had larger leaf area but shorter plant height.
These developmental variation in interspeci c hybrids of Ocimum generated interest to carry out the highthroughput de novo transcriptome sequencing and digital gene expression pro ling of parent plants (O. basilicum and O. kilimandscharicum), interspeci c hybrid F1 and amphidiploid plants.
To elucidate changes in gene expression, the differential gene expression (DGE) pro le of all four plants (i.e. the two progenitor diploids, their interspeci c hybrid and derived amphidiploids) were compared.
Detection of gene loss, silencing and activation showed that the hybridization and whole genome doubling triggers the gene expression via genetic and epigenetic alterations immediately upon allopolyploid formation. The overall in uence of hybridization and whole genome duplication on the genes related to chlorophyll metabolism as well as lignin and phenylpropanoid biosynthesis pathways was also investigated. Additionally, many transcripts related to stomatal patterning and development were differentially regulated in F1 vs. amphidiploid indicating the role of these genes in the higher stomatal density in interspeci c hybrid F1 and larger size of stomata in amphidiploid. In particular, several transcription factors (TFs) having possible role in morphological/ anatomical characteristics and in different metabolic processes such as phenylpropanoid biosynthesis ( avonoid and lignin biosynthesis), chlorophyll biosynthesis and catabolism were also identi ed. This study provides deep understanding between parents, F1 and the genomically doubled amphidiploids. The results of this investigation explain the underlying mechanisms responsible for the developmental changes in interspeci c hybrid F1 and amphidiploids underpinning the signi cance of associated changes in the gene expression.

Comparative phenotype
Morphologically interspeci c hybrid F1 and its colchicine induced amphidiploid were more similar to its parent 2 (OKP2). However, to identify the overall effect of polyploidization, different phenotypic features like plant height (Additional Fig. S1 A-D), leaf area, in orescence (Additional Fig. S1 E), stem diameter (Additional Fig. S1 F-I), trichome density (Fig. S2), stomata etc. of amphidiploid, hybrid F1 and its parents (OBP1 and OKP2) were measured (Table 1). On comparing interspeci c hybrid F1 with its parent plants and amphidiploid, F1 was found to be robust, rapidly growing, vigorous and tall (110.00±8.9). However, the leaf area of F1 (3.5± 0.18) was lesser than its amphidiploid (9.63 ± 0.75) and its parent plants. The leaf of interspeci c hybrid F1 was long, medium broad and thin. Also, the in orescence and stem were weak in interspeci c hybrid F1 but the length of in orescence of interspeci c hybrid F1 was longer (nearly two fold) than amphidiploid and its parents. In contrast, amphidiploid was slow growing and medium tall (101.50 ± 9.30). The leaf of amphidiploid was oval shaped, broad, thick and the leaf area was greater than its parents. Besides these characteristics, trichome density was more in the hybrid F1 (nearly threefold higher than amphidiploid). Scanning electron microscopy revealed that size of trichome and stomata (nearly threefold greater than F1) was greater in amphidiploid ( Fig. 1). Furthermore, the density of stomata was more in interspeci c hybrid F1. However, oil yield per 100gm of fresh leaf was more in amphidiploid (0.48 ±0.02).

Chromosome number
The results obtained from root-tip mitosis of the four target plants revealed the modal somatic chromosome number to be as: 2n (AA) = 48 for OBP1, 2n (BB) =76 for OKP2, n+n (A+B) = 62 for interspeci c hybrid F1, and 2n+2n (AA+BB) =124 (Additional Fig. S3 A-D). This is in conformity that the F1 hybrid and the amphidiploids constitute the genuine genomic combination of the two progenitor parents employed in the present study.

Comparative gene expression patterns between parents and progenies
The parent plants (OBP1 vs. OKP2) were investigated for their differential gene expression patterns ( Fig.  2A). Further, the differentially expressed genes between F1 hybrids or amphidiploid compared to their parents were also analyzed (Fig. 2B). Out of 172 differentially expressed genes between interspeci c hybrid F1 and parent 1 (OBP1), 60 showed up-regulation and 112 down-regulation. From 123 differential genes between amphidiploid and OBP1, 51 were up-regulated and 72 down-regulated. Similarly, 155 differentially expressed genes (47 Up-regulated and 108 down regulated) were obtained from the analysis of interspeci c hybrid F1 and parent 2 (OKP2) and 215 (82 up-regulated and 133 down regulated) for amphidiploid and parent 2 (OKP2). In all comparisons, it was observed that the proportion of genes displaying the differential expression between interspeci c hybrid F1 and amphidiploid and their parents was asymmetric (FDR<0.05; BH multiple correction test).
In order to identify the non-additively expressed genes, the expression levels of amphidiploid and hybrid F1 was compared with the mid parent values (MPVs) derived from the base mean values of the two parents assuming that one-third of the total transcription is from the genome from parent 1 (OBP1), onethird from parent 2 (OKP2) and one-third is from the interspeci c hybrid F1. On the other hand, interspeci c hybrid F1 was compared to the parent 1 (OBP1) and parent 2 (OKP2) assuming that the half of the total transcripts is from the genome of each parent (OBP1 and OKP2), respectively. The nonadditively expressed genes between the F1 hybrid and MPV of parent1 and parent2 (OBP1 and OKP2) was found to be 38, of which 10 genes were up-regulated and 28 genes were down-regulated ( Fig. 3A; FDR< 0.05). Likewise, of 786 non-additively expressed genes observed between the amphidiploid and the MPV of Parent1, Parent2 and F1, 395 genes showed up-regulation and 391 genes exhibited downregulation ( Fig. 3B; FDR <0.05). GO analysis of 10 up-regulated genes in F1 hybrid and MPV of OBP1 and OKP2 indicated enrichment of photosynthesis, terpene biosynthesis including sesquiterpenoid metabolism and lipid biosynthesis (Additional Fig. S4A; FDR< 0.05; Table S2) whereas, the 28 down regulated genes in F1 hybrid and MPV of OBP1 and OKP2 were mainly associated with lignan biosynthesis and metabolism, steroid metabolism (Additional Fig. S4B; FDR< 0.05; Table S3). In contrast, the 395 up-regulated genes in amphidiploid and the MPV of OBP1, OKP2 and F1 revealed the enrichment of developmental vegetative growth, regulation of leaf development and regulation of shoot apical meristem development (Additional Fig. S4C; FDR< 0.05; Table S4) while the 391 down regulated genes in the amphidiploid and MPV of OBP1, OKP2 and F1 were mainly enriched in avonoid metabolism, starch catabolism, s-adenosylmethionine metabolism (Additional Fig. S4D; FDR< 0.05; Table S5).

Hybridization induced transcript expression
To understand the reasons for developmental changes between interspeci c hybrid F1 and amphidiploid, pair-wise comparison between OBP1 vs. F1&Amphid2, OKP2 vs. F1&Amphid2, F1 vs. OBP1&OKP2 and Amphid2 vs. F1,OKP2&OBP1 were analyzed (Fig. 4A). 244 transcripts out of 38,040 common transcripts (obtained from pair-wise comparison between OBP1 vs. F1&Amphid2, OKP2 vs. F1&Amphid2, F1 vs. OBP1&OKP2 and Amphid2 vs. F1,OKP2&OBP) exhibited antagonistic expression pattern (log2 fold -1>= <1) in interspeci c hybrid F1 and amphidiploid. Further analysis revealed similar expression pattern of these 244 transcripts in amphidiploid and parent 2 (OKP2). Among these 126 transcripts were up regulated in amphidiploid and parent 2 whereas these transcripts were down regulated in interspeci c hybrid F1 ( Fig. 5A; Table S6). In contrast to this, 118 transcripts were having higher expression in interspeci c hybrid F1 in comparison to amphidiploid and parent 2 (Fig. 4C). These results indicated that the overall transcriptome of amphidiploid was more similar to its parent 2 (OKP2) and it matches to the morphological analysis which shows that amphidiploid plants were more similar to its parent 2 (OKP2).
To identify the possible function of these transcripts GO enrichment analysis was conducted. This analysis revealed that these genes were mainly enriched to biological function such as shoot system development, ower development, cotyledon morphogenesis, embryonic morphogenesis, reproductive structure development, programmed cell death (Fig. 4D). It was also observed that many transcripts were reported as "PREDICTED" cellulose synthase-like protein G3 (proposed to synthesize non cellulosic polysaccharides that comprise plant cell walls)", "1-deoxy-D-xylulose 5-phosphate synthase (catalyzes the rst step of the MEP pathway)", SWI/SNF complex component SNF12 homolog (activator of ower homeotic genes)", oligopeptide transporter 3 (essential for early embryo development)", casein kinase II subunit beta-like (involved in owering-time regulation)", AMP deaminase-like isoform X2 (essential for the transition from zygote to embryo)", SNW/SKI-interacting protein (Splicing factor involved in posttranscriptional regulation of circadian clock and owering time genes)," nuclear transcription factor Y subunit A-10 (positive Regulators of Photomorphogenesis)", "Calmodulin-domain protein kinase 5 isoform 1 (involved in the many aspects of plant growth and development)", "Epidermal patterning factor -like protein 4 (negative regulator of stomatal development)", "WAT1-related protein At4g19185-like (involved in the secondary wall formation)", "protein STRICTOSIDINE SYNTHASE-LIKE 11 (involved in anther development and pollen wall formation)" (Table S7). It was also noticed that many of the transcripts were predicted as "uncharacterized protein LOC105176273, hypothetical protein POPTR_0001s256302g, hypothetical protein MIMGU_mgv1a009003mg, hypothetical protein MIMGU_mgv1a005332mg, hypothetical protein JCGZ_24107, hypothetical protein M569_05704, uncharacterized protein LOC105177873, uncharacterized protein LOC105176091, uncharacterized protein LOC105165185, uncharacterized protein LOC105164617 isoform X1, etc." and many of them were left unannotated, perhaps due lack of its annotation in Ocimum. Further investigation of these transcripts could provide good candidates for understanding the role of genome doubling in the correction of phenotypic weakness induced by hybridization. Interestingly, these transcripts could be utilized for the identi cation of novel genes probably having their role in morphological and anatomical differences between parents and hybrid and the amphidiploid plants.
Comparison of gene expression between the F1 hybrid and amphidiploid Between colchicine induced amphidiploid and F1 hybrids, 179 differentially expressed genes (DEGs) were identi ed including 132 up-regulated and 47 down-regulated genes when BH multiple test correction method was applied (Benjamini and Hochberg, 1995) having FDR < 0.05 (Additional Fig. S5A). These DEGs were mapped to reference canonical pathways in KEGG to nd out their involvement in biological pathways and 44 out of 179 DEGs were assigned to 46 KEGG pathways (Additional Fig. S6). The largest cluster was of biosynthesis of secondary metabolites with 13 members and the second largest was of metabolic pathways with 9 members, indicating that many genes among these DEGs were involved in the biosynthesis of secondary metabolites. Thereafter, to investigate the probable function of these DEGs, GO enrichment analysis was performed. The results of GO enrichment analysis showed that they were mainly enriched in the secondary metabolic processes like sesquiterpenoid biosynthetic and metabolic process, isoprenoid biosynthetic process, jasmonic acid metabolic process in biological process category (Additional Fig. S5B) and were enriched in auxin:proton symporter activity, fatty-acyl-CoA reductase (alcohol-forming) activity, magnesium ion binding, lyase activity, cyclase activity etc (FDR < 0.05) (Additional Fig. S5C).

Detection of gene expression alterations in interspeci c hybrid F1 and amphidiploid
Gene alteration events were calculated by the occurrence of new transcripts (lacking in parents) or by the lack of some transcripts (existing in parents) in interspeci c hybrid F1 and amphidiploids. For this analysis, total transcripts of parent 1 (67,770), parent 2 (73,265), interspeci c hybrid F1 (76,917) and the amphidiploid (76,563) were examined (Fig. 5). The result of this analysis illustrated that 5,766 common transcripts of parents and interspeci c hybrid F1 were not detected in amphidiploid. In addition to this, about 6,432 transcripts present in interspeci c hybrid F1 and amphidiploid were missing in parents. However, of these 6,432 transcripts, only 3,868 transcripts were common in interspeci c hybrid F1 and amphidiploid. Therefore, total 8,330 transcripts of interspeci c hybrid F1 were absent in amphidiploid. On the other hand, only two transcripts were found to be exclusive in amphidiploid with respect to interspecifc hybrid F1 and parents (OBP1 and OKP2). These alterations in gene expression may be because of gene silencing, activation or may be due to sequencing error, but here it was assumed that these transcripts were either suppressed or expressed in amphidiploid. Upon analyzing the annotations of these non-expressing 8,330 transcripts in amphidiploid, it was observed that these transcripts mainly included the genes which were involved in disease resistance, primary and secondary metabolism and cell cycle. It also included many transcription factors ("basic helix-loop-helix transcription factor", "MYB/ MYB-related", "MADS-box", "APETALA", "AP5/EREBP", "WRKY" etc), cytochrome p450s and methyl-CpGbinding domain-containing proteins. Moreover, there were many transcripts which were predicted as "uncharacterized or unnamed protein" and many of them were left un-annotated also. In contrast, the 2 transcripts exclusive to amphidiploid were WNK lysine de cient protein kinase and geranylgeranyl transferase type-2 subunit beta 1-like proteins.
Chlorophyll content and DGE related to chlorophyll biosynthesis between F1 hybrid and amphidiploid The result of chlorophyll estimation showed that the amount of Chla, Chlb and the total chlorophyll was 0.35 mg/g, 0.13 mg/g and 0.54 mg/g, respectively in interspeci c hybrid F1 and 0.30 mg/g, 0.11 mg/g and 0.37 mg/g in amphidiploid. Here, it was observed that the amount of Chla, Chlb and the total chlorophyll contents were higher in the interspeci c hybrid F1 than its amphidiploid (Fig.7A). To nd the probable reason for this content change, 167 transcripts in interspeci c hybrid F1 and 169 trancripts in amphidiploid related to 27 classic enzymes involved in chlorophyll metabolic pathway were analyzed ( Table 2). Chlorophyll metabolic pathway in plants consists of ALA, Proto IX, heme and chlorophyll formation/degradation steps. In ALA, Proto IX, heme and chlorophyll biosynthesis steps, higher number of transcripts for enzymes like HemA, HemE, HemY, HemH, COX15, POR and CAO were detected while fewer transcripts for HemF, COX10, ChlD, ChlI, ChlM, ChlE, 4VCR and CLH were recorded. Single copy of enzyme ChlH and ChlG were identi ed from the trancriptome sequences of the interspeci c hybrid F1 and amphidiploid. In the chlorophyll degradation steps, more than one transcript was identi ed for enzymes like NYC1, HCAR, PPH, PAO and RCCR. FPKM values and fold change values for 167 common transcripts in interspeci c hybrid F1 and amphidiploid were averaged for the further analysis. Based on these identi ed transcripts, proposed chlorophyll metabolic pathway in interspeci c hybrid F1 and amphidiploid was constructed (Fig.6).
The difference in the content change of Chla, Chlb and total chlorophyll were correlated with the average Phenylpropanoid pro ling and DEG related biosynthesis in F1 hybrids and amphidiploid Phenylpropanoid biosynthetic pathway is the predominant pathway present in the different Ocimum spp. which produces different phenylpropenes, lignins and avonoids. Therefore, to understand the effect of interspeci c hybridization and whole genome duplication on phenylpropanoid biosynthesis, GC-MS pro ling of essential oils of F1 and amphidiploid, total content change of lignin and avonoid was analyzed ( Fig. 7 B-D). The essential oil analysis showed that the amount of eugenol was 0.082 mg/g leaf in interspeci c hybrid F1 and 0.063 mg/g leaf in amphidiploid whereas, the amount of methyleugenol was 0.087 mg/g leaf and 0.032 mg/g leaf in interspeci c hybrid F1 and amphidiploid, respectively. In addition, the content of total avonoid and total lignin was higher in the amphidiploid than interspecifc hybrid F1. In addition, the amount of avonoid and lignin was 0.26 mg/g leaf and 0.63 mg/g leaf, respectively in interspeci c hybrid F1 and 0.41 mg/g leaf and 1.03 mg/g leaf, respectively in amphidiploid. Thus, it was found that amount of eugenol and methyleugenol was higher in interspeci c hybrid F1 but the amount of total avonoid and total lignin was higher in amphidiploid. Several transcripts corresponding to 16 enzymes directly involved in the general phenylpropanoid biosynthesis pathway were identi ed and analyzed to address the question of differential biosynthesis of phenylpropanoids, lignin and avonoids in the interspeci c hybrid F1 and amphidiploid (Table 3). Among these enzymes, PAL, C4H and 4Cl are mandatory enzymes catalyzing the initial three steps of the phenylpropanoid pathway, while HCT, CCR, COMT, CCoAOMT and CAD are downstream enzymes directly involved in the biosynthesis of lignin in plants. These enzymes including PAL, C4H and 4CL belong to multigene family and hence, more than one transcript for these enzymes was identi ed. Similarly, for EGS and EOMT belonging to small gene families responsible for the production of eugenol and methyleugenol, respectively in Ocimum, only three transcripts for EGS and one transcript of EOMT were identi ed in the transcriptome. CHS is the rst enzyme of avonoid biosynthesis producing the rst avonoid naringenin chalcone with the involvement of 4-coumaroyl-CoA and three molecules of malonyl-CoA and CHI, the key enzyme of avonoid biosynthesis pathway catalyses intramolecular cyclization of naringenin chalcone into naringenin. About 6 transcripts of CHS and 7 transcripts of CHI were identi ed in both interspeci c hybrid F1 and amphidiploid. Reduction of dihydro avanols at position 4 is catalyzed by enzyme belonging to DFR superfamily. About 14 and 15 transcripts of DFR were detected in interspeci c F1 and amphidiploid, respectively. Likewise, 8 and 7 transcripts of F3'H and only 1 transcripts of F3'5'H and F3H were recorded in interspeci c F1 and amphidiploid. About 4 transcripts of UFGT enzyme responsible for converting anthocyanidin to anthocyanin were found in the transcriptome sequences of interspeci c F1 and amphidiploid. But the transcripts of CCoA-3H,F5H,CAAT and ANS/LDOX could not be annotated. The means of FPKM values and the values of differential gene expression of 184 common transcripts based on the above identi ed transcripts were used to correlate the trend of content change of phenylpropenes with its gene expression (Fig. 8).
The average of FPKM values and differential gene expression (log2 fold change -0.1>=<0.1) showed that the expression of four crucial genes (HCT, CCR, COMT and CAD) directly involved in the biosynthesis of lignin were up-regulated in amphidiploid but down-regulated in interspeci c F1 hybrids. Conversely, expression of PAL, C4H, 4CL, EGS, EOMT gene were highest in the interspeci c hybrid F1 while the expression of genes involved in avonoid biosynthesis (CHS, CHI, F3'H, DFR and UFGT) were down-regulated in interspeci c hybrid F1. To validate the gene expression pro le of RNAseq Data, qPCR of seven genes involved in general phenylpropanoid biosynthesis was performed (Fig. 9). The result of qPCR correlates the decreased expression of COMT, CAD, CHS, DFR genes involved in lignin and avonoid biosynthetic pathway and increased expression of PAL, 4CL, and EGS in interspeci c hybrid F1 con rming the RNAseq data.
Identi cation of candidate DEGs involved in higher stomatal density in interspeci c hybrid F1 and larger stomatal size in amphidiploid Higher stomatal density in interspeci c hybrid F1 and larger stomatal size in amphidiploid are the two peculiar characteristics which were observed through anatomical analyses in mature leaves. For this reason, 172 candidate DEGs (log2 fold change -0.1>=<0.1) putatively associated in the stomatal patterning and development were identi ed (Table S8) to be most abundant TF family, followed by WRKY, AP2/ERF and NAC. Total 8 TFs were related to bZIP superfamily which is reported to be associated in various biological processes such as ower and vascular development, embryogenesis, organ differentiation, seed maturation. TFs which were annotated to have transcription factor activity but do not fall in any of the families as classi ed by plant transcription factor database PlantTFDB 4.0 were speci ed as 'other' (Fig.10). The differential gene expression (log 2 fold change -0.1>=<0.1) and average FPKM values indicated highest number for MYB and MYB related super-family (35) followed by WRKY (33), NAC (17), MADS-BOX (14) and ethyleneresponsive transcription factor (13) which are differentially expressed between interspeci c hybrid F1 and amphidiploid. Also, a large number of TFs in WRKY (25) superfamily, MYB and MYB-related superfamily (22), NAC superfamily (12), MADS-BOX (7) and ethylene-responsive transcription factor (7) were downregulated in interspeci c hybrid F1 and up-regulated in amphidiploid. All the differentially expressed TFs were summarized according to their probable involvement in the oral development, leaf development, trichome development, seed development and xylem formation. Many other TFs putatively showing their role in different metabolic pathways such as chlorophyll biosynthesis/ catabolism, phenylpropanoid biosynthesis, avonoid biosynthesis and anthocyanin biosynthesis were also analyzed and listed (Table  S9). In addition, many TFs involved in growth and developmental processes such as owering time

Discussion
Amphidiploidy alleviates genomic instability in the inter-speci c hybrid Polyploidization is an important event of speciation and evolution leading to generation of new forms adapting to new ecological niches. Often this occurs naturally but can be realized through breeding. Many of the modern day crop plants have evolved through polyploidization. This technique is being extensively used by the plant breeders and such events are thought to have a profound effect on genome structure and gene expression [26]. It is well accepted that hybridization between diverse genotypes often generates hybrid vigour and combination of characters, but may sometimes lead to hybrid dysgenesis.
However, such interspeci c hybrids that involve diverse genomes often suffer from developmental and reproductive de ciency on account of cell cycle incompatibility and meiotic disturbances. Nevertheless, the latter de ciency could be overcome by genomic duplication / amphi-diploidization, as observed in the instant inter-speci c F1 hybrid and its derived amphidiploids.
Polyploidization often causes developmental changes in plants such as increased plant height, enlarged cell and organ size, higher biomass and other phenotypic variation. In Ma bamboo (Dendrocalamus lati orus Munro) different polyploidy levels displayed altered anatomical, physiological and growth characteristics, such as leaf thickness, fusoid cell and stomatal size, shoot number, photosynthesis and respiration rate etc [27]. Senecio cambrensis generated through chromosome doubling in the sterile triploid hybrid S. × baxteri produced fruits [28]. Though these phenomena are age old and very well known, yet the changes occurring at expression levels has not been systematically studied with respect to the parents. Hence, this study for the rst time tends to describe the genes and transcription factors in uenced by interspeci c hybridization leading to developmental changes and alleviation of intergenomic instability in amphidiploids. Here, it was found that hybridization and genome duplication have abrupt but noticeable effects on the gene expression patterns. The hybridization event strongly alters the parental gene expression patterns which get ameliorated after genome duplication. Therefore, transcriptome was analysed as the dynamics of expression changes keeping the nature of genome constant.

Gene silencing and activation in interspeci c hybrid F1 and amphidiploid
Chromosome doubling after interspeci c or intergeneric hybridization, leads to the development of new allopolyploid species [29]. Newly formed allopolyploids must overcome the reduced fertility (occurred due to improper chromosome pairing and segregation) in order to prove themselves as a successful species, and associated alteration of gene expressions [30,18]. These alterations in gene expression may occur because of gene silencing and activation. However, the type of gene affected and the probable mechanism involved in the ploidy regulation of gene expression are still challenging [31]. Very few works have addressed such responses in gene alterations in eukaryotic system. In yeast, Galitski et al. [32], showed that ploidy-regulated activation and silencing of genes were mainly related to cell growth and development. In a newly synthesized wheat allotetraploid the silenced/lost genes included rRNA genes and genes involved in metabolism, disease resistance, and cell cycle regulation but the activated genes were of known function and all were retroelements [29]. In this study, it was found that disappearance/ suppression of genes in amphidiploids from interspeci c hybrid F1 was mainly associated with the disease resistance, primary and secondary metabolism and cell cycle. Suppression of many transcription factors like basic helix-loop-helix transcription factor", "MYB/ MYB-related", "MADS-box", "APETALA", "AP5/EREBP", "WRKY" and cytochrome P450 might have helped interspeci c hybrid to overcome the reduced fertility in amphidiploid. Silencing of methyl-CpG-binding domain-containing proteins in amphidiploid plants indicated that the formation of amphidiploid is associated with the epigenetic changes. This suggests that the hybridization and allopolyploidy causes rapid changes in gene structure and expression which contributes to the novel type of expression pro les. Further investigation of unknown/uncharacterized genes which got suppressed/ lost expression in amphidiploid could provide a better understanding of gene affected and the mechanism involved in the ploidy regulation of gene expression upon chromosome doubling.

Enhanced chlorophyll biosynthesis in interspeci c hybrid F1
Increase in chlorophyll content serves as an indicator of hybrid vigor as it is believed that increase in chromosome number tends to increase the number of chloroplast in cells and hence increases the chlorophyll content [33,34,35]. However, this tendency is not always anticipated as in Atriplex confertifolia, which remains constant in plants of different ploidy levels [36]. In the present investigation, increased chlorophyll content (chla, chlb and total chlorophyll) of hybrid F1 compared to amphidiploid correlates with respective FPKM and the differential gene expression values. The increased level of chlorophyll in F1 also correlates with the increased expression of chlorophyll biosynthetic genes and decreased expression of the degradation genes in contrast to amphidiploid. Relative chlorophyll content was found to be similar in haploid, diploid and tetraploid plants of Ricinus communis [37]. Sometimes increase in chlorophyll content in interspeci c hybrid F1 could be related to sterility [38,39]. F1 hybrids of sorghum raised from the CMS lines having P614 genome had increased chlorophyll a content because of the sterile M35-1A cytoplasm. There was also increase in the total chlorophyll content in F1 hybrids obtained with CMS lines with the Zh10 genome and the P35 pollen parent which was because of A4 cytoplasm. Ectopic over expression of bol-miR171b in Brassica oleracea L var. ital also led to increase in chlorophyll content which was also sterile [40]. These ndings suggest a relationship between chlorophyll biosynthesis and the regulation of sterility in plant as in the present investigation. However, the literature support, does not provide any concrete foundation of this suggestion which require further validation of chlorophyll metabolism genes (HemA, HemL, HemC, HemE, HemF, HemY, HemH, COX10 Chl, ChlI,ChlM,ChlE, 4VCR, HCAR, PPH, PAO, RCCR) and transcription factors involved in chlorophyll metabolism such as NAC transcription factor ANAC046, ANAC087, and ANAC100 in plant fertility, oral timing, oral development and seed development.

Reduced lignin biosynthesis in interspeci c hybrid F1
Biosynthesis of lignin plays a major a role in the developmental changes in interspeci c hybrid F1 and amphidiploid as it is an integral component of plant cell wall providing strength and rigidity to the cell wall. Lignin also provide armory to the plant against various biotic and abiotic stresses [41]. Highest lignin content and the differential gene expression of the genes involved lignin biosynthesis in the amphidiploid suggests the thicker stem, leaf and its ability to withstand towards various abiotic and biotic stresses is due to up-regulation of HCT, CCR, COMT and CAD genes of the lignin biosynthesis pathway. Hence, contribution of a robust lignin biosynthesis mechanism in the processes of heterosis and enhanced adaptability in amphidiploid cannot be ignored. Lignin biosynthesis is shared by general phenylpropanoid pathway [42] which requires deamination of phenylalanine, successive hydroxylation and O-methylation of aromatic ring, followed by the conversion of the side-chain carboxyl to an alcohol group [43]. The development of plant is severely affected by disruption of CCR and CAD gene [44] and ccc mutants of Arabidopsis showed male sterility due to lack of ligni cation in anther endothecium causing the failure of anther dehiscence and pollen discharge. In contrast, down-regulation of COMT to low activity levels reduces 30% lignin content in alfalfa and maize while 17% in poplar. The lignin content vis a vis the genes of lignin biosynthesis could be correlated to the F1 and the amplidiploid in this investigation as the expression of biosynthetic genes are depressed in the F1 compared to amphidiploid.
Higher amount of eugenol and methyleugenol in interspeci c hybrid F1 is possibly associated with reduced lignin and avonoid biosynthesis Higher amount of phenylpropenes (eugenol and methyleugenol) in interspeci c hybrid F1 (and not in amphidiploid) is possibly due the down regulation of the genes involved in the production of lignin and avonoids, diverting some ux towards the production of higher amount of phenylpropenes in interspeci c hybrid F1. Earlier studies on the modi cation of lignin and avonoid biosynthetic pathway in plants suggest that the down-regulation of lignin pathway alters the carbon ux within the phenylpropanoid pathway and indirectly in uencing production of other secondary metabolites [45]. For example, in Petunia hybrida, suppression of cinnamoyl-CoA reductase (CCR1) and up regulation of expression of cinnamate-4-hydroxylase (C4H) increased the uxes through the phenylpropanoid pathway [13]. Down-regulation of chalcone synthase (CHS) gene in Flax also showed decreased lignin synthesis and signi cant plant morphology, modulating the ux towards tannins [46]. Despite the suppressed differential gene expression of CCR, COMT, CAD, CHS, and DFR genes was positively correlated with reduced content of lignin and avonoids in interspeci c hybrid F1.

Effect of hybridization on stomatal patterning and development associated genes
Stomatal density, guard cell length and stomatal plastid number have frequently been used as morphological markers to test ploidy levels in many plants [47]. In Coffea canephora, signi cant differences in stomatal frequency, guard cell length were noticed between diploid to tetraploid [47]. In this work, it was found that stomatal density in interspeci c hybrid F1 (2n=62) was nearly two fold higher than its amphidiploid (2n=124). In contrast to this, the length of stomata of amphidiploid was found to be nearly threefold greater than its interspeci c hybrid F1. Therefore, to understand the genetic basis underlying such variation in stomatal frequency and stomatal length upon change in ploidy levels, several genes putatively involved in stomatal patterning and development were analyzed. In Arabidopsis, a number of components in the series of stomatal patterning and development have been identi ed which include, putative receptors TOO MANY MOUTHS (TMM) gene, Erecta-gene family, CLAVATA, stomatal density and distribution 1 (sdd1) and several EPIDERMAL PATTERNING FACTORs (EPFs) [48,49]. Besides, SDD1-like protease (which shares high level of identity with SDD1) is concerned with the epidermal development. Also, downstream MAP kinase signaling cascade negatively regulates stomatal development. Similarly, defect in YODA gene (YDA), a putative MAP kinase kinase kinase (MAPKKK), results in an excessive number of clustered stomata [50,51].
In this study, transcripts associated with the stomatal patterning and development such as EPIDERMAL PATTERNING FACTOR-like protein 9 (EPF9), subtilisin-like protease SDD1, leucine-rich repeat receptor-like protein CLAVATA2, mitogen-activated protein kinase 4/mitogen-activated protein kinase 4-like (MAPK4/MAPK4-like) were upregulated indicating the involvement of these genes for the higher number of stomata in interspeci c hybrid F1. In contrast, transcripts homologous to TOO MANY MOUTHS (TMM) gene, Erecta-gene family, YODA gene etc were down regulated in interspeci c hybrids F1, implying the involvement of these transcripts in determining larger size and less stomata number in amphidiploid.

Differentially expressed Transcription factors
Transcription factors play a critical role in the regulation of gene expression of all the vital processes in all living organism. They are involved in the regulation of variety of processes that ranges from development to differentiation, metabolism to defense [52]. To integrate transcriptome data with the phenotypic and metabolic pro les, transcription factors involved in ower, trichome, seed and leaf development were identi ed and analyzed. Some transcription factors involved in chlorophyll metabolism, phenylpropanoid biosynthesis and xylem development were also recovered. Earlier studies in model plants such as Arabidopsis suggest the involvement of MYB, MAD, NAC, WRKY, bHLH and bZIP transcription factors in plant growth and development processes [53,54]. In Arabidopsis, majority of TFs belonging to MADS box family were speci cally involved in oral developmental processes [55]. Similarly, SQUAMOSA-PROMOTER BINDING PROTEIN-LIKE (SPL) gene family and CONSTANS genes also promotes owering in Arabidopsis [56]. MYB transcription factors are regarded as a master regulator of phenylpropanpoid pathway which is also implicated in the formation of trichome in Arabidopsis [57]. Several other MYB TFs (MYB20, MYB42, MYB43, MYB46, MYB52, MYB54, MYB58, MYB69, MYB61, MYB63, MYB83, MYB85, and MYB103) as well as some GATA like TFs are considered to be important regulatory factors for the formation of secondary wall in Arabidopsis [58]. Some NAC TFs including ANAC046, ANAC087, and ANAC100, directly bind to the promoter regions of NYC1, SGR1, SGR2, and PaO suggesting the existence of TFs coordinating the expression of a number of Chl catabolic genes [59].

Development of F1 hybrid and the amphidiploid
An inter-speci c hybrid was produced between the two target species by hand pollination taking OBP1 as the female parent. F1 Hybrid was raised from the seeds obtained from the fertilized ovules borne on the female parent. This hybrid thus obtained was seed sterile. Therefore, it was multiplied vegetatively to raise its clonal progenies. Shoot tips of the fast growing hybrid at 10-12 leaf stage were administered with 0.2% aqueous solution of Colchicine (Sigma Aldrich) following cotton swab and intermittent colchicine dropping method for 24 hours. Further, colchicine treatment was stopped by eliminating the cotton swab, and the colchicine treated shoots were washed carefully with the help of water sprayer; and shoots were permitted to grow naturally. The colchicine affected shoots were excised after growth of 10-12 leaf whorls, and multiplied vegetatively to raise the amphidiploids. The amphidiploids were seed fertile.

Chromosome count
To ascertain chromosome status of progenitor parents, their hybrid and the amphidiploid, somatic chromosome analysis procedure was performed on the four sets of plants. For this the shoot cuttings were planted in sand, and the fast growing roots emerging from the nodes were excised, then pretreated in saturated aqueous solution of para-dichlorobenzene for three hours at 12-14 0 C, followed by thorough washing in water and then xation overnight in Carnoy's solution (6:3:1, Absolute alcohol: Chloroform: Acetic Acid). Root-tips were stained overnight at 37 0 C in 2% Aceto-Orcein: 1N HCl. Fixed root-tips were squashed in 45% acetic acid to observe chromosome count under the microscope. Only the intact cells were considered to count the chromosome number, while modal number was taken into account to ascertain the somatic chromosome number.

Phenotypic changes and its statistical analysis
Parents (OBP1 and OKP2), interspeci c hybrid F1 and amphidiploid (Amphid2) were cloned from cutting and planted in the BT-2 eld of CSIR-CIMAP, Lucknow. Field experiments were carried out for the trait measurement and phenotype assessments. Plant height, stem diameter, leaf area and length of the in orescence, trichome density, length of trichome, length of stomata were calculated from the leaves of the mature plants (six month old) and analyzed statistically. The digital pictures of each mature plant, leaves and in orescence (OBP1, OKP2, F1 and Amphid2) depicting the phenotypic changes were taken. Glandular trichome density was observed at 40X magni cation using a compound microscope (Leica DM750). Ten biological replicates were taken for the trait measurement and phenotype assessments. Biochemical Assays for estimation of total chlorophyll, total lignin and total avonoid contents Total chlorophyll content of both interspeci c hybrid F1 and amphidiploid were calculated as described by Arnon [62]. The absorbance of the pigment was estimated at 645 and 663nm wavelength against 80% acetone. Three biological replicates were taken for the analysis and absorbance of each replicate was repeated thrice.
Total lignin content from the mature leaves of three biological replicate samples of interspeci c hybrid F1 and amphidiploid were estimated using the method as described by Kumar et al. [63].
Total avonoid content was quanti ed from the dried sample of mature leaves of interspeci c hybrid F1 and amphidiploid using Dowd method as illustrated by Sankhalkar and Vernekar. [64

Metabolite Analysis
For relative oil analysis, 100g of fresh leaves of interspeci c hybrid F1 and amphidiploid were hydrodistilled in the Clevenger's apparatus. The oil samples of both plants were collected into the microcentrifuge tubes (MCTs) separately. 1µl of dehydrated (using anhydrous sodium sulphate) oil diluted with hexane in the ration of 1:10 was injected to GC-MS system (MSD 7890A, Agilent Technologies) equipped with autosampler, HP-5MS column and 7977A mass detector. The oil samples were run in splitless mode as described by Akhtar et al. [65]. All the samples were run in three biological replicates and analyzed statistically (± standard deviation). Mass spectra acquisition was carried out in scan mode and analyzed with the help of Mass hunter workstation software (Agilent technologies) by comparing with the NIST11 library.

Statistical measurement
Microsoft O ce Excel 2007 was used for the calculation of mean values and standard error (SE).
Graphpad prism software was used for the calculation of signi cant differences between the samples. *** represents p < 0.001, ** represents p < 0.01 and * represents p < 0.05.    The amphidiploids were compared with parent 1 (OBP1), parent 2 (OKP2) and interspeci c of F1 assuming that 1/3 of the total transcription is from the genome from parent 1 (OBP1), 1/3 of the total transcription is from parent 2 (OKP2) and 1/3 of the total transcription is from the interspeci c hybrid F1

Abbreviations
while the interspeci c hybrid F1 was compared with the parent 1 (OBP1) and parent 2 (OKP2) assuming that the ½ of the total transcription is from the genome of each parent (OBP1 and OKP2). The numbers written at the right hand side represents signi cantly up-regulated, total number of expressed genes and down regulated genes respectively (FDR< 0.05). The numbers written at the righthand side represents total number of baseline expressed genes respectively. Percentage indicates the proportion of total number of expressed genes.