Morphological and agronomic characteristics with different ploidy levels
As shown in Fig. 1a, the number of chromosomes in the different hybrid cells was 24 in F01 (AB), 48 in F02 (AABB) and 48 in F03 (AAAB) according to the chromosomes found in the root tips. With the multiplication of chromosomes, there were obvious differences in the external morphology of F01 and F02 (Fig. 1b). The allotetraploid plants (F02) were taller and had sturdier stems, larger, thicker and darker colored leaves and larger panicles, grains and floral organs compared with the hybrid F01 (Fig. 1c ~ h). In addition, the amphiploid plants exhibited stronger growth, showing the obvious “giant” effect of polyploid organs (Table 1). In terms of seed setting, the allotetraploid hybrid F02 (AABB) had a seed setting rate of 53.16%, while the diploid hybrid F01 could not produce seed at all. The hybrids with the same ploidy (F02 and F03) also showed significant morphological differences due to the different genome compositions (Fig. 1). Compared with F02, the allotetraploid hybrid F03 grew taller, had larger panicles and floral organs and contained more grains per panicle. The overall phenotype of F03 (AAAB) was more similar to that of cultivated rice due to its increased repetition of the A genome. In terms of fertility, F03 did not set seed after selfing, although immature embryos were occasionally observed. The common features of F01, F02 and F03 were that they all had purple stigmata, red awns, black chaffs and displayed easy shattering.
Table 1
Comparison of morphology characteristics between different ploidy hybrids
Materials | F01 (AB) | F02 (AABB) | F03 (AAAB) |
Plant height (cm) | 93.51 ± 4.02 | 103.00 ± 1.25 | 109.20 ± 2.29 |
Panicle no. per plant | 24.16 ± 0.37 | 17.57 ± 1.05 | 19.86 ± 1.45 |
Panicle length (cm) | 20.01 ± 1.13 | 25.88 ± 1.97 | 31.57 ± 1.17 |
Grain length/width (cm) | 0.70/0.29 | 0.90/0.30 | 0.90/0.35 |
Awn length (cm) | 6.10 ± 0.30 | 6.90 ± 0.23 | 7.51 ± 0.41 |
Total grain no. per panicle | 170.00 ± 6.32 | 116.35 ± 5.64 | 187.29 ± 7.73 |
Seed setting rate(%) | 0 | 53.16 ± 2.14 | 0 |
Flag leaf length/width (cm) | 15.6 ± 1.37 / 2.3 ± 0.25 | 19.4 ± 0.68 / 2.5 ± 0.24 | 20.1 ± 0.39/ 2.1 ± 0.12 |
Shattering trait | Shattering | Shattering | Shattering |
Awn color | Red | Red | Red |
stigma color | Purple | Purple | Purple |
Seed color | Black | Black | Black |
The mature and normal pollen grains in the allotetraploid hybrid F02 were full, round, dark and 41.44 µm in mean diameter. The pollen grains of F01 and F03 could not be stained, and typical abortive pollen was the main type, showing characteristics of shrinkage and shriveling. Therefore, the diameters of the pollen grains were smaller, at 21.08 µm and 26.21 µm, respectively (Table 2). There were also obvious differences in pollen fertility among different ploidy hybrids (Table 2, Fig. 1). There were almost no stained pollens in F01. A few stained pollens appeared in F03, and the highest staining rate was 4.35%. The pollen staining rate for F02 was the highest, ranging from 40.11–80.26%, with an average of 62.89%. The pollen staining rate can directly reflect the fertility characteristics of the material. According to the results of pollen staining, F02 had the highest level of fertility, while F01 and F03 had poor fertility, which was consistent with the results of the investigation on seed setting rates.
Table 2
Comparison of pollen diameter and fertility between different ploidy hybrids
Materials | Pollen diameter | Pollen staining rate |
Min (µm) | Max (µm) | Average (µm) | Min % | Max % | Average % |
F01 (AB) | 16.12 | 30.22 | 21.08 ± 3.91 | 0.00 | 0.00 | 0.00 |
F02 (AABB) | 29.42 | 49.01 | 41.44 ± 6.67 | 40.11 | 80.26 | 62.89 ± 16.97 |
F03 (AAAB) | 19.75 | 38.08 | 26.21 ± 5.18 | 0.00 | 4.35 | 0.93 ± 1.57 |
Table 3
Statistics of the SMRT sequencing data with different ploidy hybrids
Items | F01 (AB) | F02 (AABB) | F03 (AAAB) |
Number of circular consensus reads | 278855 | 290833 | 293637 |
Non-full-length transcripts | 244819 | 257816 | 258052 |
Filtered short reads | 174 | 81 | 65 |
Number of consensus isoforms | 31409 | 30778 | 28933 |
Average consensus isoforms read length | 1127 | 1710 | 1760 |
Nonredundant isoforms | 11223 | 12722 | 13472 |
Gene loci | 8336 | 8767 | 9140 |
New gene loci | 270 | 206 | 259 |
New isoforms | 2746 | 4044 | 4113 |
Full-length transcriptome sequencing analysis with different ploidy levels
In this study we applied the Iso-seq approach to a transcriptomic analysis of rice hybrids with different ploidy and genome compositions (Fig. S1), which had polyploid genomes of greater complexity than those of Asian cultivated and wild rice lines [28]. To develop a comprehensive catalog of transcript isoforms, high-quality RNA was extracted from five tissues of the hybrids. These tissues were sampled at different developmental stages and then pooled to construct Iso-seq size-fractionated libraries (1–6 kb). After quality control (Figs. S1 and S2), the PacBio RS II platform generated circular consensus (CCS) reads of 278,855 (F01), 290,833 (F02) and 293,637 (F03), including full-length non-chimeric reads of 244,819 (87.78%), 257,816 (88.65%) and 258,052 (87.88%) based on the presence of 5’ primers, 3’ primers and poly (A) tails. The full-length non-chimeric reads were further clustered to obtain consensus isoforms, and the consensus isoforms in each cluster were corrected to obtain high-quality isoforms of 31,153 (F01), 30,603 (F02) and 28,840 (F03), respectively. By removing redundancy, non-redundant consensus isoforms of 11,223 (F01), 12,722 (F02) and 13,472 (F03) were obtained, covering 8336, 8767 and 9140 gene loci, respectively. At the same time, 270 (F01), 206 (F02) and 259 (F03) new gene loci and 2746, 4044 and 4113 new isoforms were found. These results showed that the non-redundant isoforms, gene loci, lengths of full-length transcripts and new isoforms in polyploid rice were higher than those in the diploid hybrid F01. The 1700 bp length of the polyploid isoforms was much longer than that of the Ensembl Plants (mean 1190 bp). These Iso-seq full-length isoforms produced directly from sequencing without assembly are valuable resources for optimizing gene models.
Splice junctions and AS modes with different ploidy levels
Isoform sequencing technology yields long reads without the aid of assembly and provides superior evidence for identifying AS variants. Based on obtaining high-quality full-length isoforms, we systematically analyzed AS events. Five major AS events were identified, including IR, A5, A3, ES and MX events, by customizing a user-friendly program. A total of 480, 1104 and 1195 AS events formed 864, 2002 and 2233 alternative splice variants from the three samples, respectively, with few proportions of shared splice variants (Fig. 2a). Figure 2b indicates that the main IR events accounted for 60%, the ratio of A3 in polyploids was significantly higher than that in the diploid, and the ratio in F03 was higher than that in F02. The ratios of ES and IR events were significantly lower than those in the diploid, and the ratios in F03 (AAAB) were lower than those in F02 (AABB). Therefore, AS events occurred twice as often in the polyploids than in the diploid rice. There was no significant difference between F02 (AABB) and F03 (AAAB).
Here, the transcripts with a class code of ‘=’ were defined as ‘known transcripts’, whereas all others (such as ‘c’, ‘i’, ‘p’, ‘j’, ‘u’, ‘e’, ‘x’ and ‘o’) were defined as ‘unannotated isoforms’ using Cuffcompare. Figure 2c shows 2746, 4043 and 4113 unannotated isoforms, respectively. For example, a gene LOC_Os01g31360 that is annotated to possess five transcripts was found to generate 13 splice isoforms from the Iso-seq data of F02 (PB.415 splice isoforms, Tables S1 and S2). A gene LOC_Os08g06110 that is annotated to possess five transcripts was found to generate 15 splice isoforms from the Iso-seq data of F03 (Table S2). It was therefore suggested that AS events led to complex transcriptional regulation in polyploid rice.
Alternative polyadenylation analysis
Many studies have shown that APA events increase transcriptome complexity and can regulate gene expression [25]. APA analysis is subject to certain limitations using conventional RNA-seq short read sequences, and the effect of APA on the complexity in lines of different ploidy are still unknown in rice. The investigation of 3’ ends of transcripts using Iso-seq allowed us accurately to identify differential polyadenylation sites for the first time in rice (Fig. 3). In our study, we detected 7532 genes containing at least one APA site from the Iso-seq data in the diploid hybrid F01, and 146 genes had at least five poly(A) sites. Finally, 13,468 APA sites were determined. On average, 1.79 poly (A) sites per gene were found. In the allotetraploid hybrid F02, 4135 genes contained at least one APA site, and 297 genes had at least five poly(A) sites. In total, 6126 APA sites were determined. On average, 1.48 poly (A) sites per gene were found. In the allotetraploid hybrid F03, 8810 genes containing at least one APA site were identified. In total, 18,038 APA sites were determined. On average, 2.05 poly (A) sites per gene were found (Fig. 3b). These results suggested that that APA is a common phenomenon in rice.
Figure 3a shows the 3246 genes shared by APA sites in the three samples, which account for 30.5%. There were more unique genes in F03 than F01. The proportion of isoforms in the KOG category was different for different ploidy levels (Fig. 3c). The ratio of enriched genes in Carbohydrate transport and metabolism, Lipid transport and metabolism, Signal transduction mechanisms and Cytoskeleton were the highest in F02 and lowest in F01. However, the ratio of enriched genes in Signal transduction mechanisms and Intracellular trafficking, secretion and vesicular transport in allotetraploid rice was greater than that in the diploid hybrid F01. Polyploidy and hybridization with variable levels of polyadenylation have certain effects on plant growth, transcription and repair. The results indicated that the polyadenylation of these genes has a certain regulatory effect, affecting biological processes such as plant growth, development, stress responses and other biological processes, while also affecting the diversity and complexity of polyploid genetic traits.
Functional annotation with different ploidy levels
The non-redundant isoforms of different ploidy levels were 11,223, 12,722 and 13,472 (Fig. 4a), respectively. The common isoforms were 7986 (42.4%) within the three samples, 8638 (56.4%) for F01 and F02, 8706 (53.5%) for F01 and F03, and 9224 (54.4%) for F02 and F03. More than half of the isoforms were mutually shared, and about 45% of the isoforms were mutually exclusive. The unique isoforms increased between the change from F01 to F02, indicating that polyploidization involves complex transcriptional regulation. Non-redundant transcripts were annotated with COG, KOG, GO and KEGG functions, and the annotation rates were 79.5%, 75.5% and 75.10%, respectively, indicating that full-length transcriptome sequencing plays a very powerful role in the study of post-transcriptional modification, new gene discovery and genome annotation.
The number of isoforms for different ploidy levels was basically the same under normal growth conditions through KOG enrichment (Fig. S3), and the main enrichment categories were nearly consistent with the different ploidy levels. From F01 to F03, the number of isoforms in the KOG categories increased. The ratio of isoforms in the KOG categories related to Energy production and conversion, Transcription and Replication, recombination and repair significantly increased from F02 to F01 (Fig. 4b). Meanwhile, the ratio of isoforms in the KOG categories related to Energy production and conversion, Carbohydrate transport and metabolism, and Translation, ribosomal structure and biogenesis in F03 was higher than that in F02, and the ratio of isoforms in the KOG categories related to Transcription in F03 was lower than that in F02, indicating that hybridization offers advantages to plant growth. In the COG categories, the ratio of isoforms related to Carbohydrate transport and metabolism, Coenzyme transport and metabolism, Lipid transport and metabolism and Translation, ribosomal structure and biogenesis significantly increased from F02 to F01. This may be the reason why the polyploid rice grew tall and sturdy. The results from the COG analysis suggested the function of Translation, ribosomal structure and biogenesis was more complex for allotetraploid rice lines. Figure 4c shows that the number of isoforms in the allotetraploid rice was higher than that in diploid hybrid F01, and that in allotetraploid rice F03 was greater than in F02. However, the proportion of most GO terms did not change significantly. Only the ratios of transporter activity, nucleic acid binding transcription factor activity and molecular function regulators in allotetraploid rice were higher than in the diploid hybrid F01. This suggested that APA and AS events may contribute to the complexity of polyploidy in translation, nucleic acid binding transcription factor activity and molecular function regulation.
Figure 5 shows that the metabolic pathways and isoforms of rice with different ploidy were mostly the same. Among the lines of different ploidy, 74.5% of isoforms were common, whereas 25.5% were specific. Although the specific isoforms of KEGG enrichment in allotetraploid rice were greater in number than those of the diploid hybrid F01, they accounted for a smaller proportion (Fig. 5a). In the KEGG enrichment terms, the change in the number of isoforms was the same as that for the KOG and GO terms; the number of isoforms for each term increased gradually between F01 and F03 (Fig. S3). These isoforms belonged mainly to the following KEGG pathways (Fig. 5c): Biosynthesis of secondary metabolites, Biosynthesis of antibiotics, Carbon metabolism, Biosynthesis of amino acids and Protein processing in endoplasmic reticulum and spliceosome. The percentage of isoforms participating in Carbohydrate metabolism, Amino acid metabolism and Energy metabolism in F02 was lower than that in F01, but the percentage of isoforms participating in Translation and Signal transduction in F02 was higher than that in F01. In F03, the percentage of isoforms participating in most of pathways (especially Carbohydrate metabolism, Amino acid metabolism and Energy metabolism) was greater than that in F02. Three KEGG pathways including Ribosome biogenesis in eukaryotes, Protein export and Microbial metabolism in diverse environments were significantly enriched in F01 (Fig. 5b). Two KEGG pathways including RNA transport and mRNA surveillance were significantly enriched in F02, and two KEGG pathways including Plant-pathogen interaction and Arginine and proline metabolism were significantly enriched in F03. These findings confirmed that AS events and APA play certain regulatory roles.
New isoforms with different ploidy levels
In total, 2746, 4044 and 4113 new isoforms were discovered using Iso-seq data. There was a higher number of these in the polyploids (F02 and F03) than the diploids, but there was no significant difference between F02 and F03 (Fig. 6a). Over half [2071 (59.1%)] of the new isoforms were common. This indicated that more new isoforms can be found through SMRT sequencing, and the number of new isoforms was less affected without the addition of new genes.
The new isoforms were KOG-annotated (Fig. 6b). The percentage of new isoforms participating in Energy production and conversion, Lipid transport and metabolism, Translation, ribosomal structure and biogenesis, Posttranslational modification, protein turnover, chaperones, Secondary metabolites biosynthesis, transport and catabolism and Intracellular trafficking, secretion, and vesicular transport in polyploids was higher than that in the diploid hybrid, and it gradually increased from F01 to F03. In GO terms of the new isoforms, the number of new isoforms also gradually increased from F01 to F03. In biological processes, the new isoforms belonging to the following GO terms: response to stimulus, biological regulation, signaling, growth, immune system process in polyploids were greater in number than those of the diploid hybrid, and those in F02 were greater than those in F03. In cellular components, the percentage of new isoforms belonging to the following GO terms: membrane part, macromolecular complex and membrane-enclosed lumen in the polyploids was higher than that in the diploid hybrid. In molecular function, the percentage of new isoforms belonging to most of the GO terms in the polyploids was higher than that in the diploid hybrid. These results indicated that polyploidy was associated with obvious regulatory complexity and high levels of growth response to external stimuli from the new isoforms.