Genes lost and gained during the evolution and domestication of the allotetraploid
Hits for about 90% of the reads from the 30 natural and 19 artificial B. napus genotypes were found in the reference genome of B. napus and they contain a total of 46,941 genes (Fig. 1). Hits not found in the reference genome accounted for about 9.2% of the reads from the natural genotypes and 11.7% of those from the artificial genotypes. Assembling these unmapped reads based on SOAP denovo identified 24,505 unique genes from the natural genotypes, and 24,879 unique genes from the artificial genotypes (Additional file 1: Table S1, Fig. 1). Thus, the ratio of genes shared between the artificial and natural genotypes is about 48.7%, and the unique genes detected in both the artificial and natural genotypes accounted for about 51.3% of all genes detected in this study.
Difference in functional groupings of genes between natural and artificial genotypes
GO annotation assessments against the 24,505 unique genes from the natural genotypes identified 80 enriched GO terms (Fig. 2, Additional file 2: Table S2, S3). They include those involved in plant development including ‘transferase activity ‘, ‘macromolecule metabolic process’, ‘cellular macromolecule metabolic process’, ‘phosphorus metabolic process’, ‘seed development’, and ‘fruit development’. A similar assessment against the 24,879 unique genes from the artificial genotypes identified 48 enriched GO terms. Different from those from the natural genotypes, most of the enriched GO terms are related to responses to stresses. Specifically, they include ‘response to stress’, ‘chromosome organization’, ‘cell cycle’, ‘cell cycle process’, ‘cell division’ and ‘mitotic cell cycle process’.
KEGG pathway analyses found that, compared with those in the artificial genotypes, significantly enriched pathways in the natural genotypes are involved in ‘Phosphatidylinositol signaling system’, ‘RNA degradation’, and ‘Inositol phosphate metabolism’. These pathways have been shown to regulate pollen tube growth, root hair tip growth, flowering and maturation [31–34]. In contrast, pathways of ‘Aminoacyl-tRNA biosynthesis’, ‘Propanoate metabolism’ and ‘Ribosome biogenesis in eukaryotes’ were significantly enriched among the unique genes in the artificial genotypes (P < 0.05) (Additional file 3: Table S4, Fig. 3). Previous studies show that these pathways are predominantly involved in drought stress and disease response [35–38]. These results indicate that many genes related to stress responses have been lost and that genes related to seeds development and metabolic processes have been significantly increased during the evolution and domestication of this species.
Difference in B. napus genes derived from the two different diploid donors
To minimize possible influences of dispensable genome components (DGCs), transcriptomic data from 30 B. rapa and 26 B. oleracea were used to assess genes derived from either of these two diploid donors in the formation and domestication of the attotetraploid B. napus. With the use of an identity threshold of 95%, 39,583 non-redundant CDS (coding sequences) were identified from the B. rapa genotypes and 42,521 from the B. oleracea genotypes. Aligning these CDS against the genome of the allotetraploid B. napus found that the C subgenome shared 52.8% of its genes with its diploid donor B. oleracea and that the A subgenome shared 47.2% of its genes with its diploid donor B. rapa. Clearly, the ratio of shared genes between the A subgenome and its diploid progenitor B. rapa was substantially lower than that between the C subgenome and its diploid progenitor B. oleracea (Table 1).
Table 1
Genes shared between subgenomes and their respective progenitors and gene ratios between the subgenome.
| | | | | | | | |
Species | Genome | Gene number | Retained from AA | Retained from CC | A/(A + C) | C/(A + C) |
No. CDS | percentage | No. CDS | percentage |
B. rapa | AA | 39583 | | | | | | |
B. oleracea | CC | 42521 | | | | | | |
Lost genes | AACC | 24783 | 8659 | 21.88% | 9612 | 22.61% | 47.39% | 52.61% |
Retained genes | AACC | 46941 | 14821 | 37.44% | 17314 | 40.72% | 46.12% | 53.88% |
Natural allotetraploid | AACC | 71377 | 18569 | 46.91% | 20183 | 47.47% | 47.92% | 52.08% |
Artificial allotetraploid | AACC | 71724 | 19653 | 49.65% | 21897 | 51.50% | 47.30% | 52.70% |
Lost genes represent genes detected in the artificial allotetraploid genotypes but not in the natural genotypes of B. napus. Retained genes represent genes detected in both the artificial and natural allotetraploid genotypes of B. napus. |
GO term analysis showed significant differences in functional groups of genes derived from the two diploid progenitors. Of the genes shared between the natural and artificial genotypes, the numbers of enriched GO terms for two (‘molecular function’ and ‘biological process’) of the three functional classes were significantly higher for genes derived from B. rapa than those from B. oleracea. Those genes derived from B. rapa were enriched in transcription factor, plant-type development and negative regulation of several biological process. Of the genes unique in the artificial allotetraploid genotypes, the numbers of enriched GO terms from B. oleracea were significantly larger than those from B. rapa (Table 2). The most significantly enriched GO terms included cellular development, response to abiotic stimulus, metabolic process and positive regulation of several biological processes (Fig. 4; Additional file 4: Table S5,6,7).
Table 2
Numbers of enriched GO terms among “retained” and “lost” genes derived from two diploid donors.
GO Term | Genes shared between natural and artificial genotypes | Unique genes in the artificial genotypes |
genes derived from B. rapa | genes derived from B. oleracea | genes derived from B. rapa | genes derived from B. oleracea |
cellular component | 3 | 19 | 0 | 46 |
molecular function | 15 | 1 | 7 | 16 |
biological process | 73 | 15 | 5 | 77 |
“retained” genes represent genes shared between natural and artificial B. napus, “lost” genes represent unique genes in the artificial allotetraploid genotypes |