The ancestor karyotype projection provides evidence for studying the evolutionary history of species by identifying collinear genes and their order [11, 13]. Previous ancestor karyotype projection studies contained undefined regions and only revealed limited karyotype dynamics [9, 10, 48]. This study utilized WGDI to identify the proto-chromosomes by searching for shared intact chromosomes or chromosome-like synteny blocks to complete gap regions [12, 13]. The ancestral karyotype projections of eight oak species with AEK and ACEK were established and the evolution of modern chromosomes in Q. glauca was clarified.
As diploids, lineage-specific whole genome replication events have not occurred in oaks [40]. Interspecific conserved synteny blocks exist between modern oak genomes and ancestral karyotypes from the same ancestor [49]. Previous research used the shared synteny blocks to explore the most intact chromosome as an ancestral proto-chromosome [10, 13]. However, complex rearrangement in oak genomes resulted in the distribution of shared synteny blocks within segments of several chromosomes, making it difficult to precisely explore the common ancestral proto-chromosome. Rearrangement occurs frequently in plant genomes and can promote the evolution of chromosome number, size, structure, and composition [8, 50]. After polyploidization, the following diploidization entails various chromosome rearrangements, such as inversions, translocations, fission and fusion, duplications, and deletions [51]. These events could contribute to the richness of structural diversity of the oak karyotype. This study clarified the evolution of modern Q. glauca chromosomes and confirmed the important role of chromosome fusion and arm exchange in karyotype evolution. Compared to the Betula pendula of Betulaceae, the evolution of AEK 6 in the Fagaceae and Betulaceae is relatively conservative, and their chromosomes have undergone complex rearrangements (Fig. S4)[13]. Chromosome rearrangement enriches the chromosomal structural diversity of these two widely distributed and ancient Fagales lineages and contributes to adaptive evolution. To elucidate the common ancestor and the specific details of karyotype evolution of oaks, it is necessary to analyze karyotype evolution based on representative genomes of other lineages [11, 13].
Identical and species-specific chromosomal rearrangements within oaks were shown in the ancestor karyotype projection and interspecific synteny relationships. In oaks, research has revealed the importance of natural hybridization and introgression in promoting genetic diversity and the generation of new species [52, 53]. Identical chromosomal rearrangements among oaks are associated with the evolution of Quercus’ common ancestor, and these rearrangements may have been preserved in frequent hybridization and have the effect of inhibiting recombination [54, 55]. Species-specific chromosomal variation enriched the lineage-specific diversity of chromosomal structure and contributed to the species reproductive isolation, speciation, and adaptive evolution [3, 50]. The accumulation of chromosomal rearrangements between species is largely incidental to speciation, and affects gene flow and fitness [54, 56]. Some species-specific chromosome structural variation detected in this study were consistent with previous oak genome research [36–38, 47]. The species-specific inversion and translocation in chromosomes 3 and 5 of Q. lobata may be related to the ancient speciation and unique lineage evolution on the west coast of North America. The interspecific chromosome rearrangements appeared irregular among different sections, which could not provide direct evidence for divergence and speciation among oak species. Quercus glauca and Q. gliva, from section Cyclobalanopsis, exhibited chromosome inversion in chromosomes 1 and 7, possibly related to speciation and habitat differences. Chromosome rearrangement undoubtedly enriches the diversity of oak karyotypes, and further research on rearrangement sequence should explore interspecific differences, stress resistance, and ecological adaptability in the oak species.
LTR-RTs and polyploidization promote adaptation and shape genomic structure [57]. The proportion of LTR in the oak species varied, ranging from approximately 139.1 Mb (17.2%) in Q. mongolica [40] to 371.3 Mb (46.6%) in Q. variabilis [41] (Table S1). Previous genomic studies on oaks focused on analyzing LTR-RTs content, with little further identification of intact full-length regions based on different lineages in the Copia and Gypsy subfamilies. According to conserved protein domains and the REXdb database [58], we identified intact full-length LTR-RTs from 33.8 Mb to 56.8 Mb when excluding some Unknown elements and solo LTRs. The amplification and depletion of LTR-RTs affect genome structure, size, and evolutionary rates [15]. Previous research on Fabaceae and Curcurbitaceae species has shown a significant positive correlation between LTR-RT content and genome size [22, 23]. Similar genome sizes but varying LTR-RTs densities in oaks imply that species-specific evolutionary histories could affected the richness of LTR-RTs across species. Several factors could contribute to the content of LTR-RTs, such as chromosomal rearrangement and solo LTRs content [59–61]. In oaks, Q. lobata, with species-specific chromosomal rearrangements, has fewer intact LTR-RTs and solo LTR, which may suggest that the genome maintained relatively stable after speciation. Two species with larger genome sizes, Q. glauca and Q. gilva, have more intact LTR-RTs and solo LTR, which may suggest rapid evolution in their genomes.
The LTR-RTs are sub-classified into different lineages in oaks, with SIRE and Retand accounting for most of the Copia and Gypsy subfamilies, respectively. Previous research [22, 23] found the scales and timeframes of activity amplifying LTR-RTs vary dramatically among families, lineages, and species [15]. In oaks, the Copia/Ale, Copia/SIRE, Copia/Angela, and Gypsy/Retand lineages exhibited varying amplification and evolutionary patterns. The amplification of different LTR-RTs lineages in the oak genome was a source of intraspecific polymorphism, which is considered an important factor affecting genomic diversity and adaptive evolution [62]. In two oak species of section Cyclobalanopsis, Q. gilva showed more ancient lineages amplification and Retand was independently amplified in Q. glauca. Two species of section Cerris (Q. acutissima and Q. variabilis) showed more recent amplification in Gypsy. The amplification/loss rate of LTR-RT specific lineages in oak species may imply a difference in the evolutionary rate of the sections and species [15].
Insertion of LTR-RTs into genomes impacts gene expression, regulation, and function, such as changing gene structure or the functional elements in the promoter region [23, 63–65]. Comparative transcriptomic analyses confirmed the suppression function of LTR-RTs inserted in Q. glauca genes, consistent with previous studies in Curcurbitaceae and Fabaceae species [22, 23]. In GO enrichment analysis, LTR-RT-associated genes in oaks were enriched in envelope and heterochromatin formation, which were related to SIRE and Retand amplification [66–68]. Meanwhile, the mutations caused by LTR-RT insertion may also affect phenotypes. For example, an LTR-RT inserted into the apple MdMYB1 gene will increase anthocyanidin accumulation and form red skin [69]. The LTR-RTs insertion in BoCYP704B1 is the primary cause of the male sterility in cabbage [70]. Therefore, the impact of inserted LTR-RT on gene expression regulation in oak genomes warrants further study.
Through integration and subsequent deletions, LTR-RTs are thought to facilitate subtle restructuring of chromosomal landscapes [50]. LTR/Copia and LTR/Gypsy were usually mixed with tandem repeats and enriched in plant centromere regions [59, 71, 72]. The pattern of 32, 78, and 79 bp repeat units are highly linked with the centromere regions of six chromosomes in Q. glauca, but Q. lobata has a consistent repeat unit (148 bp) for each centromere [47]. This result indicated that although the centromeres are conserved function across species, there is diversity in their structure and sequence [73]. The centromere region’s complex and highly repetitive structure often leads to collapse and truncation during genome assembly, which may mean we have not identified all centromeres [74]. During polyploidization and subsequent restoration to diploid, the centromere plays an important role in karyotype rearrangement and speciation [59, 75]. Some chromosomal rearrangement regions in Q. glauca exhibited unique patterns of LTR-RTs enrichment. The centromere tandem repeat units were also common in non-centromeres regions in the Q. glauca genome, which may be related to the centromere’s loss and formation after chromosome fusion and fission. However, whether ancient centromere repeats still exist in the modern genome and have special functions to maintain the stability of chromosomes remains a mystery [76]. Recent studies have proposed a new genome assembly method that can assemble a highly continuous and completely gap-free reference genome, allowing better identification of all centromere regions and exploring centromere evolution [77, 78]. This study can provide conditions for precise identification of the centromere regions in the oak genome to explore the variation between oaks and its impact on karyotype evolution.