Inference of ancestral diploid karyotype of C. sativa
To understand the evolutionary trajectories of ADK before divergence of three C. sativa sub-genomes, we analyzed the syntenic conservation and chromosome repatterning between the genomes of the ancestor of lineage Ⅰ and C. sativa. Here, we took the A. lyrata genome as the reference of ancestral genome of lineage Ⅰ for the sake of the significant colinearity between their genomes (Figure S1 and Table S1) and high similarity between their karyotype. By searching homologous genes between them, we drew homologous gene dot-plots (Fig. 2 and 3), and showed orthologous correspondence between ancestral genomes of lineage Ⅰ and C. sativa genomes.
In the homologous gene dot-plots of the two genomes, produced directly by using BLASTP hits and further highlighted by integrating inferred colinear genes, every chromosome in the ancestral genome has three homoeologous chromosomes or groups of homoeologous chromosome regions in C. sativa genome. We found that 5 ACK chromosomes had nearly perfect orthologous correspondence with at least one or more complete chromosomes in C. sativa (Fig.3a, b, c, d, and e), showing that the integrity of each of these 5 chromosomes in ADK (correspondingly defined as ADK chromosomes 1, 2, 5, 6, 7), which directly inherited the chromosome structure of ACK (AK chromosomes 1, 3, 6, 7, 8) without prominent DNA rearrangements.
Notably, orthologous correspondence between AK2, 4, 5 and Cs4, 16 (Cs-G1) is nearly the same as that between AK2, 4, 5 and Cs6, 7 (Cs-G2) (Fig.3g and h), indicating that Cs-G1 and Cs-G2 shared two ancestral chromosomes, which majorly formed through reciprocal translocation of arms (RTA) and end-end joining (EEJ) between AK2, 4, 5. By searching shared gene synteny between A. lyrata and C. sativa genomes, we further found that the crossing-over positions between chromosomes (AK4, 5) were respectively between gene AL482377 (Corresponding C. sativa ortholog: Csa16g006880.1) and AL321151 (Csa04g046610.1) in AK4, and that between gene AL486375 (Csa04g046590.1) and AL486377 (Csa16g006870.1) in AK5. Actually, the following two evolutionary trajectories could explain the changes of these chromosomes. A relatively more complex evolutionary trajectory could occur as follows: AK2 and AK4 crossed over near one telomere of each of them, resulting in EEJ to produce AK2/4 and formation of a satellite chromosome of two telomeres (and possibly little DNA); then cross-over between AK5 and neo-AK2/4, which experienced one extra translocation and pericentric inversion, resulting in RTA between the two chromosomes to produce AK5/4 (ADK3) and ADK2/4/5 (ADK4) (Fig.4c). An alternative trajectory could occur as follows: a cross-over between AK4 and AK5 resulted in reciprocal translocation of arms (RTA) to produce AK5/4, forming ADK3, and intermediate AK4/5. Then, AK4/5 and AK2 crossed over near one telomere of each of them, resulting in chromosome end–end joining (EEJ) to produce AK2/4/5 and likely formation of a satellite chromosome by two telomeres (and possibly little DNA). The neo-chromosome AK2/4/5 experienced one extra translocation and pericentric inversion to form ADK4 (Fig.4d). No matter which trajectory was the actual one, the satellite chromosome likely produced was lost, eventually reducing the chromosome number from 8 in ACK to 7 in ADK.
Orthologous correspondence between AK2, 4, 5 and Cs5, 9 (Cs-G3) (Fig.3i) is much different from that between AK2, 4, 5 and Cs4, 16 (Cs-G1) or Cs6, 7 (Cs-G2), showing that Cs5, 9 has particular structures not shared with the other two sets of chromosomes (Cs4, 16 and Cs6, 7). It seems that Cs-G3 does not share the two ancestral chromosomes (ADK3, 4) with Cs-G1 and Cs-G2. However, orthologous correspondence between Cs4, 16 or Cs6, 7 and Cs5, 9 (Fig.3j and k), showing that Cs5, 9 are majorly formed by RTA between ADK3 and ADK 4 (Fig.5). By searching gene synteny between A. lyrata and C. sativa genomes, we further characterized the crossing-over positions between chromosomes (ADK3, 4) are respectively between gene AL486375 (Corresponding C. sativa ortholog: Csa04g046590.1) and AL321151 (Csa04g046610.1) in ADK3 (where chromosome arms of AK4, 5 combined), and that between gene AL476152 (Csa09g071500.1) and AL926342 (Csa09g071510.1) in ADK4. This findings provide a clear evidence to support that the Cs-G3 actually inherited karyotype structures of the two ancestral chromosomes (ADK3, 4), which are shared with Cs-G1 and Cs-G2.
Inferring evolutionary trajectories from ADK to extant C. sativa karyotype
Shared chromosome structural patterns can help understand phylogenomic relationship. In homologous gene dot-plots, orthologous correspondence between AK7 (ADK6) and Cs10, 11, 12 (Fig.3d) suggested that one paracentric inversion is common to Cs10 and Cs 11, respectively corresponding to Cs-G1 and Cs-G2, respectively, but not in chromosome Cs12 from Cs-G3. It suggested that Cs-G1 and Cs-G2 are not directly diverged from ADK, but share a common ancestor with one paracentric inversion as compared to ADK6.
The formation process of the three sub-genomes and C. sativa genomes could occur as follows: the ancestral diploid of C. sativa differentiated into species A and B firstly, and then species A differentiated into species C and D after one paracentric inversion occurred in ADK6 (Fig.5). Crossing-over between ADK6 and ADK7 occurred near one telomere of each chromosome in species C, resulting in chromosome end–end joining (EEJ) to produce ADK6/7 and formation of a satellite chromosome of two telomeres and little DNA. ADK5 in species D experienced one paracentric inversion independently (Fig.3c and 5). Crossing-over occurred between ADK3 and ADK4 in species B, which experienced one translocation, resulting in reciprocal translocation of arms (RTA) to produce ADK3/4 and ADK4/3, which experienced one pericentric inversion (Fig.3j, k and 5). RTA between ADK5 and ADK7 in species B occurred to produce ADK5/7 and ADK7/5 (Fig.3j and 5). The crossing-over positions between chromosomes (ADK5, 7) are respectively between gene AL489681 (Corresponding C. sativa ortholog: Csa20g058860.1) and AL351869 (Csa02g002270.1) in ADK5 (the region where the centromere of ADK5 is located), and that between gene AL494932 (Csa20g041660.1) and AL494934 (Csa02g033470.1) in ADK7 (the region where the centromere of ADK7 is located). An initial hybridization event between species C (Cs-G1) and D (Cs-G2), resulting in a tetraploid genome, followed by an additional hybridization event between the tetraploid genome and species B (Cs-G3), eventually forming the extant hexaploid genome of C. sativa [17] (Fig.5).
During the formation of the karyotype of C. sativa, 14 chromosomes of C. sativa inherited the chromosome structures of ADK ones. While one paracentric inversion occurred in Cs-G2 to produce one new chromosome, two RTAs occurred in Cs-G3 with one translocation and pericentric inversion to produce four new chromosomes. EEJ occurred in Cs-G1 to produce one new chromosome and one satellite chromosome. The loss of the satellite chromosomes resulted in the chromosome number reduction from 21 to 20.