Allopolyploidization Enhances Genetic Recombination of the Ancestral Diploid Genome in the Evolution of Wheat

Hongshen Wan Sichuan Academy of Agricultural Science Jun Li Sichuan Academy of Agricultural Science Shengwei Ma Nanjing Agricultural University Fan Yang Sichuan Agricultural University Liang Chai Sichuan Academy of Agricultural Science Hongxing Xu Henan University Qin Wang Sichuan Academy of Agricultural Science Zehou Liu Sichuan Academy of Agricultural Science Zongjun Pu Sichuan Academy of Agricultural Science Wuyun Yang (  yangwuyun@126.com ) Sichuan Academy of Agricultural Sciences https://orcid.org/0000-0002-8201-4929

hexaploidization, resulting hexaploid common wheat. The bread wheat originated from accidental hybridization between tetraploid wheat T. turgidum (AABB) with wild diploid grass Aegilops tauschii (DD). The wild diploid species Ae. tauschii ssp. strangulata as the D genome donor of the hexaploid common wheat [12][13][14][15][16] hybridized with the free-threshing form of tetraploid wheat [15], which happened in the south and west of the Caspian Sea about 9,000 years ago, made the hexaploid wheat a major type of wheat in the next agricultural history, which accounted for about 95% of world wheat production whereas the tetraploid durum wheat only represented the rest 5% [17].
However, the D genome of the rst bread wheat is found to be originated from only a small natural group of Ae.tauschii ssp. strangulata [12,14,16], thus the genetic basis and diversity are rather narrow in modern common wheat. To broaden the genetic diversity of bread wheat D genome, scientists created the synthetic hexaploid wheat (SHW) by using tetraploid wheat crossing with Ae. tauschii via chromosome doubling [43], and more and more elite commercial wheat varieties derived from SHWs have been released in the subsequent decades [3,[44][45][46][47]. Thus it can be seen that the arti cial hexaploidization, to some extent, enhanced the genetic variation and adaptive evolution of modern wheat [45], and more concern has been focused on it [46][47][48][49].
Genetic recombination produces novel allele combinations, thus becomes the most important genetic phenomenon for generating new variations to the selection pools for evolution. Moreover, genetic advance in a breeding program depends on selection of new recombinant individuals from inter-varietal or interspeci c crosses. And boosting genetic recombination would effectively speed up the combining of valuable traits from different parents in new elite varieties [50]. In plants, recombination frequency is often catalyzed by genetic factors such as zep1 (rice: [51], zyp1 (barley: [52]), Ph1 (wheat: [53]), related QTL (wheat: [54]) and male meiosis (wheat: [55]; rye: [56]), which is also affected by external stress possibly changing plant/cell physiological condition, including chemical, different levels of nutrients, Xrays, mutagentic agents, temperature, etc. [57,58]. In Brassica, the genetic recombination frequency of diploid genome A gets a boost in their allotriploid and allotetraploid hybrids with C genome [59]. And in Arabidopsis, polyploidization increases meiotic recombination frequency [60]. Allopolyploidization is a driving force in plant evolution [61,62], which has been focused primarily on by plant evolutionary genomicists in recent years. In wheat, hexaploid takes its advantage on heterosis and gene redundancy from both emmer wheat and wild grass, but the change of genetic recombination such as homologous recombination of polyploidy, which might be a strong force driving some aspects of plant genome variability [63], has not received adequate attention in the evolution of wheat yet, which is regarded as a good model for studying polyploidization in plants [64].
To study the impact of allopolyploidization on genetic recombination of its ancestral diploid D and tetraploid AB genomes of wheat, we developed two sets of segregation populations to investigate the change of the genetic recombination frequency (RF) between homologous chromosomes of D and AB genomes after their polyploidization from diploid or tetraploid to hexaploid, respectively. With the help of molecular markers, the distribution of genome-wide RF in different chromosomes were examined under diploid, tetraploid and hexaploid levels, with their changing pattern along entire chromosomes of the ancestral diploid and tetraploid genomes after hexaploidization. The effects of hexaploidization on different ancestral genomes in wheat were de ned explicitly in this study. The results suggested that hexaploidization enhanced the homologous recombination of the diploid donor genome in the evolution of wheat, which helped wheat conquer their narrow origination of D genome, and thus speed its adaptive evolution to made it become one of the most-important crops.

Distribution of SNPs on AB and D genomes
Wheat Breeder's Genotyping Array and DArT-Seq™ technology were used in this study to genotype the ancestral genomes of two genetic population sets, respectively. For genotyping the tetraploid genome AABB, a total of 9925 SNP labels were designed to be anchored on Wheat Breeder's Genotyping Array, and their numbers ranged from 463 of 6A to 997 of 3B (Table 1). And 10-30 SNPs were distributed within the majority of intervals across the AB genome of 20 Mb, and the pericentromeric SNPs were much less than SNPs in other genomic regions, especially on chromosomes 1B, 2B, 4B and 5A (Fig. 1A). For diploid genome DD, 9043 SNPs were totally obtained by DArT-Seq™ technology, based on genotyping by sequencing (GBS) and aligning the Illumina sequencing reads to the Chinese Spring (CS) RefSeq v1.0 assembly. The SNP numbers distributed on chromosomes ranged from 795 on 4D to 1850 on 2D across whole D genome (Table 1). Their distributions were quietly disequilibrium along the physical position on chromosome. The amounts of SNPs increases consistently and linearly with the physical distance away from the centromere of each chromosome, and SNPs were much more in distal centromere-free regions (Fig. 1B).
For AB genome, 9487 SNP labels on the chip were selected arti cially by China Golden Marker Biotech Co Ltd (Fig. 1A), distributed more equilibriously than in D genome (Fig. 1B). And a total of 904 SNP markers were identi ed to have polymorphism between two parents of both tetraploid and SHW-derived populations, and the number of polymorphic SNP markers detected on each chromosome ranged from 28 on 4B to 107 on 2B ( Table 1). The centromere region of each chromosome was shown blank in the polymorphic SNP markers except for chromosome 3B (Fig. 2). And the polymorphic SNP markers were detected to be clustering in some genomic regions of AB genome, i.e., the 80-140 Mb region on chromosome 1AL, the 190-210 Mb region on chromosome 2AS ( Fig. 1A and 2). The genomic regions clustering the majority polymorphic SNP loci covered total physical length of about 2,320 Mb on AB genome, with an average marker density of about 4 SNPs per 20 Mb (Fig. 2).
For D genome of the second genetic population set with their parental diploid and SHWs, a total of 774 polymorphic SNP markers were nally reserved to map construction and RF calculation, and the number of polymorphic SNP markers on chromosomes ranged from 71 of 4D to 143 of 2D (Table 1). Distributions of polymorphic SNP markers on chromosomes also showed a linear decrease when approaching the centromeres (Fig. 1B, 2). The polymorphic SNP markers around centromere was nearly blank in the second genetic population set, and the majority of the polymorphic SNP markers were clustered in the genomic regions away from the pericentromeric regions, covering about 1,520 Mb totally with an average marker density more than 2 SNPs per 10 Mb (Fig. 2).
The level of genetic recombination among tetraploid genome after hexaploidization The reserved 904 polymorphic SNP markers for map construction and RF calculation covered total physical lengths of 4771.2 Mb and 4944.2 Mb on A and B genomes, respectively. And the physical lengths covered by polymorphic SNP markers on each chromosome ranged from 581.5 Mb of 1A to 813.7 Mb of 3B ( Table 2). The genetic maps for both tetraploid and hexaploid populations were constructed based on the physical positions of the 904 polymorphic SNP markers on the CS RefSeq v1.0 assembly (Fig. 2). For the tetraploid population, the total genetic map length of AB genome was 3135.9 cM, and the length of each chromosome ranged from 131 cM of 1B to 537 cM of 4B ( Fig. 3; Table 3). For their SHW-derived population, the total genetic map length of AB genome was 2326.7cM, and the length of each chromosome ranged from 109.5cM of 1B to 394.5cM ( Fig. 3; Table 3). The map lengths of chromosome 2B (393.0 cM) and 4B (537.9 cM) calculated with the tetraploid population was 112.7 cM and 143.3 cM longer than those (280.3 cM for 2B, 394.5 cM for 4B) calculated with their SHW-derived population, respectively ( Fig. 3; Table 3). Obviously, the genetic length difference of chromosomes 2B and 4B between tetraploid and hexaploid populations was mostly caused by the linkage gaps on their genetic maps. Furthermore, the linkage gaps on chromosome 4B map constructed using the tetraploid population were more than that constructed using the hexaploid population ( Fig. 3; Table 3).
For the tetraploid genome AABB, the average genetic distance between two adjacent and linked SNP loci on different chromosomes ranged from 1.13 cM of chromosome 6A to 4.33 cM of chromosome 3A in the tetraploid F 2 population, and from 1.14 cM of 6A to 2.92 of 3A in their hexaploidy population ( Table 2).
The average genetic distance between two adjacent and linked SNP loci in the tetraploid genome increased by 0.01 cM on chromosome 6A, 0.10 cM on chromosome 3B and 0.49 cM on chromosome 6A after hexaploidization, while the average genetic distance decreased by 0.20-1.41 cM on other chromosomes ( Table 2). The RF differential value between SHW and tetraploid T.turgidum (TT), recorded as ΔRF SHW-TT , were mostly distributed from -0.05% to 0.03% on total AB genome (Fig. 4). However, when comparing tetraploid with their hexaploidy on chromosome level, no signi cant difference was found on the average genetic distance between two adjacent and linked SNP loci, and the average RF between two adjacent and linked SNP loci on each chromosome also had no signi cant difference ( Table 2). It suggested that no remarkable change of RF happened in most physical region of tetraploid genome after their hexaploidization.
However, some genomic regions of the tetraploid genome received larger RF changes after hexaploidization (Fig. 4). In the pericentromeric region of chromosome 4B, the RF between the physical positions of 137.6 Mb and 414.7 Mb increased by 13.4% (about 10 folds) from 1.6% in tetraploid to 14.9% in hexaploid (Fig. 4). And in the centromere region of 6B, although no no genetic recombination occurred in the interval of 309.6-347.7 Mb, the RF was also increased by 13.4% under hexaploid genetic background (Fig. 4). It suggesting that hexaploidization might increase the RF in pericentromere of speci c chromosomes. Nevertheless, the RF between the physical positions of 705.3 Mb and 729.4 Mb on chromosome 3A of tetraploid wheat decreased by 16.4% (0.5 folds) from 33.3% to 16.9% when introduced into hexaploid genetic background. And the RFs of the genomic regions closed to the telomeres of chromosomes 5A and 5B decreased by >10% after hexaploidization (Fig. 4). The RFs between the physical positions of 617.8 Mb and 696.9 Mb on chromosome 6BL and 19.3-65.1 Mb on chromosome 7BS decreased by >15% (Fig. 4).

The level of genetic recombination among diploid genome after hexaploidization
The 774 polymorphic SNP markers used for map construction and RF calculation covered about the physical length of 3992.9 Mb in Ae. tauschii genome and ranged from 493.8 Mb of 6D to 648.7 Mb of 2D on chromosomes (Table 4). When constructing the genetic maps for each chromosome, no linkage gap was found on both diploid and hexaploid genetic maps of each chromosome (Table 5; Fig. 5). The genetic map constructed by the diploid population was totally 3334.9 cM long, and the genetic map length of each chromosome ranged from 263.2 cM of 4D to 593.3 cM of 2D (Table 5; Fig. 5). The total length of the genetic map constructed by the hexaploid population was 7185.4 cM, 2-fold larger than that of the diploid genetic map. This situation also happened to their chromosome maps, when the genetic map length of each chromosome ranged from 519.8 cM of 4D to 1279.4 cM of 7D in hexaploid (Table 5; For the diploid Ae. tauschii (AT) population, the average genetic distance between two adjacent and linked SNP loci on each chromosome ranged from 3.76 cM of 4D to 5.39 cM of 1D, while it ranged from 7.83 cM of 4D to 13.67 cM of 1D in their SHW-derived population (Table 4). For each chromosome, the average genetic distance between two adjacent SNP loci on the hexaploid genetic background was signi cantly over 2-fold longer than that in the diploid, and their differential value Δ SHW-AT between hexaploid and diploid populations ranged from 4.07 cM on 4D to 8.28 cM on 1D ( Table 4). The average RF between two adjacent and linked SNP loci in hexaploid was also signi cantly over 2-fold higher than that in diploid (Table 4). Their average Δ SHW-AT of RF on each chromosome between different ploidy populations ranged from 3.96% of 4D to 7.83% of 1D (Table 4). The ΔRF SHW-AT was mostly distributed from -5.0% to 17.0% along the whole D genome (Fig 6), with an average mean of 5.5% and ratio of 2.3 (Table 4: RF SHWP /RF ATP ). These results suggested that the hexaploidization signi cantly enhancing RF of the ancestral diploid genome DD.
In addition, 7 outliers were also observed on some genomic regions, using a stem-and-leaf plot displaying the distribution of ΔRF SHW-AT . In the genomic region from the physical position of 616.8 Mb to 618.1 Mb on chromosome 2DL, the RF of diploid increased by 18.0 % (5 folds) from 4.5% to 22.5% after hexaploidization (Fig. 6). The physical regions from 369.3 Mb to 371.1 Mb on 3D obtained increased RF by 18.2% (about 10 folds) in hexaploid (Fig. 6). And the intervals from 561.6 Mb to 562.2 Mb on chromosome 5D and from 5.3 Mb to 7.2 Mb on chromosome 7D obtained increased RFs by 18.2% (about 6.5 folds) and 18.6% (about 3 folds) after hexaploidization, respectively (Fig. 6). However, three genomic regions were also detected with their RFs decreasing by > 6.0% after hexaploidization. In the interval from the physical position of 45.7 Mb to 58.3 Mb on chromosome 5DS, the RF decreased by 10.7% after hexaploidization, and the intervals of 17.6-20.8 Mb on 4D and 352.3-364.7 Mb on 5D obtained 6.0% and 6.5%-decreasing RFs after hexaploidization, respectively (Fig. 6).

Discussion
The genetic maps constructed based on physical positions of SNPs In this study, SNPs used for constructing the genetic maps of AB and D genomes were aligned based on their physical positions to the sequence assembly of CS [65] and A. tauschii AL8/78 [23], respectively. Moreover, for genetic map construction, the method of 'nearest neighbor & two-opt' (nnTwoOpt) was also used for tour construction and its improvement, which is similar to Travelling Salesman Problem (TSP) [66]. For AB genome, chromosomes without any linkage gaps, such as 1A, 1B, 2A, 5A, 5B, 6B, 7A, and 7B, almost had the same genetic lengths of chromosome maps self-organized by nnTwoOpt method to those aligned with CS physical map (Table 3). For the chromosomes 2B, 3A, 3B, 4B and 6A, linkage gaps were found more or less on them, which were mostly caused by the extreme disequilibrium of SNPpolymorphism distribution. The polymorphism SNPs of each chromosome in D genome were much more than that in AB genome, and only one linkage gap was found on genetic map of chromosome 3D selforganized by nnTwoOpt method among the whole genome ( Table 5). Considering that only subtle difference was observed in genetic lengths between self-organizing genetic maps and that aligned with CS physical map (Table 3, 5), we used the physical maps of CS and AL8/78 to align SNPs to AB and D genomes, respectively. So, with a physical map for alignment reference, the changes of the genetic length and RF of the diploid and tetraploid genomes after their hexaploidization could be investigated directly and conveniently by comparing them between populations with different polyploidy.
However, linkage gaps were found on some chromosome genetic maps of the tetraploid genome, where the genetic distance between two adjacent SNP loci was larger than 50 cM (Table 3). More doublecrossovers often occur between two adjacent SNP loci if their genetic distance is over 50cM. In this situation, the measured RF would be much smaller than the true RF and did not re ect it accurately. Considering this, we only analyzed the genetic distance and RF between two adjacent SNP loci that were also closely linked to each other.
Enhanced genetic recombination of ancestral diploid genome after hexaploidization On chromosome level, we compared the average RF between adjacent SNP loci in diploid and tetraploid populations with that in SHW population. And signi cant increase of RF was observed only in the ancestral diploid genome DD after their hexaploidization, rather than that in the ancestral tetraploid genome AABB. These results suggested that the increase extent of RF in ancestral genome depended on polyploidization level (eventual ploidy/initial ploidy), as the increase rate of RF from diploid to hexaploid was much greater than that from tetraploid to hexaploid. Most reported studies focused on the RF change from diploid to tetraploid (polyploidization level = 2). Le on et al. [59] reported that the total genetic length of A7 linkage group increased from 52 cM in Brassica rapa to 96 cM in B. napus, after their allopolyploidization from diploidy to tetraploidy. The meiotic RF increased from 15.4% in diploid Arabidopsis thaliana to 24.1% in allotetraploid A. suecica [60]. These reported data showed that the RF in diploid genome will raise less than 2 fold (Pecinka et al. [60]: 1.56) when polyploidization level =2. Being limited by their slected plant species, it is di cult to investige the RF change from diploid to hexaploid (polyploidization level = 3), as a trigenomic Brassica (AABBCC) is not known to exist in nature. Specially using wheat as a good model studying polyploidy, we found that the RF in diploid genome increased more than two fold (Table 4: about 2.3-fold on average) when promoting its polyploidization level from diploid to hexaploid. Furthermore, no signi cant change of RF was observed in tetraploid genome after its hexaploidization, as its polyploidization level was 1.5 (eventual ploidy/initial ploidy=6/4).
However, this situation depending on polyploidization level might be only suitable for euploidy but not for aneuploidy. For example, in B. rapa, A7 linkage group of allotriploid got 4-fold increase of the total genetic length more than both diploid and tetraploid [59]. The possible reasons were as follows: (1) aneuploidy causes greater genome instability than polyploidy for organisms [2,67] and aneuploidy itself can be responsible for the procreation of chromosomal instability [68,69]; (2) chromosomes that remain as univalents in the aneuploidy could lead to a compensatory increase in crossover frequency among unaffected bivalents [59,[70][71][72].
The RF of the diploid D genome was signi cantly enhanced after their hexaploidization, and the genetic mechanisms for this has not been clear yet. However, reported QTLs for crossover (CO) have been detected in hexaploid wheat, and most of them were distributed on AB genome. With 13 recombinant inbred mapping populations, Gardiner et al. [73] detected 5 QTLs for CO frequency on AB genome of the common wheat, which were located on 2A, 2B, 4B, 5A and 6A, respectively; Jordan et al. [54] detected 40 QTLs for total CO frequency by nested association mapping, most of which were also mapped to AB genome. These results implied that the genetic factors determining the CO frequency might be present in the ancestral genome AABB, which also lead to the RF increase of diploid genome DD after their hexaploidization. However, the max phenotypic effect of QTLs reported by Gardiner et al. [73] increased CO frequency less than 15%, and all 40 QTLs detected by Jordan et al. [54] acrossing the whole genome effected 7.0% of the overall mean for total COs. The effect size of these QTLs were much lower than the increase extent of >200% for RF in diploid genome caused by hexaploidization in this study. This suggested that the RF increase of the diploid D genome in our study was caused by hexaploidization for the most part, while the contribution of the genetic factors provided by AB genome was very minor.
Polyploidization enhancing variation and adaptive evolution of bread wheat Allopolyploidy accelerates revolution in wheat often by two ways: (1) it triggers rapid genome changes through the instantaneous generation of a variety of cardinal genetic and epigenetic alterations, which generate heterosis between subgenomes in polypoid plants [1,2], and (2) the allopolyploid condition facilitates sporadic genomic changes that are not attainable at the diploid level, and take the advantages of gene redundancy [5,8,9,61,62]. The hexaploid wheat takes the advantages of heterosis from both the tetraploid wheat and Ae. tauschii. Genome sequence of Ae. tauschii and gene annotation for whole genome reveals that the diploid progenitor of hexaploid wheat D genome serves as a gene repertoire for modern wheat adaptation, which provides possible resistance to disease and pest, tolerance to environmental stresses and grain quality [22,23]. Importantly, these gene could be expressed normally in a hexaploid genetic background, for that lots of related QTLs or genes had been mapped to D genome of synthetic hexaploid wheat [24,[30][31][32][26][27][28][29][36][37]41,42,74]. Moreover, the mRNA and small RNA transcriptomes analysis in nascent hexaploid wheat also demonstrate the heterosis generating in the common wheat [4]. Using wheat, our study suggests that the enhanced genetic recombination of the ancestral diploid genome that was caused by allopolyploidization could be regarded as another advantage or a new way to increase evolutionary potential of polyploid.
By allopolyploidization, Ae. tauschii adds its genome into that of tetraploid wheat, and produces hexaploid wheat, a major type of cultivated wheat, which accounting for about 95% of world wheat production, while the tetraploid wheat only accounting for the other 5% [17], suggesting that the added D genome, made bread wheat more adaptive to alterable environments and then spreaded more rapidly than the tetraploid wheat. However, the D genome of the rst bread wheats were originated from only a small part of Ae.tauschii population that cannot possess all the superiorities mentioned above in a few lucky individuals. There must be some other reasons underlying more rapid spread of hexaploid common wheat than tetraploid wheat. Interestingly, our study shows that hexaploidization enhanced genetic recombination of the ancestral diploid genome DD in allohexaploid wheat. And the RF throughout the whole D genome in SHW increased more than 2 fold than that in diploidy, which does favor to bread wheat in enhancing variation and adaptive evolution by intercrossing with each other among the rst hexaploidy individuals of wheat, as more recombination events has the potential to substantially accelerate the development of new varieties by (1) allowing quick assembly of novel bene cial multiallelic complexes and (2) breaking the linkage among unfavorable genes and xing desirable haplotypes in fewer generations [50]. This was more e cient than that in a diploid or tetraploid genetic background, as more recombination events occurred in hexaploid genetic background, with higher possibility to create more phenotypic variations to the selection pools for evolution.

Conclusions
Allopolyploidization enhancing genetic recombination of the ancestral diploid genome is found to increase the evolutionary potential of wheat, which is bene cial for wheat to conquer their narrow origination of D genome, quickly spread and make it a major crop of the world. Using wheat as a a good model studying polyploidy, our study suggests that the enhanced genetic recombination of the ancestral diploid genome that was caused by hexaploidization could be regarded as another advantage or a new way to increase evolutionary potential of polyploid. More recombination events could generate more gene combination types or haplotypes for natural or arti cial selection, resulting in accelerated adaptive evolution.

Plant materials
Two Ae. tauschii accessions SQ665 and SQ783 (Ae.tauschii ssp. tauschii var. typica), as well as two T.turgidum cultivars Yuanwang (T. turgidum conv. turgidum) and Langdon (T. turgidum conv. durum) were used to generate three SHWs Langdon/SQ665 (LS665), Langdon/SQ783 (LS783) and Yuanwang/SQ783 (YS783). The seeds of two Ae. tauschii accessions SQ665 and SQ783 were provided by Dr. A. Mujeeb-Kazi at the International Wheat and Maize Improvement Center (CIMMYT), Mexico, in 1995. The tetraploid wheat Yuanwang is a landrace of T.turgidum conv. turgidum collected by Dr. Wuyun Yang in Sichuan Province, China. And the seeds of Langdon (T. turgidum conv. durum) were provided by Triticeae Research Institute of Sichuan Agricultural University. In this study, two tetraploid wheats were used as female parents when being crossed with two A. tauschii accessions. In their rst sel ng generation (S1) of LS665, LS783, and YS783, the majority of their offspring contained the euploid chromosome set (2n=42) karyotyped by Zhu [75], using uorescence in situ hybridization (FISH) with two repetitive DNA sequences Oligo-pSc119.2 labeled with Alexa Fluor 488-5-dUTP and Oligo-pTa535 with Texas Red-5-dCTP described by Tang et al. [76]. Then we selected karyotyped individuals covering the whole A, B and D genome with 42 chromosomes from the offspring of S1 and created two hexaploid F 2 mapping populations LS783 x YS783 (population size: 193) and LS665 x LS783 (population size: 182) to evaluate the homologous recombination frequency (RF) of AB and D genomes in the hexaploid genetic background. Meanwhile, the RF of the other two F 2 populations of Ae. tauschii SQ665 x SQ783 (population size: 123) and T. turgidum Langdon x Yuanwang (population size: 192) were calculated as a control to investigate the RF changes of the diploid and tetraploid genomes after their hexaploidization, respectively.

SNP genotyping
Among the four F 2 genetic populations, two populations Langdon x Yuanwang (A 1 A 1 B 1 B 1 x A 2 A 2 B 2 B 2 ) and LS783 x YS783 (A 1 A 1 B 1 B 1 DD x A 2 A 2 B 2 B 2 DD) were grouped to the rst genetic population set to analyze the RF change of the ancestral tetraploid AB genome after its hexaploidization. The other two F 2 populations SQ665 x SQ783 (D 1 D 1 x D 2 D 2 ) and LS665 x LS783 (AABBD 1 D 1 x AABBD 2 D 2 ) was set together for the RF analysis of the ancestral diploid D genome after hexaploidization. And the 4 populations and their involved parents were planted in the eld using the single-seed precision. 50 mg of plant tissue was collected from 2-week-old seedlings and their DNA was extracted using the NuClean Plant Genomic DNA Kit (CWBio, Beijing, China). Eluted DNA was quanti ed using Qubit 4 Fluorometer (Life Technologies Holdings Pte Ltd, Singapore) and then normalized their concentration required for genotyping.
Genotyping for the rst genetic population set and their parents were executed on the Affymetrix platform of Axiom Wheat Breeder's Genotyping Array with 13947 SNP markers developed by China Golden Marker Biotech Co Ltd (Beijing, China). The collected uorescence signal from SNP array were processing and analyzed by the functions of apt-genotype-axiom for genotype calling, ps-metrics for generating various QC metrics and ps-classi cation for classifying SNPs in the software of Affymetrix Axiom Analysis Suite version 4.0.1. Among the 13947 SNP markers, a total of 9487 SNP labels xed on the chip is designed to be distributed on AB genomes, and SQ783 was serviced to remove these SNP markers with their uorescence signal also detected on the D genome of Ae. tauschii.
Genotyping by sequencing (GBS) analysis for the second genetic population set as well as their parents was performed by DArT-Seq™ technology for detecting SNP and DArT markers by comparing obtaining sequencing reads to the RefSeq v1.0 assembly of CS [65], This genotyping technology relies on a complexity reduction method to enrich genomic representations with single copy sequences and subsequently performed second-generation sequencing using HiSeq2500 (Illumina, USA). About 100 μl of 50 ng μl -1 DNA sample was sent to Diversity Arrays Technology P/L (Bruce, Australia) for SNP and DArT analysis. Only SNP data was used for linkage map construction and RF calculation in this study, as RF calculated using co-dominant molecular data was more accurate than dominant molecular data [77]. Among the 28331 detected SNP loci comparing to CS assembly, a total of 9043 SNPs were evenly collected and theoretically distributed throughout D genome. In order to con rm their speci city on diploid D genome, the durum wheat Langdon as the same AABB donor of both LS665 and LS783 was also genotyped and used to erase SNPs from the 9043 SNPs, which were also detected in Langdon.

Genetic maps and recombination frequency
For the AB genome of the rst genetic population set, a total of 904 SNP markers (Table S1) were remained for linkage map construction and RF calculation, and the de ned rules ltering the detected SNPs generally applied to further analysis were as follows: (1) the best Blastn hits with top score was unique in the AB genome when using allele reference sequences of SNP assay querying the CS RefSeq v1.0 assembly; (2) the remained SNP markers must have no genotype (displaying as missing data) on the D genome of Ae. tauschii accessions SQ783; (3) For a employed SNP site, their polymorphism should also be detected between the parents of both tetraploid and hexaploid F 2 populations; (4) genotypes of all involved parents was homozygous at the workable SNP sites, and the tetraploid parents Langdon and Yuanwang had the same genotypes to their SHWs LS783 and YS783, respectively; (5) the SNP markers could detected both homozygous and heterozygous genotypes. For the D genome of the second genetic population set, a total of 744 polymorphic SNP markers (Table S2) were remained for linkage map construction and RF calculation, obeying the basic principles of the above-mentioned rules in ltering SNPs using the Ae. tauschii AL8/78 genome sequence assembled by Luo et al. [23] and their own parents.
The QTL IciMapping Software version 3.2 [78,79] was used for genetic linkage map construction and calculation of the RF between two adjacent SNP loci. The maximum likelihood method was used to estimate RF by the QTL IciMapping Software version 3.2 [77]. In order to comparing the RF between two genetic populations of each set, the SNP markers for genetic map construction were aligned using their physical positions on the CS RefSeq v1.0 assembly for AB genome and sequence assembly of Ae. tauschii AL8/78 for D genome, and then the RF between two adjacent and linked SNP loci in physical map was calculated using the functional of 'algorithm by input' in the software.

Declarations
Ethics approval and consent to participate Not applicable.
Adherence to national and international regulations Not applicable.

Consent for publication
Not applicable.

Availability of data and material
The datasets and used materials and analyzed during the current study are available from the corresponding author upon reasonable request.

Competing interests
The authors declare that they have no competing interests.

Funding
This study was partially supported by National Natural