FISH karyotype polymorphism of T. araraticum and T. timopheevii
Probe Oligo-pTa535-1 generated signals predominately on the At-genome chromosomes. However, several G-genome chromosomes showed different hybridization locations, and only a small number of signals were distributed on chromosomes 4G, 6G, and 7G. Oligo-pSc119.2-1 mainly hybridized with the G-genome chromosomes, and signals were located in the interstitial and subterminal regions, which allow the identification of 9 out of 14 chromosomes of T. timopheevii. There were a small number of signals on 1A and 5At. Oligo-pTa71-2 mainly hybridized with the 6At and 6G chromosomes, and the signal was on the short arm.
There were FISH signal polymorphisms in the 50 accessions used in this study (Fig. 2 and Fig. 3). There was no signal variation in the At genome but two signal variations in the G genome of T. timopheevii (Fig. 2). PI119442 was the most typical FISH karyotype among the 15 T. timopheevii accessions. Seven chromosomes of the At genome were monomorphic and very conserved in all 15 T. timopheevii accessions (Fig. 2a). Both chromosomes 2G and 7G had one kind of signal variation in the G genome of all 15 T. timopheevii accessions (Fig. 2b). However, there were 14 kinds of FISH signal variations in the At genome (Fig. 3a) and 19 kinds of FISH signal polymorphisms in the G genome of T. araraticum (Fig. 3b). Chromosome 1At had four kinds of signal variations, showing the highest polymorphism. Chromosomes 4At and 5At each had three kinds of signal variations, showing strong signal polymorphism. Chromosome 2At had two kinds of signal variations. Chromosomes 3At and 7At each had one kind of signal variation, showing the lowest signal polymorphism. Notably, chromosome 6At was monomorphic. Chromosome 2G had seven kinds of signal variations, showing the highest signal polymorphism. Chromosomes 6G and 7G each had four kinds of signal variations, showing strong signal polymorphism. Chromosomes 1G and 4G each had two kinds of signal variations, showing low signal polymorphism. Similarly, it is worth noting that both chromosomes 3G and 5G were monomorphic. However, compared with the FISH signal of T. timopheevii PI119442, additional sets of red signals were observed on the short arms of 5At and 2G in T. araraticum AS273 (Fig. 4), and on the long and short arms of the chromosomes 1At and 6G, respectively, in T. araraticum AS274 and PI427414 (Fig. 4).
FISH signal cluster analysis
Cluster analysis was performed according to the physical location of the FISH signals from 50 accessions using 10 different T. turgidum accessions as outgroup, with a similarity coefficient of 0.57 as the standard (Fig. 5). Sixty accessions were divided into two groups based on their different FISH signals. The 50 accessions with the AtAtGG genome were clustered together, and the 10 accessions with the AABB genome were clustered together. Fifty accessions with the AtAtGG genome were classified as 5 clusters according to a similarity coefficient of 0.89. Among the clusters, cluster 1 was mostly composed of T. timopheevii, and clusters 2, 3, 4, and 5 were mostly composed of T. araraticum. T. araraticum and T. timopheevii showed strong genetic differentiation, with T. araraticum showing strong genetic diversity. However, three T. araraticum accessions, AS273, AS274 and PI427414, were included in the T. timopheevii group, showing that these 3 accessions were closely related to this group. T. araraticum was concentrated in Iraq, and the geographical distributions of T. araraticum and T. timopheevii overlapped in Turkey (Fig. 6).
Chromosome distribution of SNPs
Illumina HiSeq PE150 generated a total of 7,141,920 raw single-end sequence reads from 50 samples. It is assumed that G genome SNPs from the T. timopheevii accessions mapped to the B genome of the T. dicoccoides reference sequence, in view due to a close similarity between these two genomes (Hyun et al. 2020). After quality filtering, 190,402 SNPs were identified in the At, G genomes (Table S4, https://doi.org/10.6084/m9.figshare.17032322.v1), including 104,180 SNPs from the At genome, 84,930 SNPs from the G genome and 1,292 SNPs from unanchored scaffolds (Fig. 7a, Table S5). In the At genome, the chromosomes 1At and 2At harbored the lowest (1,1982) and highest (17,527) number of SNPs, respectively. In the G genome, the chromosomes 4G and 5G had the lowest (7,748) and highest (14,637) numbers of SNPs, respectively (Fig. 7b). The highest and lowest numbers of SNPs identified on the chromosomes 4G and 2At were 17,527 and 7,748, respectively (Fig. 7b). The ratio of SNP numbers in the At to G genomes was 1.23. Thus, the SNP number of At genome exceed that of G genome.
Population structure
The resulting dataset of 190,402 high-quality SNPs from GBS was used for cluster analysis. T. araraticum and T. timopheevii were clearly divided into two subpopulations (Fig. 8). Subpopulation 1 (POP1) contained most T. araraticum accessions except for AS273. Subpopulation 2 (POP2) contained all T. timopheevii accessions including one T. araraticum accession AS273. Two T. timopheevii accessions, CItr15590 and CItr15205 from Greece were grouped together with PI94760 from Georgia, with bootstrap values reaching 100%. PI221421 from Serbia was grouped together with PI251018 from Hungary, with bootstrap values of 100%. Most T. timopheevii are related to PI119442 from Turkey, with bootstrap values of 65%. T. araraticum accessions from Iraq were divided into 6 subgroups (Fig. 8). Two T. araraticum accessions PI352265 and AS272 from Azerbaijan were grouped together with PI427339 from Iraq. Similarly, two T. araraticum accessions PI427366 and PI427370 from Iran were grouped together with PI427363 from Iraq.
When K had an inflection point, the cross-validation (CV) errors were low, with K = 2 minimizing the CV errors (Fig. 9). Clustering information for 50 individuals was shown when K = 1 to 8 (Fig. 10). K = 2, T. araraticum was first distinguished from T. timopheevii. The separation of these two sub-populations was also well supported by PCA (Fig. 11). K = 3, T. araraticum were divided into two subgroups, one subgroup includes three taxa A2, A6, and A8, and the other subgroup includes three taxa A4, A7, and A10. Nearly 70% of the genetic information for A5 was originated from the ancestors A4, A7, and A10. Almost half of the genetic information for AS273 (A9) came from the ancestor A11, and the remaining information was derived from the ancestor A2, A6, and A8.
One hundred percent different SNPs corresponding to the same physical location were screened among 35 and 15 accessions of T. araraticum and T. timopheevii to ensure that the SNPs at each corresponding physical location were common to all accessions in the respective populations. A total of 4139 SNPs were screened out (Table S6). The different SNPs between T. araraticum and T. timopheevii were widely distributed over all of the chromosomes (Fig. 12), and the most different SNPs were distributed on chromosomes 2At, 3At, 5At, and 6At. A special T. araraticum accession AS273, having an intermediate spike type was grouped with T. timopheevii, corresponding to the result of FISH signal cluster analysis. The analysis of 3034 specific SNPs from AS273 indicated that chromosomes 2At, 5At, 7At, 3G, 4G and 6G were from T. araraticum and chromosomes 1At, 3At, 4At, 6At, 1G, 2G, 5G, and 7G were from T. timopheevii (Fig. 13).
Genetic diversity
Genetic diversity analysis of the 50 accessions indicated that the mean GD and PIC were 0.30 and 0.26, with the ranges of 0.1–0.5 and 0.1–0.4, respectively (Fig. 14a, b). The Ho values ranged from 0 to 1, and the value was 0.1 when the samples were genotyped on a large number of markers (Fig. 14c). The average MAF was 0.26 (Fig. 14d). Intra-population genetic diversity analysis demonstrated that mean Na and Ne values were 1.97 and 1.50 for the two subpopulations, respectively (Table 1). The Ne value of Pop1 (1.54) was higher than that of Pop2 (1.46). The mean values of He, Ho, I and uh were 0.32, 0.35, 0.49 and 0.33, respectively. As shown in Table 1, the Pop1 (I = 0.53, He = 0.35, uh = 0.35) showed higher genetic diversity than Pop2 (I = 0.45, He = 0.29, uh = 0.31). In addition, the results of AMOVA showed that much greater variation within populations (97%) was observed than among the populations (3%) (Table 2). Notably, the Nm value was relatively high (8.39), suggesting high gene flow between subpopulations (Table 2). These findings demonstrate a high genetic diversity within subpopulations, but a low genetic diversity between subpopulations.
Table 1
Means of genetic parameters for each subpopulation of the 50 timopheevii accessions. Number of alleles (Na), number of effective allele (Ne), Shannon’s index (I), observed heterozygosity (Ho), expected heterozygosity (He), and unbiased diversity index (uh).
Population
|
Na
|
Ne
|
I
|
Ho
|
He
|
uh
|
Pop1
|
2.00
|
1.54
|
0.53
|
0.35
|
0.35
|
0.35
|
Pop2
|
1.94
|
1.46
|
0.45
|
0.36
|
0.29
|
0.31
|
Mean
|
1.97
|
1.50
|
0.49
|
0.35
|
0.32
|
0.33
|
Table 2
Analysis of molecular variance using 7212 SNP markers of genetic differentiation among and within two subpopulations of the 50 timopheevii accessions.
Source
|
df
|
SS
|
MS
|
Est. Var.
|
%
|
P value
|
Among Pops
|
1
|
9167.11
|
9167.11
|
167.97
|
3%
|
0.001
|
Within Pops
|
48
|
270708.25
|
5639.76
|
5639.76
|
97%
|
0.001
|
Total
|
49
|
279875.36
|
|
5807.73
|
100%
|
0.001
|
Nm
|
8.39
|