Morphological characterization of Tunisian durum wheat accessions
Phenotypic diversity and morphological characterization
The Shannon-Weaver index (H') revealed a high morphological diversity among durum wheat accessions (H' = 0.80) (Table 1). The most polymorphic characters were spike length (SL; H' = 0.98), grain size (GSz; H' = 0.94), grain shape (GSp; H' = 0.87), grain color (GC; H' = 0.86) and spike shape (SS; H' = 0.86), while the least polymorphic trait was spike color (SC; H' = 0.53).
The 304 durum wheat accessions investigated in this study were grouped into 11 landraces, namely Azizi, Jneh Khotifa, Taganrog, Mekki, Richi, Souri, Roussia, Badri, Biskri, Biada and Mahmoudi, recorded in the catalog of durum wheat landraces cultivated in Tunisia . These landraces were characterized by 12 specific morphological traits, based on the International Plant Genetic Resources Institute (IPGRI)  and International Union for the Protection of New Varieties of Plants (UPOV)  (Table S1, Table S2). All 12 spike and grain characteristics were almost homogeneous among accessions of the same landrace. This was supported by the Shannon-Weaver index (H'), which was relatively low for each landrace, ranging from 0.00 (Badri and Jneh Khotifa) to 0.23 (Richi), with an overall mean of 0.11 (Table S3). For instance, Mahmoudi accessions had particularly large spikes with sub-pyramidal shape, very long awns and large grains, whereas spikes of Azizi accessions were rectangular and very flat. Biskri accessions had fusiform and large spikes. The spike color, length and shape varied among the studied accessions from dark to light and from short to long spikes. For example, Badri spikes were very short and thick with a greyish color, whereas Biada spikes and awns were very light (white) in color. Souri and Roussia were both characterized by tight, red-colored spikes with a distinct spike shape, i.e., either rectangular (Souri) or cylindrical (Roussia). Souri and Roussia landraces were also characterized by a distinct orange colored grain. Interestingly, Richi accessions showed a unique feathery spike, while Mekki accessions were characterized by short and dense spikes with parallel edges. Finally, Taganrog accessions were characterized by white colored spikes with black stains, while Jneh Khotifa accessions showed very dark (black to purple), long and dense spikes and awns.
Principal component analysis (PCA)
PCA of 12 spike and grain morphological traits of 304 durum wheat accessions showed that PC1 and PC2 axes accounted for 25.73% and 22.34% of the total genetic variation in these traits, respectively (Figure 1). PC1 was mostly associated with SS, SL, number of spikelet per spike (NS), grain color (GC) and awn length (AL), whereas PC2 was mainly associated with GSp, GSz and grain number per spikelet (GN) (Figure 1a). The color-coding of accessions in the two-dimensional PCA plot (PC1 vs. PC2) showed a good correspondence between the morphological trait-based grouping and landrace denomination (Figure 1b), and accessions belonging to the same landrace were included in the same PCA subgroup. Biskri, Jneh Khotifa and Taganrog accessions grouped together, showed positive correlation with both PC1 and PC2 and shared similar spike characteristics, such as SL (mostly long spikes), high GN (>3), black awn color (AC) and AL longer than the spike. Azizi accessions were grouped into a distinct subgroup, mainly characterized by rectangular medium-sized spikes with a tan color. Mahmoudi accessions also formed a distinct subgroup, mainly characterized by unique pyramid-shaped spikes. Accessions of Souri and Roussia formed almost a single subgroup characterized by red-colored loose and long spikes as well as red colored glumes and awns. Landraces Badri and Mekki formed distinct subgroups negatively correlated to PC1 and PC2, and both subgroups were mainly characterized by short spikes with a low to intermediate GN. Biada and Richi accessions were grouped mainly in the center of the plot and were particularly characterized by white-colored spikes, glumes and awns (Table S2). Overall, PC1 and PC2 could separate all landraces, based on 12 spike- and grain-related morphological traits; the only exceptions were the groups of Roussia and Souri landraces and Biskri, Jneh Khotifa and Taganrog landraces, which could not be distinguished based on SL and SC. Thus, additional morphological traits, such as glume form, were considered to classify the latter landraces into distinct subgroups (Table S2).
Genetic diversity and population structure of Tunisian durum wheat accessions
Ten SSR markers were used in this study to analyze the genetic diversity and population structure of Tunisian durum wheat accessions. These SSR markers were mapped onto eight different chromosomes and therefore were considered largely independent (Table 2, Table S4). The percentage of missing data was low (<10%) for each locus. All 10 SSR markers amplified a total of 99 alleles and from 302 accessions, 188 multilocus genotypes (MLGs) were identified. The accumulation curve (Figure S1), showed that these SSR markers were able to reach the maximal range of differentiation among the MLGs. The number of different alleles per locus (Na) varied from 4 (Xgpw2103) to 16 (Xgwm413), with an average Na of 9.9 across all loci. Overall, the PIC value was 0.690. The highest PIC value was obtained for Xgwm413 (0.851), whereas the lowest PIC value was obtained for Xgpw2103 (0.448). The Shannon’s information index (I) value was the highest for Xgwm413 (2.182) and the lowest for Xgpw2103 (0.781). The fixation index (Fis) was approximately equal to 1 for each locus, except Xgwm495 (Fis = -0.373), for which a high PIC value was observed (0.659). Pairwise genetic differentiation (Fst) ranged from 0.201 (Xgwm495) to 0.688 (Xgpw7148).
Analysis of population structure and relationship with morphological characterization
We analyzed the population structure on 188 MLGs. The maximum likelihood (LnP(K)) and delta K (ΔK) methods indicated that the most likely number of genetic groups (K) was 11 (Figure 2a, b). The estimated genetic group membership coefficient of each accession at K = 11 is shown in the population structure plot (Figure 2c).
Overall, each genetic group corresponded to a single landrace. The genetic groups G2, G3, G4, G5, G7, G9, G10 and G11 corresponded to Jneh Khotifa, Taganrog, Mekki, Richi, Badri, Beskri, Biada and Mahmoudi landraces, respectively. Moreover, a significant correlation was detected between the genetic distance matrix and morphological distance matrix (P = 0.01; Rxy = 0.435). However, a discrepancy between the genetic grouping and the morphological characterization was observed for Azizi, Souri and Roussia landraces; Azizi landraces were grouped by STRUCTURE into two different genetic groups G1 and G8, while Souri and Roussia landraces were grouped together into one genetic group (G6), despite their distinct morphological characteristics.
A total of 41 admixed individuals were observed in the collection. The admixture was mainly obtained between G6 (Roussia and Souri) and G10 (Biada) (representing 14.6 % of the admixed genotypes), followed by G1 (Azizi) and G9 (Beskri) (representing 9.7%). Mahmoudi (G11), Beskri (G9) and admixed genotypes were the most frequent (representing 23.8%, 12.2% and 14% of the entire collection, respectively), followed by Azizi (G1), Taganrog (G3), Mekki (G4), Badri (G7) and Biada (G10) (each accounting for approximately 8% of the entire collection). However, Jneh Khotifa (G2), Richi (G5), Roussia and Souri (G6) and Azizi (G8) were the least frequent, each accounting for 3% of the entire collection.
Analysis of diversity indices and molecular variance
The 11 groups identified by STRUCTURE analysis presented different levels of genetic diversity (Table 3). Group G6 showed the highest level of genetic diversity, while G7 represented the lowest level. The number of effective alleles per locus (Ne) ranged from 1.152 (G7) to 2.379 (G6). Genetic groups with the highest number of MLGs were G6 (100% of different MLGs), G8 (90%) and G3 (85.7%), while G7 and G11 had the lowest number of MLGs (27.2% and 34.7%, respectively). The percentage of polymorphism (P) ranged from 40% (G7) to 100% (G6 and G8). Shannon's information index (I) varied from 0.166 (G7) to 0.937 (G6), with an average value of 0.620 across all accessions. In addition, G6 and G8 showed the highest number of private alleles (G6, PA = 7; G8, PA = 4), while G2 and G7 contained no private alleles (PA = 0) (Table S5). Groups G10 and G4 contained two diagnostic alleles (DA) each, while G3, G5 and G7 contained one DA each, with frequency > 70%. The fixation index (Fis ranged from 0.698 (G4) to 1.0 (G7), where observed heterozygosity (Ho) was 0.100 and null, respectively. Furthermore, analysis of variance (ANOVA) showed that 59% of the total genetic diversity was observed between distinct genetic groups, while 41% of the genetic diversity was explained by differences within each group (Table 4).
Minimum spanning network (MSN) analysis
The genetic relatedness between genotypes was tested using MSN analysis, based on Bruvo’s distance. MSN separated all accessions into two main clusters (Figure 3). Cluster C1 contained accessions belonging to Azizi (G1 and G8), Jneh Khotifa (G2), Richi (G5), Souri and Roussia (G6), Badri (G7) and Biskri (G9) landraces, while cluster C2 contained accessions belonging to Taganrog (G3), Mekki (G4), Biada (G10) and Mahmoudi (G11) landraces. In addition, the pairwise Nei’s genetic distances calculated between the 11 genetic groups were consistent with the results of MSN analysis (Table S6). The highest Nei’s genetic distance was recorded between G10 and G5 (2.416), followed by that between G10 and G7 (2.319). The lowest genetic distances were 0.421 registered between G1 and G8; and 0.630 registered between G3 and G11 and between G3 and G4 indicating that G1 and G8 as well as G3, G11 and G4 were genetically the most closely related groups. In addition, a morphological comparison between the network groupings revealed a significant difference (P < 0.05) between C1 and C2 in terms of SS, SL, AL, GC, GSp, NS, AC and glume color (GlC) (Table 5). Cluster C1 showed higher gene diversity (He = 0.740) and phenotypic diversity (H' = 0.77) than cluster C2 (He = 0.425, H' = 0.61) (Table S7 and S8). The values of SS and SL were higher in cluster C1 than in cluster C2, whereas C2 showed significantly higher AL and GSz than cluster C1 (Table 5).
Diversity analysis of Tunisian durum wheat accessions based on regions and climate stages
Analysis of morphological diversity among different regions and climate stages
The Shannon-Weaver index (H') of 12 spike and grain related traits was compared among five regions (Sousse, Mahdia, Kairouan, Gabes and Medenine) and three different climate stages (low semi-arid, high-arid and mid-arid) (Table 1). Among all five regions, Kairouan showed the highest diversity index (H' = 0.74), followed by Medenine (H' = 0.66). Sousse showed a null diversity index, indicating no phenotypic variability between accessions in this region; notably, Richi was the only landrace identified in this region. The most polymorphic characteristics by regions were SL (H' = 0.69), GSp (H' = 0.65), GC (H' = 0.62) and NS (H' = 0.61). Among all three climate stages, the high-arid climate (represented by Kairouan) showed the highest diversity index (H' = 0.74), whereas the low semi-arid climate (represented by Mahdia and Sousse) showed the lowest diversity index (H' = 0.59). The most polymorphic characters by climate stages were AL (H' = 0.90), GSp (H' = 0.82), GC (H' = 0.79), and NS (H' = 0.73).
The polymorphism level of some morphological characteristics differed distinctly among regions, excluding Sousse where an overall homogeneity of morphological traits was recorded. The value of AC varied among regions from 0.12 (Kairouan) to 0.73 (Mahdia). Similarly, SL was the highest in Mahdia (0.99) and lowest in Gabes (0.49). Values of spike color (SC) and glume color (GC) indices were the highest in Medenine (0.53) and Kairouan (0.97), respectively, and lowest in Mahdia (0.00 and 0.48, respectively). Morphological traits were also variable from one climate stage to another. Values of SL and glume color were the highest in high-arid climate (0.48 and 0.96, respectively) and lowest in low semi-arid climate (0.0 and 0.41, respectively). By contrast, AC was the lowest in high-arid climate (0.12) and the highest in mid-arid climate (0.71). However, no variation was observed among regions for GC and among climates for GN.
In addition, a dominant phenotypic class of some morphological traits was observed among regions (within more than 70% of accessions), except Sousse, which did not show any variation in morphological traits (Table S9). In Gabes, the majority of accessions showed long spikes (SL > 9 cm; 84%), with light color (92%) and cylindrical shape (79%), awns shorter than the spike (84%), moderately elongated grain shape (82%), small grains (GSz < 0.3 cm) (82%) and an intermediate number of grains per spikelet (GN = 2–3; 79%), whereas accessions with medium length spikes (SL: 6–9 cm) were dominant in Medenine (73%). In Mahdia, the majority of accessions showed spikes with AL equal to the spike (72%) and small GSz (78%). However, most of the accessions in Kairouan had spikes with AL longer than the spike (72%). Among different climate stages, the mid-arid was dominated by accessions with small grains (GSz < 0.3 cm; 72%), whereas the high-arid climate stage was rich in accessions with dark colored spikes (72%) and black awns (96%). No particular phenotypic class was observed within the low semi-arid climate (Table S9).
Analysis of genetic diversity among different regions and climate stages
The results of ANOVA showed that 19% and 10% of the total genetic diversity was observed among regions and among climate stages, respectively, while 81% and 90% of the genetic variability was explained by differences within regions and within climate stages, respectively (Table 4).
Genetic diversity among regions showed Ne ranging from 1.366 (Sousse) to 3.031 (Gabes) (Table 3). Overall and among all investigated regions, Sousse showed the lowest genetic diversity indexes, while Gabes showed the highest genetic diversity indexes; the number of MLGs was the highest in Gabes (31) and lowest in Sousse and Medenine (7), and the Shannon's information index was also the highest in Gabes (H' = 1.296) and lowest in Sousse (H' = 0.305). Moreover, the percentage of polymorphic loci (P) was 100% for all regions, except Sousse (50%). Moreover, the number of private alleles was also the highest in Gabes (PA = 17) and lowest in Sousse and Medenine (PA = 1). The value of Fis was greater than 0.800 in each region, except Sousse (Fis = 0.691). Interestingly, the DA number and heterozygosity index were the highest in Sousse. In fact, three diagnostic alleles (frequency > 70%; Ho = 0.100) were registered in Sousse, whereas only one such allele was identified in Gabes.
Analysis of SSR data obtained from different climate stages revealed that the mid-arid climate was the most outstanding, with the highest number of effective alleles (Ne = 3.174), the highest Shannon's information index (I = 1.318) and the highest number of private alleles (PA = 19). By contrast, the high-arid climate stage showed the lowest number of effective alleles (Ne = 2.707), the lowest Shannon's information index (I = 1.050) and the lowest number of private alleles (PA = 2). However, the value of Fis was similar (>0.800) among all studied climate stages (Table 3).
Correlation between genetic distance and geographic distance
The Mantel test showed a significant correlation (P = 0.010; Rxy = 0.286) between genetic and geographic distances among durum wheat accessions, suggesting that accessions growing in close geographical proximity were genetically related. Azizi and Mahmoudi landraces showed the most widespread geographical distribution in central and southern Tunisia, except Sousse, and all climate stages. However, Azizi was more frequent in Gabes (25 accessions out of 38), while Mahmoudi was mostly found in Medenine (13 accessions out of 22) and Mahdia (11 accessions out of 27) (Figure 4). In addition, all G5 genotypes, corresponding to the Richi landrace, were found in Sousse; all G7 and G2 genotypes, corresponding to Badri and Jneh Khotifa landraces, respectively, were found in Kairouan; and the landrace Taganrog, representative of the genetic group G3, was exclusively found in Mahdia.
Furthermore, we compared morphological traits between Azizi and Mahmoudi accessions collected from central and southern Tunisia. None of the traits, showed significant differences (P > 0.05), except for spike density (SD) which showed significant differences within Mahmoudi (P = 0.00). Mahmoudi accessions collected from central Tunisia had compact spikes (SD = 7), whereas those collected from southern Tunisia were characterized by loose spikes (SD = 5) (Table 5).