Various primers were used to amplify and sequence distinct genomic regions of three chosen Allium cultivars. The number of base pairs in each sequenced region slightly differed among the three cultivars, with the trnL region exhibiting shorter lengths than the other barcoding regions analysed in this study (Table 2).
Table 2
Lengths of the sequenced genome regions of three selected Allium sativum cultivars for different molecular markers used in this study (number in base pairs).
Garlic Cultivar
|
Internal Transcribed Spacer Region
|
Barcoding Regions
|
ITS
|
matK
|
trnH-psbA
|
rbcL
|
trnL
|
BDUT 1450
|
645
|
851
|
650
|
537
|
287
|
BDUT 1451
|
651
|
853
|
646
|
540
|
291
|
BDUT 1452
|
630
|
851
|
640
|
547
|
259
|
The substitution pattern and rates for the ITS sequence were estimated using the Tamura and Nei (1993) model with a discrete gamma distribution to account for evolutionary rate differences among sites (5 categories, [+ G]). In the ITS region analysis, the shape parameter for the discrete Gamma distribution was estimated to be 200.0000. The mean evolutionary rates in these categories were 0.90, 0.96, 1.00, 1.04, and 1.10 substitutions per site, suggesting moderate evolution of these cultivars. The maximum log likelihood for this computation was − 2594.439, indicating a good fit with the Tamura–Nei model, significant heterogeneity, and a moderate evolutionary rate. The nucleotide frequencies of the ITS region observed in this Tamura–Nei model were A = 22.54%, T = 30.58%, C = 20.37%, and G = 26.51%. The transition/transformation bias (R) for the ITS region was 1.09, indicating a bias higher than the expected level. Additionally, when the Kimura (1980) 2-parameter model was used for substitution patterns and rates, the nucleotide frequencies were A = 25.00%, T = 25.00%, C = 25.00%, and G = 25.00%. For estimating maximum likelihood (ML) values, a tree topology was automatically computed, and the maximum log likelihood for this Kimura 2-parameter computation was − 2617.110.
Both models exhibit a typical ITS region with AT richness, which might contribute to the higher R value than expected. When the phenetic neighbor–joining (NJ) method was applied to the ITS sequence, the phylogeny of the three cultivars resulted in a divergence of BDUT 1451 from that of the other two cultivars, BDUT 1452 and BDUT 1450, which were found to be in a separate cluster. In contrast, according to the cladistic MP method, BDUT 1450 and BDUT 1452 formed a single cluster, while BDUT 1451 diverged into a separate cluster (Figs. 1 and 2). The discrepancy in clustering patterns could conceivably be attributed to the high rate of heterogeneity, where the NJ method relies on overall sequence similarity, whereas the MP method focuses on shared derived characters. Interestingly, both methods consistently placed BDUT 1451 in a separate cluster, suggesting intraspecific divergence.
The phenetic neighbor-joining (NJ) method of ITS sequence analysis revealed that BDUT 1451 was closely linked to A. sativum gi228015332, and BDUT 1452 diverged from this cluster. BDUT 1450 was closely related to a cluster of A. ampeloprasum var. harmone gi338191565 and A. ampeloprasum var. harmone gi338191566. According to the cladistic method of ITS, BDUT 1450 and BDUT 1452 formed a single clade, whereas BDUT 1451 was closely linked to A. sativum gi228015332. The overall pairwise distance between them was 6.153. A dendrogram created by Baghalian et al. (2006) using 24 ecotypes revealed five main groups. Thirteen of the 24 ecotypes were included in cluster group A, two ecotypes in group B and seven ecotypes in group C. They found that there was no relationship between genetic divergence and geographical origin in the post planting stage, as genotypes from the same origin entered into different clusters, and genotypes from different origins also entered the same cluster. In the study by Figliuolo et al. (2001), molecular analysis confirmed the differentiation of A. sativum and A. ampeloprasum in the sequence cluster. The A. ampeloprasum cluster diverged from A. sativum through the use of single-nucleotide sequence variation in both the ITS1 locus and the RAPD locus. Sequence analysis of ITS1 DNA revealed detailed variation within the A. ampeloprasum complex and monomorphism within A. sativum in the study by Figliuolo and Di Stefano (2007). Several other studies have revealed intricate phylogenetic relationships, and the whole-genome sequence of Allium was generated through ITS analysis (Ipek et al. 2008, Hirschegger et al. 2010 and Nguyen et al. 2008).
One striking characteristic of the ITS data is the strangely large intrageneric genetic distances within Allium species. Kimura distances greater than 40% were found in the Friesen et al. (2000) study and in the Dubouzet and Shinoda (1999) study; Kimura distances often characterize the most distant genera within subfamilies or even families (Baldwin et al. 1995; Blattner and Kadereit 1999; Hsiao et al. 1999; Noyes and Rieseberg 1999). Intrageneric distances in other plant families are mostly less than 10% (Baldwin et al. 1995). These findings make Allium species either extraordinarily rapidly evolving taxa or ancient in origin, and molecular evolution is not accompanied by the increase in comparable numbers of taxonomic categories. Moreover, a phylogenetic plot showed that all three cultivars belonging to the same species fell into distinct clusters, signalling their genetic divergence. Fredotović et al. (2014) performed phylogenetic analyses of the ITS1-5.8S-ITS2 region of 35S rDNA and the non-transcribed spacer (NTS) region of 5S rDNA of A. xcornutum and its relatives to the section A. cepa.
After the ITS region was sequenced, the phylogenetic relationships of the barcoding gene sequences, such as matK, rbcL, trnH-psbA and trnL, were determined via cladistic methods. In the analysis of the matK region, the shape parameter for the discrete Gamma distribution was estimated to be 0.0500. The mean evolutionary rates in these categories were 0.00, 0.00, 0.00, 0.03, and 4.97 substitutions per site. The fifth category stands out as remarkably divergent compared with the others. These results imply the presence of a conserved matK region with a low rate of evolution and significant intraspecific divergence. The nucleotide frequencies observed were A = 32.28%, T = 37.92%, C = 15.24%, and G = 14.57%, which confirmed the characteristic AT-rich chloroplast DNA. The maximum log likelihood for this computation was − 1210.897. Additionally, the estimated transition/transformation bias (R) is 0.11, and the maximum log likelihood for this computation was − 1287.698.
According to the phenetic neighbor–joining (NJ) method of phylogenetic analysis for matK, all three cultivars formed a single cluster, with BDUT 1450 and 1451 closely linked. According to the cladistic MP method, BDUT 1450 and 1452 were found in a single cluster that diverged from BDUT 1451 (Figs. 3 and 4). According to the NJ method of matK analysis performed in this study, all three cultivars were found in a single clade, and the overall pairwise distance between them was 0.012. Both phenetic and cladistic methods concur that BDUT 1451 diverged from BDUT 1450 and 1452. The matK region had a high evolutionary rate (4.97), suggesting the potential accumulation of mutations, possibly driven by neutral evolutionary processes, adaptive evolution, or founder effects, in BDUT 1451. Conversely, BDUT 1450 and 1452 are closely related according to cladistic analysis, consistent with the overall low evolutionary rates observed in the matK region. Overall, the analysis of the matK region revealed high conservation with subtle divergence among the three cultivars.
The estimated shape parameter of the rbcL region was 28.0140 for the discrete gamma distribution, suggesting that the heterogeneity in the rbcL region was significantly greater than that in the matK region. The mean evolutionary rates in these categories were 0.75, 0.89, 0.99, 1.09, and 1.28 substitutions per site, implying a moderate evolutionary rate with high heterogeneity. The nucleotide frequencies were A = 30.11%, T = 28.62%, C = 21.73%, and G = 19.55%, which indicated slight GC bias, contrasting with the AT bias observed in the matK analysis. The maximum log likelihood for this computation was − 2070.782. The estimated transition/transformation bias (R) is 0.41. The maximum log likelihood for this computation was − 2093.020. Analysis of the rbcL region of the tested cultivars revealed that BDUT 1451 and BDUT 1452 were closely related to each other and formed a single cluster according to both methods of phylogenetic analysis. BDUT 1450 was found in a separate cluster (Figs. 5 and 6). Both the phenetic and cladistic methods for rbcL placed BDUT 1450 in a separate cluster, while BDUT 1451 and 1452 formed a closer cluster, aligning with the matK analysis. However, the rbcL analysis revealed closer relationships between BDUT 1451 and 1452 than between BDUT 1451 and matK, potentially indicating incomplete lineage sorting in the rbcL gene, where ancestral polymorphisms have not yet been sorted across lineages.
According to the trnH-psbA region analysis, the estimated value of the shape parameter for the discrete Gamma distribution is 2.6231. The mean evolutionary rates in these categories were 0.32, 0.61, 0.88, 1.22, and 1.97 substitutions per site. These results suggest moderate rate variation across sites in the trnH-psbA region. In addition, these findings indicate a mix of conserved and faster-evolving regions compared to the matK and rbcL barcode regions. The nucleotide frequencies were A = 31.48%, T = 33.73%, C = 17.95%, and G = 16.84%, which indicated a rich AT-bias similar to that of matK, which features characteristic chloroplast DNA. The maximum log likelihood for this computation was − 8919.082. The estimated R bias is 0.68, which is comparatively greater than that of matK and rbcL, potentially reflecting differences in structural constraints in this region. The maximum log likelihood for R bias computation was − 9236.580. In the phenetic NJ and cladistic MP methods of phylogenetic analysis of trnH-psbA, BDUT 1451 and BDUT 1452 were linked in a single cluster, and this cluster diverged from the cultivar BDUT 1450 (Figs. 7 and 8). The overall pairwise distance among the cultivars was 1.289 for the trnH–pspA sequence. This strengthens the confidence in the observed divergence and suggests a fundamental split between BDUT 1450 and the other two cultivars. The closer relationship between BDUT 1451 and BDUT 1452 in rbcL and trnH-psbA, compared to the more distinct separation in matK, suggested potential incomplete lineage sorting (ILS) in these regions.
According to the results of the trnL sequence analysis, the estimated shape parameter for the discrete Gamma distribution was 28.2892, similar to that of rbcL, indicating considerable rate of variation across sites. Similarly, the mean evolutionary rates in these categories resembled those of the rbcL region and were 0.75, 0.89, 0.99, 1.09, and 1.28 substitutions per site, suggesting comparable evolutionary dynamics in these regions. The nucleotide frequencies were A = 37.97%, T = 27.16%, C = 15.57%, and G = 19.31%, similar to those of matK and trnH-psbA. The maximum log likelihood for this computation was − 1030.978. The estimated transition/transformation bias (R) is 0.53. The maximum log likelihood for transition/transformation computation was − 1074.046. The trnL sequence analysis showed that BDUT 1450 and 1452 formed a single cluster, and BDUT 1451 was found in a separate cluster via the phenetic neighbor–joining (NJ) method and the cladistic difference–platelet method. Similarly, BDUT 1452 diverged from the BDUT 1451 cluster, and BDUT 1450 diverged from BDUT 1452 (Figs. 9 and 10). The overall pairwise distance between them was 1.845. Thus, trnL uniquely clusters BDUT 1450 and 1452 together, contrasting with the groupings observed for matK, rbcL, and trnH-psbA.
Species discrimination was successful in an earlier study with 72% of the cases, with the remaining species being matched to groups of congeneric species with 100% success using rbcL and matK. According to the trnH-psbA and rbcL sequence analyses, BDUT 1451 and BDUT 1452 were closely related to each other. BDUT 1450 evolved from a separate clade with A. textile as a closely related species according to the cladistic method and diverged from a cluster with A. textile JX848396 and A. sikkimense isolate LHXJ0489 according to the phenetic method of rbcL analysis. According to the phenetic method, for trnL, BDUT 1452 and BDUT 1450 were found in a single cluster, which showed that they were closely related, and BDUT 1451 was found in a separate clade related to A. linearifolium JF262666; therefore, these genes diverged from a large clade with different Allium species.
The ISSR primers used in this study produced a total of 91 scorable bands with sizes ranging from ~ 250 bp to 1500 bp (Figs. 11–13). The total number of bands for each primer ranged from zero to 21, with an average of 13 (Table 3). The total number of polymorphic alleles and the percentage of polymorphisms were 11 and 21.09%, respectively. This implies that low genetic diversity and limited polymorphism occur among the three cultivars. The ISSR9 and ISHY4 primers did not produce any scorable fragments in BDUT 1450, and ISSRa was observed to be an inefficient primer for amplifying all three garlic clones. ISHY 1b and ISHY2 produced the maximum number of amplified fragments (21 bands).
Table 3
Characteristics of the ISSR primers
Primer
|
Primer sequences 5’-3’
|
Annealing
Temperature (°C)
|
AF
|
PF
|
% P
|
ISSR8US
|
G(AACA)4
|
50
|
14
|
4
|
28.57
|
ISSR9
|
HVH(TC)8
|
50
|
11
|
3
|
27.27
|
ISHY 1b
|
(GA)8YT
|
39
|
21
|
3
|
14.28
|
ISHY 2
|
(AC)8YG
|
43
|
21
|
4
|
19.0
|
ISHY 3
|
(AG)8YT
|
39
|
19
|
3
|
15.79
|
ISHY 4
|
(CA)8RG
|
44
|
5
|
1
|
20.00
|
ISSR a
|
GCV(TG)8
|
51
|
-
|
-
|
-
|
V = G; A; C H = A; C D + G; A; T B = G; C; T Y = C; T R = A;G
AF-Total amplified fragments; PF- Number of polymorphic amplicons; % P- Percentage of polymorphism
The highest percentage of polymorphisms was observed for ISSR8US, followed by ISSR9 (28.57% and 27.27%), suggesting that these regions offer potentially informative markers for differentiating these cultivars. The statistics of genetic variation for all polymorphic loci were 2 for the observed number of alleles (na), 1.6831 ± 0.22 for the effective number of alleles (ne), 0.3969 ± 0.79 for Nei’s gene diversity (h) (1972) and 0.5837 ± 0.086 for Shannon’s information index (I). These results indicate moderate genetic variation across the analysed loci. In this study, a UPGMA dendrogram of garlic cultivars was constructed on the basis of ISSR data obtained by using POPGENE v32 software; BDUT 1450 and BDUT 1452 were linked together in a single cluster, whereas BDUT 1451 evolved as an outgroup from the BDUT 1450 and BDUT 1452 clusters. The average distance between individuals in the same cluster was determined to be 0.5862 for BDUT 1450 and BDUT 1452 and 0.5838 for BDUT 1451. This finding aligns somewhat with the clustering observed in the trnL barcoding region but contrasts with the consistent grouping in the matK, rbcL, and trnH-psbA regions. The estimated Ln probability of the data was − 10.3, the mean value of the maximum likelihood was − 9.8, the variance in the maximum likelihood was 1.0, and the mean value of the alpha was 0.3288. The Bayesian proportion of individual plants for a K = 3 population model was identified by STRUCTURE software and is indicated in Fig. 14. The proportions suggest a combination of divergence and heterogeneity among the cultivars.
Jabbes et al. (2011) revealed that, on the basis of the ISSR primers used, 7 to 21 polymorphic fragments were pragmatic, with an average of 12 fragments, which correlates with our results with the same primers used in this study. The presence of heterologous genomes was reflected by the accessibility of a comparatively high number of polymorphic ISSR markers (Jabbes et al. 2011). The results of Sadeghi and Cheghanirza (2012) demonstrated that ISSR analysis can be used not only to study the high diversity among cultivars but also to study polymorphisms. Additionally, they confirmed that ISSR techniques are advantageous for studying genetic diversity and are robust methods for characterizing plant genomes. ISSR analysis has been extensively used to determine genetic diversity in important crops and fruits and for biodiversity conservation. Furthermore, ISSRs can be used to evaluate relationships at the interspecific level (Huang and Sun 2000). Hao et al. (2002) provided a clear picture of the relationships among closely related congeneric species. Jabbes et al. (2011) reported that the diversity between accessions is always greater than the diversity within accessions.
The discriminative ability of each marker was evaluated by the diversity index value. A higher diversity index indicates more marker information (Kandasamy et al. 2013). The authors concluded that the ISSR marker system was highly informative and efficient for observing the genetic diversity of flowering plants via analytical methods. A total of 99.47% of the bands obtained in the Mukherjee et al. (2013) study were polymorphic. Nagaoka and Ogihara (1997) reported that ISSR primers produce reliable and reproducible bands.