A total of 98 alleles were detected using nine SSR markers (Table 3). The number of alleles per locus ranged from six (HS03, HS10 and HS27) to 20 (HS01), with an average of 10.89, and fragment size ranged from 103 to 343 bp. The expected heterozygosity (He) ranged from 0.04 (HS03) to 0.87 (HS01), and the observed heterozygosity (Ho) ranged from 0.01 (HS03) to 0.72 (HS06). The HS03 marker showed the lowest heterozygosity value (0.04), indicating that the analyzed accessions did not show diversity for this locus. The PIC values ranged from 0.04 (HS03) to 0.86 (HS01).
Table 3
Maximum frequence (fmax), number of alleles observed (Na), expected heterozygosity (He), observed heterozygosity (Ho) and polymorphic information content (PIC) for nine SSR markers.
SSR markers
|
fmax
|
Na
|
He
|
Ho
|
PIC
|
HS01
|
0,23
|
20
|
0.87
|
0.67
|
0.86
|
HS03
|
0,98
|
6
|
0.04
|
0.01
|
0.04
|
HS05
|
0,34
|
14
|
0.83
|
0.40
|
0.82
|
HS06
|
0,30
|
13
|
0.78
|
0.72
|
0.75
|
HS08
|
0,40
|
7
|
0.68
|
0.68
|
0.62
|
HS10
|
0,68
|
6
|
0.49
|
0.31
|
0.44
|
HS16
|
0,48
|
17
|
0.66
|
0.69
|
0.61
|
HS27
|
0,58
|
6
|
0.56
|
0.15
|
0.48
|
HS33
|
0,40
|
9
|
0.76
|
0.43
|
0.72
|
Averange
|
0.49
|
10.89
|
0.63
|
0.45
|
0.59
|
The PIC value represents the probability of detecting polymorphism between two random samples (Ismail et al. 2019). With the exception of primer HS03, the primers showed high discrimination power, with PIC values higher than 0.44 (HS10). Considering the PIC formula, the observed values are dependent on the number of detected alleles and their relative frequency (Guzmán et al. 2020). Therefore, the observation of one or two alleles with high frequency will contribute to low PIC values, as could be observed for primer HS03 that presented one allele with frequency equal to 0.98 (Table 3).
The use of nine SSR primers allowed the characterization of the variability and genetic structure among the progenies and seven matrix accessions of the Mangaba genebank. The number of alleles per locus detected (six to 20) is indicative of the allelic richness of the population, and was considered sufficient to meet the objective of the present study since for SSR markers the detection of two to seven alleles per locus is considered satisfactory (Aljumaili et al. 2018). The high heterozygosity values detected for the SSR markers used may be related to the reproductive system (interbreeding) of the species (Table 3).
The average number of alleles per locus is an indication of genetic diversity, and ranged from 3.67 (CA) to 7.22 (PT) (Table 4). The average was lower than that observed by Collevatti el al. (2018) in a study with 28 natural mangaba subpopulations (9.6) using SSR markers. The effective number of alleles was lower than the observed number of alleles, suggesting that many alleles are rare (p < 0.05) or have low frequency (0.05 > p < 0.25) (Viegas et al. 2011). The accessions that presented the lowest number of alleles were PR and CA, which may be related to the smaller number of individuals analyzed (20). The effective number of alleles was lower than the observed number of alleles, ranging from 1.98 (CA) to 3.46 (BI). And, unique alleles were observed for five accessions (Table 4).
Table 4
Estimates of genetic variability parameters for mangaba genebank accessions of Embrapa Tabuleiros Costeiros.
Accession
|
Na
|
Ne
|
I
|
Ho
|
He
|
f
|
%P
|
Nap
|
PR
|
3.89
|
2.33
|
0.85
|
0.53
|
0.44
|
-0.19
|
77.78
|
0
|
LG
|
5.67
|
3.02
|
1.04
|
0.41
|
0.51
|
0.25
|
100
|
1
|
PT
|
7.22
|
3.20
|
1.22
|
0.45
|
0.59
|
0.28
|
100
|
6
|
BI
|
6.78
|
3.46
|
1.31
|
0.48
|
0.62
|
0.27
|
88.89
|
3
|
TC
|
5.78
|
2.22
|
0.91
|
0.43
|
0.45
|
0.02
|
88.89
|
4
|
AB
|
6.44
|
2.76
|
1.19
|
0.48
|
0.58
|
0.14
|
100
|
6
|
CA
|
3.67
|
1.98
|
0.76
|
0.37
|
0.39
|
0.04
|
77.78
|
0
|
Average
|
5.63
|
2.71
|
1.04
|
0.45
|
0.51
|
0.13
|
90.47
|
20
|
Na: Average number of alleles per locus; Ne: Effective number of alleles per locus; I: Shannon index; Ho: Observed heterozygosity; He: Expected heterozygosity; f: fixation index; %P: Polymorphism percentage; Nap: Number of private alleles. |
Considering the Shannon Index, there was lower genetic diversity in access CA (0.76), and higher in access BI (1.31) (Table 4). The high values observed for the Shannon Index indicate the existence of high genetic variability in the materials. The estimated values were higher than those observed by Santos et al. (2017), studying 36 mangaba accessions using ISSR markers (0.28 to 0.42).
Ho ranged from 0.37 (CA) to 0.53 (PR), with a mean of 0.45, and He ranged from 0.39 (CA) to 0.62 (BI), with a average of 0.51 (Table 4). This index refers to the proportion of accessions that are heterozygous for a given locus. With the exception of the PR access, the others presented He > Ho, according to the Hardy-Weinberg Equilibrium. This result is an indication of heterozygote deficiency in the population, and suggests the occurrence of crossing between related individuals (Bernard et al. 2018), and can be proven by the estimated values of the Fixation Index (f) for these accessions (greater than zero). The estimates for Ho and He were lower than expected for allogamous species (0.63 and 0.65, respectively) and for long-lived species (0.63 and 0.68, respectively) (Nybom 2004). The estimates for Ho were lower than those reported by Costa et al. (2017), studying mangaba genotypes using SSR markers (0.679 to 0.714), and similar to those observed by Chaves et al. (2020), also for mangaba genotypes evaluated using SSR markers (0.428 to 0.581).
The access PR presented higher Ho than He. This result is an indication of excess heterozygosity in this progeny (Yun et al. 2020), confirming the result obtained for the Fixation Index (f = -0.19) (Table 4). The Fixation Index is one of the most important parameters in population genetics, as it presents the balance between homozygotes and heterozygotes present in the population (Pereira et al. 2020). The average value for the Fixation Index was 0.13, indicating a low level of inbreeding for the accessions evaluated.
Mangaba has a self-incompatibility mechanism (Darrault and Schlindwein 2005), which favors cross-fertilization and reduces the occurrence of inbreeding. Thus, the excess of heterozygotes observed for the PR access can be explained by the reproductive system of the species and, probably, the deficit of heterozygotes observed for the other accesses occurred due to crossing between related individuals. Biparental inbreeding was reported to be the cause of the high values observed for the endogamy coefficient in natural populations of mangaba sampled in the Midwest region of Brazil (Costa et al. 2017).
The percentage of polymorphic loci was higher than 75% (Table 4), confirming the presence of genetic variability for the accessions evaluated. The results were higher than those observed in remaining mangaba populations (73.77%) (Silva et al. 2017), which may be related to the marker (SSR), which is considered more informative than ISSR, and the number of individuals evaluated (296). The high genetic variability detected for the accessions evaluated is often related to species that have wide geographic distribution (Al Salameen et al. 2018), as is the case with mangaba.
The estimate for Nei’s genetic distance (Table 5) among the accessions ranged from 0.098 to 0.607. The lowest genetic distance of Nei was observed between the PT and TC accessions (0.098), and the highest, between the AB and LG accessions (0.607). The genetic differentiation among accessions (GST) ranged from 0.040 (PT and TC) to 0.201 (LG and CA) (Table 5).
Table 5
Nei’s genetic distance (above the diagonal) and GST (below the diagonal) between mangaba accessions.
|
PR
|
LG
|
PT
|
BI
|
TC
|
AB
|
CA
|
PR
|
-
|
0.478
|
0.198
|
0.340
|
0.188
|
0.324
|
0.229
|
LG
|
0.162
|
-
|
0.261
|
0.239
|
0.574
|
0.607
|
0.589
|
PT
|
0.071
|
0.08
|
-
|
0.182
|
0.098
|
0.172
|
0.155
|
BI
|
0.105
|
0.069
|
0.043
|
-
|
0.250
|
0.309
|
0.312
|
TC
|
0.087
|
0.186
|
0.040
|
0.085
|
-
|
0.144
|
0.153
|
AB
|
0.110
|
0.154
|
0.047
|
0.074
|
0.058
|
-
|
0.341
|
CA
|
0.111
|
0.201
|
0.064
|
0.108
|
0.079
|
0.125
|
-
|
The values observed for Nei and GST genetic distance corroborate with the estimates obtained for AMOVA (Table 6), in which the smallest proportion of genetic variation (8%) was detected among accessions, while the largest proportion (92%) was observed within accessions. In general, it is observed for species with cross-fertilization, such as mangaba, that 10–20% of the genetic variation is found between populations and that for autogamous species this value is higher than 50% (Al Salameen et al. 2018). This pattern was also observed in a study conducted on natural populations of mangaba using RAPD markers (Fajardo et al. 2018).
Table 6
Molecular analysis of variance (AMOVA) among the seven mangaba accessions.
Source of Variation
|
GL
|
SQ
|
QM
|
Variância
|
%
|
p-Valor
|
RST
|
between accessions
|
6
|
1880524.440
|
313420.740
|
3230.208
|
8%
|
0.001**
|
0.076
|
within accessions
|
605
|
23650842.429
|
39092.302
|
39092.302
|
92%
|
|
|
Total
|
611
|
25531366.869
|
|
42322.510
|
100%
|
|
|
** Significant at 1% probability. |
The estimated value for the RST statistic was 0.076 (Table 6), proving the existence of moderate genetic differentiation among the accessions evaluated. The lowest genetic distance of Nei was observed between the accessions PT and TC (0.098, Table 5), which may be associated with the origin of these accessions (Indiaroba, Sergipe, Brazil). The GST values between PT and BI (0.043); PT and TC (0.040); and PT and AB (0.047) are considered low according to the classification proposed by Wright (1978). The other values observed are considered moderate. The presence of private alleles is an indication of differentiation among accessions and demands strategies for the conservation of accessions possessing these alleles. The PR and CA accessions showed no private alleles (Table 2).
The principal coordinates analysis (Fig. 3), performed to evaluate the distribution of genetic variability among the accessions, did not allow us to distinguish them according to origin. The first two principal coordinates explained 22.07% of the total genetic variance of the 296 accessions, with 13.66% explained by coordinate 1 and 8.41% by coordinate 2.
The genetic distance between the accessions was estimated using Rogers' coefficient (1972) and ranged from 0.0 (between accessions BIP2.3 and BIP2.5) to 1.0 (between accessions PTP2.14 and LGP1.10; PTP2.14 and PTP2.11; PTP2.14 and PTP4; PTP2.14 and ABP2.1) (Fig. 4).
The population genetic structure analysis based on Bayesian statistics allowed the identification of two clusters (k = 2) (Fig. 5). The first cluster was composed of 197 progenies, generated from arrays of the accessions PR, PT, BI, TC, AB, and CA. The second cluster was composed of 90 accessions, generated from the LG, PT, and BI accessions. A total of nine accessions (membership values less than 80% for the two clusters detected) showed mixed ancestry.
Knowledge of the genetic structure of the germplasm is essential for the design of efficient strategies for the conservation and genetic improvement of the species. Cluster identification allows for the selection of genitors for breeding programs, which can contribute to increased genetic diversity and potential gain from selection (Campoy et al. 2016).
Population genetic structure analysis based on Bayesian statistics was used to infer on the ancestry of the accessions from the molecular information (Bernard et al. 2018). This analysis did not discriminate the accessions according to their origin, confirming the results obtained for the principal coordinates analysis (Fig. 2) and Rogers genetic distance analysis (Fig. 3). These results indicate that there is no correlation between the molecular data and the geographical origin of the analyzed accessions (Ismail et al., 2019). It was observed that 57 LG accessions, 17 PT accessions and 15 BI accessions were grouped in the same cluster with the analysis in Structure (green color, Fig. 4) and considering the first principal coordinate (Fig. 2), these accessions are grouped on the negative side. Analyzing the dendrogram (Fig. 3), these accessions are also grouped together (blue, yellow, and green).
The creation/maintenance of a germplasm bank presents logistical and economical limitations. Thus, the creation of a core collection, which represents most of the genetic diversity present in the BAG in a smaller number of accessions, is an efficient way to reduce costs (Campoy et al. 2016; Bernard et al. 2018) and increase the efficiency of the design of conservation and genetic improvement strategies for the species. The maximum length sub tree function of the DARwin 6.0.14 software was used iteratively to eliminate redundant accessions, based on the molecular data, and allowed the selection of 225 accessions to compose the core collection of the Mangaba BAG. The seven accessions had representatives in the composition of the core collection, being 6.67% PR, 18.22% LG, 20.89% AB, 11.55% BI, 6.67% CA, 20% PT, and 16% TC (Fig. 6). The selected accessions retained 94.90% of the detected alleles.
The three approaches used to study the genetic structure of mangaba accessions (Structure, PCoA, and Dendrogram) indicated that the accessions used as a matrix have the same genetic background and share common alleles among them (Ahmed et al. 2021). Moreover, as the species mangaba presents a self-incompatibility mechanism (Darrault and Schlindwein 2005), which favors allogamy, the maintenance of the parent accessions in the same experimental field (BAG) contributed to the occurrence of gene flow and, consequently, to the genetic similarity observed among the progenies.
The use of SSR molecular markers allowed the identification of genetic variability within and between progenies and matrices of the accessions of the Mangaba genebank, and contributed to the selection of materials to compose the core collection of this BAG, implemented in the field 15 months after sowing (Fig. 7).
Additionally, data related to agronomic and morphological characterization should be used to support the formation of this core collection, since the combination of this information contributes to the design of more efficient strategies for the use of this genetic resource.