DArTseqLD
The comprehensive dataset provided by DArT Pty Ltd. comprised 864 genotypes, featuring 15,833 codominant binary SNP markers and 33% missing data. The final number of SNP markers and genotypes for each subset following filtering are provided in Table S2 (Online Resource 2).
Monitoring of northern Spain populations
We collected wild populations from 19 locations (six in the Basque Country, four in Cantabria and nine in the Asturias) and leafy kales from three home gardens in Asturias, one of which was placed few meters away from the site of the wild population of Torimbia. All the wild populations were found in the locations indicated by Gómez-Campo et al. (2005), except one new site, which was found at El Musel, near Gijon. A few differences were observed in eight locations regarding the status of the populations, compared to the notes in Gomez-Campo et al. (2005). In Cudillero, Cabo de Peñas, Cabo Torres, Torimbia and Llanes, the populations were larger and, except in Llanes, more accessible than ten years before. In Tazones, Llanes and Pendueles, they were no longer easily accessible, and at El Pedrero, the population was much smaller (Online Resource 3). At this time of the year, a large part of the seeds had been already shed, but it was still possible to find remaining siliquas with some seeds in most plants.
Identity of the accessions
After a first data analysis of all samples together, three different approaches offered by the construction of a NJ Tree (Figure S1, Online Resource 2), Principal Component Analysis and STRUCTURE (not shown), there were three accessions that evidently did not cluster within the expected group. One of these was B. incana from Angelokastros, Corfu, which clustered completely separate from other B. incana populations and actually overlapped with B. cretica from Corfu. Two samples of B. montana, from San Marino and from Savona, clustered in the group of B. oleracea differently from B. montana from Stazzema. Only the latter formed a well-identified autonomous cluster, both in STRUCTURE and PCoA, and clustered together with the Mediterranean species B. cretica, B. drepanensis, B. macrocarpa and B. rupestris in the NJ Tree. Because of this doubtful taxonomic attribution, the three above-mentioned accessions were removed and the dataset re-filtered as described in material and methods before further analysis.
Relationship between species
Samples were analysed by species, where the cultivated B. oleracea group separately from the wild-growing populations. Each species forms a well distinct cluster (Fig. 2A) with the partial exception of B. rupestris. The cultivated types almost completely overlap with the wild B. oleracea and partly with B. incana, indicating a very close relationship between these three groups. B. rupestris and B. drepanensis are very close to each other and well separated from all other species, except four individuals of B. rupestris from Taureana, which indicate probable intercrossing with cultivated crops. Brassica cretica from Crete and Corfu form two very distinct clusters.
Looking at the Shannon index of genetic diversity (Table 2), the values are not fully comparable, since each group has a different number of individuals and populations. The case of B. rupestris (highest value with 0.207) can be explained by the fact that it is a combination of a truly B. rupestris population from Pazzano and a population from Taureana, the latter likely introgressed by cultivated crops. The two populations of B. cretica have a rather high value of 0.149, but when considered separately (Table 1) their respective values are only 0.043 and 0.056.
Table 2
Genetic diversity indexes for accessions grouped by taxon. NP/A = number of populations/accessions; NI = Number of individuals. Means with different superscripts are significantly different at p < .01.
Taxon | | NP/A | NI | Mean genetic diversity Shannon Index | | % polymorphic loci |
| | | | Mean | SE | | |
B. montana | | 1 | 18 | 0.073a | 0.002 | | 21.33 |
B. macrocarpa | | 1 | 20 | 0.078b | 0.002 | | 25.42 |
B. drepanensis | | 1 | 19 | 0.093c | 0.003 | | 26.15 |
B. oleracea var. viridis (leafy kales) | | 6 | 101 | 0.119d | 0.003 | | 54.54 |
B. oleracea var. oleracea | | 25 | 470 | 0.128e | 0.003 | | 73.89 |
B. incana | | 6 | 108 | 0.142f | 0.003 | | 62.17 |
B. cretica | | 2 | 37 | 0.149g | 0.004 | | 32.75 |
B. rupestris | | 2 | 24 | 0.207h | 0.003 | | 59.26 |
Total (Mean) | | 44 | 797 | 0.124 | 0.001 | | 44.44 |
Fst | 0.579 | | | | | | |
The Analysis of Molecular Variance (AMOVA) shows that the variation is equally distributed among (48%) and within (52%) populations (Table 3). This pattern is quantified in terms of genetic differentiation by the relatively high overall mean Fst value = 0.579. However, looking at the pairwise values of each population (Table 4), it appears that genetic differentiation is minimal (Fst = 0.012) between wild oleracea and cultivated leafy kales, indicating that these two groups are genetically almost identical. The closest wild species to these two groups is B. incana (Fst = 0.049 vs. wild oleracea; Fst = 0.060 vs. leafy kales). In comparison, B. cretica is more differentiated from B. oleracea (Fst = 0.172 vs. wild oleracea; Fst = 0.206 vs. leafy kales). Nei’s genetic identities offer a similar pattern, with wild oleracea virtually identical to the leafy kales (0.995) and B. incana as the closest wild relative (0.968), followed by B. cretica (0.868) at a certain distance (Table S3, Online Resource 2).
Table 3
Analysis of Molecular Variance (AMOVA) (NP/A = number of populations/accessions, NI = number of individuals, Fst = coefficient of gene differentiation, SE = Standard Error of Fst, P = probability)
Populations/Accessions | NP/A | NI | Fst | SE | PhiPT | P | Variance within populations | Variance among populations |
7 species groups + leafy kales group | 8 | 797 | 0.579 | 0.004 | 0.482 | 0.001 | 52% | 48% |
7 species groups (no cultivated) | 7 | 696 | 0.578 | 0.004 | 0.548 | 0.001 | 45% | 55% |
Wild oleracea vs. cultivated oleracea | 2 | 571 | 0.024 | 0.001 | 0.054 | 0.001 | 95% | 5% |
Wild oleracea from France vs. Spain | 2 | 411 | 0.025 | 0.001 | 0.056 | 0.001 | 94% | 6% |
Wild oleracea by France and Spain sub-regions | 7 | 411 | 0.124 | 0.002 | 0.109 | 0.001 | 89% | 11% |
Wild oleracea by country | 5 | 470 | 0.127 | 0.003 | 0.117 | 0.001 | 88% | 12% |
Landraces from Spain vs. Italy | 2 | 101 | 0.046 | 0.001 | 0.127 | 0.001 | 87% | 13% |
Six landraces from Spain and Italy | 6 | 101 | 0.186 | 0.003 | 0.293 | 0.001 | 71% | 29% |
Table 4
Pairwise population Fst values
B. cretica | B. drepanensis | B. incana | B. macrocarpa | B. montana | B. oleracea | B. rupestris | Leafy kales | |
| | | | | | | | B. cretica |
0.515 | | | | | | | | B. drepanensis |
0.214 | 0.370 | | | | | | | B. incana |
0.503 | 0.538 | 0.319 | | | | | | B. macrocarpa |
0.462 | 0.595 | 0.250 | 0.583 | | | | | B. montana |
0.172 | 0.352 | 0.049 | 0.303 | 0.192 | | | | B. oleracea |
0.342 | 0.320 | 0.225 | 0.377 | 0.374 | 0.222 | 0.000 | | B. rupestris |
0.206 | 0.415 | 0.060 | 0.344 | 0.241 | 0.012 | 0.248 | 0.000 | Leafy kales |
A STRUCTURE analysis was run for all accessions and the results visualized in a bar plot grouped by species (Fig. 2B). DeltaK indicated that the accessions can be grouped into two groups (K = 2). One accession, LM18_02_04, the B. cretica accession from Corfu, was equally assigned to both groups. The distinction in two groups (K1 and K2) clearly separates all the cultivated accessions, together with all the Atlantic oleracea, from the Mediterranean wild species B. montana, B. rupestris, B. drepanensis, B. macrocarpa and B. cretica, with the exception of B. incana.
The only B. montana population analysed here (‘Stazzema’), clusters together with the truly wild species, but also shows a certain percentage of ‘domesticated’ alleles (K1 type). The other two B. montana populations (‘Savona’ and ‘San Marino’) seemed to be heavily admixed with cultivated crops and were not included in this analysis.
In the case of B. cretica, the two populations from Crete and Corfu are identified as truly wild, but with a significant percentage of domesticated alleles, whereby one individual is considered an admixture.
The other wild species from South Italy (B. macrocarpa, B. drepanensis and B. rupestris) firmly plot among the truly wild.
Relationship between wild-growing and cultivated B. oleracea
The STRUCTURE results run for all wild-growing B. oleracea (n = 470) together with the cultivated B. oleracea (n = 101) distinguish two groups (Fig. 3B). All the cultivated accessions fall into the K2 group, while the wild-growing fall partly into the same group as the cultivated, and partly into a different group (K1), with various degrees of admixture. The ‘purest’ K1 populations are mainly originated from the western part of the northern coast of Spain, i.e. from Cantabria (‘Cuchia’, ‘El Pedrero’) and Asturias (‘Cabo Peñas’, ‘Cudillero’, ‘El Musel’, ‘Torimbia’, ‘Torres’ and ‘Xago’). All the wild-growing northern populations from the UK, Denmark, Germany and France are indistinguishable from the cultivated ones. Considerable admixture is present in the geographical middle between these two extremes, i.e. the Basque Country (‘Getaria west’ and ‘Urgull’) and western France (‘Mortagne’), but also in other populations from Asturias (‘Pendueles’) and Cantabria (‘Laredo’).
The corresponding PCoA (Fig. 3A) shows the majority of B. oleracea var. oleracea populations plotting on a wide area in the upper and left quadrants, while a few others (from northern Europe) are mixed with the cultivated populations in the bottom right quadrant. AMOVA indicates a variation of 5% among populations and 95% within populations (Table 3). In addition, the Nei genetic identity is very high (0.986), corresponding to a very low coefficient of gene differentiation between cultivated and wild (Fst = 0.024). Shannon index and allele polymorphism values are higher for the wild than for the cultivated accessions (Table S4, Online Resource 2).
Comparison between all wild-growing B. oleracea grouped by country of origin
Populations from each country cluster together and form a gradient of diversity with few overlaps between French and Spanish populations and between French and German ones (Fig. 4A).
Based on Shannon Index, the Spanish group is the most diverse (0.324), followed by the French (0.313), German (0.275), UK (0.193) and Danish (0.154) (Table S5, Online Resource 2). Molecular variance among populations is limited to 12% (Table 3). In fact, the level of Nei genetic distance among these groups is very low for all. The lowest and largest pairwise Nei distance is between French and Spanish (0.015) populations and Danish and British (0.096) populations, respectively. The second lowest Nei distance is between German and French (0.028) populations (Table S6, Online Resource 2). Neither the STRUCTURE analysis of the same grouping of populations shows any particular structuring associated with the country of origin (Fig. 4B). The main structuring remains between two groups (K1 and K2) that are mainly splitting into two categories the Spanish populations, while in the other countries K2 type prevails. The K2 pattern is in common with about half of the Spanish populations, mainly those from the central-eastern part of the northern coast of Spain.
Comparison between wild-growing B. oleracea from France and Spain
A comparison between French and Spanish populations of B. oleracea var. oleracea indicates a higher genetic diversity expressed in Spain. The differentiation between these two groups is very low, indicated by Fst = 0.025 (Table 3). To investigate the level of clustering of the French and Spanish populations grouped by region, the PCoA of Fig. 5 shows French populations from Charante-Maritime, Seine Maritime and Somme clustering together, reconfirming the pattern of Fig. 4A. These populations cover the bottom right quadrant. Populations from Manche cluster in the area overlapping with Basque Country and part of the Cantabrian and Asturian populations. The Asturian populations show the largest diversity (Table S7, Online Resource 2), also uniquely covering the left bottom quadrant of the PCoA. Pairwise Fst values of genetic differentiation range from 0.021 (between Asturias and Cantabria) and 0.124 (between Somme and Charente-Maritime) (Table S8, Online Resource 2). All Fst values within this analysis are low (< 0.15). The lowest values are between Spanish regions, ranging from 0.021 to 0.030. Differentiation between regional Spanish and French populations is a bit higher, ranging from 0.043 (Asturias vs. Seine-Maritime) to 0.076 (Cantabria vs. Somme). At the same time, differentiation between French populations is slightly higher, ranging from a minimum of 0.066 (Seine-Maritime vs. Somme) to a maximum of 0.124 (Charente-Maritime vs. Somme).
Nei’s genetic distance shows exactly the same pattern, ranging from the lowest distance of 0.013 between Asturias and Cantabria, to the largest distance of 0.075 between Somme and Charente-Maritime (Table S9, Online Resource 2).
Comparison between landraces from Italy and Spain
Italian and Spanish landraces plot in separate clusters in the PCoA (Figure S2, Online Resource 2). However, variance among populations is low (13%) (Table 3), as well as the pairwise population Fst value (0.046) and the Nei genetic distance (0.033). The Shannon diversity is similar, 0.268 (Italy) vs. 0.244 (Spain) (Table S10). Running a PCoA by populations, it is noticeable that each landrace clusters into a separate group, except ‘Tazones El Pison’ (Taz_EP_lr) and ‘Torimbia’ (Torim_lr), which are fully overlapping (Fig. 6A). In fact, the gene differentiation among these six landraces is sufficiently high (0.186) (Table 3) to recognize that each landrace tends to form a separate cluster while there is no distinct separation between Italian and Spanish landraces as such.
A STRUCTURE analysis of the same group of six Italian and Spanish landraces identifies five groups (K = 5). Also in this case, each population shows a distinct predominant genetic character, except two landraces (one from Sicily (‘De Baudo neighbour’), and the other from Asturias (‘Torimbia’), falling into the same category K1 (Fig. 6B). This pattern confirms that landraces maintain their own identity, but various levels of admixtures between the genetically homogeneous groups identified as five clusters (K1 to K5), emerge more clearly here (in particular, two individuals of ‘Torimbia’ are an admixture between K1 and K4). While the PCoAs seemed to visually indicate a rather clear-cut separation between Italian and Spanish landraces, in the STRUCTURE analysis the separation is not equally evident, as indicated by landraces ‘Torimbia’ and ‘De Baudo neighbour’, plotting in the same category K1. The landrace with less admixtures and therefore with the purest identity is ‘Latassa’ from Calabria, Italy.
A PCoA in which Italian and Spanish landraces are highlighted, together with all the wild oleracea (Fig. 2A) shows that they do not position in separate clusters, when compared to the wild oleracea, and that both Spanish and Italian landraces have areas of admixture with the wild populations, confirming the same pattern of Fig. 3A.
Relationship between all accessions
When keeping each population separate, information on the individual populations can be obtained. Table S11 (Online Resource 2) shows the min-max range values of all the populations within each species/group for Shannon diversity. There is no clear pattern distinguishing the genetic diversity of various species, or the wild from the cultivated. Thus, in all groups there are populations with a very low level of diversity and others with a relatively high level (> 0.1), with the exceptions of those categories where only one or two populations were tested. Data from each population (Table 1) mostly show a rather low level of diversity (I < 0.1). Only few populations have I > 0.1. The top three are B. rupestris – ‘Taureano’, Calabria (0.263), B. oleracea var. oleracea – ‘Urgull’ – Basque Country (0.129) and landrace ‘De Baudo’s neighbour’ – Sicily (0.122), also with the respective top three polymorphism levels (46.6%; 42.8% and 33.3%).
Table S12 (Online Resource 2) shows the closest relationships (two or more accessions) to each accession, based on the lowest values of pairwise genetic distance (also see the full matrix of Nei Unbiased genetic distances as Table S13 in Online Resource 1). ‘Laredo’ is the population that most often (63% of the cases) appears as genetically close to other accessions, which includes not only other Spanish B. oleracea var. oleracea populations, but also all the populations from France, ‘Helgoland’ (Germany), ‘Rødvig’ (Denmark) and ‘Tenby’ (Wales), and all the landraces from both Spain and Italy. Even in the case of B. cretica from Corfu, ‘Laredo’ is the second closest of all populations. This pattern makes ‘Laredo’ one exemplary ‘pivot’ population. The genetic distance between ‘Laredo’ and all the B. oleracea populations is never higher than 0.020 (ranging between 0.009 and 0.020). ‘Laredo’ is also the population with the lowest values of genetic distance from B. incana populations (0.037 from BincCp_w and 0.040 from BincSo_w), together with ‘Helgoland’ (0.037 and 0.039, respectively). The Shannon diversity index I of ‘Laredo’ is relatively high (0.102) compared to other accessions in this study, although not the highest.
‘DeBaudo’s neighbour’ landrace is the second most represented population (21% of the cases) among those genetically closer to other accessions. It is the closest accession to both B. cretica populations and one of the closest to B. rupestris from Sicily and Calabria. It shows a short genetic distance from wild oleracea populations from north Spain, Helgoland, as well as from other landraces from Italy and Spain. It also has a high level of diversity (I = 0.122). The wild population in Tazones, Asturias, is genetically closer to the nearby (ca. 50m) ‘Tazones’ landrace Taz_EP_lr (0.013) than to any other wild population. This may confirm inter-crossing at this location. On the other hand, this pattern is not repeated in the nearby location of Torimbia, where Tor_w is genetically closer to other more distant wild populations (‘Cudillero’: 0.012) than to the landrace in Torimbia (0.022) which was growing at a distance of ca. 0.5 km and thus possibly not inter-crossing with it. The ‘Torimbia’ landrace is genetically closer to the ‘Laredo’ wild population (0.012), which is > 100 km away.
Looking at the genetically most distant accessions, B. rupestris from Palermo and B. drepanensis are the most distant to all the other accessions (in 98% and 93% of the cases, respectively). The most distant to B. drepanensis and B. rupestris from Palermo are the two B. cretica populations. Also for B. macrocarpa, B. rupestris from Palermo is genetically the most distant, even though they are geographically rather close.
The closest population to B. rupestris-‘Palermo’ is B. rupestris-‘Tauerana’ (0.332), but the opposite is not true, since B. rupestris-‘Taureana’ is closer to any other accession than to B. rupestris-‘Palermo’. The Shannon Index and allele polymorphism of B. rupestris-‘Taureana’ (0.263 and 46.55%, respectively) are the highest values of the entire series.
Comparison between B. oleracea and B. incana
A specific comparison was made between B. incana (108 individuals from six populations) and B. oleracea (470 wild and 101 cultivated individuals together).
The PCoA and STRUCTURE analysis (K = 2) separates neatly these two groups (Figures S3A & S3B, Online Resource 2), with no overlaps between them. The genetic distance is 0.075. Variance within populations is 74% and among populations is 26%. The Fst value of gene differentiation is 0.092. The same analyses were run keeping each accession separate and this enabled us to distinguish, within the cluster of B. incana, that the populations from Monte Leano, Lazio, and from Capri, Campania, are forming sub-clusters. The oleracea are almost all well grouped together, with only a few individuals slightly divergent (including landraces or wild populations).