Variation in six fruit phenotypic traits, seed length and seed width was observed between and within the 13 Kazakh melon groups and one unknown melon group (Table 1). Kazakh melons had a seed length greater than 9.76 mm, classifying them as large-seeded melons, with the exception of two small-seeded melon accessions from Group Agrestis with a seed length of 5.76 mm. The variance coefficient across melons was larger for fruit weight, fruit length, and fruit shape, in order. The relative variance between the fruit trait values was also supported by the F values from ANOVA, although the highest F value was detected for fruit length. Fruit length showed a high correlation with both fruit weight and shape, with a correlation coefficient greater than 0.70 (Supplementary Table 4). The gradual increase in fruit length appeared in the following order, Chandalak, Cassva, Sary Guliabi/Basvaldy, Kara Guliabi and Ameri, implying that fruit weight increased and that fruit shape changed from a globular shape to an oblong shape (Fig. 1A). Soluble solids content (SSC) was found to be an independent trait compared to other five phenotypic traits based on correlation coefficient values (r = -0.128 – -0.058). Group Chandalak showed higher SSC values in the fruit pulp than other melon groups (Fig. 1B), while little difference between melon groups, including the hybrid population, was obtained based on the variance coefficient (Table 1). PCO of the above six fruit phenotypic traits produced three coordinates that represented 93.8% of the cumulative variance (Supplementary Table 5), and the first two coordinates or the first and 3rd coordinates showed clear separation of Group Agrestis from remaining Kazakh melon groups which were mixed each other (Supplementary Fig. 2AB). Consequently, the fruit trait values showed specific characteristics in Kazakh melons: gradual differentiation of fruit length together with fruit weight and fruit shape changes and conserved fruit width, thickness, and SSC among Kazakh melon groups.
The DNA fingerprints of the CAPS and dCAPS markers and insertion or deletion markers corresponded with the respective nucleotide sequences in five regions of the chloroplast genome (Supplementary Table 6). Eighty-seven Kazakh melon accessions, excluding two reference Kazakh accessions, were classified into three cytoplasm genotypes by those markers: 2 accessions were classified into Ia, 52 were classified into Ib-1/-2, and 33 were classified into Ib-3. The Ib-1/-2 and Ib-3 cytoplasm genotypes were dominant.
Thirteen random markers and 11 SSR markers generated a total of 92 alleles in the 202 melon accessions examined, of which 70 alleles were detected in 87 Kazakh melon accessions excluding the two reference accessions from Kazakhstan (Table 2). The number of alleles per SSR locus ranged from two to four in the Kazakh melon accessions, for which no unique alleles were obtained. The expected heterozygosity (He) ranged from 0.022 to 0.763, corresponding to the polymorphic information content results (PIC; r = 0.758). The mean He was higher in SSR markers than in RAPD markers in both the Kazakh melon accessions (0.390 and 0.133, respectively) and reference accessions (0.580 and 0.383, respectively), and the mean He was lower in Kazakh melon accessions than in reference accessions. Heterozygosity within SSR loci was observed in the Kazakh melon accessions, although the Ho values ranged from 0.011 to 0.213 (Mean: 0.055) and were lower than the He values (Range: 0.022 to 0.634, Mean: 0.390). With regard to the potential to detect polymorphisms, the SSR markers were more efficient.
To clarify the genetic relationships in melons between Kazakhstan and neighbouring countries and their genetic variation, genetic classification was carried out for melon accessions, including reference accessions. The pairwise genetic distances between 202 melon accessions were calculated from the RAPD and SSR data and ranged from 0 to 0.936, with an average of 0.393 (data not shown). The GDs calculated by combining the RAPD and SSR data was also related to the GDs calculated by RAPD data (r = 0.966, P < 0.01) and SSR data (r = 0.874, P < 0.01) alone. The GDs calculated from RAPD and SSR data showed a high correlation (r = 0.718, P < 0.01). The genetic relationships between the 202 melon accessions were visualized by UPGMA cluster analysis based on the RAPD genetic distance and SSR allele frequency (Fig. 2). The 202 melon accessions were grouped into seven groups, which were assigned as Cluster I to Cluster VII.
The STRUCTURE simulations of the admixture model-based calculations were performed using all 202 accessions. The LnP (D) value and Delta K value suggested the presence of two groups in the 202 accessions; 181 accessions were allocated into the two groups designated STI and STII, with the remainder in the admixed group STAD, with assignment probabilities (Q) > 0.70. Substructuring under the topmost hierarchy was detected for the accessions in STI using a similar approach. Consequently, using model-based classification, the 202 accessions were divided into STI (159 accessions), STII (22) and the admix group STAD (21), and STI was separated into five subgroups: STIa-1 (21), STIa-2 (20), STIa-3 (11), STIb-1 (26), and STIb-2 (10), with three admixed subgroups: STIAD (44), STIaAD (17) and STIbAD (10) (Fig. 2). This model-based classification was significantly correlated with the distance-based classification by the UPGMA cluster analysis (χ2 = 643.83, P < 0.01; Cramer’s V = 0.73, P < 0.01). The combined results of these two classifications provided the following phylogenetic overview: STIa located in Clusters I–III showed some divergence from STIb, which was mainly found in Clusters III–V; STIAD overlapped with STIa and STIb; STII in Cluster VII was grouped alone; and STAD in Clusters V and VI was an intermediate group between STI and STII.
To visualize the genetic groups associated with Kazakh melon development, the cytoplasm genotype representative of the maternal lineage was combined with the subgroups by model-based classification and distance-based classification (Fig. 3). Trends for the cytoplasm genotype were obtained from model-based classification as follows: divergence of East Asian melons from European and American melons and a close relationship between Kazakh melons and those from nearby areas in Central Asia and Russia (Fig. 3A). STIb-1 melons and/or Ia cytoplasm melons were rare in Kazakh melons (two accessions). In contrast, in the Ib cytoplasm, the subgroup STIAD of the admixed group with STIa and STIb was frequent in Central Asian (Turkmenistan, Uzbekistan, and Tajikistan), Russian, northwestern Chinese, and Kazakh melons and thought to be a key genetic group for melon development in these areas. Subgroups STIa-1 and STIa-2 were specific to Kazakh melons, and in combination with Ib cytoplasm genotypes, STIa-1 with Ib-3 cytoplasm and STIa-2 with Ib-1/-2 cytoplasm were unique genetic groups. STIa-1 with Ib-3 cytoplasm was frequent in Cluster I and rare in Cluster II (χ2 = 279.4, P < 0.01), whereas subgroup STIa-2 with Ib-1/-2 cytoplasm was evenly spread across Cluster I to Cluster III (χ2 = 5.4, P = 0.25; Fig. 3B). Similar to STIAD, these two subgroups were detected in the Kazakh melon groups Ameri, Cantalupensis, Chandalak, Zard and Cassaba and subgroups Basvaldy, Guliabi, and Kara Guliabi, as well as in the unknown melon group, which was prevalent in Kazakh melon (Fig. 4). Thus, the three subgroups were indicative of a close relationship between each Kazakh melon group.
This close relationship was supported by the phylogenetic analysis of 17 melon populations and their fixation index (FST) values. Kazakh melon groups clustered on the UPGMA tree, although groups Ameri, Cassaba, and Zard had similar fruit phenotypes to the northwestern Chinese Group Ameri, Spanish Group Cassaba, and northwestern Chinese Group Zard, respectively (Fig. 5). This clustering was in agreement with the FST values between the 17 melon populations as follows: the mean FST values between Kazakh melon groups with the remaining Kazakh melon groups ranged from 0.094–0.186 and were smaller than those with melons from nearby areas of Kazakhstan (FST = 0.138–0.272) and Spain and the USA (FST = 0.280–0.462) (Supplementary Table 7). Thus, Kazakh melon groups had genetic similarity to each other, which indicates their low genetic diversity. This low genetic diversity was well supported by the mean gene diversity, which was 0.224 for Kazakh melons and 0.158 for northwestern Chinese melons, and the values for Kazakh melon groups (0.099 to 0.253) were lower than those for Iranian, Afghanistan, Pakistan, and Central Asian melons (0.331 to 0.363) (Fig. 3A). Thus, the Kazakh melons examined here showed lower genetic variation than those other melons. Even between the three Kazakh provincial melon populations from Zhambyl, South Kazakhstan, and Kyzylorda, low divergence was detected by AMOVA, where 3% of the total variance was generated among the populations (Supplementary Fig. 3). The FST values among these three provincial melon populations was less than 0.036, indicating similar genetic components among them.
The two subgroups STIa and STIAD were dominant in Kazakh melons (Fig. 3A), while the admixed group STAD with STI and STII was a distantly related group to Kazakh melons (Fig. 2). Two accessions from the Kazakh Group Agrestis were allocated to STAD but shared Ib-1/-2 cytoplasm with cultivated Kazakh melons. Their location on the UPGMA tree was Cluster V, where one accession with Ib-1/-2 and two accessions with Ib-3 were included among cultivated Kazakh melons. Thus, the Kazakh Group Agrestis examined here showed phylogenetic similarity with a few cultivated Kazakh melons.