Phenotypic Variation
In the present study, FF, FFC, FT, and SSC were measured for two years (2016-2017), and showed normal distribution (these distributions are based on average by accessions) (Online resource 2). These findings indicated the importance of the genetic background of each genotype for the Prunus phenotyping profile. The minimum, maximum and mean values of each year showed high consistency, and no significant differences were observed between the years (2016 and 2017) for the mean values. The minimum, maximum and mean values of all phenotypic traits are presented in Table 1.
Table 1
Minimum, maximum and mean values of all phenotypic traits
Trait
|
2016
|
2017
|
Min
|
Max
|
Mean
|
Min
|
Max
|
Mean
|
FF
|
0.1 N
|
9.60 N
|
1.86 N
|
0.03N
|
6.62 N
|
1.71 N
|
FFC
|
11.6°
|
46.72°
|
34.04°
|
13.99°
|
46.77°
|
34.12
|
FT
|
95 days
|
125 days
|
114.24 days
|
95 days
|
126 days
|
112 days
|
SSC
|
8.22°Brix
|
29.92°Brix
|
15.28°Brix
|
8.23°Brix
|
32.37°Brix
|
16.27°Brix
|
The mean values of four traits (FF, FFC, FT and SSC) only slightly differed between the two years (2016 and 2017). However, there were fourfold differences between the SSC and FFC values obtained from 2016 and 2017 (Table 1). FT ranged from 95 to 125 days with a mean value of 114.2 days in 2016, and it ranged from 95 to 126 days with a mean value of 112 days in 2017 (Table 1). FF varied between 0.1 N and 9.60 N in 2016 and 0.03 N and 6.62 N in 2017 (Table 1). There was a nearly 90-fold difference in the ranges obtained for FF from the two years. The individuals showing the highest and lowest average values are listed in online resource 3.
The results of the correlation analysis showed no significant correlation between the four traits (FF, FFC, FT and SSC) (Online resource 4). The results of ANOVA are presented in online resource 5. ANOVA demonstrated significant variations according to year, genotype, and year X genotype interactions for all apricot traits at the P ≤ 0.01 significance level (Online resource 5).
Population Structure Analysis
A total of 24,864 SNP markers were generated from the DArT analysis (Online resource 6.), and after filtering the missing data [max 5% missing data, Marker Allele Frequency (MAF>0.5)], 11,532 high-quality SNP markers were obtained (Online resource 7.). Apricot genome was comprised of scaffolds and the detected SNPs located on scaffolds rather than chromosomes. The PIC value was 0.77, ranging between 0.05 and 0.99. These markers were assigned to the related scaffolds and were used in the STRUCTURE (v.2.2) analysis. This analysis was performed for K from 1 to 10, and the peak was observed at K = 3 according to the ΔK computation data. The STRUCTURE results showed that the 259 genotypes were divided into three main populations: namely POPI (red), POPII (green) and POPIII (blue) (Online resource 8).
All these genotypes were also further divided into three groups according to Nei’s genetic distance analysis (Online resource 9). The first group consisted of Geno 185 (Nigde - Turkey) and Geno 186 (Malatya -Turkey), the second comprised Geno 38 (Siverek/Urfa – Turkey), Geno 230 (USA) and Geno 255 (Russia), and the third contained the remaining 254 genotypes. These results indicate that the genotypes used in this study were not clustered according to their geographical origin.
The expected heterozygosity and fixation index (Fst) are parameters that explain the heterozygosity level of a population. In this study, the expected heterozygosity was determined as 0.20 for Cluster 1, 0.06 for Cluster 2, and 0.11 for Cluster 3, with a mean value of 0.12. On the other hand, the Fst value varied between 0.14 and 0.81 with a mean value of 0.55, representing a high genetic variation level for the population.
AM Analysis
AM analyses were carried out for four pomological traits (FF, FFC, FT and SSC) using TASSEL (v.5.2.3) software and the MLM (Q+K) model in two consecutive years (2016 and 2017). These analyses detected a large number of associations related to the pomological traits. FDR and Bonferroni corrections were applied to eliminate the false positives among the associations. Eventually, 131 SNP markers were found to be associated with three traits (FF, FFC, and SSC). Among these associations, three, 57 and 71 SNPs were associated with FFC, FF and SSC respectively.
A total of 88 and 228 SNPs were associated with FF in 2016 and 2017, respectively (FDR correction applied, -log10 P ≥ 2.90 for 2016 and ≥ 2.78 for 2017), and 57 of these markers were common for both 2016 and 2017 (Online resource 10 and 11 and Fig. 1). Most of the significant SNPs for FF were detected in 2017 but not in 2016 (Online resource 10 and 11 and Fig. 1).
For FFC, three SNPs (-log10 P value is ≥ 3.27, FDR correction applied) and 13 SNPs (-log10 P ≥ 3.10, FDR correction applied) were associated with FFC in 2016 and 2017, respectively, and three of these SNPs (SNP 4257, SNP 17194 and SNP 22875) was common for both years (Online resource 10 and Fig. 2).
The marker-trait association analysis for FT revealed that it was associated with 10 SNPs (FDR correction applied, -log10 P ≥3.28) in 2016 and 22 SNPs (FDR correction applied, -log10 P ≥3.06) in 2017. However, none of these SNPs was commonly seen in both years (Online resource 10 and Fig. 3).
For SSC, 167 SNPs (-log10 P ≥ 2.82, FDR correction applied) and 352 SNPs (-log10 P ≥ 2.72, FDR correction applied) were found related in 2016 and 2017, respectively. Of these SNPs, 71 were detected in both years (Fig. 4 and online resource 10 and 12). The P values presenting the significance level of the associations between the markers and pomological traits are given in Q-Q plots in online resource 13.
Identification of Candidate Genes
A total number of 30 putative candidate genes were found to be related to the SNPs associated with FF and SSC (Online resource 14). SNPs which were associated with the FFC trait did not show similarity to any of the putative candidate genes. For the SSC and FF of the apricots, the following proteins and enzymes related to putative candidate genes showed homology with SNPs (given in parentheses): putative 3,4-dihydroxy-2-butanone kinase (SNP526), putative leucine-rich repeat receptor-like protein kinase (SNP1023), pentatricopeptide repeat-containing protein (SNP1482), probable LRR receptor-like serine/threonine-protein kinase (SNP1494), vignain-like (SNP2823), transcription termination factor MTERF6 (SNP3309), LRR receptor-like serine/threonine-protein kinase GSO2 (SNP3842), WAT1-related protein At5g64700 (SNP4443 and SNP20435), cell division cycle 20.2, cofactor of APC complex-like (SNP4948), long-chain acyl-CoA synthetase8 (SNP5158), calcium-transporting ATPase 12 plasma membrane-type-like (SNP5444), probable LRR receptor-like serine/threonine-protein kinase At4g36180 (SNP13398), peroxidase 5 (SNP14227), UDP-glycosyltransferase TURAN (SNP15803), tRNA (guanine(37)-N1)-methyltransferase 1 (SNP16184), transcription factor TFIIIB component B'' homolog (SNP16422), receptor like protein 30-like (SNP17431), putative disease resistance protein (SNP19126), putative pre-16S rRNA nuclease (SNP19568), DNA-directed RNA polymerase III subunit (SNP 20753), DEAD-box ATP-dependent RNA helicase 3 (SNP22678), putative pentatricopeptide repeat-containing protein (SNP23251), S locus F-box protein f (SLFf) gene, partial cds; and Sf-RNase and S haplotype-specific F-box protein f (SFBf) genes, complete cds (SNP23325), putative disease resistance RPP13-like protein 1 (SNP23732), TMV resistance protein N-like (SNP23833), putative disease resistance protein (SNP23953), LRR receptor-like serine/threonine-protein kinase GSO2 (SNP24483), and receptor-like protein 12 (SNP24611) (Online resource 14).