Genetic variance
The selection strategies and frameworks were first compared in terms of changes to genetic variance for the three simulated traits, days to flowering (DF), white mold tolerance (WM), and seed yield (SY). Genetic variance was represented as a relative percentage, with cycle 0 defined as 100%. Differing numbers of initial parents were also compared for each trait. The analysis of variance (ANOVA) for additive genetic variance revealed that the strategy, framework, and number of parents were all statistically significant. As expected, relative genetic variance decreased over the five cycles. For days to flowering, as the number of initial parents increased, less relative genetic variance was maintained (Supplemental Fig. 2). Similar trends were observed for white mold tolerance (Supplemental Fig. 3) and seed yield (Supplemental Fig. 4). Genomic selection led to equal or greater genetic variance being maintained when compared to conventional breeding. Meanwhile, speed breeding resulted in lower genetic variance maintained compared to both conventional breeding and genomic selection. Interestingly, the use of genomic selection for seed yield resulted in maintenance of more genetic variance under the mass selection strategy, when compared to conventional breeding. For days to flowering, bulk breeding maintained the greatest amount of genetic variance for most scenarios. With 30 initial parents under genomic selection, the modified pedigree method maintained the most genetic variance. With 60 initial parents under genomic selection, mass selection maintained the most genetic variance.
For white mold tolerance, bulk breeding led to the greatest genetic variance maintained when the parental population size was 15. For parental population sizes of 30, 60, and 100, mass selection resulted in the most genetic variance maintained.
For seed yield, mass selection resulted in the most genetic variance being maintained for most scenarios. With 15 initial parents under conventional and speed breeding, bulk breeding led to the greatest genetic variance maintained.
Fixation of favourable alleles and Hamming distance
The fixation of favourable alleles was plotted over 10 cycles. Figures 3, 4 and 5 display the plots for the fixation of favourable alleles in days to flowering, white mold tolerance, and seed yield, respectively. For days to flowering (Fig. 3), as the parental population size increased, a lower percentage of alleles were fixed. Across all scenarios, the pedigree method had the fastest allele fixation rate. Mass selection had the slowest allele fixation rate and resulted in the fewest alleles being fixed. The scenario resulting in the greatest percentage of fixed alleles was single seed descent under genomic selection with 15 parents, where 93.68% of favourable alleles were fixed.
For white mold tolerance, multiple scenarios led to 100% of favourable alleles being fixed (Fig. 4). In general, as parental population size increased, a higher percentage of alleles were fixed. Under genomic selection with 100 initial parents, the pedigree method allowed for 100% of favourable alleles to be fixed in only 2 cycles. This scenario led to the greatest percentage of fixed alleles in the fewest cycles. Across all scenarios, the pedigree method had the fastest allele fixation rate.
For seed yield, a parental population size of 15 resulted in the greatest fixation of alleles (Fig. 5). The scenario resulting in the highest percentage of fixed favourable alleles was single seed descent under speed breeding with 15 initial parents, where 98.91% of alleles were fixed.
The plots for average Hamming distance are displayed in Supplemental Figs. 5, 6 and 7. Overall, Hamming distance had a general decreasing trend which eventually plateaued. For days to flowering (Supplemental Fig. 5), Hamming distance was higher in scenarios with larger parental population sizes, particularly for 60 and 100 parents. Across all scenarios, mass selection had the highest Hamming distance. This was especially pronounced under genomic selection when 30 and 100 parents were simulated. Conventional breeding, speed breeding, and genomic selection were all comparable, with minor differences. Under conventional and speed breeding, bulk breeding and single seed descent resulted in the lowest Hamming distance. Under genomic selection, the optimal strategy for Hamming distance depended on the parental population size. Bulk breeding, single seed descent, pedigree method, and modified pedigree method led to the smallest Hamming distance for the parental population sizes 15, 30, 60, and 100, respectively.
For white mold tolerance (Supplemental Fig. 6), larger parental population sizes produced smaller Hamming distances in the selected individuals. In addition, differences between the strategies were only observed with fewer initial parents. Across all scenarios, mass selection resulted in the largest Hamming distance. The three frameworks, conventional breeding, speed breeding, and genomic selection led to similar results. With 15 initial parents, bulk breeding allowed for the smallest Hamming distance. For 30 parents under conventional and speed breeding, all strategies, except for mass selection, led to the same Hamming distance. Under genomic selection with 30 parents, bulk breeding, single seed descent, and the modified pedigree method had the smallest Hamming distance. When the parental population size was 60 and 100, the strategies, except for mass selection, resulted in the same Hamming distance after 10 cycles.
For seed yield (Supplemental Fig. 7), a parental population size of 15 led to a smaller Hamming distance compared to larger parental population sizes. Similar to white mold tolerance, differences between the strategies were more noticeable with few initial parents. Mass selection consistently resulted in the largest Hamming distance across all scenarios. When comparing the Hamming distance observed in the final cycle, conventional breeding, speed breeding, and genomic selection produced similar results. It was noted that mass selection had a much larger Hamming distance under genomic selection than for the other frameworks. For 15 parents, single seed descent was the strategy that led to the smallest Hamming distance. For 30, 60, and 100 parents, the strategies, except for mass selection, resulted in the same Hamming distance.
Genetic gain
The relative genetic gain averaged across runs was determined for each cycle for the various simulation scenarios. Figure 6 displays the trend in genetic gain for the five strategies, as well as the cumulative genetic gain averaged across strategies when days to flowering was selected. The cumulative genetic gain was greater in conventional and speed breeding compared to genomic selection for all parental population sizes. Figure 7 displays genetic gain for white mold tolerance, while Fig. 8 shows the genetic gain plot for seed yield. A parental population size of 100 led to the greatest percent cumulative genetic gain, followed by 30, 15, and 60 initial parents.
For days to flowering genetic gain (Fig. 6), the initial parental population size of 100 resulted in a maximum of 50% cumulative genetic gain, while the parental population size of 60 led to a minimum of 36% cumulative genetic gain. Conventional and speed breeding resulted in greater cumulative genetic gains compared to genomic selection.
For white mold tolerance genetic gain (Fig. 7), a parental population of 30 led to the greatest cumulative genetic gain, followed by 15, 100, and 60. Interestingly, genomic selection resulted in similar cumulative gains to conventional and speed breeding when the parental population size was 30, 60, and 100. Meanwhile, genomic selection had much lower cumulative gains than conventional and speed breeding when 15 parents were used. The parental population size of 30 resulted in a maximum of 49% cumulative genetic gain. In contrast, the parental population size 15 led to a minimum of 37% cumulative genetic gain.
For seed yield genetic gain (Fig. 8), a larger parental population size resulted in greater cumulative genetic gains, with 100 parents leading to the highest cumulative genetic gains. In general, conventional and speed breeding led to higher cumulative genetic gains compared to genomic selection. The parental population size of 100 resulted in a maximum of 50% cumulative genetic gain. Meanwhile, the parental population size of 15 led to a minimum of 29% cumulative genetic gain.
The proportion of cumulative genetic gain was determined for each cycle when averaged across all strategies (Figs. 6–8). The proportions were determined for the simulation of days to flowering. By cycle five under the conventional framework, on average the various strategies had achieved between 91 and 96% of cumulative genetic gain. Meanwhile, for speed breeding, 91 to 96% of cumulative genetic gain was achieved within the first three cycles. Lastly, for genomic selection, 89 to 98% of the cumulative genetic gain was achieved in the first 6 cycles.
In the simulation for improving white mold tolerance, 83 to 97% of cumulative genetic gain was achieved by cycle 3 for under the conventional framework. Meanwhile, speed breeding led to 83 to 97% of cumulative genetic gains in the first 2 cycles. 93 to 96% cumulative gains were observed in genomic selection. Figure 9 shows the number of cycles required for 95% cumulative ΔG. On average, 3.31 cycles were necessary to achieve 95% cumulative ΔG. The scenario requiring the fewest cycles to obtain 95% cumulative ΔG was dependant on the trait. For days to flowering, the pedigree method under speed breeding with 60 parents required only 1.12 cycles to achieve 95% cumulative ΔG. For white mold tolerance, the pedigree method under speed breeding with 30 initial parents required 1.02. For seed yield, the pedigree method under speed breeding with 30 initial parents allowed for 95% cumulative ΔG to be obtained in 1.04 cycles.
The average ΔG per cycle was determined for all scenarios (Fig. 10). On average, 5.25% ΔG could be obtained per cycle. The scenario resulting in the greatest ΔG per cycle varied depending on the trait being selected. For days to flowering, single seed descent with 100 initial parents under speed breeding led to 8.45% ΔG per cycle. For white mold tolerance, bulk breeding with 15 initial parents under speed breeding resulted in 8.32% ΔG per cycle. For seed yield, single seed descent, pedigree method, and modified pedigree method with 100 initial parents under speed breeding each led to 8.69% ΔG per cycle.
Principal component analysis (PCA)
Principal component analyses were conducted to examine patterns in simulation outputs immediately after the first cycle, where each run is represented by a single point on the biplot. Results are shown in Fig. 11 with eigenvector loadings of various population and quantitative genetics statistics, such as effective population size, fixation of favorable alleles, hamming distance and genetic gain. These statistics were evaluated on the simulated populations under selection for three traits (days to flowering, white mold tolerance, and seed yield) with varying levels of initial parental population sizes, heritability, breeding frameworks and selection strategies.
PCA biplots also included genomic selection and speed breeding as breeding method alternatives to conventional, and all of these contained various selection strategies (bulk, mass, pedigree, modified pedigree and single seed descent). For days to flowering (Fig. 12a), a large linear-like cluster representing a parental population size of 100 can be observed to the right of the PCA plot between the eigenvectors for genetic gain and Hamming distance. At the extreme of the eigenvector for Hamming distance, was the cluster of runs for mass selection under genomic selection. There was a cluster for pedigree method with 100 parents under conventional breeding in the direction of the eigenvector of the genetic gain. At the extreme of the eigenvector for effective population size, there was a cluster corresponding to bulk breeding with a parental population size of 100 under speed breeding. A cluster representing the pedigree method with 15 and 30 parents under speed breeding formed in the extreme of the eigenvector for the fixation of favourable alleles. Between the eigenvectors for fixed favourable alleles and genetic gain, a cluster corresponding to the pedigree method under genomic selection and speed breeding was found. A cluster representing both single seed descent and the modified pedigree method was located closer to the center of the plot along the axis of the genetic gain vector. Between the eigenvectors for fixed favourable alleles and effective population size, a cluster consisting of multiple strategies including mass selection, the pedigree method, single seed descent, and the modified pedigree method was found.
For white mold tolerance, the first two principal components accounted for 81.8% of the variance (Fig. 12b). Notably, there were fewer distinct clusters that formed, with most points concentrated in the center of the plot. To the extreme in the direction of the effective population size eigenvector, there was a line-like cluster representing the pedigree method under speed breeding. Between the eigenvectors for effective population size and fixed favourable alleles, there was a cluster consisting of single seed descent and the modified pedigree method under speed breeding. Between the eigenvectors for Hamming distance and effective population size, there were many points corresponding to mass selection. Points reflecting all the strategies were dispersed between the vectors for Hamming distance and genetic gain, with a larger parental population size concentrated towards the center of the plot. In the most extreme of the vector for genetic gain, there were many points representing the pedigree method with 15 and 30 parents under conventional breeding.
For seed yield (Fig. 12c), the two major principal components explained 72.3% of the variance. In the outermost region of the plot, there were a number of points representing bulk breeding with 100 parents under conventional breeding between the vectors for Hamming distance and effective population size. Towards the center of the plot, there were clusters for bulk breeding that corresponded to speed breeding and genomic selection, as well as mass selection. There was a distinct cluster for mass selection with 100 parents under genomic selection that was in the direction of the Hamming distance eigenvector. In the direction of the genetic gain eigenvector, there was a cluster corresponding to the pedigree method under conventional breeding. Meanwhile, there was a sparse cluster along the fixed favourable alleles eigenvector, which consisted of the pedigree method, single seed descent, and the modified pedigree method. More points representing single seed descent and the modified pedigree method with 100 parents were found in the center of the plot. In the extreme of the fixed favourable allele eigenvector were points corresponding to the pedigree method with 30 parents under speed breeding.