Genomic divergence of hatchery- and natural-origin Chinook salmon (Oncorhynchus tshawytscha) in two supplemented populations

Captive propagation is widely used for the conservation of imperiled populations. There have been concerns about the genetic effects of such propagation, but few studies have measured this directly at a genomic level. Here, we use moderate-coverage (10X) genome sequences from 80 individuals to evaluate the genomic distribution of variation of several paired groups of Chinook salmon (Oncorhynchus tshawytscha). These include (1) captive- and natural-origin fish separated by at least one generation, (2) fish within the same generation having high fitness in captivity compared to those with high fitness in the wild, and (3) fish listed as different Evolutionarily Significant Units (ESUs) under the US Endangered Species Act. The distribution of variation between high-fitness captive and high-fitness natural fish was nearly identical to that expected from random sampling, indicating that differential selection in the two environments did not create large allele frequency differences within a single generation. In contrast, the samples from distinct ESUs were clearly more divergent than expected by chance, including a peak of divergence near the GREB1L gene on chromosome 28, a gene previously associated with variation in time of return to fresh water. Comparison of hatchery- and natural-origin fish within a population fell between these extremes, but the maximum value of FST was similar to the maximum between ESUs, including a peak of divergence on chromosome 8 near the slc7a2 and pdgfrl genes. These results suggest that efforts at limiting genetic divergence between captive and natural fish in these populations have successfully kept the average divergence low across the genome, but at a small portion of their genomes, hatchery and natural salmon were as distinct as individuals from different ESUs.


Introduction
Captive propagation is an important conservation strategy for imperiled populations, particularly for species whose habitat has been lost or degraded. For some species, captive propagation is a last ditch attempt to stave off extinction, with captive populations providing critical 'gene banks' for species that are extinct in the wild (e.g., Walters et al. 2010). For other species, releases of captively-bred individuals into the wild are intended to maintain and perhaps increase a natural population's abundance, sometimes known as supportive breeding or supplementation (e.g., Cuenco et al. 1993;Ryman et al. 1995).
Supportive breeding is a common and well-studied strategy for the conservation of salmonid populations (Fraser 2008), and is also employed for many other taxa, including freshwater fishes and mussels (e.g., Rytwinski et al. 2021), birds (e.g., Navarro and Martella 2008;Walters et al. 2010;Dolman et al. 2021), amphibians (e.g., Tapley et al. 2015), and mammals (e.g., Champagnon et al. 2012). In the North Pacific, more than 5 billion juvenile Pacific salmon are released from hatcheries annually (North Pacific Anadromous Fish Commission 2022). These fish migrate to the ocean, rear for several months up to several years, and then return as adults with varying but often high fidelity to their natal watershed. The vast majority of these releases are from harvest-oriented, 'segregated', hatchery populations, where the adult returns are primarily intended to augment fisheries. The fidelity of returning fish to 'segregated' programs 1 3 is often imperfect, however (Westley et al. 2013), such that gene flow from hatcheries to natural populations may be common even in cases where there is no intentional supportive breeding (e.g., Shedd et al. 2022). Other hatchery programs, particularly in the Pacific Northwest, are explicitly conservation-oriented, 'integrated' hatchery programs, where returning adults are intended to spawn in streams near where they were released as juveniles, and thus boost the targeted population's abundance (Mobrand et al. 2005;Naish et al. 2008).
One concern about the large-scale use of salmon hatcheries for either conservation or harvest enhancement is that such propagation may lead to genetic change that is detrimental to the conservation of natural salmon populations (Fraser 2008). Genetic divergence of closed, segregated hatchery populations from their natural population of origin is expected through drift and adaptation to captivity. The primary means of mitigating genetic risk from these programs is to limit gene flow into exposed natural populations (Hard et al. 1992;HSRG 2004;Mobrand et al. 2005;Paquet et al. 2011).
Managing genetic risks in integrated programs is more complex (Ryman and Laikre 1991), and often involves attempting to maximize the use of wild fish in the captive breeding program. At one extreme, all of the spawners in captivity every generation may consist of wild fish. Several studies have found that in this case the reproductive performance in nature of the returning hatchery-produced fish is increased, and genetic divergence between the hatchery and natural components of supplemented population decreased, compared to segregated programs (Araki et al. 2007;Waters et al. 2015;Ford et al. 2016). These types of observations, combined with theoretical expectations that incorporation of natural-origin fish into breeding programs will limit the rate and extent of domestication (Ford 2002;Baskett and Waples 2013), have resulted in guidelines and policies encouraging the use of natural-origin salmon in hatcheries, particularly in Washington State (Mobrand et al. 2005;Anderson et al. 2020), and similar guidelines have recently been proposed for Pacific salmon hatcheries in British Columbia (Withler et al. 2018).
Despite the widespread application of these guidelines for Pacific salmon hatcheries in North America, there has been relatively little empirical investigation into their effectiveness at limiting genetic divergence between the hatchery and natural components of an integrated population. An early study designed to evaluate genetic differences among hatchery and natural Chinook salmon populations by using allozyme makers found no differences in heterozygosity or effective size (N e ) between hatchery and natural populations (Waples et al. 1993), and follow-up studies of some of the same populations using microsatellite loci produced similar results (Van Doornik et al. 2011. Those studies evaluated variation at only 12-35 loci, however. This is sufficient to evaluate average patterns of neutral diversity among groups, but not to identify whether or not there are small portions of the genome that are particularly divergent due to differential selection pressures. More recently, Waters et al. (2015Waters et al. ( , 2018 used RAD-seq methods to evaluate variation between hatchery and natural fish at several thousand genetically mapped loci in a supplemented, mid-Columbia River Chinook salmon population. In addition to finding differences in N e between the segregated and integrated components, the study also found some genomic regions of particularly high divergence that were possibly caused by directional selection.
Here, we use whole-genome sequencing to evaluate the distribution of allelic divergence at millions of loci between the hatchery and natural components of two independent, integrated Chinook salmon hatchery programs. Specifically, we compare patterns of diversity between the most successful male spawners in the hatchery and the most successful males in the natural stream. We evaluate whether there is any evidence of differential selection on specific genomic regions associated with hatchery propagation. We also evaluate the genomic distribution of variation between hatchery-and natural-origin fish in each population, regardless of spawning location, and between the two overall populations. We anticipate that broadly quantifying the genomic distribution of divergence will be helpful in evaluating the genetic effects of integrated hatchery programs.

Study populations
Our study focuses on two Interior Columbia River Chinook salmon populations, one spawning in the Wenatchee River and its tributaries, Washington; and the other in Catherine Creek, a tributary of the Grande Ronde River, Oregon (Fig. 1). Both populations are protected under the Endangered Species Act (ESA), but are part of separate Evolutionarily Significant Units (ESUs). The Wenatchee River population is part of the Upper Columbia River spring-run Chinook salmon ESU, and the Catherine Creek population is part of the Snake River spring/summer-run Chinook salmon ESU (Myers et al. 1998). Genetically, both populations are part of the Interior Columbia stream-type lineage (Moran et al. 2013;Narum et al. 2018), characterized by relatively early (April-June) adult spawning migration into freshwater and a full year of juvenile rearing in freshwater prior to smoltification and migration to the ocean (Healey 1991).
Both natural populations have one or more associated hatchery supplementation programs that are managed with the intent of increasing natural-spawning abundance. The Wenatchee River population has been supplemented since the early 1990s, with natural-origin broodstock collections starting in 1989 (Maier 2017). Hatchery and natural spawning is integrated, with the proportion of natural-origin fish in the broodstock averaging ~ 45% (Hillman et al. 2021), and the proportion of hatchery fish on the spawning grounds averaging ~ 60% (Ford 2022). The Catherine Creek population has been supplemented since 1994, beginning as a whole-lifecycle captive broodstock program. The captive program utilized entirely natural-origin, wild-caught parr which were raised in the hatchery and spawned as adults. In 2001 a conventional smolt-release hatchery program was started, incorporating returning adults from both wild and hatchery origins as broodstock. From 2005 to 2020, the average proportion of natural-origin fish in the broodstock was ~ 40%, and the average proportion of hatchery fish on the spawning grounds was ~ 60%.
To clearly distinguish between the spawning location of a fish in 2008 and its parents in 2004, we adopt the following terminology for the remainder of this paper (Fig. 2): "stream" and "brood" refer to spawning location of our sampled fish in 2008, in either the stream or as hatchery brood, respectively. "Hatchery" and "natural" refer to the spawning location of the parents of the fish in our sample; a hatchery fish is a fish whose parents were spawned in the hatchery and a natural fish is a fish whose parents spawned in the stream.
The reproductive success of fish spawning in the stream and the hatchery has been monitored in both populations through genetically-inferred pedigrees Ford et al. 2012Ford et al. , 2015Berntson et al. 2018). Based on results from these studies, from each of the two populations we selected two groups of 20 male fish with the highest reproductive success that year in the hatchery and the natural stream, respectively, for a total of 80 fish (Table S1). The rationale for focusing on the highest fitness fish in each environment was that if there is selection for genotypes in the hatchery environment that differ from the optimum in the natural environment, then this will be most apparent by comparing the most successful fish in each environment. To reduce variance due to differences among years, ages and sexes, all samples were age-4 males who returned to either river in 2008, except for a small number of age-3 and age-5 hatchery males in Catherine Creek, due an insufficient number of age-4 fish in that return year. In both populations, both the stream and brood groups contained hatchery and natural fish (Fig. 2, Tables 1, S1). In this study we initially focused on males because in the Wenatchee River differences in the reproductive success of naturally spawning hatchery-and naturalorigin females has already been shown to be largely due to environmental differences associated with spawning location within the river . By chance, the Wenatchee sample contained two pairs of full sibs (one pair of hatchery origin, one of natural origin), and two maternal half sibs of natural origin. These were included in all analyses, as there was no reason to believe the presence of siblings was unusual for this population.

DNA sequencing and analysis
DNA was extracted and quantified as described by LaHood et al. (2008). Library preparation and short-read, pairedend, 150 bp Illumina HiSeq sequencing to ~ 10X coverage was conducted by a commercial vendor (Azenta Life Sciences). Sequencing quality was initially reviewed by using fastqc, and then short read data were aligned to a Chinook salmon reference assembly [CF_002872995.1_Otsh_v1.0; (Christensen et al. 2018)] using bwa mem. Mate information was corrected, duplicate reads marked, and files were sorted using samtools fixmate, markdup and sort, respectively. Read depth was evaluated using angsd (-doDepth 1 -doCounts 1 -doQsDist 1 -remove_bads 1 -uniqueOnly 1 -only_proper_pair 1). Genotypes (filtered to biallelic SNPs with p-value 1e-6 and posterior probability of genotype to 0.95) were also called using angsd (-minMapQ 20 -minQ 20 -remove_bads 1 -uniqueOnly 1 -only_proper_pairs 1 -GL 1 -doMajorMinor 1 -doMaf 1 -skipTriallelic 1 -SNP_pval 1e-6 -doGeno 3 -doPost 1 -postCutoff 0.95). Analysis of the called genotypes was conducted in R by using custom scripts and the HardyWeinberg package, including calculation of allele frequencies, individual and population heterozygosity, and the Weir and Cockerham's (1984) estimate of F ST between pairs of sample groups. Prior to subsequent analysis, SNPs were filtered to exclude any unmapped variants and any variants with a total-sample frequencies < 0.01 and an |F IS |> 0.7. We evaluated the distribution of F ST across the genome to characterize patterns of diversity and identify genomic areas of high divergence that might be candidates of differential selection among sample strata. Our primary focus was comparisons between brood and stream samples, and between hatchery and natural samples. Within each population, the stream and brood fish were chosen from the set of same-age natural-and hatchery-origin fish returning in the same year, and are therefore separated from each other by 0-1 generations (Fig. 2). Because we chose fish that were reproductively successful in each environment, however, there is potential opportunity for selection to lead to allele frequency changes between these sampling strata. The hatchery/natural comparison within each population was expected to be separated by at least one generation. This is due to their parents necessarily having spawned in different environments, and < 100% use of natural-origin broodstock. As a point of reference for both the stream/brood and hatchery/ natural comparisons, we also examined the distribution of F ST between the two population samples, which diverged an unknown but presumably much larger number of generations in the past, consistent with their status as distinct ESUs.
We used random permutations of samples with respect to the characteristic of interest to compare the observed distribution of F ST to the null distribution that would be expected if two subsamples were drawn at random from the same population. For example, to create a null distribution Table 1 Summary of variation within sample groups, where n is the number of samples in the group, SNPs is the total number of single nucleotide polymorphisms identified, H 0 is the observed sample heterozygosity, H e is the expected sample heterozygosity, N b is the esti-mated effective number of breeders, N b CIP is the parameteric 95% confidence interval around N b , and N b JK is the jackknifed confidence interval around N b with respect to the stream/brood comparison (n = 20 in each stratum) within the Wenatchee River, we randomly sampled the total set of 40 individuals without replacement to create many random samples of n = 20 each, preserving each individual's multi-locus genotype. The distribution of F ST was then evaluated for each of these random subsets, and compared to the observed stream/brood distribution. To quantify these comparisons, we calculated a variety of quantile statistics, and also visualized the comparison with quantile plots in which the expected quantiles were generated from the randomly permuted data. To construct a plot, the observed F ST values across each variable site in the genome were ordered from lowest to highest and then plotted against the ordered set of sites from one random permutation. This was repeated for each independent permutation to assess the variance due to chance differences among the permutations. In addition, the F ST values for pairs of individual permutations were also plotted against each other to assess the variance around the expected 1:1 line attributable to chance difference among the permutations (see Results for more information). We were interested in whether small effective size in the hatcheries might contribute to rapid divergence due to genetic drift, as has been previously observed in some systems (e.g., Waters et al. 2015). Estimates of the effective number of breeders (N b ) for each sampling stratum were made by using the linkage-disequilibrium (LD) method of (Hill 1981;Waples 2006), as implemented in the program NeEstimator (Do et al. 2014). We used a randomly selected set of 10,000 SNPS with a minimum within-stratum allele frequency of 0.05, and only compared pairs of SNPs on different chromosomes. Our sample of adults from a single cohort is expected to provide an estimate of N b in the parental cohort that spawned in 2004 (Waples 2005).

Results
After filtering, we identified a total of 8,187,048 SNPs in the sample of 80 fish. The number of SNPs in various sampling strata varied from 5.1 to 7.3 M (Table 1). Observed and expected heterozygosity varied only slightly among strata, although the Wenatchee River sample when considered as a whole was slightly less diverse than the Catherine Creek sample. Estimates of N b varied considerably among strata, with the estimated N b for the Wenatchee population less than half the value estimated for the Catherine Creek population (Table 1). Within each population, the natural-origin component had a higher estimated N b than the hatchery-origin component. In the Wenatchee River population, the stream component had a lower estimated N b than the brood component, whereas in the Catherine Creek population the N b estimates between the stream and brood components were nearly identical (Table 1). Parametric confidence limits for each estimate were very small, but jackknifed confidence intervals were large, overlapping, and included infinity for multiple strata. The jackknifed estimates are considered to be more accurate when using large numbers of loci (Do et al. 2014), indicating that there is considerable uncertainty in these estimates of N b , presumably due to the small number of individuals sampled.

Comparison between populations
The mean estimated F ST between the Wenatchee River and Catherine Creek populations was 0.01, and ranged across the genome from < 0 to > 0.5 ( Table 2). As expected for populations from different ESUs, the observed distribution differed markedly from the permuted null distribution (Table 2), with the observed data exhibiting a clear pattern of greater divergence across all quantiles, and an apparent inflection point at an observed F ST of ~ 0.3 (Fig. 3A).
The values above the inflection point (~ 99.999% quantile; Table 2) are located in a few discrete genomic locations, with a notable peak on chromosome 28 and several smaller peaks on other chromosomes (Fig. 4A). The peak of F ST values on chromosome 28 (centered on position 12,317,964- Fig. 5) is within the GREB1L/ROCK1 region that has been previously associated with variation in adult run timing in Chinook salmon and steelhead (Hess et al. 2016;Prince et al. 2017;Thompson et al. 2020;Willis et al. 2021;Waples et al. 2022). In this region, the Wenatchee River sample is far more heterozygous than the Catherine Creek sample (Fig. 5), and variation within the Wenatchee River population was significantly associated with an individual's time of return to the Wenatchee River ( Fig. 6; Table S2). There are several non-synonymous variants in this region, although none were directly associated with the peak of maximum divergence (Table S2).
In contrast to the between population comparison, samples of high-fitness spawners in the stream or brood within each population were only slightly more divergent from each other than would be expected under the permuted null distribution (Table 2, Fig. 3B, D). In both populations, the observed data fell slightly above the expected 1:1 line, but are well within the range of variation seen in different random permutations (Fig. 3B, D). The maximum observed values of F ST and the higher quantiles are in each case very similar to expected values under the permuted null distribution in both populations ( Table 2).
Regardless of spawning location, hatchery-and natural-origin fish within each population were somewhat differentiated from each other, however, especially in the Wenatchee River population (Fig. 3C, E). The mean F ST was less than a tenth of that observed between populations (Table 2). However, the maximum F ST between hatchery-and natural-origin fish in the Wenatchee River population was similar to that observed between populations, and was clearly higher than the permuted null distribution (Fig. 3C). The highest F ST values were associated with a peak of divergence on chromosome 8 (Fig. 4C) located near the pdgfrl and slc7a2 genes (Fig. 7, Table S3). The maximum values between hatchery-and natural-origin spawners within Catherine Creek were also somewhat higher than expected under the null distribution ( Fig. 3E) with some minor peaks of divergence on several chromosomes (Fig. 4E). Genomic locations of divergence between hatchery-and natural-origin fish were not obviously correlated between the two populations ( Fig. 4). However, in a nearby population in the Yakima River, Waters et al. (2018) also reported a region of high divergence between hatchery and natural Chinook on chromosome 8 (their Fig. 2b). The RAD-seq data from that study have not been mapped to a Chinook salmon reference genome, so we aligned (BLAST) the 14 chromosome-8 RADtag sequences they report as either divergent and/or associated with life-history traits to the reference genome used in our study. The closest sequence reported in their study (tag Ot003069_Ots08q) mapped to positions 8,495,478-8,495,538, or ~ 200 Kb from the peak we found between hatchery-and natural-origin fish in the Wenatchee (Table S4).

Discussion
Concerns about the potential for genetically-based loss of fitness in hatchery salmon date back more than 45 years (e.g., Reisenbichler and McIntyre 1977), and have been a focus of hatchery reform efforts for several decades (Hard et al. 1992;Mobrand et al. 2005). Several studies comparing the reproductive success of hatchery-and natural-origin steelhead when spawning in nature have found large and rapid (single generation) fitness losses that have been inferred to be genetic based on comparing the reproductive success of hatchery-origin fish with zero, one, or two natural-origin parents (Araki et al. 2007(Araki et al. , 2008. Although evidence for such rapid, heritable fitness loss has not been found in other propagated Pacific salmon species, it has been commonly assumed that inadvertent domestication selection in hatcheries is a serious risk to natural salmon populations of all species (Mobrand et al. 2005;Araki et al. 2008;Fraser 2008;Anderson et al. 2020).
In this study, we used whole-genome sequence data to directly examine the degree of genomic divergence between hatchery and natural Chinook salmon in two different hatchery-supplemented populations over two time periods of potential divergence. The shortest time period was a comparison of fish from within the same cohort but identified as having high reproductive success in either the hatchery brood or in the natural stream. If hatchery propagation leads to strong, rapid selection for genotypes that are deleterious in nature but advantageous in the hatchery, then we might expect to see evidence for this in the form of larger than expected allele frequency differences between fish that were successful as brood compared to those that were successful in the stream. We did not see this, however. In both the Wenatchee River and Catherine Creek populations, the Shaded rows contain the empirical comparisons among groups, including the min, max, mean and the labeled quantiles. The unshaded rows contain the corresponding mean values from 100 random permutations of the same individuals used in the empirical comparisons. For example, the first comparison is between the 40 Wenatchee and 40 Catherine Creek fish (mean F ST = 0.01088), and the row below that comparison contains the mean values for 100 permutations of 40 fish each drawn from that group of 80 fish, without replacement (mean F ST = − 0.0007) genomic distribution of divergence between the successful stream and successful brood fish was nearly the same as expected if the two sets of samples were drawn at random from the same statistical population (Fig. 3B, D). This suggests that, to the degree that these pressures exist, they are not creating greater than random differences in allele frequencies over the course of a single generation. It is important to note, however, that our power to detect such differences is fairly low, illustrated by the quantiles of the null distribution of F ST . The power of increased sample size can be clearly seen by comparing the narrower distribution of permuted F ST values for the between population comparison (n = 80) relative to the within population comparisons (n = 40; Table 2). Nonetheless, our results indicate that to the degree that divergent selection in the stream and brood exists, it is not creating large differences in allele frequencies over the course of a single generation. In contrast to the stream/brood comparisons, hatcheryand natural-origin fish were notably more divergent than would be expected if the two groups were drawn at random from the same cohort (Fig. 3C, E). This was particularly true in the Wenatchee River population. The hatchery/natural comparisons involve at least one generation of separation, because this comparison includes only the progeny of fish that spawned in different environments. Therefore, even in the absence of any differential selection between the hatchery and stream environments, a small amount of differentiation between hatchery and natural fish is expected Expected values based on 100 independent sample partition permutations are shown as grey lines. Each grey line represents one random permutation plotted against the observed data, with the variation among the grey lines due to differences among the random permutations. The black points are the means across permutations. Also plotted are the 1:1 line (thin black line) and comparisons of 100 pairs of independent sample partitions plotted against each other (orange lines). Panels B and D are the same as A, but comparing the highfitness-stream and high-fitness-brood spawners within the Wenatchee River and Catherine Creek samples, respectively. Panels C and E compare hatchery-and natural-origin fish within the Wenatchee River and Catherine Creek samples, respectively 1 3 due to a generation of drift. Spatially non-random spawning of hatchery and natural fish could also create additional opportunities for reduced geneflow Ford et al. 2015;Hughes and Murdoch 2017). It is therefore not surprising that the observed divergence is somewhat greater than the permuted distribution. Our results do provide an empirical measurement of this divergence, however. For example, even though the hatchery/natural comparisons within each population had mean F ST 10-20X lower than the comparison between the two populations, the maximum  (Table 2). In other words, the maximum divergence between hatchery and natural fish in each population was similar to the maximum divergence between populations that are considered to be different ESUs.
The excess divergence between hatchery and natural fish in the Wenatchee River was associated primarily with a peak on chromosome 8, near the slc7a2 and pdgfrl genes (Fig. 7, Table S3). The function of neither gene has been studied in salmonids, but in mammals the slc7a2 gene encodes a cell membrane protein involved in cationic amino acid transport (Hoshide et al. 1996) and variation in this gene has been associated with various cancers (Sun et al. 2020;Xia et al. 2021). In zebrafish (Danio rerio) the protein produced by this gene is expressed in macrophages involved in central nervous system health (Demy et al. 2020). The pdgfrl gene has also been identified as a tumor suppressor in humans (Guo et al. 2010) and has been associated with a blood vessel inflammation disease (Hou et al. 2013). The finding by Waters et al. (2018) of a peak on divergence between integrated hatchery and natural Chinook salmon near this region of chromosome 8 supports the possibility that variation in this region may be involved in hatchery adaptation more generally. Analysis of additional samples of hatchery and natural Chinook salmon in the Wenatchee River and elsewhere will be needed, however, to test this hypothesis further.
An additional difference between hatchery-and naturalorigin fish in our study was the estimated effective number of breeders (N b ), which was lower for the hatchery-origin samples in both populations (Table 1). This result is similar to previous observations in the Wenatchee River population , and to estimates of hatchery and natural N b of Chinook salmon in the nearby Yakima Fig. 5 Chromosome-28 region containing the peak of divergence (F ST ) between the Wenatchee River and Catherine Creek samples (A), with region zoomed in (B). mRNA transcript locations in the region are noted below each plots, with the GREB1L and ROCK1 transcripts labeled as G and R, respectively. The 50-bpwindow moving average of the heterozygosity in the Wenatchee (golden) and Catherine Creek (blue) is also shown River (Waters et al. 2015). For the Wenatchee River and Catherine Creek populations, the lower N b in the hatchery is reflective of the generally smaller number of breeders used in the hatchery compared to the number spawning in the streams. For these populations, the hatchery environment is also markedly more productive than the stream environment: in the Wenatchee River population between 1989 and 2014, each brood spawner produced an average of 7.4 returning adults, compared to 1.03 returning adults per stream spawner (Hillman et al. 2021), and in Catherine Creek the values are 6.1 adults/spawner for the brood and 0.4 for the stream (EB, unpublished data). Despite this high productivity of the hatcheries, however, N b is lower due to the smaller number of spawners.
The observed divergence between the Wenatchee River and Catherine Creek samples provides a useful benchmark for the degree of differentiation between distinct, albeit fairly closely related, Evolutionarily Significant Units (Myers et al. 1998). The distribution of F ST values was markedly (and unsurprisingly) greater than the permuted null distribution for all quantiles (Fig. 3A). Despite this, the absolute level of divergence was certainly not large, with a mean F ST of only 0.01, a 75% quantile of only 0.02, and no fixed differences between the samples. The only obvious peak of divergence occurred in the GREB1L/ ROCK region of chromosome 28, which has been previously associated with run timing variation in both Chinook salmon and steelhead (see Waples et al. 2022 for a recent review). Our results continue to support this association in two ways. First, we found that variation within this region was associated with run timing within the Wenatchee River (Fig. 6). This association is similar to what has been found in other Chinook salmon populations, including both coastal (Prince et al. 2017;Thompson et al. 2020) and Interior Columbia populations (Koch and Narum 2020;Willis et al. 2021). Variation at the GREB1L/ROCK1 region therefore appears to influence run timing in the Wenatchee River Chinook population, even though the population as a whole has an early (spring) run timing distribution. This is similar to what has been previously observed in Johnson Creek Chinook salmon (Narum et al. 2018), another early-run Interior Columbia population. Second, the patterns of variation between the Wenatchee River and Catherine Creek samples are also consistent with an association of run timing at this genomic region. Compared to other Interior spring/summer run Columbia River Chinook salmon populations, Catherine Creek has a particularly compressed run timing distribution when measured at the mouth of the Columbia River (see Fig. 3 in Sorel et al. 2021). The Wenatchee River population, in contrast, has a much broader distribution when measured the same way. These patterns are reflected within the peak of divergence between these two populations, where the Wenatchee River sample was markedly more variable compared to the Catherine Creek sample (Figs. 5, 6).
To our knowledge, this is the first study to evaluate the degree of divergence between hatchery and natural salmon in supplemented populations at the whole-genome level. The study is largely exploratory and will benefit from replication, so our results should be interpreted cautiously. Larger sample sizes, especially, will be helpful for improving the signal to noise ratio for detecting outlying regions of divergence. Additional temporal sampling will also be helpful. For example, a supplemented population might change genetically over time in a different way from an unsupplemented population, even if there is only modest differentiation between the hatchery and natural components of the population at any one moment in time. Evaluating whether such temporal changes are due to supplementation per se appears conceptually difficult for any single population study, but perhaps by evaluating large numbers of genomes from multiple supplemented and unsupplemented populations, the question could be addressed. Despite these limitations, our results illustrate Fig. 6 Illustration of the relationship between return time (day of year) of Chinook salmon to the Wenatchee River or Catherine Creek and genotype at position 12,233,225 on chromosome 28. This position was plotted because it has the highest correlation with run timing, but other linked sites also have high correlations (Table S2). The association is significant within the Wenatchee River (Pearson's product-moment correlation, r 2 = 0.63, p = 1.97e−05). Run-timing information in Catherine Creek was only available for stream fish. One fish in Catherine Creek and 2 fish in Wenatchee River with missing genotypes are not shown the promise of using whole-genome population analyses to address an important and long-standing problem in conservation biology. Fig. 7 Chromosome-8 region illustrating a peak of divergence (theta; points) between hatchery and natural Chinook salmon in the Wenatchee River. The blue and golden lines illustrates the 50-bp-window moving average heterozygosity in the hatchery and natural fish, respectively. Locations of mRNA transcripts are shown below the points. S and P refer to the location of transcripts for the slc7a2 and pdgfrl genes, respectively