In general, the efficacy of population genetic studies is dependent on the sample size as well as on the number and type of the chosen markers. Although dwarfed by the existing 170k arrays for dogs (6, 17, 18), the 1319 diagnostic neutral SNPs used in the study have previously been successfully used in detecting dog population structure as well as inferring their genetic relationships (9, 10). In more detail, as the distance between the markers is known, the 1319 SNPs were successfully used for an r2 based estimation of the historical effective population sizes of the Finnish and Nordic Spitz (9). In general, analysis of SNPs across all chromosomes can outperform any traditionally used STR panels in population genetic analyses (19). When several loci are used, as in the case of whole-genome analyses, sample sizes of 20–30 individuals are considered to give a sufficient overview of the studied population (20, 21). Our sample sizes varied from 90 to 608 (Table 2) and even the smaller sample set did not show any singleton genotypes (Figures 1–3). The same genotyping panel is also used by the dog owners to test if their dogs are carriers of known single-locus Mendelian disorders, or other functional variants, but these loci were not included in our analyses. While certain disease variants might be enriched in some breed subpopulations, these do not influence the genetic makeup of the breed. Breeders also often test several related dogs (parents, offspring, siblings), which could have influence in the representation of individuals within each K when the small sample size is small. However, as the related dogs represent the same geographical region and breed type, this type of biases would not affect the identification of subpopulation divisions.
Our analysis was able to demonstrate subpopulation differentiation in all of the studied six breeds and confirmed that both geographical isolation as well as differential breeding strategies can have similar outcomes and that the differentiation can be fairly rapid, as seen in the Finnish Lapphund. If the reproductive isolation was to be complemented with strong directional selection, we would expect the differentiation to be even faster and more extreme. It is noteworthy that the differentiated subpopulations having high FST values (Table 1), formed tight clusters on the MDS plots and were more uniform in their STRUCTURE charts than the less differentiated examples (Figures 1–3).
The geographic differentiation between the European and the US populations of the Italian Greyhounds and the Shetland Sheepdogs was expected. Both breeds originate from Europe and the US populations are based on few founders (22). The founder effect combined with the limited gene flow between the continents is expected to be efficient drivers of subpopulation differentiation. What is more astounding is that similar or even more extreme differentiation rates, as evident in the FST values (Table 1), were observed in subpopulation division to sport/working dogs and show dogs in English Greyhounds and Labrador Retrievers. In addition, one might expect that selection for running performance or work, respectively, would be a dominant driver in the differentiation, but in both breeds the show dog subpopulations had higher FST values. This probably reflects the overrepresentation of few champion breeders and hence low effective population size in the show dog lineages. There also seems to be gene flow from the show lineages to the sport or working dogs but not vice versa, also reflecting differential breeding practices in the corresponding kennel cultures.
The Belgian Shepherd is an interesting example of a dog breed with several different breed varieties with discrete breeding practices. Although the separation into breed varieties has driven subpopulation differentiation, the fact that the sub types are determined by coat type and color, both monogenic traits, can result in a situation where genotypically similar individuals are considered different breed types by the breed community (Figure 3C). For example, the lack of differentiation between Groenendaels and Tervuerens can be explained by the fact that the former exhibit the dominant black in coat colour, but can produce sable-coated offspring, which are registered as Tervuerens. Registration of offspring under a breed variety different to that of its parents plays a role in allowing gene flow between the subpopulations.
In contrast to the Belgian Shepherds, there are no phenotypically separated breed types in the Finnish Lapphund. Instead, a breeder association preferring certain family lineages to others and setting restrictions on outbreeding with the remaining population likely drives the subpopulation differentiation in this breed. The goal of this selective breeding has been to protect the original identity of these reindeer herding dogs; however, as all animal populations evolve over time, it is uncertain how this goal can be objectively evaluated. In fact, the original Sámi reindeer herding dogs have been artificially split into two breeds based on their coat type, the Lapponian Herder and the Finnish Lapphund. Finnish Lapphund is among the most popular breeds in Finland, with some 1224 registered in 2019 and only a tiny fraction of these dogs are in any herding use.
Population fragmentation is an issue for the conservation of the genetic diversity in any population. The measure of population differentiation FST, by definition, is in effect a measure of inbreeding in the subpopulation relative to the total population (23, 24). Inbreeding itself reduces the heterozygosity in a population and in fact inbreeding can be expressed also as a function of loss of Hz over generations (see (9) for the different metrics and discussion in dogs). This is also evident in our study, where the subpopulations with the highest FST had also the lowest Hz (Table 1). If the obstacle for the gene flow is maintained, Hz will continue to erode at higher rates than in the less differentiated populations. Two factors influence the FST and Hz. First is the effective population size Ne, which is not simply the number of individuals contributing to the next generation, but is also dependent on the relatedness of these individuals, affecting the subsequent change in the inbreeding rate of the population [see Kumpulainen et al. for discussion regarding dogs (9)]. In essence, pairing closely related dogs, as often the case for line breeding, results in higher inbreeding rate and faster loss of Hz over generations. The Ne can dramatically reduced, at least temporarily, due to founder effect when a dog breed is introduced to a new country. The founder effect was also evident in our study, where certain European ancestral lineages were strongly enriched and others absent in the representatives of the American lineages (Figure 1B, D). Similarly, much of the DLA variation has been lost due to founder effect in the US populations of several European dog breeds (25), which could have functional consequences for their immunocompetence. For example, the US population of Italian Greyhounds, also included in this study, was found susceptible for many autoimmune diseases not observed in the European population (22). Secondly, as the drift operates in each generation, the longer the subpopulation is separated from the others, the larger the FST will be.
The loss of genetic diversity can pose a threat to breed health (13, 22, 26-29). Although this loss cannot be completely avoided due to the closed population nature of all dog breeds, some breeding practices, such as extreme selection for “best in breed” winners of competitions and trials or lineages with culturally valued pedigrees have the potential to accelerate the process. We also acknowledge that breed differentiation can serve a purpose, such as the maintenance of working lines, in which case the FST can be seen as a measure of an increase in the “quality” genes. In these cases, mixture of differentiated subpopulations could create an inferior outcome, similar to the decay of locally adapted allelic combinations known to occur in the wild, also known as outbreeding depression (30) or migration load (31). However, contrary to the locally adapted wild populations, dog breeds result from man-made criteria, whose justification should be critically evaluated and diverse breeding options preferred over lineage selection. Also the loss of genetic diversity does not automatically have adverse effects, especially if there is simultaneous positive selection on health, facilitated by rigorous veterinary checks. The health checks, including genetic testing of disease carriers, can also narrow down the breeding population, contributing to the loss of genetic diversity especially in small breeds.
Knowledge of the genetic differentiation of populations and gene flow between them not only offers insights into breed history and current breeding practices, but also highlights how subpopulations might most beneficially be used for the maintenance or restoration of genetic diversity. Optimally the breeding of desired traits should aim to mimic the natural selection operating in wild populations, selecting uniformity for some parts of the genome, while maintaining diversity elsewhere (32, 33). The simplest tools here might be to maintain large effective population size and constrain inbreeding, which most animals tend to avoid under natural conditions (34). We also expect that subpopulation differentiation can result in false positive (or negative) signals in genome-wide association studies of different traits and should therefore be taken into account when designing such studies.