Conservation Status and Historical Relatedness of South African Communal Indigenous Goat Populations Using Genome-Wide Snp Marker

Background: Indigenous goats forms the majority of populations in smallholder; low input, low output production systems and are considered an important genetic resource due to their adaptability to different production environments and support communal farming. Effective population size (N e ), inbreeding levels, and the runs of homozygosity (ROHs) are effective tools for exploring the genetic diversity and understanding the demographic history in efforts to support breeding strategies to use and conserve genetic resources. Results: Across populations, the current N e of Gauteng was the lowest at 371 animals, while the historical N e across populations suggests that the ancestor N e has decreased by 53.86%, 44.58%, 42.16% and 41.16% in Free State (FS), North West (NW), Limpopo (LP) and Gauteng (GP), respectively, over the last 971 generations. Genomic inbreeding levels related to ancient kinship (F ROH >5Mb) was highest in FS (0.08±0.09) and lowest for Eastern Cape (EC) (0.02±0.02). A total of 871 ROH island regions which include important environmental adaptation and hermo-tolerance genes such as IL10RB, IL23A, FGF9, IGF1, EGR1, MTOR and MAPK3 were identied (occurring in over 20% of the samples) in FS (n = 37), GP (n = 42), NW (n = 2) populations only. The mean length of ROH across populations was 7.76Mb and ranged from 1.61Mb KwaZulu-Natal (KZN) to 98.05Mb (GP and NW). Distribution of ROH according to their size showed that the majority (n = 1949) of the detected ROH were >5Mb in length than the other categories. Assuming two hypothetical ancestral populations, the population from KZN and LP are revealed, supporting PC 1. The genomes of KZN and LP shared an origin but have substantial admixture from the EC and NW populations. Conclusions: These ndings indicated a greater negative impact of inbreeding in recent times which is important for planning conservation strategies. It was revealed that the occurrence of high N e and autozygosity


Background
There are currently 15 South African goat genetic resources listed on the DAD-IS of FAO and 13 on the Domestic Animal Genetic Resources Information System (DAGRIS) of the International Livestock Research Institute, including those listed in DAD-IS. In the country, indigenous goat ecotypes have been used as triple purpose animals (e.g skin, milk and meat); and depending on the region, the animal characteristics, and the geographical isolations, they began to diverge into breeds/populations [60]. These ecotypes have generally been named after their place of origin (e.g Northern Cape Skilder) and/or their prominent characteristics (e.g Xhosa lobbed ear) and on basis of the people who own them (e.g Nguni) [41]. These ecotypes are widely spread across all major agro-ecological regions of South Africa displaying adaptability traits to a speci c habitat or production environment and represent a signi cant resource to satisfy present and future demands for sustainable farming in a changing environment. Improvement of indigenous livestock has been practised through the introduction of high performing breeds (exotic and improved breeds) and as a result of indiscriminate mating and breeding, the majority of village goat populations are crossbreds [39]. The majority of the smallholder farmers have small herds or ocks where herd sizes could be less than ve animals with the majority of these herds being nondescript, crossbred or indigenous cattle, sheep and goats [42,47]. The reduction in local indigenous populations suggests a need for conservation of local genetic resources through implementing a national conservation strategy. Various studies conducted in the smallholder communal areas showed average ock sizes of between 1 and 120 goats [61,14]. Furthermore, about 70% of the goats are kept under traditional management systems where the farm structure comprises of about twenty (±20) indigenous goats [43]. Towards this aim, detailed information on the phenotypic, genetic diversity and population structure of goats ecotype populations become important to guide conservation strategies through utilisation of these populations. According to [17], conservation and characterisation of animal genetic resources is critical because of their contribution to the sustainable livelihoods of rural communities that depend on them for food security. Conservation frameworks should incorporate both genetic diversity and breed merits for prioritizing breeds/populations from community to national level to support breeding programs of current populations.
The completion of the rst draft of the goat genome [13] made it possible for the development of highdensity markers [59]. The Illumina goat SNP50K BeadChip includes 53 347 SNPs [59] that have found utility in South African population genetic studies in Angora [27], commercial, indigenous and village goat populations [37] as well as investigate genetic adaptation to environmental pressures [36]. The use of the tool has been described in other African countries [48,65] and in specialised breeds [32]. While much work on South African commercial, indigenous and village goat populations has focused on genetic studies and investigation on genetic adaptation, less work has focused on conservation status and historical relatedness of communal indigenous goat populations.
The presence of the extent of an effective population size (Ne), is an important population genetic parameter that has recently received a great deal of research attention [64], determining population demographic development [12] and demographic processes such as migration and admixture [45], and having profound implications for understanding the architecture of the animal genome [27,3]. In addition, Ne is widely regarded as one of the most critical population parameters because it measures the rates of genetic drift and inbreeding as well as affects the e cacy of systematic evolutionary forces such as mutation, selection, and migration [56]. It also helps to discover population demographic history and allows for the prediction of the behaviour of genetic diversity through time. The Ne is estimated using the r 2 coe cient and measures the observed range and the amount of genetic variation within a frame of population genetics [5]. It also provides information on the degree of inbreeding of the population under consideration [18]. The Ne determines the amount of genetic variation, genetic drift, and linkage disequilibrium (LD) in populations [28]. Implementation of a national conservation strategies for FAnGR must be based on a better understanding on the degree of inbreeding of the populations, genetic variation, genetic drift, and linkage disequilibrium (LD) in populations.
An increase in inbreeding (F) over generations leads to a reduction in genetic diversity [48]. Higher frequency of homozygous genotypes for deleterious alleles with a reduction in individual performance (inbreeding depression) and lower population viability [49]. When an offspring is inbred, it may inherit autozygous chromosomal segments from both parents that are identical by descent (IBD), i.e., segments that are derived from a common ancestor [8]. The result is continuous homozygous segments in the genome, also known as runs of homozygosity (ROH). The ROH are contiguous lengths of homozygous segments of the genome where the two haplotypes inherited from the parents are identical [20].
The extent of ROH can be used to estimate the inbreeding coe cient [7,30,51], to disclose the genetic relationships among individuals, usually estimating with high accuracy the autozygosity at an individual and/or population level [16,15]. Autozygosity is the homozygous state of identical-by descent (IBD) alleles, which can result from several phenomena such as genetic drift, population bottlenecks, mating of close relatives, natural and arti cial selection [11,27]. The ROH can also be used to establish the level of selection pressure on the populations [63]. Length and frequency of ROHs may also be used to distinguish distant from more recent inbreeding since the length of IBD segments follows an inverse exponential distribution with a mean of 1/2 g Morgans, where g is the number of generations from a common ancestor [23]. Effective populations (Ne) and ROH have been studied in humans [20], cattle [31,15,34], pigs [56,1] sheep [33] but less comprehensively in other livestock species, such as goats for designing conservation strategies especially on the communal indigenous goats of South Africa. The objective of this study was to determine the conservation status and historical relatedness of South African communal indigenous goat populations using genome-wide SNP markers.

Genetic diversity indices.
Variation of the estimated N e at t generations ago (from 12 to 983) is presented in Figure 1 The average inbreeding coe cient (F IS ), was lowest in Free State (F IS = 0.03±0.09), followed by Gauteng (F IS = 0.04±0.09), North West (F IS = 0.05±0.01) and highest in Limpopo (F IS = 0.09±0.05). Overall, the inbreeding level was 0.06±0.08. The average F ROH , its range of variation across populations and its distribution are summarised in Table1. Genomic inbreeding coe cients (F ROH ) based on the distribution of the length of runs of homozygosity by class are described in Table 2 and by chromosome in Figure 2. F ROH differed signi cantly among populations across the length categories. The genomic inbreeding coe cients of the shortest ROH (0-5Mb; related to ancient kinship) per animal ranged from 0.02±0.02 in the Eastern Cape population to 0.08 ±0.10 in North West population. The F ROH of Eastern Cape, Limpopo populations increased with category size, whilst decreased in Free State. Gauteng F ROH decreased from 0.07±0.09 to 0.05±0.09 for F ROH >20Mb and increased at >20Mb. In KwaZulu-Natal, F ROH increased to 0.08±0.09 at <20Mb and decreased for >20Mb category. Chromosomal distribution of inbreeding showed higher inbreeding levels in chromosome 15, in Gauteng chromosome 19, in KwaZulu-Natal in chromosome 19 followed by chromosome 16. In the North West chromosome 22 and 19 had the highest inbreeding coe cients compared to other chromosomes. Table 1 Distribution of runs of homozygosity inbreeding coe cients (F ROH ) within each population.  The relationship between the mean number of ROH and the length of the genome covered by ROH per individual varies considerably among animals and populations. The number of ROH per chromosome displayed a speci c pattern with the larger numbers found for chromosome 1, 2 and 3, a number that To identify the genomic regions that were most commonly associated with ROH, the percentage of SNPs in ROH was assessed by analysing the frequency of a SNP occurring in those ROH across different individuals (20%), and this was plotted against the position of the SNP along the chromosome ( Figure.  6). The threshold of 70% and 50% of the individuals did not yield ROH islands across populations thus 20% was used. Several genomic regions were identi ed that frequently appeared in a ROH within individual animals (Table. 3). We identi ed 58 ROH islands at the 20% threshold in the Free State (n = 28) and Gauteng (n = 29) provinces. No ROH islands were observed in the Eastern Cape, KwaZulu-Natal, North West and Limpopo at the set thresholds. The ROH hotspot with the highest occurrences (SNPs = 149) in Gauteng was located on chr7 (7.69Mb). Chromosome position, start and end position of ROH, ROH length and number of SNPs within the genomic regions of extended homozygosity are reported in Table 3 and Additional le 1, Population structure For population structure analysis further quality control parameters were effected in PLINK v1.90 [53]: to remove linked SNPs using the --indep-pairwise 50

Discussion
The indigenous goat population represents the main links to the development of commercial meat-type breeds and may potentially be relevant as a future source of untapped adaptable genetic material [61].
Therefore, improving our understating of within-ecotype relationships among the major goat producing provinces in South Africa offers a rare opportunity enhancing e cient use of the breeds and implementing conservation programs. This study investigated the indications for population status on inbreeding and runs of homozygosity in the communal indigenous goat population. Data from a previous study [38] enabled a broad geographical coverage of South Africa and represents populations from the major goats producing provinces within the country.
Effective population size (N e ) is a crucial population genetic parameter because of its relationship to the loss of genetic variation, increases in inbreeding, the accumulation of mutations, and the determination of the accuracy of genomic selection [21,5]. Gauteng had the smallest estimated N e among the population and Limpopo had the highest. It was also observed in other studies that effective population size (N e ) showed a reduction to 132 in the Kingdom of Eswatini and highest in South Africa 12 generations ago [61]. It is recommended that to prevent a reduction of the adaptive value in populations, N e values between 50 and 100 animals [40] are accepted to be within a factor of 2 from the true value to avoid inbreeding depression. A study by [10], reported a large N e (140) in local goats breeds, such as those from Africa, Spain and Central-Southern Italy local goats breeds and a small Ne (42) in the Angora, Boer, Nubian, Cashmere, Saanen and Alpine populations. The introduction of records keeping as part of the conservation strategy for the communal indigenous goat populations and occasional DNA-based controls will help to design mating plans that enabled farmers to control inbreeding and maintain effective population size. This can be particularly the case for breeds with distinctive phenotypic traits, such as horn shape, and sizes and coat colours in the ecotypes.
The rapid increase pattern in Ne may also include bottlenecks associated with domestication, selection and breed formation, and the endangerment of the breed [56]. A study by [38] based on SNP data and using the same method reported large N e for all investigated breeds (ranging from 140 to 348).
Furthermore, a study by [39] revealed that the ecotype goat was slightly higher (145) in effective population sizes than the Tankwa and commercial breeds across generations. From a conservation standpoint, the indigenous goat population should top the priority in the population studied due to their diminishing effective population size and increased inbreeding coe cients.
Runs of Homozygosity (ROH) can disclose the genetic relationships among individuals, estimating with high accuracy the autozygosity at the individual and population levels and can elucidate about selection pressure events [54]. If long ROH accumulates in the genome of some individuals, they could seriously impact the overall biological tness [29], therefore, it was an important objective to investigate and understand the level of homozygosity among the populations. In this study, only 1 animal was lacking ROH, whilst 206 (99.52%) had at least one ROH longer than 1 Mb. The genomic inbreeding coe cients (F ROH ) values found in the study for the Ethiopian goats were F ROH > 1Mb values [44]. Similar results were found by [44] with more African goats (Cameroon, Ethiopia, Kenya, Madagascar, Malawi, Mali, Mozambique, Nigeria, Tanzania, Uganda, and Zimbabwe) using a clustering algorithm. In this study, differences in terms of total number and length of ROH were short (>5Mb) were more abundant (57.61%). [48] reported results that showed lower than the F ROH > 2Mb for Kenya, Uganda, and Mozambique goat breeds when Goat 50K BeadChip was used. On the other hand, the Eastern Cape population showed very low amounts of ROH. This has been suggested to be consistent with recent admixture in the individuals in Chinese cattle [62]. Long segments were abundant in the North West population. [25] recent study revealed a high mean ROH in the long length category (>30 Mb), and their study suggested that inbreeding is more recent and is indicative of demographic decline. This is possible considering the extensive management systems of goats in the region, where goats are on free-range or are herded with other ocks for some part of the year, although some researchers have argued that such extensive systems may lead to inbreeding [58,22]. [29] suggested that the lower inbreeding levels in African goats could be due to the openness of the breeding systems in most of Africa due to the loose de nition of livestock breeds in the region.
One of the main advantages of genomic coe cients is the availability of chromosomal inbreeding coe cients [34]. ROH islands can be de ned as genomic regions with reduced genetic diversity and, consequently, high homozygosity around the selected locus that might harbor targets of positive selection and are under strong selective pressure [51,50]. ROHs, representing the level of genomic autozygosity, are continuous homozygous segments at the individual and population levels that can be used as a measurement of inbreeding; more in-depth ROHs are the result of demography, natural and arti cial selection, and inbreeding [55]. In this study, all the genomic regions associated with ROH were not discussed in details, but the focus was on some selected regions that show associations with several speci c traits related to livestock breeding. Five genes were identi ed and reported to be associated with the important traits of goats ( Figure. 5) identi ed by the selection signature. Overall, the highest coverage by ROH was observed on chromosome 1, 2, 6 respectively. Gene INHA, located on chromosome 2, was reported as a candidate gene for litter size in goats [24]. The PPP1R36 and Heat Shock Protein A2 (HSPA2) (CHI10, 26.402-26.719 Mb) identi ed in this communal indigenous goats are involved in heat stress response and in other studies, HSPA2, DNAJC24, and DNAJC13 are associated with the heat shock family of genes [55] . The presence of multiple genes associated with heat stress would seem to suggest that the trait is under intense selection pressure in tropically adapted breeds [48].
In accordance with [46], these regions in humans, when they are present in more than 50% of the individuals of a population, can indicate a strong selection occurrence. The occurrence of ROH hotspots in genomic regions that harbour candidate genes may be involved in selection pressure in response to production and environmental conditions. This study identi ed 58 ROH hotspots in Gauteng and Free State populations and revealed 871 genes and 292 KEGG pathways. This threshold did not yield any results and only in 20% ROH islands were detected in the communal indigenous oat population. The stringent criteria on the communal indigenous goats were used to minimize incorrect discovery of ROH (false positives) within regions of low marker density. The 20% threshold has been used in indigenous Chinese pigs, Jinhua [62]. The minimum expected length of homozygous DNA segments is based on the time frame of approximately 25 generations, over which goats are believed to have been characterized in separate breeds [48].
This study further explored the population genetic structure of all South African communal indigenous goat populations. In accordance with our earlier studies [37,27], the principal component analysis (PCA), the ADMIXTURE analyses based on the SNP array and sequence data sets capitulated the major genetic division among the South African goat populations from two large geographic regions: Eastern Cape and Limpopo. The analysis of genotyping data for communal indigenous goat populations revealed a geographical pattern that underlies the distribution of genomic diversity, and a moderate and weak population structure in the Gauteng, Free State and North West provinces respectively. The genomes of KwaZulu-Natal and Limpopo shared an origin but have substantial admixture from the Eastern Cape and North West populations. Population admixture analysis generated some signals of admixture and underlying genetic relationships among the populations [24]. The observation supported a migration route of ancient goat from the northern part of South Africa to the eastern areas of the KwaZulu-Natal, during their migration periods of the Bantu nation. This results can also be in uenced by communaltraditional indigenous goat farming system and adaptation to different climatic conditions. However, a more extensive sampling of communal indigenous goat populations that would cover more evenly the other South African provinces is necessary to assess the impact of these factors.

Conclusions
The results of this study indicated a greater negative impact of inbreeding in recent times which is important for planning conservation strategies. It was revealed that the occurrence of high N

Genetic diversity indices
Historical and recent effective population sizes (N e ) for each breed were estimated with the SNeP [4], which is based on the relationship between linkage disequilibrium (LD), N e and recombination rate. The different SNP marker distance bins used for r 2 analysis were used to obtain different estimates of Ne at different time points by calculating the number of generations (t) in the past as 1/2c. To verify the accuracy of the coe cient of inbreeding the genomic coe cient was estimated via two methods.
(1) PLINK [52] was used to measure the inbreeding coe cient based on the difference between the observed and expected numbers of homozygous genotypes (F HOM ) using the function -het. The calculation formula was as follows: determined using detectRUNs [6]. F ROH was calculated as follows: where L ROH is the total length of ROH on autosomes and L AUTO is the total length of the autosomes covered by SNPs, which was 2450.71 Mb. Furthermore, the correlation between F ROH and F HOM was calculated for the four populations.

Distribution of Runs of homozygosity
Runs of homozygosity (ROH) were identi ed in every individual using detectRUNS [6] using a sliding window of a speci ed length or number of homozygous SNPs to scan along with each individual's genotype at each SNP marker position to detect homozygous segments. The parameters and thresholds applied to de ne a ROH were (i) a sliding window of 50 SNPs across the genome; (ii) the minimum number of consecutive SNPs included in a ROH was 50; (iii) the minimum length of a ROH was set to 1 Mb to avoid short and common ROH that occur throughout the genome due to LD [55]; and (iv) a maximum of ve SNPs with missing genotypes were allowed in a ROH. ROH was divided into ve physical length categories (1-5Mb, 5-10Mb, 10-20Mb, 20-30 Mb and <40 Mb). The mean number of ROH per individual, the average length of ROH, the total number of ROH per animal, the percentage of chromosomes covered by ROH and mean ROH were calculated on detectRUNS. The genomic inbreeding coe cient based on ROHs (F ROH ) was also calculated as the sum of the length of the autosome covered by ROHs divided by the overall length of the autosome covered by the SNPs as described by [35]. Means and standard deviation (sd) of F ROH were calculated as the sum of the lengths of F ROH1-5Mb, F ROH5-10Mb, F ROH10-20Mb or <20Mb .
To identify the genomic regions that were most commonly associated with ROH, the percentage of the occurrences of a SNP in ROH was calculated by counting the number of times the SNP was detected in those ROH across individuals, and this was plotted against the position of the SNP along the chromosome. This percentage threshold was normalised to 70%, 50% and 20% of individuals per population to be an indication of a possible hotspot of ROH in the genome. The function of these genes and pathways in which they are involved were assessed using the Kyoto Encyclopedia of Genes and Genomes (KEGG, http://www.genome.jp/kegg/) database and literature search.

Population structure
Principal component analysis (PCA) was calculated and plotted in Golden Helix SNP and Variation Suite (SVS) V8.1 (Golden Helix, Inc. 2012). ADMIXTURE v1.3.0 program [2] was used for the analysis of ancestry proportions (admixture) with K set from 2 to 10. The whole genotype data set was subjected to linkage disequilibrium (LD) pruning using the default parameter of PLINK v1.9 (50 SNPs step 5 SNPs, r 2 0.5) [52] prior to use in admixture analysis. GENESIS software [9] was used to visualized admixture plots.

Funding
Research grants from the Department of Agriculture, Land Reform and Rural Development`s Directorate Genetic Resources and Directorate Policy Research and Support funded the eldwork data that include data collection, analysis and interpretation that was used for this paper from the project "Characterisation of indigenous goats" as well as full genome sequences of indigenous goats. The rst author would like to acknowledge support by the University of Limpopo for allowing him to be their Ph.D. student.

Availability of data and materials
All data generated or analyzed during this study are included in this published article, and its additional information les are available from the corresponding author on reasonable request. DNA samples were genotyped using the Illumina Goats SNP50 BeadChip using the In nium assay compatible with the Illumina HiScan SQ genotyping platform at the Agricultural Research Council-Biotechnology Platform in South Africa.
Ethics approval and consent to participate Permission for the study and ethical approval were obtained from the animal research ethics committee of both the University of Limpopo and Department of Agriculture, Land Reform and Rural Development.
Furthermore, verbal consent was given by the goat owners.

Consent for publication
Not applicable.

Figure 8
Clusters inferred from ADMIXTURE at K = 2 -6. The cluster membership of each sample is shown by the colour composition of the vertical lines, with the length of each colour being proportional to the estimated membership coe cient.

Figure 8
Clusters inferred from ADMIXTURE at K = 2 -6. The cluster membership of each sample is shown by the colour composition of the vertical lines, with the length of each colour being proportional to the estimated membership coe cient.

Supplementary Files
This is a list of supplementary les associated with this preprint. Click to download. Table3.docx Table3.docx