Genetic characteristics and diversity of Korean Jeju Black cattle by whole genome SNP markers

Conservation and genetic improvement of cattle breeds requires to know the information about genetic diversity and population structure of animals. This study investigated the genetic diversity and population structure among the three breeds raised in Korean peninsula. Three popular breed found in Korea , i.e. Jeju Black, Hanwoo, Holstein with other six breeds such as Angus, Hereford, Brown Wagyu, Black Wagyu, Brahman and Nellore was examined in this study. Genetic diversity within the cattle breeds was analyzed using the popular measures of genetic diversity namely minor allele frequency (MAF), observed and expected heterozygosity (H O and H E ), inbreeding coefficient (F IS ) and past effective population size (N E ). Molecular variance and population structure were performed among the nine cattle breeds using model-based clustering (ADMIXTURE) analysis. Genetic distances between breed pairs were evaluated using Nei’s genetic distance (D A ) and with Weir and Cockerham’s F ST . =

3 the most recent 13 generation ago.

Conclusion
This study indicates an alarming trend of reducing effective population size in Jeju Black cattle. Thus, a sustainable breeding policy should be implemented to increase the population of Jeju Black cattle for the genetic improvement and future conservation.

Background
Cattle is an integral part of animal agriculture since 8000 BC, when it was thought to be domesticated in different parts of the world such as India, Middle East and North Africa [1]. Different cattle breeds are domesticated and adapted throughout the world due to variable geographical and climatic condition. Jeju Black Cattle (Jeju Heugu) is one of the indigenous cattle breeds found in the Jeju Island of Korean peninsula (Republic of Korea).
Jeju Black cattle thought to be originated from the native Hanwoo cattle of mainland Korea according to island model of speciation [2]. Ancient cattle bones from the archaeological sites in Jeju Island suggesting the existence of the breed approximately 1100~2000 years ago. DNA analysis of bones recovered from Gonaeri and Gwakji-ri in Aewaleup, Jeju city discovered that ancestors of the present day Jeju Black cattle had been raised by human since prehistoric age [3]. Other ancient documents (the Annals of the Joseon Dynasty) and paintings found in the mural (Anak Tomb no.3, during Goguryeo Dynasty in 357 AD) also suggesting its existence. However, several reports from researchers showed their origin in a controversial way [4]. Whatever their origin, Jeju Black Cattle has been categorized as an endangered species due to a substantial shrinkage in its population size until 1980s.
This breed is well adapted in the subtropical environment of the Jeju Island. Beef from Jeju Black cattle have some distinct features such as high marbling and rich in oleic acid, linoleic acid, and unsaturated fatty acids which make it premium quality. Highly marbling beef has a great demand in Korean cuisine and culture like Japanese Wagyu beef (kobe 4 beef). As the Korean economy become healthy on the basis of per capita GDP, the demand of premium quality beef is increasing day by day. But in the past decades, this indigenous breed paid little attention due to its impaired growth performances and priorities of rearing other local beef breed 'Hanwoo' by the government and farmers. However, in the recent decades South Korean government took initiative to conserve and genetic improvement of the breed. Genetic diversity of any given species is essential to conserve nature and future genetic improvement. Conservation and genetic improvement requires careful study of genetic characteristics and population structure of any species. Many useful parameter such as allelic richness (A R ), level of inbreeding (F IS ), effective population size (N E ) and genetic distance with other local and exotic breeds was studied along with the existing Jeju Black cattle population.
Genomic studies using high throughput whole genome sequencing data have become popular in the recent years. Although the cost of genotyping reduced but still it is not very cheap. Variation of the gene for a particular trait of interest is the raw material for animal breeder. If there is no genetic variation, improvement is not possible. Single nucleotide polymorphisms (SNPs) is one of the common types of genetic variant for any organism.
However, genotyping using SNP microarray chip provides genomic data with an efficient and cost-effective manner. Many useful genetic parameters such as linkage disequilibrium (LD), effective population size (N E ), Inbreeding coefficient, levels of heterozygosity etc. can be estimated from SNP data. Commercial and custom-made Microarray SNP chips are available from low density to high density panels in different livestock species. However, the challenges are to calculation of various estimates from the SNP data with different statistical approach. Linkage disequilibrium (LD) which can be described as non-random association of alleles at two or more loci [5] is a powerful tool in 5 genetics, evolutionary and conservation biology. Measuring LD might be useful in rare breeds like Jeju Black cattle where it can be used to calculate population genetic parameters in the absence of pedigree data. It is reported that registered Jeju Black cattle have a population size approximately 619 (Korea Seed stock Database), other sources mentioned their number might be 400 ~500 [6,7]. Due to the small size of the remaining population of Jeju Black cattle only in the Jeju Island, it is essential to estimate the level of inbreeding as it is an important parameter to assess the genetic diversity of a given species. Inbreeding depression, a phenomena derived from the loss of fitness and reproductive performance in inbred population have deleterious effect in many cases.
Inbreeding coefficient can be calculated both from pedigree and genomic data but in general, pedigree data information gives lower value than those obtained with genomic data. However, better accuracies of inbreeding coefficient (F SNP ) could be estimated from the genome wide SNP data [8]. Effective population size (N E ) is one of the important genetic parameters that used to determine the amount of genetic variation, genetic drift, and linkage disequilibrium (LD) in both cattle and human population [9,10]. Marker based approach of N E estimation gained popularity due to the availability of large amounts of genetic marker data derived from advanced DNA microarray chip technology.
Development of sustainable breed improvement strategies is dependent on the precise characterization of animal genetic diversity [11]. Although several studies have investigated the diversity pattern of Korean cattle along with Jeju Black Cattle [2,6,12] but it is still controversial whether Jeju Black formed as a separate breed or as a varieties of Hanwoo. Accurate refinement to decipher the origin of Jeju Black cattle is necessary for their future conservation and improvement program. Genetic diversity study using microsatellite-based marker often reported larger genetic differentiation values than SNP-6 based marker [13,14] which is not desirable. Moreover SNPs markers using admixture analysis gives more accurate estimates than pedigree analysis [15]

Results
Within breed genetic diversity Table 1 presents three measures of within breed diversity across the studied population.
The minor allele frequency across breed was in the range from 0.11 (Nellore) to 0.21 (Hanwoo and Angus). The MAF of Jeju Black cattle was estimated 0.16 which represents the value in between of studied breeds. However, the Nellore cattle observed to have the lowest level of expected heterozygosity (H E = 0.15) than Hanwoo (H E = 0.28) which had the highest level of genetic diversity in the studied breeds. F IS which measures the nonrandom mating (inbreeding) and found less inbreeding as F IS values were negative for all the breeds ranging from -0.018 in Black Wagyu to -0.118 in Brown Wagyu cattle. Jeju Black cattle had F IS value -0.076 which is lower than Hanwoo (-0.025).

Analysis of molecular variance and population differentiation
Livestock biodiversity can be estimated from the level of genetic variation amongst breeds. Variation in SNPs allele frequencies between breeds can be used to measure the

Population Structure analysis between eight cattle breeds
The proportion of individuals in each of the breeds in our cattle breeds inferred by the ADMIXTURE are presented in table 6. In the current study, nine cattle population was tested, so we expected lowest cross validation error values when K=9 but lowest CV error estimator was found when K=11 (figure 4). Thus K=11 was taken as the most probable number of inferred populations.
Effective population size over the past generations As effective population size (N E ) estimation is necessary to determine the accuracy of genomic selection [17]. We studied N E in nine cattle breeds showed in figure 6 and table 7 at t generation ago. The size of N e differed between populations. Figure

Discussion
Genetic characterization of animal on the basis of genomic data has become an attractive method to animal geneticists and biotechnologist due to easy access of high throughput data derived from microarray SNP chip technology. In genetic diversity analysis, SNP markers have many advantages over traditional microsatellite markers due to higher level of resolution, despite a set of microsatellites are being suggested by the FAO to assess genetic diversity of farm animals and endangered species [11,18]. In the most recent years, genomic characterization using SNP markers have been studied in a variety of cattle breeds such as Irish Carry cattle [19], Tyrol Grey [20], Spanish beef cattle breeds [21], Canchim [22], Chinese Yiling yellow cattle [23] and many other indigenous and exotic cattle breeds raised in different countries worldwide [24][25][26][27][28][29][30]. Among Korean cattle breed, Hanwoo was given much more research interest due to its incorporation in the national breeding program since 1970s [2]. In this study, we emphasized on the genetic characterization of Jeju Black cattle in addition to other cattle breed Hanwoo, Japanese Wagyu, and Holstein raised in Korea to make a better and accurate comparison with other breeds.
Genetic diversity of cattle breeds can be estimated by various indices such as allelic richness (A R ), Heterozygosity level i.e., expected heterozygosity vs observed heterozygosity (H E vs H O ) and Inbreeding level (F IS ). In our study all cattle breeds having a slightly higher observed heterozygosity label than expected. Jeju Black cattle showed lower level of genetic variability (H E = 0.21) than Hanwoo and Holstein in Korea, both demonstrated heterozygosity value of H E = 0.28. Sharma in her reports [12,31] demonstrated different level of heterozygosity in JJBC (H E = 0.39 and 0.25) while other researcher, Eva [2] reported H E = 0.29. Heterozygosity level in our study is very close to the reported level of Sharma [12,31]. However, different results might be due to the use of various genotyping platform, markers density and quality control criteria [2]. The lower level of heterozygosity in JJBC than Hanwoo and Holstein raised in Korea could be due to small population size in island, or few breeding males having chance to increase inbreeding. However, Makina [26] states that allele frequencies might be a poor estimate of inbreeding, thus to observe real status of inbreeding assessment should be done every five years to determine any unfavorable changes in inbreeding levels. Inbreeding level (F IS ) in JJBC was estimated to be -0.076 which are higher than other Korean breed Hanwoo (-0.025) and Holstein (-0.026) raised in Korea. Genetic variability in B. indicus breed was lowest in Nellore (H E = 0.15) and Brahman (H E = 0.17) breed in the studied population.
Indicine breeds might have less genetic variability than taurine breed as observed by Lin Analysis of molecular variance also reveals the partitioning of genetic variation such as overall fixation indices (F ST ), within population inbreeding (F IS ) and total inbreeding (F IT ).
Combining all nine breeds demonstrated that 85% of total genetic variation was within populations. This was lower than the within populations genetic variation observed in South African cattle populations (92%) [26] but higher than those reported for Iranian cattle (82.88%) [28] and Ethiopian cattle populations (83.96%) [24]. The overall F IS value was negative (-0.03) and not significant (P > 0.05) probably because of less inbreeding level within populations. But total inbreeding estimate (F IT ) and estimate of population differentiation (F ST ) was 0.148 and 0.173 respectively, which showed significant (P < 0.001) in the studied population.  3A) also showed that Hanwoo and Wagyu breeds are much closer than JJBC.
Unsupervised hierarchal clustering of our data implemented with ADMIXTURE software estimates general patterns of admixture and genetic relationship between cattle breeds.
This revealed that 95% of Hanwoo breed were assigned to cluster ten whereas 39%, 38%, 23% and 1% of JJBC were assigned to different cluster five, one, six and ten respectively. This means that JJBC shared its genome only with Hanwoo cattle. On the other hand, rest of the genome (3%) of Hanwoo breed was observed to cluster in one, five and six.
Japanese Brown Wagy (100%) stands alone in cluster three but Black Wagyu (96%) clustered in eleven with 4% if its genome assigned to cluster three, nine and ten. Angus (97%) were assigned to cluster nine with 2% of its genome assigned to cluster eight and ten. Brahman and Nellore only share 1% of their genome whereas Hereford and Angus were assigned to cluster seven and nine with 1~2 % of its genome assigned to these cluster. Besides this, 95% Holstein raised in Korea were assigned to cluster eight with 4% of its genome assigned to cluster in nine and ten. Figure 5 depicted that Brahman and Nellore populations showed the lowest level of admixture in the present study while JJBC showed highest level of admixture. JJBC shared few genetic links (1%) with the Hanwoo cattle, which might be due to co-ancestry regarding the origin of these two breeds.
Admixture analysis clearly revealed that Korean and Japanese taurine cattle different from European taurine from our analysis. ago, whereas other studies performed by Sharma [12] showed their number to be 67.
Sudrajad [6] calculated N E in JJBC to be 60 until 11 generation ago and Eva [2] reported their number seems to be 11 at nearest generation. Other Korean cattle Hanwoo had a large N E value of 209 among studied cattle breeds but till the differences is higher those reported by Li and Kim [9] who estimated N E of Hanwoo was 630 at most recent 11 generation ago and Sudrajad [6] estimated N E is about 531. Differences in the various reports might be caused by many factors such as sample size, SNP quality control measures and model used to study LD and N E [6]. However, in our study the sample size of JJBC was 78 which was highest among all previous reports [2,6,12]. The N e value of JJBC seems to be sufficient for maintaining genetic diversity for the short term as suggested by

Conclusions
In conclusion, the present study confirms that a significant amount of genetic variation is still retained in the Korean cattle especially in JJBC population. Hence, we can speculate and suggested Jeju Black cattle as a separate breed. Although a genetic contribution in terms of allele sharing exists in JJBC with Hanwoo cattle. However, further in-depth study with whole genome resequencing and scanning using high density markers in a large population will help us to accurate measure of genetic parameters and establishment of JJBC as a separate breed. LD estimator, r 2 is preferred for association studies for its robustness, simplicity and not sensitivity to changing gene frequency and effective population size. On the other hand, estimates of D′ are sensitive to allele frequency, especially when one allele is rare, and is inflated for small sample sizes [44]. Therefore, we focused on r 2 measure for further characterization of LD. The r 2 estimator represents the squared correlation coefficient (r) between two variables (alleles) at two separate SNP marker loci [45] and r 2 = 1 when only two haplotypes are present, which is usually a consequence of genetic drift or population bottlenecks. PLINK software [35] was used for estimation of r 2 parameter for post quality control SNPs with known genomic location of all autosomal chromosomes. Here, r 2 can be expressed by the following equation:

Animals and genotypes
Where, P ab represents the frequency of haplotypes consisting of allele a at the first locus and allele b at the second locus [17]. Pairwise r 2 values were calculated for each chromosome in each breed, as well as across breed, with the -r 2 command, using the default settings. In this study, we used PLINK '-ld-window and -ld-window-r 2 commands' values, so that correlations between all possible SNPs were tested and that there was no        Table 7 Effective population size (N E ) across nine cattle breeds Gen Ago, Generation ago Table 9 Mean pairwise linkage disequilibrium (LD), r 2 estimates for different inter-SNP distance estimated with Plink v1.9  Cross validation plot. Cross-validation error was lowest for k = 11, which indicates that k = 11 is the optimal number of clusters.

Figures
38  Effective population size (Ne) in 9 cattle breed.