We defined ROH as the lengths of homozygous genotypes that were >1 Mb and contained only up to one heterozygous genotype. Given the strong linkage disequilibrium (LD) between SNPs with a distance up to 100 Kb [33], short homozygous haplotypes are expected to be prevalent in the buffalo genome. Thus, we set a minimum length of 1 Mb and a minimum number of 40 (AZ) and 38 (KHZ) SNP (as described in methods section) to avoid detecting small and prevalent haplotypes as ROH. Unlike human populations, livestock species generally have higher levels of autozygosity and longer ROHs [13, 20, 21]. However, genotyping errors can always affect the quality of ROH calling [12]. Therefore, we allowed one heterozygous SNP in ROH [25, 30, 31] to avoid losing particularly long ROH because of a single genotyping error.
As presented in Table 1, more than 53% of the detected ROH were 2–4 Mb in length. The proportion of different lengths of ROH can be used as an indicator of the number of past generations in which inbreeding has occurred, because the recombination events can rearrange the chromosomes and reduce the length of ROH. Thus, recent inbreeding results in longer ROH because of long IBD stretches. In contrast, short ROHs arise as a result of ancient inbreeding because in meiosis across generations, the long IBD segments are broken down [19]. We detected ROH with a length from 2 to 4 Mb in all of the samples (Additional file 1), which might indicate that some inbreeding events occurred about 20 generations ago [9]. However, our results should be interpreted with caution. As reported by Ferenčaković et al. [34], a medium-density chip could result in overestimation of the number of long-length ROHs (>4 Mb), probably because some heterozygous genotypes tend to appear in these ROHs by increasing the density of markers. Nevertheless, our results were in line with a previous report of a relatively sharp decrease in the effective population size (Ne) of AZ and KHZ breeds and the consequent increased rate of inbreeding since 20 generations ago [33].
The portion of the genome that was autozygote in the AZ and KHZ breeds was lower than the reported ROH coverage in the Marchigiana beef breed (7%) [15], Austrian dual purpose breeds (9%) [35], and Holstein cattle (10%) [36]. This could be because of lower inbreeding in Iranian water buffalo or because we ignored ROH of <1 Mb in length in our study.
On average, FHOM, FUNI and FGRM were higher in AZ than they were in KHZ. However, the previously reported Ne for AZ (477) was larger than it was for KHZ (212) [33]. Therefore, we expected a lower inbreeding level in AZ. The only comparable estimated inbreeding with our expectation was FROH, which showed lower inbreeding for AZ (0.043) than for KHZ (0.059).
The highest correlation was observed between FUNI and FROH (AZ=0.98 and KHZ=0.94). Literature has reported different correlation coefficients between FUNI and FROH (0.15–0.80) [14], between FHOM and FROH (0.06–0.95) [14, 21, 27], and between FGRM and FROH (0.17–0.81) [21, 37, 38]. The considerable variation among different studies may be because of a strong dependency of FHOM, FUNI and FGRM on allelic frequencies [39].
The FPED of 0.03 previously reported in Iranian buffalo [40] was lower than the estimated FROH in the current study. Given that pedigree data were not available for our study, we could not calculate FPED and compare it with FROH. However, previous studies reported moderate to high (0.47–0.82) and low to moderate (0.12–0.76) correlations between FPED and FROH in cattle and sheep, respectively [14, 17]. A low to moderate correlation between FPED and FROH was also reported by Peripolli et al. [13] in Gyr cattle, suggesting that FPED may not accurately capture small IBD segments that result from ancient inbreeding. Further, accurate and in-depth pedigree records are required to measure FPED. Additionally, methods based on allelic frequency have demonstrated considerable variation among different breeds [14]. Given that ROH does not depend on allele frequencies, and can capture recent and ancient inbreeding, it seems to be a suitable method for measuring inbreeding.
The total length of ROH islands were about 6 and 15 Mb in the AZ and KHZ breeds, respectively (Additional file 3). Consequently, fewer genes were identified in ROH islands in the AZ breed than in the KHZ breed; that is probably why the genes located in ROH islands of the AZ breed were not enriched in any GO terms (P>0.05). In the KHZ breed, however, the genes located in the ROH islands were significantly enriched (P≤0.05) in 40 GO terms (Additional file 4). These GO terms belonged to 23 biological processes (BP), 12 cellular component (CC) and 5 molecular function (MF) groups. In this paper, we focused principally on the GO terms that include the genes with known large effects on important traits in livestock.
Five genes were identified with positive regulation of DNA metabolic development (GO:0051054) in the BP group. Among these genes, STAT6 (signal transducer and activator of transcription 6, on BTA5) has been reported to have large effects on the growth efficiency and the quality of carcass in cattle [41]. Additionally, using co-expression network analysis, Nguyen et al. [42]. reported the critical role of STAT6, PBX2 (PBX homeobox2) and PBRM1 (Protein polybromo1) as transcription factors in regulating pubertal development in Brahman heifers.
Twelve genes in ROH islands were associated with lipid metabolic process (GO:0006629) in the BP group. Of these genes, BMP2 (bone morphogenetic protein 2, on BTA13) plays a major role in rebuilding hair follicles in goats [43]. Further, BMP2 in porcine, cattle and sheep has been reported to have an influence on regulating body size and muscle development [44-47]. Kim et al. [48] found several signatures of selection containing genes such as BMP2 associated with body size and development in goats and sheep native to Egypt. These researchers concluded that the genes influencing body size may be important in regulating adaptation to hot, arid habitats because efficiency in thermoregulation can be associated with body size. Supporting their conclusion is the fact that most breeds in tropical zones have smaller body size than breeds in temperate zones because tropical breeds can regulate their body temperature more efficiently [49]. However, other factors that differ between temperate and arid zones may also contribute to variations in the body sizes of breeds living in different climates.
CYP27B1 (cytochrome P450 family 27 subfamily B member 1) located on BTA5 was also one of the genes enriched in the lipid metabolic process (GO:0006629). This gene is important for making 1-α-hydroxylase, which is required in vitamin D bio-activation, and has been reported to be up-regulated as a result of bacterial infection, suggesting that this gene plays a role in modulating innate immune responses [50].
In ROH islands on BTA13, three genes were associated with the positive regulation of DNA replication (GO:0045740) in the BP group. Proliferating cell nuclear antigen (PCNA) has been reported to be associated with follicular development and growth in buffalo ovaries [51], and may therefore be related to fertility performance.
Single-organism cellular process (GO:0044763) with 64 genes was significantly enriched (P=0.05) in ROH islands, including MARS (methionyl-tRNA synthetase, on BTA5), and ADRA1D (adrenoceptor alpha 1D, on BTA13). MARS has been reported to influence milk and protein production in Chinese [52] and Portuguese [53] Holstein cattle, and ADRA1D largely affects milk protein in Murrah dairy buffalo [54]. INHBC and INHBE (inhibin beta C and E subunits, on BTA5) have been reported as candidate genes associated with reproductive performance in tropical young bulls [55], and composite reproductive traits in Lori-Bakhtiari sheep [56]. KCTD16 (potassium channel tetramerization domain containing 16, on BTA7) was reported as a candidate gene for meat quality in Simmental beef cattle [57], for residual feed intake in Junmu White pigs [58], and for fat yield in Nordic Holstein cattle [59]. PMEL (premelanosome protein) and MYO1A (myosin IA) on BTA5 have been reported as putative candidate genes related to coat colour phenotypes in cattle [60, 61]. PMEL is required for the melanin biosynthesis process in the pigmentation of hair, mucous membranes and eyes [62]. In cattle, PMEL is reported as a candidate gene associated with the dilution of coat colour and consequently colour intensity [63, 64]. Light coat colouring can be beneficial for animals in adapting to hot climates because it can help them to reduce sunlight absorption [65]. However, most of the AZ and KHZ buffalo have a dark coat, which could be a result of some other favourable traits associated with a darker coat colour or the result of artificial selection caused by human interference. SUOX (Sulphite oxidase, on BTA5), within this BP category, was reported to be associated with bone development in cattle [66].
The average LD (r2) between adjacent SNPs in ROH islands was higher than the r2 of adjacent SNPs located on the same chromosome (Additional file 3). Thus, the recombination rates in the ROH islands were lower than those in the rest of the genome. These results are in line with some previous studies [13, 17]. However, a moderate recombination rate has been reported between the SNPs in ROH islands in Valle del Belice sheep [67]. Additionally, ROH hotspots can result from a wide range of underlying causes such as inbreeding and selection [12]. Peripolli et al. [13] argued that the high LD observed in most ROH hotspots is not necessarily caused by selection or conserved IBD haplotypes, but can be an indication of a lower recombination rate in those regions. Nevertheless, most of the ROH hotspots in our study overlapped with selection signature regions found with iHS, which supports the theory that ROH can be used to find genomic regions that have been under natural and/or artificial selection.
Buffalo species have a relatively lower heat tolerance capability than some other livestock species because of their inadequately dispersed sweat glands and their dark coat colour [68]. However, Iranian buffalo breeds have historically been raised in a hot climate [69]. Therefore, selection for higher heat tolerance may have occurred in Iranian buffalo for better adaptation to heat stress [5]. It has been reported that combined networks of multiple genes are often involved in the regulation of complex traits such as adaptation to hot climates [48, 70, 71]. Thus, selection for complex traits would leave only minor footprints because of the selection for numerous regions with lower intensity across the genome [70]. Therefore, we expected to find several genes directly or indirectly influencing different traits that were under artificial selection or important for adaptation and survival in hot areas. We found genes influencing energy and digestive metabolism (KCTD16), autoimmune response (CYP27B1), thermoregulation (BMP2), embryonic development and reproduction (STAT6, PCNA, INHBC and INHBE). These genes seem to be important for species such as water buffalo that have adapted to a hot climate [48].