Runs of homozygosity and signatures of selection for number of oocytes and embryos in the Gir Indicine cattle

Runs of homozygosity (ROH) and signatures of selection are the results of selection processes in livestock species that have been shown to affect several traits in cattle. The aim of the current work was to verify the profile of ROH and inbreeding depression in the number of total (TO) and viable oocytes (VO) and the number of embryos (EMBR) in Gir Indicine cattle. In addition, we aim to identify signatures of selection, genes, and enriched regions between Gir subpopulations sorted by breeding value for these traits. The genotype file contained 2093 animals and 420,718 SNP markers. Breeding values used to sort Gir animals were previously obtained. ROH and signature of selection analyses were performed using PLINK software, followed by ROH-based (FROH) and pedigree-based inbreeding (Fped) and a search for genes and their functions. An average of 50 ± 8.59 ROHs were found per animal. ROHs were separated into classes according to size, ranging from 1 to 2 Mb (ROH1–2Mb: 58.17%), representing ancient inbreeding, ROH2–4Mb (22.74%), ROH4-8Mb (11.34%), ROH8-16Mb (5.51%), and ROH>16Mb (2.24%). Combining our results, we conclude that the increase in general FROH and Fped significantly decreases TO and VO; however, in different chromosomes traits can increase or decrease with FROH. In the analysis for signatures of selection, we identified 15 genes from 47 significant genomic regions, indicating differences in populations with high and low breeding value for the three traits.


Introduction
Natural and artificial selection have been used over the years to achieve more desirable phenotypes in livestock species (Brito et al. 2017). Runs of homozygosity (ROH), as a result of selection processes are contiguous lengths of homozygous genotypes that are present in an individual due to parents transmitting identical haplotypes to their offspring (Broman and Weber 1999;Purfield et al. 2012;Brito et al. 2017). In addition to natural/artificial selection, ROH size and distribution across the genome depend on several factors, such as genetic drift, founder effect, and effective population size (Rebelato and Caetano 2018). As patterns of ROH may be decomposed, the ancestry of an individual and its population can be evaluated by the extent and frequency of ROHs in the following way: the shorter the segment, the more ancient is the relatedness of individuals (Purfield et al. 2012), and the longer the ROH segment, the more likely the inbreeding is recent in the population (Gibson et a. 2006;Kirin et al. 2010). These findings agree with Rebelato and Caetano (2018), to whom ROH sizes tend to decrease in the herd generation after generation.
According to Ma et al. (2019), the selection of dairy breeds over time focused on production traits, which decreased reproductive performance. Moreover, the negative correlation between milk production and reproductive traits, such as oocytes and embryos, is also described in the literature (Leroy et al. 2008;Ayasan et al. 2011). When exploring ROH analysis for Gir dairy cattle, Peripolli et al. (2018) found genes and ROHs associated with milk production and heat adaptation, but nothing was mentioned about reproductive traits. In addition, the analysis of ROH is useful to monitor the rate of inbreeding in a given population as well as the negative effects of inbreeding. In a breeding program, ROH analysis and monitoring might enhance the response to selection, pinpointing genomic regions associated with the target traits (Shi et al. 2020).
Genomic regions harboring candidate genes can be identified through signatures of selection, which are unique patterns in the genome of individuals who have undergone natural and/or artificial selection (Qanbari and Simianer 2014). Inside these genomic conserved regions, genes affecting phenotypic variation in livestock can be identified, in addition to exploring the genetic diversity of a population (Brito et al. 2017). Methodologies to identify signatures of selection have been studied for over 20 years. As an example, Akey et al. (2002) explored signatures of selection using high-density SNP panels with Wright's Fixation Index (F ST , "Wright's Fixation Index"), which is ideal to compare populations (Wright 1978). Randhawa et al. (2016) carried out a review on signatures of selection identified in cattle and found that FST is the most commonly used. This methodology was also used to identify signatures of selection between beef and dairy Gir cattle (Maiorano et al. 2018). Randhawa et al. (2016) reviewed selection signatures found for adaptation, appearance, and production traits in cattle and suggested that conclusions are limited in Zebu breeds, as there are few studies compared to European breeds.
Zebu breeds such as Gir cattle (Bos primigenius indicus) have higher oocyte quality than taurine cattle (Bos primigenius taurus), which allows for greater success in reproductive biotechnologies such as in vitro embryo production (Drum et al. 2019;Rotar and Souza 2019). Given the importance of the Gir breed in the tropics, work has been carried out to explore its productive capacity (do Nascimento Rangel et al. 2018;Pereira et al. 2019;Hortolani et al. 2022) as well as its reproductive performance (González-Herrera et al. 2022). However, citations exploring the production of oocytes and embryos using genomic approaches are still lacking in Zebu cattle.
In this way, our hypothesis in the current manuscript is that ROH and ROH-based inbreeding in Gir cattle can be positively or negatively related to the number of aspirated oocytes and embryos produced in vitro. Also, we hypothesized that ROH-based genomic regions differing between subpopulations sorted by breeding values for the studied traits can be identified, as well as putative candidate genes. The aim of the current work is to verify the profile of runs of homozygosity and ROH-based inbreeding effects in the number of total and viable aspirated oocytes and the number of embryos produced in vitro in Gir Indicine cattle. In addition, we aim to identify signatures of selection and genes among animals grouped according to the breeding value for these traits and discuss their enriched regions.

Genotypic data
A premade genotype file with 2,093 animals and 420,718 SNP markers was provided by Brazilian Agricultural Research Corporation (EMBRAPA)-Center for Research in Dairy Cattle. This premade file is a result of animals genotyped with different density SNP panels, including Illumina BovineSNP50 BeadChip v2 (50 K), Illumina BovineHD BeadChip (777 K), GGP Indicus (34 K), Z-Chip (30 K), and GGP Indicus (50 K). All were extracted and/or imputed for 777 K, resulting in 420,718 SNPs.

Phenotypic data
A dataset with 17,526 follicular aspirations from 1641 Gir donors was provided by five farms. These donors are daughters of 127 sires and 771 dams. These herds are among the largest herds and include various lineages of the Gir breed, fully representing the Brazilian National Dairy Gir breeding program. The traits evaluated were the number of viable oocytes (VO), number of total oocytes (TO), and number of embryos (EMBR). The data were used to verify inbreeding depression (see Inbreeding effect on traits section).
The phenotypic dataset along with a pedigree file, including 4679 animals, containing up to fifteen generations previously described by Rocha et al. (2022), were previously used to estimate breeding values (BVs) for the animals in these three traits. A brief description of the data can be seen in Table 1. Additional information of the data such as heritability, repeatability, genetic, and phenotypic correlation can be seen in Rocha et al. (2022). Briefly, a normal distribution of the residuals was obtained for all traits using the logarithmic scale ln (X + 1), where ln is the natural logarithm and X is the raw data (oocyte and embryos). The normal distribution was verified by the Anderson-Darling test (Stephens 1986;Thode 2002). BVs for all traits were predicted using BLUPF90 family programs (Misztal et al. 2002) with animal models. BVs were used in the current work to sort and group Gir subpopulations to run the signature of selection analysis.

Runs of homozygosity (ROH)
The ROH analysis was performed with the genotype file with 2093 animals and 420,718 SNPs and their pedigree (sire and dam), with a total of 2625 animals. All parameters used to define ROHs were based on the literature (Purfield et al. 2012;Peripolli et al. 2018) and testing. First, our genotype file is a result of genotyped animals with different density SNP panels, which were extracted and/or imputed for 777 K. Therefore, we separated the genotype file into three files: (1) all animals; (2) only animals originally genotyped with the Illumina BovineHD BeadChip; and (3) only animals genotyped with density chips other than the Illumina BovineHD BeadChip. In this way, we ran around 30 analyses for each file testing different parameters to obtain ROH with file 1. The same analyses were tested for files 2 and 3. Our aim with this testing was to define the parameters to obtain ROH and check the proportion of homozygosity and heterozygosity between files 1, 2, and 3, as the addition of heterozygous genotypes may occur in the imputation processes.
In this way, homozygous segments in each individual's genome were detected by the PLINK 1.9 program (Chang et al. 2015) using the following parameters: (i) sliding window of 50 SNPs across the genome; (ii) proportion of homozygous overlapping windows was 0.05; (iii) minimum number of consecutive SNPs included in an ROH was 50; (iv) minimum length of an ROH was set to 1 Mb; (v) maximum gap between consecutive homozygous SNPs was 1000 kb; (vi) density of one SNP per 100 kb; and (vii) maximum of 1 SNP with missing genotypes and up to one heterozygous genotype were allowed in an ROH. For quality control, the parameters considered were (viii) Hardy-Weinberg equilibrium (HWe) exact test P-value of 10 −6 ; (ix) minor allele frequency (MAF) of 0.05; (x) call rates for variants of 0.02; and (xi) call rates for samples of 0.1. Variants with values lower than those established for HWe and MAF resulted in variant exclusion, as well as higher values of call rate for variants. In the same way, samples with higher values than the one established for call rate for samples were excluded. After quality control, 380,942 SNPs and all 2,093 animals remained. ROHs were obtained in each animal and classified by size: ROH1-2 Mb, ROH2-4 Mb, ROH4-8 Mb, ROH8-16 Mb, and ROH > 16 Mb. All ROH classes were present in generations 7 to 14, except ROH > 16 Mb, which was only present in generations 7 to 13.
Numbers of ROHs per chromosome were obtained by calculating the average number of ROHs for all animals in each chromosome. The average size of ROHs per chromosome was obtained by calculating the average size of ROH per animal, and then the average per chromosome. The total size of ROHs per chromosome was obtained by calculating the sum of ROH sizes per animal and then the average per chromosome. Genome coverage, which is the proportion of the autosomal genome covered by ROHs, was also obtained. Genome coverage by each ROH class was calculated by multiplying the average number of ROHs per animal by the average ROH length, then dividing it by the total size of the bovine genome, 2489.37 Mb, and finally multiplying by 100 to obtain the percentage value. Genome coverage by chromosome was calculated by dividing the average length of ROH obtained for a given chromosome by the total size of this chromosome. This result was multiplied by 100 to obtain the percentage value. The total size of the bovine genome and chromosome lengths were obtained according to the consensus map ARS-UCD1.2 (Rosen et al. 2020). where L ROH is the length of ROH and L aut is the total size of the autosomes covered by markers. L aut was taken to be 2489.37 Mb (2,489,385,779 base pairs), based on the consensus map of the bovine genome, ARS-UCD1.2 (Rosen et al. 2020). In addition, F ROH was calculated by chromosome and based on the ROH distribution of five minimum different lengths: F ROH 1-2 Mb, F ROH 2-4 Mb, F ROH 4-8 Mb, F ROH 8-16 Mb, and F ROH > 16 Mb. In addition, pedigree-based inbreeding (F ped ) was obtained considering the pedigree file with 4,679 animals using the "pedigree" package (Coster 2012) in R software (R Core Team (2022)-R version 4.0.2; R Foundation for Statistical Computing, Vienna, Austria).

Inbreeding effect on traits
Considering F ROH and pedigree-based inbreeding per animal, a simple linear regression was used to verify the effect of the inbreeding coefficient on the raw number of structures (number of oocytes and embryos not log transformed) to verify how the original average number of structures varied with inbreeding increase. The following model was used: ,where y is the number of raw structures, x is the F ROH , β0 represents the intercept, β1 is the regression coefficient, and e is the error. Linear regression was also obtained for each trait by chromosome. A Bonferroni correction test was performed for general inbreeding, adjusting the P-value for the number of tests [P < 0.05/(3*(29 + F ROH ))], where 3 is the number of traits, 29 is the number of chromosomes, and F ROH is the general inbreeding. This test was also performed for each individual chromosome and F ped , adjusting the P-value for the number of tests [P < 0.05/3], where 3 is the number of traits.

Signatures of selection
Breeding values (BV) for VO, TO, and EMBR were previously obtained. Animals were ordered from high to low BV for each trait. The breeding values' mean and standard deviation for each trait were calculated. Animals were separated into group 1 (high BV) if they had a breeding value above the mean plus one standard deviation and group 2 (low BV) if their breeding value was below the mean minus one standard deviation. Group 1 had 312 animals for VO, 313 for TO, and 319 for EMBR, and group 2 had 332 animals for VO, 332 for TO, and 330 for EMBR. For each trait, both groups were compared in the search for signatures of selection. A similar approach was used by Nani and Peñagaricano (2020). In group 1, there were 312 common animals between VO and TO, 216 between VO and EMBR, and 216 between TO and EMBR. In total, 216 animals were common among all three traits. In group 2, there were 332 common animals between VO and TO, 243 between VO and EMBR, and 243 between TO and EMBR. In total, 243 animals were common among all three traits. The same group sorting was performed considering only animals with accuracy of BV higher than 0.75. Group 1 had 76 animals for VO, 84 for TO, and 48 for EMBR, and group 2 had 90 animals for VO, 100 for TO, and 50 for EMBR. In group 1, there were 76 common animals between VO and TO, 32 between VO and EMBR, and 32 between TO and EMBR. In total, 32 animals were common among all three traits. In group 2, there were 86 common animals between VO and TO, 40 between VO and EMBR, and 38 between TO and EMBR. In total, 38 animals were common among all three traits (Supplementary File). Due to the small number of animals in each group considering accuracy of BV higher than 0.75, more than 2000 signatures of selection were identified, thus these results were not used for further analyses. Signature selection analyses were performed using Wright's FST command on PLINK software to obtain F ST estimates for each autosomal variant (Weir and Cockerham 1984). The signatures of selection were identified when F ST values were above 0.15, since as described by Wright (1978), the range from 0.15 to 0.25 indicates moderately great differentiation. This methodology has been used in cattle (Makina et al. 2015;Zhao et al. 2015). A total of 47 genomic regions were found, which were used to find positional genes located within a 20,000 base pair (bp) window (10 kb upstream and downstream) of the SNP. Genes were retrieved from the National Center for Biotechnology Information (NCBI 2022) (http:// www. ncbi. nlm. nih. gov) based on the ARS-UCD1.2 assembly of the bovine genome. The literature was reviewed to identify whether the gene functions were related to VO, TO, and EMBR. In addition, all annotated genes, except LOCs (uncharacterized regions in the genome), were further investigated based on the key words "oocyte" and "embryo" (one at a time) through VarElect NGS Phenotyper (Stelzer et al. 2016). The VarElect tool (https:// varel ect. genec ards. org/) has been used in the literature to prioritize variant genes based on the disease/phenotype of interest (Suárez-Vega et al. 2018;Gutierrez-Quintana et al. 2021). To make such inferences, VarElect uses algorithms seeking associations between genes and phenotypes based on shared pathways, interaction networks, parology relations, and other sources.
For the identification of enriched transcription factors with the TFM-Explorer Transcription Factor Matrix Explorer -Bonsai Bioinformatics (2022) (http:// bioin fo. lifl. fr/ TFM/ TFME/), promoter sequences (FASTA format) 3000 bp upstream and 300 bp downstream from the gene transcription start site (Otto et al. 2019) based on the bovine genome were used. To identify the most likely candidate genes related to all traits, Cytoscape software (Shannon et al. 2003) was used to obtain biological processes from the Biological Networks Gene Ontology tool (BiNGO) (Maere et al. 2005). From all genes identified, LOCs were excluded for biological processes enrichment analyses, which were performed using the ClueGO application in Cytoscape (Bindea et al. 2009), considering a Bonferroni correction test.

Runs of homozygosity (ROH)
A total of 105,327 ROH were found among the 2,093 Gir animals analyzed in this study. An average of 50 ± 8.59 ROHs were found per animal, with a maximum number of 95 and a minimum of 26 ROHs. The average length of ROHs per individual was 3.13 ± 1.01 Mb, with a minimum of 1 Mb and a maximum of 89.46 Mb. ROHs ranging from 1 to 4 Mb were present in all animals and represented 80.91% of all ROHs ( Table 2). The proportion of ROHs larger than 16 Mb was 2.24%, with an average length of 24.66 Mb, and they were present in 1,132 animals. ROHs larger than 16 Mb covered 2.07% of the genome, the largest proportion compared to the other classes.
The average number and size, and total size of runs of homozygosity (ROH) per animal for each chromosome are presented in Fig. 1A, B, and C and Supplementary Table S1. The highest number of ROHs was identified on BTA5 (3.153), followed by BTA2 (3.004) and BTA1 (2.975), while the lowest was identified on BTA28, with 0.687 ROHs detected, BTA27 (0.715), and BTA25 (0.763). With respect to the size of the ROHs, BTA28 had the largest ROHs (4.042 Mb), followed by BTA14 (3.861 Mb) and BTA20 (3.699 Mb), and the smallest ROHs were present on BTA12 (2.699 Mb), BTA18 (2.783 Mb) and BTA17 (2.856 Mb).

Inbreeding effect on traits
The number of VO and TO significantly decreased (P < 0.001) with the increase in F ROH and F ped (Fig. 3). More details about intercept (β0), linear coefficient (β1), R 2 , adjusted R 2 , and P-values are available in Supplementary  Table S2.
Despite the overall decrease in the average raw number of structures for the three traits, for each chromosome,

Signatures of selection
Manhattan plots with F ST values for TO, VO, and EMBR are presented in Figs. 5, 6, and 7, respectively. F ST values higher than 0.15 indicate moderately great differentiation between the groups of high and low breeding values (obtained for the traits TO, VO, and EMBR with logarithmic transformation). The same 11 significant SNPs were identified for total and viable oocytes, and 40 significant SNPs were identified for embryos. Two positions on BTA11, one on BTA14 and one on BTA21 were coincident among all traits.  Table S3). Excluding uncharacterized regions in the genome (LOCs), 11 protein-coding genes were found. According to the VarElect online tool, KNTC1 showed a direct match to "oocyte" with a score of 0.27. This score is an indication of the strength of the connection between the gene and the traits, ranging from 1 to 200. The ratio of each score is calculated, and total scores are normalized to a range of 0.05-0.8. All other 10 genes were indirectly matched to oocytes. For "embryo," the genes (scores) directly matched were ABL2 (0.67), TENM4 (0.43), SORL1 (0.24), KNTC1 (0.21), DNAH6  Table S3). These TFs were then used to search biological processes and to determine which gene ontology (GO) terms were significantly overrepresented. In this way, 127 biological processes were identified (Supplementary Table S4). The gene set with the eleven protein-coding genes found in signatures of selection results was used to perform the biological process enrichment analysis. For that, a GO network highlighting biological roles and gene interactions was generated (Table 3). These are related to embryonic development, blastocyst formation, genetic

Discussion
In the current article, we use genomic analysis to describe not only the state of art for the Gir Indicine cattle level of inbreeding, but we also use such information to build knowledge about the effects of such inbreeding in the production of oocytes and embryos, which is ultimately needed for successful genetic improvement of the most relevant Indicine dairy breed. We emphasize that the goal of selection signature analysis is not to look for association between genomic regions and any studied traits but to uncover genomic regions that differ between populations with high and low breeding values inside the Gir breed for the number of oocytes and number of embryos. After the identification of such genomic regions through selection signature analysis, the search for genes harbored inside them was performed. Although no association studies were performed between the identified genes and the studied traits, disentangling the genetic architecture behind the phenotypes might expand the range of information about these less explored traits. Genome-wide association studies to identify the actual association between genomic regions and the number of oocytes and embryos in this population will be carried out further.

Runs of homozygosity
A high frequency of ROH 1-2 Mb compared to other ROH classes was detected, and this may reflect a strong ancient relationship within the breed, which is attributed to the use of few sires and families for breeding, hence leading to loss of genetic variability (Toro Ospina et al. 2020). Peripolli et al. (2018) also studied runs of homozygosity in Gir cattle and found 161,362 homozygous segments in 2908 individuals, in which most ROHs were shorter segments (ROH1-2 Mb). They also found an average number of ROH per animal of 55.12 ± 10.37 and a mean ROH length of 3.17 Mb. Considering the number of individuals used in their study and considering that the total number of ROHs obtained depends on the total number of animals and parameters to define an ROH, these results from Peripolli et al. (2018) are similar to those described in our work.

Inbreeding depression
Performing functional annotation analyses of ROH-enriched regions and with recently formed ROH (> 8 Mb) in Retinta cattle, Goszczynski et al. (2018) found gene clusters related to male and female reproductive functions, such as pregnancy-associated proteins, immune reaction, and flagellum assembly, which is related to the assembly of cilium, a specialized eukaryotic organelle that consists of a filiform extrusion of the cell surface, and its impaired assembly may produce an abnormal movement of sperm flagellum. This may explain in part the reduced fertility in inbred animals. In our work, we have also described a reduction in the number of oocytes (total and viable) with increasing overall ROHbased (F ROH ) and pedigree-based inbreeding (F ped ). Moreover, Goszczynski et al. (2018) also showed that animals harboring the same inbreeding coefficient may present different phenotypes, indicating that inbreeding will not necessarily be harmful to the trait of interest. In our study, we observed that the inbreeding coefficient in different chromosomes may affect the number of oocytes and embryos in different ways. Looking at the inbreeding level per chromosome and the significant positions found in the selection signature when comparing the groups with high and low breeding values for the number of oocytes and embryos, BTA11, BTA14, BTA16, BTA19, BTA21, and BTA29 had significant positions for total and viable oocytes, but only BTA14 showed a significant increase in the number of viable oocytes with an increase in inbreeding, while all the others presented a decrease in the number of viable and total oocytes. BTA14 was the second chromosome with the greater size of ROHs, meaning that a part of recent inbreeding may be positively affecting the production of viable oocytes. Although no gene was identified in our search on BTA14 that could be related to an increased number of viable oocytes, the genes identified on BTA11, BTA16, BTA19, BTA21, and BTA29 probably have a negative influence on the number of oocytes.

Signatures of selection
In this work, we use 0.15 as an empirical threshold to identify signatures of selection since it is widely used in the literature (Makina et al. 2015;Wright 1978;Zhao et al. 2015). Not all approaches can accurately identify genes that are under positive selection because the proportion of true positives in such genes is unknown, so gene ontology (GO) categorization is often used to identify likely candidates (Pavlidis et al. 2012). In this work, as we found signatures of selection differing between groups with high and low breeding values for the number of total and viable oocytes and the number of embryos, our focus in the literature search was to verify whether the genes found could play a role in these traits. As suggested by Pavlidis et al. (2012), the pleiotropy of many genes, their complex relationships, and the wealth of information available in the literature may lead us to explain the action of positive selection on an arbitrary gene. However, many genes found in our work play important roles in biological processes related to reproductive traits, and as this is one of the first works for this type of trait, it provides insights and may open opportunities for new studies to validate these genomic regions, such as studies in other breeds, genome-wide association studies, and gene expression.
The selection of groups based on accuracy above 0.75 promoted the selection of reliable estimated breeding values (BV), but reduced the number of animals available for the search of distinct genomic regions between groups. Due to this small number of animals, the number of signatures of selection found between groups increased and it became less assertive. This was seen by the identification of more than 2000 signatures of selection from these groups considering accuracy of BV higher than 0.75. As it gives many potentially biased results due to the small number of animals, results of groups with accuracy higher than 0.75 were not used for gene ontology analyses. This bias due to the small number of animals could be due to the fact that the Gir breed is an endogamic breed originated decades ago from animals imported from India (Santiago 1986). Our aim in this study was to have an overview about the genomic regions that currently influence such important traits (TO, VO, EMBR), especially with the growing use of embryo in vitro production (Drum et al. 2019;Rotar and Souza 2019). Therefore, signatures of selection identified between groups considering all BVs were used for better overall picture in the genomic organization and the influence of endogamy in the Gir Breed.

Interaction: signatures x inbreeding x ROH
In our work, we found the TENM4 gene on BTA29, which is the chromosome with the third highest inbreeding level. Additionally, a significant decrease in the number of oocytes was observed on BTA29. In the literature, the TENM4 gene is related to the regulation of the nervous system and heart development (Karimi et al. 2021;Manzari et al. 2019). As TENM4 plays a role in initial development and the inbreeding level of BTA29 negatively affects the number of oocytes, monitoring the inbreeding level becomes important to avoid impaired embryo development. When related to reproductive traits, the DNAH6 gene on BTA11 does not have many results in the literature for cattle, but in humans, Henarejos-Castillo et al. (2021) found novel variants predictive of ovarian failure, two of which are related to microtubule activity.
Our results showed that the inbreeding coefficient of BTA16 may be related to a decreased number of oocytes. Fonseca et al. (2020) compared highly fertile heifers and subfertile heifers and found differential expression of ABL2 (BTA16) in the top 10 hub genes in one of the highly fertile modules of several cattle breeds (Angus, Limousin, Brangus, Simmental, Hereford, and Polled Hereford). Therefore, this gene seems to be important for fertility in cattle, and the inbreeding level negatively affects fertility traits, showing the importance of inbreeding monitoring, especially for reproductive traits.
Although genes such as PEAK1 have not yet been related to reproductive traits, they may have undergone genetic drift over the years during selection for other traits. For example, the PEAK1 gene (BTA21) was involved in the ATP binding process in a search for candidate genes related to carcass traits in Hanwoo cattle (Srikanth et al. 2020) and adaptive immune responses in West African cattle (Goyache et al. 2021). Exploring selection signatures based on ROH in Slovak spotted cattle, Kasarda et al. (2021) found PEAK1 to be one of the candidate genes located in regions under selection pressure involved in cell development. Additionally, the PEAK1 gene was described by Wang et al. (2018) as a tyrosine kinase and scaffold protein that facilitates cell movement and growth through integrin-mediated extracellular matrix signals, being highly conserved in vertebrates and important for angiogenesis.
Regarding the genes already mentioned, the TMEM245 gene (BTA8) is one of the differentially expressed genes in blastocysts and is functionally involved in methylation and acetylation in Holstein cows (Kalo et al. 2019). The SORL1 gene (BTA15) was involved in conceptuses recovering from cows diagnosed with nonuterine diseases by Ribeiro et al. (2016). These authors explained that inflammatory diseases before breeding may reduce oocyte fertilization and morula development. Therefore, ROHs formed through our selection process over the years may harbor genes influencing fertility traits in different ways.
In our work, we identified the KNTC1 and RSRC2 genes on BTA17 when comparing breeding values for the number of embryos. Additionally, on BTA17, we observed a significant decrease in the number of oocytes with an increase in the inbreeding coefficient. Fonseca et al. (2020) mention the KNTC1 gene (BTA17) as one of the target genes of an upstream regulator comparing cows with high and low fertility. Although there are few publications about RSRC2 on BTA17 of cattle, Shaw et al. (2013) found that some transcripts encoding variants such as arginine/serine-rich_ coiled-coil_2 (RSRC2) were shared exclusively between oocytes and 4-cell embryos, representing a maternal message expressed by the oocyte but removed after the onset of embryonic genome activation. As the KNTC1 and RSRC2 genes seem to be related to female fertility, it makes sense that the inbreeding level in females may affect oocyte production. This may explain why the KNTC1 gene was directly matched with both oocyte and embryo terms in the VarElect online tool.
Fertility traits are complex and controlled by a large number of genes, each with small individual effects (Wiggans et al. 2011). This was observed in this study mostly for the number of embryos, in which we found 40 significant SNPs in the signature of selection analyses. In a meta-assembly of genomic regions associated with female reproductive efficiency in dairy and beef cattle breeds, Khatkar et al. (2014) reviewed and found multiple QTLs and SNP associations on every bovine chromosome. According to these authors, the highest meta-QTL scores were on BTA1, 2, 3, 6, 7, and 9, whereas the highest meta-GWAS scores were found on BTA1, 5, 13, and 16. Some of these chromosomes are coincident with those described in our work. In a genome-wide association study in Holstein breed, Parker Gaddis et al. (2017) found the largest proportion of variance explained by a 10-SNP window on chromosomes 8 and 14 for total number of structures collected by Ovum-Pick-Up and number of viable embryos, respectively. In our work, BTA8 and BTA14 also stood out for the number of oocytes (total and viable) and the number of embryos.
In relation to the identified genes, overexpression of several genes, including ABL2 for inner cell mass compared to trophectoderm, was also found in the horse blastocyst transcriptome, and the gene ontology biological processenriched analyses showed a relation with protein amino acid phosphorylation and actin filament-based process (Iqbal et al. 2014). Those studies corroborate our findings, not only for ABL2 but also for the PEAK1 gene and their relation with protein autophosphorylation-enriched function.
In a search for the main dysregulated miRNAs in the uterine media, utero-tubal junction and isthmus exposed to seminal plasma, the TMEM245 gene was one of several predicted targets of differentially expressed miRNAs in explant media from the uterus and utero-tubal junction of sows (Barranco et al. 2020). In a transcriptome analysis carried out in four brain regions of mice to determine estrous cycle-specific changes, TMEM245 was an example of a differentially expressed gene in the hippocampus with higher expression in estrus than in diestrus (DiCarlo et al. 2017). Hou et al. (2017) listed TMEM245 among mouse orthologs of human puberty-associated genes with expression related to puberty in hypothalamic-pituitary-gonadal axis tissues (hypothalamus, pituitary, ovary, and testis). These findings might explain the importance of TMEM245 related to central nervous system myelin formation function in the enriched biological processes that we found in this work.
The expression of glutathione S-transferase subunits α1 and α2 (GSTA1, GSTA2) is tissue-and cell-specific and has already been related to steroidogenically active cells, which are hormonally regulated by gonadotropins in bovine ovarian follicles (Rabahi et al. 1999). As a gene family member from GSTA2, glutathione-S-transferase A1 (GSTA1) function is related to protecting cells from reactive oxygen species, and its expression level positively correlates with oocyte quality (Salhab et al. 2013). Mezera et al. (2021) studied the effect of prostaglandin F2α pulses on the bovine luteal transcriptome during spontaneous luteal regression and found that glutathione s-transferases were mentioned as enriched terms. Therefore, the findings in the current work relating the enriched glutathione metabolic process with GSTA2 may be due to its relation with oxidative stress response regulation. Bogliotti et al. (2020) studied transcript profiling of bovine embryos and found several significantly enriched GO terms for transcripts whose relative abundance decreased from metaphase II eggs to the 8-to 16-cell stage, including KNTC1, which is related to the biological process of the cell cycle. In beef cows (75% Red Angus, 25% MARC III), McFee et al. (2021) found that KNTC1's gene function is related to mitosis and that this gene is less expressed in granulosa cell nuclei with a high androstenedione concentration ([A4] ≥ 40 ng/mL) than follicles with an average/control concentration (A4 ≤ 20 ng/mL) and average estradiol. These findings in the literature are consistent with the results on the KNTC1 gene being related to the mitotic cell cycle checkpoint in gene ontology analyses.
In our work, the SORL1 gene was found to be involved in the regulation of choline O-acetyltransferase activity, which is a general term able to influence several functions and processes in the organism. For example, the SORL1 gene has been identified as a candidate gene for lipid metabolism in Nellore cattle (Carreño et al. 2019). On the other hand, the SORL1 gene is related to cholesterol metabolism being upregulated in granulosa cells collected from chicken follicles (Zhu et al. 2019). When related to reproduction, the SORL1 transcript was in the top 10 downregulated genes of conceptuses recovered from cows diagnosed with nonuterine diseases before artificial insemination when compared with those that were not diagnosed with nonuterine disease (Ribeiro et al. 2016).
Ubiquitin-specific peptidase 39 (USP39) functions in pre-mRNA splicing, and its knockdown was suggested as a novel alternative to targeted therapy of medullary thyroid carcinoma . This is consistent with our result that the USP39 gene is involved in spliceosomal complex assembly, a process responsible for catalyzing the splicing of nuclear mRNA. However, information about this gene in bovine species in the literature is still scarce.

Conclusion
The runs of homozygosity (ROH) profile indicates an ancient genomic relationship in this Gir population.
General ROH-based inbreeding (F ROH ) has a negative effect on the number of oocytes (total and viable) and embryos of Gir cattle. This indicates that more attention should be paid when selecting animals for reproduction to avoid inbreeding depression events. However, the effect of F ROH on the studied traits may be positive or negative depending on which chromosome the ROHs are located. These findings provide great insight into how genomic regions can be beneficial or detrimental when affecting traits. Further association studies may consider this paper when checking where the identified genomic regions or genes associated with those traits are located. Signatures of selection were identified, in which eleven protein-coding genes were located. Some of these protein-coding genes are already related in the literature to reproductive traits or to the same enriched biological processes identified in our study. This indicates that even inside a breed, selection marks for reproductive traits are present. Although the number of oocytes and embryos are not considered selection traits yet by the Brazilian National Dairy Gir breeding program, the selection process for other traits may have led to signatures affecting the studied traits. To uncover the actual effect of these genes in oocytes and embryos, studies on gene expression, such as transcriptional landscapes, are recommended.