Single-step genome-wide association studies and post-GWAS analyses for the number of oocytes and embryos in Gir cattle

Genome-Wide Association Studies (GWAS) are used for identification of quantitate trait loci (QTL) and genes associated with several traits. We aimed to identify genomic regions, genes, and biological processes associated with number of total and viable oocytes, and number of embryos in Gir dairy cattle. A dataset with 17,526 follicular aspirations, including the following traits: number of viable oocytes (VO), number of total oocytes (TO), and number of embryos (EMBR) from 1641 Gir donors was provided by five different stock farms. A genotype file with 2093 animals and 395,524 SNP markers was used to perform a single-step GWAS analysis for each trait. The top 10 windows with the highest percentage of additive genetic variance explained by 100 adjacent SNPs were selected. The genomic regions identified in our work were overlapped with QTLs from QTL database on chromosomes 1, 2, 5, 6, 7, 8, 9, 13, 17, 18, 20, 21, 22, 24, and 29. These QTLs were classified as External, Health, Meat and carcass, Production or Reproduction traits, and about 38% were related to Reproduction. In total, 117 genes were identified, of which 111 were protein-coding genes. Exclusively associations were observed for 42 genes with EMBR, and 1 with TO. Also, 42 genes were in common between VO and TO, 28 between VO and EMBR, and four genes were in common among all traits. In conclusion, great part of the identified genes plays a functional role in initial embryo development or general cell functions. The protein-coding genes ARNT, EGR1, HIF1A, AHR, and PAX2 are good markers for the production of oocytes and embryos in Gir cattle.


Introduction
Given the importance of Gir cattle breed in Brazil, the National Program for the Improvement of Dairy Gir (in Portuguese: Programa Nacional de Melhoramento do Gir Leiteiro-PNMGL) was created and has been conducted by Brazilian Agricultural Research Corporation (EMBRAPA)-Center for Research in Dairy Cattle and Brazilian Association of Dairy Gir Breeders (in Portuguese: Associação Brasileira dos Criadores de Gir Leiteiro-ABCGIL) for almost 40 years (Panetto et al. 2021). Ever since, several works have been carried out to explore Gir breed productive capacity and contribute with PNMGL and literature information (Carvalho et al. 2021;González-Herrera et al. 2022;do Nascimento Rangel et al. 2018;Pereira 2019). Previous works using the same Gir population used in this work explored genetic parameters, runs of homozygosity, and signatures of selection for reproductive traits in this breed (Rocha et al. , 2023. However, references exploring genomic information about reproductive traits in Gir cattle are scarce. So far, the genomic investigation in Gir breed showed that quantitative traits associated to fertility, milk production, beef quality, and growth were involved in the differentiation process of dual purpose populations (Maiorano et al. 2018). Despite reproductive traits usually present only a small proportion of phenotypic variance due to additive actions of individual genes, it does not mean that reproductive traits are not under genetic control (Ortega 2018).
An example of a reproductive technology that has been increasing in cattle production in the past years, is in vitro embryo production (IVP), which combines oocyte collection, in vitro fertilization, and embryo transfer (Moore and Hasler 2017). Brazil ranks second in embryo production and is the leader in IVP embryo transfers worldwide (Viana 2022). About a decade ago, it was already known that the IVP technology stood out among the reproductive traits due to its economic impact, since it accelerates genetic improvement, reduces the generation interval and promotes an increase in the number of products/cow/year, as it allows the use of pre-pubescent heifers, cows in early pregnancy, cows with acquired subfertility, and senile cows (Martins 2010). Recently, a milestone was reached for this technology, since for the first time in 2021, over one million IVP cattle embryos were transferred worldwide (Viana 2022). This shows the importance of studying oocyte and embryo production traits in cattle.
The Genome-Wide Association Studies (GWAS) are used to find genomic regions that contribute to genetic variation of a trait, and most reproductive traits are controlled by many genes. As a Bos taurus indicus, the Gir breed has been highlighted in the use of reproductive biotechnologies, such as higher oocyte quality when compared to Bos taurus taurus, which allows greater success of in vitro embryo production than taurine animals (Drum et al. 2019). In this way, the identification of genetic variants for such traits through GWAS can provide clues to understand fertility regulation and genetic architecture in Gir cattle (Ortega 2018;Ma et al. 2019). Usually, GWAS for fertility traits in dairy cattle results in several genomic regions identified due to the pleiotropic effect of genes affecting these polygenic traits (Jaton et al. 2018;Parker Gaddis et al. 2016Sbardella et al. 2021). Ma et al. (2019) suggested that the continuous data collection of fertility traits would enable more powerful estimations of dairy fertility. We hypothesize that genomic regions associated with number of oocytes and embryos can be identified in Gir. Considering the most importance of Gir breed for dairy production in Brazil and that the studied traits have high economic impact in livestock production accelerating genetic improvement, we aimed to identify genomic regions, genes, and biological processes associated with number of total and viable oocytes, and number of embryos in Gir dairy cattle.

Phenotypic data
A dataset with 17,526 follicular aspirations, including the following traits: number of viable oocytes (VO), number of total oocytes (TO), and number of embryos (EMBR) from 1641 Gir donors was provided by five farms. These donors are daughters of 127 sires and 771 dams. This dataset along with a pedigree file, including 4679 animals, covering up to fifteen generations was used to estimate genetic values for the animals in these three traits previously to this work . The Ovum Pick Up (OPU) sessions were conducted between January 2005 and June 2020 at the Minas Gerais State in Brazil, where the farms are located. Eight companies responsible for OPU and in vitro fertilization (IVF) processes were listed in the dataset received from farms, each company using their own protocols; in this way, we considered eight protocols in total for the analysis, varying among farms or sometimes within farms. The number of OPU sessions ranged from 1 to 70 per donor, with an average of 10. A total of 238 bulls were used for the in vitro fertilization processes. The frequency of use of these bulls ranged from 1 to 1789 samples, with an average of 73 samples per bull. A brief description of the data can be seen in Table 1.
The remaining animals of the pedigree did not have genomic information.

Genome-wide association study analyses
From the dataset with 2093 animals, 170 were males and 1923 were females, of which 1202 females had phenotypic data. For GWAS analysis, quality control was performed for this dataset with 420,718 SNPs, considering the following parameters for variant exclusion: SNP with minor allele frequency ≤ 0.05, maximum difference between observed and expected allele frequency for Hardy Weinberg Equilibrium of 0.15, parent-progeny conflict and SNP heritability (h 2 ), in which markers with estimated h 2 < 0.98 were discarded. After quality control analysis, 395,524 SNPs and 2091 genotyped animals remained for further analysis.
The single-step GWAS was conducted using the BLUPF90 software family (Misztal et al. 2002). The variance components were previously obtained . Briefly, animals were separated into 138 contemporary groups (CG) composed according to OPU year and season, farm, and OPU-IVF protocol, considering a minimum of three animals in each CG; lower numbers implicated in data exclusion. Cows without a minimum of three observations were excluded. As the traits did not present normal distribution, they were transformed using the logarithmic scale: ln (X + 1), where ln is the natural logarithm and X is the raw data (oocyte and embryos), verified by the Anderson-Darling test (Stephens 1986;Thode 2002). The following matrix model was used for all traits: where, y, β, a, p, and e are the vectors of observations; fixed effects; additive genetic random effects; permanent environment random effects; and residual effects, respectively; X, Z, and W are the incidence matrices of fixed, additive genetic, and permanent environment, respectively. The effects of animal, permanent environment, and residual were y = X + Za + Wp + e assumed to be random. In the single-trait model, it was assumed that a ∼ N(0, H 2 a ) , p ∼ N 0, I 2 p , and e ∼ N(0, I 2 e ) , where 2 a , 2 p , and 2 e are the additive genetic, permanent environmental, and residual variances, respectively; H is a matrix combining genomic and pedigree information, as proposed by Aguilar et al. (2010) and I is the identity matrix. The statistical model applied for VO and TO can be described as follows: and for EMBR: where Y is the dependent variable, μ is the mean, CG i is the fixed effect of contemporary group; Age j and Age 2 k are the donor's age when OPU was carried out and respective squared age (covariates); AI l is the aspiration interval (covariate); B l is the fixed effects of the bull used in the in vitro fertilization process; BB m is the effect of bull breed (Gir, n = 83; Holstein, n = 154); and e ijklmn is the residual assumed to be independent and to have a normal distribution.
The SNP effects were calculated using postGSf90 software (http:// nce. ads. uga. edu/; Aguilar et al. 2010), and the GWAS results were reported as the percentage of genetic variance explained by a window of 100 adjacent SNPs. Manhattan plots with the percentage of genetic variance were created using 'qqman' package (Turner 2018)

QTL regions, gene search and gene ontology analysis
The top 10 windows explaining the highest percentage of additive genetic variance (Otto et al. 2020;Tiezzi et al. 2015;Vargas et al. 2018) were selected and overlapped  (Chen 2022) and 'RColorBrewer' packages (Neuwirth 2022) in R software (R Core Team. 2022, R version 4.1.2; R Foundation for Statistical Computing, Vienna, Austria). All 111 protein-coding genes identified were used to perform the biological processes enrichment analysis to identify the most likely candidate genes for the number of total and viable oocytes and number of embryos. This analysis was performed using the ClueGO application (Bindea et al. 2009) in Cytoscape (Shannon et al. 2003), considering a Bonferroni correction test. For the identification of enriched transcriptional factors (TF), promoter sequences (FASTA format) with 3000 bp upstream and 300 bp downstream from genes transcription start site (Verardo et al. 2016) were submitted to the Transcription Factor Matrix Explorer (TFM-explorer 2022, http:// bioin fo. lifl. fr/ TFM/ TFME/). To determine significantly overrepresented functional GO terms, TF list was analyzed in Cytoscape software, using the Biological Networks Gene Ontology tool (BiNGO; Maere et al. 2005). The most likely candidate genes related to the studied traits were identified through gene-TF Network Analyzer tool in Cytoscape.
In addition, all annotated genes were further investigated through VarElect NGS Phenotyper (Stelzer et al. 2016). The genes found for the number of embryos and the number of oocytes (total and viable) were investigated based on the keywords "embryo" and "oocyte," respectively. Also, all transcriptional factors were investigated based on both keywords. VarElect tool (https:// varel ect. genec ards. org/) has been used in the literature to prioritize variant genes based on disease/phenotype of interest (Gutierrez-Quintana et al. 2021;Rocha et al. 2023;Suárez-Vega et al. 2018). To do such inferences, VarElect uses algorithms seeking for association between genes and phenotypes based on shared pathways, interaction networks, relation of paralog genes, and other sources. VarElect tool gives a score indicating the strength of the connection between the gene and the traits, ranging from 1 to 200. The ratio of each score is calculated and total scores are normalized to a range of 0.05-0.8.

Results
The percentage of genetic variance explained by each window of 100 adjacent SNPs is displayed in Manhattan plots for number of viable oocytes (Fig. 1), number of total oocytes (Fig. 2), and number of embryos (Fig. 3). The additive genetic variance explained by the 10 windows was 3.30% for VO and 3.54% for TO and EMBR.
The genomic regions from the top 10 windows identified on chromosomes 1, 2, 5, 6, 7, 8, 9, 13, 17, 18, 20, 21, 22, 24, and 29 were overlapped with Quantitative Trait Loci from QTL database ( Fig. 4; Online Resource ESM1; sheet 1). From QTL database, these QTLs were classified as External, Health, Meat and carcass, Production, or Reproduction traits. None of them were directly related to oocyte or embryo. However, about 38% of QTLs found are related to Reproduction traits, such as sperm counts, stillbirth, interval to first estrus after calving, birth index, gestation length, calving ease, daughter pregnancy rate, inseminations per conception, age at first calving, non-return rate, scrotal circumference, inhibin level, first service conception, perinatal mortality, temperament, interval from first to last insemination, sperm motility, and conception rate.
In addition, 117 genes were identified, two of which were open-reading frames (ORFs), four were microRNAs (MIR130B, MIR301B, MIR499, MIR12063), and 111 were protein-coding genes. From these, 42 genes were exclusively associated with EMBR, 1 with TO and no gene was exclusively associated with VO, 42 genes were in common between VO and TO, 28 between VO and EMBR and four genes were in common among all traits, which were CHAF1B, CLDN14, DOP1B, and MORC3 ( Fig. 5; Online Resource ESM1; sheet 2).
From ClueGO application, 857 biological processes were found related to the 111 protein-coding genes (Online Resource ESM1; sheet 3). Great part of the identified genes plays a functional role in initial embryo development or general cell functions. Nine significant biological processes according to Bonferroni correction test (P < 0.05) were related to cellular catabolic processes. Considering the promoter sequences of our 111 protein-coding genes, 16 transcription factors (TFs) were identified (Table 1). These transcription factors were analyzed in Cytoscape software, using BiNGO application to identify biological networks gene ontology (Online Resource ESM1; sheet 4). From this, five TFs connected to biological networks most related to the studied traits, number of total and viable oocytes, and number of embryos were identified: ARNT, EGR1, HIF1A, AHR, and PAX2. These TFs were reviewed in the literature to verify their function related to oocytes or embryos. After that, 81 candidate genes associated with these five TFs (Fig. 6)

Discussion
In our work, we found Quantitative Trait Loci (QTL) overlapped with the top 10 windows with the highest percentage of additive genetic variance explained for the number of viable oocytes, number of total oocytes, and number of embryos, which were related to different types of traits. As we studied polygenic traits, that is, traits affected by many low effect genes, it was expected to find several genomic regions and that these regions may affect more than one trait. Focusing on reproductive-related traits, some QTLs were related to interval to age at first calving, birth index, daughter pregnancy rate, first estrus after calving, daughter pregnancy rate, inseminations per conception, interval from first to last insemination depend on estrus detection, and a correlation between these traits with oocytes and embryos was searched in literature. Perez et al. (2016) found low correlation between age at first calving with number of viable oocytes (−0.003) and number of transferable embryos (−0.006) in Guzerá cattle breed, which can be understood from the physiological point of view in which the reproductive system of the animal is still in development. Cumulative live birth rates were associated to a higher number of oocytes retrieved (Fanton et al. 2023). A low positive correlation (0.12) was found between the number of retrieved oocytes and pregnancy rates (Hsu et al. 2016). Usually, estrus detection depends, among other factors, on the number of ovulation after calving, and it is the first step in getting a cow pregnant (Roelofs and van Erp-van der Kooij 2015). In addition to female fertility performance, male fertility traits are important for conception and then, embryo production. In this case, sperms per count and sperm motility represent some male reproductive traits found in QTL search. Such traits may be determinant of conception success or failure (Berry et al. 2019).
Searching deeper on QTL regions, identification of genomic variants, and genes associated with reproductive traits can improve the knowledge about fertility regulation (Ortega 2018). In our work, genes were found on several Fig. 6 Gene-transcription factor (TF) network: Genes located in the top 10 windows for the number of total and viable oocytes and number of embryos (green circle nodes) and their associated TF (yellow hexagon nodes). Node size corresponds to network analyses (Cytoscape; Shannon et al. 2003), in which larger nodes denote a higher edge density associated with the number of TF binding sites chromosomes and, at first, focus will be given to interesting results found for the four genes in common among all three traits or genes exclusively associated with one trait. CHAF1B gene, for example, was associated with all three traits: number of total oocytes, viable oocytes, and embryos, and it was directly related to the "oocyte" term in VarElect. From its related biological processes obtained by ClueGO analyses, CHAF1B plays a role in nucleosome assembly and organization. Another gene, MORC3, is also associated with all three traits and directly related to the "oocyte" term in VarElect. MORC3 is related to cell aging and proliferation, protein localization, and stability and post-embryonic development biological processes. In initial embryo development, there is a transition between maternal and onset of embryo gene expression known as zygotic/embryonic genome activation, that is, when maternal mRNAs are degraded and zygotic transcription begins (Jukam et al. 2017). Both CHAF1B and MORC3 were mentioned by Graf et al. (2014) to be expressed as primary transcripts with intronic sequences determining the onset of embryonic gene expression. Biological processes related with CHAF1B and MORC3 genes were mentioned as important functions during embryonic genome activation, such as nucleosome assembly and protein localization as consequences of embryo cell division (Jukam et al. 2017). Given the importance in this transition to maternal embryo expression, it explains why CHAF1B and MORC3 genes were associated with the number of oocytes and embryos in our work.
On the other hand, GAS2 is a gene exclusively associated with the number of total oocytes. Also, GAS2 is related to several biological processes in our work, including response to extracellular stimulus, response to nutrient levels, regulation of microtubule-based process, macromolecular complex disassembly, and negative regulation of hematopoiesis. These processes are related to programmed cell death or apoptosis, since apoptotic signaling begins with the attachment of extracellular ligands causing a variety of biological events such as proliferation/homeostasis, development, differentiation, and elimination of harmful cells (Goldar et al. 2015). On GeneCards (2022; https:// www. genec ards. org/), an integrative database that provides information on all annotated and predicted human genes, it is mentioned that GAS2 may play a role in apoptosis by acting as a cell death substrate for caspases. Indeed it makes sense that gene expression related to apoptosis would be observed in total oocytes, since not all oocytes obtained in follicular aspiration maintain quality integrity to be used for in vitro fertilization, hence not being observed in viable oocytes. It is also acceptable that an apoptosis related-gene would not be associated with number of embryos since embryo development requires biological processes related to cell development and maintenance, not cell death.
In the search for genomic regions associated with oocytes and embryos, Jaton et al. (2018) found significant SNPs on chromosome 11 with pleiotropic effect on the number of embryos and viable embryos in Holstein cattle. In our work, we did not find high proportion of additive genetic variance on BTA11, but we found common genes between the number of total and viable oocytes and the number of embryos, which means that those genes may have a pleiotropic effect on the studied traits. This is consistent to our previous study, where we found high positive genetic correlation between the number of total and viable oocytes and the number of embryos . Still in Holstein cattle, Parker Gaddis et al. (2017) found large proportion of variance for number of good quality (grade 1) embryos on BTA14 and for total structures recovered (i.e., total number of unfertilized oocytes and embryos) on BTA8, followed by BTA5, BTA10, and BTA13. In our work, the largest proportion of additive genetic variance for the number of total and viable oocytes was on BTA24 and for embryos on BTA7. This is a good representation of our previous work where a high positive genetic correlation (1.00) was identified for the number of viable oocytes and the number of total oocytes . This means that genes on BTA24 may play important role in cell viability for maintenance of viable oocytes. Also, our results indicate that genes present on chromosome 7 may have a greater influence in embryo developmental process.
In dairy Gir cattle, Pereira (2019) found windows with higher genetic variance explained (> 1%) for the number of viable oocytes on chromosomes 1, 2, 3, 4, 5, 8, 13, 14, 17, e 27 and for the number of viable embryos on chromosomes 2, 6, 9, 14, 18 e 25. Our results corroborate in part with Pereira (2019), such that in our results chromosomes 1, 2, 5, 6, 8, 9, 13, 17 and 18 were present in the windows with the higher proportion of genetic variance in at least one of the three traits studied. Given the small difference in the traits between our study (number of total oocytes, viable oocytes and embryos) and Pereira (2019) (number of viable oocytes and number of viable embryos), it can be seen that genetic variance for this type of traits is scattered among chromosomes, which may be due to its polygenic nature. In addition, SMTN and INPP5J genes were associated with number of viable oocytes by Pereira (2019), while the same genes were associated with number of embryos in our work. Repression of INPP5J expression may result in cell proliferation of human ovarian cancer cells (Zhu et al. 2015). Also, it is known that SMAD transcription factor is essential for oviduct and uterus maintenance and integrity, but knockout mouse for these transcription factors increase SMTN expression in oviducts, indicating that this gene may have a supportive role during early pregnancy in mice (Rodriguez et al. 2016). In this way, SMTN and INPP5J genes may play an secondary role in oocyte and embryo production.
About transcription factors (TFs) found in this work, focus will be given to the five TFs most related to the studied traits. In our work, AHR was associated with anatomical structure, system, and organ development. Its function was confirmed by Gialitakis et al. (2017) who mentioned that AHR plays a significant role in cell cycle progression and fetal development, because its interaction with genes involved in pluripotency. In our work, HIF1A was associated with embryonic hematopoiesis, morphogenesis process, neural crest cell-related process, and heart formation. Turhan et al. (2021) showed that HIF1alpha expression is important for cumulus expansion and expression, since it regulates expression of functional protein markers in bovine cumulus cells during in vitro maturation and subsequent embryo development. Also, based on our findings, it seems that AHR and HIF1A interact with ARNT gene. According to Munakata et al. (2016), HIF1A is the alpha and ARNT the beta subunits of HIF1 gene, a basic helix-loop-helix transcription factor, of which associated genes tend to be upregulated during follicle development. EGR1 transcriptional factor was related to hematopoiesis and organ, system, and anatomical structure development. Although the exact function of the gene is not clear, Ulloa et al. (2015) found higher expression of EGR1 in in vivo produced embryos than in their in vitro counterparts and suggested that EGR1 expression could serve as a marker of embryo quality.
In our work, PAX2 transcriptional factor was involved in several processes including morphogenesis, mesenchymal to epithelial transition, and general embryonic development. In embryonic genome activation, PAX2a plays important roles since it is expressed as a maternal gene in oocytes, needed for oogenesis and early development (Pachoensuk et al. 2020). This means that most transcriptional factors found in our results have consistent regulatory functions associated with the oocyte and embryo production, or at least, with basic cell functions, needed in several biological processes.
Although microRNAs found in the GWAS were not used in gene ontology analysis, interesting results for some of these microRNAs were found in the literature that are worth mentioning. It already has been shown that functional modulation of miRNA-130b is involved in bovine granulosa and cumulus cells function, oocyte maturation, and preimplantation embryo development (Sinha et al. 2017). MiRNA 130 and miRNA 301B were associated with follicular growth and development in cattle (Zielak-Steciwko and Evans 2016). Interestingly, in a review of placenta-derived miRNAs, (Xu et al. 2021) mentioned that miRNA 499 derived from bovine placental exosome inhibits the activation of NF-kB via Lin28B/let-7 axis, attenuating inflammatory responses, and creating an immune-tolerant microenvironment in the uterus. NF-kB was one of the 16 transcriptional factors found in our research and LIN28B was one of the genes found on BTA9 related to the number of embryos. On GeneCards, it is explained that overexpression of LIN28B gene in primary tumors is linked to the repression of let-7 family of micro-RNAs, which contributes to maintain the pluripotent state of embryonic stem cells. Functions associated with these five transcriptional factors described and their interaction between themselves, the genes, and microRNAs validated their importance in oocyte maturation and the general development of the embryo, indicating that they are good markers for the production of oocytes and embryos.
Considering all genes found in our work, 39 genes in common with Graf et al. (2014), who evaluated the transcriptome of bovine oocyte maturation and early embryonic development and identified specific genes according to the timing of embryonic activation. From these common genes, ARHGAP15 was mentioned as being expressed in blastocyst transcripts, but not present in oocytes, that is, its transcripts arise after fertilization, which is in accordance with our results, since ARHGAP15 was associated exclusively with number of embryos. Finally, some genes found in our work were also mentioned by Graf et al. (2014) to be expressed as primary transcripts with intronic sequences determining the onset of embryonic gene expression, such as EIF6, MORC3 and RBM39 expressed in 4-cell embryos, ACSS2, CHAF1B,  DUSP18, DYM, DYNLRB1, EDEM2, ERGIC3, GAS2, GJA1,  GTDC1, INTU, ITCH, LONP2, MED15, MFSD8, MORC2,  MRPL1, PLK4, PREP, SF3A1, SIAH1, SPAG4, TBC1D10A and UBTD2 expressed in 8-cell embryos, N4BP1, PIGU, PTPN3, and SLC25A31 expressed in 16-cell embryos, and EPB41L4B, GSS, NFS1, and SEC14L2 expressed in blastocysts. As the above-mentioned genes are important in the onset of embryonic gene expression, it may explain the results found in VarElect, in which some of these genes were directly related only with "oocyte," or only with "embryo" or with both terms. Assou et al. (2013) found 9805 genes expressed in both cumulus cells from patients stimulated with highly purified human menopausal gonadotropin (HP-hMG) or recombinant FSH (rFSH) that are part of the most widely used protocols for controlled ovarian stimulation and in vitro fertilization in humans. In our work, 56 genes were in common with Assou et al. (2013) results. As they are expressed in human periovulatory follicles, they may play a role in ovulation and may be related to embryo development. Using single-cell transcriptomic RNA sequencing and differential expression analysis in human oocytes, Li et al. (2020) discovered several differentially expressed genes in oocytes that failed to mature completely when compared with normally mature oocytes. Some genes cited by Li et al. (2020) were also found in our work, including MAP1LC3A, NCOA6, and PTPN3 as down-regulated genes and CEP250, CTIF, DYNLRB1, LZTR1, OSBP2, RBM39, UQCC1, and VCAN as upregulated in maturation-deficient oocytes. Li et al. (2020) also found 2.329 genes differentially spliced into more than 1 alternative isoform, in which 12 genes were in common with our findings: ABHD18, CHAF1B, CPNE1, DYNLRB1, FBXW11, GTDC1, LARP1B, LONP2, MFSD8, PIGU, RBM39, and UBE2L3. In addition, some of these genes were directly correlated with oocyte and/or embryo in VarElect. Great part of the genes discovered in our work, related to the number of total and viable oocytes and embryos are also related to ovulation and embryo development or other reproductive traits in several species. Our work provide a variety of genes that may be explored in further research to better understand the exact function of these genes on oocytes and embryo production and development.
Considering now the most likely candidate genes identified through gene-TF Network Analyzer tool, ALDH1L1 was related to 10-formyltetrahydrofolate catabolic process. In humans, it is known that neural tube formation is the origin of the brain and spinal cord, and folate supplementation prevents neural tube defects (Sadler 2005). Anthony and Heintz (2007) suggested that folate might modulate proliferation within the early neural tube on ALDH1L1-positive regulated cells. In our work, several enriched biological pathways were associated with catabolic processes. Despite catabolic process is a basic metabolic function, we can see that in this case, the catabolic process related to the ALDH1L1 gene is crucial at the onset of embryo's development.
Following up the most likely candidate genes associated with the number of oocytes and embryos in our work, TCN2 codifies for transcobalamin 2 and its higher abundance in human placenta suggests a potential role in vitamin B12 transport to the fetus (Layden et al. 2016). CNOT6L and CRKL were found in our work and in Assou et al. (2013), who showed that these genes may play a role in ovulation and embryo development. TBC1D32 was associated exclusively with number of embryos in our work and among its related biological processes was determination of left/ right symmetry. Anton et al. (2018) associated TBC1D32 gene with fertility breeding value in Hungarian Simmental cattle, with a possible role in left-right symmetry and embryonic digit morphogenesis, in which a digit is one of the terminal divisions of an appendage, such as a finger or toe. Furthermore, PI4KA gene was related to viral replication complex formation and maintenance in our work and, according to GeneCards (https:// www. genec ards. org/), this gene plays a role in base cell metabolism functions, such as transferase and kinase activity. PI4KA was upregulated in Japanese Black (Wagyu) embryos after heat shock, and it was considered by Sakatani et al. (2013) an important gene for embryonic development. Also, PI4KA was a differentially expressed gene correlated with post-partum metabolite/hormone patterns in the oviduct of cyclic Holstein dairy cows (Valour et al. 2013). Therefore, genes that have important base cell functions may play a direct or indirect role in embryo development. Further studies are suggest to explore the genetic and genomic relation between the number of oocytes or embryos and other important reproductive traits in dairy cattle, especially the ones found in our QTL results, to provide better understanding over the general relation among fertility traits.

Conclusions
Most part of genes found in the genome-wide association analysis has roles in initial embryo development or general cell functions. Five transcriptional factors found, ARNT, EGR1, HIF1A, AHR, and PAX2 are good markers for the production of oocytes and embryos. Further studies are needed to verify the expression of these genes in dairy Gir cattle to better understand their role in oocyte and embryo production. Genes identified in this work may play a role in basic regulatory cell processes, indicating that they also affect other economic important traits in livestock. As genomic regions found in our work were related to several types of QTLs, we also suggest further studies to explore the relation between the number of oocytes/embryos and production traits.