Analysis of Expression Proles of mRNA and lncRNA in Oviduct During Estrus Phase of Sheep (Ovis Aries) with Two Fecundity (FecB Gene) Genotypes

previous researches have provided a detailed insight on the regulation involved estrus phase and FecB in the reproductive-related tissues including hypothalamus, pituitary, and However, as the host of embryo development and connection between the ovary and the uterus, little is known about the interaction between mRNAs and lncRNAs in sheep oviduct. In the present study, RNA-Seq was performed to identify the transcriptomic proles of mRNAs and lncRNAs in oviduct during estrus phase of sheep with FecB BB/++ genotypes.


Background
The increasing needs for livestock industries to maximize economic bene ts calls for molecular markers that enable identi cation of superior animals. In sheep industry, lambing trait is believed to be one of the most economically important traits, the molecular mechanism underlying sheep lambing trait has been a research hotspot since the 1980s [1,2]. Bone morphogenetic protein receptor 1B (BMPR1B) is the rst major candidate gene revealed strongly associated with the fecundity of Booroola sheep and possesses a mutation (A746G), known as Booroola fecundity (FecB), which results in one amino acid substitution (Q to R) increasing the ovulation rate of Booroola ewes [3,4]. The FecB mutation has an additive effect on ovulation number and litter size, one copy of the FecB mutation can increase the ovulation number by 1.3-1.6 and the litter size by 0.9-1.2, and two copies by 2.73 and 1.1-1.7, respectively [5].
The evolution of transcriptomic technology driven by the high-throughput RNA Sequencing enabled indepth research towards non-coding RNAs (ncRNA), which cannot translate into a protein and previously referred to as junk RNAs [6,7]. Studies in ncRNAs have revealed various imperative biological roles in mammals such as diseases [8], development [9], reproduction [10], etc. Long non-coding RNAs (lncRNA) is a kind of ncRNAs with lengths of more than 200 nt, notably, lncRNAs can act as microRNA (miRNA) sponges to regulate target gene expression in various biological activities [11][12][13]. Interaction between lncRNA and its target genes has become a hot topic in the search for novel practical targets in physiology [14] and disease [15], however, speci c roles of lncRNAs in reproduction are still limited.
Recent observations indicated ewes with variations in FecB gene also potentially to produce singleton [16], hence, the molecular mechanism of FecB remains to be determined and warrants further investigation. Considering the essential roles of the lncRNAs in mammals, non-FecB mutation carrier (FecB ++ , mutant type, M) Small Tailed Han (STH) sheep and FecB mutation carrier (FecB BB , wild type, w) STH sheep were selected during follicular and luteal phases. RNA-seq was performed for the identi cation and characterization the transcriptomic pro les of lncRNAs and mRNAs in the oviduct.
Furthermore, systematic analysis was conducted to add depth to our understanding of the molecular mechanism of FecB and screening for candidate lncRNAs/mRNAs that affecting sheep fertility. Based on CPC, CNCI, and PFAM, 19,969 novel lncRNAs were identi ed from the unknown transcripts ( Fig. 1A), within which 57.3%, 34.8%, and 7.9% were revealed as intronic lncRNA, lincRNA, and antisense lncRNA, respectively (Fig. 1B). In addition, 1,894 annotated lncRNAs and 43,674 mRNAs were also revealed. Overall, 21,863 lncRNAs and 43,674 mRNAs were served as candidates for following analyses.

Overview of Sequencing Results
As Fig. 1C and 1D shows, the length of the lncRNAs was mainly between 200 bp-3,000 bp (1388 bp on average), the majority of lncRNAs have 2-8 exons (4 on average). Regarding mRNAs, the length was mainly distributed in the range of 100 bp -5,000 bp (3810 on average), the majority of mRNAs have 1-20 exons (13 on average), both of which were markedly greater than that of lncRNAs.

Transcriptomic Pro ling and DE Analysis
FPKM was performed to estimate the expression level of candidate transcripts (Fig. 2), our results showed that mRNAs had a signi cantly higher expression level than that of lncRNAs (p < 0.05), and the expression level of candidate transcripts was relatively lower in follicular phase than luteal phase.
In the comparisons between wF and wL, cis-target genes of identi ed DE lncRNAs were found signi cantly enriched in 44 GO terms, top enriched GO terms in BP, CC, and MF, were response to chemical stimulus (GO:0042221), ATP-binding cassette (ABC) transporter complex (GO:0043190), and protein binding (GO:0005515), respectively. Trans-target genes of DE lncRNAs were found signi cantly enriched in 68 GO terms, top enriched GO terms in BP, CC, and MF, were G-protein coupled receptor signaling pathway (GO:0007186), peroxisome (GO:0005777), and transmembrane signaling receptor activity (GO:0004888), respectively. DE mRNAs were found signi cantly enriched in 111 GO terms, top enriched GO terms in BP, CC, and MF, were cellular response to stress (GO:0033554), photosystem II (GO:0009523), and ion binding (GO:0043167), respectively. Regarding KEGG enrichment analysis, cis-target genes of DE lncRNAs, trans-target genes of DE lncRNAs, and DE mRNAs, were found signi cantly enriched in 4, 3, and 6, KEGG pathways, respectively. Within which, several important pathways related to reproduction were enriched, such as PPAR signaling pathway (oas03320), MAPK signaling pathway (oas04010), and ECMreceptor interaction (oas04512), etc.

GO and KEGG analysis using DE lncRNAs and mRNAs identi ed in FecB BB genotype (M) vs. FecB ++ genotype (w)
In the comparisons between MF and wF, cis-target genes of identi ed DE lncRNAs were found signi cantly enriched in 77 GO terms, top enriched GO terms in BP, CC, and MF, were nitrogen compound metabolic process (GO:0006807), small-subunit processome (GO:0032040), and DNA binding (GO:0003677), respectively. Trans-target genes of DE lncRNAs were found signi cantly enriched in 425 GO terms, top enriched GO terms in BP, CC, and MF, were cellular process (GO:0009987), Golgi apparatus part (GO:0044431), and transferase activity, transferring phosphorus-containing groups (GO:0016772), respectively. DE mRNAs were found signi cantly enriched in 135 GO terms, top enriched GO terms in BP, CC, and MF, were organic substance metabolic process (GO:0071704), adherens junction (GO:0005912), and binding (GO:0005488), respectively. Regarding KEGG enrichment analysis, cis-target genes of DE lncRNAs, trans-target genes of DE lncRNAs, and DE mRNAs, were found signi cantly enriched in 4, 10, and 7, KEGG pathways, respectively. Within which, several important pathways related to reproduction were enriched, such as AMPK signaling pathway (oas04152), PI3K-Akt signaling pathway (oas04151), and GnRH signaling pathway (oas04912), etc.
In the comparisons between ML and wL, cis-target genes of identi ed DE lncRNAs were were found signi cantly enriched in 118 GO terms, top enriched GO terms in BP, CC, and MF, were protein metabolic process (GO:0019538), nucleosome (GO:0000786), and calcium ion binding (GO:0005509), respectively. Trans-target genes of DE lncRNAs were found signi cantly enriched in 43 GO terms, top enriched GO terms in BP, CC, and MF, were urea transport (GO:0015840), membrane (GO:0016020), and hydrolase activity, acting on ester bonds (GO:0016788), respectively. DE mRNAs were found signi cantly enriched in (GO:0046483), cell part (GO:0044464), and ion binding (GO:0043167), respectively. Regarding KEGG enrichment analysis, cis-target genes of DE lncRNAs, trans-target genes of DE lncRNAs, and DE mRNAs, were found signi cantly enriched in 4, 4, and 6, KEGG pathways, respectively. Within which, several important pathways related to reproduction were enriched, such as p53 signaling pathway (oas04115), TNF signaling pathway (oas04668), and TGF-beta signaling pathway (oas04350), etc.

DE-interaction network analysis
In general, 18, 401 cis-target genes of 19,378 corresponding lncRNAs and 8,305 trans-target genes of 11, 307 corresponding lncRNAs were predicted. Based on the results of DE analysis, 33 DE lncRNAs were found trans-target 137 DE mRNAs (Fig. 9). Within which, LNC_018420, LNC_016630, and LNC_013441 were quanti ed generating the highest number of connections with corresponding DE mRNAs (47,21, and 20, respectively).

Sequencing Data Validation
Comparison of the relative expression level of the candidate lncRNAs and mRNAs selected for data validation between RT-qPCR and RNA-Seq are shown in Fig. 10. Collectively, selected lncRNAs and mRNAs showed similar expression patterns between RNA-Seq and RT-qPCR, indicating the reliability and repeatability of our sequencing data.

Discussion
Given that the ovulation rate of sheep can be genetically controlled by several major candidate genes [17,18], there is an urgent need to identify novel biomarkers of proli cacy. Even though several transcriptomic researches associated with proli cacy on sheep hypothalamus [19,20], uterus [10], ovary [16], etc. have been previously reported, there is no research focus on oviduct. As the connection between the ovary and the uterus [21], recent researches demonstrated that oviduct is essential in hosts fertilization and preimplantation development of the embryo [22][23][24], however, the in-depth molecular mechanisms have not been elucidated. In the present study, an indigenous Chinese sheep with sheep (Ovis aries) breed with excellent lambing performance (year-round estrous, 2.61 lambs per year on average), STH sheep [25,26] was selected, RNA-Seq was performed on oviduct to investigate the transcriptomic pro les during follicular phase and luteal phase in sheep with two FecB genotypes.

Transcriptomic pro les
In the present study, 21,863 lncRNAs and 43,674 mRNAs were identi ed in sheep oviduct, initially, FPKM was performed to establish the expression level, our results showed that majority of the candidate lncRNAs/mRNAs (>75%) were revealed with low expression level (average FPKM<5), candidate lncRNAs/mRNAs with average FPKM >500 account for about 0.4%. Within which, OVGP1 (Ovis aries oviductal glycoprotein 1) was obtained with highest FPKM. Numerous studies have provided a detailed insight on OVGP1 in enhancing the fertilization rate and embryo development in different species [27][28][29][30], in the present study, the expression level of OVGP1 was remarkably higher in follicular phase than that in luteal phase, our results demonstrated the positive effects of OVGP1 on the reproductive e ciency in STH sheep. Regarding lncRNAs, LNC_006353 was found with highest FPKM, notably, the expression levels of LNC_006353 were higher in follicular phase than that in luteal phase, we hypothesize that LNC_006353 may be an important regulator in embryo development and could potentially serve as candidate lncRNAs for researches on sheep oviduct. 57  It's worth noting that 4 mRNAs: C2CD5 (C2 calcium-dependent domain containing 5), EPB41L1(erythrocyte membrane protein band 4.1-like 1), EML4(echinoderm microtubule associated protein like 4), and VIRMA (vir like m6A methyltransferase associated) and LNC_006270 were found DE n all comparisons. Several researches have been conducted to assess the in uence of these genes on metabolic [31][32][33] and energy balance [34][35][36], however, little is known about the association between these candidates and reproduction, thus, we hypothesis that these genes may indirectly affect embryo development in sheep oviduct via metabolic progress.
In the comparisons between follicular phase and luteal phase, 18 mRNAs and 3 lncRNAs were obtained DE in both two comparisons. Within which, FSTL5 and LNC_016628 were detected having a greater expression levels in follicular phase than luteal phase. FSTL5 belongs to Follistatin-like protein family, a kind of glycoprotein involved in various pathological and physiological processes [37], and played an important role in mammal pregnancy [38,39]. Moreover, FSTL was isolated in combination with TGF-β (transforming growth factor beta) super family [40], key regulators in reproduction [41], which indicated that FSTL5 may also function as a connection between embryo development and oviduct. Regarding DE lncRNAs, LNC_016628 was detected highly expressed in follicular phase speci cally, and predicted target MARK3, an receptor of gonadotropin releasing hormone (GnRH) [42], hence, we hypothesized that a regulatory relationship between LNC_016628, MARK3, GnRH, and embryo development may exist. Collectively, our results suggested that FSTL5 and LNC_016628 may serve as potential biomarkers in embryo development, the speci c mechanisms still require further careful characterization.
FecB has been widely studied in the course of ovary and be proved to be key regulator gene in sheep reproduction [43]. In each estrus cycle, FecB mutation caused partial inactivation of the TGFβ/BMP signaling pathway and leading to increased ovulation rates [44]. In the comparisons between FecB BB genotype and FecB ++ genotype, the number of DE mRNAs/lncRNAs were relatively lower than that observed in comparisons between follicular phase and luteal phase, implying that the FecB mutation have little in uence on transcription stability at mRNA/lncRNA levels in sheep oviduct probably. Contrary to present results, our previous transcriptomic research in sheep hypothalamus detected a remarkably greater number of DE mRNAs and miRNAs in the comparisons between two FecB genotypes [20], the difference in experimental tissues may be responsible for the results. 6 mRNAs and 1 lncRNA were obtained DE in both two comparisons, none research have been found focused on the relationships between these DE candidates and reproduction, thus, whether these genes participate sheep embryo development remains unclear.

Functional Analysis
In the comparisons between follicular phase and luteal phase, most of the enriched GO terms of DE candidates were closely related to reproduction process, such as SAGA complex and ATP-binding cassette (ABC). Regulation of gene expression during embryo development requires cooperation between histone-modifying enzymes and transcriptional factors, such as Gcn5 and PCAF, which were proved integrated into a multi-subunit complex called SAGA [45]. ABC transporters shuttle a wide variety of substrates and part in embryonic development in both direct or indirect way [46][47][48]. Furthermore, Zou et al. found that DE genes were also were signi cantly enriched in ABC transporters in the comprehensive transcriptomic research in ovarian follicles of uniparous and multiple goats [49]. Our ndings indicated that SAGA complex and ABC transporters may also serve as key factors in sheep embryo development.
Regarding KEGG analysis, Hippo signalling pathway was enriched in both two comparisons. Hippo signalling pathway is known as a key driver in the preimplantation embryo development [50][51][52]. Several researches have revealed that differential Hippo activity is highly correlated with inner cell mass and the trophectoderm, the rst two segregate during early embryo development in mammals [53,54]. Furthermore, key transcripts of Hippo signalling pathway were detected during different stages in preimplantation embryo development which highlight the diverse roles of Hippo signalling pathway in embryo development [55,56]. Collectively, it's possible to speculated that Hippo signalling pathway could in uence reproduction by regulating embryo development in sheep oviduct.
In the comparisons between FecB BB genotype and FecB ++ genotype, cellular process involved in reproduction (GO:0048610) and reproduction (GO:0000003) draws our attention as its close relationship with embryonic development. Notably, NES (nestin) was found enriched in these GO terms, as the key regulator in the neural response, most nestin-positive cells were found strongly positive for vimentin [57,58], a differentiation marker in preimplantation embryos [59,60]. Hence, the results from our study indicated that FecB mutant may be involved in nervous system in sheep oviduct. Results of the KEGG showed that majority of the enriched pathways were close related to metabolic pathways. Consistently, several metabolic pathways were also found enriched in the RNA-Seq research in sheep hypothalamus with different genotype [20]. However, TGFβ/BMP signaling pathway was not found enriched in both comparisons. Considering the molecular regulation of FecB mutant, we hypothesis that FecB mutant may probably function as regulator in the basic regulation of metabolic processes of non-ovary tissues in sheep.

DE-interaction network
In the present study, DE-interaction network contains 33 DE lncRNAs with 137 target DE mRNAs was constructed, the topmost connected regulators were LNC_018420 (47) and LNC_016630 (21). Among the target genes of DE lncRNAs, some were also enriched in the GO term and KEGG pathway we mentioned above which related embryo development, such as ZNFs, and ADAMTs. ZNF (zinc nger proteins) constitute the largest superfamily of transcript regulators in mammals, numerous of ZNF genes are identi ed as imperative biomarkers in the development and differentiation of the nervous system [61,62], which also proved to be interacted with multiple regulations in the embryonic development [63]. ADAMTs belongs to a group of proteins that have platelet-associated activity, recent research in rats demonstrated that ADAMTs may possibly affect the endometrial receptivity and resulting in implantation failure of the embryo [64]. Our nding suggests that these top lncRNAs with highest connections, especially were genotyped in pre-experiment, 6 sheep with the FecB ++ genotype and 6 sheep with the FecB BB genotype were identi ed and chosen for the following experiment, 3 FecB ++ sheep and 3 FecB BB sheep were slaughtered during follicular phase by euthanasia (KCl (1 mg/kg) intravenous under deep anesthesia using Pentobarbital Sodium (1.5 mg/kg)), and the remaining sheep were slaughtered during luteal phase by euthanasia. Oviduct tissues were collected and snap-frozen in liquid nitrogen and then stored at -80 °C for RNA extraction. Details about experimental sheep treatment, phenotype identi cation, and initial processing can be found in Zhang et al. [20].

RNA Extraction, and Sequencing
Total RNA was extracted from stored oviduct tissues of 12 sheep with TRIzol Reagent. Qubit 3.0 Flurometer and RNA Nano 6000 Assay Kit were used for examining the quality and quantity of isolated RNA, RNA integrity Number (RIN) was checked by Agilent 2100 bioanalyzer as the threshold of RIN ≥ 8.0.
Library construction for sequencing was performed with 5 μg of high-quality RNA. The lncRNA and mRNAs libraries was constructed with extracted RNA using the NEBNext ® Ultra™ RNA Library Prep Kit for Illumina ® per the manufacturer's recommendations. LncRNA and mRNA libraries were then sequenced using a HiSeq 2500 platform (pair end, 150 bp), all the sequencing works were conducted by Novogene Technology Co., Ltd. (Beijing, China).

Read Assembly
The raw reads in FASTQ format generated by sequencing were subjected to lter for removing low-quality reads based on the following criteria: reads that contained adapters and N (the proportion of bases could not be identi ed >10%), and low-quality reads (base with sQ ≤ 5 accounts for more than 50% of the entire reads) were removed. Finally, cleaned reads were obtained and aligned to the Ovis aries reference genome (Oar_v3.1) using HISAT2 [65].

Differential Expression Analysis
The fragments per kilobase per million mapped reads (FPKM) [70] method was used to estimate the expression levels of lncRNA and mRNA transcripts. DESeq2 [71] was conducted to identify differential expressed (DE) lncRNA and mRNA transcripts in multiple comparison. lncRNA and mRNA transcripts were identi ed as signi cantly DE as the threshold of q value (p values adjusted by Benjamini and Hochberg's False Discovery Rate approach, FDR) <0.05.

Target Genes Prediction of lncRNAs and DE lncRNA-mRNA Interaction network
For an in-depth sight on transcriptomic pro les, target genes (cis-and trans-) of DE lncRNAs in multiple comparisons were predicted. Coding genes located 100 kb up-/down-stream of the corresponding lncRNAs were considered as cis-target genes. Besides, Pearson correlation coe cients was calculated between the expression level of multiple coding genes and corresponding lncRNAs. Coding genes were considered as trans-target genes as the threshold of |correlation| ≥ 0.95.
Based on the results of differential expression analysis and target genes prediction, DE lncRNA-mRNA interaction networks containing DE lncRNAs and DE mRNAs in different comparisons were constructed using Cytoscape software [72].
Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) Enrichment Analyses GO and KEGG pathway enrichment analyses were conducted using the DE mRNAs and target genes of the DE lncRNAs based on the results of differential expression analysis and target genes prediction. GO and KEGG enrichment analyses were performed using GOseq R library [73] and KOBAS (KO-Based Annotation System) [74] programs, respectively. Followed by a Fisher's exact test with FDR multiple test correction, signi cant enriched GO term or KEGG pathway was determined as the threshold of p < 0.05.

Data Validation
Five lncRNAs and ve mRNAs were randomly selected to verify the reliability and repeatability of our RNA-Seq data. The primers (Additional le 5) of selected lncRNAs, mRNAs, and reference gene beta-actin were designed using Primer Premier 5 software, and real-time quantitative polymerase chain reaction (RT-qPCR) was performed in triplicate following the SYBR Green I method. 2 -ΔΔCt method [75] was used to calculate the relative expression level. The results were shown as the fold change of relative expression level (mean ± standard error) using GraphPad Prism 6 software. The details about experimental RNA extraction, preparation of rst strand of cDNA, and RT-qPCR conditions can be found in Zhang et al. [20] Abbreviations mRNA messenger RNA All experimental procedures in this study were approved by the Science Research Department (in charge of animal welfare issue) of the Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing, China (permit number: IAS2019-49). All efforts were taken to minimize pain and discomfort to the animal while conducting these experiments.

Consent for publication
Not applicable.

Availability of data and materials
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found below: https://www.ncbi.nlm.nih.gov/, PRJNA658731.     DE-interaction network, where red and purple represent lncRNAs and mRNAs, respectively.