Differentially expressed IGF2BP2 in ovarian disorders: strongly associates with alternative splicing regulation in human granulosa cells


 Genome-wide interactions between RNA-binding proteins (RBPs) and RNA targets account for an important portion of post-transcriptional regulation. IGF2BP2 is associated with type 2 diabetes (T2D) and obesity and reportedly functions as an RBP that regulates a series of target genes by binding RNA transcripts. In this study, we detected the differential expression of IGF2BP2 in granulosa cells from women with ovarian disorders and performed RNA-seq and RIP-seq experiments in immortalized human granulosa cells (KGN cells) to evaluate global transcript levels and alternative splicing on KGN cells overexpressing IGF2BP2 versus controls. Our results show that IGF2BP2 preferentially binds to the 3’and 5’UTRs of mRNAs and enriches target gene expression in KGN cells. Notably, besides the conventional GGAC motif, we found a significant enrichment of a new GAAG motif within IGF2BP2-binding regions. We demonstrate that IGF2BP2 is involved in transcription regulation and alternative splicing in genes associated with follicular development. Furthermore, IGF2BP2 partly influences the expression levels of some of these alternatively spliced genes, including MBD3 and FN1, which may lead to ovarian endocrine disorders. In conclusion, we provide a transcriptome-wide analysis that demonstrates the role played by IGF2BP2 in the regulation of gene expression, transcription and alternative splicing of a number of genes involved in the pathogenesis of ovarian endocrine diseases.


Introduction
The ovary is an important reproductive organ in females that performs a variety of functions including generating fertile oocytes, secreting reproductive hormones and maintaining estrous cycles. These are complex biological processes that require transcriptional regulation of a large number of genes [1].
Ovarian granulosa cells (GCs), which serve as functional somatic cells that surround oocytes, can interact with the developing oocyte and are therefore essential for folliculogenesis. Aberrant pathological changes of GCs may result in follicle loss and eventually lead to several ovarian endocrine disorders, such as polycystic ovary syndrome (PCOS) and premature ovarian insu ciency (POI). Amounting evidence from a variety of research studies is revealing the speci c mechanisms associated with GCs regulation and their effect on follicle development. At present, RNA-binding proteins (RBPs) are gaining recognition for their importance in unveiling the mechanistic processes behind the pathogenesis of ovaries [2][3][4][5].
RBPs that bind to RNA target sequences and form ribonucleoprotein complexes play key roles in various gene functions in eukaryotic organisms. Speci c interactions between RBPs and their RNA targets represent important levels of post-transcriptional regulation, including the regulation of gene expression at the level of pre-mRNA splicing and polyadenylation, mRNA stability, mRNA localization and translation.
Alternative splicing (AS) is an important mechanism for controlling gene expression, which regulates tissue identity and numerous critical biological processes. Furthermore, increasing evidence indicates that RBPs are able to promote exon inclusion or skipping by binding to splicing regulatory elements [6][7][8].
Finally, change in the levels and activity of RBPs may lead to AS dysregulation and the production of diverse aberrant transcripts that might contribute to some diseases in humans.
Insulin-like growth factor 2 mRNA-binding protein 2 (IGF2BP2), also known as IMP2 or VICKZ2, is a member of the IGF2 mRNA-binding protein family and encoded by the IGF2BP2 gene, which is located on chromosome 3q27 [9][10][11]. IGF2BP2 is associated with type 2 diabetes (T2D) and obesity, has been reported to be also involved in insulin resistance, lipid metabolism, and tumorigenesis [12,13]. Igf2bp2 is expressed in multiple organs and tissues, where it may exert a regulatory function. For example, previous studies reported that IGF2BPs (IMPs) such as IGF2BP2 are expressed in male and female gonadal cells at embryonic day 12.5 (E12.5) in mice and that IGF2BP1, IGF2BP2 and IGF2BP3 are present in resting and growing oocytes as well as in the granulosa cells of mature mouse and human ovaries [14].
It has been recently reported that IGF2BP2 is highly expressed in the granulosa cells of women with PCOS and participates in the high mobility group AT hook 2 (HMGA2)/IMP2 pathway to promote the proliferation of granulosa cells [15]. However, the comprehensive role played by IGF2BP2 in the pathogenesis of ovarian diseases remains poorly understood. In this study, we generated IGF2BP2overexpression immortalized human granulosa cells (KGN cells) and obtained IGF2BP2-regulated transcriptomes in the KGN cell line HPB-ALL by RNA sequencing (RNA-seq) in order to study the potential function of IGF2BP2 in the regulation of gene expression that is potentially linked to ovarian diseases. We also performed RNA-immunoprecipitation (IP) sequencing (RIP-seq) in KGN cells to comprehensively identify the mRNAs which are associated with IGF2BP2. Comparative transcriptome analysis revealed that IGF2BP2 binding is strongly associated with the regulation of alternative splicing in KGN cells and partly in uences the expression of a variety of genes linked to the pathogenesis of ovarian endocrine disorders.

Materials And Methods
Clinical samples A total of 30 women were enrolled in the study in the Department of Human Reproductive Medicine, Beijing Obstetrics and Gynecology Hospital from January 1, 2019 to January 31, 2021. They routinely signed informed consent after a detailed explanation about some subsequent data collection for further study. The diagnostic criteria of POI are as follows: 1) younger than 40 years of age; 2) at least 1 year of amenorrhea, 3) two or more instances in which the serum FSH level is greater than 35 U/L (ie, two analyses at intervals of 1 mo or more), and 4) serum estradiol levels less than 20 pg/mL. Women with PCOS were selected in accordance with the Rotterdam criteria [16], including oligoovulation and/or anovulation, clinical and/or biochemical signs of hyperandrogenism, and polycystic ovarian morphology by ultrasound. All women were 20 to 35 years old and with at least two serum concentrations of FSH <10 IU/L and anti-Müllerian hormone >1 ng/mL obtained at least 1 month apart. Control group included women of comparable age with regular menses and normal basic endocrine parameters and bilateral antral follicle counts (single antral follicle count = 6 to 10) who had received IVF-ET/ICSI (in vitro fertilization-embryo transfer/intracytoplasmic sperm injection) for male and/or tubal factor infertility and who had no family history of T2D. Patients were excluded if they had a diagnosis of uterine broids, pelvic tuberculosis, autoimmune diseases, genital organ deformity, metabolic disorders, recurrent miscarriage, recurrent implantation failure, chromosomal abnormality, fertility caused by tubal factor or male factor, a history of other ovarian surgery, or received a steroid or immunosuppressant within the preceding 6 months. The study was approved by the Ethics Committee of the Beijing Obstetrics and Gynecology Hospital, Capital Medical University.

Preparation of human GCs
Human granulosa cells were isolated from follicular uid aspirates obtained at oocyte retrieval as described by Shi et al. [17]. Brie y, follicular uid from each patient was centrifuged at 400×g for 10 minutes, and the layers of granulosa cells with the red blood cell pellet were resuspended. After shaking at 200 rpm for 20 minutes at 37°C, the cell suspensions were layered on 8.0 mL Ficoll-Paque Plus (GE Healthcare, Pittsburgh, PA) and centrifuged at 600×g for 20 minutes. Granulosa cells at the interface were harvested and washed three times with 10 mL Dulbecco's modi ed Eagle medium (DMEM)/nutrient mixture F-12 Ham (DMEM/F-12) supplemented with 10% fetal bovine serum, 100 U/mL penicillin, 100 μg/mL streptomycin sulfate, and 1× GlutaMAX (Invitrogen, Shanghai, China). After a nal centrifugation for 5 minutes at 600×g, the cells were prepared for the following RNA and protein extraction.

Cloning and Plasmid Construction
Primer pairs used for Hot Fusion were designed with CE Design V1.04. Each of the primer comprises of a speci c gene fragment sequence and a 17-30 bp sequence of the pIRES-hrGFP-1a vector. The forward (F) and reverse (R) sequences of the primers used were: F-primer: agcccgggcggatccgaattcATGATGAACAAGCTTTACA TCG R-primer: gtcatccttgtagtcctcgagCTTGCTGCGCTGTGAGGC The pIRES-hrGFP-1a vector was digested by EcoRI and XhoI (NEB) at 37℃ for 2h-3h. After digestion, we ran the vector on a 1.0% agarose gel and puri ed using a Qiagen column kit. Total RNA was isolated from KGN cells with Trizol. Puri ed RNA was transcribed for cDNA by oligo dT primer, after which the insert fragment was synthesized by PCR ampli cation. We added the linearized vector (i.e. digested with EcoRI and XhoI) and the PCR insert to a PCR microtube for ligation with ClonExpress® II One Step Cloning Kit (Vazyme). The plasmids were introduced into an Escherichia coli strain by chemical transformation, and the cells were plated onto LB agar plates containing 1µL/ml ampicillin and incubated overnight at 37℃. The colonies were screened by colony PCR (28 cycles) with universal primers (located on the backbone vector). The insert sequence was con rmed by Sanger sequencing.
Transfected cells were harvested after 48h for RT-qPCR analysis.
Assessment of gene overexpression GAPDH (glyceraldehyde-3-phosphate dehydrogenase) was used as a control gene for assessing the effects of IGF2BP2 overexpression. cDNA synthesis was done using standard procedures and RT-qPCR was performed on a Bio-Rad S1000 with Bestar SYBR Green RT-PCR Master Mix (DBI Bioscience, Shanghai, China). A detailed information about the primers used is presented in Supplementary Table 1.
The concentration of each transcript was normalized to GAPDH mRNA level using 2 -ΔΔCT method (Livak and Schmittgen 2001). We performed statistical comparisons with a paired Student's t-test using the GraphPad Prism software (San Diego, CA).

RNA extraction and sequencing
The KGN cells were ground into ne powder before RNA extraction. Total RNA was treated with RQ1 DNase (Promega) to remove DNA. The quality and quantity of the puri ed RNA were determined by measuring the absorbance at 260nm/280nm (A260/A280) using smartspec plus (BioRad). RNA integrity was con rmed by 1.5% Agarose gel electrophoresis.
For each sample, 0.5μg of total RNA was used for RNA-seq library preparation using the KAPA Stranded mRNA-Seq Kit for Illumina® Platforms (KK8544). Polyadenylated mRNAs were puri ed, fragmented and converted into double strand cDNA. Following end repair and A tailing, the DNA fragments were ligated to the Diluted Roche Adaptor (KK8726). After purifying and size fractioning to 300-500bps, the ligated products were ampli ed and puri ed, quanti ed and stored at -80℃ before sequencing. The strand marked with dUTP (the 2nd cDNA strand) was not ampli ed, allowing for strand-speci c sequencing.
The libraries were prepared for high-throughput sequencing following the manufacturer's instructions for the Illumina NovaSeq 6000 system with 150 nt paired-end sequencing.

RNA-Seq Raw Data Clean and Alignment
We discarded raw reads containing more than 2-N bases and then trimmed the adaptors and low-quality bases from raw sequencing reads using the FASTX-Toolkit (Version 0.0.13). We have also discarded reads shorter than 16nt. The dataset containing the clean reads was aligned to the GRch38 genome by tophat2 [18] allowing 4 mismatches. Uniquely mapped reads were used for gene reads number counting and FPKM calculation (fragments per kilobase of transcript per million fragments mapped) [19].

Differentially Expressed Genes (DEG) analysis
We utilized the R Bioconductor package edgeR [20] to obtain the differentially expressed genes (DEGs).
We set a false discovery rate <0.05 and fold change >2 or < 0.5 as the cut-off criteria for DEGs identi cation.

A3SS & ES and A5SS & ES.
To assess RBP regulated ASEs, we evaluated the signi cance of the ratio alteration of AS events by performing a Student's t-test. Those events with a signi cant at P-value cutoff corresponding to a false discovery rate cutoff of 5% were considered RBP regulated ASEs.

Reverse transcription qPCR validation of DEGs and AS events
We performed quantitative reverse-transcription polymerase chain reaction (RT-qPCR) for some of the DEGs to elucidate the validity of the RNA-seq data. Information on the primers utilized is presented in Supplementary Table 1. Total RNA remaining from the RNA-seq library preparation was used for RT-qPCR.
RNA was reverse transcribed into cDNA using a M-MLV Reverse Transcriptase (Vazyme). Real-time PCR was performed with the Step One Real Time PCR System using the SYBR Green PCR Reagents Kit (Yeasen). The PCR conditions were as follows: denaturing at 95˚C for 10 min, 40 cycles of denaturing at 95˚C for 15 s, annealing and extension at 60˚C for 1 min. PCR ampli cations were performed in triplicate for each sample. The RNA expression levels of all genes were normalized against that of GAPDH.
Additionally, we performed a RT-qPCR assay in order to validate ASEs. Information on the primers utilized for detecting ASEs is shown in Supplementary Table 1. To detect alternative isoforms, we used a boundary-spanning primer for the sequence encompassing the junction of a constitutive exon and an alternative exon as well as an opposing primer in a constitutive exon. The boundary-spanning primer of the alternative exon was designed according to "model exon" to detect model splicing or "altered exon" to detect altered splicing.

Functional enrichment analysis
We identi ed the functional categories of DEGs using Gene Ontology (GO) terms and KEGG pathways with the KOBAS 2.0 server. We used a hypergeometric test and a Benjamini-Hochberg FDR correction to de ne the enrichment of each term.
iRIP-seq library preparation and sequencing KGN cells were irradiated once for 400 mJ/cm2, grinded in liquid nitrogen and lysed in ice-cold wash buffer. Cells lysis was performed in cold wash buffer (1× PBS, 0.1% SDS, 0.5% NP-40 and 0.5% sodium deoxycholate) supplemented with a 200 U/mL RNase inhibitor (Takara) and protease inhibitor cocktail (Roche) and incubated on ice for 30 min. We cleared the cell lysate by centrifugation at 10,000 rpm for 10 min at 4°C. We then added RQ I (promega, 1U/μl) to a nal concentration of 1 U/μl and incubated in a water bath for 30 min at 37°C. Immediately afterwards, we added a stop solution to the lysates in order to quench DNase. The mixture was then vibrated vigorously and centrifuged at 13,000 x g at 4℃ for 20 min to remove cell debris. RNA digestion was then performed by MNase (Thermo Scienti c, EN0181).
For immunoprecipitation, the supernatant was incubated overnight at 4 °C with 10μg IGF2BP-antibody (Proteintech, 11601-1-AP) and control IgG-antibody (CST, 2729S). The immunoprecipitates were further incubated with protein A /G Dynabeads for 2h at 4 °C. After applying the magnet and removing the supernatant, we sequentially washed the beads with lysis buffer, high-salt buffer (250 mM Tris 7.4, 750 mM NaCl, 10 mM EDTA, 0.1% SDS, 0.5% NP-40 and 0.5 deoxycholate), and PNK buffer (50 mM Tris, 20 mM EGTA and 0.5% NP-40) twice. The beads were then resuspended in Elution buffer (50 nM Tris 8.0, 10 mM EDTA and 1% SDS) and incubated the suspension for 20 min in a heat block at 70°C to release the immunoprecipitated RBP with crosslinked RNA before vortex. The magnetic beads were removed on the separator and the supernatant was transferred to a clean 1.5 ml microfuge tube. Proteinase K (Roche) was added into the 1% input (without immunoprecipitated and the immunoprecipitated RBP with crosslinked RNA to a nal concentration of 1.2 mg/ml, and incubated for 120 min at 55°C. The RNA was puri ed with the Trizol reagent (Life technologies).
cDNA libraries were prepared with the KAPA RNA Hyper Prep Kit (KAPA, KK8541) according to the manufacturer's protocol. We prepared high-throughput sequencing libraries following the manufacturer's instructions for the Illumina NovaSeq 6000 system with 150 nt paired-end sequencing.

Data Analysis
After aligning the reads to the human reference genome GRCH38 with TopHat 2(Kim, Pertea et al. 2013), we kept only those that mapped uniquely for downstream analysis. The "ABLIRC" strategy was used to identify the genomic regions of IGF2BP2 binding [22]. Reads showing an overlap of at least 1 bp were clustered as peaks. For each gene, we used computer simulation to randomly generate reads having the same number and sequence lengths as empirical reads in the peaks. The reads obtained were further mapped to the same genes to generate random max peak height from overlapping reads. We repeated the entire process 500 times and selected the peaks with heights greater than those of random max peaks (p-value < 0.05). The IP and input samples were analyzed independently via simulations, and the IP peaks overlapping with Input peaks were removed. We nally determined the IP target genes by the peaks and used the HOMER software to call the IP binding motifs [23].

Functional enrichment analysis
We identi ed the functional categories of genes associated with peaks (target genes) using Gene Ontology (GO) terms and KEGG pathways with the KOBAS 2.0 server [24]. We used a hypergeometric test and a Benjamini-Hochberg FDR correction to de ne the enrichment of each term.

IGF2BP2 is differentially expressed in GCs from patients with ovarian diseases
We rst determined the clinical features related to ovarian diseases in control and patient samples ( Table  1). The average age of POI women is slightly older than the other two groups, while body mass index (kg/m 2 ) of the women with PCOS was 25.7±4.97, which was a little higher than the control group. Levels of follicle-stimulating hormone in POI group was signi cantly higher than that in the control group, which suggested poor ovarian functions in these women. Furthermore, antral follicle number was signi cantly much more in PCOS patients and less in POI patients compared to the normal women. We collected the follicular uid and isolated GCs these women who underwent assisted reproductive technology in the reproductive center.
To investigate potential involvement of IGF2BP2 in etiology of ovarian disease, we detected the expression levels in collected samples from POI, PCOS and normal women. Interestingly, IGF2BP2 was downregulated and upregulated in both transcriptional and protein levels in POI and PCOS patients' GCs respectively (Fig. 1a, 1b), which was indicated that it might be an important regulator to modulate gene expression associated with ovarian disorders. IGF2BP2 preferentially binds to 3' and 5'UTRs of mRNAs To better understanding the roles IGF2BP2 plays in the pathogenesis and interactions with other ovarian disease-related genes, the IGF2BP2-interacting transcripts were con rmed by RIP-seq analysis in KGN cells. In order to assess the speci city of IGF2BP2 pull-downs, we set IgG as the immunoprecipitation control. The IGF2BP2 antibody was used to immunoprecipitate IGF2BP2-RNA complexes from the total KGN cell lysates containing IGF2BP2 overexpressing plasmids. IGF2BP2 was detected by western blotting in the total KGN cell lysate while very few quantities were detected in the negative IgG controls. Similarly, we were able to nd abundant IGF2BP2 quantities in the immunoprecipitate fraction of KGN IGF2BP2-overexpressing cells (Fig. 2a). We sequenced the cDNA libraries from the RNA inputs and the IGF2BP2 fraction using the Illumina NovaSeq 6000 system. After removing the adaptor sequences and ltering out low-quality reads, we obtained a total of 44098992 and 49754547 reads from each of the IGF2BP2 immunoprecipitates repetitions, and 58215977 and 60099967 reads from the two RNA input repetitions. We mapped these four groups of reads to the human GRCH38 genome using Tophat2 [18], about 57% aligned. From the mapping data, 6.3% and 4.9% of the total mapped reads were uniquely mapped in the immunoprecipitates. About 3.4% of the total mapped reads were uniquely mapped in the input groups.
We plotted the distribution of uniquely mapped IGF2BP2 reads across reference genomic regions and found that the RIP-seq reads were highly enriched in CDS and 3′UTR regions (Fig. 2b). Furthermore, intronic regions comprised a small percentage of the whole distribution (Fig. 2b), which is consistent with the previously reported regulation of mRNA stability and translation of IGF2BP2 [25,26]. Our analysis also revealed that the fractions of clean reads that mapped to the 3' and 5'UTRs were signi cantly higher in the IGF2BP2 immunoprecipitates compared to the RNA input. Interestingly, we observed more IGF2BP2binding peaks at CDS than in intronic regions (Fig. 2b). Overall, the immunoprecipitates showed that IGFBP2 mainly binds to 5′UTR, and that this was more apparent in IP1 than IP2 (Fig. 2c).

GGAC and GAAG are IGF2BP2 binding motifs
We searched for overrepresented motifs in the IGF2BP2-binding peaks using the software Homer (http://homer.salk.edu/homer/motif/index.html). The GA-rich motif was amongst the top 10 IGF2BP2binding motifs in two independent experiments (Fig. 3a). Moreover, and in agreement with previous studies [26], we con rmed that GGAC is an enriched binding motif. It has been previously reported that IGF2BPs act as a distinct family of m6A readers that are able to recognize the GG (m6A) sequence to target mRNA transcripts. A further study showed that IGF2B2P2 may bind HMGA2 mRNA via a GGAC motif and that there are possible interactions within the circNSUN2/IGF2BP2/HMGA2 complex [27]. Importantly, we found a novel motif -GAAG -in the two immunoprecipitate experiments that rank eighth (IP1) and third (IP2), respectively (Fig. 3a). In terms of the RNA binding protein, we note that the GGAC motif was enriched at the 5′ UTR and that the GAAG motif was enriched in the 3′UTR (Fig. 3b, 3c). Finally, our analysis on the distribution of reads within 1kb up and downstream of the TSS and the TTS, respectively, showed that there was an enrichment of reads downstream of the start codon and upstream of the stop codon.

IGF2BP2-bound genes are enriched in gene expression regulation
In order to validate reliable IGF2BP2-bound genes, we called IGF2BP2-bound peaks using three different methods, namely ABlife, Piranha and CIMS. The number of overlapped peaks was calculated using ABlife. A total of 4674 peak clusters consistently and repeatedly overlapped in the two iRIP-seq sample replicates (Fig. 4a). We then performed GO and KEGG enrichment analysis to further examine the potential biological roles of these IGF2BP2-bound genes (Fig. 4b, 4c). The top 10 biological process GOterm enrichment demonstrate that IGF2BP2-bound genes are associated with the regulation of transcription from RNA polymerase II promoter, the actin cytoskeleton organization and the nerve growth factor signaling pathway. Analogously, enriched KEGG pathways included focal adhesion, pancreatic cancer, prostate cancer, proteoglycans, among others. Finally, Reactome analysis showed that IGF2BP2binding RNA are mainly associated with XBP1(s)-activated chaperone genes and IRE1 alpha-activated chaperones (Fig. 4d).

IGF2BP2 binds to the mRNAs of NFIC and FOSL2, two genes related to the regulation of ovarian disorders
To con rm IGF2BP2-binding mRNA from the iRIP-seq data, we identi ed all the peaks shared by both the bounds. RN7SL1, PTMS, NFIC, FOSL2, VASP and NR5A1 were selected by ranking the maxheight peaks in descending order. Among these potential binding genes, NFIC, FOSL2, NR5A1 and VASP were further veri ed by RIP-PCR validation. These genes have been previously reported to regulate other genes involved in the pathogenesis of primary ovarian insu ciency, a clinical syndrome that is de ned by the loss of ovarian activity before the age of 40 years and that has potentially devastating consequences on a woman's fertility [28]. We provide a depiction of the read density landscape of these genes across the genome and show much higher peaks compared to IGF2BP2 input samples. Speci cally, IGF2BP2 bound to the CDS and the 3'UTR region of NFIC while the binding sites of NR5A1 included the CDS, intronic regions and the 3'UTR (Fig. 5a, 5b).

RNA-seq analysis of IGF2BP2 regulated transcriptome pro le in KGN cells
The IGF2BP2 protein levels were signi cantly higher in KGN cells transfected with an IGF2BP2overexpressing plasmid compared to controls (Fig. 6a). This was con rmed by RT-qPCR showing signi cantly higher mRNA levels of IGF2BP2 in overexpressing cells (Fig. 6b). IGF2BP2 overexpressing (OE) and control cells were used to construct cDNA libraries for sequencing on an Illumina NovaSeq 6000 system. We trimmed the adaptor sequences and ltered the raw reads by removing low-quality reads and those containing 2-N bases. The clean reads were then mapped to the human genome GRCh38 using TopHat2: approximately 93.8% of the reads aligned and 87.37% were uniquely mapped in triplicates of IGF2BP2 OE. We quanti ed genes and transcripts to compare gene expression patterns across individuals. Effective IGF2BP2 overexpression was further con rmed by performing three repeated parallel RNA-seq experiments. There were 17,825 genes expressed in the RNA-seq data in groups overexpressing IGF2BP2 (Fig. 6c), of which 488 were upregulated and 653 were downregulated (following the criteria of FC ≥ 2 or ≤ 0.5 and FDR < 0.05). The heat map showing the hierarchically clustered Pearson's correlation matrix of the transcript expression values (Fig. 6d), and the volcano plot displaying the signi cantly expressed genes that are associated with the overexpression of IGF2BP2 (Fig. 6e), illustrate the distinct transcription pro les between control and IGF2BP2-overexpressing samples (Fig.  6f). We subsequently examine the potential biological roles of the DEGs regulated by IGF2BP2 by conducted GO enrichment analysis that included three distinct ontologies, namely molecular function, cellular component and biological process. The results including the top GO enrichment results on IGF2BP2-mediated up-or downregulation are shown in Figure 6g. IGF2BP2-upregulated genes are mainly associated with processes involving in ammatory response, cell surface receptor signaling pathway, defense response to virus and cytokine-mediated signaling pathway. The IGF2BP2-downregulated genes are mainly involved in transcription, DNA-dependent regulation of transcription, pattern speci cation process and positive regulation of transcription from RNA polymerase II promoter. Analogously, the top ten KEGG pathways associated with IGF2BP2-upregulated genes include the intestinal immune network for IgA production, in ammatory bowel disease and cytokine-cytokine receptor interaction; while those that are downregulated by IGF2BP2 are associated with transcriptional misregulation in cancer and glycerolipid metabolism (Fig. 6h).
DEGs regulated by IGF2BP2 overexpression were not attributed to the direct binding of IGF2BP2 We found a total of 17 genes that tend to bind to IGF2BP2 among the 1124 DEGs, as shown in the Venn diagram (Fig.7a). However, we found no signi cant correlation between IGF2BP2-binding in these 17 genes and differentially expressed genes (P>0.05). We then selected several relevant DEGs, including OASL, THEMIS2, BCHE, FAM133B and STAB2 that were enriched in the in ammatory response signal pathway and con rmed that the relative expression levels measured by RNA-seq data and RT-qPCR are generally consistent and show similar trends (Fig. 7b).

IGF2BP2 could regulate alternative splicing events in KGN cells
In order to shed light on the role played by IGF2BP2 on AS regulation, we used the uniquely mapped reads from the transcriptome sequencing data to explore IGF2BP2-dependent AS events in KGN cells. We detected 56.5% (207,635 out of 367,321) annotated exons when comparing these uniquely mapped reads to the reference genome annotation, and 54,400 novel splice junctions using Tophat2. Furthermore, we uncovered 11,557 known ASEs in the model gene we named in the reference genome, and 29,625 novel ASEs. Based on a stringent P value cut-off (≤0.05) and changed AS ratio ≥ 0.2, we were able to identify 340 high-con dence RASEs, of which the majority included intron retention (IR, 106 events), exon skipping (ES, 55 events), alternative 3′ splice site (A3SS, 49 events), alternative 5′splice site (A5SS, 46 events), and cassette exon (CE, 38 events) (Fig. 8a). These results suggest that IGF2BP2 is able to globally regulate ASEs in KGN cells. Besides the AS changes attributed to transcription regulation, we also analyzed the expression of RASGs at a transcriptional level. There are different alternatively spliced genes between IGF2BP2-overexpressing and control samples but there was only one signi cantly regulated at a transcript level in RASGs (Fig. 8b). Therefore, the observed changes in AS events cannot be simply attributed a transcription up-or downregulation.

Validation of ASEs by IGF2BP2 overexpression in KGN cells
A small part (67/235) of the IGF2BP2-regulated alternatively spliced genes overlapped the the set of genes that bind to IGF2BP2 (Fig. 9a), which suggests that IGF2BP2 binding might affect some alternative splicing events by binding to certain RNA targets. To further explore these alternative splicing genes directly regulated by IGF2BP2, we performed additional GO and KEGG pathway enrichment analyzes. In this case, the top hits obtained were transcription from RNA polymerase II promoter, blood coagulation, apoptotic process and gene expression (GO-terms), and Resin secretion, mRNA surveillance, glycosphingolipid biosynthesis-ganglio series (KEGG pathways) ( Fig. 9b and 9c, respectively). Among these 67 IGF2BP2-regulated genes, we validated four splicing events affecting MBD3, FN1, TFDP1 and MKNK2 by PCR (see Supplementary Table 1 for primer information), which were con rmed to be 5pMXE, ES, A3SS and A3SS, respectively (Fig. 9d, 9e and S2). These results are consistent with the previous RNA-seq results. After IGF2BP2-overexpression, the ratio of splicing patterns was signi cantly higher in MBD3, FN1 and TFDP1 but lower in MKNK2. These results further suggest that IGF2BP2 binds to different genes and plays an important regulatory role in their AS in KGN cells.

Discussion
IGF2BP2 is highly expressed in the gonads, oocytes and embryos in both mice and in humans [14,10].
The gene belongs to a conserved family of RNA-binding proteins containing six characteristic RNA binding modules, including two N-terminal RNA recognition motifs (RRM1 and RRM2) and four C-terminal hnRNP K-homology (KH1 to KH4) domains that control its RNA-binding speci city [45,9]. The expression of IGF2BP2 is widely distributed in E12.5 mouse embryos [12] and vital for RNA processing -its mRNA transcripts were detected at the 2-cell-stage during mice embryogenesis. It is likely that IGF2BP2 is associated with both energy expenditure and life span in different pathological conditions [46]. Accordingly, a large number of diseases, including diabetes, acute myelocytic leukemia [47] and colorectal carcinoma [48], may be partly attributed to the abnormal expression and dysregulation of IGF2BP2. In a previous study, IGF2BP2 was highly expressed in the granulosa cells of women with PCOS and it was possible to demonstrate that it serves as an HMGA2 target gene and can promote the proliferation of granulosa cells by the HMGA2/IGF2BP2 pathway [49]. However, despite the association between IGF2BP2 and ovarian endocrine disorders, there are previously proposed unknown regulatory mechanisms, and it remains unclear whether IGF2BP2 can regulate alternative splicing in the ovary. In the present study, we performed RNA-seq and RIP to explore whether IGF2BP2 could bind to genes associated with ovarian diseases and was able to regulate alternative splicing events.
We analyzed the binding pro le of IGF2BP2 in KGN cells via RNA-seq and found that IGF2BP2 preferentially binds to the 3' and 5'UTRs of mRNAs (Fig. 2). In general, we con rmed the GGAC motif that was previously reported as a common binding site recognizing the GG (m6A) sequence to target mRNA transcripts and which is involved in interactions within the circNSUN2/IGF2BP2/HMGA2 complex. More importantly, we also found a novel GAAG motif in the two immunoprecipitate experiments. We demonstrated that GGAC's binding sites were enriched at the 5′ UTRs while the GAAG motif was enriched at the 3′ UTRs. The recognition of both 5′ and 3′ splice sites in pre-mRNA was an essential event in the alternative splicing decision.
Furthermore, GO and KEGG analyzes on the RIP-seq data showed that the genes binding to IGF2BP2 mainly function as regulators of gene expression, which reveals an important role in the crosstalk between gene networks. Among the IGF2BP2-binding mRNA, we found four candidates associated with ovarian insu ciency (POI): NFIC, FOSL2, NR5A1 and VASP, as demonstrated by previous studies. For example, a mutation in NR5A1 was identi ed in women POI [50,51,7,52]. This mutation might result in an impaired transcriptional activation of the Amh, Inhibin-a, Cyp11a1 and Cyp19a1 genes, all of which are essential for normal ovarian function [53]. However, it is still unclear if the mutations in NR5A1 are rare and whether they contribute to pathogenicity of POI. Furthermore, FOXL2 exclusively heterodimerizes with Jun members to form an activator protein-1(AP-1) complex, thereby participating in various cellular processes including proliferation, differentiation, and transformation [54]. It has been shown that gonadotropin upregulates FOSL2 and JUNB in granulosa cells and in uences the phosphorylation of the estrogen receptor beta (ESR2), which play an essential role in the onset of puberty, gonadal development and ovulation [55]. Finally, the genes NFIC and VASP have also been associated with some ovarian disorders in female reproduction [56][57][58].
We unbiasedly analyzed the DEGs that show evidence of being regulated by IGF2BP2 from the RNA-seq data. Functional enrichment analyzes demonstrated that these DEGs are enriched in multiple GO-terms and KEGG pathways. These include in ammatory response, defense response to virus and cytokinemediated signaling pathway, all of which are concordant with the reported IGF2BP2 function in regulating in ammatory environments and genomic stability in human vascular endothelial cells and hepatocytes [59]. IGF2BP2-regulated genes were also enriched in intestinal immune network for IgA production, in ammatory bowel disease and cytokine-cytokine receptor interaction.
Among the 1124 DEGs obtained from the RNA-seq data, 17 (or their antisense RNA) have IGF2BP2 binding sites (Fig. 7). Thus, it is possible that these are the only genes which might be regulated through direct binding to IGF2BP2 despite an observed lack of signi cant correlation between IGF2BP2-binding and regulation. The selected genes showing an association with the in ammatory response signal pathway were con rmed by qPCR, further validating the analysis above.
Notably, the majority of alternatively spliced genes that are regulated by IGF2BP2 involved gene expression without changes in transcription levels. Therefore, we hypothesize that IGF2BP2 binds to mRNA to act as a regulator of gene expression and plays an important role in the regulation of alternative splicing events. Moreover, the AS genes that are also IGF2BP2 binding targets were mainly involved in gene expression which is in line with the ndings of a GO analysis comparing RASE between IGF2BP2overexpressing and control groups. The IGF2BP2-regulated AS genes in these biological processes include MBD3, FN1, TFDP1 and MKNK2, all of which were validated by RT-qPCR. We were also able to demonstrate that, after overexpressing of IGF2BP2 in KGN cells, the mRNA expression ratios of MBD3, FN1 and TFDP1 were higher, while that of MKNK2 was lower. MBD3, a member of the nucleosome remodeling deacetylase complex, is a maternal protein that is essential for the protection of imprinting control regions against DNA demethylation after fertilization and for maintaining DNA methylation of the H19 paternal allele during preimplantation [60,61]. FN1 is a glycoprotein that is present on the cell surface, in extracellular uids and connective tissues, and a component of the basement membrane in the ovarian follicle. Fibronectin is a secretion product that is associated with granulosa cell adhesion and migration in rats [62,63]. The production of bronectin is indispensable in the early stages of follicular development and tends to rise during follicle maturation [64]. Previous studies have shown that bronectin appears during the early luteal phase and then decreases in murine granulosa cells [65], and that it might participate in the regulation of cell proliferation [66].

Conclusion
In this study, we indicated that IGF2BP2 might be involved in the regulation of transcription and alternative splicing of a series of genes associated with follicular development in human granulosa cell lines. Moreover, we show that IGF2BP2 partly in uences the expression of some target alternative splicing genes, and that this may lead to the pathogenesis of ovarian disorders. However, the detailed mechanisms of these alternative splicing events and downstream regulatory patterns still require further investigation.

Con icts of interest/Competing interests
The authors wish to declare that they have no con ict of interest.

Availability of data and material
The RNA-seq data produced in this work will be deposited in NCBI's Gene Expression Omnibus and can be accessed through GEO series accession.

Ethics approval
The authors wish to con rm the originality of this work and that is has neither been published nor is it under consideration for publication elsewhere. Ethical approval was obtained from the Ethics Committee of the Beijing Obstetrics and Gynecology Hospital, Capital Medical University.

Consent to participate
Informed consent was obtained from all individual participants included in the study.

Consent for publication
Patients signed informed consent regarding publishing their data.