Ethics Statement
All procedures were performed according to standard protocols or the manufacturers’ instructions and the study was approved by the ethics committees of the First Affiliated Hospital of Xiamen University.
Sample Collection
A total of 6 subjects (3 TFI patients and 3 endometriosis-related infertility patients) were recruited to participate, and all of the patients were of reproductive age with normal menstrual cycles. Herein, 3 patients with endometriosis were all patients whose endometrium occur in the ovary (group EMS). Meanwhile, 3 patients with TFI refer to women who have been diagnosed with tubal obstruction by fallopian tube angiography before (group TFI). In addition, all patients in our study gave informed consent. The follicular were monitored by ultrasonography from the ninth day of the menstrual cycle. On the 6th-7th day after ovulation (the LH test showed double line and the ovulation sign was detected by B ultrasonic examination), endometrium was obtained after these checks performed. The tissue was divided into two parts: the first part was fixed by 10% formaldehyde and was sent for pathological examination and endometrial lesions were excluded. The second part was stored in liquid nitrogen, and then the total RNA was extracted to explore lncRNA and mRNA expression profiles by human lncRNA Array v4.0.
RNA Extractionand Quality Control
According to the manufacturer's instructions, total RNA extracted from tissue was performed using Trizol reagent (Invitrogen). RNA quantity and quality were assessed by spectrophotometric measurement (Nanodrop ND-1000). Denaturing agarose gel electrophoresis was used to evaluate the RNA integrity. RNA was stored at -80°C for further experiments.
Microarray Analysis and Data Processing
Human LncRNA Microarray V4.0 (lncRNAs (40173) + coding genes (20730), 8 × 60 K,Array Star, Rockville, MD, USA) designed for the global profiling of human lncRNAs and protein-coding transcripts was used in our study. The microarray can detect about 40,173 lncRNAs and 20,730 coding transcripts, which are derived from authoritative databases (RNA-seq, GENECODE, RefSeq, UCSC, UCR, ect) and other landmark publications. Sample labeling and array hybridization were performed in terms of the Agilent One-Color Microarray-Based Gene Expression Analysis protocol (Agilent Technology) with minor modification. In brief, mRNA samples from endometrium were purified to remove rRNA and were amplified and transcribed into fluorescent cRNA along the entire length of the transcripts without 3' bias utilizing a random priming method (Arraystar Flash RNA Labeling Kit, Arraystar). RNeasy Mini Kit (Qiagen) was employed to purify the labeled cRNA. After hybridization and washing, the arrays were scanned with Agilent DNA Microarray Scanner (part number G2505C). Raw data were extracted using Agilent Feature Extraction software (version 11.0.1.1). Furthermore, quantile normalization and subsequent data processing were done through GeneSpring GX v12.1 software package (Agilent Technologies). Log2 transformation was performed on the mRNA and lncRNA expression data. The average RNA expression value was used when duplicated data were found. Microarray hybridization and expression data collection were performed by Kangchen Bio-tech, Shanghai, China. The microarray analysis was conducted in School of Medicine, Xiamen University, Xiamen, Fujian, China. Finally, we obtained lncRNA and mRNA expression profiles.
Identification of DEGs
Herein, R software and limma package (http://www.bioconductor.org/packages/release/bioc/html/limma.html) in Bioconductor were used to identify DEGs by comparing mRNA expression differences in eutopic endometrium tissues from TFI and endometriosis-related infertility respectively.13 Normalized data of mRNAs in two groups of samples (TI and EMS) was shown in Fig. 1C. DEGs were identified by fold-change screening at a threshold of 2.0-fold or greater, and a p-value < 0.05.
Enrichment Analyses of DEGs
To understand the biological roles of these DEGs associated with TFI and endometriosis-related infertility, pathway analysis (based on Kyoto Encyclopedia of Genes and Genomes, http://www.genome.jp/kegg/) and Gene Ontology (GO) (http://www.geneontology.org) analysis were assessed by the clusterProfiler package in R software to explore the significant pathways related to DEGs14. adjust. P < 0.05 was used as thresholds to define markedly enriched GO terms/pathways. GO is a series of semantics used to describe the characteristics of genes. These semantics are mainly divided into three kinds: Biological Process (BP) , an orderly biological process with many steps, such as cell growth, differentiation and maintenance, apoptosis and signal transduction, Molecular Function (MF), which mostly refers to the functions of individual gene products, such as binding activity or catalytic activity, and Cellular Component (CC), which is used to describe the location of gene in cells, such as cytoplasm, nucleus, endoplasmic reticulum or mitochondria. KEGG is a database for systematic analysis of metabolic pathways of gene products in cells, and it is the most commonly used metabolic pathway analysis. ClusterProfiler14 package (http://www.bioconductor.org/packages/release/bioc/html/clusterProfiler.html) in Bioconductor implements methods to analyze and visualize functional profiles (GO and KEGG) of gene and gene clusters.
TFs-DEGs Network Construction
Genes are often regulated by TFs at the transcription level. Therefore, DAVID (Database for Annotation, Visualization and Integrated Discovery) 6.8 (https://david.ncifcrf.gov/) was used for TFs enrichment analyses to investigate DEGs at the transcriptional level. Herein, the p-value (Benjamini) ≥ 0.05 was taken as the exclusion standard and the TFs-mRNAs network was constructed by Cytoscape15 software. DAVID is a gene functional classification tool that integrates a set of functional annotation tools for investigators to analyze biological functions behind massive genes. Currently, the DAVID database is mainly used for the analysis of functions, pathways and TFs enrichment of genes.
Analysis of the Important Modules and Hub Genes
To analysis the important modules and Hub genes associated with endometriosis-related infertility, we constructed a PPI network of DEGs through the STRING 11.0 (https://string-db.org/) online database with the basic settings (meaning of network edges: evidence, minimum required interaction score: medium confidence) and visualized the results by Cytoscape software. Then, the important modules in the PPI network was analyzed by Molecular Complex Detection (MCODE) plug-in with the max depth, haircut on, node score cut-off, degree cut-off and k-core set as 100, 0.2, 0.2, 10 and 2, respectively. Gene ontology and pathway enrichment analysis of genes in each module (module 1, module 2 and module 3) were performed by DAVID database and the cut-off criterion of p-value (Benjamini) was set as < 0.05.
Moreover, in this study, we screened out the top 50 Hub genes in the PPI network by cytoHubba16 plug-in and predicted their TFs by iRegulon17 plug-in. The top 5 TFs in the NES scores were displayed. STRING, a database that predicts interactions between proteins or genes, includes both direct physical interactions and indirect functional correlations between proteins. MCODE plug-in can discover closely related regions in PPI network, which may represent molecular complexes. The MCC algorithm of CytoHubba plug-in can help us discover key genes and subnetwork in PPI network. IRegulon, a plug-in that predict TFs of gene sets, predicts TFs by calculating motif enrichment analysis and uses multiple position weight matrices (PWM) to score each motif.
Identification of DELs
The normalized expression of lncRNAs in two groups of samples (TI and EMS) was shown in Fig. 1D. Moreover, normalized data of lncRNAs was used for differential analysis by limma package in R software and the DELs were identified by fold-change screening at a threshold of 2.0-fold or greater, and a p-value < 0.05.
CeRNA Network Construction
It is well known that the DEGs is regulated by miRNAs and lncRNAs at transcriptional level. The combination of miRNAs and mRNAs can promote the degradation of mRNAs. At the same time, lncRNAs can competitively combine with mRNAs to further inhibit the degradation of mRNAs by miRNAs. The interaction network of lncRNA, miRNAs and mRNAs are called competing endogenous RNA (ceRNA) network. Herein, in order to better understand the physiological changes of eutopic endometrium between patients with TFI and endometriosis-related infertility, we used R software and Perl software to construct the ceRNA network (DEGs-miRNAs-DELs) through the miRDB database (miRNA-mRNA, http://mirdb.org/), miRTarBase database (miRNA-mRNA, http://mirtarbase.mbc.nctu.edu.tw/index.html), TargetScan database (miRNA-mRNA, http://targetscan.org/vert_72/) and miRcode database (lncRNA-miRNA, http://www.mircode.org/index.php) database.
Herein, first, DELs-miRNAs interaction pairs were integrated using the miRcode database. Second, target genes of miRNA signatures were obtained using three databases: miRDB, miTarBase, and TargetScan. Genes present in all three databases were regarded as target genes of miRNAs. Comparing predicted target genes of miRNAs with the DEGs, only the remaining overlapping genes and their interaction pairs were used for further analysis. Then, the ceRNA networks (DEGs-miRNAs-DELs) were constructed by Cytoscape software according to DELs-miRNAs pairs and miRNAs-mRNAs (overlapping genes) pairs.
TFs Prediction of MRNA in CeRNA Network
MRNA and miRNA can be regulated by TFs at the transcription level. Herein, in order to predict the upstream regulatory mechanism of mRNAs related to endometriosis-related infertility, we predicted the TFs of mRNAs in ceRNA network by iRegulon plug-in in Cytoscape software and we only showed the top 10 TFs in the NES scores in this study.
Screening of Potential Key Genes
In order to screen out the potential key genes related to endometriosis-related infertility. In this study, we selected the lapping genes of important modules, Hub genes and ceRNA network as the potential key genes. At the same time, we also predicted the TFs of potential key genes and miRNAs (target to potential key genes) by iRegulon, respectively. The top 5 TFs in the NES scores were displayed.
TFBs Predition of Potential Key Genes
In this study, we predicted transcription factor binding sites (TFBs) for these potential key genes that bind to TFs by JASPAR 2020 online software (http://jaspar.genereg.net/). JASPAR18 is a collection of transcription factor DNA-binding preferences, modeled as matrices. These can be converted into Position Weight Matrices (PWMs or PSSMs), used for scanning genomic sequences. JASPAR is an open-access database of curated, non-redundant transcription factor (TF) binding profiles stored as position frequency matrices (PFMs) and TF flexible models (TFFMs) for TFs across multiple species in six taxonomic groups. The higher the score, the greater the likelihood that the TF will bind to the DNA binding site (motif).