The current research landscape of IVF offspring health
Following a WoS search, 585 publications and 9698 citation items were retrieved from 2003 to April 2021. From the trend line, the number of documents increased exponentially and reached a peak in 2020. A relevance score is calculated for all terms based on 24075 keyword terms and 492 records that met the threshold. Finally, 53 keywords related to health problems of offspring after IVF implantation were chosen (Figure 1A), which mainly includes keywords correlated with metabolism such as birth weight, BMI, diabetes, blood pressure, cardiovascular disease, long-term health, obesity, etc.
Identification of common genes between IVF and T2DM
The following transcriptome profile datasets were downloaded from NCBI GEO database: GSE122214. After being analyzed, the villus transcriptome following IVF treatment was significantly different from natural conception. We successfully identified 1,806 DEGs, including 1,064 upregulated and 742 downregulated genes (Figures 1B). In addition, the potential targets associated with T2DM were retrieved from Phenopedia database. Further analysis of these DEGs using Venn diagram revealed 15 common genes as hub genes to identify IVF infants at risk of developing T2DM as adults (Figures 1C), and the common gene heatmap is displayed in Figure 1D.
Functional annotation for DEGs via DAVID and Metascape
To further explore the biological function of target genes, two online databases (David and Metascape) were used to conduct functional analysis of the selected genes. Through DAVID analysis, the results of KEGG pathways indicated that the top canonical pathways associated with target genes included PPAR signaling pathway, HIF-1 signaling pathway, vitamin digestion, and absorption, as well as fat digestion and absorption (Figure 2A). GO analysis revealed that variations in common genes linked to biological processes were mainly enriched in cholesterol metabolic process, positive regulation of PI3K signaling, blood pressure regulation, lipoprotein metabolic process, and so on (Figure 2B).
Target genes linked to cellular components were significantly enriched in plasma membrane, extracellular exosome, very low-density lipoprotein particle, and chylomicron (Figure 2C). Regarding molecular function, these genes were significantly enriched in cholesterol transporter activity, identical protein binding, phospholipid binding, and lipid binding (Figure 2D). Furthermore, the functional enrichment analysis with Metascape revealed that target genes were significantly enriched in regulating lipid metabolic process, small molecular, metabolic process, oxidoreductase activity, and so on (P< 0.05, Figures 2E–G).
Properties of selected genes
Gene expression in the sample. A total of 15 common genes were differentially expressed in the villus of IVF and naturally conceived patients. Compared with the villus of naturally conceived patients, eight genes were upregulated, and seven genes were down-regulated in the villus of IVF patients (Figure 3A).
Physiochemical properties of selected genes. The 15 common genes’ physicochemical features are listed in Table 1. The table lists the gene name, gene ID, protein length, MW, PI, instability index, and predicted N-glycosylation site. The results illustrated that these selected genes are distributed on chromosomes 1, 2, 3, 6, 7,11, 12, and 19. The protein sequence length ranged from 167 to 4653 amino acids, with LEP having the shortest sequence at 167 amino acids and APOB having the longest sequence at 4653 amino acids. Besides, the molecular weight (MW) of selected genes ranged from 18.64 to 515.6 kDa. Furthermore, PI ranged from 5.56 to 8.93. Based on protein instability index, most studied proteins (10/15) were unstable. Also, most proteins exhibited negative GRAVY, revealing that they possess hydrophilic properties.
Protein modification information. Posttranscriptional modifications such as phosphorylation and glycosylation are involved in regulating protein stability and protein interactions. Therefore, the potential phosphorylation and glycosylation sites between IVF and T2DM were predicted into the amino acid sequence of selected genes (Table 1, Figure 3B). As predicted, APOB was identified as an N-hyperglycosylated protein that may be mostly glycosylated (Table 1). Also, the results demonstrated that APOB gene exhibits more phosphorylated sites (Figure 3B).
Other characteristics. Three common SNP sites were associated with diabetes mellitus, including two SNP sites of ApoE: rs7412 and rs429358 (Figure 3C). To identify common conserved motifs upstream of the selected genes, GLAM2 tool was utilized to screen promoter sites (1500 bp upstream of the start codon). As a result, there are GC-rich regions identified as conserved motifs in the upstream of selected genes; the conserved sequence may be involved in regulating key targets (Figure 3D). Next, we identified three diabetes-associated SNP sites and the conserved site for miRNA families. Concurrently, multiple miRNAs were identified to be involved in gene expression regulation, and green miRNAs, as central miRNAs, were involved in regulating multiple genes (Figure 3E).
Network establishment and analysis
The protein-protein interaction (PPI) network involves most biological processes, such as DNA transcription and replication, protein transport, protein degradation, and cell cycle regulation. In this study, STRING v11 was employed to construct a PPI network (Figure 4A), visualized using Cytoscape 3.5.1. First, the Maximal Clique Centrality (MCC) of each node was calculated by CytoHubba, a plugin in Cytoscape. The genes with the top 6 MCC values were considered hub genes. Concurrently, another Cytoscape plugin Mcode analyzes the network and determines the central key nodes, resulting in seven hub genes. Subsequently, an online Metascape enrichment analysis was used to obtain a key target complex: APOA1, APOB, and APOE (Figure 4B). Next, the gene correlation analysis was deployed to further investigate the expression relationship between genes—the stronger the gene correlation, the darker the color (Figure 4C). Finally, the Venn diagram was employed to identify common genes; APOE and APOA1 were identified as two such genes (Figure 4D). When the correlation between genes is considered, it can be concluded that APOA1, APOE, and APOB are strongly correlated and positively associated. As a result, we assumed that APOA1, APOE, and APOB were the key genes.
Gene complex-related diseases. Disease correlation analysis was conducted on the gene complex composed of APOA1, APOB, and APOE to find the diseases related to the complex for reverse verification, and a gene-disease network was constructed (Figure 4E). As the figure displays, diseases closely related to the three genes include metabolic diseases, cardio-cerebrovascular diseases, nervous system diseases, and immune-related diseases, among which metabolic diseases include obesity, diabetes, non-alcoholic fatty liver disease, and lipid metabolism disorders. Cardiovascular and cerebrovascular diseases include hypertension, heart disease, coronary artery disease, cerebrovascular disease, etc. Immune-related diseases mainly include Leukocyte chemotactic factor 2 Amyloidosis, and ApolipoProtein C-III is associated with Amyloidosis and other neurological diseases, including Alzheimer's disease.
Expression levels of hub genes in obesity. We applied Attie Lab Diabetes database to find the correlation between hub genes and diabetes; BTBR mice become severely diabetic with obesity at 10 weeks of age. We discovered that APOA1, APOB, and APOE expressions were expressed differently in different tissues in 4-week and 10-week BTBR obese diabetic mice (Figure 5). In the 4-weeks BTBR obese diabetic mice’s adipose, APOA1 was downregulated, and APOE was downregulated in adipose and islet. However, in the 10-weeks BTBR obese diabetic mice’s liver, APOA1 was significantly downregulated, while APOB and APOE were significantly increased in islet tissue.
Identification of potential predictive drugs. From DrugBank, we obtained ten drug-mRNA interaction pairs between the three key targets. The basic information and structural formula of drug are presented (Table 2). There were nine drugs regulating lipid metabolism and one antioxidant drug. Rosuvastatin, pitavastatin, and gemfibrozil were approved for use, whereas gamolenic acid, lovastatin, and mipomersen were approved for use but require additional investigation.
GSEA revealed offspring's safety and potential health effects through IVF-ET.
Functional differences between the two groups were determined from a macroscopic genome-wide perspective, not just DEGs. The most significantly enriched gene sets correlated with natural subjects were maturity-onset diabetes in young, glycine serine and threonine metabolism, fatty acid metabolism, and peroxisomes (Figure 6).