3.1 Study design
The flowchart of our study design is shown in Figure 1. our initial aim was to identify core genes in CSP development and their possible regulatory mechanisms. We used four GEO datasets and five SRA datasets (GSE18850, GSE149436, GSE195795, GSE2957, PRJNA687512, PRJNA516344, PRJNA725184, PRJNA667091, and PRJNA417762), as well as a collection of six women's deciduas and villus for proteomics analysis. We extracted protein expression data from women with normal pregnancy and CSP, as well as gene expression data from CD and EI, to determine whether there is an association between this CD, EI, and CSP. Based on these DEGs, co-expressed DEGs were identified. Biological processes and KEGG analyses were then performed to select important signaling pathways. Finally, we verified the important role played by core co-expressed genes in CSP by PPI, and together with the observation of the pattern of gene changes in each datasets, we made the hypothesis of ITGB3 rebound effect.
3.2 Quality Control
After the search database is completed, a series of quality controls are needed to ensure that the quality of the results meets the standards, which is available in Supplementary Figure 1. The results showed that most of the peptides were distributed in 7-20 amino acids, which met the quality control requirements. Most of the proteins corresponded to more than two peptides and covered less than 20%. In addition, the distribution of proteins above 10 KD was relatively uniform, indicating that no significant molecular weight bias was generated for proteins above 10 KD during sample preparation.
3.3 Incidence of CSP is associated with cesarean delivery and embryo implantation
This proteminc analysis showed that 7276 proteins were measured in all collected samples. After differential analysis, we found that there were 1063 differential proteins in SDvsD, of which 781 were down-regulated expression and 282 were up-regulated expression, while 1023 differential proteins were found in SVvsV, of which 757 were down-regulated expression and 266 were up-regulated expression. After enriching the differentially expressed proteins in different comparison groups for GO classification, KEGG pathway, and protein structural domains, we performed a clustering analysis intending to find the correlation of differentially expressed protein functions in the comparison groups. The above results can be found in Supplementary Figure 2-4. Among them, biological processes were mainly enriched in immune response, substance exchange, synthesis, and metabolism. KEGG analysis was primarily enriched in focal adhesion, formation of various diseases, and synthesis of nucleic acids. Considering the sites affecting embryonic implantation, focal adhesion signaling pathways, substance exchange, and metabolism may all be involved in the development of CSP.
In addition, after removing duplicate genes and results with missing gene symbols, we combined the results of DEGs from these four datasets on CD using PRJNA516344, GSE149436, GSE18850, and PRJNA687512 to obtain 3007 DEGs (CD_DEGs). We then obtained 16040 DEGs (EI_DEGs) after using the DEGs from PRJNA417762, PRJNA667091, and PRJNA725184, which are three datasets on embryo implantation. And then we identified 7103 DEGs in GSE2957 regarding non-pregnancy versus normal pregnancy (NPvsP), of which 398 genes were upregulated and 6705 genes were downregulated. After analyzing GSE195795, we obtained 7103 DEGs regarding postpartum versus normal pregnancy (PPvsP), of which 3327 genes were upregulated and 4206 genes were downregulated. All the above results are shown in order in Supplementary Table 2-6.
By merging the results of DEGs from clinical samples and differential expression datasets, we identified the overlapping DEG between CD_DEGs, SDvsD, and SVvsV named DEGs1, which contains 142 genes (Fig.2A). Subsequently, we did the intersection between the 3 groups of EI_DEGs, SDvsD, and SVvsV, and obtained that DEGs2 contained 315 genes (Fig.2B). This suggests that there are shared genes involved in both processes, CD and EI, and the occurrence of CSP. We can see that these genes are not a few, but amount to hundreds, proving that this association is not accidental. To verify the reliability of our Venn diagram to find the association of CD with EI and CSP, we did a statistical analysis. Firstly, we used 7276 proteins detected in clinical samples (RAW) (Supplementary Table 7), SDvsD, SVvsV to make intersections with CD_DEGs, EI_DEGs respectively, and then We did four chi-square tests by two-by-two comparison (Table 1). Since the total number of cases is> 40 and all theoretical frequencies > 5, we elected to look at Pearson's results. After performing the chi-square test, all four results showed a statistically significant difference of P=0.00<0.05. The above results indicate that directly using DEGs of EI or CD to do intersection with differential proteins to find the linkage between the two is different than finding the linkage in the original protein.
Table 1 Comparison of two methods for finding differential gene associations
3.4 Six hub genes are involved in CSP formation through cell adhesion
The GO function annotation and enrichment of overlapping DEGs were implemented using DAVID v6.8. The results of BP indicate that DEGs1 were primarily enriched in “cell−cell adhesion”, “intracellular protein transport”, “endocytosis”, “mRNA splicing, via spliceosome”, “response to drug”, and so on (Fig.3A), whereas DEGs2 were mainly associated with “cell−cell adhesion”, “cell adhesion”, “innate immune response”, “platelet degranulation”, “ER to Golgi vesicle−mediated transport”, and so on (Fig.3B). To further determine which genes might play an important role in the progression of CSP, we entered the filtered GO terms into the QuickGo website for secondary screening and obtained the Ancestor Chart (Supplementary Figure 5). The chart shows that "cell-cell adhesion" is a subtype of the node "cell adhesion", so only genes that are enriched to "cell adhesion" were used for the subsequent analysis. Combining the two sets of GO results, we found that cell adhesion is involved in both. Considering the characteristics of CSP disease, it is reasonable to believe that cell adhesion is the most relevant biological process for CSP. But in order not to miss all possibilities, all other genes enriched to GO terms were retained, and 60 DEGs were obtained after removing the duplicates (Supplementary Table 8).
To filter out the core genes, 60 DEGs were uploaded to STRING for further analysis, which was divided into three groups by the K-means clustering method (Fig.3C), with a high similarity between objects in the same cluster and large differences between objects in different clusters. Then 56 nodes plus 182 edges were obtained. Local clustering coefficient is 0.591 and PPI enrichment P-value < 1 × 10-16. Next, the protein-protein interaction network was processed with Cytoscape. First, the network data were processed with the MCODE module to identify gene clusters, and our selection criteria were as follows: MCODE score > 8, degree cutoff = 2, node score cutoff = 0.2, maximum depth = 100, and k score = 2. The eight highest-scoring genes in the first gene cluster were obtained (Fig.3D). Then the top 10 key genes (Fig.3E) were filtered by the CLASS algorithm of Cytohubba. Finally, the genes screened by the two methods were intersected to obtain a total of 6 hub genes including MMP9, VTN, ITGB3, CTNNB1, CDH1, and ITGA2B. This further screen showed that these six hub genes are the key genes more likely to be involved in the regulation of CSP.
3.5 Three hub genes activate the focal adhesion signaling pathway to form CSP
To further explore through which signaling pathway the formation of CSP is activated, we did the KEGG pathway enrichment analysis for 6 hub genes. Figure 4A shows the results of DAVID enrichment analysis, which mainly concentrated in “focal adhesion”, “ECM−receptor interaction”, “PI3K−Akt signaling pathway”, and so on. In addition, Metascape analysis showed that the 6 hub genes were centered in three pathways, namely “fluid shear stress and atherosclerosis”, “focal adhesion”, “proteoglycans in cancer” (Fig. 4C). Combining the KEGG results from these two methods, we can know that the focal adhesion that was co-enriched is the most likely signaling pathway for the formation of CSP. We then found that the genes involved in this pathway are VTN, ITGB3, and ITGA2B, suggesting that these three hub genes most likely regulate the development of CSP.
To further explore the molecules associated with these 3 Hub genes, we did PPI network analysis of the differential proteins from the clinical samples with these 3 hub genes (Fig. 4B), and selected a confidence level of 0.7, hoping to find more specific mechanisms regulating CSP. The results indicate that there are 6 differential genes with which there are co-interactions. Notably, among the co-regulated genes, SRA & PTK2, are also involved in the focal adhesion signaling pathway and are downstream genes of the 3 hub genes, which further validates that these 3 hub genes induce further development of CSP through activation of the focal adhesion signaling pathway.
To explore how gene expression changes specifically regulate CSP formation, we reviewed the gene expression dataset and clinical samples for these three hub genes. First, we found that the three genes were upregulated in the differential protein expression of clinical samples, suggesting that increased expression of ITGA2B, VTN, and ITGB3 in either deciduas or villus chorionic tissue would favor CSP formation (SDvsD; SVvsV). By analyzing the GSE195795 dataset, we found that ITGB3 expression was down-regulated, VTN was slightly up-regulated, and ITGA2B expression changes were not significantly different after delivery compared to pregnancy. By analyzing the GSE2957 dataset, we found that the expression of ITGB3, VTN, and ITGA2B were all significantly down-regulated. It could be attributed to the upregulation of the expression of these three genes after pregnancy, which contributes to embryo implantation and implantation. By analyzing 3 datasets on cesarean delivery compared with spontaneous delivery (PRJNA516344; GSE149436; GSE18850), we found that the expression of ITGB3 was significantly upregulated in all of them, while there was no significant difference in the changes of VTN and ITGA2B. Only ITGB3 was changed in these 3 hub genes after cesarean delivery, suggesting that ITGB3 is inseparable from cesarean delivery. Remarkably, only ITGB3 of the 3 hub genes were all involved in pregnancy, cesarean delivery, and CSP. After combining the expression changes of these three hub genes, we found an interesting phenomenon in the expression changes of ITGB3. First, ITGB3 is generally expressed down-regulated after delivery than during pregnancy, but this down-regulation is made more significant if delivery is by CD (Fig.4D). Subsequently, ITGB3 expression was slightly upregulated in general pregnancy compared to non-pregnant women, but CSP showed a significant increase in ITGB3 expression compared to general pregnancy (Fig.4E). In summary, ITGB3 is more significantly downregulated after CD than after general delivery because the degree of downregulation is greater and the increase in ITGB3 is more remarkable when another pregnancy occurs. This pattern of change, therefore, leads to a high-fold increase in ITGB3 among CSP, which may be an important causative factor for the occurrence of CSP after CD. The P values of the above differential genes were less than 0.05.