Are there common mechanisms of aneuploidy-related abortion and defects in fetus?

Background: Monosomies and trisomies, as the most common aneuploid abnormalities, are the leading causes of miscarriages and fetal defects in humans. Although there is evidence suggested that aneuploid may have some common aspects, their common mechanism still remains unclear. This studies objective was to explore the common mechanism of monosomies and trisomies, with a purpose to identify some critical biomarkers and pathway so as to early diagnosis and effective therapy. Methods: We obtained the mRNA expression prole of GSE114559 including 101 samples data from GEO database. These data include normal, every monosomic, and trisomic transcriptome. We conducted Limma analysis by using the adj. p<0.05 and |FC|>1 criteria to identify all monosomy-related, trisomy-related differentially expressed genes (DEGs), and also to found their overlapping DEGs through Venn diagram. We then performed Gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes pathway (KEGG) enrichment analyses, protein-protein interaction(PPI) network analysis to nd the functional, pathways and hub genes in these DEGs. We carried out weighted correlation network analysis (WGCNA) to further detect the candidate genes and pathways related to all DEGs and their overlappling DEGs. Finally, we further used qPCR to certify pathological change of specic genes. Results: We identied all monosomy-related, trisomy-related DEGs, and their overlapping DEGs which were enriched by spliceosome, thyroid hormone, infection-related genes and signalling pathways. We also found that epigenetic related pathways were signicantly enriched in the DEGs of monosomies by GO, KEGG. We explode the hub gene and module in the DEGs of monosomies and the overlapping DEGs by PPI. Then, we found that spliceosome, thyroid hormone, infection-related genes and signalling pathways were enriched in all DEGs group and the overlapping DEGs group by weighted correlation network analysis (WGCNA). Finally, we certied some hub gene in the trisomy 21, 47, XYY samples from clinical patients by qPCR which were consistent with results of PPI analysis. Conclusion: Our study indicates the potential common mechanism underlining spliceosome, thyroid hormone and infection-related signalling pathways for both monosomies and trisomies, and the mechanism underling epigenetic for monosomies.


Background
Aneuploidy, mostly of which is monosomy or trisomy, de ned as a chromosome number that is not an exact multiple of the usually haploid number, affects 50%-80% of preimplantation embryos, 10-40% of pregnancies, and 0.3% of newborns [1,2]. It is the leading cause of miscarriages and developmental defects in humans [3,4]. Monosomic or trisomic babies that do survive are associated with well-de ned syndromes, including Turner, Down, Edwards, Patau, Klinefelter, and Superfemale syndrome. Furthermore, they exhibit obvious clinical phenotypes, including facial abnormalities, cardiovascular and neurodevelopmental abnormalities, limb asymmetry and so on. This is due to that normal ontogenetic development is tightly regulated, and aneuploidy leads to large genomic imbalances that disturb onto genetic development, which results in miscarriages and developmental defects.
Large genomic imbalance of aneuploidy for gene expression is attributed to both abnormal chromosomelinked gene and genes on other chromosomes. Some evidences have suggested that only few genes on the aneuploid chromosome can reach the theoretical 0.5 or 1.5-fold change, and several euploid genes show disrupted expression pro les [5][6][7]. These evidences indicated that aneuploidy has a global effect on the whole transcriptome.
Although theoretically the global effect on the whole transcriptome in various aneuploidies should be very inconsistent because they include different sets of disturbed genes, we found various aneuploidies share some similar phenotypic and molecular characteristics clinically [7][8][9][10]. Thus, there may be some common mechanisms among them, including some of the key biomarkers and pathways to all aneuploidies, could be bene t to early diagnosis and effective treatment of these diseases. Thereby, it is necessary for us to identify the common mechanisms associated with various aneuploidies.
Studying the mechanism of aneuploidies, one of the most effective and feasible method is to analyze the RNAseq data through various bioinformatics technologies, has been witnessed myriad applications in genetics elds [11][12]. So far, there is a large number of publicly available aneuploidy related RNAseq datasets in Gene Expression Omnibus (GEO) database (https://www.ncbi.nlm.nih.gov/geo/). Some key biomarkers and pathways in aneuploidy have been found by using bioinformatics technologies to integrate and analyze these datasets [13][14][15]. Recent studies indicated that some aneuploidies such as trisomy 18, 21 or trisomy 8, 13, 22, monosomy 45, x have potential common mechanisms [7,13].
However, comprehensively understand the underlying molecular mechanism among all aneuploidies and to early diagnosis and effective therapy is still required to the profound investigation. Therefore, our study is to investigate common mechanisms of all aneuploidies.
In this study, we hypothesized that there are common mechanisms between various aneuploidies. To test the hypothesis, we adopted a comprehensive biological information method to analysis RNASeq dataset (GSE114559) from the GEO database, including 5 normal samples, 53 various trisomic samples, and 43 various monosomic samples, so as to identify the common mechanisms of various aneuploidies.

Data Processing
The differentially expression genes among monosomies and trisomies were identi ed by Limma, which is an R package for comparation of different groups of samples from the GEO series according reference [16]. The adjusted p values (adj. p) were applied to correct the false positive results by default Benjamini-Hochberg false discovery rate method. That the adj. p < 0.05 and |FC|>1 were considered as the cut-off values. The Venn diagram of these DEGs was generated using a webtool (http://bioinformatics.psb.ugent.be/webtools/Venn/). Then, the heatmaps of these DEGs were drawn using the heatmap package (Raivo Kolde, rkolde@gmail.com).

Gene Ontology (go) And Pathway Enrichment Analysis Of Degs
The functional classi cation for DEGs was obtain from GO (http://www.geneontology.org) database, which can provide three domains: cellular component (CC), molecular function (MF), and biological process (BP) [17]. The biological pathways of DEGs was come from Kyoto Encyclopedia of Genes and Genomes (KEGG, http://www.genome.ad.jp/kegg/) database, which is a knowledge database resource for understanding high-level functions and utilities of the biological system [18]. The integrated functional and pathway analysis is derived from DAVID (http://david.abcc.ncifcrf.gov/), which was an online tool that can perform GO and KEGG analysis to provide visualization, annotation, and integration of genes and proteins [19]. p < 0.05 was indicated as statistically signi cant.
Then, PPI network (monosomy-related and the overlapping DEGs related PPI network) were built with a query score ≥ 400 and visualized by Cytoscape software v3.6.0. Furthermore, the modules in the PPI network were identi ed by a Cytoscape plug-in called MCODE [CITED]. The GO functions and pathway of identi ed modules were annotated by Cytoscape plug-in BiNGO..

Weighted Correlation Network Analysis (wgcna)
To describe the correlation of DGEs expression pattern and to screen common co-expression modules across monosomies and trisomies, we introduced the R package WGCNA according to the previous description [21]. In brief, expression correlation coe cients of DGEs were calculated and an adjacency matrix was constructed based on the correlation coe cients's results and a predefed soft-thresholding parameter (β). Then, gene expression modules with similar patterns were identi ed on the basis of gene cluster dendrogram using the dynamic tree cut method. Finally, GO and KEGG pathway enrichment analysis on gene expression modules was performed to characterize modules related to monosomies and trisomies.
Sample collection and validation of RNA-Seq data via qRT-PCR Assay Experiments with human samples were approved by the medical ethics committee of the XiangYa Hospital of Centre South University. The anonymized specimens were obtained from prenatal diagnosis center of XiangYa Hospital, Centre South University. The anonymized samples (gestational ages between 16 and 18 weeks) were collected from women undergoing genetic testing for routine clinical indications. Cell cultures of amniotic uid were cultured at 37 °C in 5% CO 2 environment according to standard protocols using the tissue culture ask and commercially available medium (AmnioMax, Invitrogen, CA).
Following a routine diagnostic cytogenetic analysis, the second passage of amniotic cell culture which was grown in the same condition as primary cell culture were collected for total RNA extraction, including two sample sets: the 1st sample set (the normal sample set) included 3 normal euploid pregnancies, and the 2nd set(the trisomic sample set) included 47,xyy, 21 trisomies.
Total RNA was extracted with Trizol reagent (TRANS) according manufacture introduction. Then, realtime qPCR was performed as previously described [22,23]. The primer sequences are listed in Table S1.

Statistics Analysis
All data were presented as mean ± SEM. Student's test was applied to calculate the statistical signi cance. The signi cance was set at P < 0.05.
These data might suggest that both trisomies and monosomies may have some common mechanisms, in addition to which all monosomies also have other common mechanisms.
Identify the biological functions and signaling pathway of these DEGs by GO/KEGG pathway enrichment analysis To explore the common mechanism among monosomies and trisomies, we rst introduced GO analysis and KEGG analysis to investigate the functional and signaling pathway enrichment of DEGs. In the results of GO analysis, the top thirty GO terms with the most signi cant p values for the categories of cell component, molecular function and biological process are shown in (Fig. S2). In detail, the DEGs of monosomies for the category of biological process were signi cantly enriched in the process of gene expression (such as the mRNA splicing via spliceosome) and epigenetic inheritance (i.e., histone acetylation and protein deubiquitination). Whereas, the overlapping DEGs for the category of biological process were signi cantly enriched in the process of gene expression, such as mRNA splicing via spliceosome, mRNA processing, and gene expression groups.
Based on the KEGG analysis, we found that the DEGs of monosomies were enriched in spliceosome, HTLV-I infection, and some pro-survival signalling (i.e., Notch and Wnt signaling pathway), ect. (Fig. 2a  Table S3). However, we found the overlapping DEGs were enriched in spliceosome, thyroid hormone signalling pathway, and pathogenic Escherichia coli infection, ect. (Fig. 2b and Table S4).
Identi cation of key candidate genes and pathways by DEGs protein-protein interaction network (PPIN) Secondly, we use the STRING online database and Cytoscape software to futher identify the potential hub genes and pathways of monosomies and trisomies. In monosomies, the DEGs related PPI network consisted of 1975 nodes and 19117 edges were constructed from 2527 DEGs and visualized using Cytoscape software (Fig. 3a). Degree > 10 was set as the cutoff criterion, and the most signi cant node degree genes were INS, DHX8, POLR2A, EHMT1, SMARCA4, EP3, CAD, MYC, and CTNNB1. 44 signi cant modules were constructed from the DEGs related PPI network using MCODE, and the most signi cant module includes 82 nodes and 1741 edges (Fig. 3b). Biological functional enrichment analysis showed that genes in this module were markedly enriched in maturation of SSU-rRNA from tricistronic rRNA transcript (SSU-rRNA, 5.8S rRNA, LSU-rRNA), negative regulation of mRNA splicing, and RNA processing via spliceosome, ect (Table 1). In the overlapping DEGs, the overlapping DEGs related PPI network consisted of 193 nodes and 773 edges were constructed from total of 317 overlapping DEGs and visualized using Cytoscape software (Fig. 3c). And the most signi cant node degree genes were SRSF4, HNRNPA2B1, DHX8, HNRNPU, HNRNPA1, HNRNPM, SRSF1, GTF2F1 DHX9, and MYC. A signi cant module was constructed from the PPI network of the DEGs using MCODE, including 19 nodes and 171 edges (Fig. 3d). Biological functional enrichment analysis showed that genes in this module were markedly enriched in spliceosome ( Table 2). First, we construct co-expression modules and analyzed the association of expression gene modules through WGCNA. In all DEGs group, we lter the P value and obtained 3 co-expressed gene modules (Fig. 4a). Genes data of the 3 modules was 1267(Me turquoise), 543 (ME blue), 171(ME brown) respectively. Approximately 571 genes did not load onto any speci c module (ME grey). In the overlapping DEGs group, we ltered the p value and obtained one co-expressed gene module, and the genes data was 229 (turquoise). Eighty-eight genes did not load onto any speci c module (ME grey) (Fig. 4b).
Then, we identi ed biological functions of the positive module through GO and KEGG analysis. In KEGG analysis, as we can see in all DEGs group, DEGs in turquoise module signi cantly enriched in spliceosome, RNA transport, and notch signaling pathway et al (Fig. 5a). DEGs in blue module signi cantly enriched in pathogenic Escherichia coli infection, aminoacyl-tRNA biosynthesis, DNA replication ect. (Fig. 5b). DEGs in brown module signi cantly enriched in endocytosis signalling pathway (Fig. 5c). In the overlapping DEGs group, DEGs in turquoise module signi cantly enriched in spliceosome, thyroid hormone signaling pathway, and pathogenic Escherichia coli infection ect. (Fig. 5d).

Validation Of The Degs In Some Trisomies
To guarantee the validity of the identifed DEGs, we analyzed their expression levels in trisomy 47, XXY and trisomy 21 by randomly selected 10 genes from all the top centrality hub genes. The analysis of melting-curve of qPCR illustrated a single product for all genes. Compared gene expression levels from qPCR with the DGE analysis results, 60-70% genes showed a consistent direction in both DGE library and qPCR analysis (Table 3 and Fig. 6).

Discussion
This study is, to our best knowledge, the rst to focus on the common mechanism among all monosomies and trisomies. In the present study, the prevalence of aneuploidy 35% in abortions and the leading cause in birth defects was signi cantly [24,25].
We observed most DEGs genes of trisomies are the same with that of monosomies. We identi ed speci c common pathways among both trisomies and monosomies, and meanwhile we found the epigenetic pathway was also important in monosomies. To the best of our knowledge, this is the rst study to use DEGs to identify potential common mechanisms in all trisomies and monosomies.
Thus, one of main strengths of our work was that we did a comprehensive analysis of 24-chromosome (every monosomy and trisomy) to identify potential common mechanisms in all trisomies and monosomies, although increasing evidence have highlighted that there were common mechanisms between chromosomal anomaly [7,13,[26][27][28].
We found that spliceosome-related gene and signaling pathways presented in all monosomies and trisomies. Original data article has noted that there were signi cant changes of spliceosome in the inviable blastocyst. Our ndings do not against it. Aneuploidy is a near-universal characteristic of human cancers [29], and a large number of studies have shown that spliceosome-related genes and signalling in cancers are interrupted [30,31,32], which is the most consistent evidence pointing to spliceosome-related gene and signaling pathways presented in all monosomies and trisomies.
We found thyroid hormone-related gene and signaling pathways were presented in all monosomies and trisomies. Original data article has not noted that. But previous clinical studies have found that some aneuploidy patients were associated with thyroid dysfunction, for example, trisomy 21 have a high prevalence of thyroid dysfunction [33,34,35]; Monosomy 45, X patients are more prone to autoimmune thyroid disease [36]; Partial trisomy 13 patient have autoimmune thyroiditis [37]; And partial monosomy 13 patient have thyroid dysplasia [38]; which are consistent with our study.
We also found infection-related gene and signaling pathways were presented in all monosomies and trisomies. Original data article has not also noted that. However, few recent studies found that in ammatory cytokines such as IFN, IL-6 and TNF (and so on) were increased trisomy 21 [39], which supported our study.
Surprisingly, we also found that ubiquitination and histone acetylation was presented in all monosomies.
For ubiquitination, few studies have reported that ubiquitination is present in trisomy 18 and trisomy 21 [7,40], which is not consistent with us. No similar studies investigated this pathway in monosomies, and therefore it is di cult for us to discuss with other studies. Our ndings can be explained by that: Normal ontogenesis depends on highly coordinated cell-fate decisions by stem cells, and numerous studies have found that ubiquitination set the level of transcription factors (such as multipotent markers: oct-4, nanog) to regulate pluripotency or control relevant signals (such as pro-survival signalling: Notch, Wnt/β-catenin, and Insulin growth factor) to specify a speci c differentiation in stem cells [41]. Which is consistent with the result we see more trisomies than monosomies in clinically, and also consistent with the result that Notch, Wnt/β-catenin, and Insulin growth factor were enriched in the DEGs of monosomies (Fig. 3 a). For histone acetylation, studies have found that the acetylation level in model organisms of monosomy 5 and monosomy 45, X is elevated [42,43], which provides supporting evidence for our ndings.
There are some limitations of this study that should be acknowledged. First, the authors of the original data have claimed that some of the data came from chimeric sample. And some the transcriptome pro le was responding to the chimeric conditions. Nevertheless, we found that the normal samples used for analysis were not derived from chimeric samples,and for abnormal samples, we have speculated that the impact of an extra chromosome on the DGE would greatly overcome the effects of the chimeric conditions. Second, we have only veri ed the HUB gene. The above pathways lack veri cation. Further, we will conduct pathway veri cation on aneuploidy

Conclusions
We systematically analyzed the common mechanisms in all monosomies and trisomies. We found that both monosomies and trisomies shared spliceosome, thyroid hormone and infection-related signaling pathways, wheras, monosomies also have epigenetic related signalling pathways. These deepen our understanding trisomies and monosomies, and might be used as diagnostic and therapeutic molecular biomarkers for monosomies and trisomies.

Declarations
Ethics approval and consent to participate Experiments with human samples were approved by the medical ethics committee of the Xiangya Hospital of Centre South University. The human samples were obtained following written informed consent from its legal representatives.

Consent for publication
Not applicable.

Availability of data and materials
Not applicable.

Competing interests
None.  Venn diagram represents the overlapping DEGs between the monosomies and trisomies. The cross areas indicate the common DEGs. mono:monosomies; tri:trisomies Heatmap of modules associated with the overlaping DEGs group. Color gradient from blue to red indicates correlation coe cient from -1 to 1.