The differential expression and regulatory networks of ceRNAs in umbilical cord blood sample of chromosome 22q11.2 deletion syndrome

Background: Chromosome 22q11.2 deletion (CH22qD) syndrome is the most common human deletion syndrome. Competing endogenous RNAs (ceRNAs) have miRNA binding sites that are capable of competitively binding miRNAs and inhibiting miRNA regulation of target genes. Results: We identied differently expressed miRNAs, circRNAs, lncRNAs and mRNAs of CH22qD, and we analysed the results by using GO analysis, KEGG pathway analysis and network regulation analysis. Conclusions: These analyses may predict the effects of chromosomal microdeletions.


Background
Chromosome 22q11.2 deletion (CH22qD) syndrome is the most common human deletion syndrome.
Patients with 22q11.2 deletion syndrome can suffer from congenital heart diseases, palatal abnormalities, learning di culties, immune de ciency, characteristic facial features, and hypocalcaemia [1]. Competing endogenous RNAs (ceRNAs) have miRNA binding sites that are capable of competitively binding miRNAs and inhibiting miRNA regulation of target genes. According to the ceRNA theory, mRNAs, pseudogenes, lncRNAs, circRNAs, etc. may compete for binding with miRNAs through miRNA response elements (MREs), thereby inhibiting the negative regulation of miRNAs on the target mRNAs. The regulation by ceRNAs provides a new perspective on how to construct gene expression regulatory networks. Marcella Cessna et al [2] demonstrated that linc-MD1 controls the timing of the differentiation of human myoblasts and that its levels are strongly reduced in Duchenne muscle cells.
The authors concluded that the ceRNA network involving linc-MD1 plays an important role in muscle differentiation. Florian A. Karreth et al [3] showed that Braf-rs1 and its human orthologue, BRAFP1, exert their oncogenic activity, at least in part, by acting as competitive endogenous RNAs (ceRNAs) that increase BRAF expression and MAPK activation in vitro and in vivo. The authors found that transcriptional or genomic aberrations of BRAFP1 occur frequently in multiple human cancers, including B cell lymphomas. In this study, we will explore the ceRNA network in chromosome microdeletions.

Methods
Umbilical cord blood samples were obtained starting from April 2017 from two females pregnant with a CH22qD foetus (age, 29) and a healthy foetus (age, 32) at the Shenzhen People's Hospital (Shenzhen, China). Diagnosis of CH22qD was performed by chromosome microarray. This study was undertaken with the approval of the Institutional Ethical Board of Shenzhen People's Hospital (Shenzhen, China), and written informed consent was provided by all subjects.
A total amount of 5 μg RNA per sample was used as the input material for RNA sample preparation. First, ribosomal RNAs were depleted to obtain rRNA-depleted RNAs. rRNA-depleted RNAs were further treated with RNase R (Epicentre, USA) and then subjected to TRIzol extraction. Subsequently, sequencing libraries were generated using rRNA-depleted and RNase R-digested RNA.
After cluster generation, the library preparations were sequenced on an Illumina HiSeq 2500 platform, and 125 bp paired-end reads were generated.
Paired-end clean reads were aligned to the reference genome.
Unmapped reads were kept, and 20-mers from the 5' and 3' ends of the reads were extracted and aligned independently to reference sequences. Then, back-spliced reads with at least two supporting reads were annotated as lncRNAs circRNAs. miRNA binding sites were predicted by psRobot_tar in psRobot [4].
The threshold for the corrected p-value for differential expression was set to 0.05.
The differential expression of the two samples was evaluated by using DEGseq (version 1.20.0) (Wang et al, 2010). The p-value was adjusted by q-value statistics [5], and q-value < 0.01 and |log2(foldchange)| > 1 were the default threshold settings to determine differential expression.
Gene Ontology (GO) enrichment analysis for the source genes of differentially expressed lncRNAs and circRNAs was performed by GOseq (version 1.18.0). KEGG (Kanehisa et al, 2008) is a database for understanding high-level functions and utilities of a biological system, such as the cell, the organism and the ecosystem, from molecular-level information, especially from large-scale molecular datasets generated by genome sequencing and other high-throughput experimental technologies (http://www.genome.jp/kegg/). KOBAS [6] was used for KEGG pathway enrichment analysis.

Results
The expression of circRNAs and miRNAs in the cord blood of the CH22qD group were pro led using highthroughput sequencing. Using a threshold of log2 fold change ≥1.0, log2 fold change ≤-1.0 and a pvalue < 0.05, , 2267 lncRNAs and 14502 circRNAs were signi cantly increased in the CH22qD group compared to the control group (Table 1A,1B). lncRNA and circRNAbiological pathway analysis, based on the KEGG database (http://www.genome.jp/), and by GO analysis.
GO annotation and KEGG pathway analysis of differentially expressed lncRNAs and circRNAs lncRNAs are usually coordinately transcribed with their associated mRNAs, the functions of lncRNAs could be mirrored through their associated mRNAs by cis-regulation and trans-regulation [19]. Therefore, the functions of lncRNAs were predicted based on the GO and KEGG pathway annotations of their target genes [20].
GO and KEGG pathway analyses were used for the functional annotation of differentially expressed lncRNAs and circRNAs. The GO analysis contains 3 ontologies: biological process (BP), cell composition (CC) and molecular function (MF). The GO analysis indicated that the most enriched lncRNAs correlated with the metabolic process of the biological process ontology (Figure 1). Moreover, the majority of lncRNAs were related to intracellular functions in the cellular component ontology and were associated with regulation of molecular function in the molecular function ontology (Figure 1). GO analysis indicated that the most enriched circRNAs correlated with the metabolic process of the biological process ontology ( Figure 2). In addition, the majority of circRNAs were related to intracellular functions in the cellular component ontology and were associated with regulation of binding in the molecular function ontology ( Figure 2). Most of GO terms and KEGG pathways for lncRNA hosting genes in CH22qD were mainly involved in metabolic process, such as metabolic process, organic substance metabolic process, cellular metabolic process, primary metabolic process, macromolecule metabolic process and metabolic pathways (Figure 1). Most of GO terms and KEGG pathways for circRNAs hosting genes in CH22qD were mainly involved in metabolic process, such as metabolic process, organic substance metabolic process, cellular metabolic process, primary metabolic process, macromolecule metabolic process, protein processing in endoplasmic reticulum, Protein processing in endoplasmic reticulum and Ubiquitin mediated proteolysis (Figure 2).
The GO and KEGG pathway analyses revealed different functions for the CH22qD lncRNAs and circRNAs that were associated with metabolic. We have listed the top 20 pathways of the KEGG pathway analysis of differentially expressed lncRNAs and circRNAs (Figure 1.2). KEGG enrichment distribution map of candidate target gene is a graphical representation of the results of KEGG enrichment analysis. In this gure, KEGG enrichment was measured by Rich factor, Qvalue and the number of genes enriched into this pathway. Rich factor refers to the ratio of the number of genes in the pathway entry in differentially expressed genes to the total number of genes in the pathway entry in all annotated genes. The larger the Rich factor is, the greater the degree of enrichment will be. Qvalue is the Pvalue after multiple hypothesis testing and correction. The value range of Qvalue is [0,1]. The closer to zero, the more signi cant the enrichment. The metabolic map of the pathway enriched by the most candidate target genes is shown as follows (Figure 3), in which the small boxes represent proteins and the small red boxes represent proteins corresponding to the candidate target genes.

Construction of ceRNA network
According to ceRNA hypothesis, competing endogenous RNAs (ceRNAs) members can compete for the same MREs to regulate each other. We constructed a ceRNA network by the mRNAs, lncRNAs, circRNAs, and miRNAs from our data.
A network was constructed based on the differentially expressed miRNAs, circRNAs, lncRNAs and mRNAs. In the lncRNA-miRNA network, green represents miRNA, and red represents lncRNA ( Figure 4A), in the circRNA-miRNA network, green represents miRNA, and red represents circRNA ( Figure 4B). Some lncRNAs and circRNAs sharing a common binding site of MRE. For instance, LNC_000828 and hg38_circ_0026798 were predicted to be ceRNAs of the miRNA hsa-let-7b-5p, which targets the ENSG00000137309, ENSG00000187838 and ENSG00000077157. These lncRNAs and circRNAs were also implicated in a number of biological processes, including metabolic process, primary metabolic process and organic substance metabolic process. ceRNA regulatory networks, which include mRNAs, miRNAs, lncRNAs, and circRNAs, might act a pivotal part in metabolic process. From the network, we found that hsa-let-7b-5p was associated with several lncRNA and circRNA (Figure 4). The important role of hsa-let-7b-5p in the DiGeorge patients with 22q11.2 deletion has been reported. There are articles showing that down-regulation of hsa-let-7b-5p inhibited cell cycle progression and anchorageindependent growth of melanoma cells [18].

Discussion
Chromosome 22q11.2 deletion syndrome is the most common human deletion syndrome.But we don't know enough about it. so that it is di cult to estimate the risks that a fetus will face when it is born. the coding and non-coding RNAs such as lncRNAs and circRNAs compete with each other to sponge their target miRNAs.
In recent years, the functional signi cance of miRNAs, lncRNAs and circRNAs has been reported. miRNAs are a family of small ncRNAs that negatively regulate gene expression by interacting with the 3'UTR of a target mRNA [8]. Some studies have been published on the relationship between miRNAs and CH22qD. The DiGeorge patients with 22q11.2 deletion were shown to have upregulated miRNA (miR-194-5p) and downregulated miRNAs (miR-15b-3p, miR-185-5p, and let-7b-5p) [7].
We found that hsa-let-7b-5p was associated with several lncRNA and circRNA (Figure 3), which targets the HMGA1, and HMGA1 encodes a chromatin-associated protein involved in the regulation of gene transcription, integration of retroviruses into chromosomes, and the metastatic progression of cancer cells. The encoded protein preferentially binds to the minor groove of AT-rich regions in double-stranded DNA. Multiple transcript variants encoding different isoforms have been found for this gene.
Pseudogenes of this gene have been identi ed on multiple chromosomes. The gene was cancer-related genes, disease related genes and transcription factors [21].
Overexpression of let-7b-5p in melanoma cells in vitro downregulated the expression of cyclins D1, D3, and A, and cyclin-dependent kinase (Cdk) 4. Let-7b-5p inhibited cell cycle progression and anchorageindependent growth of melanoma cells [18]. Perhaps hsa-let-7b-5p plays an important regulatory role in the development of the fetus. hsa-let-7b-5p is down-regulated in our con rmatory experiment.
DLEU2 negatively regulates the G1 cyclins E1 and D1 through the miR-15a/miR-16-1 cluster, and these oncoproteins are subjected to miR-15a/miR-16-1-mediated repression under normal conditions [9]. The lncRNA MALAT1 is a biomarker for lung cancer metastasis, and is a regulator of gene expression governing hallmarks of lung cancer metastasis [10]. Xist encodes an RNA molecule that plays critical roles in the choice of which X chromosome remains active, and in the initial spread and establishment of silencing of the inactive X chromosome [11]. A decrease in APTR is necessary for the induction of p21 after heat stress and DNA damage by doxorubicin, and the levels of APTR and p21 are inversely correlated in human glioblastomas [12].
In our study, miR-15b-3p was associated with ENST00000413077.1, and ENST00000413077.1 was associated with AAMP. data previous study demonstrated the critical role of AAMP in angiogenesis and suggested that blocking AAMP could serve as a potential therapeutic strategy for angiogenesis-related diseases [13]. miR-15b-3p was associated with LINC00278, HCG18 and SNHG17. LINC00278 plays a role in colorectal cancer [14]. HCG18 is highly expressed in hepatocarcinoma cells and can promote the proliferation and invasion of these cells [15]. SNHG17 knockdown signi cantly inhibited the proliferation of CRC cells, and induced cell cycle G1/G0 phase arrest and cell apoptosis. Consistent with these ndings, SNHG17 silencing inhibited tumour growth in vivo [16].
miR-185-5p is a microRNA (miR) that targets Bruton's tyrosine kinase in B cells. This miR is haploinsu cient in 90-95% of individuals with chromosome 22q11.2 deletion syndrome, which are patients who can present with immune, cardiac, and parathyroid problems, learning disorders, and a high incidence of schizophrenia in adults [17].

Conclusions
In conclusion, we identi ed differently expressed miRNAs, circRNAs, lncRNAs and mRNAs of CH22qD, and we analysed the results by using GO analysis, KEGG pathway analysis and network regulation analysis, These analyses may predict the effects of chromosomal microdeletions.

Declarations
Ethics approval and consent to participate This study was undertaken with the approval of the Institutional Ethical Board of Shenzhen People's Hospital (Shenzhen, China).

Not applicable
Availability of data and materials The datasets used and/or analysed during the current study are available from the corresponding author on reasonable request.

Competing interests
The authors declare that they have no competing interests. Funding