Construction of a Competitive Endogenous RNA Network Associated with Angiogenesis in Ischemic Stroke Using a Bioinformatics Approach

Ischemic stroke (IS) is one of the leading causes of death and disability worldwide, and angiogenesis is an important target for its treatment. However, the mechanism of angiogenesis of endogenous RNA (ceRNA) in IS remains poorly understood. This study aims to explore the role of ceRNA in the angiogenesis of IS, to provide a possible target for the treatment of IS. First, GSE22255 (mRNA), GSE55937 (miRNA) and GSE102541 (lncRNA) were downloaded from the Gene Expression Omnibus (GEO) database. Then, a total of 21 mRNA modules were identied by WGCNA analysis, among which NR4A1, PTGS2, ERG3, and VEGFA in cyan module were identied as key genes for angiogenesis. Subsequently, 1454 differentially expressed lncRNAs (DELs) were screened and a lncRNA-mRNA co-expression network consisting of 40 lncRNAs and 4 mRNAs was constructed by correlation analysis. Then, 16 differentially expressed miRNAs (DEMs) were screened and the online database was used to predict the interaction information between miRNAs, lncRNAs and mRNAs. The angiogenesis-related ceRNA network was nally constructed based on ceRNA theory, in which 1 DEL was predicted as a ceRNA for 2 DEMs to regulate 4 hub genes, specically, HCG18-has-let-7i-5p-NR4A1/PTGS2/ERG3, HCG18-miR-148a-3p-PTGS2/ERG3/VEGFA interaction axis. The results of gene set enrichment analysis (GSEA) suggest that HCG18 may regulate angiogenesis through NF-kB-TNFA signaling pathway, hypoxia and other pathways. In conclusion, the above genes may be new biomarkers and potential targets for the treatment of IS.


Introduction
Stroke is an acute cerebrovascular disease, according to statistics, 16.9 million people suffer from stroke every year (Boldsen et al. 2018; Katan and Luft 2018), and the incidence is increasing year by year (Wu et al. 2019). The major subtype of stroke is ischemic stroke (IS), which accounts for 85% of all strokes.
Ischemic stroke is the result of localized ischemic necrosis or softening of brain tissue due to impaired blood supply, ischemia and hypoxia of the brain (Xu et al. 2016). Studies show that angiogenesis plays an important role in neovascularization and neurological recovery after stroke ).
Therefore, further understanding of the molecular mechanisms of angiogenesis in ischemic stroke may provide new directions for clinical prevention and treatment.
Long non-coding RNAs (lncRNAs) are endogenous non-coding RNAs more than 200 nucleotides in length (Takahashi et al. 2020). LncRNAs regulate gene expression by affecting epigenetics, transcription and translation. Previous studies show that LncRNAs played an important role in regulating endothelial cell survival, vascular integrity and angiogenesis . For instance,  found that MALAT1 may regulate angiogenesis through the 15-LOX1/STAT3 signaling pathway. Zhan et al. (2017) demonstrated that MEG3 expression is also upregulated in the OGD/R model and reduces ROS by mediating the p53/NOX4 axis thereby promoting angiogenesis. LncRNAs can also participate in signal transduction pathways to regulate angiogenesis, such as HOTTIP regulating endothelial cell proliferation and migration by activating the Wnt/β-catenin pathway (Liao et al. 2018).
LncRNAs can also act as competing endogenous RNAs (ceRNAs) for miRNAs, regulating the expression of target mRNAs and thus in uencing the occurrence and progression of disease (Li et al. 2021).
However, little is known about the mechanisms of ceRNAs in ischemic stroke angiogenesis. Therefore, studys of ceRNA network associated with lncRNAs could help to reveal the mechanisms of angiogenesis in ischemic stroke. In this study, mRNA, lncRNA and miRNA expression pro les of peripheral blood mononuclear cells (PBMC) from ischemic stroke patients and healthy controls were obtained from the Gene expression omnibus (GEO) database, so as to construct an angiogenesis-related ceRNA network with the aim of providing potential directions for the diagnosis and treatment of ischemic stroke.

Data Acquisition and Preprocessing
Gene Expression Omnibus (GEO) database is a public genome database created by NCBI, which consists of microarray, second-generation sequencing and high-throughput sequencing data (Barrett et al. 2013).
The mRNA dataset GSE22255 (Krug et al. 2012), microRNA dataset GSE55937 (Jickling et al. 2014) and lncRNA dataset GSE102541 were downloaded from GEO (http://www.ncbi.nlm.nih.gov/geo/) through the "GEOquery" package in R software . A total of 20 blood samples from patients with ischemic stroke and 20 blood samples from healthy control patients were included in the dataset GSE22255. A total of 24 blood samples from ischemic stroke patients and 24 blood samples from healthy control patients were included in the dataset GSE55937. A total of 6 blood samples from patients with ischemic stroke and 3 blood samples from healthy control patients were included in the dataset GSE102541.
GSE22255 based on GPL570 [HG-U133_Plus_2] Affymetrix Human Genome U133 Plus 2.0 Array. We downloaded GTF le from UCSC(Table Browser (ucsc.edu)) to reannotate GPL570 probes and obtained expression pro le data for 23520 genes. GSE102541 based on GPL2275 Agilent-076500 Human lncRNA+ mRNA array (Probe name version), and the microarray annotation information is obtained from the microarray annotation le of the platform GPL22755. The data of 10090 lncRNA gene expression pro les are obtained by annotating the gene expression pro les. Background correction, normalization and expression quanti cation are performed on all datasets using the "affy" package.

WGCNA Analysis and Screening IS Correlation Module
The "WGCNA" package in R software was used to construct the co-expression network of IS based on the dataset GSE22255 (Langfelder and Horvath 2008). According to the degree of gene variation, we selected 10000 genes with the greatest variation in their expression levels in different samples to construct a coexpression network. First, determine the soft threshold of the network. Soft thresholding transforms the adjacency matrix into a continuous value between 0 and 1, so that the constructed network conforms to a power-law distribution and is closer to the actual biological network state. Next, a scale-free network is constructed using the blockwiseModules function, and then a module analysis is performed to identify co-expression modules of genes and group genes with similar expression patterns. Modules are de ned by cutting the cluster tree into branches using the dynamic tree cutting algorithm, and different colors are given to them for visualization. Then, the module eigengene (ME) is calculated for each module.
Subsequently, the correlation between ME and IS in each module is calculated. Finally, the gene signi cance (GS) of the genes in the modules was calculated and the Metascape (http://metascape.org/) online annotation tool was used to functionally enrich the genes in the signi cant modules ).

Identi cation Hub Genes
Obtain the set of angiogenesis-related genes in the gene set enrichment analysis (GSEA) database (http://www.gsea-msigdb.org/gsea/index.jsp), take the intersection with the set of genes in the key module, and use the overlapping genes as candidate genes for angiogenesis. Candidate genes were compared between female ischemic stroke patients and healthy controls and between male ischemic stroke patients and healthy controls, respectively, using t-test, and those with differential expression in both female and male ischemic stroke patients were used as hub genes for subsequent analysis.
Screening DELs and Construction of lncRNA-mRNA Network Screening of DELs in IS patients and controls in dataset GSE102541 using the "Limma" package in R software (Ritchie et al. 2015), With |Log2 FC|>1 and P<0.05. Heat maps and volcano maps were generated using the "pheatmap" package (MICHAEL B. EISEN 1998) and "ggplot2" (Gómez-Rubio 2017) in R software. Next, we used the COR function in R language to calculate Spearman correlation coe cients of aberrantly expressed lncRNAs and angiogenesis-related genes in ischemic stroke patients, constructed lncRNA-mRNA co-expression networks with |r|>0.5 and P<0.05, and visualized them using Cytoscape 3.7.2 (Shannon et al. 2003). Finally, the Wilcoxon test was used to compare the expression of all lncRNAs in the lncRNA-mRNA network in the samples.
Screening DELs and Construction of lncRNA-miRNA-mRNA Network Screening of DEMs in IS patients and controls in dataset GSE102541 using the "Limma" package in R software. The |Log2 FC|>0.5 and P<0.05 as the judgment threshold. Heat maps and volcano maps were generated using the "pheatmap" package and "ggplot2" in R software. Based on the theory that lncRNAs can sponge miRNAs to further regulate mRNAs, we construct ceRNA networks. We selected lncRNAs and mRNAs with positive regulatory relationships in the angiogenesis-related lncRNA-mRNA regulatory network. Prediction of miRNAs for angiogenesis-associated lncRNAs using the miRNA-lncRNA module in StarBase 3.0 (http://starbase.sysu.edu.cn/); prediction of miRNAs for angiogenesis-associated mRNAs by the miRNA-mRNA module. Finally, the up-regulated miRNAs were intersected with miRNAs targeting down-regulated lncRNAs; the down-regulated miRNAs were intersected with miRNAs targeting upregulated lncRNAs, based on which the mid-angiogenesis-related lncRNA-miRNA-mRNA regulatory network was constructed and visualized using Cytoscape 3.7.2.

Gene Set Enrichment Analysis
To further explore the function of lncRNA HCG18 in the lncRNA-miRNA-mRNA regulatory network constructed in this study, we performed gene set enrichment analysis (GSEA) analysis of the network mRNAs separately using the clusterPro ler package. Molecular Signatures Database (MSigDB) is one of the most widely used and comprehensive gene set databases for performing gene set enrichment analysis (Liberzon et al. 2015

Construction of Co-expression Networks and Identi cation of Key Module
First, we detect whether there are outlier samples, and no samples are removed based on hierarchical clustering (Fig. 1a). When the soft threshold power is set to 8 and the cutoff fraction of scale-free R2 is 0.8, co-expression networks conform to a power-law distribution and the average connectivity does not drop to the minimum (Fig. 1b). Therefore, we choose β=8 as a suitable soft threshold parameter to construct a scale-free network to identify a total of 21 modules (Fig. 1c). Next, we analyzed the correlation of the modules with IS and the genetic signi cance of the genes of the modules, where the cyan module (r=0.38, P=0.02) was the most strongly correlated with IS (Fig. 1d), and its genetic signi cance was highest (Fig. 1e). The results of functional enrichment analysis suggest that genes in the cyan module may be involved in the regulation of blood vessel development and regulation of smooth muscle cell proliferation (Fig. 1f). Existing studies suggest that angiogenesis-related genes contribute to angiogenic remodeling and neurological recovery after IS (Sui et al. 2020;Zhang et al. 2018). Therefore, we speculate that angiogenesis-related genes in the cyan module may play an important role in the development of IS.

Hub gene Identi cation
The set of angiogenesis-related genes obtained from the GSEA database was intersected with the set of genes in the cyan module, and ve angiogenesis-related genes closely related to IS were identi ed (EGR3 KLF4 NR4A1 PTGS2 and VEGFA) (Fig. 2a). The expression of these genes was examined in the dataset GSE22255 (Fig. 2b), The expression levels of EGR3 and PTGS2 in PBMC of female IS patients were signi cantly lower than those of healthy controls; whereas the expression levels of EGR3, NR4A1, PTGS2, and VEGFA in PBMC of male IS patients were signi cantly higher than those of healthy controls (Fig. 2c).
Only KLF4 was not abnormally expressed in both female IS and male IS. Therefore, we selected EGR3, NR4A1, PTGS2, and VEGFA as key genes for subsequent analysis.
DEMs Screening Results and lncRNA-miRNA-mRNA network A total of 16 differentially expressed miRNAs, including 9 signi cantly upregulated miRNAs and 7 signi cantly downregulated miRNAs, were screened according to the previous threshold criteria. We show differentially expressed miRNAs in ischemic stroke patients in the form of heat maps and volcano maps (Fig.4a, 4b). Based on previous predictions and validation, we found that only hsa-let-7i-5p and hsa-miR-148a-3p, which were signi cantly downregulated in patients with ischemic stroke, were eligible for screening (Fig. 4c). Finally, we constructed an angiogenesis-related lncRNA-miRNA-mRNA regulatory network. This regulatory network suggests that in ischemic stroke patients, lncRNA HCG18 may regulate the expression of NR4A1, PTGS2 and EGR3 through hsa-let-7i-5p; it may also regulate the expression of PTGS2, EGR3 and VEGFA through hsa-miR-148a-3p (Fig. 4d).

Discussion
Angiogenesis is an important part of cerebral ischemia repair, and promoting angiogenesis is considered a promising strategy for the treatment of ischemic stroke ( To our knowledge, the present study is the rst to apply WGCNA to construct a ceRNA network for IS, linking the ceRNA network to angiogenesis for the rst time. WGCNA analysis takes the correlation coe cients of gene expression values to the Nth power, thus making the distribution of correlation coe cients more consistent with scale-free network analysis and biological rules (Kakati et al. 2019). In this study, NR4A1, PTGS2, ERG3 and VEGFA were identi ed as hub genes for IS angiogenesis by screening and enrichment analysis of the modules. Subsequently, by constructing a ceRNA network, we suggest that the upregulated LncRNA HCG18 may be critical for angiogenesis, as it may act as a ceRNA to downregulate the expression of has-let-7i-5p and has-miR-148a-3p, leading to the upregulation of NR4A1, PTGS2, ERG3, and VEGFA, respectively. GSEA analysis suggests that PTGS2 and ERG3 may affect angiogenesis by in uencing NFKB-TNFA signaling pathway; NR4A1 is involved in angiogenesis through hypoxia, oxidative phosphorylation, while VEGFA may be associated with DNA repair, apoptosis. NR4A1 (also known as TR3, Nur77, NGFI-B, TIS1 and NAK-1), nuclear receptor subfamily 4 group A member 1, is involved in a variety of biological processes including apoptosis, proliferation, in ammation and metabolism (Crean and Murphy 2021; Nie et al. 2016). Therefore, NR4A1 overexpression may have the potential to promote IS angiogenesis. This has been con rmed in several previous studies. Using a middle cerebral artery occlusion (MCAO) model Ling et al. (2020) found that NR4A1 expression was downregulated, but miR-224-5p inhibitor ameliorated OGD-induced neuronal apoptosis by targeting the 3'-UTR of NR4A1. In addition, Nur77 overexpression in mouse endothelial cells upregulates integrin β4 to promote vascular neogenesis (Bourbon et al. 2015). The present study suggests that has-let-7i-5p may regulate NR4A1, while HCG18 may interact with has-let-7i-5p as ceRNA. To our knowledge, no studies have explored the role of HCG18 in IS, but its role in other diseases may indirectly explain its role in IS. Notably, Zou et al. (2020) demonstrated that lncRNA HCG18 promotes the proliferation and migration of hepatocellular carcinoma cells. Li et al. (2020) revealed that the HCG18 sponge miR-34a-5p mediates HMMR expression and promotes cell proliferation, migration and invasion in lung adenocarcinoma, thereby aggravating the progression of lung adenocarcinoma. Furthermore, Xiang et al. (2017) reported that let-7i overexpression signi cantly alleviates cell death and increases the survival of OGD-treated human brain microvascular endothelial cells. Jickling et al. (2016) found that let-7i is decreased in circulating leukocytes of IS patients and is involved in leukocyte activation, recruitment and proliferation pathways. It is hypothesized that HCG18 may be upregulated in IS, further sequester has-let-7i-5p prevent it from inhibiting the expression of NR4A1, ultimately leading to upregulation of NR4A1 in IS. This hypothesis was con rmed in the results of this study.
VEGFA (also known as VPF, VEGF, MVCD1), a member of the VEGF family, is an important regulator of angiogenesis, neuroprotection and neurogenesis (Geiseler and Morland 2018). Therefore, upregulation of VEGFA is a protective strategy for IS. Ren et al. (2018) con rmed that lncRNA-MALAT1 promotes angiogenesis by sponging miR-145 and blocking its inhibition of VEGFA expression. In this study, VEGFA expression was signi cantly upregulated in IS patients, and HCG18 may sponge has-miR-148a-3p to regulate VEGFA expression. The role of has-miR-148a-3p in IS has not been analyzed, but its role in other diseases has been demonstrated. For example,  demonstrated by real-time quantitative PCR analysis that has-miR-148a-3p was lowly expressed in esophageal cancer samples, promoted cell proliferation and invasion, and was associated with a poorer prognosis. In addition, Wang et al. (2020) also found that miR-148a-3p overexpression could cause damage to the blood retinal barrier and inhibit angiogenesis. Thus, HCG18 may also be upregulated to sponge has-miR-148a-3p, thus facilitating the upregulation of VEGFA by reducing the expression of has-miR-148a-3p, and subsequently promoting angiogenesis.
Early growth response proteins belong to the immediate -early transcription factor family and are expressed in many types of tumor cells and are induced by a variety of stimulus (O'Donovan et al. 2000). Among the four EGR members (EGR1-4), EGR3 is mainly involved in neurodevelopmental processes such as myotome development (Tourtellotte and Milbrandt 1998) and sympathetic neuron differentiation (Jackson et al. 2014). Therefore, it is speculated that ERG3 upregulation may be one of the protective factors for IS. This hypothesis was con rmed, Liu et al. (2008) reveal that ERG3 upregulation promotes VEGF-mediated endothelial cell proliferation, migration, and angiogenesis. Chen et al. (2020) also con rmed that ERG3 promotes angiogenesis both in vivo and in vitro. Consistent with these studies, ERG3 was highly expressed in the IS samples in this study. Moreover, this study also demonstrated that both has-miR-148a-3p and has-let-7i-5p could regulate ERG3 expression, whereas HCG18 could sponge has-miR-148a-3p and has-let-7i-5p, respectively. concluded that PTGS2 expression downregulation not only promoted angiogenesis but also reduced brain infarct volume and inhibited in ammation in MCAO rats. However, PTGS2 was signi cantly upregulated in the peripheral blood of IS patients in this study, and this inconsistent result may be due to different sample selection. Similar to ERG3, HCG18 regulates PTGS2 expression by sponging has-miR-148a-3p and has-let-7i-5p.
The present study still has limitations. Based on bioinformatics techniques, the study preliminarily explores the mechanism of IS angiogenesis and identi es the interactions among lncRNA, miRNA and mRNA in angiogenesis. Further experimental studies are needed to verify the interactions among identi ed ceRNA axis in IS.
In conclusion, this study identi es the lncRNA -miRNA-mRNA interaction axis (HCG18-has-let-7i-5p-NR4A1/PTGS2/ERG3 HCG18-miR-148a-3p-PTGS2/ERG3/ VEGFA) which provides a new perspective for the study of IS angiogenesis mechanism and a potential target for IS therapy. In the future, we will also conduct clinical, vitro and vivo experiments to validate the expression of IS angiogenesis-related genes identi ed in this study and their function relationships.

Declarations
Funding This study was nancially supported by the China Medical Board.
Acknowledgments This study used the GEO database as a data source. The interpretation and reporting of these data are the sole responsibility of the authors. The authors acknowledge the efforts of the National Center for Biotechnology Information (NCBI) for the creation and distribution of the GEO database.
Authors' Contributions Jia Wang and Xuxiang Zhang analyzed and interpreted the data. Yibo Xie and Jiachen Li participated in the study. Jia Wang and Xuxiang Zhang wrote the manuscript. Xiaokun Wang and Chunxiao Yang conceived and designed the study. Chunxiao Yang supervised the study and provided support. All authors read and approved the nal manuscript.
Availability of Data and Material The data used in this study were downloaded from the GEO database. The data used to support the ndings of this study are available from corresponding websites upon request.
Con ict of Interest The authors declare that they have no competing interests.
Code Availability The code that supports the ndings of this study is available on request from the corresponding author.

Consent for Publication Not applicable.
Consent to Participate Not applicable.