Re-Exploring the Inflammation-Related Core Genes and Modules in Cerebral Ischemia

The genetic transcription profile of brain ischemic and reperfusion injury remains elusive. To address this, we used an integrative analysis approach including differentially expressed gene (DEG) analysis, weighted-gene co-expression network analysis (WGCNA), and pathway and biological process analysis to analyze data from the microarray studies of nine mice and five rats after middle cerebral artery occlusion (MCAO) and six primary cell transcriptional datasets in the Gene Expression Omnibus (GEO). (1) We identified 58 upregulated DEGs with more than 2-fold increase, and adj. p < 0.05 in mouse datasets. Among them, Atf3, Timp1, Cd14, Lgals3, Hmox1, Ccl2, Emp1, Ch25h, Hspb1, Adamts1, Cd44, Icam1, Anxa2, Rgs1, and Vim showed significant increases in both mouse and rat datasets. (2) Ischemic treatment and reperfusion time were the main confounding factors in gene profile changes, while sampling site and ischemic time were not. (3) WGCNA identified a reperfusion-time irrelevant and inflammation-related module and a reperfusion-time relevant and thrombo-inflammation related module. Astrocytes and microglia were the main contributors of the gene changes in these two modules. (4) Forty-four module core hub genes were identified. We validated the expression of unreported stroke-associated core hubs or human stroke-associated core hubs. Zfp36 mRNA was upregulated in permanent MCAO; Rhoj, Nfkbiz, Ms4a6d, Serpina3n, Adamts-1, Lgals3, and Spp1 mRNAs were upregulated in both transient MCAO and permanent MCAO; and NFKBIZ, ZFP3636, and MAFF proteins, unreported core hubs implicated in negative regulation of inflammation, were upregulated in permanent MCAO, but not in transient MCAO. Collectively, these results expand our knowledge of the genetic profile involved in brain ischemia and reperfusion, highlighting the crucial role of inflammatory disequilibrium in brain ischemia.


3
Background Stroke is a leading cause of death and disability worldwide, and approximately 80% of strokes are caused by cerebral ischemia [1]. Approximately 70% of ischemic strokes are caused by the occlusion of a major cerebral artery [2]. The mainstay of acute treatment for ischemic stroke is rapid recanalization via thrombolysis or mechanical thrombectomy. However, such treatments must be administered only within a narrow therapeutic window [3]. Even after successful recanalization, infarcts often continue to increase in size, a process referred to as ischemia-reperfusion injury. Primary vessel occlusion persists in cases of unsuccessful recanalization [4]. Several clinical trials have attempted to improve the outcome after an ischemic stroke, including by using antioxidant strategies, neuronal protective strategies, and anti-inflammatory strategies. However, most of these approaches have failed, which highlights the critical need for developing new understanding of the mechanisms underlying ischemic stroke [5].
It is challenging to obtain human brain samples from ischemia stroke patients. As an alternative, the middle cerebral artery occlusion (MCAO) model has been used as the primary model in stroke research. The transient MCAO model, which mimics stroke with reperfusion, represents 2.5-11.3% of large vessel stroke patients, while permanent MCAO or gradual transient MCAO mimics a stroke without recanalization, representing the majority of stroke patients [6].
We analyzed MCAO transcriptome data from a Gene Expression Omnibus (GEO) dataset to identify differentially expressed genes (DEGs) and applicated bioinformatic tools to altered pathways and hub genes. Integrated analysis of all of the existing data from different experimental animals and primary cells allowed us to understand the gene profiles changes in an unbiased manner. Weighted-gene coexpression network analysis (WGCNA) was first used to analyze the relationship of gene profile changes and various treatments among studies, and the results provided guidance to MCAO animal models. Hub genes were identified via network analysis, and we experimentally validated the expression of unreported stroke-associated hub genes and several human stroke-associated hub genes. The unreported hub genes provided further research areas in brain stroke.

Meta-Analysis of Brain Ischemia Microarray Datasets
We conducted a meta-analysis of gene expression microarray datasets of MCAO according to the guidelines put forth by Ramasamy et al. [7]. Microarray data of 131 mouse brain tissue samples (71 MCAO and 60 controls) from the GEO database were analyzed as training datasets (DS1) ( Table S1). For each sample, the characteristics of treatment, sampling site, ischemia time, and reperfusion time were collected.
We identified 2669 DEGs between groups (adj. p < 0.05). The upregulated DEGs were more significant and pronounced than the downregulated DEGs, and there were 58 upregulated DEGs with more than 2-fold increase (Fig. 1B, Table S2). Supervised hierarchical clustering based on DEGs showed distinct clustering of the majority of MCAO samples, and the clustering was independent of sampling sites, ischemia time, and reperfusion time (Fig. 1C). Among all MCAO samples, 13 samples with a 24-h reperfusion time showed stronger variations compared to controls and other MCAO samples (Fig. 1C).
To test whether these findings were replicable, we analyzed the microarray data from 101 rat brain samples in the GEO datasets. Supervised hierarchical clustering based on the rat DEGs (adj. p < 0.05) showed poor clustering of MCAO samples and controls ( Figure S1B), indicating that integrated data from rat samples have poorer reliability compared to mouse samples. We chose 39 samples from five rat GEO series with more condensed clustering (20 controls and 19 MCAO) as the validation datasets (DS2 , Table S1). We observed 564 increased DEGs and 677 decreased DEGs in DS2 (adj. p < 0.05). Because of the larger size, we adjusted the statistical criteria to adj. p < 0.01 in DS1, and the DEGs in DS1 showed good replicability with those in DS2 (Fig. 1D). Moreover, Atf3, Timp1, Cd14, Lgals3, Hmox1, Ccl2, Emp1, Ch25h, Hspb1, Adamts1, Cd44, Icam1, Anxa2, Rgs1, and Vim showed significant increases in both DS1 and DS2 (Fig. 1E, Table S3). Hierarchical clustering of DS2 samples based on either DS2-DEGs or the overlapped Fig. 1 Integration, validation, and differential expression (DE) analysis of data from brain ischemia samples. A Workflow of the integrated analysis for brain ischemia in Gene Expression Omnibus (GEO) datasets. Data from mouse brain samples were identified as training DS1. The robustness of the DS1 data was validated based on rat brain samples DS2. DE analysis and WGCNA analysis of DS1 were used to explore phenotype-related gene modules and hub genes. The expression of unreported and part of the known human stroke susceptibility genes related to hub genes was experimentally validated. Module functions were clarified using GSEA, GSVA, and metascape analysis. Reported cell type markers and primary cell transcriptional profile in brain ischemia were used to assess the cell type contribution to module gene profile. B Volcano plot of DEGs in DS1. Genes with p-value < 0.05 and Log2 (fold change) > 1 were labeled.  Figure S1C). Additionally, hierarchical clustering of DS1 samples based on the overlapped DEGs showed distinct clustering of the majority of MCAO samples ( Figure S1A), similar to the clustering observed in the DEGs in DS1 (Fig. 1C). Thus, differentially expressed analysis produced robust and highly reproducible results, warranting further analysis.

WGCNA Analysis Identified Ischemia and Reperfusion Relevant Modules
Next, we applied WGCNA based on the DEGs (adj. p < 0.05) in DS1 to integrate the expression differences into a higher order systems level context. The expression levels of each module were summarized using the first principal component (the module eigengene). We identified nine modules by clustering of module eigengenes ( Fig. 2A; Figure S1D). The relationship of the module eigengene with the sample characteristics provided a complementary assessment of potential confounder variables (Fig. 2B). The modules were highly correlated with ischemic treatment and reperfusion time, but were not significantly related to the sampling sites and ischemic time (Fig. 2B). We used the Pearson correlation coefficients to reassess the correlation between the two most highly ranked positive or negative modules and treatment type, and found that the blue, turquoise, brown, and purple modules had significant correlations with treatment type ( Figure S1E).
The DEGs in DS1 had 24 known human stroke susceptibility genes [8][9][10] (Table S9). Remarkably, the blue and turquoise modules had 19 of 24 genes (Table S9). Therefore, the genes in the DS1-blue and turquoise modules were most likely to map the molecular changes in human stroke. The correlations of modules with ischemia time and reperfusion time were similar to that with groups, which was due to the confounding effects from controls. We further analyzed the module eigengenes of 71 MCAO samples and identified reperfusion time as a main factor associated with gene profile changes ( Figure S2A-C). Among the reperfusion time-related modules, the turquoise module genes increased with prolonged reperfusion time, while the green module genes decreased ( Figure S2B, D). Of the 885 genes in the DS1-turquoise module, 761 of them increased with prolonged reperfusion time (p = 5.2E−176, Fig. 2C).
Genes associated with Alzheimer's disease (AD) play a key role in brain damage due to ischemia and reperfusion [11]. The DEGs from a gene expression dataset of frontal and temporal cortex tissues in AD (GSE122063) showed highly significant overlap with DEGs in DS1 ( Figure S3A). Among the four modules relevant to ischemia, the turquoise module had a significant overlap with increased DEGs in AD ( Figure S3B). Mapping genetic modules of brain ischemia. A Gene clustering of DS1 on "TOM"-based dissimilarity. Genes with similar dissimilarity were set into the same module using the function "cut-treeDynamic." Modules with similarity > 0.8 based on "eigengenes" were merged. B Heatmap of the relationships between modules and group; the values were set as the "correlation coefficient (p-value)." C The overlap of reperfusion-related module genes and DS1-modules (hypergeometric test)

Functional Analysis and Hub-Gene Identification of the Inflammation-Related Blue Module
The blue module eigengene was highly expressed in MCAO cases, indicating that the genes in this module are upregulated in the MCAO brain ( Figure S3C). Most genes did not increase with reperfusion time; indeed, some even decreased (Fig. 2C). We therefore speculated that the genes in the blue module may play a priming role in subsequent complex molecular events. We applied GSEA and GSVA analyses to the genes in DS1 and metascape analysis to the module genes. The overlaps of the three analysis methods were regarded as the pathways and biological processes of the separate module genes (Fig. 1A). The genes in the blue module were implicated in the inflammation process (including T cell differentiation, vasculature development, cytokinecytokine receptor interaction, and tissue morphogenesis), hydrolase and peptidase activity, MAPK signaling pathway, and apoptotic signaling pathways (Fig. 3A, Table S5). All involved pathways and processes were upregulated in brain ischemia (Fig. 3A). Cell typespecific markers (including neurons, astrocytes, mixed microglia, oligodendrocytes, and macrophages) were used to assess the cell contribution to the module gene profile [12] [13]. No significant enrichment markers were detected in the blue module (Table S4). We further analyzed the RNA data from ex vivo inflammationassociated primary cells in brain ischemia models from GEO datasets, including microglia, astrocyte, epithelium cell, and macrophages (Table S1). The genes in the blue module showed significant overlap with astrocyte-relevant DEGs (p = < 0.0001) and microglia-relevant DEGs (p = 0.007, Table S4), indicating that gene changes in the blue module may be dominated by astrocytes and microglia.
A further advantage of network analysis over differential expression (DE) analysis is that it allows the inference of the functional relevance of genes based on their network position. In this case, the hubs had the highest rank of module membership. The core hubs of the blue module are listed in Table S6. Four of the hubs have never been reported in brain ischemia, including Zfp36, Rhoj, Maff, and Nfkbiz (Table S6). Twenty of the 21 hub genes were detected as DEGs in primary astrocytes (Fig. 4A, B; Table S7). We further explored the proteinprotein interaction (PPI) network of core hubs and known human stroke susceptibility genes in the STRING database and found that the genes most strongly connected to stroke susceptibility genes were Adamts1, Zfp36, Nfkbiz, Ccl2, and Hmox1 (Fig. 4C). The transcriptions of Zfp36, Rhoj, Maff, and Nfkbiz in DS1 rapidly increased as early as 2 h after reperfusion and lasted at least 24 h ( Figure S5).

Turquoise Module is a Thrombo-Inflammation-Related Module Involved in Reperfusion Injury
The turquoise module eigengene showed high expression in MCAO cases, indicating that the genes were upregulated in the MCAO brain and increased with reperfusion time ( Figure S3C, 2C). The turquoise module showed markers of mixed microglia (60% microglia, 40% astrocyte and oligodendrocyte), and its genes, including core hubs, showed high significant overlap with DEGs of primary astrocytes and microglia (Table S4, S7). The pathway enriched in the turquoise module was involved in two main processes of thrombo-inflammation [14]. The first was leukocyte activation and migration-related molecular events, such as actin filament-based processes, integrin pathways, and extracellular matrix organization, and the second was platelet activation and degranulation by the transmembrane receptor protein tyrosine-kinase signaling pathway [15] (Fig. 3C, Table S5). This suggests that platelet activation and inflammatory responses are concomitant in ischemia-reperfusion in acute stroke. The core hubs of the turquoise module are listed in Table S6, including Tnc as an astrocyte marker and Tgfbi as a macrophage marker. Mmp3 and Serpina3n are homologs to the known human stroke susceptibility genes MMP12 and Serpine1, respectively. Serpina3n was newly reported to be upregulated in astrocytes and neurons within the ischemic penumbra after stroke. The expression of ser-pina3n expression gradually increased 6 h after reperfusion, peaking on days 2-3, and then decreasing by day 7, which was in accordance with our analysis in Figure S5. Overexpression of serpina3n reduced the infarct size, associated with alleviated inflammation, oxidative stress, and cytotoxic granzyme B inactivation [16] [17]. Ms4a6d and Tgfbi have never been reported in brain ischemia, and their transcription remarkably increased from 8 h and peaked at 24 h following stroke ( Figure S5). Serpina3n, Lgals3, Mmp3, Vim, Cd44, Spp1, and Tgm2 showed interactions with established human stroke susceptibility genes in the STRING database (Fig. 4C).

Brown and Purple Module Genes Were Decreased in MCAO
The brown module of co-expressed genes was negatively related to ischemia status and showed a decrease after stroke ( Figure S3C). This module was enriched for oligodendrocyte markers (Table S4) and was downregulated in the MCAO brain, where it showed activity in response to oxidative stress, porphyrin metabolic processes, and synapse transport ( Figure S4C, 3B). The purple module (Figure S3C), whose genes have been implicated in response to hormone and neurotransmitter transport, was also decreased (Fig. 3D, S4D). The unreported core hubs-Gpr34, Myoc, Crybb1, Rhobtb2, and Rgs9-decreased in transcription processes to the lowest point at 24 h after stroke ( Figure S5).

Expression Identification of Core Hubs
We validated the transcription of unreported core hubs. Zfp36 mRNA was upregulated in permanent MCAO; Rhoj,  signals were fuzzy, which may indicate that it is less abundant in the brain. RHOJ protein remained unchanged in both permanent and transient MCAO (Fig. 5). Several hub genes have been previously identified as being human stroke-associated genes. We found upregulation of Adamts-1, Lgals3, and Spp1 mRNA in both permanent and transient MCAO mice, suggesting shared mechanisms of stroke between species. Fig. 4 Heatmap of log2 fold change (A) and p-value (B) of core hubs in whole brain and primary cells. C PPI network of core hubs and known human stroke susceptibility genes based on the STRING network. The thickness of the edges represents the experimentally deter-mined interaction scores. The red, blue, and turquoise dots represent the known human stroke susceptibility genes, the blue and turquoise module core hubs, respectively

Discussion
In this study, an integrated analysis of mRNA expression after brain ischemia was performed using eight publicly available mouse GEO datasets, five rat GEO datasets, and primary cells. We found suggestive evidence of the MAPK signaling pathway, apoptosis, and an increase in thromboinflammation, which were reported in previous studies. In this integrated analysis of experimental animals, we also reported core regulators of inflammation, including Zfp36, Nfkbiz, Maff, Rhoj, Ms4a6d, and Tgfbi. We further verified their upregulated transcription in brain ischemia. ZFP36, NFKBIZ, MAFF proteins were upregulated as unreported core negative regulators of inflammation after stroke, providing suggestive targets for further inflammatory studies in stroke. We reported 18 DEGs that were common between mouse and rat datasets, and 15 of them were upregulated in MCAO (Fig. 1E, Table S3). Fourteen of the upregulated DEGs were detected in primary astrocytes, six of which were specific to astrocytes; and five and seven of the upregulated DEGs were detected in primary microglia and macrophages, respectively ( Figure S3D). These findings indicated that most of the DEGs were contributed by astrocytes, microglia, and macrophages-the main inflammatory cells-after stroke. Eight and four of the upregulated DEGs overlapped with blue and turquoise module core hubs, respectively ( Figure S3E). These results indicate that 12 of the 15 most significantly upregulated DEGs played a central role in inflammatory-related module genes network. Among the upregulated DEGs, Icam, Adamts-1, Lgals3, and Spp1 were reported as being human stroke-associated genes. Icam1, which is also a blue module core hub, is a known stroke susceptibility gene and is involved in the inflammation process [10]. Polymorphisms of Adamts-1 (rs416905 and rs402007) have been found to be associated with ischemic stroke caused by large artery atherosclerosis (LAA) in humans and are upregulated in unstable carotid plagues and the ischemic brain [18][19][20]. Galectin-3 (encoded by Lgals3) is a mediator of microglia responses in the injured brain [21]. The IL-10-STAT3galectin-3 axis is essential for osteopontin (encoded by Spp1)-producing reparative macrophage polarization [22], and both galectin-3 and osteopontin are emerging biomarkers in stroke that are responsible for monitoring brain microglial activation [23] [24]. Zfp36 is a global post-transcriptional regulator of feedback control in inflammation, promotes cell quiescence, and inhibits apoptosis [25] [26]. ZFP36 is an RNA-binding protein involved in mRNA metabolism pathways and contains two tandemly repeated CCCH-type zinc-finger motifs, binds to AU-rich elements (AUUUA) in the 3′-untranslated regions of specific mRNA, and leads to target mRNA decay [27]. ZFP36 also degrades Tnf, Ptgs2, and Ccl3 mRNAs and negatively regulates NF-κB signaling at the transcriptional corepressor level [28] [29][30][31]. In our mouse MCAO models, Zfp36 was upregulated in permanent MCAO, but not in transient MCAO. Among the primary inflammatory cells after stroke, Zfp36 was upregulated only in astrocytes, but not microglia, the epithelium, and macrophages (Fig. 4A, B). Zfp36 was recently reported to be increased in immortalized hippocampal HT22 neuronal cells after oxygen-glucose deprivation/reoxygenation treatment, where it protects against mitochondrial fragmentation and neuronal apoptosis [32]. Thus, we speculate that study of the regulation of inflammation by Zfp36 in neurons and astrocytes after brain ischemia will be a promising future research direction.
IkappaB-zeta (encoded by Nfkbiz) is the last identified member of IkappaB family proteins, and is strongly induced upon stimulation with LPS, IL1-β, and IFN-γ，which stimulates cells through the Toll-like receptor 4, and localizes in the nucleus, where it inhibits P65 activity [33,34]. Transcription of Nfkbiz is dose-dependent in fibroblasts, and it mediates the transcriptional response to TNF and IL-17A, but not TNF alone [35]. Therefore, Nfkbiz is widely involved in Th subtype-related-cytokines regulation, including the Th1-related cytokines, TNF-α, and IFN-γ; the Th2-related cytokine, IL-10; and the Th17-related cytokines, IL-17 and TNF-α. We found an increase in L10 signaling and T cell differentiation after ischemic stroke (Fig. 3 A, B). Emerging evidence indicates that NKT cells accumulate in the ischemic hemisphere at 24 and 48 h after ischemia [36], and Nfkbiz is essential for NK cell activation in response to IL-12 and IL-18 [37]. The ability of Nfkbiz to regulate T cell differentiation after brain ischemia will be an interesting direction for future research.
Maff belongs to the small Maf proteins (sMafs), which are basic leucine zipper (bZIP)-type transcription factors. MAFFs form homodimers by themselves and heterodimers with p45 NF-E2, Nrf1, Nrf2, and Nrf3. Homodimers act as transcriptional repressors, while heterodimers are transcriptionally active in processes that are dependent on the heterodimeric partner molecules and context [38]. Recent studies have revealed that Maff regulates an atherosclerosis relevant network connecting inflammation and cholesterol metabolism [39]. However, there have been no reports about its ability to regulate inflammation after stroke.
Ser 232 and Ser 235 phosphorylation in Ms4a6d leads to the activation of JAK2-STAT3-A20 cascades, which further results in nuclear factor κB suppression and Nlrp3 and Il-1β repression [40]. Although we did not revalidate its protein levels due to the lack of commercial antibodies, it represents an interesting investigation target in stroke.
AD is a chronic inflammatory neurodegenerative disease in which the brain undergoes an innate immune response, including large changes in the phenotypes of microglia and astrocytes, and disruptions to the blood-brain barrier (BBB) [41]. The turquoise module showed significant overlap with increased DEGs in AD (Fig. 2C). Indeed, the core hub Adamts1 has been identified as a risk locus for AD [42], while APOE4 has been specifically associated with the upregulation of multiple Serpina3 genes [43]. Ms4a6d belongs to the MS4A gene cluster, which is a key modulator of AD risk [44]. This suggests that innate immune response disequilibrium and thrombus formation are common pathological molecular changes in both acute stroke and AD.
The DEG analysis conducted in previous research had had some limitations [45], which we attempted to address in our current study. First, we addressed the wide heterogeneity of previous studies (due to variations in microarray platforms, labs, and technicians) using the "removeBatchEfffect" in the limma package to remove unwanted batch effects. Second, while DEGs are heterogeneous across species, only two common DEGs have been previously identified [45]. Therefore, clustering based on DEGs provided the ability to determine the appropriateness of analysis based on species. Indeed, we found that DEG analysis and network analysis were suitable for mice models, but not rats. Finally, as sample-processing conditions varied across studies, we identified reperfusion time as a major confounding factor using WGCNA analysis.
However, our study has several limitations. The data used for analyses were taken from experimental animal models, and the results should be cautiously interpreted and revalidated in human tissues. Moreover, the analyses did not consider post-transcriptional and post-translational modifications, and further work combining gene expression, protein, and phosphorylation profiles is needed to more accurately predict molecular changes after cerebral ischemia and reperfusion.

Conclusion
In conclusion, through integrative analysis of gene expression datasets from GEO, we have identified a reperfusion-independent inflammation module and a reperfusion-dependent thrombo-inflammation module. We identified six previously unreported inflammation-related core hubs in brain ischemia, including Zfp36, Rhoj, Nfkbiz, Maff, Ms4a6d, and Tgfbi, and further verified that Zfp36, Nfkbiz, and Maff, which are core negative inflammatory regulators, were upregulated after stroke at both the mRNA and protein levels.

Building the Gene Expression Compendium of the Brain or Primary Cells
Gene expression profiling array data were downloaded from the GEO using the terms "cerebral ischemia," "MCAO," "brain ischemia," "cerebral Infarction," or "ischemic attack." Microarray analyses using RNA samples from mouse and rat brain were included in our study. The basic characteristics used to identify the studies included first author name, year of publication, dataset, and platform. Characteristics of samples were collected, including groups (contralateral or ipsilateral of MCAO), sampling sites (cortex, ischemic penumbra, hemisphere, ischemic core, striatum), ischemia time, and reperfusion time. Using these criteria, 131 mouse samples and 101 rat samples were ultimately selected, and raw data from individual datasets were downloaded.
RNA data from ex vivo sorted primary microglia, astrocytes, epithelial cells, and macrophages were downloaded from the GEO using the abovementioned terms, by applying oxygen-glucose deprivation (OGD) or MCAO as cerebral ischemia models.

Combining the Study-Specific Estimates from Individual Datasets and Identification of DEGs
We combined the study-specific estimates from individual mouse or rat datasets according to the guidelines put forward by Ramasamy et al. [7]. Raw data were normalized via log2 transform, probe IDs were converted to official gene symbols, and any probe that did not map to any official gene symbols was discarded. For official gene symbols with multiple probe IDs within a study, we selected the value with the maximum mean value and discarded any official gene symbol that did not provide valuable information in individual datasets and that was not found in at least one individual dataset. For every official gene symbol, we combined the study-specific estimates across the studies and removed the batch effect using the removeBatchEffect function in the limma package. We performed a DE analysis between contralateral samples and ipsilateral samples using the R package "limma." Genes with an adjusted p-value < 0.05 were identified as DEGs. A gene list of the top 5% upregulated and top 5% downregulated genes of the mouse dataset is provided in Table S2.
For merged microglia dataset, we combined the studyspecific estimates from individual datasets using the removeBatchEffect function in the limma package. DE analyses of expression profiling by array between contralateral samples and ipsilateral samples were performed using the R package limma, and those of expression profiling by high-throughput sequencing were performed using the R package "edgeR." Genes with an adjusted p-value < 0.05 and |log2 fold change| > 1 were identified as DEGs of primary inflammatory cells.

WGCNA and Identification of Core Hub Genes
WGCNA is an R package for weighted correlation network analysis and is a method for describing the correlation patterns among genes across samples. WGCNA can be used to identify modules of highly correlated genes, and it can measure the correlation of modules to one another or external traits. DS1-DEGs were input as expression data, and characteristics of samples, including group, sampling sites, ischemic time, and reperfusion time, were input as trait data. A β power was used to increase the gap in the scale-free topology network conversion, where the value of β was 5. The TOM value was calculated, and genes were clustered into modules on TOM-based dissimilarity. The minimum module size was 30. Correlation of modules and traits was measured using the "cor" function in R studio (version 4.0.3, 250 Northern Ave, Boston, MA, USA, 02210). The correlation of gene module membership and gene trait significance of interested modules was calculated to revalidate the correlation of modules and traits. We calculated the connectivity within interested modules, and genes of the highest 5% connectivity were regarded as WGCNA hub genes. Hub genes were overlapped with top 5% DS1-DEGs, and the overlapped genes were identified as core hubs of modules (Table S6).

Pathway and Process Enrichment Analysis Based on Metascape
Metascape (http:// metas cape. org/) is an online tool to combine functional enrichment to leverage independent knowledge bases within one integrated portal [46]. Pathway and process enrichment analysis was conducted with the following ontology sources: KEGG pathway, GO biological processes, Reactome gene sets, canonical pathways, CORUM, and WikiPathways. WGCNA module genes were input into the metascape. Terms with a p-value < 0.01, a minimum count of 3, and an enrichment factor > 1.5 were collected and grouped into clusters based on their membership similarities. Sub-trees with a similarity of > 0.3 were considered a cluster. The most statistically significant term within a cluster was chosen to represent the cluster. To further capture the relationships between the terms, a subset of enriched terms was selected and rendered as a network plot, where terms with a similarity > 0.3 are connected by edges ( Figure S4). We selected the terms with the best p-values from each of the 20 clusters, with the constraint that there were no more than 15 terms per cluster and no more than 250 terms in total. Each node represented an enriched term and was colored first by its cluster ID. For clarity, term labels were only shown for one term per cluster.

Pathway Gene Signatures Analyzed using Gene Set Enrichment Analysis (GSEA)
GSEA is a computational method for exploring whether a priori defined set of genes shows statistically significant, concordant differences between two biological states. GSEA was performed using GSEA software (version v4.1.0 Java Web Start) to determine enriched terms that were predicted to have a correlation with all canonical pathways (including BioCarta, KEGG, PID, Reactome, and WikiPathways) in C2 and biological processes in C5. The gene sets containing less than 15 genes or more than 500 genes were excluded. The phenotype label was set as MCAO vs. controls, and the metric for ranking genes was Signal2Noise. The number of permutations was set as 1000 replications. Pathways with a nominal p-value < 0.05 and FDR < 0.25 were considered significantly enriched.

Gene Set Variation Analysis (GSVA)
GSVA (version 1.38.0) is an open-source software package for R, which provides increased power to detect subtle pathway activity changes over a sample population in comparison to GSEA [47]. Pathway analyses were predominantly performed on canonical pathways in C2 and biological processes in C5 of the molecular signatures database. We applied GSVA using standard settings, as implemented in the GSVA package. Enriched pathways were compared using the limma package between MCAO and controls, and those with an adjusted p-value < 0.01 were considered significantly enriched.

Transient and Permanent MCAO Mouse Model
MCAO was induced with an intraluminal suture to occlude the middle cerebral artery, as previously described [48], with some modifications. Briefly, mice were anaesthetized with isoflurane in air with spontaneous respiration. Under the operating microscope (RWD Life Science Co., Ltd.), the common carotid artery (CCA), internal carotid artery (ICA), and external carotid artery (ECA) were exposed through a midline neck incision. The proximal portions of the CCA and ICA were temporarily ligated, and the distal portion of the ECA was permanently ligated. A 180μm silicon-coated nylon suture was induced into the ECA and advanced approximately 9.0 mm beyond the carotid bifurcation to occlude the middle cerebral artery. For transient MCAO, the suture was removed to restore blood flow after 90-min occlusion. For permanent MCAO, the suture stayed in the vessels. Neurological evaluation was performed using the Longa scoring method [49]. Mice undergoing ischemia were rested in the home cage for 24 h. During all surgical procedures, mice were maintained normothermic using a heating blanket. Animals with a Longa's score of 1-3 were sacrificed. Those who died in the first 24 h after surgery and those with subarachnoid hemorrhage at the time of killing were excluded.

RNA Quantification by RT-qPCR
Approximately 20 mg of brain tissues from the edematous ischemic hemisphere and ipsilateral hemisphere was immediately extracted and frozen in liquid nitrogen. RNA and proteins were extracted and quantified, similar to the steps mentioned in our previously published articles [50]. For RT-qPCR, Gapdh, and Actb were used as reference genes for normalization. All primers are listed in Table S8. Data were analyzed with 5-6 biological replicates and 2-3 technical replicates in each group. The Mann-Whitney U test was used to compare data between MCAO and controls, and data with p-values < 0.05 were considered significantly different.