Identi cation of multiple sclerosis-related genes regulated by EBV-encoded microRNAs in B cells

Xinming Rang Second A liated Hospital of Harbin Medical University Yuan Liu Second A liated Hospital of Harbin Medical University Jingguo Wang Second A liated Hospital of Harbin Medical University Yifei Wang Second A liated Hospital of Harbin Medical University Chaohan Xu Harbin Medical College: Harbin Medical University Jin Fu (  fujin@hrbmu.edu.cn ) Second A liated Hospital of Harbin Medical University https://orcid.org/0000-0001-7532-784X


Background
Multiple sclerosis (MS) is an autoimmune disease characterized by demyelination of white matter in the central nervous system (CNS). It is a major cause of serious physical disability in young adults in developed countries [1]. MS involves the interaction between genetic susceptibility and environmental risk factors, and more than 200 gene loci related to MS have been identi ed through genome wide association studies (GWAS), which can account for 25% of the perceived heritability of MS [2][3][4]. Though a proportionately larger contribution was driven by non-genetic in uences, such as Epstein-Barr virus (EBV) infection, vitamin D de ciency and smoking, less progress has been made in how environmental triggers may function in MS pathogenesis [5]. Until now, the diagnosis of MS mainly relies on imaging features and clinical manifestations due to the limited understanding of MS pathogenesis, suggesting a tremendous need of speci c diagnostic biomarkers.
Epidemiological researches reported that viruses infection are closely related to the pathogenesis and development of MS, including EBV, Cytomegalovirus (CMV), Human immunode ciency virus (HIV), in uenza virus and measles virus, but the correlation between these viruses and MS risk has not been determined quantitatively [6][7][8][9][10]. From a large longitudinal study, the seropositivity of anti-EBV nuclear antigen IgG and the history of infectious mononucleosis caused by EBV infection were the top two risk factors associated with MS [11]. Pathological studies indicated that EBV-infected B cells can migrate to the CNS of MS patients. At advanced stages of the disease (secondary progressive MS), it may be induced to organize themselves into B cell follicles, contributing to viral existence and reactivation [12].
Moreover, a signi cant additive interaction between EBV infection and major histocompatibility complex, class II, dr beta 1 (HLA-DRB1)*1501 has been observed in the risk of MS [13]. Collectively, EBV infection has been identi ed as a key regulator in the initiation and progress of MS, but its roles in the pathogenesis of MS have not been established.
MicroRNAs (miRNAs) are small non-coding RNAs that regulate gene expression at the posttranscriptional level, whose potential importance has been highlighted in the pathogenesis of MS. In contrast to human miRNAs, viral miRNAs have been con rmed as more active players in host-microbe interactions, but little is known about miRNAs encoded by EBV, the major environmental risk factor of MS. EBV encodes 44 mature miRNAs which are located in two main clusters, the BamHI fragment H rightward open reading frame 1 (BHRF1) and BamHI A rightward transcripts (BART) clusters [14,15]. EBV miRNAs are abundantly expressed in all stages of EBV infection and latency (25% of miRNAs in Latency III), which can target both viral and host cellular mRNAs, allowing EBV-infected B cells to evade the host immune response [16,17]. Albanese et al compared the primary B cells infected with EBV expressing or lacking viral miRNAs. In contrast to miRNAs-expressing EBV, primary B cells infected with miRNAs-de cient EBV were more susceptible to death when introducing EBV-speci c CD8 + T cells[18]. Barth et al indicated that ebv-miR-BART2 inhibited transition from latent to lytic viral replication by downregulating the viral DNA polymerase BALF5, suggesting that EBV miRNAs had an important regulatory impact on host B cells [19].
Current researches of EBV miRNAs are mainly focusing on tumors, ebv-miR-BART8-3p has been con rmed as a potential biomarker for nasopharyngeal carcinoma, which can induce epithelialmesenchymal transition and promote metastasis through directly targeting Ring Finger Protein 38 (RNF38) in nasopharyngeal carcinoma cells [20]. However, the molecular mechanism of EBV miRNAs may function in MS pathogenesis has not been systematically elucidated.
In the present study, we collected and investigated the currently available transcriptomics datasets of MSrelated viruses and MS patients, the results of which showed the most signi cant correlation between EBV infection and MS risk by systematic analysis. Furthermore, we comprehensively screened out MSrelated genes in B cells by integrating MS susceptibility genes and differentially expressed genes (DEGs) from B cells. In comparison with DEGs from B cells after EBV infection in vitro, we con rmed EBV-regulated, MS-related genes. We then obtained target EBV miRNAs which can regulate these genes from several online databases. Finally, we identi ed target EBV miRNAs may directly regulate MS-related genes through bioinformatic prediction and experimental validation. These ndings may provide a novel insight into discovering diagnostic biomarkers and therapeutic targets for MS.

MS-related viruses and MS patients datasets selection
A owchart for this part was shown in Fig. 1a. All currently available microarray gene expression datasets of MS-related viruses and MS patients were downloaded from the Gene Expression Omnibus (GEO) database (http://www.ncbi.nlm.nih.gov/geo/). The selection criteria were as follows: (1) samples of MSrelated viruses datasets were derived from viruses susceptible cells which were isolated from healthy individuals and then infected in vitro. (2) samples of MS patients datasets were derived from those viruses susceptible cells. (3) samples of MS patients datasets were immunosuppressive treatment naïve or had not received treatment for at least two months. (4) the sample size of both case and control groups should be no fewer than three. According to these criteria, seven viral gene expression pro ling studies were included: two datasets for EBV (accession number: GSE45829 and GSE49628 [21,22]), two datasets for CMV (accession number: GSE17948 and GSE24238 [23,24]), one dataset for HIV (accession number: GSE33877 [25]), one dataset for in uenza virus (accession number: GSE32140 [9]) and one dataset for measles virus (accession number: GSE980 [10]). Meanwhile, a total of four MS patients datasets were taken from GSE117935, GSE66988, GSE41890 and GSE37750 [26][27][28][29].
Data pre-processing and identi cation of the DEGs Raw expression values were quantile normalized and log2 transformed. Probe sets mapping multiple genes were removed and the mean values were calculated as expression levels of those mapping on one gene. DEGs were screened out by using linear model in "limma" package in R [30]. The cut-off criteria for statistical signi cance were set as p-value less than 0.05.

Hypergeometric test
A hypergeometric test was used to assess the correlation between MS-related viruses infection and MS risk [31]. The p-value was derived from the hypergeometric function below: Here, m represents the total number of human genome genes, j is the number of DEGs in each group of cells after MS-related viruses infection, n belongs to the number of DEGs between MS patients and healthy controls, and x denotes the number of genes belonged to both two groups of DEGs. The p-value less than 0.05 was considered statistically signi cant.

Acquisition of MS-related genes regulated by EBV
A owchart for this part was shown in Fig. 1b. In order to comprehensively screen out MS-related genes in B cells, DEGs from B cells between MS patients and healthy individuals were integrated with MS susceptibility genes. Through comparing MS-related genes and DEGs between EBV-infected B cells and primary B cells, we identi ed EBV-regulated, MS-related genes.

Pathway enrichment analysis
Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis was performed for given genes through clusterPro ler package in R [32]. Gene symbols were transformed into EntrezID by means of org.Hs.eg.db package. These signi cant pathways were selected using Fisher's exact test, with the threshold of 0.05 for p-value.

Pathway-pathway and pathway-gene networks construction
In order to further identify the involvement of EBV in MS pathogenesis, KEGG pathway enrichment analysis and hypergeometric test were used to construct pathway-pathway network, which was visualized by Cytoscape software [33]. In this hypergeometric function, m represents the total number of human genome genes, j and n represent the number of genes in the two MS risk pathways respectively, and x denotes the number of genes belonged to both two MS risk pathways. In addition, all MS risk pathways regulated by EBV and their mapped genes were considered and combined to construct a pathway-gene network. The hub genes were determined by calculating centrality degree.

Investigation of target EBV miRNAs
MiRTarbase (http://mirtarbase.mbc.nctu.edu.tw/) and TarBase (http://carolina.imis.athenainnovation.gr/diana_tools/web/index.php?r=tarbasev8%2Findex) were used to obtain human target genes of 44 mature EBV miRNAs. All target genes should be experimentally validated, including western blot, luciferase reporter assay or high throughput sequencing [34,35]. In comparison with EBV-regulated, MS-related genes, we identi ed MS-related genes regulated by EBV miRNAs. The pathway-pathway network was constructed according to KEGG pathway enrichment analysis and hypergeometric test. Subsequently, we obtained the hub genes in the MS risk pathway-gene network regulated by EBV miRNAs. In addition, a protein-protein interaction (PPI) network was constructed by STRING database to further investigate EBV miRNAs-regulated, MS-related genes [36]. Finally, we identi ed target EBV miRNAs may directly regulate MS-related genes through bioinformatic prediction and experimental validation.

Acquisition of MS-related viruses and MS patients DEGs
Based on the threshold of 0.05 for p-value, we rst screened out DEGs in viruses susceptible cells from healthy individuals after MS-related viruses infected in vitro. Subsequently, we identi ed DEGs in viruses susceptible cells between MS patients and healthy controls. The number of DEGs was shown in Table 1.

Evaluation of correlation between MS-related viruses infection and MS risk
A hypergeometric test was conducted to explore the correlation between MS-related viruses infection and MS risk. We compared each group of MS-related viruses and MS patients DEGs derived from the same type of cells and found that four types of viruses were statistically signi cant with MS, apart from HIV. As anticipated, the correlation between EBV infection and MS risk was the most signi cant (p < 1.0×10 − 6 ), followed by measles virus, in uenza virus and CMV (Fig. 2).
Acquisition of all MS-related genes and risk pathways in B cells In recent years, a total of 284 MS susceptibility genes have been con rmed through GWAS [2,3]. By analyzing published transcriptomics data from B cells between relapsing-remitting MS (RRMS) patients and healthy controls[26], 601 DEGs were con rmed (322 upregulated-genes, 279 downregulated-genes), with only 12 overlapped genes between DEGs and MS susceptibility genes (Fig. 3a). This result suggested that hereditary susceptibility did not solely dominate the alteration of gene expression during MS development, environmental factors also played an important role, so that both susceptibility genes and DEGs should be regarded as MS-related genes. Therefore, we integrated 284 susceptibility genes and 601 DEGs from B cells, and a total of 873 MS-related genes were collected. By using KEGG pathway enrichment analysis, 52 MS risk pathways were obtained (Fisher's exact test, p < 0.05, Additional le 1: Table S1). As shown in Fig. 3b Table S2). Furthermore, a total of 18 closely associated MS risk pathways were identi ed by hypergeometric test, and the pathway-pathway network was visualized by Cytoscape ( Fig. 4a, b). In addition, we constructed a pathway-gene network among these 18 risk pathways and their enriched genes. In this network, 10 genes (HLA-DPB1, IFNGR2, STAT3, TNFRSF1A, CXCR4, IL10RA, LCK, MALT1, BCL10 and LTA) were determined as hub genes due to their high centrality degree (Fig. 4c, d).

Identi cation of the target EBV miRNAs
By searching miRTarbase and TarBase, 209 and 5962 EBV miRNAs-mRNAs pairs were downloaded, respectively. And a total of 3171 human target genes of 44 mature EBV miRNAs were obtained, out of which 42 target genes overlapped with 150 EBV-regulated, MS-related genes. According to the result, these 42 target genes were regulated by 36 target EBV miRNAs (Fig. 5a, Additional le 2: Figure S1). To further explore the impact of EBV miRNAs in the pathogenesis of MS, 19 closely associated MS risk pathways regulated by EBV miRNAs were screened out through KEGG pathway enrichment analysis and hypergeometric test (Additional le 1: Table S3). Subsequently, a pathway-gene network among these 19 MS risk pathways and their enriched genes were constructed, and a total of 5 hub genes (MALT1, BCL10, IFNGR2, STAT3 and CDK6) were obtained (Fig. 5b, Additional le 2: Figure S2). In addition, a PPI network of these 42 target proteins were established by STRING database, 10 hub genes of which were selected according to betweenness centrality (Fig. 5c). We found that forkhead box p1 (FOXP1) has a close relationship with mucosa-associated lymphoid tissue lymphoma transport protein 1 (MALT1) and BCL10 immune signaling adaptor (BCL10) in the PPI network, which should also be a focus point in our study. Finally, we obtained 15 target EBV miRNAs and their regulated, 6 MS-related genes (Fig. 5d). Surprisingly, MALT1 has been con rmed as a target of ebv-miR-BHRF1-2-5p by luciferase reporter assay in our research [37].  blasts with EBV infection when compared to the original resting B cells, re ecting EBV infection played a facilitative role in B cell activation [21]. Thus, we obtained a total of 3556 DEGs from primary B cells after EBV infection in vitro. Through intersecting these 3556 DEGs and 873 MS-related genes in B cells by the same trend, we identi ed 150 EBV-regulated, MS-related genes. Subsequently, a total of 18 closely associated MS risk pathways regulated by EBV were obtained through KEGG pathway enrichment analysis and hypergeometric test, including Viral protein interaction with cytokine and cytokine receptor, NF-κB signaling pathway and PD-L1 expression and PD-1 checkpoint pathway in cancer. After pathwaygene network analysis, HLA-DRB1 was con rmed as a vital factor in the development of MS that EBV was involved in. Recent studies revealed that B cells could express MHC class II and serve as antigenpresenting cells for activation of CD4 + T cells, which was critically involved in MS pathogenesis [43]. Meanwhile, another two molecules in the MS risk pathway-gene network regulated by EBV, MALT1 and BCL10, were important parts of CARMA1-BCL10-MALT1 (CBM) complex, which was a central mediator of T cell receptor and B cell receptor-induced NF-κB activation, and in this way drove lymphocyte activation [44]. These results indicated that EBV-infected B cells might lead to the activation of T cells via NF-κB pathway, which may point to the crucial molecular processes in the pathogenesis of MS.
In order to explore the regulatory effects and associated signaling pathways of EBV miRNAs in MS pathogenesis, we identi ed 42 MS-related genes regulated by 36 target EBV miRNAs. By constructing pathway-pathway and pathway-gene networks, a total of 5 hub genes (MALT1, BCL10, IFNGR2, STAT3 and CDK6) were obtained. Subsequently, a PPI network of these 42 target proteins were established by STRING database, and we found that FOXP1 had a close relationship with MALT1 and BCL10, which should also be a focus point in our study. Finally, we identi ed 15 target EBV miRNAs and their regulated, 6 MS-related genes. It is worth noting that ebv-miR-BHRF1-2-5p and ebv-miR-BHRF1-3 belonged to the same class cluster and had a synergistic effect. In this study, we found that ebv-miR-BHRF1-2-5p can target both MALT1 and BCL10, the components of CBM complex. MALT1 was a key molecule that promoted activation of NF-κB pathway, regulated regulatory T cells (T regs ) function and maintained immune tolerance. MALT1 knockout mice showed absence of T regs , increased Th1 and Th2 cells, which consequently led to lymphocyte in ltration and multi-organ in ammation [45]. Surprisingly, we rstly detected ebv-miR-BHRF1-2-5p and ebv-miR-BHRF1-3 were signi cantly increased in the circulation of RRMS patients. By luciferase reporter assay, MALT1 has been con rmed as a target gene of ebv-miR-BHRF1-2-5p in our research [37]. The target gene of ebv-miR-BHRF1-3, phosphatase and tensin homolog (PTEN), has also been con rmed. PTEN was a key regulator of the phosphatidylinositol 3-kinase (PIK3)/AKT (protein kinase B) survival pathway [46]. Furthermore, FOXP1 was closely associated with MALT1 and BCL10 in the PPI network and identi ed as a target gene of ebv-miR-BART11 [38,39]. FOXP1 was a critical regulator in maintaining T regs homeostasis and suppressive function. Mice with FOXP1de cient T regs developed spontaneous in ammatory disease with age, which can lead to more severe EAE [47]. In addition, we found another two EBV miRNAs-regulated, MS-related genes, interferon gamma receptor 2 (IFNGR2) and STAT3 were both the upstream genes of programmed cell death ligand 1 (PD-L1) and PD-1 in the PD-L1 expression and PD-1 checkpoint pathway in cancer. PD-L1 and PD-1 knockout mice showed more severe in ammatory in EAE models [48]. Thus, EBV miRNAs could directly target MS-related genes, which consequently altered the expression and function of B cells, changed the autoimmune response of T cells and promoted the initiation and development of MS through NF-κB (MALT1 and BCL10) and PD-L1/PD-1 (IFNGR2 and STAT3) pathways (Fig. 6).  Identi cation of MS-related genes regulated by target EBV miRNAs.

Figure 2
The correlation between MS-related viruses infection and MS risk. This result was calculated by hypergeometric test, and the p-value was log processed.

Figure 4
The MS risk pathway-pathway and pathway-gene network which were regulated by EBV. a The pathwaypathway network between MS risk pathways regulated by EBV. The blue diamonds represent MS risk pathways, and the line between two diamonds represents a signi cant correlation between these two pathways. b The number of associated pathways of each MS risk pathway in the pathway-pathway network. c The pathway-gene network between MS risk pathways which were regulated by EBV and their enriched genes. The blue diamonds represent MS risk pathways, the yellow nodes represent enriched genes, and the red nodes represent hub genes. d The number of enriched pathways of each hub gene in the pathway-gene network.