Integrated bioinformatics approach to understand immune-related key genes and pathways in chronic spontaneous urticaria

Chronic spontaneous urticaria (CSU) refers to recurrent urticaria that lasts for more than 6 weeks in the absence of an identiable trigger. Due to its recurrent wheal and severe itching, CSU seriously affects patients' life quality. There is currently no radical cure for it and its vague pathogenesis limits the development of targeted therapy. With the goal of revealing the underlying mechanism, two data sets with accession numbers GSE57178 and GSE72540 were downloaded from the Gene Expression Omnibus (GEO) database. After identifying the differentially expressed genes (DEGs) of CSU skin lesion samples and healthy controls, four kinds of analyses were performed, namely functional annotation, protein-protein interaction (PPI) network and module construction, co-expression and drug-gene interaction prediction analysis, and immune and stromal cells deconvolution analyses. NF-κB signaling pathway and Jak-STAT signaling pathway, were found to closely related to the development CSU. on these and pathways involved a of immune responses and of immune cells, the to analyze the immune cell inltration of CSU patients with lesions. Results showed that immune cell inltration of CSU was signicantly different from that of the health control group.

Integrated bioinformatics approach to understand immune-related key genes and pathways in chronic spontaneous urticaria wenxing

Introduction
Urticaria is one of the most common clinical diseases in dermatology, mainly manifested as wheal, angioedema, or both. According to whether the course of disease exceeds 6 weeks, urticaria can be divided into acute and chronic conditions. Chronic spontaneous urticaria (CSU) refers to recurrent urticaria that lasts for more than 6 weeks without an identi able trigger, accounting for about 25% of all urticaria, with an incidence rate of 0.5-1%. In terms of the effects that cast on patients' life quality, urticaria can be put on a par with diabetes and coronary heart disease because of its repeated rashes and severe itching [1][2][3]. According to the EAACI/GA(2)LEN/EDF/WAO International Urticaria Guidelines (2018 Edition) [2], the treatment of this disease is still dominated by antihistamines. For intractable and refractory conditions, even further combined with hormones, immunomodulators or leukotriene receptor inhibitors, anticoagulants, and omalizumab, etc., the problem of relapse cannot be thoroughly solved. At present, the possible pathologic factors of CSU include Th1/Th2 cell imbalance, abnormal activation of immune cells (such as mast cells and basophils), autoimmunity, abnormal blood coagulation, and infection, whereas its exact pathogenesis remains unknown [4][5][6]. Therefore, in order to better control the clinical symptoms of CSU, revealing the molecular mechanism of its occurrence and development is urgently needed.
In recent years, bioinformatics analysis of gene expression pro les or other high-throughput data has played a key role in studying the pathogenesis of human diseases. However, single gene expression pro ling analysis often has large errors. Therefore, in this study, we downloaded two datasets from the Gene Expression Omnibus (GEO) for the rst time, including lesions of 16 CSU patients and normal epidermis of 13 healthy individuals. The common differentially expressed genes (DEGs) of the two data sets were screened and biological functions and pathway enrichment analysis were performed. Through protein-protein interaction (PPI) network analysis and GeneCards database, we identi ed four key genes, IL6, TLR4, ICAM1 and PTGS2. Besides, the module analysis showed that several key pathways are closely related to the occurrence and development of CSU and can be used as molecular targets for CSU treatment. In addition, we used xCell for the rst time to analyze the difference in immune in ltration between CSU tissue and normal tissue in 64 immune and stromal cell types. It was found that there was a signi cant difference in the characteristics of immune in ltration between the CSU and the health control group.

Methods
Raw data collection GEO (http://www.ncbi.nlm.nih.gov/geo) is a gene expression database created by NCBI, which contains high-throughput gene expression data submitted by research institutes worldwide. Two microarray datasets (GSE57178 [7] and GSE72540 [8]) were downloaded from it. Table 1 shows the details of the two data sets. All patients showed severely active CSU (urticaria activity score (UAS)7 ≥ 11) for at least 3 months, and standard dose antihistamine therapy was ineffective.

Identi cation of DEGs
Raw data of GSE57178 and GSE72540 datasets were read through the "affy" package, and the RMA algorithm was used for background correction and data normalization. The "limma" package was used to screen out DEGs. Probe sets without corresponding gene symbols or genes with more than one probe set were removed or averaged, respectively. |LogFC| > 1 and P-value < 0.05 were considered statistically signi cant.

Enrichment analyses
Gene set enrichment analysis (GSEA) refers to sorting genes according to the degree of differential expression from the two types of samples and then checking whether the preset gene set is enriched at the top or bottom of this sorting table [9]. GSEA can retain this key information without screening out differences, and then nd out those functional gene sets that are not obviously different but have a same trend of genetic differences. The functional enrichment analysis tool (Funrich) was chosen to analyze the biological pathways of DEGs [10]. The pathway enrichment analyses of DEGs were evaluated by KOBAS 3.0 (http://kobas.cbi.pku.edu.cn). Two databases, "KEGG pathway" and "Reactome", were for further analyses. Pathway analysis was conducted to nd out which cellular pathways may be involved in the changes of DEGs. Therefore, the key pathways related to DEGs can be identi ed. P-value < 0.05 was considered signi cant.

PPI network construction and analysis of key modules
Search Tool for the Retrieval of Interacting Genes (STRING; http://string-db.org) (version 10.0), which is an online database of known and predicted protein interactions, was applied to predict the PPI network of DEGs. Only interactions with a combined score > 0.4 were considered statistically signi cant. Cytoscape (version 3.6.1, http://www.cytoscape.org) was used to visualize molecular interaction networks, with its CytoNCA plug-in analyzing the topological properties of nodes in the PPI network and setting parameters to no weight. By ranking the scores of each node, we obtained important nodes involved in protein interactions within the network. The Molecular Complex Detection (MCODE; version 1.5.1) of Cytoscape was applied to screen the most signi cant module in the PPI networks with MCODE scores > 5, degree cut-off = 2, node score cut-off = 0.2, max depth = 100 and k-score = 2.

Identi cation of genes of interest
Considering that most networks were scale-free, the hub genes with degrees ≥10 were chosen. In addition, we used GeneCards database (https://www.genecards.org/) to identify some other potential related genes of CSU (Relevance score ≥ 20), and then intersect with hub genes to get the key genes. A network of the key genes and their co-expression genes was analyzed via GeneMANIA (http://www.genemania.org/). Besides, Drug-Gene Interaction database (DGIdb 2.0; http://www.dgidb.org/), which mines existing resources and generates assumptions about how genes are therapeutically targeted or prioritized for drug development, was adopted in the study to predict drugs based on the genes of interest. The parameters were set as: preset lters: FDA approved; Immunotherapies; all the default. All the drug-gene relationship pairs related to the key genes were predicted, and the network map was formed by Cytoscape.
Immune and stromal cells deconvolution analyses xCell, a novel gene signature-based method, was used to infer 64 immune and stromal cell types with extensive in silico analyses and compared to cytometry immunophenotyping [11]. By applying xCell to the microarray data and Wilcoxon method for variance, the estimated proportion of immune and stromal cell types can be obtained for each renal sample. The cut-off values for the cell analyses were P-value < 0.05.
Cell types were categorized into lymphoid, myeloid, stromal, stem cells, and others. Venn diagrams were used to compare common cell types from different datasets.
Correlation analysis of key genes and in ltrating immune cells Spearman correlation analysis was performed on key genes and in ltrating immune cells using the "ggstatsplot" software package, and the results were visualized using the "ggplot2" software package.

Identi cation of DEGs in CSU
After standardizing the microarray results, DEGs (292 in GSE57178 and 1221 in GSE72540) were identi ed ( Figure 1A and 1B). A total of 99 genes overlapped among the two datasets are shown in the Venn diagram ( Figure 1C), consisting of 92 upregulated genes and 7 downregulated genes. The heat map shows that these DEGs can basically distinguish CSU lesion samples from healthy control samples ( Figure 2).

Analysis of the functional characteristics
All gene expression information of CSU and healthy controls samples from two datasets was uploaded to GSEA, while the hallmark gene set database was used to analyze genes at the overall level of expression pro le. The signi cantly enriched gene sets were set at a default cut-off as P-value < 0.05 and FDR < 0.25.
The gene set enrichment analysis showed that four common gene sets were signi cantly enriched in CSU samples (Figure 3), including TNF-α signaling via NF-κB, in ammatory response, IL-6/JAK/STAT3 signaling and interferon-gamma response. Enrichment analysis of BP showed that these common DEGs are mainly enriched in the in ammatory response and leukocyte migration. The top 10 biological processes based on P-value < 0.05 were selected, and then a bar graph was drawn based on P-value and gene count ( Figure 4A and Table 2). Metascape visualized the interactive network of BP. The top ve enriched terms are myeloid leukocyte activation, leukocyte migration, cytokine-mediated signaling pathway, positive regulation of defense response and cellular response to interferon-gamma ( Figure 4B).
A total of 99 DEGs were uploaded to Funrich for enrichment analysis of biological pathways. According to the results, DEGs were mainly enriched in IL6-mediated signaling events and formyl peptide receptors bind formyl peptides and many other ligands, as shown in Figure 5. The top 20 terms of the pathway enrichment result, from two databases "KEGG pathway" and "Reactome", are shown in gure 6A and 6B (Table 3 and 4).

PPI network construction and module analysis
The PPI network of DEGs with combined scores greater than 0.4 was generated by Cytoscape, which contained 87 nodes and 347 interaction pairs ( Figure 1D). Two modules were identi ed and created as subnetworks ( Figure 7A and 7C). In addition, the KEGG pathway enrichment analysis of genes included in each subnetwork was performed, which revealed that DEGs in the modules were mainly associated with TNF signaling pathway (involving PTGS2, IRF1, SOCS3, SELE, IL6 and ICAM1), NF-κB signaling pathway (involving PTGS2, CD14, CCL13, ICAM1 and TLR4) and Jak-STAT signaling pathway (involving IL6, MYC, OSMR, SOCS3 and PIM1) ( Figure 7B and 7D).

Immune and stromal cells deconvolution analyses
Due to technical limitations, the status of immune in ltration in CSU has not been fully revealed, especially in subpopulations with low cell abundance. To determine the cell types that may be involved in

Discussion
As an immune-mediated in ammatory disease, CSU is closely related to autoimmune process and systemic in ammatory response. The autoimmunity theory and Th1/Th2 cell imbalance hypothesis related to its pathogenesis have been discussed most in the past [6,12]. In order to better understand its pathogenesis, this article explores from the molecular level. In this study, through protein-protein interaction (PPI) network analysis and GeneCards database, we identi ed four key genes, including IL6, TLR4, ICAM1, and PTGS2. Through the enrichment analysis of the core modules, three signal pathways, TNF signaling pathway, NF-κB signaling pathway and Jak-STAT signaling pathway, were found to be closely related to the occurrence and development of CSU. Based on these genes and pathways involved in a variety of immune responses and chemotaxis of immune cells, we used the xCell to analyze the immune cell in ltration of CSU patients with lesions. Results showed that immune cell in ltration of CSU was signi cantly different from that of the health control group.
Dysregulation of TNF-α signaling pathway is an important feature and pathogenic factor of various diseases, including sepsis, cancer, autoimmune and in ammatory diseases [13]. Previous studies reported that the upregulated expression of TNF-α was found in skin tissue biopsy of patients with different types of urticaria [14]. In addition, the concentration of TNF-α and its soluble receptor types 1 and 2 (sTNFR1 and sTNFR2, respectively) in serum of CSU patients increased signi cantly, suggesting that the activation of TNF-α signaling pathway is related to the development of CSU [15]. On the other hand, the signi cant effect of TNF-a inhibitors on refractory urticaria where other treatments were proved ineffective, also shows the importance of TNF-a signaling pathway in the development of CSU [16,17]. However, control studies with larger scale and longer follow-up periods are needed to con rm the e cacy and safety of TNF-a inhibitors in the treatment of CSU patients.
The NF-κB signaling pathway plays an important role in regulating the survival and activation of T and B lymphocytes in the thymus, bone marrow, spleen, and periphery [18]. The dysregulation of NF-κB signal leads to its structural activation, which leads to autoimmunity and chronic in ammation. Many autoimmune diseases have been proved to be associated with dysregulation of NF-κB signaling, including type 1 diabetes, systemic lupus erythematosus, and rheumatoid arthritis [19]. TNF is a major in ammatory cytokine that activates NF-κB, with most of its in ammation mediated by TNF receptor 1 (TNFR1)[20, 21]. The positive effect of neutralizing TNFR1-NF-κB signaling in autoimmune and in ammatory syndrome has been con rmed [22]. Our GSEA enrichment results showed that the TNF-α signaling via NF-κB of CSU patients in the two data sets was highly expressed, which supports the autoimmunity theory of CSU.
The Jak/STAT3 signaling pathway, as a conduction pathway closely related to in ammatory reactions, is involved in the development of chronic in ammatory diseases such as atopic dermatitis and psoriasis. The Jak inhibitor JTE-052 has shown good e cacy in the treatment of chronic dermatitis induced by hapten in rats, and is expected to become a candidate drug for the treatment of chronic dermatitis PTGS2, also known as cyclooxygenase 2 (COX2), is a key enzyme in the biosynthesis of prostaglandin D2 (PGD2) involved in in ammation. The high expression of PTGS2 can promote large amount of PGD2 synthesis and aggravate the in ammatory response in CSU patients. Besides, IL6 classical signaling can also enhance the expression of COX2 induced by FcεRI, which thereby enhances the production of IgEdependent PGD2 by human tissue-derived mast cells [41]. It is well known that endothelial dysfunction may increase vascular permeability, leading to a pro-in ammatory response. ICAM1, as a biomarker of endothelial dysfunction, is detected in the skin biopsy of CSU patients whose expression level was upregulated, re ecting the proin ammatory phenotype of its endothelium [42]. In addition, circulating soluble ICAM1 also plays a potential role in the pathogenesis of CSU, however, it is not parallel to disease activity, nor can it predict the e cacy of H1 antihistamine therapy [43]. TLR4, the earliest discovered Toll-like receptors (TLRs), can regulate the expression of various genes through NF-κB signaling after activation, like IL6, ICAM1, COX2, etc. At the same time, the massive production of Th2 cytokines will also lead to the imbalance of Th1/Th2 cells [44,45], promoting the CSU to develop.
From the immune in ltration analysis, we found that CSU tissue generally contained a higher proportion of DC, Th2 cells, mast cells, MEP, preadipocytes, and macrophages M1. As we all know, type 1 allergy plays an important role in CSU. DC, as the main antigen presenting cell, transmits allergens to B cells and activates them to generate plasma cells, which synthesizes and secretes IgE. In addition, as previously mentioned, Th2 cells can help activate B cells to produce IgE, and also can further expand the in ammatory response by producing cytokines such as IL-4, IL-10, IL-13. As a key effector cell in the pathogenesis of CSU, mast cell can release in ammatory mediators such as histamine and PGD2 after being activated, resulting in increased permeability of vasculature and recruitment of in ammatory cells, further causing symptoms such as wheal, itching, and edema [46]. In addition, macrophage M1 can produce proin ammatory related factors, such as IL-6 and TNF, and participate in the in ammatory response of CSU [47]. Our analysis results are basically consistent with the previous reports about CSU immune cell in ltration, which in turns proves the accuracy of our study. However, studies on MEP, preadipocytes, and CSU have not been reported, and this potential connection is worth further exploration.
In addition, we analyzed the correlation between IL6, TLR4, ICAM1, and PTGS2 and immune cells, and found that IL6 was positively correlated with activated Th2 cells, mv Endothelial cells, and preadipocytes; PTGS2 was positively correlated with neurons, macrophages M1, and Th2 cells; ICAM1 was positively correlated with Th2 cells, mast cells, MEP, macrophages M1, mv Endothelial cells, and preadipocytes; TLR4 was positively correlated with mv Endothelial cells, preadipocytes, macrophages M1, and MEP. This indicates that these key genes also play an important role in the immune in ltration of CSU. However, the speci c impact of these differentially expressed genes on the immune invasion of CSU lesions needs further study.
We acknowledged that the study has some certain limitations. First, the sample size we analyzed is relatively small. Second, this is a retrospective study and all the data are from publicly available databases. Third, results require further con rmation such as by vivo or vitro experiments. However, it is important that no one has done similar research on CSU before. Our results may provide new insights into the occurrence and development of CSU.

Conclusions
In summary, the purpose of this study was to identify DEGs that may be associated with pathogenesis of CSU. A total of 4 key genes have been identi ed, which can be used as a marker for CSU or as a drug therapy target. In addition, three signal pathways were found to be closely related to the occurrence and development of CSU: TNF signaling pathway, NF-κB signaling pathway and Jak-STAT signaling pathway. By combining a reliable deconvolution algorithm with large-scale genomic data, we found that there is a difference in immune in ltration between the CSU and the health control group. However, the relationship between key genes and immune in ltration and the biological functions of key genes and immune in ltration pro les in CSU need further study.   CCL5(Relevance score=20.63), CDH1(Relevance score=20.58), MYD88(Relevance score=20.49), WAS(Relevance score=20.43), CXCR4(Relevance score=20.19), CASP1(Relevance score=20.06), CST3(Relevance score=20.04), MIR223(Relevance score=20.03), The PPI network of DEGs was constructed using Cytoscape. Upregulated genes are marked in light red; downregulated genes are marked in light blue.    The biological pathways of DEGs by Funrich. P-value < 0.05 was considered signi cant.