Genetic Characteristics of Systematic Juvenile Idiopathic Arthritis and The Bioinformatics Basis for Treatment

Objective Systemic Juvenile Idiopathic Arthritis (sJIA) is a distinctive subtype of Juvenile Idiopathic Arthritis (JIA). The pathogenesis of sJIA is still unclear with the treatment options limited. Although previous bioinformatics analyses have identied some genetic factors underlying sJIA, these studies were mostly single center with a small sample size and the results were often inconsistent. Herein, we combined two datasets of GSE20307 and GSE21521 and select the matrix of patients diagnosed as sJIA in it for further analysis. Methods The GSE20307 and GSE21521 matrixs downloaded from the Gene Expression Omnibus (GEO) were analyzed using online-tool GEO2R, Venny, Metascape, STRING, and Cytoscape to identify differentially expressed genes (DEGs), enrichment pathways, protein-protein interaction (PPI), main Module and hub genes between sJIA individuals and healthy controls.


Introduction
As a distinctive subtype of Juvenile idiopathic arthritis (JIA), systemic juvenile idiopathic arthritis (sJIA) is recognized as a multisystem in ammatory syndrome extraarticular, and it accounts for 10%-20 % of JIA.
The clinical manifestations of sJIA include fever, rash, arthritis, generalized lymphadenopathy, hepatosplenomegaly and polyserositis [1]. More seriously, a potential fatal feature associated with sJIA is macrophage activation syndrome, which can affect one-third sJIA, and characterized by the excessive activation of well-differentiated macrophages, resulting in high fever, hepatosplenomegaly, cytopenias, and intravascular coagulation. Additionally, 50% of patients continue to suffer from active arthritis after being diagnosed with sJIA [2]. Notably, sJIA has a great impact on children's physiology as well as psychology and seriously affects their quality of life. Therefore, in order to better understand and treat the disease, extensive research has been conducted on the eld and exceptional results have been reported [3][4][5]. Presently, the consensus state that unlike other JIA, sJIA is characteristic as autoin ammatory rather than autoimmune condition, and sJIA is more like an innate immune response disregulation rather than adaptive immune disorder [6,7]. In the pathogenetic process of sJIA, monocytes and neutrophils rather than lymphocytes occupy an important position [8], and the pro-in ammatory cytokines such as interleukins 1, 6, and 18 (IL-1,IL-6 IL-18), and tumor necrosis factor (TNF) were observed as the predominant cytokines interleukins. Some researchers also believe that the sJIA is a systemic immune in ammatory reactive disease triggered by infection in people with genetic predisposition [9]. Nonetheless, information regarding the pathogenesis of sJIA is still scarce despite the extensive effort that has been made and this makes surveillance as well as treatment of the disease rather challenging.
Bioinformatics analysis is as a reliable way to analyze disease-related pathways, nding biomarkers and predicting therapeutic targets for diseases from the genomics and transcriptomics perspective. Therefore, it has been increasingly used by researchers over the recent years, especially in cancers and autoimmune diseases [10]. Notably, genetic predisposition to sJIA was previously identifed through peripheral blood genomic analysis [11,12]. However, these studies were usually single center with a small sample size, and the results were often inconsistent.
Thus, in this study, we attempted to comprehensive analysis two public datasets of GEO to seek potential pathogenic pathways and suspected virulence genes, especially of sJIA by boinformatic analysis. The sJIA and healthy controls in GSE20307 and GSE21521 were selected in our study, and genomics in these two datasets were both detected from peripheral blood mononuclear cells (PBMC) and based on the same platform, which can make the results much more reliable. Our study aimed at identifying genetic characteristics and providing treatment basis for sJIA from a bioinformatic perspective.

Gene expression pro le data
The two gene expression pro le datasets (GSE20307 and GSE21521) were downloaded from the Gene Expression Omnibus (GEO, https://www.ncbi.nlm.nih.gov/geo). The gene expression pro les of the two datasets were both obtained from PBMCs and were both based on the GPL570 platform ( [HG-U133_Plus_2] Affymetrix Human Genome U133 Plus 2.0 Array). The pro les of sJIA and healthy controls were selected for the study. Therefore, in GSE20307, 20 sJIA and 50 healthy controls were involved while in GSE21521, 18 sJIA and 29 controls were selected.
Overlapping DEG identi cation GEO2R is an interactive online tool (http://www.ncbi.nlm.nih.gov/geo/geo2r/) and was used to screen for DEGs between sJIA and healthy controls, with the cut-off criteria of |log2FC| >0.5 and the adj. P < 0.05. Additionally, an online tool Venny (Version 2.1, available online: http://bioinfogp.cnb.csic.es/tools/venny/index.html) was used to obtain the overlapping DEGs in the two datasets.

Functional and pathway analysis of DEGs
Gene Ontology (GO) terms [10] and KEGG [11] (Kyoto Encyclopedia of Genes and Genomes) pathway enrichment analyses are two main approaches used in bioinformatics to explore potential biological functions. GO terms include Cellular Component (CC), Molecular Function (MF) and Biological Process (BP). In addition, metascape which is a convenient and up to date online gene annotation and analysis tool was used for analysis of the overlapping DGEs. Each gene of DEGs was studied for its pathway and process enrichment score for statistical signi cance in each biological process. Only terms with P < 0.01, a minimum count of 3 and an enrichment factor > 1.5 were considered be signi cant. Genes were also clustered according to their pathways and the results visualized using bar charts.
Construction of the PPI network, identi cation of modular analysis and signi cant candidate genes The Search Tool for the Retrieval of Interacting Genes/Proteins (STRING, https://string-db.org/) was used to construct a Protein-Protein Interaction (PPI) network of the overlapping DEGs with a combined score > 0.4 as the threshold for statistically signi cant interaction. Then Cytoscape (version 3.4.0; http://www.Cytoscape.org) was used to visualize the PPI network and the plug-in application named Molecular Complex Detection (MCODE) [12] used to further identify important molecules in the PPI network [13]. The recognition criteria were MCODE scores > 5, degree cut-of = 2, node score cut-off = 0.2, Max depth = 100 and k-score = 2. Additionally, Hub genes in the PPI network were ltered using cytoHubba based on the connectivity node degree analysis. The nodes with a high degree were identi ed as hub genes, which might be key candidate genes in the pathogenesis of the disease.

Identi cation of DEGs in sJIA
The sJIA and healthy controls in GSE20307 and GSE21521 were involved in our studies for they based on the same platform and the gene pro les were both obtained from PBMC. A total of 559, and 322 DEGs were identi ed from GSE20307 and GSE21521 respectively. A total of 289 genes (consisting of 41 downregulated genes and 248 upregulated genes) overlapped among the two datasets and was shown in the Venn diagram (Figure 1), and the details were showed in table 1.

Enrichment analysis of the overlapping DEGs
The online tool of metascape was utilized to analyze the GO and KEGG pathway enrichment of overlapping DEGs. For GO term analysis (Figure 2), the overlapping DEGs were primary enriched in myeloid leukocyte activation, myeloid cell differentiation, reactive oxygen species metabolic process, hydrogen peroxide catabolic process, porphyrin-containing compound metabolic process for biological processes (BP) (Figure 2a). The molecular function (MF) (Figure 2b) terms were mainly enriched in hemoglobin binding, organic acid binding, haptoglobin binding, cofactor transmembrane transporter activity, protein kinase inhibitor activity. And enriched cell component (CC) (Figure 2c) of DEGs signi cantly involved in speci c granule, speci c granule lumen, spectrin-associated cytoskeleton, haptoglobin-hemoglobin complex, mitochondrial outer membrane. In addition, the KEGG pathway enrichment analysis (Figure 2d) showed that the overlapping DEGs were maily involved in Malaria (hsa05144), Non-small cell lung cancer (hsa05223), Mitophagy -animal (hsa04137), Transcriptional misregulation in cancer (05202), Porphyrin and chlorophyll metabolism (hsa00860).

Construction of the Protein-protein Interaction (PPI) network, Module Analysis and Identi cation of Hub genes
A PPI network of the overlapping DEGs was constructed on the STRING website and visualized with Cytoscape, where nodes that were less connected to the entire network were deleted. Consequently, a total of 203 nodes and 622 edges were constructed in the PPI network ( Unfortunately, the pathogenesis and pathological process of the disease remain largely unclear with the treatment rather challenge. Notably, the recent development of high-throughput technologies has led to a better understanding of the pathogenesis of various diseases [14]. And more genes have been discovered and validated by high-throughput sequencing. Our studies combination of two cohort pro le datasets and integrated bioinformatics methods and ultimately identify some suspected genes and pathways in the pathogenesis of sJIA. Besides, our results revealed an important module in sJIA, and KEGG pathway analysis of which revealed that the genes involved were associated with ubiquitin mediated proteolysis.
What's more, the top 10 hub genes identi ed in the PPI network were all present in this module, suggesting that ubiquitin mediated proteolysis was an important process in the pathogenesis in sIJA.
Among the DEGs, 10 hub genes were selected, 8 of which were related to erythropoiesis. These included genes encoding both adult and fetal hemoglobin as well as those coding for structural proteins in the red blood cells along with proteins and enzymes on the cell surface, similar to previous studies [15][16][17]. Hinze, C.H et al. reported that an erythropoiesis signature was present in sJIA with anemia but not in other JIA subtypes with anemia [18]. Additional research also showed that the erythropoiesis signature was quite special in active sJIA (with fever) but not in the inactive form of the disease (without fever) [15,19]. Moreover, Hinze, C.H et al. showed that the index of the erythropoiesis signature decreased with the improvement of sJIA [19]. Further research by Fall et.al through ow cytometry, also revealed that sJIA patients had precursor cells expansion, with a higher proportion of CD34 + and CD15 + CD16 − immature PBMC subgroups [16]. This was consistent with the BP terms enriched from GO analysis in this study. The results from the current study also revealed the activation of myeloid in sJIA. Generally, the ndings corroborated with those from previous studies in showing that there is indeed an increase in erythropoiesis genes and myeloid activation in sJIA.
The simultaneous increase in the erythropoiesis signature and precursor cell activation in sJIA may be due to the increase of in ammatory cytokines [15]. It was previously reported that there was an increase in the erythropoiesis signature in adult patients with rheumatoid arthritis and anemia [20], in which IL-6 plays an important role. Therefore, the erythropoiesis observed in sJIA may have similar pathogenic mechanisms [21]. IL-6 has an important effect on bone marrow hematopoiesis and hyper-IL-6 plays a regulatory role in the differentiation of myeloid and erythroid progenitor cells derived from human cord blood. IL-6 can directly stimulate glycoprotein (gp) 130 (the membrane-anchored signal transducing receptor component of IL-6), effectively stimulating the in vitro expansion of human CD34 stem cells/progenitor cells and promoting erythropoiesis [22]. Moreover, the administration of IL-6 was reported to stimulate multi-lineage hematopoietic function and accelerate recovery from radiation-induced hematopoietic hypofunction. Furthermore, IL-6 has a strong promoting effect on other aspects of hematopoiesis [23]. For example, it can induce the expression of hepcidin in liver cells, reduce the absorption of iron in the intestine and induce the expression of ferritin in monocytes/macrophages [24]. Therefore, serum iron is retained in the periphery and the available iron reaching the hematopoietic area is reduced. This may be the reason behind the increased production of red blood cells. In addition, the increased expression of erythropoiesis genes may be related to the hemophagocytic syndrome in sJIA. It was previously shown that two-thirds of sJIA patients have macrophage polarization and hemophagocytic syndrome [25]. In the presence of the hematopoietic syndrome, a large number of blood cells are destroyed and erythrocytes are renewed, potentially leading to the expression of the secondary erythropoiesis signature. In this study, the results of MF analysis showed that enrichment of hemoglobin and haptoglobin binding may be another indication of an increase in not only the erythropoiesis signature but also erythrocyte precursors.
Another nding of our study is that the modular analysis results of PPI show that the main modules are related to ubiquitin-mediated proteolysis (UPS). UPS is another important way of protein degradation in eukaryotes besides the autophagy-lysosome pathway. It was rst reported in reticulocytes and studies have found that it is more active in erythrocyte precursor cells [26]. UPS can program the degradation of pre-existing cellular proteins in the terminally differentiated erythroid precursors and reshape their proteome hence simplifying the cellular proteome of mature erythrocytes. This is essential in maintaining the normal function of red blood cells. Furthermore, Grune et al. showed that the proteasome in erythrocytes plays a key role in the degradation of oxidized hemoglobin [27]. Further research by Hanash et al.'s on reticulocytes and normal red blood cell lysates also showed that early erythrocyte precursors had a greater ability to reduce excess alpha chains compared to mature erythrocyte [28]. Additional studies also reported that the UPS system is a protective mechanism against hemoglobinopathies as it can degrade unstable globins, especially in thalassemia [29]. Therefore, activation of the ubiquitin proteasome system in sJIA further suggested that there was indeed an increase in the expression of erythropoiesis genes, an increase erythrocyte precursor cells and possible abnormal hemoglobin chains.
In any case, the important role of the proteasome in immune diseases has been paid increasing attention. Existing evidence suggests that UPS is increased in rheumatoid arthritis, systemic lupus erythematosus and other autoimmune in ammatory diseases. It was reported that the constituent subunits of the proteasome are replaced by inducible subunits (β1i/LMP2; β2i/Mecl-1;β5i/LMP7) during in ammation [30], leading to the formation of the Immunoproteasome (IP). IP is related to various biological processes of autoimmune diseases such as MHC (major histocompatibility complex)-mediated antigen presentation, B cell maturation and antibody secretion, Th1 and Th17 differentiation, production of in ammatory cytokines and macrophage polarization [31]. Moreover, the most important link between UPS and in ammation is NF-κB [32], which is the main regulator of many in ammatory cytokine genes and whose activation is mediated by UPS. UPS can regulate the degradation of IκB and control the activity of NF-κB, hence regulating the secretion of NF-κB-related in ammatory cytokines (including TNFα, IL-1β, IL-6 and IL-10). Moreover, related studies showed that inhibiting UPS can interfere with the antigen presentation role of T cells, inhibit the production of Th17 cells, reduce the quantity and quality of autoantibodies [33]. Thus, UPS inhibitors such as Bortezomib (BTZ) have shown their effectiveness in the treatment of autoimmune diseases in human and animal studies [34,35]. In short, combined with those previous researches and the results of our study, we believe that UPS inhibitors may be an effective alternative option to sJIA.
The KEGG pathway enrichment analysis also showed that Malaria and Non-small cell lung cancer were enriched in sJIA. We believe that the enrichment of malaria pathway provides the basis for the application of hydroxychloroquine in sJIA. Hydroxychloroquine is an effective anti-malarial drug that has also proven to be effective in the control of rheumatic diseases [36]. Studies have showed that hydroxychloroquine can inhibit the antigen presentation function of the autophagy lysosome pathway by destroying membrane stability, interfering with the activity of lysosomes and damaging the maturation of lysosomes and autophagosomes [37]. The drug also directly and indirectly inhibits the toll-like receptor signaling pathways and reduces the production of cytokines mediated by macrophages [38]. Moreover, it was reported that hydroxychloroquine can inhibit T and B cell receptor calcium signaling [39], reduce the expression of CD154 in T cells [40], therefore inhibiting T cell proliferation and immunoglobulin production. Therefore, based on the previous studies and those obtained from KEGG enrichment analysis herein, hydroxychloroquine may be more applicable in the management of sJIA given that the malaria pathway was enriched in the disease.
Moreover, the results from KEGG analysis suggested that non-small cell lung cancer (NSCLC) related pathways were involved in the pathogenesis of sJIA. It is noteworthy that rheumatoid arthritis and cancer have certain aspects in common. Both cases have an increase in in ammatory cytokines (including IL-6, IL-23, TNF-α, IL-1 and IL-17) and the proliferation of related cells. On the on hand, previous studies showed that the above-mentioned cytokines play an important role in the occurrence and development of autoimmune in ammatory diseases as well as in non-small cell lung cancer [41]. On the other hand, recent reports indicate that some kinase inhibitors used for the treatment of NSCLC also have some extent of e cacy against rheumatic diseases [42]. Aanimal studies revealed that the Protein Tyrosine Kinase inhibitors (PTKs); Imatinib and Nilotinib, can regulate the processing of the antigen peptide proteasome and inhibit the function of T cells thus alleviating the effects of collagen-induced rheumatoid arthritis in mice [43]. Besides, the Epidermal Growth Factor Receptor (EGFR) kinase inhibitors; Erlotinib and Ge tinib were shown to be able treat non-cancer-related TNF-α-mediated in ammatory autoimmune diseases [44]. In short, kinase inhibitors for non-small cell lung cancer may also be effective in the treatment of sJIA due to the pathway of NSCLC, which provid a potential alternative for the management of sJIA.

Conclusion
This study comprehensively analyzed the datasets of the two centers and combined with previous research reports, revealing that there is indeed increased erythropoiesis in sJIA. In addition, through our study, we provide a basis for the application of proteasome inhibitors, hydroxychloroquine and kinase inhibitors in patients with sJIA from the perspective of bioinformatics. Gene symbol Function degree EPB42 Erythrocyte membrane protein band 4.2 is an ATP-binding protein, it probably has a role in erythrocyte shape and mechanical property regulation. Gene Ontology (GO) annotations related to this gene include structural constituent of cytoskeleton and protein-glutamine gamma-glutamyltransferase activity.

SLC4A1
The protein encoded by this gene is part of the anion exchanger (AE) family and is expressed in the erythrocyte plasma membrane. Among its related pathways are transport of glucose and other sugars, bile salts and organic acids, metal ions and amine compounds and Neuroscience. Gene Ontology (GO) annotations related to this gene include protein homodimerization activity and transporter activity.

ALAS2
The product of this gene specifies an erythroid-specific mitochondrially located enzyme. Among its Top 10 hub genes and cointeraction in the PPI network constructed by cytoHubba of Cytoscape based on a degree score. Color scale represents highly of degree scores.