RNA-Seq and Network Analysis Reveal Unique Chemokine Activity Gene Expression Signatures in Synovial Tissue From Rheumatoid Arthritis

Objective: This study identied to provide a comprehensive understanding of genome-wide expression patterns of synovial tissue from rheumatoid arthritis (RA) patients to investigate the potential mechanism RA occurrence and development. Methods: The transcription proles of 9 RA and 15 control (osteoarthritis OA) synovial tissue were generated by RNA-Seq. Gene set enrichment analysis (GSEA) was used to analyze all detected genes and differentially expressed genes (DEGs) were identied by DESeq. To further analyze the DEGs, Gene ontology (GO) functional enrichment and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis were performed. The protein-protein interaction (PPI) network of DEGs were constructed by STRING and the hub genes were identied by topology clustering with MCODE-Cytoscape. The most important hub genes were validated by quantitative real time polymerase chain reaction (qRT-PCR). Results: A total of 17736 genes were detected and 651 DEGs were identied. For the GSEA, signicantly enriched gene sets positively correlated with the RA group were CD40 signaling over-activation, Th1 cytotoxic module in C2, over-activation of immune response, adaptive immune response in C5, In C7, the up-regulation of the gene set of effective versus memory CD8 T cell is related to RA group. The down-regulation of gene set of naïve versus effective CD8 T cell is related to RA group.in C7. Biology process enrichment analysis showed that the DEGs were signicantly enriched for signal transduction (P=1.52×10 -08 ), immune response (P=1.94×10 -22 ) and inammatory response (P=1.11×10 -11 ). Molecule function enrichment analysis revealed over-represented calcium ion binding (P=8.61×10 -03 ), receptor binding (P=7.03×10 -05 ) and chemokine activity (P=4.15×10 -15 ). The DEGs were signicantly enriched for plasma membrane (P=2.26×10 -20 ), integral component of membrane (P=7.79×10 -07 ), extracellular region (P=3.43×10 -16 ) in cellular component. The KEGG pathway analysis showed that the DEGs were enriched in the cytokine-cytokine receptor interaction (P=6.86×10 -21 ), chemokine signaling pathway (P=2.03×10 11 ), systemic lupus erythematosus(P=9.23×10 -07 ), T cell receptor signaling pathway (P=6.59×10 -06 ) and rheumatoid arthritis (P=3.24×10 -05 ). We conrmed RA over-expressed PPI network hub genes included CXCL13, CXCL6, CCR5, CXCR5, CCR2, CXCL3, CXCL10 and RA down-regulated hub genes included SSTR1. Conclusions: The study and veried the between RA and OA synovial tissue which highlighted the activity of a subset of chemokine genes, thereby providing novel insights into the molecular mechanisms of RA and identied potential diagnostic and therapeutic targets for RA.


Introduction
Rheumatoid arthritis (RA) is an autoimmune diseases characterized by synovial in ammation and hyperplasia, cartilage and bone destruction, and the clinical manifestations are joint pain, swelling, stiffness, and deformation [1,2]. The pathogenesis of RA is thought to involve genetics, environment factors, obesity, diet, and microbiota [3]. Osteoarthritis (OA) is a joint disease characterized by degeneration of synovial joint and loss of articular cartilage, with the primary clinical features including pain and loss of mobility [4]. Genetics, diet, estrogen use, obesity, bone density, and joint laxity all play a role in OA [5]. As both RA and OA share common physiological targets, the exploration of synovial tissue biomarkers that discriminate between these diseases carries substantial medical utility [6,7].
Transcriptomics is tissue-speci c and as such offers an avenue for the investigation of effects localized to cells that are likely to play an important role to the etiology of diseases [8]. RNA sequencing (RNA-Seq) technology has become a primary took of transcriptomics research to characterize expression within certain cell types and tissues. Using RNA-seq to identify gene expression differences between RA and OA synovial tissue may provide new insights into the understanding of the molecular pathophysiology of these diseases.
In this study, to better understand the transcriptome functional differences between RA and OA, we analyzed whole detected genes by GSEA, identi ed the differentially expressed genes (DEGs) between RA and OA synovial tissue using RNA-seq, then analyzed the DEGs by Gene ontology (GO) functional enrichment and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis, constructed protein-protein interaction (PPI) network, screened and veri ed the hub genes. Due to their central role in gene expression networks, validated hub genes may serve as critically important molecular markers for identifying RA and OA differences in synovial tissue.

Patients/Tissue
This study included 9 RA patients, who were diagnosed based on the 2010 American College of Rheumatology (ACR)/European League Against Rheumatism (EULAR) classi cation criteria for rheumatoid arthritis [9], 15 OA patients, who were diagnosed on the American College of Rheumatology OA classi cation criteria [10]. The synovial tissue of RA and OA patients were obtained from Guanghua hospital, Shanghai. After taking the synovial tissue, removed the excess fat and vascular tissue, and then put them into liquid nitrogen for later use. The demographic information was shown in Supplementary Table1.This study was approved by the Ethics Committee of Guanghua Hospital of Integrated Traditional Chinese and Western Medicine (approval number: 2018-K-12) and written consent was collected prior to the surgery from the patients.

Ribonucleic acid isolation and Library Preparation
Total ribonucleic acid (RNA) from the synovial tissue was extracted using the Trizol reagent (Thermo Fisher, Waltham, MA, USA) according to the manufacturer's protocol. RNA purity and quanti cation were evaluated using the Nano Drop 2000 spectrophotometer (Thermo Scienti c, Wilmington, DE, USA). RNA integrity was assessed using the Agilent 2100 Bioanalyzer (Agilent Technologies, Santa Clara, CA, USA). The total RNA that had a standard of RNA integrity number (RIN)≥7.0, 28S/18S≥0.7 was subjected to RNA-Seq. Then the libraries were constructed using TruSeq Stranded mRNA LT Sample Prep Kit (Illumina, San Diego, CA, USA) according to the manufacturer's instructions.

RNA Sequencing and Differentially Expressed Genes Identi cation
The libraries were sequenced on an Illumina HiSeq X Ten platform. Raw data (raw reads) of fastq format were rstly processed using Trimmomatic [11] and the low quality reads were removed to obtain the clean reads. The clean reads were mapped to the human genome (GRCh38) using HISAT2 [12]. FPKM of each gene was calculated using Cu inks [14,15], and the read counts of each gene were obtained by HTSeqcount [15]. Differential expression analysis was performed using the DESeq R package [16]. P value < 0.05 and |log2FoldChange| >= 2 were set as the threshold for signi cant differential expression.

The Gene Set Enrichment Analysis of All Detected Genes
Gene set enrichment analysis (GSEA) is used to determine the statistically signi cant differences between two groups with regard to a de ned set of genes. The analysis was conducted using the R software package, and the data set was from the Molecular Signatures Database v7.2 (MSigDB)downloaded from the GSEA-MSigDB websites.The MSigDB is database of gene sets for performing gene set enrichment analysis [17]. Adjusted P-value < 0.05, |NES| >1, P<0.05, FDR<0.25 were selected as the cut-off criteria indicating statistically signi cant differences. The MSigDB gene sets included 9 major collections (H:C8). C2 (curated gene sets), C5 (ontology gene sets), C7 (immunologic signature gene sets) are the target data set for our study.

The GO functional enrichment and KEGG pathway analysis
The DEGs were annotated by the GO functional enrichment analysis which included biological process (BP), molecular function (MF), cellular component (CC) and KEGG pathway analysis [18]. KEGG is a database resource for elucidating the genes function at the molecular and higher levels, including biochemical pathways [19]. The annotation and visualization were performed by the clusterPro ler package [20] (an R package for comparing biological themes among gene clusters). The enrichment analysis was performed by hypergeometric test. P<0.05 was chosen as the cut-off criterion indicating statistically signi cant difference.

The PPI network construction and hub genes identi cation
The PPI network was constructed by the Search Tool for the Retrieval of Interacting Genes/Proteins (STRING), a database providing all the exposed protein-protein interactions [21]. The setting of minimum required interaction score was highest con dence (0.900). The hub genes were screened and visualized by the MCODE and CytoHubba plugins in Cytoscape version 3.7.2. The Cytoscape software can visualize, model and analyze molecular and genetic interaction networks [22]. MCODE and cytoHubba plugins can identify hub genes from complex interaction and help to lock the hub-genes in a computationally e cient manner [23].

Validation of hub gene expression by reverse transcription polymerase chain reaction
The reverse transcription polymerase chain reaction (qRT-PCR) was conducted to validate the reliability of the DEG results, the expression levels of 10 selected hub genes were determined. Total RNA were extracted from the 9 RA and 15 OA synovial tissue with TRIzol reagent (Invitrogen, Thermo Fisher Scienti c, Inc). The RNA samples were reverse transcribed to cDNA with PrimeScript™ RT Master Mix (Perfect Real Time) (Takara, China), and qRT-PCR was carried out using TB Green® Premix Ex Taq™ (Tli RNaseH Plus) (Takara, China). β-actin was used as an internal reference. Relative mRNA expression was calculated using the 2 -△△ Ct method. Mann Whitney test was used for the statistical analysis, and P < 0.05 indicated a signi cant difference.

The Total Detected Genes and Differentially Expressed Genes Identi cation
In our sequencing data, a total of 17,736 genes were detected and 651 differential expression genes were identi ed with the threshold of P value < 0.05 and |log2FoldChange| ≥ 2. We identi ed 403 up-regulated and 248 down-regulated differential expressed genes. The volcano plot of DEGs were shown in Fig. 1.

The GO functional enrichment and KEGG pathway analysis of DEGs
The results of GO functional enrichment analysis showed that the DEGs were enriched in signal transduction, immune response and in ammatory response for BP, the DEGs were enriched in calcium ion binding, receptor binding and chemokine activity for MF, and the DEGs were enriched in plasma membrane, integral component of membrane and extracellular region for CC (Fig. 3A, 3B). The KEGG pathway analysis showed that the DEGs were enriched in the cytokine-cytokine receptor interaction, chemokine signaling pathway, systemic lupus erythematosus, T cell receptor signaling pathway and rheumatoid arthritis (Fig. 3C, 3D). The important pathway of cytokine-cytokine receptor interaction and rheumatoid arthritis was shown in Fig. 4.

The PPI network construction and hub genes identi cation
Using the STRING database, our analysis produced a total 630 nodes and 1001 edges. The PPI enrichment p-value was 1.0×10 − 16 . Through MCODE plugin in Cytoscape software, 19 modules were identi ed. The top 5 modules were shown in Fig. 5. Combined with MCODE and cytoHubba plugin in Cytoscape, the hub genes of CXCL13, CXCL6, CCR5, CXCR5, CCR2, CXCL3, CXCL10, CCR7, SSTR1, SSTR3 were identi ed for further analysis.

Validation of hub gene expression by qRT-PCR
Try to veri ed the hub genes, the expression levels of 10 selected hub genes in RA and OA synovial tissue were detected by the qRT-PCR. The primer sequence was shown in Table 3. The statistical results showed that the expression levels of CXCL13(P < 0.0001), CXCL6 (P = 0.0252), CCR5 (P = 0.0002), CXCR5 (P = 0.0033), CCR2 (P = 0.0073), CXCL3 (P = 0.0314), CXCL10 (P < 0.0001) in RA synovial tissue were higher than that in OA synovial tissue and the expression levels of SSTR1 (P = 0.0486) in OA synovial tissue were higher than that in RA synovial tissue (Fig. 6).

Discussion
RA and OA are two common types of arthritis with an in ammatory component, but have distinct etiologies, clinical trajectories and treatment. The pathogenesis and manifestations of these two diseases are complex, with clinical heterogeneity in presentation and disease course. Distinguishing between these two arthritic conditions is critically important for early diagnosis, appropriate treatment and elucidates the underlying pathophysiology of these disorders. Studies have demonstrated that synovial tissue plays an important role in the occurrence and development of both RA and OA. Our study performed second-generation sequencing on the synovial tissue of these two diseases, thereby enabling the identi cation of DEGs, analysis of the functions and pathways from DEGs enrichment results, and subsequently veri ed the hub DEGs by RT-qPCR.
In previous studies, the data sets in the GEO database were used for bioinformatics analysis of RA synovial tissue, such as GSE55235, GSE12021, etc. These studies are based on previous chip information, with different data sets and different genes identi ed [24,25]. To further investigate the biomarkers of synovial tissue in RA synovial tissue. In order to further analyze the transcriptome of RA synovial tissue derived from patients in a clinical setting, we collected synovial tissue from patients with RA and OA within a single rheumatology hospital, performed RNA-seq, and investigated the pathways, gene networks and hub genes to further arthritis transcriptome studies.
In this study, a total of 17736 genes were detected. The GSEA was sensitive to detect genes with relatively smaller fold change [26]. We found that in curated gene sets, the signi cant enriched gene sets positively correlated with RA group were CD40 signaling up, Th1 cytotoxic module. The CD40 signaling is associated with the production of human rheumatoid factor [27] and the CD40/NF-kB signaling pathway play an important role in RA pathogenesis [28].The Th1 cytotoxic module has not been reported to be related to RA, but the Th1 cytotoxic is reportedly associated with tumor microenvironment [29]. In ontology gene sets, the signi cant enriched gene sets positively correlated with RA group were activation of immune response, adaptive immune response. The RA is an autoimmune disease involved in innate and adaptive immunity [30]. In immunologic signature gene sets, the signi cant enriched gene sets positively correlated with RA group was the up-regulation of the gene set of effective versus memory CD8 T cell; The down-regulation of gene set of naïve versus effective CD8 T cell.
-The CD8 + T cells are involved in the pathogenesis of many autoimmune diseases mainly due to their self-reactive cytotoxic in ammatory behavior [31]. Effective CD8 + T cells have proliferation and cytotoxic properties, and induce the death of infected cells and effective memory CD8 + T cells have a lower ability to induce cytotoxicity than effective CD8 + T cells [31,32].
In our study, a total of 651 DEGs were identi ed, of which 403 were up-regulated genes and 248 were down-regulated genes. GO functional enrichment analysis demonstrated that the DEGs were enriched in signal transduction, immune response and in ammatory response in BP term, enriched in calcium ion binding, receptor binding and chemokine activity in MF term, enriched in plasma membrane, integral component of membrane and extracellular region in CC term. The KEGG pathway analysis showed that the DEGs were enriched in the cytokine-cytokine receptor interaction, chemokine signaling pathway, systemic lupus erythematosus, T cell receptor signaling pathway and rheumatoid arthritis. The DEGs was mainly concentrated in immune and in ammation-related pathways.
A total of 10 DEGs were distinguished as hub genes by MCODE and cytoHubba plugin of Cytoscape.
According to the ROC analysis, and qRT-PCR validation, for the synovial tissue ,the expression of CXCL13, CXCL6, CCR5, CXCR5, CCR2, CXCL3, CXCL10 in RA were higher than OA,but the expression of SSTR1 in OA was higher than RA. The expression of CCR7 and SSTR3 showed no difference between RA and OA synovial tissue. CXCL13, CXCL10, CXCL6 and CXCL3 are the main members of the CXC chemokine subfamily. C-X-C motif chemokine ligand 13 (CXCL13), a B cell chemokine, interacting with its receptor C-X-C motif chemokine receptor 5 (CXCR5) promotes the migration and aggregation of B lymphocytes [33]. The expression level of CXCL13 with the serum of RA patients are positively correlated with the level of rheumatoid factor, and is also correlated with the disease activity and treatment response of early rheumatoid arthritis [34][35][36]. C-X-C motif chemokine ligand 10 (CXCL10) is a ligand for the receptor C-X-C motif chemokine receptor 3 (CXCR3), which may stimulate the migration of monocytes, natural killer and T-cell migration [37]. The expression of CXCL10 was detected in serum, synovial uid and synovial tissue of RA patients [38,39]. CXCL10 may be a disease activity marker in early RA because of its high circulating level in plasma of untreated early RA and its association with clinical disease activity [40]. Our study con rmed the high expression of CXCL10 in RA synovial tissue, and identi ed that CXCL10 expression level in RA synovial tissue was higher than that in OA synovial tissue, and the difference was statistically signi cant. C-X-C motif chemokine ligand 3 (CXCL3) was reported to be associated with the invasion and metastasis of various cancer [41][42][43]. CXCL3 and CXCL6 are related to the invasion and migration of a variety of cancers [44][45][46]. The differential expression of CXCL3 and CXCL6 between RA and OA synovial tissue has not been reported. CCR7, CCR5, CCR2 are typical chemokine receptors. C-C motif chemokine receptor 5 (CCR5) is reportedly expressed in the RA synovial tissue and T helper-cell type 1 in ammatory in ltrates. The CCR5 ( the Delta32 allelic variant) has previously been reported as having a protective effect on RA susceptibility [47], however, the effect of CCR5 inhibitors on RA is still controversial [48][49][50]. C-C motif chemokine receptor 2 (CCR2) has been widely considered as a potential therapeutic target for RA and the CCR2 blocking agents have been developed [51]. The monocyte chemoattractantprotein (MCP)-1 (CCL2) and its high-a nity receptor, CCR2, are central to the development of pain associated with knee osteoarthritis. CCR2 plays an important role in both RA and OA. our study found the expression of CCR2 in RA and OA synovial tissue was different, may further identify its differential function between RA and OA. Somatostatins can regulate diverse cellular functions such as neurotransmission, cell proliferation. The somatostatin receptor 1 (SSTR1) was reportedly associated with various cancer, such as prostate cancer [52] and gastric cancer [53]. The role of SSTR1 in RA and OA has not been studied, our study may provide a basis for future arthritis research.

Conclusion
the RNA-seq was used to detect the genes between RA and OA synovial tissue. Combined with bioinformatics analysis, the DEGs were identi ed and the GO functional and KEGG pathway enrichment analysis of DEGs was analyzed. The hub DEGs, SSTR1, CXCR5, CXCL6, CXCL3, CXCL13, CXCL10, CCR7, CCR2 were veri ed by qRT-PCR. The present study could enrich expression pro le data of DEGs between RA and OA synovial tissue and provide novel insight into difference between RA and OA. The candidate DEGs, pathway might be therapeutic targets and biomarkers for RA or OA.   Table 3 The gene primer sequence    The PPI network construction and hub genes identi cation (top 5 modules). The signi cant modules identi ed by MCODE plugin of Cytoscape. The size and color of the circle represent the degree of the hub gene. The darker the color, the bigger the circle, the greater the degree of the core gene. DEG: differentially expressed gene; PPI: protein-protein interaction.