Identi cation of Differentially Expressed Proteins in the Serum for Systemic Juvenile Idiopathic Arthritis Using Next-generation Proteomics

Hironori Sato Chiba University Yuzaburo Inoue (  yuzaburo@chiba-u.jp ) Chiba Children's Hospital Yusuke Kawashima Kazusa DNA Research Institute Daisuke Nakajima Kazusa DNA Research Institute Ren Nakamura Kazusa DNA Research Institute Daigo Kato Chiba Children's Hospital Kanako Mitsunaga Chiba Children's Hospital Takeshi Yamamoto Benaroya Research Institute at Virginia Mason Akiko Yamaide Chiba Children's Hospital Minako Tomiita Shimoshizu National Hospital Akira Hoshioka Chiba Children's Hospital Osamu Ohara Kazusa DNA Research Institute Naoki Shimojo Chiba University


Introduction
Systemic juvenile idiopathic arthritis (sJIA) is a chronic arthritis that causes systemic symptoms, such as a spiking high fever, salmon-pink rash, lymphadenopathy, hepatosplenomegaly, and serositis. While sJIA is considered an autoin ammatory disease with no genetic involvement (1)(2)(3), the pathogenesis leading to systemic in ammation remains unclear.
One of the clinical problems with sJIA is the di culty diagnosing the disease at the onset because of the many differential diagnoses; biomarkers to facilitate the diagnosis are therefore needed. In addition, sJIA is sometimes complicated by macrophage activation syndrome (MAS) due to the cytokine storm induced by the activation of T lymphocytes and macrophages (2). As MAS causes rapid progression of lifethreatening conditions and requires accurate and prompt evaluations and treatment (4,5), indicators for predicting MAS are also required.
Several acute phase proteins and pro-in ammatory cytokines have been used to assess the sJIA disease activity and the risk of complications with MAS. C-reactive protein (CRP) and ferritin can be measured rapidly and easily in clinical practice(6, 7), but these proteins are non-speci c, and it takes time for them to be secreted into the blood and increase in concentration. In addition, the use of some biologics, such as tocilizumab, suppresses their production. Interleukin (IL) -18 and soluble CD163 are useful biomarkers for distinguishing sJIA from other diseases because of their rapid rise during the acute phase of sJIA and MAS(8-11). However, accurately evaluating these biomarkers can be di cult, as they cannot be measured immediately in general practice and because some sJIA patients have persistent IL-18 production despite a low disease activity (12).
Proteome analyses using mass spectrometry (MS) have recently attracted attention as a powerful tool for elucidating the pathology and identifying disease-related proteins as candidate biomarkers. Proteome analyses can identify many proteins from micro samples (13), so applying this technique with systemic acute in ammatory diseases may reveal not only acute-phase proteins and proin ammatory cytokines located downstream of in ammatory pathways but also unknown upstream proteins that cause or enhance in ammation. Furthermore, functional and pathway analyses of the identi ed proteins may reveal new pathological conditions that have not been previously reported.
We therefore investigated pathology-related proteins and candidate biomarkers for therapeutic targets and disease assessments using a deep proteome called "next-generation proteomics," from sJIA sera of different disease activities.

Patients' characteristics
We collected serum samples from 9 patients with sJIA (5-17 years old). As the study targeted a rare disease in children and its purpose was to search for differentially expressed proteins in the different disease phase and clarify the pathogenesis of the disease, we determined that the number of patients collected was su cient for detailed proteomic analysis and assessing the quantitative changes in proteins.
The patients' characteristics and laboratory data at the time of serum collection are shown in Table 1.
Sera were collected at different disease phases (MAS, active, and inactive phases) from seven of nine patients to compare the proteomes at different disease phases. Thus, 11 active-phase (including 6 MASphase) and 13 inactive-phase samples were used for the analysis. Some patients had few or mild joint symptoms at the onset of sJIA, and their arthritis was con rmed later. Thus, they did not have arthritis at the time of serum collection. The data collected from each patient did not differ substantially from the previously reported information. Therefore, the grouping of the data by disease phase was deemed reasonable. The age was consistent with the general age of sJIA patients, and the ferritin and IL-18 levels tended to be higher in patients with a high disease activity. Several patients were treated with biologics (e.g. tocilizumab) during the active phase.
Identifying differentially expressed proteins during the active phase of sJIA and their characteristics Data analyses and differentially expressed proteins were performed using the work ow in Figure 1. Our MS analysis identi ed 2,727 proteins in the initial measurement of whole samples. According to our evaluation criteria, a total of 1,856 proteins were used in the analysis. We analyzed the changes in protein expression during the active phase by selecting ve active-phase and eight-inactive phase samples from nine patients to avoid duplicating the same disease stage. The protein variation between the two groups is shown in Figure 2a. Among these, 607 proteins (32.7%) were signi cantly expressed. A total of 566 proteins were upregulated (≥2-fold) and 41 proteins were downregulated (≤0.5-fold) (Figure 2b). These upregulated proteins included many acute-phase proteins with in ammatory activity, such as CRP, ferritin (ferritin light chain, heavy chain), serum amyloid A proteins (SAA-1, SAA-2), and heat shock proteins. We also detected proteins, such as soluble CD163 and IL-18, that had previously been reported to be elevated during the active phase and in cases with MAS complications (8-11). Among these proteins, the quantitative values of the MS analysis of CRP, ferritin, and IL-18 showed a high correlation with the laboratory data (IL-18 was measured by ELISA) (Supplemental gure 1). Based on these ndings, our proteomic analysis accurately demonstrated the differences in protein expression in serum associated with the active phase of sJIA.
We also selected proteins from this group with levels that varied by disease activity. Using samples from the MAS group (n=6), active group (n=5), and inactive group (n=13), we performed an ANOVA and selected 591 proteins that were signi cantly differentially expressed. Then, we performed a principal All 68 PC1 proteins were differentially upregulated with levels that varied by disease activity (Figure 3b). We also labeled these proteins URPs (upregulated proteins) and deemed them potentially re ective of disease activity. These URPs and their functional characteristics were investigated, as explained in the following section.

The functional enrichment and pathway analysis of URPs
We performed a functional enrichment and pathway analysis to characterize the URPs. The majority of URPs were cytosolic proteins (GO:0005829). Classi cation by biological process revealed many GO terms that were related to immunity, e.g. immune-effector processes (GO:0002252), neutrophil activation (GO:0042119), neutrophil degranulation (GO:0043312), and chemokine production (GO:0032602) (Supplemental gure 2). Notably, the top enriched KEGG pathways for URPs included Proteasome (hsa03050). Identi ed proteasome subunits showed high expression in the active phase, suggesting that proteasome might play a role in the sJIA pathogenesis.
Next, we performed a protein-protein interaction analysis of URPs using STRING. For this analysis, we included identi ed downregulated proteins to further understand the pathogenesis. The major functional groups of these proteins are shown in Figure 4. These proteins, including proteasome subunits, had strong interactions with each other. Interestingly, some downregulated proteins that constitute growth factor binding (GO:0019838) and the PI3K-Akt signaling pathway (hsa04151) were revealed to be involved in URPs.

Candidate biomarkers
We selected four proteins from URPs; leucine aminopeptidase 3 (LAP3), guanylate-binding protein 1 (GBP1), heme oxygenase 1 (HMOX1), and bone morphogenetic protein 10 (BMP10), as candidate biomarkers based on their differential expression with high fold changes depending on the disease activity phase and/or as core proteins in the network analysis.
A comparison of the quantitative expression values and the fold change of the four proteins (LAP3, GBP1, HMOX1, and BMP10) is shown in Table 2. These proteins, as well as the conventional biomarkers of in ammatory proteins (CRP, ferritin, and SAA), showed high quantitative values in the MS analysis and their expression varied depending on the disease stage. Furthermore, these proteins were highly correlated with each other (Figure 5a).
We also assessed the time-course expression of four proteins and conventional biomarkers in a patient with repeated relapses. This patient suffered from repeated complications of MAS despite receiving various immunosuppressive therapies (e.g. methylprednisolone pulse therapy, dexamethasone palmitate, cyclosporine, and etoposide). The clinical course is shown in Figure 5b. Biomarker candidate proteins were increased similar to conventional biomarkers, suggesting that these proteins may be useful for monitoring the sJIA disease activity.

Discussion
This study investigated differentially expressed serum proteins associated with sJIA disease activity using next-generation proteomics. The main nding was the identi cation of 68 URPs with elevated levels during the active phase among the 2,727 proteins obtained from MS. These URPs included proteasomerelated proteins and biomarker candidate proteins, such as LAP3, whose expression varies drastically by active phase; these proteins might be associated with the pathology.
As few speci c biomarkers that can indicate and evaluate the sJIA disease activity have been identi ed, discovering novel biomarkers is very important in clinical practice. We identi ed proteins that were upregulated in active sJIA and might be useful for predicting relapse and making treatment decisions. In addition, the identi cation of proteins with a constitutively differential expression might facilitate our understanding of the underlying pathogenesis of sJIA.
The biomarker candidate proteins (LAP3, GBP1, HMOX1, and BMP10) that we selected were detected with high levels among URPs, and their expression levels varied depending on the disease stage. Since these proteins can be measured with high reproducibility using MS, they have the potential to be useful as serum biomarkers for a wide range of clinical applications in the future. Although further studies are needed, these proteins are also located in the in ammatory pathway for disease activity of sJIA and might be useful for the differential diagnosis.
We also successfully identi ed unprecedented numbers of serum proteins that proved useful for a functional and pathway analysis through careful preparation and next-generation proteomics technology. As serum protein concentrations are known to have a wide, dynamic range, most candidate biomarker proteins are low in abundance and cannot be detected by conventional preparation methods due to the in uence of a high abundance of serum components, such as albumin (14). These techniques may be used to investigate serum biomarkers of other diseases in the future.
LAP3 was the most highly expressed of the candidate biomarker proteins. This protein is not only involved in intracellular protein processing and regular turnover but is also assumed to be associated with in ammation via neutrophil activation and degranulation (15,16). The dysregulation of LAP3 expression is also thought to cause changes in cell proliferation, invasion, and angiogenesis by altering peptide activation and has been reported as a candidate biomarker for the early diagnosis and therapeutic evaluation of colorectal and breast cancer (17). Interestingly, LAP3 and GBP1 were identi ed as proteins strongly upregulated by type 1 interferons (IFNs), along with CXCL9 and 10 and CCL2 and 8(18). CXCL9 is also upregulated during the MAS phase (19), indicating its involvement in the activation of in ammation along with IFN-γ. Heme oxygenase (HMOX) is a potent anti-in ammatory and antioxidant rate-limiting enzyme that is involved in the degradation of heme to biliverdin, free ion, and carbon monoxide (CO) (20,21). HMOX-1 is distributed in the liver, spleen, and endothelium with rapid induction in the presence of stressors. Recently, some reports have shown that HMOX1 de ciency exhibits hyperin ammation and features that are similar to MAS (22); thus, HMOX1 is considered an important protein in the regulation of systemic in ammation. In addition, it has been reported that HMOX1 levels are increased in patients with sJIA and adult Still's disease (12,21,23). In this study, the network analysis showed that HMOX1 is a protein involved in in ammatory pathways and quantitative measurement by MS revealed that it was highly expressed. The expression change of BMP10 between MAS and the active phase was the most variable among the identi ed proteins. BMP10 is a circulating cytokine with an important role in endothelial homeostasis, and high BMP10 levels may increase the recruitment of monocytes to the vascular endothelium (24). This protein might be a useful biomarker for the early detection of complications and disease progression. Important proteins to consider when discussing the underlying pathogenesis of sJIA are proteasome subunits. Proteasomes are protein complexes that function as a ubiquitin-proteasome system and are involved in not only breaking down dead proteins but also many other essential cellular processes, such as cell cycle regulation and proliferation (25). Proteasomes are located in not only the cytoplasm but also the serum as circulating proteasomes (c-proteasomes). Although the exact function of c-proteasomes remains unclear, the c-proteasome levels are increased in various diseases, such as autoimmune diseases, malignancies, and sepsis(26, 27). Furthermore, in this analysis, in cases with repeated relapses, proteasomes levels were higher in the active phase as well as inactive phase than in the active phase of cases without relapse. Although further con rmation will be needed because of the small number of cases reviewed here, our result suggests that increased c-proteasomes in sJIA might not be located downstream of systemic in ammation but rather upstream of the pathogenesis of sJIA. On the other hand, the role of the functions constituted by the downregulated proteins revealed in this study is also unclear. Thus, further studies will be needed to clarify the role of c-proteasomes in the pathogenesis of sJIA, including in relation to the function of the downregulated proteins.
Two strengths of this study warrant mention. First, the sample collection was devised to ensure an effective biomarker search. The clinical course, symptoms, and treatment were carefully followed at the same institution, and samples were collected and analyzed. We were thus able to reduce biases, such as environmental and medical equipment factors that might affect the serum proteome analysis results.
Second, as mentioned above, we performed careful serum preparation and proteome analyses using the latest technology, which allowed us to obtain a large amount of information from the serum. We detected new biomarker candidates and pathological conditions by analyzing them in combination with the clinical course.
However, several limitations of the present study also warrant mention. First, the number of patients was small, so some biomarkers we found may have been affected by the patients' characteristics. Further studies evaluating the biomarkers in a large number of patients are needed to con rm their usefulness. Another methodological limitation is that low-abundance proteins (such as cytokines and chemokines) cannot be identi ed in serum, even using deep proteome analyses. Further technological innovation in proteome analyses will be needed to identify new biomarkers.

Conclusion
In conclusion, we successfully identi ed a protein group that was differentially expressed during the active phase and analyzed its functions and pathways in sJIA by next-generation proteomics. In particular, some proteins such as LAP3, whose levels increase dynamically during the active phase, are expected to be useful for clinical applications, such as making diagnoses and performing therapeutic evaluations, in combination with previously reported biomarkers. The proteins, such as proteasomes, identi ed here might also provide clues to the pathogenesis of the disease.

Study design
We performed a cross-sectional study of patients treated for sJIA. The study was performed according to the Declaration of Helsinki principles and approved by the ethics review board of Chiba Children's Hospital, Chiba, Japan. (Approval number, 2020-022). Written informed consent was obtained from the study participants and/or their guardians.

Setting and participants
We recruited patients with sJIA at Chiba Children's Hospital in October 2020. Chiba Children's Hospital serves most pediatric patients with rheumatic disease in the area due to the small number of centers treating pediatric rheumatic diseases. The eligible participants were patients who had previously been diagnosed with sJIA based on the International League of Associations for Rheumatology (ILAR) classi cation criteria(28) and had undergone treatment at Chiba Children's Hospital between April 2013 and March 2020. The exclusion criteria were a lack of medical records, complications of other rheumatic diseases, complications of acute infection at time of serum collection, or other conditions that might induce in ammation, such as surgery, injury, and malignancy.
The patient's clinical symptoms and laboratory ndings were obtained retrospectively from their medical records. The diagnosis of MAS was based on the 2016 EULAR/ACR/PRINTO classi cation criteria (29). The criteria for active-phase sJIA were typical symptoms, such as a fever, arthritis, hepatosplenomegaly, rash, and generalized lymphadenopathy, and increased CRP levels (>0.3 mg/dL).

Sample preparation for proteome analyses
Highly abundant serum proteins were depleted from 10 μl of serum using a High Select Top 14 Abundant Protein Depletion Mini Spin Column (Thermo Fisher Scienti c, Waltham, MA, USA) according to the manufacturer's instructions. Depleted serum (50 μl) was diluted with 150 μl of 100 mM Tris-HCl pH 8.5 and 2% sodium dodecyl sulfate (SDS) and treated with 10 mM dithiothreitol at 50 °C for 30 min. The sample was then alkylated with 30 mM iodoacetamide in the dark at room temperature for 30 min and subjected to cleanup and digestion with single-pot solid phase-enhanced sample preparation (SP3) (30).
Two types of beads (hydrophilic and hydrophobic Sera-Mag Speed-Beads; Cytiva, Marlborough, MA, USA) were used for the SP3 method. These beads were combined at a 1:1 (v/v) ratio, rinsed with distilled water, and reconstituted in 500 mM Tris-HCl pH 7.0 at 10 μg solids/μl. The reconstituted beads (20 μl) were then added to the alkylated sample followed by ethanol to bring the nal concentration to 75% (v/v), with mixing for 20 min. The beads were subsequently immobilized on a magnetic rack. The supernatant was discarded, and the pellet was rinsed with 80% ethanol and 100% acetonitrile (ACN). The beads were then resuspended in 40 μl of 50 mM Tris-HCl pH 8.0 with 1 µg trypsin/Lys-C Mix (Promega, Madison, WI, USA) and digested by gentle agitation at 37 °C overnight. The digested sample was acidi ed with 150 μl of 0.1% Tri uoroacetic Acid (TFA) and then desalted using GL-Tip SDB (GL Sciences Inc., Tokyo, Japan) according to the manufacturer's instructions, followed by drying with a centrifugal evaporator. The dried peptides were redissolved in 3% ACN and 0.1% formic acid and transferred to a hydrophilic-coated lowadsorption vial (ProteoSave vial; AMR Inc., Tokyo, Japan).

Proteome analyses
Peptides were directly injected onto a 75 μm × 40 cm PicoFrit emitter (New Objective, Woburn, MA, USA) packed in-house with C18 core-shell particles (CAPCELL CORE MP 2.7 μm, 160 Å material; Osaka Soda Co., Ltd., Osaka, Japan) at 60 °C and then separated with a 120-min gradient at 100 nl/min using an UltiMate 3000 RSLCnano LC system (Thermo Fisher Scienti c, Waltham, MA, USA). Peptides eluting from the column were analyzed on a Q Exactive HF-X (Thermo Fisher Scienti c) for overlapping window dataindependent acquisition (DIA)-MS (13,31). MS1 spectra were collected in the range of 495-745 m/z at 30,000 resolution to set an automatic gain control target of 3×10 6 (unit) and maximum injection time of 55 (unit). MS2 spectra were collected at >200 m/z at 45,000 resolution to set an automatic gain control target of 3×10 6 (unit), maximum injection time of "auto," and stepped normalized collision energy of 22%, 26%, and 30%. The isolation width for MS2 was set to 4 m/z, and overlapping window patterns of 500-740 m/z were used for window placements optimized by Skyline.
The MS les were searched against a human spectral library using Scaffold DIA (Proteome Software, Inc., Portland, OR, USA). The human spectral library was generated from the human protein sequence database (UniProt id UP000005640, reviewed, canonical) established by Prosit. The Scaffold DIA search parameters were as follows: experimental data search enzyme, trypsin; maximum missed cleavage sites, 1; precursor mass tolerance, 8 ppm; fragment mass tolerance, 8 ppm; static modi cation, cysteine carbamidomethylation. The protein identi cation threshold was set at <1% for both peptide and protein false discovery rates. The peptide quanti cation was calculated by the EncyclopeDIA algorithm (32) in Scaffold DIA. For each peptide, the four highest-quality fragment ions were selected for quantitation. The protein quantitative value was estimated from the summed peptide quantitative values. Total quantitative values were calculated by normalizing the protein quantitative values between samples.

The statistical and bioinformatics analysis of the data
We performed statistical analyses and generated hierarchical clustering and heatmaps using the JMP Pro13 software program (SAS Institute Inc., Cary, NC, USA) and Qlucore omics explorer software program (Qlucore AB, Lund, Sweden).
We used proteins with at least three peptide counts that were detected in at least one subject for the subsequent analysis, with the exclusion of proteins with missing values in some subjects. After conducting log 2 transformation for the quanti ed continuous values, an independent sample two-sided ttest or an analysis of variance (ANOVA) was used. P values of <0.05, with a con dence interval of 95%, were considered to indicate statistical signi cance.
We also used a principal component analysis in our study to reduce the identi ed proteins into a few dimensions that reduce a large amount of the variability of the original values. The rst principal component (PC1) explains the highest amount of variability of the original data. Furthermore, correlations of the identi ed proteins were investigated using Spearman's correlation coe cient method.
An analysis of functional Gene Ontology (GO) terms and the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway were performed with the online tool Search Tool for the Retrieval of Interacting Genes (STRING v11; http://string-db.org/) (33). An association network of the identi ed proteins was also established using the STRING database with the highest con dence interaction score. The proteins that showed elevated serum values were referred to the Human Protein Atlas (HPA; https://www.proteinatlas.org/) (34).

Declarations
Hepatomegaly and/or Splenomegaly 6 (24) 4 (67) 0 (0) 2 (28)  Figure 1 The analysis work ow. *We removed proteins that contained samples that could not be quanti ed as a missing value. sJIA, systemic juvenile idiopathic arthritis; MAS, macrophage activation syndrome; GO, gene ontology.  protein; S100A8,9, protein S100A-8, and 9. Results of a STRING-based interaction analysis of identi ed proteins. This schematic diagram represents the proteins and the functional groups of the proteins. Differentially expressed proteins were classi ed with the Gene Ontology and KEGG pathway, and essential proteins were selected. The thickness of the lines indicates the con dence level of the predicted interactions (con dence score ≥0.9, thick lines; con dence score ≥0.4, thin lines). Refer to Supplemental table 1 for the name of each protein gene. The lower area shows a line graph for candidate biomarkers. Serum was collected and analyzed at points (1) to (5). CRP, C-reactive protein; IL18, interleukin-18; FTL, ferritin light chain; FTH, ferritin heavy chain; LAP3, aminopeptidase 3; GBP1, guanylate-binding protein 1; HMOX1, heme oxygenase 1; BMP10, bone morphogenetic protein 10.

Supplementary Files
This is a list of supplementary les associated with this preprint. Click to download. supplementarymaterialSR.docx