High Fidelity Cyclic Peptide Microarray to Diagnose Rheumatoid Arthritis

Antibodies reactive with cyclic citrullinated peptides (ACPA) have been included as a part of the ACR criteria for rheumatoid arthritis (RA) since 2010 ii . However, due to the inability to capture the polydispersity of antibodies in RA without losing specificity, the ELISA assays cannot be used as definitive markers for diagnosis or prognosis. Current ELISA methods employ at most 30 unique cyclic citrullinated peptides for detection of ACPA. Apart from antibodies to citrullinated peptides, the serum from RA patients also contains antibodies to carbamylated antigens. Citrullinated and Carbamylated peptides hence show great promise for the improved diagnosis of RA. Here we tested whether a novel silicon-based combinatorial high-fidelity and high throughput cyclic peptide microarray that includes both citrullination and carbamylation modifications identifies RA with improved sensitivity compared to standard ELISA assays. Using a cyclic modified peptide (CMP) library of more than 2 million sequences we tested 121 clinically diagnosed RA patients and compared binding profiles to corresponding linear epitopes and to commercially available CCP kits. Disease controls and samples from healthy individuals were also used to determine the specificity of the assay. The tests were performed using high throughput liquid handlers integrated with biochip imagers enabling high automation and a rapid turnaround time of 2.5 hrs. Since the citrulline and homocitrulline are very difficult to distinguish except by mass spectrometry, we will use the term Vibrant ACPA to include antibodies that react with either or both of these modified peptides in our assay. These novel cyclic modified peptide (CMP) sequences have a high degree of accuracy in differentiating RA from controls, compared with standard serologic ELISA tests. Both citrullinated and carbamylated peptides are necessary for increasing the current sensitivity of RA testing. We also established here that a rigid conformation of the peptide is necessary for improved capture of antibodies by comparing cyclic and linear versions of the same peptide showing improved sensitivity with cyclisation. This high throughput pillar plate platform along with the 2 million data points generated per sample enable immune profiling on an unprecedented scale. While improved diagnostics is the primary outcome presented here, future identification of antigenic peptides that enable better prediction of prognosis and therapy would potentially improve outcomes in the affected population. existing assays. The diversity of peptides


INTRODUCTION
The National Arthritis Data Workgroup estimated that rheumatoid arthritis (RA) affects 1.3 million (0.6%) adults in the US i . Women are two to three times more likely to be affected in comparison to men. The evidence that early treatment improves patient outcomes and retards disease progression by limiting joint destruction and functional disability iv . Although antibodies to a citrulline containing protein, filaggrin was discovered more than 20 years ago in patients with RA v,vi interest has increasingly focused on anticitrullinated peptide/protein antibodies that serve as an important serological marker in the early diagnosis of RA as well as the progressive course of the disease vii . The 2010 ACR/EULAR (European League Against Rheumatism) criteria have been specifically prepared to classify and identify patients with early RA who might benefit from early DMARD therapy introducing ACPA as alternative criterion to rheumatoid factor ii .
In addition to citrullination (enzymatic deimination of arginine), peptides can be carbamylated by the reaction of cyanate with lysine to form homocitrulline viii which has also been found to be antigenic in RA patients. Although testing for ACPA has contributed significantly to diagnosis of RA, the sensitivity of ACPA as a biomarker for RA is suboptimal and diagnostic criteria require clinical as well as other serologies for a confirmatory diagnosis and guide to prognosis ix . Interestingly, studies have shown that anti-carbamylated antibodies alone could increase sensitivity of detection by up to ~16% x . Whether the suboptimal sensitivity of ACPA is due to limitations of the antigenic probe set, assay methodology, disease heterogeneity or variation in the antibody response is not clear. In order to improve the sensitivity of detection, attempts have been made to modify citrullinated peptides, with the cyclic forms showing an improved sensitivity compared to linear counterparts xi . By obtaining fixed geometries, cyclic peptides usually bind more efficiently and with higher affinity to their respective receptors since these peptides are conformationally constrained.
Moreover, cyclic peptides are often chosen over their linear analogues due to their enhanced enzymatic stability and enhanced membrane permeability, which result in improved bioavailability. They also possess entropic advantages in molecular recognition xii .
Recently, we created a peptide microarray with 110,000 peptides to detect autoantibodies in another autoimmune condition, celiac disease xiii . Patients with celiac disease develop antibodies not only to tissue transglutaminase (tTG), but they also produce antibodies to post translationally modified gliadin peptides (deamidated gliadin peptides or DGP). Subsequently, using in silico generated peptides as antigen targets, we identified autoantibodies to tTG/DGP neopeptides in 99% of celiac disease patients with 100% specificity xiv . Employing similar technology, we devised a peptide microarray with more than 2 million citrullinated and carbamylated sequences aiming to improve sensitivity for the diagnosis of RA without loss of specificity.

Study Cohort
To evaluate the diagnostic utility of the cyclic peptide library, sera were obtained from a cohort comprising All samples were handled according to standard laboratory procedures and stored at -80°C. All samples were probed using 1:100 primary antibody dilution and 1:2000 secondary antibody dilution (anti-human IgG and anti-human IgA) and scanned on a microarray fluorescence scanner.

Study Approval
The study was conducted under the ethical principles that have their origins in the Declaration of Helsinki.
The serum samples used were obtained from the University of Washington and other providers following approval by the appropriate IRB panels as indicated.

Silicon substrate preparation
Prime-grade 300-mm silicon wafers with p-type boron, (1 0 0) orientation, 1 to 5 Ω·cm-1, and 725-μm thickness were obtained. The wafers were deposited with 100 nm thermal oxide by dry oxidation. The wafers were etched using a specific photomask design using standard lithography techniques creating a feature area of 100nm silicon dioxide depth and the gap area being bare silicon.
The surface derivatization of the wafers was done starting with an ethanol wash for 5 minutes followed by immersion in 1% by weight APTES in Ethanol for 20-30 minutes to grow the silane layer. The wafers were then cured in a 110°C nitrogen bake oven to grow a mono silane layer with a -NH2 group to attach a linker molecule. Poly-L-glutamic acid (MW: 50,000-100,000) was activated and coupled to the silane layer to increase the surface density and poly-PEG layer was attached as a linker molecule. The surface derivatization was completed by adding a Fmoc-protected glycine layer as shown in Figure 1.

Peptide Array Synthesis
After the substrate preparation the peptides are synthesized as shown in Figure 2. A photoresist composition comprising of 1% by weight of Poly (vinylpyrrolidinone) (PVP) and 5% by weight of piperidineglyoxylic acid (ionic photobase generator (PBG)) was dissolved in Dimethylformamide (DMF) and spin coated at 4000rpm for 1 minute and soft baked at 70°C for 1 minute in a hot plate. The wafer is then selectively exposed under 365nm UV at a dose of 50 mJ/cm 2 using a designed photomask wherein the PhotoBase Generator (PBG) converts into the piperidine base in exposed regions. The wafer is then hard baked at 75°C for 2 minutes selectively removing the Fmoc protection in the desired features. The incoming Fmoc-protected amino acid carboxylic acid is pre-activated using TBTU 2-(1H-Benzotriazole-1-yl)-1,1,3,3tetramethylaminium tetrafluoroborate, HOBt Hydroxybenzotriazole and DIEA N,N-Diisopropylethylamine.
The mixture was spin coated onto the wafer at 3000rpm for 30 seconds and baked at 65°C for 2 minutes in a hot plate to enable coupling of the Fmoc-protected amino acid to the free amine of the peptide sequence. This was followed up by spin coating a solution of 50 wt% of DMF and 50 wt% of acetic anhydride to cap any unprotected amines from the coupling reaction. The whole process was repeated for each individual layer of amino acid designed to be coupled to complete the synthesis of all layers of sequences as designed using individual reticles. The step yield of this process is greater than 99.9995% enabling high fidelity of synthesis. This is a key advantage of the photolithographic process which enables superior assay performance with respect reproducibility in comparison to spotted peptide arrays.

Array Side Chain Deprotection
Following the completion of peptide array synthesis, the side group protection present for amino acids was removed to facilitate biological activity of the peptide sequences. A solution comprising of 95 weight % Trifluoroacetic Acid [TFA] and 5 weight % DI Water was reacted on the wafer for 2 hours to remove the side group protection for all amino acids as applicable.

Cyclization Methods
Various cyclization approaches were tested as shown in Figure 3 to identify the method which produces the highest cyclization yield and the approaches used were as detailed below: A) Cyclization was performed with a cysteine coupled as the first layer C-terminus and the last layer N-terminus during peptide synthesis. All peptides were cyclized using cysteine bridge formation under mild oxidative conditions (air oxidation).
B) Peptides were cyclized using a glutamic acid at the C-terminus and coupled to the N-terminus amine in the peptide chain by reacting the wafer using an amino acid activation cocktail comprising of TBTU/DIEA facilitating the formation of the N->C cyclic peptide bond.
C) Peptides were cyclized using the method of click chemistry using an azido-lysine in the C-terminus and a propargyl-glycine at the N-terminus. Cyclization was performed by Cu catalyzed alkyne azide cycloaddition yielding 1,4-disubstituted 1,2,3-triazoles covalently linking both termini to form a cyclized peptide. D) Cyclization was performed by bridging two primary amines between the N-terminus and a lysine sidechain (C-terminus) by reacting the wafer in a linker solution comprising of disuccinimidyl glutarate.
E) Cyclization was performed with 2 cysteines at the N-and C-terminus by reacting the sulfhydryl group of two residues with dibromoxylene under mild basic conditions. F) Peptides were cyclized using a N-terminal cysteine and a C-terminal thioester using a 2-step N->C cyclization reaction. In the first step, the thiol group places a nucleophilic attack on C-terminal carbonyl moiety, and this is followed by an irreversible intramolecular S->N acyl shift on the newly formed thioester group yielding backbone cyclized peptides. Cyclization yields were determined by testing the activity of C-terminus amino acid using fluorescence before cyclization and after the cyclization reaction to determine the cyclization yield as shown in Table 2.
The results indicated that the Method E produced the highest cyclization yield and was utilized in the final peptide array cyclisation.

Cyclic Peptide Design
A designed peptide library of RA related protein families including filaggrin, fibrinogen, vimentin, collagen II, enolase, histones, and 14-3-3 eta was synthesized with a lateral shift of 2 amino acids as shown in Figure   4 to bias the 2 million random peptide array sequences (data not shown). The library was constructed with native epitopes, citrullinated epitopes (conversion of the amino acid arginine to citrulline) and carbamylation epitopes (conversion of the amino acid lysine to homocitrulline). The library was designed with a photomask series with 3 replicates of each sequence and synthesized using photolithography.

Figure 4 Sliding library of peptides
A training cohort of RA positive and ELISA CCP positive samples were run on this design library to probe commonly binding subsequences and dominantly occurring amino acids. The learned biases from this library were used to design the final biased random library with 2 million cyclic modified peptides.

Conventional Assays
The following commercial kits were used to compare results of the

Data Analysis
To identify the optimal peptide epitopes from the library, the sample cohort from Table 1 were run on the peptide microarray. Using the fluorescence microarray scanner, immunoassay binding activity was scanned. The scanned immunoassay binding data were mapped using a ROI analysis and converted into raw binding fluorescent intensity. Subsequently, the intensity was normalized from the spots where no epitopes were synthesized using a least variant set method of normalization algorithm.
Using historic data from previous microarrays, a random forest classifier was trained to detect spots which were not reproducible between replicates of each individual sequence. A random forest classifier is a machine learning procedure, which learns from existing data and applies them to new unclassified data sets. All epitopes not within the 95% linear regression confidence are removed from the data analysis.
The receiver operating characteristic (ROC) curve was then applied for each sequence to identify the optimal threshold value for obtaining the highest sensitivity and specificity. Overall sensitivity and specificity were calculated based on the condition that a sample is considered to be positive if at least 5 unique epitopes have normalized units greater than the threshold chosen for the corresponding sequence.

Statistical Analysis
McNemar's chi-square test was used to compare the sensitivity of paired data of Vibrant ACPA to combined RF and CCP assays. The complete sensitivity and specificity comparisons are shown in Table 5. The same set of samples were run on each individual method to obtain comparable values for sensitivity and specificity.

Analytical Studies
Analytical accuracy of the peptide array was validated using the spike/recovery method.   Table 4 Extent of variability in repeated assays of the peptide array expressed as the coefficient of variation (CV)

Alanine scanning mutagenesis
Detailed characterization of antibody-antigen interactions is necessary for accurate diagnosis of autoimmune conditions. Antibody recognition of in situ synthesized peptides antigens at the amino acid level can be verified using a synthesis technique called alanine scanning. Briefly, peptide epitopes of a set of monoclonal antibodies were grown on the microarray platform along with mutants for each individual amino acid along the epitope sequence. The mutant library was synthesized by replacing each amino acid (AA) with alanine one AA at a time. Alanine has an inert methyl functional group and can help understand the binding contribution of the corresponding amino acid it replaces. The peptide mutants for each epitope were thus synthesized and reacted with the monoclonal to identify the key amino acids contributing to the binding interaction. This enabled validation of the antibody recognition of the in-situ synthesized peptides at the amino acid level and the data is presented under supplementary material I.

Clinical Sensitivity and Specificity
The study cohort of 1056 samples as shown in Table 1 were used to test the diagnostic utility of this cyclic modified peptide library (Vibrant ACPA) and the corresponding linear peptide library (ALPA). The samples were also tested on the commercial platform's RF (Beckman Coulter) and CCP (Inova Diagnostics). As summarized in Table 5 we observed that the clinical sensitivity was 95.04% and the clinical specificity was 95.27% for the Vibrant ACPA assay. Individual Citrullinated peptide library and Carbamylated peptide library sensitivities and specificities are also shown. The effect of this improved assay performance can also be seen reflected in the positive and negative predictive values as shown below.

Citrullination Vs Carbamylation
Both Citrulline and homocitrulline are nonstandard amino acids with the presence of an ureido group. While citrulline is formed by enzymatic deimination of arginine, peptides can be carbamylated by the reaction of cyanate with lysine to form homocitrulline. Since we essentially grew the same peptide sequence for both citrulline and carbamylation modifications it was interesting to note that for a majority of patients the antibodies bound to both the sequences. Both citrulline and carbamylation favored the amino acids E, Q, S, G, R and T in closer proximity than the other amino acids (data not shown).

DISCUSSION
Autoantibodies against citrullinated peptides have emerged as biomarkers for the diagnosis of rheumatoid arthritis and are now routinely utilized as a part of diagnosis and disease management xv,vi . Though these markers are very specific, sensitivity of disease detection is suboptimal and there is conflicting literature comparing anti-CCP2 and anti-CCP3 assays xvi,xvii,xviii . Knowing that diverse proteins are implicated as antigens xix and cyclization of post-translationally modified peptides improves antibody recognition xx , we have developed powerful tools to create a vast library of cyclic citrullinated and carbamylated peptides that markedly improve the sensitivity of RA diagnosis. These novel peptides demonstrated a significantly improved sensitivity for the diagnosis of RA while maintaining specificity compared to the standard assays for CCP and RF combined as shown in Table 5.
The CMP microarray presented here paves way for the generation of novel cyclic peptides that can be tested for diagnosis, prognosis and, potentially, for monitoring in response to treatment. To synthesize millions of such peptides in a cost-effective approach to make it applicable in a healthcare setting, we employed the silicon platform to grow a diverse group of cyclic molecules. We first confirmed that, compared to linear epitopes, efficient cyclization plays an important role in improving the identification of ACPA antibodies in RA due to conformational stability. These cyclic epitopes enable diagnosis of RA with a 95.04% sensitivity and 95.27% specificity, which is superior to existing assays. The diversity of peptides and the use of both citrullination and carbamylation are also key to capture the polyclonal response among individuals diagnosed with RA.
In a previous study, ACPA has been shown to be a more sensitive and better predictive of erosive disease in comparison to rheumatoid factor with the combined sensitivity and specificity of 67% and 95% respectively xxi . Moreover, a meta-analysis of 151 studies shows sensitivity and specificity of ACPA to be 57% and 96% respectively (the analysis refers only to 15 relevant cohort studies) xxii and being the preferred biomarker for early RA detection. ACPA appears to be a reliable predictor of erosive RA, making it a potentially important prognostic tool that might be used to direct patient management decisions. Other studies have also found that ACPA can be detected prior to the onset of arthritis and also RA patients with recent onset arthritis vi,xxiii . The time interval from the onset of the first symptoms to the fulfillment of the classification criteria seems to be directly related to the initial ACPA level xxiv . In the future, it will be interesting to determine whether individual peptide specificities relate to early disease or to development of erosions. In addition to its potential for early diagnosis, prognosis and exploration of pathogenesis, the CMP microarray described here may be broadly useful for identifying lead compounds for drug discovery and for the development of new peptide antigens.
In summary we present here a novel cyclic modified peptide microarray for the diagnosis of RA. The development of this high fidelity and high throughput method will be an important tool for improved diagnosis of this disease with the potential for application to prognosis and monitoring of treatments in the future.

ACKNOWLEGMENTS
Research reported in this publication was supported by Vibrant America LLC. The content is solely the responsibility of the authors and does not necessarily represent the official views of the FDA. We would like to thank Miaomiao Sun and Ashish Gopalakrishnan for generating the figures.

Author Contributions
Conceived had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Competing Interests Statement
The authors have read the journal's policy and the authors of this manuscript have the following competing interests. All of the authors listed in this paper are employees of Vibrant Sciences or Vibrant America, with the exception of KE who an academic research collaborator. Vibrant America is a clinical laboratory performing commercial diagnostic testing.    Cyclization Methods. Different chemistries that were used to achieve cyclisation are demonstrated from Method A to Method F as described under Cyclization Methods.

Figure 4
Sliding library of peptides