The Ecacy of Genetic Testing for Early Detection of Psoriatic Arthritis in Patients with Psoriasis

Background: Improved understanding of the genetic architecture of psoriatic disease (PsD) and reduction in genotyping costs provide an opportunity to assess the application of genetic testing for psoriatic arthritis (PsA) diagnosis. The study aimed to assess the performance of a multi-marker genetic kit in classifying patients as PsD and PsA and to assess whether the performance has improved by combining genetic and clinical data. Methods: 328 patients with psoriasis and musculoskeletal symptoms (78 PsA and 250 psoriasis alone) who were referred to rheumatology for suspected PsA and 341 non-psoriatic controls were analyzed. A custom multi-SNP genetic assay, including 41 variants based on large scale association genome-wide association studies in PsA, was genotyped. Machine-learning methods were used to identify the optimal classication model by area under the receiver operating curve (AUC). Results: The performance of all three classiers to distinguish PsD from non-psoriatic controls was moderate with similar AUC (range 0.64 to 0.69). Logistic regression had the highest AUC and showed moderate specicity (61.9%) and specicity (67.9%). The ability of the models to correctly classify PsA among all psoriasis patients was suboptimal with AUC ranging from 0.55 to 0.60 with low sensitivity (1.3% to 34.6%) and moderate to high specicity (69.2% to 99.6%). The combination of genetic and clinical data resulted in an improvement in the performance of the models (AUC 0.65 to 0.69), however, the information contributed by the genetic markers was only marginal. Conclusions: Genetic testing has marginal effect on correctly classifying PsA among psoriasis patients in clinical setting. symptoms. The overall aim of our research is to develop a personalized model for identifying PsA and PsD using genetic data. Specically, we aimed to 1) Assess the performance of the multi-marker genetic kit in classifying patients PsD from non-psoriatic controls and PsA from psoriasis alone; and 2) to assess whether the performance of the prediction model for PsA improved by combining genetic and clinical data.


Background
Psoriatic arthritis(PsA) affects up to a third of patients with psoriasis thus, efforts to improve early detection of the disease focus on patients with psoriasis (1). Delays in diagnosis remain a major gap in care in PsA. Studies have shown that 79% of the patients with PsA were diagnosed more than 3 months of symptoms onset (2) and that the wait time to rheumatology referral was substantially longer than for other rheumatic conditions, such as rheumatoid arthritis and lupus (3). These gaps in care contribute to poor disease outcomes, such as the development of joint damage, disability and low quality of life.
One of the major factors that contributes to the delayed diagnosis of PsA is the lack of an objective laboratory test to aid in the diagnosis in early stages of the disease. The presentation of PsA can be nonspeci c and subtle, furthermore, it can co-exist with non-in ammatory musculoskeletal conditions, such as osteoarthritis, which can result in misdiagnosis and delays in referral to rheumatology care among non-specialists (4,5).
To date, strategies to improve early diagnosis of PsA are directed towards screening patients with psoriasis for musculoskeletal symptoms. Recent efforts focused on screening questionnaires, however, these tools suffer from low sensitivity and speci city. Since a considerable proportion of the psoriasis patients with musculoskeletal complaints, do not have PsA, but rather have other non-in ammatory rheumatic conditions, such as osteoarthritis (6), a simple, scalable tool is needed that can help dermatologists and family physicians triage these patients and prioritize rheumatology referral of potential PsA cases (7). Both psoriasis and PsA have a strong genetic component as evident by substantial heritability (40-90%) and familial aggregation (8,9). This heritability is in uenced by speci c loci that determine the individual's susceptibility to develop these conditions. Indeed, several large genome wide association studies (GWAS) identi ed over 60 risk loci for psoriasis which encode for proteins involved in the innate immune system, antigen presentation and acquired or adaptive immune response (10). Despite the smaller number of studies, several GWASs and candidate-gene studies identi ed speci c genetic variants associated with PsA. While some of these genes overlap with those of psoriasis, others could differentiate patients with PsA from those with psoriasis without arthritis (PsC) (11)(12)(13)(14)(15)(16)(17)(18)(19)(20)(21). The strongest and most reproducible risk loci for PsA are located within the MHC region, particularly HLA-B and HLA-C genes. Additional suggested PsA-speci c loci outside of the MHC region include 5q31, IL-23R, PTPN22, TNFAIP3, FBXL-19, ZNF816, IFNG and IL-13 (11)(12)(13)(14)(15)(16)(17)(18)(19)(20)(21). The improved understanding of the complex genetic architecture of PsA along with the reduction in the costs of genotyping provide an opportunity to assess the application of genetic testing for PsA diagnosis in clinical setting. Currently, there is limited information about the utility of genetic testing in optimizing triage of psoriasis patients with musculoskeletal symptoms and improving early detection of PsA.
We hypothesize that genetic testing could improve the accuracy of early detection of PsA among psoriasis patients who present with musculoskeletal symptoms. The overall aim of our research is to develop a personalized model for identifying PsA and PsD using genetic data. Speci cally, we aimed to 1) Assess the performance of the multi-marker genetic kit in classifying patients PsD from non-psoriatic controls and PsA from psoriasis alone; and 2) to assess whether the performance of the prediction model for PsA improved by combining genetic and clinical data.

Study population
Psoriasis patients who were referred to rheumatology for suspected PsA were prospectively recruited in a single medical academic centre in Toronto, Canada. All patients had dermatologist conformed diagnosis of psoriasis and were experiencing musculoskeletal symptoms. None of the patients had a prior diagnosis of PsA. The patients were referred from dermatology and family medicine clinics as well as from a phototherapy centre that serves as a tertiary referral center for dermatologists from the Greater Toronto Area. The control population comprised of osteoarthritis patients with no psoriasis, in ammatory arthritis or other autoimmune diseases. The study was conducted from January 2016 to December 2018.

Case de nition
All participating patients and controls underwent a clinical assessment by a rheumatologist. Patients were classi ed as having PsA if they met the CASPAR classi cation criteria for PsA. Patients who were classi ed as not having PsA at baseline were reassessed after 1 year to determine whether they have developed PsA since the baseline visit. The rheumatologist was blinded to the genetic testing results. In addition, clinical information about the patients' demographics, psoriasis characteristics, duration and severity, family history of PsA, comorbidities, and patient reported outcomes were collected at the baseline visit.

Genetic Testing
A custom multi-SNP genetic assay was designed using the program Assay Design Suite and Typer IV

Statistical analysis
The accuracy of genetic testing for differentiating between PsA and non-PsA patients was assessed twice for each of the following outcomes: 1) diagnosis of PsA at baseline; and 2) diagnosis of PsA at 1 year. In addition, we assessed the ability of the genetic testing to distinguish between PsD and nonpsoriatic controls.
First, we tested the association between each genetic marker individually and patient diagnosis using Chi Square test and reported the odds ratio (OR) and its 95% Con dence Intervals (CI). Results were reported for signi cance levels of p < 0.05 and after correcting for multiple testing (p < 0.001 for 41 individual tests).
We then used the entire information from the genetic assay to develop machine-learning classi ers to predict PsA and PsD diagnosis including logistic regression, naïve bayes and random forest. Age and sex were included in each of the models in addition to genetic data. The performance of the resulting classi cation models at distinguishing between PsA and non-PsA patients and PsD and non-psoriatic controls, was evaluated by calculating their sensitivity, speci city, precision (proportion of subtype that was accurate) and area under the receiver operator curve (AUC). We then selected a number of clinical and demographic variables to assess whether their combination with genetic data could improve the model performance. We selected clinical variables that were reported to be associated with PsA compared to psoriasis alone. The following variables were selected for testing: age, sex, race (Caucasian vs. non-Caucasian), psoriasis duration, a history of uveitis, nail psoriasis, exural psoriasis, psoriasis area and severity index (PASI), body mass index (BMI), patient pain (visual analogue scale of 0 to10), health assessment questionnaire disability index (HAQ-DI), Functional Assessment of Chronic Illness Therapy -Fatigue (FACIT) and high sensitivity C-Reactive Protein (hsCRP) and family history of PsA.
We used RWeka package in R for the machine learning and statistical analysis.

Results
A total of 328 patients with psoriasis who were referred for a suspected diagnosis of PsA due to musculoskeletal symptoms were enrolled in the study. Of those, 78 patients were classi ed as PsA by the rheumatologist at the initial visit (PsA-baseline) and the remaining 250 patients were deemed not to have PsA (cutaneous psoriasis (PsC)). 17 additional patients have developed PsA within 1 year after the baseline visit, resulting in 95 patients with PsA at 1 year (PsA-1 year). Patients who were lost to follow up were assumed not to have PsA at 1 year. In addition, DNA from 341 non-psoriatic controls was analyzed.
The patient characteristics are shown in Table 1. Genetic-based classi ers for PsD vs. controls The association between the tested SNPs and PsD vs. controls is shown in Fig. 1 and supplementary Table 1.
The frequencies of the following 17 SNPs located in 8 genes were associated with PsD compared with controls: IL-23R (rs11209026), PTPN22 (rs2476601), REL (rs13017599), 5q31 (rs715285), HLA-C (rs887466, rs3869115, rs1050414, rs12189871, rs12191877, rs13214872, rs2894207, rs4406273), HLA-B (rs6457374, rs3131382, rs3129944), MICA (rs67841474) and TYK2 (rs34536443). After correction for multiple testing the variants in PTPN22, REL, HLA-C and HLA-B remained signi cantly associated with PsD vs. controls (p < 0.001). Next, we used all 41 genetic SNPs to develop classi ers to predict PsD diagnosis. The performance of the following three machine learning methods: logistic regression, naïve bayes and random forest are shown in Table 2. Overall, the performance of all three classi ers was moderate with similar AUC (range 0.64 to 0.69). Logistic regression had the highest AUC and showed moderate speci city (61.9%) and speci city (67.9%). In the comparison between PsA and PsC at baseline the following three SNPs were more frequent in patients with PsA: LCE3A (rs10888503, p = 0.004, OR 1.73), IL-23R (rs4655683, p = 0.049, OR 1.46) and TNIP1 (rs146571698, p = 0.038, OR 2.12). None of them remained signi cantly associated with PsA after correcting for multiple testing. We then analyzed the ability of the multi-marker panel to correctly classify patients with PsA at baseline from those with PsC (Table 3).
Overall the classi ers showed low sensitivity (range 1.3-32%) and high speci city (range 76-99.6%) with modest AUROC of 0.55 to 0.60. The results were not substantially changed when the outcome was considered to be PsA-1-year (AUROC 0.52 to 0.56, data not shown). Lastly, we evaluate the changed in speci city and sensitivity when modifying the logistic regression model cut-off level. Using a cut-off level of 0.3 instead of the default level of 0.5, the sensitivity increased from 24.4-34.6% and the speci city reduced from 88.4-69% and with a cut-off level of 0.7 the sensitivity reduced to 7.7% and speci city increased to 98%. Since the performance of the genetic-based classi ers was suboptimal, we evaluated whether combining clinical and genetic data could improve their performance in classifying patients as having PsA vs. PsC. The results are shown in Table 4. The combination of genetic and clinical data has improved the performance of the models with an AUC ranging from 0.65 to 0.69. Logistic regression had the highest AUC (0.69) and showed reasonable speci city (81.2%) with low sensitivity (47.4%). Using a lower cut-off level of 0.3 the sensitivity increased to 57.6% and the speci city reduced to 80%. To nd irrelevant and redundant features, we also applied gain ratio attribute evaluator and ranker search method to score these genetic and clinical variables. The variables that contributed most information to the model were: rs146571698(TNIP1), HAQ and hsCRP. The results were not substantially changed when the outcome was considered to be PsA-1-year (AUC 0.65 to 0.69, data not shown).

Discussion
Our study evaluated the performance of a multi-marker genetic kit in classifying patients PsD and PsA in a real-world setting. The results of our study suggest that despite the association of several genetic markers with PsD and PsA, the overall ability of a multi-marker genetic test to correctly classify patients with PsD and those with PsA, is modest is only modest. In particular, the low sensitivity of the genetic test suggests that it is not useful as a screening tool among psoriasis patients referred to rheumatology for suspected PsA.
The use of genetic testing at the point of care is part of common practice for the diagnosis of ankylosing spondylitis (AS) as testing for the presence of HLA-B*27 is part of the classi cation criteria for axial SpA (23) and can help with the diagnosis of this condition. However, PsA is much more genetically heterogenous and unlike AS, in which HLA-B*27 is carried by the majority of patients, in PsA no single dominant susceptibility gene has been identi ed (22,24). The majority of the genetic heritability in PsA is explained by genes located in the MHC region and there are few non-MHC genes that have been speci cally associated with PsA when compared with PsC (11,12,20). In this context, the suboptimal performance of our multi-marker genetic testing could be explained.
Patrick et al. used 200 genetic markers to develop a classi er that distinguished between PsA and PsC with good delity (AUC 0.82) (25). However, it should be noted that patients with PsA and PsC were recruited from different sources, thus potentially resulting in substantial differences in the two populations that may have been driven by the different genetic pro les. This may create a type of selection bias where patients who developed PsA prior to or at the time of development of psoriasis have not been included in our study that focused on patients with prevalent and often long-term psoriasis. Thus, susceptibility genes with high penetrance for PsA, such as the HLA-B*27, are expected to be less frequent in our study population as patients who carry these genes are expected to have already developed PsA(immortal time bias) (26). Indeed, the prevalence of the tagging SNP for HLA-B*27 was only 4% which was similar to the prevalence found in the general population and much lower than the reported prevalence in other PsA cohorts (~ 15%) (14,26). However, we believe that our source population of psoriasis patients represent a more relevant population to test the performance of genetic testing in clinical setting as it represent a typical population of patients referred to rheumatology for suspected PsA. In this population, genetic factors with high penetrance for PsA may play a lesser role in disease susceptibility compared to other genes with weaker effect or to environmental risk factors.
The combination of clinical and genetic factors has improved the performance of the model classi er with AUC approaching 0.70 and a reasonable speci city, however the sensitivity of the test remained low.
However, most of the information in these models was provided by non-genetic factors including patient reported outcomes and other clinical variables. The relatively lower sensitivity of these models precludes their use as screening tools.
Our study was limited in several aspects including the relatively small sample size that may have limited the performance of some of the machine learning methods. Secondly, the study population included patients with psoriasis who had musculoskeletal symptoms, thus, some of the patients that were classi ed as PsC at baseline may have had subclinical PsA leading to clinical misclassi cation thus contributing to the suboptimal performance of the model. Indeed, 17 (6.8%) out of the 250 patients with PsC have converted to PsA within 1 year, although when considering these converters, no signi cant differences were observed in the model performance.

Conclusions
The results of our study showed that the overall ability of multi-marker genetic testing to distinguish patients with PsA from those with PsC was only modest. This suggests that genetic testing has limited role for screening for or diagnosing PsA among patients with psoriasis in clinical setting. Studies should focus on more dynamic biomarkers that could re ect the change in relevant immunologic pathways over time. The study was approved by the Women's College Hospital Ethics Board and all patients gave their informed consent (REB#2016-0043). All participants signed an informed consent form for participation in the study.

Consent for publication
Not applicable Availability of data and materials The datasets generated and/or analysed during the current study are not publicly available due to terms of our research ethics board.

Competing interests
The authors declare that they have no competing interests Authorship LE: conception, design of the work, acquisition of data, interpretation of data; drafted the manuscript; QL: design of the work, data analysis; interpretation of the results; review of the manuscript; DJ: design of the work, data acquisition, review of the manuscript; CF: design of the work, data acquisition, review of the manuscript; JY: data acquisition, review of the manuscript; PR: conception, design of the work, acquisition of data, interpretation of data; review of the manuscript; All authors have read and approved the nal manuscript.

Funding
The study was supported by a pilot research grant from the Spondyloarthritis Research Consortium of Canada (SPARCC)