Psychometric Validation of a Novel PRO Measure for Assessing Patient-reported Experience of Cognitive Impairment in Schizophrenia (PRECIS)

Background Cognitive impairment associated with schizophrenia (CIAS) can be a distressing feature that contributes to the burden of the disorder, as well as being a strong predictor of functional impairment. To fully assess the burden to patients living with this illness, there is a need to develop a specific measure of patient-reported outcomes (PROs). Methods Following initial development of the Patient-Reported Experience of Cognitive Impairment in Schizophrenia (PRECIS) instrument, the domain structure, reliability (inter-item consistency and test-retest reliability), and validity (discriminant validity, divergent, and convergent validity) of the tool were assessed in patients (aged 18–55 years) with CIAS participating in a 12-week, Phase II, randomized, double-blind, placebo-controlled study to evaluate the efficacy, safety, and tolerability of BI 409306. Healthy control subjects were recruited separately. The PRECIS instrument was completed at baseline, Week 6, Week 9, and Week 12. 35 included PRECIS

existing objective measures of cognition and serves to define key patient-based endpoints for use in future clinical studies.

Background
The majority of patients with schizophrenia experience some level of underlying cognitive impairment [1]. Cognitive impairment associated with schizophrenia (CIAS), a serious and often distressing feature of schizophrenia [2], can manifest as deficits in processing speed, attention, episodic memory, working memory, social cognition, and executive function [3,4].As such,CIAS is associated with substantial functional impairment [2,5], with 20%-60% of variation in functional outcome attributed to cognitive performance [2]. Patients often have some degree of awareness of cognitive deficits, but clinicians typically neglect to ask about the impact of these symptoms on the patient. Pharmaceutical management of schizophrenia has largely targeted managing positive symptoms, increasing the length of remission, and reducing the duration and severity of acute psychosis. While therapies are effective in reducing psychosis, there is typically little improvement in a patient's status, in part due to a lack of treatment effect on negative symptoms and CIAS [6]. Currently, there is a need for effective pharmacological treatments for CIAS [1,3] and for a patient-reported outcome (PRO) instrument to capture subjective outcomes of interventions in mental health [7]. The importance of assessing the patients' perspective is underscored by the increasing demand from regulatory bodies and health technology appraisals to increase patient-centered outcomes research including the use of PRO instruments [8]. Several interviewer-and performance-based instruments exist to evaluate specific aspects of functional capacity, cognitive functioning, and the patient's subjective perception of their cognitive difficulties in schizophrenia. These include the Schizophrenia Cognition Rating Scale (SCoRS) [9], which assesses the effect of cognitive deficits on day-to-day functioning, the Measurement and Treatment Research to Improve Cognition in Schizophrenia (MATRICS) test battery [10], the Cambridge Neuropsychological Test Automated Battery (CANTAB) [11], and CogState [12], which assess cognitive functioning, and the Subjective Scale to Investigate Cognition in Schizophrenia (SSTICS) [13] and the Self-Assessment Scale of Cognitive Complaints in Schizophrenia (SASCCS) [14], which evaluate patients' experience of CIAS. However, generalized PRO instruments are unlikely to provide sufficient sensitivity to detect differences in individual aspects of multifactorial conditions [7].
Most of these instruments (MATRICS, CANTAB, and CogState) are designed to assess objective (and not subjective neuropsychological) performance in schizophrenia. Although the SCoRS, SSTICS, and SASCCS are subjective measures, the SCoRS includes modified items from dementia rating scales and is not based on patient experience of CIAS and the SSTICS is, by design, brief. In addition, the SCoRS and SASCCS were not developed based on patient input. Therefore, the development of a PRO instrument specific to CIAS is necessary to enable improved evaluation of new therapies in patients with schizophrenia.
This study outlines the continuing development and validation of a novel PRO instrument for assessing patient experience of CIAS, the Patient-Reported Experience of Cognitive Impairment in Schizophrenia (PRECIS). The purpose of developing this instrument was to detect treatment responsiveness in a clinical trial setting. The initial development and concept elicitation data for the PRECIS instrument have previously been reported [15], and development was conducted in accordance with guidance from the US Food and Drug Administration [8,16] and the American Educational Research Association's Standards for Educational and Psychological Testing [17]. During this development, a conceptual model was proposed based on published data, clinical experience, and advice from leading clinician-scientists in schizophrenia research and neuropsychology [15]. The model included seven domains reported to contribute to the subjective experience of CIAS: attention, communication/social cognition, executive functioning, intermittent impaired perception, memory, metacognitive abilities, and sharpness of thought. Using this conceptual model, intensive concept elicitation interviews were conducted in 80 patients with schizophrenia, and an initial pool of 53 items was developed. Cognitive debriefing interviews resulted in the removal of 18 items and modification of 22 other items. The remaining 35 items represented 23 concepts within six domains plus two items assessing bother.
The instructions for the instrument were modified to include additional explanation regarding the differences between cognitive difficulties and positive symptoms, based on patient feedback [15]. The qualitative findings in these patients informed the basis of a 35-item version of the PRECIS instrument, covering 5 domains (Additional Figure A1). Two domains (sharpness of thought, intermittent, impaired perception) were omitted as having insufficient content validity or being poorly comprehended among participants. This study aimed to investigate the psychometric properties of the 35-item instrument and develop a sensitive and reliable tool for assessing patient experience in CIAS, the PRECIS. Here we report on the domain structure, discriminant (known groups) validity testing, domain scoring, internal reliability, and sensitivity of PRECIS to between-group differences, comparing patients with schizophrenia and control subjects.

Methods
This psychometric validation study included patients with schizophrenia who were participating in a Phase II, randomized, double-blinded, placebo-controlled, parallel group study to evaluate the efficacy (cognition and everyday living skills), safety, and tolerability of four orally administered doses of BI 409306 during a 12-week treatment period (clinicaltrials.gov: NCT02281773) [18]. Healthy individuals (control group) were recruited separately from subgroups at the same study sites, but using a separate, parallel-study protocol (Clinicaltrial.gov: NCT01505894). The study was conducted in accordance with the ethical principles of the Declaration of Helsinki [19], International Council for Good Clinical Practice [20], and applicable country-specific regulatory requirements. It was reviewed by the New England Research Institute's institutional review board (IRB) and each study site received individual IRB approval via Alpha IRB.

Patients
All patients/individuals were 18-55 years of age. The patient group had a diagnosis of schizophrenia (as per the Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition) and had received stable antipsychotic treatment for ≥8 weeks prior to randomization. Patients had no more than a ''moderate'' severity rating on hallucinations and delusions (Positive and Negative Syndrome Scale [PANSS]) and no more than "moderate" depression scores (PANSS general psychopathology syndrome depression item). The control group were free of major psychiatric illness, neuropsychological impairment and history of antipsychotic drug use. All patients, or their legally accepted representatives, provided written informed consent.

Assessments
The patient group completed the PRECIS instrument at baseline (randomization or Visit 2), Visit 4 at 6 weeks, Visit 5 at 9 weeks, and end of treatment (EOT) visit at 12 weeks. The PRECIS instrument was initially comprised of 35 items answered via a 5-category Likert scale (ie, 1 = not at all/not at all hard, 2 = a little bit/a little bit hard, 3 = somewhat/somewhat hard, 4 = quite a bit/quite hard, 5 = very much/very hard) [15].
Identification of an appropriate algorithm to convert the responses into numerical values was carried out once the final structure of the PRECIS instrument was determined, as described below. In addition to the PRECIS instrument, the CANTAB and the MATRICS Consensus Cognitive Battery (MCCB) were used to objectively assess cognitive function. The SCoRS was conducted to assess functional capacity.
Patients completed the CANTAB and MCCB during Screening, Visits 2 and 4, and EOT. The SCoRS was completed at Visit 2 and EOT. The control group completed the original 35-item PRECIS instrument, the MCCB, CANTAB, and SCoRS at a single visit.

Statistical Analyses
Analyses were conducted using SAS version 9.4 software (SAS Institute, Cary, CA, USA). Descriptive statistics were calculated for demographic variables (age, gender, race, and ethnicity). For individual items in the PRECIS instrument, the mean, standard deviation (SD), range, and ceiling and floor effects were calculated. Effects of gender (female vs male), age (≤45 years vs >45 years) and raceethnicity (white vs black vs other) on PRECIS item scores (35-item instrument) were examined using standard statistical tests.
Responses to the PRECIS instrument were converted into a total score (unweighted), calculated using the sum of the 35-item scores divided by 35 (average item score). Individual domain scores (scores for each category of items) were computed as the average score across all items within each domain.
Any data points missing ≥10% of data were excluded. If more than one item was missing from a domain, the domain score for the patient was excluded. Individual items with less than adequate reliability or validity were identified and eliminated or modified. Items that met the following predefined elimination criteria were considered poorly performing and were considered for elimination from the 35-item instrument: floor or ceiling effects of >50% among patients ≥10% missing data relatively low factor loadings across the domains (<0.025) Cronbach alpha inflation of >10% above the original value upon item removal lower internal or test-retest reliability (<0. 6) failure to distinguish statistically between patients and controls in discriminatory validity testing (statistical cut-off: p = 0.05).
Analyses were repeated on the revised instrument once poorly performing items were removed. Item finalization was performed over two rounds of analysis and alternative scoring procedures were considered.

Factor Analysis
An exploratory factor analysis (EFA) was conducted to determine the dimensionality of the latent variables (ie, domains) being assessed by the 35-item PRECIS instrument and to identify common factors (domains) among the 35 items. All pre-treatment scores in the patient group, and for patients and controls combined, were used for these analyses. The control group (N = 88) had insufficient sample size for a separate factor analysis. The number of derived factors was based on eigenvalues (threshold of >1.0), a scree test, and the interpretation of simple structure. Both Promax and Varimax rotations were employed to identify correlated factors and uncorrelated factors, respectively, to evaluate the strength of domains. A confirmatory factor analysis (CFA) was carried out after the poorly performing items had been eliminated and a 24-item version of the PRECIS instrument was available for testing.

Reliability
Reliability was assessed using assessments of internal reliability and test-retest replicability. Internal reliability, or inter-item consistency, of the original 35-item PRECIS instrument was evaluated by means of Cronbach's alpha [21,22] based on data obtained at baseline from patients with schizophrenia. Cronbach's alpha ranges from 0 to 1, with higher values indicating increased reliability.
Inter-item correlations were analyzed using Spearman's (non-parametric) rank order correlation coefficients. Intra-class correlation coefficients (ICCs) were used to analyze test-retest reliability of the responses to the 35 individual PRECIS items between Visit 5 (Week 9) and EOT (Week 12).
As patients had been randomized to receive treatment that may have influenced their cognitive impairment, this potentially reduced the reliability of the test-retest analysis. To minimize this variability, test-retest analyses were conducted in a subset of placebo-treated patients with stable CANTAB scores (defined as <1-point change in either direction). ICCs, percent agreement scores and paired t-test comparisons of scores were calculated to compare PRECIS instrument responses at Visit 5 and EOT. Correlations of ≥0.70 were defined as good reliability [23]. This analysis was repeated for the revised PRECIS total and factor scores, once poorly performing items had been eliminated.

Validity
For both the original 35-item and the subsequent 24-item versions of the PRECIS instrument, discriminant validity assessed whether it was possible to differentiate between the patient and control groups. PRECIS responses from both groups recorded at randomization were compared using an exact Wilcoxon two sample (non-parametric) test. Convergent and divergent validity were assessed by correlating pre-randomization PRECIS scores with related and unrelated domains from other validated instruments (CANTAB, MCCB, and SCORS), using Spearman's (non-parametric) rank order correlation coefficients for patients with schizophrenia. Strong and poor convergent validity was defined as correlations of ≥0.70 or ≤0.40, respectively. These analyses were repeated on the revised, 24-item PRECIS instrument to ensure its validity.

Baseline Characteristics
A total of 410 patients with schizophrenia and 88 healthy individuals were enrolled. Overall, the patient group was older than the control group (mean [SD]; patient group, 43.0 [9.5] years; control group, 33.2 [11.1] years), and included a larger percentage of males and African Americans ( Table 1).
Effects of gender, age, and race-ethnicity on PRECIS item scores were examined, revealing that none of the item scores were affected by gender or age, and few differences were observed based on raceethnicity of the participant. Therefore, the potential effects of these demographic variables were not analyzed further.

Item Elimination and Development of a Revised PRECIS Instrument
Item response distributions, EFA, reliability and validity of the 35-item PRECIS were assessed initially.
Results from these analyses are described in Additional File 1, Additional Figures A2 and A3, and Additional Tables A1 and A2.
Items that met the pre-determined criteria were eliminated from the 35-item PRECIS instrument.Two of the 35 items poorly loaded onto the identified domains with a Varimax correlation <0.50. Marked floor effects in patient groups resulted in the elimination of 5 further items. Regarding test-retest reliability, 3 items scored <50% between visits, leading to their elimination from the revised instrument. A further item (CIAS #112) was found to be significantly different between visits and was also eliminated (Additional Table A3). No items were eliminated due to inadequate inter-item consistency or lack of discriminant validity. Analysis of the 35-item PRECIS instrument therefore resulted in elimination of 11 items, in accordance with the pre-defined elimination or inclusion criteria (Additional Table A3). The revised PRECIS instrument included 24 items, which were also assessed (Additional Table A4).

Analysis of Revised 24-item PRECIS Instrument
Factor Analysis Of the remaining 24 items in the revised PRECIS instrument, 22 function items were subjected to CFA.
The two "Bother" items were excluded from the CFA for conceptual reasons, as they were designed to assess the degree to which the rest of the scale mattered to the patient. The CFA identified one strong factor (Attention; eigenvalue: 9.74) and 3 additional factors (Memory, Executive function and communication; eigenvalues: 1.10-1.52; Table 2, Additional Table A5, Additional Figure A4). Reinsertion of the "Bother" factor, resulted in a final 5-factor solution. Similar factor solutions were calculated using Varimax (Table 2) and Promax (data not shown) rotations.

Reliability
There was a high level of internal consistency both for the overall 24-item PRECIS instrument (Cronbach's alpha score of 0.942) and individual domains (Cronbach's alpha scores: 0.743-0.873; Table 3). The Cronbach's alpha score for the 24-item instrument was compared with the resulting score following removal of each of the items in turn, to confirm that no individual item was decreasing the overall reliability of the score. Individual item correlations within each of the subscales showed adequate or better internal consistency for individual items and domains, in addition to the overall PRECIS score (Table 3).
For the revised 24-item PRECIS instrument, test-retest analysis for the 5 domains, as shown by ICC in 111 patients, ranged from 0.49 to 0.74, with moderate-high levels of agreement across testing visits (60.4%-74.8% agreement; Table 3). An outlier ICC value <0.50 was noted for the communication domain, which nevertheless had a high (62.2%) agreement ratio, in keeping with the other domains.
As this scale was revised, item-level ICC and percentage agreement analyses were not conducted to avoid duplicating domain analyses (Table 3). Overall, the ICC for the total 24-item PRECIS instrument in test-retest analysis was 0.78, with 73% agreement between responses of patients over two separate visits.

Validity
Discriminant validity testing confirmed there were significant differences between the patient and control groups in each of the 5 domains in the revised 24-item PRECIS instrument (p<0.0001; Table   4). Convergent validity was demonstrated through a significant correlation between PRECIS total and individual domain scores with the SCoRS global rating scale (p<0.0001). There was no correlation between the PRECIS instrument and CANTAB composite or domain scores, with the exception of Verbal Recall/Recognition Memory, which correlated with Domain 2 (Memory) of the PRECIS instrument (Table 5). There was no correlation between the PRECIS instrument total and MCCB composite scores (Table 5)

Discussion
This study aimed to develop and validate a sensitive and reliable PRO instrument for assessing patient experience in CIAS, according to US Food and Drug Administration guidelines [8,16]. This well-powered, clinical psychometric validation study demonstrated that the revised 24-item version of the PRECIS instrument has adequate discriminant validity, good internal consistency, adequate testretest reliability, and good-to-excellent intra-and inter-item correlations, providing strong evidence of its validity and reliability. The PRECIS instrument also demonstrated good convergent validity, based on correlations with the SCoRS. Moreover, there were very few correlations between the PRECIS instrument and the CANTAB or MATRICS test battery domain and total scores, which further demonstrates the specificity of PRECIS in assessing patients' personal experiences of cognitive impairment. Such experiences can be independent of objective cognitive impairments assessed using performance-based measures (eg, CANTAB and MATRICS test battery) [24,25]. The PRECIS and SCoRS instruments differ in their methodology, in that they involve patient-reported and clinician/observer-reported outcomes, respectively. However, the correlation between the two instruments is consistent with the notion that PRECIS may measure the effect of CIAS on patients' day-to-day functioning, rather than being a direct measurement of cognitive impairment. Overall, the PRECIS instrument is a novel PRO tool with strong evidence of validity and reliability, designed using patient feedback to measure the subjective experience of CIAS. The test-retest reliability coefficients for the 24-item instrument ranged between 0.49 and 0.74, with coefficients for four of the five domains (attention, executive functioning, communication and bother) falling below 0.70, a value that some have suggested to be the minimum standard for demonstrating reliability [23]. However, a number of factors should be considered when interpreting these coefficients, not least the fact that patients included in the test-retest reliability analysis were part of an interventional study, albeit they were treated with placebo. Although these patients are more likely than their BI 409306-treated counterparts to respond in a consistent way across visits, it is still possible that patients experienced a placebo or study effect, which may have affected their performance leading to a lack of agreement between responses over the two visits. While this effect would reduce test-retest reliability, it would also indicate the high level of construct validity of the PRECIS and allow for greater sensitivity to between-group differences in treatment studies. Such tradeoffs between reliability and validity are common in clinical research, including in studies of cognitive change in schizophrenia [26]. Nonetheless, the test-retest analysis coefficients were supported by moderate to high levels of agreement across visits (60.4%-74.8%), demonstrating adequate test-retest reliability in the context of the current study.
It is worth noting that the primary outcome in the clinical, pharmacological study from which the patient group was enrolled was to observe a change in cognitive function, as measured by the CANTAB. The purpose of developing the PRECIS instrument was to detect treatment responsiveness by comparing any changes in outcome during treatment with corresponding CANTAB scores.
However, as no treatment effect was observed using the validated CANTAB score, treatment responsiveness of the PRECIS instrument could not be assessed. Therefore, the sensitivity of PRECIS to detect differences corresponding with changes in clinical symptom severity requires further exploration. Continued development of the PRECIS instrument is ongoing, including the use of the 35-item version in another Phase II study in CIAS to generate additional data to support the proposed revisions reported here and to obtain treatment responsiveness data. Future work may also involve the inclusion of the PRECIS instrument in a study of cognitive remediation, as a previous study has demonstrated that a moderate effect size for improvement in cognition (0.5 SD), as measured by MCCB, can be achieved using cognitive remediation [27].
While the current study cannot confirm sensitivity of the PRECIS instrument to severity of positive or negative symptoms, the patients eligible for this study comprised a subset of patients with schizophrenia who were taking regular antipsychotic medication (for ≥8 weeks prior to randomization) and with no more than moderate hallucinations, delusions, or depression. While the PRECIS instrument was not developed to measure the severity of positive or negative symptoms, further investigation is needed to fully understand the reliability and validity of the PRECIS instrument in patients with a wide range of symptom severity.
The final stage of the development and validation process was to determine the algorithm to be used to convert the Likert responses into item, domain, and total scores. Further investigation may identify alternatives to the chosen scoring algorithm that may impact on the sensitivity and suitability of the instrument. For example, by increasing or decreasing the influence of the Bother scores relative to that of the other four domain scores, further gains in validity may be observed.
The study has a number of limitations to be considered. First, assessment of reliability and discriminant validity may have been impacted by the differing sizes and characteristics of the patient and control groups, including lower mean age in control subjects and a lower proportion of African American subjects. Furthermore, responses from control subjects demonstrated floor effects with limited range in scores, which may have influenced the assessment of discriminant validity.
Additionally, the patient sample included predominantly African-American subjects, therefore may not be representative of all patient populations. It should also be noted that the performance-based measures MCCB and CANTAB, which were applied a number of times throughout the study, have been shown to have small practice effects, which may have influenced the assessment of convergent validity. However, this would not have impacted test-retest reliability of the subjective PRECIS instrument.

CONCLUSIONS
In conclusion, a revised 24-item PRECIS instrument has been developed that shows good reliability and validity. This tool may provide a patient-based perspective to complement existing measures of cognition and serves to define key patient-based endpoints for use in future clinical studies. (IRB) and each study site received individual IRB approval via Alpha IRB. Prior to patient participation, written informed consent was obtained from each patient or the patient's legally accepted representative. Each signature was dated by each signatory and the informed consent and any additional patient information form was retained by the investigator as part of the trial records. A signed copy of the informed consent and any additional patient information was given to each patient or the patient's legally accepted representative.

Consent for Publication
Not applicable

Availability of Data and Material
The datasets generated and/or analysed during the current study are not publicly available as they are currently under regulatory review but may be available from the corresponding author upon reasonable request.

Competing Interests
The authors declare that there are no conflicts of interest in relation to the subject of this study. BD is an employee of Synexus and has been the principal investigator on studies for Boehringer Ingelheim