Participants
We recruited participants via convenience sampling from a psychiatric hospital in southern Taiwan. Patients were included in this study if they met the following criteria: (1) diagnosis of schizophrenia according to the Diagnostic and Statistical Manual of Mental Disorders, 5th edition (DSM-5) [23]. DSM-5 criteria for schizophrenia were assessed and validated by board-certified psychiatrists and supported by clinical observations and interviews during hospitalization, past medical records, and information provided by main caregivers, (2) age ≥ 20 years, and (3) stable use and dosage of antipsychotic medication for at least one month prior to recruitment. The exclusion criteria were (1) diagnosis of other neurological or psychiatric diseases affecting cognition (e.g., stroke or depression), (2) another severe medical condition or psychiatric disorder that required treatment during the study, or (3) an unstable severity of symptoms [specifically, a change in score of more than 2 on the Clinical Global Impressions Scale–Severity (CGI-S)] [24].
This study was approved by the Institutional Review Board of the local hospital. All participants signed consent forms before participating in this study.
Procedure
This study was comprised of three assessments with 2-week intervals between adjacent assessments (i.e., early, middle and late assessments). At the early and late assessments, the participants completed alternate forms of the TONI-4 (i.e., Form A at the early assessment and Form B at the late assessment) in a four-week interval. At the middle assessment, we administered the Tablet-Based Symbol Digit Modalities Test (T-SDMT) [25], and Montreal Cognitive Assessment (MoCA) [26]. All assessments were administered by a trained occupational therapist. In addition, the CGI-S was administered in each session to ensure that the participants' symptoms did not change during the study period. We collected the patients’ demographic characteristics from chart review.
Measures
TONI-4
The TONI-4 is designed to assess fluid intelligence for individuals aged 6 years to 89 years and 11 months. The TONI-4 has alternate forms (i.e., Form A and Form B) to reduce the practice effect [19]. Items for Form A can be found on one side of the picture book, and items for Form B are found on the reverse side. The two forms are not interchangeable. Each item is composed of a sequence of abstract figures with a figure missing from the sequence. Each sequence includes one or more attributes, such as shape, position, direction, rotation, contiguity, shading, size, and movement. Items ascend in level of difficulty as more attributes are added. When three of the five consecutive items are incorrectly answered, the test is terminated. The items are scored dichotomously: correct answers earn one point and incorrect answers earn zero points. The rater writes the score on the answer sheet. The TONI-4 is norm referenced and yields an index, which is a standardized score (quotient) with a mean of 100 and a standard deviation of 15. Higher index scores indicate better fluid intelligence [19].
T-SDMT
The T-SDMT was developed from the SDMT to assess processing speed [25]. This test includes 9 different symbols, each associated with a number (1–9), presented to the examinee on a tablet computer screen (i.e., an iPad). All trials are conducted with the tablet in landscape orientation, held in place by a case that is adjusted to a 30-degree tilting angle. To respond to each item, the participant is required first to look at the symbol in the center of the screen, then to search for the corresponding number in the table at the top of the screen, and finally to choose the corresponding number on a 3-by-3 grid at the bottom of the screen. The tablet computer automatically records the number of correct answers during the test. A higher number of correct answers indicates better performance of processing speed. The T-SDMT has acceptable psychometric properties in patients with schizophrenia [25].
MoCA
The MoCA briefly measures overall cognitive functioning, including orientation, memory, visuospatial skills, executive functioning, language, and attention [26]. The total scores range between 0 and 30, and higher scores indicate better cognitive functioning. The total score (including the addition of one point for examinees with 12 or fewer years of education) is used for analysis. The MoCA has demonstrated high sensitivity as a cognitive screening test for severe mental illness [27].
CGI-S
The CGI-S assesses symptom severity on a 7-point scale (1–7) [24]. One point on the CGI-S represents that a patient is not ill, and 7 points represents most severely ill. We used the CGI-S to examine whether the symptom severity of the participants was stable during the study period.
Statistical Analyses
Test–retest reliability
Test–retest reliability was estimated using the intra-class correlation coefficient (ICC) between the early and late assessments, on the basis of a two-way random-effects model with absolute agreement [28]. The following criteria were used to interpret ICC values: an ICC value ≥ 0.80 indicated excellent test–retest reliability; 0.60–0.79, good; 0.40–0.59, moderate; and < 0.40, poor [29].
The standard error of measurement (SEM) is an index of random measurement error that can be used to present the precision of individual scores [30]. The SEM% was calculated by dividing the SEM by the mean of the early assessment score and then multiplying the result by 100% (SEM%). An SEM% of less than 10% is considered to indicate limited random measurement error for a measure [31].
We also calculated the minimal detectable change (MDC) and MDC percentage (MDC%) to examine the change between adjacent assessments that could be considered as a real change (beyond the score change caused by random measurement error) at the 95% confidence level. The MDC% was calculated by dividing the MDC by the mean of the early assessment score and then multiplying the result by 100% [32].
In addition, the agreement between test–retest measurements was analyzed by Bland–Altman plots with 95% limits of agreement (LOA) [33]. In these plots, the differences (d) between each pair of assessments were presented against the average value for each pair of assessments. To examine whether heteroscedasticity existed, Pearson’s correlation coefficient (r) was used to calculate the correlation between the absolute value of the difference of two assessments and the mean score of two assessments [34]. When Pearson’s r was ≥ 0.3 or ≤ -0.3, it meant that the absolute value of the difference was related to the mean score of two assessments, and that there was heteroscedasticity [35]. In other words, the higher the assessment score, the greater (r ≥ 0.3) or smaller (r ≤ -0.3) the difference between the two assessments.
Effect size (Cohen’s d) was used to estimate the magnitudes of practice effects due to repeated assessments of the TONI-4. An effect size ≥ 0.80 was considered as a large practice effect; 0.50–0.79, medium; 0.20–0.49, small; and < 0.20, trivial [36].
Convergent validity
Convergent validity was examined by correlating the scores of the TONI-4 at the early assessment with those of the MoCA and the T-SDMT using Pearson’s r. We hypothesized that we would find moderate correlations between the scores of the TONI-4 and the MoCA (i.e., fluid intelligence and cognition) [37], and that small correlations would be found between the scores of the TONI-4 and the T-SDMT (i.e., fluid intelligence and processing speed) [3, 38].