2.1 ICA test description and the scientific rationale behind the test
The ICA test is a rapid visual categorization task with backward masking 17,18,25. The test takes advantage of the human brain’s strong reaction to animal stimuli 25,26. One hundred natural images (50 animal and 50 non-animal) are carefully selected, with varying levels of difficulty, and are presented to the participants in rapid succession. Images are presented in the center of the screen at a 7° visual angle. In some images the head or body of the animal is clearly visible to the participants, which makes it easier to detect. In other images the animals are further away or otherwise presented in cluttered environments, making them more difficult to detect. A few sample images are shown in Figure 1. We used grayscale images to remove the possibility of color blindness affecting participants’ results. Furthermore, color images can facilitate animal detection solely based on color 27,28, without full processing of stimulus shape. This could have made the task easier and less suitable for detecting less severe cognitive dysfunctions.
The strongest categorical division represented in the human higher level visual cortex appears to be that between animates and inanimates 29,30. Studies also show that on average it takes about 100ms to 120ms for the human brain to differentiate animate from inanimate stimuli 26,31,32. Following this rationale, each image is presented for 100 ms followed by a 20 millisecond inter-stimulus interval (ISI), followed by a dynamic noisy mask (for 250 ms), followed by the subject’s categorization into animal vs. non-animal (Figure 1). Shorter ISI durations can make the animal detection task more difficult and longer durations reduce the potential use for testing purposes as it may not allow for the detection of less severe cognitive impairments. The dynamic mask is used to remove (or at least reduce) the effect of recurrent processes in the brain 33,34. This makes the task more challenging by reducing the ongoing recurrent neural activity that could artificially boost a subject’s performance and further reduces the chances of learning the stimuli. For more information about rapid visual categorization tasks refer to Mirzaei et al., (2013) 17.
The ICA test starts with a different set of 10 test images (5 animal, 5 non-animal) to familiarize participants with the task. These images are later removed from further analysis. If participants perform above chance (>50%) on these 10 images, they will continue to the main task. If they perform at chance level (or below), the test instructions will be presented again, and a new set of 10 introductory images will follow. If they perform above chance in this second attempt, they will progress to the main task. If they perform below chance for the second time the test is aborted.
Backward masking: To construct the dynamic mask a white noise image was filtered at four different spatial scales, and the resulting images were thresholded to generate high contrast binary patterns following the procedure in Bacon-Macé and colleagues (2005) 16,17. For each spatial scale, four new images were generated by rotating and mirroring the original image, creating a pool of 16 images. The noisy mask used in the ICA test was a sequence of 8 images, chosen randomly from the pool, with each of the spatial scales appearing twice.
2.2 Brief International Cognitive Assessment for MS (BICAMS)
The BICAMS battery consists of three standard pen-and-paper tests, measuring speed of information processing, visuospatial learning and verbal learning.
Symbol Digit Modalities Test (SDMT): The SDMT is designed to assess speed of information processing, and takes about 5 minutes to administer 35. The test is formed of a simple substitution task. Using a reference key, the examinee has 90 seconds to pair specific numbers with given geometric figures.
California Verbal Learning Test - 2nd edition (CVLT-II): The CVLT-II test 36,37 measures episodic verbal learning. The test begins with the examiner reading a list of the 16 words. Participants listen to the list and then report as many of the items as they can recall. Five learning trials of the CVLT-II are used in BICAMS 20, which takes about 10 minutes to administer.
Brief Visual Memory Test – Revised (BVMT-R): The BVMT-R test assesses visuospatial learning (i.e. immediate recall) and memory (delayed recall) 38,39. Only learning trials of BVMT-R are included within BICAMS. Here, in three consecutive trials, six abstract shapes are presented to the participant for 10 seconds. After each trial, the display is removed from view and patients are asked to draw the stimuli via pencil on paper manual responses. The test takes about 5 minutes to administer.
In total, 174 participants took part in Substudy 1 (Table 1): 91 patients diagnosed with multiple sclerosis (MS), and 83 healthy controls matched for age, gender and education. 48 MS patients took part in Substudy 2 (Table 2). Of all participants, 25 attended both substudies. Participants’ age varied between 18 and 65. The study was conducted according to the Declaration of Helsinki and approved by the local ethics committee at Royan Institute. Informed written consent was obtained from all participants. Patient participants were consecutively recruited from the outpatient clinic of the Aria Medical Complex for MS in Tehran, Iran. Patients were diagnosed by a consultant neurologist according to the McDonald diagnostic criteria (2010 revision) 40. Healthy controls (HC) were recruited through local advertisements.
Exclusion criteria: severe depression and other major psychiatric comorbidities, presence of neurological disorders and medical illness that independently affect brain function and cognition (other than MS for the patient group), visual problems that cannot be corrected with eye-glasses such that the problem prevents participants from reading, upper limb motor dysfunction, history of epileptic seizures, history of illicit substance and/or alcohol dependence.
For each participant, the clinical characteristics of MS subtype, information on age, education and gender were also collected. We quantified participant disability and disability progression over time by utilising the Expanded Disability Status Scale (EDSS).
For the purposes of this study, patients with severe abnormality in at least one of the BICAMS sub-tests (defined as 2 standard deviations (SD) below the norm) or with mild abnormality (defined as 1 SD below the norm) in at least two sub-tests of BICAMS were identified as cognitively impaired.
2.4 Study procedures
Substudy 1: 174 participants (Table 1) took the iPad-based ICA test and the pen-and-paper BICAMS test, administered in random order. The same researchers who administered the BICAMS directed participants on how to take the ICA test. In this substudy, we investigated convergent validity of the ICA test with BICAMS, ICA’s test-retest reliability and the sensitivity and specificity of the ICA platform in detecting cognitive impairment in MS.
To measure test-retest reliability for the ICA test, a subset of 21 MS and 22 HC participants were called back after five weeks (± 15 days) to take the ICA test as well as the SDMT. The subset’s characteristics were similar to the primary set in terms of age, education and gender ratio. For both SDMT and the ICA, the same forms of the tests were used in the re-test session. Note that in the ICA test, while the images were the same, they were presented in a different random order in each administration.
Substudy 2: In this substudy, we investigated ICA’s correlation with the level of serum NfL in 48 MS patients (Table 2). Participants took both the ICA and the SDMT test, administered in random order. The ICA and SDMT were administered in the same session, but blood samples were collected in another visit with a gap of 2-3 days in between.
Blood samples were collected in tubes for serum isolation, then centrifuged at 3000 rpm for 20 minutes of blood draw, and finally placed on ice. Serum samples were measured at 1:4 dilution. NfL concentrations in serum were measured using a commercial ELISA (NF-light® ELISA, Uman Diagnostics, Umeå, Sweden). We used Anti NF-L monoclonal antibody (mAB) as a capture antibody and a biotin-labeled Anti NF-L mAB as the detection antibody. All samples were measured blinded. ELISA readings were converted to units per milliliter by using a standard curve constructed by calibrators (Bovine lyophilized NfL obtained from UmanDiagnostics).
Participants in Substudy 2 also attended an 8-week physical and cognitive rehabilitation program, details of which are reported in separate studies 41,42. The physical rehabilitation program included a combination of endurance and resistance exercises, with gradually increasing intensities over the 8-week period. The cognitive rehabilitation program included playing newly-developed games in a virtual reality (VR) environment, targeting sensorimotor integration, memory-based navigation and visual search. For the purpose of this study we measure pre- and post-rehabilitation ICA results for these group of participants, and the ICA correlation with NfL pre- and post-rehabilitation.
Participants were divided into a rehabilitation group of 38 individuals and a control group of ten; the control group only took the tests before and after these 8 weeks without attending the rehabilitation program. The rehabilitation group attended three sessions each week, each of them lasting about 70 minutes.
2.5 Accuracy, speed and ICA summary score calculations
In the ICA, participants’ responses to each image and their reaction times (i.e. time between image onset and response) are recorded and used to calculate their overall accuracy and speed. Speed and accuracy are then used to calculate an overall summary score, called the ICA score.
Accuracy is simply defined as the number of correct categorizations divided by the total number of images, multiplied by 100.
[Please see the supplementary files section to view the equation.] (1)
Speed is defined based on participants’ response reaction times to images they categorized correctly.
[Please see the supplementary files section to view the equation.](2)
Speed is inversely related to reaction time; the higher the speed, the lower the reaction time.
Preprocessing: We used a boxplot to remove outlier reaction times, before computing the ICA score. A boxplot is a non-parametric method for describing groups of numerical data through their quartiles; and allows for detection of outliers in the data. Following the boxplot approach, reaction times greater than q3 + w * (q3 - q1) or less than q1 - w * (q3 - q1) are considered outliers (where q1 is the lower quartile, and q3 is the upper quartile of the reaction times; and “w” is a 'whisker' ; w = 1.5). The number of reaction-time data points removed by the boxplot can vary case by case; if this number exceeds 40% of the observed images, the results are deemed invalid and a warning is shown to the clinician to repeat the test. In this study none of the participants faced such a warning. The maximum percentage of outliers was 15%, which happened in one of the MS patients.
The ICA score is a combination of accuracy and speed, defined as follows:
[Please see the supplementary files section to view the equation.] (3)
2.6 ICA’s artificial intelligence (AI) engine
ICA’s AI engine (Figure 2) used in this study was a multinomial logistic regression (MLR) classifier trained based on a set of features extracted from the ICA test output for each participant. These features included the ICA score, and the trends of speed and accuracy during the test (i.e. whether the speed and/or accuracy were increasing or decreasing during the time-course of the test). The classifier also took subject’s age, gender and education in order to match subjects with similar demographics.
Multinomial logistic regression classifier (MLR) 43 is a supervised regression-based learning algorithm. The learning algorithm’s task is to learn a set of weights for a regression model that maps participants’ ICA test output to classification labels.
The difference between ICA’s AI engine in detecting cognitive impairment and the conventional way of defining a cut-off value for the outcome score of the test is further discussed in the discussion section.