Investigating the Need for Calibration to Track Eye Movements: A Feasibility Study

Automated eye tracking technology could enhance diagnosis and treatment for many neurological diseases, including posterior circulation stroke. Much of the current literature focuses on gaze estimation through a form of calibration. Unlike other ﬁelds, medicine has a clear need to better track eye symmetry during movement for better detection of abnormal conjugacy, ductions, and vestibulo-ocular function in a variety of neurological diseases. However, patients with neuro-ocular deﬁcits may have a difﬁcult time completing a calibration procedure due to inattention and other associated neurologic deﬁcits. Here, we investigate the need for calibration to measure the symmetry of eye movements in healthy individuals including testing ﬁxations, smooth pursuits, and saccades. The results of this feasibility study suggest that calibration may not be necessary to measure and track binocular eye movements in tandem. The structure or shape which the eyes draw during visual tracking remain intact even without a calibration procedure. The preliminary study suggests that this technology can be deployed without a calibration procedure within this clinical context. Further research is needed to validate these ﬁndings in populations with neuro-ophthalmologic disease, including posterior circulation stroke. consisting 79 % female and 21 % male. The racial distribution of the controls is 81% White, 14% Asian, 5% American Indian or Alaska Native. The Tobii Pro Fusion eye tracker was placed at 0.6 meters from the participant. However, one participant did not follow the instruction during the experiment by leaning far forward to the eye-tracker. This lead to the eye-trackers inability to estimate the gaze. Therefore this participant was excluded from the study.


Introduction
Posterior circulatory stroke (PCS) is a type of stroke that could use the application of gaze estimation to detect abnormal eye movement. PCS accounts for about 25 percent of Ischemic stroke 1 , and is three times 2 more likely to be misdiagnosed than anterior circulatory stroke (ACS). A reason being is that emergency medical services (EMS) screening tools are not well-suited to detect PCS. Moreover, current EMS screening tools do not incorporate symptoms such as abnormal eye movement. As a result, patients with PCS are treated later and less often with acute stroke therapy as compared to patients with ACS. Recognizing the symptoms and getting treatment within 60 minutes can prevent fatality and disabilities 3 .
Dizziness, among other frequent symptoms of PCS, is a non-specific symptom accounting for over 2.5 million emergency department (ED) visits annually 4,5 . Diagnosing symptoms such as abnormal eye movements often requires a trained physician. A trained physician would look into deficits of interest such as nystagmus, abnormal visual fixation, and skew deviation 6,7 . Here the physician performs specific examination procedures that involve fixation and smooth-pursuit of visual stimuli. Abnormal eye position and movements due to a PCS can be subtle and difficult to detect by the untrained observer 6,8 . While clinical maneuvers such as the HINTS exam 9 are a viable solution in the ED, they are not well adapted for use by non-experts in the pre-hospital setting such as first-responders. Therefore, the development of an automated eye-tracking tool could improve PCS screening in the pre-hospital setting in order to provide faster diagnosis and rapid treatment to achieve better patient outcomes.
Device-based eye tracking is widely used to quantify the gaze of an individual, much like a physician assessing eye movements at the bedside. Estimation performed by eye tracking is pertinent to various fields of medicine and clinical research including ophthalmology 10,11 , psychiatry 12 , psychology 13 , psychopharmacology 14 , and neurology [15][16][17][18] . Depending on the use case scenario, researchers may opt to deploy either wearable devices or screen based platforms to estimate eye movements. Among neuro-ocular deficits, eye-trackers can be used to detect nystagmus 19 , quantify saccades 15 , assess conjugacy, and detect scotomas 11 . Pong et al. 20 used a wearable eye-tracker to detect nystagmus by performing pupil extraction and estimating inner motion of the eye. Terao et al. 15 present a review on the use of eye tracking technology saccade detection in clinical settings, and the relation to identifying neurological disorders. Researchers have also investigated visual field defects such as peripheral loss, central loss and hemifield loss using gaze estimation 11 .
Eye tracking is also performed to quantify the mental state of Alzheimer patients. The researchers used a mobile device based eye tracker to perform cognitive assessments 13 . Kapoula et al. 19 performed gaze estimation by detecting abnormal eye movements such as optokinetic-nystagmus (OKN), and smooth pursuit (SP) for participants suffering from somatic tinnitus. Puuganti et al. 21,22 investigated saccade detection during nystagmus using a wearable eye tracker while performing the Dix-Hallpike maneuver. Others used eye trackers to understand motion blur caused by nystagmus 23 , and attention deficits 24 . Kumar et al. 16 used eye tracking to quantitative abnormal eye movements on stroke survivors during rehabilitation process.
One limitation of using device-based eye tracking in the clinical setting is the requirement of a preparatory calibration procedure. Particularly in clinical research, some neurological studies 16 have excluded participants who have failed the calibration procedure. Calibration is performed to minimize the error between the between the estimated point of gaze (POG) and actual POG. The calibration procedure offered by most manufacturers [25][26][27] requires the participant to follow instructions and fixate gaze on a set of known points on a screen. This is a major limitation in clinical settings as patients suffering from cognitive deficits, pathological vertigo or nystagmus, or visual field defects may have difficulty completing the procedure 11,28,29 . The need for a calibration procedure further limits the use of device-based eye trackers in time sensitive clinical situations, such as acute stroke, where eye tracking could be useful to identify certain strokes in the emergency setting. Hence, we sought to investigate the feasibility of gaze estimation to detect eye movement abnormalities without performing a calibration procedure.
To investigate the feasibility of gaze estimation of abnormal eye movements, we adapted a set of standard neurological eye movement examinations (i.e. NeuroEye) and assessed a device-based eye tracker's performance with and without calibration. We tested this initially in a healthy population to first understand the normal variance of non-pathological eye movements.

Participant characteristics.
Nineteen healthy controls participated in this calibration study. The mean age (range) is 40 years consisting 79 % female and 21 % male. The racial distribution of the controls is 81% White, 14% Asian, 5% American Indian or Alaska Native. The Tobii Pro Fusion eye tracker was placed at 0.6 meters from the participant. However, one participant did not follow the instruction during the experiment by leaning far forward to the eye-tracker. This lead to the eye-trackers inability to estimate the gaze. Therefore this participant was excluded from the study.

Analysis
The   The HO Test has a low average correlation for both conditions compared to the other eye tests. The calibration had an average coefficient of 0.525 (0.349, 0.686) on the X and -0.05 (-0.306, 0.21) on the Y axes. The average coefficients without calibration are 0.561 (0.371, 0.727) on the X and -0.07 (-0.26, 0.142) on the Y axes. Of note, there is greater imprecision of the 95% confidence intervals for both the calibration conditions compared to the other eye tests. Furthermore, Fig. 3 is a heat-map that shows a low density for gaze samples over the screen target's true position for both calibration conditions. These results suggest an inability to capture gaze samples during changes in head position, specifically when the head tilts. Further analysis indicated that the eye-tracker failed to capture ≈ 20% of gaze data for both condition. Therefore the greater proportion of missing data during the HO test suggests that the eye-tracker failed to maintain eye detection during the change in the vertical orientation of the head.
The OKN test showed equivalent average correlations on the X axis, 0.933, as seen in Table 1. The 95% confidence of the correlation difference is -0.006 to 0.007. The correlations on the Y axis were not equivalent between calibration conditions. There was a difference of 0.079 (-0.052, 0.212). Similar to the HO test, the Y axis 95% confidence intervals are wider than the X axis. This suggests that there may be greater variability among measured Y coordinates. See Fig. 4 (a). Because of the clear linear correlation of right and left eye movements on the X axis, seen in Fig. 4 (b), it is likely due to the high variability of the Y coordinates which reflect the uncertainty of the participants' gaze on the rectangular shape images. The same figure also suggests that each participant's estimated gaze point of origin is different without the calibration procedure. This can be considered a random effect of intercepts.

Implications for stroke exams
This study provides preliminary evidence that a calibration procedure is not necessary to track eye movements over time and to measure eye movement symmetry. Regarding tracking eye movements over time or their path, the resulting shape from the changes in the X and Y positions is maintained for each participant irrespective of calibration. To demonstrate this assertion, Figures 1 to 4 show that the density and paths of the eyes along the X and Y axes maintain similar shapes. As shown in Eq. 2, the calibration condition appears to scale and center the estimated eye positions close to the actual screen coordinates, but not affect the general shape that the path of the eye movement create.
Additionally, in neurology and ophthalmology, symmetry of eye movement is very important. A significant deviation away from eye symmetry in terms of velocity when tested in parallel, lack of motion in any direction when tested alternatively, or non-mirrored eye alignment is considered pathological. The Spearman correlation coefficients within this context of all the various conditions, tests, and both axes are shown in Table 1. All differences between calibration conditions are zero or close to it ranging from -0.041 to 0.079. Additionally, each bootstrapped 95 % confidence interval contains zero, which suggests that there is no significant difference between calibration conditions among all the tests and axes. Even when the repeated measurement of participants is taken into account via a linear mixed effects model, the calibration condition parameter did not reach significance. These results suggest that the calibration procedure is not necessary for the measurement of eye movement symmetry.
These two points of path shape and conjugate eye symmetry are important clinically. A neurologist, for example, is much more concerned about the conjugacy of the eyes' direction and velocity following a visual target, than the accuracy or precision of the eyes' point of gaze on the same target. In the same vein, a neurologist examines a patient whom is unable to move the right eye in all directions. The right eye cannot move away from the nose (abduction). This suggests to the neurologist that the 4/9 right eye may have a palsy of the sixth cranial nerve, which is responsible for horizontal abduction of the eye. While the lack of eye movement should be easily detected and measured, other characteristics like saccades are more difficult to observe and quantify. However, the path that each eye takes can be easily visualized, as in Fig. 2, even without calibration. The path that the eyes take when tracking a target (smooth pursuit) also allows the quantification of direction and speed of eye movement. This is necessary for appropriate labeling and detection of changes in velocity of the eyes and the symmetry between them.
In summary, this feasibility study demonstrates that the eye tracker is able to measure ocular path and symmetry well with or without calibration in healthy participants. However, this study did not observe individuals with neuro-ocular deficits that result in asymmetry. It is reasonable to assume that the eye tracker, regardless of calibration, will be able to measure the lack of correlation or symmetry between eyes in the presence of ocular weakness or lack of available motion. Even so this assumption needs to be validated within a population that has neuro-ocular deficits, which is our focus of further study. Because the eye tracker maintains the structure or shape that the eyes draw as they track or find a target and has the ability to measure gaze symmetry without a calibration procedure, the technology can be deployed in a manner that skips calibration within this clinical context. This enhances eye tracking technology's potential to translate into the pre-hospital setting as a screening tool for stroke.

Methods
Understanding Calibration Figure 5. Illustration of the calibration process.
The point of Gaze (POG) is defined as the intersection of the visual axis of the eye with the scene. Eye trackers use a gaze calibration procedure to estimate the intersection of the visual axis of the eye with the scene (see Fig. 5). Essentially the process of calibration uses a minimization function to reduce the error between the estimated gaze point g non−cal , and the calibration point/target in the screen g cal (see Eq.1): with g cal and g non−cal both points (2D vectors) on the screen, and with the underlying assumption that the subject is indeed focusing his/her gaze on the screen point g cal . Generally speaking 27,30,31 , the function f is defined as where t is a translation vector, A a 2x2 matrix encompassing scaling, shear, and rotation, and c i coefficients for nonlinear basis functions (e.g. polynomials or radial basis functions) to account for any non-linearities present in the system. In this formulation, the calibration model parameters are t, A, and c i , i = 1, · · · , M. With knowledge of a mathematical model and measured points, the goal of the calibration procedure is obtain many corresponding measurements g k cal , g k non−cal , k = 1, · · · , N, and then finding model parameters t, A, and c i that minimize some measure of error between the calibrated and non-calibrated points. One popular error function is least squares, in which case the optimization problem can be defined as: with Understanding the above, it then becomes clear that whichever procedure is being used by a particular device manufacturer to obtain g non−cal , uncalibrated points tend to be a 'deformed' version of g cal . In cases where the nonlinear component of the model expressed in eq. 2 is negligible, the points will be translated and rotated/scaled/sheared versions of the actual points (again assuming that the patient is actually looking exactly at the stimulus point presented in the screen). In such cases, a few observations with regards to utilizing uncalibrated data for obtaining quantitative information for differentiating normal vs. abnormal populations: • Accuracy. Without calibrated measurements, it is not possible to reliably measure the degree to which different populations are able to focus and track points on the screen with respect to accuracy between the points on the screen, and uncalibrated estimates of gaze points.
• Discriminant analysis. Often pattern recognition methods are used on a collection of gaze points estimated from a subject during an examination procedure to 'learn' differences between normal and abnormal populations. To the extent that the pattern recognition methods can be made invariant with respect to affine transformations, discriminant analysis is possible.
• Agreement between left and right eye focus. If quantitative information regarding how well left and right eye movements correlate is desired, the correlation procedures must take into account the model being expressed in eq. 2. For example, if the nonlinear portion of the model is deemed negligible, the correlation between uncalibrated left and right eye movements will undoubtedly be linear 32 . This is due to the fact that the remaining affine portion of the model in eq. 2, by definition, does not affect linear correlations.

Experiment Setup
The experimental protocol was approved by the University of Virginia's Internal Review Board. As such, the protocol complies with all national ethical research standards and in accordance with the Declaration of Helsinki. Written informed consent was obtained prior to subject enrollment and testing. The experimental setup itself for our feasibility study to assess non-calibrated gaze to detect PCS as pre-hospital screening tool consisted of two main components, RoADIE (Rolling Apparatus to Detect Impairment of the Eyes), and NeuroEye, the neurology eye examination comprises four different tests, investigating various motor functionalities of the eye. Figure 6. Rolling Apparatus to Detect Impairment of the Eyes -RoADIE. (a) Illustration of the hardware components for RoADIE, (b) Data synchronization for RoADIE

RoADIE
RoADIE is a mobile rig that was developed to acquire gaze data while performing the NeuroEye examinations. The RoADIE is equipped with a HIPPA compliant computer to run our custom built data acquisition software and store data, Tobii

6/9
Pro Fusion Eye Tracker 25 was used to estimate the gaze, RealSense camera 33 to acquire visual data, and screen-2 displays stimulaies of the NeuroEye exam (see Fig. 6 (a)). The data acquisition for all the sensor modalities was triggered using global timer and was locally synchronised to a precision of 100 ms (see Fig. 6 (b)).

NeuroEye
The NeuroEye examination comprises four different tests, investigating various motor functionalities of the eye. The four tests are the "Dot-Test", "H-Test", "OKN-Test", and "HO Test". They are computer adaptions of standard bedside clinical tests. The Dot Test is designed to assess the quality of eye coordination for each saccade. A clinician observes the movement of both eyes as the patient shifts his or her gaze from target to target (see Fig. 7 (a)). The clinician looks for the eyes to suddenly move to the next visual target and stop accurately at its destination. Abnormal signs consist of the eyes stopping short or too far away from the target as known as under or over-shooting, respectively, or the eyes do not initiate a saccade. The H Test is designed to assess the quality of eye movement at a constant slow pace as the eyes track a visual target, smooth pursuit, into all directions. The "H" pattern (see Fig. 7 (b)) ensures that both eyes track the target in all directions to their end ranges. Abnormal signs consist of the eyes lacking motion, or its initiation, in any direction or the eyes utilize small saccades to catch up to the moving visual target.
The Head Orientation (HO) Test is designed using a stationary dot that would appear at the center of the screen (see Fig.  7 (c)) and requires the patient to be able to tilt the head to the left and right in alternate fashion. The patient reports if the movement results in double vision. If he or she has double vision already, the patient should remark if it worsens, improves, or both and to which direction. This is because head tilting exacerbates a vertical mal-alignment as the head tilts toward the superiorly position eye relative to the other one. Conversely, head tilting toward the inferiorly positioned eye will reduce the double vision as the eyes become level to each other. In patients with an Ocular Tilt Reaction (OTR), it will be expected that the patient's initial head tilt will be biases toward one side. Thus, the amount of head tilting may be assumed to be symmetric in a healthy control and more asymmetric in patient suffering a lesion affecting his or her gravitational perception from the utricle and vertical canal structures.
The optokinetic nystagmus (OKN) test measures the patient's ability to perform smooth pursuit, similar to the H test, and his or her ability to initiate a saccade to fixate on next visual target after the first target disappears. The visual stimulus is typically vertical bars with a high contrast to the background. The bars move at a quick and constant pace from right to left (see Fig. 7 (d)). This is repeated in the opposite direction. OKN typically remains preserved in individuals with occipital lobe infarcts. Asymmetry in performance of the eyes or poor ability to generate saccades may suggest damage to the ocular, brainstem, or cerebellar nuclei or tracts.

Statistical Plan
Statistical analyses were conducted with R (v4.0.3, R Foundation for Statistical Computing, Vienna, Austria). For all four tests, we assumed that normal eye movement consists of highly correlated position changes between the left and right eyes for the X and Y axes. Additionally, the rate of change that both eyes experience was assumed to be constant, which results in a linear relationship.
A Spearman's rank correlation was calculated to measure the correlation of left and right eye movements for each participant by axis, condition, and test. We chose the Spearman's coefficient over Pearson's due to the non-Gaussian distribution of coordinates. Because of the small sample of participants, we performed 10,000 bootstraps of the Spearman's coefficient of the participants to estimate the average correlation coefficient and its 95% confidence intervals for each axis, calibration condition, and test. This ensures that the sample of participant-level correlation coefficients are independent. Therefore we repeated the approach for the difference in correlation coefficients and its 95% confidence interval between calibration conditions. Table 1 shows these results.
To estimate the overall effect of calibration on the eye-tracker's ability to measure symmetry of eye movement, a linear mixed effects model was implemented with the "lme4" 34 package to account for the repeated measurement among participants and their random effects seen in Fig. 4 (b). Participant level Spearman correlations were the response variable with the following co-variables: calibration condition, axis, and test as fixed effects with participant as a random effect. Prior to creating the model, the Spearman correlations, the response variable, received a Fisher's Z transformation to achieve a normal distribution of correlations. Additionally, weights were implemented to account for the different number of observations for each test and correct for heteroscedasticity. The linear mixed effects model showed an even distribution of residuals and fitted values. The calibration condition parameter failed to achieve significance.