This study describes a novel, automated method for screening of strabismus (“Strabismus screening Test using Augmented Reality and Eye-tracking”: STARE). STARE combines free, open source software with a commercially available headset with integrated eye-tracking (HTC Vive Pro Eye – though several other similar headsets are also available) to perform an automated alternate cover test. The patients’ task was simply to fixate a physical target in the Eye clinic for 60 seconds.
It was encouraging to note that STARE was able to detect moderate and severe cases of horizontal strabismus with reasonable (87%) sensitivity and 100% specificity. Given our intended use of STARE as a screening tool, its 100% specificity is particularly important – indeed an estimated prevalence of strabismus of 1.93%[14] means that a specificity lower than 98% would result in more false positive referrals than true positives. The test was reasonably quick (60 seconds once the patient was wearing the headset and instructed to look at a target) and tolerable (100% completion rate amongst control participants and patients), so might conceivably be used remotely by non-specialists in future as a means of highlighting those needing face-to-face specialist care.
We were keen to determine whether STARE, in its current iteration, could also quantify horizontal deviations, as would be expected in a specialist setting. Of note, our technical validation procedure based upon seven control participants with simulated deviations highlighted large variation in response amplitude, particularly at higher magnitude deviations. This individual variability may account for the wider limits of agreement in our Bland-Altman analysis compared to previous virtual reality tests (95% limit of agreement 27.9 prism dioptres, compared to 11.3 [8] and 3.1 [9]). One extreme outlier can be identified in our patient data – an 83 year-old myopic patient with an alternating esotropia measuring 25 prism dioptres on APCT and 82 prism dioptres on STARE. The incorporated eye trackers within the virtual reality headset rely on pupil tracking under infra-red illumination to determine eye movement in millimetres [13], and extremes of axial length or vertex distance would lead to errors in the conversion of this into a prism dioptre deviation. This may well explain the gross overestimation of deviation of our myopic patient mentioned above.
Comparisons to previous research
Previous authors have attempted to automate strabismus assessment using photographs [15][16][17] and binocular OCT [18]. The former in particular has significant potential for large-scale strabismus screening, but fails to capture the nuances of the alternate prism cover test, including identification of phorias and abnormal retinal correspondence. The assessments are also limited by room lighting, image quality and spectacle correction.
Commercially-available virtual reality headsets overcome many of these limitations, by offering complete control of targets and illumination, and are a natural choice of device to digitise the alternate prism cover test. Yeh et al. (2021) [8] compare the performance of a virtual reality headset-based test with the APCT in a cohort of 38 patients and show good agreement for both horizontal and vertical deviations. However, their test requires a skilled operator, able to assess presence of eye movement during the test and change target position accordingly. Instead, as in the present study, Miao et al. (2020) [9] report an automated virtual reality headset-based test, requiring a non-skilled operator. They find good agreement with the alternate prism cover test in the assessment of horizontal deviation in 17 patients. However, all patients were required to have their interpupillary distance and axial length measured in addition to performing the test, in order for an angle of deviation to be estimated.
The aforementioned studies used single fixation targets on a uniform black background, reported to be at 6m distance within a ‘virtual environment’. Nevertheless, both note uncertainties around the patient’s perceived fixation distance, and indeed Yeh et al. (2021) [8] find a tendency to esotropia with their virtual reality test, suggesting accommodation and significant convergence to the target. In our study, instead of depriving the patient of their usual surroundings and distance cues, we allow the patient to view a faithful stereoscopic representation of the room they are in. There is thus complete flexibility over choice of fixation target – a young child could, for instance, be asked to look at their parent’s face for 60 seconds, whilst the STARE protocol is performed.
Test Limitations
In order for pupil tracking to be accurate, the virtual reality headset must be fitted correctly on the patient’s head. The headset used in this study would be too big for use in young children, but was selected as it is relatively inexpensive and readily commercially available. In future, we expect smaller and cheaper virtual reality headsets to become widespread. The eye trackers used in this study were unable to measure torsion, and STARE cannot be used in those with nystagmus. STARE requires the patient to be able to fixate on a target throughout the test, the optimum duration of which appears to be a full 60 seconds (Fig. 7). The protocol used in the present study does not differentiate between manifest and latent deviations.
Study Limitations and Future work
There are several limitations to this feasibility study of STARE. External validity is limited by our small sample of patients. Only horizontal deviations were considered in comparisons of size of deviation between APCT and STARE, and no intra- or inter-observer data was recorded. Ocular deviation was also only measured in primary position with STARE.
This study forms the preparatory work for future iterations of the test, where we aim to use computer-generated augmented reality overlays of fixation targets within our real-world environment, in order to eliminate the effect of head movement and more readily determine size of deviation for both near and distance. By introducing a cover-uncover test into the protocol, we aim to also differentiate manifest from latent deviations.