3.1 Identification of studies
We identified 783 records through the electronic database research. After excluding non-RCTs through electronic filters (n=155) and duplicates (n=18), we screened titles and abstracts from the remaining records. Having excluded studies that did not involve COPD participants (n=177), non-VR intervention (N=105), and studies written in a language other than English (n=155), we thoroughly screened the remaining 14 in terms of full text inspection. After excluding 8 studies for not involving PR in both experimental groups, a total of 6 RCTs were finally included in this systematic review. A detailed flowchart is provided in Figure 1.
3.2 Methodological Quality
The methodological quality score of all included studies was rated with the PEDro scale (Table 1) and on average was found to be 6.5/10. Specifically, two studies were rated with 8/10, two with 6/10, one with 7/10, and one with 4/10.
To address risk of bias through the methodological quality of the included studies we examined the 10 components of the PEDro scale individually (Figure 2). Only two categories—therapist and assessor blinding were not addressed by all the studies and measurement of outcomes obtained from >85% of subjects receiving treatment as allocated was not addressed by more than 50% of the included studies. These present significant sources of bias (Moseley et al. 2019).
3.3 Description of studies
The total number of participants of this systematic-review was 360. All participants presented with stable COPD apart from 16 in the study of Mazzoleni et al. 2014, presenting with other pulmonary diseases. The mean age ranged from 64 to 75 years. In most studies FEV1% was > 65, describing patients with moderate degree of air obstruction. Only Suntanto et al. 2019 and Xie et al. 2021 included people with more severe COPD (FEV1%<50, GOLD stage: D).
All included studies (Table 2) had added a VR component to usual PR, which was the main rehabilitational strategy for COPD patients. The technological equipment varied across the studies from the Microsoft Xbox Kinect (Rutkowski et al. 2019, 2020) and Nintendo Wii Fit (Mazzoleni et al. 2014; Suntanto et al. 2019) to Head Mounted Displays (Rutkowski et al. 2021; Xie et al. 2021).
The Xbox 360 console was used along with the Kinect motion sensor in order to detect and follow the participants’ movements. The patients participated in mini-games as part of Kinect Adventures, such as rafting, cross country running, hitting a ball and a roller coaster ride. The games that the Kinect training included were focused on improving balance, elasticity, endurance and strengthening of upper and lower limbs. Age-predicted maximal heart rate was used to monitor workload in order to ensure safe training. The Nintendo Wii Fit system uses haptic controllers and a balance board as interfaces to the games. In the study of Mazozoleni et al. (2014), these involved a “Yoga” activity with a deep breathing session in a standing position on the balance board, the “Jogging Plus” that involved running on a spot and “Twisting and squatting” that consisted of trunk twisting and arm-leg squatting. Similar games were used by Suntanto et al (2019) such as “Yoga”, “Torso twist” and “Free run”. Pulse rate, respiratory rate and SpO2 were used to keep training safe, whilst intensity was monitored by using the 10-point Borg scale. Patients were instructed to maintain the sensation of dyspnoea between to 4-6 on the modified Borg scale.
In most recent studies, Head Mounted Displays (HMDs) were used in order to immerse patients in a virtual environment. Thus, from semi-immersive gaming platforms we reached to explore the effectiveness of fully immersive gaming, either in the form of a simulated bicycle (Xie et al. 2021) or as a therapeutic garden that represents patient’s health (Rutkofski et al.2021), there are endless possibilities for this technological revolutionary equipment. A fully description of the rehabilitation programs followed by the experimental groups are individually presented for each group (Table 3).
3.4 Intervention Comparability
All of the included studies were randomized, included a control group and an adequate number of individuals. Only one study had a relatively low number of participants (Suntanto et al. 2019), with most ranging between 20 to 30 per group. Sample size calculation was performed in all studies.
Although significant clinical heterogeneity was noted between the included studies attributed to: (a) differences in the technology used and variability in (b) intervention duration and (c) the outcomes assessed between studies, a quantitative synthesis was also performed where possible.
3.5 Effects of Interventions
3.5.1 Effect of VR-training on Exercise capacity (Figure 3).
The effect of VR-Training with or without other parallel interventions on 6MWT, calculated in meters, was evaluated in four studies (Mazzoleni et al. 2014; Rutkowski et al. 2019; Rutkowski et al. 2020; Suntanto et al. 2019) including 196 participants in total (Figure ). A mean difference [MD (95% CI) = 22.7 (19.92 to 25.63) m, favoring VR-Training with statistical significance (Z=15.66, p < 0.001) and no statistical heterogeneity (I2=0, p=0.7) was noted, based on a 7/10 Pedro-quality score on average (Table 1).
3.5.2 Effect of VR-Training on Pulmonary function (Figure 4).
The effect of VR-Training with or without other parallel interventions on FEV1%pred was evaluated by two studies (Rutkowski et al. 2021; Xie et al. 2021) including 110 participants in total (Figure ). A mean difference [MD (95% CI) = 4.56 % (1.64 to 7.49), favoring VR-Training with statistical significance (Z=3.06, p = 0.002) and minimal statistical heterogeneity (I2=4%, p=0.31) was noted, based on Pedro-quality evidence (Table 1).
3.5.3 Effect of VR-Training on Subjective Dyspnea (Figure 5)
The effect of VR-Training with or without other parallel interventions on the MRC dyspnea scale was evaluated by two studies (Mazzoleni et al. 2014; Suntanto et al. 2019) including 56 participants in total (Figure 2). A mean difference [MD (95% CI) = -0.06 (-0.36 to 0.24), with one study favoring VR-Training and the other having the opposite effect, overall not reporting statistical significance (Z=0.38, p = 0.7), however with no statistical heterogeneity (I2=0, p=0.32) was noted, based on Pedro-quality evidence (Table 1).
3.5.4 Psychological status
Anxiety and depression were evaluated in two studies (Mazzoleni et al 2019; Rutokofski et al. 2021) without reaching statistical significant changes between groups, although a significant reduction was noted in the VR intervention in both studies.