Study selection and characteristics
The literature search strategy, following the PRISMA 2020 guidelines, is summarized in Fig. 1. The initial search yielded 4463 records; 4265 from English databases (MEDLINE, Embase, Cochrane Library, CINAHL, and ProQuest) and 198 from Korean databases (National Assembly Library, KCI, RISS, KISS, DBpia, Korea Scholar), up to August 2023. Duplicate records were removed using EndNote, resulting in 3826 records. Selection was based on titles and abstracts, adhering to the PICOS and inclusion criteria. After initial screening, we reviewed the full texts of 407 articles, excluding studies not involving older adults (k = 153), not employing AI intervention (n = 111), not following an RCT design (k = 58), or having unmatched outcomes (k = 72). Thirteen studies met the inclusion criteria and were included in the final analyses.
Table 1 lists the main characteristics of the 13 RCTs. These studies were conducted in Italy (k = 4) [9, 10, 22, 23], the Netherlands (k = 3) [2, 8, 24], the USA (k = 2) [7, 25], Australia (k = 1) [13], Canada (k = 1) [26], Turkey (k = 1) [27], and the UK (k = 1) [28]. The participants comprised community-dwelling older adults (k = 2) [2, 26]; patients with stroke (k = 4) [22, 23, 25, 27]; patients with cognitive deficiencies, such as dementia, mild cognitive impairment, or Parkinson’s disease (k = 3) [7, 10, 13], patients with functional limitations, such as hearing loss or upper limb impairments (k = 3) [8, 24, 28], and patients with multiple sclerosis (k = 1) [9]. The included 13 RCTs involved 1498 participants, with 18–770 participants in each trial, and a mean age ranging from 46 to 85 years. Interventions were delivered through a robot (k = 8) [9, 10, 13, 22, 26–28], smart device (k = 4) [7, 8, 24, 26], or the Internet (k = 1) [2]. Short-term interventions (k = 3) [7, 9, 10] were defined as lasting less than four weeks, with a range of three to five sessions and a duration of 40–45 min per week. Mid-term interventions (k = 5) [13, 22–24, 27] were conducted over five to eleven weeks, with two to three sessions and a duration of 30–90 min per week. The remaining fve studies applied long-term interventions (≥ 12 weeks), with sessions occurring three times per week and lasting 30 min or daily. Two studies did not provide a clear indication of intervention intensity. Settings included home (k = 3) [2, 7, 24] or clinical settings (k = 8) [9, 10, 22, 25, 27, 28], such as hospitals, rehabilitation centers, and primary care centers. Comparison groups received usual care (k = 2) [8, 28], were on a waitlist (k = 1) [2], or received active control (k = 10) [7, 9, 10, 13, 22, 27] (rehabilitation training or therapy, reading activity, or rehabilitation training without a robot).
Table 1
Characteristics of the included studies
Study (year) | Country | Study design | Participants | Intervention | Setting | Comparisons | Outcome |
| E (M:F) | C (M:F) | Mean age† (years) | AI type | Delivery mode | Intensity (per week) | Duration (wk/m/yr) |
Ambrosini (2021) [22] | Italy | Single-blinded multicenter RCT | Stroke survivors | 36 (25:11) | 36 (25:11) | 60.9 ± 13.7 | Robotic system with Functional electrical stimulation | Robot | 90 min/ 3 times | 9 wks | Clinical center | Advanced conventional therapy (ACT) | SSQOL |
Broekhuizen et al. [2] | Netherlands | RCT | Aged 60–70 years | 119 (72:47) | 116 (67:49) | 64.7 ± 3.0 | Internet-based physical activity | Internet | everyday | 12 wks | Home | Waitlist | RAND36 |
De Luca (2020) [23] | Italy | Parallel-group RCT | Chronic ischemic stroke | 15 (11:4) | 15 (11:4) | 54.4 ± 11.9 | Robotic gait training | Robot | 60 min/ 3 times | 8 wks | Hospital | Physiotherapist-aided training | SF12 |
Steele Gray (2021) [26] | Canada | Cluster RCT | Complex care needs | 23 (8:15) | 21 (14:7) | 68.65 ± 7.10 | Electronic patient-reported outcome tool (ePRO) | Smart device (system) | everyday | 6 m | Primary care center | Technology training | AQOL-4D |
Hornby (2008) [25] | USA | RCT | Chronic stroke with hemiparesis | 24 (15:9) | 24 (15:9) | 57 ± 10 | Robotic-assisted locomotor training | Robot | 30 min/ | 2 yr | Rehabilitation institute | Therapist-assisted locomotor training | Physical SF36 |
Kramer (2005) [24] | Netherlands | RCT | Hearing impairment | 24 (16:8) | 24 (12:12) | 69 ± 7.7 | Home education and hearing aid fitting | Smart device (videotape/ DVD) | . | 11 wks | Home | Hearing aid fitting | IOIHA-QOL |
Meijerink et al. [8] | Netherlands | Cluster RCT | Hearing loss | 72 (50:22) | 74 (28:46) | 61.9 ± 10.0 | Web-based program (SUPR) | Smart device (Booklet, videos) | . | 6 m | Hearing aid dispenser | Usual care | IOI-HA items |
Moyle et al. [13] | Australia | Pilot RCT | Dementia | 9 | 9 | 85.3 ± 8.4 | Companion robots (PARO) | Robot | 45 min/ 3 times | 5 wks | Residential care facility | Reading activity | QOL-AD |
Mustafaoglu (2020) [27] | Turkey | Single-blinded RCT | Older adults with stroke | E1 17 (11:6) E2 17 (10:7) | 17 (12:5) | 53.8 ± 10.7 | Robot-assisted gait training (RAGT) | Robot | 45 min/ 2 times | 6 wks | Rehabilitation center | Conventional training (CT) | SSQOL |
Rodgers (2020) [28] | UK | Three-group RCT | Limb functional limitation | E1 257 (156:101) E2 259 (159:100) | 254 (153:101) | 59.9 ± 13.5 | Robot-assisted training | Robot | 60 min/ 3 times | 12 wks | Robotic gym in clinical rehabilitation application | Usual care | SIS EQ-5D-5L QALYs |
Schmitter-Edgecombe et al. [7] | USA | Pilot RCT | Mild cognitive impairment | 17 (10:7) | 15 (5:10) | 70.6 ± 6.3 | Electronic memory and management aid (EMMA) application | Smart device (application, smart home) | 5 times | 4 wks | Home | Partnered with smart home prompting | QOL-AD |
Spina et al. [10] | Italy | Pilot RCT | Parkinson’s disease | 11 (6:5) | 11 (7:4) | 68 ± 6.9 | Robotic balance training | Robot | 45 min/ 5 times | 4 wks | Hospital | Conventional balance training | PDQ-39 |
Tramontano et al. [9] | Italy | Single-blind RCT | Multiple sclerosis | 14 (6:8) | 16 (6:10) | 46.7 ± 10.4 | Upper limb Sensory-motor training with robotic support | Robot | 40 min/ 3 times | 4 wks | Rehabilitation center | Upper limb sensory-motor training without robotic support | MSQOL-24 |
†Data are presented as the mean ± standard deviation or range values. |
Abbreviations: M, male; F, female; SSQOL, stroke specific quality of life scale; SF12, short form 12 quality of life test; AQOL-4D, assessment of quality of life-4 dimensions; SF36, short form 36 quality of life scale; QOL18, 18-item quality-of-life questionnaire; IOI-HA, international outcome inventory for hearing aids; QOL-AD, quality of life in Alzheimer’s disease; WHOQOL26, World Health Organization Quality of Life Questionnaire 26; SIS, stroke impact scale; EQ-5D-5L, euroqol-5 dimensions; QALYs, quality-adjusted life-years; PDQ-39, Parkinson’s disease questionnaire; MSQOL-54, multiple sclerosis quality of life-54. |
Quality of studies and risk of bias
The quality and risk of bias were assessed using Cochrane RoB 2.0 [17] to evaluate bias in each domain through an algorithm. The quality assessment of the 13 studies indicated that eight studies had a low risk of bias (61.5%) [2, 7, 9, 10, 22, 25, 27, 28] and five studies had some concerns for risk of bias (38.5%) [8, 13, 23, 24, 26]. None of the studies showed a high risk of bias. Upon analysis by domain, three studies (23.1%) [23, 24, 26] showed concerns for risk of bias in the "randomization process". While a randomization process was described in 10 of these studies, two provided no information about the details of the randomization process, and one study lacked information on allocation concealment. Regarding "deviations from intended interventions", 11 studies (84.6%) showed a low risk of bias, whereas two studies (15.4%) [13, 24] had some concerns for risk of bias. Most studies involving AI intervention did not implement blinding for participants and staff due to the nature of the interventions. However, no deviation from the intended intervention was observed in the experimental context. Most studies used the intent-to-treat population (ITT) or the modified intent-to-treat (mITT) population. However, two studies (15.4%) [13, 24] lacked sufficient information about the analysis and employed inappropriate statistical methods. Concerning the domains of "missing outcome data" and "measurement of the outcome", only one study (7.7%) [8] showed the risk of bias. In most of the remaining studies, a low risk of bias was observed. Some of the study outcome measurements were conducted with blinding (k = 9, 69.2%) [2, 7–10, 13, 22, 23, 27], whereas others were performed without blinding (k = 4, 30.8%) [24, 26, 28]. However, mostly objective, or self-reported measures were used as study outcomes. Regarding the "selection of the reported result", all the studies demonstrated a low risk of bias.
Effect of AI intervention on quality of life
The meta-analysis of 13 studies, employing a random-effects model, revealed a modest effect size for AI intervention in improving the quality of life (Hedges’s g = 0.30, 95% CI = 0.10 to 0.51) (Fig. 3A). Potential heterogeneity was identified through the Q value and I2 (Q = 24.60, p = 0.017, I2 = 51.22). Funnel plots and the Egger’s regression test indicated a potential risk of publication bias (p = 0.002) (Fig. 4A).
Subgroup analysis
A subgroup analysis was conducted using the AI intervention delivery mode, comparing robots (k = 8) [9, 10, 13, 22, 26–28] and smart devices (k = 4) [7, 8, 24, 26]. The effect sizes of AI intervention based on delivery mode were quantified by Hedges' g values for robots (Hedges' g = 0.40, 95% CI = 0.09–0.72) and smart devices (Hedges' g = 0.34, 95% CI = -0.05–0.72) (Fig. 3B). Funnel plots and the Egger's regression test (p = 0.020, p = 0.445) indicated a potential risk of publication bias in the AI intervention delivery mode for robots (Fig. 4B, 4C).
Another subgroup analysis was conducted based on AI intervention duration, comparing short-term (k = 3) [7, 9, 10], mid-term (k = 5),[13, 22–24, 27], and long-term (k = 5) interventions. The effect sizes of AI intervention duration were expressed through Hedges' g values for short-term (Hedges' g = 0.17, 95% CI = -0.24–0.59), mid-term (Hedges' g = 0.69, 95% CI = 0.26–1.12), and long-term (Hedges' g = 0.05, 95% CI = -0.09–0.19) interventions. Funnel plots and the Egger's regression test (p = 0.853, p = 0.106, and p = 0.040) suggested a potential risk of publication bias in AI intervention for the long-term duration (Fig. 4D, 4E, and 4F).