Statistical analyses
Statistical analyses were performed using SPSS 29 (SPSS Inc., Chicago, Illinois) and R.Studio, Version 2023.03.0 (RStudio PBC, Boston, Massachusetts). Analyses were mainly based on non-parametric Friedman analyses, as of the ordinal scale of questionnaire data. The significance level was set to p <.05, while p-values <.10 are reported as statistical trends. For a measure of effect size the correlation coefficient (r) is provided. For post-hoc comparisons, Wilcoxon-rank-sum-tests were applied. Two-tailed critical p-values adjusted for multiple comparisons using the Bonferroni correction, as implemented in R’s ‘p.adjust’ function, are reported. For descriptive values of the outcome variables divided by assessment point refer to table 1.
Subjective parameters
Insomnia Severity (ISI)
A non-parametric Friedman-test with the independent variable ‘Insomnia Severity Index’ and the repeated measures factor ‘Time’ (T0, T1, T2, T3) revealed a statistically significant change in insomnia severity depending on the time point, χ2(3) = 49.44, p <.001.
Post hoc analyses with Wilcoxon signed-rank tests were conducted and a Bonferroni correction was applied (cf. Figure 2A). As expected, there was no significant change from baseline to pre-training (Z = -0.809, p = 1.00). There was however a significant change from pre- to post-training (Z = -4.670, p <.001, r = .67) and the results remained stable from post-training to the follow-up assessment one month later (Z = -1.552, p = .362).
Sleep quality (PSQI)
A non-parametric Friedman-test with the independent variable ‘PSQI’ and the repeated measures factor ‘Time’ (T0, T1, T2, T3) revealed a statistically significant change in sleep quality depending on the time point, χ2(3) = 46.21, p <.001.
Post hoc analyses with Wilcoxon signed-rank tests were conducted and a Bonferroni correction was applied (cf. Figure 2B). As expected, there was no significant change from baseline to pre-training, but a statistical trend towards an improvement (Z = -2.274, p = .069). Furthermore, there was a significant change from pre- to post-training (Z = -3.852, p <.001, r = .56) and there was a trend for a further improvement from post-training to the follow-up assessment one month later (Z = -2.141, p = .097, r = .31).
Figure 2. Improvement in (A) the severity of insomnia symptoms assessed by the Insomnia Severity Index and (B) the subjective sleep quality assessed by the Pittsburgh Sleep Quality Index. (A) Insomnia severity improved significantly from pre-training (T1) to post-training (T2) and remained stable until one-month follow-up (T3); N = 48. (B) Sleep quality improved tendentially from baseline (T0) to pre-training and significantly from pre-training (T1) to post-training (T2) and was by trend lower in the one-month follow-up (T3); N = 48. Higher values represent a stronger impairment in the respective measure. Horizontal lines represent the medians, boxes the interquartile range, with whiskers depicting the 1.5 interquartile range. The black cross corresponds to the mean. Asterisks indicate significance: ***p < .001, +p < .10.
Psychological strain (BSI - GSI values)
A non-parametric Friedman-test with the independent variable ‘GSI’ and the repeated measures factor ‘Time’ (T0, T1, T2, T3) revealed a statistically significant change in the participant’s psychological strain depending on the time point, χ2(3) = 32.12, p <.001.
Post hoc analyses with Wilcoxon signed-rank tests were conducted and a Bonferroni correction was applied (cf. Figure 3A). As expected, there was no significant change from baseline to pre-training (Z = -1.421, p = .466). There was, however, a significant change from pre- to post-training (Z = -4.688, p <.001, r = .68) and results remained stable from post-training to the follow-up assessment one month later (Z = -0.641, p = 1.00).
Subscale Depression (BSI)
A non-parametric Friedman-test with the independent variable ‘depression’ and the repeated measures factor ‘Time’ (T0, T1, T2, T3) revealed a statistically significant change in the domain depression depending on the time point, χ2(3) = 12.01, p = .007.
Post hoc analyses with Wilcoxon signed-rank tests were conducted and a Bonferroni correction was applied (cf. Figure 3B). As expected, there was no significant change from baseline to pre-training (Z = -0.172, p = 1.00). There was, however, a significant change from pre- to post-training (Z = -3.435, p = .002, r = .50) and results remained stable from post-training to the follow-up assessment one month later (Z = -0.669, p = 1.00).
Subscale Anxiety (BSI)
A non-parametric Friedman-test with the independent variable ‘anxiety’ and the repeated measures factor ‘Time’ (T0, T1, T2, T3) revealed a statistically significant change in the domain anxiety depending on the time point, χ2(3) = 34.70, p <.001.
Post hoc analyses with Wilcoxon signed-rank tests were conducted and a Bonferroni correction was applied (cf. Figure 3C). As expected, there was no significant change from baseline to pre-training (Z = -0.887, p = 1.00). There was, however, a significant change from pre- to post-training (Z = -4.130, p <.001, r = .60) and results remained stable from post-training to the follow-up assessment one month later (Z = -0.563, p = 1.00).
Figure 3. Improvement in (A) psychological strain measured by the Global Severity Index of the Brief Symptom Inventory (BSI), (B) the BSI-subscale depression and (C) the BSI-subscale anxiety. (A) Psychological strain, (B) symptoms of depression and (C) symptoms of anxiety improved significantly from pre-training (T1) to post-training (T2) and remained stable until one-month follow-up (T3); N = 48. Higher values represent a stronger impairment in the respective measure. Horizontal lines represent the medians, boxes the interquartile range, with whiskers depicting the 1.5 interquartile range. The black cross corresponds to the mean. Asterisks indicate significance: ***p < .001, **p < .010.
Quality of life (WHOQOL-BREF)
Domain Physical health
A non-parametric Friedman-test with the independent variable ‘physical health’ and the repeated measures factor ‘Time’ (T0, T1, T2, T3) revealed a statistically significant change in the domain physical health depending on the time point, χ2(3) = 18.96, p <.001.
Post hoc analyses with Wilcoxon signed-rank tests were conducted and a Bonferroni correction was applied (cf. Figure 4A). There were no significant changes from baseline to pre-training (Z = -0.657, p = 1.00), nor from pre- to post-training (Z = -0.206, p = 1.00), but a significant change from post-training to the follow-up assessment one month later (Z = -2.829, p = .014, r = 0.41).
Domain Psychological health
A non-parametric Friedman-test with the independent variable ‘psychological health’ and the repeated measures factor ‘Time’ (T0, T1, T2, T3) revealed a statistically significant change in the domain psychological health depending on the time point, χ2(3) = 14.96, p =.002.
Post hoc analyses with Wilcoxon signed-rank tests were conducted and a Bonferroni correction was applied (cf. Figure 4B). There were no significant changes from baseline to pre-training (Z = -0.374, p = 1.00), nor from pre- to post-training (Z = -0.909, p = 1.00), but a significant change from post-training to the follow-up assessment one month later (Z = -2.405, p = .049, r = .35).
Figure 4. Improvement in quality of life assessed by the components (A) physical health and (B) psychological health measured by the WHO Quality of life questionnaire (WHOQOL-BREF). (A) Physical health and (B) psychological health improved significantly from post-training (T2) to follow-up (T3); N = 48. Lower values represent a stronger impairment in the respective measure. Horizontal lines represent the medians, boxes the interquartile range, with whiskers depicting the 1.5 interquartile range. The black cross corresponds to the mean. Asterisks indicate significance: *p < .050.
[Table 1]
Usage of the program within the follow-up-period
Further we analyzed if different frequencies in the usage of the app-program within the follow-up period resulted in different outcomes in terms of changes in the various questionnaires from T0-T3, T1-T3 and T2-T3. Hereby, we used non-parametric Kruskal-Wallis-tests to investigate the differences between the 4 groups (no usage, rare usage, frequent usage, regular usage).
No significant differences between groups were found in PSQI changes T0-T3 (H(3) = 2.29, p = .515), T1-T3 ( (3) = 5.83, p = .120) and T2-T3 (H(3) = 2.97, p = .396), in ISI changes T0-T3 (H(3) = 0.73, p = .867), T1-T3 (H(3) = 1.48, p = .688) and T2-T3 (H(3) = 6.02, p = .111), in BSI global score changes T0-T3 (H(3) = 1.27, p = .736), T1-T3 (H(3) = 1.54, p = .673), T2-T3 (H(3) = 1.18, p = .758), in BSI subscale depression T0-T3 (H(3) = 1.72, p = .632), T1-T3 (H(3) = 4.67, p = .197), T2-T3 (H(3) = 3.70, p = .296), in BSI subscale anxiety T0-T3 (H(3) = 1.05, p = .788), T1-T3 (H(3) = 1.52, p = .678), T2-T3 (H(3) = 1.17, p = .759), in WHOQOL-BREF domain “physical health” T0-T3 (H(3) = 0.90, p = .824), T1-T3 (H(3) = 0.34, p = .953), T2-T3 (H(3) = 1.00, p = .803), and in WHOQOL-BREF domain “psychological health” T0-T3 (H(3) = 1.23, p = .746), T1-T3 (H(3) = 1.29, p = .733). For changes in WHOQOL-BREF domain “psychological health" a statistical trend was observed for T2-T3 (H(3) = 7.47, p = .058). For an overview on descriptive statistic values please refer to supplementary tables S1-S7.
Factors predicting improvements during the training phase
A multiple linear regression model was conducted to assess potential predictors for the improvement of insomnia symptoms during the app training phase, defined by the difference of the ISI scores from T1 to T2. The following predictors were included in a forced entry model: Sex (male vs. female), age, education (higher, i.e., graduated from high school or university degree, vs. lower), last completed level at T2, PSQI at T1 (low, i.e., 0-5 points vs. high, i.e., 6 or more points), ISI at T1 (low, i.e., 0-7 points, vs. high, i.e., 8 or more points). The overall regression was statistically significant (R² = 0.31, F (6, 41) = 3.04, p = .015), showing that the predictors accounted for 31% of the variability in the improvement in ISI scores (cf. table 2).
[Table 2]