Our study aimed to examine the characteristics of PISP use in the context of previous prospective evaluation studies of DiGA-compliant digital health applications, published internationally before the DiGAV in April 2020.
Main findings in the context of previous research
The core area of PISP was introduced to strengthen the role of patients and take their assessments and benefits into greater account in the approval of digital health applications (6). Notably, PISP outcome categories are primarily process quality indicators and not outcome quality indicators as represented by medical benefit. Considering process quality indicators in addition to outcome quality indicators, which mutually influence each other, provides the basis for a more holistic, not to mention patient-centered, evaluation of digital health applications.(63)
Altogether, we included 6 studies in our review. All included studies used a controlled study design, 4 (4/6) of which were randomized controlled trials. The approval process as a DiGA in Germany also requires a controlled study design, underlining the appropriateness of this design for demonstrating the effectiveness of digital interventions compared to non-controlled study designs.(64-66) The majority of evaluated applications focused on patients with chronic conditions (4/6) and offered monitoring of the corresponding disease as a feature (5/6). Both findings reflect the current range of telemedicine applications, which primarily offer monitoring for patients with chronic conditions.(15)
The most commonly used PISP outcome domain in evaluation studies was health literacy (7/14, 50.0%), followed by coping with illness-related difficulties in everyday life (3/14, 21.0%). One possible reason for this is that health literacy (67) is a widespread and well-known outcome domain and various established outcome measurement instruments already exist in the form of validated questionnaires. For instance, the Health Literacy Tool Shed online database listed a total of 240 health literacy measurement instruments as of April 28th 2023. (68)
Within our studies, we found no outcomes belonging to the PISP domains of patient autonomy, coordination of treatment procedures, facilitating access to care, or patient safety. Since validated measurement instruments for most of these domains exist (69-71), and increasing patient empowerment and access to care are among of the key promises of digital health use (72), this result is rather surprising.
Regarding the PISP outcome domains in our evaluation studies, self-developed questionnaires or process-generated, tracked, or directly measured data were used for only 4 out of 14 (28.5%) of the outcomes measured. In this regard, our study provides evidence that for some PISP outcome categories, validated outcome measurement instruments may be lacking. This is also critical because validated questionnaires are mandatory in the DiGA evaluation studies for measuring and demonstrating a positive healthcare effect.(5)
In addition, the analysis showed that clearly assigning the outcomes from evaluation studies to the PISP outcome domains is sometimes difficult. This is due to a lack of detailed guidance on which outcomes can be assigned to each domain and which outcome measurement instruments should be used for assessment.(5)
Notably, all included studies evaluated outcomes from the core area of PISPs in addition to outcomes from the core area of medical benefits and only as secondary outcomes. No study solely evaluated positive effects for outcomes from the core area of PISPs. This might be due to the novelty of PISP outcome domains in the context of approval trials and the fact that positive healthcare effects in terms of medical benefits and PISPs are now of equal importance.
Implications for future research
Our study is the first systematic review of evidence concerning the characteristics of PISPs in the context of previous evaluation studies of DiGA-compliant digital health applications. Therefore, our results are a starting point concerning the guidance needed on which outcomes and outcome measurement instruments can be used in evaluation studies of such applications to measure PISP outcome domains. The results also highlight PISP outcome domains where knowledge is still lacking about outcomes and outcome measurement instruments that can be used. Thus, our findings can be used to further develop existing evaluation frameworks and outcome taxonomies. The discussion of our findings also provides several valuable ideas for targeted implementation and dissemination of PISP in Germany, as well as other countries that would like to strengthen the patient perspective in the evaluation and implementation of digital health applications.
Future research should advance the distinct and transparent assignment of outcomes and validated outcome measurement instruments to all 9 of the existing PISP categories based on the results of our review. Research is especially needed concerning the 6 domains for which we found no insights concerning validated outcome measurement instruments.
Another topic for future research is the further inclusion of PISP outcome categories in existing evaluation frameworks and outcome taxonomies. The questions of how PISPs can be classified within existing taxonomies, how taxonomies could be adapted, or even if new ones should be developed hold comprehensive research potential.
Additionally, updating the review in one or two years will be of great interest as it will help to analyze how the characteristics of PISP measurement in evaluation studies have developed since the first DiGA was approved in Germany in October 2020.
Implications for practice
With DiGAs now being part of standard care for people with statutory health insurance in Germany and with PISPs now relevant in terms of approval and reimbursement decisions, Germany is breaking new ground in international comparison. These innovations, the accompanying discussions, and thus, the results of the present study are of international interest for different target groups:
(I) international digital health application manufacturers and vendors who plan to have their application approved as a DiGA in Germany, an important market in the field of mobile health applications [2]
(II) representatives of governments, and healthcare systems worldwide concerning the significance of PISP in evaluation studies of digital health applications and the associated DiGA approval and reimbursement process [2]
(III) patients, who gain low-threshold access to quality-assured and evidence-based digital health applications and whose perspective and potential benefits have become significantly more important through innovative German legislation. This greater consideration of patient benefits when implementing and evaluating care interventions follows the principles of value-based healthcare [7].
Limitations
Currently, no legally binding document elaborates on the single PISP outcome categories, the corresponding outcome domains, outcomes, or outcome measurement instruments. This lack of explanation made it difficult for us to assign the single outcomes to the respective PISP outcome categories. However, to sharpen our understanding, we used the explanations from the DiGA Guide (Table 2)(5) and insights from a text book published by an expert committee of the Federal Ministry of Health for digital health interventions.(73) Furthermore, the interventions and study settings described in the articles were considered for the assignment of outcomes to the respective PISP outcome categories, and the entire data extraction was done by two reviewers independently. Nonetheless, the difficulties we faced illustrate again the need for further clarification of which outcomes and outcome measurement instruments can be assigned to the 9 PISP outcome domains.
As this was the follow-up analysis of an existing review not directed at measuring the effectiveness of an intervention, no a priori registration of the review was filed. However, the previous review used validated search strategies as well as a piloted data extraction strategy.
Given the fact that we did not aim to measure intervention effectiveness, we refrained from any quality assessment. Therefore, no statement can be made on potential risks of bias within the included studies.
Since we were unable to obtain any additional references from the extensive aditional searches, we assume that the initial data set and thus the conduct of a subreview was an appropriate method to address our research questions.