Evaluation of a Clinical Decision Support System for Rare Diseases - A Qualitative Study

doi:10.21203/rs.3.rs-125710/v1

Download PDF

Research Article

Evaluation of a Clinical Decision Support System for Rare Diseases - A Qualitative Study

https://doi.org/10.21203/rs.3.rs-125710/v1

This work is licensed under a CC BY 4.0 License

Journal Publication

published 18 Feb, 2021

Read the published version in BMC Medical Informatics and Decision Making →

You are reading this latest preprint version

Background

Rare Diseases (RDs) are difficult to diagnose. Clinical Decision Support Systems (CDSS) could support the diagnosis for RDs. The MIRACUM (Medical Informatics in Research and Medicine) consortium developed a CDSS for RDs based on distributed clinical data from ten German university hospitals. To support the diagnosis for difficult patient cases, the CDSS uses data from the different hospitals to perform a patient similarity analysis in order to obtain an indication of a diagnosis. To optimize our CDSS, we conducted this qualitative study to investigate the usability of the CDSS with its functionality and information included.

Methods

A Thinking Aloud Test (TA-Test) was performed with RDs experts recruited from Rare Diseases Centres (RDCs) at the MIRACUM locations which were specialized in the diagnosis and treatment of RDs.

An instruction sheet with tasks was prepared that the participants should perform with the CDSS during the study. Participants were asked to share any thoughts about the CDSS. The TA-Test was recorded on audio and video. A questionnaire was handed out at the end of the study including the System Usability Scale (SUS). Afterwards, the data was analysed with the qualitative content analysis according to Mayring, which includes a category-guided deductive approach.

Results

A total of eight experts were included in the study since eight MIRACUM locations have established an RDC.

The results show that more detailed information about the patients, such as descriptive attributes or findings, are needed. The given functionality of the CDSS was rated positively, such as the function for the overview of similar patients and medical history. However, there is a lack of transparency regarding the results of the CDSS patient similarity analysis. The participants stated that the system should present exactly which symptoms, diagnosis etc. have matched. Regarding usability, the CDSS received a score of 73.21 points according to the SUS, which is ranked as a good usability.

Conclusions

This qualitative study investigated the usability of a CDSS of RDs. Despite the promising results, the CDSS still needs some revisions before use in clinical practice, e.g. by improving the transparency of the patient similarity analysis.

Medical Informatics

Rare Diseases

Clinical Decision Support Systems

Computer-Assisted Diagnosis

Usability

According to a definition of the World Health Organization (WHO), a disease is referred as “rare”, if less than 1.3 out of 2,000 people are affected [1]. It is estimated that currently about 7,000 different rare diseases (RDs) exist, with about 400 million people affected worldwide [2]. Many of these diseases are chronic, degenerative or life threatening. They can also lead to life impairment or severe disability [3, 4]. Furthermore, about 80 % of the RDs are of genetic origin. Additionally, infectious, immunological or environmental factors are observed as possible disease causes [5–7].

Often, many years go by until a RD is diagnosed. Many patients are diagnosed too late or not at all, especially those with phenotypes that occur later in life [3]. The search for the right diagnosis can lead to years of limitations and major suffering for patients [8]. A further problem is the geographical distribution of experts for RDs. Research and care is also restricted due to few studies and little available patient data [9]. Therefore, research networks that make clinical knowledge from various institutions available are a promising approach for RDs, in order to generate new knowledge for research and care [10].

The MIRACUM consortium (Medical Informatics in Research and Care in University Medicine) is a large network of ten university hospitals funded by the Germany Ministry of Education and Research (BMBF). The aim of the consortium is to create Data Integration Centres (DICs) at each site and the supply of data using interoperable standards. The benefits of the DICs will be demonstrated by using selected use cases. One of these use cases is a prototypical conception and implementation of a Clinical Decision Support System (CDSS) for RDs, which is called DISERDIS (Diagnosis Support in Rare Diseases). The aim of DISERDIS is to identify similar patients at other MIRACUM locations, what could give an indication for a diagnosis [11].

Sutton et al. defines a CDSS as a software system that improves health care by enhancing medical decisions per clinical knowledge, information about a patient and other health information. A traditional CDSS compares the characteristics of an individual patient with a clinical knowledge base and generates patient-specific assessments or recommendations [12]. For instance, CDSS can be used to assist a physician to diagnose a patient with a RD.

During the development of a CDSS, it is important to involve the future users of a CDSS in every phase of development in order to develop a system that is accepted and purposeful in everyday clinical practice [13–15]. Therefore, we decided to use a “User-Centred Design Process” (UCD) for the development of our CDSS. Within an UCD, software prototypes are designed, developed and tested on a multi-level approach together with users according to their requirements and needs [16]. In the context of the UCD, we took several steps to identify user-requirements and needs, including a scoping review of literature about CDSS for RDs [17], a cross sectional survey [18], a qualitative study with expert interviews [19] and a focus group. As a result, a high fidelity prototype was developed that addresses the requirements of the UCD phases, but is not yet a fully developed software system [20].

However, according to the UCD, an evaluation is necessary to check whether the CDSS meets the requirements that are defined by the users [21]. Especially in CDSS, the evaluation of usability plays a decisive role. Usability refers to the quality of a system, which allows users to complete a task effectively, fast and with a high satisfaction [22]. To our knowledge, there is no study available that has assessed a CDSS for RDs regarding the usability and the acceptance by the users. However, several CDSS have been developed, whereof studies have been published, but only about the performance and accuracy of the diagnosis of the CDSS [17]. Therefore, we performed an evaluation study of our CDSS within the UCD, with the goal to get feedback for the further development of our prototypical CDSS and to increase the user satisfaction. The objective of this study was to investigate whether the functionalities and information in the CDSS were implemented in a user-friendly manner and which functionalities and information are perceived as particularly useful in order to derive recommendations for optimizing the CDSS.

2.1 Design

To investigate the usability of a software, a “Thinking-Aloud-Test” (TA-Test) was performed. This method was originally developed in the field of thinking psychology for the record of human cognitive processes. However, it is also possible to use it for the evaluation of a software system [23–25]. “Thinking aloud” means that the users communicate their thoughts aloud while interacting with the software. During the interaction, they indicate why they perform certain actions and what their goal is [26]. We have chosen this approach to get opinions and feedback of all stakeholders in the MIRACUM consortium for our CDSS.

This study was conducted and performed according to the Consolidated Criteria for Reporting Qualitative Research (COREQ) [27, 28]. We considered 31 out of 32 items of COREQ in our study. A checklist is provided in additional file 1.

2.2 Setting and sampling

In this study, we used a purposeful sampling, as in the previous UCD studies [18, 19]. Experts in the field of RDs, known by the authors, were invited for participation [29, 30]. According to Meuser and Nagel, an “expert” can be defined as a person who has knowledge in a specific research area that is not accessible to everyone. Experts have experience and knowledge and can act upon their knowledge [31]. Experts were recruited from Rare Diseases Centres (RDCs) within the participating hospitals in MIRACUM. These centres are specialised facilities for patients without a diagnosis or RDs [31]. As in our former studies, we took the following characteristics of study participants into account: Participants are members of the MIRACUM consortium and work in a medical centre where specific RDs are diagnosed and treated (RDC). They have a completed medical degree and specialist training in human medicine. Considering these characteristics, eight potential study participants were identified, since eight of ten MIRACUM locations have established a RDC.

The participants were contacted by e-mail and asked to suggest a date and time. The invitation was sent in October, 2019. We distributed a study information letter to the study participants with the email invitation.

2.3 Data collection

2.3.1 Description of the test environment

The study was carried out under laboratory conditions, which means that the CDSS was used in a test scenario by the study participants and not in a real clinical scenario [32]. Ten fictitious patient cases were created together with an expert for RDs and used in the TA-Test [33, 34]. These patient cases were stored in the CDSS database. A description of the patient cases and functionality of the CDSS are shown in additional files 2 and 3.

2.3.2 Preparation and implementation of the study

An instruction sheet was created with three tasks that the participants were supposed to perform during the study (shown in additional file 4, translated from German to English). The sheet included a patient case, which was created by an expert of RDs as mentioned above. One of the 10 patient cases was selected and used in the study.

The instruction sheet was divided into two categories: (a) use of the patient management and (b) use of the similarity analysis in the CDSS. The participants were ought to use the function of the similarity analysis of the CDSS, whereas the similarity analysis is calculated between the patient case, which is also shown on the sheet, and all other available cases in the database of the CDSS. Therefore, each study participant used the same patient case to have the same conditions for evaluation.

In addition to the instruction sheet, a moderation guide was prepared which served as an orientation for the test leader while conducting the study. It also contained questions to maintain the flow of a discussion with study participants [35].

A pre-test was performed before the study was conducted with all study participants. The pre-test was conducted with an expert for RDs and had a duration of 43 minutes. After the pre-test, only minor changes of the instruction sheet and the course of the study were necessary.

The study was conducted by JAS from November to December, 2019, whose research characteristics are the following: “gender: male”, “experience: 4 years research experience in medical informatics”, “degree: M.Sc. in Medical Informatics”, “occupation: research assistant and PhD student”.

The study was executed during the working time of the study participants in their offices. No other persons were present during the study and there were no interruptions. The study was not repeated. The prototype of the CDSS was provided on a laptop. After signing the consent forms for participation, the instruction sheet was handed out to the participants and the test leader answered possible questions. During the study, the activities on the screen of the study participants were recorded with the software OBS-Studio. In addition, the sound was recorded in order to correlate the participant's actions with verbal statements. Once the TA-Test was completed, the audio and video recording was stopped and a questionnaire was handed out to the study participants, which is further described in the next section.

In summary, the TA-Tests had a length between 18 and 43 minutes, with an average duration of 30 minutes. The TA-Tests were conducted only once and were not repeated.

2.4 Data analysis and processing

The analysis of the study was based on a qualitative content analysis according to Mayring [36] and a questionnaire. In the following, both are described more detailed.

Qualitative content analysis

A transcription protocol was created with Microsoft Word that includes the indication of the time in the video and audio recording. The statements of the study participants were transcribed into written form in the transcription protocol. The transcription was based on the transcription system of Kuckartz et al. [37]. Additionally, the user's interactions were described in order to trace which interactions the user performed in the software at any given time. A transcription protocol was created for each TA-Test.

The transcripts were checked for validity and possible errors were corrected (e.g. missing words or sentences). The transcripts were returned to the study participants for validation, whereupon all participants confirmed the content of the transcripts. For data analysis of the transcripts, deductive categories, according to Mayring, were created with the aim of assigning text passages from the transcripts to the categories [36]. Deductive categories are used to evaluate a qualitative content analysis and are defined before the study begins. They are divided into main-categories and sub-categories. The category system used in this study is based on the deductive categories, shown in figure 1. We defined the categories based on our research questions. To answer our research questions, all relevant software functions and user interfaces are available in the category system. For each category, the sub-categories “information”, "software functionality” and “usability” were also examined according to the research questions. These sub-categories are defined as follows:

Information: The statements of the study participants refer to information presented in the CDSS. Information are presented after the use of a software function.
Software functionality: The statements of the study participants refer to a software function of the CDSS.
Usability: The statements of the study participants refer to the usability of the CDSS.

As recommended by Mayring, the transcribed material is supposed to be checked in advance using the category system, with the goal to determine whether the categories can be applied to the data material. For this step, we used two (n=2) transcription protocols, as Mayring recommends to use 10-50 % of the transcribed material [38]. Afterwards, the category system was refined and two categories (1.3 and 3.7) were added to allow a more precise subdivision. After that, all transcripts were used and text passages were assigned to the categories. If a text passage could not be assigned to a category, all authors discussed and decided the assignment. Saturation of the study was reached when (1) all participants had successfully completed the study and (2) when the categories were adequately represented in the data after refinement of the category system [39].

While applying text passages to categories, we defined anchor examples and described when a text passage should be applied to a category. After the text passages were assigned to the categories, the respective statements were summarised per category. Finally, all results of the study were distributed to the study participants. All participants agreed to the results. In order to present the results in this paper, quotations were selected that represent the category at its best. The quotations (shown in additional file 5) were translated from German into English.

Questionnaire

For further analysis in the evaluation of the CDSS, we developed a questionnaire which consists of three parts. In part one, the System Usability Scale (SUS) was used, which is a standardised questionnaire with ten questions (items) to assess the usability of a software system [40]. Since the items are only available in English, they have been translated into German for the purposes of the study. In the second part of the questionnaire, further questions were created to evaluate the individual functionalities of the CDSS. These questions were answered using a 5-step Likert scale (from “1 = strongly disagree” to “5 = strongly agree”) [41]. In the third part of the questionnaire, participants were asked to provide some personal information [34]. The following information were collected: gender, age group, medical specialization, years of experience in the field of RDs and previous experience with CDSS.

For the data analysis of the SUS, we used the approach of Bangor et al. [40]. The result is a range from “0” to “100”, which isn’t supposed to be interpreted as a percentage and must be normalised. According to Bangor [42], normalisation can be achieved by using the range shown in table 1. Furthermore, we calculated the mean and standard deviations for each item of the SUS across all participants.

Table 1: Usability ranges of the SUS according to Bangor [42]

SUS Score (range)	Ranking
84.1 - 100.0	Best Imaginable
80.8 - 84.0	Excellent
71.1 - 80.7	Good
51.7 - 71.0	OK
25.1 - 51.6	Poor
0.0 - 25.0	Worst Imaginable

The answers to the questions in the second part of the questionnaire were assigned with numerical values from “1” to “5” according to the Likert scale, whereas mean and standard deviations were calculated for each question across all participants.

In this section, the results of the study are shown. First, the characteristics of the study participants are presented. Afterwards, the results of the qualitative content analysis with its categories are shown. In the third part, the results of the questionnaire are described.

3.1 Participants

All eight experts contacted responded to our invitation and participated in the study. We therefore achieved a participation rate of 100 %. One study participant took part in the study, but due to time constraints, the questionnaire could not be filled in, with the consequence that the following data are only available for seven participants. The characteristics of the study participants are shown in table 2.

Table 2: Characteristics of study participants

Characteristics	Options	Participants (n=7)
Gender	Female	4
Gender	Male	3
Age group	>59	1
	50-59	1
	40-49	2
	30-39	3
Medical specialization	Paediatric Surgery	1
	Neurology	1
	Psychiatry and Neurology	1
	Nephrology	1
	Internal Medicine	1
	Neuropediatric	1
	Paediatric	1
Years of experience in the field of rare diseases	30	1
	24	1
	15	1
	4	2
	3	1
Prior experience with clinical decision support systems	Yes	4
Prior experience with clinical decision support systems	No	3

The participants were predominantly female (n=4). The distribution of the age group was between 30-39 and older than 59 years. The study participants have different medical specialisations. It is noticeable that three study participants are working in paediatrics. The experience of the study participants in the field of RDs ranges from 3 to 30 years. In total, four of the study participants stated that they had already worked with any kind of CDSS.

3.1 Main themes by deductive category

The results of the qualitative content analysis are presented in the following sections. Results are reported by deductive categories of the category system. We provide selected quotations for each statement (see additional file 5). The following exemplary quotes are abbreviated by “Q” and numbered in ascending order.

3.1.1 Patient Overview (Category 1)

Information

One study participant stated that all the expected information about the patient were available in the patient’s overview (Q1). Two other study participants noted that more precise findings, such as doctor's letters, laboratory or imaging findings are necessary to determine who diagnosed the RD (Q2-Q4).

In the diagnosis history and symptom history (category 1.1 and 1.2), the study participants stated that the provided information were insufficient (Q5-Q8). One participant said: “So, basically, it is not enough for me. But it depends on whether this is a first, what are the symptoms. Then of course it is sufficient. But in principle I would like to have details. In the case of dyspnoea in particular, this is high dyspnoea, dyspnoea caused by stress, and in case of fever, how high are the temperatures really, is this a chronic increased body temperature or fluctuating symptoms. The same applies to paraesthesia, i.e. which extremities are affected. Paraesthesia can have very different qualities. Depressiveness would suffice and fatigue, which are more descriptive. But this is not enough for me. But to have a first overview, then yes.” (Q6)

On the lines of the symptom and the diagnosis history, the information on the family history (category 1.3) were not considered as sufficient by the study participants. Further information should be provided to illustrate the family relationships (Q9). For example, one respondent said that there was no information on whether a family member had similar diagnoses or symptoms. You would also expect a kind of family tree that shows the links between the family members and their illnesses (Q10-12). One study participant indicated that consanguinity can be a marker for a genetic disorder, but it is not interesting for all diseases (Q13). The participant stated: “In that in the constellation as it exists right now, this is not my first question. Nevertheless, this is interesting, because whenever you have to make a diagnosis, it is not uninteresting to investigate consanguinity. Because paediatrics 80% are genetic. That is super interesting. But in the symptom complexes that were given here, that would not have been my first question.” (Q13)

Usability

The participants agreed that the views of the patient overview, symptom history, diagnosis history and family history (category 1 - 1.3) are presented simply and clearly (Q14-17). One study participant explained: “Otherwise, the mask is clearly arranged.” (Q18)

Functionality

One participant stated that in addition to the information about the symptoms, the age of the patient and the age at which the symptoms occur is important. The participant recommended calculating the age and the age at the onset of symptoms (Q19). One study participant suggested to integrate a function to mark diagnoses as valid or not. This could, e.g. indicate whether a diagnosis was made in accordance with medical guidelines (Q20).

3.1.2 Execution of the similarity analysis (category 2)

Information

No data available for this category.

Usability

Before the similarity analysis could be carried out by the participant, one task of the task sheet required the user to select all MIRACUM locations where data matching should take place. In the course of the study, the study participants noticed that the function for selecting all locations is not immediately visible, as it is located in the lower part of the view (Q21-22). Two study participants stated that the selection of all locations must be placed in the table headline (Q21-22).

Functionality

Two study participants made statements about the speed of the calculation time of the similarity analysis. One participant rated this process as too slow (Q23), while another participant stated: “Well, that didn't take too long, 45 seconds. Maybe even 30.” (Q24).

3.1.3 Results of the similarity analysis (category 3)

Information

When presenting the overview of similar patients (category 3.1), the participants consider the date of birth as not relevant, whereas the age at the time of the diagnosis is much more relevant (Q25-26). The views of the patient timeline (category 3.3), medical history (category 3.4) and criteria for similarity analysis (category 3.5) also contain corresponding statements from participants (Q28-31). In addition, the patient comparison (category 3.2) should differentiate between the age of the patients and the age at the time of the diagnosis.

The information provided on the medical history (category 3.4) were rated as interesting and relevant (Q31-35). One study participant used an example to explain why the medical history is relevant for diagnosis: “History is always very helpful. Especially in rare diseases. Because then only the history of the disease provides the information. If you think it's from my field, neurology, psychiatry, for example. With atypical Parkinson's syndromes. If you start with a cross-sectional approach, then you can hardly tell the difference at the bottom of the scale. But then the progression of certain symptoms, like the increase or decrease, especially the increase. This will basically give you the information about the diagnosis. As I mentioned earlier, this varies greatly from illness to illness.” (Q36)

Apart from the advantages of such a view, one participant emphasised that the information in the medical history must be chosen in such a way that it really helps to answer the question (Q37). One study participant stated that only those parameters should be displayed in the medical history that are specifically rare: “I think something that is a common symptom of an extremely common disease, you don't need it here. Instead, these should be things that are specifically rare and therefore more specific.” (Q38).

When searching for an expert for a diagnosis (category 3.6), two participants discussed which persons should be considered as an expert. They explained that an expert should have published at a high quality level (Q39-40).

Usability

The participants rated the usability of the overview of similar patients (category 3.1) positively (Q41-42). One participant stated: “This is all very clear to me.” (Q43). Additionally, the patient timeline (category 3.3) was also rated as useful (Q43-46). One participant suggested to include more than one patient in the patient timeline view (Q47).

One study participant stated that the comparison of two patients (category 3.2) is helpful (Q48). He proposes to place the demographic data at the beginning of the table (Q49).

Two participants could not find the function to set the criteria for the similarity analysis (category 3.5) (Q50-51). However, they noted that a regular use of the system may make it easier to use (Q52-53). Two study participants proposed to configure the criteria of the similarity analysis before the analysis is performed (Q54-55). One participant stated that the function to search for experts for a diagnosis (category 3.6) is not intuitive: “No, not very intuitive. That what I did earlier. Oh, that was not related to the patient at all. Then it's perfectly fine. Then it is. You only have to know it once and then. Perfect. Okay.” (Q56)

One participant had difficulties to find the scatterplot (category 3.7) (Q57). Another participant also suggested that more information should be displayed when you click on a point in the scatterplot (e.g. information on diagnosis) (Q57-58). One participant said that the benefits of the scatterplot occur when there are a large number of patients. However, he rated the tabular presentation as better, as it allows to view several pieces of information at a glance (Q59).

Functionality

The study participants criticised the lack of transparency of the similarity algorithm (category 3.2) (Q61-62). They explained that it must be comprehensible how the algorithm leads to the results (Q63-Q67). One participant stated: “The displayed parameters. The anonymous patient was 63 years old when she was diagnosed. Similarity between the patients. Exactly here I miss again, which diagnosis have exactly matched?” (Q67)

One participant rated the search for an expert in the CDSS as a good feature (category 3.6). However, he suggested that it should be possible to search for diagnoses that do not only refer to the confirmed diagnosis of a patient. The diagnoses of the diagnosis history should also be included in the search (Q68). One participant noted that it would be helpful to include differential diagnoses in the system and to include them in the search (Q69).

Regarding the scatterplot (category 3.7), one participant suggested to use a spider chart to plot the individual components, such as symptoms, diagnoses and family history (Q70).

3.2 Results of the questionnaire

The first part of the questionnaire (questions 1-10), which is related to the SUS, resulted in a score of “73.21”. Thus, the CDSS achieved the usability rating “good”, according to Bangor et al. [40]. More details are shown in Table 3.

Table 3: Results of the System Usability Scale (SUS)

SUS-Item	Question	N (valid)	Mean	SD
1	I think that I would like to use this system frequently.	7	3,85	0,9897
2	I find the system unnecessarily complex.	7	1,71	0,4518
3	I thought the system was easy to use.	7	3,71	0,4518
4	I think that I need the support of a technical person to be able to use this system.	7	2,57	0,4949
5	I found the various functions in this system were well integrated.	7	3,85	0,8330
6	I thought there was too much inconsistency in this system.	7	2	0,5345
7	I would imagine that most people would learn to use this system very quickly.	7	4,57	0,4949
8	I found the system very cumbersome to use.	7	2	0
9	I felt very confident using the system	7	3,71	1,1606
10	I needed to learn a lot of things before I could get going with this system.	7	2,14	0,8330
Overall SUS score	73.21

n = 7 participants, mean rating (5-point scale strongly disagree = 1; disagree = 2; neutral = 3; agree = 4; strongly agree = 5), standard deviations (SD) and overall SUS score

The second part of the questionnaire (questions 11-21) with specific items to individual functionalities resulted in a rating from 3.42 to 4.28. Thus this correspond to the characteristic value from “neutral” to “agree”, according to the 5-level Likert scale. More detailed results of the questionnaire are shown additional file 6.

This qualitative study investigated the usability of a CDSS for RDs. In concrete terms the study investigated whether the functionalities and information in the CDSS were implemented in a user-friendly manner and which changes are necessary to improve the CDSS.

4.1 Discussion of results

Information

With regard to the information shown in the patient cases in the CDSS, the study participants stated that relevant clinical information and findings are missing in order to evaluate a patient case more accurately. For example, more detailed descriptive information for symptoms, diagnoses and family histories are required. Participants need more information about the findings related to a diagnosis and about the corresponding physician who has diagnosed the patient. In terms of the family history, relationships and information about family members of the patient should be provided. Hence, there needs to be a revision about what kind of information of patient cases are presented. However, since RDs are very heterogeneous and a general data set that describes all diseases cannot be defined, it might be possible to display parameters by disease groups [19]. However, the development of a data set by disease groups depends on what patient data is available at the MIRACUM locations. This should be investigated in further studies before the CDSS is used in clinical practice.

Another aspect mentioned in the context of the information displayed in the CDSS is that the date of birth of a patient is not relevant and the age, e.g. when a symptom occurred, should be calculated.

Despite these limitations of the information presented in the CDSS, the study participants rated some views, such as medical history, as helpful and relevant. This is consistent with the results of the questionnaire. However, the clinical parameters presented there have to be selected in such a way that they really help to answer the question for the patient.

Usability

The results concerning usability show that there is an acceptance of the clinical users towards the CDSS. This is confirmed by the results of the SUS, what indicates that the usability of the CDSS was rated as good. However, the results may have been influenced by the fact that the information on the patient cases were insufficient for the study participants. It is possible that acceptance to the system will increase, if this aspect is addressed in further developments. However, most of the study participants stated that they could imagine to use the system in the future. Furthermore, the views for the symptoms, diagnoses, family history and the overview of similar patients were rated as structured and clear. In addition, the patient timeline and the comparison of two patients were rated as useful.

The study also identified usability problems. For instance, one problem refers to the software function to execute the similarity analysis. The users were not able to select all MIRACUM locations as stated in the instruction sheet. However, the study participants indicate that once you have found the function, you know how to use it. Another usability problem was identified in the function of the set criteria for the similarity analysis, since two study participants were not able to find the function in the system.

Functionality

Regarding the software functionality of the CDSS, participants stated that they could easily determine which patients are the most similar and could quickly view information, such as diagnosis, gender and location. These results were also available in the questionnaire. In the overview of similar patients, the participants stated in the TA-Test as well as in the questionnaire, that the tabular presentation is favoured over the scatterplot. Furthermore, the participants suggested some improvements for the CDSS. For example, it should be possible to display several patients in the patient history. Additionally, the participants indicated that a sufficient transparency in the results of the similarity analysis must be available. This is necessary in order to determine which symptoms or diagnoses match exactly. Other studies also describe the transparency of a CDSS as an essential success factor for acceptance [43–45].

However, the search for an expert for RDs was considered as an important feature. In the future, the search should make it possible to search not only for the established diagnosis of the similar patient, but also for differential diagnoses.

Summary

As the study has revealed some advantages and disadvantages, the optimization of the CDSS should include the following aspects:

The CDSS should include more detailed information about patients.
The CDSS should calculate the age of the patient where it is necessary.
The view of the patient history should include several patients.
A higher transparency in the results of the similarity analysis is needed (e.g. which symptoms were similar).
The search for an expert for RDs should allow to search for any diagnosis, not only for the established diagnosis of the similar patient.
Usability issues should be solved, like the selection of all MIRACUM locations or to find the function of set criteria for similarity analysis.

4.2 Discussion of methods

In this study, we have chosen a TA-test because it is easy to implement and allows us to primarily assess usability weaknesses as well as the functionality and information of the CDSS in an early stage of the development [25]. As an alternative methodology to the TA-Test, so-called “Near Live Clinical Simulations” (NLCS) could be used [24–26, 47]. For example, Li et al. used NLCS to evaluate their CDSS. This test scenario differs in that the participants are in a prepared treatment room similar to in clinical routine. The study participants are confronted with different case scenarios, at which the patient cases are simulated by actors and they are recorded on video tape. The participants can start and stop the video at their own discretion. As with the TA-Test, the computer screen is also recorded on video tape and an audio recording is made [25]. The authors concluded that the use of NLCS provides results on how the clinical workflow affects the use of the CDSS [25]. As our CDSS was still in an early phase of development, we did not opt this method due to the high effort required for evaluation. However, this method could be an opportunity to investigate the CDSS in a more realistic clinical scenario after the refinement of the system. To reduce the high effort in preparation time and the evaluation in NLCS, the “Instant Data Analysis” (IDA) can be used [48]. The IDA is supposed to reduce the amount of work and time needed for the execution and analysis of a usability test. In IDA, several individual sessions are conducted on one day. At the end of the sessions, the study participants meet to discuss the identified usability problems. It is also possible to express thoughts and ideas that may not be in the front of the discussion. During the discussion, the usability problems are ranked according to severity and frequency [48]. Both IDA and NLCS could be ways to further evaluate and refine the CDSS within the UCD.

The study has several limitations. The results of the study may have been affected by the presence of a test leader throughout the test scenario. This could have an impact on the natural behaviour of the participants, as they know they are under observation (Hawthorne effect) [24]. Furthermore, only eight experts participated in the study, according to the inclusion and exclusion criteria. Moreover, one participant did not filled the questionnaire.

The results are limited to the RDCs in the MIRACUM consortium. More RDCs could be included in a further study. Currently 33 RDCs are available in Germany [46]. However, according to the literature, 8-10 participants can identify 80 % of the usability problems of a software [33, 41-42].

Another limitation is that the system was only evaluated under laboratory conditions. No real patient data was used and the system was not provided on the participant's own computer. Nevertheless, a lab test offers advantages, e.g. the conditions are controllable and the results are reproducible. A further limitation was that no data are available for the subcategory "Information" in category 2 "Execution of the similarity analysis". Nevertheless, using a high methodological standard with COREQ could minimize possible bias across the study.

This qualitative study involved experts for RDs to assess whether the functionalities and information in the CDSS were implemented in a user-friendly manner and which changes should be derived.

In terms of the research question, the results showed that more details regarding information of patients are needed. Furthermore, most of the software functionalities were rated positively, whereas the study participants suggested some improvements for the functions. For instance, the transparency of the results provided by the CDSS for decision support are insufficient and should be refined.

Overall, the CDSS achieved a good usability and most participants could imagine using the system in the future. Therefore, the developed prototype has potential to be used in the clinical practice. However, further work and studies are necessary to refine the CDSS. The results and suggestions of this study should be addressed to increase the usability. For the future, it remains interesting to evaluate the system again when it is used in clinical practice with real patient cases.

BMBF: German Ministry of Education and Research; CDSS: Clinical Decision Support System; COREQ: Consolidated Criteria for Reporting Qualitative Research; DIC: Data Integration Centre; DISERDIS: Diagnosis Support in Rare Diseases; IDA: Instant Data Analysis; MIRACUM: Medical Informatics in Research and Medicine; NLCS: Near Live Clinical Simulations; RD: Rare Diseases; RDC: Rare Diseases Centre; SUS: System Usability Scale; TA-Test: Thinking Aloud Test; UCD: User-Centred Design Process; WHO: World Health Organization

Author's contributions

JS and MS designed the qualitative study and formulated the research questions. The study was performed by JS, as well as the data analysis. The transcripts of the study were coded by JS and checked by MS for validity and possible errors were corrected. Unassigned categories were discussed between all authors. If a text passage in the transcripts could not be assigned to a category, all authors discussed and decided the assignment. Results of the study were discussed and presented to all authors.

Quotations for publication were translated from German to English by JS and checked by BS. The first draft of this publication was written by JS, whereas all authors provided valuable input. The final manuscript was written by JS and approved by all authors.

Acknowledgements

We thank all participants for their participation. The study was part of the MIRACUM use-case “Roll out Rare Diseases”.

Funding

MIRACUM is funded in context of the Medical Informatics Funding Schema by the German Federal Ministry of Education and Research (BMBF). Funding reference number: FKZ 01ZZ1801A, 01ZZ1801C, 01ZZ1801L.

Availability of data and material

The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.

Consent for publication

All research participants signed the informed consent.

Competing interests

The authors declare that they have no competing interests.

Ethics approval and consent to participate

The study was carried out in accordance with relevant guidelines and regulations. The study was submitted and approved by the ethics committee of the Technical University of Dresden with the committee's reference number “EK 226052019”. All participants provided written informed consent to participate in the study.

Author’s information

¹ Medical Informatics Group (MIG), University Hospital Frankfurt, Frankfurt, Germany

² Institute for Medical Informatics and Biometry, Carl Gustav Carus Faculty of Medicine Technical University of Dresden, Dresden, Germany

³Chair of Medical Informatics, Department of Medical Informatics, Biometrics and Epidemiology, Friedrich-Alexander University Erlangen-Nürnberg, Erlangen, Germany

Lopes MT, Koch VH, Sarrubbi-Junior V, Gallo PR, Carneiro-Sampaio M. Difficulties in the diagnosis and treatment of rare diseases according to the perceptions of patients, relatives and health care professionals. Clinics (Sao Paulo). 2018; 73:e68.
World Health Organization. Priority diseases and reasons for inclusion. 2013. https://www.who.int/medicines/areas/priority_medicines/Ch6_19Rare.pdf. Accessed 2 Dec 2020.
Evans WR, Rafi I. Rare diseases in general practice: recognising the zebras among the horses. Br J Gen Pract. 2016; 66:550–1.
Taruscio D, Floridia G, Salvatore M, Groft SC, Gahl W. Undiagnosed Diseases: Italy-US Collaboration and International Efforts to Tackle Rare and Common Diseases Lacking a Diagnosis. Adv Exp Med Biol. 2017; 1031:25–38
Genetic Alliance UK. What is a Rare Disease. 2018. https://www.raredisease.org.uk/what-is-a-rare-disease. Accessed 2 Dec 2020.
Griffon N, Schuers M, Dhombres F, Merabti T, Kerdelhue G, Rollin L, Darmoni SJ. Searching for rare diseases in PubMed: a blind comparison of Orphanet expert query and query based on terminological knowledge. BMC Med Inform Decis Mak. 2016; 16:101.
Institute of Medicine (US) Committee on Accelerating Rare Diseases Research and Orphan Product Development. Rare Diseases and Orphan Products: Accelerating Research and Development. 2nd ed. Washington (DC): National Academies Press (US); 2010.
Shen F, Liu S, Wang Y, Wen A, Wang L, Liu H. Utilization of Electronic Medical Records and Biomedical Literature to Support the Diagnosis of Rare Diseases Using Data Fusion and Collaborative Filtering Approaches. JMIR medical informatics. 2018; 6(4):e11301.
Storf H, Schaaf J, Kadioglu D, Gobel J, Wagner TOF, Uckert F. Registries for rare diseases: OSSE - An open-source framework for technical implementation. Bundesgesundheitsblatt Gesundheitsforschung Gesundheitsschutz. 2017; 60(5):523–31.
Mascalzoni D, Knoppers BM, Aymé S, Macilotti M, Dawkins H, Woods S, Hansson MG. Rare diseases and now rare data? Nat Rev Genet. 2013; 14:372–372.
Schaaf J, Boeker M, Haverkamp C, Hermann T, Kadioglu D, Prokosch H, et al. Finding the Needle in the Hay Stack: An Open Architecture to Support Diagnosis of Undiagnosed Patients. Stud Health Technol Inform. 2019; 264:1580–1.
Sutton RT, Pincock D, Baumgart DC, Sadowski DC, Fedorak RN, Kroeker KI. An overview of clinical decision support systems: benefits, risks, and strategies for success. NPJ Digit Med. 2020; 3:17.
Fraccaro P, O’Sullivan D, Plastiras P, O’Sullivan H, Dentone C, Di Biagio A, et al. Behind the screens: Clinical decision support methodologies - A Review. Health Policy Technol. 2015; 4:29–38.
Berner ES. Diagnostic decision support systems: why aren’t they used more and what can we do about it? AMIA Annu Symp Proc. 2006:1167–1168.
Stanziola E, Uznayo M, Simón M, Otero C, Campos F, Luna D. User-Centered Design of Health Care Software Development: Towards a Cultural Change. Stud Health Technol Inform. 2015; 216:368–71.
LeRouge C, Wickramasinghe N. A review of user-centered design for diabetes-related consumer health informatics technologies. J Diabetes Sci Technol. 2013; 7:1039–56.
Schaaf J, Sedlmayr M, Schaefer J, Storf H. Diagnosis of Rare Diseases: a scoping review of clinical decision support systems. Orphanet J. Rare Dis. 2020; 15:263.
Schaaf J, Sedlmayr M, Prokosch H-U, et al. The Status Quo of Rare Diseases Centres for the Development of a Clinical Decision Support System - A Cross-Sectional Study. Stud Health Technol Inform. 2020; 271:176–183.
Schaaf J, Prokosch H-U, Boeker M, Schaefer J, Vasseur J, Storf H, Sedlmayr M. Interviews with experts in rare diseases for the development of clinical decision support system software - a qualitative study. BMC Med Inform Decis Mak. 2020; 20:230.
Walker M, Takayama L, Landay J, Leila. High-Fidelity or Low-Fidelity, Paper or Computer Choosing Attributes When Testing Web Prototypes. Proc Hum Factors Ergon Soc Annu Meet. 2002; 46:661–665.
Kushniruk AW, Patel VL. Cognitive and usability engineering methods for the evaluation of clinical information systems. J Biomed Inform. 2004; 37:56–76.
Blecker S, Pandya R, Stork S, Mann D, Kuperman G, Shelley D, Austrian JS. Interruptive Versus Noninterruptive Clinical Decision Support: Usability Study. JMIR Hum Factors. 2019; 6:e12469.
Konrad K. Lautes Denken. Handbuch Qualitative Forschung in der Psychologie. VS Verlag für Sozialwissenschaften; 2010.
Richardson S, Mishuris R, O’Connell A, Feldstein D, Hess R, Smith P, McCullagh L, McGinn T, Mann D. “Think aloud” and “Near live” usability testing of two complex clinical decision support tools. Int J Med Inform. 2017; 106:1–8.
Li AC, Kannry JL, Kushniruk A, Chrimes D, McGinn TG, Edonyabo D, Mann DM. Integrating usability testing and think-aloud protocol analysis with “near-live” clinical simulations in evaluating clinical decision support. Int J Med Inform. 2012; 81:761–772.
Press A, McCullagh L, Khan S, Schachter A, Pardo S, McGinn T. Usability Testing of a Complex Clinical Decision Support Tool in the Emergency Department: Lessons Learned. JMIR Hum Factors. 2015; 2:e14.
BC O’Brien, Harris I, Beckman T, Reed D, Cook D. Standards for reporting qualitative research: a synthesis of recommendations. Acad Med. 2014; 89(9):1245-51.
Tong A, Sainsbury P, Craig J. Consolidated criteria for reporting qualitative research (COREQ): a 32-item checklist for interviews and focus groups. Int J Qual Health Care. 2007; 19:349–357.
Mays N, Pope C. Rigour and qualitative research. BMJ. 1995; 311:109–12.
Patton M. Qualitative Research and Evaluation Methods. 3rd edition. CA: Sage Publications; 2001.
Geschäftsstelle des Nationalen Aktionsbündnisses für Menschen mit Seltenen Erkrankungen (NAMSE). National action league for people with rare diseases. 2010. https://www.namse.de/fileadmin/user_upload/downloads/National_Plan_of_Action.pdf. Accessed 2 Dec 2020.
Buber R, Holzmüller HH. Qualitative Marktforschung Konzepte - Methoden – Analysen. 1st ed. Gabler; 2007.
Genov A, Keavney M, Zazelenchuk T. Usability testing with real data. Journal of Usability Studies. 2009; 4(2):85–92.
Kushniruk AW, Patel VL. Cognitive and usability engineering methods for the evaluation of clinical information systems. J Biomed Inform. 2004; 37:56–76.
Nielsen J. Usability engineering. 1st ed. Cambridge: AP Professional; 1993.
Mayring P. Qualitative content analysis - theoretical foundation, basic procedures and software solution. Klagenfurt; 2014. https://www.psychopen.eu/fileadmin/user_upload/books/mayring/ssoar-2014-mayring-Qualitative_content_analysis_theoretical_foundation.pdf. Accessed 01 Dec 2020.
Kuckartz U. Qualitative Evaluation: Der Einstieg in die Praxis. 2nd edition. Hamburg: VS-Verlag; 2008.
Helfferich C. Die Qualität qualitativer Daten. Manual für die Durchführung qualitativer Interviews. 4th edition. Hamburg: VS-Verlag; 2011.
Saunders B, Sim J, Kingstone T, Baker S, Waterfield J, Bartlam B, et al. Saturation in qualitative research: exploring its conceptualization and operationalization. Quality & Quantity. 2018; 52:1893–907.
Bangor A, Kortum PT, Miller JT. An Empirical Evaluation of the System Usability Scale. Int J Hum Comput Interact. 2008; 24:574–594.
Bühner M. Einführung in die Test- und Fragebogenkonstruktion. 3rd ed. München: Pearson; 2010.
Bangor A, Kortum P, Miller J. Determining What Individual SUS Scores Mean: Adding an Adjective Rating Scale. J Usability Stud. 2009; 4:114–123.
Curcin V, Fairweather E, Danger R, Corrigan D. Templates as a method for implementing data provenance in decision support systems. J Biomed Inform. 2017; 65:1–21.
Bezemer T, de Groot MC, Blasse E, ten Berg MJ, Kappen TH, Bredenoord AL, van Solinge WW, Hoefer IE, Haitjema S. A Human(e) Factor in Clinical Decision Support Systems. J Med Internet Res. 2019; 21:e11732.
Miller A, Moon B, Anders S, Walden R, Brown S, Montella D. Integrating computerized clinical decision support systems into clinical work: A meta-synthesis of qualitative research. Int J Med Inform. 2015; 84:1009–1018.
SE-Atlas. Centres for Rare diseases. 2020. https://www.se-atlas.de/map/zse/?ln=en_EN. Accessed 02 Dec 2020.
Chrimes D, Kitos NR, Kushniruk A, Mann DM. Usability testing of Avoiding Diabetes Thru Action Plan Targeting (ADAPT) decision support for integrating care-based counseling of pre-diabetes in an electronic health record. Int J Med Inform. 2014; 83:636–647.
Joe J, Chaudhuri S, Le T, Thompson H, Demiris G. The use of think-aloud and instant data analysis in evaluation research: Exemplar and lessons learned. J Biomed Inform. 2015; 56:284–291.

Additionalfile1COREQchecklist.pdf
Additional file 1: COREQ Checklist. (PDF 424 KB)
Additionalfile2Descriptionofpatientcases.pdf
Additional file 2: Description of patient cases. (PDF 108 KB)
Additionalfile3FunctionalityoftheCDSS.pdf
Additional file 3: Functionality of the CDSS. (PDF 1798 KB)
Additionalfile4Instructionsheet.pdf
Additional file 4: Instruction sheet. (PDF 238 KB)
Additionalfile5Examplequotationsofstudyparticipants.pdf
Additional file 5: Example quotations of study participants. (PDF 315 KB)
Additionalfile6Resultsofthequestionnaire.xlsx
Additional file 6: Results of the questionnaire. (XLS 16 KB)
CoverpageSupplemenatryInformation.docx

Download PDF

Journal Publication

published 18 Feb, 2021

Read the published version in BMC Medical Informatics and Decision Making →

Editorial decision: Major revision
04 Jan, 2021
Reviews received at journal
27 Dec, 2020
Reviewers agreed at journal
18 Dec, 2020
Reviewers agreed at journal
10 Dec, 2020
Reviewers invited by journal
10 Dec, 2020
Editor assigned by journal
10 Dec, 2020
Editor invited by journal
10 Dec, 2020
Submission checks completed at journal
10 Dec, 2020
First submitted to journal
10 Dec, 2020

You are reading this latest preprint version

Evaluation of a Clinical Decision Support System for Rare Diseases - A Qualitative Study

Status:

Journal Publication

Version 1

Abstract

Figures

Background

Methods

Results

Discussion

Limitations

Conclusion

Abbreviations

Declarations

References

Supplementary Files

Status:

Journal Publication

Version 1