The traditional Delphi method consists of five phases: preparation (Phase 1), content development on the relevant topic with an optional so-called round 0 survey (Phase 2), initial content evaluation based on statements in a round 1 survey (Phase 3), re-evaluation of content statements with feedback of all participants in a round 2 survey (Phase 4), and, if necessary, repeats of the pattern of re-evaluation until a consensus is reached (Phase 5) (Fig. 1). The participants in the survey rounds are experts selected by the science or monitoring team. The definition of an expert in a certain area is subjective and finding the consensus can be influenced by the selection of the experts or the number of iterative rounds in the study [13].
2.1. 360-Degree Delphi Design – what's new?
The new approach is called 360-Degree Delphi (360°D for short) to represent the aspect that a collection of opinions is gathered. The name is a reference to 360-Degree feedback, in which the competence or performance of personnel is evaluated not only by their supervisors or managers, but also by their co-workers and subordinates. In a similar way, 360°D evaluates aspects of a topic by considering the opinion of all relevant stakeholders. Another main innovation of the 360°D approach, compared to the conventional DT, is the composition of the expert collective, which follows a practically-oriented stakeholder principle. To differentiate the stakeholder groups, the criteria for inclusion need to be precisely defined. Experts are usually selected according to the number of their publications, their prior work in the field under investigation or the interest they have shown in the specific topic, although there are other methods. With the 360-Degree approach, people are selected on the basis of their participation in the domain. In other words, they are experts as the result of their daily roles and work in the area that will be influenced by the findings of the study. Another innovation is the mandatory performance of certain DT components that are usually voluntary (Fig. 2).
In a similar way to traditional DT, 360°D is structured in five main phases, but some aspects are partially modified (Fig. 2): (Phase 1) preparation phase, (Phase 2) qualitative round/questionnaire, (Phase 3) first qualitative round, (Phase 4) first consensus round, and (Phase 5) optional repetitions of Phase 4. Only changes in the sub-tasks performed are reported in the following sections: we refer readers to the online supplements for a complete itemized task description.
2.1.1 Phase 1 – preparation phase
The preparation phase consists of six steps, (1.1) the creation of the monitoring team, (1.2) the creation of the advisory board, (1.3) the definition of the scientific problem, (1.4) the definition of the stakeholder groups, (1.5) the creation of the survey infrastructure, and (1.6) the obtaining of ethical approval. Steps 1.1, 1.3 and 1.5 are performed in exactly the same way as in the traditional DT.
Step 1.2. Creation of the advisory board
In 360°D the founding of an advisory board is mandatory. The advisory board does not conduct the study, but it gives scientific assistance to minimize bias by the monitoring team. It is especially important in the 360°D approach to review the definition of the stakeholder groups [14]. The people involved in this board are science-related, not necessarily topic-related, and are experienced in evaluating scientific projects. The advisory board accompanies the study by meeting the monitoring team over the course of the study. Since the advisory board is not directly involved in the work, advantages and disadvantages can be identified and notified to the monitoring team.
Step 1.4. Definition of stakeholder groups
The principal new step in 360°D is the definition of stakeholder groups; the purpose here is to categorize the participants on the basis of their different (possibly corporate) interests in or views on the defined problem. In 360°D the term “participant” is used for any member of the study collective (which is the expert panel in classical DT). In finding a consensus regarding a certain topic, the participants’ current fields of interest and work should be related to that topic.
Step 1.6. Ethical advice and data protection statement
Ethical approval from an institutional review board or ethics committee and a statement or certificate by a privacy officer are recommended for the 360°D approach as part of good scientific practice. Ethical approval will be mandatory if the data acquired are related to individuals, or if there are questions related to sensitive topics such as infectious state, social status or financial status. Similarly, data protection is critical not only for ensuring data quality but also to increase compliance by the study participants.
2.1.2. Phase 2 – qualitative round
Phase 2 (or round 0) links five sub-steps: (2.1) creation of open questions, (2.2) test-phase, (2.3) collective selection, (2.4) main poll, and (2.5) evaluation.
Step 2.1 Creation of open questions
Since the goal of the first question-phase is the collection of general knowledge and problem statements from a stakeholder perspective, open questions for qualitative data acquisition are formulated by the monitoring team using standard question design methods (e.g. [15,16]).
Using the defined problem and the defined stakeholder groups, the wording of the questions is adjusted to make them understandable by the stakeholder groups. Therefore, questions addressed to particular stakeholder groups might be formulated differently. The level of abstraction and the content should be the same for every stakeholder group. A change of perspective can help stakeholders to provide answers based on their role. For example, a question from a patient’s perspective could be “How would you provide your will in an emergency situation?”, while the analog from the perspective of a medical doctor could be “How would you gather the patient’s will in an emergency situation?”.
Step 2.2 Test-phase
Before the main poll, the qualitative open questions are tested to see whether they are understood correctly or whether editing is necessary. Participants in this test-phase should not be included in the later rounds. The questions can be tested either in face-to-face interviews or by a pre-survey created for test purposes. A major advantage with oral interviews is that direct and indirect feedback can be obtained from the participants.
Step 2.3. Collective selection
In addition to the creation of the survey, the monitoring team selects the participants. Participants are selected on the basis of a practically oriented stakeholder principle: their practical experience in the field of study, their possible future interaction with the system, or their current usage of similar systems. Each stakeholder group should contain approximately the same number of participants. It is recommended that a bigger pool is contacted, as the response rate is approximately 25% and the number of participants should not be less than ten for a qualitative collective performing a qualitative study [19].
Step 2.4 Main poll
The main quantitative poll is performed in the same way as round 0 with the general DT. However, depending on the questions, appropriately formulated questionnaires might be necessary for each stakeholder group (compare Step 2.1).
Step 2.5 Evaluation
The answers to the open questions consist of qualitative text data. To evaluate the results, a systematic approach is used. Some variants for qualitative data evaluation are available from social studies. Well-known techniques are “Grounded Theory” [23] or “Qualitative Content Analysis” [24,25]. To reduce the subjectivity of the qualitative analysis, multiple analyses by different people from the monitoring team can be performed independently and merged, providing the opportunity to assess inter-rater reliability. Performing the analysis separately for each stakeholder group is important, even if similar or almost identical questionnaires are used for all stakeholder groups. A statement catalog is created using, for example, a coding system based on the statements for each stakeholder group and in total. Grouped by codes derived by bottom-up coding (see [25]) and agreed by the raters, the catalog of statements provides the basis for creating quantitative items for the following rounds.
2.1.3. Phase 3 – First quantitative round
Phase 3 is the starting point for deriving consensus in phases 4 and 5. The participants score the items derived from the qualitative phase. The sub-steps of this round are similar to a traditional DT: (3.1) creation of closed questions for rounds 1 and 2, (3.2) test-phase, (3.3) collective selection (2), (3.4) round 1 main poll and (3.5) evaluation of round 1. Steps 3.2, 3.3, and 3.4 are performed as in traditional DT, with the collective being selected using the same practically oriented stakeholder principle as in Step 2.3. A new collective can be formed or persons can be added to the existing collective.
Step 3.1. Creation of closed questions for rounds 1 and 2
In this step the statement catalog from Step 2.5 is used to generate quantitative questionnaires by, for example, systematically transforming statements into Likert-scale items or priority rankings. Statements are selected according to their importance within the total collective but also within individual stakeholder groups. This guarantees that statements that are important to only one stakeholder group will still be reflected in the total consensus, even if this group is small or would have a low impact otherwise. The remaining process is performed as in traditional DT.
Step 3.5 Evaluation
The data collected in round 1 is quantitative data, so it can be analyzed by descriptive statistics, as in ordinary DT studies or any other survey. The choice of the analytical approach depends on the level of measurement chosen. A new aspect of 360°D is that data are aggregated for each stakeholder group separately instead of overall figures being generated as in traditional Delphi studies. The results are documented accordingly.
2.1.4. Phase 4 – first consensus round
In Phase 4 a consensus may appear. The following steps are conducted to perform phase 4: (4.1) adding of feedback, (4.2) round 2 main poll and (4.3) evaluation of round 2. Steps 4.2 and 4.3 are performed in an identical way to Steps 3.4 and 3.5. If possible, the collective is not altered.
Step 4.1 Addition of feedback
The evaluation results of each statement for each stakeholder group are communicated in an understandable way to the total collective. For example, the average evaluation and dispersion of each stakeholder group for each question is displayed [14,21]. Feedback influences the responses of the participants towards a consensus. Displaying the means and dispersion of ratings for each stakeholder group instead of an overall mean and dispersion allows a faster consensus within stakeholder groups. This may be a benefit of 360°D over traditional DT, as the latter would not reach overall consensus if two stakeholder groups cannot find a common consensus on a statement.
2.1.5. Phase 5 – 2nd to nth consensus round
All further phases are performed in an identical way to Phase 4.
2.2. Evaluation techniques
The evaluation of Phase 1 of 360°D is performed by looking at several aspects: (1) inter-rater reliability is calculated in order to create a solid coding base, (2) Hypothesis 1 is evaluated using excerpts of the coding catalog, and (3) Hypothesis 2 is evaluated by calculating the dispersion of statements within stakeholder groups. Similarly, statistics from Phase 3 are reported alongside to support the initial findings for Hypotheses 2.
2.2.1. Inter-rater reliability
Two coders receive the raw collected data and independently create a coding structure and base. Data are read and the coding base and corresponding statement coding are created in a bottom-up procedure resulting in codes associated with parts of the raw statements. The coding versions by the two coders are compared to check if the bottom-up procedures for each coder show similarities. Code matches between the codes are created based on position in the statement texts and the semantic meaning of the code label. If multiple codings occur at a text position for one coder, but not for the other one, only one code match is created and other codings are labeled as unmatched. Since the code structure and code base are created independently, the semantic meaning of individual codes could vary. Each matched code was therefore presented to three evaluators who rate the code matching either as a semantic match or as a mismatch. The rate of semantically matched codes based on a majority vote scheme (at least two out of three) and single vote scheme (one out of three) is reported in the results.
2.2.2. Evaluation of Hypothesis 1
To investigate Hypothesis 1, we assume that any of the five stakeholder groups would be eligible to stand as an expert collective for the traditional Delphi technique by themselves. Thus, we analyze whether any of the given statements (represented by their codings) are made exclusively or predominantly by one of the stakeholder groups, thereby being complementary to the other groups.
2.2.3. Evaluation of Hypothesis 2
The opinion dispersion of the stakeholders is measured using the code repository derived by the raters. Dispersion D is defined as the average standard deviation across the different coding occurrences. (see Equation 1 in the Supplementary Files)
with C being the different codes, Nc the number of codes and oc the population standard deviation of occurrences within the code.(see Equation 2 in the Supplementary Files)
with S being the different stakeholders within the group, the number of times the code C is associated with stakeholder S, the average number of times code C is associated with stakeholders in the group, and the number of stakeholders.
Dispersion is reported per stakeholder group and for the total collective using box plots.
In addition, mean and standard deviation are reported and compared for the Phase 3 quantitative questionnaire.
2.3. Software Tools
The web-based tool LimeSurvey (Version 2.05, LimeSurvey GmbH, Hamburg, Germany) was used for online questionnaires. The coding and further analysis was supported by MaxQDA (Version 10, VERBI GmbH, Berlin, Germany), Microsoft Excel (Version 2011, Microsoft Corporation, Redmond, WA, USA) and Matlab (Version R2014b, The Mathworks, Natick, MA, USA).