The Impact of Panel Composition and Topic on Stakeholder Perspectives: Generating Hypotheses from Online Maternal and Child Health Modi ed-Delphi Panels


 Background: Multi-stakeholder engagement is crucial for conducting health services research. Delphi-based methodologies combining iterative rounds of questions with feedback on and discussion of group results are a well-documented approach to multi-stakeholder engagement. The aim of this study is to develop hypotheses about the impact of panel composition and topic on the propensity and meaningfulness of response changes in multi-stakeholder modified-Delphi panels.Methods: We conducted three online modified-Delphi multi-stakeholder panels using the same protocol. We assigned 60 maternal and child health professionals to a homogeneous (professionals-only) panel, 60 pregnant or postpartum women (patients) to a homogeneous panel, and 30 professionals and 30 patients to a mixed panel. In Round 1, participants rated seriousness of 11 maternal and child health outcomes using 0-100 scale and explained their ratings. In Round 2, participants saw Round 1 results and discussed them using anonymous, moderated online discussion boards. In Round 3, participants revised their original ratings. Our outcome measures included binary indicators of response changes to ratings of low, medium, and high severity maternal and child health outcomes and their meaningfulness, measured by a change of 10 or more points on a 0-100 scale.Results: Participants changed 55% of responses; the majority of response changes were meaningful. We developed three main hypotheses. First, stakeholders may be more likely to change their responses on preference-sensitive topics where there is a range of viable alternatives or perspectives. Second, patients may be more likely to change their responses and to do so meaningfully in mixed panels, whereas professionals may be more likely to do so in homogeneous panels. Third, the association between panel composition and response change may vary according to the topic.Conclusions: Results of our work not only helped generate empirically-derived hypotheses to be tested in future research, but also offer practical recommendations for designing multi-stakeholder online modified-Delphi panels.Registration: International Registered Report Identifier: DERR1-10.2196/16478


Introduction
Multi-stakeholder engagement is crucial for conducting health services research; it helps ensure that key stakeholder perspectives inform the research process and its outcomes [1]. Patients, caregivers, clinicians, researchers, payers, purchasers, and policy-makers are key stakeholders [2] whose engagement can positively impact all stages of the research process [3,4]. Nonetheless, multistakeholder engagement is challenging due to logistical di culties, power imbalances, and stakeholders' capacity to participate meaningfully [5].
One way to conduct multi-stakeholder engagement is to convene a Delphi panel [6][7][8]. Delphi-based methodologies that combine iterative rounds of questions with feedback on intermediary panel results were designed to objectively develop group consensus [9,10]. The Delphi method is based on the idea is that exposure to alternative perspectives improves the quality of the nal responses, which are used to determine the existence of consensus. Delphi-based methodologies provide a useful approach for measuring whether and how participants' perspectives change [11,12].
Modi ed-Delphi methodologies that start with a survey, proceed with a feedback on and an in-person, telephone, or virtual discussion of initial survey results, and end with participants revising their original survey responses offer stakeholders an opportunity to directly engage with each other, which is absent in traditional Delphi panels [8, [13][14][15][16]. Online modi ed-Delphi approaches are particularly useful engagement techniques because they allow for large-scale (100 + participants) anonymous engagement, which is not possible in modi ed-Delphi panels that meet in-person. The requirement of in-person discussion limits the panel size to 9-20 participants [13,14]. While the online method has clear bene ts, little is known about the contextual factors such as panel composition or topic that might affect the outcomes of multi-stakeholder engagement.
Research suggests that stakeholder perspectives vary, with patients and clinicians, for example, having different perceptions of research priorities, treatment preferences, and risk-bene t tradeoffs [17,18]. Although patients' voices may be dominated by clinicians' [17], true consensus in multi-stakeholder initiatives may not be achieved without directly exposing stakeholders to the perspectives of other groups. While patients may be more comfortable sharing their perspectives with peers and, therefore, could be more satis ed with engagement in homogeneous panels, participants in mixed panels could change their positions after being exposed to the alternative perspectives, which is key for developing true consensus in multi-stakeholder panels [19]. Although it is possible to imagine how the outcomes of a multi-stakeholder engagement might vary depending on its topic, we are not aware of previous studies that directly addressed this question in the context of modi ed-Delphi panels.
This paper advances methods for conducting online multi-stakeholder panels using a modi ed-Delphi approach by exploring the impact of panel composition and topic on stakeholder judgments and generating empirically-grounded hypotheses for future research. We use the propensity and meaningfulness of response changes after stakeholders receive statistical feedback and discuss their original responses with others as a measure of panel impact on individual stakeholder judgments. Our results are based on three online modi ed-Delphi (OMD) panels that engaged patients and professionals around severity of maternal and child health outcomes linked to gestational weight gain [20,21]. Our ndings have practical and methodological implications for assembling multi-stakeholder panels and contribute to ongoing scholarly debates about the impact of feedback [22] and the nature of consensus-building in Delphi panels [11,12,23].

Methods
Study Design: In October-November 2019, we conducted three concurrent OMD panels: a panel of 60 professionals, a panel of 60 patients, and a mixed panel of 30 professionals and 30 patients. We used our professional networks and social media to recruit 90 maternal and child health professionals who have worked in the eld at least ve years and 90 women who were either pregnant or gave birth in the past two years. We randomly assigned participants to either homogeneous or mixed panels.
All panels were conducted using ExpertLens™ -a previously evaluated OMD platform [13,19,24,25]. Each panel completed a threeround OMD process. In Round 1, participants rated seriousness of 11 pregnancy weight gain outcomes using a 0 (not serious at all) to 100 (very serious) rating scale and explained their ratings. In Round 2, participants saw the distribution of Round 1 ratings and associated explanations, reviewed how their own ratings compared to their panel's medians and quartiles, and engaged in an asynchronous, anonymous, and moderated online discussion. To preserve con dentiality, participants were only identi ed as "professionals" or "patients." In Round 3, participants were allowed to revise their original ratings. Those completing all rounds received a $165 gift card. Additional details about study design [20] and its ndings [21] were published elsewhere.

Sample
Our analysis focuses on response changes to the same question between Rounds 1 and 3. We only include a participant's response to a question if it was provided in both rating rounds. Our nal sample includes 143 participants and 1,491 response changes.

Variables
Main outcome variables in this study include binary indicators of response change and its meaningfulness (Yes/No). We considered a change of 10 or more points to be meaningful because it moves a response from one decile to another on the 100-point scale.
Our main predictor variable is the composition of the panel a participant was randomized into (patients in a homogeneous panel (reference group), professionals in a homogeneous panel, patients in a mixed panel, and professionals in a mixed panel).
Our control variables include stakeholders' participation experiences, such as satisfaction with various aspects of the online engagement process. Participants used 7-point Likert scales to rate their agreement with the following statements: Participation in this study was satisfying; The charts helped me understand how my responses compared to those of other participants; Round Two discussion changed my perspective on the study topics.
As in previous studies, we dichotomized responses and considered those scoring an item as 5, 6, or 7 as having positive participation experiences [19,24,25].
Other control variables include participants' race (white vs other) and age.

Statistical analysis
We used mixed-effect logistic regression to estimate the panel composition effects on the presence and meaningfulness of response changes. All models were clustered at the individual level to address within-participant correlations, and robust standard errors were produced. We rst ran all the models using the seriousness ratings of all pregnancy outcomes combined (n = 1,461 response changes). We then strati ed all analyses by health outcome severity levels, which we considered as different panel topics. High severity outcomes included infant death, stillbirth, preterm birth, and preeclampsia (n = 528 response changes). Medium severity outcomes included obesity in women, childhood obesity, gestational diabetes, and metabolic syndrome in women (n = 530 response changes). Finally, low severity outcomes included small-for-gestational-age (SGA) birth, large-for-gestational-age (LGA) birth, and unplanned caesarean delivery (n = 403 response changes) [21]. We conducted all the analyses using STATA SE 14.

Participants characteristics
Of 180 invited participants, 143 (79%) answered at least one question in both rating rounds. Of these 143 participants, 73 (51%) were health care professionals and 70 (49%) were patients. Of 73 professionals, 46 (63%) were in the homogeneous panel and 27 (37%) were in the mixed panel. Of 70 patients, 47 (67%) were in the homogeneous panel and 23 (33%) were in the mixed panel.
Almost all study participants (92%) were female and roughly two-thirds (65%) were white (Table 1). Among professionals, the majority were researchers (77%) and had a doctoral degree (93%). Two-fths of all professionals had 15 or more years of experience, 36% had 10-14 years, and 25% had 5-9 years of experience. The majority of patients had a master's or doctoral degree (61%) and reported being pregnant within the past 2 years (76%); and two-fths reported having 2-3 prior pregnancies. High school or Less Master's degree 28 ( Notes: a Not all respondents provided answers to all of the questions.

Participation experiences
Participants were generally satis ed with their study experiences (mean = 5.7, SD = 1,1), thought that the charts showing the distribution of Round 1 responses helped them understand how their responses compared to those of other participants (mean = 6.4, SD = 1.0), and felt that the discussions changed their perspective (mean = 5.0, SD = 1.3) ( Table 1). There were no major differences in participation experiences across panel type. Among professionals, those in the mixed panel, on average, had slightly lower scores on the questions about charts and discussions, but slightly higher scores on the overall satisfaction. Patients had slightly higher scores on all three measures of subjective participation experiences than professionals, with patients in the mixed panel being slightly more satis ed than patients in the homogeneous panel.

Response changes
Almost all participants (97%) changed at least one response (data not shown). Of the 1,491 questions that participants answered twice, responses to 55% of all questions changed in Round 3 (Table 2). Of all responses provided, 38% were changed by 10 or more points (mean value of response change = 7.14, SD = 9.98; median = 5). Although the pattern of changes was similar between professionals and patients when panel type was not considered, it varied once panel type was accounted for. A higher percentage of patients' responses in the mixed panel (59%) changed, compared with responses provided by patients in the homogeneous panel (50%). In contrast, 58% of responses provided by professionals in the homogeneous panel and 53% of responses in the mixed panel were changed. . These results suggest a differential effect according to panel type. Moreover, the patterns of response changes differed by topic: a higher percentage of responses have been changed and altered by more than 10 points for medium and low severity outcomes than for high severity outcomes across all participant and panel types. Table 3 shows the results of the mixed-effects logistic regression predicting response changes. Looking at all outcomes shows that patients in the mixed panel (OR = 1.5, CI = 0.9-2.3) and professionals in the homogenous panel (OR = 1.4, CI = 0.9-2.1) were about 40-50% more likely than patients in the homogeneous panel to change their ratings. These differences, however, were only marginally signi cant and only for patients. Moreover, panel composition was a signi cant predictor of response changes for medium and low severity outcomes, but not high severity outcomes. For medium severity outcomes, patients in the mixed panel (OR = 2.1, CI = 1.2-3.9) and professionals in the homogeneous panel (OR = 1.7, CI = .9-3.1) were more likely to change their ratings, compared to patients in the homogeneous panel. Moreover, patients and professionals in the mixed panel were more likely than patients in the homogeneous panel to change their answers about low severity outcomes (OR = 2.7, CI = 1.2-6.1 and OR = 1.9, CI = .9-3.9, respectively). Those satis ed with their participation were less likely than their less satis ed counterparts to change ratings on medium severity outcomes (OR = .5, CI = .3 − 1.0), whereas those who felt that charts helped them understand how their responses compared to those of others were less likely to change their ratings on low severity questions (OR = .4, CI = .1-1.2). We note that small sample sizes led to imprecise estimates. Figure 1 shows marginal effects of the logistic regression predicting response changes, which provide additional support to our modeling results. Brie y, patients in the homogeneous panel had the lowest probability of changing their responses (50%) when looking at all outcomes together. Participants had the lowest probability (below 50%) of changing their responses on high severity outcomes. Patients in the mixed panel rating high severity outcomes had the lowest predicted probability of changing their responses (38%), whereas patients in the mixed panel rating low severity outcomes had the highest predicted probability of modifying their responses (72%). Table 4 shows the results of the mixed-effect logistic regression predicting the meaningfulness of response changes. Panel composition was a signi cant predictor of meaningful response changes only for questions about low severity outcomes. Relative to patients in the homogeneous panel, patients in the mixed panel and professionals in the homogeneous panel were more likely to meaningfully change their answers (OR = 2.5, CI = 1.2-5.3 and OR = 1.9, CI = .9-3.8, respectively). The difference between professionals and patients in homogeneous panels was only marginally signi cant.  Notes: Patients in a homogeneous panel are a reference group. We control for demographic characteristics, such as race and age for all models. Models were clustered at the participant level. Coe cients for constant are excluded. Values presented in this table are odds ratios (OR) and robust 95% con dence intervals (CI). *** p < 0.01, ** p < 0.05, * p < 0.1

Model results
Although perceived usefulness of charts reduced the likelihood of meaningful response changes on low severity outcomes (OR = 0.3, CI = .1-1.0), participation satisfaction made participants marginally less likely to change their responses by 10 or more points on medium severity outcomes (OR = .5, CI = .3 − 1.0). Figure 2 shows marginal effects of the logistic regression predicting meaningful response changes. As in previous models, patients in the homogeneous panel and professionals in the mixed panel had the lowest probability of changing their responses meaningfully (34% for both groups). Looking across the outcome severity levels, the lowest predicted probability of a response change of 10 or more points was observed for high severity outcomes. Professionals in the mixed panel rating high severity outcomes was the group with the lowest probability of meaningful response changes (22%). Patients in the mixed panel rating low severity outcomes was the group with the highest probability of changing responses meaningfully (54%).

Discussion
We analyzed the impact of panel composition and topic on presence and meaningfulness of response changes in OMD panels. Our results show that professionals and patients rating seriousness of maternal and child health outcomes changed more than half of their original responses and that the majority of response changes (69%) were meaningful. This nding suggests that the exposure to and the discussion of the perspective of other participants affect individual judgments about outcome severity.
In contrast to previous research that suggested that personal characteristics of OMD panelists were not associated with response changes [12], our results showed that participant background matters and that patterns of response changes are different for pregnant and postpartum women and maternal and child health professionals. It is worth noting, however, that the previous study focused on patient and caregiver panels on different topics, included a different set of participant background measures, and did not look at the meaningfulness of change.
Our results also illustrate heterogeneity in the impacts of panel composition on response changes and their meaningfulness based on panel topic (e.g., outcome severity). While we saw some patterns, we cannot, with certainty, say that certain types of stakeholders are more likely than others to change their responses on certain topics. Nonetheless, our study design provides a unique opportunity to generate hypotheses to be tested in future research.
First, we hypothesize that stakeholders in modi ed-Delphi panels are more likely to change their responses for certain preferencesensitive topics, such as those where there is a range of viable alternatives or perspectives. In our study, the likelihood and meaningfulness of response changes were affected by the nature of maternal and child health outcomes considered. Moreover, participants had the lowest probability of changing their responses and doing so meaningfully while rating outcomes deemed "severe" by participants. There is little if any debate that infant death and stillbirth are serious health outcomes. Therefore, it is not surprising that our participants rated these outcomes highly in round 1 and that their perspective did not change greatly after the discussion round. At the same time, individual judgments about less severe outcomes, such as gestational diabetes and small-forgestational-age birth, were more affected by the perspectives of other participants. For example, while some patients may have initially focused more on the short-term impact of gestational diabetes thinking that it will go away after delivery, exposure to professional perspective may have brought to the fore concerns about risks of complications during delivery and increased risk of Type 2 diabetes after child-bearing years.
Second, we hypothesize that stakeholders may be more likely to change their responses in different panels types: patients may be more likely to change their responses and to do so meaningfully in mixed panels, whereas professionals may be more likely to do so in homogeneous panels. The fact that patients generally had the highest probability of changing their responses in the mixed panel may be an illustration of collaborative learning, which takes place in diverse stakeholder panels that focus on important issues. Stakeholders can learn from the perspective of a different group and change their responses based on new ideas they may not have considered [26]. Indeed, patients may be eager to learn not only from the experiences of other patients, but also from professionals who have specialized expert knowledge of the topic. To illustrate, engagement with professionals could help patients learn about potential burden of undergoing treatments for complications caused by a problem that patients may not have otherwise considered severe enough to worry about.
Although unexpected and somewhat methodologically undesirable, professionals in our study were more likely to change their responses and to do so meaningfully in the homogeneous panel. This nding, however, is not too surprising given that shared decision-making often suffers from "selective paternalism" − a situation where healthcare professionals step outside of shared decision-making to choose a course of action they think would work best for their patient, but that discounts hearing an alternative patient perspective [27].
Third, we hypothesize that the association between panel composition and response change may vary according to topic. Panel composition may play a bigger role in panels on somewhat less serious or consequential, but still important health topics. In our study, marginal effects of panel composition varied by the outcome type: while patients in the mixed panel were more likely to change their responses and to do so meaningfully on medium and low severity outcomes, the reverse was true for high severity outcomes. Professionals in the homogeneous panel generally had higher probabilities of changing their responses and doing so meaningfully than professionals in the mixed panel. This was true across topics with one exception: professionals in the mixed panel had a higher probability of changing their responses on low severity outcomes. Although these ndings support our hypothesis that patients and professionals learn under different circumstances, it offers an important nuance − mutual learning may happen in diverse panels, but only for certain types of outcomes. While previous research shows that there are no statistically signi cant differences between patient/caregiver and clinician/research experiences with OMD panels or their willingness to use OMD in the future [19], this study suggests that mixed panels may promote mutual learning in multi-stakeholder panels on certain topics.
Our study has important limitations. Our analysis was limited to three OMDs that engaged pregnant and postpartum women and health professionals on the topic of maternal and child health outcomes. The patterns of ndings may be different in panels that engage different stakeholders on other topics. Moreover, not all study participants answered the same questions twice or provided responses to satisfaction questions. Nonetheless, attrition is common in OMD panels, and our participation rates were higher than in other panels [21].

Conclusions
We recognize that our study cannot provide conclusive answers to our questions. That is why we generated empirical data needed to formulate evidence-based hypotheses about panel composition and topic for future research. Our practical recommendations (Fig. 3) can help panel designers assess possible threats to achieving valid, reliable panel conclusions and encourage them to consider how panel design considerations may affect their conclusions.

Declarations
Ethics approval and consent to participate: The institutional review boards at University of Pittsburgh and the RAND Corporation determined this study to be exempt from review. All study participants have provided their informed consent to participate in the online modi ed-Delphi panels. All data collection activities were carried out in accordance with relevant guidelines and regulations.
Consent for publication: Not applicable.
Availability of data and material: Deidenti ed study data are available upon reasonable request from the corresponding author.
Competing interests: DK is the ExpertLens team leader. ExpertLens is an online platform for conducting modi ed-Delphi panels that has been used to collect data in this study. All other authors report no con ict of interest.
Funding: This study was supported by a National Institutes of Health grant (R01 HD094777). The funder did not have any role in the design of the study and collection, analysis, and interpretation of data and in writing this manuscript.
Authors' contributions: All authors contributed to the study conception and design. DK led the data collection, designed the analytic procedures, and drafted the paper; SP conducted all analyses, created all tables and gures, and helped draft the paper; JAH and LMB obtained funding, designed the overall study, advised on the analytic procedures, reviewed and revised earlier drafts of the manuscript; SMP participated in the data collection and reviewed and revised earlier drafts of the manuscript. All co-authors reviewed and approved the nal version of the manuscript.