The Multiple Stakeholder Approach to Real-world Evidence Generation: the Results of the Jandhyala Method in Observing Expert Consensus on Quality Indicators of Rare Disease Patient Registries (RDRs) by Stakeholder Group

Evidence is valuable to inform decision making. Understanding stakeholder needs from the evidence of key stakeholders is empirically important to those involved in its generation. Where multiple stakeholders exist, an understanding of whether their questions are homogenous or heterogenous necessitates a dedicated approach that will be of value to those invested in the outcomes of their decisions. The pharmaceutical industry engages with non-pharmaceutical industry stakeholders: Payors, Prescribers and, under carefully controlled circumstances, patients when commercialising its medicines. This original research focussed on the differences between these groups and their pharma aligned stakeholders: Regulatory Affairs, Market Access, Commercial and Medical Affairs using measures of their common quality indicators (QIs) from rare disease patient registries. QIs were solicited using the Jandhyala method for observing item awareness and consensus from list generating questioning. They were compared for homogeneity between individual stakeholder groups and the combined pharma and non-pharma stakeholder group population.


Introduction
Evidence is the product of research being conducted to answer a particular research question. Such evidence is of the greatest value to its 'consumers', who are de ned by the relevance of the research question as it pertains to them, including the value that the answers will have to them in decision making (1). Evidence-based decision making can be assumed as a risk-mitigating approach by some (2); thus, consumers of evidence may fall into groups on the basis of their common unanswered research questions and the effect that the evidence targeted to answering it is expected to have on an environment.
Consumers may generate relevant evidence themselves or have it generated for them by another interested party. This party may be driven by the shared bene ts, to themselves and to the consumer, of answering a speci c set of consumer-relevant questions.
The pharmaceutical industry is one such interested party which manufactures medicines for the bene t of patients in return for revenue generated by its adoption (3). However, key hurdles of regulatory approval, reimbursement and prescribing practices need to be addressed, with evidence, tailored to their gatekeeper consumer's needs before these bene ts can be broadly realised.
It has been argued that these gatekeepers, more constructively termed 'stakeholders', require a dedicated strategy to meet their evidence needs. Furthermore, where more than one need exists, such as in the pharmaceutical industry, a multiple stakeholder approach is required to plan and deliver evidence that maximises the return on investment, in the form of time and resources, by ensuring as many stakeholders can bene t from the generated evidence as possible (4).
Patient registries, or observational cohort clinical studies or prospective and/or retrospective type, have the potential to generate answers to a large number of research questions in real-world patients (5), primarily if these questions are appropriately understood and incorporated in the planning phase (6,7).
Their value can be ampli ed in rare diseases where small, discrete patient populations can be combined and studied as a larger, single population to increase the quality of the evidence generated. Orphan medicines are also among the more costly drugs and are arguably under greater scrutiny because of this (8). Therefore, the impact of well-targeted evidence supporting the approval, reimbursement, and prescribing of orphan medicines to encourage their adoption, offers the best opportunity for rare populations to receive this life-changing treatment.
The pharmaceutical industry has evolved by recognising the need to engage with key non-pharma (PH-) stakeholders such as the Regulator, Payor (Payo), Patient (Pati) and Prescriber (Pres) by aligning their internal functions to mirror the needs of these stakeholders to ensure their perspective remains in focus for the company (9,10). This could be done using functions such as Regulatory Affairs (RAff), Market Access (MAcc), Commercial (Comm) and Medical Affairs (MAff). Several companies have also implemented patient advocacy functions to engage with these PH-stakeholders.
Given the value and versatility of patient registries in rare diseases for generating real-world evidence (RWE) and the importance of the decisions made by the various stakeholders involved in making new medicines available (11), three clear questions have been identi ed as the subject of this work: Firstly, whether a multiple stakeholder approach is required by con rming the differences in the stakeholder needs between them. If so, the second question regarding what the exact nature of the differences are and the level of agreement on any common items across the groups should be posed.
Thirdly, whether pharma is aligned with PH-stakeholders or not and if there is an alignment between the individual stakeholder groups, especially between those Pharma (PH+) stakeholder groups intended to mirror speci c PH-stakeholder groups.
To answer these questions alignment will be measured by comparing the detailed lists of quality indicators (QIs) of rare disease patient registries (RDRs), solicited by the Jandhyala method, to observe consensus among the stakeholder groups on items that should be included.
It is hoped that outputs from this work may help to inform the planning of RWE generating strategies, and if a multiple stakeholder approach is justi ed.

Jandhyala Method
This study was conducted using the Jandhyala method (12), a novel mixed-methods approach for generating expert opinion. Expert opinion is generated by observing levels of awareness and consensus relating to items identi ed from list-generating questioning via two anonymised online surveys and calculating an awareness index (AI) and consensus index (CI) for each item, respectively (13). The objective of this study was to identify the QIs for a successful RDR from the perspective of multiple relevant stakeholders, or consumers, of the resulting evidence, as well as assess the heterogeneity among stakeholders' expert opinions on the QIs needed for a successful RDR. There were no formal endpoints in this study, as the outcomes were consensus on items that indicated successful RDR quality.
During the rst Awareness Round (1) survey, participants were asked to respond to the list-generating question: "What is important to planning for, or judging the success of, a Rare Disease Registry?" They were asked to provide a minimum of three, and a maximum of ten, free-text answers.
The participants' responses from this Awareness Round (1) were analysed per group and re ned into mutually exclusive items by three investigators using a process of content analysis and open coding. The codes were then attributed to the relevant participants' answers by one investigator, before being con rmed with a second.
The participants who completed the rst round were asked to participate in the second Consensus Round (2) survey, and were asked to rate their level of agreement with the inclusion of the items arising from the Awareness Round (1) survey in the list of QIs using a ve-point Likert-scale (strongly agree; agree; neither agree nor disagree; disagree; or strongly disagree). Items reaching a consensus level of ≥ 50% (consensus index ≥ 0.5) were retained in the nal list.

Study Participants
Participants were invited from the author's professional network and via advertisements placed on professional social media networking sites based on experience in RWE in rare diseases from the following industry and non-industry stakeholder groups: MAff, MAcc, Comm, RAff, Pati, Pres and Payo.

Study Variables
In line with the published method and its reporting convention, the Awareness Index (AI), Consensus Index (CI), and an Index score, indicating prompting, were calculated for each generated item within their stakeholder group.
The AI, measured in the Awareness Round (1), was calculated using the frequency of an item concerning the most frequently occurring item. The minimal awareness threshold was de ned as an Awareness Index ≥ 50%. The CI was calculated as the percentage of participants who agreed or strongly agreed with an item in the Consensus Round (2).
The Index score measured whether prompting occurred during the Consensus Round (2). Prompting occurred if the absolute difference between the AI and the CI was ≥ 0.05 (or 5%). Prompting was considered negligible if the absolute difference was ≤ 0.05 (or 5%). Unprompted consensus occurred when most participants suggested an item in the Awareness Round (1), and the majority agreed, or strongly agreed, that the item was important in the Consensus Round (2). Any item that was not suggested by the participants during the Awareness Round (1), but which was agreed to be important in the Consensus Round (2), was prompted completely. Items achieving a CI ≥ 0.51 were retained.
The different stakeholder groups were recruited at different time points; thus, it was necessary to code the Awareness Round (1) data for each group individually to send the Consensus Round (2) survey out timeously. Consequently, the generated items could not be harmonised across all groups until the end of the data collection period, and an additional round of analysis was carried out to standardise the generated items and avoid repetition.
The list of QIs for a successful registry from the perspective of multiple stakeholders was generated from the expert opinion survey.

Statistical Analysis
Absolute heterogeneity in stakeholder responses (Table 1) The threshold for absolute heterogeneity of a stakeholder group was the generation of at least one unique QI. The number of unique QIs generated by each stakeholder group, the total number of items generated by each stakeholder group, and the proportion of unique items is presented.
A Chi-squared test was performed to see whether the proportion of unique QIs generated by each stakeholder group was signi cantly different than those of others (combined) using two proportions tests. A P-value ≤ 0.05 indicated that the proportion of unique QIs generated by the stakeholder group is signi cantly different (heterogeneous) than that of others.
Relative heterogeneity ( Table 2)  The heterogeneity among stakeholders for each item was examined using individual-or expert-level consensus scores. For this analysis, the consensus scores of the individual stakeholders were further classi ed as agree (expert-level consensus score of 4 or 5 in the Jandhyala method) and disagree (expertlevel consensus score of 1 to 3 in the Jandhyala method). For each item, these classi ed scores were tabulated among the stakeholder groups that provided consensus scores for the item. Chi-square tests were performed on the tabulated data, and a P-value ≤ 0.05 indicated heterogeneity among stakeholders for consensus scoring on the item.
The stakeholder groups were combined into PH+; Comm, MAcc, MAff, and Regulators; and PH-; Pati, Pres, and Payo. Chi-square testing was also performed on the combined tabulated data and a P-value ≤ 0.05 indicated heterogeneity in consensus on each item between the PH + and PH-stakeholders for the item.
Heterogeneity in common pairwise stakeholder items (

Data Includes two tables and one gure
The heterogeneity between stakeholder pairs for the list of items generated and consented upon by each stakeholder group within the pairs were examined. This analysis used item-level data and the stakeholder group-level consensus score per item based on the Jandhyala method, ranging from 1 to 4. These scores were categorised according to the method's published scheme into Complete consensus, Consensus +, Consensus -and No consensus, numbered 1,2,3,4, respectively. The category numbers for each generated list of items are tabulated between each stakeholder pair. Chi-square tests were performed on the tabulated data, and a P-value ≤ 0.05 indicated homogeneity between the stakeholder pair in their consensus on the item.
All statistical analyses were performed using R 3.6.2.

Study Participants
Recruitment took place in May 2019 and January 2020. A total of 55 participants were recruited. PHstakeholder participants were made up of 9 Pres, 7 Payo and 7 Pati. PH + stakeholder participants comprised: 13 MAff, 5 MAcc, 7 Comm and 5 RAff. Non-industry regulatory functions were uniformly unable to participate in the study on the grounds of con ict of interest. Due to recruitment di culties, the RAff experts were not always specialised in rare disease, and some Payo (1/7) had to be recruited from outside the EU.  Fig. 1).

Relative Heterogeneity
Page 13/20 The mean of unique items suggested per stakeholder group was 7.57, whereas the mean of total items suggested was 35.71. The mean stakeholder group size was 7.86 participants per stakeholder group. The Pati stakeholder group suggested both the largest number of unique and total items 14/45 (31.11%, P = 0.111). The MAff stakeholder group was the largest group (13/55) but suggested the smallest number of unique and total items: 4/25 (16.00%, P = 0.680). The proportion of unique to total number of items for Comm, Macc, Raff, Pati, Payo and Pres were: 8/41 (19.51%, P = 0.936), 6/35 (17.14%, P = 0753), 6/37 (16.22%, P = 0.558), 6/32 (18.75%, P = 0.895) and 9/35 (25.71%, P = 0.059), respectively (Table 1 and Fig. 1). All stakeholder groups were found to be equally unique within the group in their proportion of suggested unique to indicators. Sixteen items out of 111 reached consensus (Consensus Index ≤ 2) in four or more groups of stakeholders. Eight items resulted from the division of a generated item into its constituent parts, and eight generated items were merged with other generated items as part of the harmonisation process. A degree of commonality and heterogeneity exists between the consensus levels of these common items.
Two items reached consensus in four or more stakeholder groups for inclusion in list QIs of RDRs: "Includes a core dataset as part of the outcomes" and "Engages with patients (organisations) and gains their buy-in" ( Table 2 and Figure 2).  (Table 3 and Figure 2).

Discussion
This research sought to con rm the presence of discrete stakeholder groups with equally unique needs to validate the need for a multiple stakeholder approach in evidence generation. This was con rmed in both PH + and PH-stakeholders. Despite the unique QIs mapped out per stakeholder group, several common indicators existed. An understanding of both common and unique items enables evidence generation to be planned to satisfy the needs of more than one stakeholder, thereby improving return on investment and resource attrition over time. Two QIs of RDRs met with unanimous agreement across the PH + and PH-stakeholder groups were identi ed and should be considered 'core' to the success of this type of research: 1) the need to have a core data set and 2) the need to both engage with patients and obtain their buy-in.
The core dataset has clear signi cance in the context of the 'Achilles heel' of observational clinical studies -that of missing data from the data requested. The issue here is the central requirement that an observational study must not mandate data collection through unnecessary interventions on patients in the real world, but simply observe their standard diagnostic and monitoring practice. The inevitability of missing data and the limitation this places on the quality of the evidence generated is the common theme running through these stakeholders' appreciation for a key QI of a potential powerhouse of evidence generation. Development of core datasets in rare diseases is currently an area of interest for the author and work is ongoing on how these can be delivered with the multiple stakeholders in mind.
With reference to the work presented here, it should be recognised that PH + stakeholders have evolved, in alignment with, and primarily as a result of, the recognition of the various gatekeeper PH-stakeholder groups above. The results of the dedicated pairwise analysis of common QIs between all possible pairs of stakeholders were surprising. Expected pairs failed to yield homogeneity in unique indicators, most notably Payo -MAcc, where other pairs yielded unexpected and, arguably less useful alignment, such as MAcc -Raff and MAcc -Comm.
The above ndings were in the context of a broader and more worrying lack of alignment in homogeneity between the PH + and PH-stakeholders; 53/111 (47.75%) QIs were without a corresponding partner on either side of this divide. Even more worrying was, despite the recognition that patients were unanimously agreed to be a stakeholder group that needed to be engaged with, none of the Patient stakeholder pairwise tests yielded homogeneity in common QIs with any other groups.
The original research presented here provides some evidence which can be used to inform corrective strategies by providing a Neutral List of QIs against which existing patient registries can be measured, and future registries can be designed and implemented. Further work is required to appropriately understand the QIs of the PH-regulatory group in a similar framework.

Declarations
Ethics Approval and Consent to Participate All participants provided informed consent to participate in this research study, and all ethical standards were adhered to.

Consent for Publication
Not applicable.

Availability of Data and Materials
All data generated or analysed during this study are included in this published article and its supplementary information les.

Competing Interests
The author declares that they have no competing interests.

Funding
No external funding sources supported this work. within their discipline listed in item numbering ascending order.