Exploring the impact of regional variation on outcome prioritisation in core outcome set development: a case study in the eld of gastric cancer surgery


 BackgroundInternational stakeholder participation is important in the development of core outcome sets (COS). Stakeholders, however, may value health outcomes differently when regional differences are considered. Here, we explore how region, health income and participant characteristics influence prioritisation of outcomes during development of a COS for gastric cancer surgery trials (the GASTROS study).Methods952 participants from 55 countries participating in a Delphi survey during COS development were eligible for inclusion. Recruits were grouped according to region (East or West), country income classification (high and low-to-middle income) and other characteristics (e.g. patients; age, sex, time since surgery, mode of treatment, surgical approach and healthcare professionals; clinical experience). Groups were compared with respect to how they categorised outcomes (‘consensus in’, ‘consensus out’, ‘no consensus’). Outcomes categorised as ‘consensus in’ or ‘consensus out’ by all 3 stakeholder groups would be automatically included in or excluded from the COS respectively.ResultsIn total, 13 outcomes were categorised ‘consensus in’, 13 ‘consensus out’ and 31 ‘no consensus’. There was little variation in prioritisation of outcomes by stakeholders from Eastern or Western countries and high or low-to-middle income countries. There was little variation in outcome prioritisation within either health professional or patient groups.ConclusionOur study suggests that there is little variation in opinion within stakeholder groups when participant region and other characteristics are considered. This finding may help COS developers when designing their Delphi surveys and recruitment strategies. Further work across other clinical fields is needed before broad recommendations can be made.


Introduction
A core outcome set (COS) is an agreed minimum group of critically important outcomes which should be reported by all trials within a research eld 1 . The GASTROS study (www.gastrosstudy.org) aims to develop a COS in the eld of gastric cancer surgery to promote uniform reporting of important outcomes and facilitate evidence synthesis 2 . This is necessary as there is signi cant variation and heterogeneity in this eld with respect to reporting and measurement of outcomes 3 . Furthermore, the outcomes chosen by researchers to report in surgical trials for gastric cancer often do not re ect the priorities held by patients 4 . For this reason, the GASTROS study has sought consensus between patients and healthcare professionals with respect to outcome selection.
Delphi surveys and consensus meetings are commonly used methodologies in the development of COS 1,5 . Delphi surveys ask participants deemed by the study group to hold an important perspective (key stakeholders) to prioritise outcomes and achieve consensus. The completed Delphi survey often informs and in uences discussions during a subsequent consensus meeting, with the aim of resolving uncertainties regarding prioritisation and ratifying the nal composition of the COS. Clear recruitment strategies for Delphi surveys are an important consideration. If recruitment does not result in representative stakeholder groups, there is a risk that the results of the Delphi may not be valid 6 . This is particularly important in international COS where signi cant regional and cultural differences may in uence the results ahead of a consensus meeting and, ultimately, the nal COS.
Ensuring stakeholder groups are representative can be a challenging task. There is a need to consider many factors including the incidence of the disease, treatment protocols, international variation in healthcare systems and values and socio-economic issues. In the case of curative surgery for gastric cancer it is known that practice varies worldwide (e.g. how surgery is carried out and the extent of resection) and typically surgeons value different outcomes to patients 4 . There is therefore a need to explore these issues to understand how key stakeholders are selected for survey participation. In the GASTROS study 952 participants were recruited to a Delphi survey (268 were patients, 445 surgeons and 239 nurses) from 55 countries. It was therefore possible to explore how stakeholder charachteristics in uenced outcome prioritisation.
This study had two main objectives: 1. To describe the characteristics of Delphi participants and explore their possible in uence on the prioritisation of outcomes within stakeholder groups.
2. To explore whether there were regional differences across all stakeholder groups with respect to the categorisation of outcomes.

Methods
This was an analysis of registration data supplied by Delphi survey participants as part of the GASTROS study. Details of the scope, objectives and methodology of the study have been previously described [2][3][4] . In summary, participants were asked to score outcomes in terms of importance. The results of the Delphi survey informed discussions in a consensus meeting where nal recommendations were made regarding which outcomes to include in the COS.
2.1 Stakeholder selection and baseline information guiding principle has been to promote the 'patient voice' as they are the bene ciaries of trials in this eld and have all-important 'lived experience'. The patient voice has previously been shown to be under-represented in COS development 7 . Surgeons provide a clinical perspective and the experience of treating large volumes of patients. Oncology nurses were invited to participate given their central roles as care-givers, patient advocates and core members of the clinical team.
Participation in the Delphi survey was open to all interested stakeholders who ful lled the following criteria: Surgeons who had completed their training and routinely treat gastric cancer.
Oncology nurses with a recognised proportion of their role involved in the care and follow-up of gastric cancer patients.
Patients who have undergone surgical resection for gastric cancer with the intention of cure.
There is no sample size requirement for Delphi surveys. To be able to demonstrate the enrolment of a broad and representative range of stakeholders, participants were asked to provide the information listed below: These datapoints were developed based on information that was likely to be readily known to participants and the expert opinion of the GASTROS study management group (SMG) with respect to important factors that may in uence outcomes or perspectives. In the context of patients, different health outcomes, such as complications and survival, may impact their lived experience and ultimately how outcomes are prioritised. Similarly, as clinical experience changes with time, there may be a greater exposure to and therefore appreciation of the impact or importance of longer-term consequences of surgery.
Additionally, all participants were asked to provide their country of residence so that regional differences could be considered. Participants were categorised into 'Eastern' or 'Western' countries ( gure 1) and 'high-income' or 'low-to medium-income countries' as de ned by the Organisation for Economic Cooperation and Development's Development Assistance Committee 8 . Eastern countries were de ned as those within East Asia, South East Asia, and Eastern Russia, and included China, Japan, South Korea, Thailand, Vietnam, Malaysia, and Singapore 9 . Western countries were de ned as those from Western Europe, North America, Australia, and New Zealand 10 . Contrasting between the 'East' and 'West' is of particular importance to gastric cancer given the differences in incidence, pathology, treatment and outcome. It was hypothesised that these differences in approach and survival may in uence how stakeholders in these regions prioritise different health outcomes which could be examined further in this study 11,12 . Similarly, health priorities may be in uenced by resource availability as categorised by country income.

Scoring of outcomes in the Delphi survey and categorisation of outcomes
A list of 56 outcomes identi ed from previous trials and patient interviews 3,4 were presented to survey participants who were asked to rate each outcome on a scale of importance (1-3: not important, 4-6: important, 7-9: critically important). Patients, surgeons, and nurses group ratings were considered separately to ensure that each group had an equal voice. Participants had the opportunity to suggest further outcomes that they believed had not been presented in round 1. One additional new outcome suggested by participants in round 1 was identi ed and after consideration by the SMG was presented to participants for scoring in round 2. Therefore, a total of 57 outcomes were presented in round 2 where, for each outcome, participants were shown the scores from each stakeholder group, and given the opportunity to change their rating if they wished.
After two rounds of rating, outcomes were categorised as follows: To be included in the COS ('consensus in') To be excluded from the COS ('consensus out') 'No consensus' reached i.e. no decision reached as to whether the outcome should be included in of excluded from the COS.
Criteria for categorising outcomes was set a priori by the SMG and based on established COS methodology 1 . If an outcome was rated 7-9 (critically important) by 70% or more of a stakeholder group and 1-3 (not important) by no more than 15% of the group, then the consensus amongst that group was that the outcome should be included in the COS. If an outcome was rated 7-9 (critically important) by less than 50% of the group, the consensus amongst that group was for the outcome to be excluded from the COS. Unanimous agreement amongst all three stakeholder groups was required for inclusion in, or exclusion from, the COS. Any other combination resulted in the outcome being placed in the 'no consensus' category and was discussed at a pre-planned consensus meeting to nalise the COS.

Data analysis and interpretation
In round 1, participants completing 50% or more of the Delphi survey were included in the round 1 analysis and invited to participate in round 2. Likewise, participants completing 50% or more of the survey in round 2 were included in the round 2 analysis. For the purpose of this present analysis, participants were placed into 'sub-groups' according to the registration data they submitted (e.g. patient treatment type, surgeon experience etc) to examine the differences in outcome scoring. The following analyses were performed after 2 rounds of ratings: 1. The proportion of participants scoring each outcome as 'critically important' (score 7-9). This analysis approach was chosen as these gures were presented in the consensus meeting discussing results from the Delphi survey.
2. The consensus opinion of each sub-group with respect to whether the outcome should be 'included' in the COS, 'excluded' from the COS or whether 'no consensus' could be reached. These categorisations were compared against the overall 'in', 'out' and 'no consensus' categorisations by each stakeholder group (patients, surgeons and nurses) which was presented to the consensus meeting participants.
Participants not providing demographic data during registration were excluded from the sub-group analyses. When exploring differences in prioritisation, particular focus was placed on outcomes that were categorised as 'consensus in' by one sub-group and 'consensus out' by another.
To examine the possible in uence of attrition bias between rounds, the characteristics of stakeholders participating in both rounds were compared to those who only completed round 1. A descriptive analysis was undertaken, and the Chi squared test applied to examine for statistically signi cant differences at the 0.05 level.

Ethical Approval
The study was given ethical approval by the North West -Greater Manchester East Research Ethics Committee (18/NW/0347) and governance approvals by Manchester University Hospitals NHS Foundation Trust.

Overview
The characteristics of participants included in the analysis and attrition rates are summarised in table 1. After 2 rounds of voting, agreement was reached amongst all three stakeholder groups to include 13 outcomes and exclude 13 outcomes from the COS, leaving 31 no consensus outcomes for discussion at the consensus meeting.

Patients
A summary of outcomes categorised for 'inclusion' by at least one patient sub-group after 2 rounds of voting is presented in table 2. Thirty outcomes were categorised for inclusion in the COS by at least one subgroup. Four outcomes were simultaneously categorised both for 'inclusion' and 'exclusion' by different subgroups. None of the outcomes categorised for automatic inclusion by all stakeholder groups were voted 'consensus out' by any patient sub-group. Seven outcomes were categorised for inclusion in the COS by all patient subgroups. Table 3 summarises and compares outcomes categorised for inclusion by at least one surgeon sub-group after 2 rounds of voting. Twenty-one outcomes were categorised for inclusion by at least one subgroup. No outcomes were simultaneously categorised both for 'inclusion' and 'exclusion' by different subgroups. Twelve outcomes were categorised by all surgeon subgroups for inclusion. Table 4 summarises and compares the outcomes categorised for inclusion by at least one nurse sub-group after 2 rounds of voting. Twenty-two outcomes were categorised for inclusion by at least one subgroup. Five outcomes were simultaneously categorised both for 'inclusion' and 'exclusion' by different subgroups. None of the outcomes categorised for automatic inclusion by all stakeholder groups were voted 'consensus out' by any nurse sub-group. Ten outcomes were categorised by all nurses' subgroups for inclusion.

Nurses
3.5 Regional variations Table 5 details the nal categorisation list of outcomes which was presented to participants at the consensus meeting. This is compared to alternative outcome categorisation lists based on the region and country income differences described above. Consensus agreement to include 8 and exclude 7 outcomes was reached across all regional sub-groups. No outcomes were simultaneously categorised as 'consensus in' and 'consensus out' across different regional sub-groups.   Values are the percentage of participants voting the outcome as critically important (score 7-9).  Values are the percentage of participants voting the outcome as critically important (score 7-9).
Table legend. Green = for inclusion, Yellow = no consensus. HIC =high income country, LMIC = low-to middle-income country; *Denotes outcomes are those which were included in the final list of outcomes for automatic inclusion in the COS. **Participants not from either Western or Eastern countries were excluded from this analysis. Values are the percentage of participants voting the outcome as critically important (score 7-9). Table legend. Green = for inclusion, Red = for exclusion, Yellow = no consensus. HIC =high income country, LMIC = low-to middle-income country; *Denotes outcomes are those which were included in the final list of outcomes for automatic inclusion in the COS. **Participants not from either Western or Eastern countries were excluded from this analysis.

Discussion
The GASTROS study (www.gastrosstudy.org) is the rst to bring together healthcare professionals and patients with the purpose of identifying outcomes to include in a COS for surgical trials in gastric cancer. The multi-language survey recruited a broad spectrum of stakeholders with different personal and professional experiences from over 50 countries across 6 continents. We aimed to examine whether certain stakeholder characteristics in uenced how outcomes were prioritised and whether there were regional in uences also. Our analysis from nearly 1000 survey participants suggested that little variation within the stakeholder groups exists. Similarly, when all stakeholders were categorised according to region or country income, signi cant differences were not identi ed. These are important ndings which should serve to reassure researchers and patients that the resulting COS has sought and considered international opinion. Furthermore, these ndings suggest that priorities within stakeholder groups and across regions are more aligned than may have been previously thought.

Planning recruitment to Delphi surveys
Few studies have previously examined factors which in uence how stakeholders prioritise outcomes in the eld of COS development. The BRAVO study explored this in the eld of breast cancer reconstruction and found that priorities varied within patient and healthcare professional groups 6 . This led them to recommend careful participant selection for Delphi surveys by COS developers. These same differences, however, were not identi ed in our study. The BRAVO study's healthcare professional stakeholder group was more heterogenous than the groups in this study (breast surgeons, plastic surgeons, nurses and psychologists grouped together) and so these differences may be expected. Furthermore, reconstructive breast surgery is a complex area which covers many different types of procedures. This may also account for the signi cant variation in outcome prioritisation by patients which was not mirrored in the GASTROS study. Similarly, a COS study in the eld of bariatric surgery identi ed signi cant variation in outcome prioritisation amongst healthcare professionals 13 . Again, healthcare professionals in this study were heterogenous, which supports our strategy to separate surgeons and nurses into different stakeholder groups.
Achieving the 'correct balance' of representative stakeholders is an important consideration during the design phase. For example, knowledge of the patient demographic and which types of interventions are prevalent within that group, will enable researchers to recruit an appropriate number of stakeholders with those characteristics. With respect to the GASTROS study, the importance of seeking international agreement on core outcomes was identi ed at the conception stage and subsequently in uenced the design of the prioritisation exercise. Our strategy for addressing the signi cant challenges associated with international involvement included 1) an international working group with regional collaborators, 2) translating surveys and 3) seeking the support of relevant patient and professional groups. Transparent reporting of methodological approaches adopted during COS development are of paramount importance. Ultimately, a COS will only achieve its stated goals if researchers use it. And whilst there are likely several factors which in uence the uptake of COS, ensuring researchers have the con dence that the COS is relevant to them and has been developed through a methodologically robust process are likely to be important factors which in uence uptake and dissemination 14 .
There are challenges in deciding how to sample participants for a Delphi study. Epidemiological studies, registries and audits provide descriptive regional or national information [15][16][17] . However, in the case of gastric cancer, these resources are not always complete or available. Consequently, the study team widened the promotion and enrolment into the Delphi to capture as many patients as possible. In our study, we demonstrated that there was not signi cant variation in outcome prioritisation within stakeholder sub-groups with respect to the characteristics that we examined. Consequently, whilst over 1000 participants were enrolled, it may not have been necessary to recruit such large numbers. This will likely guide our recruitment strategy during future planned stages of work when reviewing the COS and identifying outcome measurement instruments. Our experience may also help guide other COS developers as they consider the number of participants to recruit to their Delphi surveys. However, given some of our ndings differed from those in the eld of breast surgery reconstruction and bariatric surgery, more work is needed before broad recommendations can be made.

Variations within stakeholder groups
When regional variations across the three stakeholder groups were compared, the greatest differences in prioritisation were observed amongst nurses. For example, in four outcomes (pain, ability to undertake physical exercise, impact on mental health, need for additional intervention) different subgroups of nurses categorised them as 'consensus in' and 'consensus out'. These outcomes seemed less important in LMIC and HIC settings within the nurse group.
Understanding the reason for this is likely to be complex. It may be argued that this is simply because nurses are re ecting the importance that patients from these cultures or regions place on these outcomes as similar trends were seen amongst patients. Limited resource in LMIC settings which may affect followup may also play a role in understanding how important longer-term problems are in these regions. Further exploration using qualitative research methods may help understand these differences further.
In examining the differences between patient sub-groups, one would expect to see some differences given the number of characteristics that were examined.
Despite this, only two outcomes (urinary complications and conversion to open surgery) were simultaneously categorised as 'consensus in' and 'consensus out' by different sub-groups. This nding suggests that despite the many possible in uences on patient experience following gastric cancer surgery, there is not a signi cant variation in how health related outcomes are prioritised in this group. Surgeons had the greatest concordance with respect to outcome prioritisation. Overall, the observed differences in outcome prioritisation were small within each stakeholder group reassuring researchers using the COS that it is based on the views of a representative cohort of patients and healthcare professionals.

Impact of regional variations on outcomes automatically included in COS
The aim of a COS is to identify outcomes which are critically important across all stakeholder groups participating in the process. In the case of the GASTROS study, an outcome would only be automatically included in the COS if patients, surgeons, and nurses each categorise it 'consensus in'. Ultimately, it is not possible to con dently assess how regional differences may have affected the nal categorisation of outcomes which informed the consensus meeting. Participants in round 2 were shown the scores of all stakeholder groups from round 1 before being asked to change their score if they wish. To assess regional differences, Western participants, for example, in round 2 would have needed to see only Western stakeholder group scores from round 1.
Furthermore, there are a number of other confounding factors which in uence why participants change scores between rounds (see below) further making an analysis of regional impacts challenging.
Despite this, some assessments could be made. No outcomes categorised for automatic inclusion by all three stakeholder groups were categorised for automatic exclusion by a regional sub-group. And no outcomes categorised for automatic exclusion from the COS by all three stakeholder groups were categorised for automatic inclusion by a regional sub-group. This suggests that the regional differences in approach to management or patient outcome may not signi cantly in uence how stakeholders prioritise outcomes by stakeholders from the West and HIC that were not included in the nal list presented to the consensus meeting. Furthermore, some outcomes (surgeryrelated death, nutritional outcomes, bleeding, overall quality of life, anaesthetic complications) did not reach consensus for automatic inclusion by regional sub-groups yet were automatically included when the overall views of stakeholders were considered. This may bring some to the conclusion that different COS should be developed for different regions as some researchers may be collecting outcomes that were not deemed critically important in their region.
However, researchers should be cognisant of the fact that their trials are internationally relevant and vitally important to the larger picture where evidence synthesis is concerned. From a different perspective, some researchers may feel aggrieved if outcomes which are critically important in their region are not eventually included in the COS. It is important to emphasise that COS are minimum reporting guidelines and that researchers are encouraged to report additional outcomes that they believe are important.

Strengths and Limitations
Strengths of this study include that it is novel and that is was able to recruit a large number of participants from many countries. However, there are some limitations which should be acknowledged. This was an analysis which was not powered to make de nitive conclusions about relationships between subgroups and how outcomes were rated. Therefore, the results should be viewed in this context. Furthermore, the sub-groups examined in this paper were chosen by members of the study team based on their extensive experience in the eld of gastric cancer and their understanding of factors which may impact on stakeholder experience, perceptions and subsequently how outcomes may be prioritised. It is possible that other unexplored characteristics impact on how stakeholders prioritise outcomes. In addition, this study did not explore how different characteristics interact with one another to impact on outcome prioritisation (e.g. years since surgery and type of gastrectomy). Doing so would create results which would remove the focus from regional differences and would be di cult to interpret. Furthermore, there were signi cantly fewer patients from Eastern countries enrolled compared to their Western counterparts.
This may have in uenced how outcomes were categorised ahead of the consensus meeting. However, due to the interplay of other factors described above, reaching a de nite conclusion about the degree of this possible limitation is di cult. This is an area that may bene t from further exploration using qualitative research methods.
Delphi surveys are an established method of reaching consensus in the design of COS 1 . They give participants the opportunity to re ect on their ratings from previous rounds before giving a nal score. Only after this opportunity should all scores be analysed, and outcomes categorised ahead of the consensus meeting. During the process of rating outcomes in round 2 of the survey, participants are shown the results from each separate stakeholder group in round 1.
The topic of why participants change their scores between rounds is an interesting one which has been examined elsewhere 18 . Through our previous analysis we identi ed that the reasons for changing scores provided by stakeholders were varied, including having the time to re ect on the question being asked, changing their minds on the importance, impact or usefulness of the outcome in question, and changes in personal experience of the outcome. In fact, the in uence of other stakeholder ratings as a reason for signi cantly changing a score in round 2 was cited by only a minority of healthcare professionals and patients.
Another factor which may in uence scores between rounds is attrition. Our strategy to keep this as low as possible, alongside other approaches to facilitate international participation in Delphi surveys for COS is a topic which will be described separately. Whilst overall attrition was 30%, the group this affected the most were nurses with nearly 45% attrition. However, the characteristics of those completing both rounds were not signi cantly different to those only completing round 1. Likewise, a statistically signi cant difference was identi ed in the characteristics of surgeons completing both rounds who were predominantly Western and from HIC compared to the balance of surgeons completing round 1. It could be argued therefore that retaining a greater number of Eastern and LMIC surgeons may have led to slightly different survey results. However, whilst statistically signi cant, this difference is unlikely to be clinically signi cant given that the number of surgeons not participating in round 2 was relatively small.

Conclusion
The GASTROS Delphi survey recruited a broad spectrum of international stakeholders to produce a list of outcomes which should be included or excluded from a COS and others which required further discussion at a consensus meeting. Whilst some regional differences were highlighted, there was little variation within stakeholder groups and between regions with respect to how outcomes were prioritised. This may reassure COS users that the adopted methodology was robust and that the views captured during its development were representative. COS developers should carefully consider the characteristics of Delphi survey participants when planning their recruitment strategy.  Figure 1 Countries from which participants were recruited. Eastern countries were de ned as those within East Asia, South East Asia, and Eastern Russia, and included China, Japan, South Korea, Thailand, Vietnam, Malaysia, and Singapore. Western countries were de ned as those from Western Europe, North America, Australia, and New Zealand. Note: The designations employed and the presentation of the material on this map do not imply the expression of any opinion whatsoever on the part of Research Square concerning the legal status of any country, territory, city or area or of its authorities, or concerning the delimitation of its frontiers or boundaries. This map has been provided by the authors.