The following sections describe the findings from the interviews, based on the themes derived from the analysis.
Attitudes towards, and experience of, machine learning in OA
The majority of participants had at least a basic understanding of machine learning within OA, and most felt positively about it conceptually. There was a good level of agreement that machine learning and artificial intelligence offers opportunities to achieve analysis that would not be possible by humans alone, or that would be prohibitively time-consuming otherwise. Another major advantage identified was the ability to test a hypothesis and train an algorithm on larger datasets, but then refine it on the smaller datasets which are more typical and achievable within OA research. Being able to develop machine learning tools sensitive enough to reliably test hypotheses in small samples was seen as a positive opportunity. Machine learning was also seen as potentially beneficial to commercial companies, who could use it to expedite trials of their products. All of the perceived benefits of machine learning were felt to have the potential to positively impact patient outcomes.
IP04: “We were training it to use a scoring system which took the radiologist about 35 minutes and of course once you've got the machine learning algorithm sorted, it can be done in a few minutes or seconds.”
IP06: “If it does work it could be quite transformative. […] Because what tended to happen in osteoarthritis is people progress slowly, companies don’t want to do clinical trials of three, four, five years.”
IP09: “I think there’s probably more around diagnosis than there is maybe around prognostic modelling.”
One aspect of machine learning on which participants strongly agreed, was that collaboration with people with expertise in the area is essential since the field is relatively new and also extremely complex. Participants were agreed that specific knowledge is required to develop the algorithms and approaches needed to tackle large datasets and extract meaningful insights. Those who had already explored machine learning in their work spoke positively about collaborators with specialist knowledge, but also acknowledged that due to the infancy of the field there are few analysts with the correct set of skills currently. IP02 explained that it is crucial not only to have someone who understands coding and the appropriate computer programming languages, but also to have someone who can understand the research aims and what exactly is being sought within the data.
IP02: “I think specialist knowledge, I think that’s the thing, and it’s having the data in the right format for them to use. I think the other thing is when we started doing this there weren’t many people that knew about it, we had to train the computer scientists to understand where our data came from otherwise they didn’t use it in the right way. So it’s about speaking the same languages.”
IP07: “I think as people with those sort of skills are increasingly employed in our sort of health data research departments, then they will bring that knowledge of, and sort of, insight, and the ethos of needing to share these things more openly and widely.”
Though participants were generally positive about machine learning, there were some words of caution. One important observation was that whilst machine learning can facilitate large scale analyses, large scale datasets are required in the first instance. As discussed in previous sections, datasets in OA are typically much smaller in scale than the numbers required for this approach, and as such it is vital that either data pooling is achieved first, or that the algorithms are trained on existing large datasets. There was a concern from some participants that if not applied carefully and cautiously, machine learning studies would be underpowered and therefore the reliability and validity of the outcomes could be compromised.
IP05: "The numbers in most studies would not be big enough by far. And the problem is, if you look at most x-ray studies of OA, we now know that if you were doing a, even with an enriched cohort, you’d probably need about 600 patients per arm in an x-ray study with a 12-month outcome. And, when you look at most studies, they’re 100 patient per arm or 50 patients per arm.”
IP09: “That is the risk, that you’re just doing multiple testing and then you find things by chance, or if you’re coming up with a model to explain your data, it’s just horribly overfitted, so i.e. it works perfectly with your little set of data by chance, but it isn’t in any way generalisable to anyone else’s, and that’s the risk.”
Size of datasets
Given the need for larger datasets to facilitate the application of machine learning approaches, participants were questioned on their current, typical data collection, in terms of size, format and minimum data requirements. There was a large amount of variance in the sizes of the datasets collected or used by participants, depending on the nature of the study. For studies which involve time-consuming data collection methods such as sample collection, lab analysis or the application of markers, researchers tended to report smaller sample sizes. However, it was felt that smaller sample sizes are an inevitability with this type of research. IP02, who had achieved larger sample sizes on labour- and resource-intensive studies described the difficulty in doing so, and the associated compromises, time and cost required to collect multiple measures from large cohorts.
IP02: “We’ve got a database of [...] I think it’s about 200 people, some have imaging as well some don’t. […] That took about three years, three, four years to collect that. It’s just getting the people in and keeping the lab quality and the time it takes. [...] It’s harder with the older age groups and sometimes with the younger age groups because they have to take time off work to come in, so it’s just… it’s a lengthy process. […] What we find is it takes ages to marker somebody up and get them ready to test, and then the testing doesn’t take that long. […] And of course the labs got to be free, […] there’s this whole load of logistics go into it and there’s always something.”
There was also some variance in what would be considered a ‘large’ or ‘small’ sample, depending on the aims of the research and the sensitivity of the analysis. Those researchers who reported larger samples described not only having easier measures to collect (for example questionnaires, or routine clinical imaging), but also the ability to pool these measures once taken. For those using validated questionnaires, there was also the opportunity in some cases to increase their sample size by accessing existing large databases.
IP07: “For electronic health records, if it’s primary care data, you can get thousands of participants. […] Secondary care data is less widely available for research. We’re accessing our electronic health records from a single local hospital.”
IP03: “RNA sequencing basically sequences everything in your sample. So, you can be trying to map against 20,000 different genes in each of your samples. So, they are very large datasets.”
Minimum data collection requirements
Participants were asked whether there are any existing guidelines regarding minimum datasets used in their area of research, or whether doing so in the future might be possible. There was strong agreement across all disciplines that there are not currently any formal guidelines or frameworks covering minimum data collection requirements. Some participants observed that there are some common data collection methods across different studies, although they were not aware of any central resource providing information on which researchers are using which methods.
IP02: “I think there’s so many inconsistencies, how people capture the data, the capture rates, the type of data and we don’t seem to have any standards or guidelines to say this is the bare minimum.”
It was felt that since most OA research is designed on a study-by-study basis and methodology is determined by the research question, using minimum datasets would not be practical. It was felt that first and foremost, the study design must be appropriate for the question and this often means different measures and methods are considered to be of highest importance.
IP09: “A core data set might look quite different in a clinical trial of knee OA to hand OA to an observational cohort to a cohort that was designed for predictive modelling, so they may have very different things that they would consider absolutely essential. Or a cohort that doesn’t have OA yet to a cohort that already has OA. […] if we’re going to say, mandate a core set, […] you have to be really clear what settings you are requiring that in and that is appropriate for all the people you are talking to.”
Additionally, it was noted that even where the same standardised measures are used in different studies, they may not be used in the same way or at the same time points. Preferences for adapted or personalised uses of equipment such as motion capture marker placement was also a consideration which may make comparing datasets difficult.
IP02: “I suppose what you get a lot with optical tracking is everyone wants their own unique marker set because they all think theirs is better, but it then means that there’s lots of data out there that’s maybe not quite so easy to cross reference and link together.”
The participants were divided on whether they felt that a set of core values could be determined and implemented on all studies. Some felt that this would not be possible for the reasons outlined above, however some noted that the feasibility of this would be improved by being managed by a large organisation such as the MRC or a research council. Participants in support of the idea of minimum datasets with a view to post-hoc data linkage felt that being able to re-use data would be a positive step, particularly in studies which are very resource-intensive or costly.
IP04: “The MRC have set up the Biobank, haven't they, which is a good exemplar of what can be done, […] You could have a common core that different centres could use, that might be a way to improve it.”
Alongside discussing the idea of standardising data collection, it was also commented that even when using clinical data there are inconsistencies which make data pooling difficult. In particular, one participant felt that the nomenclature used across OA is poorly defined, and the standard International Classification of Diseases (ICD-10) codes can be too varied for effective database searching. A number of reasons were cited for this, including different paths to diagnosis and different presentations of OA. The participant suggested that a framework could be developed to streamline the codes used and provide guidance on recoding OA for clinicians so that researchers may more easily use the data.
IP09: “There’s different nomenclature that people use, subgroups, phenotypes, subsets, various sort of classifiers from that point of view. […] A recent barrier we’ve had. […] So if you are wanting to search for patients who might be eligible for studies, it’s a bit of a minefield and not an efficient way. […] if you’re running a study in diabetes or cardiovascular disease, you’ve got much more efficient ways of searching for people. […] I think probably having some kind of musculoskeletal framework or osteoarthritis framework that encouraged people to use particular codes, to have some guidance there, use them early and be consistent would be really great.”
IP07 also felt that clinical data collection could be improved in order to facilitate research, and had been working on this from a structural point of view.
IP07: “We’re looking to try and structure the data collection in clinical care so that it’s then also more useful for research.”
Attitudes towards post-hoc data harmonisation and pooling
A number of data sharing approaches were discussed and evaluated by the interview participants, and barriers and enablers of each were identified. Participants did see benefits to having access to larger datasets.
IP01: “Coming into it there is a lot of new stuff to learn but I think once we get over these sort of teething processes, access to bigger data sets will obviously mean for better studies.”
It was generally agreed that harmonising heterogeneous data may be too time- and resource-consuming and may risk diluting or invalidating findings, particularly when resources such as the Osteoarthritis Initiative (OAI) (9) exist and provide large scale data collected in a robust manner.
IP05: “Now there’s also never enough studies going on that are collecting things in a systematic way that may make it worthwhile. And are people collecting data better than was collected in the nine-year follow-up of the osteoarthritis initiative, which is freely available now for anybody to use?”
Rather than homogenising data, participants instead felt that combining already similar datasets would be more appropriate, but only if there is a sufficiently persuasive argument for adding impact to the findings. There were other advantages seen to combining datasets, including the potential for acceptance into higher impact journals.
IP06: “The more evidence you have for something, the more compelling it is. So, you tend to combine as much data as you can to show that something genuinely is happening. And the positive side of that is it can get in a more prestigious journal as well. To work in that way you kind of combine data, but you’re not homogenising as such, it’s providing additional support for a hypothesis.”
IP09: “You have to have a really good question and have a persuasive reason for people actually putting loads of effort in, because it’s quite a pain, the sort of legal side of data sharing. […] You know, is this in the interests of the research that we set out to do?”.
Barriers to sharing data
Though participants felt that harmonisation of different datasets may not be the right choice for OA research, they did agree that in some situations data sharing and pooling may be possible. In discussing how this might work practically, participants were first asked to identify the barriers which might be currently preventing data sharing from happening.
Data management and storage
Participants were in strong agreement that the logistics of sharing data are the biggest barrier and felt that storage was a considerable challenge. There were two main strands to this challenge – determining an appropriate and capable data storage solution and generating the funding for it. For non-physical data, a cloud database was seen as the most appropriate solution, but setting this up was not considered to be simple. The main difficulties within the issue of storage were seen as data security and being able to accommodate the size of the datasets. It was clear that the issue of mass data storage, particularly in readiness for sharing, is a very new concept in the field of OA research and as such is not yet governed by any best practice guidelines.
IP01: “Because I’ve got some long term cohorts I understand the need for… sort of long term data management, that was never an agenda when I was doing things ten years ago and I think it’s only experienced researchers are probably coming to this now, people are all starting to twig this is an important thing.”
Concerns were raised about the responsibility of ensuring that data sharing can be conducted securely, including preparation of the data and also users downloading it safely. Participants mentioned using so-called ‘safe havens’; secure portals designed to allow access to sensitive data without the need to transfer it or download it. The suggestion of cloud storage was seen as viable for the management of such large data, but there remained questions about who would take responsibility for this, how it would be funded and the protocols and processes by which it might be managed.
IP05: "You have to have a safe site to download things to, and you have to prove that you’ve got all the data security on your sites before you can get downloads. It’s quite laborious and complex. And it takes many months after you take a download before you can clean all the data up and start to do anything with it […] so you need big servers set up to deal with this, and then appropriate software for dealing with big data.”
IP09: “It has to be carefully approached and thought through and there have to be clear analysis plans and data management plans, so you can’t do it in a kind of half-baked way.”
With digital data, there was also a concern about the size and format of imaging files, which are not only very large files but also often stored on NHS systems where anonymisation is not necessary. Should these files be required to be downloaded or stored elsewhere, agreed-upon anonymisation protocols would be essential. This would present new challenges. The process of removing patient identifiers from imaging is problematic; this can result in either reducing the usefulness of the data by also removing key information, or conversely can miss elements which would make patient identification possible. This may compromise ethical boundaries in some cases and would need to be considered if and when images are transferred from NHS secure systems to local research systems.
IP05: "MR images, DICOM images are large, stored on people’s routine hospital PACS systems, where they don’t have to be anonymised, because only relevant clinicians can access them. But, for research purpose, they would have to be anonymised in a very good system before they could be shared. […] And the problem is, if you strip off all the identifiers, it may adversely affect the image analysis that’s done later where certain types of image analysis need to know some things about the sequences.”
Ethical considerations
Another major consideration when discussing data sharing was ethical clearance to do so. Participants noted that as studies often span several years, the ethical applications for studies ending now were written prior to the idea of data sharing becoming more common. Therefore, many ethical documents make no mention of data sharing, or perhaps explicitly state that this will not happen. In these cases, seeking consent from participants retroactively can be problematic. It was also noted that since the introduction of the General Data Protection Regulation (GDPR) in the UK (13), researchers are held to more stringent ethical guidelines.
IP05: “When we set things up 10 years ago, [we] didn’t think we [would] want to come back and dip into things again. […] [The] ability to re-contact people, it has to be in your consent forms. […] But, of course, you have to identify your patients if you’re going to go back, and how did you keep a record of them, why did you keep a record of them when you shouldn’t have after the finish of the study?”
IP05 also highlighted that not only must consent be taken for future sharing of data, but that researchers hold a responsibility to be clear with participants about what their data may be used for. This was seen as important not only from the perspective of informing the participants and obtaining true informed consent, but also at the later stage of determining whether to grant access to other researchers, and whether their proposed use would meet the description given at the consent stage.
IP05: “Apart from GDPR, it’s the issue of what did people give consent for? And most people in their studies weren’t thinking five years ahead, or 10 years ahead, or pooling their data with other people. […] This to me is main issue number one, it’s how do you get the community to include certain phrases, like you should be providing phrases and we’d say, ‘Put these, make sure these are in your ethics’.
It was however, generally agreed that people who participate in OA research are happy for their data to be shared providing there is a well-established rationale for doing so. Participants reported more recent ethics applications having been updated to include the option for data sharing at a later stage and felt that their participants showed no changes in willingness to consent since introducing these clauses.
IP06: “We’ve noticed that OA patients are delighted that somebody is investigating their disease and wants to know something about it.”
Governance
A further challenge identified when discussing data sharing was the management of the process itself and governing appropriate and ethical use of the data. Participants agreed that the responsibility for this must rest with the original data custodian, and as such, there need to be robust processes in place to maintain data security. This was seen as time consuming and costly, and potentially outside of the expertise of the researchers depending on the set-up required. There was a cautious attitude towards the practicalities of data sharing, and a recognition that the impact of doing so improperly would be serious. Participants also felt that where secondary analysis has been completed, the original researchers should be properly credited, and there could be guidelines for doing this appropriately.
IP01: “Above all you have to be sure that appropriate data is being safely released, or safely used. I know that in an organisation it is paramount. [...] So, you have got to be very careful, we have got to have processes.”
Participants who had experience of setting up (or beginning to set up) data sharing processes felt that there were different ways to navigate these challenges. IP02 described a panel approach, whereby researchers wishing to access the data would write an application, which would then be assessed against predetermined guidelines for data use. This proposed approach was also seen as appropriate in terms of anonymisation and could provide clear guidance for anonymisation standards and procedures.
IP02: “So they can apply to a panel that makes sure that you adhere and you can then say ‘yes, you can have this data on these grounds’ and you sign up to it, and there’s all the governance that goes with it. ”
It was however acknowledged that the resources required to facilitate this process would be significant, and may require full-time administrative responsibility from someone outside the research team.
IP07: “There would be a data access application process; there would be a review of that application; […] we have data sharing agreements that people have to sign and sort of terms and conditions of use as well in terms of how to acknowledge the data source.”
Use of data from databanks and databases
Alongside the potential to share study data between researchers, there exist a number of databases and databanks with purposively collected datasets or curated and collated data from primary sources. Participants were aware of several of these data repositories and had varied experience in accessing them and using the data. Some of the examples mentioned were the Osteoarthritis Initiative (OAI) (9), the Clinical Practice Research Datalink (CPRD) (13), the Imperial Tissue Bank (14), REDCap (15), OpenClinica (16) and the UK Biobank (17). There were also institution-specific databases run by Universities and research centres.
The concept of using data repositories in itself was viewed positively, either for increasing sample size and thus improving statistical power, and for reducing replication of others’ previous work. One participant felt that using these datasets was potentially a viable alternative to costly and time-consuming randomised controlled trials (RCTs), should the relevant data be available. Those working with tissue samples were able to access specimens which would otherwise go to waste, by applying for them via a databank.
IP03: “For my group with ten samples, someone else's group with ten samples, clever people can combine those datasets and then you increase the power of your analysis.”
IP09: “We have a musculoskeletal tissue bank here and we will tend to use that to acquire tissue samples that would be essentially waste tissue for [joint] replacements.”
IP04: “Registries are increasingly important now for studying lots of things. I think they're getting more respect as well. Indeed, they are sometimes suggested as an alternative to clinical trials, because RCTs are incredibly difficult and costly to do.”
Other uses of databanks included using existing data to answer a specific question and generate/support a hypothesis, and then following this up with a more bespoke original study in the lab. The use of databanks in this context was seen as a cost-effective method to prove a concept, which could then be developed further.
IP06: “Within the UK Biobank there are measures relating to the musculoskeletal system, so it’s possible to identify individuals that do have osteoarthritis and then do a genetic analysis of those patients […]. But once that’s done, that just tells you the genetic signal. The next thing is to go in the lab and try and work out what that genetic signal is doing to gene function.”
The application and access processes when using these datasets were generally viewed as appropriate on a governance level, but there were varied experiences in terms of ease of access. Some participants reported positive experiences, whereas others felt that the process was a steep learning curve and somewhat bureaucratic. It was agreed, however, that stringent protocols are appropriate to protect the data.
IP03: “It's normally pretty easy. You do have key words. It might take about half an hour to get your head around it but it's pretty simple to do.”
IP01: “So, it’s not a simple thing where anyone can just say, “I want this data and I am going to run these tests.””
IP07: “It’s a large, complex data set that you have to, kind of, slowly come to understand. […] It is a learning curve with it, like there is with most things and once you understand it, then it becomes that bit more easy to use.”
Once in possession of the raw data, processing and preparation can still be a considerable task. IP07 described taking a collaborative approach and sharing their data preparation programmes on open-source platforms to help other researchers do this more efficiently in future.
IP07: “There’s a whole data preparation step that is complicated. […] when we first did it, it took us, like, over a year to go from the receipt of the raw data to the data ready for analysis, and we’ve written, kind of, programmes and scripts to make that more efficient, and we have shared those on GitHub and Zenodo repositories so that other people can do that more efficiently than we did, to begin with.”
Attitudes towards partnership working and collaboration in OA research
Collaboration in OA research was viewed as potentially useful and important, but not necessarily a widespread approach currently. IP03 in particular felt that when collecting tissue, researchers tend to be collaborative due to the difficulty of obtaining samples.
IP03: “I think in OA people are pretty collaborative to be honest, because we know how difficult it is to get tissues in the first place.”
Several participants felt that collaboration was difficult in such a small field, as competition for funding is high and researchers can be protective of their data. This was seen as slowly changing, with many researchers being open to collaboration and data sharing if certain barriers are considered. It was noted that sharing data beyond the scope of the original study can require significant additional effort in order to prepare it, including data storage solutions and potential costs. Despite this, participants felt that collaborating and working together could potentially improve research outcomes, if effective ways of doing so could be determined.
IP01: “The thing about the randomised trial, someone has taken it and done an enormous amount of work, it’s taken five, if not ten, years to get the final paper out […] and that data needs to be packaged and if people could access it, not just for metanalysis purposes but to draw different data and you see that more and more. There are some groups in England are now pooling different data sets to look at certain questions. […] I think the collaboration side of it is still a relatively new thing.”
IP07: “Sharing experiences, sharing codes would be useful and we know people don’t really do that, and there are reasons why they don’t[…] But then that is in conflict with the transparency that we should have with research, and the, you know, spending the money from research councils and charities efficiently, we should very much should be sharing.”
Some participants felt that there is potentially work being duplicated within OA research, with very similar studies happening and little communication between research centres. This was seen as being due to several reasons, but ultimately communication was seen as a key contributing factor. Whilst participants acknowledged that sometimes similar studies are needed, there was also an acceptance that with improved communication and collaboration, data sharing might be a positive step forward.
IP02: “I think the problem is as a group of people interested in arthritis if we all pull together a lot of the time we’re saying the same messages but we use a different language or different way, and if we could come forward with a better dialogue that shows that we are all saying the same thing we could be much more effective as a community.”
Though many participants felt that collaboration within OA research is possible, and potentially a positive approach, it was clear that this is not something to be forced. Participants preferred to allow collaboration to happen naturally, and where appropriate, rather than being mandated by frameworks. However, it was generally agreed that there may be space for the introduction of resources in order to connect researchers with each other, with expert collaborators/advisors, and to disseminate information about what research is being conducted.
IP09: “We can’t force people into a model of collaboration, but I think we can provide platforms that help make it easier for people if they want to engage. I think I would probably approach it that way.”