The use of real-world data (RWD) and real-world evidence (RWE) derived from it has been widely adopted by pharmaceutical developers and a variety of decision makers including providers, payers, health technology authorities and regulatory agencies (Berger et al. 2015; Berger et al. 2016; Berger and Crown 2022; Daniel et al. 2018; Zou et al. 2021). Credible RWE can be created from good quality RWD when investigated within well-designed and well-executed research studies (Berger and Crown 2022). Adoption and use of RWD has been complicated by concerns regarding whether particular sources of RWD are of “good quality” and “fit-for-purpose”. These concerns have become more urgent as regulatory agencies are increasingly using RWD as external comparators for randomized clinical trials and are exploring whether non-interventional RWD studies can provide substantial supplementary evidence of treatment effectiveness.
The US Food and Drug Administration (FDA) draft guidance “Assessing Electronic Health Record and Medical Claims Data to Support Regulatory Decision Making” stated that for all study designs, it is important to ensure the reliability and relevance of data used to help support a regulatory decision (FDA 2021a). Reliability included data accuracy, completeness, provenance, and traceability; relevance includes key data elements (exposures, outcomes, covariates) and a sufficient number of representative patients for the study.
The European Medicines Agencies (EMA) has issued a draft framework for data quality for EU Medicines Regulation (EMA 2022). It defines data quality as fitness for purpose to user needs in relation to health research, policy making, and regulation and the data reflect the reality which they aim to represent (TEHDS EU 2022). It divides the determinants of data quality into foundational, intrinsic, and question specific categories. Foundational determinants are those that pertain to the processes and systems through which data are generated, collected and made available. Intrinsic determinants pertain to aspects that are inherent to a specific dataset. Question specific determinants pertain to aspects of data quality that cannot be defined independent of a specific question. It also distinguishes three levels of granularity of data quality: value level, column level, and dataset level. The dimensions and metrics of data quality are divided into the following categories: reliability, extensiveness, coherence, timeliness, and relevance.
-
Reliability (precision, accuracy, plausibility) evaluates the degree to which the data correspond to reality.
-
Extensiveness (completeness and coverage) evaluates whether the data are sufficient for a particular study.
-
Coherence examines the extent to which different parts of a dataset are consistent in the representation and meaning. This dimension is subdivided into format coherence, structural coherence, semantic coherence, uniqueness, conformance, and validity.
-
Timeliness is defined as the availability of data at the right time for regulatory decision making.
-
Relevance is defined as the extent to which a dataset presents the elements required to answer a research question.
TransCelerate has issued a simpler framework entitled “Real-World Data Audit Considerations” that is divided into pillars of relevance, accrual, provenance, completeness, and accuracy (TransCelerate 2022). These frameworks are part of an ongoing dialogue among stakeholders from which international standards for “regulatory-grade RWD” will eventually emerge.
In the meantime, there is an immediate need for researchers with varying levels of RWD experience to have a screening tool to help them assess whether potential RWD sources are fit-for-purpose when designing studies whose purpose is to answer questions from regulatory agencies or to support claims regarding benefits and risks of therapies. To this end, we developed such a tool, consistent with above frameworks, that conforms to how researchers generally approach this critical issue and also incorporated concepts from Modern Validity Theory.
We took our cue on the definition of “fit-for-purpose” from the FDA draft guidance on selecting, developing, or modifying fit-for-purpose clinical outcome assessments (COAs) for patient-focused drug development guidance (to help sponsors use high quality measures of patients’ health in medical product development programs) which states that fit-for-purpose in the regulatory context means the same thing as valid within modern validity theory, i.e., validity is “the degree to which evidence and theory support the interpretations of test scores for proposed uses of tests,” and that a clinical outcomes assessment is considered fit-for-purpose when “the level of validation associated with a medical product development tool is sufficient to support its context of use”(FDA 2022). The term validity has been defined in epidemiology to be comprised of internal and external validity relating to study design and execution. We designed the RWD screening tool to focus on evaluation of the real-world data itself within the larger framework of modern validity theory (Royal 2017).
After all, as Wilkinson notes, “good data management is not a goal in itself, but rather is the key conduit leading to knowledge discovery and innovation, and to subsequent data and knowledge integration and reuse by the community after the data publication process” (Wilkinson 2016). Wilkinson proposed the FAIR principles for the management of RWD generated by public funds (although they are also applicable to datasets created in the private sector) (Wilkinson 2016). Data sources should be Findable, Accessible, Interoperable, and Reusable. These recommendations are complemented by the recommendations of the Duke-Margolis white paper “Determining Real-World Data’s Fitness for Use and the Role of Reliability” (Mahendraratnam et al. 2019) that explored whether RWD are fit-for-purpose by the application of rigorous verification checks of data integrity.
While experts in modern validity theory have not reached consensus on the attributes of validity, there are basic tenets that most modern validity theorists have adopted (Royal 2017). Validity pertains to the inferences or interpretations made about a set of scores, measures, or in this case – data sources, as opposed to their intrinsic properties. As applied to evaluation of RWD sources, this means that they must be considered fit-for-purpose for generating credible RWE through well-designed and well-executed study protocols to inform decision making. Modern validity theory would suggest that the accumulation of evidence should be employed to determine if this inference regarding RWD quality is adequately supported. Hence, validity of a data source is a judgement on a continuum onto which new evidence is added and is assessed as part of a cumulative process because knowledge of multiple factors (e.g., new populations/samples of participants, differing contexts, new knowledge, etc.) are gained over time. This element of RWD source evaluation is not specifically recognized in the current recommendations by the FDA and the EMA.
One obstacle to developing a consensus regarding evaluation of data quality is that many terms have been used to describe their dimensions and elements, and the terminology has been used inconsistently, despite efforts at harmonization (Kahn et al. 2016). Despite this, RWE derived from RWD that focuses on the natural history of disease and adverse effects of treatment have long been considered “valid” by decision-makers. Recently, the issue of data validity has become more urgent and has been a focus for regulatory initiatives as RWE derived from RWD is being expanded in its use to inform decisions about treatment effectiveness and comparative effectiveness. These decisions demand a greater level of confidence and certainty in study results.
A crucial dimension in assessing the validity of data is the need for transparency and traceability/accessibility. This has been reinforced by the FDA in several recent draft guidance documents and the HMA-EMA (European Union’s Heads of Medicines Agencies-European Medicines Agency) Joint Big Data Taskforce Report (HMA-EMA 2019). The FDA guidance on Considerations for the Use of Real-World Data and Real-World Evidence to Support Regulatory Decision-Making for Drug and Biological Products states “If certain RWD are owned and controlled by third parties, sponsors should have agreements in place with those parties to ensure that all relevant patient-level data can be provided to FDA and that source data necessary to verify the RWD are made available for inspection as applicable” (FDA 2021b). The FDA noted in its Data Standards for Drug and Biological Product Submissions Containing Real-World Data Guidance for Industry that “during data curation and data transformation, adequate processes should be in place to increase confidence in the resultant data. Documentation of these processes may include but are not limited to electronic documentation (i.e., metadata-driven audit trails, quality control procedures, etc.) of data additions, deletions, or alterations from the source data system to the final study analytic data set(s)” (FDA 2021c).
Interest in the creation of “regulatory-grade” RWD/RWE has been spurred by the 21st Century Cures Act in the US and by the ongoing initiatives in Europe including the Innovative Medicines Initiative (IMI) Get Real and the HMA-EMA Big Data Joint Taskforce. As noted in the Framework for FDA’s Real-World Evidence Program to evaluate the potential use of RWE for the support of a new indication for a drug already approved or to help satisfy drug post approval study requirements, the strength of RWE submitted in support of a regulatory decision will depend on its reliability that encompasses more than transparency in data accrual and quality control, but also clinical study methodology, and the relevance of the underlying data (FDA 2018).
The European Medicines Regulatory Network strategy to 2025 includes the creation of the DARWIN [Data Analytics and Real-World Interrogation Network] (Arlett,2020, Arlett et al. 2021); it builds on the observation HMA - EMA Big Data Joint Taskforce Report (HMA-EMA 2019) that RWD is challenged by a lack of standardization, sometimes limited precision and robustness of measurements, missing data, variability in content and measurement processes, unknown quality and constantly changing datasets. The report viewed the number of European databases that currently meet minimum regulatory requirements for content and that are readily accessible, citing Pacurariu et al. (2018), as “disappointingly low”. The International Coalition of Medicine Regulatory Authorities (ICMRA) has called for global regulators to collaborate on standards for incorporating real-world evidence in decision making (ICMRA 2022).
In developing the screening tool, we attempted to find the right balance between the granularity of requested information and the response burden. We defined the dimensions of data suitability (e.g., sufficiency of quality and fitness-for-purpose) in plain English terms consistent with existing frameworks as discussed above. Although we focus on the US and EU, the tool may have relevance to other jurisdictions as well.