OMOP, Data Quality, and the Principles of FAIR, CARE, and FIVE SAFES
The use of OMOP-CDM aligns well with the need for systematic data evaluation and adherence to data quality standards and the Principles of FAIR (Findable, Findable, Accessible, Interoperable, and Reusable), CARE (Collective benefit, Authority, Researcher, and Ethics), and the FIVE SAFES. Before use, OMOP-CDM data undergoes a rigorous data quality assessment process, which includes checks for completeness, concordance, plausibility, and currency when compared to the source EMR data (14). These quality checks are predefined and configured to run on datasets conforming to OMOP standards, and they can be executed using tools like Achilles, which is accessible via the OHDSI Data Quality Dashboard (15). In addition, the OMOP-CDM enables researchers to work within a secure and firewalled environment while conducting advanced analytics and prediction techniques. This aligns with the principles of making data ‘FARE’ Findable, Accessible, Interoperable, and Reusable, ensuring that data is available for a wide range of research applications (16, 17). Data accessed through an OMOP-CDM also adheres to the CARE principles (Collective Benefit, Authority to Control, Responsibility, and Ethics) as it operates within the governance framework established by the custodians of each local data repository. This framework ensures that data is used responsibly and ethically, benefiting both researchers and the broader community (16, 17)
OMOP data adheres to the ‘Five Safes’ guiding principles by providing a structured and secure framework for managing and sharing healthcare data while ensuring that privacy and security are maintained. In doing so it delivers:
‘Safe Projects’ by allowing for the creation of secure, firewalled environments for data analysis. Researchers can undertake advanced analytic and prediction techniques within these controlled environments without direct access to raw patient-level data, ensuring that projects are conducted safely and in compliance with privacy regulations.
‘Safe People’ by tightly controling and restricting access to OMOP data to authorised individuals only, such as researchers, who have undergone appropriate training and have the necessary permissions. This safeguards against unauthorised data access.
‘Safe Data’ is maintained by rigorous data quality checks and transformations to improve data accuracy and quality. This is supported by the use of a standardised format and terminologies used in OMOP-CDM enhance data consistency and quality.
‘Safe Settings’ the OMOP-CDM can be configured to adhere to specific governance and privacy frameworks established by the custodians of local data repositories. This ensures that data is used in accordance with the governing policies and procedures set by the data custodians.
‘Safe Outputs’ when researchers work with OMOP data safe results and outputs are genratred as they are aggregated and anonymised to prevent the identification of individual patients. This protects patient privacy while allowing valuable research insights to be shared (Table 1)
Table 1
Guiding Principles of FAIR, CARE, and the FIVE SAFES
The FAIR Guiding Principles |
F | Findability | Metadata and data should be easily found by both humans and computers through the assigment of globally unique and permanent identifier to enable the automatic discovery of datasets and services via machine learning (16). |
A | Accessability | Metadata and data should easily retrieved by authorised and authenticated users via a standard communication protocol (16) . |
I | Interoperabilty | Data from one data source can be integrated with data from other sources so that it can be aggregated into a single, unified view, and refers to the intergration and exchange of applications, analysis, storage, and workflow processing across different data sources (16). |
R | Reusability | Metadata and data characteristics are specified in detail to enable replication and/or linkage in different settings. Reusability includes the release of data usage licenses, provenance details, and disclosure around community standards relevant to the domain (16). |
The CARE Principles |
C | Collective Benefit | Collective benefit including where the wellbeing of Indigenous Peoples’ rights is of primary concern (17) |
A | Authority | Indigenous Peoples’ rights and interests about their peoples, communities, cultures, and territories with regards to data is recognised and clearly articulated (17) |
R | Researcher | Researchers have a responsibility to develop and nurture respectful relationship with Indigenous Peoples’ from whom the data originate (17) |
E | Ethics | Minimise harm and maximise benfit for Indigenous Peoples’, for justice and future use (17) |
The Five Safes Framework |
People | Safe People Is the researcher appropriately trained and authorised to access and use the data? (18) |
Projects | Safe Projects Is data used for an appropriate purpose that is valid and of public benefit? (18) |
Settings | Safe Settings Does IT access and physical environment prevent unauthorised use? (18) |
Data | Safe Data Has appropriate and sufficient protection been applied to the data to avoid risk of disclosure? (18) |
Outputs | Safe Outputs Are the statistical results non-disclosive? (18) |