Background No standards exist for the handling and reporting of data quality in health research. This work introduces a data quality framework for observational health research data collections with supporting software implementations to facilitate harmonized data quality assessments.
Methods Developments were guided by the evaluation of an existing data quality framework and literature reviews. Functions for the computation of data quality indicators were written in R. The concept and implementations are illustrated based on data from the population-based Study of Health in Pomerania (SHIP).
Results The data quality framework comprises 34 data quality indicators. These target three aspects of data quality: compliance with pre-specified structural and technical requirements (Integrity), presence of data values (completeness), and error in the data values (correctness). R functions calculate data quality metrics based on the provided study data and metadata and R Markdown reports are generated. Guidance on the concept and tools is available through a dedicated website.
Conclusions The presented data quality framework is the first of its kind for observational health research data collections that links a formal concept to implementations in R. The framework and tools facilitate harmonized data quality assessments in pursue of transparent and reproducible research. Application scenarios comprise data quality monitoring while a study is carried out as well as performing an initial data analysis before starting substantive scientific analyses.
Figure 1
Figure 1
Figure 1
Figure 2
Figure 2
Figure 2
Figure 3
Figure 3
Figure 3
Loading...
Posted 10 Dec, 2020
On 10 Feb, 2021
Received 09 Feb, 2021
Received 20 Jan, 2021
On 19 Jan, 2021
On 03 Jan, 2021
Invitations sent on 13 Dec, 2020
On 08 Dec, 2020
On 08 Dec, 2020
On 08 Dec, 2020
On 01 Dec, 2020
Posted 10 Dec, 2020
On 10 Feb, 2021
Received 09 Feb, 2021
Received 20 Jan, 2021
On 19 Jan, 2021
On 03 Jan, 2021
Invitations sent on 13 Dec, 2020
On 08 Dec, 2020
On 08 Dec, 2020
On 08 Dec, 2020
On 01 Dec, 2020
Background No standards exist for the handling and reporting of data quality in health research. This work introduces a data quality framework for observational health research data collections with supporting software implementations to facilitate harmonized data quality assessments.
Methods Developments were guided by the evaluation of an existing data quality framework and literature reviews. Functions for the computation of data quality indicators were written in R. The concept and implementations are illustrated based on data from the population-based Study of Health in Pomerania (SHIP).
Results The data quality framework comprises 34 data quality indicators. These target three aspects of data quality: compliance with pre-specified structural and technical requirements (Integrity), presence of data values (completeness), and error in the data values (correctness). R functions calculate data quality metrics based on the provided study data and metadata and R Markdown reports are generated. Guidance on the concept and tools is available through a dedicated website.
Conclusions The presented data quality framework is the first of its kind for observational health research data collections that links a formal concept to implementations in R. The framework and tools facilitate harmonized data quality assessments in pursue of transparent and reproducible research. Application scenarios comprise data quality monitoring while a study is carried out as well as performing an initial data analysis before starting substantive scientific analyses.
Figure 1
Figure 1
Figure 1
Figure 2
Figure 2
Figure 2
Figure 3
Figure 3
Figure 3
Loading...