Measuring and Controlling Medical Record Abstraction (MRA) Error Rates in an Observational Study

doi:10.21203/rs.3.rs-1225727/v1

Download PDF

Research Article

Measuring and Controlling Medical Record Abstraction (MRA) Error Rates in an Observational Study

https://doi.org/10.21203/rs.3.rs-1225727/v1

This work is licensed under a CC BY 4.0 License

You are reading this latest preprint version

Background: Studies have shown that data collection by medical record abstraction (MRA) is a significant source of error in clinical studies. Yet, the quality of data collected using MRA is seldom assessed. We employed a novel, theory-based framework for data quality assurance and control of MRA. The objective of this work is to determine the effects of formalized MRA training and continuous quality control (QC) processes on data quality over time.

Methods: We conducted a retrospective analysis of QC data collected during a cross-sectional medical record review of mother-infant dyads with Neonatal Opioid Withdrawal Syndrome. A confidence interval approach was used to calculate crude (Wald’s method) and adjusted (generalized estimating equation) error rates over time. Comparison of error rates with estimates derived from meta-analysis of the literature on the topic were conducted. We calculated error rates using the number of errors divided by total fields (“all-field” error rate) and populated fields (“populated-field” error rate) as the denominators, to provide both an optimistic and a conservative measurement, respectively.

Results: On average, the ACT NOW CE Study maintained an error rate between 1% (optimistic) and 3% (conservative), 3-5 percentage points less than the observed rate from the literature. Additionally, we observed a decrease of 0.51 percentage points with each additional QC Event conducted.

Conclusions: Formalized MRA training and continuous QC resulted in lower error rates than have been found in previous literature and a decrease in error rates over time. This study newly demonstrates the importance of continuous process controls for MRA within the context of a multi-site clinical research study.

medical record abstraction

data quality

clinical research

clinical data management

data collection

Medical record abstraction (MRA) has traditionally been, and continues to be, one of the most common forms of data acquisition for clinical research studies.¹ However, the quality of MRA has often been questioned,^2,3 as it is highly prone to human error^2–6 and often adds to the overall complexity of clinical research.^7–12 For example, studies have shown that error rates associated with MRA are an order of magnitude greater than other data collection methods,^2,3 and the most significant source of error in clinical research.^13,14 Moreover, the inherent complexities of electronic health records (EHR) often add to the variability of the data abstracted, further contributing to the high and highly variable discrepancy rates associated with MRA.^3,4,15 Traditionally, data quality assessments in clinical studies have been limited to database error rates and often overlook errors arising from MRA and from transcription from medical records to the research database.⁴ This is unfortunate because MRA remains the dominant method of data collection in retrospective and prospective research.

Although position papers and reports of empirical results exist, the reasons for high error rates associated with MRA have not been systematically studied, and the mechanisms are not clearly understood. Importantly, MRA errors are less likely to be detected by downstream data processing, such as data entry or programmatic data cleaning. For example, an incorrect but plausible value chosen from the medical record will not be detected by valid range checks. MRA errors that result in plausible values will only be detected through comparison to the medical record (i.e., re-abstraction). Accordingly, it is critical that the quality of the data collected through MRA be closely monitored and managed. However, attempts to improve the quality of MRA have not been formally evaluated in the literature. Thus, improvements in MRA data quality have been limited. Our research aims to address this gap by implementing¹⁵ and evaluating a standardized process for MRA training and continuous quality control (QC) within the context of a clinical research study.

As a retrospective chart review, the Advancing Clinical Trials for Infants with Neonatal Opioid Withdrawal Syndrome Current Experience (ACT NOW CE) Study¹⁶ was an example of a clinical study dependent on MRA for data acquisition and subject to data accuracy and quality concerns. Thus, special consideration was given to quality assurance and control of the MRA process for the ACT NOW CE Study.¹⁵ In an attempt to reduce data quality issues, study-specific MRA training was provided to sites prior to activation,¹⁵ and a formalized QC process was conducted throughout the course of the study across all participating sites. Throughout the study, data was collected from all participating sites indicating (1) the number of discrepancies per MRA training case, per abstractor, per site, and (2) the number of discrepancies versus true errors per QC, per abstractor, per site.

The objective of this study was to determine the effects of formalized MRA training and continuous QC processes on data quality over time and provide a baseline measure for traditional MRA error rates. We hypothesized that the implementation of formalized MRA training and continuous QC monitoring conducted throughout the course of the ACT NOW CE Study would result in (1) improvement in error rates for the ACT NOW CE Study when compared to MRA error rates reported in the literature, and (2) improvements in error rates for the ACT NOW CE Study over time.

The ACT NOW CE Study was a multi-site, retrospective chart review conducted between July 1, 2016 and June 30, 2017 “to inform the design of a clinical trial to improve care and outcomes for infants with neonatal opioid withdrawal syndrome (NOWS)”.¹⁶ Thirty sites from the Environmental influences on Child Health Outcomes IDeA States Pediatric Clinical Trials Network (ECHO ISPCTN) and the Eunice Kennedy Shriver National Institute of Child Health and Human Development’s (NICHD) Neonatal Research Network (NRN) distributed across the U.S. participated in the study. The medical records of approximately 1,800 infants with NOWS were abstracted across all sites, of which a subset of cases (over 200) underwent a formalized QC process to identify data quality errors and determine the association between MRA and data quality.

To evaluate the MRA process, continuous QC monitoring was performed throughout the course of the ACT NOW CE Study. This process required a certain percentage of cases at each site to be re-abstracted by a second, independent abstractor from that site. Prior to the start of the ACT NOW CE Study, (1) the acceptable error rate threshold was set (no greater than 4.93% or less than 500 errors per 10,000 fields),¹⁵ (2) a formal abstraction guideline for the study was developed to ensure consistency in data collection across abstractors and sites, and (3) each abstractor (primary abstractor and QC-abstractor) received extensive MRA and QC training. At a minimum, each site performed QC on the first 3 cases abstracted by the site (QC1). Depending on the total number of cases abstracted by the site, additional QC “events” would be required after every 25 cases, one randomly selected case for every 25 cases abstracted and entered into the electronic data capture (EDC) system (QC25, QC50, etc.). Accordingly, the total number of QC Events conducted per site corresponded to the total number of cases abstracted at the site. Seven distinct QC Events were observed over the course of the study (QC1, QC25, QC50, QC75, QC100, QC125, QC150).

A high-level overview of the QC process is described here. The primary abstractor would perform MRA on a set of cases (up to 3 for QC1 and an additional 1 case for every QC Event thereafter). The site would notify the Data Coordinating and Operations Center (DCOC) once the specified number of cases had been entered into the EDC and cease all data collection and entry until QC was completed. Using a random number generator, the DCOC identified case(s) for QC and notified the site’s QC-abstractor, who would independently abstract the assigned case(s). The QC-abstractor was not able to see how the primary abstractor identified the data elements within the EHR. Essentially, the QC-abstractor carried out their abstraction and data entry as if it was a completely new case. Upon completion, an automated script was triggered to run, which compared the data entered for each QC case (primary- vs. QC-abstractor) and generated a report with a list of discrepancies.

The system considered any inconsistency between the primary- and the QC-abstractor as a discrepancy. By design, the system was highly sensitive to detect any inconsistency in data entry. Once the report was generated, an informaticist and site manager from the DCOC met with the site (both the primary- and QC-abstractors) via video conference, and reviewed the results of the discrepancy report. During the review, sites referenced their EHRs to identify the true value for all discrepancies. The team reviewed each discrepancy and identified true errors. A discrepancy was considered a true error if the primary abstractor had entered data into the EDC that was inconsistent with what was in the EHR (the gold standard). If the primary-abstractor’s data entry matched the EHR, the discrepancy was not considered a true error (even if the QC-abstractor did not match). The error rate was then calculated and shared with the site along with a corrective action plan.

In the event that a site exceeded the acceptance criteria for the specified QC Event, the site would be required to repeat the event on another 3, randomly selected cases. Cases for repeat QC Events were selected by the DCOC as follows. First, the DCOC identified the treatment type (pharmacologic vs. non-pharmacologic), referred to as the case type, most prominent in the current set of abstracted cases at the site, and, then, randomly selected 2 cases of the most prominent case type and 1 case from the other case type for re-abstraction. In situations where the site exceeded the acceptance criteria for the repeat QC, the site (primary- and QC-abstractors) would be required to participate in retraining and perform (and pass) another QC before continuing with the study. If the site was within the acceptable limits, they were able to continue with data collection.

The Error Rate Calculation framework, outlined in the Good Clinical Data Management Practices (GCDMP) guidelines,¹⁷ was used to describe error rates, the distribution of the error rates, and the error rates over time. Simply put, error rate is a ratio between the number of data errors detected compared to the total number of data fields collected:

(1)

For this study, we initially calculated the crude MRA error rates along with the Wald’s 95% confidence intervals (CIs) over time. Error rates were calculated using all/total fields (“all-field” error rate) and using only populated fields (“populated-field” error rate) to provide both an optimistic and a conservative measurement, respectively, to account for the variability in the calculation and reporting of error rates in the literature. We derived an adjusted MRA error rate along with a 95% CI using a generalized estimating equation model to account for the clustering.

Of the 1,808 total cases, 219 cases were selected for QC. Four of those cases did not have QC performed due to data entry issues that did not allow for a full QC report to be generated. These were excluded from the analysis. Thus, the analytic sample consisted of 215 QC cases. When calculating the error rates for both the all-field and populated-field totals, the study population was divided into two groups, or case types, based on the methods used to treat the infant for NOWS: using pharmacologic therapy (P) or using only non-pharmacologic therapies (NP). The number and types of data elements varied between the two groups (the pharmacologic treatment group requiring more variables), differences that could cause variation in the number of errors. After reviewing the adjusted error rates by case type, the decision was made to combine cases when calculating the changes in error rates over time, as the adjusted error rates by case type did not offer statistically significant results to warrant further investigation. We derived both the crude and adjusted populated-field error rates at each of the 7 distinct QC Event times. For each set of crude and adjusted populated-field error estimates, we fitted separate time series regression with a time trend as the independent variable. We report both the slope estimates and corresponding 95% confidence limits.

The overall error rates for the ACT NOW CE Study were also compared to error rates from the literature to assess the impact formalized MRA training and continuous QC monitoring could have on data quality in clinical research. Prior work conducted by Zozus and colleagues^3,4 synthesized the existing data quality literature across various clinical research studies. Quantitative data accuracy information was abstracted from the articles and pooled. Manuscripts were categorized by type of secondary data use, data processing method, and data accuracy assessment. Information on the number of errors identified and the number of fields inspected was collected for each manuscript. From this work, we referenced 71 MRA-centric studies to conduct a meta-analysis of the overall error rate as reported in the existing literature for comparison against the ACT NOW CE Study. Based on the residual and leave-one-out diagnostics, we identified 5 studies that were deemed to be potential outliers. Thus, these studies were removed for the final meta-analysis to obtain the estimate error rate for the literature reviews.

In order to derive an overall MRA error rate for comparison with the current study, we performed a meta-analysis of single proportions to derive an overall error rate from the literature based on an inverse variance method and generalized linear mixed model approach using the R package “metafor”.¹⁸ Additionally, we analyzed the data from each site within the ACT NOW CE Study separately to obtain site-specific error estimates and used meta-analysis to obtain a single common error estimate.¹⁹ Next, we compared the all-field and populated-field error rates from our study with the error estimate derived from the literature based on independent meta-analyses using a Wald-type test statistic. As a sensitivity analysis, we also compared the error rate estimate from the meta-analysis with our study estimate based on a meta-regression model with the inclusion of an indicator moderator to distinguish studies from the literature with our study. The meta-analysis was limited to a comparison in overall error rates, as the studies identified in the literature did not provide the level of granularity to support a comparison by data or case type.

The ACT NOW CE Study electronic case report form was comprised of 312 total data elements. Of the full set (N = 312), nearly three-quarters (n = 211) of the data elements fell within the medication (n = 152) and medical history (n = 59) domains (Fig. 1). The remaining data elements were distributed relatively evenly across five domains: demographics (n = 20), diagnosis (n = 18), eligibility (n = 15), encounter (n = 31), and procedure (n = 17) (Fig. 1).

Table 1 provides a breakdown of total subjects, fields per case, total fields, and populated fields across 2 subgroups (pharmacologic cases and non-pharmacologic cases). Across all 215 QC cases, the all-fields count was 48,880 total fields. The populated-field count was 18,843 total fields, a little under half the all-fields count.

Table 1

QC Dataset: Population Breakdown
	Total Subjects n (%)	Fields per Case n	Total Fields n	Populated Fields n
P	85 (40%)	312	26,520	10,425
NP	130 (60%)	172	22,360	8,418
Study Totals	215 (100%)	-	48,880	18,843

Note. P = pharmacologic cases; NP = non-pharmacologic cases. “Total Fields” was calculated by multiplying the total subjects (column 1) by the number of fields per case (column 2); and was used as the denominator to calculate all-field error rates. “Populated Fields” was calculated by multiplying the total subjects per case type by the total number of fields populated for each subject that fell within that category.

Table 2 provides a breakdown of the error rates by case type (pharmacologic and non-pharmacologic) for both the crude and adjusted estimates. When considered in aggregate (n = 215 QC cases), a total of 2,394 discrepancies were identified across all cases. Of the 2,394 discrepancies, 573 true errors were identified. Accordingly, the all-field error rate was 1.24%, 95% CI [1.14, 1.34], and the populated-field error rate was 3.04%, 95% CI [2.81, 3.30], across the full QC dataset (across all case types, across all sites). This translated to 124 and 304 true errors per 10,000 fields, respectively. Accounting for clustering, the study total all-field adjusted error rate was 1.17%, 95% CI [0.91, 1.50], and the adjusted populated-field error rate was 2.87%, 95% CI [2.21, 3.74]. The 95% CIs for adjusted error rates were much wider compared to the crude estimates.

Table 2 also presents error rate estimates stratified by case type for both all-field and populated field. Using the crude estimates, the differences between the error rates for non-pharmacologic versus pharmacologic cases based on all-field and populated-field were statistically significant (\({\varDelta }_{all-field}=0.39;p=0.0002\) and \({\varDelta }_{populated}=0.95;p=0.0002)\). In contrast, after accounting for the clustering, the differences in adjusted error rates among the all-field and populated-field were not statistically significant (significant (\({\varDelta }_{all-field}=0.28;p=0.152\) and \({\varDelta }_{populated}=0.67;p=0.269)\).

Table 2

QC Dataset: Error Rates
	True Errors n	All-Field Error Rate % [95% CI]	Adjusted All-Field Error Rate % [95% CI]	Populated-Field Error Rate % [95% CI]	Adjusted Populated-Field Error Rate % [95% CI]
P	273	1.06 [0.94, 1.20]	1.07 [0.81, 1.42]	2.62 [2.33, 2.94]	2.64 [1.97, 3.54]
NP	300	1.45 [1.30, 1.63]	1.35 [1.04, 1.75]	3.56 [3.19, 3.98]	3.31 [2.53, 4.33]
Study Totals	573	1.24 [1.14, 1.34]	1.17 [0.91, 1.50]	3.04 [2.81, 3.30]	2.87 [2.21, 3.74]

Note. All-Field Error Rate was calculated using the Total Fields count, and Populated-Field Error Rate was calculated using the Populated Fields count from Table 1.

The error rates for the ACT NOW CE Study over time are displayed in Fig. 2. For both the crude and adjusted populated-field error rates, there was a statistically significant downward trend among the sites with multiple QC Events. More specifically using the crude error estimates, the error rate decreased by 0.51 percentage points (p = 0.017; 95% CI: [-0.88%, -0.14%]; R² = 0.71) for each additional QC Event. Similarly, the error rates accounting for clustering decreased by 0.46 percentage points (p = 0.016; 95% CI: [-0.80%, -0.13%]; R² = 0.72) for each additional QC Event.

The overall error rate from the literature meta-analysis was 6.57% (95% CI: 5.51%, 7.72%). In comparison, the overall error rate for the ACT NOW CE Study was much lower, at 1.04% (95% CI: 0.77%, 1.34%) based on the all-field calculation, which included all data elements regardless of case type. The difference between the two error rate estimates was statistically significant based on the Wald-type z-test and meta-regression (p < 0.0001). The error rate estimate for the populated-field ACT NOW CE Study meta-analysis was 2.55% (95% CI: 1.88%, 3.35%). Similarly, the error rate was statistically significant compared to the 6.57% error rate based on the literature (p < 0.0001).

In this analysis, we found that error rates using a formalized MRA training and continuous QC process were on the order of 1–3% (or 100 to 300 errors per 10,000 fields), depending on whether the optimistic, all-field or the conservative, populated-field approach was used to determine error rates. Using either approach, these error rates were substantially lower than those found in the peer-reviewed literature,^2–4 which, on average, were approximately 6.5% (or 650 errors per 10,000 fields) – a difference of 553 per 10,000 fields (all-field total) and 402 per 10,000 fields (populated-field total). Notably, there was significant variability in the MRA error rates reported in the literature, ranging from 70 to 5,019 errors per 10,000 fields. The use of standardized MRA training and QC, consistently deployed by the ACT NOW CE Study, allowed for greater control in the variability of error rates across sites and over time. Further, a clear pattern of decline in error rates was observed over time (on average, − 0.51 percentage points) as participating sites continued to perform QC throughout the course of the study. Based on these results, it is evident that using formalized MRA training and continuous QC processes has the potential to positively affect the data quality in a clinical research study by maintaining lower error rates overall and reducing error rates over time.

We offer several possible explanations for the variability in error rates across clinical research studies and sites. The experience of the abstractor(s) and their level of familiarity with the EHR may affect the resulting error rate. For example, in the case of the ACT NOW CE Study, several abstractors were registered nurses who were familiar with the population and experienced with where the data would exist within their institutional EHRs. Another possible explanation is data complexity. For the ACT NOW CE Study, data complexity varied by case type. Pharmacologic data elements (e.g., medication-dosing information) tended to be discrete fields, often more consistently documented within the EHR. In comparison, non-pharmacologic data elements included items that were much more difficult to find in the EHR, as they could be documented in a variety of places as free-text (e.g., in clinical notes and/or flowsheets). As such, data abstractors often struggled as they searched for the unstructured fields, which is likely the reason the error rates were slightly higher for NP versus P cases. Other factors worth noting include the local system implementations and workflows, the number of data processing steps, and the differences in reporting error rates.

Ultimately, the decisions to employ study-specific abstraction training and continuous QC resulted in significantly lower error rates overall, with continued improvements in data quality observed over time. MRA training conducted prior to study start offered sites with a clear set of instructions for identifying the appropriate study data elements within the EHR, while the use of continuous QC during the study provided a mechanism for catching and addressing errors early in the data collection process and provide retraining (as necessary) and corrective action for future abstraction. Importantly, the results presented here are of immediate use in informing investigators and research teams as they plan and execute future research studies. The framework used in the ACT NOW CE Study for controlling MRA data errors can be leveraged by other researchers going forward. We recommend that researchers utilize this framework and conduct more systematic data quality analyses for their studies. We also encourage researchers to publish these results to contribute to the data quality literature and provide additional use cases for the clinical research community. Research from multiple epistemological stances would provide valuable information to confirm or challenge the results identified here.

Further, as data (increasingly captured electronically) are used to support direct patient care, performance measurement, and research, the effects of data quality on decision-making need thorough exploration, as do the effects of system usability and data entry and cleaning methods on data quality and clinical workflow. As we look back at the results from the literature, the variability and the magnitude of error rates reported should encourage additional evaluation of the impact of new technology and processes on data accuracy and subsequent decisions regarding whether the accuracy of the data is acceptable for the intended use. Research opportunities exist (1) to understand the types of errors that perpetuate in MRA data collection, and (2) in the areas of data and process standardization^20–22 to aid in streamlining data collection for clinical research studies. Our ongoing work aims to address these issues as we develop solutions to streamline data collection processes in clinical research.

Through this work, we have quantified the results of formalized MRA training and continuous QC within the context of a multi-site clinical research study and provided a baseline measure for traditional MRA error rates. More importantly, we have demonstrated that use of a standardized training program and ongoing data quality monitoring processes can yield positive results. For the ACT NOW CE Study, specifically, the results were twofold: (1) error rates were more controlled than what has been seen in the MRA / data quality literature, and (2) the average rate of change over time indicates a decrease in the number of true errors observed over time that may be contributed to the QC training. From these results, it is clear that formalized MRA training and continuous QC conducted throughout the course of a clinical study has the potential to significantly lower error rates overall and over time. Thus, this work should be used to inform future study design and quality assurance processes for clinical studies relying on MRA for data collection.

Abbreviation	Meaning / Definition
MRA	Medical Record Abstraction
QC	Quality Control
EHR	Electronic Health Record
ACT NOW CE	Advancing Clinical Trials for Infants with Neonatal Opioid Withdrawal Syndrome Current Experience
NOWS	Neonatal Opioid Withdrawal Syndrome
ECHO	Environmental influences on Child Health Outcomes
ISPCTN	IDeA States Pediatric Clinical Trials Network
NICHD	National Institute of Child Health and Human Development
NRN	Neonatal Research Network
EDC	Electronic Data Capture
QC1	First QC Event: required of all sites on the first 3 cases abstracted and entered into the electronic data capture system.
QC25	Second QC Event: required of sites with ≥ 25 total cases, once the 25^th case has been abstracted and entered into the electronic data capture system.
QC50	Third QC Event: required of sites with ≥ 50 total cases, once the 50^th case has been abstracted and entered into the electronic data capture system.
QC75	Fourth QC Event: required of sites with ≥ 75 total cases, once the 75^th case has been abstracted and entered into the electronic data capture system.
QC100	Fifth QC Event: required of sites with ≥ 100 total cases, once the 100^th case has been abstracted and entered into the electronic data capture system.
QC125	Sixth QC Event: required of sites with ≥ 125 total cases, once the 125^th case has been abstracted and entered into the electronic data capture system.
QC150	Seventh QC Event: required of sites with ≥ 150 total cases, once the 150^th case has been abstracted and entered into the electronic data capture system.
DCOC	Data Coordinating and Operations Center
GCDMP	Good Clinical Data Management Practices
CI	Confidence Interval
P	Pharmacologic
NP	Non-pharmacologic

Ethics Approval and Consent to Participate

The ACT NOW CE Study (parent study; IRB#217689) was reviewed and approved by the University of Arkansas for Medical Sciences (UAMS) Institutional Review Board (IRB). The parent study received a HIPAA waiver, as well as a waiver for informed consent/assent. This ancillary, quality control study received a determination of not human subjects research as defined in 45 CFR 46.102 by the UAMS IRB (IRB#239826) and was determined to be exempt by the University of Texas Health Science Center at Houston (UTHealth) IRB (HSC-SBMI-19-0828). All methods were carried out in accordance with relevant guidelines and regulations.

Consent for Publication

Not Applicable

Availability of Data and Materials

The datasets generated and/or analyzed during the current study are available in the NICHD Data and Specimen Hub (DASH) repository, https://dash.nichd.nih.gov/study/229026.

Competing Interests

The authors declare that they have no competing interests.

Funding

Research reported in this publication was supported by the National Center for Advancing Translational Sciences of the National Institutes of Health under award number UL1TR003107, and by the IDeA States Pediatric Clinical Trials Network of the National Institutes of Health under award numbers U24OD024957, UG1OD24942, UG1OD024943, UG1OD024944, UG1OD024946, UG1OD024947, UG1OD024951, UG1OD024954, UG1OD024955, UG1OD030016, and UG1OD00319. The content is solely the responsibility of the authors and does not represent the official views of the NIH.

Authors’ Contributions

MYG, MZ, and AW conceived and designed the study. AS, LD, LY, JL, JS, and SO contributed significantly to the conception and design of the project. MYG, AW, BM, SB, and SS contributed significantly to the acquisition of the data. MYG led and was responsible for the data management, review, and analysis. AS, JL, SO, and ZH contributed significantly to the analysis and interpretation of the data. MYG led the development and writing of the manuscript. All authors reviewed the manuscript and contributed to revisions. All authors reviewed and interpreted the results and read and approved the final version.

Acknowledgments

We thank the sites and personnel who performed medical record abstraction and participated in the quality control process for the ACT NOW CE study.

Kellar E, Bornstein SM, Caban A, Celingant C, Crouthamel M, Johnson C, et al. Optimizing the use of electronic data sources in clinical trials: the landscape, part 1. Ther Innov Regul Sci 2016;50(6):682-696. https://doi.org/10.1177/2168479016670689
Nahm M. Measuring data quality. In: Good Clinical Data Management Practices (GCDMP) (Version 2000 - present). Society for Clinical Data Management 2012. Available from: https://www.scdm.org/gcdmp. Accessed August 2020.
Zozus M, Pieper C, Johnson C, Johnson TR, Franklin A, Smith J, et al. Factors affecting accuracy of data abstracted from medical records. PLoS One 2015;10(10):e0138649. https://doi.org/10.1371/journal.pone.0138649
Nahm ML, Pieper CF, Cunningham MM. Quantifying data quality for clinical trials using electronic data capture. PLoS One 2008;3(8):e3049. https://doi.org/10.1371/journal.pone.0003049
Nahm M, Nguyen VD, Razzouk E, Zhu M, Zhang J. Distributed cognition artifacts on clinical research data collection forms. Summit Transl Bioinform. 2010 Mar 1;2010:36-40.
Zozus MN, Hammond WE, Green BG, et al. Assessing data quality for healthcare systems data used in clinical research (Version 1.0): An NIH Health Systems Research Collaboratory Phenotypes, Data Standards, and Data Quality Core White Paper 2014. Available from: https://www.nihcollaboratory.org/Products/Assessing-data-quality_V1%200.pdf. Accessed August 2020.
Sung NS, Crowley WF, Genel M, Salber P, Sandy L, Sherwood LM, et al. Central challenges facing the national clinical research enterprise. JAMA. 2003 Mar 12;289(10):1278-1287. https://doi.org/10.1001/jama.289.10.1278
Eisenstein EL, Lemons PW, Tardiff BE, Schulman KA, Jolly MK, Califf RM. Reducing the costs of phase III cardiovascular clinical trials. Am Heart J. 2005 Mar;149(3):482-488. https://doi.org/10.1016/j.ahj.2004.04.049
Malakoff D. Clinical trials and tribulations. Spiraling costs threaten gridlock. Science 2008 Oct 10;322(5899):210-213. https://doi.org/10.1126/science.322.5899.210
Embi PJ, Payne PR. Clinical research informatics: challenges, opportunities and definitions for an emerging domain. J Am Med Inform Assoc. 2009 May-Jun;16(3):316-327. https://doi.org/10.1197/jamia.M3005
Eisenstein EL, Nordo AH, Zozus MN. Using medical informatics to improve clinical trial operations. Stud Health Technol Inform. 2017;234:93-97.
Getz KA, Campo RA. New benchmarks characterizing growth in protocol design complexity. Ther Innov Regul Sci. 2018 Jan;52(1):22-28. https://doi.org/10.1177/2168479017713039
Blumenstein BA. Verifying keyed medical research data. Stat Med. 1993 Sept 15;12(17):1535–1542. https://doi.org/10.1002/sim.4780121702
Jansen AC, van Aalst-Cohen ES, Hutten BA, Buller HR, Kastelein JJP, Prins MH. Guidelines were developed for data collection from medical records for use in retrospective analyses. J Clin Epidemiol. 2005 Mar;58(3):269–274. https://doi.org/10.1016/j.jclinepi.2004.07.006
Zozus MN, Young LW, Simon AE, et al. Training as an intervention to decrease medical record abstraction errors in multicenter studies. Stud Health Technol Inform. 2019;257:526-539.
Young LW, Hu Z, Annett RD, et al. Site-Level variation in the characteristics and care of infants with neonatal opioid withdrawal. Pediatrics. 2021 Jan;147(1):e2020008839. https://doi.org/10.1542/peds.2020-008839
Society for Clinical Data Management (SCDM). Good Clinical Data Management Practices (GCDMP). Society for Clinical Data Management. 2013. Available from: https://www.scdm.org/gcdmp. Accessed August 2020.
Viechtbauer W. Conducting meta-analyses in R with the metafor package. J Stat Softw. 2010;36(3):1–48. https://doi.org/10.18637/jss.v036.i03
Basagana X, Pedersen M, Barrera-Gomez J, Gehring U, Giorgis-Allemand L, Hoek G, et al. Analysis of multicentre epidemiological studies: contrasting fixed or random effects modelling and meta-analysis. Int J Epidemiol. 2018 Aug 1;47(4):1343-1354. https://doi.org/10.1093/ije/dyy117
Garza MY, Myneni S, Nordo A, Eisenstein E, Hammond WE, Walden A, et al. eSource for standardized health information exchange in clinical research: A systematic literature review. Stud Health Technol Inform. 2019;257:115-124. https://doi.org/10.3233/978-1-61499-951-5-115
Garza MY, Rutherford M, Myneni S, Fenton S, Walden A, Topaloglu U, et al. Evaluating the coverage of the HL7 FHIR standard to support eSource data exchange implementations for use in multi-site clinical research studies. AMIA Annu Symp Proc. 2020 Jan 25;2020:472-481.
Garza MY, Myneni S, Fenton SH, Zozus MN. eSource for standardized health information exchange in clinical research: A systematic review of progress in the last year. JSCDM. 2021;1(2). https://doi.org/10.47912/jscdm.66

No competing interests reported.

MRAErrorRatesMedicalCareSupplementalFileAppendix.docx

Download PDF

Editorial decision: Major revision
29 Mar, 2022
Reviews received at journal
28 Mar, 2022
Reviewers agreed at journal
25 Mar, 2022
Reviewers agreed at journal
23 Mar, 2022
Reviewers invited by journal
22 Mar, 2022
Editor assigned by journal
22 Mar, 2022
Editor invited by journal
08 Mar, 2022
Submission checks completed at journal
08 Mar, 2022
First submitted to journal
03 Jan, 2022

You are reading this latest preprint version

Measuring and Controlling Medical Record Abstraction (MRA) Error Rates in an Observational Study

Status:

Version 1

Abstract

Figures

Background

Methods

Results

Discussion

Conclusion

List of Abbreviations

Declarations

References

Additional Declarations

Supplementary Files

Status:

Version 1