We generated and linked three data sets: 1) a retrospective clinical chart review as a reference standard; 2) original ICD-10-CA coded data; and 3) re-coded ICD-11 coded data (Figure 1). To date, data collection is complete, and analyses are underway.
Sample Size and Cohort
Based on previous (3) findings on sensitivity and prevalence of conditions in a sample of ICD-10-CA data, 3000 records were deemed required to test a 10% difference in sensitivity of common conditions such as myocardial infarction (12.8%), cardiac arrhythmia (21.8%), hypertension (30.2%), and others. The Lachenbruch’s (4) midpoint method was used.
The study cohort included a random sample of discharges selected from records between January 1st and June 30th of 2015, from three hospitals in Calgary, Alberta. Patients were 18 and 104 years of age with a valid Personal Health Number (PHN) for Alberta. Obstetric admissions were excluded due to short stay and absence of chronic conditions of interest. The first 1100 records from each hospital with the lowest random chart numbers were selected. If there were multiple discharges for a single patient during the study period, we randomly selected one discharge record per patient. The additional 100 records per site allowed for missing or excluded charts.
Chart Review Dataset
Internal validation of a dually coded database involves measuring how well codes, selected from both ICD-10-CA and ICD-11, represent the diagnoses identified by chart reviewers, in terms of sensitivity, specificity, positive and negative predictive values.
Data Dictionary
We replicated and expanded the chart review approach from our prior study on the validity of ICD-10-CA (3). We selected 51 medical conditions, including the Charlson and Elixhauser (5,6) comorbidity conditions, and up to three harms (Table 1). We chose these conditions from other validation studies (3,7) and they are commonly used for risk adjustment. Some definitions were based on literature (5,6) and our prior validation study (3). Where no published definition was available, ICD-11 Browser definitions (beta version) were used (2). Chart review conditions are listed in Table 1.
Table 1 Chart Review Conditions for Data Collection
Conditions
|
1. Angina
|
18. Inflammatory bowel disease
|
35. HIV/AIDS
|
|
2. Myocardial infarction (new)
|
19. Liver disease
|
36. Disorders due to tobacco use
|
|
3. Myocardial infarction (old)
|
20. Cancer
|
37. Dyslipidemia
|
|
4. Congestive heart failure
|
21. Malignancy without metastases
|
38. Disorders due to alcohol use
|
|
5. Cardiac arrhythmias
|
22. Malignancy with metastases
|
39. Disorders due to drug use
|
|
6. Atrial fibrillation
|
23. Leukemia
|
40. Psychoses
|
|
7. Atrial flutter
|
24. Lymphoma
|
41. Anxiety
|
|
8. Valve disease
|
25. Renal disease
|
42. Depression
|
|
9. Pulmonary circulatory disorders
|
26. Rheumatologic disease
|
43. Homeless
|
|
10. Hypertension
|
27. Diabetes
|
44. Urinary tract infection
|
|
11. Peripheral vascular disease
|
28. Hypothyroidism
|
45. Pneumonia
|
|
12. Cerebrovascular disease
|
29. Coagulopathy
|
46. Skin/wound infection
|
|
13. Paralysis
|
30. Anemia
|
47. Gastroenteritis
|
|
14.Chronic pulmonary disease
|
31. Fluid & electrolyte disorder
|
48. Other infection
|
|
15. Asthma
|
32. Obesity
|
49. Sepsis
|
|
16. Peptic ulcer Disease
|
33. Significant weight loss
|
50. Pressure Ulcer
|
|
17. Gastrointestinal bleed
|
34. Dementia
|
51. Sleep disorders
|
|
Harms
|
Harm 1
|
Harm 2
|
Harm 3
|
|
Detailed condition definitions are available in the Data Dictionary for ICD-11 Field Trial [see Additional file 1].
Chart Access
Patient charts were available in paper and electronic (hybrid) form in each hospital’s health records department. Electronic content was accessed in Sunrise Clinical ManagerTM (SCM).
Chart Review Team
Six nurse chart reviewers underwent extensive training on the data extraction process by the research coordinator. Training involved learning the data dictionary definitions and following a consistent order to review the chart documents. To test the data definitions, training included practice identifying the medical conditions in the same five charts. Discrepancies between the reviewers were discussed and the data dictionary was refined. We then proceeded with inter-rater reliability (IRR) explained below.
The nurse reviewers examined the entire chart for the presence of specific health conditions. These reviewers were blinded to the ICD codes assigned by the coders.
ICD-10-CA Coded Dataset
We used previously coded charts because the existing ICD-10-CA dataset represented a “real-life” sample of coding practices. Alberta hospitals employ trained clinical coders (CCs) (i.e., nationally certified health information management specialists) who read through patient hospital charts. These CCs assigned ICD-10-CA diagnosis codes to describe each patient’s hospitalization, based on ICD-10-CA Canadian coding standards (8). Each discharge record contains a unique identification number for each admission and up to 25 fields for diagnosis codes, which became the study dataset.
Re-coded ICD-11 Dataset
The third phase involved re-coding the same inpatient charts using ICD-11.
Training Material Development
The research coordinator and a member of Canadian Institute developed ICD-11 training materials for Health Information (CIHI). Materials included three slide sets covering ICD-11 concepts and tools (9). Materials for coding practice included two sets each of Morbidity and Quality and Safety Case Scenarios. Coding rules and decision trees were developed for coding hospital-acquired conditions (harms) in conjunction with the WHO Quality and Safety Technical Advisory Group (TAG). The full training process is presented in Eastwood et al. (submitted) (10).
Clinical Coding Team
Six professional CCs were hired and trained. Trainers included a team from University of Calgary, CIHI, and WHO experts in ICD-11 concepts. Training involved 20 classroom hours and approximately 40 hours of coding practice homework prior to coding full hospital charts. Then, the coding team and trainers met monthly during the coding phase to discuss coding issues. ICD-11 coding decisions were based on what was available at the time in the draft ICD-11 Reference Guide of the WHO (11), the WHO ICD-11 Coding Tool (12), and the Canadian ICD-10-CA coding standards (8), given that ICD-11 coding rules were limited.
Analysis
Test Inter-rater Reliability (IRR) of Chart Review
To test agreement between reviewers, IRR involved two nurses reviewing sets of the same 10 charts. Agreement was checked for the presence of the 17 Charlson conditions. Where agreement was poor (kappa<0.60) retraining took place and chart review resumed in batches of 10 charts, until agreement was high (kappa>0.8) (13). High agreement was reached after two people completed 49 sets of records. Reviewers then independently extracted data from the remaining charts over several months. Data were entered into a secure electronic data collection tool called REDCap (7.6.9-©2018 Vanderbilt University). IRR was not available for the previously coded ICD-10-CA dataset.
Test Inter-rater Reliability of ICD-11 Coded Charts
IRR involved 60 full charts coded by two CCs, similar to the above chart review IRR process. IRR focused on consistent coding of the main condition given the bulk of possible codes generated from full hospital charts. After the first 40 charts, a kappa of 0.50 was reached on the main condition parent code (highest level in the ICD-11 condition hierarchy). Training continued, differences were discussed, and experts were engaged for guidance. After coding the next 20 charts, a kappa of 0.88 was reached for main condition parent codes and independent coding commenced. The CCs were blinded to the original ICD-10-CA codes and the chart review data.