Drug and natural health product data collection and curation in the Canadian Longitudinal Study on Aging (CLSA)

doi:10.21203/rs.3.rs-3085472/v1

Purpose

The mapping of drug and natural health product (NHP) data to standardized terminologies is central to its analysis. This study aimed to develop an efficient data collection and curation process for all drug and NHP used by Canadian Longitudinal Study on Aging (CLSA) participants.

Methods

The 3-step sequential data collection and curation process consisted of: 1) mapping drug inputs to the Health Canada Drug Product Database (DPD), 2) algorithm-recoding of unmapped drug and NHP inputs, and 3) manual recoding. A gold standard manually recoded input was established by two pharmacy technicians. The proportion of algorithm-correctly recoded inputs was calculated as the number of algorithm-correctly recoded inputs, based on the gold standard, divided by the number of algorithm-recoded inputs.

Results

Among the 30,097 CLSA Comprehensive cohort participants, 26,000 (86.4%) were using a drug or a NHP with a mean of 5.3 (SD 3.8) inputs per participant-user for a total of 137,366 inputs. Of those inputs, 70,177 (51.1%) were mapped to the Health Canada DPD, 20,729 (15.1%) were recoded by algorithms and 44,108 (32.1%) were manually recoded. In a validation sample (n = 1407 inputs), the Direct algorithm correctly classified 99.4% of drug and 99.5% of NHP inputs for which a gold standard could be established. In another validation sample of 329 manually recoded free-text inputs, consensus was reached by 2 recoders for 89.7% of drug and 74.8% of NHP inputs.

Conclusion

We developed an efficient 3-step process for drug and NHP data collection and curation for use in a longitudinal cohort.

drug

natural health product

CLSA

older adults

longitudinal cohort

Large databases of health information are an important resource to study the use and outcomes of health services including the use of medications [1–5]. Information on the prevalence, incidence and duration of drug therapy is important in health research, health system planning and assessment of appropriate prescribing for treatment patterns and burden [4, 6–8]. Moreover, as the global population of adults 65 years and older continues to grow, the need will also grow for timely and accurate information not only for prescribed medications but also for non-prescription medications and natural health product (NHP). Standardized coding and classification of medication data can improve the efficiency in data collection and curation processes, which are complex processes due to heterogeneous formats including generic names (e.g. acetaminophen), trade names (e.g. Tylenol) and numeric drug identifiers (e.g. 02046040) [9, 10].

The mapping of medication data to standardized terminologies such as the RxNorm ontology [11] has been proposed to allow efficient analysis and interpretation of drug data [9, 10]. The performance of this mapping to standardized terminologies has been evaluated with medication data from hospital pharmacy systems [12, 13], electronic health records (EHRs) [14], drug adverse events database [15], multi-site clinical trial [16, 17], and longitudinal cohorts [17]. For prospective clinical studies, the ASPREE [16] clinical trial in older adults and the 45 and Up study [18] reported a method of structured medication data collection based on a list of common medications with the option of free-text data entry for other medications. Both studies used a structured process of automated and manual coding for the curation of the free-text data by medication experts [16, 18]. Systematic approaches for the curation of large free-text medication data have involved automated and manual approaches [9, 15].

The Canadian Longitudinal Study on Aging (CLSA) is a population-based research platform established to better understand how biological, medical, psychological and social determinants have an impact in maintaining health and in the development of disease and disability as people age [19, 20]. The complete documentation of all drugs and NHP used every 3 years over 20 years in a cohort of more than 30,000 participants requires efficient data mapping and curation processes. In this article, we describe a 3-step process for the data entry and mapping of drug data to the Health Canada Drug Product Database (DPD) by CLSA interviewers, as well as the development and validation of a cleaning process of free-text/numeric drug and NHP inputs in a software algorithm approach followed by manual recoding.

Study population

The recruitment and baseline evaluations of the 51,338 CLSA participants aged 45-85 years at enrolment was completed in 2015 [19]. The complete CLSA cohort is composed of the Tracking cohort of 21,241 participants who provide data via telephone interviews and the Comprehensive cohort of 30,097 participants who provide data via in-person home interviews and visits to a data-collection site. Comprehensive participants provided data in English and French on all regularly used drug and NHP.

Drug and NHP data collection / mapping drug data to Health Canada database

In the first of a 3-step process, Drug and NHP data were entered in the CLSA data collection software by interviewers who were trained to identify the relevant information from medication packaging. During an in-home visit, CLSA interviewers asked participants to present all regularly scheduled or taken medications (i.e., scheduled, once a day, every other day, taken occasionally, as required), including prescription, non-prescription, over-the-counter (OTC), herbals, vitamins or NHP in all routes of administration. The interviewer entered either the generic name (e.g. atorvastatin), trade name (e.g. Lipitor) or drug identification number (DIN) (e.g. 02230711) in a type-to-search box that mapped the drug input to the Health Canada DPD and generated a list of corresponding generic or trade drug names. In the absence of adequate drug name correspondence, the name / DIN was entered as a free-text/numeric input. Since the type-to-search box was not mapped to the Health Canada Licensed Natural Health Products Database (LNHPD), NHP were entered as free-text/numeric inputs. The interviewer also recorded information about the dosage, frequency, duration, start date and indications for use.

Drugs authorized for sale by Health Canada are listed in the Health Canada DPD [21] which contains information notably on product name, list of active ingredients, DIN and World Health Organisation (WHO) anatomical therapeutic chemical (ATC) classification. NHP licensed by Health Canada are listed in the Health Canada LNHPD [22] which contains information notably on product name, product’s medicinal ingredients, product’s non-medicinal ingredients, natural product number (NPN). The NHP database does not include ATC codes. Both databases are updated nightly.

Algorithm recoding

In a second step, sequential algorithms were applied to map free-text (drug or NHP names) or numeric (DINs or NPNs) inputs to the products of the Health Canada drug and NHP databases. Seven algorithms were developed in a software algorithm approach independent of the sample data (Table 1). The algorithms were run sequentially such that once an input was matched, it was no longer considered in the remaining algorithms. For a given input, the first algorithm attempted to map the input to the drug followed by the NHP database before moving on to the next algorithm. The Direct and Code algorithms were run first since they only ever matched a single input to a single drug or NHP, while the Word and Simple algorithms at times found multiple matches. In cases of multiple matches due to numerous dosage strengths, the input was matched to the suitable drug or NHP with the lowest DIN or NPN.

Table 1. Developed algorithms

Name	Description	Examples
Code	The input is compared to the DIN or NPN. A match is found when the input is identical to the DIN or NPN. There can only ever be one match.	The input “02275619” matches the DIN “02275619”. In comparison, the input “0227-5619” does not match the DIN “02275619”.
Direct	The input is compared to the drug or NHP’s name. A match is found when the input is identical to the drug’s or NHP’s name (including all special characters, spaces, etc.). There can only ever be one match.	The input “TYLENOL ALLERGY” matches the drug name “TYLENOL ALLERGY”. In comparison, the input “TYLENOL ALLERGY 100MG” does not match the drug name “TYLENOL ALLERGY”.
Word	The input is compared to the drug or NHP’s name. A match is found when the drug’s or NHP’s name is found as a sub-string within the input. Spaces are considered such that only whole words can be matched. There may be multiple matches.	The input “LARGE TYLENOL SUPER RELIEF 100MG” matches the drug name “TYLENOL SUPER”. In comparison, the input “LARGE TYLENOL SUPERIOR RELIEF 100MG” does not match the drug name “TYLENOL SUPER”.
Simple	The input with all non-alpha-numeric characters removed is compared to the drug or NHP’s name with all non-alpha-numeric characters removed. A match is found when the two altered names are identical. There may be multiple matches.	The input “TYLENOL-ALLERGY (50-MG)” (transformed into “TYLENOLALLERGY50MG”) matches the drug name “TYLENOL ALLERGY 50MG” (transformed into “TYLENOLALLERGY50MG”). In comparison, the input “TYLENOL-ALLERGY (50-MG)” (transformed into “TYLENOLALLERGY50MG”) does not match the drug name “TYLENOL ALLERGY” (transformed into “TYLENOLALLERGY”).
Reverse-word	This algorithm is identical to “Word”, but the input is searched as a sub-string within the drug or NHP.
No-Units	The input with all units of measurement removed.	The input "ASPIRIN COATED CAPLETS 500MG" would have the units, 500MG, removed and become "ASPIRIN COATED CAPLETS".
Predefined	List of common drugs and NHPs established by our team. Inputs with predefined names would get coded first.	Aspirin, Vitamin B, Vitamin C, Vitamin D, multivitamin, etc.

DIN, Drug identification number; NHP, Natural health product; NPN, natural product number.

Work was conducted using SQL (database scripting language) and PHP (general programming language). The Health Canada databases and CLSA data were loaded into a secure MySQL database using SQL. Some pre-processing was conducted on these databases before using PHP to enhance performance, increase speed of matching and make the computer algorithms more efficient. For instance, the Simple algorithm compared the unmapped inputs to drug and NHP names from the Health Canada databases by ignoring non-alpha-numeric characters. This was done by removing the non-alpha-numeric characters from both the unmapped inputs and the Health Canada databases names, then comparing the two. It would be slow to transform the drug names in this way every time a comparison is made. Instead, all drug names were electronically converted during this pre-process step once and used by the algorithm every time a match was searched for. Another example is a list that was made of all identical drug and NHP names. The final version of the algorithm sequence and variables from the Health Canada databases are presented in Supplementary Text.

As part of an iterative algorithm improvement approach, two pharmacists (LD, BC) independently recoded 40 unmapped drug and NHP inputs. The pharmacist-recoded inputs were compared to algorithm-recoded inputs during meetings of the research team, leading to algorithm refinement. This process of review – discussion – algorithm refinement was conducted three times for a total of 120 inputs, leading to two new algorithms: Predefined and No-units (Table 1). The greater complexity of recoding NHP inputs compared to drug inputs was identified early in this process and discussed throughout our work.

Manual recoding

In a third recoding step, following the application of the algorithms to the unmapped drug and NHP data, the remaining unmapped de-identified data were exported directly from the CLSA’s database to an Excel file for manual recoding by 3 pharmacy technicians. The same group of recoders conducted the recoding and validation work. The recoders’ work was supported by a set of decision rules (Supplementary Text) to assign selected NPNs for the most prevalent NHP inputs (e.g., NPN=80083109 for calcium).

Spelling dictionary

As inputs were manually recoded, common misspellings were compiled into a dictionary and applied to future iterations of the computer algorithms. In the pre-processing stage, all inputs containing any of the misspelled words in the dictionary were replaced with the correct spelling before the algorithms were run.

Validation process

A validation sample of 100 Comprehensive cohort participants was randomly selected to evaluate the performance of the recoding algorithms and manual recoding. This sample included 352 free-text drug and NHP inputs for which, a gold standard recoded input was determined independently by 2 recoders with resolution of discrepancies by a pharmacist. A gold standard recoded input could not be stablished for some inputs due to Insufficient input information. Differing commercial products of the same generic drug or NHP were considered to be an agreement. After this first validation, the algorithms were further refined and validated in a second sample of 544 Comprehensive cohort participants with 1407 unmapped free-text drug and NHP inputs. In this second validation, the gold standard recoded input was established by a single recoder based on the measured recoders consensus in the first validation.

Analysis

Manual recoding was considered the gold standard for free-text inputs. The proportion of algorithm-correctly recoded inputs was calculated as the number of algorithm-correctly recoded inputs, based on the gold standard, divided by the number of algorithm-recoded inputs. In the primary analysis, the denominator included only the inputs for which a gold standard could be established in order to distinguish between drug and NHP. In a sensitivity analysis, the denominator included all algorithm-recoded inputs, regardless of gold standard coding, for a more conservative estimate that cannot differentiate between drug and NHP.

Mapping and recoding of drug and NHP inputs

Among CLSA’s 30,097 baseline Comprehensive cohort participants, 26,000 (86.4%) were using a drug or an NHP. Among drug or NHP users, a mean of 5.3 (SD 3.8) inputs per participant were documented for a total of 137,366 inputs. In the first of a 3-step process, interviewers mapped 70,177 (51.1%) of the 137,366 inputs to a drug in the Health Canada DPD (Figure 1). Of the remaining 67,189 unmapped inputs (Figure 1), 3,247 (4.8%) were pre-processed by the spelling dictionary. In step 2, the Direct and Code algorithms recoded 10,657 (7.8%) drug and 10,072 (7.3%) NHP inputs. In step 3 – manual recoding, 10,185 (7.4%) drug and 33,923 (24.7%) NHP inputs out of the 46,460 (32.1%) remaining unmapped inputs were manually recoded (Figure 1). Insufficient input information resulted in an inability to code for 2,352 (1.7%) inputs (e.g., study drug, hypertension medication), made available to researchers as entered (Figure 1).

Algorithm and manual recoding validation

First validation sample

From the first validation sample, 352 free-text inputs were submitted to algorithm recoding and reviewed by 2 recoders (and pharmacist for non-consensus inputs) to establish a gold standard recoded input. Of these 352 inputs, 12 free-text inputs were not recoded by the recoder nor the algorithms because of insufficient information. Of the remaining 340 inputs, 307 were recoded by the algorithms (Table 2). The Direct algorithm recoded the most (49.5%) inputs followed by the Word algorithm (22.5%). In the main analysis of the inputs for which a gold standard could be established, the Direct and Word algorithms correctly classified 97.9% and 59.3% of drugs and 96.2% and 30.6% of NHP inputs respectively. In the sensitivity analysis of all algorithm-recoded inputs, the Direct and Word algorithms correctly classified 95.4% and 39.1% of inputs.

Table 2. Validation of algorithm recoding with manual recoding (gold standard) – first validation sample

Algorithms

Manual recoding (gold standard)

Algorithm correctly recoded inputs

Primary analysis

Sensitivity analysis

Not recoded

Drug

NHP

Drug

NHP

Drug or NHP

Direct

3

(27.3%)

96

(65.3%)^a

53

(35.6%)

94

(97.9%)^b

51

(96.2%)^c

145

(95.4%)^d

No-units

0

(0.0%)

0

(0.0%)

2

(1.3%)

–

1

(50.0%)

1

(50.0%)

Predefined

1

(9.1%)

1

(0.7%)

55

(36.9%)

0

(0.0%)

46

(83.6%)

46

(80.7%)

Reverse-word

1

(9.1%)

19

(12.9%)

1

(0.7%)

15

(78.9%)

1

(100%)

16

(76.2%)

Simple

0

(0.0%)

4

(2.7%)

2

(1.3%)

4

(100%)

2

(100%)

6

(100%)

Word

6

(54.5%)

27

(18.4%)

36

(24.2%)

16

(59.3%)

11

(30.6%)

27

(39.1%)

All

11

(100%)

147

(100%)

149

(100%)

129

(87.8%)

112

(75.2%)

241

(78.5%)

NHP, Natural Health Product

a Percent of the manually recoded drug inputs also recoded by the Direct algorithm out of the 147 manually recoded drug inputs

b percent of correctly recoded drug inputs by the Direct algorithm out of the 96 recoded drug inputs by the Direct algorithm

c percent of correctly recoded natural health products inputs by the Direct algorithm out of the 53 recoded natural health products inputs by the Direct algorithm

d percent of correctly recoded inputs by the Direct algorithm out of the 152 recoded inputs by the Direct algorithm

Of the 352 drug and NHP inputs, consensus was reached by both recoders for 294 (83.5%) inputs. Of these 352 inputs, the recoders agreed that there was insufficient information to recode 21 inputs, excluded from the following subgroup analysis. Of the remaining 329 inputs, consensus was reached by the recoders for 156 (89.7%) of the 174 drug inputs and for 116 (74.8%) of the 155 NHP inputs. Based on these results, the second algorithms’ validation was conducted with a gold standard established by a single recoder. The recoders’ consensus was similar for algorithm-recoded inputs (83.4%) and non-algorithm-recoded inputs (84.4%).

Second validation sample

Of the 1407 free-text inputs of the second validation sample, 27 were not recoded by the recoder nor the algorithms because of insufficient information. Of the remaining 1380 inputs, 1280 were recoded by the algorithms (Table 3). The Predefined algorithm recoded the most (44.8%) inputs followed by the Direct algorithm (29.0%). Modifications to the predefined algorithm for the coding of vitamins explains the increase in recoded inputs from the first to the second validation sample. In the main analysis of the inputs for which a gold standard could be established, the Direct and Pre-defined algorithms correctly classified 99.4% and 86.4% of drugs and 99.5% and 78.2% of NHP inputs respectively. In the sensitivity analysis of all algorithm-recoded inputs, the Direct and Pre-defined algorithms correctly classified 94.6% and 77.0% of inputs. Following the second validation, the Code and Direct algorithms were selected for step 2 algorithm-recoding of the unmapped free-text inputs of the baseline Comprehensive cohort participants.

Table 3. Validation of algorithm recoding with manual recoding (gold standard) – second validation sample

Algorithms

Manual recoding (gold standard)

Algorithm correctly recoded inputs

Primary analysis

Sensitivity analysis

Not recoded

Drug

NHP

Drug

NHP

Drug or NHP

Direct

18

(26.1%)

171

(49.0%)^a

182

(21.1%)

170

(99.4%)^b

181

(99.5%)^c

351

(94.6%)^d

No-units

1

(1.4%)

1

(0.3%)

3

(0.3%)

1

(100%)

2

(66.7%)

3

(60.0%)

Predefined

15

(21.7%)

59

(16.9%)

499

(57.9%)

51

(86.4%)

390

(78.2%)

441

(77.0%)

Reverse-word

4

(5.8%)

47

(13.5%)

23

(2.7%)

44

(93.6%)

1

(4.3%)

45

(60.8%)

Simple

3

(4.3%)

1

(0.3%)

14

(1.6%)

1

(100%)

13

(92.9%)

14

(77.8%)

Word

28

(40.6%)

70

(20.1%)

141

(16.4%)

64

(91.4%)

101

(71.6%)

165

(69.0%)

All

69

(5.4%)

349

(27.3%)^e

862

(67.3%)

331

(94.8%)

688

(79.8%)

1019

(79.6%)

NHP, Natural Health Product

a Percent of the manually recoded drug inputs also recoded by the Direct algorithm out of the 349 manually recoded drug inputs

b percent of correctly recoded drug inputs by the Direct algorithm out of the 171 recoded drug inputs by the Direct algorithm

c percent of correctly recoded natural health products inputs by the Direct algorithm out of the 182 recoded natural health products inputs by the Direct algorithm

d percent of correctly recoded inputs by the Direct algorithm out of the 371 recoded inputs by the Direct algorithm

We described a 3-step process for the mapping of drug and NHP data to Health Canada databases that included algorithm recoding of 15.1% of all drug and NHP inputs with high confirmation against gold standard manual recoding. The developed algorithms have and will continue to save significant manual recoding time considering the large volume of CLSA drug and NHP data collected every three years over twenty years. The 3-step process will enable the medications data collected from CLSA participants to be curated more efficiently and released as part of the CLSA research data platform for use by researchers. The process has the potential to be tested and applied with other large studies.

In the first of the 3-step process, CLSA in person interviewers mapped 51% of 137,366 drug and NHP inputs to the Health Canada DPD. In CLSA, the mapping of all drugs and NHP, a much more diverse dataset, contrasts from the mapping to a selection of 2025 common medications in the multi-national, ASPREE clinical trial in older adults [16] and to a list of the 32 most common medications used in the Australian population in the 45 and Up study [18].

In the second mapping step, 2 (Code and Direct) of the 7 developed algorithms were selected for algorithm-recoding of unmapped drug and NHP inputs. The limited number of selected algorithms highlights the need for a validation process to identify the challenging inputs in a specific dataset. In our final validation sample, the Direct algorithm correctly classified 99.4% of drug and 99.5% of NHP inputs among the inputs for which a gold standard could be established. Similar validations of drug mapping / recoding have been reported by other groups. In the 45 and Up study, the automated coding of drug terms first to generic names using the Systematised Nomenclature of Medicine – Clinical Terms followed by coding to the WHO – ATC classification achieved positive predictive values above 95% and sensitivity of 79% at the exact ATC level with higher sensitivity values for drugs than vitamins and supplements [18]. The cleaning of drug names in the Food and Drug administration Adverse Event Reporting System database (FAERS) database resulted in standardization of 95% of drug name [15]. In another study on the FAERS database, drug name coverage of 93% was achieved in the mapping to RxNormstandard code ingredients [23]. With highly structured inpatient pharmacy data from the GEMINI database from 7 Canadian hospitals over 8 years, the use of existing Rx-Norm functionality resulted in sensitivity greater than 98.5% and an F-Measure above 90.0% in the standardization of 13 selected drug classes [12].

In the third mapping step, 33.8% of the remaining unmapped inputs were manually recoded with higher consensus for drug than NHP inputs. The lack of adequate information on the herbal product itself is highlighted in the CONSORT statement on herbal interventions [24]. The mapping of the NHP inputs to Health Canada’s LNHPD adequately documents the CONSORT recommended product name and allows researchers using CLSA data to further detail the CONSORT recommended elements of ‘characteristics of the herbal product’. General NHP designations (e.g., multivitamins) were coded as per our decision rules.

Strengths and limitations

The main strength of our approach is the mapping / recoding of drug and NHP data to standardized information of Health Canada’s Drug and NHP Databases. The availability of these regularly updated databases was essential to this project. This linkage included the WHO ATC categories for drugs, a derived variable particularly useful for researchers using CLSA data. Our sequential approach limited the manual recoding to 33.8% drug and NHP inputs. The main limitation of our approach is in the initial free-text entry of all NHP inputs and the 74.8% consensus during manual recoding. Also, our approach would need to be adapted for drug and NHP data collection in other countries because of varying names.

Ongoing developments

We continue to refine our collection and curation processes for medications data in the CLSA by exploring the linkage of the type-to-search box to Health Canada’s LNHPD for the mapping of NHP information by CLSA interviewers. The multiple brand name extensions generating an important number of options that could increase interviewers’ data collection time is a concern for NHP mapping. We are also evaluating the integration of the algorithms to the type-to-search box to generate a list of possible matches.

We created an efficient 3-step sequential process for drug and NHP data collection and curation in a longitudinal cohort as shown by the mapping of half of the drug and NHP inputs by the interviewers and algorithm recoding of 15.1% of inputs. The accuracy of our approach was shown by the confirmation of algorithm coding compared to gold standard manual recoding and recoders consensus for drug for the manual recoding process. Our approach has the potential to be applied by researchers using other large datasets requiring cleaning. We are pursuing the development of our approach for the data collection and mapping of NHP data to Health Canada’s LNHPD and integrating the algorithms into the day to day working of the next set of follow up data collection periods in the CLSA.

Ethical Approval The CLSA was approved by the Hamilton Integrated Research Ethics Board (approval number 10-423, for the Comprehensive cohort) at McMaster University and the research ethics boards of all collaborating institutions.

Competing interest None.

Authors’ contribution BC, LG, PDE, JB, KN, LD designed the study. BC, LG, PDE, LM, JB, LD analysed the data. BC, LG, PDE, JB, LD drafted the manuscript. All authors Interpreted the data, critically revised the manuscript and approved the final version.

Funding Funding for the Canadian Longitudinal Study on Aging (CLSA) is provided by the Government of Canada through the Canadian Institutes of Health Research (CIHR) under grant reference: LSA94473 and the Canada Foundation for Innovation. The funders had no role in study design, data collection, data analysis, data interpretation, or writing of the report. The CLSA is led by Drs Parminder Raina, Christina Wolfson and Susan Kirkland. Benoit Cossette is a Junior 1 Research Scholar from the Fonds de recherche du Québec – Santé. Lauren Griffith is supported by the McLaughlin Foundation Professorship in Population and Public Health. Parminder Raina holds the Raymond and Margaret Labarge Chair in Optimal Aging and Knowledge Application for Optimal Aging, is the Director of the McMaster Institute for Research on Aging and the Labarge Centre for Mobility in Aging and holds a Tier 1 Canada Research Chair in Geroscience.

Availability of data and materials CLSA data are currently available to approved public sector researchers in Canada and elsewhere. The data application process is described on CLSA’s website (www.clsa-elcv.ca) which also hosts the medication and NHP data support document providing a brief overview.

Acknowledgements The authors would like to thank the participants who give their time to the Canadian Longitudinal Study on Aging. The authors also thank Helga Weigelin and Bevonie Brown who conducted the manual recoding, Jean-Philippe Turcotte and Claudie Rodrigue for the data analysis and Joanne Ho, Carol Bassim and Kasia Makara for their support in coordinating this work.

Supplementary Information The online version contains supplementary material available at

Cadarette SM, Wong L (2015) An Introduction to Health Care Administrative Data. Can J Hosp Pharm 68:232–237. https://doi.org/10.4212/cjhp.v68i3.1457
Murdoch TB, Detsky AS (2013) The inevitable application of big data to health care. JAMA 309:1351–1352. https://doi.org/10.1001/jama.2013.393
Metge C, Grymonpre R, Dahl M, Yogendran M (2005) Pharmaceutical use among older adults: using administrative data to examine medication-related issues. Can J Aging Rev Can Vieil 24 Suppl 1:81–95. https://doi.org/10.1353/cja.2005.0052
Schneeweiss S, Avorn J (2005) A review of uses of health care utilization databases for epidemiologic research on therapeutics. J Clin Epidemiol 58:323–337. https://doi.org/10.1016/j.jclinepi.2004.10.012
Zhan C, Miller MR (2003) Administrative data based patient safety research: a critical review. Qual Saf Health Care 12 Suppl 2:ii58-63. https://doi.org/10.1136/qhc.12.suppl_2.ii58
Moriarty F, Bennett K, Fahey T, et al (2015) Longitudinal prevalence of potentially inappropriate medicines and potential prescribing omissions in a cohort of community-dwelling older people. Eur J Clin Pharmacol 71:473–482. https://doi.org/10.1007/s00228-015-1815-1
Moriarty F, Hardy C, Bennett K, et al (2015) Trends and interaction of polypharmacy and potentially inappropriate prescribing in primary care over 15 years in Ireland: a repeated cross-sectional study. BMJ Open 5:e008656. https://doi.org/10.1136/bmjopen-2015-008656
Galvin R, Moriarty F, Cousins G, et al (2014) Prevalence of potentially inappropriate prescribing and prescribing omissions in older Irish adults: findings from The Irish LongituDinal Study on Ageing study (TILDA). Eur J Clin Pharmacol 70:599–606. https://doi.org/10.1007/s00228-014-1651-8
Richesson RL (2014) An informatics framework for the standardized collection and analysis of medication data in networked research. J Biomed Inform 52:4–10. https://doi.org/10.1016/j.jbi.2014.01.002
Nikiema JN, Liang MQ, Després P, Motulsky A (2021) OCRx: Canadian Drug Ontology. Stud Health Technol Inform 281:367–371. https://doi.org/10.3233/SHTI210182
RxNorm. National library of medicine. Available at: . Accessed 2023-01-13
Waters R, Malecki S, Lail S, Mak D, Saha S, Jung HY, Razak F, Verma A. Automated identification of unstandardized medication data: A scalable and flexible data standardization pipeline using RxNorm on GEMINI multicenter hospital data. medRxiv. Available at: . Accessed 2023-01-13
Hernandez P, Podchiyska T, Weber S, et al (2009) Automated mapping of pharmacy orders from two electronic health record systems to RxNorm within the STRIDE clinical data warehouse. AMIA Annu Symp Proc AMIA Symp 2009:244–248
Zhou L, Plasek JM, Mahoney LM, et al (2012) Mapping Partners Master Drug Dictionary to RxNorm using an NLP-based approach. J Biomed Inform 45:626–633. https://doi.org/10.1016/j.jbi.2011.11.006
Veronin MA, Schumaker RP, Dixit RR, Dhake P, Ogwo M. A systematic approach to “cleaning” of drug name records data in the FAERS database: a case report. Int J Big Data Man. 2020;1:105–118
Lockery JE, Rigby J, Collyer TA, et al (2019) Optimising medication data collection in a large-scale clinical trial. PLOS ONE 14:e0226868. https://doi.org/10.1371/journal.pone.0226868
Richesson RL, Smith SB, Malloy J, Krischer JP (2010) Achieving standardized medication data in clinical research studies: two approaches and applications for implementing RxNorm. J Med Syst 34:651–657. https://doi.org/10.1007/s10916-009-9278-5
Gnjidic D, Pearson S-A, Hilmer SN, et al (2015) Manual versus automated coding of free-text self-reported medication data in the 45 and Up Study: a validation study. Public Health Res Pract 25:e2521518. https://doi.org/10.17061/phrp2521518
Raina P, Wolfson C, Kirkland S, et al (2019) Cohort Profile: The Canadian Longitudinal Study on Aging (CLSA). Int J Epidemiol 48:1752–1753j. https://doi.org/10.1093/ije/dyz173
Raina PS, Wolfson C, Kirkland SA, et al (2009) The Canadian longitudinal study on aging (CLSA). Can J Aging Rev Can Vieil 28:221–229. https://doi.org/10.1017/S0714980809990055
Health Canada Drug Product Database. Available at: . Accessed 2023-01-13
Health Canada Licensed Natural Health Products Database. Available at: . Accessed 2023-01-13
Banda JM, Evans L, Vanguri RS, et al (2016) A curated and standardized adverse drug event resource to accelerate drug safety research. Sci Data 3:160026. https://doi.org/10.1038/sdata.2016.26
Gagnier JJ, Boon H, Rochon P, et al (2006) Recommendations for reporting randomized controlled trials of herbal interventions: Explanation and elaboration. J Clin Epidemiol 59:1134–1149. https://doi.org/10.1016/j.jclinepi.2005.12.020

No competing interests reported.

CLSARxMappingEurJClinPharmacolSupp.docx

Drug and natural health product data collection and curation in the Canadian Longitudinal Study on Aging (CLSA)

Status:

Version 1

Abstract

Purpose

Methods

Results

Conclusion

Figures

Background and Context

Methods

Results

Discussion

Conclusion

Declarations

References

Additional Declarations

Supplementary Files

Status:

Version 1