2.1 Cohorts
The first cohort comprised of 235 home-based people with PD (PwP) and 231 healthy controls of Caucasian heritage from the Perron Institute for Neurological and Translational Science PD Database, as previously reported 23. Clinical and demographic data, including the age of symptom onset, were recorded in the database. All PwP were examined by a movement disorder neurologist prior to inclusion in the study for verification of the diagnosis in accordance with the UK Brain Bank criteria for idiopathic PD and reported no family history of PD 24, while healthy controls were confirmed to have no history of any neurological disorders. This study was approved by the Sir Charles Gairdner Hospital Human Research and Ethics Committee (Approval number 2006/073). Written informed consent was obtained from all participants, in accordance with the Australian National Health and Medical Research Council research guidelines.
The second cohort was derived from the international Parkinson’s Progression Markers Initiative (PPMI) database (available at http://www.ppmi-info.org/data). This cohort comprised of 368 PwP and 172 healthy controls, after exclusion of non-Caucasians in order to reflect the composition of the Australian cohort.
2.2 DNA extraction
In the Australian cohort, DNA was extracted from either blood samples or buccal swabs. Participant buccal samples were collected by a trained researcher using Isohelix™ DNA/RNA Buccal Swabs (Cell Projects Ltd, Kent, U.K.) and stored until DNA extraction. Alternatively, blood was collected from the medial cubital vein. DNA was extracted and purified from these samples using QIAamp DNA mini kits (Qiagen Pty LTD., Victoria, Australia), according to the manufacturer’s protocol. DNA concentration was determined using absorbance readings calculated by a NanoDrop One Microvolume UV-Vis spectrophotometer (Thermo Fisher Scientific Australia Pty LTD., Victoria, Australia).
2.3 Genotyping of TOMM40 ‘523’ using PCR and fragment analysis
PCR-amplification of the ‘523’ variant in the Australian cohort was completed using fluorescently labelled primers, as previously described 13. The forward primer sequence was 5’-/6-FAM/-TGCTGACCTCAAGCTGTCCTC-3’ and the reverse primer was 5’-GAGGCTGAGAAGGGAGGATT-3’, synthesized by Integrated DNA Technologies Pty Ltd (IDT, Iowa, USA). Endpoint PCR was performed using 5 µL of 5x MyFiTM Reaction Buffer (including 1 mM dNTPs and 3 mM MgCl2; Bioline, NSW, Australia), 1 µL of MyFiTM DNA Polymerase (Bioline), 0.2 µL of each forward and reverse primer (20 µM) (IDT), 0.25 µL of 1% dimethylsulfoxide (DMSO; Sigma), 50 ng of genomic DNA, and 11.35 µL of dH2O (Baxter Healthcare); to a final volume of 25 µL. Optimised PCR conditions were as follows: 1 cycle at 94 °C for 3 minutes, 27 cycles at 94 °C for 15 seconds, annealing at 65 °C for 20 seconds and extension at 70 °C for 30 seconds, and 1 cycle at 70 °C for 5 minutes. Applied Biosystems® SimpliAmpTM Thermal Cyclers (Thermo Fisher Scientific, MA, USA) were used for endpoint PCR cycling. Post PCR products were stored at 4 °C until capillary fragment separation, which was conducted by the Australian Genome Research Facility (AGRF, WA, Australia). Electropherograms were analysed using Peak ScannerTM Software (v1.0; Thermo Fisher Scientific). ‘523’ allele lengths were determined according to a previously established method by Linnertz et al. 13. Briefly, the highest intensity peak(s) in each peak cluster between 160–190 bp were identified and sized, and 150 bp (accounting for flanking regions and primers) was subtracted from the peak sizes to determine poly-T allele lengths. Alleles were grouped using the convention established by Roses et al.9: Short (S, T ≤ 19), Long (L, 20 ≤ T ≤ 29) and Very Long (VL, T ≥ 30).
2.4 Genotyping of TOMM40 ‘523′ using Whole Genome Sequencing
Whole genome sequencing (WGS) data were obtained from the PPMI database (available at http://www.ppmi-info.org/data) in binary alignment map (BAM) format that had been aligned to the human reference genome GRCh38 using the Burrows-Wheeler transform alignment algorithm 25. Resultant BAM files were analysed using the Integrative Genomics Viewer 26 in order to calculate the length of the poly-T repeat, as previously demonstrated 27. Alleles were grouped using the convention established by Roses et al.9: Short (S, T ≤ 19), Long (L, 20 ≤ T ≤ 29) and Very Long (VL, T ≥ 30).
2.5 APOE ε genotyping
Genotyping of APOE ε were determined using the single nucleotide polymorphism (SNP) and PCR-restriction fragment length polymorphism (PCR-RFLP) analyses. APOE ε2/ε3/ε4 genotypes were determined by sequencing two SNPs (rs429358 and rs7412) using the MassARRAY® system (Agena, Biosciences) at the Australian Genome Research Facility (AGRF; Queensland, Australia) 28. For PCR-RFLP analyses, endpoint PCR reactions were prepared to a final volume of 10 µl using primer sequences previously described 29. Reactions contained 7.2 µl dH2O (Baxter Healthcare, NSW, Australia), 2 µl MyFi reaction buffer (Bioline, NSW, Australia,), 0.05 µl MyFi DNA polymerase (Bioline, NSW, Australia), 0.375 µl forward and, 0.375 µl reverse primer (Integrated DNA Technologies, Iowa, USA) at 200 ng/µl, and 25 ng DNA. The PCR amplification protocol followed, being an initial hold temperature of 95 °C for 4 minutes 30 seconds and 35 cycles of denaturation at 95 °C for 30 seconds, annealing 60 °C for 30 seconds and extension at 72 °C for 1 minute 30 seconds. Restriction enzyme reactions were prepared to a final volume of 20 µl, containing 6.3 µl dH2O (Baxter Healthcare, NSW, Australia), 0.2 µl acetylated BSA (Promega, WI, USA), 2 µl C buffer (Promega, WI, USA), 0.5 µl Hhal (Promega, WI, USA) and 10 µl PCR product. Reactions were incubated for 4 hrs at 37 °C prior to polyacrylamide gel fractionation. Restriction enzyme digested products were fractionated on 12% (w/v) 29:1 polyacrylamide gel (BioRad, CA, USA) in 1 x TBE. Electrophoresis fragment separation was performed at 100 V for 3 hr on the DCode™ Universal Mutation Detection System (BioRad, CA, USA). Gels were stained in 1 x TBE containing SYBR® Gold nucleic acid gel stain (Thermo Fisher Scientific, MA, USA) for 4 min before visualization using a BioRad Chemidoc™ MP Imaging System. APOE ε genotype data relating to the PPMI cohort was obtained from the online database (available at www.ppmi-info.org/data).
2.6 Statistical methods
The Australian and PPMI cohorts were analysed separately, and together, using IBM-SPSS software (version 26, IBM Corporation). A significant nominal p-value of < .05 was employed for all statistical tests. Variables were described using mean and standard deviation (in brackets, SD), or frequency and percent (in brackets, %), as appropriate. Normality was assessed and subsequent clinical characteristics were analysed using Independent Samples T-Test, Mann-Whitney U, or Chi-square, as appropriate. For cross-sectional analysis, Chi-square, stratified Mantel-Haenszel tests and binary logistic regression models were used to evaluate the association between TOMM40 ‘523’ genotypes and risk of PD in the Australian and PPMI cohorts. Binary logistical regression models were run both with and without correction for the APOE ε4 status (being grouped as zero, one or two ε4 allele(s)) and patient sex. Analysis was also run considering all combinations of ‘523’ length category and APOE ε genotype, to examine for interactive effects without a priori assumptions. The aforementioned analyses were performed separately in the PPMI and Australian cohorts, and after combination of the cohorts. Following this, binary logistic models were carried out when considering populations of APOE ε3/ε3 carriers only, as previously examined in this fashion and stated as a requirement for replication studies 30.
Generalised linear models (GLMs) were also constructed in order to study the interaction of TOMM40 on age of disease onset, correcting for APOE ε allele status and sex. Again, GLMs were carried out when considering populations of APOE ε3/ε3 carriers only. Residual plots were examined for all models and no violations were noted. Correction for multiple comparisons was conducted using Bonferroni pairwise comparisons.
Subsequently, the load combination of TOMM40 ‘523’ S allele carrier status (being S/S genotype, carriage of one S allele, and non-carriage of the S allele) and APOE ε4 status (the genotype ε4/ε4, carriage of one ε4 allele and non-carriage of the ε4 allele) were combined to produce 5 groups. Mean comparisons were then analysed using the Kruskal-Wallis one-way analysis of variance and univariate GLMs. This analysis was repeated in the combinations of TOMM40 ‘523’ S allele carrier status and APOE ε2 status, and TOMM40 ‘523’ VL allele carrier status and APOE ε4 status.
Finally, Kaplan-Meier curves for age at PD symptom onset were estimated, stratified by TOMM40 ‘523’ genotype as well as both the TOMM40 ‘523’ and APOE genotype. To compare the survival curves, the log rank test was applied, placing weight on longer survival periods 31,32. Allelic stratification by TOMM40 ‘523’ was also run using Kaplan-Meier analysis. Additionally, all distributions of ages at onset adjusting for sex were compared via Cox proportional hazard regression models.