The aim of this study is to investigate the genetic predisposition to elite athletic endurance through conducting the largest GWAS in elite athletes to date, followed by functional validation through metabolomics study to shed light on the underlying mechanisms of genetic associations.
Participants
Discovery study
Seven hundred and fifty-three consented European international-level athletes (594 males, 159 females) from different sports disciplines who participated in national or international sports events and tested negative for doping substances at anti-doping laboratories in Qatar (ADLQ) and Italy (FMSI) were included in this study. No other information of participants was available due to the strict anonymization process undertaken by the anti-doping laboratories. This study was performed in line with the World Medical Association Declaration of Helsinki – Ethical Principles for Medical Research Involving Human Subjects. All protocols were approved by the Institutional Research Board of ADLQ (F2014000009). Athletes were dichotomized into groups with different aerobic (dynamic) and power (static) components (Table 1) based on their sport types as described previously (3). Table 1 further lists the number of participants based on various analyses as per sport type in each class/group and their genders.
Replication study
The Russian athletes’ study involved 219 athletes (95 females, age 21.9 (3.5) years, 124 males, age 22.1 (4.2) years; 43 sprinters, 120 middle-distance athletes, 56 long-distance athletes). Sprinters included 8 100-400 m runners, 5 sprint cyclers, 10 500-1000 m speed skaters / short trackers, 19 50-100 m swimmers, 1 200 m kayaker. Middle-distance athletes comprised 59 rowers, 10 0.8-1.5 km runners, 7 middle-distance cyclers, 21 middle-distance kayakers / canoers, 15 1.5-3.0 km speed skaters, 8 200-400 m swimmers. Long-distance athletes included 3 3-10 km runners, 1 marathon runner, 14 biathletes, 12 cross-country skiers, 14 0,8-25 km swimmers, 6 triathletes, 6 race walkers. All athletes were Olympic team members (International level) who have tested negative for doping substances. Russian controls were 173 (126 males and 47 females) unrelated citizens of Russia without any competitive sport experience (all Caucasians of Eastern European descent). The Russian study was approved by the Ethics Committee of the Federal Research and Clinical Center of Physical-chemical Medicine of the Federal Medical and Biological Agency of Russia. Written informed consent was obtained from each participant. The study complied with the guidelines set out in the Declaration of Helsinki and ethical standards in sport and exercise science research. The experimental procedures were conducted in accordance with the set of guiding principles for reporting the results of genetic association studies defined by the STrengthening the REporting of Genetic Association studies (STREGA) Statement.
Genotyping
Discovery study
DNA was extracted from leukocytes (venous blood) samples from all participants using DNeasy Blood & Tissue kit (Qiagen) following manufacturer’s instructions. The concentration and the quality of DNA were assessed using the Nanodrop (Thermo Fisher) and Qubit Fluorometer (Invitrogen) to ensure sufficient amount and quality of DNA were obtained for genotyping. Illumina Drug Core array-24 BeadChips was chosen for the genotyping of 476,728 SNPs in the 753 European elite athletes collected for Anti-Doping analysis (discovery cohort). This array contains over 240,000 highly-informative genome-wide tag SNPs and a novel ~200,000 custom marker set designed to support studies of drug target validation and treatment response. The assay required 200 ng of DNA sample as input with a concentration of at least 50 ng/µl. All further procedures were performed according to the instructions of Infinium HD Assay according to manufacturer’s instructions. Briefly, 4 µl of obtained DNA was mixed with Illumina amplification reagents and incubated overnight at 37oC in hybridization oven. On the second day, enzymatic reagents were used to fragment the amplified DNA then precipitated by centrifugation. Subsequently, re-suspended pellet was loaded in the beadchip then incubated overnight at 48oC in hybridization oven. On third day, beadchips underwent enzymatic base extension and fluorescent staining. Lastly, after coating, the beadchips were imaged using iScan.
Replication study
Molecular genetic analysis in Russian cohorts was performed with DNA samples obtained from leukocytes (venous blood). Four ml of venous blood were collected in tubes containing EDTA (Vacuette EDTA tubes, Greiner Bio-One, Austria). Blood samples were transported to the laboratory at 4°C and DNA was extracted on the same day. DNA extraction and purification were performed using a commercial kit according to the manufacturer's instructions (Technoclon, Russia) and included chemical lysis, selective DNA binding on silica spin columns and ethanol washing. Extracted DNA quality was assessed by agarose gel electrophoresis at this step. HumanOmni1-Quad BeadChips (Illumina Inc, USA) were used for genotyping of 1,140,419 SNPs in athletes and controls. The assay required 200 ng of DNA sample as input with a concentration of at least 50 ng/µl. Exact concentrations of DNA in each sample were measured using a Qubit Fluorometer (Invitrogen, USA). All further procedures were performed according to the instructions of Infinium HD Assay.
Data Extraction and SNP Identification
Raw data was extracted, peak-identified and QC processed using Illumina iScan hardware and software. These systems are built on a web-service platform utilizing Microsoft’s NET technologies, which run on high-performance application servers and fiber-channel storage arrays in clusters to provide active failover and load-balancing.
Metabolomics
Screening of serum metabolites was performed in 490 elite athletes (Table S1) using protocols established at Metabolon, Durham, NC, USA. The platform utilizes Waters ACQUITY ultra-performance liquid chromatography (UPLC) and a Thermo Scientific Q-Exactive high resolution/accurate mass spectrometer interfaced with a heated electrospray ionization (HESI-II) source and Orbitrap mass analyzer operated at 35,000 mass resolution. Detailed protocol and QC measures were previously published (14, 42).
Statistical analysis
Following genotyping using Illumina’s Drug Core SNP array, analysis was performed using Plink v1.9. Quality control measures were applied to the genotype data set to exclude samples with low genotype call rate or excess heterozygosity. Accordingly, SNPs with a genotype call rate < 98%, minor allele frequency < 1%, or deviating from Hardy-Weinberg equilibrium (P < 10E-6) were excluded. After filtering the data with the above criteria, 341,385 SNPs were used in analysis. Population background was determined using principal component analysis (PCA) in comparision to samples from 1000 genome project and only samples with European ancestry were included in the analysis. The analysis in European and Russian cohorts was performed using linear or logistic regression models. A model incorporating sports grouped by training modalities (i.e. sports with high vs. low/moderate aerobic component) was used for the discovery cohort after incorporating gender and PCA components 1, 2, 3 & 4 as covariates in the model. A stringent Bonferroni level of significance of p < = 0.05/341385 = 1.46E-7 was used to define significant associations. To perform the meta-analysis, the Cochrane Review Manager version 5.3 was used. Random and fixed effect models were applied. The heterogeneity degree between the studies was assessed with the I2 statistics. Associations between SNPs and metabolite levels were computed using lm function in R (version 3.3.1) while correcting for gender, hemolysis and PCA. An additive inheritance model was used (SNPs were coded as 0,1,2 according to their genotype group. Pathway enrichment analyses were carried out using Chi square tests to identify pathways with enriched metabolites ranked by p-value from the linear model since Bonferroni level of significance was not observed.