SLC16A1 SNPs and leisure-time physical activity in young Brazilian adults CURRENT STATUS: POSTED

Background: Physical inactivity is a pandemic risk factor for non-communicable diseases. Investigating its determinants is critical to inform effective interventions. However, little is known about genetic determinants of physical activity. Methods: Adults from 1982 Pelotas Birth Cohort were investigated. Five SLC16A1 SNPs were assessed for association with physical activity measured by the International Physical Activity Questionnaire. Results: At a mean age of 22.8 years, rs1049434-AT and TT genotypes (compared to AA) were associated with 4.9 (95% CI: -32.8; 41.5) and 20.6 (-29.1; 69.4) more minutes per week of self-reported leisure-time physical activity in males, respectively. rs3849174-AT and TT males reported 7.9 (95% CI: -43.1; 27.3) and 41.6 (95% CI: -111.5; 28.2) less minutes per week compared to AA, respectively. At a mean age of 30.2 years, the results for the rs1049434 in males were very similar. Effect estimates of 22.6 (95% CI: 53.8; 8.6) and 28.7 (95% CI: -90.8; 33.4) less minutes were observed for rs3849174-TG and GG males, respectively. Results were inconsistent for the rs17493313 SNP and for females. Conclusion: Our results suggest that rs1049434 and rs3849174 SNPs may be genetic determinants of physical activity. However, our findings need replication in larger samples with more precise measures of physical activity. of the present study to investigate the association between


Introduction
The pandemic of physical inactivity (16) is causing more than 5 million annual deaths worldwide (18).
Even so, 1/3 of the world's adult population fails to reach the recommended 150 minutes per week of at least moderate intensity physical activity, and an alarming 4/5 of the adolescent population may be characterized as physically inactive according to public health recommendations (9). A better understanding of the determinants of physical activity at the population level is urgent and essential to help change this negative scenario (1).
So far, research into the correlates and determinants of physical activity has primarily focused on socio-demographic, intrapersonal, interpersonal and environmental factors and the vast majority of data are from high-income countries (1). However, some previous studies have identified genetic factors associated with different aspects of physical activity. Genetic variation [especially single nucleotide polymorphisms (SNPs)] in genes involved with biological processes physiologically related to physical activity has been studied in different high-level athletic contexts in distinct populations, likewise in laboratory studies and specific designs such as twin studies (1,5,15).
Family and twin studies assessing physical activity heritability have provided estimates with high variation in different populations, as described previously (1,7). Such variability could be attributed to different factors, from population differences to measurement error. A twin study using objectivelymeasured physical activity and sedentary behavior obtained heritability estimates of 47% and 31% for time spent in moderate-to-vigorous intensity physical activity and in sedentary behavior, respectively (7). Although some candidate genes have been proposed, including MC4R (20), LEPR (19,28) and DRD2 (25), the genetic architecture underlying the heritable component of physical activity remains largely unidentified (6). While there are currently no robust variants identified through genome-wide association studies, candidate-gene studies can contribute to identify biological pathways involved with physical activity predisposition, as well as replicate and refine findings from genome-wide investigations (once these are available) in different populations.
Genetic factors involved with of lactate accumulation are possibly associated with predisposition to being physically active in the general population. More specifically, lactate levels are well-established markers of oxygen supply to the tissues, which influences muscle fatigue (13). This then may influence an individuals' motivation to exercise (1). Genetic variation in the SLC16A1 gene (which encodes the lactate transporter protein monocarboxylate transporter 1) was identified in patients with lactate transport deficiency (22). Later, this gene was screened for genetic variation in a Singaporean sample (17) and one of the identified variants has been associated with lactate accumulation after exercise (3,4,8), and with athletic performance (8,24).
In spite of the plausible link between genetic variation in genes involved with lactate clearance and physical activity and the corroborative evidence in specific groups of athletes, no population-based investigation examining the potential association between this gene and physical activity has been performed to date. The aim of the present study was to investigate the association between SLC16A1

Study participants and data collection
In 1982, the maternity hospitals in Pelotas, a southern Brazilian city (current population 330,000), were visited daily and screened for all newborns. In total 5914 live births whose families lived in the urban area were examined and their mothers interviewed. Information of more than 99% of the live births was collected (n = 2898). Individuals belonging to the birth cohort were thereafter followed up at the mean ages of 11 (14,30).  Genotyping and SNPs selection At the 22 years follow up visit, subjects were also invited to visit the research laboratory to donate a blood sample, collected by venous puncture. DNA was extracted and frozen at -70ºC. DNA samples were genotyped using the Illumina HumanOmni2.5-8v1 array. From SNPs that remained after quality control [exclusion criteria: Hardy-Weinberg Equilibrium (HWE) P-value < 1 × 10 − 8 , minor allele frequency ≤ 1% and genotyping rate ≤ 90%)], the ones that lie within SLC16A1 (genome assembly GRCh37.p13) region (from start to stop codons, including introns) and that are uniquely mapped to this gene were selected, resulting in five SNPs: rs17493313, rs9429505, rs7169, rs1049434 and rs3849174.

Statistical analyses
Linkage disequilibrium (LD) analyses were performed by calculating r 2 values of all pairwise combinations of SNPs. HWE and distribution of the genotypes of each SNP according to observed skin color were evaluated by Fisher's exact and χ² tests, respectively. Crude and adjusted associations between each SNP and physical activity were evaluated by linear regression. Adjusted analyses controlled for the top 20 ancestry-informative principal components (calculated using a LD-pruned subset of ~ 300000 autosomal SNPs), and were also stratified by sex.
Due to positive skewness in physical activity scores, we repeated some analyses using bootstrap to estimate effect sizes, confidence intervals and P-values. Because findings using bootstrap were virtually identical to those using classical linear regression, we opted to present only the latter (bootstrap results available upon request). Statistical significance was a priori defined as P < 0.05.
Principal components were calculated using SNP & Variation Suite version 7.7.8 (Golden Helix). The remaining analyses were performed using R version 3.0.2 (http://www.r-project.org/).

Sample description
In the 2012-13 (30-31 years of age) follow up visit, we were able to locate 3701 individuals, which (added to the 325 known to have died) represent a follow-up rate of 68.1%. Follow-up rates were slightly lower among males, individuals in the wealthiest socioeconomic groups at birth and with normal birth weight (Table 1). When compared to the entire 2012-2013 follow-up participants, individuals with both genetic and physical activity data were poorer and less active in their leisuretime at a mean age of 30.2 years.  Table 2 provides a description of the studied sample. 47.5% of the individuals were males and more than 70% reported white skin color. The frequencies of the homozygous variant genotype (i.e., the genotype with two copies of the least prevalent allele) varied between 2.9% than 15.5% regarding the three SNPs selected based on LD (see next paragraph). Men were more active and women for all physical activity measurements. As expected, the differences between the mean and the median for physical activity variables indicate a substantial positive skewness.  Table 3. As expected, the genotypic frequencies of the three SNP varied significantly according to skin color (P = 0.011, P < 0.001 and P < 0.001 for rs17493313, rs1049434 and rs3849174, respectively). No significant deviances from HWE were observed for rs17493313 and rs1049434 in the total sample and within skin color strata. However, rs3849174 presented a significant deviance from HWE within blacks (P = 0.044). Therefore, all association analyses involving rs3849174 were performed for all skin color groups and within non-blacks only to avoid the possibility of finding spurious associations.

Associations of SLC16A1 SNPs with physical activity
To test the hypothesis that genetic variation in SLC16A1 is associated with physical activity in the general population, crude and adjusted analyses involving rs17493313, rs1049434 and rs3849174 SNPs and physical activity were performed.   Associations involving leisure-time physical activity at a mean age of 30.2 years are shown in Table 5.
rs17493313 estimates presented inconsistent directions for males when compared to Table 4; the pattern (i.e., change in regression coefficients according to the number of C alleles) was inconsistent in both sexes, and the magnitude of the estimates was considerably smaller. Estimates for rs1049434 were inconsistent for females, but similar in direction and magnitude for males: when compared to AA, AT and TT individuals reported 3.9 (95% CI: -28.6; 36.5) and 19.0 (-24.4; 62.4) more minutes per week of leisure-time physical activity. Females also presented inconsistent results for the rs3849174 SNP, while males presented consistent directions. Although less activity being reported according to the number of G alleles was also observed at a younger age, there difference between effect estimates of TG and GG individuals was smaller at a mean age of 30.2 years. Table 5 Linear regression coefficients for leisure-time physical activity (min/week) at a mean age of 30.

Discussion
We investigated the association of genetic variation in SLC16A1 with leisure-time physical activity in a population-based sample with appreciable score-lowering effect estimates, but 95% CIs were wide.
Results were more consistent for males than for females, and the SNP rs1049434 presented the most consistent results.
To the best of the authors' knowledge, this is the first investigation of an association between genetic variation in SLC16A1 and physical activity in the general population. Furthermore, this is the first candidate-gene assessment of the associations of SLC16A1 SNPs other than rs1049434 with physical activity. To date, there are only four reports of associations between rs1049434 and physical activity.
The first reported an association between the SNP and lactate accumulation and maximum lactate concentration in 10 men aged 20-26 under controlled high-intensity circuit training (3). Later, a study in 15 men and 14 women observed an association between rs1049434 and blood lactate accumulation during different training series in males but not in females (4). Two recent studies provided further evidence on this topic: one observed a higher prevalence of the A allele in endurance-oriented athletes than in non-athletic individuals, while mean blood lactate levels were higher in male rowers carrying the T-allele than in AA ones (8); the other provided evidence that sprint/power athletes (n = 100) were more likely than endurance athletes (n = 112) and matched controls (n = 621) to present the T allele (24).
Our findings agree with previously published studies regarding the general notion that SLC16A1 SNPs influence physical activity. Although effect estimates were larger for the rs3849174 SNP, the rs1049434 variant (which was associated with different aspects of physical activity in the aforementioned studies) presented very consistent effect estimates in males. However, the literature supports that rs1049434-T allele is associated with reduced predisposition to be physically active, but we observed the opposite. Additionally, an association was observed in females but not in males, which also is contrary to available evidence (4). However, women have consistently been reported to be less active (regarding overall and leisure-time physical activity) than men (1) and some data suggest women are less motivated to exercise (21,29). If SLC16A1 SNPs influence exercise levels -a component of leisure-time physical activity -lower participation in structured exercise in women may explain the sex differences observed. Nevertheless, the inconsistencies with the literature evidence the need of future population-based studies in different genetic backgrounds to evaluate the association of these SNPs with leisure-time physical activity and whether this association is different between sexes.
One of the important limitations of our study is the use of self-reported physical activity. In this regard, we have focused on leisure-time physical activity for two reasons. First, it is well-known that occupational and housework domains are overestimated by IPAQ in Latin America (11). Second, the roles of genetic determinants are likely to be more pronounced in leisure-time than in transportrelated physical activity, especially in low/middle income populations, where transport-related physical activity is likely determined by socioeconomic factors rather than individual choice (10).
Moreover, given Mendel's 1st and 2nd laws (26,27), it is unlikely that germline genetic variants (including SNPs) are associated with potential sources of systematic error in physical activity reports.
Another potential source of bias in our study is the fact that our sample is multi-ethnic. However, ancestry-informative principal components obtained from genome-wide genotyping data were available and are known to provide a robust protection against population stratification (23). It is also reassuring that the results for the overall sample were similar comparing crude and adjusted analysis.
Although two SNPs (especially rs1049434) presented consistent results in males at different ages, our study is likely underpowered. Some causes of this limitation include measurement error of questionnaire measurers of physical activity (12), the multi-factorial nature of this behavior (1) and the low effect sizes frequently observed for common genetic variants. In the future (when the effects of life-long physical activity patterns will be more apparent in our cohort) it will also be possible to test the association of these SNPs with physical activity-related traits as an additional replication strategy, evaluating the effects of these SNPs on physical activity at different ages. However, the wide confidence intervals we observed do not allow excluding the possibility that our findings are due to chance. Although the associations we reported have biological plausibility, replication in other populations is warranted.
Considering that physical inactivity is a major risk factor for non-communicable diseases, identifying Written informed consent was obtained from all participants.

Competing interests: The authors declare no conflict of interests
Funding Source: Funding for genome-wide genotyping was received from the Brazilian Ministry of