Background Across complex traits, common variants explain only a modest amount of variance, with SNP-heritability consistently below heritability estimates from close relatives. Here, we examined the contribution of rare variant to tobacco use risk in up to 26,000 individuals of European ancestry in the Trans-Omics for Precision Medicine (TOPMed) program with whole genome sequence (WGS;~30X coverage).
Method We grouped about 35 million genetic variants by their minor allele frequencies (MAF) and linkage disequilibrium (LD) and estimated SNP-heritability for age of smoking initiation (N = 14,747), cigarettes smoked per day (N = 15,425), smoking cessation (N = 17,871) and initiation (N = 26,340) using linear mixed model. Rare variant population structure is detected and adjusted for by permutation procedure. We estimated an upper bound for narrow-sense heritability for tobacco use using available pedigrees consisting of close relatives in TOPMed.
Results Rare variants with MAF 0.1–0.01%, mostly from non-protein altering region, accounted for 26% of variation in age of initiation and 15% for cessation. Follow-up analysis indicated that about one-third of these rare variants contribtion is potentially confounded with rare variants structure even after adjusting for principal components. After further conservative adjustment of population structure, we estimated SNP-based heritability to be 0.21 (SE = 0.08) for age of initiation, 0.15 (0.06) for cigarettes per day, 0.21 (0.09) for cessation, and 0.24 (0.07) for initiation, 1.8–4.5 times higher than previous SNP-based estimates. Our pedigree-based upper-bound for SNP-based heritability ranged from 0.18–0.35.
Conclusion The substantial contribution of rare variants for several smoking phenotypes sheds light on the missing heritability and genetic etiology of tobacco use. It also informs fine-mapping strategies since the majority of the rare variant contribution was located in non-coding regulatory regions.