In genome-wide association studies the extent and impact of confounding due population structure has been well recognized. Inadequate handling of such confounding is likely to lead to spurious associations, hampering replication and the identification of causal variants. Several strategies have been developed for protecting associations against confounding, the most popular one being based on Principal Component Analysis. In contrast, the extent and impact of confounding due to population structure in gene-gene interaction association epistasis studies is much less investigated and understood. In particular, the role of nonlinear genetic population substructure in epistasis detection is largely under-investigated, especially outside a regression framework.
In order to identify causal variants in synergy, to improve interpretability and replicability of epistasis results, we introduce three strategies based on model-based multifactor dimensionality reduction approach for structured populations namely: MBMDR-PC, MBMDR-PG and MBMDR-GC.
Simulation results comparing the performance of various approaches show that in the presence of population structure MBMDR-PC and MBMDR-PG consistently better control type I error rate at the nominal level than MBMDR-GC. Moreover, our proposed three methods of population structure correction outperform MDR-SP in terms of statistical power.
We demonstrate through extensive simulation studies the effect of various degrees of genetic population structure and relatedness on epistasis detection and propose appropriate remedial measures based on linear and nonlinear sample genetic similarity.

Figure 1

Figure 2

Figure 3

Figure 4

Figure 5

Figure 6
The full text of this article is available to read as a PDF.
This is a list of supplementary files associated with this preprint. Click to download.
Loading...
On 06 Feb, 2021
Received 31 Jan, 2021
On 17 Jan, 2021
Received 17 Jan, 2021
Invitations sent on 14 Jan, 2021
On 14 Jan, 2021
On 13 Jan, 2021
On 13 Jan, 2021
On 13 Jan, 2021
Posted 02 Jul, 2020
On 05 Nov, 2020
Received 31 Oct, 2020
On 13 Oct, 2020
Received 10 Aug, 2020
On 17 Jul, 2020
Invitations sent on 14 Jul, 2020
On 01 Jul, 2020
On 30 Jun, 2020
On 30 Jun, 2020
On 29 Jun, 2020
On 06 Feb, 2021
Received 31 Jan, 2021
On 17 Jan, 2021
Received 17 Jan, 2021
Invitations sent on 14 Jan, 2021
On 14 Jan, 2021
On 13 Jan, 2021
On 13 Jan, 2021
On 13 Jan, 2021
Posted 02 Jul, 2020
On 05 Nov, 2020
Received 31 Oct, 2020
On 13 Oct, 2020
Received 10 Aug, 2020
On 17 Jul, 2020
Invitations sent on 14 Jul, 2020
On 01 Jul, 2020
On 30 Jun, 2020
On 30 Jun, 2020
On 29 Jun, 2020
In genome-wide association studies the extent and impact of confounding due population structure has been well recognized. Inadequate handling of such confounding is likely to lead to spurious associations, hampering replication and the identification of causal variants. Several strategies have been developed for protecting associations against confounding, the most popular one being based on Principal Component Analysis. In contrast, the extent and impact of confounding due to population structure in gene-gene interaction association epistasis studies is much less investigated and understood. In particular, the role of nonlinear genetic population substructure in epistasis detection is largely under-investigated, especially outside a regression framework.
In order to identify causal variants in synergy, to improve interpretability and replicability of epistasis results, we introduce three strategies based on model-based multifactor dimensionality reduction approach for structured populations namely: MBMDR-PC, MBMDR-PG and MBMDR-GC.
Simulation results comparing the performance of various approaches show that in the presence of population structure MBMDR-PC and MBMDR-PG consistently better control type I error rate at the nominal level than MBMDR-GC. Moreover, our proposed three methods of population structure correction outperform MDR-SP in terms of statistical power.
We demonstrate through extensive simulation studies the effect of various degrees of genetic population structure and relatedness on epistasis detection and propose appropriate remedial measures based on linear and nonlinear sample genetic similarity.

Figure 1

Figure 2

Figure 3

Figure 4

Figure 5

Figure 6
The full text of this article is available to read as a PDF.
This is a list of supplementary files associated with this preprint. Click to download.
Loading...