Nonlinear ridge regression improves cell-type-specific differential expression analysis
Background: Epigenome-wide association studies (EWAS) and differential gene expression analyses are generally performed on tissue samples, which consist of multiple cell types. Cell-type-specific effects of a trait, such as disease, on the omics expression are of interest but difficult or costly to measure experimentally. By measuring omics data for the bulk tissue, cell type composition of a sample can be inferred statistically. Subsequently, cell-type-specific effects are estimated by linear regression that includes terms representing the interaction between the cell type proportions and the trait. This approach involves two issues, scaling and multicollinearity.
Results: First, although cell composition is analyzed in linear scale, differential methylation/expression is analyzed suitably in the logit/log scale. To simultaneously analyze two scales, we applied nonlinear regression. Second, we show that the interaction terms are highly collinear, which is obstructive to ordinary regression. To cope with the multicollinearity, we applied ridge regularization. In simulated data, nonlinear ridge regression attained well-balanced sensitivity, specificity and precision. Marginal model attained the lowest precision and highest sensitivity and was the only algorithm to detect weak signal in real data.
Conclusion: Nonlinear ridge regression performed cell-type-specific association test on bulk omics data with well-balanced performance. The omicwas package for R implements nonlinear ridge regression for cell-type-specific EWAS, differential gene expression and QTL analyses. The software is freely available from https://github.com/fumi-github/omicwas
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9
Due to technical limitations, full-text HTML conversion of this manuscript could not be completed. However, the manuscript can be downloaded and accessed as a PDF.
This is a list of supplementary files associated with this preprint. Click to download.
Posted 06 Jan, 2021
On 20 Dec, 2020
On 20 Dec, 2020
On 20 Dec, 2020
On 30 Oct, 2020
Received 26 Oct, 2020
Received 12 Oct, 2020
On 08 Oct, 2020
On 07 Oct, 2020
Invitations sent on 05 Oct, 2020
On 01 Oct, 2020
On 30 Sep, 2020
On 30 Sep, 2020
On 21 Aug, 2020
Received 20 Aug, 2020
On 19 Jul, 2020
Received 05 Jul, 2020
On 15 Jun, 2020
On 04 Jun, 2020
Invitations sent on 04 Jun, 2020
On 03 Jun, 2020
On 03 Jun, 2020
On 25 May, 2020
Nonlinear ridge regression improves cell-type-specific differential expression analysis
Posted 06 Jan, 2021
On 20 Dec, 2020
On 20 Dec, 2020
On 20 Dec, 2020
On 30 Oct, 2020
Received 26 Oct, 2020
Received 12 Oct, 2020
On 08 Oct, 2020
On 07 Oct, 2020
Invitations sent on 05 Oct, 2020
On 01 Oct, 2020
On 30 Sep, 2020
On 30 Sep, 2020
On 21 Aug, 2020
Received 20 Aug, 2020
On 19 Jul, 2020
Received 05 Jul, 2020
On 15 Jun, 2020
On 04 Jun, 2020
Invitations sent on 04 Jun, 2020
On 03 Jun, 2020
On 03 Jun, 2020
On 25 May, 2020
Background: Epigenome-wide association studies (EWAS) and differential gene expression analyses are generally performed on tissue samples, which consist of multiple cell types. Cell-type-specific effects of a trait, such as disease, on the omics expression are of interest but difficult or costly to measure experimentally. By measuring omics data for the bulk tissue, cell type composition of a sample can be inferred statistically. Subsequently, cell-type-specific effects are estimated by linear regression that includes terms representing the interaction between the cell type proportions and the trait. This approach involves two issues, scaling and multicollinearity.
Results: First, although cell composition is analyzed in linear scale, differential methylation/expression is analyzed suitably in the logit/log scale. To simultaneously analyze two scales, we applied nonlinear regression. Second, we show that the interaction terms are highly collinear, which is obstructive to ordinary regression. To cope with the multicollinearity, we applied ridge regularization. In simulated data, nonlinear ridge regression attained well-balanced sensitivity, specificity and precision. Marginal model attained the lowest precision and highest sensitivity and was the only algorithm to detect weak signal in real data.
Conclusion: Nonlinear ridge regression performed cell-type-specific association test on bulk omics data with well-balanced performance. The omicwas package for R implements nonlinear ridge regression for cell-type-specific EWAS, differential gene expression and QTL analyses. The software is freely available from https://github.com/fumi-github/omicwas
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9
Due to technical limitations, full-text HTML conversion of this manuscript could not be completed. However, the manuscript can be downloaded and accessed as a PDF.