DOI: https://doi.org/10.21203/rs.3.rs-149087/v1
Identifying causal risk factors for severe coronavirus disease 2019 (COVID-19) is critical for its prevention and treatment. Many associated pre-existing conditions and biomarkers have been reported, but these observational associations suffer from confounding and reverse causation. Here, we perform a large-scale two-sample Mendelian randomization (MR) analysis to evaluate the causal roles of many traits in severe COVID-19. Our results highlight multiple body mass index (BMI)-related traits as risk-increasing: BMI (OR:1.89, 95% CI:1.51–2.37), hip circumference (OR:1.46, 1.15–1.85), and waist circumference (OR:1.82, 1.36–2.43). Our multivariable MR analysis further shows that the BMI-related effect is driven by fat mass (OR:1.63, 1.03–2.58), but not fat-free mass (OR:1.00, 0.61–1.66). Several white blood cell counts are negatively associated with severe COVID-19, including those of neutrophils (OR:0.76, 0.61–0.94), granulocytes (OR:0.75, 0.601–0.93), and myeloid white blood cells (OR:0.77, 0.62–0.96). Furthermore, some circulating proteins are associated with an increased risk of (e.g., zinc-alpha-2-glycoprotein) or protection from severe COVID-19 (e.g., interleukin-3/6 receptor subunit alpha). Our study shows that fat mass and white blood cells underlie the etiology of severe COVID-19. It also identifies risk and protective factors that could serve as drug targets and guide the effective protection of high-risk individuals.
The coronavirus disease 2019 (COVID-19) is a global pandemic caused by severe acute respiratory syndrome coronavirus 2 (SARS‑CoV‑2)1. As of mid-January 2021, 92 million confirmed cases and two million deaths from COVID-19 have been reported worldwide2. Despite substantial public health and medical efforts, COVID-19 continues to cause irreversible damage and death3-5. It is essential to identify risk factors and potential drugs for COVID-19 in order to improve primary prevention and to develop treatment strategies.
Many observational studies have reported that age, gender, ethnicity, and pre-existing conditions, such as cardiovascular disease, diabetes, chronic respiratory disease, hypertension, and cancers, are associated with increased COVID-19 susceptibility and severity5-8. Moreover, these retrospective observational studies have noted that hospitalized COVID-19 patients, especially those with severe respiratory or systemic conditions, are at increased risks of atrial fibrillation, non-sustained ventricular tachycardia, acute kidney injury, neurologic disorders, and thrombotic complications9-12. Vitamin D deficiency, higher body mass index (BMI), and obesity have been associated with an increased risk of COVID-1913,14. Some lifestyle factors were also identified as risk-increasing, such as smoking, alcohol consumption, and lack of physical activity15. However, it is difficult to infer causal effects from observational studies because they are susceptible to confounding and reverse causation, while data from randomized controlled trials are scarce and inconclusive.
Mendelian randomization (MR) study provides a promising opportunity to validate and prioritize putative risk factors and drug targets. MR studies use randomly allocated genetic variants related to the exposure as instrumental variables for investigating the effect of the exposure on an outcome16. It is expected to be independent of confounding factors and has been demonstrated as an efficient and cost-effective strategy to identify causal effects17. Recent MR studies have provided evidence of causality for a range of risk factors on COVID-19 (Supplementary Table 1). For instance, BMI and smoking are associated with an increased COVID-19 risk, while no evidence of causal effects was found for circulating 25-hydroxy-vitamin D (25OHD) levels18-20. However, inconsistent results were also reported for some factors, such as Alzheimer’s disease, blood lipids, and physical activity18,19,21-26. Some of these inconsistencies are likely due to the usage of early genome-wide association studies (GWAS) of COVID-19, which have small sample sizes. Moreover, most studies are limited to a small number of candidate factors, leaving many more to be tested and identified.
In this study, we conducted an unbiased and exhaustive MR analysis to examine the causal effects of an extensive list of exposures on severe COVID-19. All traits with existing GWAS, as compiled by the Integrative Epidemiology Unit (IEU) Open GWAS project, were included27. Based on these GWAS, independent genetic variants associated with a trait at the genome-wide significance were selected as instrumental variables for the trait. The associations between genetic instruments and the risk of severe COVID-19 were evaluated based on three GWAS of COVID-19. The COVID-19 Host Genetics Initiative (HGI) study A2 was used in our discovery analysis. HGI A2 compared COVID-19 patients with confirmed severe respiratory symptoms to population controls28. The HGI B2 study, comparing hospitalized COVID-19 patients to population controls, was used as one of our two replication datasets. The other replication dataset, labeled as the NEJM study, was drawn from the first published GWAS study of COVID-19 comparing patients with respiratory failure to healthy controls from Italy and Spain29. Multiple sensitivity analyses were performed to detect and correct for the presence of pleiotropy. Here, we only report associations that do not have evidence of pleiotropy in genetic instruments and are replicated in at least one of the two replication analyses. As an in-depth investigation into the BMI-related traits, we further conducted a multivariable MR analysis to disentangle the effects of fat mass and fat-free mass. Our findings provide profound insights into the etiology of severe COVID-19 and prioritize candidate causal risk factors for public health intervention and for drug discovery.
Study overview
The workflow of our study is summarized in Figure 1. Starting with the 34,519 GWAS compiled by the IEU Open GWAS project, we focused on the 14,422 that are based on European-descent samples, in order to match the major ancestry in the GWAS of COVID-19 and to avoid false positives as results of population discrepancy in genetic effects (Supplementary Tables 2 and 3). Genetic instruments were selected from each GWAS as independent genetic variants at the genome-wide significance, and their effect alleles were harmonized with the outcome GWAS. Three or more genetic instruments are required for statistical tests of pleiotropic effects, and thus exposures with fewer instruments were excluded. For the univariable MR analysis of each exposure-outcome pair, we first applied the inverse variance-weighted (IVW) method with a multiplicative random-effects model30. We then evaluated the possible presence of pleiotropic effects with Cochran’s Q test of heterogeneity and the MR-Egger intercept test for directional pleiotropy31-33. We excluded all exposures with indications of pleiotropy in their genetic instruments to fullfill the key assumptions underlying MR analysis. We retained 6,442 GWAS for the discovery analysis with the HGI A2 study (Supplementary Table 4), 6,407 GWAS for the replication analysis with the HGI B2 study (Supplementary Table 5), and 6,248 GWAS for the replication analysis with the NEJM study (Supplementary Table 6). The false discovery rate (FDR) approach was utilized to correct for multiple testing of many exposures (Supplementary Tables 7-9). Based on these three sets of analysis, we defined two sets of results: 1) the significant and replicated results, which have a q-value < 0.05 in the discovery analysis and a nominal p-value < 0.05 in either one of the replication studies (Supplementary Table 10); and 2) the suggestive and replicated results, which have a nominal p-value < 0.05 in the discovery analysis and a nominal p-value < 0.05 in either one of the replication studies (Supplementary Table 11). A total of 49 significant and replicated traits were identified. Among them, 17 were replicated in both replication datasets (Table 1, Supplementary Table 12).
BMI-related traits
In the univariable MR study, eight BMI-related traits are positively associated with severe COVID-19 in our discovery analysis and also in both of our replication analyses (Table 1). Genetically predicted one standard deviation (SD) increase of BMI is associated with a higher risk of severe COVID-19 (OR: 1.89, 95% CI: 1.51–2.37, p = 1.78 × 10−6). Consistent with BMI, genetically instrumented higher hip circumference (OR: 1.46, 95% CI: 1.15–1.85, p = 0.0017) and waist circumference (OR: 1.82, 95% CI: 1.36–2.43, p = 6.20 × 10−5) are associated with a higher risk of severe COVID-19. The univariable MR study also provided strong evidence that weight and fat mass in the left arm, right arm, left leg, right leg, trunk, and whole body are positively associated with severe COVID-19 (Supplementary Tables 10 and 11).
To pinpoint the different aspects of BMI-related traits, we investigated the roles of fat mass and fat-free mass indices in severe COVID-19 (Supplementary Table 13). In the multivariable MR analysis controlling for fat-free mass, there is strong evidence for direct causal effects of fat mass measured at different body parts, including the whole body, left and right arms, left and right legs, and the trunk. The evidence is consistent across the three GWAS of COVID-19 (Fig. 2, Supplementary Table 13). On the other hand, there is no evidence for direct causal effects of fat-free mass (Fig. 3, Supplementary Table 13). The multivariable MR analysis results indicate that the causal effects of BMI-related traits on severe COVID-19 are mainly driven by fat mass.
White blood cell traits
In the univariable MR analyses, we identified a group of five white blood cell traits to be negatively associated with the risk of severe COVID-19. Specifically, suggestive associations were determined for neutrophil count (OR: 0.76, 95% CI: 0.61–0.94, p = 0.013), sum basophil neutrophil counts (OR: 0.71, 95% CI: 0.57–0.87, p = 0.001), sum neutrophil eosinophil counts (OR: 0.76, 95% CI: 0.61–0.95, p = 0.015), myeloid white cell count (OR: 0.77, 95% CI: 0.62–0.96, p = 0.0197), and granulocyte count (OR: 0.75, 95% CI: 0.601–0.93, p = 0.009) (Fig. 4). For all five traits, causal estimates are broadly concordant in weighted median (WM) and weighted mode methods, and consistent directions of effects were also found by the MR-Egger method (Supplementary Table 11). Take neutrophil count as an example, consistent estimates of a protective effect were found with WM (OR: 0.61, 95% CI: 0.42–0.88, p = 0.009) and weighted mode (OR: 0.59, 95% CI: 0.39–0.91, p = 0.017). Overall, our findings support the causal roles of white blood cells, especially neutrophils, in reducing the risk of developing severe COVID-19.
Circulating proteins
Our univariable MR analyses revealed evidence of causal effects for some circulating proteins. There are six proteins whose effects on severe COVID-19 are significant in the discovery MR analysis (q-value < 0.05) and also replicated in both replication analyses (p-value < 0.05) (Table 1, Supplementary Table 12). Three of them are negatively associated with the risk of severe COVID-19, including interleukin-3 receptor subunit alpha (OR: 0.87, 95% CI: 0.79–0.94), interleukin-6 receptor subunit alpha (OR: 0.88, 95% CI: 0.83–0.94), and prostate-associated microseminoprotein alpha (OR: 0.71, 95% CI: 0.58–0.86). The other three are risk-increasing, including zinc-alpha-2-glycoprotein (OR: 1.37, 95% CI: 1.14–1.66), C1GALT1-specific chaperone 1 (OR: 1.20, 95% CI: 1.19–1.21), and corneodesmosin (OR: 1.12, 95% CI: 1.09–1.16). There are another six circulating proteins that have significant and replicated effects on severe COVID-19, although they are only replicated in one replication analysis (Supplementary Table 10): inter-alpha-trypsin inhibitor heavy chain H1 (OR: 1.08, 95% CI: 1.04–1.12), alpha-2-macroglobulin receptor-associated protein (OR: 1.14, 95% CI: 1.07–1.23), resistin (OR: 1.09, 95% CI: 1.07–1.11), reticulon-4 receptor (OR: 0.86, 95% CI: 0.79–0.93), C-C motif chemokine 23 (OR: 0.88, 95% CI: 0.83–0.92), and collectin-10 (OR: 0.83, 95% CI: 0.76–0.901). Additionally, our suggestive and replicated results revealed another 14 proteins to be associated with the severe COVID-19 risk (Supplementary Table 11). Overall, our MR analyses prioritized scores of circulating proteins that are likely causal in the development of severe COVID-19.
This exhaustive MR study examined an extensive list of risk factors and prioritized those that are likely to play causal roles in the development of severe COVID-19. It leveraged GWAS of COVID-19 of the largest sample size, and the findings were replicated with one, and for some associations, two additional COVID-19 GWAS. Using univariable MR, we first confirmed that BMI-related traits are causal risk factors for severe COVID-19. Our multivariable MR results further suggested that the effects of BMI-related traits are driven by fat mass but not fat-free mass. Moreover, our findings indicate that white blood cell traits, particularly neutrophils, are inversely associated with the severe COVID-19 risk. We also highlighted scores of circulating proteins that could potentially serve as drug targets.
Our main finding that higher BMI-related traits increase the risk of severe COVID-19 is consistent with several recent MR studies18,19,23,24. Furthermore, our multivariable MR analysis further showed that fat mass is a causal risk factor for severe COVID-19, while fat-free mass is not. These results indicate that the causal effect of BMI on severe COVID-19 is likely driven by fat mass. The causal effects of BMI and fat mass have plausible biological mechanisms. Fat mass has been known to have deleterious effects on lung function, inflammation, and immunity34-36. In adipose tissue, high production of circulating proinflammatory cytokines and adipokines may intensify virally induced inflammation and immune dysregulation, and contribute to acute respiratory distress syndrome, which is the leading cause of mortality from COVID-1937-40. Notably, other causal risk factors of severe COVID-19 identified in this study are also related to adiposity, including glucosamine, resistin, IL6Ra, prostate-associated microseminoprotein, and zinc-alpha-2-glycoprotein41-45. These connections suggest a shared mechanism for their contribution to severe COVID-19. Therefore, further mechanistic understanding of fat mass and other related risk factors will shed light on the etiology of severe COVID-19 and provide multiple targets of intervention for prevention and treatment.
Our study indicates that white blood cell traits, especially neutrophils, reduce the risk of severe COVID-19. In addition to direct evidence from neutrophils, sum basophil neutrophil counts and sum neutrophil eosinophil counts are directly related to neutrophils, and concordant causal effects were obtained using multiple MR methods. We also identified myeloid white blood cell counts and granulocyte counts as being inversely associated with the risk of severe COVID-19, which is consistent with our previous MR findings46. In contrast to the negative associations in this MR study, previous observational studies have provided strong evidence that elevated white blood cells and neutrophils but depleted lymphocytes are common in COVID-19 patients47-50. This discrepancy highlights the possibility that observed associations are due to confounding and reverse causation. The causal role of neutrophils in preventing the development of severe COVID-19 has biological support. Neutrophils, the integral components of the innate immune system, are the first line of defense against invading pathogens51. Moreover, neutrophils participate in elaborate cell signaling networks involving cytokines, chemokines, survival, and growth factors that cause downstream pro-inflammatory effects52. On the other hand, neutrophils are involved in the hyperinflammatory responses (e.g., overproduction of neutrophil extracellular traps and cytokine storm) in severe COVID-19 patients. This reflects the reverse causal effect of COVID-19 on neutrophils53,54. Overall, our present results support the causal effects of white blood cells, especially neutrophils, on severe COVID-19, likely through an enhanced immune response that suppresses virus infection in the early stage.
To identify potential drug targets, we found that six immune-related proteins are inversely associated with the risk of severe COVID-19. Interleukin-6 receptor subunit alpha is associated with a decreased risk of severe COVID-19, which is consistent with a recent MR finding55. Signaling of the interleukin-6 receptor subunit alpha influences many inflammatory molecules and tissue regeneration56. Interleukin-3 receptor subunit alpha plays important functions in hemopoietic, vascular, and immune systems57. Prostate-associated microseminoprotein may influence inflammation and cancer development58. C-C motif chemokine 23 is a chemotactic agent, which probably plays an important role in inflammation and atherosclerosis59. Collectin-10 can act as a cellular chemoattractant in vitro, probably involved in the regulation of cell migration60. Reticulon-4 receptor influences the central nervous system and protects motoneurons against apoptosis61. The identification of these immune-related circulating proteins highlights the critical role of immune responses in the development of severe COVID-19.
Our study identified another six circulating proteins to be positively associated with an increased risk of severe COVID-19. Most of them are glycoproteins. The effect of zinc-alpha-2-glycoprotein might be mediated by the depletion of fatty acids from adipose tissues62. C1GALT1-specific chaperone 1 might abolish a glycosyltransferase function and disrupt the O-glycan Core 1 synthesis63,64. Corneodesmosin is a glycoprotein expressed in the epidermis and the inner root sheath of hair follicles65. Inter-alpha-trypsin inhibitor heavy chain H1 is involved in cell adhesion and leukocyte migration in inflammation sites66. The alpha-2-macroglobulin receptor-associated protein is responsible for the role of exotoxin A in pseudomonas disease and immunity67. Resistin is known as a hormone that potentially links obesity to diabetes through resisting insulin action45. More in-depth mechanistic work is needed to better understand the physiological and biological processes through which these druggable proteins contribute to COVID-19 severity. While our study identified scores of circulating proteins, we cannot rule out the possibility that there are more COVID-19-related proteins. The number of genetic instruments is often limited for circulating proteins, precluding many of them from being analyzed. Our findings of circulating proteins not only suggest possible etiological processes but also provide potential druggable targets.
Our study has many strengths. One strength as an MR study is the ability to assess causal effects, avoiding bias from reverse causation and residual confounding. A major feature and strength of our study is an unbiased and exhaustive approach to screen an extensive list of risk factors. To address the issue of multiple testing, we used FDR corrections in the discovery analysis. To ensure robustness and reduce false positives, we only reported results that were replicated in at least one replication analysis. Another strength is that we applied multivariable MR analyses to evaluate the independent causal effects of fat mass and fat-free mass.
Our study also has several weaknesses. Although we applied multiple sensitivity analyses, including the heterogeneity test, MR-Egger, and WM method, we could not fully rule out the possibility that some genetic variants might be pleiotropic. Another limitation is that some GWAS of exposures and the HGI GWAS of COVID-19 have overlapped samples, especially those from the UK Biobank. To mitigate this issue, we also utilized another GWAS of COVID-19, the NEJM study, which does not have overlapping samples. A further weakness is that the statistical power for some analyses was limited, and some null results might be false negatives. Further positive findings may be revealed if more GWAS with larger sample sizes become available. As an effort to reduce complications from population stratification, our study focuses on European ancestry, and thus the findings may not be generalizable to other ethnicities. Lastly, further research is required to decipher the biological pathways underpinning the observed associations with severe COVID-19.
In conclusion, the present study provides evidence that the causal association between BMI-related traits and severe COVID-19 is driven by fat mass, but not by fat-free mass. Our findings suggest that neutrophils, granulocytes, and myeloid white blood cells are inversely associated with the severe COVID-19 risk. Our study also identifies putatively causal associations between 12 circulating proteins and severe COVID-19. These findings provide valuable insights into the etiology of severe COVID-19. These prioritized risk and protective factors could serve as drug targets and guide the effective protection of high-risk populations.
Exposure data sources
To obtain a comprehensive list of traits with existing GWAS, the summary statistics of 34,519 published GWAS were extracted from the latest MRC Integrative Epidemiology Unit (University of Bristol) GWAS database (https://gwas.mrcieu.ac.uk/). Details of each GWAS study can be found at https://gwas.mrcieu.ac.uk/datasets/. The R package TwoSampleMR (version 0.5.5) was applied to retrieve the IEU GWAS datasets68. The univariable MR study was conducted using the same package.
These GWAS were further filtered based on the following criteria: 1) European ancestry; 2) not eQTL studies, those labeled as “eqtl” from eQTLGen 201969. A total of 14,422 GWAS summary datasets were retained and used in this study. Detailed information on the used data sources is available in Supplementary Table 2, while details for each exposure and its GWAS are available in Supplementary Table 3.
Outcome data sources
For evaluation of the association with COVID-19 severity, the instrument-outcome effects were retrieved from the recent version of GWAS meta-analysis by the COVID-19 Host Genetics Initiative (HGI, release 4 alpha, accessed on October 9, 2020)28. Detailed information has been provided on the COVID-19 HGI website (https://www.covid19hg.org/results/). In our primary discovery analysis, we used the summary statistics based on the comparison of 2,972 patients confirmed as “very severe respiratory” COVID-19 with the 284,472 general population samples. This is called “the HGI A2 study” in this study.
To reduce false positives and to ensure the robustness of our discoveries, replication analyses were performed with two additional GWAS of COVID-19. One of them was also from the COVID-19 HGI, comparing 6,492 hospitalized COVID-19 patients with 1,012,809 control participants. We called it “the HGI B2 study”. Only single nucleotide polymorphisms (SNPs) with imputation quality scores > 0.6 were retained. The other GWAS was on 1,610 COVID-19 patients with respiratory failure and 2,180 controls from Italy and Spain, and it was called “the NEJM study”29.
Selection of instrumental variables
For the implementation of MR, SNPs were selected based on the genome-wide significance threshold (p < 5 × 10−8). To ensure SNPs are independent, we pruned the variants by linkage disequilibrium (LD) (R2 threshold of 0.001 or clumping window within 10,000 kb). When target SNPs were not present in the outcome dataset, proxy SNPs were used instead through LD tagging (minimum LD R2 threshold of 0.8). The effect alleles of selected genetic variants were harmonized across the exposure and outcome associations.
Univariable Mendelian randomization
Two-sample Mendelian randomization analysis was undertaken using GWAS summary statistics for each exposure-outcome pair. In order to estimate the causal effect of each trait on severe COVID-19, the IVW method with a multiplicative random-effects model was used as the primary analysis30,33,70. Horizontal pleiotropy occurs when SNPs exert a direct effect on severe COVD-19 through other independent biological pathways. To assess the presence of heterogeneity among genetic instruments, Cochran’s Q statistic was calculated for heterogeneity for the IVW analyses31. An extended version of Cochran’s Q statistic (Rücker’s Q′) can be estimated for the MR-Egger32. We used the MR-Egger intercept test to evaluate the extent to which unbalanced horizontal pleiotropy may affect the effect estimate33. To account for pleiotropy, additional sensitivity analyses were performed with the MR-Egger33,70, weighted median (WM)71, and weighted mode methods72. The MR-Egger method allows unbalanced horizontal pleiotropic effects even when all SNPs are invalid instruments33. The WM method can provide robust causal estimates when at least 50% of SNPs are valid genetic instruments, while the weighted mode method requires that the largest number of instruments that identify the consistent causal effect to be valid instruments71,72. The false discovery rate (FDR) approach was utilized to correct for multiple testing, and it was applied to the p values from the IVW random-effect model73. FDR controls the expected proportion of false positives among the hypotheses declared significant if the q-value is < 0.05, while the association was deemed to be suggestive if the unadjusted p-value is < 0.05.
Two additional exclusion criteria were applied to filter out exposures: 1) the number of genetic instruments was less than three. Three or more are required for statistical tests of pleiotropic effects and for statistical sensitivity analyses to correct for pleiotropy. 2) Exposures with indications of pleiotropy in their genetic instruments. The presence of pleiotropy violates the assumption of MR analysis. For the remaining exposures, FDR correction for multiple testing was applied separately for each analysis with the HGI A2, HGI B2, or NEJM study. To identify potential causal risk factors for severe COVID-19, we used two approaches to consider the evidence strength. First, the significant and replicated results were defined as those with a q-value < 0.05 in the discovery analysis and a nominal p-value < 0.05 in either one of the replication studies (Supplementary Table S10). Second, the suggestive and replicated results were defined as those with a nominal p-value < 0.05 in the discovery analysis and a nominal p-value < 0.05 in either one of the replication studies (Supplementary Table S11). All MR analyses were conducted in R with the TwoSampleMR package68. An analysis flowchart is shown in Fig. 1.
Multivariable Mendelian randomization
As many BMI-related traits are typically correlated with each other, we conducted a two-sample multivariable MR (MVMR) analysis to explore independent causal risk factors for severe COVID-1974. SNPs associated with fat mass and fat‐free mass were obtained from previous GWAS by MRC IEU and the Neale Lab through the TwoSampleMR package. The effects of genetically predicted fat mass and fat-free mass for each pair of the whole body, left arm, right arm, left leg, right leg, and trunk were estimated using the MVMR package (version 0.2.0) in R.
Data availability
All analyses were conducted using publicly available data. The exposure data (GWAS summary statistics) used in the analyses described here are freely accessible in the MR-Base platform (https://www.mrbase.org/). We downloaded COVID-19 data (GWAS summary statistics) in the COVID-19 Host Genetics Initiative (https://www.covid19hg.org/) and COVID GWAS results browser (https://ikmb.shinyapps.io/COVID-19_GWAS_Browser/).
Code availability
The codes used in the Mendelian randomization analyses described here are freely accessible in TwoSampleMR R package via GitHub (https://github.com/MRCIEU/TwoSampleMR/). Full documentation for the R package is also provided (https://mrcieu.github.io/TwoSampleMR/). We implemented the MVMR analysis using the MVMR R package (https://github.com/WSpiller/MVMR/).
Acknowledgments
We thank the investigators of the COVID-19 genome-wide association study, the COVID-19 Host Genetics Intiative, and FinnGen consortimum for sharing summary-level data. We would like to express our gratitude to all other Ye lab members for stimulating discussions.
Contributions
Y.S. and K.Y. conceived the study. Y.S. performed data analysis and prepared visualizations. Y.S., J.Z., and K.Y. interpreted the results. Y.S. and K.Y. wrote the first draft of the manuscript. All authors reviewed and approved the final version.
Competing interests
The authors declare no competing interests.
Table 1. Significant and replicated causal associations with severe COVID-19
Exposure |
Outcome dataset |
SNP# |
Odds ratio |
95% CI |
p-value |
q-value |
|
ukb-b-19953 |
Body mass index (BMI) |
HGI A2 |
454 |
1.89 |
1.51, 2.37 |
3.15E-08 |
1.78E-06 |
|
|
HGI B2 |
453 |
1.69 |
1.45, 1.97 |
1.31E-11 |
9.34E-10 |
|
|
NEJM |
449 |
1.53 |
1.05, 2.25 |
0.0286 |
0.2867 |
ukb-b-15590 |
Hip circumference |
HGI A2 |
417 |
1.46 |
1.15, 1.85 |
0.0017 |
0.0319 |
|
|
HGI B2 |
417 |
1.51 |
1.28, 1.77 |
5.93E-07 |
1.88E-05 |
|
|
NEJM |
412 |
1.66 |
1.11, 2.50 |
0.0142 |
0.1871 |
ukb-b-9405 |
Waist circumference |
HGI A2 |
371 |
1.82 |
1.36, 2.43 |
6.20E-05 |
0.0017 |
|
|
HGI B2 |
371 |
1.92 |
1.57, 2.34 |
1.57E-10 |
9.33E-09 |
|
|
NEJM |
369 |
1.95 |
1.21, 3.16 |
0.0065 |
0.1108 |
ukb-b-8338 |
Arm fat mass (left) |
HGI A2 |
419 |
1.69 |
1.33, 2.14 |
1.33E-05 |
0.0004 |
|
|
HGI B2 |
419 |
1.61 |
1.37, 1.90 |
9.35E-09 |
3.70E-07 |
|
|
NEJM |
417 |
1.52 |
1.02, 2.29 |
0.0420 |
0.3645 |
ukb-b-6704 |
Arm fat mass (right) |
HGI A2 |
424 |
1.74 |
1.36, 2.22 |
7.64E-06 |
0.0003 |
|
|
HGI B2 |
424 |
1.68 |
1.43, 1.96 |
1.16E-10 |
7.36E-09 |
|
|
NEJM |
422 |
1.65 |
1.12, 2.44 |
0.0115 |
0.1613 |
ukb-a-279 |
Leg fat mass (left) |
HGI A2 |
287 |
2.76 |
2.00, 3.81 |
8.04E-10 |
5.55E-08 |
|
|
HGI B2 |
287 |
2.20 |
1.77, 2.72 |
5.73E-13 |
5.66E-11 |
|
|
NEJM |
285 |
1.82 |
1.08, 3.07 |
0.0243 |
0.2617 |
ukb-b-7212 |
Leg fat mass (left) |
HGI A2 |
423 |
2.41 |
1.81, 3.22 |
2.02E-09 |
1.30E-07 |
|
|
HGI B2 |
422 |
2.14 |
1.75, 2.61 |
1.15E-13 |
1.36E-11 |
|
|
NEJM |
418 |
1.98 |
1.18, 3.31 |
0.0094 |
0.1442 |
ukb-b-18096 |
Leg fat mass (right) |
HGI A2 |
425 |
2.66 |
1.98, 3.58 |
8.66E-11 |
7.34E-09 |
|
|
HGI B2 |
424 |
2.33 |
1.90, 2.86 |
5.09E-16 |
7.97E-14 |
|
|
NEJM |
420 |
2.04 |
1.23, 3.40 |
0.0061 |
0.1059 |
ukb-b-18377 |
Leg fat percentage (left) |
HGI A2 |
381 |
4.05 |
2.70, 6.09 |
1.56E-11 |
1.39E-09 |
|
|
HGI B2 |
379 |
2.76 |
2.09, 3.63 |
5.01E-13 |
5.25E-11 |
|
|
NEJM |
375 |
2.11 |
1.02, 4.37 |
0.0453 |
0.3786 |
ukb-b-6591 |
Age at first sexual intercourse |
HGI A2 |
199 |
0.39 |
0.26,0.59 |
7.25E-06 |
0.0003 |
|
|
HGI B2 |
198 |
0.43 |
0.33, 0.56 |
7.50E-10 |
3.81E-08 |
|
|
NEJM |
196 |
0.52 |
0.27, 0.97 |
0.0405 |
0.3566 |
ukb-b-4667 |
Leisure/social activities: Religious group |
HGI A2 |
23 |
0.00 |
0.00, 0.12 |
0.0028 |
0.0468 |
|
|
HGI B2 |
23 |
0.01 |
0.00, 0.16 |
0.0006 |
0.0108 |
|
|
NEJM |
23 |
0.02 |
0.00, 0.84 |
0.0402 |
0.3566 |
ukb-b-11535 |
Mineral and other dietary supplements: Glucosamine |
HGI A2 |
5 |
0.00 |
0.00, 0.00 |
1.23E-11 |
1.15E-09 |
|
|
HGI B2 |
5 |
0.00 |
0.00, 0.04 |
3.74E-06 |
0.0001 |
|
|
NEJM |
5 |
0.00 |
0.00, 0.00 |
3.69E-06 |
0.0002 |
prot-a-208 |
Zinc-alpha-2-glycoprotein |
HGI A2 |
3 |
1.37 |
1.14, 1.66 |
0.0009 |
0.0197 |
|
|
HGI B2 |
3 |
1.24 |
1.07, 1.45 |
0.0049 |
0.0580 |
|
|
NEJM |
3 |
1.4 |
1.08, 1.82 |
0.0121 |
0.1645 |
prot-a-294 |
C1GALT1-specific chaperone 1 |
HGI A2 |
3 |
1.20 |
1.19, 1.21 |
< 2E-16 |
< 2E-16 |
|
|
HGI B2 |
3 |
1.22 |
1.05, 1.41 |
0.0074 |
0.0800 |
|
|
NEJM |
3 |
1.53 |
1.26, 1.86 |
2.07E-05 |
0.0009 |
prot-a-500 |
Corneodesmosin |
HGI A2 |
4 |
1.12 |
1.09, 1.16 |
1.31E-12 |
1.43E-10 |
|
|
HGI B2 |
4 |
1.06 |
1.05, 1.08 |
1.37E-14 |
1.88E-12 |
|
|
NEJM |
4 |
1.27 |
1.14, 1.41 |
2.40E-05 |
0.0010 |
prot-a-1530 |
Interleukin-3 receptor subunit alpha |
HGI A2 |
3 |
0.87 |
0.79, 0.94 |
0.0011 |
0.0238 |
|
|
HGI B2 |
3 |
0.84 |
0.79, 0.90 |
7.38E-08 |
2.68E-06 |
|
|
NEJM |
3 |
0.74 |
0.58, 0.94 |
0.0136 |
0.1804 |
prot-a-1540 |
Interleukin-6 receptor subunit alpha |
HGI A2 |
3 |
0.88 |
0.83, 0.94 |
0.0001 |
0.0036 |
|
|
HGI B2 |
3 |
0.90 |
0.86, 0.95 |
2.39E-05 |
0.0006 |
|
|
NEJM |
3 |
0.82 |
0.70, 0.96 |
0.0160 |
0.2040 |
prot-a-1950 |
Prostate-associated microseminoprotein |
HGI A2 |
3 |
0.71 |
0.58, 0.86 |
0.0005 |
0.0121 |
|
|
HGI B2 |
3 |
0.81 |
0.71, 0.93 |
0.0023 |
0.0316 |
|
|
NEJM |
3 |
0.79 |
0.66, 0.96 |
0.0176 |
0.2159 |
NOTE: Only associations that are replicated in both replication analyses (with the HGI B2 and NEJM datasets) are included here. Odds ratios and 95% confidence intervals were derived using the inverse-variance weighted random-effects model. The full list, including those that are replicated with only one study, is available in Supplementary Table 10. SNP, single nucleotide polymorphism; SNP#, number of SNPs retained for this analysis; CI, confidence interval.
Supplementary Table 1. Notable Mendelian randomization studies of COVID-19.
Supplementary Table 2. Details of the datasets used in the present Mendelian randomization study.
Supplementary Table 3. Dataset information of all traits included in our MR analysis.
Supplementary Table 4. All MR results based on the HGI A2 dataset.
Supplementary Table 5. All MR results based on the HGI B2 dataset.
Supplementary Table 6. All MR results based on the NEJM dataset.
Supplementary Table 7. All MR results using FDR, based on the HGI A2 dataset.
Supplementary Table 8. All MR results using FDR, based on the HGI B2 dataset.
Supplementary Table 9. All MR results using FDR, based on the NEJM dataset.
Supplementary Table 10. Significant and replicated results (IVW FDR < 0.05 with HGI A2; and IVW p < 0.05 with HGI B2 OR IVW p < 0.05 with NEJM).
Supplementary Table 11. Suggestive and replicated results (IVW p < 0.05 with HGI A2; and IVW p < 0.05 with HGI B2 OR IVW p < 0.05 with NEJM).
Supplementary Table 12. Significant and replicated results that are replicated with both HGI B2 and NEJM (IVW FDR < 0.05 with HGI A2; and IVW p < 0.05 with HGI B2 AND IVW p < 0.05 with NEJM).
Supplementary Table 13. Multivariable Mendelian randomization of fat mass and fat-free mass indices on severe COVID-19.