The role of thrombophilia related polymorphisms in Coronavirus pandemic mortality

The current study represents a comprehensive analysis in several databases and extensive bibliographic review, aiming to evaluate the correlation between the frequency of thrombophilia related polymorphisms with mortality rates by COVID-19 worldwide. We found 8 polymorphisms at 8 genes statistically correlated with Daily Death Rates of COVID-19, being an important clue to assignment of genes involved in poor prognosis of COVID-19, specically the ones related to thromboembolic events.

There are a few studies that associates COVID-19 with genetics factors, mostly related to the immune response, like the HLA Class I genes (5). ABO blood group and single nucleotide polymorphisms (SNP) in 3p21.31 and 9q34.2 (6); 19p13.3, 12q24.13 and 21q22.1 (4) regions were also recently reported as associated in genome wide association studies (GWAS). Ongoing GWAS initiatives may reveal further key host genetics players underpinning susceptibility-resistance to the disease. Encoded proteins related to the coagulation cascade are now being described as critically in uenced by pathological alterations brought by COVID-19, modulating key elements in this pathways, such as von Willebrand factor (vWF) and the antithrombin III binding agent (2). Interestingly, a study from Manne et al. reported the increase of P-selectin, a major platelet activator, occurs mainly during acute phase of COVID-19, which may unveil key gene modulations accounting for the thrombophilia (2).
Although blood screening tests routinely used for monitoring hematological parameters, such as D-dimer, coupled with the assessment of thromboembolic comorbidities (1) are considered relevant risk factors for thrombosisrelated events, little is still known about the genetic basis underlying the clotting cascade in the presence of COVID-19. In this context, genetic polymorphisms that predispose to thromboembolism have the potential to generate clinically relevant knowledge on COVID-19 pathogenesis that can be used for prognosis and strati cation of therapy.
Hence, we performed a comprehensive list of functional polymorphisms in 24 key genes or clusters related to thrombophilia that were obtained from surveys in OMIM (8) and Orphanet (9) databases, listed in notes of Table 1. Since very low frequency SNPs are unlike to contribute to general epidemiological ndings on COVID-19, SNPs that has frequency below 1% were not considered for the metanalysis. Additionally, a number of SNPs have no available frequencies in populations. The frequencies in populations worldwide of the remaining 18 SNPs mapping 15 genes were retrieved from Ensembl database (10) along with a comprehensive survey using the SNP IDs as keyword (raw data and bibliographic sources are given in Additional Files 1 and 2) and are presented in Table 1, with their respective gene location, frequency ranges and effects on thrombophilia.
: Chromosomic localization of each gene, the mutations they present, their frequency and their effects Genes 1 Chromosome  3 Available at OMIM , 4 Thrombopoietin receptor 4 Metalloipeptidase bospondin type 1 motif 13 6 Coagulation factor XIII B chain 7 Fibrinogen alpha chain 8 Coagulation factor IX 9 Thrombomodu uronan binding protein 2 11 Serpin family E member 1 12 Methylenetetrahydrofolate reductase 13 Fibrinogen alpha cha aracterized LOC105378861 15 Fibrinogen gamma chain 16 Coagulation factor V 17 Coagulation factor II, Thrombin . The other six re not included in this list are: Kininogen 1 (KNG1), Proteinc C, inactivator of coagulation factors Va and VIIIa (PROC), Prot 1), Serpin family C member 1 (SERPINC1), Vitamin K epoxide reductase complex subunit 1 (VKORC1), Janus kinase 2 (JAK2) and ctor C2 (HCFC2). KNG1, PROC and JAK2 does not have any frequency available for their SNPs, the PROS1, SERPINC1 and HCF2 st, two SNPs with frequency available, but all of them have frequencies lower than 1%. The genes SERPINC1, PROC and PROS1 are orphic, having at least 8 mutations described.
We also sought to apply an indirect approach to detect the relevance of these SNPs in COVID-19 prognosis was applied, by correlating worldwide frequencies of these SNP with their mortality rates. Estimates of number of individuals that are infected, or died, by SARS-CoV-2 in over 200 countries were obtained from WHO Dashboard (11) in November, 02, 2020, as well as their respective inhabitant numbers.
These data were used to estimate for each country: (i) the Case Fatality Rate (CFR), de ned by the number of deaths by COVID-19 divided by the number of con rmed cases and (ii) the Daily Death Rate (DDR), represented as the average number of deaths per day (since the rst con rmed case) per ten million inhabitants. CFR and DDR estimates for all countries are presented along with genetic data in Supplementary Data.
For thirteen highly polymorphic SNPs Spearman Linear Correlation of their frequencies with CFR and DDR estimates was carried out. Five SNPs were polymorphic at only a fraction of the populations. Thus, CFR and DDR estimates were compared between two groups of populations, one composed by populations where the SNP was polymorphic and another where the SNP was monomorphic. These comparisons were made using Mann-Whitney test. Correction for multiple tests were applied accordingly. The Spearman Linear Correlation and Mann-Whitney were performed on the program BioEstat version 5.3 (12).
The results of the statistical analysis are presented in Table 2. We found signi cant correlation between frequencies and DDR in seven SNPs, remaining six signi cant after correction for multiple tests. No frequencies were correlated with CFR. Moreover, results from Mann-Whitney test suggested that two SNPs are associated to DDR, even after correction for multiple tests. No association was detected between SNP polymorphism and CFR.  Our results suggest association of eight thrombophilia related SNPs with death rates attributable to COVID-19. Only two SNPs enlisted as probable mechanistic candidates were closely mapped to previously described positions associated with poor prognosis, namely the rs2301612 within ADAMST13 and rs5985 within F13A genes. We observe that the polymorphism rs2301612 is functionally related to thrombophilia and localized within the Chr:9q34.2 region, the same locus implicated by a recent COVID-19 GWAS study (6) and, which also, overlap with the ABO blood group locus at 9q34.2. Moreover, the polymorphism rs5985 is localized within the F13A gene were found to be signi cantly associated with thrombophilia. This SNP is approximately one million pb distant from RIOK1 gene, a candidate outside MHC region described previously by GWAS (4).
The remaining seven SNP linked with DDR mapped the genes FGA, FGG, F2, F5, F9, HABP2 and MTHFR, which were not clusters described previously by other authors, though they are considered well known polymorphisms associated with thrombophilia. FGA and FGG polymorphisms have been associated with D-Dimer levels that has been shown to be an important marker in COVID-19 (3). Both F2 and F9 SNPs were associated to susceptibility to thrombus formation as well as the F5 SNP, known as Factor 5 Leiden, responsible by hypercoagulability and thrombosis (13). In the same context the SNP within MTHFR gene is also been linked with thrombosis susceptibility (14).
Considering three major ethnicities, Africans, Europeans and East Asians there are clearly remarkable differences in DDR, in which mortality rates were signi cantly higher within the European populations compared to others.
Interestingly, the allele frequencies of the implicated markers we evaluated were found to be consistently higher amongst Europeans ( For the majority of healthcare services worldwide, the molecular or serological testing for SARS-CoV-2 infection have been preferentially applied to severe/critical cases and to clear suspect death by COVID-19. Hence, differences in testing coverage would impact less in the number of deaths than in the number of cases. This global scenario favors epidemiological statistics like DDR that considers country's population as denominator and take in account the time. Hence, DDR, while an average daily incidence, seems to be more suitable to COVID-19 that is still an ongoing pandemic. The present approach presents preliminary evidence that a signi cant proportion of deaths by COVID-19 must likely be the result of thrombophilia-related events that can be at least in part explained by difference in the genetic distribution of underpinning polymorphisms. Correlations tend to be stronger with DDR rather than CFR. Since CFR is obtained through the product of the number of deaths over the number of con rmed cases (as the denominator), it is automatically inferred this can be heavily biased by low testing coverage and the high frequency of both asymptomatic and non-reported mild cases, which altogether are reliant on socio-economic and political management factors. It is widely acknowledged that in the majority of countries, SARS-CoV2 testing have been preferentially offered to particular segments of the population, noticeably to severe/critical cases and suspect COVID-19 deaths. Hence merely meta-analytical, the strength of the correlation tests makes these results interesting to be considered for future evaluation and need to be corroborated in more structured case control or cohort studies. Although our direct meta-analytical approach endorsed the contributing role of the thrombophiliaassociated genetic markers towards COVID-19 outcomes, further work is warranted to generate further experimental evidence. Additionally, the results suggest careful planning of sampling strategies in order to avoid strati cation (15), because these polymorphisms have a wide range of variability associated with ethnicity. Finally, cumulative effects of multiple polymorphisms should be considered by evaluating the role of these SNPs in COVID-19.