A multipronged deep omics-based investigation of COVID-19 plasma samples identies key molecular networks of disease severity progression

Prognosis and management of COVID-19 severity is a challenge even after months of the pandemic. We employed high resolution mass spectrometry-based deep proteomics and metabolomics based investigation to study a cohort of 186 patients plasma samples to understand the COVID-19 disease severity mechanism. A cohort of patients displayed COVID-19 like symptoms but were negative for the virus while the COVID-19 positive patients were classied as non-severe or severe groups based on the clinical manifestation of the disease. Of the 1200 proteins detected, 27 were differentially regulated in COVID positive group while 38 were found to be differentially expressed between non-severe and severe groups. GO enrichment analysis highlighted the involvement of regulation of peptidase activity, regulated exocytosis, extracellular structure organization, blood coagulation, brin clot formation, complement activation, leukocyte activation involved in immune response, and response to glucocorticoid biological processes to be dysregulated in severe SARS-CoV-2 infection. Further, metabolomics identied 15 signicant metabolites indicating dysregulation in tryptophan, glycine, serine, threonine, arginine, proline, and porphyrin metabolism pathways. We validated the potential severity biomarkers such as angiotensinogen, apolipoprotein B, SERPINA3, SERPING1, and Fibrinogen gamma chain using targeted proteomics. Moreover, using our proteomics dataset and docking studies we found FDA-approved drugs Selinexor and Ponatinib having potential to be repurposed for the therapeutics intervention of COVID-19. results demonstrate the pathogenesis of cardiac maladies in severe COVID-19 patients. 45 Patients with severe COVID-19 often report lower platelet count. 46 Our studies have demonstrated that a crucial factor in platelet biogenesis TPM4 47 is inhibited (FC-1.94, P < 0.001) in severe cases thereby providing novel biological insight into COVID-19 severity. Yet another ubiquitously present protein βII spectrin was found βII 48 Figure showing the docking results of three different target proteins with two FDA-approved drugs Selinexor and ponatinib. Figure 6A represents the predicted 2D interaction diagram of Selinexor with SERPIN A7 involving various chemical bonds. Figure 6B shows the 3D diagrams which represent the binding pockets as well as the interaction of the drugs to three differentially expressed proteins, SERPIN A7, SERPIN D1, and S100 A9. Selinexor drug binds with SERPIN A7 with -9.3 Kcal/mol of binding anity, SERPIN D1 with -8.7 Kcal/mol of binding anity, and S100 A-9 with -7.5 Kcal/mol of binding anity. Drug Ponatinib binds with SERPIN A7 with -9.3 Kcal/mol of binding anity and with SERPIN D1 with -8.4 Kcal/mol binding anity.


Introduction
Unparalleled events have unfolded in the year 2020. A previously unknown infectious microbe has held life as we know it to ransom, by triggering a raging pandemic of Coronavirus disease (COVID- 19) sweeping across continents like a wild re but much like a wild re left behind a trail of more than a million dead and destruction of lives and livelihood of millions of others. The microbe that we now know as the Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2)1, is a beta coronavirus that belongs to the order Nidovirales of the Coronaviridae family. The virus thus has the same lineage as other highly infectious viruses SARS-CoV and Middle East respiratory syndrome coronavirus (MERS-CoV) both have caused outbreaks in the preceding decades.2 SARS-CoV-2 attaches to the Angiotensin-converting enzyme 2 (ACE2) receptor in humans and infects the human respiratory tract and lungs, mostly leading to typical u-like symptoms like dry cough, body ache, and fever.3 However, sometimes it leads to acute respiratory distress syndrome. Patients of severe COVID-19 deteriorate to multi-organ dysfunction and death4,5 despite intense medical intervention.
Although SARS-CoV-2 primarily targets the respiratory tract and lung, an increasing body of work has shown that virus infects other organs like the gastrointestinal tract, liver, kidney, cardiac muscles, central nervous system, musculoskeletal system, and even reproductive system in males 6,7. The diagnosis of the disease has been accurate and widespread. This was possible due to the increased availability and deployment of RT-PCR assays 8, or serological test kits and because of scaling up of testing rates globally.9 However, the prognosis of the disease remains a challenge. This is because the precise pathophysiological pathway that gets perturbed during the course severity of the disease remains mostly unexplored.
Blood is the only body uid that reaches virtually all organ systems of our body and potentially carries information of perturbations in their physiology. Further, owing to its minimally invasive nature, it has been the mainstay of several diagnostic tests used for assaying multiple parameters to assess the physiological state of the human body. Indeed, blood is the clinician's method of choice to understand physiological aberrances and thus blood plasma proteome is an excellent source for assessing hostresponse. 10 While alterations in the functional molecules of the proteins tell us the story of what was happening inside, the small molecules called metabolites describes the aftermath of the incident. The sensitivity and high throughput nature of high resolution mass-spectrometry has allowed the scientist to detect even the slight perturbations in host physiology11.
For our study, we chose to apply a multi-omics strategy to delineate the systems-wide perturbations brought about by SARS-CoV-2 infection in their human host both during non-severe and severe cases of COVID-19. We investigated the host plasma proteome and metabolome from blood collected from a large cohort of 186 patients (Supplementary Table S1) of varying COVID-19 severity during their active infection phase. We found 38 differentially expressed proteins could lead to the transition from nonsevere to severe COVID-19 and identi ed key pathological processes that turn a non-severe COVID-19 into a severe form. Dysregulations of pathways, especially relating to the in ammatory pathways, complement activation, and blood clotting are key events, which can prognosticate the severity of the disease. Further, to validate our candidate biomarkers, we conducted targeted proteomics based Multiple Reaction Monitoring (MRM) studies on independent cohort of patients.
In this study, we have also performed a comprehensive metabolomics analysis of plasma samples from COVID-19 patients, at different severity levels. Metabolites are small biomolecules that have a molecular mass of less than < 1000 Da. They play a vital role in facilitating the pathways in all organisms 12. Metabolites have been targeted for potential diagnostics biomarker candidates by researchers due to their involvement at the various stages of central dogma 13,14. It has been reported that Kynurenine, Nitrogen, fatty acid, amino acid 15,16, Pyrimidine 15,17,18, TCA cycle, fructose, mannose, guanosine monophosphate (GMP) of nucleotide biosynthesis pathways, carbamoyl phosphate of the urea cycle, carbon metabolism 19 and carnitine pathway 20 has been reported to be altered in plasma and serum samples of COVID19 infected patients. We carried out an inclusive LC-MS/MS-based metabolomics study to identify differentially abundant plasma metabolites in COVID-19 positive vs negative patients and Non-severe COVID-19 (NSC) vs. Severe COVID-19 (SC) patients. Identi cation of differentially altered metabolites may form the basis for the future development of new prognostics and therapeutic intervention strategies.
Finally, information obtained from the multi-omics experiments was used to initiate the in silico docking studies on over-expressed host proteins with existing drug library, which identi ed small molecules of interest for the management of COVID-19 severity. We could predict 2 FDA-approved re-purposed drugs that could be potentially used during the management of COVID-19 disease. The salient outcomes from this study provide valued information on plasma biomarkers associated with the severity of COVID-19, and unravel the mechanistic pathways of the pathogenesis of SARS-CoV-2 infection. Further, the potential therapeutic targets predicted from the in silico study could be validated on cell-line based studies and clinical trials for the therapeutic interventions.

Quantitative proteomic analysis of COVID-19 plasma samples
We performed label-free quanti cation of a total of 74 depleted plasma samples out of which 20 were negative, 18 were non-severe and 36 were severe ( Figure 1A). Figure 1 (B-F) depicts the schematic work ow of label-free quanti cation under discovery proteomics, illustrates the overview of statistical data analysis, shows the summary of synthetic peptide peaks after Multiple Reaction Monitoring (MRM) under validation proteomics, represents the outline of biological network analysis and docking study respectively. The correlation matrix of the total 74 samples is shown in Supplementary gure 1. The mass-spectrometry setting for the label-free quanti cation is shown in Supplementary gure 2. The LFQ analysis of 74 samples provides a total of 1206 proteins. A list of 278 missing value imputed proteins from 71 samples was taken forward for the partial least squares-discriminant analysis (PLSDA) for an overall assessment of the difference between the COVID-19 positive and COVID-19 negative sample cohort. The two sample cohorts were found segregating in two separate clusters shown in Figure 1G.
Proteomic analysis of COVID-19 Non-Severe and COVID-19 Severe patients Further, this study has also investigated the proteomic alteration between the Non-severe and Severe cohort which provides a list of38 signi cantly differentially expressed proteins (Supplementary Table S3). Figure 2A represents a heatmap of the top 25 differentially expressed proteins in context to the Severe and Non-Severe cohort. A list of 287 missing value imputed proteins was taken forward for the partial least squares-discriminant analysis (PLSDA) for an overall assessment of the difference between the Severe and Non-severe cohort. The two sample cohorts were found segregating in two separate clusters except for Sample P93, P30, and P106, which clustered closer to the opposite cohort ( Figure 2B). Figure  2C depicts the signi cant DEPs in the form of a Volcano plot. The proteins such as Kallistatin (SERPINA4), Serum amyloid P-component (APCS), Protein S100-A8 (S100A8), Fibrinogen gamma chain (FGG), Corticosteroid-binding globulin (SERPINA6), and Alpha-1-antichymotrypsin (SERPINA3) were found to be upregulated in the severe cohort whereas proteins such as Complement factor D (CFD), Monocyte differentiation antigen (CD14), Complement component C8 alpha chain (C8A), Apolipoprotein (LPA) and Apolipoprotein M (APOM) were found to be downregulated in the severe when compared to non-severe patients. Supplementary gure S3 represents the top 25 differentially expressed proteins of Severe and Negative in the form of Heatmap and depicts the PLSDA clustering of the Severe versus Negative patients.

MRM analysis of proteins overexpressed in COVID-19 severe
The MRM study aimed to validate the differentially regulated proteins found between COVID-19 severe and non-severe samples from the LFQ data. The BSA as QC standard to monitor day-wise instrument response is shown in Supplementary gure S4. To establish that all the injections gave the more or less same response, we spiked in an equal amount of a heavy labeled synthetic peptide (FEDGVLDPDYPR) in all samples. The uniform peak areas for this peptide as shown in Supplementary gure S5 establishes the same. Even duplicates run on separate days showed comparable peak areas with very low CV. Based on the response of the differentially regulated peptides, the list was further re ned to keep only peptides showing signi cant dysregulation (adjusted p values below 0.05) between severe and non-severe. For this, the peaks were annotated and transitions were re ned according to the library match to give dotp values for all peptides. A dotp value is a measure of the match between the experimental peak and the library fragmentation patterns. Thus, the re ned list had 183 transitions belonging to 28 peptides of 9 host proteins and 1 synthetic peptide. List of peptide sequences and transitions of proteins that showed differential regulation between COVID-19 non-severe and severe patient samples shown in Supplementary Table S4. Using the MS stats external tools in Skyline we determined that proteins AGT, APOB, SERPINA3, FGG, and SEPRING1 have 3 or more than 3 peptides that show a peak area fold change more than 3 and adjusted p-value less than 0.05 at a con dence of 95-99% ( Figure 3). This validates that for the given set of samples, these proteins are showing statistically signi cant overexpression in COVID-19 severe patients than in COVID-19 non-severe patients (refer to the data availability section for the Skyline les).

Biological Pathway and Network analysis of differentially expressed protein in Severe Vs Non-severe comparison
We also identi ed the enriched biological processes for the 38 dysregulated proteins in COVID-19 severe compared to COVID-19 non-severe patients. The biological processed enriched proteins were shown in the form of protein-protein interaction. Few proteins have been shown in the form of a violin plot ( Figure 4A). Figure 4B shows a network of enriched terms colored by clusters, where nodes that share the same clusters are typically close to each other. We identi ed the biological process such as regulation of peptidase activity, regulated exocytosis, extracellular structure organization, blood coagulation, brin clot formation, complement activation, classical pathway, leukocyte activation involved in immune response, and Response to glucocorticoid process to be enriched in COVID-19 severe patients. The list of proteins expressed in these pathways is shown in Supplementary Table S5.

Metabolomics pro ling of COVID-19 patient cohort
The work ow of metabolome pro ling from plasma samples is shown in Figure 5A. The quality check control of the internal standard of all the sample runs is shown in Figure 5B. The Principal Component Analysis (PCA) plot representing proper segregation of QC pools from all the batches for quality check of sample run is shown in Figure 5C. Among the analysis of COVID-19 Negative and COVID-19 Positive samples 32 metabolites came out to be common yet signi cant differentially expressed metabolites (DEMs) having FDR adjusted p-value less than 0.05 and fold change above 1.5 (Supplementary Table   S6). Out of the 32 DEMs, only 11 were not a contaminant from the blank solvent. Of the 11 only 1 was level-2 annotated -Linoleate; 2 were level-3 annotated -Kauralexin A1 and D-(+)-Maltose (Supplementary  table S7 Non-severe and Severe COVID-19 Positive patients' comparative data analysis resulted in 24 features having FDR adjusted p-value less than 0.05 and fold change above 1.5 were considered as statistically differentially expressed metabolites and were named as Differentially Expressed Metabolites (DEMs) (Supplementary Table S7), out of which 13 metabolites were found post blank subtraction. These 13 metabolites were used for the PCA plot ( Figure 5D), and heat map preparation to show the segregation of Non-severe to Severe sample sets ( Figure 5E). The blank subtracted signi cant DEMs were used to calculate the Variable Importance in Projection (VIP) scores and for plotting volcano plot along with their expression trend represented as box plots ( Fig. 5F and 5G). The box plots represent all the level 2 metabolites i.e. 4 out of 13 signi cant DEMs (Supplementary Table S6), the trend of the rest of the unannotated DEMs, either level 4 or level 3 is listed in Supplementary Table S8 and their trend is represented in Supplementary gure S6G.
A total of 18 signi cantly altered metabolites was found in the comparison of NSC vs SC, which contains 13 DEMs, 1 metabolite speci c to the NSC cohort, and 4 metabolites speci c to the SC cohort. Five out of eighteen signi cant metabolites were found to be of level 2 MSI viz. Propionylcarnitine, N-Methylethanolamine phosphate, Indole-3-acetic acid, Creatine, and Bilirubin. Kauralexin A1 was found to be level 3 MSI and the remaining 12 signi cant metabolites belong to level 4 MSI (Supplementary table  S9).
Propionylcarnitine was found enriched in the oxidation of branched-chain fatty acids pathway, indole-3acetic acid was enriched and mapped on tryptophan metabolism pathway, creatine was enriched and mapped on glycine, serine, threonine, arginine, and proline metabolism pathway, and bilirubin was enriched and mapped on porphyrin metabolism pathway (Supplementary Figure S7). Kauralexin A1 was not found in enrichment or pathway analysis.

In-silico screening of drugs against differentially expressed proteins
We have performed in silico molecular docking of signi cantly altered proteins with the library of 58 drugs (Supplementary table 10A-E). Out of 58 drugs, 30 drugs are FDA approved, 9 drugs are clinically approved and 19 drugs are pre-clinical approved. We have identi ed positive control drugs for each protein from the literature which is a known inhibitor of the protein. Positive control drug gives us a possible cut-off for the docking score. After docking, we have used two criteria for selecting the drugs for the protein. Firstly, the drug's binding energy should be equal to or higher than that of the control inhibitor. Secondly, the drug's binding pocket should be similar to the control drug. For COVID-19 non-severe vs severe comparison, we have docked 5 proteins that are coming signi cant, which are Heparin cofactor 2, Thyroxine-binding globulin, Angiotensinogen, Carbonic Anhydrase-1, and Carbonic Anhydrase-2.
Heparin cofactor 2 (SERPIN D1) is a protein of 499 amino acid long peptide, which binds with the drug Sulodexide with a binding a nity of -7.1 kcal/mol and hence it is taken as a control drug (Supplementary gure S8A). When docked with the customized drug library, we nd four FDA approved drugs that bind to a similar binding pocket as the control drug and have better binding a nity than the Sulodexide, namely, Selinexor (-8.7 kcal/mol), Ponatinib (-8.4 kcal/mol), EGCG (-7.7 kcal/mol) and Nafamostat (-8.1 kcal/mol). Another protein Thyroxine-binding globulin (SERPIN A7) is a protein with 415 amino acids. It showed a binding a nity of -7.4 kcal/mol with the drug Tamoxifen which is a well-known inhibitor of the protein hence we have used it as a control inhibitor of the protein (Supplementary gure S8B). From the customized drug library SERPIN A7 has bound to Selinexor and Ponatinib with a binding a nity of -9.3 kcal/mol ( Figure 6B). 2D interaction diagram of Selinexor docked with SERPIN A7 shows the amino acids Y20 and R381 forms potential Hydrogen bonds at the binding site ( Figure 6A). Angiotensinogen protein is 485 amino acids long; it binds with the Irbesartan with a binding a nity of -8.4 kcal/mol. It is a known inhibitor of the protein, so we have used it as a control drug (Supplementary gure S8C). Angiotensinogen binds to the drug ML-240, which a pre-clinical approved drug with a binding a nity of -8.9 kcal/mol. This is the only drug we have identi ed in our study to target Angiotensinogen. We have also performed docking of Carbonic Anhydrase-1 (261 amino acid length) and Carbonic Anhydrase-2 (260 amino acid length). Small molecule Topiramate binds with Carbonic Anhydrase-1 with a binding a nity of -9.2 kcal/mol (Supplementary gure S8D) and Acetazolamide binds with Carbonic Anhydrase-2 with a binding a nity of -6.3 kcal/mol (Supplementary gure S8E), they are used as control drug for respective proteins. In our study, we have identi ed EGCG as the only FDA-approved drug that can be used to target Carbonic Anhydrase-1 (with binding a nity -9.5 kcal/mol) and Nafamostat (with binding a nity -8.2 kcal/mol) to target Carbonic Anhydrase-2. Four proteins from COVID-19 positive vs negative comparison were used for molecular docking which are Protein S100 A9, Carboxy Peptidase B2, Glutathione Stransferase omega-1, and 6-Phosphogluconate dehydrogenase.
We found that the Rapamycin drug can be used to target all 4 proteins as it is binding with the proteins with higher Binding a nity than their respective control drug and the binding pocket is also the same as the control drug. The rst protein is Protein S100 A9 which is a small protein with 114 amino acids. We have used Tasquinimod as a control inhibitor as it binds with the protein with a binding a nity of -7.5 kcal/mol (Supplementary gure S8F). Using the above-mentioned criteria to select drugs, we have identi ed Protein S100 A9 can be targeted using two FDA approved drugs: Selinexor which also binds to the protein with a binding a nity of -7.5 kcal/mol and another drug is Rapamycin which is an mTOR inhibitor, it binds to the protein with a binding a nity of -8. Four FDA approved drugs from our customized drug library can be used to target GSTO-1, which are Rapamycin (binding a nity -8.8 kcal/mol), Selinexor (binding a nity -8.6 kcal/mol), Ponatinib (binding a nity -9.1 kcal/mol), and Silmitasertib (binding a nity -8.3 kcal/mol). 6-Phosphogluconate dehydrogenase is a protein with 483 amino acids, we have used Physcion as a control drug it binds with 6-PGDH 1 with a binding a nity of -7.0 kcal/mol (Supplementary gure S8I). Six FDA approved drugs from our customized drug library can be used to target 6-PGDH, which are Rapamycin (binding a nity -8.8 kcal/mol), Selinexor (binding a nity -8.5 kcal/mol), Ponatinib (binding a nity -10.3 kcal/mol), Silmitasertib (binding a nity -7.7 kcal/mol), Daunorubicin (binding a nity -8.4 kcal/mol) and Dabrafenib (binding a nity -8.6 kcal/mol). From our molecular docking analysis, we have found Rapamycin which is an already approved drug for organ transplant rejection, binds to all four signi cantly upregulated proteins of COVID-19 positive vs negative comparison. We have also found that Selinexor, which is an exportin antagonist and is approved for multiple myeloma, and Ponatinib which is a tyrosine kinase inhibitor and is approved for Chronic Myeloid Leukemia (CML) can be used to target proteins. from COVID-19 positive vs negative comparison and non-severe vs severe comparison as it has shown to inhibit proteins from both the comparison. Another drug Pevonedistat which is a clinically approved drug can also be explored in the future for targeting COVID-19 as it has also been shown to inhibit proteins from both the comparison.

Discussion
Nasopharyngeal swab samples and serological tests are being routinely used in clinics for the detection of SARS-CoV-2 infection diagnosis; however, biomarkers for prognosis of the disease before it could lead to fatal symptoms are yet to be found. The understanding of the host response towards the viral infection might provide important clues on the progression of the disease from non-severe to severity. A multiomics integrated approach was applied for an in-depth understanding of the disease mechanism. Thus, for our study, we looked into the plasma of the infected host for our multi-omics analysis. Few studies have already reported differences in the level of blood-based proteins such as lactate dehydrogenase (LDH), d-dimers, and in ammatory markers such as C-reactive protein (CRP), ferritin, and brinogen in COVID-19 patients 21,22 . Major aim of our study was to perform deep pro ling of serum proteome and metabolome from a large cohort of COVID-19 patients (n=186), facilitating the robust and statistically signi cant evaluation of differential expression between non-severe and severe disease groups. The Indian subcontinent being fairly unscathed by the severity of the pandemic, reporting a few of the lowest case fatality rates per million, 23 our study on large number of patients hold importance in understanding the biology associated with the same.
Deep proteome comparison between COVID-19 positive and COVID-19 negative patients who exhibited symptoms similar to COVID-19 revealed a subset of proteins that differentiates COVID-19 from other febrile respiratory maladies. Of importance was the upregulation of the von Willebrand factor (VWF).
Animal studies have shown that increased VWF might be due to hypoxic conditions in the lung endothelial cells 24 however, this induces a risk of both arterial or venous thrombosis since it directly promotes the thrombotic process during in ammation. 25 The increase in Haptoglobin-related protein (HPR) is also found in cases of idiopathic pulmonary brosis 26 and as a factor of non-bacterial pneumonia 27 thus may act as a biomarker of lung trauma. Carboxypeptidase B2 has anti-in ammatory and anti-brinolytic effects and its increase in this cohort is indicates the natural response to systemic in ammation brought about by COVID-19. 28 Another protein Pro lin-1 (PFN1) overexpression implicated in the vascular hyperpermeability, vascular hypertrophy can perhaps explain the aberrant physiology of COVID-19 patients. These apart acute phase response proteins were like SAA-4 and S100A8 were also upregulated in response to COVID-19. Interestingly, however the protein lymphatic vessel endothelial hyaluronic acid receptor 1 (LYVE1) was downregulated and might be an indication of liver injury 29 while attenuated Histidine-rich glycoprotein (HRG) expression might explain the altered hemostasis in the patients. 30 However there lies a caveat, most of these patients were under medications and the results might also be due to the ongoing therapies than the disease itself.
In our plasma deep proteome study of non-severe vs severe patients, we observed that only 38 were differentially expressed proteins such as FGG, S100A8, VWF, SAA4, SERPIND1, and SERPINA6 to be upregulated in the COVID-19 positive patients while interestingly the mitochondrial 60 kDa heat shock protein, HSPD1 was only expressed in the severe patients. It has been already reported that high levels of circulatory HSPD1 are associated with cardiac failures. 31 Therefore increased HSPD1in severe patients can act as a clinical biomarker of cardiac malfunction in the severe group. Also, our results indicated an increase in plasma Cholinesterase (BCHE) in the severe group which is upregulated in patients suffering from mild ischemic stroke. 32 These ndings thus implicate severe COVID-19 associated risk of cardiac and CNS injury that has been already reported by clinicians and these biomarkers could help prognosis of the threat.
The plasma levels of carbonic anhydrase 1 (CA1) was found to be substantially elevated in the severe group. Increased carbonic anhydrase has been found to mediate hemorrhagic retinal and cerebral vascular permeability. 33 The rami cations of increased CA1 are also substantiated by earlier reports on a cohort study of sepsis secondary to pneumonia 34 where it was found to be upregulated during sepsis. Moreover, the role of increased CA1 in the worsening of ischemic diabetic cardiomyopathy also paints a rather gloomy picture of the cardiac sequelae of COVID-19, especially in diabetic patients 35 and might also contribute to the increased fatality of diabetic patients. 36 The protein Fibrinogen (FGG) was also found to be upregulated in severe patients when compared to non-severe. FGG is an oligomeric glycoprotein produced in the liver and secreted in the blood. The increased brin formation and breakdown correlated with the high level of D-dimers observed in the COVID-19 patients with the worst outcomes 37 . The increasing level of FGG in severe might be due to liver injury, impairing hepatic brinogen secretion with acquired brinogen storage disease 38 . The protein S100A8 (calgranulin A/myeloid-related protein 8) belongs to the group of alarmins or damage-associated molecular patterns (DAMPs) which are released in response to stress against the microbial infection that leads to exacerbating the in ammatory response. Chen et al. and his co-workers reported that the level of S100A8 positively correlated with the C t value and oxygen demand indicating the severity of the acute respiratory distress (ARDs) in COVID-19 patients. 39 A recent study showed that severe COVID-19 patients release massive amounts of S100A8 which is accompanied by changes in monocytes and neutrophil subsets. 40 The protein AGT was found to be signi cantly upregulated in severe patients as compared to the nonsevere. Angiotensinogen (AGT) is a component of the renin-angiotensin system (RAS), a substrate of renin that regulates blood pressure and uid balance. The dysregulation of AGT and RAS might lead to acute lung injury and acute respiratory distress leading to serious prognosis 41 . Apolipoprotein B-100 (APOB) is involved in lipid transport and low-density lipoprotein (LDL) catabolism. The high levels of APOB in the plasma might be an indicator of signi cant cardiovascular manifestation seen in COVID-19 infected severe patients 42 . These results are consistent with previous ndings 15 on COVID-19 patient sera which had identi ed dysregulation of multiple apolipoproteins. Several serine protease inhibitors (SERPINs) such as SERPING1 and SERPINA3 were also identi ed to be upregulated in severe patients. The increasing level of SERPINs, an acute-phase protein positively correlates and associates with a high level of IL-6 seen in severe patients 43 . The validation study by MRM based assay could speci cally detect AGT, FGG, APOB, SERPING1, and SERPINA3 host peptides in COVID-19 patients using. The mass spectrometry-based detection of host peptides used in our study can be used in the clinics for the prognosis of disease severity.
A subset of proteins was also down-regulated in the severe group. The protein peptidase inhibitor 16 (PI16) was severely down-regulated (FC= -2.45,P <0.005). It concurs well with the previous studies wherein it has been shown that while PI16 protective role against atherosclerosis, PI16 inhibition by circulating in ammatory cytokines that act through the NF-κB signaling pathway. 44 These results demonstrate the pathogenesis of cardiac maladies in severe COVID-19 patients. 45 Patients with severe COVID-19 often report lower platelet count. 46 Our studies have demonstrated that a crucial factor in platelet biogenesis TPM4 47 is inhibited (FC-1.94, P < 0.001) in severe cases thereby providing novel biological insight into COVID-19 severity. Yet another ubiquitously present protein βII spectrin was found to be downregulated in severe COVID-19, given that inadequate βII spectrin might precipitate into arrhythmia, heart failure, or even neurodegeneration 48 the ndings hold much importance. Two proteins namely APOM which is known to protect the lungs and kidneys from injuries 49 and APOA2 were also downregulated, similar observations were previously reported. 15 Functional enrichment analysis of the 38 differentially expressed proteins in severe vs non-severe cohort revealed that these proteins are enriched in pathways related to blood coagulation, brin clot formation, complement system, leukocyte activation, regulation of peptidase activity, regulated exocytosis, and extracellular structure organization among others. Proteins like A2M, SERPINA4, SERPINA3, SERPING1, and FGG that are involved in regulated exocytosis of platelets were upregulated in the severe cohort suggesting an increased consumption of platelets. This could be a possible reason for the lower platelet count (clinically called thrombocytopenia) commonly reported in many severe cases of COVID-19 50 , which is also associated with coagulation abnormalities, disease severity, and mortality [51][52][53] . There is enough evidence to suggest that aside from their role in hemostasis, platelets have potent immune and in ammatory effector functions. Interaction between viruses and platelets has been known to stimulate platelet degranulation leading to the release of a variety of cytokines and chemokines 54,55 . They also directly interact with leucocytes and endothelial cells to trigger and modulate in ammatory reactions and immune responses 55 . Thus, platelet hyperactivity due to the upregulation of these proteins correlates with the over-exuberant host in ammatory response as COVID-19 progresses from non-severe to severe.
Many of the peptidase activity regulator proteins including SERPINA4, SERPING1, SERPINA3, SERPIND1, and A2M are involved in blood coagulation and in ammation pathways. SERPINA4 is an inhibitor of the kallikrein-kinin system involved in coagulation and in ammation 56,57 ; SERPING1 is an inhibitor of the classical pathway of the complement system as well as of several proteins involved in blood coagulation 58 ; SERPIN A3 is a major inhibitor of cathepsin G, a key proteolytic enzyme and in ammatory effector released by neutrophils 59 ; A2M is an inhibitor of a variety of proteases involved in blood coagulation and in ammation including thrombin, kallikrein, plasmin, and cathepsin G 60 ; SERPIND1 regulates blood clot formation by inhibiting thrombin 61 . Moreover, FGG or brinogen gamma chain is a component of the clotting factor brinogen which promotes tissue repair. High brinogen levels are associated with bleeding and thrombosis, and also correlates with the increased erythrocyte sedimentation rate (ESR) observed in severe cases 62,63 . COVID-19 associated coagulopathy is common in severe patients 64 , while overt disseminated intravascular coagulopathy, a critical condition characterized by abnormal blood clotting and bleeding, is observed in the majority of the critically ill patients who do not survive 65,66 . These thrombotic complications can be characterized by dysregulation of proteins involved in blood coagulation, brin clot formation, and platelet exocytosis. Conversely, these proteins can be associated with disease severity and risk of mortality and can serve as biomarkers for a better prognosis.
Consistent with previous studies, multiple acute phase proteins (APPs) like APCS, C4B, A2M, SERPING1, SERPINA3, and FGG were found to be upregulated in severe patients 15,67 . APPs are manifested as the body's innate response to any kind of stress. Tissue damage caused by injury or infection instigates a local in ammatory response that leads to the release of pro-in ammatory cytokines. APPs are synthesized and released mainly by liver hepatocytes in response to these cytokines 68 15,16 The signi cantly altered metabolite capturing can be used for the rapid prognosis of the severity of the COVID19 condition in patients with vulnerability or comorbidity for e cient healthcare.
Multiple antiviral drug therapies as well as clinical drug trials are going on to come up with a de nitive solution for this life-threatening viral infection. Current approaches of treating various stages of COVID-19 patients with commercially available drugs include two major categories; treatment with antiviral drugs and immune modulators. HIV protease inhibitors are quite famous as they belong to the former category but still no de nitive studies have proven those drugs to be a potent inhibitor, hence the quest has to be continued 83 . In our study, we performed in silico drug re-purposing analysis with 9 proteins from our proteomic analysis against a library of 58 small molecules. The chosen drugs are previously found to target the protein-protein interactions happening between SARS-CoV-2 and human proteins in a cell line model 84 . Two FDA-approved drugs, Selinexor and Ponatinib were found to inhibit most of the proteins belonging to two different cohorts; COVID-19 positive vs COVID-19 negative and non-severe Vs severe patients' plasma. Previously, Food and drug administration (FDA) had approved Selinexor for the treatment of multiple myeloma in a combination with Dexamethasone 85 . The drug is a rst class exportin-1 (XPO1) inhibitor that brings apoptosis in cancer cells by blocking nucleocytoplasmic transport of tumor suppressor proteins 86 . Although developed originally as anticancer-drugs, Exportin inhibitors can act as antiviral drugs as they have the potential to block the intracellular replication of viral particles by inhibiting the transport of viral replication proteins into the cytoplasm 87 . Hence, the drug is currently under phase 2 clinical trial for COVID-19 infection (ClinicalTrials.gov Identi er: NCT04349098). Five plasma proteins from our study, including Thyroxine-binding globulin (SERPIN A7) and Heparin cofactor 2 (SERPIND1) belonging to the family of serine protease inhibitors (SERPINs) shown to interact tightly with Selinexor, suggesting these SERPINs could be a target for the drug. Both proteins are also seen to interact with another FDA-approved drug called ponatinib. Ponatinib is originally a tyrosine kinase inhibitor, used to treat patients with chronic myeloid leukemia (CML) 88 . It is currently under no clinical trial for COVID-19 patients but, recent studies in mice model showed it can suppress the cytokine storms from viral infections like in uenza 89 . Hence ponatinib, as an immune modulator appears to be a suitable drug for making therapeutic cocktails against COVID-19 infection in the future.
Managing the case fatality in COVID-19 still remains a challenge for all countries worldwide. While the COVID-19 pandemic has unleashed an unprecedented crisis on the lives and livelihood of most it is the health workers who have borne the brunt of the pandemic. With limited availability of hospital infrastructure and overworked staff, it was pertinent to uncover factors that are leading to the severity of symptoms associated with COVID-19 to reduce the caseload in hospitals and management of severe cases at an accelerated pace. Our study not only revealed candidate biomarkers for disease severity prognosis but also enhanced our understanding of mechanism of COVID-19 severity. Further, validation of these biomarkers and re-purposed drugs can aid clinicians in faster disease prognosis and better therapeutic strategies to improve survival rates; especially in the vulnerable population, while selecting drugs based on the severity of infection would help in better management of symptoms.

Sample and clinical details
For this study, we procured plasma samples from 186 patients who visited Kasturba Hospital for Infectious Diseases, Mumbai. All plasma samples were collected with approval from the Institute Ethics Committee, IIT Bombay, and Kasturba Hospital for Infectious Diseases, Institutional Review Board. Based on RT-PCR results these patients were assigned as COVID-19 positive and COVID-19 negative. Depending on the clinical symptoms, positive patients as advised by clinicians were further grouped into severe (patients with mechanical ventilation and having severe symptoms of acute respiratory distress, bilateral pneumonia) and non-severe (patients having mild symptoms of cough, fever, fatigue, and breathlessness without invasive ventilation) For plasma proteomic analysis of COVID-19 infected patients, 20 negative, 18 non-severe, and 33 severe cases were taken forward (Supplementary Table 1). Approximately, 2 mL of whole blood was collected by Kasturba Hospital, for biochemical and serological tests from COVID-19, RT-PCR con rmed and suspected patients. Whole blood was collected in a sterile vacutainer by trained medical practitioners under aseptic conditions. After Biochemical tests were performed, leftover blood (~1 mL) was collected and centrifuged at 3000 rpm for 10 minutes to separate plasma. The separated plasma was then incubated at 56°C for 30 mins for viral inactivation and further stored at -80°C in cryovials. For the MRM validation experiment, 12 COVID-19 positive patient samples were taken forward.
For metabolomics, 31 negative, 43 Non-Severe, and 29 Severe plasma samples were used. Around 100 µL of Plasma was dispensed into a sterile tube. 200 µL of pre-chilled 100% Ethanol was added into the plasma sample and incubated for 5 minutes in RT. The tube was incubated in a biosafety cabinet for 1.5 hours until ethanol was evaporated. To the semi-dried sample, 4X (400 µL) 100% Methanol was added and vortexed brie y. The tube was Incubated at -20˚C, overnight. The next day, the sample was centrifuged at 4˚C for 30 minutes at 12000 g. The supernatant was collected in a fresh tube and was stored at -20˚C. Samples were transported to Proteomics Lab at IIT Bombay. 250 µL of supernatant was concentrated using speed vac up to the nal volume of approximately 80 µL out of which 50 µL sample was dispensed in the glass vial for mass spectrometry-based metabolomics pro le run. To each glass vial containing 50µL of metabolite extract, 0.5 µL of reserpine (10 µg/ mL) was added as an internal control for capturing instrumental variation. Vials containing metabolite extract was then placed in an autosampler for metabolite pro ling using QExactive (Thermo Scienti c).

Proteomics analysis
To improve the detectability of low abundance plasma protein, plasma samples were depleted. In our study, plasma samples were rst subjected to Pierce TM top 12 abundant protein depletion spin column (ThermoFisher Scienti c) and incubated for 1 hour under rocking motion. Samples were centrifuged at 1500g for 2min to obtain the low abundant plasma protein. To concentrate the sample, it was evaporated to 1/4th of its initial volume. Then the sample was stored at -80°C for further processing. The depleted plasma sample was taken forward for quanti cation by Bradford assay taking BSA as standard. To 30µg of the depleted plasma sample, 6M of urea lysis buffer was added followed by 6 times dilution with ammonium bicarbonate. Before digestion of the protein, the plasma protein extract was reduced with TCEP ( nal concentration 20mM) at 37°C for 1hour and then alkylated with iodoacetamide ( nal with an easy nano-LC 1200 system with a gradient of 80% ACN and 0.1% FA for 120 min with blanks after every sample. BSA was run at the starting and endpoint of each set of the run to check the instrument quality. All samples were loaded onto the LC column at a ow rate of 300 nL/min. Mass spectrometric data acquisition was done in data-dependent acquisition mode with a mass scan range of 375-1700 m/z and a mass resolution of 60,000. A mass window of 10 ppm was set with a dynamic exclusion of 40s. All MS/MS data was acquired by the High energy Collision Dissociation method of fragmentation and data acquisition was done using Thermo Thermo Xcalibur software version 4.0.

Proteomics data and pathway analysis
The raw datasets were processed with MaxQuant (v1.6.6.0) against the Human Swiss-Prot database (downloaded on 09.07.2020), searched with the built-in Andromeda Search Engine of MaxQuant 90 . Raw les were processed within Label-Free-Quanti cation (LFQ) parameters setting label-type as "standard" with a multiplicity of 1. The Orbitrap was set to Orbitrap Fusion mode. Trypsin was used for digestion with a maximum missed cleavage of 2. Carbamidomethylation of Cysteine (+57.021464 Da) was set as the xed modi cation, whereas oxidation of Methionine (+15.994915 Da) was set as the variable modi cation. The False-Discovery-Rate (FDR) was set to 1% for the protein and peptide levels to ensure high reliability of the protein detection/identi cation. Decoy mode was set to "reverse", and the type of identi ed peptides was set to "unique+razor".
A sample wise correlation analysis of 74 samples was performed to understand the data quality and 3 samples were removed. Proteomic data of 71 samples were taken forward to perform missing values imputation using the k-nearest neighbors (KNN) algorithm in Metaboanalyst 91 . Statistical analysis and data visualization were carried out in Python and Microsoft excel. The signi cant differentially expressed proteins were determined using Welch's t-test where p values less than 0.05 was used as a cut-off. The biological pathway analysis was done using Metascape for GO enrichment analysis 92 whereas String (Version 11.0) 93 was used to prepare the protein-protein interaction (PPI) network.

Targeted proteomics by Multiple Reaction Monitoring Assay
The proteins found to be of statistical signi cance and showing upregulation in COVID-19 Severe when compared with COVID-19 Non-severe and COVID-19 Negative in the LFQ data were selected and used for a targeted MRM study. The list of transitions was prepared for unique peptides of these selected proteins using Skyline (Ver 20.2.1.286). The missed cleavage criterion was 0, precursor charges +2, +3, and product charges +1, +2 with y ion transitions (from ion 3 to last ion -1) were included. Pools of each group of COVID-19 samples (positive and negative) were run against all the generated transitions and based on the data derived from it, a list was nalized. A thus formed list consisting of 35 peptides from 13 proteins. This list included a spiked-in synthetic peptide (FEDGVLDPDYPR) essential for monitoring the consistency of the mass spectrometry runs (with a heavy labeled C-terminal Arginine). For the experiment, a Vanquish UHPLC system (ThermoFisher Scienti c, USA) connected to a TSQ Altis mass spectrometer (ThermoFisher Scienti c, USA) was used. The peptides were separated using a Hypersil Gold C18 column 1.9μm, 100 X 2.1 mm (ThermoFisher Scienti c, USA) at a ow rate of .45 ml/min for a total time of 10 minutes. The binary buffer system used was 0.1% FA as the buffer A and 80% ACN in 0.1% FA as the buffer B. Approximately, 1µg of BSA was also run with the samples to check uniformity in the instrument response.
A batch of samples that included 6 severe and 6 non-severe COVID-19 samples were run in duplicates against the above mentioned 35 peptides list. The rst and second replicates were run two days apart, to establish reproducibility. After the data was acquired the raw les were imported into Skyline and peaks were annotated with the help of a library. The library was built from the in-house LFQ data of COVID-19 samples.

Metabolomics analysis
The internal standard added extracted samples were analyzed using Ultra-High-Performance Liquid Chromatography coupled with tandem Mass Spectrometer (UHPLC-MS/MS) methods with positive ion mode of Electrospray Ionization (ESI). Ultimate 3000 and Q Exactive (Thermo Fisher, USA) was the system used with HESI heated ESI source and Orbitrap as its mass analyzer. The resolution of the mass spectrometer was set at 140,000 for Full MS and 17500 for ddMS2 and scanned at a mass range of 100 to 700 m/z. The capillary temperature was 340 o C, Sheath Gas Flow rate at 42, Aux Gas rate at 10, and spray voltage at 3.8kV. A C18 column, i.e. Hypersil GOLD (100 x 2.1 mm, 1.9 µm particle size, Thermo Fisher Scienti c, USA) was used in the UHPLC using Water and 100% Methanol as gradient eluents both added with 0.1% Formic Acid, in a 20 mins gradient. The gradient consecutively reached 1% of methanol at 2 mins, 50% Methanol at 5 mins, 98% Methanol at 14 mins, stayed at 98% till 17 mins, 1% at 17.2 mins, and stayed at 1% Methanol till 20 mins all at 0.350mL/min ow rate. The samples were run in batches and each sample was analyzed in three technical replicates using MS only and MS2 modes. Each batch of samples was having initial blank runs consisting of only resolving solvent, i.e., 50% Methanol, and a single blank was run after every sample. Quality check control samples consisting of a pool of the samples were also run after every 5 samples to check the consistency of the instrument quality.

Statistical analysis
Analysis of acquired data was initially performed with the Compound Discoverer 3.0 software (Thermo Fisher), which does metabolite identi cation/quantitation, chromatography peak alignment, mass spectrum visualization, and statistical analysis. The QC pools and internal standards were checked in different batches to decide the further normalization and transformation strategy. Metabolites of interest were further searched in the METLIN and HMDB database using the observed m/z with mass error constraint of 3 ppm at negative mode, and experimental MS/MS spectra were compared to available reference MS/MS spectra in METLIN and HMDB. Spearman correlation analysis was performed to check the data quality and the samples having R2 above 0.5 were considered for further post-processing. The features with over 30% missing values were ltered out and the missing value imputation was done separately for each cohort through KNN (k-nearest neighboring) in Metaboanalyst. The data was then logtransformed and median normalized which was followed by a two-tailed unpaired student t-test for each pair of cohorts. The features having FDR adjusted p-value less than 0.05 and log2 fold change above 1.5 were considered as statistically differentially expressed metabolites.

Molecular docking
Differentially expressed proteins from our proteomic study have been taken forward for in-silico docking studies where we retrieved the complete crystal structures of the proteins available from Protein Data Bank (PDB) 94 . Known inhibitors were searched in the literature against the selected proteins and they are termed as control inhibitors and for each control binding a nity (kcal/mol) was documented. We prepared a library of 58 small molecular components among which 30 are already FDA approved, 9 are in clinical trials and 19 are in pre-clinical phase trials. SDF les for each of the components were downloaded from the ZINC 15 database 95 . The proteins having a complete crystal structure and known inhibitors were taken forward in this study, where each of them is docked against the library along with their respective controls. We used Autodock Vina 1.1.2 (Trott,O., Olson, 2019) to perform the docking experiment, which was inbuild in PyRx software (https://pyrx.sourceforge.io/). After loading the .pdb structure of proteins, they were rst converted to a macromolecule via Autodock tools. Similarly, SDF les for the selected drugs were converted to PDBQT format, which is a readable le format for Autodock Vina, using the open babel tool. In our blind docking method, the exhaustiveness was set to 50 while instead of choosing a particularly active site, the whole protein was consumed into the grid box. The docking output les were split into individual poses where the pose having the lowest binding energy was taken forward for further analysis. Finally, the Docked structures were visualized using PyMOL (Version 2.4) and Discovery Studio Visualizer Software (Version 4.0) and checked for the binding pockets for the drugs in the library. Additionally, the protein-ligand interaction pro ler (PLIP) server was used to calculate the number and types of interactions between the protein and the drugs 96 Data availability All proteomics data associated with this study are present in the manuscript or the Supplementary Materials. Raw MS data and search output les for proteomics datasets are deposited to the ProteomeXchange Consortium via the PRIDE partner repository with the dataset identi er PXD022296 (Link: https://www.ebi.ac.uk/pride/archive/login, Username: reviewer_pxd022296@ebi.ac.uk, Password: Ewqtl0NJ). Targeted proteomic data is deposited on Panorama Public and the ProteomeXchange ID reserved for the data is PXD022475. It can be accessed using the access URLhttps://panoramaweb.org/COVID_PLASMA.url Declarations Figure 1 Landscape of Proteomic analysis of COVID-19 positive patients: Figure 1A represents the sample cohort which includes 20 COVID-19 Negative, 18 COVID-19 Non-Severe and 33 COVID-19 Severe patients; Figure  1B depicts the schematic work ow of Label-free quanti cation under discovery proteomics; Figure 1C illustrates the overview of statistical data analysis; Figure 1D represents the use of synthetic peptide in Multiple Reaction Monitoring (MRM) under targeted proteomics, representative peaks are shown; Figure  1E and 1F represents the outline of biological network analysis and docking study respectively; Figure 1G represents the segregation between COVID-19 Positive (includes Severe and Non-Severe) and COVID-19 Negative samples; Figure 1H portrays a volcano plot showing signi cant differentially expressed protein; Figure 1I displays violin plot of SERPIND1, VWF and MIF protein (ns: 5.00e-02 < p <= 1.00e+00; : 1.00e-02 < p <= 5.00e-02; *: 1.00e-03 < p <= 1.00e-02; *: 1.00e-04 < p <= 1.00e-03; **: p <= 1.00e-04) and Figure 1J depicts validation of acute phase proteins like SAA1 and SAA2 in different study cohorts.

Figure 2
Proteomic analysis of COVID-19 Non-Severe and COVID-19 Severe patients: Figure 2A represent top 25 differentially expressed of Severe and Non-Severe in the form of Heatmap; Figure 2B and 2C depicts the PLSDA clustering and signi cant DEPs in the form of Volcano plot respectively; Figure 2D display a panel of 8 protein found to be most relevant in terms Severe vs Non-severe (ns: 5.00e-02 < p <= 1.00e+00; : 1.00e-02 < p <= 5.00e-02; *: 1.00e-03 < p <= 1.00e-02; *: 1.00e-04 < p <= 1.00e-03; **: p <= 1.00e-04).  Biological Pathway and Network analysis of differentially expressed protein in Severe vs Non-severe comparison: Figure 4A represents the enriched biological processes with their co-expressed proteins in the form a bipartite network where few proteins has been shown in the form violin plot (ns: 5.00e-02 < p <= 1.00e+00; : 1.00e-02 < p <= 5.00e-02; *: 1.00e-03 < p <= 1.00e-02; *: 1.00e-04 < p <= 1.00e-03; **: p <= 1.00e-04). Figure 2B depicts bar graph of enriched gene ontology based on DEPs, where x axis represents -log10 p Value and y axis represents Gene Ontologies; Figure 4C shows network of enriched terms colored by clusters, where nodes that share the same clusters are typically close to each other. PCA plot representing proper segregation of QC pools from all the batches for quality check of sample run, D. PCA plot of Non-severe and Severe COVID-19 patient cohort, E. Heat map of signi cantly altered metabolites in Non-severe and Severe COVID-19 patient cohort, F. VIP score plot of signi cantly altered metabolites in Non-severe and Severe COVID-19 patient cohort, and G. Volcano plot of signi cantly altered metabolites, level 2 MSI, in Non-severe and Severe COVID-19 patient cohort with their trend of alteration represented by box plot.

Figure 6
In-silico molecular docking of drugs against upregulated proteins from different stages of COVID infection: Figure showing the docking results of three different target proteins with two FDA-approved drugs Selinexor and ponatinib. Figure 6A represents the predicted 2D interaction diagram of Selinexor with SERPIN A7 involving various chemical bonds. Figure 6B shows the 3D diagrams which represent the binding pockets as well as the interaction of the drugs to three differentially expressed proteins, SERPIN A7, SERPIN D1, and S100 A9. Selinexor drug binds with SERPIN A7 with -9.3 Kcal/mol of binding a nity, SERPIN D1 with -8.7 Kcal/mol of binding a nity, and S100 A-9 with -7.5 Kcal/mol of binding a nity.
Drug Ponatinib binds with SERPIN A7 with -9.3 Kcal/mol of binding a nity and with SERPIN D1 with -8.4 Kcal/mol binding a nity.