COVID-19 plasma deep proteome reveals distinct signatures in severe patients

Prognosis and management of COVID-19 severity is a challenge even after months of the pandemic. Host plasma proteome alterations carry insights into the physiological alterations in response to the infection. Here we employed a mass spectrometry-based label-free quantitative proteomics approach to study alteration in plasma proteome in a cohort of 73 patients (20 COVID negative, 18 non-severe, and 33 severe) to understand the disease dynamics for addressing this challenge. Of the 1200 proteins detected in the patient plasma, 38 proteins were differentially expressed between non-severe and severe groups. The host proteins such as Angiotensinogen, apolipoprotein B, SERPINA3, SERPING1, and Fibrinogen gamma chain identied in LFQ analysis were further validated using targeted mass spectrometry assay. Utilizing our proteomics dataset, we identied multiple drugs that could inhibit the upregulated proteins involved in disease pathogenesis of these 2 FDA-approved drugs Selinexor and Ponatinib, which showed promise of being re-purposed for potential therapeutics of COVID-19. Plasma proteome identied signicant dysregulation in the pathways related to peptidase activity, regulated exocytosis, blood coagulation, complement activation, leukocyte activation involved in immune response, and response to glucocorticoid biological processes during severe SARS-CoV-2 infection. Further, the results suggest that COVID-19 severity can be prognosticated using specic biomarkers of severity, and few of these proteins are excellent targets for re-purposed drugs.


Introduction
Unparalleled events have unfolded in the year 2020. A previously unknown infectious microbe has held life as we know it to ransom by triggering a raging pandemic of Coronavirus disease (COVID- 19) sweeping across continents like a wild re and much like a wild re left behind a trail of more than a million dead and destruction of lives and livelihood of millions of others. The microbe that we now know as the Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), 1 is a beta coronavirus that belongs to the order Nidovirales of the Coronaviridae family. Thus, the virus has the same lineage as other highly infectious viruses SARS-CoV and Middle East respiratory syndrome coronavirus (MERS-CoV) both have caused outbreaks in the preceding decades. 2 In humans, SARS-CoV-2 attaches to the Angiotensin-converting enzyme 2 (ACE2) receptor and infects the human respiratory tract and lungs, mostly leading to typical u-like symptoms like dry cough, body ache, and fever. 3 However, sometimes, it leads to acute respiratory distress syndrome. Patients of severe COVID-19 deteriorate to multi-organ dysfunction and death 4,5 despite intense medical intervention. However, although SARS-CoV-2 primarily targets the respiratory tract and lung, an increasing body of work has shown the virus to infect other organs like the gastrointestinal tract, liver, kidney, cardiac muscles, central nervous system, musculoskeletal system, and even reproductive system in males. 6,7 The diagnosis of the disease has been accurate and widespread. This was possible due to increased availability and deployment of RT-PCR assays, 8 or serological test kits, and scaling up of testing rates globally. 9 However, the prognosis of the disease remains a challenge. This is because the precise pathophysiological pathways that get perturbed during the disease's progression to severity remain mostly unexplored.
Blood is the only body uid that reaches virtually all organ systems of our body and potentially carries information of perturbations in their physiology. Further, owing to its minimally invasive nature, it has been the mainstay of several diagnostic tests used for assaying multiple parameters to assess the physiological state of the human body. Indeed, blood is the clinician's bio uid of choice to understand physiological aberrances, and thus blood plasma proteome is an excellent source for assessing hostresponse. 10 The sensitivity and high throughput nature of modern-day mass-spectrometry has allowed the scientist to detect even the faintest of agitations in host physiology. 11 For our study, we chose to apply a deep proteomics strategy to delineate the systems-wide perturbations brought about by SARS-COV-2 infection in their human host during non -severe and severe COVID-19. We surveyed the host plasma proteome from blood collected from a large cohort of 73 patients (Supplementary Table S1) of varying COVID-19 severity during their active infection phase.
We could uncover the pathological processes that turn a non-severe COVID-19 into a severe form. A mere 38 differentially expressed proteins could lead to the transition from non-severe to severe COVID-19. Dysregulations in the pathways, especially relating to in ammatory pathways, complement activation, and blood clotting, are crucial and can prognosticate the disease's severity. Further, to validate our ndings, we conducted MRM studies on unique peptides.
Finally, armed with our information of the differentially expressed proteome, we used a customized drug library for in silico docking studies on over-expressed host proteins to screen potential drugs for the management of the disease. The analysis revealed two FDA-approved drugs with the best binding a nity towards the therapeutic targets reported in the study.
These salient outcomes provide valued information on plasma biomarkers associated with the severity of COVID-19 and unravel the mechanistic pathways of the pathogenesis of SARS-CoV-2 infection. Further, the potential FDA-approved drugs showing inhibition towards the marker proteins could be validated on cell-line based studies. All samples were loaded onto the LC column at a ow rate of 300 nL/min. Mass spectrometric data acquisition was done in data-dependent acquisition mode with a mass scan range of 375-1700 m/z and a mass resolution of 60,000. A mass window of 10 ppm was set with a dynamic exclusion of 40s. All MS/MS data was acquired by the High energy Collision Dissociation method of fragmentation, and data acquisition was done using Thermo Thermo Xcalibur software version 4.0.

Proteomics data and pathway analysis
The raw datasets were processed with MaxQuant (v1.6.6.0) against the Human Swiss-Prot database (downloaded on 09.07.2020), searched with the built-in Andromeda Search Engine of MaxQuant. 12 Raw les were processed within Label-Free-Quanti cation (LFQ) parameters setting label-type as "standard" with a multiplicity of 1. The Orbitrap was set to Orbitrap Fusion mode. Trypsin was used for digestion with a maximum missed cleavage of 2. Carbamidomethylation of Cysteine (+57.021464 Da) was set as the xed modi cation, whereas oxidation of Methionine (+15.994915 Da) was set as the variable modi cation. The False-Discovery-Rate (FDR) was set to 1% for the protein and peptide levels to ensure high protein detection/identi cation reliability. Decoy mode was set to "reverse", and the type of identi ed peptides was set to "unique+razor".
A sample wise correlation analysis of 74 samples was performed to understand the data quality, and three samples were removed. Proteomic data of 71 samples were taken forward to perform missing values imputation using the k-nearest neighbors (KNN) algorithm in Metaboanalyst. 13 Statistical analysis and data visualization were carried out in Python and Microsoft excel. The signi cant differentially expressed proteins were determined using Welch's t-test where p values less than 0.05 was used as a cutoff. The biological pathway analysis was done using Metascape for GO enrichment analysis 14 , whereas String (Version 11.0) 15 was used to prepare the protein-protein interaction (PPI) network.

Targeted proteomics by Multiple Reaction Monitoring Assay
The proteins found to be of statistical signi cance and showed upregulation in COVID-19 Severe compared with COVID-19 Non-severe and COVID-19 Negative in the LFQ data were selected and used for a targeted MRM study. The list of transitions was prepared for unique peptides of these selected proteins using Skyline (Ver 20.2.1.286). The missed cleavage criterion was 0, precursor charges +2, +3, and product charges +1, +2 with y ion transitions (from ion 3 to last ion -1) were included. Pools of each group of COVID-19 samples (positive and negative) were run against all the generated transitions and based on the data derived from it, and a list was nalized. The list thus formed consisted of 35 peptides from 13 proteins. This list included a spiked-in synthetic peptide (FEDGVLDPDYPR) essential for monitoring the consistency of the mass spectrometry runs (with a heavy labeled C-terminal Arginine). For the experiment, a Vanquish UHPLC system (ThermoFisher Scienti c, USA) connected to a TSQ Altis mass spectrometer (ThermoFisher Scienti c, USA) was used. The peptides were separated using a Hypersil Gold C18 column 1.9μm, 100 X 2.1 mm (ThermoFisher Scienti c, USA) at a ow rate of .45 ml/min for a total time of 10 minutes. The binary buffer system used was 0.1% FA as the buffer A and 80% ACN in 0.1% FA as the buffer B. Approximately 1µg of BSA was also run with the samples to check uniformity in the instrument response.
A batch of samples that included six severe and six non-severe COVID-19 samples were run in duplicates against the above mentioned 35 peptides list. The rst and second replicates were run two days apart to establish reproducibility. After the data was acquired, the raw les were imported into Skyline, and peaks were annotated with the help of a library. The library was built from the in-house LFQ data of COVID-19 samples.

Molecular docking
Differentially expressed proteins from our proteomic study have been taken forward for in-silico docking studies where we retrieved the complete crystal structures of the proteins available from Protein Data Bank (PDB). 16 Known inhibitors were searched in the literature against the selected proteins, and they are termed as control inhibitors, and for each control, binding a nity (kcal/mol) was documented. We prepared a library of 58 small molecular components, among which 30 are already FDA approved, nine are in clinical trials, and nineteen are in pre-clinical phase trials. SDF les for each of the components were downloaded from the ZINC 15 database. 17 The proteins having a complete crystal structure and known inhibitors were taken forward in this study, where each of them is docked against the library along with their respective controls. We used Autodock Vina 1.1.2 (Trott,O., Olson, 2019) to perform the docking experiment, which was inbuild in PyRx software (https://pyrx.sourceforge.io/). After loading the .pdb structure of proteins, they were rst converted to a macromolecule via Autodock tools. Similarly, SDF les for the selected drugs were converted to PDBQT format, which is a readable le format for Autodock Vina, using the open babel tool. In our blind docking method, the exhaustiveness was set to 50 while instead of choosing a particularly active site, the whole protein was consumed into the grid box. The docking output les were split into individual poses where the pose having the lowest binding energy was taken forward for further analysis. Finally, the Docked structures were visualized using PyMOL (Version 2.4) and Discovery Studio Visualizer Software (Version 4.0) and checked for the binding pockets for the drugs in the library. Additionally, the protein-ligand interaction pro ler (PLIP) server was used to calculate the number and types of interactions between the protein and the drugs. 18

Deep proteomic analysis of patient plasma
We performed label-free quanti cation of a total of 74 depleted plasma samples, out of which 20 were negative, 18 were non-severe, and 36 were severe ( Figure 1A). Figure 1 (B-F) depicts the schematic work ow of label-free quanti cation under discovery proteomics, illustrates the overview of statistical data analysis, shows the summary of synthetic peptide peaks after Multiple Reaction Monitoring (MRM) under validation proteomics, represents the outline of biological network analysis and docking study respectively. The correlation matrix of the total 74 samples is shown in Supplementary gure 1. The mass-spectrometry setting for the label-free quanti cation is shown in Supplementary gure 2. The LFQ analysis of 74 samples provides a total of 1206 proteins. A list of 278 missing value imputed proteins from 71 samples was taken forward for the partial least squares-discriminant analysis (PLSDA) for an overall assessment of the difference between the COVID-19 positive and COVID-19 negative sample cohort. The two sample cohorts were found segregating in two separate clusters shown in Figure 1G. The statistical analysis between the COVID-19 positive and COVID-19 negative cohort provides a list of 27 signi cant differentially expressed proteins, which has been represented in the form of a volcano plot in Figure 1H and a heatmap in Figure 1J (Supplementary Table S2). We identi ed proteins such as von Willebrand factor (VWF), Haptoglobin-related protein (HPR), Glutathione peroxidase 3 (GPX3), Alpha-2-macroglobulin (A2M), Carbonic anhydrase 2 (CA2), Protein S100-A8 (S100A8), Carboxypeptidase B2 (CPB2), Heparin cofactor 2 (SERPIND1), Fibrinogen gamma chain (FGG), Pro lin-1 (PFN1) and Serum amyloid A-4 protein (SAA4) to be signi cantly upregulated in the COVID-19 positive patients. The proteins such as Lymphatic vessel endothelial hyaluronic acid receptor 1 (LYVE1), Intercellular adhesion molecule 1 (ICAM1), Macrophage migration inhibitory factor (MIF), Histidine-rich glycoprotein (HRG), IgGFc-binding protein (FCGBP), Immunoglobulin heavy variable 3-15 (IGHV3-15) and Insulin-like growth factor-binding protein 3 (IGFBP3) are signi cantly downregulated in the COVID-19 positive patients. The violin plot of few dysregulated proteins SERPIND1, VWF, and MIF protein are shown in Figure 1I. We have also found a protein Cadherin EGF LAG seven-pass G-type receptor 2 (CELSR2), which is exceptionally absent in all the COVID-19 negative samples but present in COVID-19 positive samples. The identi cation of CELSR2 protein in COVID-19 positive is also exciting as it is present in 5 out of 18 non-severe samples whereas found in 26 samples out of 33 severe samples.
Further, this study has also investigated the proteomic alteration between the Non-severe and Severe cohort, which provides a list of38 signi cantly differentially expressed proteins (Supplementary Table  S3). Figure 2A represents a heatmap of the top 25 differentially expressed proteins in context to the Severe and Non-Severe cohort. A list of 287 missing value imputed proteins was taken forward for the partial least squares-discriminant analysis (PLSDA) for an overall assessment of the difference between the Severe and Non-severe cohort. The two sample cohorts were found segregating in two separate clusters except for Sample P93, P30, and P106, which clustered closer to the opposite cohort ( Figure 2B). Figure 2C depicts the signi cant DEPs in the form of a Volcano plot. The proteins such as Kallistatin (SERPINA4), Serum amyloid P-component (APCS), Protein S100-A8 (S100A8), Fibrinogen gamma chain (FGG), Corticosteroid-binding globulin (SERPINA6), and Alpha-1-antichymotrypsin (SERPINA3) were found to be upregulated in the severe cohort whereas proteins such as Complement factor D (CFD), Monocyte differentiation antigen (CD14), Complement component C8 alpha chain (C8A), Apolipoprotein (LPA) and Apolipoprotein M (APOM) were found to be downregulated in the severe when compared to non-severe patients. Supplementary gure S3 represents the top 25 differentially expressed proteins of Severe and Negative in the form of Heatmap and depicts the PLSDA clustering of the Severe versus Negative patients.

MRM analysis of proteins overexpressed in severe COVID-19
The MRM study aimed to validate the differentially regulated proteins found between COVID-19 severe and non-severe samples from the LFQ data. The BSA as QC standard to monitor day-wise instrument response is shown in Supplementary gure S4. To establish that all the injections gave the more or less same response, we spiked in an equal amount of a heavy labeled synthetic peptide (FEDGVLDPDYPR) in all samples. The uniform peak areas for this peptide, as shown in Supplementary gure S5, establishes the same. Even duplicates run on separate days showed comparable peak areas with very low CV. Based on the response of the differentially regulated peptides, the list was further re ned to keep only peptides showing signi cant dysregulation (adjusted p values below 0.05) between severe and non-severe. For this, the peaks were annotated, and transitions were re ned according to the library match to give dotp values for all peptides. A dotp value is a measure of the match between the experimental peak and the library fragmentation patterns. Thus, the re ned list had 183 transitions belonging to 28 peptides of 9 host proteins and 1 synthetic peptide. List of peptide sequences and transitions of proteins that showed differential regulation between COVID-19 non-severe and severe patient samples shown in Supplementary Table S4. Using the MS stats external tools in Skyline, we determined that proteins AGT, APOB, SERPINA3, FGG, and SEPRING1 have 3 or more than 3 peptides that show a peak area fold change more than 3 and adjusted p-value less than 0.05 at a con dence of 95-99% (Figure 3). This validates that for the given set of samples, these proteins are showing statistically signi cant overexpression in COVID-19 severe patients than in COVID-19 non-severe patients (refer to the data availability section for the Skyline les).

Biological Pathway and Network analysis of differentially expressed protein in Severe Vs. Non-severe comparison
We also identi ed the enriched biological processes for the 38 dysregulated proteins in COVID-19 severe compared to COVID-19 non-severe patients. The biological processed, enriched proteins were shown in the form of protein-protein interaction. Few proteins have been shown in the form of a violin plot ( Figure   4A). Figure 4B shows a network of enriched terms colored by clusters, where nodes that share the same clusters are typically close to each other. We identi ed the biological process such as regulation of peptidase activity, regulated exocytosis, extracellular structure organization, blood coagulation, brin clot formation, complement activation, classical pathway, leukocyte activation involved in immune response, and Response to glucocorticoid process to be enriched in COVID-19 severe patients. The list of proteins expressed in these pathways is shown in Supplementary Table S5. 3.4 In-silico screening of drugs against differentially expressed proteins We have performed in silico molecular docking of signi cantly altered proteins with the library of 58 drugs (Supplementary table 6A-E). Out of 58 drugs, 30 drugs are FDA approved, nine drugs are clinically approved, and 19 drugs are pre-clinical approved. We have identi ed positive control drugs for each protein from the literature, which is a known inhibitor of the protein. Positive control drug gives us a possible cut-off for the docking score. After docking, we have used two criteria for selecting the drugs for the protein. Firstly, the drug's binding energy should be equal to or higher than that of the control inhibitor. Secondly, the drug's binding pocket should be similar to the control drug. For COVID-19 non-severe vs. severe comparison, we have docked ve signi cant proteins, which are Heparin cofactor 2, Thyroxinebinding globulin, Angiotensinogen, Carbonic Anhydrase-1, and Carbonic Anhydrase-2.
Heparin cofactor 2 (SERPIN D1) is a protein of 499 amino acid long peptide, which binds with the drug Sulodexide with a binding a nity of -7.1 kcal/mol and hence it is taken as a control drug (Supplementary gure S6A). When docked with the customized drug library, we nd four FDA approved drugs that bind to a similar binding pocket as the control drug and have better binding a nity than the Sulodexide, namely, Selinexor (-8.7 kcal/mol), Ponatinib (-8.4 kcal/mol), EGCG (-7.7 kcal/mol) and Nafamostat (-8.1 kcal/mol). Another protein Thyroxine-binding globulin (SERPIN A7) is a protein with 415 amino acids. It showed a binding a nity of -7.4 kcal/mol with the drug Tamoxifen which is a well-known inhibitor of the protein; hence we have used it as a control inhibitor of the protein (Supplementary gure S6B). From the customized drug library, SERPIN A7 has bound to Selinexor and Ponatinib with a binding a nity of -9.3 kcal/mol ( Figure 6B). 2D interaction diagram of Selinexor docked with SERPIN A7 shows the amino acids Y20, and R381 forms potential Hydrogen bonds at the binding site ( Figure 6A). Angiotensinogen, a 485 amino acids long protein is; binds with the Irbesartan with a binding a nity of -8.4 kcal/mol. It is a known inhibitor of the protein, so we have used it as a control drug (Supplementary gure S6C). Angiotensinogen binds to the drug ML-240, which a pre-clinical approved drug with a binding a nity of -8.9 kcal/mol. This is the only drug we have identi ed in our study to target Angiotensinogen. We have also performed docking of Carbonic Anhydrase-1 (261 amino acid length) and Carbonic Anhydrase-2 (260 amino acid length). Small molecule Topiramate binds with Carbonic Anhydrase-1 with a binding a nity of -9.2 kcal/mol (Supplementary gure S6D), and Acetazolamide binds with Carbonic Anhydrase-2 with a binding a nity of -6.3 kcal/mol (Supplementary gure S6E); they are used as control drug for respective proteins. Our study identi ed EGCG as the only FDA-approved drug that can be used to target Carbonic Anhydrase-1 (with binding a nity -9.5 kcal/mol) and Nafamostat (with binding a nity -8.2 kcal/mol) to target Carbonic Anhydrase-2. Four proteins from COVID-19 positive vs. negative comparison were used for molecular docking, which are Protein S100 A9, Carboxy Peptidase B2, Glutathione S-transferase omega-1, and 6-Phosphogluconate dehydrogenase.
We found that the Rapamycin drug can be used to target all 4 proteins as it is binding with the proteins with higher Binding a nity than their respective control drug and the binding pocket is also the same as the control drug. The rst protein is Protein S100 A9, which is a small protein with 114 amino acids. We have used Tasquinimod as a control inhibitor as it binds with the protein with a binding a nity of -7.5 kcal/mol (Supplementary gure S6F). Using the above-mentioned criteria to select drugs, we have identi ed Protein S100 A9 can be targeted using two FDA approved drugs: Selinexor, which also binds to the protein with a binding a nity of -7.5 kcal/mol and another drug is Rapamycin, which is an mTOR inhibitor, it binds to the protein with a binding a nity of -8. Four FDA approved drugs from our customized drug library can be used to target GSTO-1, which are Rapamycin (binding a nity -8.8 kcal/mol), Selinexor (binding a nity -8.6 kcal/mol), Ponatinib (binding a nity -9.1 kcal/mol), and Silmitasertib (binding a nity -8.3 kcal/mol). 6-Phosphogluconate dehydrogenase is a protein with 483 amino acids. We have used Physcion as a control drug; it binds with 6-PGDH 1 with a binding a nity of -7.0 kcal/mol (Supplementary gure S6I). Six FDA approved drugs from our customized drug library can be used to target 6-PGDH, which are Rapamycin (binding a nity -8.8 kcal/mol), Selinexor (binding a nity -8.5 kcal/mol), Ponatinib (binding a nity -10.3 kcal/mol), Silmitasertib (binding a nity -7.7 kcal/mol), Daunorubicin (binding a nity -8.4 kcal/mol) and Dabrafenib (binding a nity -8.6 kcal/mol). We have found Rapamycin, an already approved drug for organ transplant rejection, binds to all four signi cantly upregulated proteins of COVID-19 positive vs. negative comparison from our molecular docking analysis. We have also found that Selinexor, an exportin antagonist, is approved for multiple myeloma, and Ponatinib, a tyrosine kinase inhibitor, is approved for Chronic Myeloid Leukemia (CML), can be used to target proteins. From COVID-19 positive vs. negative comparison and non-severe vs. severe comparison as it has shown to inhibit proteins from both the comparison. Another drug Pevonedistat, a clinically approved drug, can also be explored in the future to target COVID-19 as it has also been shown to inhibit proteins from both the comparison.

Discussion
Nasopharyngeal swab samples and serological tests are being routinely used in clinics to detect SARS-CoV-2 infection diagnosis; however, biomarkers for prognosis of the disease before it could lead to fatal symptoms are yet to be found. Understanding the host response towards the viral infection might provide important clues on the progression of the disease from non-severe to severity. A proteomics approach was applied for an in-depth understanding of the disease mechanism. Few studies have already reported differences in the level of blood-based proteins such as lactate dehydrogenase (LDH), d-dimers, and in ammatory markers such as C-reactive protein (CRP), ferritin, and brinogen in COVID-19 patients. 19,20 One speci c forte of our study is the in-depth pro ling of plasma proteome from a cohort of COVID-19 patients (n = 73), facilitating the robust and statistically signi cant evaluation of differential expression between non-severe and severe disease groups. The Indian subcontinent being reasonably unscathed by the severity of the pandemic, reporting a few of the lowest case fatality rates per million, 21 our study holds importance in understanding the biology associated with the same.
Deep proteome comparison between COVID-19 positive and COVID-19 negative patients who exhibited symptoms similar to COVID-19 revealed a subset of proteins that differentiates COVID-19 from other febrile respiratory maladies. Of importance was the upregulation of the von Willebrand factor (VWF). Animal studies have shown that increased VWF might be due to hypoxic conditions in the lung endothelial cells 22 ; however, this induces a risk of arterial or venous thrombosis since it directly promotes the thrombotic process during in ammation. 23 The increase in Haptoglobin-related protein (HPR) is also found in cases of idiopathic pulmonary brosis 24 and as a factor of non-bacterial pneumonia 25 thus may act as a biomarker of lung trauma. Carboxypeptidase B2 has anti-in ammatory and anti-brinolytic effects. Its increase in this cohort indicates the natural response to systemic in ammation brought about by COVID-19. 26 Another protein Pro lin-1 (PFN1) overexpression implicated in the vascular hyperpermeability, vascular Hypertrophy can perhaps explain the aberrant physiology of COVID-19 patients. These apart acute phase response proteins were like SAA-4 and S100A8 were also upregulated in response to COVID-19.
Interestingly, however, the protein Lymphatic vessel endothelial hyaluronic acid receptor 1 (LYVE1) was downregulated and might indicate liver injury 27 . At the same time, attenuated Histidine-rich glycoprotein (HRG) expression might explain the altered hemostasis in the patients. 28 However there lies a caveat, most of these patients were under medications the results might also be due to the ongoing therapies than the disease itself.
In our plasma deep proteome study of non-severe vs. severe patients, we observed that only 38 were differentially expressed proteins such as FGG, S100A8, VWF, SAA4, SERPIND1, and SERPINA6 to be upregulated in the COVID-19 positive patients while interestingly, the mitochondrial 60 kDa heat shock protein, HSPD1 was only expressed in the severe patients. It has been already reported that high levels of circulatory HSPD1 are associated with cardiac failures. 29 Therefore increased HSPD1in severe patients can act as a clinical biomarker of cardiac malfunction in the severe group. Also, our results indicated an increase in plasma Cholinesterase (BCHE) in the severe group, which is upregulated in patients suffering from mild ischemic stroke. 30 These ndings thus implicate severe COVID-19 associated risk of cardiac and CNS injury that has been already reported by clinicians, and these biomarkers could help prognosis of the threat.
The plasma levels of carbonic anhydrase 1 (CA1) was found to be substantially elevated in the severe group. Increased carbonic anhydrase has been found to mediate hemorrhagic retinal and cerebral vascular permeability. 31 The rami cations of increased CA1 are also substantiated by earlier reports on a cohort study of sepsis secondary to pneumonia 32 , where it was found to be upregulated during sepsis.
Moreover, the role of increased CA1 in the worsening of ischemic diabetic cardiomyopathy also paints a rather gloomy picture of the cardiac sequelae of COVID-19, especially in diabetic patients 33 and might also contribute to the increased fatality of diabetic patients. 34 The protein Fibrinogen (FGG) was also found to be upregulated in severe patients when compared to nonsevere. FGG is an oligomeric glycoprotein produced in the liver and secreted in the blood. The increased brin formation and breakdown correlated with the high level of D-dimers observed in the COVID-19 patients with the worst outcomes. 35 The increasing level of FGG in severe might be due to liver injury, impairing hepatic brinogen secretion with acquired brinogen storage disease. 36 The protein S100A8 (calgranulin A/myeloid-related protein 8) belongs to the group of alarmins or damage-associated molecular patterns (DAMPs), which are released in response to stress against the microbial infection that leads to exacerbating the in ammatory response. Chen et al. and his co-workers reported that the level of S100A8 positively correlated with the C t value and oxygen demand, indicating the severity of the acute respiratory distress (ARDs) in COVID-19 patients. 37 A recent study showed that severe COVID-19 patients release massive amounts of S100A8, which is accompanied by changes in monocytes and neutrophil subsets. 38 The protein AGT was found to be signi cantly upregulated in severe patients as compared to the non-severe. Angiotensinogen (AGT) is a component of the renin-angiotensin system (RAS), a substrate of renin that regulates blood pressure and uid balance. The dysregulation of AGT and RAS might lead to acute lung injury and acute respiratory distress leading to a severe prognosis. 39 Apolipoprotein B-100 (APOB) is involved in lipid transport and low-density lipoprotein (LDL) catabolism. The high levels of APOB in the plasma might be an indicator of signi cant cardiovascular manifestation seen in COVID-19 infected severe patients. 40 These results are consistent with previous ndings 41 on COVID-19 patient sera, which had identi ed dysregulation of multiple apolipoproteins. Several serine protease inhibitors (SERPINs) such as SERPING1 and SERPINA3 were also identi ed to be upregulated in severe patients. The increasing level of SERPINs, an acute-phase protein, positively correlates and associates with a high level of IL-6 seen in severe patients. 42 The validation study by MRM based assay could speci cally detect AGT, FGG, APOB, SERPING1, and SERPINA3 host peptides in COVID-19 patients using. The mass spectrometry-based detection of host peptides used in our study can be used in the clinics for the prognosis of disease severity.
A subset of proteins was also downregulated in the severe group. The protein peptidase inhibitor 16 (PI16) was severely downregulated (FC= -2.45, P <0.005). It concurs well with the previous studies wherein it has been shown that while PI16 protective role against atherosclerosis, PI16 inhibition by circulating in ammatory cytokines that act through the NF-κB signaling pathway. 43 These results demonstrate the pathogenesis of cardiac maladies in severe COVID-19 patients. 44 Patients with severe COVID-19 often report lower platelet count. 45 Our studies have demonstrated that a crucial factor in platelet biogenesis TPM4 46 is inhibited (FC-1.94, P < 0.001) in severe cases, thereby providing novel biological insight into COVID-19 severity. Another ubiquitously present protein βII spectrin was found to be downregulated in severe COVID-19, given that inadequate βII spectrin might precipitate into arrhythmia, heart failure, or even neurodegeneration 47 the ndings hold much importance. Two proteins, namely APOM, which is known to protect the lungs and kidneys from injuries 48 and APOA2, were also downregulated; similar observations were previously reported. 41 Functional enrichment analysis of the 38 differentially expressed proteins in severe vs. non-severe cohort revealed that these proteins are enriched in pathways related to blood coagulation, brin clot formation, complement system, leukocyte activation, regulation of peptidase activity, regulated exocytosis, and extracellular structure organization, among others. Proteins like A2M, SERPINA4, SERPINA3, SERPING1, and FGG that are involved in regulated exocytosis of platelets were upregulated in the severe cohort suggesting an increased consumption of platelets. This could be a possible reason for the lower platelet count (clinically called thrombocytopenia) commonly reported in many severe cases of COVID-19, 49 , which is also associated with coagulation abnormalities, disease severity, and mortality. [50][51][52] There is enough evidence to suggest that platelets have potent immune and in ammatory effector functions aside from their role in hemostasis. Interaction between viruses and platelets has been known to stimulate platelet degranulation leading to the release of a variety of cytokines and chemokines. 53,54 They also directly interact with leucocytes and endothelial cells to trigger and modulate in ammatory reactions and immune responses. 54 Thus, platelet hyperactivity due to the upregulation of these proteins correlates with the over-exuberant host in ammatory response as COVID-19 progresses from non-severe to severe.
Many of the peptidase activity regulator proteins, including SERPINA4, SERPING1, SERPINA3, SERPIND1, and A2M, are involved in blood coagulation and in ammation pathways. SERPINA4 is an inhibitor of the kallikrein-kinin system involved in coagulation and in ammation. 55,56 SERPING1 is an inhibitor of the classical pathway of the complement system as well as of several proteins involved in blood coagulation. 57 SERPIN A3 is a signi cant inhibitor of cathepsin G, a key proteolytic enzyme and in ammatory effector released by neutrophils. 58 A2M is an inhibitor of a variety of proteases involved in blood coagulation and in ammation including thrombin, kallikrein, plasmin, and cathepsin G; 59 SERPIND1 regulates blood clot formation by inhibiting thrombin. 60 Moreover, FGG or brinogen gamma chain is a component of the clotting factor brinogen, promoting tissue repair. High brinogen levels are associated with bleeding and thrombosis and correlate with the increased erythrocyte sedimentation rate (ESR) observed in severe cases. 61,62 COVID-19 associated coagulopathy is common in severe patients, 63 while overt disseminated intravascular coagulopathy, a critical condition characterized by abnormal blood clotting and bleeding, is observed in most the critically ill patients who do not survive. 64,65 These thrombotic complications can be characterized by dysregulation of proteins involved in blood coagulation, brin clot formation, and platelet exocytosis. Conversely, these proteins can be associated with disease severity and mortality risk and can serve as biomarkers for a better prognosis.
Consistent with previous studies, multiple acute phase proteins (APPs) like APCS, C4B, A2M, SERPING1, SERPINA3, and FGG were upregulated in severe patients. 41,66 APPs are manifested as the body's innate response to any kind of stress. Tissue damage caused by injury or infection instigates a local in ammatory response that leads to the release of pro-in ammatory cytokines. APPs are synthesized and released mainly by liver hepatocytes in response to these cytokines. 67 Severe COVID-19 patients tend to have higher levels of pro-in ammatory cytokines [68][69][70] , which explains the elevated APP levels and the acute in ammatory state correlating with disease severity. The complement system is a signi cant contributor to the acute phase response against infection. C4B is a proteolytic product of complement factor C4 and is involved in the propagation of all the three complement pathways; 71 APCS or serum amyloid P component (SAP) is an activator of the classical pathway of the complement system. 72 Other proteins involved in the complement system that were downregulated in the severe cohort include CFI, patients with commercially available drugs include two major categories; treatment with antiviral drugs and immune modulators. HIV protease inhibitors are quite famous as they belong to the former category, but still, no de nitive studies have proven those drugs to be potent inhibitors; hence the quest has to be continued. 79 In our study, we performed in silico drug re-purposing analysis with 9 proteins from our proteomic analysis against a library of 58 small molecules. The chosen drugs are previously found to target the protein-protein interactions happening between SARS-CoV-2 and human proteins in a cell line model. 80 Two FDA-approved drugs, Selinexor and Ponatinib, were found to inhibit most of the proteins belonging to two different cohorts; COVID-19 positive vs. COVID-19 negative and non-severe Vs. Severe patients' plasma. Previously, Food and drug administration (FDA) had approved Selinexor for the treatment of multiple myeloma in combination with Dexamethasone. 81 The drug is a rst-class exportin-1 (XPO1) inhibitor that brings apoptosis in cancer cells by blocking nucleocytoplasmic transport of tumor suppressor proteins. 82 Although developed originally as anticancer-drugs, Exportin inhibitors can act as antiviral drugs as they have the potential to block the intracellular replication of viral particles by inhibiting the transport of viral replication proteins into the cytoplasm. 83 Hence, the drug is currently under phase 2 clinical trial for COVID-19 infection (ClinicalTrials.gov Identi er: NCT04349098). Five plasma proteins from our study, including Thyroxine-binding globulin (SERPIN A7) and Heparin cofactor 2 (SERPIND1) belonging to the family of serine protease inhibitors (SERPINs) shown to interact tightly with Selinexor, suggesting these SERPINs could be a target for the drug. Both proteins are also seen to interact with another FDA-approved drug called ponatinib. Originally a tyrosine kinase inhibitor, Ponatinib is used to treat patients with chronic myeloid leukemia (CML). 84 It is currently under no clinical trial for COVID-19 patients but, recent studies in mice models showed it could suppress the cytokine storms from viral infections like in uenza. 85 Hence ponatinib, as an immune modulator, appears to be a suitable drug for making therapeutic cocktails against COVID-19 infection in the future.
A stitch in time saves nine. Perhaps no proverb holds true as this when it comes to managing the case fatality in COVID-19. While the COVID-19 pandemic has unleashed an unprecedented crisis on most lives and livelihood, it is the health workers who have borne the brunt of the pandemic. With limited availability of hospital infrastructure and overworked staff, it was pertinent to uncover factors leading to the severity of symptoms associated with COVID-19 to reduce the caseload in hospitals and manage severe cases at an accelerated pace. Further, the validation of these biomarkers can help clinicians in faster disease prognosis and better survival rates, especially in the vulnerable population, while selecting drugs based on the severity of infection would help better manage symptoms.

Declarations
Data availability All proteomics data associated with this study are present in the manuscript or the Supplementary Materials. Raw MS data and search output les for proteomics datasets are deposited to the ProteomeXchange Consortium via the PRIDE partner repository with the dataset identi er PXD022296 (Link: https://www.ebi.ac.uk/pride/archive/login, Username: reviewer_pxd022296@ebi.ac.uk, Password: Ewqtl0NJ). Targeted proteomic data is deposited on Panorama Public, and the ProteomeXchange ID reserved for the data is PXD022475. It can be accessed using the access URLhttps://panoramaweb.org/COVID_PLASMA.url Figure 1 Landscape of Proteomic analysis of COVID-19 positive patients: Figure 1A represents the sample cohort which includes 20 COVID-19 Negative, 18 COVID-19 Non-Severe and 33 COVID-19 Severe patients; Figure   1B depicts the schematic work ow of Label-free quanti cation under discovery proteomics; Figure 1C illustrates the overview of statistical data analysis; Figure 1D represents the use of synthetic peptide in Multiple Reaction Monitoring (MRM) under targeted proteomics, representative peaks are shown; Figure  1E and 1F represents the outline of biological network analysis and docking study respectively; Figure 1G represents the segregation between COVID-19 Positive (includes Severe and Non-Severe) and COVID-19