8-Hydroxydihydrosanguinarine (8-HDS), a pyridone containing analogue of sanguinarine, can be a potential inhibitor of S protein and M protease of SARS CoV2: Insights from computational studies

The unprecedented global pandemic of COVID-19 has created a daunting scenario urging an immediate generation of therapeutic strategy. Interventions to curb the spread of viral infection primarily include setting targets against the virus. Here in this study we target S protein to obstruct the viral attachment and entry and also the M pro to prevent the viral replication. For this purpose, the interaction of S protein and M pro with phytocompounds, sanguinarine and eugenol, and their derivatives were studied using computational tools. It is evident from the docking studies that 8-Hydroxydihydrosanguinarine, a derivative of sanguinarine, exhibits maximum binding affinity with both the targets. The binding energies of the ligand with S protein and M pro scored to be ΔGb -9.4 Kcal/mol and ΔGb -10.3 Kcal/mol respectively. MD simulation studies depict that the phytocompound could effectively cause structural perturbations in the targets which would affect their functions. 8-Hydroxydihydrosanguinarine distorts the α -helix in the secondary structure of M pro and RBD site of S protein. Protein-protein interaction study in presence of 8-hydroxydihydrosanguinarine (8-HDS) also corroborate the above findings which indicate that this polyphenol interfere in the coupling of S Protein and ACE2. The alterations in protonation of M pro suggest that the protein structure undergoes significant structural changes at neutral pH. ADME Solubility, Pharmacokinetics, Drug-likeness) property of 8- hydroxydihydrosanguinarine suggests that this could be a potential drug. This makes the phyto-alkaloid a possible therapeutic molecule for anti COVID-19 drug design.


INTRODUCTION
The ominous COVID-19 has gripped the globe with panic and distress. The coronavirus, also known as SARS-CoV2 because of around 79.5% genomic identity with the RNA of SARS CoV [1], belongs to genus betacoronavirus [2][3][4]. Due to its high rate of transmission and unavailability of specific therapy, it was proclaimed as a pandemic by WHO.
SARS-CoV2, the enveloped, positive sense, single stranded RNA virus, causes respiratory infections in humans [5].The single stranded genomic RNA, ~30 kb in length [6], comprises atleast 6 open reading frames (ORFs) along with 5' cap structure and 3' poly A tail [7]. The first ORF occupying about two-third of the genome length encodes two translational products, polyproteins -pp1a and pp1ab [8] which mediates viral replication and transcription. The viral expression is coordinated by a highly complex proteolytic processing cascade [9]. The M pro, main protease (also known as 3 CL-pro) plays a pivotal role in processing these polyproteins into 16 mature non-structural proteins (nsp) [8]. The nsps are employed for the production of sub genomic RNAs which are required for synthesis of the structural proteins i.e. envelope (E), spike (S), membrane (M), and nucleocapsid (N) proteins and other accessory proteins.
Various X-Ray crystallographic studies [7,10] depict that M pro comprising protomers a and b consists of 306 Amino acid residues. Each protomer consists of three domains, domain-I (8 -99 aa residues), domain-II (100 -183 aa residues) and domain-III (200 -306 aa residues). A connecting loop, between domain II and III, spans from 184 to 199 residues. The N-terminal or N finger (1-7 residues) is responsible for the proteolytic activity [11,12] whereas, the Cterminal domain III of M pro plays a major role in the dimerization of protomers a and b [13].
The establishment of SARS-CoV in the host cell involves proteolytic processing events mediated through M pro which direct gene expression and replication [14]. There are about 11 cleavage sites of M pro on the larger polyprotein pp1ab with Leu-Gln↓Ser, Ala, Gly as recognition sequence. Thus inhibiting this protease activity would block viral replication [15]. Incorporation of protease inhibitors such as lopinavir/ritonavir into CoV infected patients has responded with healthier consequences [16]. Cure to such infections might be more improved by administration of non-covalent protease inhibitors. Most of the anti-SARS approved drugs were intended for certain other viral strains. Recently, a few clinical development programs have effectively accomplished with a CoV protease specific inhibitor [17]. While clinical trials are sprinting for resolving the broad range efficiency scale of various molecules in infected patients, the main protease (M pro) seems to be a promising target in drug design [18,19].
The entry of SARS-CoV-2 into host cells is mediated through the binding of glycoproteinous spike protein of the virus, the S protein and host cell membrane receptor, i.e. Angiotensin Converting Enzyme 2 (ACE2) [20] (Fig. S1). Lu et al., 2020 have suggested that receptor binding domain (RBD) of S-glycoprotein of SARS-CoV2 being similar to that of SARS-CoV, targets Angiotensin Converting Enzyme 2 (ACE2), a monomeric membrane bound receptor on human cells [21]. Hence, it is presumed that blocking the RBD site of the spike protein would prevent viral attachment to the ACE2 receptor and entry into the host cells, thus specifying it as a target to prevent the viral infection [20,22]. Though a few recent and related studies have suggested various natural products for the development/repurposing of anti SARS-CoV2 drug targeting Spike glycoprotein, Membrane glycoprotein and Main protease [20,23,24], to the best of our knowledge none has investigated the potentiality of sanguinarine or eugenol for this purpose.
This study, therefore, takes interest in inspecting the competence of two phytocompounds, sanguinarine and eugenol and their derivatives, in preventing viral infection/replication by inhibiting M pro and blocking S protein through in silico approaches. Virtual screening of sanguinarine and eugenol derivatives was conducted on the basis of molecular interaction, out of them 8-hydroxydihydrosanguinarine (8-HDS) showed the highest binding affinity. 8-HDS, a derivative of sanguinarine, obtained from Chelidonium majus L and is a benzo [c] phenanthridine type alkaloid [25]. Sanguinarine, is known for its wide range of biological activities such as regulation of nuclear factor NF-κB, apoptosis and disintegration of microtubules [26]. Pyridone containing analogues indeed have been proven to be potent in preventing viral replication by inhibiting certain essential viral enzymes [27]. Also tests have been conducted for pyridone derivatives for anti-hepatitis B virus (HBV) activity and cytotoxicity in vitro [28]. Therefore, 8-HDS containing pyridone ring might prove to be effective in averting SARS CoV 2 infection [15] (Fig. S2).
Needless to mention that computational study minimizes biological waste, research time span and is cost-effective. In this context we have employed various computational approaches (Molecular docking and simulation) for designing of natural compound based therapeutic intervention against COVID-19. Findings of this computational study indicate the potentiality of 8-HDS as an effective and promising therapeutic agent for SARS-CoV 2.

Sequence analysis
Using PDB database, the cryo EM structure of COVID-19 S protein (PDB ID -6vsb) and X-Ray diffraction structure of COVID-19 M pro (PDB ID-6y84) having resolution 3.46 Å and 1.39 Å respectively have been obtained for computational study. For analysing multiple sequence alignment the FASTA sequences of M protein of 2019-nCoV, MERS-CoV, HCoV-NL63, SARS-CoV were also derived. FASTA sequence of M pro of above PDB structure was taken for the analysis of its physicochemical property and secondary structure prediction in ExPASy ProtParam and SOPMA tools, respectively.

S-protein-ACE2 interaction in presence of 8-HDS
A computerised rigid body docking tool, clusPro2.0 was used for S-protein-ACE2 protein docking analysis in presence or absence of 8-HDS. This tool helps in screening docked conformations with respect to their clustering properties taking into consideration different protein parameters. The selection of the filtered conformations was based on the assessment of empirical free energy. Both lowest de-solvation and electrostatic energies were taken into account for the evaluation of free energy. ClusPro is accessible at https://cluspro.bu.edu/publications.php. Piper, being a FFT-based rigid docking tool, serves the ClusPro clustering program for detecting native site by providing 1000 low energy outcomes [29]. The native site is assumed to possess a wide range of free-energies to draw greater number of results. Initially the sample was taken for about 10 9 positions of the ligand with respect to the receptor. Out of these, only the top 10 3 positions were selected among all relative ligand positions in correspondence to the receptor.

Molecular docking analysis between M pro and S protein with sanguinarine and eugenol
The binding affinities of M pro and S protein with different derivatives of sanguinarine and eugenol were evaluated through molecular docking program AutoDock Tools 1.5.6. The canonical SMILES ids of sanguinarine and eugenol along with their derivatives were acquired from PubChem database (https://pubchem.ncbi.nlm.nih.gov/). CHIMERA 1.11.2 [30] programme was used for the conversion of 3D structures. Binding affinity of M pro and S protein with derivatives of sanguinarine and eugenol was estimated using AutoDock Vina1.1.2 [31]. Various parameters such as binding affinity, receptors interacting atom, receptor pocket atom, receptor ligand interaction site, atomic contact energy (ACE) and side amino acid residues were studied to recognise the binding site of M pro. Virtual screening of sanguinarine and eugenol derivatives was conducted on the basis of molecular interaction.
Pictorial depiction of docking results were analysed by Discovery Studio 2017 R2 Client [32].

Molecular simulation
The ligand 8-HDS was extracted in chemically unstandardized 2D structures from PubChem database (https://pubchem.ncbi.nlm.nih.gov/). LigPrep [33] was used to standardise the ligand files, lower energy and extrapolated 3D structures which were virtually screened by Glide [34]. The orthorhombic, truncated octahedron and rhombic dodecahedron were taken up for precisely directing solvent simulations with periodic marginal conditions. An 8-staged stabilization run was conducted prior to 100 ns production run. Beginning primarily with task, then followed by simulations with NVT at T = 10 K and small time steps in Brownian Dynamics and restraints on solute heavy atoms for 100 ps. The third stage included repetition of the above stage but with restraints on solute heavy atoms for 12 ps, the following 4th stage was also carried out in a similar manner with NPT instead of NVT. In stage 5, solvate pockets were focussed. Similar to stage 4, the stage 6 was carried out. The next stage was concerned with simulation at NPT for 24 ps with no restraints. Ultimately, simulations were done.
In this study, Molecular simulations were performed specifically for the top two identified hits to study the stability of the ligand receptor complex for 100 ns. Stability of docked complexes of ‗S protein -8-HDS' and ‗M pro -8-HDS' till 100 ns simulation time was checked using system builder of Desmond implemented in Maestro [36]. The system for ‗S

pKa Calculations
PROPKA program is one of the widely used approaches for calculating of pKa values in proteins. PROPKA generates convenient structural rationalization of the predicted pKa values. In this program, Graphical User Interface (GUI) tool was employed for computing the pH-dependent properties of proteins such as charge and stabilization energy by providing a direct link between the structure and the pKa data (predicted by the PROPKA calculations, via the Visual Molecular Dynamics (VMD) program) [38]. PROPKA2.0 was used to calculate the pKa value of M protease, S protein and protein ligand complex -S protein-8-HDS‖ and -M protease -8-HDS‖ at pH 7.0.

Drug likeliness analysis
Swiss ADME is a robust web tool to access physicochemical properties, pharmacokinectics, drug-likeliness and medical chemistry friendliness of a molecule to determine its proficiency to be used as a drug. Bioavailability Radar (lipophilicity, size, polarity, solubility, saturation and flexibility) has been used for this purpose to calculate drug-likeness. Assessment of ADME (absorption, distribution, metabolism and excretion) is essential for drug design. The Canonical SMILES of 8-HDS was retrieved from PubChem database and the ADME properties were predicted at Swiss ADME (http://www.swissadme.ch/).

Structural analysis
The secondary structure of M protease as predicted from SOPMA (Self Optimised Prediction Method with Alignment) showed that the M protease is composed of 306 amino acid residues consisting of 89 α helices (29.08%), 35 βturns (11.44%), 99 random coil (32.35%) and 83 extended strands (27.12%). It was revealed through ExPASy ProtParam that there are 26 negatively charged (Asp + Glu) and 22 positively charged residues (Arg + Lys). The aliphatic index was determined as 82.12. The GRAVY (Grand Average of Hydropathicity) was found to be -0.019. The instability index was scored to be 27.65. These characteristics factually support that the protein is stable.
The estimated half-lives of M protease in mammalian reticulocytes, yeast and Escherichia coli on a comparative analysis were determined as 1.9 hours, 20 hours and 10 hours, respectively. in between Domain-II and Domain-III. Thus it was deduced from the structure alignment and sequence alignment that SARS-CoV is closely related to SARS-CoV2 (Fig. 1).

S-protein-ACE2 interaction in presence of 8-HDS
The best 10 docking models with different free energies were obtained from the ClusPro web-server. The total RMSD value was taken as criteria for grouping [39]. Our study analysed 5 ClusPro docking models which were chosen based on probability of S Protein, S Protein with 8-HDS to interact with the predicted binding sites of ACE2 as well as the lowest binding energy during such interactions. Average binding energy of all 5 binding positions for S Protein-ACE2 interaction is -901.2 kJ/mol. Nevertheless, average binding energy for S Protein-ACE2 in presence of 8-HDS is -771.86 kJ/mol ( Table 1).

Molecular docking analysis
Analysis of the binding interactions of sanguinarine, eugenol and their derivatives with S protein and M pro was done using Autodock Vina 1.1.2 [31].
From the above data, it can be deduced that sanguinarine shows higher binding affinity towards both M pro and S protein ( Table 2). Out of them 8-HDS has the highest binding affinity towards both proteins.

Molecular Simulation analysis
As the best docking results were observed in case of 8-HDSwith both S protein and M protease, it was taken further for molecular simulations study.

Interaction of S protein with 8-hydroxydihydrosanguinarine
The average change in displacement of atoms for all frames with respect to the reference frame in trajectory, i.e. protein ligand Root Mean Square Deviation (RMSD) is stable after 40 ns which was recorded to be 2.5 Å (Fig. S8) in ‗S protein-8-HDS' complex. Such changes suggest that the compound, 8-HDS is capable of causing significant conformational changes in the protein during simulation. The plot illustrates, that the RMSD of ligand is smaller than that of protein, which suggests that compound has not diffused away the binding site.
The structural changes in the protein as determined through RMSF plot reveal that maximum fluctuations occur between 300-500 amino acid residues (Fig. S4). It is known to us that S protein's RBD site spans from 319 amino acid to 591 amino acid residues [40]. Therefore it His1058, Ser730, and Thr778 are seen to interact with ligand over the major time course of 100 ns trajectory. The hydroxyl group of pyridone ring in 8-HDS is found to be in association with the His1058 of the protein for 52% of the 100 ns trajectory. Thr778 and Ser730 interact with the ligand though water bridges for 52% and 33% of the trajectory, respectively (Fig.   S5).

Interaction of M protease with 8-hydroxydihydrosanguinarine
The Root Mean Square Deviation (RMSD) of the M pro -8-HDS complex was found to be 1.25 Å (Fig. 2). Though the protein ligand interaction occurs over 100 ns simulation time yet substantial interaction was observed in the first 10 ns and in between 85-100 ns.
The RMSF exhibited the highest peak near the 50 amino acid residues which is an αhelix region (Fig. S3). As α-helix is known to determine the functional aspects, perturbations in this region make the protein feeble in its action. Generally the fluctuations appear in the N and the C-terminal ends but peaks appearing all over the RMSF plot ensure that the compound is capable of destabilising the protein by causing conformational changes. Water bridges, H-bond and hydrophobic interactions which play a decisive role in ligand binding are persistent in this protein ligand interaction. Phe294 is observed to be involved in maximum interaction time i.e. about 75% (Fig. S7). Asp295 shows dual interaction with hydroxyl group of pyridone of 8-HDS through double water bridges one for 34% and another for 30% of simulation time. Asn151 also interacts with the ligand for 40% of the simulation time (Fig. 3). The information derived above from simulation study is also supported by well-  (Fig. 4, S9, S10).

pKa Calculations of M protease, S Protein and its complex with 8-HDS
Four maximum pK a and pKa shift values of amino acid residues of M pro and S protein in presence or absence of 8-HDS were retrieved by PROPKA2. Amino acid residues Tyr161, Tyr54, His163 and His164 showed pK a shift values 5.5, 5.2, -4.8 and -4.8, respectively in native M pro at pH 7.0. It was observed that pK a shift values of all four Amino acid residues changed to 5.3, 5.7, -5.8 and -4.9, respectively during interaction with 8-HDS (Fig. 5, Table. ST5). pK a and pka shift values remain constant for S-protein as well as S-protein-8-HDS complex.  (Table ST4).

DISCUSSION
In this study we have primarily analysed the molecular interactions of S protein and M protease with 8-HDS through an insilico approach. Inspecting the sequential and structural alignment through various computational tools enhances our concept for better understanding of SARS-CoV2.
Here it was inferred from sequence similarity study that M protease of SARS-CoV2 shows 96% sequential identity with that of SARS-CoV. Thus the sequence similarities confirm that SARS-CoV2 is a descendant from SARS-CoV. Evolutionary studies state that SARS-CoV2 is more closely related to SARS-CoV which had infected several people in the year 2003 with ~10% fatality rate of [41] than MERS-CoV that had caused infection in 2012 with a fatality rate of ~36% [42].
Because of their potential capability to fit into different environments and susceptibility to recombination and mutation, Coronaviruses can deliberately adapt to altered host range and tissue tropism [43]. The major concern now is to contain the spread of virus. Blocking the viral infection in cells through M protease inhibition and targeting the viral spike protein to prevent its attachment and entry into host cells can serve as approaches to combat against the virus. These present phytocompounds, eugenol and sanguinarine, as well as their derivatives are to be employed for the above purpose.
MD simulations exhibit flexibility of intrinsic receptors and also include treatment of water molecules rather than simple portrayal of ligand -protein interactions. Therefore, MD simulations prove to be effective in studying atomic level interactions and establishing the ligand selectivity towards the target. 8-HDS possesses the capability to fluctuate the RBD site of S protein, which interacts with ACE2 receptor of human cell for the viral infection [20].
Structural fluctuation at RBD site may inhibit the binding of S protein and ACE2 receptor, which eventually prevent the viral infection in human cell. It was observed that during protein-protein interaction, binding energy of S Protein-ACE2 decreases in presence of the phytocompound, (i.e. 8-HDS). A significant decline of 129.34 kJ/mol of binding energy was observed during the interaction of S Protein-ACE2 in presence of 8-HDS compared to their direct binding. Therefore, it can be presumed that 8-HDS is capable of hindering the attachment of RBD site of S Protein to the ACE2 receptor protein (Fig. 6, 7). This would indeed pave a way for the utilisation of 8-HDS in repurposing/design of effective therapy to prevent the viral entry.  /mol (Fig. S6).
The hydrophobic amino acids are the main dynamic force in protein folding. In general, proteins become functional after being folded into specific globular structure. During protein folding, hydrophobic amino acids get buried in the core of the protein to get protected from water, leading to the protein fold stable [67,68]. Phe8, Ile152, Phe294, Val297, Val303 hydrophobic Amino acids of M Pro generally interact with 8-HDS during the 100 ns simulation which caused instability in the protein [67] (Fig. S10).
In some recent studies, many natural compounds have been evaluated via molecular docking to analyse the binding affinity of these compounds with CoV proteins with respect to antiviral drugs. This serves as a prospective to evaluate the likelihood of these molecules as drug For a molecule to be an effective drug, it needs to reach the target in optimised concentration and be available in bioactive form till the necessary biological events occur. The SwissADME technology makes the process of drug discovery with less time and resource consuming. Appraisal of the structural or physicochemical properties of development compounds for drug-likeness is enough to consider it as an oral drug-candidate [64]. Druglikeness of a molecule is evaluated with respect to bioavailability by qualitatively examining the probability of the molecule to be developed into an oral drug.
The pink core area of bioavailability Radar representing lipophilicity, size, polarity, solubility, saturation and flexibility define the optimal range of properties for drug-likeness of the input molecule, 8-HDS (Fig. 10) (Table ST4). Due to zero rotatable bonds 8-HDS have no Bioavailability Radar in the flexibility region. The BOILED-Egg model [65] predicts easy penetration of 8-HDS through blood-brain barrier (BBB) and human gastrointestinal absorption (HIA). From the Physicochemical, Lipophilicity, Water Solubility, Pharmacokinetics, Drug-likeness properties of 8-HDS, the study arrives at a conclusion that 8-HDS could be a potential drug. The resulting changes in protonation of M pro are in good agreement that the protein structure undergoes significant structural changes at neutral pH [66]. Our study primarily focuses on 8-HDS which can be a promising drug molecule against SARS-CoV2 due to the presence of a pyridone ring. This phytocompound is known for its antiviral properties and shows drug likeness, structural alteration, binding affinity and molecular interaction. These findings infer that 8-HDS could serve as an effective and potential drug molecule.

CONCLUSION
The COVID-19 pandemic has become a global consternation by infecting millions of people worldwide, claiming several thousand lives and having shaken the socio-economic stability.
It calls for an urge to stop the march of rapidly dwindling scourge. Our research via computational tools establishes that 8-hydroxydihydrosanguinarine can be an efficient inhibitor in blocking viral infection. Thus it can prove to be a potential molecule in therapeutic drug development against COVID-19.         Comparative study of pKa amino acid residues of M pro in presence and absence of 8-HDS.

Figure 6
Docked model depicting interaction of S-protein with ACE2 receptor in the absence of polyphenol.

Figure 10
The Bioavailability Radar depicts glimpse of the drug-likeness of a molecule. The pink area represents the optimal range for each properties of 8-hydroxydihydrosanguinarine.

Supplementary Files
This is a list of supplementary les associated with this preprint. Click to download.