Identication of best suitable repurposed drugs considering mutational spectra at RdRp (nsp12), 3CLpro (nsp 5) and PLpro (nsp 3) of SARS-CoV-2 in Indian population

Background and Objective: COVID-19 originated in Wuhan, China and expanded to different geographical location of the world with variation in its sequence due to mutation, consequent alteration in different protein structures; resulting in different interaction with the host body. The highly infectious and diverge character of the virus makes it imperative to identify the promising inhibitory compounds for RdRp, 3CLpro and PLpro as suitable antiviral drug target so that viral duplication can be prevented in the host body. Methods: RdRp, 3CLpro and PLpro sequences of Indian patients were retrieved from the database and MSA was employed to identify mutation at nsp 3, nsp 5 and nsp 12. Protein structures were modeled considering all possible combination of sequences abundant in India and docking was performed with repurposed drugs, currently under trial for COVID-19, using Autodock Vina to nd out the suitable ones considering the mutational spectra of SARS-CoV-2 in India. Results: PLpro is found to be most vulnerable to mutations with four mutations found in >5% studied population whereas in 3CLpro none observed at the frequency > 5%, so far. Two mutations were reported among Indian population (>5%) at RdRp. Therefore 3CLpro and RdRp were further analysed as a suitable target for repurposed drugs. Elbasvir has come up as the most suitable drug to inhibit the activity of RdRp in Indian population, followed by Remdesivir and Methylprednisolone. TMC 310911, Lopinavir and Elbasvir again, found to be the best candidates for inhibiting 3CLpro. Interpretation and Conclusions: Remdesivir and Lopinavir(alone or in combination with Ritonavir), the most popular drugs of choice at recent times may be suitable to be used in Indian population, considering the mutational variations in Indian population. Among others, Elbasvir, TMC 310911 and Methylprednisolone are good choices for treating COVID-19.

largest mass quarantine popularly known as 'Lock Down'. At present, India is among one of the highly affected countries globally in terms of the number of affected individuals and death, is concerned. India is reported to have 101139 positive cases and 3163 deaths as on 19 th May 2020 [1] . COVID-19 originated in Wuhan (China) and expanded to different parts of the world with variation in its sequence due to mutation, consequent alteration in different protein structures leading to different interaction with the host body [2,3] and its' virulence.
The highly contagious and diverge character of the virus makes it imperative to identify the promising inhibitory compounds that can be used as a suitable antiviral drug target [4] . The SARS-CoV-2 viral genome encompasses a large orf1ab polyprotein which includes three non-structural proteins (nsp) namely RNA dependent RNA polymerase (RdRp), papain like protease (PLpro) and coronavirus main protease (3CLpro) [5,6] . 3CLpro and PLpro are proteinases, that cleaves the replicase polyproteins at different sites resulting in the formation of the functional proteins and are two important drug targets [4,7] .
RdRp is an essential enzyme for the life cycle of RNA viruses [8] and thus targeting RdRp, a promising strategy for blocking viral RNA synthesis [6] . E cacy of the target drug depends on a few important criteria including their binding energy (ΔG), orientation at the active site of the receptor protein and the interactive residues [4] . Inhibition of 3CLpro or main protease (nsp 5) is an encouraging strategy against SARS-CoV-2 infection and drugs like Lopinavir, Ritonavir and few more that used to treat HIV, are being explored as a repurposed drug to combat COVID-19 [6,9] . Lopinavir combined with Ritonavir has approved drug combination for HIV infection and may inhibit papain like protease or PLpro(nsp 3) also [6,10] . Remdesivir, another repurposed drug targeting the RdRp(nsp 12), is showing improved clinical outcome in many countries and the clinical trial is underway [11,12] . Favipiravir, a purine nucleoside analog, is another broadspectrum anti-viral compound which also inhibits viral RdRp and is a licensed drug in Japan [6,13] .
It is important to emphasize the fact that for SARS-CoV-2, infectivity is signi cantly higher and mutational spectra vary with geographical location. Therefore, it is important to explore the divergence of PLpro, 3CLpro and RdRp proteins in Indian population at the sequence level and consequently at the structural level, while predilection the drug target. Designing antiviral drugs targeting proteases and RdRp of orf1abis a promising approach, however, it is important to consider the mutational spectra [14] . Here in this study, we will assess the effect of Remdesivir, Favipiravir, Lopinavir, Ritonavir etc. on Indian strain(s) of SARS-CoV-2 and will examine a few more anti-viral repurposed drugs which are currently under trial for COVID-19 to identify potential inhibitory drug candidates that are best suited for the treatment of COVID-19 in India.

Material And Methods
ORF1ab protein sequence retrieval from the database The protein sequences were retrieved from "NCBI Virus" database, speci c input was "SARS-CoV-2". Then the output was re ned with sequence length 7075 to 7100, as the length of our target orf1ab polyprotein is 7096. The geographic region was positioned as 'India'. Total 75 sequences were retrieved (till 15 th May 2020), among them, 14 sequences were from Karnataka, 53 from Gujarat, 6 from Telangana and 2 from Kerala.
Multiple Sequence Alignment(MSA) and screening of mutation at nsp 3, nsp 5 and nsp 12 Clustal Omega [15] was employed to align retrieved sequences using ancestral orf1ab polyprotein sequence of Wuhan (YP_009742608) as reference sequence [16,17] . The penalty for gap opening and gap extension was set 12 and 2 respectively to ensure that unnecessary gaps are not created during alignment and alterations are visualized appropriately. Region 819-2763 (nsp 3), 3264-3569 (nsp 5) and 4393-5324 (nsp 12) of the alignment le was retracted to explore the mutations at viral PLpro, 3CLpro and RdRp proteins respectively. All the mutation position and subsequent alterations at PLpro, 3CLpro and RdRp were noted.

Simulation of protein structure by Homology Modelling
Structures of the target proteins for wild type sequence as well as mutated sequences were generated by Homology Modeling method using Swiss Model. Quality estimation of the modeled protein structures were done by query coverage of the template sequences, Global Model Quality Estimation (GMQE) [18] and QMEAN score [19] . Mutated structures were then superimposed over the wild type using UCSF Chimera [20] and PyMOL [21] for better visualization and comparison. All the .pdb les for those protein structures were downloaded and used in later analysis.
Extraction of selected drug molecules from the library of anti-viral drugs Anti-viral compounds were retrieved through ClinicalTrials.gov [22] and DrugBank database [23] . Firstly, we searched for known repurposed drugs from ClinicalTrials.gov, which are currently under trial against COVID-19 in between Phase 2 to Phase 4. Selected drugs were then checked from DrugBank database and those potential but yet unapproved drugs for COVID-19 category were enlisted for present analysis. Among those 23 drugs retrieved, some (22%) were known to target the main protease (3CLpro) of the virus and some (13%)targeting the RdRp. A few monoclonal antibodies were also retrieved but were not included in this study.

Screening and selection of drugs against RdRp and proteases through Docking
Protein-Ligand(drug) interaction was studied by molecular docking using Autodock Vina [24] and the interaction analysis was done by using PyMOL. The protein structure(.pdb) le was edited using Autodock Tools. The water molecules were excluded and the polar hydrogen bonds were added into the secondary structure and both Protein and Ligand structure les (.pdb) were converted into .pdbqt format. The area of docking region and the co-ordinates were speci ed using the Grid Box tool. Blind docking was executed by setting the size of the Grid Box 110x110x110. Then the binding region was focused by adjusting the size to 42x52x42 for better visualisation and analysis of the particular region.
The co-ordinates were set at x =115.793, y = 123.619, z =119.332,spacing was xed as0.375 angstrom as default and the exhaustiveness was set to 8as default value. For each run, only the top-ranked results were considered as it had the best RMSD score. The docking outcomes were further analyzed in PyMOL and the interacting residues, their bond length and the binding pockets were observed using various features of PyMOL. The drugs were selected depending on the strength of receptor-drug association or binding a nity.

Results
Mutational spectra analysis reveals signi cance of RdRp and 3CLpro for drug targeting: All 75 sequences retrieved from December 2019 to 14 th May 2020 were analysed and mutations at nsp 3, nsp 5 and nsp 12 were identi ed. Among those three locations at orf1ab, nsp 3 or PLpro is found to be most vulnerable harboring a total of 14 mutations. Out of these 14 mutations, some with very low frequency (< 5%) and thus, we considered only those mutations, represented at least 4 out of those 75 studied sequences (i.e> 5%).Four mutations at nsp 3 (i.e. 1159 I>M, 1534 S>I, 2016 T>K and 2376 P>L) were found to be prominent for Indian strain(s) representing a distribution of 5.3 to 13.3%. At nsp 12 or RdRp, a total of 11 mutations are observed till now, but except 4489 A>V (18.6%) and 4715 P>L (74.6%), rest are present in a very low abundance (<5%) (Fig. 1).Nsp 5 or the main protease 3CLpro is found to be least vulnerable to mutation in Indian strain(s) of SARS-CoV-2 with none observed at>5% abundance. Thus the wild type 3CLpro is predominant in the Indian population. The mutation with only >5% abundance in the population was considered for latter analysis and the detailed list of mutations at nsp 3, nsp 5 and nsp 12 are shown in Table Ӏ.
Homology modeling predicts mutation induced altered protein structure: The ndings suggest, 3CLpro or main protease is the most suitable target for drugs due to highly conserved sequence and its corresponding structure in Indian population whereas PLpro is not suitable for drug targeting. Therefore nsp 5 (3CLpro) and nsp 12 (RdRp) were further analysed in this study and modeled structures were generated by Swiss Model. GMQE and QMEAN score was found to be 0.99 (out of 1) and 0.45 (around zero indicate the best quality and ≤ -4.0 indicate low quality) for the wild type structure of nsp 5 (or 3CLpro) whereas it was 0.96 and -1.49 respectively for wild type nsp 12 (or RdRp). Identi cation of suitable drug molecules targeting RdRp using molecular docking: Among 13 repurposed drugs tested against RdRp following our methodology, Elbasvir has come up as the best suitable drug to inhibit the activity of RdRp in Indian population. FDA approved Elbasvir is used as an antiviral against hepatitis C virus (HCV) to treat hepatitis, an infectious, chronic liver disease. The estimated binding free energy (ΔG) of Elbasvir against the RdRp of the wild type structure was observed to be -8.84 kcal/mol, and the estimated Inhibition Constant (Ki) was 363 (nM). The ΔG and Ki values for all the mutant structures were even better in case of Elbasvir indicating slightly greater binding a nity than that of wild type and requirement of smaller dose to inhibit the enzyme.
Remdesivir, the most popular drug of choice at recent times found to be the 2 nd best option to inhibit RdRp in India. The ΔG and Ki values of Remdesivir against wild type 3CLpro was -8.52 kcal/mol and 620 (nM). The ΔG values for the mutant structures harboring 4489 A>V and 4715 P>L mutations alone or in combination were -8.02, -7.78 and -7.66 kcal/mol respectively. Beside ΔG and Ki values, the number of polar bonds formed between the interacting residues may be an important feature in receptor-drug association study. In case of wild type and two mutant types harboring either of 4489 A>V or 4715 P>L mutation, 6 polar bonds were formed but when both these mutations occur simultaneously, 5 polar bonds were formed; indicating no such alteration in interaction. The bond lengths ranged from approximately 1.9 Åto 5.5 Å and the detailed interactions are shown in Table ӀӀ. The results revealed that binding a nity of Remdesivir although decreased, alteration was not signi cantly different for mutant structures, indicating the suitability of this drug in Indian population even under mutational background.
The 3 rd suitable option to inhibit RdRp is Methylprednisolone, a corticosteroid medicine that is used to prevent in ammation.The ΔG value for wild type structure was -8.18 kcal/mol and for the mutant structures, it ranged from -7.84 to -7.96 Kcal/mol. The Ki values ranged between 1000 and 2000 (nM).
Cinnamaldehyde(ΔG = -4.9 kcal/mol) and Thymoquinone(ΔG = -4.94 kcal/mol) were used as negative control for all the drugs tested here and statistical test further con rms signi cantly higher (p<0.0001) binding a nity for all the chosen drugs towards RdRp compared to negative control and their interaction with RdRp is shown in Fig. 3.
Molecular docking identi es promising repurposed drugs targeting 3CLpro 3CLpro or nsp 5 is the most appropriate drug target area for the viral strain(s) of India as it is least vulnerable to mutation and out of 15 repurposed drug molecules tested against it, Elbasvir, TMC 310911 and Lopinavir are found to be the best candidate to inhibit the 3CLpro and it is shown in Fig. 4. According to the observed ΔG and Ki value, Elbasvir(ΔG = -10.44 kcal/mol and Ki = 25nM) is the best one but the number of polar bonds formed during the interaction with 3CLpro is only three, among which one has bond distance ≥ 4 Å. The ΔG and Ki value of TMC 310911 was observed to be -9.98 kcal/mol and 54 (nM) and the number of polar bonds formed was 5, among which all had bond length ≤ 3 Å. In case of well-known protease blocker Lopinavir, the ΔG and Ki value was noted to be -9.26 kcal/mol and 179 (nM) and 4 polar bonds were observed between the drug and the enzyme, in which bond distance ranged between 2.1 Å to 2.8 Å. Here also, Cinnamaldehyde and Thymoquinone were used as negative control for 3CLpro showed ΔG value of -4.96 and -5.32 kcal/mol respectively and statistical test con rms signi cantly higher (p<0.0001) binding a nity of above mentioned three drugs towards 3CLpro compared to negative control. The binding a nity and interaction of all the selected drugs against 3CLpro are shown in Table ӀӀӀ.

Discussion
There are different repurposed drugs; from HIV protease inhibitor to common antivirals, anti-in ammatory and anti-malarial to RNA polymerase blocker, which is currently being tested and are under clinical trials [22,25]. Therefore, it is important to identify the course of action to prevent COVID19 urgently. As mutational spectra vary along with the geographical variation, it is important to examine the e cacy of the drug considering the speci c geographical location and genetic variation in the population, thereof.
Remdesivir and Favipiravir are two widely discussed drugs all over the world as well as in India that are known to prevent viral replication in the host body by inhibiting the enzyme RdRp that drives viral replication, but the Union health ministry is still not satis ed with the e cacy of these two drugs for the treatment of COVID-19 [26,27] . Depending on the sequencing data of SARS-CoV-2 RdRp till now, the infected population in India can be classi ed in to four groups; (i) one carrying the wild type sequence, (ii)one harboring single mutation at 4489 (A>V) and (iii)another harboring single mutation at 4715 (P>L), and (iv) the last group which achieve both these frequently observed mutations at RdRp. Our analysis have found that Favipiravir is not a suitable drug for COVID-19 treatment in India as it has a poor binding a nity towards all of those four groups mentioned above (Mean ΔG ranged between -5.04 and -5.38 kcal/mol and Ki ranged between 212843 to 120323 nM). We have also found that Remdesivir can be a suitable drug in Indian population as the mean ΔG and Ki value for group (i) was -8.52 kcal/mol and 620 (nM) respectively. The binding free energy of Remdesivir for group (iv) decreased to -7.66 kcal/mol but still was signi cantly higher than that of negative control (p-value <0.0001). The best candidate for inhibiting RdRp in Indian population is Elbasvir, a known anti-HCV drug, as per our analysis. The mean ΔG values of Elbasvir towards RdRp of group (i) to (iv) were -8.84, -8.92, -8.98 and -9.02 kcal/mol respectively. Ki values were also in the range between 268 to 363 (nM) indicating less amount of dose would be su cient to inhibit the RNA polymerase. The number of polar bonds formed was 5 for the wild type (group i) and 4 for all mutant types, among which all had bond lengths between 2 to 4Å.
Till now Hydroxychloroquine(HCQ) alone or in combination with Azithromycine has been recommended as prophylaxis of COVID-19 by some sources in India [27,28] and HCQ became a hot topic of discussion worldwide in recent past. HCQ alone is not at all suitable in targeting RdRp or 3CLpro as the binding free energy for RdRp ranged between -5.48 to -5.68 kcal/mol and for that of 3CLpro, mean binding free energy was -6.14 kcal/mol with only two interacting residues, forming two polar bonds. There is an indication that HCQ in combination with Azithromycine may act against 3CLpro a little bit better as Azithromycine have a stronger a nity towards 3CLpro (ΔG = -8.64 Kcal/mol and Ki = 507 nM).
There are two repurposed drugs namely Lopinavir and Ritonavir which used to be prescribed for the treatment of HIV/AIDS and are known to inhibit the proteases of HIV virus [9,25] , and a Lopinavir/ Ritonavircombination is being considered to be a promising target for SARS-CoV-2 3CLpro. In this study, we have found that Lopinavir and Ritonavir both have favorable ΔG and Ki value for interaction with 3CLpro and can be used as a promising combination. TMC 310911, a protease inhibitor and structurally similar to another known repurposed drug Darunavir, is under trial since 7 th February 2020 and maybe more promising than Lopinavir in inhibiting the 3CLpro of SARS-CoV-2.
Thus, it can be concluded that Remdesivir and Lopinavir(alone or in combination with Ritonavir), the most popular drugs for discussion all over the world at recent times to combat COVID-19 may be suitable to be used in Indian population also. Among others, Elbasvir, TMC 310911 and Methylprednisolone are good choices for treating COVID-19 based on in silico analysis.  Tables   Table I. List of mutations