In silico elucidation revealed SARS CoV and MERS CoV Drug Compounds could be Potential Therapeutic Candidates against Post Fusion Core (S2) Protein of Novel Coronavirus (2019-nCov)

Novel coronavirus (2019-nCoV), since its emergence from Wuhan China in December 31, 2019 is still uncontrolled and has raised attention around the globe. According to World health organization, up to March 20, 2020, globally 209,839 conrmed cases of COVID-19 have been reported along with 8778 deaths. 2019-nCoV is likely to be a recombinant of different coronaviruses such as SARS CoV and MERS CoV. Recent developments revealed that glycosylated spike (S) protein of 2019-nCov is contributing signicantly in facilitating 2019- nCov infection in human body. The subunit (S1) of spike protein facilitates 2019-nCov binding with host cells’ receptors, while S2 subunit (post fusion core of 2019-nCov) is a key factor in fusion of 2019-nCov with host cell membrane and subsequent inoculation of its DNA in to the host cell. Therefore, in coronavirus infection, membrane fusion and receptor binding are critical. And if active sites of 2019-nCov spike protein S2 (post fusion core of 2019-nCov) are blocked, this may reduce COVID-19 infections in human. We use clustering based drug-drug interaction (DDI) networks and drug repositioning approach based on modularity to inhibit the membrane fusion and receptor binding capacity of 2019-nCov. About 150 drug compounds effective against SARS-CoV and MERS-CoV were retrieved, and screened on the basis of Lipinski rule of ve. Clusters and strongly interacted DDI networks were generated in accordance to their modularity class, average path length and density. Promising drug candidates were then ltered by toxicity indicator and molecular docking. Our nding reveals that ZINC000029038525 and ZINC000029129064 drug compounds have signicant binding potential with active sites of post fusion core of 2019-nCov ‘S2’ subunit and may inhibit membrane fusion and receptor binding capacity of 2019-nCov. Therefore, these drug compounds alone or in amalgamation could be strong and more effective therapeutic candidates against 2019-nCov infections.


Introduction
Coronaviruses (CoV), are non-segmented single-stranded positive-sense RNA viruses of subfamily Coronavirinae in the family of Coronaviridae and the order Nidovirales. Members of the Nidovirales order shares some common features which include: i) a highly conserved genomic organization, with a large replicase gene preceding structural and accessory genes; ii) expression of many nonstructural genes by ribosomal frame shifting; iii) expression of downstream genes by synthesis of 3′ nested sub-genomic mRNAs and iv) several unique or unusual enzymatic activities encoded within the large replicasetranscriptase polyprotein1. All viruses of Nidovirales order are enveloped and contain very large genomes for RNA viruses, with Coronavirinae having the largest identi ed RNA genome of approximately 30 kilo base (kb)2 . Based on phylogenetic relationships and genomic structures, Coronavirinae are further comprised of four groups, the alpha, beta, gamma and delta coronaviruses, which have been detected in a wide range of animal species such as birds, rabbits, reptiles, cats, dogs, pigs, monkeys, and bats3. Alpha coronaviruses (229E and NL63), and beta coronaviruses (OC43 and HKU1), usually cause mild upper respiratory diseases in human4. In the last two decades two large-scale pandemic such as Severe Acute Respiratory Syndrome (SARS) and Middle East Respiratory Syndrome (MERS) have been caused by coronavirus5. It has been reported that SARS-CoV and MERS-CoV were rst discovered in bats and might cause future disease outbreak6. In 2002-2003, SARS-CoV outbreak in China resulted over 8,000 con rmed cases and 800 deaths. Similarly, Middle East Respiratory Syndrome (MERS-CoV) was rst identi ed in the Middle East (Saudi Arabia and Jordan) in 20127. Estimated death rates of SARS-CoV and MERS-CoV were about 10% and 35%, respectively8. World Health Organization recorded 2494 laboratory con rmed cases of MERS-CoV, including 858 associated deaths worldwide among these 1896 cases, including 732 related deaths were recorded only from Saudi Arabia9.
But the year 2020, started with the profound concern associated with onset of a novel coronavirus (2019-nCov) outbreak in Wuhan, China. This emerging pathogen was rapidly characterized as a novel member of the beta coronavirus10. In early cases, 2019-nCov appeared due to the contact of infected individuals with original seafood market, but now it is incessantly spreading through human-to-human transmission and triggering respiratory, enteric, hepatic, and neurological diseases of variable severity to humans11.
Typical clinical symptoms shown by patients are fever, dry cough, dyspnea, headache, pneumonia and disease onset may results in progressive respiratory failure due to alveolar damage and even death12. On February 16, 2020, WHO con rmed over 51,000 cases globally, along with 1,600 associated deaths worldwide. According to recent record of European Union, since December 31, 2019 to March 20, 2020 about 209,839 cases of COVID-19 (in accordance with the applied case de nitions in the affected countries) have been reported, including 8778 5deaths13.
Complete protein structure of 2019-nCoV depicted high resemblance with that of SARS-CoV, with a root mean square deviation (RMSD) of 3.8 Å14. A glycosylated spike (S) protein having two structurally distinct conformations, the pre-fusion (S1) and post-fusion (S2) is considered as the key facilitator of coronaviruses infections in human. Receptor binding and membrane fusion are initial and critical steps, which are facilitated by S1 and S2 subunits15. This process is triggered due to the binding of pre-fusion (S1) subunit to a host-cell receptor resulting destabilization of S1 trimer and transition of the S2 subunit to a stable post-fusion conformation16. It is expected that pre-fusion core (S1) facilitates 2019-nCov binding with host cells' receptors, while post-fusion core (S2) is a key factor in fusion of 2019-nCov with host cell membrane and subsequent inoculation of its DNA in to the host cell.

Methods
Gene selection SARS-CoV and MERS-CoV mutated genes, Angiotensin Converting Enzyme 2 (ACE2) and Tumor Necrosis Factor (TNF) were selected on the basis of scoring value using GeneCard17.

Data collection and mining
Effective drug compounds used against SARS-CoV and MERS-CoV mutated genes (ACE2 and TNF) along with their properties were collected through virtual screening using ZINC15 Database, which is a search engine for investigating compounds contrary to biological targets18. A pharmacologically active drug development structure was optimized step-wise using Lipinski's rule of ve, which is imperative for drug's pharmacokinetics in human body19.
Clustering WEKA tool, was used for clustering of the drug compounds as reported earlier20. The absolute interconnectivity between clusters (Ci and Cj) was measured by clustering algorithm and relative interconnectivity between a pair of clusters Ci and Cj was measured by formula as explained before21.
Where, EC (Ci, Cj) denotes the sum of the weight of the edges.

Drug-drug interaction (DDI) networks' generation
Gephi tool was used for the generation of DDI networks, which is a software for data visualization and exploration, and analyzing social, biological and network databases in real time22. In Gephi tool, Kmeans clustering supports the generation of drug-drug interaction networks to nd out the strong association between the drug compounds within each cluster. A strongly interacted network was generated from the set of each DDI networks based on modularity and degree distribution analysis.

Toxicity predication
Toxicity of selective drug compounds was checked using ProTox web server23, which incorporates molecular similarity, pharmacophores, fragment propensities and machine-learning models for the prediction of various toxicity endpoints24. The drug compounds which were within safest toxicity class viz. 4-6 were considered only.

Molecular docking
Post-fusion core of 2019-nCoV S2 subunit protein (id 6LXT) was retrieved by PDB RCSB server (https://www.rcsb.org/).. Afterwards, Pdb les of the prepared protein and drug compounds (ligands) were submitted to PatchDock and Firedock servers to perform structure prediction of protein-ligand complexes25, and to evaluate the binding affinities and ideal binding mode between post-fusion core of 2019-nCoV S2 subunit protein and each drug compound26.

Repositioning
Repositioning is an exceptional approach used to assess suitable cost-bene t ratio of drug development. Because, conventional novel drug development strategies are costly, time-consuming and risky27. The effective drug compounds used against SARS CoV and MERS-CoV infections were repositioned for postfusion core of 2019-nCoV S2 subunit protein.

Results And Discussion
Computational techniques are powerful knowledge-based approaches, which contribute signi cantly in the selection of compounds with an increased likelihood for biological activity28. In this study, computational framework was portrayed to identify potential drug compounds against COVID-19 infections. We selected drug compounds for genes ACE2 and TNF which play a leading role in SARS-CoV and MERS-CoV. As ACE2 monocarboxylase is the functional receptor29 and have a critical role in mediating SARS-CoV infection30. TNF, a pleiotropic cytokine contributes in homeostasis and disease pathogenesis31, and its concentration is markedly enhanced in MERS-CoV infected individuals32.

Clustering of drug compounds
About 150 drug compounds retrieved from Zinc database were ltered on the basis of Lipinski rule of ve along with Zinc ID, Log P, Molecular weight (MW), Hydrogen bond donors (HBD), Hydrogen bond acceptors(HBA) and Rotatable bonds (RB). Filters were applied on the sorted data and only 109 drug compounds satisfying Lipinski rule of ve were selected for further analysis. Those compounds having similar properties were grouped together by using the clustering strategy. K-means algorithm was used for grouping the similar chemical compounds. And eight clusters as shown in Figure 1, were identi ed by Elbow method.

DDI networks
The DDI networking is a systematic way to explore the potential of a drug candidate in physiological environment33. This networking strategy is based on modularity, path lengths, degree distribution and have a great impact on drug development and clinical care34. The DDI and drug-target interaction (DTI), were used parallel for successful repositioning of selected drug candidates against 2019n-CoV spike protein. In addition, Gephi was used to generate and analyze clustered drug compounds. Layouts of expansion and Fruchterman rein gold was used to visualize the interactions of modules in the network. Each node represents a different drug and linkage between the two drugs represents the interaction between them according to the similarity of their properties. As each cluster contains drugs with similar properties, therefore; the strong interactions among the drugs determine their synergistic effects, and provides information about the functional pro le of a drug cluster. In total, 8 distinct DDI networks were generated by using Gephi, while keeping network diameter and average path length constant at 1 for all networks. Each network consists of the set of vertices V and edge E, average path length L, degree D of nodes, network density, and modularity classes as shown in Figures 2a&b. Moreover, interactions between drugs compounds are shown by dotted lines, while the strong interactions are represented by solid lines.

Modularity and degree distribution of drug compounds
Modularity classes of all DDI networks were calculated via Gephi tool. Each node was assigned with distinct colors on the bases of modularity class. The network structure is generally represented by a high modularity, both in the bipartite network and in their projections, exhibiting that topology is exceptionally distinct from an arbitrary network, and contains a rich and heterogeneous modular structure35. Modularity and degree distribution are decent indicator for determining of properties and effectiveness of chemical compounds. Modularity with a higher value indicates many connections within modules while lower modularity value indicates less number of connections within a modules36. Network modularity helps in supporting complex behaviors within a network37. Degree distribution supports in the generation of the graph in-degree, degree, and out-degree distribution36. As shown in Table 1, 8th DDI network had highest value of average degree (3.366), followed by 7th and 1st networks at 3.203 and 3.067, respectively. Similar DDI networks depicted high levels of average weighted degree. Density of all networks ranged from 0.011 to 0.042 with highest level in 8th network and lowest for 7th one. Modularity was found maximum for 1st DDI network (0.599), followed by 3rd, 6th and 5th networks (0.597, 0.578 and 0.573, respectively). Number of nodes varied from 32 (4th network) to 153 (7th network), while edges were highest in 7th, 1st and 8th networks at 245, 138 and 69, respectively.

Repositioning of strongly interacted drug and toxicity analysis
During early stages of drug development process, predictive studies should be performed that could be appropriate in saving time, efforts, animal or even human lives, particularly in evaluating the e cacy and biopharmaceutical properties of the drug candidates38. Strong DDI in parallel to molecular docking re ects the interposition of several behaviors of the drug.
We use GEPHI, to generate the strongly interacted cluster. Out of above mentioned eight DDI networks, drugs with strong interactions and modularity class greater than 4 were selected and nal strongly interacted network was built. Out of thirty two, 12 drugs depicted greater modularity value. Final strongly interacted networks (Figure 3) were consist of 111 nodes and 170 edges with average path length 1.0, graph density 0.014, network diameter 1, modularity class 0.585, and average degree 3.063. In Figure 3, interactions between drugs compounds are shown by dotted lines, while solid lines indicating strong interactions.
We also use repositioning approach of drugs to identify the novel therapeutic uses of the drug compounds39. Chemical structures along with toxicity class and predicted LD50 of strongly interacted drug compounds are mentioned in Table 2. Drugs lying under 5 and 6 toxicity classes were selected for further docking. Toxic doses were achieved as LD50 values in mg/kg of the body weight40. Some intense harmful tests such as the "established" LD50 test were anticipated to decide the mean lethal dosage of the test substance41. LD50 limits were used to ensure relatively intense risks of modern chemicals, particularly at the point when no other toxicology information are accessible for the chemicals.
Drug selection was based on the toxicity class of the drugs, and those with high toxicity were discarded.
We found that seven drugs were in toxicity class 5, three were lying under toxicity class 4 and two drugs were in class 6. All these three toxicity classes are considered as nontoxic. Two drug candidates viz. ZINC000029038525 and ZINC000029129064, were the safest based on their toxicity class 6.

Molecular docking analysis
For docking analysis 9/12 drug compounds were selected on the basis of their toxicity class. Structure of novel corona virus "2019-nCoV" S2 subunit protein retrieved by PDB RCSB server along with pdb id 6LXT, revealed that S2 subunit has 2.9Å resolution, total structure weight 84658.0 and residue count 792. In docking analysis, screened drugs compounds along with S2 protein structure were submitted to PatchDock, which provides a list of potential complexes based on drug compound's shape complementarity criteria with their binding a nities. Topmost results of PatchDock were interpreted in discovery studio to check their bumps (red colour) and 2D structures as shown in Figure 4.
Binding a nities of docked drugs-protein complex were further evaluated and values were calculated as per the binding a nity energies (Table 3). Molecular docking analysis revealed that drug compounds "ZINC000029038525" and "ZINC000029129064" have lowest binding energies, which represents their strong interactions with 2019-nCoV" S2 subunit protein. Therefore, ZINC000029038525 and ZINC000029129064 drug compounds are most viable for drug repositioning and can be used for further in vitro and in vivo authentication and as an immediate treatment for the 2019-nCoV effected patients.

Conclusion
Identi cation of novel uses of drug-target interactions is an important parameter in improving clinical care. We targeted glycosylated spike (S) protein due to its indispensable function as a facilitator of novel coronavirus "2019-nCoV" to enter into the host cells. In Silico elucidation reviled that 9 screened drugs had good interactions with spike protein S2 (post fusion core of novel 2019-nCov). More speci cally, 2 drug compounds viz. ZINC000029038525 and ZINC000029129064 depicted strong interactions with S2 protein and could be potential therapeutic candidates against 2019-nCov because of their lowest toxicity, drug-target interaction and minimum binding energy. We suggest that these drug compounds could block the active sites of post fusion core of 2019-nCoV S2 subunit and may contribute signi cantly to alone or in combination (synergistic fashion) with other neutralizing antibodies, for the prevention and treatment of 2019-nCoV infections. In addition, further in vitro and in vivo investigations may con rm the e cacy of these suggested drug compounds.

Disclosure statement
No potential con ict of interest was reported by the author (s).  Tables   Due to technical limitations, Tables 1-3 can be found as a download in the supplemental le section. Figure 1 Bar graph (A) and Plot matrix (B) representation drugs' clusters along with attributes.   tables13.pdf