Computer-Aided Identification of Bioactive Compounds of Azadirachta Indica (Neem) with Potential Activity against SARS-CoV-2 Main Protease

Coronavirus disease 2019 (COVID-19) is a zoonotic disease caused by a novel virulent virus known as Severe Acute Respiratory Syndrome Coronavirus-2 (SARS-CoV-2). Up till now, there is a continuous increase in cases, morbidity and fatality associated with the disease throughout the world. Azadirachta indica (Neem) is a medicinal plant popularly known for its antimalarial and broad-spectrum antiviral activities. The bioactive compounds of Neem were therefore analyzed in this study for possible inhibitory activity against the SARS-CoV-2 3-Chymotrypsin (3C)-like protease (Main protease), an important therapeutic target of the virus. This was done through a computational approach involving molecular docking, pharmacophore modelling and ADMET studies. Out of 150 Neem compounds subjected to molecular docking against the main protease, rutin had the highest binding affinity followed by tannin amine, quercitrin, hyperoside and kaempferol, before the standard inhibitor K36. The compounds interacted with Glu-166, Asn-142, His- 41, Cys-145 and other crucial amino acids residues of the catalytic cleft of the protease. Most of the selected compounds displayed acceptable druglikeness, pharmacokinetic and toxicity parameters. These compounds could therefore be developed further for the treatment and management of COVID-19 after experimental studies.


Introduction
The coronavirus of 2019  was announced by the World Health Organization as a Public Health Emergency of International Concern on the 30th of January 2020, and as a pandemic on the 11th of March 2020 [1,2]. As of 30 th of May, 2021; over 169 million incident cases have been cumulatively confirmed, with more than 3.5 million confirmed deaths from was first discovered in Wuhan, China [4,5]. Person-to-person transmission, through close contact with asymptomatic and symptomatic patients, and touching an infected object/surface by unaffected persons respectively, have been reported as the cause for the rapid global spread of the pandemics and as a major bottleneck in combatting the virus [2,6,7].
Consequently, the outbreak of SARS-CoV-2 has led to a global medical emergency and this had heavily affected international travelling, trade and commerce; hence, resulting in a dire economic crash of several countries, including the developed nations [7,8]. In the quest for probable pharmacological agents against the virus, drugs like Remdesivir, Chloroquine (CQ) and Hydroxychloroquine (HCQ) were repurposed for the prevention and treatment of the disease. Despite the preliminary positive results from the in vitro studies and the extensive use of CQ/ HCQ during the first wave of COVID-19, the terrible side effects related to their prolonged administration [9] have discouraged their use. This and other factors like the development of new variants of the virus and the high costs of the anti-Covid-19 vaccine [10], have contributed to the global increase in new cases. Therefore, there is an urgent need to come up with an affordable anti-COVID-19 drug to cure and bring its global spread to a halt.
Coronaviruses are made up of some critical structural and non-structural proteins that can be exploited as therapeutic targets against the virus; hence, a main protease or 3-Chymotrypsin (3C)-like protease is one of them [5,11,12]. The 3C-like protease is an upstream enzyme that is involves in the SARS-CoV-2 replication and transcription [11]. With the availability of several computational modelling techniques, screening of chemical compounds for potential inhibitory activity against these protein targets represents a viable option.
Azadirachta indica (commonly called Neem) belongs to the Meliaceae family of plant that are cultivated in Asia and Africa, and is respectively called dongoyaro, ogwu akom and maina by the Yorubas, Igbos and Hausas in Nigeria. Neem is inexpensive, easily available and quite abundant in Nigeria. It is the most important medicinal plant that has been declared as the "Tree of the 21st century" by the United Nations [13]; and as "Life-giving tree", "Panacea for all diseases", "Nature's Drugstore", "Divine Tree" and "Village Pharmacy" in India [14,15]. Therefore, Neem is a "multipurpose" medicinal plant by its capacity to be utilized for repertoires of diverse infectious and chronic diseases [16]. The medicinal potentials of this plant reside in all parts (roots, bark, leaves and seeds) of the plant as phytoconstituents [16][17][18]. Neem bioactive compounds had been experimentally proven to be anti-inflammatory, anticarcinogenic antidiabetic, antiplasmiodal, antifungal, anti-microbial [19][20][21][22][23], and antiviral [24,25]. Recently, there was a debatable clinical claim that extract of Neem can cure patients with COVID-19, even though without a globally accepted experimental evidence. However, in-silico experimentation may show a possible light to this assumption [26,27]. This study was aimed to identify potential anti-COVID-19 agents from the repository bioactive compounds of the Neem plant, through molecular docking, pharmacophore modelling and ADMET studies.

Protein preparation
The crystal structure of SARS-CoV-2 3C-like protease (PDB ID: 7C6U) was retrieved from the Protein Data Bank (PDB). The protein was prepared using the protein preparation wizard panel of Glide (Schrödinger Suite 2020-3), where bond orders were assigned, hydrogen atoms added, disulfide bonds created, while missing side chains and loops were filled using prime. Water molecules beyond 3.0 Å of the hits were removed and the structure was minimized using OPLS2005 and optimized using PROPKA [28,29]. Subsequently, the receptor grid file was generated to define the binding pocket for the ligands.

Ligand preparation
About 150 compounds of Azadirachta indica were obtained from Dr. Duke's Phytochemical and Ethnobotanical Database and their structures including that of the standard inhibitor (K36) downloaded from the PubChem database were prepared for molecular docking using the Lipprep module (Schrödinger Suite 2020-3). Low-energy 3D structures with correct chiralities were generated. The possible ionization states for each ligand structure were generated at a physiological pH of 7.2 ± 0.2. Stereoisomers of each ligand were computed by retaining specified chiralities while others were varied.

Receptor grid generation
Receptor grid generation allows defining the position and size of the protein's active site for ligand docking. The scoring grid was defined based on the co-crystalized ligand using the receptor grid generation tool of Schrödinger Maestro 12.5. The van der Waals (vdW) radius scaling factor of nonpolar receptor atoms were scaled at 1.0, with a partial charge cut off of 0.25.

Protein-ligand docking
The glide tool of Schrödinger Maestro 12.5 was used to perform the molecular docking studies using the generated receptor grid file. The prepared ligands were docked using standard precision (SP), with the ligand sampling set to flexible, and then docked again with extra precision (XP) with the ligand sampling set to none (refine only). The vdW radius scaling factor was scaled at 0.80 with a partial charge cut-off of 0.15 for ligand atoms.

Receptor-ligand complex pharmacophore modelling
The receptor-ligand pharmacophore model of the three topscoring compounds were developed using PHASE. The auto (E-pharmacophore) method was used, the hypothesis was set with the maximum number of features to be generated at 7, the minimum feature-feature distance at 2.00, minimum featurefeature distance for the feature of the same type at 4.00 and donors as vectors.

Pharmacology parameters
The absorption, distribution, metabolism, excretion and toxicity (ADMET) properties of the test compounds were determined using in silico integrative model predictions at the Swis-sADME and ProTox-II servers.

Molecular docking of neem compounds against 3C-like protease of SARS-CoV-2
The compounds demonstrated various levels of binding affinities for the protein target as shown in Table A1 (Appendix). The ten top-scoring Neem compounds showed binding affinities ranging from -9.140 to -5.480 Kcal/mol for the SARS-CoV-2 3C-like protease, the values being a measure of change in Gibb's free energy (ΔG) as shown in Table 1. The first six compounds (rutin, tannin amine, quercitrin, hyperoside, kaempferol and myricetin) displayed binding affinities higher than that of the standard ligand which is -6.577 Kcal/mol. Rutin exhibited the highest binding affinity (-9.140 Kcal/mol) and Nimbinone the lowest (-5.480 Kcal/mol) among the ten top-scoring compounds.

Molecular modelling of biological interactions
Analysis of the molecular interaction of K36, rutin, tannin amine, quercitrin, hyperoside and kaempferol with the SARS-CoV-2 3C-like protease showed that the compounds occupied the binding pocket of the enzyme comprising of amino acid residues 25 to 192 as shown in figure 2a-f. In addition to forming one or more hydrogen bonds with Glu-166 to which the standard inhibitor (K-36) binds, rutin also formed hydrogen bonds with His 164, Thr 190, Gly 143 and Thr 26, tannin-amine with Leu 141, Gln 189 and Gly 143, quercitrin with His 164, Leu 141 and Thr 190; and hyperoside with His 164 and Thr 190. Kaempferol however formed a hydrogen bond with only Thr 26. Also worthy of note is the pi-pi stacked interaction of hyperoside with His 41. Moreover, all the compounds and the standard ligand bind to several other amino acid residues including His-41, Asn-142 and Cys-145 using different types of molecular interactions.

Absorption, distribution, metabolism, excretion and toxicity (ADMET) properties
The SwissADME predictions of lipophilicity, solubility, druglikeness and oral bioavailability of the ten top-scoring Neem compounds are shown in Table 2, their pharmacokinetic properties are in table 3 and the ProTox-II predicted toxicity profile  in Table 4. For the water solubility (Log S) values, nimbaflavone is predicted to be poorly soluble, azadirachtannin and nimbinone are moderately soluble while the remaining compounds are soluble. Lipophilicity (Log P) values ranged from -0.25 for hyperoside to 5.11 for nimbaflavone. The drug-likeness prediction showed that rutin, tannin amine, quercitrin, hyperoside, and azadirachtannin violate 2 or more Lipinski rules, myricetin violates only one, while kaempferol and the remaining compounds fully obey the rules. Kaempferol, scopoletin, nimbaflavone and nimbinone fully obey Verber's rules while the remaining compounds violate one of the rules. Furthermore, rutin, tannin amine, quercitrin, hyperoside and azadirachtannin have bioavailability scores of 0.17 while the remaining compounds scored 0.55.
The pharmacokinetic prediction (Table 3) showed that the capacity to cross the gastrointestinal tract (GI) for kaempferol, Scopoletin, nimbaflavone and nimbinone is high and the bloodbrain barrier (BBB) permability for scopuleptin and nimbinone is also high, while the remaining compounds have low permeabilities respectively. Rutin, tannin-amine, azadirachtannin and nimbinone are substrates of the Permeability glycoprotein (P-gp). Kaempferol is predicted to be able to inhibit CYP2D6, CYP1A2 and CYP3A4, nimbaflavone is an inhibitor of CYP2C9 and CYP3A4; nimbinone an inhibitor of CYP3A4 and CYP2C19, myricetin an inhibitor CYP1A2 and CYP3A4 and scopuletin could inhibit CYP1A2. Table 4, the ProTox-II-predicted toxicity profile of the compounds indicated that tannin amine, kaempferol and nimbinone are not likely to be hepatotoxic, carcinogenic, immunotoxic, mutagenic and cytotoxic. Rutin, hyperoside, nimbaflavone and azadirachtannin are likely to be immunotoxic. Quercitrin and scopoletin could be immunogenic and carcinogenic, and myricetin could be carcinogenic and mutagenic. Except for myricetin and azadirachtannin with low LD50 values (159 and 274 mg/Kg respectively), the LD50 values of the compounds ranged from 2000 to 5000 mg/Kg and apart from nimbaflavone, they all belong to the acute oral toxicity class 5.

Receptor-ligand pharmacophore modelling
The pharmacophore models of rutin, tannin amine, quercitrin, hyperoside and kaempferol on SARS-CoV-2 3C-like protease are shown in figure 3. Four hydrogen bond donors, one hydrogen bond acceptors and two aromatic rings are the structural features involved in the molecular interaction of rutin with the enzyme. Quercitrin uses two hydrogen bond donors and two aromatic rings; tannin amine requires five hydrogen bond donors, one hydrogen bond acceptor and one aromatic ring; three hydrogen bond donors and two aromatic rings are required by hyperoside, and kaempferol uses one hydrogen bond donor and two aromatic rings.

Discussion
The computational approach in drug discovery helps in predicting the activity and fate of a potential drug candidate, thereby cutting down the cost and time of drug development, avoiding unwarranted drug toxicity and reducing the ethical concerns of experimental animals [30]. With the computeraided ligand-protein interaction technique (molecular docking), the efficacy of the diverse bioactive compounds present as a composite mixture in a given medicinal plant can be individually evaluated [31,32], their respective drug-likeness, pharmacokinetic and pharmacodynamic behaviours can be predicted through the Swiss-ADME server [33,34] and the toxicity profile through the ProTox-II server [35,36]. In this study, 10 compounds were selected from one hundred and fifty bioactive compounds of Neem after the molecular docking evaluation of their inhibitory potential against SARS-CoV-2 3C-like protease. This drug target is a SARS-CoV-2 enzyme belonging to the family of cysteine proteases and it is one of the non-structural proteins (nsps) required for the cleavage of viral RNA-translated polyproteins (Pp1a ad -1ab) into other nsps such as helicases (nsp13), RNA-dependent RNA polymerase (RdRp or nsp12), exonucleases (nsp14) and endonucleases (nsp15); all of which are components of the viral replication-transcription complex (RTC) required for new virion synthesis [11,12,37,38]. Among these 10 selected test compounds, rutin had the highest binding affinity followed by tannin amine, quercitrin, hyperoside and kaempferol, before the standard inhibitor K36. Rutin, which is a glycoside formed by a fusion of flavonol, quercetin with disaccharide [39], has been evident to be anticarcinogenic, antidiabetic, antimicrobial, and cardio-, neuro-, haemato-, hepato-and nephroprotective. Tannin amine, quercitrin, hyperoside and kaempferol have also been reported to display a wide range of biological activities, including antiviral, in different experimental studies [40][41][42][43]. Tannin amine which is a tannic acid derivative, is a polyphenol used for the treatment of a vast number of diseases, including viral diseases [39]. Quercitrin (a rhamnose glycoside) is a glycoside derivative of quercetin, which is a flavonoid with a well-known antiviral potency [44]. Hyperoside, a quercetin-3-O-galactoside is another flavonoid that has shown antiviral potential against the hepatitis B virus (HBV) [45]. Kaempferol is a flavonoid aglycone which exists in a glycoside form, and has been implicated as an antiviral agent against HIV and coronavirus [46]. The five compounds interacted with important active site amino acids residues of the enzyme. Rutin, tannin amine, quercitrin, hyperoside and the standard ligand formed one or more hydrogen bonds with Glu-166. Kaempferol was also found to be in association with this residue. Glu-166 is an important residue required by the enzyme for its substrate-induced dimerization, a necessary condition for catalysis [47]. Mutation of this residue has been found to significantly reduce the substrate-induced dimerization process and subsequently prevent enzyme activation [47][48][49]. The compounds also bind to Asn-142 which is required for blocking the entrance of the substrate-binding subsite in the enzyme monomer, by forming a hydrogen bond with Glu 166 [48]. Also, of high significance is their interaction with His 41 and Cys 145. These residues play very critical roles at the catalytic site of the enzyme [50]. The catalytic site holds a His 41-Cys 145 catalytic dyad in a cleft between two structural domains of the enzyme, where Cys 145 acts as a nucleophile during the first step of the catalytic process and His 41 acts as a base catalyst [51]. Molecular interaction with these important residues is the target of most SARS-CoV-2 3C-like protease inhibitors.
The pharmacophore models of rutin, quercitrin, tannin amine, hyperoside and kaempferol on SARS-CoV-2 3C-like protease showed that hydrogen bond donors/acceptors and aromatic rings are the structural features of the compounds responsible for the molecular interactions with the enzyme (figure 3). These important features could have contributed to the binding affinity of the compounds. Hydrogen bonds are generally considered to be facilitators of protein-ligand binding [52] and their presence is an indication of good docking quality and complex stability [53]. Aromatic interactions are very significant to molecular recognition and are particularly essential in drug design since about 20% of amino acids are aromatic in nature [54].
Despite the high binding affinity of the compounds for the protein target, a promising drug candidate should fulfil the paramount criteria of drug-likeness. Drug-likeness is achieved when the molecular and structural features of the test compound are under the acceptable range. Such features include water-solubility, lipophilicity, molecular size, flexibility, polarity and saturation of the compound etc, and they determine whether a compound will be orally bioavailable or not [55,56]. Water solubility is associated with lipophilicity and permeability and this, in turn, determines the bioavailability of molecules at the target site. The compounds (apart from the poorly soluble nimbaflavone and the moderately soluble azadirachtannin and nimbinone) possess high water solubility which is needed for easy passage within the aqueous blood, but in principle could reduce their membrane permeation capacity and bioavailability. However, the compounds possess varying levels of lipophilicity, some of which are good enough for them to penetrate the intestinal linings. The lipophilicity value of most of the selected Neem phytoconstituents falls within the acceptable value of octanol-water partition coefficient (Log P) ≤ 5. Additionally, kaempferol, scopoletin, nimbaflavone and nimbinone fully obeyed Lipinski and Verber's rules, and hence, they can be predicted to be orally bioavailable. Myricetin could also be a good oral drug because it violated only one Lipinski rule. The Lipinski rule constitutes the criteria of molecular weight (MW) ≤ 500, octanol/water partition coefficient (C logP) ≤ 5, number of hydrogen bond donors (HBD) ≤ 5 and number of hydrogen bond acceptors (HBA) ≤ 10; and an orally active drug should not violate more than one of these criteria. For Veber's rule, compounds that meet only the two criteria of ≤ 10 rotatable bonds and polar surface area no greater than 140 Å2 are projected to have good oral bioavailability. Apart from passing Lipinski and Veber's rules, these compounds also possess a good bioavailability score of 0.55 as against 0.17 for the remaining compounds. This implies that kaempferol, scopoletin, nimbaflavone, nimbinone and myricetin have a 55% probability of at least 10% oral bioavailability in rat or measurable human colon carcinoma (Caco-2) permeability, whereas, the remaining compounds possess only about 17% probability. The oral bioavailability prediction also agrees with the predicted GIA potential (which is high for kaempferol, scopoletin, nimbaflavone and nimbinone) and the blood-brain barrier (BBB) permeability (which is high for scopoletin and nimbinone).
In addition to the druglikeness properties, the pharmacokinetic properties of the compounds were also considered. The results indicated that kaempferol, quercitrin and hyperoside might escape the efflux pump (P-gp) which is a multidrug resistance protein that offers protection to the organs from oxidative damage by xenobiotics. Another important pharmacokineticrelated factor involves the underlying role played by drug-metabolising enzymes like cytochrome P-450 [57,58]. Roughly 50-90% known drugs are substrates of the 5 important isoforms CYP 1A2, 2C19, -2C9, -2D6 and -3A4 [59][60][61][62]. This implies that Kaempferol, nimbaflavone, nimbinone, myricetin and scopuletin could inhibit the metabolism of drugs that are substrates of one or more of these enzymes, thereby causing some levels of drug-drug interaction [60,61,63,64].
Prediction of the toxicity profile of test compounds is an essential part of the early drug discovery process. Roughly, 89% of new drug candidates could not accomplish the human clinical trial despite their high efficacy and acceptable pharmacokinetic properties, and 50% of these failures are due to unexpected drug-related toxicity [63,65]. From the results obtained in this study, tannin amine, kaempferol and nimbinone are not likely to induce any toxic effect; but rutin and hyperoside might be immunotoxic, and quercitrin could be both carcinogenic and immunogenic. However, the two most toxic of all the compounds are myricetin and azadirachtannin with LD 50 values of 159 and 274 mg/Kg respectively. The remaining compounds, the majority of which belong to the oral toxicity class 5 (narrow probability of being harmful if swallowed) are relatively safe, with LD 50 values ranging from 2000 to 5000 mg/Kg. ment of the anti-inflammatory potential of Phyllanthus nivosus leaf against ulcerative colitis. Heliyon. 2020; 6: e03893.