Sequences retrieval and alignment
All structural alignments were performed using the Dali server (http://ekhidna2.biocenter.helsinki.fi/dali/) (Holm, 2020). nCOVSNVs with ID NC_045512 were identified and cross-matched with BLASTp for the COVID-19 virus (SARS-CoV-2) receptor
binding domain (RBD) and subdomain-1 (319th to 591st aa) of the the spike glycoprotein [bat coronavirus RaTG13] with about 74% sequence identity in the single receptor-binding domain for the up configuration and the S1 protein partial [SARS coronavirus GD322] with 100% sequence identity. Τotal of 1652 SARS-CoV-2 S protein complete sequence alignments of COVID-19 RBD subdomain-1 (319th to 591st) amino acid available at the (2,3,7-39) NCBI Virus portal were retrieved. We then sought to rank the COVID-19 virus RBD subdomain-1 (319th to 591st) amino acid sequences based on the predicted probability that the RNA folds into the MEA structure with 73.96% sequence identity and not other structures (8,9,12-30) according to the patient’s corresponding sample information, dates of sample collections, and geographical locations among others which have shown strong clinical evidence of the presence of SARS-CoV by immunohistochemistry, high-resolution oligonucleotide array, electron microscopy, and real-time reverse transcription-qRT-PCR.
Preparation of the protein structures
For this purpose, (1,2,4-6,23) we initially selected the three-dimensional structures of the non-structural proteins of the Nsp3, Nsp5 (PLpro domain), (1-5,6,7-48) Nsp12 (RdRp), the structural Spike proteins, the Nsp15 (endoribonuclease), and the nucleocapsid protein (N protein) between the SARS-CoV-2 Mpro structure and the SARS-CoV Mpro, PDB entry 6LU7 as the best-characterized drug targets among coronaviruses. (2,5,6,7,9-27) For the N protein, we clustered 31 conformations with (11,13,14-29,32) Glu174 present in an opened residue conformation out of a total of 40 ionization states (16,19,21-46,47,48) present in the globular cluster of five protease helices, which is involved through a salt-bridge interaction in modulating the homodimerization of the NMR-derived structure of the (PDB code 6YI3 49) Mpro, mainly between the binding cavities of the Arg4 of one protomer and the Glu290 domains of other to select a short linear peptide subset representative (22,24,26-42,43,46,48) within the protease flexibility. (17,18,21-43,48) The aliphatic carbon atoms from the the closed conformation of the Glu174/Glu166 side chain were prepared as a part of the phosphate binding site which was lead in the binding site to steric clashes with potential inhibitors by removing all water molecules using PyMOL software.
Collecting gold-standard pairwise drug combinations
In this study, we focused on pairwise drug combinations on clinically and FDA-approved investigational drugs by assembling the publicly available clinical data sources from the updated human interactome and the multiple commonly used small molecule databases. (23,24) Each approved drug in combinations was required to have the experimentally screening system (25,26) and validated COVID-19 protein target information based on GeneCards and on high-quality PPIs: each EC50, IC50, Ki, or Kd ≤ 10 µM. (27,28) Compound name, gene expression data, metabolic associations, evolutionary analysis, commercial name or generic name of each drug and protein-drug datasets were included and (29,30) standardized by MeSH and UMLS vocabularies (22,28-47) and further transferred to DrugBank ID (30-32) from the DrugBank database (v4.3)40. (30-38, 40) Duplicated drug pairs were removed. (40, 42, 44) In total, 681 unique pairwise drug combinations by literature-derived low-throughput experiments (46,48) connecting 362 drugs were retained. We also compiled clinically reported adverse (30,32,37-47,48) drug–drug interactions (DDIs) data from the (38,39,40,42-48) the Therapeutic Target Database (TTD, v4.3.02)41, the PharmGKB database (December 30, 2015) and the FDA approved DrugBank databases using reported binding affinity data such as the dissociation constant (Kd), inhibition constant/potency (Ki), median inhibitory concentration (IC50) ≤ 10 µM or median effective concentration (EC50). Drug–Protein target interactions, (42-48) Chemical similarity analysis and (32,35,36-48) Collecting adverse drug–drug interactions of (54,58) FDA approved drug pairs were acquired from the publicly available DrugBank database (v4.3).
Gene Ontology (GO) similarity analysis and Clinical similarity of drug pair analysis
The Gene Ontology (GO) annotation for all the drug-gene targets of the SARS- CoV-2 main protease, Mpro, DPP4 ADABP, ADCP2, CD26, DPPIV, TP103, FURIN FUR, PACE, PCSK3,FURIN-ADAMTS1-ROR-GAMMA- SRRM2-RORγ- NF-κB/RelA-STAT3A- RORγ- NF-κB/RelA-STAT3B, Nsp15, TCEB2-->ASB8->>TCEB1 coding genes was downloaded from the website: http://www.geneontology.org/. We used three types of the experimentally validated or (4,5,7-22,48) literature-derived evidences (32,33,36-48) biological processes (BP), (31,32-48) molecular function (MF), (37,39,4,42-45) and cellular component (CC), (37-48) excluding annotations inferred (36-47,48) computationally. The semantic comparison of GO annotations offers quantitative (32,33-45-48) ways to compute similarities between (22,27,20-32,41-45) genes and drug-gene (22,23,25,27-48) combination theapy products. The overall GO similarity of the drug target-SARS-CoV-2 main protease, Mpro, DPP4 ADABP, ADCP2, CD26, DPPIV, TP103, FURIN FUR, PACE,PCSK3, FURIN-ADAMTS1-ROR-GAMMA-SRRM2-RORγ- NF-κB/RelA-STAT3A-RORγ- NF-κB/RelA-STAT3B, Nsp15, TCEB2-->ASB8->>TCEB1 coding genes binding to the Remdesivir, Colchicine and Ursolic acid drugs was determined by Eq. (x˙=f(x,t,η),x(t0)=x0(η) D={(tk,y~k)}k=1nt y~ik=yi(tk)+εik,εik∼N0,σi2(y~ik−yi(tk))22σi2, p(θ|D)=p(D|θ)p(θ)p(D) p(D|θ)1βlp(θ) σ10 σ20 , averaging all pairs of protein target-SARS-CoV-2 main protease, Mpro, DPP4, ADABP, ADCP2, CD26, DPPIV, TP103, FURIN FUR, PACE, PCSK3, FURIN-ADAMTS1-ROR-GAMMA-SRRM2-RORγ- NF-κB/RelA-STAT3A-RORγ- FURIN-LRP1-MMP17-MMP2-MMP25-NF-κB/RelA-STAT3B, Nsp15, TCEB2-->ASB8->>TCEB1 coding genesa and b with ∈ and ∈ ⟨ GO⟩=1pairs∑{p(D|θ)=∏i=1ny∏k=1nt1σi2πexp},GO(,p(θ|D)=p(D|θ)p(θ)p(D) p(D|θ)1βlp(θ) σ10 σ20)(6).
Preparation of the datasets with known e-Drug3D drugs
The dataset containing the (5,7,9-48) FDA-approved drugs and active metabolites was constructed from the e-Drug3D commercial small molecule and fragments of 1305 different FDA-approved drugs that currently contains 1519 annotated 3D structures of molecular weight <2000 that are represented by a single conformation dataset, (10,12,14-48) a dataset updated annually and freely available for the scientific community at https://chemoinfo.ipmc.cnrs.fr/MOLDB/index.php (13,15-48). In this In-Silico effort we prepared mol2 files of the essential dataset of the chemical structures from the two supplementary e-Drug3D collections which were carefully constructed and have been shown similar inhibition in infectious models and resulting from (i) the generation of multiple conformers of high-quality and curated structures of FDA-approved drugs with molecular weight less than 2000 for the: {[5‐(1lambda4,3lambda4,7lambda4‐purin‐6‐ylsulfanyl)‐1‐methyl‐4H‐1lambda5‐imidazol‐4‐
ylidene](oxo)imino}‐lambda1‐oxidanylidene-heptahydrogen ring systems and (ii) the calculation of the most probable ionic and tautomeric states at pH 7.4. Smaller drugs (backbone with less than 20 heavy atoms) were also prepared in the .pdb, .pdbqt and .mol2 format files (26,27-42,46-48) present in the e-Drug3D dataset. (28,30-48) The ensemble of 3D molecule conformations (31,32-45,48) of the larger drugs from e-Drug3D were provided on a separated drug library dataset.
Drugs under clinical trials (COVID-19 repurposing dataset)
Our approach was focused on identifying a cluster of (1-48) similar chemotypes followed by parallel docking grid generation using the advanced to the next MM-PBSA-WSAS, KNIME-HTS-HVS filter, and chemical structure preparation wizard as provided by the BiogenetoligandorolTM cluster of algorithms that had the potential to target the (2-48) FURIN-ADAMTS1-ROR-GAMMA-SARS-COV-2 conserved domains and fit the geometric constraints without any restraints or constraints when filling in the open valence of the SARS-COV-2-ACE2-RORγ-BRD4-FURIN (1-30,41-48) binding pocket residues. These were then (6-48) converted into substructure searches of one copy of SARS-CoV-2 main protease (7-45) which were used to mine commercially available (8-42) compounds and covalently bonded SARS-COV-2 inhibitors using the (9-32) eMolecules database. The dataset of the selected hit drugs followed the force field parameters as applied to the partial atomic charges of the selectred ligands which were derived using the RESP and collected from published articles (3,5,7,13,48) and approved drugs listed on the (2-39) DrugBank database in the "Clinical Trial Summary by Drug" section to fit the HF/6-31G* electrostatic potentials generated using the Gaussian 16 software package. (4,6-18) We intended to generate these from two different branches to model the viral protein using the isotropic position scaling algorithm. (7,9-46,48) One branch of the selected hit elements was refered to as the “Remdesivir Literature Substructures” branch, (8,11-47) which was based on the Remdesivir, Colchicine and Ursolic acid substructures using the Antechamber module as extracted (15,46) from published bromodomain inhibitors.
Structure-Based Pharmacophore, Docking, Machine Learning (QSAR) Methods.
Molecular docking, quantum mechanical LigandorolTM-inspired physarum-prize-collecting Neural Matrix Factorizations (12-13) and drug repositioning scoring and constrained analysis using the SHAKE algorithm were implemented (13-14) to a collection of the (14-15) ZINC databases. Protein-molecule complexes, (12-21). followed by structural relaxation (13-22) for the cocrystal ligand of SARS-CoV-2 main protease, N3 were generated through (17-18) flexible-ligand:rigid-receptor (19) The molecular (4) docking in this local energy minimization of the equations of motion was conducted at a time step of 1 fs to optimize (15) protein-molecule (22-24) interactions for the relaxation phase and 2 fs capping (24-27) for the N- and C-terminal of each (26) fragment. i-GEMDOCK (26-27) software deployed at 298 K, 1 bar for 10 ns for the equilibrium and sampling phases through cycles in amino-acids 15-23+ within 4 Å to calculate the full electrostatic energy of any (27-28) docked molecule. In addition, following our (37- 38) static analysis, we conducted some (38-39) preliminary molecular dynamics studies (40-41) on a potential “latch” for the Down state protomer. (38) Explicit solvent molecular dynamics (MD) (39) in both the minimization and MD simulation stages of novel coronavirus spike protein (42-43) were performed (44-45) using the NAMD2 program (45). We used the CHARMM-Gui (46) with the CHARMM36m force field (47-48) along with TIP3P water molecules to explicitly (30-39,44) solvate the proteins of a unit cell in a macroscopic lattice and add any (32-43) missing residues of repeating images from the experimental structure atom files selected from the sampled snapshots. (45) Simulations were carried out (35-45) maintaining the number of simulated particles, (27-48) pressure and temperature (the NPT ensemble) (11-48) constant with the Langevin piston method (23-39) specifically used to maintain a constant pressure of 1atm. (12-46) We employed periodic (18- 40) docking boundary conditions to enrich repurposing drug candidates (20-43) using a more accurate and more efficient method, the BiogenetoligandorolTMQMM-PBSA-WSAS pipeline for a water box simulation (43) volume as well as the particle nanomesh (24-45) Ewald (PME) method with a 20 Å cutoff distances (24-46) between the simulated SARS-COV-2 protein (16-47) and water box edge since the most chemical ligands and herbals comprising antiviral and antibacterial activities are large and consist of many rotatable bonds. (15) The integration time entropy step was 2 femtoseconds using a method coined WSAS (weighted solvent accessible surface area) with our protein simulations (17-48) as conducted under physiological conditions (37 C, pH of 7.4, (69,70) physiological ionic strength) based on the MM-PBSA-WSAS binding free energies. To emphasize the (8,9,12-47) computational efficiency of the method (9,10-46) we would like to mention that the average computing time (12,14,15-48) for one solution was <10 s on a single GPU processor by using solvent assessible surface area, 0.054. (15,16,17-48) A significant variability in the solutions can be observed. (15,17,19-48). The docking calculation (20,21,23-48) for estimating the nonpolar solvation energy was performed on the prepared dataset of 195 SARS-COV-2 receptors and FDA ligands by using the BiogenetoligandorolTM Cluster of docking softwares and workflows based on default parameters.