Multi-reference Computational Method for De-novo Design, Optimization, and Repositioning of Pharmaceutical Compounds Illustrated by Identifying Multi-target SARS-CoV-2 Ligands

In this work a novel computational multi-reference poly-conformational algorithm is presented for 20 design, optimization, and repositioning of pharmaceutical compounds. The algorithm searches for 21 candidates by comparing similarities between conformers of the same compound and identifies 22 target compounds whose conformers are simultaneously “close” to the conformers for each of the 23 compounds in a reference set. The reference compounds can have very different MoAs, which 24 directly and simultaneously shapes the properties of the target candidate compounds. 25 The algorithm functionality has been validated in silico by scoring ChEMBL drugs against FDA- 26 approved reference compounds which either had the highest predicted binding affinity to our 27 chosen SARS-COV-2 targets or confirmed to be inhibiting such targets in-vivo. All our top scoring 28 ChEMBL compounds also turned out to be either high-affinity ligands to the chosen targets (as 29 confirmed separately in other studies) or showing significant efficacy in-vivo against those 30 selected targets. 31


47
Conformers as independent molecular entities. In real life, most compound molecules exist in 48 multiple conformations (shapes) based on the surrounding environmental conditions. In particular, 49 each 3D shape of a molecule dictates its biological activity and enables the molecule to fit into the 50 binding pockets of proteins. Often, distinctly different chemical compounds that have similar 51 shapes (and similar charge distributions along the molecular surface) have a potential to bind as 52 long as the ligand's partial charges are positioned in the binding pocket the same way (i.e., form 53 the same hydrogen bonds). Therefore, it is beneficial to compare the shapes and surface 54 distribution charges for target query and reference compounds on a conformer-by-conformer basis. 55 If one of the conformers of the query molecule matches one of the conformers (especially bound-56 to-target) of the reference molecule, then there is a chance that the reference compound will also 57 exhibit similar binding properties to the same target. efficient method proposed here is expected to be very useful for finding candidate drugs for multi-85 target disease indications, ligand-based drug design, and drug repurposing applications. 86 Method applications for SARS-CoV-2 treatment compounds. The set of SARS-CoV-2 87 treatment compounds have been used for both method validation since, there are compounds that 88 have been confirmed to be effective 24 and for the search for new potential compounds based on 89 the existing known set since no drug has been identified to be 100% effective against the virus, and spread of infection. Because our method doesn't directly use target information but rather 105 analyzes 3D shapes for a compound that was already predicted or experimentally found to be 106 effective against a particular target (we call it a reference compound), one has to choose one (or more) such compound(s) as a reference for each target. The focus for each of the above SARS-108 COV-2 targets (3CLpro, PLpro and RdRp) was on the reference compounds with the highest 109 binding affinities from the recent in silico multi-target repurposing study. 24

110
For the new compound search (virtual library screening) we used the same set of reference 111 compounds as we used for the method validation. possible shapes, adopted via varying environmental conditions, of the same molecule (i.e., 118 conformers) rather than just a single shape that was used before. In particular, the suggested 119 approach is based on the matching of ligand-ligand fingerprints without explicitly using target 120 structure information unlike docking and molecular dynamics approaches that simulate physical 121 binding of a ligand to the target. The supporting theory behind the method is based on the decision 122 to treat conformers, which might have different binding characteristics and properties, as 123 independent entities. In such an approach each conformer has the corresponding independent 124 alignment-free 3D-similarity scoring using the known multi-references. All conformers were 125 generated using the ETKDG algorithm implemented in RDkit 27 . Benchmarking studies have 126 found ETKDG to be the best-performing freely available conformer generator up-to-date 28,29 127 providing diverse and chemically-meaningful conformers reproducing crystal conformations.
Unlike what the majority of computational methods had assumed a couple of decades or so ago 129 (e.g. in the CoMFA method 58 ), recent research indicates that the bioactive conformation is not 130 necessarily the lowest-energy conformation in the presence of the receptor 59-61 . In particular, as 131 long as an increase in energy for less favorable conformation is compensated by its binding to the 132 target, i.e. the total ligand-target energy is lower than the sum of the energies for the non-bound 133 target and ligand, the bound state is favored. The proposed method emphasizes and relies on this 134 ligand's ability to use its potentially higher energy conformations depending on the target it 135 attempts to bind. Note, however, that when a sufficiently large number of conformers is requested, 136 ETKDG algorithm generates more conformers with lower energy than with higher energy 27,28 , 137 therefore when averaged over all conformers (and we generate 100 conformers per molecule), 138 conformers with the lower energy will contribute more to the total overlap. 139 Actually, one of the things that distinguishes ligand-based 3D virtual screening methods from 2D 140 methods is that one has to start worrying about how many conformers to include in the reference  The authors have called the approach MultiRef3D to emphasize that it is a fast, alignment-free    selected ChEMBL compounds were already marketed drugs for which at least one target is known.

223
The corresponding ChEMBL extraction query is provided in the manuscript Supplement at GitHub In addition, all of them turned out to be also good binders of 3CLpro 57 .

295
The other hits were Temafloxacin and Trovafloxacin, predicted to be potent 3CLpro ligands 55 and 296 experimentally shown to inhibit virus replication 56,57 , and anti-inflammatory drugs Benoxaprofen 297 and Ciproflaxin predicted to target 3CLpro 58,59 as well.

298
Concluding the list is an interesting multi-target Aurora/JAK inhibitor hit: compound At-9283.    (1) and (2). Confirmation by direct docking for the fingerprint-matched queries can be 387 used to confirm the match.

388
Our methodology emphasizes pursuit of candidate compounds that achieve therapeutic effect (e.g.