Synthesis of the poised DNA-encoded library
The first NUDEL was designed based on two sequential amide couplings involving coupling of a set of Fmoc-protected amino acids to an amino-derivatised DNA-head piece in cycle 1, followed by Fmoc-deprotection and second amide coupling with a set of aryl halide-containing acids. Libraries of this design could then incorporate any active fragment for which a derivative was available with a functional group that allowed coupling with the halide. Commonly used reactions of this type, such as Suzuki-Miyaura11,12 and Buchwald-Hartwig13 reactions, have been recently shown to work efficiently on DNA-tagged substrates.
The library components were selected based on diversity of shape and chemical functionality (Fig. 2a). Seven Fmoc amino acids, consisting of canonical amino acids alanine, valine and phenylalanine, pyrrolidine-3 carboxylic acid and aniline derivatives 3- and 4-aminobenzoic acid and (4-methylamino)benzoic acid, were selected for cycle 1. Six aryl halide-containing acids: 3- and 4-iodobenzoic acid, (4-iodophenyl)acetic acid 5-iodonicotinic acid, 2-(6-bromopyridin-3-yl)acetic acid and 2-(4-bromo-1H-pyrazol-1-yl)acetic acid, were selected for cycle 2. Thus, the final library would contain a variety of hydrogen bonding and lipophilic functionality along with differing overall shapes that present different relative orientations of the components.
Through the DEL methodology, large numbers of compounds become synthetically achievable in parallel. However, one of the major disadvantages of working with DNA-conjugated compounds is that the removal of side products is difficult. Underpinning the fidelity of the library is the requirement that reactions used in its synthesis proceed in high conversion to the intended product. This ensures that the compound structure associated with each label is correct, as side products or incomplete reactions give rise to multiple products attached to the same sequence. To ensure the NUDEL was of high fidelity, examples of amide coupling conditions for compounds 1a-c with headpiece 6 were optimised to the highest conversions, before further expanding for the selected library building blocks. Working with a small-scale library afforded the ability to conduct thorough validation experiments for each stage of library construction.
After a period of optimisation, the library synthesis protocol involved on-resin monomethoxytrityl deprotection of a hexylamino-tagged DNA adapter sequence, followed by HATU-mediated coupling of a PEG-4-linker bearing an Fmoc protected amine (Fig. 2b). Concomitant Fmoc deprotection and cleavage from the resin was achieved by methylamine treatment. The resulting headpiece was annealed with a complementary DNA strand and the building block 1 codons attached by enzymatic ligation. The Fmoc amino acids in cycle 1 were incorporated by DMTMM coupling in borate buffer, followed by deprotection to yield the 7 intermediate amines. These intermediates were pooled, and the resulting mixture split into 6 wells. Codon 2 ligation was then carried out followed by a second DMTMM-mediated acylation of the cycle 2 halide-containing acids, resulting in a fully encoded library of 42 aryl halides (11aa-gf), poised for coupling of active fragments.
Coupling of an active fragment
To demonstrate the use of the NUDEL to a focused library for fragment expansion, we selected bromodomain-containing protein 4 (BRD4) as a test case. 3,5-Dimethylisoxazole is a known active fragment binder of BRD4.14 We proposed that the NUDEL aryl halides could be derivatised by coupling to the corresponding boronic acid of dimethyl isoxazole via on-DNA Suzuki-Miyaura cross-coupling,11,12 thus providing a BRD4 focused library.
To establish the coupling of dimethylisoxazole boronic acid to the library, a potentially challenging coupling due to the presence of the heterocyclic system and sterically hindering bis-ortho-methyl substitution, a model DNA substrate containing the adapter sequence, was used as a trial reaction (Fig. 3a). Micellar-mediated Suzuki-Miyaura conditions performed very well with this substrate, furnishing the coupled dimethylisoxazole with 100% conversion, giving confidence that the BRD4 fragment could be coupled efficiently on-DNA. A sample of the NUDEL was subjected to the same coupling conditions resulting in the BRD4 targeted library.
Affinity selection against BRD4
The targeted library was incubated with immobilised BRD4 (first bromodomain, BD1) protein in the presence of DNA-linked version of the established BRD4 ligand JQ115 as a positive control. After washing to remove non-binders, the protein was denatured, and the sample subjected to 40 cycles of PCR amplification and next generation sequencing (> 150,000 reads). The resulting output was analysed to assess the frequency of occurrence of each building block codon and their enrichment relative to the starting library. Remarkably, this highlighted a specific combination of cycle 1 codon (ACTATGGA) with cycle 2 codon (CTTAGAGC), corresponding to alanine and methylamidopyrazole respectively, which occurred significantly more times (11-fold) than any other sequence and was comparable to the enrichment of the JQ1 derived control, spiked into the library at the concentration of a representative library member (Fig. 3b). Interestingly, neither codon appeared significantly enriched in any other combination. Together, this suggested that compound 21a would be a potent BRD4 ligand and that the combination of 1a and 2f, rather than either motif individually, resulted in a synergistic and multiplicative gain in affinity.
Off-DNA hit confirmation
To confirm the synergistic finding, a ‘matched square’ of compounds consisting of compound 22, which corresponded to the hit, 24, which contained 1a (active cycle 1 monomer) and 2a (representative non-enriched cycle 2 monomer), 18a containing 1g (representative non-enriched cycle 1 monomer) with 2f (enriched cycle 2 monomer) and finally 18b (both non-enriched monomers) were selected for off-DNA synthesis and evaluation (Fig. 3c).
Compounds 18a-b were synthesised from 4-(aminomethyl)benzoic acid 15 by formation of the primary ethyl amide via a protection and deprotection of the amine as the trifluoroacetamide. Subsequent amide coupling to the appropriate cycle 2 monomer and incorporation of the dimethyl isoxazole warhead via Suzuki cross-coupling resulted in the desired off-DNA compounds 18a-b (Fig. 3c). 22 and 24 were obtained in a similar manner, first coupling alanine ethyl ester 19, with either 2a, 2f, followed by Suzuki cross coupling to install the isoxazole. Ester hydrolysis and amide coupling gave the final ethyl amide products 22 and 24.
SAR analysis
The Kd of 18a, 18b and 22 and 24 were evaluated via SPR analysis for binding to BRD4 BD1. Gratifyingly, hit compound 22, had a Kd of 51 nM (Fig. 4a). In contrast the affinities of 18a,18b and 24 were 35 µM, 2.5 µM, and 15 µM, respectively, i.e. two to three orders of magnitude less potent (Fig. 4b). This result is consistent with the enrichment observed during the selection and shows that it is the combination of the alanine and the pyrazolylacetamide moieties, rather than either of them in isolation that is responsible for the dramatic gain in affinity. This highlights a distinct advantage of this approach over traditional fragment growing. It would not be possible to identify this unique, synergistic combination of substructures without synthesising every combination, something that is essentially prohibitive for analogues synthesised off-DNA. Moreover, using fragment growing approaches, this combination could not be identified because the pyrazoleacetamide species would not be identified as beneficial when lacking the cycle 2 alanine and so would not have been selected for subsequent elaboration. The use of the NUDEL, which contains every possible combination of the selected substructures, makes the discovery of such synergistic combinations of groups facile.
Characterisation of the binding mode of 22 (CLAS-106) was achieved by solving the crystal structure in complex with BRD4 (Fig. 4c). The crystal structure of the BRD4 complex was determined at 1.8Å resolution and, as expected, 22 was observed to interact with the KAc recognition site. The oxygen of the isoxazole moiety forms a hydrogen bond with the conserved Asn-140, and a conserved water mediated a hydrogen bond between the nitrogen of the isoxazole and Tyr-97. The pyrazole core is sandwiched between the Pro-82 of the WPF shelf and Leu-92 of the ZA-loop, with a key hydrogen bond formed between a conserved structural water and the main chain of Pro-86 and Gln-85 of the ZA-loop. The extended pyrazole core, a product of cycle 2 synthesis, is further stabilised through additional water mediated hydrogen bonding network involving the main chain of Asp-86. This provides an explanation for the synergistic increase in potency from the combination of the pyrazole and alanine, in that it provides a unique relative orientation of the three critical groups to form complementary hydrogen bonds to the two adjacent structural water molecules.
ADME and cellular pharmacology of 22
To confirm the activity of 22 in a cellular context, human multiple myeloma MM.1S cells were treated with compound. Consistent with the high polarity of the compound, and hence expected low permeability and significant efflux, cells required permeabilization with digitonin to show activity. In the presence of digitonin, 22 showed downregulation of c-Myc, an established downstream pharmacodynamic marker of BRD4 inhibition16 at a concentration of 30 µM (Fig. 4d). A more lipophilic, cell permeable analogue 25 (benzyl amide derivative) showed downregulation of c-Myc in a concentration dependent manner in both the presence and absence of digitonin.
The LogD value for compound 22 was 0.90, thus a lipophilic ligand efficiency of 6.4, which is consistent with a high quality lead compound (Fig. 4e).17 It had high solubility and low in vitro clearance in human and rat microsomes and hepatocytes. As expected, and consistent with the cellular activity, permeability in MDCK cell was low without evidence of MDR1-mediated efflux. Overall, compound 22 is a high quality technical profile with scope for facile further optimisation.