Intracellular Drug Targets In Mycobacterium Tuberculosis Revealed By A Chemo-Genetic Approach

Mycobacterium tuberculosis (Mtb), the etiological agent of tuberculosis, is one of the most devastating infectious agents in the world. It causes chronic lung diseases to one third of the world’s population. Chemo-genetic characterization through in vitro evolution combined with whole genome sequencing analysis can identify novel drug targets and drug resistance genes in Mtb. We performed a genome analysis of 53 Mtb mutants resistant to 15 different hit compounds. We found nonsynonymous mutations/indels in 30 genes that may be associated with drug resistance acquisitions. Beyond conrming previously identied drug resistance mechanisms such as rpoB and lead targets reported in novel anti-tuberculosis drug screenings such as mmpL3, ethA, mbtA, we discovered several unrecognized candidate drug targets including prrB and TB18.5. The exploration of the M. tuberculosis chemical mutant genomes could help novel drug discovery and structural biology of compounds and asscoiated mechanisms of action relevant to tuberculosis treatment.


Introduction
Mycobacterium tuberculosis (Mtb), the etiological agent of tuberculosis, is one of the most devastating infectious agents in the world 1 . One third of the world's population is exposed to Mtb, and nearly two million people were killed annually. In 2018, about 1.5 million people died from the disease, and 10 million people developed the illness (http://www.who.int/en/news-room/fact-sheets/detail/tuberculosis). TB transmission is airborne, where droplets containing Mtb enter the lungs and circulating alveolar macrophages engulfs the bacilli (http://www.who.int/mediacentre/factsheets/fs104/en/). Macrophages are key components of the human innate immune system that destroy invading microorganisms. Yet, Mtb is able to survive and persist from the macrophage's killing machinery and even replicate inside the macrophage in a speci ed organelle termed the phagosome. Mtb can evade the host immune system, and are protected from many antibiotics that fail to reach the phagosome 2 .
There are several rst-and second-line anti TB drugs, and the treatment involves a regime of four drugs, isoniazid, rifampin, pyrazinamide, and ethambutol taken daily for 6 to 9 months, a far longer treatment than for most bacterial infections. With the increasing prevalence of multi-and extremely-drug-resistant tuberculosis, treatment of patients often involves the use of more expensive second-line drugs and requires over 24 months. A few candidate drugs and hit compounds have been discovered in the last 2 decades but only two drugs, bedaquiline and pretomanid, have been FDA approved in the past forty years 3 , plus there is an urgent need to combat these new Mtb resistant strains by refueling drug development pipeline with novel drug discovery approaches.
One of the challenges in TB drug discovery is the lack of successful transfer from compounds with in vitro activity to e cacy in the clinical settings. For instance, compounds may be selected that are only active in vitro conditions; the targets which are inhibited and identi ed in in vitro liquid culture during cell-based phenotypic screens may not be essential in vivo 4 . This can be achieved through a more global understanding of the host-Mtb interaction using a chemical-genetic approach. We have developed an advanced intracellular drug-screening assay to screen compounds in infected macrophages 5,6 . Using our approach, we have further screened two libraries and identi ed a set of diverse chemical entities that are highly effective against Mtb within the human macrophage with marked intracellular selectivity. Yet, the mode of action (MOA) in these hit compounds and the potential drug targets/inhibitors have not been fully elucidated.
Whole-genome sequencing (WGS) technology followed by bioinformatics analysis has been effective to investigate the epidemiology and transmission of Mtb in outbreak investigation and for infection control 1,7−9 . The technology and variant calling pipeline are also useful in characterizing the genetic polymorphisms and mechanism of resistance in drug resistance clinical strains [10][11][12][13][14] . Since the lack of understanding of compound mode of action (MOA) has become the major barrier to the development of potential novel drug to treatment tuberculosis, the WGS technology has been extended to identify and characterize the candidate drug targets in novel TB drug discovery program; the genomes of spontaneous drug-resistant mutants of M. bovis BCG or M. smegmatis from screening have been characterized followed by variants annotation as well as target identi cation and validation [15][16][17][18] .
In this study, we seek to identify novel/candidate drug targets/hit compound mode of action and further understand the mechanisms of resistance to various chemical entities. We took advantage of the hit compound libraries and the intracellular drug screening assays. We postulated the MOA of selected hit compounds based upon their chemical properties. We further generated and sequenced the genomes of 53 resistant mutants of M. tuberculosis H37Rv on media containing 2-5 ⋅ MIC90 of various hit compounds to assist identi cation of their corresponding MOA. The identi ed mutations lead us to identi cation of novel MOA candidate proteins that include possible drug targets which are critically important to identifying new antibiotics for the long-term control of TB disease.

Results And Discussion
Identi cation of hit compounds Previous screening of GSK-proprietary libraries had identi ed a set of diverse chemical entities. Highthroughput Screening (HTS) was performed using 5µM single shot and 1-10µM dose response for 84,000 compounds from prede ned in-house GSK libraries; rst, the "TB box", having 11,000 compounds coming out from in vitro phenotypic screening of 2,000,000 compounds against M. bovis BCG with hit con rmation in Mtb and second, a library of 73,000 compounds drawn from GSKChem with "ideal" medicinal chemistry characteristics termed "Small&Beautiful".
Data has been analysed and intracellular MIC50 and MIC90 have been extrapolated for all compounds tested. 523 hit compounds belonged to the GSK "TB box", with intracellular MIC90 <3 µM, and 31 hit compounds belonged to the Small&Beautiful, with MIC90 <10 µM. A total of 564 hits from the HTS campaign were identi ed in both aforementioned libraries (unpublished data).
Rationalizing that it would be di cult to obtain mutants for intracellular Mtb, we took advantage of the observation that various carbon sources can mimic the intracellular environment within the macrophages. The 85 hit compounds selected were subject to in vitro MIC in different carbon sources (glucose, cholesterol and acetate). Twenty-seven compounds (34%) of compounds tested had MIC90 in cholesterol.
Most of the compounds active in cholesterol were also active in acetate 6 .

Generation of Mutants, Mutant characterization and whole genome sequencing
Out of these 85 hit compounds, the properties of 16 compounds were summarized in Table 1. These compounds had demonstrated potent activities against Mtb in de ned carbon source media. 5,6 Although some compounds showed antitubercular activity based on structural analysis, the targets/MOA of some compounds has been unknown or unde ned.

Identi cation of candidate intracellular drug targets in resistant mutants
We discovered 74 nonsynonymous SNPs and 13 indels (insertions or deletions) located in 30 different genes that arose in 53 Mtb mutant genomes (Table 2). Most polymorphisms (28) localized in genes involved in cell wall and cell processes, followed by intermediary metabolism and respiration (16) and regulatory proteins (14) (Fig. 2). Annotation of these mutations revealed several known genes or targets that are relevant to drug tolerance/ resistance; several mutations were reported in targets of the rst-or second-line drugs in treating tuberculosis. We found a mutation in rpoB (Asp574Asn) encoding the beta subunit of RNA polymerase in mutants to compound 950A. Mutations in rpoB known to confer resistance to rifampicin are commonly reported in MDR and XDR M. tuberculosis strains (Gygli et al. 2017); approximately 95% of rifampicin resistant clinical isolates carry a mutation in the resistance determining region in rpoB in M. tuberculosis. Alternatively, six resistant strains to three compounds (472A, 739A, 912A) were found to possess mutations (3 different nonsynonymous mutations) in ethA (Table 2), an FAD-containing monooxygenase, which is a mycobacterial enzyme responsible for bio-activation of ethionamide (ETH), an antibiotic prodrug in tuberculosis treatment [19][20][21][22] . Loss-of-function mutations in ethA result in ethionamide resistance 20 . Interestingly for 4 mutants resistant to 472A and 739A and bearing substitutions in ethA, they also possessed indels in Rv3220c, which belongs to two component regulatory system that enable the organisms to make coordinated changes in gene expression in response to various environmental stimuli 23 . However, Rv3220c did not appear to contribute to M. tuberculosis virulence because Rv3220c knock out mutant did not demonstrate any severe infection in mice compared with the H37Rv wild type in previous study 23 . For the two mutants resistant to 472A and 739A but without mutations/ indels in ethA and Rv3220c, mutations were observed in rpsO, which encodes the ribosomal protein S15 and is important in protein translation 24 .
Apart from ethA, mutants resistant to 912A and 705A had mutations or gained a stop codon in Rv3083 (MymA), which also plays a role in activating ethionamide; loss of MymA function resulted in ethionamide resistant Mtb 25,26 . Grant et al. (2016) found that MymA, a Baeyer-Villiger monooxygenases (BVMO) not previously described as an activating enzyme, is required to oxidize compounds to the corresponding sulfoxide for its replicating and non-replicating activity 25 ; loss of MymA function is proposed to confer resistance comparable to loss of ethA function 26 .
Gene targets with independent mutations may confer a tness advantage to Mtb strains in the presence of antimicrobial drugs. Importantly, we identi ed mmpL3 as a target of independent mutation in mutants resistant to compound 267A, 213A 27 and 290A. mmpL3 is a membrane transporter in the resistancenodulation-cell division family and has been shown to be the target of several small molecules and antimycobacterial compounds 28-31 . Similarly, multiple strains resistant to 412A developed mutations in prrB, which belongs to a two-component regulatory system composed of PrrB histidine kinase and PrrA response regulator [32][33][34] . This gene has been shown to be critical for viability of M. tuberculosis cells and is required for the initial phase of macrophage infection; prrBA is conserved among all mycobacterial species pointing towards its critical function in mycobacterial physiology. Recently, prrB was also reported as the target of a hit compound Diarylthiazole 33 .
In addition, mutations were reported in the target of new anti-tuberculous drugs. A frameshift insertion leading to loss-of-function in Rv0678 has been observed in mutants resistant to 454A. Rv0678 is a gene that regulate the expression of the MmpS5-MmpL5 e ux pump, of which the variants could confer resistance to bedaquiline, leading to 2-to 8-fold increases in bedaquiline MIC as well as 2-to 4-fold increases in clofazimine MIC 35 . They have been isolated in vitro upon exposure to clofazimine3 or bedaquiline. Recently, cross-resistance between clofazimine (CFZ) and bedaquiline (BDQ) was shown to be due to mutations within Rv0678 35-37 , a transcriptional repressor, which results in derepression and upregulation of the multi-substrate e ux pump mmpL5. Similar genetic polymorphisms (SNPs and indels) in Rv0678 had also been reported in resistant mutants to "compound 5" in drug target discovery study 38 . Interestingly, all the mutants bearing indels in Rv0678 also had deletion in mbtA, which is an adenylating enzyme that catalyzes the rst step in the biosynthesis of the mycobactins 39 . In the last decade, the siderophore biosynthesis has been pursued as a drug target to tuberculosis 40  Our study also revealed mutations in genes that are not well investigated or unknown in function. Some of these targets could be newly discovered drug targets. For instance, independent mutations were observed in TB18.5 in mutants to 296A. TB18.5 is a conserved hypothetical protein without known function, but it has been predicted to be an outer membrane protein (https://mycobrowser.ep .ch/). Other candidate targets include narL discovered in mutants to 412A. NarL belongs to one of the two-component regulatory systems and regulates the synthesis of formate dehydrogenase-N and nitrate reductase enzymes during aerobic nitrate metabolism 43 . Lead compound targeting NarL is being explored for Mtb treatment 44 . Mutations were observed in polyketide synthase pks6, in two mutants (to compound 296A) having mutations in TB18.5. Pks6 is involved in human infection 45 . Also, substitution was reported in CtpC from mutant to 648X; ctpC appeared to be important in the transport of heavy metal zine and contributed to the survival of Mtb in macrophage 46,47 . In addition, mutants from compound 486X were found to have mutations in phoR and fbiA/fbiC, which have important functions. PhoPR is a well-known 2-component regulator of pathogenic phenotypes, including secretion of the virulence factor ESAT-6, biosynthesis of acyltrehalose-based lipids, and modulation of antigen export 48-50 . Clinical mutants resistant to delamanid, a drug for M. tuberculosis were found to possess mutations in fbiA and fbiC, as well as fbiB 51 . Likewise, mutations in fbiA and fbiC have been related to resistance to delamanid in M. bovis BCG mutants; mutations in fbiA and fbiC likely to impact the F420 pathway, and delamanid requires a bioactivation by the F420-dependent nitro-reduction pathway to exert its anti-tuberculous activity 51 .
Mutants to 622A and 1114A have mutations in virS which is important -the expression of mymA operon genes may be regulated through PknK-mediated phosphorylation of VirS 52 . VirS is important for M. tuberculosis to block phagosomal-lysosomal fusion in the activated macrophages and to survive in acidic conditions 53 . Another mutant to compound 1114A has mutations in dnaE1, which is essential for highdelity DNA replication and is considered a potential drug target 54,55 . A mutant to 412A has a mutation in isoniazid inducible gene iniB, which is involved in cell wall synthesis 56 , while moaC3 (mutant to 412A) is part of the Molybdenum cofactor (Moco) biosynthesis pathway, which may be signi cant to pathogenesis 57 .
Multiple mutants to 213A, 622A and 1114A gained the N-terminal stop codons in sugI, which encodes a sugar-transport membrane protein in M. tuberculosis 58 . The same mutation may be associated with the resistance to the second line drug D-cycloserine; Chen et al. 13 suggested the loss-of-function mutation discovered from a mutant may result in a lower uptake of cycloserine inside the cell, therefore leading to higher resistance to d-cycloserine.
The role of mutations in the genes of unknown functions or "relatively low-abundance" genes such as Rv0370c, Rv3629, Rv1948, Rv1825, Rv0585c, Rv3175, Rv3327 is unclear; these mutations may be random or involved in compensating for resistance mutations or providing an additional level of resistance 59 .
Fitness costs caused by chemical resistance mutations could be ameliorated by compensatory mutations, which do not contribute directly to drug resistance 59 . In fact, whole genome sequencing of MDR and XDR strains also revealed lots of mutations, and some of them may be trade-off or involved in compensation of tness costs 10 .

Conclusion
Bioinformatics analysis of 53 mutants screened against various compounds identi ed several promising genes that confer resistance to given chemical entities and as such may provide novel drug targets. Some targets of these chemical libraries are consistent with those that have been tied to the proposed mechanism of action or resistance (e.g. rpoB, mmpL3, ethA) and potential new pathway identi ed in our analysis (e.g prrB). The analysis has extended our understanding of the biological basis for the antituberculous actions. Future studies are needed to address the role of the identi ed mutations in genes of unknown functions and how they might be involved in the mode of action/ resistance of these compounds to tuberculosis.

Materials And Methods
Preparation of chemical compounds.
Hit compounds with potential anti-tuberculous activities have been identi ed and the chemicals were prepared under similar conditions as previously described in Sorrentino et al. 6 . Libraries.
The TB box library collection of 11 to the reference genome sequence of H37Rv (NC_000962.3) using BWA-mem 62 , and SNVs and indels were called using GATK v.3 63 . The SNVs generated using GATK were ltered using vcftools (10) to ensure high con dence. The parameters for ltering were (i) minimum read depth of 10; (ii) maximum base quality of 30 for every nucleotide in the sample; (iii) minimum mapping quality of 20. SnpEff 64 was used to annotate and to output the SNVs changes in mutants according to the reference genome and GFF les of Mtb H37Rv in NCBI. Unique variants in mutants were identi ed by examining the discordant SNVs between wild type and mutants that differed from the H37Rv reference in NCBI 65 . To avoid false-positive SNVs, the unique variant in mutants was inspected through Tablet 66 . SNPs and indels occurring in PPE/PE_PGRS genes, which contain repetitive elements, were excluded to avoid inaccuracies in the read mapping and alignment in those portions of the genome 10  Clustering analysis of 85 hit compounds into different known targets or chemical entities.

Figure 2
The function of candidate target genes identi ed from M. tuberculosis H37Rv mutants from screening.

Supplementary Files
This is a list of supplementary les associated with this preprint. Click to download. supplementarytable1.xlsx