Innate biosignature of treatment failure in human cutaneous leishmaniasis

The quality and magnitude of the immune and inflammatory responses determine the clinical outcome of Leishmania infection, and contribute to the efficacy of antileishmanial treatments. However, the precise immune mechanisms involved in healing or in chronic immunopathology of human cutaneous leishmaniasis (CL) are not completely understood. Through sequential transcriptomic profiling of blood monocytes (Mo), neutrophils (Nφ), and eosinophils (Eφ) over the course of systemic treatment with meglumine antimoniate, we discovered that a heightened and sustained Type I interferon (IFN) response signature is a hallmark of treatment failure (TF) in CL patients. The transcriptomes of pre-treatment, mid-treatment and end-of-treatment samples were interrogated to identify predictive and prognostic biomarkers of TF. A composite score derived from the expression of 9 differentially expressed genes (common between Mo, Nφ and Eφ) was predictive of TF in this patient cohort for biomarker discovery. Similarly, machine learning models constructed using data from pre-treatment as well as post-treatment samples, accurately classified treatment outcome between cure and TF. Results from this study instigate the evaluation of Type-I IFN responses as new immunological targets for host-directed therapies for treatment of CL, and highlight the feasibility of using transcriptional signatures as predictive biomarkers of outcome for therapeutic decision making.


INTRODUCTION
Vector-borne infectious diseases cause more than 700,000 human deaths each year 1 .Among the most impactful are those caused by viruses and intracellular protozoan parasites, including malaria, visceral and cutaneous leishmaniasis (CL), Chagas disease and dengue, which disproportionately affect the poor, and perpetuate the cycle of poverty and disease 2 .CL annually affects about one million people in 92 countries 3 .In the absence of vaccines and effective vector management, control of CL relies on treatment.However, the use and e cacy of most antileishmanials is limited by parenteral delivery, high levels of toxicity including hepato-and cariotoxicity 4 , and increasing rates of treatment failure.
Despite WHO/PAHO guidelines favoring local treatments like thermotherapy or intralesional Sb V therapy for uncomplicated CL, the vast majority of patients in endemic regions across the continent are still treated with systemic therapies, due to local protocol misalignment and limited access to technologies like thermotherapy.Furthermore, a high proportion of patients in the eld (as much as 60% 5 ) are not eligible for local therapies, as they do not comply with the clinical criteria for use which are lesions in anatomical areas other than the face or near the joints, lesions < 3 cm of diameter, and patients presenting with a single lesion 6 .Therefore, systemic antimonials remain the treatment of choice in clinical practice, which is reinforced by their lower cost and higher availability.This translates into a challenge for patients inhabiting rural and dispersed communities in most endemic areas, where access to medical attention is limited.Therefore, adherence to treatment and clinical follow-up are compromised, resulting in higher rates of treatment failure (TF).Consequently, there is an urgent need for predictive biomarkers of treatment outcome for these vulnerable patient populations, but most importantly, for better therapeutic interventions.
Clinical resolution of CL is accompanied by a reduction in parasite burden and yet, parasite elimination per se is not the sole determinant of healing.Parasite persistence has been documented following therapeutically achieved cure in more than 40% of patients infected with Leishmania Viannia species in Central and South America [7][8][9] .Pathogenesis of CL is mediated by the immune and in ammatory responses; thus, resolution of disease and control of infection are intimately linked to the host response 10 .Recent studies from our group and others are beginning to unravel the role of innate and adaptive immune components in TF of human CL upon treatment with antimonial drugs and miltefosine [11][12][13][14] .We have shown that lesions of TF patients are characterized by sustained local in ammation mediated by heightened expression of pro-in ammatory chemokines and cytokines, predominantly associated with activation and migration of monocytes, polymorphonuclear cells, and T H 1 CD4 + T cells 13 .In addition, an environment of high cytolytic activity mediated by CD8 + T cells and subsequent induction of in ammation mediated by IL1β, has been shown in lesion biopsies of TF patients infected with L.V. braziliensis 15 .Interestingly, the expression of in ammatory genes in peripheral blood mononuclear cells (PBMCs) of CL patients is modulated by in vivo exposure to antimonial drugs 11 , as early as 30 minutes after intramuscular drug delivery.The dynamics of expression of some of those genes, including ones involved in monocyte and neutrophil chemotaxis and activation (ccl2, cxcl2, cxcl3, cxcl8), were related to the pharmacokinetics (PK) of plasma antimony concentrations.These observations suggest that drugdependent modulation of in ammation, especially the immunomodulation of the innate responses, is a central component of healing in CL.
We hypothesized that a loop of sustained and deregulated in ammatory gene expression in innate cells leads to their continuous activation and recruitment to cutaneous lesions, promoting immunopathology and leading to TF in CL patients.Here, we present the rst comprehensive transcriptomic pro ling of peripheral blood monocytes (Mo), neutrophils (Nφ) and eosinophils (Eφ) of CL patients undergoing antileishmanial chemotherapy with meglumine antimoniate (Glucantime -GLUC).The pro les from each of the innate immune cell types interrogated here yielded clear signatures that distinguished TF patients from those who cured.Most prominent, and shared between the three cell types, was an enhanced Type I interferon (IFN) gene expression pro le which comprised the hallmark signature of innate cells in patients with TF.A score derived from the expression levels of a subset of Type I IFN-inducible genes successfully classi ed cured and TF patients, even before initiation of treatment.Characterization of the innate immune response during the course of treatment allowed identi cation of host-speci c prognostic signatures of the therapeutic response.This constitutes a solid rst step toward the rational selection of host-targeted CL therapeutics that redirect in ammation and control pathology.

Study participants and samples
Forty adult participants treated with antileishmanial therapy were included in this study, and recruited in our outpatient clinics, one in the South Paci c coast of Colombia in Tumaco, and another in the urban center of Cali.One patient was excluded due to progression to mucosal leishmaniasis, in accordance with the clinical protocol designed for this study (Fig. 1A).Of the remaining participants, 35 completed clinical follow-up.One participant did not donate blood or tissue samples and was therefore not included in the transcriptomic analyses.Most of the 34 enrolled participants were male of Afrocolombian descent, presented with ulcerated skin lesions, and most were infected with L.V. panamensis.Five patients were treated with miltefosine (MLF), all cured.Twenty-nine patients were treated with GLUC, 19 cured and 10 experienced TF.Due to the small sample size of MLF-treated patients and the absence of TF in this group, only samples from GLUC-treated patients (29 in total) were carried forward for transcriptomic analyses (Fig. 1B).Adherence to treatment for GLUC-treated patients was > 95% (measured as the proportion of ampoules received from those prescribed).Signi cant differences were noted in the age and ethnicity of patients who cured vs. patients with TF in the aggregated data of patients recruited in Cali and Tumaco (Table S1a).
Drug resistance and virulence of the infecting Leishmania strain may contribute to TF.In L. V. braziliensis and L. V. guyanenesis infections, virulence has been associated with the presence of the Leishmania RNA virus (LRV).To assess the impact of these variables, we evaluated the Leishmania isolates from study participants for susceptibility to pentavalent antimony and the presence of LRV.Leishmania strains were isolated from 24 patients (14 cures and 10 TF); isolates from the remaining 5 could not be propagated.As intracellular amastigotes, 10 strains were susceptible to GLUC, 12 were resistant (of which 5 were isolated from patients who were cured, and 7 from TF), and data was unavailable for 2 (Fig. S1).No statistically signi cant difference was found (Fisher´s P = 0.23).All strains evaluated were negative for LRV.Together, these results rule out a signi cant contribution of virulence and drug resistance to the outcome of treatment in this cohort of patients.

Global assessment of samples and transcriptomic pro les
Lesion biopsy samples were collected before initiation of treatment.Peripheral blood samples were obtained pre-treatment (Pre-Tx), mid-way through treatment (Mid-Tx, day 8), and at the end of treatment (End-Tx, day 20).Mo, Nφ, and Eφ were isolated from all blood samples and cDNA libraries were constructed and sequenced from a total of 186 samples (Table S2_metadata).Following low coverage ltering (Fig. S2), two samples were removed.For the remaining 184 samples (Fig. 2A), we used principal component analysis (PCA) and a correlation heatmap analysis to visualize the relationship between samples (Fig. 2).The PCA resulted in the expected grouping by cell type (Fig. 2B).A similar clustering was observed in hierarchical clustering analysis (Fig. 2C).These data support the quality, reproducibility and speci city of transcriptomes derived from the isolated blood cells from our patient cohort.
The use of two clinics for patient recruitment (one in Tumaco and another in Cali) necessitated the evaluation of the data to account for possible clinic-associated batch effects.When all samples were colored by clinic on the same PCA plot, no grouping by clinic was discernible within cell types (Fig. S3, Panel A); however, a grouping of samples by cell type revealed a signi cant amount of variance which separated Tumaco from Cali, con rming the presence of a batch effect attributable to the clinic (Fig. S3, Panels B-D).Similar analyses were performed evaluating the effect of ethnicity, without nding any substantial contribution of this variable to overall variability of the data (Fig S3, Panels F-H).To highlight this variability between clinics, we modeled the 'clinic' variable in SVA and used the surrogate variablesmodi ed counts to generate the PCA plots, further supporting the hypothesis of a strong batch effect associated with the clinic (Fig. 3, Panels A and B-D).
To investigate the contribution of each potential variability-contributing metadata variable (such as clinic, donor, visit number, ethnicity or cell type), we performed a loadings analysis of the Surrogate Variables (SVs).First we performed a SV analysis (SVA) with cure/TF status as the variable of interest to calculate the SV loadings.Next, we calculated the F-statistic for each metadata variable (such as clinic location, donor, visit number and cell type) with respect to each SV (Table S3).This metric estimates if each factor is a good indicator of separation based on that SV loading, with higher F-statistic values indicating more between-class variation and less within-class variation.We found that the largest components of variance in the expression data were 'cell type' and 'clinic', followed by modest 'donor/participant' variability.After sub-setting to include only samples collected from Tumaco, we identi ed 'cell type' and 'donor' as variables contributing to expression variability (Table S3).Notably, we did not see a large contribution due to 'visit number' in any SV calculations, showing that the expression changes in the samples are modest between samples collected Pre-Tx, Mid-Tx or at End-Tx.SV1 and SV2 variability was dominated by the cell type, while SV3 through SV5 are representative of the donor and clinic (which are confounded).Visit number variability was not represented in any SV, indicating again that there is minimal variability in gene expression in samples collected at different time-points during treatment.
Based on the signi cant batch effect introduced by the patient recruitment clinic, and the skewed representation of cured patients recruited in Cali (9 of 10 CL patients recruited in this site cured), we excluded all samples obtained from patients in the Cali clinical site from the initial transcriptomic analyses and biomarker discovery.Therefore, transcriptomic variance associated with therapeutic outcome was analyzed on SVA-transformed data sets.
Transcriptomes from lesion biopsies obtained pretreatment, corroborate heightened tissue cytolytic activity in treatment failure patients A recent comparative transcriptome analysis of Pre-Tx lesion biopsies from CL patients infected with L.V. braziliensis showed a gene signature of heightened cytolytic activity in lesions from TF patients compared to those who cured 15 .To explore the congruity of those ndings with infections with other L. Viannia species, we examined CL lesion transcriptomes in our study cohort.Overall, the transcriptomic pro les of lesion biopsies from patients who cured were indistinguishable from patients with TF (Fig. S3E), even following SVA (Fig. 3E).As expected, only few genes (n = 28) were differentially expressed (DE) (P < 0.05; |log 2 FC| ≥ 1) in lesion biopsies, 17 of which were up-regulated and 11 down-regulated in patients who went onto fail treatment, compared to cures (Table S4a).Notably, among up-regulated genes, a signature of increased cytolytic activity (GZMB, NCR1, SH2D1B, PRF1, KLRC1, GNLY, FGFBP2, KIR2DL4, CCL3 and CCL4) was found in tissue samples from TF patients ((Table S4b), consistent with previous ndings from TF patients infected with L.V. braziliensis 15 .The minimal difference in the global transcriptomes of skin lesions from patients that cured and TF patients was expected, since bulk transcriptomes from complex multicellular tissues are often skewed to re ect the most abundant or transcriptionally active cells within the sample.Nevertheless, the fact that a clear transcriptomic signature of enhanced cytolytic activity was detected in TF suggests that systemic differences leading to the activation and/or recruitment of cytotoxic Natural Killer (NK) and CD8 + T cells could be contributing to this enhanced in ammatory state.

Transcriptomic pro les of innate immune cells do not change over the course of treatment
Our understanding of the participation of innate immune responses in the outcome of antileishmanial therapy is almost exclusively limited to the role of macrophages as primary host cells for the parasite.However, mounting evidence shows that other innate cells including Nφ, and more recently Eφ and NK cells, participate in the in ammatory responses that contribute to CL immunopathology, and thus their functions are relevant to the outcome of treatment 16 .
Among the hallmarks of innate immune cell functions are the velocity and robustness of their elicited responses, which in turn require tight mechanisms of control to avoid host injury.To explore the dynamics of these responses and their participation in therapeutic responsiveness, we analyzed samples collected from CL patients at three different visits: Pre-Tx, Mid-Tx and End-Tx.PCA plots of the transcriptomes of individual cell types did not reveal any signi cant clustering of samples based on visit (i.e. over the course of treatment) (Fig. S4).Thus, we opted to group transcriptome samples from all time points from each cell type, to provide increased robustness to DE analyses for identi cation of biomarkers and transcriptional signatures of cure and TF.
Monocyte, neutrophil and eosinophil transcriptomes from CL patients who cure differ from those with TF Examination of Nφ, Mo and Eφ transcriptomes showed a clear clustering of samples by treatment outcome, in each of the three cell types, revealing a distinct separation of samples along the rst principal component (PC1) (Fig. 4A).On average, PC1 explained 18% of the variance across the three cell types.A DE analysis in each of three cell type transcriptomes revealed 3 to 4 times more signi cantly DE genes between cures and TF in Mo, when compared to Eφ and Nφ (Table S5), consistent with higher transcriptional activity of Mo.Those DE pro les are represented in the form of volcano plots (Fig. 4B).However, when a |log 2 FC| ≥ 1 threshold was established, a comparable number of DE genes was observed between cures and TF among the three cell types: 113 in Mo, 190 in Eφ, and 113 in Nφ (Table S5).Although more than 70% of signi cant DE genes with |log 2 FC| > 1 were unique to each cell type (Fig. 4C), gene enrichment analyses showed common signi cantly enriched features between the different innate cells (Table S6).Among up-regulated genes, enrichment of type I IFN responses was common to Mo, Eφ and Nφ from TF patients, and this was also supported by GSVA (Table S7).No similarities in gene enrichment analysis were found for down-regulated genes among cell types, with the exception of MHCII related genes (discussed below) (Table S6).
In Eφ, a robust induction of IFNα/β signaling was revealed by an up-regulated gene pro le which included innate immune receptors, signaling molecules, transcription factors and regulators of signaling pathways (Table S5).The gene encoding IRF7 was up-regulated in TF Eφ, which together with IRF3, are the canonical transcriptional regulators of Type I IFNs 17 .IFIH1 (gene encoding MDA5), a RIG-I-like receptor and cytoplasmic sensor of dsRNAs, was also up-regulated in TF.MDA5 has been demonstrated to be an ampli er of innate immune responses and associated with autoin ammation 18 .Genes encoding downstream effectors (including OAS1, OAS2, OAS3, OASL, BST2, MX1, MX2, IFI6, XAF1, GBP2, and IFI27), and pathway regulators (IFIT5, ISG15, RSAD2, USP18, HLA-G, DHX58, and DDX60) were included in enriched gene categories in Eφ (Fig. 5), further substantiating a signature of enhanced Type I IFN responses in TF patients.Type I IFN response pathways were also enriched in Nφ transcriptomes from TF (Table S6 and Table S7).Notably, genes encoding HERC6, IFI44L, ISG15, USP18, IFI27 and DDX60 (Fig. 5) were also expressed at higher levels in TF Nφ.Consistently, monocyte transcriptomes from TF patients were also enriched in mRNAs encoding Type I IFN-related genes, suggesting synergistic innate cell functions towards enhanced type I IFN in ammation in TF patients.Downregulation of IL1R1 and IL1R2 (decoy) receptors was observed in Mo from TF patients, and downregulation of molecules related to MHCII antigen presentation was found in all cell types from TF patients (Table S5 and Table S6).Genes involved in wound healing and cell proliferation (HBGEF, EGR1, and EGR3) were found downregulated in Eφ of TF patients.In Mo from TF patients, down-regulated antimicrobial peptide genes (including CTSG, LTF, CAMP, DEFA3, and LCN2) and immune receptor activity genes (including IL2RB, IL1R2, HLA-DQA1, HLA-DQB1, IL1R1, and CXCR4) were signi cantly enriched.Altogether, these results suggest that mechanisms underlying TF are associated with an enhancement of Type I IFN signaling, a dampening of antimicrobial effectors (antimicrobial peptides), and functions linking the innate and adaptive immune systems (antigen presentation).
In a complementary approach to DE analyses, we employed weighted gene co-expression network analysis (WGCNA) to identify co-expressed gene clusters associated with therapeutic response.The Mo data resulted in four gene modules with a signi cant association.The four modules comprised 478 genes altogether and were enriched in genes encoding ribosomal, mitochondrial, DNA nuclear activity and IFNα/β-inducible proteins (Table S8, Figure S5).In Nφ, three modules comprising a total of 581 genes were associated with therapeutic response and were enriched in cellular pathways similar to those found in our DE analyses: Type I IFN, MHC-II, DNA nuclear activity, glycolytic metabolism, cell migration, vesicular transport, cell death and the immunoproteasome.For Eφ, both associated modules comprised a total of 200 genes, with an enrichment of genes related to Type I IFN responses consistent with DE analysis, and Nφ and Mo WGCNA.Of the total genes contained within the signi cant modules from the three cell types, 36 were shared and all were up-regulated in TF.Among those, 24 were related to the type I IFN response (Figure S5 IFNα/β-inducible genes constitute a hallmark signature of TF in CL Based on the functional commonalities between the cell-speci c transcriptional pro les, we recognized a common innate gene signature, from which we were able to identify biomarkers that predict TF.From the gene lists derived from our WGCNA and DE analyses, we selected the 10 common DE genes in Mo, Nφ and Eφ: IFI44L, IFI27, PRR5, PRR5-ARHGAP8, RHCE, FBXO39, RSAD2, SMTNL1, USP18, and AFAP1.Each of those common genes were up-regulated in TF, with the exception of AFAP1, which was downregulated.We therefore constructed a predictive composite score for each patient based on this innate gene signature (excluding AFAP1 because it was downregulated in TF).RPKM values were extracted from all Pre-Tx samples for each cell type.Those were selected for their predictive value in guiding early therapeutic interventions.RPKM values were used to construct raw and normalized scores (Z-score), the latter to account for possible outliers within groups.Both raw and Z-scores were higher in TF patients, and this difference was statistically signi cant for the scores derived from Nφ, and for the Mo + Nφ composite score (Fig. 6A-C and Fig. S6).Consistently, raw and Z-scores from Pre-Tx Nφ samples and the Mo + Nφ scores, were signi cantly predictive of the therapeutic response (Fig. 6E, F).
We next applied a machine learning approach to carry out a comprehensive analysis of all data on hand and complement our focused DE analyses.The expression matrices of all samples obtained from both clinics (Tumaco and Cali) were split into 10 rounds of training (40%) and testing sets (60%), through random partitioning and cross-validation.The training sets were used to generate K nearest neighbors (KNN), logistic regression (GLM), gradient boost (XGBoosted GLM), and random forest models 19 .Following an evaluation of the resulting models by comparing the predictions of the training data to the known clinical outcomes (Table S9), we used those models to predict clinical outcomes in the test partition and evaluated their performance with an emphasis on a dataset that only included Pre-Tx samples as these would be the most translatable for clinical and therapeutic decision making.The GLM model appeared to best predict the clinical outcome with good overall performance (accuracy 0.73/0.89,sensitivity 0.74/0.95,speci city 0.95/0.80,and AUC 0.6/0.85; each pair of values is the mean of Pre-Tx / mean of all visits) (Table S9).As expected, addition of samples from subsequent visits enhanced the performance of all models (Table S9).

DISCUSSION
The response to antimicrobial chemotherapy has been primarily believed to depend on the drug susceptibility of the etiological agent, patient adherence to the therapeutic scheme, and intrinsic differences in drug exposure (PK).It is known that immunosuppression negatively impacts the e cacy of antileishmanials, both in murine models of infection [20][21][22][23][24] and in humans 25,26 .However, approximately 25% of immunocompetent individuals present with TF in controlled clinical trials [27][28][29] .In this study we elucidated a central role for Type I IFN innate in ammatory responses in TF during systemic treatment with meglumine antimoniate (Glucantime®).
Type I IFNs (IFNα and IFNβ) are rapidly induced during viral infections, and are central to the antiviral response 17 .However, their role in infections with intracellular bacteria or protozoan parasites remains elusive.A low dose of IFNβ protected mice from progressive CL caused by L. major 30 , and this was related to induction of iNOS, NK cytotoxicity, and early production of IFNγ 31 .In visceral leishmaniasis (VL) caused by L. donovani, IFNα/β acts as an upstream suppressor of anti-parasitic T H 1 cells, and IFNAR1 -/-mice better control infection compared to wild type 32 .These results provide evidence of both protective and pathogenic roles of Type I IFNs in leishmaniasis, which are likely dependent on the IFNα/β concentration and downstream regulation of the response.In human PBMCs, pharmacological blocking of IFNα/β resulted in antigen-speci c increase of IFNγ production, and this was reverted by inhibition of MHCII (HLA-DR) antigen presentation 32 .Interestingly, heightened expression of IFNα/β-stimulated genes (ISGs) in Mo, Nφ and Eφ of TF patients was consistently accompanied by signi cant dampening of MHCII gene expression.This suggests that, similar to what occurs in VL, an impaired protective immunity mediated by type I IFNs via antigen presenting cells, could be occurring in CL.
Recent evidence shows that Type I IFNs promote pathogenesis and severity of M. tuberculosis (MTB) infection in both mice and humans 33,34 .This was associated with dampening of protective IL-1 signaling via eicosanoid imbalance.In monocytes from TF, high type I IFN-inducible gene expression was accompanied by repression of IL1R1 and IL1R2 (decoy receptor), and increased IL1β.Downregulation of IL1R1, even in the context of up-regulated IL1β, suggests that the cross-balance of Type I IFNs and IL1 is not only relevant to the pathogenesis and severity of TB, but also to the outcome of CL treatment.
Severity and tissue damage during infectious diseases is often mediated by immunopathology caused by exacerbated and uncontrolled in ammatory responses.During viral infections, IFNα/β promote CD8 + T cell longevity and clonal expansion 35 , as well as NK cell functions 17 .Notably, Pre-Tx lesion biopsies from TF patients exhibited a transcriptional pro le compatible with enhanced cytolytic activity mediated by CD8 + T and NK cells, similar to what observed in TF patients infected with L.V. braziliensis 15 .It is plausible that the heightened Type-I IFN responses, mediated by systemic innate in ammatory cells, could contribute to skin immunopathology driven by CD8 + T and NK cells in lesions of CL patients who do not respond to treatment.
The mechanisms by which IFNα/β-inducible genes are modulated during Leishmania infection remain unknown, and constitute part of our ongoing investigations.In L. Viannia infections, expression of IFNα/ β and its contribution to disease severity has been proposed to be mediated by the Leishmania RNA virus (LRV).A systematic evaluation of the presence of LRV in more than 100 L.V. panamensis clinical strains did not show evidence of the presence of LRV in this species 36 .This was consistent with the absence of LRV in all L.V. panamensis, as well as in L.V. braziliensis clinical isolates from our study participants, ruling out any contribution of LRV to the Type I IFN signature of innate cells observed in TF patients.We have previously shown that L.V. panamensis induces TLR4 gene expression as early as 8h after interaction with human monocyte-derived macrophages 37 .Intracellular parasite survival and TNFα production were found to be dependent on TLR4 37 .Furthermore, using RNA-seq, we have shown induction of a Type-I IFN signature in human PBMCs, occurring as early as 24h after L.V. panamensis infection with strains associated with chronic CL, and not with those causing self-healing disease 38 .These previous ndings lead us to hypothesize that rapid and, likely, strain-speci c TLR engagement by Leishmania could induce a rapid (8h) IFNα/β gene expression, leading to a second wave of IFNα/βinducible genes as early as 24h post infection.
The slow discovery pipeline for novel antimicrobials, and especially for those causing neglected tropical diseases, has driven the development of innovative approaches such as host-directed therapies (HDTs).
The basis of HDTs relies on detailed understanding of the role of host responses in pathogenesis and clinical outcome of infections.However, HTDs for leishmaniasis, including immunomodulation, have often been based on knowledge of the contribution of immune responses to disease in animal models, resulting in failed clinical trials [39][40][41][42] .Understanding the innate factors driving therapeutic healing of CL offers a unique opportunity for rational identi cation of HTDs that optimize available therapeutic regimens, and can capitalize on past and current pharmaceutical developments in modi ers of innate immune functions.Interestingly, antibody-mediated blocking of IFNαR in mice, or the use of FDAapproved ruxolitinib (a small molecule inhibitor of JAK1 and JAK2), synergized with amphotericin B to control L. donovani infection 32 .Our data suggest that modulation of Type I-IFN responses is a likely target for host-directed therapy in CL caused by L. Viannia.
The high frequency and severity of adverse events and of TF during antileishmanial treatment demands strati cation of therapeutic interventions to populations where they will be most effective.Results from this study revealed a transcriptional Type-I IFN innate signature of TF.This signature allowed construction of a composite score, with signi cant speci city and sensitivity to predict TF before initiation of treatment.Although our sample size presents an evident limitation for successful implementation of machine learning techniques, GLM models were good predictors of outcome, albeit using the top 3000 most variable genes.The area under the curve (AUC) achieved using our composite scores, which incorporate 9 genes, closely mirrored the AUCs obtained with machine learning models that included all 3000 genes.This similarity not only underscores the robustness of our selected genes but also validates the effectiveness of our hallmark in capturing the essential genomic signatures.
Notably, the score derived from Nφ, and the composite score from Mo + Nφ were both predictive of TF.That Nφ are the most abundant white blood cells in blood, and monocytes one of the most transcriptionally active, supports the likelihood of developing whole blood tests for future validation, which also facilitate access in remote rural populations.Similar scoring systems have assisted screening in other immunological systems.IFN-scores based on gene expression pro les of Type-I IFN stimulated genes have been used as screening tools for monogenic interpheronopathies, and to stratify patients with systemic lupus erythematosus 43,44 .Genes reported in these scoring systems often differ between diseases.However, four of the ten genes that composed our IFN-signature (IFI27, IFI44L, USP18 and RSAD2), have been consistently reported as signature members in other autoin ammatory diseases, and used for clinical assessments 43,45 .
Personalized medicine is anticipated to be restricted to developed countries widening the disparity between "the rich and the poor" 46 .Technology-driven research, systems medicine and genetic knowledge should reduce health care disparities rather than exacerbating them.Our study provides a solid rst step towards validation and implementation of personalized medicine for CL, one of the most neglected tropical infectious diseases of global importance.

MATERIALS AND METHODS
Ethics statement.This study (IRB code #1273) was approved and monitored by the Institutional Review Board for ethical conduct of research involving human subjects of the Centro Internacional de Entrenamiento e Investigaciones Médicas (CIDEIM) in accordance with national (Resolution 008430, República de Colombia, Ministry of Health, 1993) and international (Declaration of Helsinki and amendments, World Medical Association, Fortaleza, Brazil, October 2013) guidelines.All individuals voluntarily participated in the study and written informed consent was obtained for each participant.
Study design and subjects.This study was designed to identify host biomarkers and innate immune functions that participate in the response to antileishmanial treatment in CL patients.Transcriptional pro ling of innate immune cells and lesion biopsies was conducted.Adult patients (18 to 60 years of age) with parasitological diagnosis of active CL with a time of evolution < 6 months, and without apparent immune de ciencies (negative HIV test, no evidence of immunological disorder or treatment with medication having immunomodulating effects), who received standard-of-care treatment with Glucantime (GLUC, 20 mg/kg/day for 20 days) or miltefosine (MLF, 1.8-2.5 mg/kg daily dose for 28 days) were included in this study.Treatment outcome was evaluated at week 13 following initiation of treatment for GLUC and at week 26 for MLF.Cure was de ned as complete re-epithelization and absence of in ammatory signs for all lesions.TF was de ned as incomplete re-epithelization and/or the presence of induration, raised borders, or redness in any lesion; reactivation of the original lesion(s); or the appearance of new lesions.
Skin lesions biopsies samples.Skin lesion punch biopsies were obtained before initiation of treatment.Biopsy punches of 3 mm were obtained under local anesthetic, taking into account the following ratio: 1/3 of healthy skin and 2/3 of the edge of the lesion (the indurated edge, which does not include necrotic tissue).Skin biopsies were immediately stored in 1 mL Allprotect® (Qiagen).Samples were equilibrated overnight at 4°C and then stored at -20°C until processing 47 .
Isolation of monocytes, neutrophils and eosinophils from peripheral blood samples.Ninety mL of whole blood anticoagulated with EDTA was obtained from each patient.PBMCs and polymorphonuclear leukocytes (PMNs) were isolated by centrifugation over a Polymorphprep™ (Axis-Shield) gradient according to the manufacturer's instructions.CD14 + Mo were puri ed from PBMCs using the CD14 microbeads ultrapure kit (Milteny Biotec) coupled to magnetic cell sorting (MACS).CD16 + Nφ were puri ed from PMNs using CD16 microbeads human kit (Milteny Biotec) and Eφ were obtained by negative selection using the eosinophil isolation kit human (Milteny Biotec).Cells were washed with cold PBS, precipitated by centrifugation, and the cell pellet was resuspended in 100-200 µL of RNAlater (Ambion) and stored at -80°C for later use.Purity of isolated cell populations was evaluated by ow cytometry and light microscopy.Samples were used only when purity was > 98%.
Leishmania strains, typing and drug susceptibility testing.Leishmania isolates were obtained from all patients and propagated in Senekjie´s biphasic blood agar and immediately stored in liquid nitrogen until use.Strains were typed by immunoreactivity to monoclonal antibodies as previously described 48 .Drug susceptibility of intracellular amastigotes was estimated by evaluation of % parasite survival in PMAdifferentiated U-937 cells after exposure to pentavalent antimony (SbV) at a nal concentration of 32 µg/mL, compared to control without drug exposure.Leishmania strains were de ned as Sb-resistant when percent reduction of the parasite burden after drug exposure was < 78%.Susceptibility cutoff was de ned based on a panel of well-characterized clinical isolates presenting with intrinsic resistance or susceptibility to SbV 49,50 .
RNA isolation and cDNA synthesis.RNA isolation from puri ed cell populations (Mo, Nφ and Eφ) stored in RNAlater was performed using TRIzol™ (Invitrogen, USA), followed by RNA cleanup with RNeasy Mini Kit columns (Qiagen, USA).Isopropanol/water (1:1) was used for RNA precipitation.RNA isolation from lesion biopsies was performed by tissue disruption, homogenization and extraction using TRIzol™ reagent as previously described 47 .RNA integrity was assessed using an Agilent 2100 bioanalyzer.For RNA-seq, poly(A)-enriched cDNA libraries were generated using the Illumina TruSeq v2 sample preparation kit (San Diego, CA) and checked for quality and quantity using bioanalyzer and quantitative PCR.
RNA-seq data generation, preprocessing, and quality trimming.Global data assessment, visualization and differential expression analysis.Biological replicates and batch effects were assessed and visualized using an R package, hpgltools (https://github.com/elsayedlab/hpgltools),developed and maintained in the El-Sayed lab.Normalized data were visualized using log 2 transformed counts per million (CPM) reads following ltering to remove low counts (de ned as any gene with a sum less than twice the number of samples or when all samples had fewer than 2 counts).Samples in which fewer than 11,000 genes were observed in a non-zero genes plot, were removed.Following data ltration, visualizations were performed to observe the sample relationships; these included density plots, boxplots of depth, coe cient of variance, hierarchical clustering analyses based on Pearson's correlation coe cient and Euclidean distance, variance partition analyses, and principal component analyses (PCA) before and after normalization.Several combinations of normalization and batch adjustment strategies were evaluated along with surrogate variable estimation via SVA 55 .Samples were queried via cure/TF status, visit number, clinic, and cell type in order to calculate the surrogate variable (SV) loadings, and the F-statistic was calculated for each variable with respect to each SV.
Differential expression analyses were performed using a single pipeline which performed all pairwise comparisons using the Bioconductor packages: limma 56 , edgeR 57 , DESeq2 58 , EBSeq 59 , and an explicitly basic analysis (using only Log 2 CPM values).In each case (except EBSeq and the basic analysis), the surrogate variable estimates provided by SVA were used to adjust the statistical model in an attempt to address the batch/surrogate effects.Each contrast was evaluated in context of its agreement with other methods, but the interpretations were primarily informed by the DESeq2 results.Detailed information on the analytical pipeline and scripts is available in Supplementary Material.Genes with signi cant changes in abundance (|Log 2 fold change| ≥ 1 and false discovery rate adjusted P values ≤ 0.05) were passed to gPro leR 60 and GSVA 61 .Gene ontology analyses were supplemented with manual data curation.Network analyses were performed with STRING 11.0 62 .Simultaneously, gene set variation analysis (GSVA) was performed to produce an enrichment score against the mSigDB 63 datasets (C2, C7, and H) on a per-sample basis.These scores were passed to limma to evaluate the difference in GSVA score distributions for each gene set in the samples.Results from limma were then ltered according to log 2 fold change, adjusted P value, and maximum GSVA score mean.
Detection of virus sequences.Kraken 2, with a supplemented version of its viral database 64 , was used to check each sample speci cally for the Leishmania RNA virus (LRV), as well as any putative viral reads.
Con rmation of LRV absence was performed by qRT-PCR as described 36 .
Weighted gene co-expression network analysis (WGCNA).WGCNA co-expression networks were generated and examined using low-count ltered, SVA-adjusted, RPKM (Reads per kilobase per million) by average CDS length, log 2 -transformed counts as input.Pairwise Pearson correlations between each gene pair were calculated and transformed into a signed adjacency matrix using the minimum power that resulted in a scale-free R 2 t of 0.8.The resulting modules and associated eigengenes were produced via the default correlation matrix blockwise module detection methods from WGCNA.Module scores by sample were queried against the metadata factors (cure and TF) via Pearson's correlations and scored using the P value metrics provided by WGCNA.The eigengenes were extracted from modules with scores deemed signi cant, manually examined, and passed to gene set enrichment methods.Modules cantly associated with the and containing more than 1,000 genes were excluded from the analyses.
Composite scores.Pre-Tx RPKM values were used to construct raw and normalized (Z-score) composite scores.Selection of genes that constitute each individual composite was based on the DE genes between cures and TF that were common among Nφ, Eφ, and Mo (signature genes).Raw scores were calculated per patient, as the sum of RPKM values of the signature genes in each cell type (Eq.1).A normalized score was also computed based on the sum of normalized RPKM values (Eq.2).For raw and normalized scores, "I" indexes genes and "k" patients.
For each cell type: ( where and Receiver operating characteristic (ROC) curves were used to explore the predictive potential of raw scores and Z-scores to discriminate therapeutic cure and failure.Sensitivity and speci city parameters were calculated for scores.Youden's J statistic was used to de ne cutoff values.
Machine learning models.A series of machine learning models were and via the caret R package 19 to create transcriptome-informed classi ers of patients likely to cure or fail treatment. (rawscore

ACKNOWLEDGEMENTS
We acknowledge all patients who participated in this study and members of the CIDEIM Clinical Unit.We also acknowledge the support of the personnel of the CIDEIM BioBank, for their technical assistance in phenotyping of strains processed in this study, and Mariana Rosales for support with the identi cation of LRV in Leishmania strains.Here, and based on the results shown in Fig. S3 we modeled the 'clinic' variable in SVA and used the surrogate variables-modi ed counts to generate the PCA plots.The analyses were performed using all annotated protein-coding genes following ltering for low counts, cpm normalization and log 2 transformation.In the PCA plots, the rst two principal components are shown on the X and Y axes respectively, with the proportion of total variance attributable to that PC indicated.Each sample is represented as a single point with color indicating either the clinic (Panel A), or clinic, treatment and outcome (Panels B-E).The inner and outer colored ellipses represent the 90% and 95% con dence intervals, respectively.These are only shown for classes containing more than 3 samples.
, Panel C), and this was consistent with the DE analyses.Prominent in the WGCNA analyses were STAT1 and STAT2, two known transcription factors involved in Type I IFN signaling and expression of IFN-inducible genes.Both transcription factors were up-regulated in TF, supporting the coordinated and sustained Type I IFN effector functions elicited downstream of IFN receptor ligation.Together, WGCNA and DE analyses support the participation of a systemic proin ammatory environment sustained in TF, mediated by Type I IFNs.

Figure 3 Assessment
Figure 3

Figure 4 Global
Figure 4

Figure 5 Network
Figure 5 54ngle and paired-end reads were obtained on an Illumina NovaSeq 6000 at the Genetic Resources Core Facility, Johns Hopkins Department of Genetic Medicine, Baltimore, MD; or on an Illumina HiSeq1000 at the Brain & Behavior Institute -Advanced Genomic Technologies Core (BBI-AGTC) at the University of Maryland, College Park, MD.Trimmomatic 51 was used to remove Illumina adapter sequences, discard reads shorter than 40 nucleotides, and trim any 4 nucleotide rolling window with a mean Phred quality score less than or equal to 20.Sequence quality metrics were assessed using FastQC (http://www.bioinformatics.babraham.ac.uk/projects/fastqc/).Raw data and project information available via NCBI-dbGaP study ID # 38338; phs003545.v1MappingcDNAfragmentsandabundance estimation.Reads were aligned against the human (hg38 revision 100), L.V. panamensis (TriTrypDb release 36), and L.V. braziliensis (release 26) genomes with HISAT2 (2.1.0) 52g the default parameters.The resulting accepted hits and mapped reads were sorted and indexed via SAMtools53and passed to HTSeq54for generating count tables.

2 1 m
Initial expression sets were selected to include all data from innate cells from patients recruited in Cali and Tumaco, or data only from Tumaco patients.The starting data were log 2 -transformed CPM values, normalized, ltered to exclude genes with CV < 0.1, centered, ltered to exclude genes with correlations ≥ 0.95.The most variable 3,000 genes were then selected.The remaining data was split into training (0.4) and testing (0.6) sets 10 times.The training datasets were used to create k-nearest neighbor, random forest, GLM, and gradient boost models with an arbitrarily chosen mix of bootstrap and CV sampling.The test partitions were evaluated for accuracy and sensitivity/speci city with respect to the known outcome of each patient.Statistical analyses.For the exploration and description of the sociodemographic and clinical variables, univariate analyses were performed.Categorical variables were described with frequencies and percentages.Quantitative variables were described as means (± SD) or medians (IQR) according to the distribution of the data.For the comparison of qualitative variables, Fisher's exact test or the chi2 test was used according to the distribution of the data.Fisher´s test was also used for comparison of tables larger than 2x2 65 .Quantitative variables were compared using t-test or U-Mann-Whitney tests.Normality was determined with qq plots and the Shapiro wilk test.In all analyses P-values < 0.05 were considered signi cant.Statistical analysis was performed using GraphPad Prism version 9 and R version 4.1.3.