Leishmania subpopulations identified through scRNA-seq
To define the transcriptional profiles of L. major throughout its development in P. duboscqi, we infected female sandflies with lesion-derived RFP+ (red fluorescent protein) amastigotes (Day 0; Extended data Fig. 1a). Midguts from 15 sandflies were dissected at 2-, 5- and 9-days post-infection (p.i.), homogenized and RFP+ promastigotes were sorted to remove tissue debris (Fig. 1a, Extended data Fig. 1a-c). The time-points were based on prevailing observations regarding Leishmania developmental stage morphology during sandfly colonization, as well as bulk transcriptome analysis7. Microscopic examination on day 2 revealed a predominant procyclic population with a minority of nectomonad forms. On day 5, leptomonad and nectomonads predominated, with limited detection of metacyclics. By day 9, most promastigotes were fast-swimming metacyclics colonizing the thoracic midgut with mainly leptomonads and haptomonads (Fig. 1b, Extended data Fig. 1a-c). For this time point, only midguts with a well-developed plug were selected for downstream processing (Extended data Fig. 1b, Day 9).
Visualization of the data after unsupervised cell clustering and dimensionality reduction (UMAP, Uniform Manifold Approximation and Projection) highlighted the differences between cells in each time point and the expected separation between amastigotes (Day 0) and promastigotes. Within the promastigote populations, the transcriptomic signature of cells obtained from Day 2 was remarkably different from those in later time-points (Days 5 and 9), (Fig. 1c). Unsupervised cell clustering demonstrated the transcriptomic heterogeneity in sandfly-isolated promastigotes populations days 5 and 9 (Fig. 1d). The distribution of cells among the six clusters in the UMAP plots at Days 2, 5 and 9 (Fig. 1d), and the gene expression profile (Fig. 1e) indicated high cell heterogeneity within clusters 4 and 5.
Cluster 4 contained a major and a minor sub-cluster which shared markers with cluster 2 (named cluster 4_2), and cluster 1 (named cluster 4_1), respectively (Fig. 2a,b). Since cluster 1 was mostly represented at day 2, cluster 4_1 was likely an early-differentiation state of the next evolutive form, which was confirmed by defining the two subpopulations using time-regulated markers (Extended Data Fig. 3a-c, clusters 4_1 and 4_2). In addition, cluster 4_1 shares markers with both clusters 1 and 4_2 (Fig. 2b, Supplementary Table 1). Of note, cluster 4_1 does not have exclusive markers. Cluster 5 also consisted of two discrete sub-populations based on the cell distribution in the UMAP and the expression of a few genes, such as LmjF.27.1770 (Frag1/DRAM/Sfk1 family), (Fig. 1d,e). Examples of markers that define either cluster 5_1 or 5_2 are shown in Extended data Fig. 3d-f. Using this second approach of annotation of the data structure, we identified eight discrete Leishmania clusters (Fig. 2a).
Next, we used Slingshot9 to estimate the temporal ordering of individual cells along an inferred developmental trajectory where cells are ordered by a pseudotime value that can be interpreted as corresponding to a developmental stage. The two lineages inferred by the algorithm initially follow the same course, with cells advancing in the following order: 3, 1, 4_1, 4_2, and 2. However, they subsequently diverge, with one lineage progressing from cluster 2 to 5_1, and ultimately to 5_2, while a second lineage diverges into cells from cluster 6 (Fig. 2a). This challenges the prevailing paradigm suggesting a linear Leishmania differentiation process, where metacyclic promastigotes are considered the final developmental stage in the sandfly1.
The only well characterized markers reported to date that can distinguish between different developmental stages of Leishmania promastigotes are those used to identify culture- and sandfly-derived metacyclics. Among these, SHERP (small hydrophilic endoplasmic reticulum-associated protein) is the most well-characterized and widely used7,8,10–13. SHERP is found on the cytosolic surface of the endoplasmic reticulum and mitochondria10 and is associated with survival and development in the sandfly14. Unequivocal detection of the three different SHERP genes in L. major (LmjF.23.1050, LmjF.23.1080 and LmjF.23.1086) is not possible by our methodology, since their 3’UTRs (untranslated regions) are indistinguishable by sequencing. Therefore, we use the expression of LmjF.23.1050 to define SHERP levels, as the sequencing reads were randomly aligned to this gene copy during the analysis. While clusters 1 and 4_1 were SHERP negative, all clusters from days 5 and 9 (Fig. 2a – dotted line) upregulated SHERP, albeit at varying levels. We classified these clusters as SHERPNeg (1 and 4_1), SHERPLo (3, 4_2 and 2), SHERPInt (6), and SHERPHi (5_1 and 5_2), based on their average expression profiles (Fig. 2c, Extended Data Fig. 4c).
Inferred developmental stage assignments of discrete clusters
Based on the anticipated Leishmania development within the sandfly midgut15, the pseudotime trajectory, as well as the top upregulated genes and SHERP expression, we designated clusters as: 3 - amastigotes, 1 - procyclics, 4_1 - procyclic to nectomonad transition, 4_2 - nectomonads, 2 - leptomonads, 5_1 - early metacyclics, 5_2 - late metacyclics and 6 – late leptomonads/haptomonads (Fig. 2d,e). Top markers in each cluster are presented in Fig. 2b and d, and the complete and detailed rationale for manual annotation of the clusters is described in Supplementary Methods.
Amastigotes and procyclics were identified by their early positioning on the developmental trajectory, and the marked expression of δ-amastins and promastigote surface markers (e.g. PSA2), respectively. Procyclics (SHERPneg; Day 2) and nectomonads (SHERPLo; Day 5) were connected along the trajectory by a cluster of cells in a transient stage of differentiation expressing markers of both stages (Fig. 2b,c). Nectomonad markers included genes described in a previous bulk RNA-seq analysis to be upregulated in this stage in comparison to procyclics and metacyclics, such as NEK157.
At Day 5, leptomonads (SHERPLo; Day 5 and 9) emerged as a prominent population (Fig. 2f), and although not characterized by a set of exclusive expressed markers, the top 5 upregulated genes are annotated as amino acid transporters and permeases (Extended Data 5, Supplementary Table 1), suggesting a role in adaptation to metabolic changes and increase in proliferation. Our pseudotime analysis suggests leptomonads serve as the final shared state between the two distinct developmental lineages before the branching point (Fig. 2a). The SHERPInt population (Days 5 and 9) was marked by the expression of PPG3, a major component of the PSG plug, and a calpain-like cysteine peptidase (Fig. 2b), and genes previously associated with expression in haptomonads (Extended Data 6a)16.
At days 5 and 9, we identified two well-defined SHERPHi clusters, designated early and late metacyclics. Both clusters upregulated other genes in common (e.g.; surface antigen protein LmjF.12.0990), but some genes in early metacyclics are unique, including a histone lysine methyltransferase DOT1B and a mitochondrial DNA LIGkα. The zinc-metalloprotease GP63-4 was identified as an exclusive marker in late metacyclics (SHERPHi GP63-4+). Altogether, our data suggest that the SHERPHi DOT1B+ cluster represents an early stage in the differentiation of metacyclic promastigotes, with the terminal stage represented in the SHERPHi GP63-4+ cluster (Fig. 2a,c).
Based mainly on promastigote development in Lutzomia longipalpis sandflies, leptomonads are argued to be the precursors to metacyclics, with their detection coinciding with the expansion of the population following migration to the anterior midgut1. We employed cell cycle stage regulators previously identified for Trypanosoma brucei procyclics (Briggs et al., 2023) to distinguish replicating stages. While the anticipated upregulation of Leishmania S- and G2/M regulators' homologs was concentrated among a subset of cells in procyclics, their expression in leptomonads seemed diminished and more dispersed (Extended Data 6b). Unexpectedly, we observed a striking upregulation of late G1, S and G2/M markers in the majority of early metacyclics, whereas few nectomonads, late metacyclics and haptomonads expressed these genes (Extended Data 6b). This suggests that early metacyclics could serve as a replicative stage in mature sandfly infections.
The pseudotime values obtained for our development dataset were also used to explore the shifts of gene expression along the trajectory using TradeSeq17. TradeSeq employs statistical models, specifically Generalized Additive Models (GAMs), to infer smooth functions associated with the pseudotime trajectory. These functions facilitate the visualization and interpretation of temporal aspects related to genes involved in development. Both lineages exhibited coordinated gene regulation during the initial phase of their shared development. As pseudotime progressed towards the midpoint, both lineages displayed a second group of upregulated genes, coinciding with the downregulation of the previously active genes. This observation supports an ongoing differentiation (Fig. 2g, Supplementary Table 2). Despite our identification of 8 clusters and their linkage to stages described for Leishmania, gene regulation occurs in two clearly distinct temporal blocks. This reflects both the strong association observed across all early developmental stages, emphasized by the absence of bonafide markers for nectomonads and leptomonads, and the unique signature of the final stages, suggesting specialization and distinct functions.
As expected, genes upregulated in cells committed to lineage 1 were linked to markers from late metacyclics (Surface antigen protein [LmjF.12.0990]; Frag1/DRAM/Sfk1 [LmjF.27.1770]; and Beta tubulin [LmjF.33.0820]), while those upregulated in cells committed to lineage 2 were linked to markers from the SHERPInt PPG3+ cluster (PPG3 [LmjF.35.0500], Calpain-like [LmjF.20.1185], Hypothetical proteins [LmjF.26.2680, LmjF.31.1440, LmjF.31.1445], ATG8 [LmjF.09.0158], DUF1935 [LmjF.34.0190]). SHERP (LmjF.23.1050) emerged as the most strongly associated with the early divergence of the lineages. HASPB (LmjF.23.1060) and GAPDH (LmjF.36.2350) were upregulated in cells committed to lineage 1. By associating cell positioning with pseudotime and its stage annotation, and aligning with the smoother function of each gene, we discerned the segregation between the lineages. While some genes were linked to one of the lineages as early as the procyclic stage (e.g., SHERP, HASPB, Surface Antigen Protein, and Hypothetical Proteins), others emerged only in fully differentiated stages (e.g., PPG3, Calpain-like, Frag1/DRAM/Sfk1, and GAPDH), (Fig. 2h).
Reporter cell lines distinguish three late-infection stages
To validate the assignment of specific clusters to distinct stages and to corroborate the scRNA-Seq data, we employed CRISPR-Cas9 to endogenously tag putative stage-specific markers, using either mNeonGreen (mNG) or mCherry (mCH), (Fig. 3; Extended data 8). The reporter genes were chosen based on their selective expression during late infections (Fig. 3a,c,e). The images shown to visualize the reporter lines are confined to the thoracic midgut region behind the stomodeal valve, showing partial protrusion of the plug and escape of free-swimming promastigotes, caused by the removal of the head during midgut dissection. The late leptomonad/haptomonad reporter cell line (mNG::31.1440) was observed in association with the plug and the thoracic midgut, apparently excluding the stomodeal valve (Fig. 3b). Parasites found free or weakly associated with the plug did not express the reporter protein. To distinguish between the two metacyclic populations, we tagged either LIGkα (early metacyclic; Fig. 3d) or the hypothetical protein encoded by LmjF.19.0540 (late metacyclic; Fig. 3f) using mCherry. Notably, the SHERPHi populations exhibited different midgut interactions, with some early metacyclics found attached to the thoracic midgut and cardia regions by an extended flagellum (Fig. 3d, indicated by the white arrow), while late metacyclics displayed a free-swimming behavior (Fig. 3f). Labeled promastigotes of the double fluorescent, reporter cell line mNG::31.1440 and mCH::19.0540 were embedded in the PSG plug or were free swimming, respectively. Dual fluorescent forms were not observed, confirming that each marker is expressed by a discrete stage in mature infections.
To further characterize the morphology of the marked stages, we conducted correlative light and electron microscopy (CLEM) on promastigotes recovered from midguts infected with the reporter double fluorescent lines. Using the fluorescent signals to map the coordinates of the populations of interest on the coverslip, the cells were imaged by scanning electron microscopy (SEM). The SHERPIntPPG3+ subpopulation displayed a large cell body and a very short flagellum, with many coated with what appears to be midgut tissue, possibly acquired from their sites of attachment (Fig. 4a-h). Indeed, we detected fragments of tissue with mNG::31.1440 reporter cells still attached (Fig. 4a, c), while others appeared mechanically dissociated with their short flagellum covered with pieces of the tissue (Fig. 4b,g-h). Additionally, a protein-like matrix covered the cell body of several of these cells, likely the PSG (Fig. 4b,f, e). Finally, some mNG::31.1440 cells displayed a small, uncoated flagellum, including some with an enlarged tip (Fig. 4d-f). By integrating transcriptome analysis with morphological characterization of the SHERPIntPPG3+ cells, we designate the SHERPIntPPG3+ cells as haptomonads.
Both early and late metacyclics exhibited the typical metacyclic morphology with a slender, tapered cell body and long flagellum (Fig. 4i-r). However, early metacyclics were also found attached to fragments of the midgut tissue (Fig. 4i, white arrows). Most interesting was the evidence of dividing cells with a second emerging flagellum (Fig. 4m-n). The late metacyclic cells, by contrast, were uniformly unattached and had a single, long flagellum (Fig. 4o-r).
We utilized our reporter cell lines to further assess the cell cycle attributes of the specific stages. On day 9 p.i., the pooled promastigotes collected from 50 sandflies infected with either of the double reporter cell lines revealed that the haptomonads in each case showed approximately three times more cells in the G2/M stage compared to the total population. The early metacyclics exhibited an increase in DNA content, with 27% of cells in S phase and 31.3% in G2/M phases (Fig. 5a), that was associated with their having the highest frequency of cells with 2 nuclei and 1 kinetoplast (2N1K) or 1N2K (Fig. 5b). The cell cycle profile of late metacyclics was similar to the total population, and the proportion of cells containing 2N1K, 1N2K, or 2N2K, was lower compared to the other groups (Fig. 5d).
The Leishmania meiosis toolbox does not delineate a discrete sexual stage
Leishmania development in the vector includes a non-obligatory, cryptic sexual cycle18. Previously, we described the up regulation of 2 genes encoding proteins involved in gamete fusion (HAP2-1 and HAP2-2), and a nuclear fusion protein (GEX1), in hybridizing competent cells in vitro18,19. We also described the requirement for HAP2-2 and HOP-1, a component of the synaptonemal complex (SC), for successful mating in the sandfly20. Overall, expression of the gamete fusion and meiosis gene homologs in L. major was distributed across each of the clusters, including amastigotes. HAP2-1 (LmjF.35.0460), HAP2-2 (LmjF.36.3860) and GEX1 (LmjF.04.0100) were most strongly expressed by leptomonads, haptomonads and a subset of procyclics (Fig. 5e). A smaller percentage of cells demonstrated some level of expression for SC-related genes (Fig. 5e). Interestingly, cells that expressed any of the SC genes, except for HOP2, shared the exact same list of highly expressed genes (Fig. 5f; Supplementary Table 3), although the majority are hypothetical proteins. Annotated genes falling above the fold-change and p-value thresholds (avg_log2FC ≥ 1, p-value 5e-2) were involved in protein folding (LmjF.32.3300, chaperone protein DnaJ), DNA replication (LmjF.24.0990 and LmjF.36.6710, replication factor C, subunits 1 and 3, respectively), cell cycle control (LmjF.28.0030, cell division cycle 6, CDC6) and ubiquitination (LmjF.24.2140, ubiquitin conjugating enzyme), (Fig. 5f).
Effect of a second blood meal on population structure and gene expression
Recently, an uninfected, second bloodmeal taken by infected flies has been implicated in the amplification of parasite numbers, and the appearance of a morphologically distinct form termed retroleptomonad6,21. We allowed the flies to take a second blood meal (2BM) on uninfected blood on day 6 p.i., obtaining promastigotes 3 days later to investigate their RNA expression in comparison to infected sandflies that received a single blood meal (1BM - day 9), (Fig. 6a). We analyzed 19,311 cells (7,363 from the 1BM group, and 11,948 from the 2BM group) and, after unsupervised clustering, found that both groups were composed of 5 clusters, corresponding to the expected developmental stages as validated by the markers for each cluster (Fig. 6b-c,i). The frequency of nectomonad and leptomonad cells increased when compared to 1BM, with lower representation of haptomonads and late metacyclics (Fig. 6c, Extended Data 9a,b). We did not, however, discern the emergence of a distinct cluster defining it as a separate and identifiable stage in the life cycle. Within each cluster, we observed changes in particular genes after a second BM (Fig. 6d,e), some of which were associated with a shift in carbohydrate metabolism that can be linked to the change in the available carbon source (Fig. 6d, Supplementary Table 4).
The stage heterogeneity of transmitted promastigotes
To determine which populations were transmitted, approximately 100 sandflies from day 9 p.i. were provided with pre-warmed culture media in an artificial feeder to allow the recovery of Leishmania egested during feeding attempts. We collected media samples every 5 minutes over a total of 1 hour, immediately placing the samples on ice (Fig. 6f). We analyzed 9,169 cells (Day 9 – midgut: 7,363; Day 9 – Transmitted: 1,806) and annotated cell clusters based on markers defined by the development analysis. Surprisingly, the principal transmitted stage was haptomonads (82%), followed by late metacyclics (6.5%), nectomonads (6%), and early metacyclics (< 1%), (Fig. 6g-h). The transmitted cells exhibited a general downregulated expression of the genes defining the stages when compared to the cells present in the day 9 midguts (Extended Data 10f). We acknowledge the potential for artifacts in relation to their DEGs due to the change in environment, which we hoped would be mitigated by the 5-minute intervals for collection and immediate temperature reduction (4°C).
Comparing the infectivity of discrete and mixed transmitted stages in mice
The notable overrepresentation of haptomonads in the transmitted inoculum raises questions about their potential role in infection outcome. Using the double-fluorescent reporter cell lines, we infected two groups of sandflies and sorted each subpopulation on day 9 p.i., selecting flies with mature infections characterized by rapidly swimming metacyclics and a prominent plug. The following populations were obtained: mNeonGreen-positive (haptomonad), mCherry-positive (either early or late metacyclic, depending on the reporter cell line used), double-positive, and double-negative (representing other stages), (Extended Data 10a, b). The double positive events were most likely dead cells and debris since they failed to proliferate in a limiting dilution assay. All other populations survived and proliferated as expected (Extended Data 10c, d). We confirmed viability by staining the midgut homogenates (9 days p.i.) with a vital dye. The signal profile of mNeonGreen and mCherry populations was similar to the negative control (log-phase promastigotes), with only the double-negatives and the total population displaying some level of cell death (Extended Data 10e).
From the populations sorted from flies infected with the reporter cell line mNG::31.1440 and mCH::19.0540, BALB/c mice ears were intradermally infected with 10,000 cells (high dose) of each of three distinct populations: mNeonGreen+ (haptomonad), mCherry+ (late metacyclic), and double-negative (other stages). We reconstituted the total population to match the proportion of the respective stages observed in the transmitted inoculum determined by scRNA-Seq, up to a total dose of 10,000 promastigotes. Additionally, we infected groups of mice with proportionate combinations of late metacyclics and haptomonads, or late metacyclics and the negative population (other stages). Lastly, we infected ears with either early metacyclics (sorted from flies infected with the mNG::31.1440 and mCH::LIGkα reporter line) or late metacyclics alone, each at a low dose (650 cells) (Fig. 6i). All the inocula produced uncontrolled dermal lesions. Lesions appeared earliest (3–4 weeks) and progressed most rapidly in size in ears infected with 10,000 late metacyclics or with a mixture of 9,350 haptomonads and 650 late metacyclics (Fig. 6j, extended data 10g). In addition, these two groups and the reconstituted total population produced more severe pathology in comparison to the other groups (Fig. 6k). Each of the other groups, including the high dose haptomonads, showed a similar trend in lesion onset (4–5 weeks), development, and pathology. After 10 weeks, ears infected with the mix of late-metacyclics and haptomonads harbored significantly more parasites compared to ears from the groups infected with low dose metacyclics (p = 0.0328), but indistinguishable from all other groups (Fig. 6l). These findings indicate that metacyclics are not the only infectious stage transmitted by sandflies and suggest that the presence of haptomonads in the transmitted dose influences lesion severity and reduces the requirement for a high dose of infective metacyclics to achieve similar infection outcomes.