Transition between sexual and asexual stages of development defined by transcriptome
P. falciparum NF54-pfs16-GFP-Luc (42) parasites were induced to form gametocytes after 1.5 cycles of asexual development (3 days) and monitored for the next 13 days to capture gametocyte commitment and development to mature stage V gametocytes (Fig 1A). Tight synchronization of the asexual parasites ensured coordinated gametocyte development, and gametocytes (stage I) were observed in culture from day 0 onwards (Fig 1B). Morphological evaluation showed a shift from a predominantly asexual parasite population to >60% gametocytes by day 3 of gametocytogenesis following the removal of asexual stages (Fig 1B).
We measured mRNA abundance genome-wide using DNA microarrays that included 5792 annotated transcripts, each represented by >2.5 probes including ncRNA and tRNAs that could produce unique probes (43). Using two color arrays in which the Cy5 channel pooled total RNA from timepoints comprised of both asexual and sexual stages, we were able to easily normalize each gene at every timepoint to distinguish the timing of peak abundance throughout the entire time course and across developmental stages. In each sample, expression values were captured for 96-99% of the 5443 genes on the array (P<0.01, full dataset provided in Additional File 1), a 1.5-fold improvement in coverage compared to the 65% of the transcriptome (3410 genes) captured in the previously reported Young et al. dataset (36). Overall, the transcriptome of gametocytes is distinct from asexual parasites, as is evidenced by a clear shift in Pearson correlation between the transcriptomes of asexual parasites (day -2 to 2) and gametocytes (day 3 onward) (Fig 1C). Populations containing predominantly asexual parasites (days -2 to 2) were highly correlated across the first two 48 h cycles (r2=0.54-0.86, data provided in Additional File 2) and were characterized by periodic gene expression changes between the asexual ring and trophozoite stages (Fig 1C). From Day 3 onward, the transcriptional profiles diverged indicating a switch from asexual to sexual development, evidenced by a loss of the 48 h correlation pattern (Fig 1C). During subsequent days of gametocytogenesis, daily peak correlations were associated with developmental progression through stage I-II (days 3-5, r2=0.56-0.73), stage III-IV (days 6-9, r2=0.51-0.92), and mature stage V gametocytes (days 10-13, r2=0.50-0.84) (Fig 1C, data provided in Additional File 2), which corresponded to morphological transitions observed via Giemsa-stained thin-blood smears throughout the time course.
Conversion from asexual to sexual development was also clearly detectable in the expression profiles of individual genes required during asexual development (e.g. kahrp (pf3d7_0202000)) while sexual genes were only expressed during gametocyte development from Day 3 (44) (Fig 1D). The genes restricted to expression during sexual development include downstream targets of PfAP2-G (23) and markers associated with mature gametocyte sex-specificity (Fig 1D) (35) and 24 novel gametocyte-associated transcripts (data provided in Additional File 2). Among these transcripts were a putative ncRNA, three rRNAs and two tRNAs, suggesting that the expression of non-coding RNAs may not only play a role during gametocyte commitment (18) but also in gametocyte development and maturation in P. falciparum. Together, these data comprise a high-resolution P. falciparum blood stage developmental transcriptome that allows for the temporal evaluation of transcriptional abundance patterns associated with gametocyte commitment, development and maturation.
The gametocyte-specific transcriptional program reflects the molecular landscape of gametocyte development
To associate temporal gene expression to gametocyte commitment and stage transitions throughout development, the full 16-day transcriptome dataset was K-means clustered revealing 2763 transcripts with overall decreased abundance (clusters 1-5) and 2425 with increased abundance during gametocytogenesis (clusters 6-10, Fig 2A). Therefore, gametocytogenesis relies on a more specialized program of gene expression compared to asexual development, with only 45% of transcripts showing increased abundance during gametocyte development (Fig 2A) compared to the 80-95% of transcripts increased during specific phases of asexual development (11,19,45). Interestingly, individual clusters showed specific patterns of gene expression throughout gametocyte development (Fig 2A), with transcript abundance during gametocytogenesis either decreased following asexual development (clusters 1-3, 1042 transcripts); maintained (clusters 4-5, 1721 transcripts) or increased (cluster 6-7, 1571 transcripts). Three clusters (clusters 8-10) show transcripts with specific peaks in expression during development, indicative of developmental gene regulation.
Cluster 1 is predominantly comprised of critical asexual stage transcripts which showed a decline in abundance during gametocytogenesis, with up to 5 log2 fold (log2FC (Day3/Day1)) decreases in the expression of these transcripts between the ring and early gametocyte stages (Fig 2A). These transcripts include Maurer’s cleft proteins e.g. rex1 (pf3d7_0935900) and rex2 (pf3d7_0936000) as well as knob associated proteins that form part of the cytoadherence complex (kahrp (pf3d7_0202000), kahsp40 (pf3d7_0201800)), supporting earlier observations that the gametocytes mediate sequestration via different mechanisms than asexual parasites (46). Many of these cytoadherence associated transcripts are associated with heterochromatin protein 1 (HP1) occupancy during gametocyte development (47), and other HP1 (47,48) and H3K9me3 (17) repressed genes are also significantly enriched in cluster 1 (P<0.0001, Fisher’s exact test, genes listed in Additional File 3). This suggests asexual development-specific genes are actively repressed by epigenetic regulation throughout gametocyte development. Clusters 1-3 also contain transcripts involved in metabolic processes that are not critical to gametocyte development including genes encoding for enzymes of heme metabolism and glycolysis (Fig 2B, cluster 3, Additional File 1) as well as regulators of egress (pkg (pf3d7_1436600)) and invasion (bdp1 (pf3d7_1033700) and ap2-i (pf3d7_1007700)), all processes that are not necessary for gametocyte maturation (Fig 2A, cluster 2). Beyond these examples, clusters 1-3 also contain 214 unannotated genes that could be specifically required for asexual development only (Fig 2B).
Some transcripts show low abundance throughout gametocytogenesis (Fig 2A, clusters 4 and 5, average expression <0.1 log2(Cy5/Cy3), with amplitude change <0.5 log2(Cy5/Cy3)). These clusters include regulators of proliferation (e.g. origin of replication complex protein mcm4 (pf3d7_1317100), proliferating cell antigen 1 (pf3d7_1361900) and cyclin dependent kinase crk4 (pf3d7_0317200)). By comparison, clusters with transcripts maintained at increased levels throughout commitment and development (Fig 2A, clusters 6 and 7, average log2(Cy5/Cy3) >0.31) included expected gene sets involved in the constitutive processes of macromolecular metabolism (e.g. DNA replication, protein modification and RNA metabolism Fig 2B, Additional File 1) (36,38). Interestingly, cluster 6 (and cluster 2) showed a high degree of cyclic oscillation in transcript abundance (Fig 2A). Many of these transcripts relate to transport, general cellular metabolism and homeostasis, functions in which fluctuation would not be unexpected (Fig 2B, Additional File 1). Importantly, cluster 7 also contained transcripts classified by gene ontology as involved in cellular differentiation (caf40 (pf3d7_0507600), pf3d7_0918400, pf3d7_0926800 and speld (pf3d7_1137800)) (GO:0030154, P=0.026, Fig 2B, S1 Table).
A significant proportion (15%) of the transcriptome is associated with peak expression during specific stage-transitions in gametocyte development (Fig 2A, clusters 8-10), reminiscent of the phased expression typical of the asexual transcriptome (11,12). Transcripts involved in early-stage development increased from stage I-II in cluster 8 in a transcriptional profile often associated with targets of AP2-G (22,23,25). Transcripts in cluster 9 increased in abundance in the intermediate phase of development (stage III-IV) before the expression of transcripts required for development in mosquitoes in cluster 10 (stage V, gamer (pf3d7_0805200), mtrap (pf3d7_1028700), cht1 (pf3d7_1252200), Fig 2A &B). The transcripts in clusters 8-10 are thus markers of biological transitions during gametocyte development. Clusters 6 & 8 are enriched for genes that contribute to the metabolic shift to mitochondrial metabolism (e.g. malate dehydrogenase (mdh, pf3d7_0618500)) and fatty acid biosynthesis (e.g. β-ketoacyl-ACP synthase III (kasIII, pf3d7_0618500)) (49,50) in gametocytes, followed by the emergence of processes related to cytoskeletal formation (clusters 8 & 9, Fig 2A &B, Table S1) that lead to the construction of a rigid subpellicular microtubule array during the sequestering stages (stages I-IV) of gametocytes (51). The microtubule array results in the characteristic crescent shape of the intermediate stages before the complex depolymerizes in stage V that is accompanied by the increased transcript abundance of actin depolymerization factors 1 and 2 (Fig 2A, cluster 10, pf3d7_0503400, pf3d7_1361400) to allow for a more deformable erythrocyte that can re-enter circulation (51). This cluster also includes the genes encoding the serine repeat antigens (sera) 3 and 5 (pf3d7_0207800, pf3d7_0207600) that play a role in egress in asexual parasites (52,53), implying that they may retain this role during gametocyte egress from the erythrocyte in the mosquito midgut. The striking temporal patterns of transcript abundance in clusters 8-10 suggests strict transcriptional regulation of these genes to ensure the timing of gametocyte sequestration, circulation and egress. Interestingly, these patterns are exhibited by parasites that need not fulfil any of these functions when grown in vitro in the absence of host-interactions, suggesting that transcription of these genes is hard-wired.
Different gene sets enable sexual commitment and development
The time-resolved gametocyte transcriptome also allows interrogation of the expression of genes involved in sexual commitment throughout gametocyte development (18,20,25) (Fig 3). In total, previous reports produced a set of 1075 unique genes proposed to function as an “on switch” that characterizes gametocyte commitment (18,20,25). Of these, 680 genes (63%) also have increased transcript abundance during gametocyte development (Fig 3). These increased transcripts include those encoding epigenetic regulators involved in cell cycle control such as SIR2A (PF3D7_1328800) and SAP18 (PF3D7_0711400) that contribute to decreased DNA synthesis and a block in proliferation (54,55) . These epigenetic modifiers and readers do not have direct roles postulated for commitment, but could contribute to the global change in abundance of specific histone marks as the parasite differentiates (56). The remaining 395 transcripts are not increased in abundance during gametocyte development, suggesting that these transcripts are short lived and possibly only essential during gametocyte commitment. These short lived transcripts include gdv1, whose protein product prevents epigenetic repression of ap2-g during commitment (29), iswi and sn2fl, which encode chromatin remodelling proteins (Fig 3A), that are expressed in sexually committed cells downstream of ap2-g (25) and hp1 and hda2 that antagonise ap2-g expression (27).
From our data, we also identified specific 5’ cis-regulatory motifs that are enriched upstream of genes involved in gametocytogenesis (Fig 3B). The first motif, (ATGTGTA) was highly represented in cluster 7 in genes that express ubiquitously throughout both sexual and asexual development. This motif has been correlated with genes involved in DNA replication (57) and the significance of its enrichment in genes associated with differentiation is unclear. The second motif, (AGACA) that is enriched upstream of the genes in the developmentally regulated clusters 8 and 9 has been associated with sexual commitment and development in previous datasets (18,58) although no trans-acting factors have been identified for either of these motifs (15,59) (Fig 3B). Additionally, a second well-conserved motif was enriched in cluster 8, (ACATAC) that has not been reported before and possibly represents a new avenue for investigation of cis-regulatory elements in genes contributing to parasite differentiation. In addition, genes in cluster 10 were enriched for 3 motifs, of which the first (GT[A/G]CA) closely matches the composite motif observed in genes bound by both the AP2-I and AP2-G transcription factors (23) and the second motif (GGTGCA) closely resembles the transcription factor binding site of AP2-I alone (60). Cluster 9 was the only cluster of genes with an enriched motif in their 3’ UTR, coinciding with 63% of this cluster being translationally repressed in the gametocyte stage (32,35).
Transcriptional patterns characterize distinct transitions in gametocyte development
Apart from commitment to sexual development, the parasite also undergoes distinct developmental and transcriptional transitions during gametocyte development. The initial transition occurring in stage I gametocytes and regulating immature gametocyte development is characterised by increased transcript abundance in cluster 8 (Fig 4A), which showed a significant enrichment for genes involved in regulation of transcription (GO:0010468, 11 transcripts, P=0.029) including the specific ApiAP2 transcription factors pf3d7_0404100, pf3d7_0516800, pf3d7_1429200 and the myb1 transcription factor (pf3d7_1315800) (Fig 4A). Other genes with potential regulatory functions include a possible novel transcription factor, pf3d7_0603600, which contains an AT-rich interaction domain (IPR001606: ARID) and an uncharacterized RNA binding protein (pf3d7_1241400). Proteins expressed by these two genes have been detected previously during gametocyte development (Fig 4A) (34,35,40,41). These proteins, along with the C-Myc binding protein MYCBP (PF3D7_0715100), are of interest for further study to determine their role in controlling gene expression during gametocyte development.
A second outcome of the initial transition into gametocytogenesis is the determination of sex differentiation in P. falciparum parasites, which is proposed to be an PfAP2-G independent process that occurs at the very onset of commitment (18,35,40,41,61). However, sexually dimorphic gametocytes are only morphologically detectable by microscopy from stage III onwards (62). Our data indicate that the male-enriched transcripts from Lasonder et al. 2016 (35) show increased abundance earlier in development (stage I-II; 27% of cluster 8, P<0.0001, two-tailed Fisher’s exact test, Fig 4B, Additional File 3) compared to female transcripts. These 98 male-enriched transcript abundance may be good biomarkers of early male differentiation as an alternative to α-tubulin II, which is expressed promiscuously in early gametocyte populations (63).
Female-enriched transcripts (35) peak in abundance only after sexual dimorphism is clearly discernible, from stage II-III onwards (Fig 4B) and are significantly overrepresented in the intermediate development cluster 9 (Fig 4B, Additional File 3, 76% of the cluster, P<0.0001, two-tailed Fisher’s exact test). Overall, this trend held true for the 158 female-enriched transcripts in cluster 9, including those encoding canonical female markers, e.g. osmiophilic body protein g377 (pf3d7_1250100) (64,65), late-stage antigen pfs25 (pf3d7_1031000) (35,65) and ccp1-3 (pf3d7_1475500, pf3d7_1455800, pf3d7_1407000) (35,65) and ccp4 (pf3d7_0903800) that was recently used to reliably type male and female gametocytes in late-stage gametocytes (66). We also detect a small subset of female-enriched transcripts (pf3d7_0918700, imp2 (pf3d7_0730400), pf3d7_1007800, pf3d7_1466800, pf3d7_1146800, obc13 (pf3d7_1214800)) that are expressed earlier in gametocyte development (Fig 4B) and could potentially be important for female development before morphological differences are apparent.
The second transcriptional transition we observed coincides with the onset of gametocyte maturation from stage IV to V (Fig 4C). These transcripts show increased abundance in sexually committed asexual parasites as well as mature stage V gametocytes but have diminished abundance during the early and intermediate stages of gametocytogenesis (cluster 10, Fig 4C). This cluster was significantly enriched for transcripts stabilized during commitment (47% of transcripts, P<0.0001, two-tailed Fisher’s exact test) (18), as well as genes marked with H3K36me3 in asexual parasites (49% P<0.0001, Fisher’s exact test) (16). Interestingly, the epigenetic H3K36me3 mark is abundant during the intermediate stages of gametocyte development (56) and genes overlapping in the three datasets encode transcripts associated with the intracellular signalling machinery of the parasite (cdpk1 (pf3d7_0217500), cdpk5 (pf3d7_1337800) and adenylyl cyclase beta (pf3d7_ 0802600, (67)), along with cAMP-dependent protein kinase A catalytic and regulatory subunits (pkac (pf3d7_0934800), pkar (pf3d7_1223100) (Fig 4C). Of these, CDPK1 has been confirmed to function in de-repressing female gametocyte transcripts during parasite development in mosquitoes (68). Several of the genes in cluster 10 also have roles in invasion including the merozoite proteins msp1, pf3d7_0930300, msp2, pf3d7_0206800, rh4, pf3d7_0424200, and rh5, pf3d7_0424100, suggesting that invasion genes need to again be expressed for transition to gametogenesis in the mosquito. Overall, the gametocyte transcriptome reveals three major stages in gametocyte development (differentiation (Fig 4A), intermediate development (Fig 4B), maturation (Fig 4C)) that promote gametocyte maturation of P. falciparum parasites.
ApiAP2 transcription factors are expressed at specific intervals during gametocytogenesis
To investigate the possible contribution of factors associated with transcriptional regulation to the observed stage-progressions during gametocytogenesis, we interrogated the expression of the genes encoding the ApiAP2 transcription factor family (Fig 5). Of the 27 family members, 15 genes encoding ApiAP2 transcription factors increased in transcript abundance during gametocyte development. Transcript abundance for pf3d7_0404100, pf3d7_1350900, pf3d7_1449500, pf3d7_0802100, pf3d7_1429200 increased consistently throughout the time course (Additional Figure 1). However, most ApiAP2-encoding transcripts increased in abundance at discrete intervals (Fig 5A) throughout gametocytogenesis. As expected, ap2-g (pf3d7_1222600) transcript abundance increased before the appearance of gametocytes (days -1 to 2). The target genes bound by AP2-G (23), peaked in transcript abundance directly following AP2-G peak abundance as expected, coinciding with stage I of gametocyte development (Additional Figures 2 & 3). Thereafter, three transcription factors pf3d7_1408200, pf3d7_1317200 and pf3d7_0611200 were increased during stage I to III of development (days 2-6). In the rodent-infectious malaria parasites P. berghei and P. yoelii, orthologs of the first two genes have been associated with gametocyte development through knockout studies (21,30,69) while little is known of pf3d7_0611200. Three ApiAP2-encoding transcripts for pf3d7_0516800, pf3d7_1222400, pf3d7_0934400 were increased in abundance from stage I to V of development (Fig 5A), following a pattern similar to the increased abundance of cluster 8 (Fig 4A). During the later stages, pf3d7_1143100, pf3d7_1239200 and pf3d7_0613800 were increased in abundance. Expression of pf3d7_1143100 is translationally repressed in P. berghei gametocytes (32) indicating that these transcription factors may not contribute to gene expression in P. falciparum gametocytes, but may instead have functional significance in subsequent development in the mosquito.
To associate functional regulation of gene sets due to the cascade-like increased abundance of the transcripts encoding the ApiAP2 transcription factors, the gametocyte transcriptome was analysed using Gene Regulatory Network Inference Using Time Series (GRENITS (70)) (Additional File 1) with the strongest predicted regulators shown in Fig 5B. From this analysis, pf3d7_0611200, which increased in abundance directly following ap2-g, coexpressed with 314 genes (probability linkage >0.6), 223 of which were anti-correlated for expression and functionally enriched for genes involved in host invasion (GO:0044409: entry into host, P=2.97e-12; Fig 5B). Interestingly, 116 of the genes co-expressed with this transcription factor were enriched for the TGCAC motif (P=5.5e-13), of which 100 were negatively co-expressed, indicating a repressive role for this transcription factor, either alongside or instead of the P. falciparum ortholog of pbap2-g2 (21), pf3d7_1408200. This motif bears a striking resemblance to the motif bound by the 3rd AP2-domain of AP2-I, GTGCAC (13), suggesting this transcription factor could act as repressor of the invasion genes that AP2-I activates in asexual development. A secondary enriched domain was present in 43 of the co-expressed genes (GGTTG) and both of these binding motifs warrant further study into their functional relevance. The second apiap2 transcript increased in abundance, is pf3d7_1317200, the P. falciparum ortholog of pbap2-g3, coexpressed with 21 genes involved in cell cycle processes, including DNA replication (GO:0044786, P=0.0061) and chromosome organization (GO:0051276 P=0.0046). Unlike its ortholog in P. berghei (69), no enrichment for female specific proteins or transcripts were observed in the co-expressed genes and further phenotypic information is needed to describe this ApiAP2’s activity in P. falciparum. The two final ApiAP2 transcription factors are increased between stage I-V of development, with the first, pf3d7_0934400, showing mostly negative co-expression with its target genes (27/37 transcripts, including pkg and ptex88 (pf3d7_1105600)), suggesting this ApiAP2 transcription factor might also act as repressor. Secondly, the transcript of ap2-o2 is increased in abundance throughout development but peaks at stage IV (day 8-9) of development and was predicted to regulate 22 target genes. Taken together, this data supports the involvement of successive expression of ApiAP2 transcription factors in a regulatory cascade during gametocyte development, as has been proposed for P. berghei gametocytes (21) and shows that this subsequent expression co-occurs with stage transition during P. falciparum gametocytogenesis.