This study provides one of the most comprehensive analyses to date regarding the transcriptomic signatures of plastic phenotypes, and the first to employ an SVM approach to interrogate the molecular basis of behaviour. We have applied this approach to identify caste-specific gene expression profiles in P. dominula and to explore the relationship between transcriptomic and plastic phenotypic changes following a major social perturbation in paper wasp colonies. We identify a set of nearly 2000 genes that optimally capture gene expression differences between established queens and workers. Using a caste classifier based on these genes, we find that queen removal leads to a colony-wide shift in expression, where the expression profiles of all individuals move towards a state intermediate between those of established queens and workers. Individual variation around this intermediate state is related to age and phenotypic attributes, with older (and therefore more queen-like) individuals showing expression profiles that are closer to that expected for established queens. Our results show that molecular responses to queen removal in P. dominula consist of both a general colony-wide response independent of phenotypic change and a response that reflects the plastic phenotypic transition.
Our study contributes to the enormous progress in our understanding of the relationship between molecular changes and changes in phenotypic expression that has been made in the past decade, facilitated by the increased availability of 'omic’ data and complex bioinformatic analyses. Recent studies have started to challenge the view that there is a direct correspondence between transcriptomic states and external phenotypes. Libbrecht et al (2018), for example, show that gene expression responses associated with a reversible phenotypic change differ qualitatively based on the directionality of the change (from reproductive to non-reproductive or vice versa). Meanwhile, molecular manipulations have revealed a surprising degree of plasticity in canonically implastic traits such as mammalian sex (Matson et al 2011) or ant castes (Simola et al 2016). Our results go further, showing a shift in caste-specific brain gene expression profiles among individuals whose phenotypic caste expression remains otherwise apparently unchanged. This shows that the expectation of a close match between expression profiles and phenotypes is excessively simplistic, or at least that detecting such a match requires detailed knowledge of relevant genes and/or an exhaustive phenotypic characterisation. This, in turn, suggests that the use of expression data to infer the molecular basis of phenotypes is more challenging than hitherto appreciated.
A major advantage of this study is the use of individual-level gene expression data from a large number of subjects, including individuals reared in a shared social environment but exhibiting very different phenotypic responses to perturbation. By sequencing individuals rather than pools, we were able to match each gene expression profile to high-resolution phenotypic data that captures the scale of naturally-occurring variation in features such as age, ovarian development and dominance behavior. This resolution allows us to address questions that are otherwise inaccessible in gene expression analyses. For example, we have been able to show that caste identity, but not the residuals of caste identity on age, are significantly predictive of individuals’ change in transcriptomic caste identity following queen loss.
Our discovery of colony-wide responses to queen loss suggests that this social perturbation provokes a significant reaction even from individuals that have little hope of attaining the vacant reproductive role. This is a surprising finding given that P. dominula is thought to express a ‘conventional’ gerontocratic mechanism of dominance and queen succession that mitigates the need for costly intragroup conflicts over the identity of the replacement queen (Pardi 1948; Tsuji & Tsuji 2005; Monnin et al 2009), which should greatly reduce the need for young, low-ranking workers to respond to queen loss (Taylor et al 2020). The gene expression responses of lower-ranked workers to queen removal might plausibly represent a form of safeguard against queen loss: if queen loss sometimes occurs multiple times in quick succession or is frequently associated with a general decimation of the nest population (i.e. through predation), there might be kin-selected benefits of a colony-wide ‘de-differentiation’ of individuals that facilitates a quicker succession process.
Replacement queens in our colonies did not have access to males and therefore remained unmated even after queen succession. This may partially explain the fact that even individuals with fully developed ovaries and very high dominance ratings did not transition to a fully queen-like gene expression profile, as mating can induce significant gene expression changes in insects (e.g. Gomulski et al 2012; Zhou et al 2014). The lack of immediate mating opportunities for new queens in our experiment is not necessarily unrealistic, however: unmated Hymenopteran females can lay unfertilized eggs, which develop as males. Moreover, in naturally-occurring early P. dominula nests, replacement queens may be established a month or more before they are mated (Strassmann et al 2004), presumably due to a scarcity of early males. The unmated replacement queens analysed here are therefore representative of those that would be present on wild nests shortly after queen loss.
To our knowledge, this study represents the first application of a support vector classification approach to behavior-associated transcriptomic data. Using this approach we identified a large group of genes as differing meaningfully between Polistes castes—over 10% of annotated genes, a much larger set than those found using standard analytic approaches in this study and others (e.g. Ferreira et al 2013; Toth et al 2014; Patalano et al 2015; Geffre et al 2017). Standard approaches using packages such as edgeR (Robinson et al 2010), DESeq2 (Love et al 2014), or NOISeq (Tarazona et al 2011), which typically include information-sharing between genes and relatively strict fold change cutoffs, allow differential expression to be assessed with a high degree of confidence at the level of individual genes. However, because such approaches assess genes individually and do not reduce the dimensionality of the samples’ transcriptional profiles, they are not well-suited to the tracking of subtle but consistent broad-scale changes. This is especially true for gene expression data that are noisy and heterogenous, or those that are dominated by many genes of small effect (Huang et al 2018). The molecular bases of caste in simple social insect societies represent such a class of data.
Using a machine learning approach, here we have undertaken a detailed analysis of the relationship between gene expression and socially-mediated phenotypic plasticity, revealing broad-scale changes in caste-associated gene expression profiles following a major social disruption. Our results reveal a hitherto unrecognized capacity for large scale disruption to caste-biased gene expression profiles even in the absence of apparent changes in caste phenotype, a disconnect that undermines simplistic models of the relationship between transcriptome and phenotype. Future studies should continue to marry detailed phenotypic and gene expression data in order to assess the prevalence and provenance of such discontinuities.