(i) Transcriptome identifies gene candidates linked to insecticide-resistance status
Using DESeq2, we identified 607 DEGs between resistant and susceptible GWSS populations (Table S1). Of these, 81 had higher expression in resistant populations and 526 had higher expression in susceptible populations. Of the DEGs, 57.3% had a functional annotation match to at least one database. Insect cuticle proteins were the dominant function of DEGs with higher expression in susceptible populations (IPR000618; n = 28). In contrast, M13 peptidases (IPR000718; n=11) and cytochrome P450s (IPR001128; n=8) were the dominant functions found with higher expression in resistant populations. Ordination of overall transcriptome profiles did not depict a strong pattern related to resistance and while many genes were identified as differentially expressed, only a single gene had an obvious consistent pattern linked to resistance status, J6590_005969 (Figure 1).
Based on previous insecticide literature [14–23], we had hypothesized that cytochrome P450 genes might be partially or fully responsible for insecticide resistance of GWSS in the California Central Valley. Of the 81 DEGs identified as having higher expression in the resistant populations, eight were annotated as cytochrome P450s. Of these, a single gene, J6590_005969, was also the most significantly differentially expressed gene between resistant and susceptible populations (Figure 2). Phylogenetic approaches place J6590_005969 in a clade with other GWSS cytochrome P450s belonging to the CYP6A9 family. Interestingly, this locus is flanked on both sides by Helitron, Mutator and CACTA transposons, similar to other observations of repetitive elements flanking cytochrome P450s involved in detoxification (e.g., [32]). None of the elements surrounding J6590_005969 were predicted by the Extensive de novo TE Annotator (EDTA) to be structurally intact (i.e. containing TIR regions, transposases or protein domains).
(ii) Functional enrichment of GO terms supports a landscape of resistance in GWSS
GO term enrichment analysis was performed on the DEGs to identify significantly over-enriched functional terms (Table 2, Bonferroni-adjusted p < 0.05). For DEGs with higher expression in insecticide-resistant GWSS, we found that GO terms for seven molecular functions (MF) and 15 biological processes (BP) were over-enriched, but no terms for cellular compartments (CC) were enriched. For DEGs with higher expression in susceptible GWSS, we found that GO terms for three MFs and three CCs were over-enriched, but found no enrichment in terms for BPs. Generally, over-enriched MF terms in resistant GWSS were represented by gene clusters with predicted functions such as cytochrome P450s (GO:0005506, GO:0016705, GO:0020037, GO:0046906), the M13 protease neprilysin (GO:0004222, GO:0008237) and vitellogenins (GO:0005319) (Figure 3). In contrast, gene clusters represented by over-enriched MF terms in susceptible GWSS were dominated by genes with functions related to cuticle structural proteins (GO:0042302, GO:0005198) and peritrophins (GO:0008061). The functions of genes represented in over-enriched BP terms in insecticide-resistant GWSS include vitellogenins (GO:0006869,GO:0010876) and a cluster of three gene copies of the body color gene yellow (GO:0018958, GO:0046189, GO:0042440, GO:0046148, GO:0019953, GO:0032504, GO:0044703, GO:0048609, GO:0051704, GO:0000003, GO:0022414, GO:1901617, GO:1901615). The functions of over-enriched CC terms in susceptible insects include peritrophins (GO:0005576), and NADH-ubiquinone oxidoreductases and MICOS complex subunits (GO:0019866, GO:0005743).
(iii) No evidence of broad-scale population structure from coding region variants
Observed heterozygosity was significantly lower than expected (Bartlett’s test, p < 0.001) and FIS was 0.25 indicating high levels of non-random mating (e.g. inbreeding) in sampled GWSS. Overall Fst was 0.02, indicating low differentiation between populations (i.e., A-D; Table 1) and possibly high levels of migration or gene flow between populations, which is consistent with the relatively small geographic range studied here. Within-population FST were 0.028 and 0.033 for resistant and susceptible populations, respectively, and were 0.070, 0.081, 0.087, and 0.033 for population A (Tulare susceptible), B (Temecula susceptible), C (General Beale resistant) and D (Tulare resistant), respectively. Pairwise (between-population) FST was 0.011 between resistant and susceptible populations. Pairwise FST was similarly low for between A, B, C and D populations, ranging from 0.012 to 0.036 (Figure 4).
Further, population structure results from fastSTRUCTURE (marginal likelihood maximized at K=1, Figure S2) and Landscape and Ecological Association Studies (LEA) (lowest cross-entropy at K=1; Tracy Widom test p > 0.05 for all PCA eigenvalues, Figure S3) were both supportive of weak differentiation between populations. Despite a lack of overall population structure, using PCA ordinations, we observed some minimal separation by the general location individuals were collected from (Figure 4). Thus, when performing analysis of molecular variance (AMOVA) tests, we used resistance status alone (therefore assuming no population structure) and in a nested structure to account for separation by collection location or population. AMOVA tests on resistance status alone were significant (p < 0.01), but only explained a tiny portion, 1.78%, of observed variation. Most of the variation, 98.22%, was found within individuals, further supporting a panmictic (randomly mating) population. When using a nested structure, AMOVA tests of resistance status were not significant (p > 0.05), indicating no detectable signal of resistance when accounting for local population structure. In these cases, collection location or population were significant (p < 0.01), accounting for 4.05% of variation, indicating that while there is no detectable broad-scale population structure, there may still be some local adaptation occurring. We used OutFlank to identify Fst outliers as likely variants potentially under selection due to resistance status, but no significant outliers were detected (q > 0.1).
We used SNPEff annotations of the functional effect of variants to identify SNVs that might contribute to the increased expression of J6590_005969. Unfortunately, the low levels of expression of J6590_005969 in susceptible GWSS prevented our ability to call SNVs in this gene for susceptible samples. However, given that the reference genome represents a susceptible genotype [33], we still investigated the variants called in the resistant GWSS individuals. We found no SNVs that were homozygous or heterozygous for an alternate allele that was predicted to affect J6590_005969 function.