Colony census
Ropalidia plebeiana constructs massive nest aggregations (Fig. 1). We conducted a colony census of 26 colonies in a population at Dinner Creek in New South Wales, Australia (location data in Table S1). These routinely censused colonies (RCCs) were censused approximately every 3 days from mid-October 2001 to the end of March 2002; newly emerged adult females in the RCCs were individually marked with paint. During the paper wasp life cycle, foundresses begin nest construction and produce workers, gynes, and males. A foundress is assigned a queen after workers begin to emerge if she is the egg-producer at that time. Gynes that are candidate female foundresses in the following season were designated as females that emerged after February 21, 2002, for the following reason: we found 59 marked and overwintered foundresses again on September 16, 2002, and the mean date of emergence of the first gyne marked for 19 colonies was February 21, 2002. Newly emerged males were collected with forceps and immediately stored in 99% EtOH. The mean date of emergence of the first male in 24 colonies was March 2, 2002. The original queen(s) of the RCCs were collected in the same manner in March, if still alive, and stored in 99% EtOH. Workers that emerged before February 21 were collected during March. At the end of March, we collected the whole RCC and counted the number of pupae. Then, we collected all pupae of each colony and stored them in 99% EtOH. We checked and sexed the collected insects according to morphological differences in antennae. We estimated the sex ratio of reproductives (males/[males + new gynes]) for each colony from these data. We genotyped some of these female pupae as new gynes.
In addition, we individually marked 960 foundresses in 111 arbitrarily censused colonies with paint before worker emergence at Dinner Creek. From October 2001 to February 2002, we arbitrarily collected the colonies, including all colony members of foundresses with marks and emerged workers without marks, and stored them in 99% EtOH. After each round of collection, we visually checked whether each colony had been divided into subcolonies by comb-cutting, because it was possible to observe comb-cutting during each colony census. The number of colonies was 16 before comb-cutting and 36 after comb-cutting.
All collected female adults were dissected under a microscope, and their ovarian condition was noted as follows: 0, undeveloped; 1, slightly developed; 2, one mature egg; 3, more than two mature eggs.
Sample collection for population structure
We collected 14–40 adults from each of the 14 populations (total: 309) in 2008, including the population at Dinner Creek, where we conducted the colony census using insect nets and stored them in 99% EtOH.
Genetic analyses
Collected individuals were transported back to the laboratory and stored in 99% EtOH at −20 °C until DNA extraction. The excised legs of adults and pupae were frozen in liquid nitrogen and smashed with a pestle in 1.5-mL tubes, then mixed with 50 µL extraction buffer containing 150 mM NaCl, 10 mM Tris-HCl (pH 8.0), 1.0 mM EDTA, 10 µg proteinase K, and 40 µg Chelex 100 (Bio-Rad Laboratories). The tubes were incubated at 56 °C for 2 h and at 99.9 °C for 3 min [29]. Then each solution was precipitated with ethanol and maintained in TE buffer at 4 °C.
Genotyping for COI
Polymerase chain reaction (PCR) was performed with 1 µL diluted genomic DNA (approximately 1 ng) in a mixture consisting of 1 µL primer mix (2.5 µM), 0.1 µL 10 mM dNTP mix, 0.05 µL Taq polymerase (5 units/µL, Ex Taq; TaKaRa), 1 µL 10 × buffer (provided with the polymerase, containing 1.5 mM MgCl2), and 6.85 µL distilled water with a total volume of 10 µL. PCR was performed using a thermal cycler (Gene Amp PCR System 2700; Applied Biosystems). After denaturation for 4 min at 94 °C, the samples were subjected to 30–35 cycles of 1 min at 94 °C, 1 min at the annealing temperatures appropriate for the particular primer pair, and 45 s at 72 °C, with a final extension for 7 min at 72 °C. We used five primer pairs (Table S2). The products were electrophoresed in 8% polyacrylamide gel and visualized by silver staining [30]. Genotype scoring and data entry were performed twice, and the scores were compared. Discrepancies were checked, and if necessary, the sample was reanalyzed.
The cytochrome oxidase subunit I (COI) gene was amplified and sequenced for the population samples. PCR fragments were produced with the mtDNA primers CO1-LCO and CO1-HCOoutout [31]. At the first step, double-stranded PCR products were generated with a PCR mixture containing 2 µL genomic DNA, 5 µL primer mix (0.8 pmol/µL), 2.5 µL 10′ PCR buffer, 2 µL 2.5 M dNTP, 10.375 µL distilled water, and 0.125 µL Taq DNA polymerase (5 units/µL, Ex Taq; TaKaRa) with a total volume of 22 µL. The PCR program was: 5 min of denaturing at 94 °C followed by 40 cycles of 15 s at 94 °C, 5 s at 46 °C, and 30 s at 68 °C. The reaction was terminated with 7 min of elongation at 72 °C. An aliquot of the PCR products from this first step was used as the template for a second PCR with only one of the two primers added to produce a single-stranded PCR product. The second PCR mixture included 4 µL DNA template (PCR product from the first step), 2 µL primer (0.8 pmol/µL), 3.5 µL sequencing buffer, and 0.5 µL BigDye Terminator Ready Reaction Mix (Applied Biosystems) in a total volume of 10 µL. The single-stranded PCR products were sequenced with dye-labeled terminators with the ABI Prism BigDye Terminator Cycle Sequencing Ready Reaction Kit (Applied Biosystems) and 3100 DNA analyzer (Perkin-Elmer). PCR fragments were sequenced in both directions to ensure their accuracy.
Analyses of Population structure
For microsatellite analyses, we developed specific primers for R. plebeiana. We used four specific primer pairs and one new pair specific for R. revolutionalis (Rrev 188; [32]) to analyze the population and colony structures of R. plebeiana. For the microsatellite data set, we checked the existence of null alleles across the loci with MICRO-CHECKER [33] using genotyping data from 309 adults from the 14 populations. Departures from Hardy-Weinberg assumptions were calculated using GenAlEx ver. 6 [34]. Linkage disequilibrium was tested with Genepop ver. 1.2 [35]. We sampled the 14 populations using arbitrary netting by hand. Therefore, we may have captured some kin groups. To exclude this possibility, we detected full-sib relationships using Kinship [36] and removed them from our data. We ran the software assuming "full-sib" as the primary hypothesis and "unrelated" as the null hypothesis, with a Type I error of less than 0.05.
For both the microsatellite and mtDNA data sets, we quantified genetic variation between hierarchical levels of populations (among regions, among populations within regions, and within populations) and their fixation indices using analysis of molecular variance [37] in GenAlEx ver. 6 [34]. To examine mtDNA haplotype differentiation, this method yields analogs of F statistics, called Ф statistics, as defined by Michalakis and Excoffier [38]. We determined the probabilities that the molecular variance and fixation indices at different levels were significant and positive (indicating differentiation) using permutation analyses of 1,000 randomly permuted data sets. FST (for microsatellites) and ФST (for mtDNA) show the overall genetic variation among populations, indicating a sex-biased dispersal pattern. We evaluated genetic differentiation among populations with pairwise FST with 10,000 permutations using GenAlEx ver. 6. Pairwise FST was used to investigate the impact of interpopulation geographic distance on genetic differentiation using an isolation-by-distance (IBD) model [39]. Significant correlations between genetic differentiation (FST values for both markers) and the corresponding geographic distance were assessed using a Mantel test with 10,000 permutations. For geographic distances, straight-line distances (in kilometers) between paired members of the 14 populations were considered.
We applied Bayesian clustering in STRUCTURE ver. 2.0 [40] as an alternative measure of the degree of genetic differentiation using the microsatellite data set to delineate the number of genetically identified clusters (K) and assign individuals to clusters without using prior information about their locations of origin. We used an admixture model with correlated allele frequencies [41] for K values 1 (panmixia) to 10, performing 10 runs for each K value with 1,000,000 Markov chain Monte Carlo iterations and a burn-in period of 100,000 iterations. We estimated the likelihood of K for K = 1–10; the highest likelihood indicated the most probable number of populations. We calculated an ad hoc criterion of K [42] to determine the optimal K.
Analyses of Colony structure
We genotyped four classes of females, namely, foundresses, nestmate workers, new gynes, and foundresses of the following season. To genotype foundresses of the following season, we collected females on September 15 and 16 in 2002. For new gynes, we genotyped the pupae as well as adults of each colony. We calculated coefficients for genetic relatedness and inbreeding using Relatedness ver. 4.2c [43]. Standard errors were based on jackknifing over colonies [44]. Colonies were weighted equally in all analyses.
If not otherwise specified, all statistical analyses were performed in R ver. 3.4.1 [45].