We collected yellow-legged hornets using traps at Mt. Jiri National Park in September 2019 when the colony was at its maximum size and the production of new queens resulted in high demand for hornet workers. Sugared water with acetic acid and 95% ethanol was used as the trap solution in 2 L plastic containers. The traps were monitored every two weeks. The collected hornets were preserved in 95% ethanol for use in later molecular and morphological analyses. Traps were placed in 30 sites and hornets were captured from 13 sites (Fig. 6, Table 3). We selected three individuals from each site, totalizing 39 similar-sized workers.
DNA extraction and microsatellite genotyping
We collected muscle samples from the hind legs of each worker for genomic DNA extraction. DNeasy Blood and Tissue kits (Qiagen, Hilden, Germany) were used to extract genomic DNA according to the manufacturer’s protocol. We measured the concentrations and quality of genomic DNA using a NanoDrop 2000 spectrophotometer (Thermo Scientific, Wilmington, USA), and the concentrations of genomic DNA were diluted between 10 to 20 ng/µL range. Seven polymorphic microsatellite loci were amplified using primers developed by  for the yellow-legged hornet (D2-185, D3-15, R1-137, R1-36, R4-33, R1-75, and R1-169). We performed amplifications using the procedures for each locus detailed in . Seq-Studio Genetic Analyzer (Applied Biosystems, Foster City, USA) was used to visualize the amplicons, and GeneMapper version 6.0 was used to evaluate the dataset for genotype errors and the presence of null alleles.
Analysis of genetic diversity and distance, and population structure
The genotype datasets were used to analyze genetic diversity and population structure. GENEPOP version 4.7  was used to confirm deviations in Hardy-Weinberg equilibrium (HWE) and linkage disequilibrium of seven microsatellite loci. Null alleles, significant deviations from HWE, or evidence of linkage disequilibrium were not observed in any of the seven loci. In addition, seven loci could sufficiently separate 39 samples of yellow-legged hornets (Fig. 7).
The MS Excel add-in, GenAlEx version 6.5 , and Arlequin version 3.5 , were used to calculate genetic diversity and diversity indices, namely mean number of alleles (NA), effective number of alleles (NE), observed heterozygosity (HO), expected heterozygosity (HE), Shannon’s information index (I), molecular diversity (h), and inbreeding coefficient relative to the sub-population (FIS) of each population. The UPGMA based on Euclidean distances was used to construct a dendrogram based on Nei’s genetic distances obtained using GenAlEx version 6.5, using PAST 3 . Paired population differentiation (FST) was performed using Arlequin version 3.5. The linearized FST value (FST/[1- FST]) with geographic distance was used in the Mantel test (number of permutations: 999) to find evidence of genetic isolation by distance from the hornet using the ‘vegan’ package in R .
Bayesian clustering was performed using STRUCTURE version 2.3.4 . An admixture model, which allows us to confirm whether ancestors in population k have passed a portion of their genetic material to individual i. Simulations (100,000) were performed in each analysis after an initial burn-in of 100,000 simulations. We used the ΔK method  in STRUCTURE Harvester , which supported the estimation of the best‐identified k value. The range of one to 14 possible clusters with three independent runs each was employed in STRUCTURE Harvester. In the STRUCTURE harvest analysis results, the best-supported K value among hornets across 13 populations was identified as 7 (Fig. 8).
DAPC , which is a multivariate clustering algorithm, was used to analyze the population structures of 39 hornets from 13 sites using the “adegenet” package in R . The analysis describes the greatest amount of variation through a discriminant function representing a linear combination of correlated alleles in a linear discriminant analysis based on principal components generated after reducing the dimension of the genetic variation using principal component analysis . The method was used to determine how distinct each population among the 13 sites was.
Landmark-based geometric morphometrics were used to analyze the wing shapes of the hornets. TpsDig  was used to digitize the 19 landmark points in the wing vein (Fig. 9). Morpho J version 1.07a (Manchester, UK) was used to convert the digitized landmark coordinates into Procrustes coordinates. Centroid size, which used the size proxy in landmark morphometrics , was measured to compare wing sizes among the 13 groups. CVA was used to compare morphological differences among the 13 groups and to calculate the f contribution (%) of each canonical variate (CV) in CVA. Variation in wing shape among the hornets from 13 sites was visualized in a wireframe graph along the first two (CV1 and CV2) axes. We also identified the values and significant distances of the Procrustes to explore the morphological similarity between groups. One-way ANOVA was used to test significant differences in centroid size, CV1, and CV2, among the hornets using GraphPad Prism version 7.0 for Windows (GraphPad Software, San Diego, USA). When significant differences were found in the ANOVA test, Tukey’s post hoc tests were performed. All differences were considered significant at p < 0.05.