In order to demonstrate the general applicability of a hypothesis, it is desirable to collect data from sites that differ in their physical characteristics. Climate is one factor that could affect how birds respond to this experiment. Therefore, we conducted this experiment in twelve public parks in two climatic regions, six in subtropical Nanning (Shishan Park [SP], Medicinal Botanical Garden [MBG], Peoples’ Park [PP], Xiangsi Lake Park [XLP], Xinxu River Park [XRP], and Flower Park [FP]) and another six in temperate Kyoto (Yoshidayama [YM], Fudo Onsen [FO], Manshu-in [MI], two areas of Takara-ga-ike [TGI1 and TGI2], and Shimogamo Jinga [SJ]) (see Table S1). Climatic regions were assigned according to the Köppen climate classification . Most sites were more than 2 km from one another within each region to ensure that the bird communities at each site were independent from one another. We chose 10 sub-sites in each park within an area of ~ 4,000 m2. We used small areas with tree-dominated vegetation, high bird activity, and low human activity as sub-sites. Sub-sites were chosen to have trees or bushes with vegetation within 3 m of the ground to present the prey, with each sub-site being approximately 4 m2 in size, and at least 15 m from other sub-sites.
We used 20 mm long mealworms (Tenebrio molitor) as prey. We treated these prey in different ways which resulted in three different prey types. We had non-treated undefended mealworms that we used during predator training. During the experiment, we used undefended mealworms injected with 0.02 ml of water and defended mealworms injected with 0.02 ml of a 3% quinine sulphate solution (Sigma Aldrich, Q0132–25G).
The experiment had a total duration of 8-days and was comprised of two stages of 4-days each. The first 4-days was a training period that allowed birds to find the undefended prey and to start interacting with them. Prey deployed during the training stage were always tied onto 25 × 25 mm blue polypropylene (PP) squares (HTML colour code #81BEF7; https://html-color-codes.info/). The PP colour background acted as an aid to enable the birds to learn the properties of the prey through associative learning . Four days were enough for the birds to find the prey and learn their locations (see Results section).
The second stage (days 5–8) was the experiment itself. During this phase, we deployed prey at the same sub-sites to those used in the training. We deployed prey in two clusters with each cluster being released within an area of 1.5 m2 and at over 1 m from the other cluster. In one cluster, we deployed undefended mealworms injected with water, which were undefended control prey. We deployed undefended prey to show that predators were able to discriminate among prey (i.e., were educated) and their predation of model-mimics was related to their preferences. In the second cluster, we deployed the model-mimics (the experimental treatment), which were a combination of undefended (injected with water) and defended (injected with quinine) mealworms (for more details of prey preparation and deployment in the field see 21–22). We used 25 ⋅ 25 mm PP squares of different colours for the controls and treatments. We used pink (#F5A9D0) and yellow (#F2F5A9), which were randomized across the 12 sites in a balanced manner (See Fig S1 for what prey looked like, Table S1).
The experiment was conducted across three winters (Dec. 2016-Mar. 2017, Nov. 2017-Mar. 2017, and Jan. 2018). At each site, we used a different ratio of model-mimics (range: 0–1, 0.2 increments, Table S1). Our daily protocol consisted of deploying prey starting 30-min after sunrise, which took about 60-min to deploy. We left prey for 2 h, after which we collected the prey again and recorded the rates of partial predation (hereafter referred to as ‘taste rejection’) and full predation. Although we attempted to collect 8 consecutive days of data at each site, we did not run the trials when it was raining. In such instances, we continued the prey deployments on the next day with good weather [see 21 for more details].
This experiment was approved by the Animal Ethics Experimental Committee of Guangxi University and the Animal Ethics Committee of the Department of Zoology at Kyoto University. The experiment complied with all laws of the countries in which it was conducted and adhered to the Animal Behavior Society/Association for the Study of Animal Behaviour regulations for the use of animals in research. This experiment also adhered to the ARRIVE guidelines.
Predator diversity survey
Our experiment time was designed to minimize the time and study site effects (Table S1). However, the abundance and diversity of predators might influence the attack rate and taste rejection rate [23, 24]. Therefore, before starting the daily trials, we conducted a 5-min bird point-count to determine the assemblage of potential predators , recording the species identity of all birds detected (seen or heard, except those in flight). The bird count was performed at the centre of a 50 m diameter circle of the study site. Bird counts were performed during the 4-day experimental stage [see 21,26 for further details of bird count methods].
We used generalized linear mixed-effect models (GLMMs) with a binomial distribution to determine how mimic frequency influenced the frequency of taste rejection, by using the lme4 package . We analysed data from Nanning and Kyoto separately. We combined the 4-days taste rejection records and included the proportion of taste rejected prey (taste rejected prey/the total number of prey per sub-site) as the dependent variable. Mimic frequency (0, 0.2, 0.4, 0.6, 0.8, and 1) was the fixed effect, and sub-site nested within study site (6 sites for per region) the random factor. We also calculated the level of taste rejection relative to overall predation to account for the effect of predation rate (i.e., proportion of taste rejected prey/proportion of attacked prey; hereafter, ‘relative taste rejection’). We performed a GLMM on this ratio as dependent variable and following the same procedure described above. Finally, we repeated this procedure using data from the controls. We found the models of Nanning’s treatment were completely separated because there was no taste rejection at the mimic frequency of 1.0 in Nanning . Therefore, we added an artificial 1 taste rejection observation to the Nanning data. We also increased the 10000 of iterations to the maximum to improve model convergence. We calculated the coefficient of determination statistics to examine the amount of variation explained by the model . Type III Wald chi-square tests were conducted to estimate the significance levels of our independent variables. Finally, we conducted post-hoc tests by comparing the pairwise least square means with the P-value adjusted by Bonferroni method to examine the differences of proportion of (relative) taste rejection among mimic frequencies using the multcomp package . Both of the GLMM models above did not show any evidence of overdispersion and spatial autocorrelation, which was assessed by using Moran I Tests on the model residuals using the “Moran.I” command in the ape package , respectively. This indicates that there was enough mixing of individual birds within the parks to outweigh any effect of bird territoriality. We also used Wilcoxon rank sum test to simply examine the difference of proportion of taste rejection and proportion of relative taste rejection between two study regions and between treatment and control, respectively.
Since predator diversity could influence attack rate [21–24, 32] we tested the dissimilarity of bird composition between study sites using the Jaccard’s and Bray-Curtis’s indexes  calculated in package vegan  (see electronic supplementary materials for details). And independent t-tests were performed to examine the difference of point count abundance (log-transformed) and richness (log-transformed) between the two study regions. All analyses were run using R software version 3.4.2 .