Not all cities are the same: variation in animal phenotypes across cities within urban ecology studies

The sustained expansion of urban environments has been paralleled by an increase in the number of studies investigating the phenotypic changes of animals driven by urbanization. Most of these studies have been confined to only one urban center. However, as the types and strength of anthropogenic stressors differ across cities, a generalizable understanding of the effects of urbanization on urban-dwelling species can only be reached by comparing the responses of urban populations from the same species across more than one city. We conducted phylogenetic meta-analyses on data for animal species (including both invertebrates and vertebrates) for which measures about any morphological, physiological, or behavioral trait were reported for two or more cities. We found that morphological, physiological and behavioral traits of urban animals all differ similarly across cities, and that such phenotypic differences across cities increase as the more cities were investigated in any given study. We also found support for phenotypic differences across cities being more pronounced as the farther away cities are from each other. Our results clearly indicate that separate urban populations of the same species can diverge phenotypically, and support previous pleas from many researchers to conduct urban studies across several urban populations. We particularly recommend that future studies choose cities in different biomes, as urban adaptations may differ substantially in cities sited in different ecological matrices. Ultimately, a generalized knowledge about how organisms are affected by urbanization will only be possible when comprehensive biological patterns are similarly studied across separate and distinct cities.


Introduction
The exponential growth of the human population and the increasing percentage of humans moving into urbanized areas has led to a sustained expansion of urban environments (United Nations 2018).Urban environments are ecologically different from the non-urbanized environments in which many species have evolved (Grimm et al. 2008).Consequently, as urban populations of different species are exposed to anthropogenic stressors within study of only one urban center, normally the city in which the researchers are sited, which is then compared to adjacent natural areas (Bonier 2012;Fidino et al. 2021;Johnson and Munshi-South 2017).
However, several authors have repeatedly raised the need to compare the phenotypic responses of urban species across several cities because the types and strength of anthropogenic stressors among cities are not equal (Bonier 2012;Donihue and Lambert 2015;Fidino et al. 2021;Magle et al. 2019;Ouyang et al. 2018;Rivkin et al. 2019).Comparing the responses of urban populations from the same species across more than one city can offer a generalizable understanding of the effects of urbanization on urban species (Fidino et al. 2021).Studying several cities is equally important to determine if any species has developed different adaptive responses to urbanization in different cities (i.e., different selection pressures), or it can allow us to establish patterns of convergent evolution associated with urbanization (Rivkin et al. 2019).Indeed, it is unclear whether species' responses to urbanization are consistent across different cities.For example, similar genetic changes underlying neural function and development in great tits (Parus major) occurred across multiple European cities (Salmón et al. 2020), whereas patterns of thermal tolerance under urbanization in an acorn ant species differed across three large US cities (Diamond et al. 2018).
Whether we should predict species responses to differ or not across cities depends on whether we consider different cities to be ecologically homogeneous or not.Several authors have argued that urbanization leads to homogenous habitats globally, even across major climatic regions, as all cities are designed similarly to meet the needs of humans (Groffman et al. 2014;McKinney 2006).If different cities are replicates of the same type of environment, we should expect to observe little phenotypic differentiation across cities.Alternatively, separate cities can be considered to differ substantially from each other due to differences in many important parameters, such as size, age, growth pattern, land-use legacies, policies on urban planning, zoning, socioeconomic development, local and national culture, human population density, climate, latitudinal location, topography, habitat structure, water availability, levels of different types of pollution, control of urban wildlife, and levels of biodiversity in the region (Evans et al. 2009b;Miles et al. 2021;Ouyang et al. 2018).Thus, despite different cities sharing some similar landscapes, the combination of the abovementioned parameters should lead to very different conditions for the animals living in those different cities (Winchell et al. 2022).If different cities are considered as distinct urban environments instead of replicates of the same type of urban environment, we should predict significant phenotypic differences to arise across urban environments in separate cities.We should also predict across-city phenotypic differences to be more pronounced the more cities are compared in a study.Moreover, as the geographic distance between cities within a study increases, we might also predict that phenotypic differences should be more pronounced because cities that are farther apart may diverge more in abiotic factors such as those associated with climate.
Phenotypic differentiation across cities may occur due to adaptation, non-adaptive genetic changes, epigenetic effects, or phenotypic plasticity (Johnson and Munshi-South 2017;Lambert et al. 2021;Liker 2020).In most urban studies, the mechanism(s) underlying phenotypic changes between urban and non-urban populations is unresolved (Lambert et al. 2021).However, there is ample evidence about the broad number of phenotypic traits involved, including an array of behavioral, physiological and morphological traits affected by urbanization (Liker 2020;Ouyang et al. 2018;Putman and Tippie 2020).What remains unclear is whether certain types of phenotypic traits are affected sooner (i.e., are altered more quickly) or more intensely by urbanization.Some authors have argued that behavioral and physiological traits may change more than morphological traits in response to urbanization, partly because behavioral and physiological traits can be plastic at different life stages including adulthood (as mentioned above, these plastic changes may not necessarily involve local adaptation to urban conditions), whereas the plasticity of most morphological traits may be restricted to developmental phases (Crispo et al. 2010;Evans et al. 2010).
Here we conducted meta-analyses to determine if the phenotypes of animals are consistently different across cities (whether urbanization generally alters animal phenotypes).We focused only on animals to assess the potentially different effect of urbanization on morphological, physiological and behavioral traits.We collected data for any animal species (including both invertebrates and vertebrates) for which measures about any morphological, physiological, or behavioral trait were reported for two or more cities.We addressed seven questions: (i) whether the phenotype of urban animals differs across cities, regardless of the type of phenotypic trait or the number of cities investigated; (ii) whether across-city phenotypic differences may be restricted to some types of phenotypic traits (i.e., morphological, physiological or behavioral traits); (iii) whether phenotypic differences across cities increase as the more cities are investigated; (iv) whether choosing cities based on any a priori differences between them (e.g., latitude or climatic differences) results in higher phenotypic differentiation between those cities; (v) whether phenotypic differentiation across cities increases as the geographical distance between cities increases; (vi) whether phenotypic differences across cities are more pronounced the more cities differ in human population size or density; and (vii) whether any observed patterns across all taxa are maintained when restricting the analysis to smaller taxonomic groups (birds, invertebrates and reptiles).
Our approach will elucidate whether cities within studies on phenotypic responses to urbanization in animals generally can act as replicates of each other (i.e., phenotypes show little differentiation between or among cities) or whether certain factors (e.g., number of cities studied, geographic distance between cities, differences in human population size or density) contribute to more or less differentiation in animal phenotypes among cities.Furthermore, our analyses will determine whether the degree of phenotypic change is more pronounced for certain types of traits (e.g., behavioral vs. morphological traits) and/or within certain taxonomic groups.Overall, our results will help inform the design and interpretation of urban ecology studies on animals.

Data collection
We started our literature search on 4th May 2020 with previously collected papers on urban ecology, selecting 2,102 papers that contained "cities" anywhere in the text.That same day we performed a search in Web of Science (SCI-Expanded; accessed through the IRIS Consortium of Irish University and Research Libraries), using the terms "urban*" AND "cities" under Topic.Search words with an asterisk allow for different forms of a word to appear in the search results (e.g., the term urban* searched publications containing the words urban, urbanised, urbanized, urbanisation, urbanization, etc.).We obtained 136,200 results, but selected only 4,604 results under the following Web of Science categories that were pertinent: "Ecology", "Zoology", "Biology", "Entomology", "Evolutionary Biology", "Ornithology", "Reproductive biology", "Physiology", "Anatomy & Morphology", "Biodiversity Conservation", "Endocrinology & Metabolism", and "Psychology Biological".On 8th May 2020 we performed two additional searches in Web of Science, one with the terms "urban*" AND "multi-city" producing 119 results (all categories considered), and another one with the terms "urban*" AND "multicity" producing 19 results (all categories considered).After removing duplicate results and irrelevant papers (nonanimal studies) we had 2,800 results.From these, we considered 268 studies on any phenotypic trait in any animal species sampled in two or more cities.Citations from those 268 studies led us to consider 5 further studies.
On 18th March 2021, we collected all the studies that had cited any of the previous 273 studies that we considered relevant, i.e., studies sampling animals in two or more cities or reviews that mentioned the importance of collecting data across cities when investigating urban populations.For this we used Web of Science (or Scopus if the cited study was not included in Web of Science).Before any filtering, this search produced 3,752 results, from which 275 results we had not previously considered.Citations in these 275 studies led us to consider 6 further studies.
On 7th May 2021, we made a new search in Web of Science for papers that were published in 2020 and 2021.The combination of terms "urban*" AND "cities" produced 11,587 results.Selecting results from the categories "Ecology", "Zoology", "Biology", "Entomology", "Evolutionary Biology", "Biodiversity Conservation", "Multidisciplinary Sciences", "Physiology", "Ornithology", "Toxicology", "Environmental Studies", and "Urban Studies" reduced the number of results to 3,125.We also made a search with the terms "urban*" AND "multi-city" (18 results), and "urban*" AND "multicity" (4 results).After removing duplicates and irrelevant studies, we considered 52 studies, from which only 11 included measurements in more than one city.
Even though significant differences in phenotypic traits have been found in humans living in different cities, e.g., involving sperm quality (Auger et al. 2001;Swan et al. 2002Swan et al. , 2003)), we did not include humans in our study, as humans have the ability to move across cities, and it is thus not possible to know if individuals move in and out of cities.We did not consider studies that only reported genetic data or biodiversity estimates (e.g., species richness or evenness).We collected measurements for any morphological, physiological or behavioral trait for which the sample size in each city was at least 5.If values were reported for both juveniles and adults, we only used data from adults.If values were reported separately for males and females and they were within 10% of each other, we combined both sets of data by calculating the weighted means and the weighted standard deviations.If values for one sex were higher than 10% of the other sex, we used data from the sex with the highest mean value.If standard errors of the mean were reported, we estimated the standard deviation by multiplying the standard error by the square root of the sample size.
For any appropriate study in which the reported results for the urban sites from separate cities were not sufficient to calculate effect sizes, we contacted the corresponding author and requested that information.
From each appropriate study, we compiled the mean, standard deviation and sample size from two cities.From studies in which data were available from three or more cities, we selected the two cities with the smallest and the greatest means for each trait.If two separate studies measured the phylogenetic reconstruction, combining information from different sources to resolve the following relationships: Bombus species (Arbetman et al. 2017); squamates (Watanabe et al. 2019);birds (delBarco-Trillo 2018); and Zosteropidae in relation to other Passeriformes (Cai et al. 2019).
We also tested the effects of various moderators on model heterogeneity.We were interested in the effects of 6 moderators: (1) the type of trait measured (behavior, physiology, morphology), (2) whether or not there was an a priori expectation in trait differences between cities (i.e., whether the authors selected the cities due to some intrinsic difference between those cities), (3) the number of cities in the study, (4) the distance between the two comparison cities, (5) the absolute difference in human population density between the two comparison cities, and ( 6) the absolute difference in human population size between the two comparison cities.Because we had various explanatory moderators, we used an information-theoretic approach to select the most informative model, or set of models, that best explained heterogeneity (Burnham and Anderson 2002).For this, we used the glmulti package in R (Calcagno and de Mazancourt 2010).We compared models that contained none, one, and up to six (i.e., all) of our moderator variables using AICc values.For this process, we had to fit various random/mixed-effects meta-regression models using maximum likelihood estimation (instead of REML) because log-likelihoods are not directly comparable for models with different fixed effects.We solely compared models with main effects only, and we included the same random factors as above (paper, effect size id, and phylogeny).We selected the "best" models as the ones with the lowest AICc values, which were within 2 units of the lowest AICc value.For each model, we also calculated the model weight, which represents the probability that the model is the best model.Finally, for each model factor (moderator), we calculated model-averaged parameter estimates, which are weighted averages of the model coefficients across all potential models, and we calculated the relative importance by taking the sum of the weights (probabilities) for the models in which the factor appeared.
To determine whether the taxon studied affected the above results, we performed subgroup analyses by running separate meta-analytic models for individual taxonomic groups.We could only do this for birds, invertebrates, and reptiles as these animal groups were well represented in our dataset (birds: 41 species and 168 effect sizes; invertebrates: 9 species and 26 effect sizes; reptiles: 4 species and 43 effect sizes) compared to the other taxonomic groups (amphibians: 1 species and 4 effect sizes; and mammals: 4 species and 10 effect sizes).For these subgroup analyses we used the same approaches as above, including the model without moderators (to find the overall effect size) and the model selection same trait for the same species and in the same cities, we selected the study with larger sample sizes (this led to the removal of only 9 entries in our dataset; see Online Resource 1, Table S1).We also included the following information in the dataset: (1) the type of trait measured (behavior, physiology, morphology).(2) Whether or not there was an a priori expectation in trait differences between cities (i.e., whether the authors selected the cities due to some intrinsic difference between those cities; this was a yes/no variable).( 3) The number of cities compared in each study.(4) The geographical distance between any two comparison cities (in km), calculated using an online calculator (https://www.distancefromto.net).And (5) the human population size and density for each city.We used the human population and population density information provided in the respective studies.Otherwise, we determined the human population and population density for each city as close as possible to the sampling year.If information about sampling time was not provided by the authors, we chose the year previous to publication to estimate population size and density.If different population values were given for the same city (e.g., for the city proper and for the metropolitan area), we chose the larger value.

Statistical analyses
We calculated the standardized mean difference (SMD) in phenotype values between the cities as Hedges' g (Hedges 1981).This measure of effect size is appropriate when the dataset contains means with opposing signs.We calculated Hedge's g so that larger values indicate a greater difference between the smallest and largest mean phenotype between the two city comparisons.The higher the value of any Hedge's g, the more different the phenotypic trait was between the two compared cities. Hedge's g values are included in the dataset (see Online Resource 2).
To determine whether the overall effect size is different from zero, we ran a random effects meta-analytic model with no moderators using the rma.mv function in the metafor package for R (Viechtbauer 2010) (R version 4.1.1).We added weights to this model through the argument, weights = 1/vi with vi representing the variance around each effect size.Adding weights is more conservative and is more robust to publication bias (Henmi and Copas 2010).To this model, we also accounted for non-independence among effect sizes by including various random factors.We included paper id and effect size id (each different effect size has its own id) as random factors to account for betweenstudy effects and within-study effects, respectively.We also added phylogeny (as a correlation matrix) to control for potential non-independence from phylogenetic relatedness of species.We used Mesquite v.3.6 (built 917) for the 14% was attributed to paper id, and 57% was attributed to effect size id.The test of moderators (omnibus test of all model coefficients except for the intercept) was significant (Q = 19.33,df = 5, P = 0.0017).The number of cities was the only significant moderator with more cities in a study contributing to a greater difference between phenotypes (estimate = 0.164, 95% CI = 0.070-0.258,Z = 3.421, P = 0.0006, Fig. 1).The distance between cities was also marginally significant in the top model (estimate = 0.0002, 95% CI = 0.000-0.0003,Z = 1.769,P = 0.077).Although the type of phenotypic trait and the difference in human population density between cities were identified as important moderators through our model selection process, they were not significant in the best model (see Online Resource 1, Table S3) nor in the second or third best models (Online Resource 1, Tables S4-S5).
Performing multimodel inference to determine the importance of the various moderators across all models, we found that number of cities, distance between cities, and human population density had the highest importance values (which represent the sum of the weights for the models in which the variable appears) with values of 1.00, 0.96, and 0.85 respectively (Online Resource 1, Table S6), but number of cities was the only moderator that reached statistical significance (P < 0.001).
Model selection revealed the importance of number of cities as a predictor variable for both birds (importance value = 0.99, P = 0.018) and invertebrates (importance value = 0.99, P < 0.001), but not for reptiles (importance value = 0.39, P = 0.507; Fig. 2; Online Resource 1, Table S6).For reptiles, the distance between cities was ranked process to determine which factors were most important at explaining the model results.
Publication bias, which primarily looks for whether small studies with small effect sizes are missing from the dataset, was evaluated using funnel plots and Egger's test for asymmetry (Borenstein et al. 2009;Egger et al. 1997).We also used the trim-and-fill method (Nakagawa and Santos 2012) to estimate the number of small studies missing and to estimate what the actual effect size would be had these studies been published and included in the analysis.

Overall meta-analysis
Upon analyzing heterogeneity among 251 effect sizes, the overall meta-analytic mean from the multilevel random effects model was significantly different from zero (estimate = 0.653, 95% CI = 0.146-1.159,Z = 2.525, P = 0.012).Thus, the difference between cities in phenotypes is on average about 0.65 standard deviation values.We also found significant variation in effect sizes (i.e., heterogeneity) that is not accounted for by sampling variance (I 2 = 90.22,Q = 1645.44,df = 250, P < 0.001).Approximately 90% of the total variance was due to heterogeneity: Phylogeny attributed approximately 32%, paper id attributed 24%, and effect size id attributed 34% of the total variance.

Effects of moderators
From 64 potential models, we identified three that were more than 2 information criteria units lower than all other models, but within 2 units of each other (see Online Resource 1, Table S2, Figure S2).The top model (AICc = 604.10,weight = 0.335) included type of trait, number of cities, distance between cities, and the absolute difference in human population density as moderators.The second-best model (AICc = 604.75,weight = 0.242) included the same moderators in addition to the moderator of a priori expectation.The third-best model (AICc = 606.08,weight = 0.124) included number of cities, distance between cities, and the absolute difference in human population density, but its model weight, or probability of being the best model, was less than half of the top model.Here, we will report the results of the top model (Online Resource 1, Table S3) and provide results on the other models in Online Resource 1(Tables S4-S5).
Based on the model selection results, we reran the phylogenetic meta-analysis using the REML estimation method.We found significant heterogeneity with I 2 = 82.82(Q = 1012.33,df = 245, P < 0.001).Of the total heterogeneity, approximately 11% was attributed to phylogeny, Fig. 1 Forest plots showing the point estimates (standardized mean difference as Hedges' g) and their 95% confidence intervals for each study (effect size id listed on y axis) in our dataset.The estimates are ranked and color-coded by number of cities.We observed more phenotypic differences across cities (larger effect sizes) the more cities that were in the study increasing distance between cities led to more phenotypic differentiation.
For birds, the best model, with the lowest AICc value, contained the predictors of number of cities, distance between cities, and difference in human population density, each of which significantly explained model heterogeneity (Q = 631.46,df = 164, P < 0.001; Table 1).This is similar to the model containing all species, which is not surprising, as approximately 67% of the effect sizes in our study are accounted for by bird species.For invertebrates, the best model only contained number of cities as a predictor, and this also significantly explained model heterogeneity (Q = 68.28,df = 24, P < 0.001; Table 1).For reptiles, the best model contained trait and distance between cities as as having the highest importance (value = 0.81, P = 0.143; Fig. 2; Online Resource 1, Table S6).The geographical distance between cities was also consistently ranked highly across all models, being the second most important predictor for the full dataset (importance value = 0.96, P = 0.073), for birds only (importance value = 0.97, P = 0.011), and for invertebrates only (importance value = 0.18, P = 0.636), and the most important predictor for reptiles only (importance value = 0.81, P = 0.143).However, distance was negatively related to phenotypic differences between cities in reptiles; for every one-unit increase in distance between cities, the standardized mean difference in phenotypes decreases by 0.0008 (Table 1).This is an opposite pattern than what we found in the other taxonomic subgroups in which for each factor is equal to the sum of the weights/probabilities for the models in which the variable appears.The red line at 0.8 is often used as a cutoff to determine the most-important variables smallest and largest phenotype between cities) could only generate positive effect sizes.

Discussion
We compared different types of phenotypic traits in urban populations of invertebrate and vertebrate species across separate cities.Our main result is that the phenotype of urban animals differs across cities, regardless of the type of phenotypic trait investigated, and this was the case when we considered all taxa together, and when we considered separately birds or invertebrates.We also found that phenotypic differences across cities are more pronounced as the more cities are investigated and the farther away cities are from each other (except for our analyses on reptiles).
Although there have been many recent studies investigating phenotypic changes across cities, it must be noted than in the majority of those studies, the focus was in rural-urban comparisons, with the different cities simply providing replicates for those rural-urban comparisons (Evans et al. 2009b;Potvin and Parris 2012;Slabbekoorn and den Boer-Visser 2006;Tyler et al. 2016).Indeed, in some of these studies any potential phenotypic differences across urban populations are not even reported nor discussed (Eggenberger et al. 2019).In a review considering parallel evolution in cities (i.e., whether rural-urban comparisons in different cities show consistent and similar responses driven by urbanization), parallelism was exhibited in only 44% of species across all the cities studied (Santangelo et al. 2020).Even in cases when parallelism across urban-rural comparisons exist, there may be significant differences in phenotypic traits across urban populations, as the changes taking place may be higher in some urban populations than in others.But predictors (Q = 114.02,df = 40, P < 0.001; Table 1).Morphological traits had an average 0.775 lower standardized mean difference between cities compared to behavioral traits, i.e., the average mean difference in behavioral phenotypes between cities is larger than that of morphological phenotypes (as we predicted).However, this result should be taken with caution as behavioral estimates are on a single lizard species (Anolis cristatellus) across only two studies.There were no physiological traits in the dataset in reptiles.
Within the top model for birds, phylogeny attributed approximately 22%, paper id attributed 9%, and effect size id attributed 49% of the total variance.Within the top model for invertebrates, phylogeny attributed approximately 41%, paper id attributed 15%, and effect size id attributed 16% of the total variance.Within the top model for reptiles, phylogeny attributed approximately 0%, paper id attributed 19%, and effect size id attributed 51% of the total variance.

Publication bias
Our funnel plot for the meta-analysis without moderators showed significant asymmetry (Egger's test: z = 2.2992, P = 0.022; Online Resource 1, Figure S3) with small studies with large effect sizes being more likely to be published than small studies without significant or large effects.Using the trim-and-fill method, we found that the number of missing studies was 93 (out of 251) and the corrected model estimate (overall effect size) was 0.428 (95% CI: − 0.0356-0.8911),which is smaller than our original estimate of 0.653 and failed to be significantly different from zero effect at α = 0.05 (Z = 1.8095,P = 0.070).However, it must be noted that the missing studies estimated by the trimand-fill method had negative effect sizes, but our approach to calculate effect sizes (using the difference between the populations of a same species may diverge phenotypically instead of changing in a parallel fashion.Our result also highlights the importance of studying urban populations in many cities, as some biological patterns may only become apparent when doing so.For example, only by studying bird and plant biodiversity across many cities could researchers determine that the density of species was more affected by urban characteristics (e.g., landcover and city age) than by non-anthropogenic factors such as climate and geography (Aronson et al. 2014).
We also found that a greater geographical distance between cities is likely to lead to greater phenotypic differentiation across urban populations.This positive association was the case for the models containing all taxa, and for the models with only birds, but not for the models with only invertebrates (no association) or only reptiles (negative association).Such a difference in the case of invertebrates and reptiles may be due to the fact that geographical distances between studied cities tended to be smaller for invertebrates (range = 22.12-645.79km; average = 124.13km) and reptiles (range = 17.4-1661.66km; average = 162.77km) than for birds (range = 12.31-9489.13km; average = 844.68km).It is also possible that for many invertebrate species distances between cities are magnified compared to birds and reptiles, and that thus there is a smaller distance threshold beyond which any further distance between cities has a superfluous effect.As for reptiles, we found that the difference between phenotypes was greater as distance between cities decreased (for each one-unit increase in distance between cities, the standardized mean difference in phenotypes decreased by 0.0008).However, this result should be taken with caution, as 93% of effect sizes were associated to small distances between cities (average = 78.61km), whereas the remaining 7% of effect sizes (amounting to only 3 effect sizes) were associated to much larger distances (average = 1284.87km).
A greater geographical separation between cities does not only minimize the occurrence of genetic flow but it can also maximize abiotic differences between those two cities, e.g., related to latitude and climate conditions.Additionally, small distances between cities will promote a leapfrog process of urban colonization, in which new urban populations are not established by colonizers from adjacent rural populations but by colonizers from urban populations in nearby cities (Evans et al. 2009a(Evans et al. , 2010)).Cities that are close together in which urban populations were established via a leapfrog process should be more phenotypically similar compared to separate urban populations that were independently established from their respective adjacent rural populations.However, even in species in which the leapfrog process of colonization is at play, separate urban populations will have traversed separate evolutionary paths since their establishments in the different cities (assuming there is if episodes of non-parallelism are predominant, in which phenotypic traits increase in some urban populations compared to the rural population, but decrease in some others, then substantial differences across urban populations should be expected, and this is confirmed by our results.The emergence and increase of phenotypical differences across urban populations is further exacerbated by the fact that rates of phenotypic change are much higher in urban areas than in natural contexts (Alberti et al. 2017;Hendry et al. 2008).
Phenotypic differences across urban populations may be due to many reasons: adaptation (Lambert et al. 2021;Winchell et al. 2022); phenotypic plasticity (Bressler et al. 2020;Thompson et al. 2018); decreased gene flow, and founder effects, i.e., stochastic differentiation following separate colonizations by different subpopulations in different cities (Evans et al. 2009b); genetic drift, a nonadaptive, genome-wide process that could lead to random phenotypic differentiation across urban populations (Mueller et al. 2020); and hybridization between native and non-native species, which may potentially increase the distinctiveness of phenotypic traits across cities (Beninde et al. 2018).In the majority of studies in our dataset, the processes involved in any phenotypic differences across urban populations were not investigated, and thus we were not able to determine their relative roles either.We also did not include studies investigating only genetic differences in separate cities, as we could not calculate effect sizes as we did for the phenotypic traits.However, many recent genomic studies have addressed the existence of genetic differentiation across cities.For example, a study on rat populations across four cities, including temperate, subtropical and tropical cities, showed similar genetic diversity across cities but different patterns of gene flow depending on city-specific barriers separating subpopulations within each city (Combs et al. 2018); and a study on bumblebees in nine German cities found in some loci a high degree of genetic differentiation associated to urbanization (Theodorou et al. 2018).
In our models, the most consistent moderator explaining phenotypic differences across cities was the number of cities investigated-as more cities were included in a study, the larger the difference between the smallest and largest mean urban phenotype reported in that study (i.e., a higher standardized mean difference).This was the case for the models containing all taxa, and for models with only birds and only invertebrates, but it was not the case for the models with only reptiles, although this may have been due to the fact that the variation in the number of cities was relatively small in our considered reptile studies (range = 2-5 cities; average = 3.2 cities).However, overall, the more cities for which data from a phenotypic trait were available, the greater the difference was in that phenotypic trait across urban populations.This result supports the idea that separate urban the importance of measuring traits across several cities.When cities are selected so that they differ in some ecological feature (e.g., in relation to latitude, or biome), researchers can concurrently study the effects of urbanization and other ecological factors.This can allow to tackle questions like the effects of urbanization in different ecoregions (e.g., temperate, desert, and tropical cities), or how the combined effects of urbanization and climate change may affect populations differently in separate cities.At the other extreme, if the selected cities are very close together and very similar in many aspects, one minimizes the likelihood of observing major phenotypic differentiation between any two urban populations (Sparkman et al. 2018), which may provide an interesting system to perform experimental approaches that require starting with similar phenotypes.
Our results clearly indicate that separate urban populations of the same species can diverge phenotypically, and that this is the case for any phenotypic trait, no matter if it is morphological, physiological or behavioral.In principle, there seem to be two opposing views on whether the responses of animals to urbanization should be consistently similar or dissimilar across cities.First, if several cities under investigation are considered to be similar replicates of the same type of environment, we would predict to find more episodes of convergence than of divergence regarding phenotypic traits, especially when phenotypic differentiation is mostly driven by phenotypic plasticity.Second, if different cities are ecologically distinct (Santangelo et al. 2020), we would expect to find phenotypic differences across them (Ouyang et al. 2018;Thompson et al. 2016), as we did in our meta-analysis.This is likely to be the case the more cities are investigated and the farther apart cities are, which is also mostly supported by our results.As already mentioned, the fact that evolution rates are higher in urban areas than in any other type of environment (Alberti et al. 2017) means that even small differences among cities can lead to measurable phenotypic differentiation across them.Cities can also be highly stochastic, regularly disturbed, and thus variable over time (Sattler et al. 2010).That is, replication may not only be important at the spatial scale (different cities), but also at the temporal scale (populations being studied over time).
In conclusion, most studies on urban ecology have been restricted to one urban center, with researchers tending to conduct studies only in the city in which they live.However, our results support previous pleas from many researchers to conduct urban studies across several urban populations.Those different urban populations would not necessarily act as replicates, as our analysis shows that phenotypic differentiation increases as the more cities are investigated.One approach to implement multi-city studies is by establishing a long-term network of research partners located across little gene flow between them), and phenotypic differences may have still arisen across cities, in this case being greatly determined by the age of those cities and thus the age of the different urban populations.Differences in the human population densities (a proxy of city size) between the compared cities did not have an effect on the degree of across-city phenotypic differences in the models considering all data, only invertebrates, or only reptiles.However, we found a surprising effect in the case of birds, with the difference in phenotypes between cities being smaller as the difference in population densities increased, although this effect was relatively small (estimate = -0.001).In principle, phenotypic differentiation is likely to be higher in larger cities than in smaller cities.For example, gene flow between rural and urban populations may be more important in smaller cities as the distance between rural and urban populations is reduced (Santangelo et al. 2020).Larger cities will also provide more opportunities for population structuring, with more subpopulations within a city possibly diverging phenotypically from one another (Johnson and Munshi-South 2017).However, whether city size by itself is a main driver of phenotypic differentiation across cities remains unclear.
We predicted that morphological traits would be more similar across cities compared to physiological traits, and especially compared to behavioral traits.The reason for this prediction is that morphological traits are generally set at maturity, whereas physiological and behavioral traits can be more plastic at different life stages including adulthood.However, our study does not support this prediction.The overall meta-analyses including moderators did not show significant differences between the types of traits.And the same was the case for the subgroup analyses, with the exception of reptiles.We did find more differentiation in behaviors in reptiles than in morphological traits (there were no physiological traits in the dataset), but behaviors were represented by only two studies on a single species.Although we cannot provide a robust explanation for the lack of significant differences between the three types of traits, it must be noted that there was a high degree of variation within each type of traits in our dataset, e.g.behavioral traits included such various traits as the spiderweb surface in a spider species, alarm calls in birds, and the velocity on different surfaces in a lizard species.
Phenotypic differences observed between pairs of cities were similar in cases in which cities were selected by researchers due to some intrinsic difference between those cities (e.g., latitude or city size), and in cases in which the researchers did not mention any a priori differences between the cities.The fact that phenotypic differences between separate urban populations exist even when comparing cities that are not clearly different from one another emphasizes many cities (Magle et al. 2019).We also recommend that future studies assess comprehensive sets of traits, as the degree of phenotypic differentiation across cities may vary in different traits (Santangelo et al. 2020).Using a comparative framework would also be important, because different species may have undergone different processes of adaptation to urban environments, given their different ecological requirements.Finally, we recommend that future studies choose cities in different biomes, as urban adaptations may differ substantially in cities sited in different ecological matrices, e.g.cities in desert or tropical regions.Ultimately, a generalized knowledge about how organisms are affected by urbanization will only be possible when comprehensive biological patterns are similarly studied across separate and distinct cities.

Fig. 2
Fig. 2 The relative importance of model factors (terms) averaged across all possible models for (A) the full dataset, (B) birds only, (C) invertebrates only, and (D) reptiles only.The importance value (x-axis)

Table 1
Multivariate meta-analytic results of the top models (lowest AICc value) for each taxonomic group.The number of effect sizes is denoted