We obtained 967 effect sizes (nfemale = 505, nmale = 462), representing 64 species, extracted from 150 studies published between 1987 and 2019 (Fig. 1). Ornaments encompassed plumage (n = 850), bill (n = 61), eye (n = 4), and bare skin parts such as feet, gular skin, orbital ring, wattle, comb, and gape (n = 52). Traits included pigment- and structural-based colouration, and expression, size, and shape of ornaments.
The global meta-analytic means of the relationships between the degree of elaboration of ornaments and overall condition and fitness combined were positive for both females and males, and their 95% credible intervals did not overlap zero (females: Zr = 0.18 and 95% CI = 0.11 − 0.27, males: Zr = 0.21 and 95% CI = 0.10 − 0.32; model 1 in Fig. 2, Supplementary Tables S1.1 and S2.1). Males tended to have slightly larger effects than females, but the difference was not statistically significant (female Zr − male Zr: -0.03, 95% CI = -0.15 − 0.08). Effects from correlational or experimental studies in which condition or ornament expression was manipulated did not significantly differ (910 vs 57 effect sizes, respectively; Zrstudy type = 0.02, 95% CI = -0.07 − 0.13, Supplementary Table S3) and, therefore, we did not consider this moderator any further.
Positive effects in both sexes were larger for condition parameters than for fitness parameters, as evident from Model 2 that included a factor separating effects associated with condition or fitness parameters in interaction with sex (difference condition − fitness: ΔZr female = 0.09, 95% CI = 0.02 − 0.15; ΔZr male = 0.10, 95% CI = 0.04 − 0.17; model 2 in Fig. 2, Supplementary Table S1.2). The effects for condition parameters alone were positive (females: Zr = 0.22 and 95% CI = 0.15− 0.30, males: Zr = 0.25 and 95% CI = 0.16 − 0.37). Males tended to have slightly stronger effects than females, but the difference was not statistically significant (female Zr − male Zr = -0.03, 95% CI = -0.15 − 0.08; model 2 in Fig. 2, Supplementary Table S1.2). The effects for fitness parameters alone were also positive, albeit weaker (females: Zr = 0.14 and 95% CI = 0.06 − 0.22, males: Zr = 0.15 and 95% CI = 0.04 − 0.27), with no statistically significant difference between sexes (female Zr − male Zr = -0.02, 95% CI = -0.14 − 0.09; model 2 in Fig. 2, Supplementary Table S1.2).
To determine whether different condition or fitness parameters show different associations with ornamentation, we analyzed effects for specific condition and fitness parameters separately. All specific condition parameters showed positive effects, and no difference between sexes was statistically significant. Only the association between ornamentation and parasites was non-significant for males, but this can be due to the relative small sample size for this specific condition parameter (n = 9) and large associated errors (model 3 in Fig. 2, Supplementary Tables S1.3 and S2.3). For fitness parameters, we found clear positive effects for reproductive success, offspring quality or condition, and timing of breeding, while effects for parental quality and survival were not significant with 95% credible intervals substantially overlapping zero for both sexes. No differences between sexes were found for any of these effects (model 4 in Fig. 2, Supplementary Tables S1.4 and S2.4).
To investigate whether the degree of sexual dimorphism affects the strength of the association between ornament elaboration and condition or fitness, we tested whether effects corresponding to more sexually dimorphic ornaments show more marked differences between the sexes than effects for ornaments that do not differ much between females and males. We predicted that, as male-biased sexual dimorphism increases, male effects will become larger and more positive, while female effects will become weaker. This should translate into a statistically significant interaction between sex and sexual dimorphism.
We obtained information on ornament sexual dimorphism for 47 mutually ornamented species, and this data comprised 432 effect sizes. As expected, sexual dimorphism tended to be male-biased (Cohen’s d median = 0.6, and range = -1.96 − 11.13, n = 216, positive sign indicates more ornamented males), that is, males tend to have more elaborate ornaments than females, however, mean sex differences were not statistically significant (Cohen’s d meta-analytical mean = 0.70 and 95% CI = -0.34 − 1.70). For condition parameters we initially found the expected effect of sexual dimorphism (i.e., heightened condition-dependence towards male-biased traits in males and vice versa for females), but the 95% CI nearly encompasses zero (βsex:dimorphism = 0.04 and 95% CI = 0.0003 − 0.09, n = 214; Supplementary Table S1.7). This outcome was driven by one extremely dimorphic ornament (Willow Ptarmigan Lagopus lagopus comb size; Supplementary Fig. S2). Excluding the four effects related to this ornament from the analysis reduced the effect of sexual dimorphism, which became non-significant (βsex:dimorphism = 0.03 and 95% CI = -0.05 − 0.11, n = 210, Supplementary Table S1.5). For fitness parameters we found no significant effect of sexual dimorphism (βsex:dimorphism = -0.02 and 95% CI = -0.09 − 0.05, n = 218; no extreme values were identified, Supplementary Table S1.6). Thus, overall variation in ornament sexual dimorphism did not affect the magnitude of the association between ornament elaboration and overall condition or fitness (Fig. 3, Supplementary Tables S1.5 and S1.6).
Random effects and heterogeneity
Random effects, phylogeny (range of variances across models: 0.05 − 0.20) and species ID (0.03 − 0.16) had only minor effects, although in general phylogenetic effects seemed to be more marked in males than in females (Supplementary Table S4). Covariation between female and male effects tended to be stronger for fitness than for condition parameters (expressed as correlation coefficients, r) but in general had very broad credible intervals that overlapped zero (Supplementary Table S4). Heterogeneity (computed for the model with sex and publication year; model 1 in Fig. 2) was overall very high for females (I2 = 0.85, 95%CI = 0.82 − 0.87) and males (0.83, 0.80 − 0.87). Heterogeneity for the phylogeny component was 0.06 (2.35 x 10−8 − 0.13) for females and 0.15 (1.30 x 10−9 − 0.31) for males, and for species ID it was 0.05 (2.35 x 10−8 − 0.13) for females and 0.04 (9.62 x 10−10 − 0.14) for males.
Publication Bias
Overall, publication bias did not seem to be particularly marked for either sex. Based on exploratory analyses of funnel plots, we found slight asymmetries (i.e., seemingly minimal publication bias; Supplementary Fig. S3), also supported by Egger’s tests revealing significant publication bias. Potential publication bias by trim-and-fill analysis only identified from 0 to 2 missing data points in funnel plots across data sets, all of which were negative and corresponding to relatively mid-to-low powered effect sizes found in female studies. Although these results were contradictory with the Egger’s tests only suggesting (negative) publication bias for male studies, adjusting for missing samples resulted in minimal mean effect sizes displacements that did not affect the conclusions (Supplementary Table S5). We did find evidence for time-lag bias (βpublication year = -0.0062 and 95% CI = -0.0096 − -0.0028; model 1, Supplementary Table S1.1), indicating that studies with larger or significant effects were published first than those with smaller or non-significant effects.