Sampling and phylogeny
To explore the morphological diversity (phenotype) across the Scorpiones order, we made morphological measurements on 61 species. We measured one specimen per species, spanning approximately 70% of extant families (14 of 20), five continents (North and South America, n = 20; Eurasia, n = 15; Africa, n = 24; Oceania, n = 2). The Scorpiones families are represented by the Chaerilidae (n = 1), Buthidae (n = 23), Iuridae (n = 1), Bothriuridae (n = 3), Chactidae (n = 5), Scorpiopiidae (n = 3), Euscorpiidae (n = 1), Troglotayosicidae (n = 1), Caraboctonidae (n = 2), Hormuridae (n = 8), Urodacidae (n = 1), Diplocentridae (n = 2), Scorpionidae (n = 5) and the Vaejovidae (n = 5). The ecomorphs included are those referenced in the literature as psammophilous (e.g., Apistobuthus pterygocercus and Vejovoidus longuiunguis), lithophilous (e.g., Hadogenes paucidens and Iurus dufoureius), pelophilous (e.g., Pandinoides cavimanus and Odontobuthus doriae), corticolous (e.g., Tityus trinitatis and Opisthacanthus asper) and lapidicolous scorpions (e.g., Buthus ibericus and Bothriurus coriaceus). Specimens were selected from the scorpion collection at CIBIO in Vila do Conde, Portugal, the RMNH in Leiden, the Netherlands, and the MNHN in Paris, France.
We used recent transcriptome-based phylogenies of the Scorpiones order for our comparative analyses [31, 32]. From there, we drew an ultrametric tree with the same topology as the higher-level classifications. The interfamilial relationships are congruent in both publications, except for the position of the Vaejovidae, which was revised in [32] and adopted in this study. Species that were not present in either phylogenetic analysis were treated in one of two ways: species with members of the same genus represented in the phylogenies were placed at the position of the genus; species without genus representation were placed basal inside the corresponding family clade [33, 34]. In all cases, unclear relationships between species were represented with polytomies. Branch lengths were adjusted to the tree topology, calculated using Grafen’s method [35].
Morphological measurements
We aimed for an unbiased sampling of scorpion anatomy by measuring all anatomical regions of the scorpion body. Specifically, we did not restrict our sampling to known eco-functional traits. We used digital calipers (Absolute IP67, Mitutoyo Inc., Kawasaki, Japan) to measure 70 lengths from six anatomical regions, namely in the prosoma, mesosoma, metasoma, telson, walking legs, and pedipalps, to the nearest 0.01 mm, following Stahnke [36]. Our measurements did not include structures such as the chelicerae, carinae, tarsi, or the setal hairs. The measured species exhibit considerable size variation, and our sampling represents total body lengths from 22.1 mm in Microbuthus sp. to 158.05 mm in Ha. granulatus. To reduce the number of variables for statistical analysis, we summed trochanter and femur lengths into a “proximal leg” part and the patella, tibia, and metatarsus lengths into a “distal leg” part. This division corresponds to biomechanically functional units of the leg: on one side, the distal muscles, responsible for pretarsal (“foot”) movement occupy all distal segments until and including the patella; on the other side, most of the leg motion occurs around the femoropatellar joint [37]. The lengths of the five metasoma segments were added together, while their heights and widths were averaged. Pedipalp measurements represent averages between left and right pedipalps. For details about the measurements (description, abbreviation, and illustration), see Supplementary Table 1 and Supplementary Figure 1. The final morphological dataset consisted of 36 morphological variables, which were log-transformed before further analysis (Supplementary Table 2). In general, scorpions are not characterized by strong sexual dimorphism, so we did not differentiate specimens by sex. However, in those species with more substantial sexual dimorphism, males and females may have subtly different ecological roles [38].
Unlike in, e.g., herpetology, no consensually accepted single linear measurement corresponds well with overall body size in scorpions [39]. Therefore, an isometric body size (IsoSize) was calculated by projecting the 36 linear measurements on an isometric vector. We then calculated each linear measurement's regression residuals using IsoSize to obtain size-corrected traits, following [40]. Last, we calculated the degree of phylogenetic signal present in the morphological variables given the phylogeny using the function physignal of the of R package GEOMORPH. physignal provides a mathematical generalization of the Kappa statistic [41] appropriate for highly multivariate data [42].
Ecological data and microhabitat clustering
Since the conceptualization of the five scorpion ecomorphs, not all species have been assigned to one. For example, less than 50% of the taxa selected here are unambiguously assigned to an ecomorph in the literature. In cases where an ecomorph assignment can be found in the literature, the assignment is often made based on the morphological habitus of the specimens, risking circular reasoning. To overcome these limitations, we chose to forego the assignments to classical ecomorphs entirely in our analysis. Instead, we retrieved descriptors of microhabitat use from the literature. We selected descriptors referring to substrates, their arrangement, and the activities scorpions perform in them. Those most frequently encountered in the literature were used to record presence (= 1) vs. unreferenced presence (= 0) for the following 12 parameters: “compact soil”, “loose sand”, “rock surface”, “leaf-litter”, “under rocks”, “under vegetation”, “dug burrow”, “passive shelter”, “climb bushes”, “climb trees”, “rock crevices” and “hanging upside down”. Each species was assigned one or more microhabitat uses (see Supplementary Table 2).
We grouped species with similar microhabitat uses together into clusters. To achieve this, we used the matrix of 12 microhabitat traits in the following three steps. First, we calculated Jaccard distances between pairs of species based on all microhabitat descriptors together, using the function vegdist of R package vegan [43]. Secondly, using the obtained distance matrix, we selected the number of habitat clusters based on the Bayesian Information Criterion (BIC), using the function Mclust of R package mclust [44] (Supplementary Figure 2). We selected the number of clusters corresponding to the first k preceding a plateau in BIC values. The number of clusters, four (see results), was used to compute a k-means clustering. The distance matrix was used as input for a Principal coordinate analysis (PCoA), resulting in 26 axes. Lastly, we reviewed the ecological composition of each microhabitat cluster by calculating two matrix correlations (Spearman’s as well as Pearson's ρ): 1) between the microhabitat traits and the PCoA scores to obtain PCoA-to-microhabitat correspondence; 2) between the resulting matrix from 1) and the k-means cluster centers to obtain microhabitat-to-cluster correspondence. Cluster terminology reproduces the different associations of each cluster to the microhabitat traits.
To visualize the scorpion ecospace, we performed multiple correspondence analyses (MCA) using function MCA of R package FactoMineR [45]. MCA uses the PCoA scores of microhabitat traits to plot barycenter points of categories (n-individual mean scores) and barycenter points of individuals (n-category mean scores) simultaneously; for visual clarity, however, only the latter were plotted.
Ecology-morphology associations
We searched for strongly eco-covarying traits by exploring which morphological traits co-vary most with ecology across all species. To this end, we applied a phylogenetic Partial Least Squares (PLS, using the phylogeny shown in Figure 2) to the multivariate sets of ecological and morphological variables using the function phylo.integration [46–48] of R package GEOMORPH [49]. Here, multivariate ecology consisted of 26 microhabitat traits (PCoA axes), and multivariate morphology consisted of 36 size-corrected morphological traits. Permutations with 10,000 cycles were used to test for significance of the multivariate correlation between vectors of morphology and ecology. The resulting matrix of morphological traits with maximized ecological covariance is hereafter referred to as the matrix of eco-projected morphology.
Ecomorphological distinctiveness between microhabitat clusters
To corroborate the existence of ecomorphs in scorpions, we examined whether microhabitat clusters exhibit distinctive phenotypes as represented by ecologically correlated morphology. For this purpose, we performed a MANOVA with the matrix of eco-projected morphology as the dependent variable and microhabitat cluster as a predictor while accounting for phylogenetic autocorrelation using generalized least squares (GLS). To test for significance, we used randomization of residuals over 10,000 permutations, as implemented in the function lm.rrpp of the RRPP R package [50, 51]. Then, to identify which microhabitat clusters differed significantly, we employed distance-based testing of pairwise differences between microhabitat cluster means, as implemented in the function pairwise of RRPP [52, 53]. For illustration, we plotted group means rotated to their principal components and with 95% confidence ellipses around them, using the plotting tools of the RRPP R package.