The number of vertebrae in mammals is highly limited. In the neck, it is constrained to seven in all of the ~ 6400 species except sloths and manatees [1, 2]. Similarly, the number of dorsal vertebrae (sum of thoracic and lumbar vertebrae) is limited to 19 or 20 in the majority of mammals [2, 3]. Thus, nearly all mammals have 26 or 27 presacral vertebrae (PSV, sum of cervical, thoracic and lumbar number). As vertebrae derive from somites, the number of (trunk) somites is accordingly conservative across mammals, too. This contrasts with birds and reptiles in which PSV number as well as the number within the individual vertebral regions is highly variable (e.g., [4]). It has been suggested that a combination of developmental and biomechanical innovations set the meristic constraint very early in mammalian evolution [3, 5]. Nevertheless, there are few lineages that broke these constraints and evolved an increased number of dorsal (and thus presacral) vertebrae: elephants (23–24 dorsal vertebrae), hyraxes (28–31), sea cows (24), golden moles (22–24, these four lineages are part of the Afrotheria), horses and rhinos (perissodactyls, 22–24), and Hero shrews (22–25; [6, 7]). As already the increase of one or two vertebrae is extremely rare among most mammals, these taxa represent significant trunk elongations among mammals.
Somites (as the developmental precursors of vertebrae) are formed in early development via the process of somitogenesis. During somitogenesis, a pair of somites buds off from the paraxial, presomitic mesoderm every two hours in mouse embryos, suggesting that somite segmentation is controlled by a biological clock with a two-hour cycle [8, 9]. The process of somitogenesis has been formalized in a theoretical model – the ‘clock and wavefront’ model [10]. In this model, temporal and spatial information are integrated and determine somite boundaries. Spatial information is provided by the wavefront of Fgf expression which continuously regresses in the posterior direction according to the posterior elongation of the body axis [11, 12]. Temporal information is generated by a traveling wave of cyclic activation/expression of oscillator genes from posterior to anterior [11, 12]. When oscillator expression and wavefront meet, somites are periodically generated. Thus, the period for the formation of one somite and the size of each somite are defined by the period of the oscillator and the distance that the wavefront moves during one cycle of the oscillation, respectively.
The basic helix-loop-helix factor Hes7 is a key effector of Notch signaling during somitogenesis. Its expression follows a two-hour oscillatory cycle controlled by negative feedback and each cycle of Hes7 expression coincides with the generation of each pair of somites [13]. This Hes7 oscillation is proposed to be the molecular basis for the somite segmentation clock [13–15]. An important aspect of the negative feedback control is the transcriptional delay between the pre-mRNA and the final protein due to the transcription, splicing, translation, and transport of the mRNA [13, 15–20]. The transcriptional delay of a gene is suggested to be affected by its number of introns that have to be removed during splicing. Variation in intron number would lead to variation in mRNA maturation time and, thus, in transcriptional delay. Using transgenic mice, Harima and colleagues [21] showed that reducing the number of introns from three (wild-type condition) to two or one within the Hes7 gene actually shortens the delay and results in the acceleration of both Hes7 oscillation and somite segmentation. This eventually led to an increase in the number of somites and vertebrae in the cervical and upper thoracic region, thus increasing total PSV number. Their results suggested that the number of introns is important for the appropriate tempo of oscillatory expression and that Hes7 is a key regulator of the pace of the segmentation clock [16, 21]. Variation in Hes7 intron number could therefore be a potential evolutionary mechanism for varying PSV number across mammals. In order to test this hypothesis, we inferred Hes7 intron number from published genomes across mammals with varying PSV number.