According to the MLS and BM records of nonflying eutherian mammals from the AnAge online dataset, we obtained a new allometric equation: \(\text{Y}\text{ }\text{(expected longevity)}\text{ = }\text{3.7136}{\text{ }\text{BM}}^{\text{ }\text{0.1842}}\). The LQ of all cetacean species was calculated based on the allometric equation \(\text{LQ }\text{= MLS / (3.7136 * BM ^ 0.1842)}\). The mean LQ value from 65 cetaceans was 0.97 and the 0.5 SD was 0.17 (LQ = 0.97 ± 0.17). Six cetacean species had LQ > 1.14, so we classified them as long-lived species: bowhead whales, long-finned pilot whales (Globicephala melas), Pacific white-sided dolphins (Lagenorhynchus obliquidens), killer whales (Orcinus orca), bottlenose dolphins ((Tursiops truncatus), and Indo-Pacific bottlenose dolphins (T. aduncus). In contrast, we classified five species short-lived (LQ < 0.80): vaquita (Phocoena sinus), harbor porpoises (P. phocoena), beluga whales (Delphinapterus leucas), baiji (Lipotes vexillifer), and common minke whales (B. acutorostrata). The remaining species (0.8 < LQ < 1.14) were used as the control (Fig. 1). For the MLS, the mean MLS value ± 0.5 SD for all cetaceans was 47.90 ± 15.67. Five species were classified as the long-lived group with MLS > 63.57: blue whales, bowhead whales, humpback whales, killer whales, and sperm whales (Physeter catodon). Four species had MLS < 32.23 and were thus considered short-lived: vaquita, harbor porpoises, Yangtze finless porpoises (Neophocaena asiaeorientalis), and baiji. In addition, the ancestral state of both MLS and LQ was reconstructed to further classify long-lived lineages in the ancestral nodes (Figure S1). The long-lived cetacean species identified by the two standards (LQ > 1.14, MLS > 63.57) were used for subsequent analyses.
Gene duplications in cetacean genomes
To gain insight into the molecular mechanisms of cancer resistance in the p53 pathway, we identified genes that had undergone duplications in cetaceans. In our study, 23 genes were detected to have copy number gains in at least one cetacean lineage (Figure 2, Table S2). We leveraged the COSMIC v92 [22] and TSGene 2.0 [23] databases and found that, among genes that underwent duplications, more than 50% were tumor suppressor genes (i.e. CASP3, CDK1, CDK2, CDK6, EI24, GADD45A, IGFBP3, PERP, RCHY1, SFN, SIAH1, THBS1). Four genes with two copies were unique to the long-lived cetacean lineages (LQ > 1.14)—BCL2L1 (long-finned pilot whales), IGFBP3 (Indo-Pacific bottlenose dolphins), PERP (bottlenose dolphins), and STEAP3 (Indo-Pacific bottlenose dolphins)—whereas only one copy was identified in other cetacean lineages. In addition, three copies of CCNB2 and two copies of MDM4 were detected only in the large, long-lived sperm whale, but only one copy of each was found in the other cetacean lineages. We also found that both CASP3 and CCNG1 had undergone duplication in all the large long-lived cetacean species (MLS > 63.57) except the common minke whale. In contrast, only one copy of RCHY1 was identified in the five large, long-lived species, but two copies were found in the other cetacean species. To further confirm whether the above eight genes with gene copy gains (BCL2L1, IGFBP3, PERP, STEAP3, CCNB2, MDM4, CASP3, CCNG1) and one gene with copy loss (RCHY1) are unique to long-lived cetaceans, we added 17 non-cetacean mammals and found higher copies of both CASP3 and PERP in the well-known long-lived species (including large cetaceans, primates and the naked mole-rat), although there were some exceptions (e.g. cow). In addition, we did not find an effect of genome assembly length or scaffold N50 numbers on estimated gene copy number (Figure S2).
Positive selection of p53 pathway-related genes in cetaceans
A total of 46 “one-to-one” orthologous genes were identified among the 73 genes involved in the p53 pathway. To detect signatures of episodic selection in genes occurring along the long-lived cetaceans, we used four different methods: free-ratio and branch-site model from the PAML4.9 package and aBSREL and BUSTED from Datamonkey. The LRT revealed that the free‐ratio model that assumes an independent ω on each branch fit the data significantly better than the one‐ratio model for three genes (APAF1, CASP8, AIFM2;P < 0.05, Figure 1, Table S3). ω > 1 was only identified in four long-lived branches of both APAF1 and CASP8: the LCA of the humpback whale and terminal branch of the Indo-Pacific bottlenose dolphin of APAF1, and the LCA of delphinids and the branch leading to the Pacific white-sided dolphin of CASP8. However, AIFM2 was found to be under positive selection in both long-lived long-finned pilot whales and short-lived vaquita. Similar results were obtained from the more stringent branch-site model—which can detect stronger positive selection acting on only a few sites within a landscape of overall purifying selection—from CODEML. Evidence of positive selection was observed in the two long-lived branches leading to the sperm whale for TP73 and the long-finned pilot whale for AIFM2 (Figure 1, Table S3). Two positively selected sites identified using the BEB approach in both genes (TP73: 506; AIFM2: 459) had undergone radical changes in at least one property. In addition, three genes (TP53I3, SIVA1, GTSE1) were identified in the short-lived groups and another two (CCNE1 and TP53) along the non-long-lived branch leading to the common minke whale. We then ran another branch-site model in the Datamonkey program aBSREL, which appeared to be markedly more sensitive in detecting episodic selection than branch-site methods from PAML. The result showed evidence of episodic selection on two branches from two genes after correcting for multiple testing. TP73 was identified to be under positive selection along the large long-lived humpback whale, whereas GTSE1 was subject to positive selection along the non-long-lived lineage of the LCA of Phocoenidae and Monodontidae (Table S4). The BUSTED program, which may be particularly effective at testing for selection limited to foreground branches, further revealed that the TP73 in the long-lived cetaceans undergoes positive selection (p < 0.05, Table S5). The above four methods of selection testing identified three positively selected genes (APAF1, CASP8, TP73) unique to the long-lived cetacean species.
Neutral theory predicts that the ω value is higher in species with small effective population sizes (Ne), like cetaceans. To test whether the selective signs we identified in the three genes (APAF1, CASP8, TP73) of long-lived cetaceans were due to a low Ne, we used branch-site and free ratio models implemented in PAML4.9 to evaluate the ω value in each lineage across 17 non-cetacean mammals. Two genes were also identified to be under positive selection in other long-lived mammals: the ancestral branch of primatesfor CASP8 and TP73, and TP73 in the terminal branch of Brandt's bat (Myotis brandtii). The results suggest that positively selected genes were not identified in the long-lived cetaceans due to a low Ne (Table S6).
Gene-phenotype evolution
The phylogenetic generalized least squares (PGLS) regressions were performed between the evolutionary rate of each orthologous gene (represented by root-to-tip ω) and three lifespan-associated traits (MLS, BM, and LQ). Regression analyses revealed that the evolutionary rates of the two genes were significantly positively correlated with LQ: CD82 (R2 = 0.392, P = 0.004) and SERPINE1 (R2 = 0.343, P = 0.010, Figure 3). For MLS, a positive association between log10 (root-to-tip ω) and log10 (MLS) was identified at TSC2 (R2 = 0.364, P = 0.008). Moreover, the TP73 evolution rate was positively related to BM (R2 = 0.226, P = 0.031, Figure 3).