Identification of orthologous gene set and phylogenetic analysis
After data processing, an average of 8,601 single-copy genes were identified for each of the 15 species, ranging from 8,282 (O. curzoniae) to 8,814 (O. cuniculus). These single-copy genes accounted for 30.25% of the total number of genes in each species on average, of which B. mutus (44.59%) and P. hodgsonii (25.33%) had the highest and the lowest single-copy gene proportion, respectively. More information about gene sets of 15 species were shown in Table S1. These single-copy genes were further used to obtain the one-to-one single-copy orthologous genes. After alignment, a total of 6,008 single-copy orthologous genes were identified. After gap removal, 6,000 single-copy orthologous genes were used for subsequent analysis (Fig. S1).
According to the phylogenetic relationship constructed by single-copy orthologous genes, the six high-altitude species and their closest relatives belong to four different orders, including Lagomorpha, Primates, Carnivora and Artiodactyla (Fig. 1). Among them, Artiodactyla has the most high-altitude species, including B. mutus, P. hodgsonii and O. aries (high). Although these three high-altitude species belong to the same family and order (Bovidae and Artiodactyla), the phylogenetic relationship showed that the MRCA (most recent common ancestors) of these three species and their closest relatives were not high-altitude species. Thus, they might have independently evolved high-altitude adaptation mechanisms. According to the results of divergence time estimation, among the six high-altitude species, O. aries (high) and its low-altitude close relatives O. aries (low) had the closest relationship, with a divergence time of 3.99 Ma (95% CI: 6.30–2.30), while O. curzoniae and its low-altitude close relatives O. cuniculus had the longest divergence time of 45.15 Ma (95% CI: 65.5–24.9). In general, the results of phylogenetic relationships showed that these six high-altitude species belong to the high-altitude adaptation mechanism of independent evolution, which meets the requirements of convergent evolution research.
Gene Family Expansion and Contraction Analysis
According to the analysis results of gene family expansion and contraction, a total of 17,153 gene families were identified by MRCA (most recent common ancestors) in 15 species for subsequent analysis. Interestingly, except for the O. curzoniae, the remaining five high-altitude species had more gene family changes than their closest relatives (Fig. 2). It was worth noting that the expanded gene families of the remaining five high-altitude species, except O. curzoniae, were not significantly more than their closest relatives, or even less (i.e., 577 of R. bieti versus 685 of T. francoisi). Compared with low-altitude relatives, the gene family changes of these five high-altitude species had a significantly higher proportion of contracted gene families. At the same time, the number of contracted gene families of high-altitude species was also much larger than that of low-altitude relatives (i.e., 1,133 of R. bieti versus 600 of T. francoisi, 1,610 of P. hodgsonii versus 708 of C. hircus, 2,208 of O. aries (high) versus 357 of O. aries (low), 1,991 of V. ferrilata versus 482 of V. lagopus, and 976 of B. mutus versus 409 of B. taurus).
In order to explore the convergence pattern of gene family changes in high-altitude species, gene families that significantly (p < 0.05) expanded or contracted together in more than half of the high-altitude species were identified. The results showed that in these six high-altitude species, the convergence pattern of the significant contracted gene families was much larger than the significant expanded gene families (Fig. 3). As shown in Fig. 3a, 38 significantly expanded gene families were identified to exist in more than three high-altitude species and no significantly expanded gene families were found in all six high-altitude species. Furthermore, only one of these significantly expanded gene families existed in five high-altitude species (OG0000128), and five existed in four high-altitude species (OG0000068, OG0000304, OG0000398, OG0001349 and OG0001430). In contrast, 112 significantly contracted gene families were identified to exist in more than four high-altitude species (Fig. 3b, c). Among the 112 significantly contracted gene families, two gene families were identified in six species and 25 gene families were identified in five high-altitude species. In addition, among the 450 significantly expanded gene families, the number of species-specific gene families was much higher than that existing in more than two species (71 of P. hodgsonii, 66 of O. curzoniae, 49 of O. aries, 36 of V. ferrilata, 32 of R. bieti, and 29 of B. mutus, respectively) (Fig. S2a). As for significantly contracted gene families, compared with the number of gene families existing in more than two species, the species-specific gene families did not show a significantly higher trend. Only the number of species-specific gene family of one species was significantly higher than other types of gene families (82 of P. hodgsonii) (Fig. S2b).
To further explore the functional convergence pattern of the significantly changed gene families of these six high-altitude species, we performed GO enrichment analysis on the significantly expanded and contracted gene families of each species. Enriched GO terms that existed in at least five species were retained for subsequent analysis. As shown in Fig. 4a, a total of 29 enriched GO terms existed in at least five high-altitude species, of which 16 were enriched in all high-altitude species. In addition, the number of significantly expanded gene families in the same enriched GO term of each species also varied greatly (Table S2). Different from the results of gene families, the gene families that significantly contracted in the six high-altitude species did not enrich significantly more GO terms than the enrichment results of the significantly expanded gene families. A total of 37 enriched GO terms existed in at least five high-altitude species, of which 11 were enriched in all high-altitude species (Fig. 4b) (Table S3).
Rapidly Evolving and Positive Selection Genes
To better understand the effect of high-altitude environment on the nucleic acid sequence changes of genes of different species, identification of rapidly evolving gens (REGs) and positive selection genes (PSGs) were conducted under the two ratio model in PAML and a total of 1,713 REGs were identified in all six high-altitude species. After KEGG and GO enrichment analysis of these REGs, 40 significantly over-represented pathways and 43 significantly enriched GO terms were obtained (Fig. 5a, b). Notably, ultraviolet (UV) radiation response-, angiogenesis related- and hypoxia-related REGs showed significant pathway expansion, including the Melanoma (ko05218, 19 genes, p = 1.69E-06), MAPK signaling pathway (ko04010, 48 genes, p = 2.18E-06), Calcium signaling pathway (ko04020, 38 genes, p = 5.41E-05), EGFR tyrosine kinase inhibitor resistance (ko01521, 15 genes, p = 0.000886), Regulation of actin cytoskeleton (ko04810, 32 genes, p = 0.000986), Ubiquitin mediated proteolysis (ko04120, 24 genes, p = 0.001141), Ras signaling pathway (ko04014, 34 genes, p = 0.001142), mTOR signaling pathway (ko04150, 24 genes, p = 0.002045), PI3K-Akt signaling pathway (ko04151, 44 genes, p = 0.002641), Renal cell carcinoma (ko05211, 13 genes, p = 0.004766), and AMPK signaling pathway (ko04152, 18 genes, p = 0.008991). The 43 significantly enriched GO terms, including G protein-coupled receptor signaling pathway (GO:0007186, 28 genes, p = 1.13E-20), G protein-coupled receptor activity (GO:0004930, 26 genes, p = 2.69E-19), potassium ion transport (GO:0006813, 22 genes, p = 4.15E-06), protein phosphorylation (GO:0006468, 71 genes, p = 1.18E-05), voltage-gated potassium channel activity (GO:0005249, 14 genes, p = 0.000153), zinc ion binding (GO:0008270, 59 genes, p = 0.000341), protein tyrosine kinase activity (GO:0004713, 17 genes, p = 0.000551), voltage-gated potassium channel complex (GO:0008076, nine genes, p = 0.000814), intracellular signal transduction (GO:0035556, 20 genes, p = 0.004235), and ion transport (GO:0006811, 27 genes, p = 0.018482), were also related to angiogenesis and hypoxia response and might reflect the convergence adaptation mechanism of these six high-altitude species to the low-oxygen, high UV-radiation and extreme temperature conditions.
The number of PSGs was significantly less compared with REGs. Only seven PSGs were identified in all six high-altitude species. Four of these seven genes might reflect convergent coping mechanisms of high-altitude species when exposed to the stress of high-altitude environments, including CXCL10 gene, ERBIN gene, and SCAMP1 gene which promote angiogenesis under hypoxic conditions, as well as FYCO1 gene which is involved in UV-radiation damage response (Table 1).
Table 1
Basic information on four genes associated with high-altitude adaptation mechanisms that exhibit convergent positive selection features
Abbreviation | Full name | Function |
ERBIN | Erbb2 interacting protein | Enhanced the expression of Hif-1α protein and promoted angiogenesis |
CXCL10 | C-X-C motif chemokine | Hypoxia response related gene |
SCAMP1 | Secretory carrier membrane protein 1 | Vascular endothelial growth factor related gene |
FYCO1 | FYVE and coiled-coil domain-containing protein 1 | UV-radiation damage repair related gene |