Sequence characteristics and conservation analysis of the five Dmrt family genes of A. schrenckii
The ORF of Dmrt2 has a total length of 1046 bp, encoding 348 amino acids. The ORFs of DmrtA1 and DmrtA2 are 1353 bp and 1338 bp and encode 450 and 445 amino acids, respectively. The ORFs of DmrtB1a and DmrtB1b are 1104 bp and 1122 bp and encode 367 and 373 amino acids, respectively. The detailed characteristics of the five Dmrt genes are summarized in Table 1.
According to domain composition, the five Dmrt family genes could be divided into two groups: the first group contained DmrtA1 and DmrtA2, with both DM and DMA domains, while the second group contained Dmrt2, DmrtB1a and DmrtB1b, all of which had only the DM domain (Fig. 1). The schematic illustration of amino acid conservation among the five Dmrt family genes showed that that their DM domain sequences were highly conserved and the DMA domains of DmrtA1 and DmrtA2 were also highly conserved; however, the degree of variation among amino acid sequence of the DM and DMA domains and other regions was relatively high. A schematic illustration of the amino acid sequence conservation of the five genes in A.schrenckii is shown in Supplementary Fig. 1.
Advanced protein structure prediction for the five Dmrt family genes of A. schrenckii
The secondary structures of all five Dmrt family proteins consisted of 4-peptide α-helices (4.63–10.06%), β-sheets (1.33–3.45%) and coils (83.11–93.73%). The secondary structure types and their corresponding proportions among the five Dmrt family proteins of A. schrenckii are shown in Fig. 2 and Supplementary Table 4.
The tertiary structure analysis of the five Dmrt family proteins showed that DmrtA1 and DmrtA2 existed as monomers and that the other three Dmrt proteins formed homotrimers. Through further analysis, we found that serine (Ser)-rich regions were only distributed in the α-helices of Dmrt2, DmrtA1 and DmrtA2, while redundant alanine (Ala) residues were mainly distributed in β-sheets or junctions between β-sheets and α-helices (Fig. 3).
Sequence similarity comparison and phylogenetic analysis
The similarity of the amino acid sequences among the DM domains and coding sequences (CDSs) of the five Dmrt family genes was analyzed. We found that the similarity among DM domains was higher than that among CDSs. For example, the amino acid sequence similarity was highest between the DM domains of DmrtB1a and DmrtB1b, at 98.2%, followed by that between DmrtA1 and DmrtA2 (94.4%) and Dmrt2 and DmrtA2 (88.9%). In contrast, the degree of CDS similarity between Dmrt2 and DmrtB1a was the same as that between Dmrt2 and DmrtB1b, at 51.9%. The results of the comparisons of amino acid similarity among the five Dmrt family genes (DM domains vs. CDS) are listed in Supplementary Table 5.
Subsequently, the phylogenetic tree of the five Dmrt family genes was reconstructed based on the amino acid sequence of the DM domains. The phylogenetic analysis revealed that the Amur sturgeon DmrtA2, DmrtA1 and Dmrt2 sequences aggregated into single branches and then clustered together with the corresponding sequences of other teleost fishes. Additionally, we found that DmrtA2 showed the earliest evolutionary divergence from the Dmrt family and that DmrtA1 presented the closest phylogenetic relationship with DmrtA2, followed by Dmrt2. Importantly, Amur sturgeon DmrtB1a and DmrtB1b, which were the only two novel Dmrt family genes in sturgeon aggregated into a single clade and then clustered together with the sequences of other vertebrates (Fig. 4).
Furthermore, using the mouse Dmrt8 amino acid sequence as the outgroup, we reconstructed protein-based phylogenetic tree of DmrtB1 genes reported from nine teleost fishes, including six actinopterygian fishes and three sarcopterygian fishes. As shown in the phylogenetic tree (Fig. 5A), it was clear that the DmrtB1 proteins from three Perciformes species (O. niloticus, M. samoides, and L. calcarifer) were clustered into one clade, while the DmrtB1 proteins from one Acipenseriformes species (A. schrenckii), two Coelacanthiformes species (L. chalumnae, L. menadoensis), one Dipnoi species (P. amphibius), one Siluriformes species (I. punctatus) and one Lepidosteiformes species (L. oculatus) were clustered into another clade. Additionally, the DmrtB1 proteins from Acipenseriformes (A. schrenckii) and Siluriformes (I. punctatus) showed the closest phylogenetic relationship. Intriguingly however, the phylogenetic relationships of the DmrtB1 proteins between Acipenseriformes (A. schrenckii) and three sarcopterygian fishes were closer than those between Acipenseriformes (A. schrenckii) and three Perciformes species as they belonged to the actinopterygians.
DmrtB1a and DmrtB1b were positive selection
Ka and Ks assessments provide very important markers used in studies of evolutionary selection on genes. The Ka/Ks ratio can be used to assess whether genes are under selection pressure, where a Ka/Ks > 1 indicates probable positive selection, while Ka/Ks values close to 1 represent neutral evolution or relaxed selection, and Ka/Ks < 1 (especially less than 0.5) indicates purifying selection. First, the Ka/Ks ratios of A. schrenckii were assessed against eight related teleost species: O. niloticus, M. salmoides, L. calcarifer, I. punctatus, L. oculatus, P. amphibious, L. chalumnae, and L. menadoensis. The results showed average Ka/Ks values > 1 for DmrtB1a (1.49) and DmrtB1b (1.73), indicating positive selection (Fig. 5B). The relatively Ka/Ks ratios indicating potential adaptive evolution may reflect specific selection pressure on A. schrenckii. Alternatively, they could be a sign of increased variability of these particular proteins within a broader group of species. The DmrtB1 gene Ka/Ks ratios were also calculated for the rest of the paired sequence comparisons among the eight fish species (average Ka/Ks ratio − 1.728). The Ka/Ks cross test showed that positive DmrtB1 gene selection events were widespread in teleost fish.
Tissue expression profiles and spatiotemporal expression patterns in gonads
To reveal the expression characteristics of the five Dmrt family genes, expression analysis of seven main tissues of three year-old A.schrenckii individuals was first performed using real-time PCR. Tissue expression profiles of the five Dmrt family genes are shown in Fig. 6. Here, we observed two obvious expression characteristics. 1) Relative expression was lowest in the brain, but each Dmrt gene was expressed in all seven tissues, which may indicate that the five Dmrt genes have extensive biological functions. 2) Dmrt2, DmrtA1, DmrtA2 and DmrtB1a were the most highly expressed in gills, but the expression level of DmrtB1b was the highest in the liver.
Subsequently, spatiotemporal expression patterns in the gonads during sex differentiation and further development were analyzed using real-time PCR. The results of the spatiotemporal expression patterns analysis present in Fig. 7. The expression level of Dmrt2 was highest in 5 M gonad (UGs) and there were no differences between the testis and ovary in any of the three examined differentiated stages. DmrtA1 and DmrtA2 showed similar spatiotemporal expression patterns, with relatively higher expression levels in 5 M and in the ovaries in the other three developmental stages. In the 36 M gonads, the relative expression of DmrtA1 and DmrtA2 was significantly higher in ovaries than in testes (P < 0.05). In contrast, the expression levels of DmrtB1a gradually increased in the testes from 12 M to 36 M, and the significantly different expression levels were observed between testes than ovaries at 36 M (P < 0.05). Most interestingly, the expression pattern of DmrtB1b was impressive during the gonadal differentiation of A. schrenckii. The expression of DmrtB1b was lowest in UGs. Thereafter, the expression level of DmrtB1b was first increased at 12 M, but then gradually decreased in the ovaries from 12 M to 36 M, while it continuously increased in testes and the testes showed significantly higher expression than the ovaries at 24 M and 36 M (P < 0.05).