Structural analysis of the ACTR8 gene in various primates
We conducted a structural analysis of the ACTR8 gene in nine primates, including hominoids (human, chimpanzee, and gorilla), Old world monkeys (rhesus monkey, crab-eating monkey, and African green monkey), New world monkeys (marmoset and squirrel monkey), and prosimians (ring-tailed lemur), using DNA and mRNA sequences from the NCBI genome database. Obtained DNA sequences were screened for repetitive elements using the RepeatMasker Program, which revealed that AluSz6 is located in the 7th intron region in antisense orientation (Fig. 1A). The ACTR8 gene is composed of 13 exons. The length of the untranslated region (UTR) differs between species, but the open reading frame (ORF) region is highly conserved and encodes 624 amino acids. Remarkably, the squirrel monkey ACTR8 gene has 12 exons and encodes a short protein of 616 amino acids. AluSz6-exonized transcripts were not found from the NCBI database throughout the nine primates. Next, we performed genomic PCR to determine the integration time of AluSz6 using genomic DNA samples. Amplicons containing AluSz6 were detected in all the primates studied (Fig. 1B). These results indicated that AluSz6 was integrated into the primate genome before the divergence of simian and prosimian lineages.
Identification of the Alu-derived and alternative splicing variants of the ACTR8 gene
To confirm the occurrence of AluSz6 exonization of the ACTR8 gene in the crab-eating monkey, reverse transcription (RT) PCR was performed using two validation primer pairs (V1 and V2) (Fig. 2A and Additional file 1: Table. 1). The V1 primer pairs were designed to identify transcript variants, and the V2 primers pairs were used to detect the transcripts containing the Alu-derived exon. In total, seven transcripts were identified in the crab-eating monkey; the V1 primers detected five transcripts, and the V2 primers detected two transcripts (Fig. 2B). Sequence analysis of the transcripts revealed that the variants originated from multiple AS events, including exon skipping, alternative 3′ SS and 5′ SS, intron retention, mutual exclusion, and Alu exonization (Fig. 2C) [5, 6]. The TV1 transcript skips exon 8 and has exon 9a, which is 19 bp longer than exon 9, through the use of alternative 3′ SS. The TV2 transcript has exon 7a and an AluSz6-derived exon, which are generated by mutual exclusion and Alu exonization, respectively. TV3 and TV4 have the same AluSz6-derived exon but carry exon 9 and exon 9a, respectively, through differential alternative 3′ SS. TV5 is generated by simultaneous AluSz6 exonization and intron retention. TV6 has a longer AluSz6-derived exon due to a differential alternative 5′ SS.
Because Alu-exonized transcripts often exhibit tissue-specific expression patterns [34-36], we profiled ACTR8 gene expression in various tissues of the crab-eating monkey, including the cerebellum, cerebrum, heart, kidney, lung, pancreas, spleen, and testis. Specific RT-PCR primers for seven transcript variants were designed, considering the splice junctions of each (Fig. 2C). RT-PCR results did not reveal tissue-specific ACTR8 gene expression. The original transcript was ubiquitously expressed in all tissues evaluated, whereas the other transcript variants (TV1–TV6) showed low or no expression in various tissues overall (Fig. 2D). We further investigated ACTR8 gene expression of other primates (humans, rhesus monkey, African green monkey, marmoset, and squirrel monkey) using cerebellum cDNA samples and transcript variant-specific primers (Fig. 2E). In humans and Old world monkeys, transcript variants showed the same expression patterns as those of the crab-eating monkey. Moreover, each transcript variant appeared to have a similar expression level to that of the crab-eating monkey. Remarkably, New world monkeys only expressed the original transcript. Our data suggest that the expression of the ACTR8 gene is regulated by lineage-specific AS events in primates.
Bioinformatics analysis of alternative splicing transcripts
Multiple sequence alignment of the AluSz6-derived exon in nine primates (humans, chimpanzee, gorilla, rhesus monkey, African green monkey, marmoset, squirrel monkey, and ring-tailed lemur) demonstrated that a novel G duplication was used as the new 5′ SS for the AluSz6-derived exon in Old world monkeys and apes (Fig. 3 and Additional file 1: Fig. S1A). Considering our results, lineage-specific Alu exonization in the ACTR8 gene might be caused by the G duplication mutation, which is absent in New world monkeys and prosimians. We carefully proved our findings by matching our results with AluSz6 sequences from the UCSC genome browser. We found a discrepant G duplication in the AluSz6 sequence in primate lineages (Additional file 1: Fig. S4).
Moreover, we identified lineage-specific exon inclusion or exclusion events, regardless of AluSz6 integration. The TV2 transcript carrying exon 7a showed lineage-specific expression. The splice sites (donor and acceptor site) were well conserved in all primates evaluated (Additional file 1: Fig. S1). Assessment of the branch point using the SVM-BP finder to the surrounding sequences of exon 7a predicted several branch point candidates. Among the candidates, the TTATAAGAT sequence had the highest potential for inclusion in exon 7a. This sequence was located 21 bp upstream of the 3′ SS of exon 7a in Old world monkeys and apes, but not New world monkeys and prosimians (Additional file 1: Fig. S1). It is likely that a lineage-specific mutual exclusion exon, exon 7a, may have been spliced due to the branch point difference (Additional file 1: Fig. S1). In the squirrel monkey, exons 2 and 3 were found to be longer than in the other primates (Fig. 1A). We examined the splice site of the constitutive exon. Interestingly, the squirrel monkey has a TA sequence, a specific acceptor splice site, whereas Old world monkeys and apes do not have the 5′ SS consensus sequence in the same region (Additional file 1: Fig. S2). The TA acceptor splice site is also seen in marmosets and lemurs, and although they are likely to express the longer exons 2 and 3, we did not confirm their expression experimentally (Additional file 1: Fig. S2). Therefore, besides AluSz6, factors promote lineage-specific expression in the primate ACTR8 gene.
Analysis of ACTR8 isoform structure and function
To assess how the seven transcript variants identified affect translation and protein function, we performed in silico analysis using the ORF finder (http://www.ncbi.nlm.nih.gov/projects/gorf/) and Pfam (https://pfam.xfam.org) database. In this study, seven transcript variants produced a total of four isoforms of the ACTR8 protein, containing the original and alternative isoforms. The original transcript encodes the full-length protein with 624 amino acids, including an ATP-binding site (amino acids (aa) 55–56, 288, and 290) and a nucleotide-binding site (aa 283–286) [37, 38]. The TV1 transcript encodes isoform 1 with 579 aa, deleting aa 308 to 352. TV2 encodes isoform 2 with 341 aa, and TV3–TV6 encode isoform 3 with 304 aa (Fig. 4). Exon 7a and AluSz6 introduce a pre-termination codon (PTC) in TV2–TV6, and we identified truncated proteins, with isoforms 2 and 3 deleted from the C-terminus (Additional file 1: Fig. S3). Notably, isoforms 2 and 3 certainly contained the ATP-binding pocket and nucleotide-binding sites, which are essential functional domains for correct ACTR8 functioning. According to a previous study, the N-terminal region of ACTR8 is critical for protein functional activity. N-terminal deletions have deleterious effects on the expressed protein [39-41]. However, deletions in the C-terminal region did not lead to defects in ACTR8 function [37]. Although the Alu-derived transcript in the ACTR8 gene altered the protein structure, obtained novel isoforms can produce a lineage-specific protein without loss of function.