Re-evaluating the transcription start site of tammar MEST.
Although MEST imprinting has been characterised in both the South American grey short-tailed opossum (Monodelphis domestica) (Das et al., 2012) and the tammar (Suzuki et al., 2005), the transcription start sites (TSSs) of the MEST isoforms are not yet well defined in marsupials. To characterise the tammar MEST gene locus, the putative tammar MEST gene was searched using the wallaby genome database (Wallabase: https://wallabase.science.unimelb.edu.au/) and compared with mouse Mest protein (Accession number: NP_032616.1). 2859 bp of putative tammar MEST was identified (Figure 1A). Since the MEST gene has isoform dependent imprinted expression in eutherians (Kosaki et al., 2000; Li et al., 2015; Reule et al., 1998; Riesewijk et al., 1997; Yonekura et al., 2019), we examined isoforms of MEST by 5’RACE experiments using the adult tammar testis. As MESTA shares the protein coding sequence with the longer isoform of MEST, MESTB in eutherians (Figure 1A: green coloured boxes), 5' RACE reaction was performed using primers designed for the protein coding sequences of the putative tammar MEST (Figure 1A: green coloured boxes). After sequencing the 5' RACE products, we confirmed that there are three isoforms of MEST (Figure 1B and C). The longer isoform of MEST was expressed from a more upstream CpG island that was distinct from the other two isoforms (Figure 1C). We renamed the shortest isoform of MEST as MESTA (DDBJ accession number: LC747011) and the longer isoform of MEST as MESTB (DDBJ accession number: LC747012) in accordance with eutherian MEST isoforms. The intermediate size isoform was renamed MESTC. Although each isoform had a different TSS, the translation start sites were common to each other.
Identification of two isoforms of MEST in monotremes.
To ask whether the short isoform of MEST is therian mammal specific, the presence of MEST isoforms in monotremes was examined by 5’RACE experiments using adult platypus testis. To characterise the monotreme MEST gene locus, the platypus orthologue of MEST was searched by NCBI Blast (https://blast.ncbi.nlm.nih.gov/Blast.cgi) comparing with the tammar MEST amino acid sequences, and 2374 bp of putative platypus MEST (Accession number: XM_001511283.6) was identified (Figure 2A). Next, 5' RACE reactions were performed using primers designed for the protein coding region of the putative platypus MEST (Figure 2A and B: green coloured boxes). After sequencing the RACE products, we confirmed that platypus has two isoforms of MEST (Figure 2B and C). Their TSSs were located at different CpG islands from each other (Figure 2C). We renamed the shortest isoform of MEST as MESTA (DDBJ accession number: LC747014) and the longer isoform of MEST as MESTB (DDBJ accession number: LC747015) in accordance with eutherian and marsupial MEST isoforms. Although each isoform had a different TSS, the translation start sites were common to each other.
Identification of an orthologue of MESTIT1 in the tammar wallaby.
Since the human lncRNA MESTIT1 is expressed from the DMR at the promoter of human MESTA (Li et al., 2002; Nakabayashi, 2002), we asked whether similar antisense transcripts are present around the TSS of the tammar MESTA by analysing stranded transcriptome data sets. In adult tammar testis transcriptome data, antisense mapping reads were present in the CpG island near the TSS of the tammar MESTA (Figure 3A). After isolating a partial transcript by PCR with an antisense transcript candidate-specific primer, 5' and 3'RACE experiments were performed to confirm the full length of the transcript (Figure 3B). While the 5' RACE reaction yielded one band, the 3’RACE reaction resulted in three different bands. Of these, two 3'RACE products were non-coding transcripts with an alternative poly A signal and distinct poly A tail (DDBJ accession numbers; MESTIT1 isoform1: LC746974; MESTIT1 isoform 2: LC746975) (Figure 3B: black asterisks). Surprisingly, the largest 3’RACE product encoded an isoform of the neighbouring gene of MEST, CEP41 (Later renamed CEP41A, DDBJ accession number: LC747013) (Figure 3B: red asterisks). The neighbouring genes of the putative MESTIT1 orthologue in the tammar showed synteny to corresponding genes in human genome, suggesting that the MESTIT1 gene is conserved between the tammar and the human genomes.
To ask whether the lncRNA MESTIT1is present in the tammar sperm as seen in human, RT-PCR analysis was performed. PCR amplification was observed in all sperm samples only after reverse-transcription (Figure 3C).
Tammar has two isoforms of CEP41.
During the RACE reaction to identify MESTIT1 in tammar, an isoform of CEP41 was identified. To determine whether isoforms of CEP41 other than the isoform identified by the RACE experiment of MESTIT1, the presence of CEP41 isoforms was examined using the tammar wallaby. In the wallaby genome database (Wallabase: https://wallabase.science.unimelb.edu.au), there was a putative CEP41, but its exon structure was not the same as the CEP41 identified in this study (Figure 4A). Therefore, we renamed the isoform identified by 3'RACE as CEP41A (Figure 4A). The putative CEP41 and CEP41A shared several exons. To confirm the presence of other isoforms of CEP41, 5' RACE reactions were performed using primers designed for the common exons of CEP41A and the putative CEP41 (Figure 4B: green coloured boxes). After sequencing the 5’RACE products, two TSSs were identified for the tammar wallaby CEP41. CEP41A shared a CpG island with marsupial MESTA. The other isoform was found to share a CpG island with MESTB. Because of the different exon structures, the newly identified CEP41 isoform was named CEP41B. Furthermore, we confirmed that the possible protein encoded by CEP41B differs in its C-terminal region from the amino acid sequence encoded by CEP41A (Figure 4C).
CEP41A isoform is not present in either mouse or platypus
To ask whether the CEP41 isoform, CEP41A, is a marsupial specific isoform, the presence of CEP41 isoforms in mouse and platypus was examined by 5’RACE experiments using their adult testes (Figure 5). First, the sequence of the mouse CEP41 gene was obtained from NCBI (Accession number: NM_031998.3). Since CEP41A shared several protein-coding exons with CEP41B in the tammar, 5' RACE reactions were performed using primers against the conserved region of mammalian CEP41 (Figure 5A: green coloured boxes). A single transcript was isolated from adult mouse testis. This transcript was identical to the known mouse CEP41 and had a genetic structure similar to the tammar CEP41B (Figure 5A). Similar experiments were performed with adult platypus testes after obtaining a putative platypus CEP41 gene from NCBI (Accession number: XM_001511256.6). A single transcript was isolated from an adult platypus testis. The transcripts also had high homology to the tammar marsupial CEP41B (Figure 5B).
Genomic analysis of the MEST and CEP41 flanking regions
To investigate CpG island locations and CEP41 and MEST isoforms in mammals, the genomic structures of the MEST and CEP41 flanking region in human, tammar and platypus were compared with each other. In both tammar and platypus, two large domains of CpG islands exist in close proximity (Figure 6A: yellow highlighted regions). The two major MEST isoforms, MESTA and MESTB, were expressed from the two large CpG island domains, respectively (Figure 6A). However, there is expansion of the intergenic region between CEP41 and MEST in the human genome, and the human MESTB did not share the same CpG island with human CEP41 (Figure 6A). NCBI BLAST searches of the human MEST and CEP41 flanking region identified a LINE1 ORF1 in the vicinity of MEST. Similar LINE1 elements were also found in the mouse and elephant genomes by NCBI BLAST searches (Figure 6B). However, the elephant LINE1 element was located close to CEP41 (Figure 6B).
Neighbouring transcripts of MESTA are not imprinted in the tammar placenta tissues
Since human MESTIT1 is imprinted (Nakabayashi, 2002), it was possible that the tammar MESTIT1 is also imprinted. To confirm whether MESTIT1 is imprinted in the tammar wallaby, allelic expression of the gene was performed using tammar placenta tissues. First, SNP sites were examined by direct sequencing of genomic DNA with RT-PCR (Figure 7A: black arrows). After examining 29 samples, one SNP site was identified in the common region of the two isoforms (Figure 7A). In the shorter isoform, specific primers could not be designed because the SNP site was too close to the poly-A tail. However, for the longer isoform, we could detect it from cDNA using the same primers as used for the SNP search (Figure 7A). Using these primers, the imprinting status of the tammar lncRNA was determined by direct sequencing of the PCR products that contained the SNP site (Figure 7A). 29 samples were examined, and three samples were a combination of maternal homozygous and fetal heterozygous SNPs (Figure 7B). In contrast to the gDNA PCR data in which the two peaks at the SNP site have almost the same signal strength, the cDNA PCR products showed that the lncRNA is preferentially expressed from either one of the two alleles (Figure 7B). Individuals #1 and #2 showed paternally skewed expression, but individual #3 showed maternally skewed expression (Figure 7B). Therefore, in the tammar placenta, the lncRNA was not imprinted.
Although CEP41 is not imprinted in mouse (Yamada et al., 2002), the tammar CEP41A shares the same CpG island with the shorter MEST isoform, MESTA. It was therefore possible that marsupial CEP41A is imprinted with MESTA. To confirm whether CEP41A is imprinted in the tammar wallaby, allelic expression of the gene was examined in the tammar placenta tissues. Since each isoform has a unique TSS and unique exons, allelic expression analysis was performed using CEP41A specific primers (Figure 7C and D). First, SNP sites in 3’UTR were examined by direct sequencing of genomic DNA with PCR (Figure 7C: blue arrows). After examining 18 samples, two SNP sites were identified in the 3’UTR of the tammar CEP41A (Figure 7D: Arrowheads). Allelic expression was performed using CEP41A specific primers (Figure 7D). Of these 18 samples, two animals were a combination of maternal homozygous and fetal heterozygous. Fortunately, the maternal homozygous SNPs in these two samples were different from each other (Figure 7E). In animal #1, CEP41A was preferentially expressed from the maternal allele (C) in the BOM and TOM tissues. However, in animal #2, CEP41A was preferentially expressed from the paternal (C) allele in the TOM tissues (Figure 7E). CEP41A was not detectable in the BOM of animal #2 so we could not determine its allelic expression (Figure 7E). In animal #3, CEP41A showed a clear bi-allelic expression in the BOM placenta and skewed expression in the TOM placenta (Figure 7E). Therefore, in the tammar placenta, CEP41A was not imprinted.
Re-evaluating allelic expression of MEST in the tammar placenta tissues
Since CEP41A and MESTIT1 were not imprinted in the tammar placenta even though they shared the same CpG island with the MEST gene, we re-evaluated MEST imprinting in the tammar placenta. The two major isoforms identified in this study were expressed from the two different CpG islands, respectively (Figure 8A). Since each isoform has a unique TSS and unique exons, allelic expression analysis was performed using each isoform specific primer and the shared reverse primer (Figure 8B). Our RACE experiments could not identify the isoform previously described (Suzuki et al., 2005), but using the same primers as Suzuki et al. (2005) we re-examined its allelic expression (Figure 8A and B). The reverse primer was designed to detect the previously described C/A SNP site at the 3’UTR region (Suzuki et al., 2005) (Figure 8B: Arrowhead). Seventeen samples were examined, and six animals had a heterozygous SNP at the SNP site. All of the six samples showed monoallelic expression (Figure 8C). The mothers of four of the six animals were not homozygous, but fortunately, the maternal homozygous SNPs in the remaining two samples were different from each other (Figure 8C). In animal #1, all isoforms were expressed from the paternal allele (A), exclusively. However, in animal #2, all isoforms were expressed from the maternal (A) allele (Figure 8C). Therefore, all isoforms in the two animals were not imprinted but all were mono-allelically expressed in the tammar TOM tissues (n=6).