Identification of novel candidate genes implicated in odontogenic potential in the developing mouse tooth germ using transcriptome analysis

In tooth bioengineering for replacement therapy of missing teeth, the utilized cells must possess an inductive signal-forming ability to initiate odontogenesis. This ability is called odontogenic potential. In mice, the odontogenic potential signal is known to be translocated from the epithelium to the mesenchyme at the early bud stage in the developing molar tooth germ. However, the identity of the molecular constituents of this process remains unclear. The purpose of this study is to determine the molecular identity of odontogenic potential and to provide a new perspective in the field of tooth development research. In this study, whole transcriptome profiles of the mouse molar tooth germ epithelium and mesenchyme were investigated using the RNA sequencing (RNA-seq) technique. The analyzed transcriptomes corresponded to two developmental stages, embryonic day 11.5 (E11.5) and 14.5 (E14.5), which represent the odontogenic potential shifts. We identified differentially expressed genes (DEGs), which were specifically overexpressed in both the E11.5 epithelium and E14.5 mesenchyme, but not expressed in their respective counterparts. Of the 55 DEGs identified, the top three most expressed transcription factor genes (transcription factor AP-2 beta isoform 3 [TFAP2B], developing brain homeobox protein 2 [DBX2], and insulin gene enhancer protein ISL-1 [ISL1]) and three tooth development-related genes (transcription factor HES-5 [HES5], platelet-derived growth factor D precursor [PDGFD], semaphrin-3 A precursor [SEMA3A]) were selected and validated by quantitative RT-PCR. Using immunofluorescence staining, the TFAP2B protein expression was found to be localized only at the E11.5 epithelium and E14.5 mesenchyme. Thus, our empirical findings in the present study may provide a new perspective into the characterization of the molecules responsible for the odontogenic potential and may have an implication in the cell-based whole tooth regeneration strategy.


Introduction
In vertebrates, organogenesis depends on a well-organized series of induction phenomena, especially on the interactions between the epithelium and mesenchyme, including various signaling pathways (Ribatti and Santoiemma 2014). Odontogenesis is a complex process that involves reciprocal induction between the ectodermal epithelium and mesenchyme originated from the neural crest (Balic and Thesleff 2015). Owing to its complexity, teeth development can be used as a tool to study complicated organogenesis, as indicated by various in vivo models and human mutation studies (Mitsui et al. 2016;Wang et al. 2009a). Odontogenesis includes biological events such as differentiation, morphogenesis, and mineralization, and parameters such as tooth structure, position, and shape are strictly controlled by intraor inter-tissue signaling which involve numerous genes and growth factors (Bei 2009;Yu and Klein 2020).
Odontogenic potential is defined as the capability of a specific tissue to induce gene expression in an adjacent tissue and initiate odontogenesis. In developing mouse, the odontogenic potential first appears on the dental lamina stage, corresponding to mouse embryonic day 10.5, initially at the dental epithelium. Then, at the early bud stage, which corresponds to mouse embryonic day 12.5, it is shifted to the mesenchyme and prolonged up until birth (Lumsden 1988). The previously mentioned events were validated by experiments that induced odontogenesis by recombining epithelium or mesenchyme possessing odontogenic potential with non-dental cells, such as embryonic stem cells, bone marrow-derived cells, keratinocytes, gingival epithelial cells, and hair follicle dermal papilla cells (Angelova Volponi et al. 2013;Otsu et al. 2012;Ruan et al. 2018;Wu et al. 2009;Yu et al. 2007). Additionally, it was reported that the odontogenic potential was preserved in the epithelium and mesenchyme of the human tooth germ, although its conversion issuing time was different (Hu et al. 2014).
Bone morphogenetic protein 4 (BMP4) is a molecule that contributes to signal transduction between the epithelium and mesenchyme during the early stages of odontogenesis and its expression profile and onset age appear to coincide with the odontogenic potential shift (Vainio et al. 1993). Additionally, it was reported that growth factors such as Bone morphogenetic protein 7 (BMP7) and Epiregulin (EREG) could help adult dental pulp cells in recovering inductive potential, thereby implying that pulp cells with previously no inductive potential could differentiate adjacent non-dental epithelial cells to ameloblasts (Gao et al. 2015). These findings suggest that these molecules may be part of the odontogenic induction signaling pathways. However, the odontogenic potential mechanisms are believed to be much more intricate than what is currently known, including interactions between various signal transduction pathways, transcription controlling factors, and extracellular matrix composed of secretory products of tissues (Zheng et al. 2016a, b). Nevertheless, the traditional molecular biological research methods, which focus on investigating the roles of particular genes or single signaling pathways, have limitations. Thus, to clarify these unknown mechanisms, a comprehensive systemic approach is needed such as genome-wide screening and transcriptomic analysis. These research methods not only explore the odontogenic potential signaling components themselves, but can also contribute as a critical step towards stem cell-based bioengineered teeth replacement.
This study aimed to determine the molecular identity of odontogenic potential by exploring novel genes and proteins and to reveal their functions, thereby providing a new perspective to odontogenesis and, potentially, aiding in the development of tooth regeneration techniques.

Tissue sample
All animal procedures were approved by the Animal Care Committee of the Chonnam National University of Korea (CNU IACUC-YB-2020-11). The day the vaginal plug was found was designated E0.5. After embryos were collected from pregnant female mice at 11.5-day and 14.5-day gestation, respectively, the molar tooth germs were microdissected from their mandibles. In this study, to improve the reproducibility of transcriptome profiles, we carried out three biological replicate experiments for each embryonic stage from a total of 11 embryos. The isolated tissues were treated with 0.75 mg/ml of dispase II for 30 min at 37 ℃ to separate the epithelium from the mesenchyme. The separated tissues were then placed on ice and used for extraction of the total RNA. Approximately 5 g of each tissue was treated for 10 min with a Reverse Transcription (RT) kit with an RNA Stabilization Reagent (Qiagen, Chatsworth, CA, USA,cat. no. 76,104). Total RNA was extracted with 10 ml of Tri-Reagent (Molecular Research Center, Inc., Cincinnati, OH, USA) following the manufacturer's instructions. The RNA pellets were dissolved in DEPC-treated water. The water with total RNAs was treated with DNase I (Sigma-Aldrich, St. Louis, MO, USA, cat. no. AMPD1) to eliminate gDNA contamination.

RNA sequencing
After qualifying and quantifying the total RNA using the NanoDrop1000 spectrophotometer (Thermo Scientific, Wilmington, DE, USA) and the Bioanalyzer 2100 (Agilent Technologies, Palo Alto, CA, USA), 1 µg of total RNA was used with the TruSeq RNA library preparation kit (Illumina, San Diego, CA, USA) according to the manufacturer's instructions. Briefly, for removal of bacterial and eukaryotic ribosomal RNA (rRNA), the total RNA sample was initially treated with the Ribo-Zero rRNA Removal Kit (Epicentre, Madison, WI, USA). The remaining RNA was chemically fragmented and converted into single-stranded cDNA using reverse transcriptase and random hexamers. Then, a second strand was synthesized to produce double-stranded cDNA ready for application to TruSeq library construction using DNA Polymerase I and RNase H. After an end repair process, single 'A' bases were added to the cDNA fragments and the ligation of the index adapters was performed. Finally, the cDNA libraries were sequenced using the Hiseq 2000 platform (Illumina) to obtain approximately 6 giga base pairs (Gbp) per sample. All raw sequenced data have been deposited in the NCBI Short Reads Archive (SRA) database with the accession number SRX9609228-SRX9609231 under BioProject PRJNA681713.

Bioinformatic analysis
All raw sequence reads generated from the NGS libraries were initially pre-processed by Trimmomatic (v0.36) to trim the adaptor sequence and remove low-quality sequences. The remaining trimmed reads were aligned to the mouse genome (mm10) using Hisat2 (v2.1.0) (Kim et al. 2019) with the default parameter. For quantified known transcripts, data on the resulting aligned reads were put into Cufflinks (v2.2.1) (Trapnell et al. 2010) and, unless stated otherwise, all gene expression levels in our analysis were normalized by fragments per kilobase of exon per million fragments mapped (FPKM). A differential gene expression analysis was carried out using Cuffdiff (v2.1.1) (Trapnell et al. 2012). In this study, we defined genes as differentially expressed using a fold change (FC) cut-off analytical approach (FC ≥ 2) in each comparison. The mouse reference genome and annotation data were obtained from the UCSC genome browser (https:// genome. uscs. edu), and the data for visualization were generated using the R software (R Development Core Team, Vienna, Austria). The experimental procedures for transcriptome analysis are illustrated in Fig. 1.

Real-time quantitative RT-PCR (RT-qPCR)
The mRNA expression levels of transcription factor AP-2 beta isoform 3 [TFAP2B], developing brain homeobox protein 2 [DBX2], insulin gene enhancer protein ISL-1 [ISL1]), transcription factor HES-5 [HES5], platelet-derived growth factor D precursor [PDGFD], and semaphrin-3 A precursor   [SEMA3A] were assessed by real-time RT-qPCR. The cDNA was synthesized from the total RNA obtained from the E11.5 and E14.5 mouse mandibular molar tooth germ using a RT system containing Moloney Murine Leukemia Virus reverse transcriptase (Promega, Madison, WI, USA). Table 1 shows the primer sequences and expected product sizes. Equal amounts of cDNA were used for real-time amplification of the target genes using a Rotor-Gene 3000 (Corbett Research, Mortlake, Australia). The amplified cDNA was detected using the SYBR Green PCR Master Mix Reagents Kit (Qiagen, Valencia, CA, USA). PCR conditions were as follows: incubation for 2 min at 95 °C, followed by 45 cycles of 5 s of denaturation at 95 °C, and 15 s for annealing and extension at 62 °C. The reaction mixture lacking cDNA was used as a negative control in each run. Data were analyzed using the Corbett Robotics Rotor-Gene software (version 6.1, Build 90 software). The ratios of the intensities of the target genes and β-actin signals were used as relative measures of the target gene expression levels. The primers were tested with conventional RT-PCR to optimize reaction conditions to generate a single PCR product on gel electrophoresis. To ensure experimental accuracy, we conducted real-time PCR assays in triplicate for each sample. The mean FC of the expression levels in the experimental group, when compared with that of the control group, was calculated using the 2 −ΔΔCt method, and the range of the FCs was calculated from the standard error of the values.

Immunofluorescence staining
The E11.5 and E14.5 embryo head tissues were harvested and fixed in a 3.7 % paraformaldehyde solution. Frontal serial sections from the paraffin-embedded tissues were cut at 5-µm thickness. The immunofluorescence staining was performed using the TSATM Kit (Invitrogen, Carlsbad, CA, USA). Briefly, after blocking endogenous peroxidase, the sections were allowed to react with the primary antibody against TFAP2B (Novos Biologicals, Centennial, CO, USA) overnight, and subsequently with the HRP-conjugated secondary antibody (Cell Signaling Technology, Beverly, MA, USA). The sections were incubated in Alexa Fluor 488 ® tyramide working solution, and then they were photographed using an LSM confocal microscope (Carl Zeiss, Standort Gȍttingen-Vertrieb, Germany).

Transcriptome analysis
To investigate the gene expression profile changes during embryonic odontogenesis, we collected both dental epithelial and mesenchymal tissues from the lower molar tooth germ of embryonic day 11.5 (E11.5) and embryonic day 14.5 (E14.5) mice, and sequenced an average of 75 million raw reads (more than 2.5× coverage per sample), which provide a sufficient depth of coverage for detecting the most expressed genes in the genome . After trimming the raw reads (Table 2), a total of 288 million reads (96.1 %) were used for further reference-based transcriptome analysis (Fig. 1). On average, we obtained 72.1 million clean reads per sample, and they were independently mapped to the mouse reference genome. Considering the FPKM value higher than 1, nearly half of the annotated genes in the mouse genome were expressed in each of the tissues, with 11,422 and 11,580 genes expressed in E11.5 and E14.5 epithelial cells, respectively, and 11,446 and 12,242 genes expressed in E11.5 and E14.5 mesenchymal cells, respectively. To ascertain whether our transcriptome data reflected the tooth development and mutation processes, we built expression distance matrices for each developmental stage and constructed a gene expression tree ( Fig. 2A). The two different development stages in each epithelial and mesenchymal cell type were grouped together, thereby indicating that the tooth development stages appeared to be primarily manifested at the transcriptome level.

Identification of the putative odontogenic potential-related genes
To identify the odontogenic potential-related genes that showed dynamic expression changes during the odontogenic potential shifts from the epithelium to the mesenchyme, two pairwise transcriptome comparisons were conducted. The aim of the first comparison was to identify genes relatively over-expressed in epithelial versus mesenchymal cells at E11.5 because on that day, the odontogenic potential first resides in the epithelial compartment. We found that 1356 genes were expressed (FPKM greater than 1) with a minimum FC of 2 in the epithelial cells at E11.5 (Fig. 2B). The next comparison was drawn between the same cell types in different embryonic stages, because at the early bud stage, the odontogenic potential is shifted from the epithelium to the mesenchyme. Using the same comparison criteria, we found that 2269 genes were overexpressed in the mesenchymal cells at E14.5 (Fig. 2C). From the obtained comparison results, the 55 most commonly expressed tissue-and stage-specific genes were newly identified as odontogenic potential-related candidate genes (Fig. 2D) and, as expected, they were the most abundantly expressed in either epithelial cells at E11.5 or mesenchymal cells at E14.5 (Fig. S1).

Validation of the RNA sequencing (RNA-seq) data using RT-qPCR
To verify the reliability of the RNA-seq results, the top three most expressed transcription factor genes (TFAP2B, DBX2, and ISL1) and three tooth development-related genes (HES5, PDGFD, and SEMA3A) were selected (Walker et al. 2019;Wu et al. 2010;Fujii et al. 2019) (Fig. 2E) and their expression levels were assessed by RT-qPCR. Consistent with the RNA-seq data, the mRNA expression levels of DBX2, ISL1, and TFAP2B were significantly higher in the E11.5 epithelial and E14.5 mesenchymal tissues when compared with those in the E11.5 mesenchymal and E14.5 epithelial tissues. However, temporospatial changes in the expression levels of HES5, PDGFD, and SEMA3A were not statistically significant but these expression patterns were retained (Fig. 3).

Immunofluorescent localization of Tfap2b in the developing molar tooth germ
Next, based on the above-mentioned results, the temporospatial expression pattern of TFAP2B was observed by immunofluorescence staining. Its expression was restricted to the epithelium of the lower molar tooth germ at the dental lamina stage (E11.5), while it was strongly localized in the lower molar mesenchyme at the cap stage (E14.5). In the upper molar mesenchyme at the cap stage, TFAP2B expression was localized as well. However, it was mainly detected in the E11.5 upper molar mesenchyme differing from the expression pattern in the lower molar tooth germ (Fig. 4).

Discussion
The use of stem cells such as dental pulp stem cells (DPSCs) has been recognized as the most promising approach in clinical dentistry for repairing damaged teeth (Sui et al. 2019). DPSC transplanted to in vivo tissues could differentiate to odontoblast and originate dentine-like structures (Zhu et al. 2018). However, recent studies have reported that the stem cell-based whole tooth regeneration strategy has remarkable limitations, because usually such stem cells lack the ability to generate induction signals to initiate the tooth development process (Hu et al. 2014). To identify novel odontogenic potential-related genes, we initiated the present study with the well-stablished findings that the odontogenic potential signals are translocated from the epithelium to the mesenchyme at an early bud stage in mice and compared the transcriptome-wide changes in gene expression on dental epithelial and mesenchymal tissues from the lower molar tooth germ of E11.5 and E14.5 mice.
With the growing popularity of the RNA-sEq. technology, differential expression analysis for RNA-seq data has Fig. 3 Relative difference in the mRNA expression obtained from the epithelium and mesenchyme of the developing mouse lower molar tooth germ. The expression levels of DBX2, ISL1, HES5, PDFGD, TFAP2B, and SEMA3A genes were evaluated by RT-qPCR analysis. β-actin was used as a control. The data represent mean ± S.E. from three independent experiments Fig. 4 Immunofluorescence findings of TFAP2B expression in the developing mouse molar tooth germ. At E11.5, TFAP2B is expressed in the lower molar epithelium, whereas its expression is mainly observed in the lower molar mesenchyme at E14.5 1 3 been widely used, which is carried out as per the following steps: (1) sequencing of the fragmented complementary DNA sequences from the RNA samples, (2) mapping of the small generated sequences to a reference genome, (3) estimation of the expression levels for each gene, (4) normalization of the mapped sequence reads and identification of the differentially expressed genes (DEGs), and (5) evaluation of the relevance of the identified genes using other in-vivo or in-silico analyses (Oshlack et al. 2010;Wang et al. 2009b). Unlike traditional methods of DEG analysis, where negative binomial distribution and local regression along with false discovery rate (FDR) cut-off of 0.05 are used to estimate the relationship between the variance and the mean of expression levels (Rapaport et al. 2013;Trapnell et al. 2013), in this study, we employed a simple FC statistics model. As the synchronization of the developmental stages was crucial for our experiments, instead of using a multi-round sample pooling strategy, a two-round method was used in this study as the cutoff to choose the DEGs in which the FCs were larger than the cutoff. Finally, regarding the 55 DEGs that had a relatively abundant expression in epithelial cells at 11.5 and mesenchymal cells at 14.5, no enriched Gene Ontology terms were found within the threshold of FDR < 0.05. However, a total of seven transcription factor genes (TFAP2B, DBX2, ISL1, JDP2, TOX2, HIVEP2, and AHR) and several tooth development-related genes (HES5, PDGFD, and SEMA3A) were found in this DEGs group.
In this study, by using RT-qPCR and immunofluorescence analysis, we investigated and proposed that TFAP2B might be one of the genes conferring odontogenic potential in tissues. This finding is also substantially supported by recent literature, which reports that TFAP2B mutations are strongly associated with dental anomalies such as tooth agenesis and abnormal development in teeth size and number (Mani et al. 2005), in addition to being one of the most common causes of patent ductus arteriosus (PDA) Tanasubsinn et al. 2017). Intriguingly, based on our immunofluorescence analysis, we observed that TFAP2B showed non-homogeneous expression patterns between E11.5 upper and lower molar tooth germs. This finding is probably not due to the different roles of this gene in the development of the upper and lower teeth, but the temporal difference in development, which is supported by a previous study about the developmental differences between maxillary and mandibular teeth in a keratin 14-Noggin transgenic mouse (Plikus et al. 2005).
In conclusion, we compared the gene expression profiles in the developing mouse tooth germ before and after the odontogenic potential shift using RNA-sEq. We identified several transcription factor genes with potential roles in the odontogenic potential signaling pathways. Further functional studies and the data from chromatin immunoprecipitation followed by high-throughput sequencing (ChIP-seq) will be helpful for understanding the dynamic changes in chromosome accessibility and regulatory network in the epithelium and mesenchyme during the tooth development process. Altogether, these results can provide a molecular cytological basis for the development of bioengineered tooth regeneration techniques to replace damaged or lost teeth, which is currently one of the most studied topics in dental research.