Seven HSCR patients from different subgroups (4 S-HSCR, 2 L-HSCR and 1 TCA) were recruited at Queen Mary hospital, Hong Kong. Each patient was given a disease score according to the length of aganglionic segment (0-5; control= 0 and TCA=5) (Fig. 1A & B). The RET alleles in rs2435357 and variants in the coding region of RET gene were identified through sequencing the genomic DNA from the patients’ blood. In brief, among the four S-HSCR patients, HSCR#5, #10, #20 showed relatively mild phenotypes with disease scores 1 or 2, while HSCR#1 had a longer aganglionic segment up to the lower sigmoid, representing the intermediate phenotype between S- and L-HSCR. Skin biopsies were obtained from these patients for generation of iPSC lines. The study was approved by the institutional review board of The University of Hong Kong together with the Hospital Authority (UW 13-419).
Human induced pluripotent stem cells (iPSCs)
A control hiPSC line (IMR90-iPSC) was obtained from WiCell Research Resources (WiCell, WI., RRID:CVCL C434). Another control hiPSC cell line (UE02302) was generated from urine derived cells of a male individual by episomal reprogramming vectors carrying the four reprogramming factors33. All iPSCs used in this study were at the intermediate (35-65) passage numbers and maintained on Matrigel (Corning, 354277)-coated plate with mTeSR1 medium (StemCell Technologies, 85850).
Generation of iPS cell lines
Fibroblasts were derived from skin biopsies obtained from patients diagnosed with HSCR disease. Fibroblasts were then maintained and expanded in DMEM with 10% FBS. iPSC were generated from fibroblasts using episomal vectors carrying the four reprogramming factors (Oct4, Klf4, Sox2, and c-Myc) 34. In brief, 1 ´ 106 fibroblasts were transfected with episomal vectors using Nucleofector with transfection kit (Lonza, VPD-1001) and the transfected cells were then replated on a 35mm gelatin-coated well. The medium was changed 24 hours later and replaced daily from thereon. At day 7, cells were dissociated and replated on MEFs in ES medium (DMEM/F12 (Thermo Fisher Scientific,, 10565018) plus 10% KnockOut Replacement Serum (KSR) (Thermo Fisher Scientific,, 10828028), 0.5% L-glutamine (Thermo Fisher Scientific, 25030018), 1% NEAA (Thermo Fisher Scientific, 11140050), 0.1% β-mercaptoethanol (Thermo Fisher Scientific, 21985023), 0.5% penicillin-streptomycin (Thermo Fisher Scientific,15140122) and FGF2 (Peprotech, 100-18B). Cells were fed daily with ES medium. Following the appearance of hESC-like colonies after approximately 3–4 weeks, individual colonies were isolated and clonally expanded to establish iPSC lines.
Derivation of neural crest (NC) from iPSC lines
At day 0, Control or HSCR-iPSCs were seeded on Matrigel-coated plate (105 cells cm−2) in iPS cell medium containing 10 ng ml−1 FGF2 and 10 μM Y-27632 (Tocris Bioscience, 1254). Differentiation was then initiated by replacing iPSC medium with KSR medium, containing KnockOut DMEM (Thermo Fisher Scientific, 10829018) plus 15% KSR, 1% NEAA, 1% L-glutamine, 1X β-mercaptoethanol, 1% penicillin-streptomycin, LDN193189 (100 nM, Reprocell, 04-0074) and SB431542 (10 μM, Abcam, ab120163). The dual SMAD inhibitors and a potent GSK inhibitor were added at different time frame during the NC induction, including LDN193189 (from day 0 to day 3), SB431542 (from day 0 to day 4), 3 μM CHIR99021 (from day 2 to day 10, Reprocell, 04-0004). The NC cells were finally caudalized with 1 μM retinoic acid (Abcam, ab120728) (from day 6 to day 9). The KSR medium was gradually changed to N2 medium at day 4 by increasing N2 from 25% to 75% from day 4 to 9 as described previously7. The N2 medium contained neural basal medium (Thermo Fisher Scientific, 22103049) and DMEM/F12 in 1:1 ratio supplemented with 0.5% N2 supplement (Thermo Fisher Scientific, 17502048), 1% B27 supplement (Thermo Fisher Scientific, 17504044), 5 mg ml-1 insulin (Thermo Fisher Scientific, 12585014) and 1% penicillin-streptomycin. The ENCCs were enriched by FACS with antibodies against p75NTR and HNK-1 at day 10 of the differentiation as described7,8,20. FACS-sorted ENCCs were collected for single-cell RNA sequencing.
In vitro differentiation of ENCCs to neuronal progenitors (NPs)
Around 40 thousand FACS-enriched ENCCs were seeded as droplets on poly-ornithine/laminin/fibronectin (PO/LM/FN)-coated 24 well plate in N2 medium containing 10ng ml−1 FGF2, 3 μM CHIR99021 and 10 μM Y-27632. The neuronal differentiation started 48 hours later and the attached ENCCs were then cultured with N2 medium containing BDNF (10ng ml−1 Peprotech, 450-01), GDNF (10 ng ml−1, Peprotech, 450-10) and ascorbic acid (200 μM, Sigma, A4034-100G), NT-3 (10ng ml−1, Peprotech, 450-03), NGF (10ng ml−1, PeproTech, 450-01) and cAMP (1 μM, Sigma, D0260). For valproic acid (VPA) treatment, 1mM VPA (Sigma, P4543) was added during neuronal differentiation. The culture medium was changed every 2 days. ENCC-derived neurons at differentiation day 5 or 9 were fixed for immunocytochemistry analyses.
Fluorescence activated cell sorting (FACS) and flow cytometric analysis
For quantifying the ENCCs, the 10 day-differentiated cells were dissociated with Accutase and then incubated with anti-human antibodies including APC-HNK-1 (BD Pharmingen, 560845) and FITC-p75NTR (Miltenyi Biotec, 130-091-917) for 30-45 minutes on ice. To stain for PE-RET (Neuromics, FC15018), the cells were fixed in 4% paraformaldehyde for 10 minutes at room temperature and permeabilized using 0.1% (w/v) Saponin solution, then washed and blocked in PBS with 2% FBS. The cells are then stained with antibodies for 30-45 minutes on ice. Approximately 106 cells were stained and labeled cells were detected using FACSAriaIII (Becton Dickinson Immunocytometry Systems, San Jose, CA, USA). Isotype-matched antibodies were used as controls. FlowJo version 8.2 (Tree Star, Inc.) was used to analyze flow data.
For cell sorting, HNK-1/p75NTR stained cells were washed and resuspended in PBS with 2% FBS. The HNK-1and p75NTR double positive cells were enriched using fluorescence activated cell sorting (FACS) with BD FACSAria III Cell Sorter. The HNK-1 and p75NTR double positive cells were gated and sorted using the four-way purity mode and the purity of sorted cells was >96% and evaluated by flow cytometry. The sorted neural crest cells were collected for immunostaining or subsequent experiments. A list of primary antibodies and the working dilutions are provided in Supplementary Table 1.
For immunocytochemistry, the cells were fixed with 4% paraformaldehyde in PBS at room temperature for 30 min, followed by blocking with 1% bovine serum albumin (BSA) (Thermo Scientific, 23209) with or without 0.1% Triton X-100 (Sigma, T8787) in PBS buffer. Cells were then incubated in primary antibody overnight at 4°C and host-appropriate Alexa Fluor -488 or 594 secondary antibody (Molecular Probes, Invitrogen) (Supplementary Tables 1 & 2) for 1 h at room temperature. Cells were then counterstained with mounting medium with DAPI (Thermo Scientific, P36931) to detect nuclei. Cells were photographed using Carl Zeiss confocal microscope (LSM 810). Quantitative image analysis of differentiated neuronal cultures was performed with ImageJ plug-in tool. The total signal of neuronal marker was normalized with DAPI signal and the values reported in bar charts represent the mean ± SEM.. The same approach was used to quantitate the expression of neuronal marker in various groups. A minimum of 4,000 cells were analyzed per sample.
FACS-enriched ENCCs were plated on human fibronectin coated 12-well culture plates (30,000 cells cm−2). After 24 hours, cells were treated with mitomycin (10μg ml−1) to stop the cell proliferation. A wound was created in the center of each well by scratching with a pipette tip. Cells were allowed to migrate for 18 hours. The images of the initial wound and final wound were captured immediately and at 18 hours after scratching. The migration distance was obtained by comparing the width of the initial wound created and wound closure during 18 hours.
Reverse Transcription-PCR (qRT-PCR)
Total RNA was isolated from iPSC-derived ENCCs using RNeasy Mini Kit (Qiagen); and reverse transcribed in 20μl reaction system using HiScript II Q RT SuperMix (Vazyme, R223-01), in accordance with the manufacturer’s instructions. diluted cDNA samples were amplified by Luna® Universal qPCR Master Mix (New England BioLabs, M3003) with forward and reverse primers specific for HDAC1, PKMEx2-4, PKM1 (PKM-Ex9) and PKM2 (PKM-Ex10). Fluorescence signal was measured by ViiA 7 Real-Time PCR System (Thermo Fisher Scientific) at the end of each cycle. Each individual sample was assayed in triplicate and gene expression was normalized with GAPDH expression. The primer sequences are listed in Supplementary Table 3.
FACS-sorted ENCCs were collected and lysed using protein lysis buffer containing 50mM Tris-HCl, pH7.5, 100mM NaCl, 1% Triton X-100, 0.1mM EDTA, 0.5mM MgCl2, 10% glycerol, protease inhibitor cocktail (Roche) and phosphatase inhibitor cocktail (Roche). After incubation on ice for 15 minutes and the total proteins were collected by centrifuge for 10 minutes at 12000rpm, 4 °C. 20μg of total protein from cell lysates was separated on 12% SDS-polyacrylamide gels and blotted with the corresponding primary antibodies. A list of primary antibodies and working dilutions is provided in Supplementary Table 1. The same membranes were stripped and hybridized with anti-β-actin monoclonal antibody (Millipore, MAB 1501) as a protein-loading control. All blots were incubated with secondary horseradish peroxidase-conjugated anti-mouse or anti-rabbit or anti-goat antibody (1:2500, DAKO).
The Agilent Seahorse XFe96 Analyzer (Seahorse Bioscience, Billerica, MA) was used to measure basal, ATP-linked, and maximal oxygen consumption rate (OCR); reserve capacity; and extracellular acidification rate (ECAR). Cellular bioenergetics was performed using XF Cell Mito Stress Test Kit (Agilent Technologies, 103010-100), XF Glycolysis Stress Test Kit (Agilent Technologies, 103020-100), and XF Long Chain Fatty Acid Oxidation Stress Test Kit (Agilent Technologies, 103672-100). 5 x 104 of FACS-enriched ENCCs (Passage 1 -2) were seeded onto fibronectin-coated Seahorse XF96 V3 PS Culture Microplate (Seahorse Bioscience Inc, Billerica, MA, USA, 101085-004) in N2 medium containing 10ng ml−1 FGF2 and 3 μM CHIR99021. After a 24-hr incubation, N2 medium from each well were removed, and washed once with pre-warmed assay medium (Seahorse XF DMEM Medium, Seahorse Bioscience Inc, 103575-10) supplemented with 25 mM glucose, 2 mM L-glutamine, 1 mM sodium pyruvate, 0.5% N2 supplement, 1% B27 supplement, 1% penicillin-streptomycin, 10ng ml−1 FGF2 and 3 μM CHIR99021, then 180μl of assay medium was added. Cells were incubated in 37°C incubator without CO2 for 1hr to allow to pre-equilibrate with the assay medium. Cells were treated with/without PKM2 inhibitor (compound 3k, 1μM, Selleckchem, D8375), SAG (1μM) or cyclopamine-KAAD (1nM) for 1hr.
For Mito Stress Test, ten-fold concentrated compounds in the kit of oligomycin (Complex V inhibitor), carbonyl cyanide-4 (trifluoromethoxy) phenylhydrazone (FCCP, mitochondrial uncoupler), and a mixture of rotenone (complex I inhibitor) and antimycin A (complex III inhibitor) were loaded into the cartridge to produce final concentrations of 1.2μM, 2μM, and 1μM respectively.
For the Glycolysis stress test, ten-fold concentrations of compounds from the kit containing glucose (fuel for glycolysis), oligomycin and 2-Deoxy-D-Glucose (2-DG, competitive inhibitor of hexokinase) were loaded into the cartridge to produce final concentrations of 10 mM, 1.2 μM and 50 mM, respectively.
For Long Chain Fatty Acid Oxidation Stress Test, , ten-fold concentrations of compounds from the kit containing etomoxir (ETO, fatty acid oxidation inhibitor), oligomycin, FCCP and a mixture of rotenone and antimycin were loaded into the cartridge to produce final concentrations of 5μM 1.2μM, 2μM, and 1μM respectively.
Data were analysed using Agilent Seahorse analytics and normalized with number of cells seeded in each well.
Lactate and Pyruvate Assay
Lactate and pyruvate levels in medium was measured using Lactate Colorimetric Assay Kit (BioVision Incorporated, #K607) and Pyruvate Colorimetric/Fluorometric Assay Kit (BioVision Incorporated, #K609) according to the manufacturer's instructions, respectively. In brief, FACS-enriched ENCCs (Passage 1 -2) were seeded at 5 x 105/well onto a fibronectin-coated 12-well plate and incubate overnight. On the next day, cells were treated with/without PKM2 inhibitor (1μM), SAG (1μM) or cyclopamine-KAAD (1nM) for 1hr. The medium were then collected and the levels of lactate or pyruvate were measured simultaneously. The levels were normalized to DNA concentration. The experiments were performed in triplicate.
ATPlite Luminescence Assay
ATP was detected using the ATPlite Luminescence ATP Detection Assay System (PerkinElmer, #6016943) according to the manufacturer's protocol. In brief, FACS-enriched ENCCs (Passage 1 -2) were seeded at 5 x 103/well onto a fibronectin-coated 96-well plate and incubate overnight. On the next day, cells were treated with PKM2 inhibitor (1μM), SAG (1μM) or cyclopamine-KAAD (1nM) for 1hr. After incubation, 50 μL lysis buffer (from the ATPlite kit) was added and shaken for 5min. Then, 50μL substrate (from the ATPlite kit) was added and incubated for 15min in dark. The ATP content was measured with VICTOR Nivo Multimode Microplate Reader (PerkinElmer).
To measure the mitochondrial activity, FACS-enriched ENCCs (Passage 1 -2) were seeded at 1.5 x 105/well onto a fibronectin-coated 24-well plate and incubate overnight. On the next day, cells were washed once with PBS, and starved in 2mM glucose NC medium for 2hrs. Cells were then treated with GW9508 (10μM), SAG (1μM) or ETO (5μM, 1hr) + SAG (1μM) for 1hr. 250 nM MitoTracker Red (Life Technologies, M7514) were then added, and incubated for 30 min at 37 °C, 5% CO2. After incubation, cells were washed twice with PBS, and then fixed with 4% paraformaldehyde for 20min at room temperature. After fixation, cells were stained with DAPI, and imaged using a 63x objective on Carl Zeiss confocal microscope (LSM 800).
FACS-enriched ENCCs (Passage 1 -2) were seeded at 5 x 103/well onto a fibronectin-coated 96-well plate and incubate overnight. On the next day, cells were treated with PKM2 inhibitor (1μM), SAG (1μM) or cycoplamine-KAAD (1nM) for 1hr. For pyruvate or lactate treatment, cells were washed once with PBS, and starved in 2mM glucose NC medium for 2hrs. Then 2mM pyruvate, 2mM pyruvate + 1μM SAG, 4mM lactate or 4mM lactate + 1μM SAG were added to the cells, and incubated for 3hr or 6hr. 10μl of MTT labelling reagent (final concentration 0.5mg/ml, Sigma, M2128) was added to each well for 3hr. Medium with labelling reagent were then removed, and 200μl DMSO was added to solubilize the purple formazan crystals. The absorbance was measured at 570nm using VICTOR Nivo Multimode Microplate Reader (PerkinElmer).
Plate-based single-cell (sc) RNA sequencing were performed at the Centre of Genomic Science, The University of Hong Kong. For scRNA-seq of human cells, the Smart-seq® v4 (Clontech) kit was used for first-strand synthesis. Single cells were directly sorted into 4 l of lysis buffer in a 384-well plate using a FACSAria III flow cytometer (BD Biosciences). First-strand DNA was synthesized within 16 cycles of amplification according to the manufacturer’s instructions. cDNA was purified on Agencourt AMPureXP magnetic beads, washed twice with fresh 80% ethanol and eluted in 17 l of elution buffer. Then, 1 l of cDNA was checked and quantified on an Agilent Bioanalyzer high-sensitivity DNA chip. Sequencing libraries were produced using Illumina Nextera XT tagmentation according to the manufacturer’s instructions except using 150 pg of input cDNA, 5 min of tagmentation and 12 cycles of amplification using the Illumina XT 24 index primer kit. Libraries were cleaned using an equal volume (50 ml) of Agencourt AMPureXP magnetic beads and re-suspended in 20l of elution buffer. Libraries were checked and quantified on an Agilent Bioanalyzer high-sensitivity DNA chip (size range 150-2000 bp) and by Qubit dsDNA BR (Molecular Probes). Libraries were pooled to a normalized concentration of 1.5 nM and sequenced on an IlluminaTM NextSeq 500 using the 150 bp paired-end kit as per the manufacturer’s instructions.
Preprocessing and quality control of Smart-Seq Single-Cell RNA Sequencing Data
Fastq files with paired-end reads were trimmed by Cutadapt (version 1.7) to remove read adapters or low-quality tail bases and then aligned to the ENSEMBL GRCh38 (release 90, https://www.ensembl.org/) human transcriptome by using Bowtie2 (version 22.214.171.124)35 with options “--sensitive --mp 1,1 --np 1 --score-min L,0,-0.1 -I 1 -X 2000 --no-mixed --no-discordant -N 1 -L 25 -k 200,” which allows 1 mismatch during sequence alignment. Quality control of the reads of each cell was assessed by using FastQC (version 0.11.6). Gene expression level was quantified by using raw read count number and Transcript per million (TPM) values generated by RSEM (version 1.3.0)36. For the quality control of the genes and cells, genes that were detected in less than 5 cells and cells that expressed with either fewer than 3000 genes or more than 9000 genes, or greater than 20% mitochondrial genes were excluded from the expression matrix. Finally, our study included 2,985 high-quality cells for downstream analysis. The distributions of uniquely mapped reads count and detected gene number in the cells are shown in Supplementary Data 1 and Supplementary Fig. 2A-C.
Dimensionality reduction and cell clustering
The R package Seurat (version 3.1.4)23 implemented in R (version 3.6.2) were used to perform dimensional reduction analysis on iPSC-derived ENCCs RNA-seq data. The “NormalizeData()” function from Seurat was used to normalize the raw counts, and the scale factor was set to 200,000, then followed by “FindVariableFeatures()” with default parameters to calculate highly variable genes for each sample. After performing “JackStraw()”, which returned the statistical significance of PCA (Principal component analysis) scores, we selected 15 significant PCs to conduct dimension reduction and cell clustering. Then, cells were projected in 2D space using t-SNE (t-distributed Stochastic Neighbor Embedding) with default parameters. Clustering on individual iPSC-derived ENCC used different resolutions and was finally integrated into 5 main clusters by the expression of canonical markers.
Single cell data integration and batch effect correction
To account for batch effect among different samples, we used “FindIntegrationAnchors()” in the Seurat package to remove batch effect and merge samples to one object. In detail, the top 4,500 genes with the highest expression and dispersion from each sample were used to find the integration anchors, and then the computed anchoret was applied to perform dataset integration.
Identification of differentially expressed genes and pathway enrichment analysis
To identify unique differentially expressed genes (DEGs) among each cluster, the “FindAllMarkers()” function from Seurat was used and non-parametric Wilcoxon rank sum tests were set to evaluate the significance of each individual DEG. The DEG analysis between HSCR and control-iPSC-derived ENCCs based on single-cell expression data was performed using monocle (version 2.14.0) R package37. The DEGs with adjusted P-value less than 0.05 and Log2 Fold Change (log2FC) larger than 0.5 were thought to be significant and used in downstream analysis. Gene ontology (GO) term enrichment analyses were performed using clusterProfiler (version 3.14.3) R package38. Terms that had an adjusted P-value < 0.05 was defined as significantly enriched.
Pathway score analysis
In order to measure the activity of whole pathways between samples at transcriptome level, we applied an simple additive model that ignore the interactions between genes to estimate the overall expression level of the pathways. The genes involved in neurogenesis (GO:0050771), proliferation (GO:0051726), RNA splicing (GO:0033120) and cellular respiration (GO:1901857) in GO database were used to evaluate the overall changes of the key biological processes.
Histone deacetylase activity inference
Key components of Sin3, NuRD and CoREST complexes were obtained from public database. A geometric average model was applied to estimate the overall activity of histone deacetylase complexes based on the average expression of their subunits. Geometric average was used to set the complexes activity to zero if any of the subunit was not expressed in single cells.
Transcriptomic timing analysis
A Bayesian linear regression model was applied to detect “switch-like” upregulation or downregulation of genes along the pseudotime axis. Crucially, the model probabilistically assigns a region along the axis associated with the positive or negative activation of each of each gene in the core gene set.
Gene regulatory network (GRN) analysis
Single cell regulatory network inference and clustering (SCENIC, version 1.1.2)26 was used to infer transcription factor networks active using scRNA-Seq data. Analysis was performed using default and recommended parameters as directed on the SCENIC vignette (https://github.com/aertslab/SCENIC) using the hg19 RcisTarget database. Kernel density line histograms showing differential AUC score distribution across conditions were plotted with ggplot2 v.3.1.1 using the regulon activity matrix (‘3.4_regulonAUC.Rds’, an output of the SCENIC workflow) in which columns represent cells and rows the AUC regulon activity. Fold-change (FC) difference between median AUC values was calculated and the highest changed TFs were plotted.
Differential co-expression analysis
To predict HDAC1 activated and repressed target genes, we applied a three-step strategy to obtain the target genes of HDAC1 during HSCR pathogenesis. In the first step, we identified the significantly correlated genes to HDAC1 which represented the genes that had strong regulatory relationships with HDAC1. In the second step, we utilized the differential expression profile between control and HSCR-ENCCs to filter out the genes whose expression levels were not significantly disrupted by HDAC1. Finally, to confine the genes to be the actual targets of HDAC1, a public ChIP-seq dataset designed for HDAC1 from a similar cell state (neuronal progenitor) in Cistrome database39 was utilized to obtain the direct/indirect binding genes of HDAC1. Activated and repressed target genes were determined by the positive/negative correlation and up/down-regulation of the differential expression, respectively. In addition, TF motif enrichment analysis by FIMO tool40 of MEME suite41 was used to cross-validate the predicted activated targets of HDAC1. Similar method was applied to identify the potential target genes/exons of PTBP1 exon9 based on RNA binding motif analysis. Firstly, AS events that significantly correlated with PTBP1 exon9 were selected. Secondly, exons targeted by PTBP1 through motif recognition were kept. Finally, we focused on the AS events that significantly associated with HSCR disease.
Alternative splicing analysis
SUPPA (version 2.3)42 was used to perform the alternative splicing analysis. Firstly, transcript events and local alternative splicing (AS) events were generated from ENSEMBL GRCh38 genome annotation (gtf) file. Secondly, the transcript and local AS event inclusion levels (PSI, Percent-Spliced-In) were quantified from multiple samples. Lastly, differential splicing for AS events and differential transcript usage between HSCR and control iPSC-derived ENCCs were calculated based on single-cell level PSI value. Clustering of cells based on PSI value was also performed by Seurat package.
Protein-protein interaction (PPI) network analysis
The PPI among core gene set and HSCR associated DEGs were obtained from STRING (v11)25. We included only the interactions based on manual curation or experimental evidence with a combined score >0.4. Cytoscape43 was used to plot the network.
The differences among multiple treatment groups were analyzed with a two-sided unpaired Student’s t test or one-way analysis of variance followed by Tukey post-test using GraphPad Prism 7 (GraphPad Software). A P value less than 0.05 was interpreted to represent a statistically significant difference. All experiments were replicated at least three times, and data are shown as means with SEM. No specific randomization or blinding protocols were used.
Further information on research design is available in the Nature Research Reporting Summary linked to this article.
Raw sequencing data are available in the Sequence Read Archive (SRA) at the NCBI Center with the accession number PRJNA784249. The processed scRNA-seq data sets are available at https://doi.org/10.5281/zenodo.6104610. All other relevant data supporting the key findings of this study are available within the article and its Supplementary Information files or from the corresponding author upon reasonable request. A reporting summary for this Article is available as a Supplementary Information file. Source data are provided with this paper.