CHIRP-Seq: FoxP2 transcriptional targets in zebra finch brain include numerous speech and language-related genes

Background Vocal learning is a rare, convergent trait that is fundamental to both human speech and birdsong. The Forkhead Box P2 (FoxP2) transcription factor appears necessary for both types of learned signals, as human mutations in FoxP2 result in speech deficits, and disrupting its expression in zebra finches impairs male-specific song learning. In juvenile and adult male finches, striatal FoxP2 mRNA and protein decline acutely within song-dedicated neurons during singing, indicating that its transcriptional targets are also behaviorally regulated. The identities of these targets in songbirds, and whether they differ across sex, development and/or behavioral conditions, are largely unknown. Results Here we used chromatin immunoprecipitation followed by sequencing (ChIP-Seq) to identify genomic sites bound by FoxP2 in male and female, juvenile and adult, and singing and non-singing birds. Our results suggest robust FoxP2 binding concentrated in putative promoter regions of genes. The number of genes likely to be bound by FoxP2 varied across conditions, suggesting specialized roles of the candidate targets related to sex, age, and behavioral state. We validated these binding targets both bioinformatically, with comparisons to previous studies and biochemically, with immunohistochemistry using an antibody for a putative target gene. Gene ontology analyses revealed enrichment for human speech- and language-related functions in males only, consistent with the sexual dimorphism of song learning in this species. Fewer such targets were found in juveniles relative to adults, suggesting an expansion of this regulatory network with maturation. The fewest speech-related targets were found in the singing condition, consistent with the well-documented singing-driven down-regulation of FoxP2 in the songbird striatum. Conclusions Overall, these data provide an initial catalog of the regulatory landscape of FoxP2 in an avian vocal learner, offering dozens of target genes for future study and providing insight into the molecular underpinnings of vocal learning.


Background
Vocal learning is a complex phenotype in which organisms learn to accurately imitate sounds and use them in the appropriate social contexts.This trait is a remarkable example of convergent evolution across several distinct avian and mammalian taxa (1) and forms the basis for song and speech learning in songbirds and humans, respectively.Both processes involve developmental critical periods whereby learning is most robust in juveniles, with ongoing maintenance of the learned vocalizations in adulthood (2).A growing body of work suggests that this phenotypic convergence extends to the level of neural architecture (3) and molecular specialization (4,5), with analogous motor circuits and gene expression patterns supporting vocal learning across taxa.However, our understanding of the genetic mechanisms that establish the specialized circuitry and expression pro les that underlie vocal learning remains incomplete.
An important candidate gene for accurate vocal learning is Forkhead Box P2 (abbreviated FOXP2 when referencing the human form of the protein and FoxP2 when referencing it in other taxa (6)).FoxP2 is a member of the Forkhead Box family of transcription factors which canonically bind to a 6-12 base-pair sequence, or regulatory element, usually in the promoter region of genes, whereby they alter target gene expression (7) through both activation and repression, with the latter being more commonly reported for FoxP2 (8).The role of FOXP2 in vocal learning was rst identi ed in the KE family, a human cohort with a point mutation in exon 14 of the gene, resulting in reduced DNA binding and speech de cits (9,10).A similar phenotype was later demonstrated in a songbird, the zebra nch (Taeniopygia guttata), as FoxP2 knockdown in juveniles impaired their ability to learn the songs of their adult tutors (11).
Additional evidence from songbirds highlights the dynamic nature of FoxP2 expression in the brain.Work from Teramitsu and White (12) rst described the behavioral regulation of FoxP2 with decreased mRNA and protein levels observed in Area X, the vocal-dedicated brain region of the songbird striatum, 2 hours after the onset of undirected singing in the morning (a form of vocal practice).This nding has been replicated across multiple studies and species (13)(14)(15)(16)(17)(18).Further, while FoxP2 is highly expressed in the striatum of both sexes of zebra nches, only males engage in vocal learning and only males exhibit singing-linked down-regulation, suggesting a sexually dimorphic distribution of molecular targets within this species.Molecular targets of FOXP2 have been identi ed in brain tissues from humans (19,20), the only known primate vocal learner (21), however, the FoxP2 regulatory network in an avian vocal learner, and how it uctuates across sex, development, and behavior is currently unknown.
Any investigation of transcription factor target sites requires a high-quality reference genome assembly, a resource that has only recently become available for the zebra nch (22).The original zebra nch reference genome was produced using Sanger sequencing which resulted in an accurate but exceedingly fragmented assembly with large sequence gaps (23).Such gaps often occur in highly repetitive, intergenic regulatory regions, limiting the ability to resolve transcription factor binding sites and subsequent target genes (20).An updated assembly produced by the Vertebrate Genomes Project using both short-read (Illumina, 10X) and long-read (Pacbio, Bio, Nanopore) technologies has produced a more accurate, contiguous assembly which greatly improves the mappability of next-generation sequencing data (22).Here, we explore the speci c utility of this updated assembly for studying FoxP2 gene regulation.
We used a chromatin-immunoprecipitation and sequencing experiment (ChIP-Seq) to determine the molecular targets of FoxP2 in the zebra nch telencephalon across an informative set of developmental and behavioral conditions, including adult female, adult male, juvenile female, and juvenile male, the latter in both singing and non-singing states.To do so, we developed a bioinformatic pipeline for data analysis using both the original and updated zebra nch reference assemblies.With this pipeline, we quanti ed a substantial improvement in our ability to identify downstream target genes with the newer assembly.We describe distinct binding pro les in putative promoters across all conditions with distinct gene associations in each condition.Adult males exhibit robust functional enrichment of a set of genes related to speech and language dysfunction in humans that uctuates as a function of age, sex and behavioral state.This initial evidence of a speech and language-related regulatory network under the control of FoxP2 in the zebra nch provides dozens of novel gene candidates for further study.

Subjects
All animal use was in accordance with National Institutes of Health and American Veterinary Medical Association guidelines for experiments involving vertebrate animals and approved by the University of California, Los Angeles Chancellor's Institutional Animal Care and Use Committee (ARC-2001-054).For the ChIP-Seq experiments, we created ve treatment conditions, with three unrelated zebra nches per condition: i) non-singing adult males (> 120d), ii) non-singing adult females (> 120d), iii) non-singing juvenile females (65d), iv) non-singing juvenile males (65d), and v) singing juvenile males (65d).The comparison between singing and non-singing juvenile males was designed to allow us to examine the behavioral regulation of FoxP2 previously reported by our own and other groups (11)(12)(13)(14)(15), and its effects on the number of transcriptional targets identi ed.
For the juvenile male cohort, young males were housed with their families from hatching, enabling them to form a template of their tutor's song (24).At the onset of sensorimotor learning (35d), they were individually housed with a female conspeci c until 64d when the female was removed.At 65d, males were selected for either the non-singing (NS) or singing condition (S) using established methods for producing high (NS) or low (S) levels of Area X FoxP2 mRNA and protein (11)(12)(13)(14)(15).The NS condition was enabled by the experimenter sitting near to the bird's cage in the morning and, if the bird attempted to sing, gently distracting it from singing for 2h after lights-on.Juvenile males who sang > 10 motifs were not used on that day.Those that sang < 10 motifs in the rst 2h were then left undisturbed for an additional hour prior to sacri ce.This protocol previously resulted in gene expression pro les similar to those of birds that do not sing of their own volition (15), and measurements of corticosterone suggest that it does not induce a stress response in NS birds (13).To be included in the S group, birds must have spontaneously sung > 90 motifs during the 2h window.Those that met these criteria were sacri ced 1h later.As noted, these different levels of singing during the initial 2h reliably produce birds with either high (NS) or low (S) FoxP2 levels in Area X.The additional 1h delay after 2h of singing or non-singing was intended to capture differential FoxP2-mediated transcriptional regulation of target genes (15).Juvenile males were sacri ced by rapid decapitation and the telencephalon was extracted and ash-frozen on liquid nitrogen.
For the remaining non-singing conditions, adult males, and adult and juvenile females were selected from our aviaries and housed individually the day before use.All non-singing subjects were monitored for 3h in the morning (to ensure that males sang < 10 motifs; female zebra nches do not typically sing).All birds were sacri ced by rapid decapitation and the telencephalon was extracted and ash-frozen on liquid nitrogen.
In addition to the ChIP-Seq experiments, three adult males were used for immunohistochemical validation of a putative target gene, calcium/calmodulin-dependent serine protein kinase (CASK).As with the juvenile S and NS conditions, we allowed one individually housed adult male to sing undisturbed in the absence of conspeci cs for 2h in the morning.Another individually housed adult male was monitored and, when necessary, distracted from singing for 2h.An additional male was housed with a female and the status of his singing over the morning was unmonitored.Three hours after light onset, males were sacri ced via inhalation overdose (iso urane), perfused with 4% paraformaldehyde and brains extracted and cryoprotected in sucrose solution..

Chromatin Immunoprecipitation
Chromatin immunoprecipitation (ChIP) was performed using ChIP-IT High Sensitivity (Active Motif, Cat.No. 53040) following the manufacturer's protocol.In brief, brains were minced, and crosslinked in a formaldehyde solution.The tissue was disrupted using a hand-held homogenizer for 45s at 35,000 rpm.
Homogenates were sonicated at 25% amplitude, 30s on, 30s off, for 30m.A portion of each sonicate was decrosslinked and quanti ed by a Nanodrop 1000 (Thermo Scienti c, F713).Samples were then split evenly into 2 tubes.A cocktail of 4µg of each of three anti-FoxP2 primary antibodies was applied to one half (Thermo Fisher, Cat.No. 720031, Abcam, Cat.No. ab1307, and Santa Cruz, Cat.No. sc-517261), while the second was used as an input DNA control.After an overnight incubation, samples were washed, decrosslinked, and prepped for sequencing.
Library preparations were quality-assessed by the Agilent 4200 TapeStation system (Cat.No. G2991AA) using D1000 Screen Tape.The ChIP libraries were quanti ed by the Qubit 1.0.Fluorometer and diluted to 10nM.Libraries from each sample were combined and sequenced across 2 lanes by the UCLA Neurogenomics Core (UNGC; https://www.semel.ucla.edu/ungc)by the Illumina HiSeq 4000 sequencer, generating between 15-50 million 65bp single-end reads per library.Reads were demultiplexed by the UNGC.

Sequence Data Analysis
We developed a custom computational pipeline to analyze our ChIP-Seq data.First, quality control for raw sequence reads from all samples was conducted using FastQC (v0.11.9) and all reports were aggregated using MultiQC (v1.11).Reads that passed QC are detailed in Table.S1.All test and control samples were high quality (Phred > 30) with no adapter contamination, so no trimming was necessary.All samples were aligned to the most recent high-quality zebra nch reference genome (bTaeGut1.4.pri;RefSeq Accession: GCF_003957565.2) using Bowtie2 (v2.2.5).We then used samtools (v1.3.1) and sambamba (v0.8.1) to lter out all unmapped and multi-mapped reads, retaining only uniquely mapped reads (q > 30) for the downstream analyses.We used Macs2 (v2.2.7.1) to call peaks in all samples independently, then generated a high con dence peak set containing overlapping peaks at the same genomic loci +/-5 bp on either side in 2 or more replicates.We used ChipseekeR (v1.30.3) to annotate these high con dence peaks to genes in each condition.The nal peak x gene association table was used as input for gene ontology analysis using gpro leR (v0.2.1, hsapiens background) and network analysis using StringDB (v10.0,con dence coe cient = 0.4).Putative promoters were de ned as the regions 1000bp before and after the transcription start site.

Genome Assembly Comparison
We compared the number of uniquely mapped, multi-mapped, and unmapped reads from all samples across two reference assemblies and tested for differences in these distributions using two sample T tests implemented in R v4.2.2.The small size of each experimental group (n = 3) precluded formal high-powered statistical tests between conditions, resulting in primarily descriptive assessments of differences observed between age, sex and singing.

Gene Set Enrichment Analysis
Genes associated with FOXP2 ChIP-Seq peaks were evaluated from previous studies, including Spiteri et al. (2007), and Vernes et al., (2007;2011) (19,20,25).To assess the degree of shared overlap between these studies, we conducted a gene set enrichment analysis using a hypergeometric test.The density of the hypergeometric distribution for each comparison was calculated using the "phyper" function in R with the parameters q, m, n, and k where: q = number of genes with FOXP2 peaks in both our dataset and a previous test study m = number of genes associated with FOXP2 peaks in the previous test study n = number of genes not associated with FOXP2 peaks in the previous test study k = number of genes associated with FOXP2 peaks in our study

Immunohistochemistry
The brains of the adult non-singing male (NS), singing male (S) and unmonitored male were cryo-sectioned in the coronal plane at 15µm and thaw-mounted onto slides (Superfrost, Fisher Scienti c), then stored at -80C until use.Immunohistochemical experiments were conducted simultaneously for dual antigens as previously described (13) using primary antibodies against FoxP2 (1:500 dilution of Santa Cruz Biotechnology SC-517261) and CASK (1:200 dilution of Invitrogen PA5-96141).Per the manufacturer's product speci cation, the immunogen for the anti-CASK antibody comprises amino acids 1-300 of human CASK (NP_0011195261), a region that is entirely conserved in zebra nch CASK (XP_030138806.1).Signals were visualized using uorescence-tagged secondary antibodies (1:1,000 dilutions of Alexa uor 488nm A31620 to detect FoxP2 and Alexa uor 596 A11035 to detect CASK).Coverslips were mounted using ProLong Gold Antifade Mountant with DAPI (360nm, Molecular Probes, Eugene OR).Images were captured using an AxioImager uorescent microscope (Carle Zeiss, Thornwood NY), Basler camera and Pylon viewer software (Basler Inc., Exton, PA).Fiji (ImageJ) was used to colorize the images via LUT editor.Background noise was depleted using the contrast editor and a 600x600 pixel images was cropped from the original 1920 − 1200 pixels in a location that most clearly showed cells bodies that express CASK (due to the high background level of this protein in tissues).An air bubble stain in the Nissl image in panel A was retouched to improve clarity.

Comparison of Reference Assemblies
We successfully generated ~ 18 million sequence reads across the three replicates of each of ve conditions (n = 15) that underwent ChIP-seq using the cocktail of anti-FoxP2 antibodies, and 250 million sequence reads in their respective input DNA controls (Table S1).Reads from the ChIP conditions are enriched for sequences bound by FOXP, while the input control samples contain sequences with general open chromatin.To determine which reference genome used these data most effectively, we analyzed our data using both the "original" (RefSeq: GCF_000151805.1) and "updated" (Refseq: GCF_003957565.2) zebra nch reference assemblies (Fig. 1).We found that the updated assembly signi cantly increased unique mapping of sequence reads by ~ 15% (t = 21.414,n = 25, p < 2e-16), and signi cantly decreased both multiple (t = 26.539,n = 25, p < 2e-16) and unmapped (t = 6.4358, n = 25, p = 5.9e-08) reads (Fig. 1A).Surprisingly, we did not nd a signi cant difference between the assemblies in the total number of peaks called from aligned reads (Fig. 1B).However, we did nd substantial improvement with the updated assembly when annotating these peaks to genes.For example, in the adult male non-singing condition, using the original zebra nch reference assembly, we called 1,112 high-con dence peaks (i.e.overlapping peaks in 2 or more individuals) and assigned them to 500 unique genes (Fig. 1C).In contrast, using the updated zebra nch reference assembly, we called 1,267 high-con dence peaks and assigned them to 812 unique genes.These increases are not simply due to the annotation of more genes in incomplete regions, as the concordance rates between these gene sets was only 23% (Fig. 1D).These data suggest that the updated zebra nch reference assembly allowed us to more con dently call peaks and assign them to genes.We thus use this updated assembly (bTaeGut1.4pri) for the remainder of the study.

Localization of Putative FoxP2 Binding Sites
Given FoxP2's role as a transcription factor, if we successfully isolated and sequenced regions of DNA bound by FoxP2, we would expect an enrichment of peaks within the putative promoter regions of genes when compared to other gene regions, regardless of condition.Indeed, the percentage of peaks that were located within putative gene promoters varied from 63-88% across the different conditions, suggesting successful pulldown of FoxP2-bound regions of DNA (Fig. 2A).Although FoxP2 binds primarily in the putative promoter in all conditions, we hypothesized the peak locations and subsequent target genes would vary between conditions given their differences in song learning behavior (e.g.female zebra nches do not learn, young males are engaged in sensorimotor learning or quiescent, and adult males have learned their songs).To test this, we conducted a comparative assessment of all high con dence peaks and their genes (Table S2).In general, FoxP2 binding peak frequency increased with age and decreased with singing behavior (Fig. 2B).Both adult male and female zebra nches had markedly more unique peaks associated with unique genes than did juveniles of the same sex, suggesting that more genes are regulated by FoxP2 as the zebra nch matures although the small sample size (three birds per condition) precluded statistical testing.

Bioinformatic Identi cation and Validation of Identi ed FoxP2 Gene Targets
We bioinformatically identi ed 812 high con dence FoxP2 gene targets within the adult male zebra nch telencephalon.To assess whether these putative targets may be associated with FoxP2, we took multiple approaches, namely comparison of our list with: i) lists of putative FoxP2 targets previously generated in studies on mammalian nervous tissue (19,20,25), ii) genes previously shown to be differentially expressed in song-dedicated Area X relative to outlying striatum in adult male zebra nches (5), and iii) genes previously shown to be acutely regulated by singing in Area X (15,26).A substantial number of putative targets identi ed here were cross-validated by one of the above bioinformatic approaches (see below, Tables 1, 2).Only one, CASK, was cross-validated in two separate approaches: This gene was previously identi ed as a putative FoxP2 gene target in developing mouse brain (25) and as differentially expressed in Area X of adult male zebra nches (5).For this reason, we selected CASK for an immunohistochemical validation designed to determine whether FoxP2 protein colocalizes with the protein of this putative target within medium spiny neurons of adult male zebra nch Area X (see next sub-section, below).
A gene set enrichment analysis against three FOXP2 ChIP-seq datasets from mammalian nervous tissues (19,20,25) revealed 46 genes that exhibited FOXP2 binding in putative promoters in either human or mouse and at least one zebra nch condition (Table 1).While none of these relationships reached the level of signi cance, this is not surprising, as similar tests that compared putative targets across the prior mammalian studies found no signi cant overlap, with only two genes, CCK and NRN1, being shared among them (19,20,25).(20).
Given the importance of FoxP2 expression in Area X of male zebra nches, as a second approach to validation, we examined whether any peaks were located in the putative promoters of genes known to exhibit differential expression in the zebra nch Area X relative to the adjacent non-vocal ventral striatum (5).We found 21 genes exhibiting either up or down regulation in adult male Area X with at least one FoxP2 binding site in the putative promoter region (Table 2).Of these 21 genes, only one, encoding a protein with unknown function (KIAA0232), was associated with peaks in the juvenile condition, and only in the non-singer condition (Table 2).This result suggests that FoxP2 serves distinct regulatory roles in zebra nches across sex, development, and behavior.Of these 21 genes, only one, RASEF (also known as RAB45), is located on the Z chromosome, indicating that chromosomal dosage does not account for most of the regional differences in gene expression.Another, CASK, was previously shown to be a putative Foxp2 target in mice (Tables 1, 2; (25).An additional 12 of these genes are isoforms or family members of genes previously identi ed as putative Foxp2 targets in mammalian studies (19,20,25).Moreover, mutations in 10 of these genes are either direct causes of or implicated in nervous system dysfunction including speech and general motor delay (Table 2; see Discussion).If the genes identi ed as having peaks are indeed transcriptional targets of FoxP2, then their expression levels are predicted to change as a function of FoxP2 levels.As a third approach to validating these genes as FoxP2 targets, we compared them with a list of transcripts we previously found to be differentially expressed in Area X between singing and non-singing males (15,26).In 2012, we employed cDNA microarray technology and weighted gene co-expression network analysis (WGCNA; (27) to assess coordinated changes in gene expression in 26 adults who sang different amounts of song on a given morning.In that study, all 12 of the 60-mer probes for FoxP2 on the microarray indicated decreased FoxP2 expression with greater amounts of singing.Following WGCNA, genes whose expression levels were signi cantly correlated with singing were grouped into a so-called 'song-related module'.Any of these previously identi ed genes are candidates for FoxP2 transcriptional regulation since both FoxP2 and these genes change expression in concert within Area X during an acute 2h bout of singing.
Overlaps between genes in the adult song-related module (15) and the present study include NTRK2, HOMER1, IRS2, DUSP6 and UBXN2A.NTRK2 encodes the neurotrophin receptor tyrosine kinase 2 which shows one peak of potential FoxP2 binding in the juvenile male singer condition.Previously, Vernes and colleagues (25) identi ed NTRK2 as a putative target of Foxp2 in the developing mouse brain, providing partial validation for the bioinformatic approach used here.
In addition to adult males, we previously used bulk RNA sequencing to assess singing-driven changes in Area X gene expression of juvenile males (26) at the same age studied here.Among the 5 overlaps mentioned above, HOMER1, which exhibits one peak in the juvenile non-singer condition, was highly correlated with the amount of singing in juveniles in the prior work.Similarly, IRS2 exhibits seven peaks in adult males, two peaks in adult females, and three peaks in the juvenile male non-singing condition and was signi cantly regulated by singing in juveniles.DUSP6 shows one peak in adult males and was a member of the juvenile song-related module.Moreover, this gene, which encodes dual speci city phosphatase 6, was also shown by Vernes and colleagues to be a putative Foxp2 target (25) providing additional validation of our pipeline.
Finally, UBXN2A exhibits one peak in the adult female condition and was signi cantly regulated by singing in juvenile zebra nch males (26).

Validation of an Identi ed FoxP2 Target Using Immunohistochemistry
An immunohistochemical experiment to detect the protein for FoxP2 and that of one of its putative gene targets, CASK, revealed co-localized expression within Area X neurons of an adult male housed with a female (Fig. 3), providing further support that our in silico results have biological relevance.In line with prior work showing singing-driven down-regulation of FoxP2 in Area X, only CASK, and not FoxP2, was detected in Area X neurons of a male who sang by himself for 2h in the morning.Conversely, only FoxP2, and not CASK was detected in Area X neurons of a male who did not sing.These qualitative ndings of inverse expression levels suggest that FoxP2 represses CASK within Area X. Outside of Area X, robust FoxP2 and CASK signals were co-detected in cerebellar Purkinje neurons (Fig. 3).

Functional Processes of Putative FoxP2 Targets
We next wanted to better understand the overall functional processes of putative FoxP2 target genes identi ed in each condition.We conducted a gene set enrichment analysis, using the unique set of targets per condition as input (Fig. 2B, Table S3).condition had hundreds of GO terms (range 775 to 1733) involved in a wide range of processes including cell signaling, neurogenesis, and axon guidance.Interestingly, we found that FoxP2 targets were enriched for genes related to human speech and language in males only, consistent with the sexually dimorphic vocal learning in this species (Fig. 4).These functions include poor or absent speech, speech and language impairment, and delayed speech onset, all consistent with the phenotypes found in the KE family following FOXP2 mutation (9).
The non-singing adult male zebra nch exhibited the most putative FoxP2 target genes known to be involved in human speech with many fewer found in the non-singing juvenile male (Fig. 4), suggesting the baseline vocal learning regulatory network of FoxP2 expands with development and song learning.Conversely, the singing juvenile male condition exhibited even fewer target genes related to human speech, consistent with less FoxP2 protein being available to bind to its targets in this behavioral condition (13,17).Overall, these data suggest that the FoxP2 transcription factor targets genes involved in human speech and language and that the degree of this regulation depends on sex, age, and singing.
Given the robust enrichment for speech and language-related functions in the FOXP2 candidate target genes in adult male zebra nches, we investigated whether these genes may interact in a functional network.Using StringDB, we de ned a protein-protein network from all FOXP2 putative target genes involved in speech and language in the adult male zebra nch telencephalon (Fig. 5).Of the 61 zebra nch genes, 57 were recognized by the network algorithm, and 48 of these genes (84%) formed a functional network with each other.Most notable was UBB, a highly conserved gene coding for ubiquitin which is involved in a number of cellular processes such as protein tra cking and degradation (28).In the adult male, the UBB promoter contains the most FOXP2 peaks out of any condition, suggesting that strong regulation of free-oating ubiquitin levels is important for vocal maintenance in the adult male zebra nch.

Discussion
In this study, we provide the rst detailed description of the putative molecular targets of FoxP2 in the vocal learning zebra nch across sex, development, and behavioral conditions.This analysis was enabled by a new high-quality reference genome assembly, highlighting the importance of robust computational resources for accurate biological conclusions.Improvements to the genomic annotation, including adding previously missed genes, cleaning up spurious exons, and de ning upstream promoters ( 23) have all signi cantly aided the identi cation and association of FOXP2 binding peaks with target genes.Using the new assembly, we found evidence for FoxP2 regulation of ~ 60 genes in adult male zebra nch telencephalon that are involved in human speech and language function, with fewer such genes found in non-singing juveniles and even fewer in juvenile singers, indicating both developmental and behavioral changes in regulation.This dataset highlights a functional network composed of dozens of candidate genes that are targets for further study for their role in vocal learning function in zebra nch and across vocal learning taxa.
The nding of 46 shared putative target genes between our present study on zebra nches and at least one of three prior studies on mammalian nervous tissue, while not statistically signi cant, is remarkable given that the overlap in target genes between those mammalian studies was limited to two genes, CCK and NRN1 (Table 1; (19,20,25).The lack of commonality among mammalian studies likely re ects the different tissue sources: Spiteri and colleagues (2007) examined human fetal basal ganglia and inferior frontal cortex (19) whereas Vernes and colleagues examined human SH-SY5Y cell lines in their 2007 study (20) and reported on whole embryonic mouse brain in 2011 (25).None-the-less those prior studies highlighted important consistencies in biological themes, notably neurite outgrowth and synaptic plasticity.All three mammalian studies used microarrays to identify putative targets whereas the present study used DNA sequencing, a methodological difference that re ects on-going technological advances.An even greater number of putative targets identi ed here are related to isoforms previously identi ed in prior mammalian work (see below).
Among the 21 putative targets we identi ed that are also differentially expressed in Area X (Table 2; (5), the RASEF gene (formerly known as RAB45) is the only one located on the Z chromosome.RASEF is a member of the Rab family of GTPases involved in membrane tra cking.In mammals, many RAB isoforms are putative FoxP2 transcriptional targets in the brain tissues (19,20,25).RASEF is part of a novel locus associated with attention de cits identi ed in a meta-analysis of age-related cognitive decline in 3,045 individuals aged ≥ 65 (29).Other genes identi ed in our analysis that are linked with human syndromic brain phenotypes include: TUBB3 (pruning of misguided axons during development (30); FUS (Fronto-temporal lobe dementia (31); P4HTM (HIDEA syndrome (32); and MTMR10 (human 15q13.3microdeletion syndrome (33).
In mammals, MTMR2 has been identi ed as a putative FoxP2 target (25) that is associated with neurite outgrowth, providing further validation of the bioinformatic pipeline used here.
Only one gene, CASK, was identi ed as a putative gene target in zebra nches (the present study) and in mice (Vernes et al., 2011), and also exhibited differential expression in zebra nch Area X (Tables 1, 2; (5).We found that signals for CASK protein colocalized with those for FoxP2 within single neurons in zebra nch Area X, providing support for their biological interaction.The CASK gene lies on zebra nch chromosome 1 and encodes a calcium/calmodulin-dependent serine protein kinase anchored to the neuronal membrane at synapses.There, its CaM-kinase domain phosphorylates itself as well as the presynaptic protein neurexin-1 (34,35).CASK translocates to the nucleus and interacts with transcription factors to regulate gene expression (36) including that of NECDIN, RLN and the NMDA receptor subunit 2b (37).
In humans, the CASK gene is X-linked and its mutation leads to FG syndrome 4, a form of X-linked mental retardation (36).Recently, a de novo variant of CASK was found to cause a neurodevelopmental disorder in a 9 year-old boy with severe psychomotor delay (38).CASK is part of a signaling pathway that includes the widely validated autism susceptibility gene CNTNAP2 and the Prader Willi syndrome gene NECDIN (36).
Zhang and colleagues ((36); 2023) showed that CNTNAP2 undergoes proteolytic cleavage and its intracellular domain promotes the nuclear translocation of CASK to affect NECDIN expression.Remarkably, viral-driven expression of NECDIN in the Cntnap2 −/− mouse model of autism normalized the social de cits of these mice.The authors conclude that the CNTNAP2-CASK-NECDIN signaling pathway plays a critical function in ASD (36).
Our analysis did not identify two genes, VLDLR1 and CNTNAP2, that previously validated as direct transcriptional targets of FoxP2 in humans and zebra nches (39,40,41).Interestingly, we and others previously identi ed Vldlr1 mRNA as being regulated by singing in zebra nch Area X (15,42), and part of a song-related gene module.Our prior work used tissue punches of Area X, whereas, for technical reasons (see below), the present study used the entire telencephalon.Similarly, Adam and colleagues (41) speci cally targeted Area X with lentiviral injections to knock down FoxP2 levels, leading to altered Cntnap2 expression.
Here, the inclusion of pallial and striatal tissues outside of Area X likely diminished our ability to detect these associations.
The possibility of a specialized role of FoxP2 in female zebra nches is intriguing given its prominent role in vocal learning in males (11,43).We found evidence for female-speci c FoxP2 binding in genes associated with ribosomal biogenesis, suggesting differences in protein synthesis between the sexes.Humans (44), mice (45), and yeast (46) exhibit a wide variety of specializations in ribosomal genes across tissues, and Drosophila exhibit a sex-speci c pattern of ribosomal genes expression in their testes and ovaries (47).Given the sexual dimorphisms in neural circuitry governing vocal learning in the zebra nch, these female-speci c binding events could represent FoxP2 repression of genes that facilitate the synthesis of vocal learningrelated proteins in males.Should this be the case, we would not expect to see these sex-speci c patterns in songbird species in which females also learn and produce song, or in parrots where call learning occurs in both sexes (48-50).
Many of the putative FoxP2 targets we identi ed in adult males are genes involved in the cellular ubiquitination pathway and are critical nodes in the speech/language regulatory network, including UBB, USP9X, and CBL.Ubiquitination also in uences PTEN function, another gene in this network with mutations associated with communication de cits in autism spectrum disorders (51).However, the directionality of regulation of these target genes is currently unknown.FOXP2 is canonically thought to serve a repressive role in gene regulation, and strong repression of free-oating ubiquitin (UBB), as well as ubiquitin ligases (CBL) and proteases (USP9X), could serve to maintain the current ubiquitin pro le in the brain.One way to test this idea would be to repeat these experiments using a singing adult male condition, with the hypothesis that the peaks indicating FoxP2 binding around these genes would disappear, leading to disinhibition and providing exibility to the ubiquitination state of the brain.
One limitation of this study is that data were obtained from whole telencephalic lysates, rather than solely from song control regions.The reason for this was technical, as limitations in cell number at the onset of these experiments precluded the use of such a small brain region from individual birds, while potential interindividual variability made pooling individuals to increase cell number undesirable.However, it is likely that Area X provides the primary source of behavioral regulation in our signal, as previous studies have not observed variation in FoxP2 levels in other telencephalic regions as a function of singing (12,14).In addition, using RNA-Seq data from an adult non-singing zebra nch Area X and surrounding striatum (5), we found that several of the FoxP2 target genes from the matching condition in this study exhibit differential expression (Tables 2 and S4.Additional experiments using RNA-seq and ATAC-seq (52) to pro le transcriptomic activity in Area X of juvenile males before and after singing, as well as developing female zebra nches, are necessary to determine the extent of FoxP2 regulation in this region.
Overall, this work advances our understanding of the molecular mechanisms underlying the rare trait of vocal learning.Since a role for FOXP2 in human speech and language was rst established (9), molecular pathways governed by FOXP2 in human tissue have been identi ed (19,20,25,53,54), leading to the hypothesis of similar patterns of regulation in other vocal learning species such as songbirds.The present work provides support for the hypothesis of convergence of FOXP2 transcriptional networks across vocal learning songbirds, humans and potentially with other lineages that exhibit vocal learning.Such similarity would suggest shared constraints on the evolution of this complex trait and provide insights to rescuing de cits in these molecular pathways in the future.

Figures
Figure 1 Updated zebra nch reference genome assembly improves sequence alignment and gene assignments.A) Average sequence alignment statistics across all samples using the original (blue) and updated (orange) genomic assemblies.All comparisons were signi cant at p < 5e-8 or greater.B) Total number of called FOXP2 peaks by condition using the original and updated genome assemblies.C) Total number of highcon dence peaks and genes associated with peaks for the adult male condition, comparing the original and updated assemblies.D) Total percentage of genes associated with peaks that were found using one or both assemblies.
Figure 2 FOXP2 promoter binding and gene regulation varies across conditions.A) Feature plot of called peaks for all conditions.Total peaks for each condition are displayed as a proportion of each annotated feature.B) Total number of unique peaks (blue) and associated unique genes (green) for each condition in the experiment.
Abbreviations: NS = Non-singer, S = Singer FOXP2 regulates genes involved in human speech and language in adult male zebra nches.Dot plot highlighting speech/language related GO terms from gpro leR.Color of each dot denotes signi cance after multiple test corrections (Enrichment P-value; FDR < 0.05) and the size of each dot denotes the number of genes found in the given term and condition.Adult and juvenile female birds are not displayed as they had no enrichment for the plotted terms.A full list of signi cant GO terms for each condition can be found in Table S2.
Figure 5 Adult male zebra nch regulatory network molecular targets of FOXP2, many of which are involved in human speech/language dysfunction.Protein-protein interaction network for all genes associated with human speech/language dysfunction in the adult male zebra nch (Fig. 4).Network constructed using StringDB (v10.0).Lines between nodes (genes) denote con dence of interaction with all connections >40% con dence.All unplaced nodes are clustered on the left of the network.Orange boxes highlight important genes in the ubiquitination pathway.

Supplementary Files
This is a list of supplementary les associated with this preprint.Click to download. TableS1ProjectMetadata.csv

Table 2
Putative Songbird FoxP2 Targets with Specialized Expression in Area X1