Unsupervised classification of brain-wide axons reveals neuronal projection blueprint

Long-range axonal projections are quintessential determinants of network connectivity, linking cellular organization and circuit architecture. Here we introduce a quantitative strategy to identify, from a given source region, all “projection neuron types” with statistically different patterns of anatomical targeting. We first validate the proposed technique with well-characterized data from layer 6 of the mouse primary motor cortex. The results yield two clusters, consistent with previously discovered cortico-thalamic and intra-telencephalic neuron classes. We next analyze neurons from the presubiculum, a less-explored region. Extending sparse knowledge from earlier retrograde tracing studies, we identify five classes of presubicular projecting neurons, revealing unique patterns of divergence, convergence, and specificity. We thus report several findings: (1) individual classes target multiple subregions along defined functions, such as spatial representation vs. sensory integration and visual vs. auditory input; (2) all hypothalamic regions are exclusively targeted by the same class also invading midbrain, a sharp subset of thalamic nuclei, and agranular retrosplenial cortex; (3) Cornu Ammonis, in contrast, receives input from the same presubicular axons projecting to granular retrosplenial cortex, also the purview of a single class; (4) path distances from the presubiculum to the same targets differ significantly between classes, as do the path distances to distinct targets within most classes, suggesting fine temporal coordination in activating distant areas; (5) the identified classes have highly non-uniform abundances, with substantially more neurons projecting to midbrain and hypothalamus than to medial and lateral entorhinal cortex; (6) lastly, presubicular soma locations are segregated among classes, indicating topographic organization of projections. This study thus demonstrates that classifying neurons based on statistically distinct axonal projection patterns sheds light on the functional organizational of their circuit.


75
Formal Definition and Quantitative Solution of the Classification Problem.

77
The axonal projections of each neuron in a source region can be represented as k-dimensional 78 vectors, where k is the number of target regions invaded by the source region. Each of the k 79 components of the vector quantifies the number of axonal points within the corresponding region 80 (Figure 1). We explore the null hypothesis, H0, that all neurons from a source region belong to a 81 single projection class (Figure 2A), as opposed to the alternative hypothesis, HA, that distinct 82 projection classes exist from that source region ( Figure 2B). If two hypothetical classes exist, the 83 projections will be more similar between neurons within a class and more different across classes 84 ( Figure 2C). In such a two-class scenario, the combined within-and across-class distances would 85 thus form a wider distribution than the distribution generated if all neurons belong to just a single 86 class ( Figure 2D). To formally test HA, we measure all pairwise differences between neurons (as 87 arccosine vector distances, see Methods & Materials). We then generate the distribution of 88 distances for H0 by randomizing the projection patterns while preserving total axonal length both 89 by neuron and target region. We achieve this single-class "continuum" by iterative stochastic 90 swapping of axonal points between neurons across two target regions (see Figure 2E and Methods 91 & Materials). We can then apply Levene's one-tail statistical test to ascertain whether the original 92 distribution of pairwise distances has significantly larger variance than the randomized 93 distribution. If the answer is positive, we must discard H0 and accept HA. Starting from the top 94 node in an unsupervised hierarchical clustering tree, we can thus repeat Levene's test on the 95 neurons of each of the two subtrees, continuing the process until none of the variance differences 96 are statistically significant ( Figure 2F).

98
Validation of the Approach.    We tested whether the path distances from presubicular neurons of a given projection class differed 186 across their divergent target regions ( Figure 6). In these analyses of divergence, ipsilateral and 187 contralateral targets were considered separately, as the latter are systematically farther than the  projections from B27 have longer distances than those from A38 ( Figure 7A). For the ipsilateral 210 parasubiculum, path distances from D19 are longer than those from B27 ( Figure 7B). Finally, for 211 both the contralateral subiculum and parasubiculum, path distances from B27 are longer than those 212 from A38 ( Figure 7B-C).

215
This study introduced an original method to objectively identify projection-based neuronal classes 216 by pairing the Levene's test with unsupervised hierarchical clustering. We first conducted a 217 confirmatory study on layer 6 of the primary motor cortex to verify that the proposed technique 218 could reproduce known projection types in a previously explored area of the mammalian brain.

219
The results yielded two clusters with axonal projections consistent with those of the 220 corticothalamic and intratelencephalic neuron classes found in past studies, thereby confirming the 221 validity of the technique 23 .

223
To test whether the technique could lead to novel insights, we then applied it to the presubiculum, 224 a region with crucial cognitive function 24 , yet few studies on its circuitry 25 . The results yielded five 225 clusters, indicating distinct neuron classes, which led us to reject the null hypothesis that projection 226 neurons exhibit random variation within the constraints of regional connectivity from the 227 presubiculum. In an earlier study 26 , retrograde tracing identified five classes of neurons projecting 228 from the presubiculum, which target the retrosplenial cortex, contralateral subiculum, medial 229 entorhinal cortex, anterior thalamic nucleus, and lateral mammillary nucleus. Our results confirm 230 the existence of these five classes and add new information that reveals patterns of divergence 231 (e.g., class A38 projects to the retrosplenial cortex, dentate gyrus, subiculum, and entorhinal 232 cortex), convergence (e.g., the subiculum receives projections from classes A38, contralateral C3, 233 and D19), and specificity (e.g., class E6 projects exclusively to the medial geniculate nucleus, and 234 all hypothalamic regions receive projections solely from class D19).

236
The proposed clustering technique correctly distinguishes cortical (classes A38, B27, and C3) from 237 subcortical (D19 and E6) pathways in the second binary split in the hierarchical classification. From a comparison of divergent path distances from one presubicular class to its major targets, 256 along with a comparison of convergent path distances from each presubicular class to collectively 257 major targets, we found that path distances to the same targets were significantly different between 258 classes, as were the path distances to distinct targets within most classes. This might imply that electrical impulses reach different targets with varying delays, both within the same class and 260 between classes.

262
Topographic analysis of presubicular classes revealed spatial separation between the somata of 263 each class. This suggests the possibility of anatomically mapping the input and output of the 264 circuitry specializing in head direction computations 30 . Our reported topography of presubicular 265 projections classes is consistent with the recently observed local modularity of the head-direction 266 microcircuit 31 , and may help clarify the relationship between the egocentric and allocentric spatial 267 and episodic representations of the cortico-hippocampal system 32 .

269
As with many secondary data analyses, we have limited knowledge of, and control over, artifactual  Our results so far, in the cases of the mouse primary motor cortex and presubiculum, indicate that 276 the executed analysis is robust to these possible confounding variables 22 .

279
Overall, this study revealed that neurons can be divided into distinct classes based on axonal 280 projection patterns, as demonstrated in layer 6 of the primary motor cortex and the presubiculum.

281
Our applied analyses can be used to similarly analyze neurons projecting from all other mouse 282 brain regions with sufficient data. There are currently approximately 40 regions fitting this 283 criterion in the existing datasets, but this number is expected to grow in the near future. Hypothesis Design.

306
To determine whether distinct projection classes of neurons exist from a particular parcel of the 307 brain, hypothesis HA, we tested the pairwise differences between neurons from the experimental 308 matrices described above. If only a single class of neurons exists, then only a single distribution 309 of differences between neurons will be generated ( Figure 2A). If two hypothetical classes exist, 310 then the differences between neurons, evaluated two at a time, will be smaller within a given class 311 than across the two classes ( Figure 2B-C). In a multi-class scenario, a histogram of the differences 312 between neurons should be wider than the distribution generated when all the neurons belong to 313 just a single class ( Figure 2D). To generate the distribution of differences for the null hypothesis,

314
H0, a randomized control matrix was generated from the original experimental matrix through 315 multiple iterations of the stochastic pairwise swapping of axonal counts from two neurons across 316 two target regions ( Figure 2E). This method randomized the projection patterns, yielding a 317 "continuum" consistent with the regional connectivity of Figure 2A, while preserving axonal sizes 318 (row sums) and regional targeting (column sums) of the original experimental matrix.

322
We assessed the hypothesis that the variance of experimental data was significantly larger than the consistent with the scenario presented in Figure 2B.

666
Right: the path distance of an archetype neuron from class A38 (light blue), from its soma (black) 667 in the ipsilateral presubiculum (green) to the subiculum (purple), is significantly shorter than that 668 (dark blue) to the lateral entorhinal cortex (orange). (B) Left: box and whisker plot depicting the 669 distributions of path distances from class B27 to its major ipsilateral and contralateral targets. plot of the path distances from neurons in contralateral classes to the subiculum. Bottom: the 730 distance of an archetype neuron from class A38 (blue), from its soma in the presubiculum (green) 731 to the contralateral Sub (purple), is significantly shorter than the comparable distance of an 732 archetype neuron from class B27 (red). See Figure 4 for abbreviation definitions. Significant 733 differences in distances were calculated using a Wilcoxon Signed Rank Test performed on 734 neuronal path distances and multiple testing was corrected for by False Discovery Rate to 735 determine the significance of the resultant p-values. Tables   737  738   Table S1. Raw axonal counts for primary motor area layer 6. 739 740  Table S3. Non-negative least-square normalizations. 743