Neuronal Connectivity as a Determinant of Cell Types and Subtypes

Classifications of single neurons at brain-wide scale is a powerful way to characterize the structural and functional organization of a brain. We acquired and standardized a large morphology database of 20,158 mouse neurons, and generated a whole-brain scale potential connectivity map of single neurons based on their dendritic and axonal arbors. With such an anatomy-morphology-connectivity mapping, we defined neuron connectivity types and subtypes (both called “c-types” for simplicity) for neurons in 31 brain regions. We found that neuronal subtypes defined by connectivity in the same regions may share statistically higher correlation in their dendritic and axonal features than neurons having contrary connectivity patterns. Subtypes defined by connectivity show distinct separation with each other, which cannot be recapitulated by morphology features, population projections, transcriptomic, and electrophysiological data produced to date. Within this paradigm, we were able to characterize the diversity in secondary motor cortical neurons, and subtype connectivity patterns in thalamocortical pathways. Our finding underscores the importance of connectivity in characterizing the modularity of brain anatomy, as well as the cell types and their subtypes. These results highlight that c-types supplement conventionally recognized transcriptional cell types (t-types), electrophysiological cell types (e-types), and morphological cell types (m-types) as an important determinant of cell classes and their identities.


Introduction
A mammalian brain is a complex network of tens of millions or more neurons and supporting cells that work together to carry out its functions (Purves, et al, 2019;Luo, 2015). These neurons form intricate modules within neuronal circuits and connectomes (Lichtman, et al, 2008;Sporns, 2011;Van Essen, et al, 2012;Seung, 2012). Cataloging brain-wide neuron types is recognized as a powerful way to understand the structural and functional organization of the brain (Peng, et al, 2021;Winnubst, et al, 2019). A definitive class of neuronal cells in a mammalian brain typically has a number of neurons that share multiple common attributes. Recent advances in classifying neurons usually rely on four major types of attributes: anatomical (Zeng and Sane, 2017;Muñoz-Castañeda, et al, 2021), physiological (Gouwens, et al, 2020), morphological (Peng, et al, 2021;Winnubst, et al, 2019), and molecular (Moffitt, et al, 2018;Zhang, et al, 202l;Zhang, et al, 2023), which are further complemented by other attributes such as lineage or developmental trajectory of these cells (Sebé-Pedrós, et al, 2018 ;Zhong, et al, 2020;Russ, et al, 2021).
Large-scale data acquisition and analyses at single-neuron resolution have succeeded in identifying neurons based on their 3-D registered soma-locations to the standard brain atlases. For mouse brains, the Allen Common Coordinate Framework (Dong, 2008;Wang, et al, 2020) and image-modality specific variations (Qu, et al, 2022) serve as standard atlases to index anatomical locations of neurons of interest. However, such soma-location cell types, or s-types, provide only an anatomical reference of the respective neurons, with marginal indication about the structural, physiological, molecular, and other attributes of neurons. Practically available techniques (Lee, et al, 2021;Lipovsek, et al, 2021) to record electrophysiological, morphological, and transcriptional properties of individual neurons have generated massive resources of these data modalities (Kalmbach, et al, 2021), enabling the analysis of electrophysiological cell types (e-types), morphological cell types (m-types), and transcriptional cell types (t-types) that further classify various s-types (Scala, et al, 2021).
The description of the morphology of neurons has been a crucial force to advance neuroscience since the time of Cajal (Cajal, 1909). While high-resolution digital reconstruction of the 3-D morphology of neurons is very challenging Manuben-Gil, et al, 2023), recent efforts in large-scale, semi-automatic reconstruction have yielded substantial datasets for whole mouse brains (Peng, et al, 2021;Winnubst, et al, 2019;Gao, et al, 2022) and other complex primate brains (e.g. Han, et al, 2023). We believe that a comparative approach taking advantage of these resources will help understand the morphological classification and distribution of single neurons. We also envision that an objective comparison of the morphology of neurons, especially from various data sources, should be carried out in a standardized coordinate system of an entire brain. Recent advances in brain mapping and registration (e.g., Qu, et al, 2022) provide such an opportunity.
Recent neuroscience research has highlighted the urgent need and various approaches to study neuronal connectivity and the whole-brain connectome (Abbott, et al, 2020;Whitesell, et al, 2021;Axer, et al, 2022). The MICrONS Consortium has recently produced a number of analyses about the connectivity of cortical neurons using electron microscopy (EM) datasets (e.g. Turner, et al, 2022;Dorkenwald, et al, 2022;Yin, et al, 2020). Parallel efforts also include the EM-based reconstruction and analysis of Drosophila hemibrain that also have a limited use of the cell connectivity to define cell types (Scheffer, et al, 2020). However, the EM approach has not yet scaled up to a whole mouse brain, which motivated us to take an alternative approach. Indeed, the connectivity of neurons has to be mediated by their morphology, making it challenging to study at the whole-brain scale. While it is clear that neurons can be classified based on their regional projection and connectivity, a quantitative study of the connectivity types of neurons in mammalian brains has been challenging. The goal of this study is to make an initial attempt to define neuronal connectivity in the context of whole brain based on aggregating and augmenting the largest single neuron morphology reconstruction datasets. In particular, we analyzed data with axonal and dendritic reconstructions of all brain regions to catalog connectivity types (or c-type) and subtypes of anatomically defined neurons. We further investigated the role of such connectivity types, and found that the connectivity types and subtypes supplement with the conventionally recognized transcriptional cell types (t-types), electrophysiological cell types (etypes), and morphological cell types (m-types), as a new determinant of cell classes and identities. Our finding underscores the importance of potential connectivity in characterizing the modularity of brain anatomy, as well as the cell types and their subtypes.

Whole-Brain Map of Neuron Arbors and Projections
To formalize terminology, we call each group of neurons whose somas are in the same brain region a "soma-type", or s-type. A neuron type determined based on clustering of morphological features, or m-features, is called a "morphology-type", or m-type. Many morphological features, such as length, surface area, and number of branches of a neuron, are independent of a neuron's spatial orientation, while other m-features, such as width and height, may be associated with a neuron's orientation. A neuron type determined based on clustering of connectivity features/profiles, or c-features, of neurons is called a "connectivity-type", or c-type. All cfeatures are orientation-independent and thus c-type is also not associated with a neuron's orientation. Of note, connectivity is often associated with a topological direction, which means where axons project to and where neuronal input signal come from. Morphology features do not immediately exhibit such a directionality except partitioning into dendritic and axonal arbors. Remarkably, any quantifiable neuronal "connectivity" must be based on an anatomically precise mapping of individual neurons' morphology at the whole brain scale. In another word, c-type is a derivative of m-types but also involves multiple neurons, neuron-populations, and brain regions, in a standardized manner.
To effectively study s-types, m-types, and c-types, we built a comparative morphology neuron database that consists of 20,158 neuron-morphology reconstructions (Figure 1), with which we inferred potential axon-dendrite connectivity of single neurons for standard whole-brain anatomical regions. To do so, we first aggregated four state-of-the-art neuron reconstructions datasets from independent sources (Supplementary Table 1). In detail, we specifically generated 3-D dendritic morphology of 10,860 neurons, called DEN-SEU ( Figure 1A Figure 1), and the axonal morphology of 6357 neurons generated by ION (Gao, et al, 2022). For fair and comprehensive analyses, we cross-validated neuron morphologies (Supplementary Figures 2 and 3) to avoid potential systematic bias favoring one particular way in generating the respective neuron data. Moreover, we applied 3-D brain registration (Methods) to all these neurons to map them onto the same spatial coordinate system, Allen mouse brain Common Coordinate Framework, CCFv3 , so that all these neurons' somalocations and 3-D morphologies can be compared against each other directly.
The somas of the DEN-SEU neurons are distributed fairly evenly across major brain regions, while the other three full morphology or axon datasets focus on specific brain regions of cerebral cortex, thalamus, striatum, hypothalamus, hippocampus, and claustrum ( Figure 1A; Supplementary Table 2). Compared with the sparse and long projection of axons in the other datasets, the dendritic arbors of DEN-SEU cover all CCFv3 brain regions (Figure 1B), making this dataset suitable for analyzing the target projection/connection regions of axons of neurons. We also cross-validated the quality and distribution of our assembled neuron data with other public documented neuron morphologies of shared by independent labs ( Supplementary  Figures 4 and 5; Supplementary Table 3) via NeuroMorpho.Org (Akram, et al, 2018;Bijari, et al, 2020). At the same time, the projection pathways of aggregated full neurons and axons data in this study capture many important regional connections, such as neurons originating from primary motor cortex (MOp) (Figure 1C). The total number of brain regions reached by projection axons follows a broad distribution (Figure 1C), indicating that most axons normally project to relatively distal regions. By contrast, dendrites extend a much shorter distance, invading at most 5 or 6 brain regions nearby their soma anatomical locations ( Figure 1C). The large number of neurons involved in this study form complex patterns of potential connectivity, which should be quantified and analyzed in a principled way. We tackled this challenge by considering the modularity and granularity of individual neurons.
Neuron arbors often correlate with regions of dense connections between neurons. Therefore, we used a recent machine learning method, AutoArbor (Peng, et al, 2021), to determine the topologically connected arborization regions of neurons automatically. In particular, we started with 2941 fully reconstructed neuron morphologies in the Allen/SEU-ALLEN and MouseLight datasets to produce a brain-wide arborization map of a mouse brain ( Figure 1D). In this way, various neuronal pathways indicated in our datasets ( Figure 1C) are quantitatively modeled. For example, we observed clear modules of projection and potential connection patterns in large brain regions such as isocortex, striatum, and thalamus. This motivated us to characterize the potential connectivity among neurons using the structural components, i.e., neurite arbors, systematically. Accordingly, we generated in total 26205 axonal arbors and 20158 dendritic arbors ( Figure 1B) for all neurons in this study. We subsequently used these arbors to define the connectivity among neurons and respective c-types.   (Supplementary Figure 6). Horizontal axis: soma location of single cells. Vertical axis: arbor projection regions which are also grouped into larger brain areas. Size of circles: arbor length in brain regions. Color bar: ratio of local and distal arbors relative to soma locations. Distinct from morphology analyses of neurons that rely on various m-features, such as the Sholl analysis (Binley, et al, 2014), L-Measure (Scorcioni, et al, 2008), and extended global or local structural features (Wan, et al, 2015), we study cell-typing by generating the neuron connectivity features. One approach to quantifying single neuron connectivity is based on axon-dendrite colocalization that needs precise details on synaptic contact location approximation (Rees, et al. 2017), which however is still challenging at the whole brain scale. Our intuitive approach is to use soma locations and defined spatial domains of neuronal arborization. Specifically, we determined the connection targets of a neuron based on the 3-D registered brain regions invaded by its axonal arbor, and the connection strength based on the spatial adjacency of this neuron's axonal arbor and nearby dendritic arbors of neurons in our dataset (Figure 2A). We detected arbor domains of neurons that originate from a specific brain region using Gaussian mixture models (Methods) and produced spatially and statistically optimal parcellation of projection sites of all s-types. Within each arbor domain, the arborization pattern of each group of neurons of the same s-type is approximated using a spatially homogeneous Gaussian distribution. For example, SSp neurons were found to have 9 arbor domains, 4 of which are axonal arbor domains and 5 are dendritic arbor domains ( Figure 2B).

Connectivity Profiles Augment Morphology-Based Neuron Types
For each neuron in a specific s-type, we then computed a connection barcode ( Figure 2C) as the features to characterize the axon-dendritic spatial overlap of its axonal arbors and dendritic arbor domains of all s-types at the whole brain scale, all defined in the standard CCF space. For DEN-SEU, we produced 19 dendritic arbor domains per brain hemisphere. We also produced another 56 dendritic arbor-domains per brain hemisphere for other s-types with at least 60 reconstructed neurons. These dendritic arbor domains span an average volume of 8.94 mm 3 . The resultant connection barcode is thus a 150-dimensional feature vector for the entire brain, indicating how axons of neurons in a s-type may project and potentially connect to various dendritic domains in the context of whole brain anatomy. With this barcode, neurons belonging to a s-type can be further clustered. For instance, ventral posteromedial nucleus (VPM) neurons were smoothly clustered into four connection groups ( Figure 2C), which are visually separable from each other ( Figure 2D).
To understand the advantage of the connectivity barcode, we first applied it to assisting conventional morpho-analysis of cell types that clusters s-types or their sub-types based on mfeatures. It was difficult to separate MOp, Subiculum (SUB), and ventral posterolateral nucleus (VPL) neurons that have heavily overlapping m-features, as seen in both the overlap scores and the feature scatter plot (Figure 2E top). However, when the connectivity features were appended to the m-feature vectors to cluster these three s-types, they became clearly separable in terms of a minimal overlapping in this case ( Figure 2E bottom right). When the first principal component of the connectivity features was added in visualization, the separation of the three s-types was visible ( Figure 2E bottom left). This shows that the connectivity features help discriminate neuron classes, similar to the dimension-increment analysis or support vector machines Steinwart and Christmann, 2008) in pattern recognition and machine learning, where non-separable classes could become distinguishable in higher-dimensional spaces.  Connectivity and morphological features c-type

Connectivity Types Outperform Morphology Types in Neuron Classification
To investigate whether c-features would classify cell types better than conventionally used mfeatures (Zeng and Sane, et al, 2020; Peng, et al, 2021), instead of providing auxiliary dimensions to assist cell typing, we computed morphological features' similarity scores (mscore) of all 31 known s-types (n > 60) ( Figure 3A). Except a small amount, i.e., 25.8%, of stypes that have relatively low similarity in their m-features, the majority of s-types (74.2%) make up 3 boxed cohorts, within each of which neurons of different s-types share high similarity mfeatures ( Figure 3A). Remarkably, the similarity score between the c-features (c-score) of all 3 cohorts of s-types are dramatically reduced while the c-scores of the other 8 s-types remain low ( Figure 3B). In another word, in general a s-type is well separated from other s-types in the space of c-features.
We further directly compared corresponding m-scores and c-scores to quantify the improvement of cell typing performance of connectivity features over morphological features. Here, 76% of entries in the ratio matrix of c-scores and m-scores ( Figure 3C, 3D, and 3E) are less than 1, while 99% of such entries corresponding to the boxed cohorts are less than 1. We also visualized the actual clustering of neurons based on either morphological features or connectivity features. Examination of the paired UMAP (Uniform Manifold Approximation and Projection) clustering for the 12 smallest ratios of c-scores and m-scores shows that c-features are much more separable than m-features ( Figure 3F). For example, ACAv5 neurons have mixed m-features with AId2/3 neurons, however their c-features are clearly separable ( Figure 3F). This is the same case for MOs5 neurons vs ILA5 neurons, ORBm2/3 neurons vs ACAv5 neurons, and all other visualized pairs of cell types, although all these cases have varying distributions in their UMAP space ( Figure 3F). As our results analyze the largest neuron archives for the mouse brain containing major neurons classes, it is reasonable to conclude that c-features could serve as strong contenders of m-features for cell typing of neurons whose somas are from wellestablished anatomical regions.

Connectivity Features Correlate with Spatial Separation of Potential Cell Subtypes
After establishing that connectivity is a powerful attribute for classifying neurons types, we investigated whether c-features would also help identify sub-types of neurons that share their soma locations in the same anatomical area. To do so, we generated a distance map (d-map) to measure the spatial separation of two neurons based on their soma locations ( Figure 4A). Because within any specific brain region neurons were labeled in a stochastic way, the pairwise soma-distance may form a Gaussian-like or Gaussian-mixture distribution ( Figure 4B). Particularly, when somas scatter almost uniformly within a brain region, their pairwise distance will be close to Gaussian, such as LGd and CP neurons ( Figure 4B, red and blue). Conversely, when somas form two or more subclusters within a region, their pairwise distances may form a distribution with long-tail, or approximately a Gaussian mixture distribution, such as the ACAd 6a neurons in this database (Figure 4B, green). Correlating the morphology similarity scores (mscores) and connectivity similarity scores (c-scores) with d-map provides a useful way to understand which kind of features may help identify subtypes of neurons whose somas are from subareas in an established s-type.
In the example of CP neurons, we calculated the pairwise m-score and c-score matrices ( Figure  4C) sorted in the same order of neurons as in the respective d-map ( Figure 4A). Using the cfeatures, we obtained three major CP clusters ( Figure 4D) with different projection and arborization patterns (Figure 4E), although their somas are mixed fairly uniformly ( Figure 4F), while there is no obvious subcluster based on m-features' similarity matrix ( Figure 4C). Similarly, we computed the d-maps and respective m-scores and c-scores matrices for LGd and ACAd6a neurons ( Figure 4G, Figure 4H). The Gaussian-mixture like distribution of the pairwise neuron-distances of ACAd6a neurons also translate to potential clusters in ACAd6a's dmap (Figure 4H), while the single Gaussian-like distributions of CP and LGd neurons ( Figure  4B) correspond to the less clear hierarchical clustering of the respective sorted d-maps ( Figure  4A, Figure 4G).
We computed the corresponding d-maps, m-score, and c-score matrices for all 31 s-types of neurons. Overall, we found that for any pair of neurons, their c-scores are only slightly greater than the m-scores ( Figure 4I). There is a weak positive correlation between these two scores ( Figure 4J), of which m-scores follow a much flatter marginal distribution than c-scores ( Figure  4J); this indicates that statistically it would be harder to produce clearly segregated neuron clusters based on morphology similarity. However, remarkably the corresponding entries of the d-map and c-score matrices have evidently negative correlation, which is also much stronger than that between d-map and m-score entries ( Figure 4K, Figure 4L). Indeed only 6 out of 31, or 19.4%, s-types show stronger negative-correlation of soma-location-and-morphology similarity over soma-location-and-connectivity correlation ( Figure 4K). Neurons with far away soma locations can be at most 4 times more likely to have different c-features than m-features ( Figure 4L). Thus, we conclude that potential subtypes for a s-type are statistically better represented by c-features than by m-features.

Spatially Tuned Connectivity Features Identify Cell Subtypes
Anatomical sub-grouping of neurons within a specific brain region reflects the spatial coherence of these cells. As c-features correlate more strongly with the spatial adjacency of neurons, for each s-type we combined connectivity profiles and spatial adjacency to cluster neurons and identify potential anatomical subtypes. We called this approach Spatially-Tuned c-Features, with which we produced clear subtyping of neurons ( Figure 5) that we had never been able to identify using alternative methods.
In particular, for cortical neurons (Figure 5A~F), we found that neurons in the prelimbic area (PL) have 2 subtypes for each of the layers 2/3 ( Figure 5A), layer 5 ( Figure 5B), and layer 6a (Figure 5C), respectively. Neurons in layers of the secondary motor cortex, MOs, could also be clustered into subgroups (Figure 5D~F). The layer 2/3 MOs neurons are clustered into two large subgroups indicated by the sorted distance matrix, along with distinct projection patterns of these subgroups in the cross-sectional views of the CCF space ( Figure 5D). Similarly, each of layer 5 and layer 6 MOs neurons were divided into two subgroups, respectively (Figure 5E and 5F). Detailed examination of these MOs subtypes provides guidance for analyzing connectivity-based subtypes of cortical neurons (see next section).
We also attempted to identify subregions in the thalamic gateway related to sensory and motor input, particularly VPL (Figure 5G), VPM (Figure 5H), and LGd ( Figure 5I). We found that subregions of somas in these areas correspond to neurons projecting to distinguishable spatial targets, visualized often as homogeneous color-blobs of neuron-subclusters, which are particularly clear in the three subtypes of VPM neurons ( Figure 5G).
LGd has three known anatomical subregions (Guido, et al, 2018;Okigawa, et al, 2021), i.e., LGd-shell, LGd-core, and LGd-ip (ipsilateral zone). We found two major distinguishable subtypes of somas using our approach, which may provide further spatial granularity to study the previously documented subregions. Of note, LGd neurons could not be clearly clustered using either morphology features or connectivity or spatial distance features alone ( Figure 4G).

Subtyping MOs and VISp Neurons Reveals Diversified Connectivity, Transcriptomic, and Electrophysiological Characteristics
MOs neurons have long axonal projections that subserve animal decisions (Yang and Kwan, 2021). In addition to individual neurons' spatial patterning, we also profiled the symmetry of MOs connectivity using the cortical layer 5 neurons. To do so, we kept the somas separated when calculating their space distance map. Our examination of individual MOs neurons confirmed long-range projection targets at the full-brain scale (Figure 6A, top). The overall projection patterns of these MOs neurons are also consistent with the previously documented population projection (Oh, et al, 2014) ( Figure 6A, bottom-right). We found that the somas in MOs5_1 and those of MOs5_2 and MOs5_3 clusters distribute on the two sides of the brain (Figure 6A, bottom-left), while the somas in MOs5_2 and MOs5_3 essentially intermingled. Indeed, the projection patterns of MOs5_1 match well with the mirrored sum pattern of MOs5_2 and MOs5_3. In another words, the spatially tuned connectivity analysis provides a powerful way to reveal both the anatomical distribution of neuron subtypes and their symmetry. Particularly, while the reconstructions of neurons of this MOs5 dataset have three anatomical subtypes when both hemispheres of the brain are considered, there are only two genuine subtypes ( Figure 5E) that are distributed symmetrically on the brain's coronal plane. These two subtypes might be further subdividable as implied in respective clustering tree ( Figure 5E).
We also examined both the axonal and dendritic morphologies of MOs5 subtypes. While the most dendritic features of the two genuine subtypes, MOs5_2 and MOs5_3, are similar to each other, their axonal features ( Figure 5E) are clearly different in area, width, and relative shift of centers, despite the similar numbers of axonal bifurcations. This means that, although these two subtypes have similar branching complexity, their projection patterns differ. Such variability of MOs5_2 and MOs5_3 is also seen in the different correlation and neuron-beta (Peng, et al, 2021; Methods) values compared to the overall MOs population projection ( Figure 6C). The respective scores of MOs5-vs-population and MOs5_1-vs-population are comparable to each other, indicating MOs5_1 is a good ipsilateral approximation of the overall MOs5 patterns, also as the "sum" reference for MOs5_2 and MOs5_3 ( Figure 6C). The MOs5 and MOs2/3 neurons also covary strongly with the MOs population projection. Differently, MOp neurons show more variation in the single-neuron-vs-population comparison, while their integrative projection pattern also matches with previous population projection data (Oh, et al, 2014) ( Figure 6C). We also correlated the m-features of individual neurons' dendritic and axonal arbors. For MOs5 neurons, m-features such as the number of bifurcations and total length show recognizable level of correlation, in the range of 0.3~0.7, between dendrites and axons ( Figure 6D).
We also produced UMAP analyses to compare the transcriptomic subtypes of single MOs neurons (Yao, et al, 2021), connectivity subtypes and morphological subtypes ( Figure 6E). As the transcriptomic data of MOs and FRP (frontal pole, cerebral cortex) were mixed due to the limited spatial resolution at this point, we prepared connectivity and morphological features of individual neurons in a similar way, also specifically for layer 5. Within each of the individual scenarios, we observed relatively coherent subtyping except for the cases of morphological features. However, it seems that the diversity exhibited in the c-features cannot be immediately explained by the subgrouping of the transcriptomic features. We have not observed a conclusive layer-by-layer correspondence between transcriptomic and connectivity subtypes, either.

coefficient, between single neurons and neuron-populations in motor cortex, specifically MOp, MOs2/3, MOs5, and MOs6a subtypes. D. Correlation of dendritic and axonal morphological features for MOs5 connectivity subtypes, along with examples of the first MOs5 cluster. Note that the clustered neurons in A might not have dendrite reconstructions, however in this dendro-axonal correlation analysis only neurons in A but also with full dendrites and axons are counted. E. Transcriptomic profile-based single neuron clustering of FRP-MOs neurons (n=34,331) and more specific FRP-MOs layer 5 neurons (n=9879), compared with the clustering based on connectivity and morphology features of FRP-MOs / FRP-MOs layer 5 neurons.
Moreover, we performed a joint analysis of the m-type, c-type, t-type and e-type data based on retrieving the publicly available electrophysiological and transcriptomic recordings of single neurons that also fall into the brain regions used in this study. For the primary visual area (VISp), we analyzed 48 fully reconstructed neuron morphologies and their regional connectivity patterns (Supplementary Figure 7A), along with their morphometric features (Supplementary Figure  7B) and anatomical locations of cell bodies (Supplementary Figure 7C). We found that VISp neurons in different cortical layers have a less clear separation in morphology (Supplementary Figure 6D) than in connectivity (Supplementary Figure 7E). We also re-analyzed previous single neuron electrophysiological recordings (Gouwens, et al, 2020) based on the concatenated e-type features (Supplementary Table 7) and colored these e-type data using their molecular profiles and anatomical locations (Supplementary Figure 7F-G). While it is clear that certain neurons in different layers have preferences in their physiological and molecular properties, there is a general disparity between such features and their c-types.

Subtyping Single-Cell Connectivity of VP Nuclei Indicates Broader Multisensory Integration
By subtyping single cell reconstructions of 390 VPM and 83 VPL neurons, we were able to document the broad regional connections of VP neurons in a comprehensive manner. First, we clustered individual neurons' detailed projections onto cortical areas and layers into 8 subtypes as a matrix ( Figure 7A). These 8 groups have similar separation of their soma locations as well as the respective axonal arbor targets' locations ( Figure 7B and 7C). The longest dendrite can be about 5 times of the shortest dendrites in these groups ( Figure 7D). We also confirmed the majority projection of VP neurons to layer 2/3 and 4 of somatosensory cortex ( Figure 7A It is interesting to note that, while our previous study (Peng, et al, 2021) implied that a small portion of VP projection may target MOp, the detailed examination presented in the next section (Figure 8) visualizes abundant outgoing arborization of VP neurons in MOp regions. Strikingly, we estimated a nonnegligible 20.7% VP cells (n = 98) actually project to multiple cortical areas such as motor or visceral areas that are outside somatosensory cortex, and even beyond such as CP ( Figure 7A). Furthermore, we found that a single VP neuron could target multiple sensory areas. For example, a VPM neuron can simultaneously projects to SSs and sub-areas of SSp, such as SSp-m, SSp-n, SSp-ll or SSp-ul (Figure 7A and 7E). Indeed, some VPL neurons even project to layer 1 in addition to layer 4 (lower right panel in Figure 7E). Such neurons carry two separate axonal clusters: a larger one projecting to VPL neuron's typical projection target, i.e., SSp-layer 4, and a smaller one targeting layer 1 of a different cortical area, such as SSs or even VIS (visual cortex).
Also of note, the surface-area of some VPM cells' axonal cluster in SSp-bfd (largest: 384,942μm 2 ) is twice larger than that of a barrel ( Figure 7F). Traditionally, it is believed that each VPM cell only projects to one barrel (Pierret, et al, 2000). Our finding suggests potential signal regulation across multiple barrels; thus, the tactile sense signaling transmission could be a multithread process.
Additionally, 18.6% of VP neurons (n = 88) possess small branches with bouton terminations in subcortical striatum, suggesting VP-striatum projections (Figure 7G). Our finding indicates a new pathway in thalamic-subcortical circuit, supplementing the main pathway of VP nuclei to somatosensory cortex. Taken together, these single-cell VP reconstructions give clues to supplementary and complex signal transmission paths in multisensory integration circuits.

Subtyping Target Connections of Thalamocortical Neurons in MOp Cortex
In addition to the outgoing "forward" connection patterns examined in preceding sections, we also investigated the diversity of incoming connections of a target brain region. Previous literature shows that the primary motor cortex (MOp) receives thalamocortical projection from the sensory-motor relay nuclei VAL and the modulatory or high order nuclei like VM or PO (Kuramoto, et al, 2011;Guo, et al, 2018;Guo, et al, 2000). Our analysis revealed additional connections from sensory relay nuclei VPM and VPL (Figure 7 and Figure 8).
With the whole-brain mapped full reconstructions we produced, it can be seen that individual neurons from PO, VM, VAL project to motor and somatosensory areas as a whole spectrum of connectivity subtypes (Figure 8A and 8B).

Discussion
This work studies the whole-brain scale connectivity of single neurons using one of the largest data archives produced to date, leveraging both new dendritic reconstructions that cover the entire brain and existing axonal and full reconstructions (Winnubst, et   indicating quality reconstructions. The consistency of these data suggests a novel approach to reveal how individual neurons in different regions are wired into different networks with different circuit motifs at whole brain scale. There are two remarkable topics in such an integrative approach. First, one may be able to study the building blocks of a brain, i.e., organizational "types" or "subtypes" of individual neurons, in terms of connectivity. This work makes an initial attempt toward this end. Second, one may be able to construct and study the "microscale" connectome based on individual neurons, filling a gap between previous work at the population level mesoscale connectome and the nanoscale connectome that relies on using electron microscopy and/or other super-resolution microscopy methods more suitable for examining synaptic level connections of neurons in potentially smaller, local brain regions. We have also attempted the second approach in another ongoing study (unpublished work).
To understand the potential connectivity of neurons throughout a brain at single neuron resolution, it is essential to analyze the arborization of axons and dendrites in different anatomical areas. An overall axonal arborization distribution map (Figure 1) provides an understanding of the marginal distribution of neurons that innervate from different regions, and also highlights that arbors can be powerful entities to study neuronal connectivity. To complete this paradigm, we produced brain-wide dendritic arbor domains, which were used to generate the connectivity profiles for each individual neuron. In this way, the connectivity features can be precisely defined and utilized for analysis. This approach therefore constitutes a contribution essential for whole brain scale single-neuron analysis.
Our framework derives potential connectivity from full morphology plus atlas mapping into a standard space, so that the regional connection relationship of multiple neurons can be compared objectively with the appropriate context of their relative locations, distributions of their shapes, and spatial adjacency and/or overlap of their axon-dendritic arbors. Atlas mapping thus enables the expansion of previous approaches to neuron-type circuit analysis that were limited to local anatomical domains (Tecuatl, et al, 2021) to the entire mouse brain and long-range projection neurons. Our data show that connectivity features of neurons not only provide additional dimensions to distinguish neurons from different anatomical regions, but also allow effective neuron-typing when they are used alone. Within each of the anatomically established brain regions, we also see a strong correlation between the connectivity-based similarity and the spatial adjacency of neurons and their somas. Therefore, to approach the more challenging task of subtyping neurons, we can aggregate both connectivity and spatial information to observe distinct neuron-groups that are otherwise difficult to distinguish. Our application in analyzing MOs neurons demonstrates diversity of such neuron subtypes that cannot be readily inferred from existing data of neuron population-projections and molecular profiling. Further screening of the enriched regional connectivity of neurons in VP nuclei and MOp may provide additional evidence that connectivity subtypes do exist, and carry biological significance in signal relay and integration.
Within a general framework of cell typing, our study demonstrates that morphology cannot accomplish this alone. Indeed, based on this study and also previous work (Peng, et al, 2021), as well as converging results from invertebrate nervous systems (Mehta, et al, 2023), we hypothesize that t-types or e-types alone may also be insufficient, and it is an interesting open question how to synergize all these data in a common connectivity framework. We believe that there are two key steps to address this challenge. The first is the generation of connectivity associated t-type, e-type, and m-types data. A second step is building a thorough statistical model of all such data to mine the associations and distribution patterns, which could be homogeneous clusters or globally nonlinear manifold patterns (Liu and Qian, 2022).
This approach of leveraging connectivity-type analysis toward the determination and validation of neuronal cell types is powerful. It can be extended to brain-scale analysis of single neurons' synaptic connectivity when data becomes available. An excellent example can be seen in the single-cell connectivity-types defined for a Drosophila brain (Scheffer, et al, 2020) that elaborate on the connection detail built upon morphological and lineage similarities. While such an approach provides the electron microscopy-based, ultrascale spatial resolution to precisely pinpoint synaptic connections, it is also subject to noise and imperfect process of data acquisition and computation, which would likely be exacerbated when applied to a much larger and complicated mammalian brain. The strength of the present approach is that we can readily study cell typing and subtyping using the arborization based regional connectivity, without precise pinpointing of synaptic level connections. This may be valuable when considering that individual synapses are subject to turnover via structural plasticity, while arbor geometry provides a relatively more stable circuit scaffolding (Stepanyants et al, 2002). Connection types and subtypes can also provide a useful blueprint of future synaptic level analysis. In summary, neuronal connectivity in mammalian brain provides a powerful discriminant in the classification of neuronal cell types, refining and adding novel class information to existing and widely studied modalities.
We caution that the subtlety in definitions of "morphology" and 'connectivity" might cause slight confusion in a specific context. In literature, sometimes the analysis of morphology type might have used part of the connectivity information, such as the orientation-aligned neurons could be analyzed using lamination information (Gouwens, et a, 2019). The goal of this study, however, is to factorize the analysis in an understandable way. In this sense, our paradigm can contribute to a more organized and clear communication in this field.

Acknowledgment
This project was mainly supported by a Southeast University initiative for neuroscience awarded to HP. HP was also supported by a Zhejiang Lab BioBit Program visiting grant (2022BCF07). The Southeast University team was also supported by a MOST (China) Brain Research Project 2022ZD0205200/2022ZD0205204 awarded to LL. GAA was supported in part by NIH grants R01NS39600 and RF1MH128693. HZ was supported by BRAIN Initiative grant U19MH114830. We thank Sebastian Seung for comment and suggestion on the manuscript and literature, Yong Yao, Rafael Yuste, David Van Essen and a number of other experts for various inspirations including discussions, comments, suggestions, and community events.

Author Contribution
HP conceptualized and designed this study, and managed the entire project. LL executed the project and supervised ZY and FX in generating all results along with the assistance of LMG and HC. HD, MH, and HZ all advised on the project and manuscript. GAA advised on selection and analysis of independent reconstructions for validation, participated in critical discussion on the early phases of analysis design, and assisted in editing and revising the manuscript focusing specifically on relation with extant scientific literature. HP wrote the manuscript with input from all coauthors.

Full and axon reconstructions
We performed a detailed analysis of 1741 fully reconstructed single neuron morphologies (Peng, et al, 2021) called BICCN AIBS/SEU-ALLEN, and 1200 full single neuron reconstructions from the Janelia MouseLight project (Winnubst, et al, 2019). We also analyzed the axonal morphology of 6357 neurons generated by ION (Gao, et al, 2022). All neurons were registered to the Allen Mouse Brain Common Coordinate Framework v3 (CCFv3). These data are also documented in Supplementary Table 1. The naming convention of brain regions follows the CCFv3 and also consistent with the previous studies. Abbreviations are also recorded in Supplementary Table 2.

Generation of dendritic tracing
We generated 10860 dendrite reconstructions from fMOST imaging with the following protocol. First, we collected image samples following the same protocol in our previous study on generating the full reconstructions (Peng, et al, 2021). Next, we ran the APP2 algorithm (Xiao and Peng, 2013) for tracing local arbors by taking manually defined and validated somas as the central starting points in local image volumes (1024x1024x512 voxels), for the goal that the joint area of these local volumes covers main dendrite arbors. We ran APP2 with a number of background thresholds (10,15,20,25,30,35) [35,129], and 'Max Branch Order' [3,32]. An automatic tracing would be discarded if less than four of its features fell out of these limits. In case more than one tracing qualified for a soma location, we kept the tracing with greatest length. We spatially registered all tracings to CCFv3 using mBrainAligner (Qu, et al, 2022). In total, we collected images from 53 mouse brains, we identified 31,625 neurons, and we generated 17,228 qualified tracings. We visually inspected all tracings and discarded those with obvious errors (e.g., 1 trace covers multiple touching neurons, mis-alignment during registration), finally obtaining 10,860 proofread dendritic tracings.

Independent reconstructions for validation
To cross-validate the neuron morphologies used in this work, we also considered independent morphologies produced and documented in public resources. Particularly, we searched adult mouse neuron reconstructions in certain brain regions via keywords using the searching tool of NeuroMorpho.

Gaussian Mixture Model classification of neuron nodes
We resampled fully traced SWC files (n = 9298) to have nodes every 10 µm and saved their coordinates in the space of the Common Coordinate Framework v3 (CCFv3)  at 25µm isotropic voxel resolution. For each of the 19 CCFv3 brain regions with most neurons in the analyzed datasets (MOs, AId, ACAd, ACAv, ORBvl, ORBl, ORBm, VPM, CP, AIv, FRP, ILA, MOp, SSp, VPL, SUB, LGd, SSs; see Supplementary Table 2 for reference), we pooled all SWC coordinates in a single data frame containing their x, y, and z locations. We clustered the pooled data using the Mclust function with default parameters (mclust R package version 5.4.7 (Scrucca, et al, 2016)). We selected the Gaussian Mixture Model (GMM; among all combinations of spherical, diagonal and ellipsoidal with equal or varying volume, shape and orientation) as it provides optimal clustering as measured by the Bayesian Information Criterion (BIC) (Schwarz, 1978). BIC is a measure for the comparative evaluation among a finite set of statistical models, based on maximizing the likelihood function while penalizing for the number of parameters in the models. We saved the resulting classification with the node IDs of each neuron.

Definition of arbor domains using a-shape
We defined 3-D dendritic domains by using the pooled, clustered SWC coordinate dataset. We found the minimal volume enclosing all nodes belonging to each single cluster by obtaining the 3D a-shape of the point set (alphashape3d R package version 1.3.1) (Edelsbrunner, et al, 1994). The 3D a-shape is a generalized definition derived from the Delaunay triangulation (Delaunay, et al, 1934) with a parameter a to control for the level of detail (the convex hull is obtained when a»¥). To obtain detailed volumes enclosing all neuron nodes in each cluster, we used a=0.4. We call the obtained 3D shapes "arbor domains". When the majority of the nodes within an arbor domain belonged to neurons with their soma in the domain itself, we categorized those as dendritic arbor domains. Otherwise, we considered the obtained domains to be axonal. We saved all arbor domains as surface objects. We plotted 2D slices of the arbor domains using the R base plot function (version 4.1.0).
The definition of dendritic domains based on full tracings was obtained both for raw data distributed in both brain hemispheres and for flipped neurons, ensuring that all of them had somas in the same hemisphere. To further analyze connectivity, in that case, dendritic domains were flipped to also recapitulate homologous contra-lateral regions.
In addition to the arbor domains obtained from fully traced neurons, we also generated single dendritic domains from using all node coordinates for dendritic tracings with somas inside each of the 19 brain regions with most neurons. 3D a-shapes were defined using the same method. However, in this case we did not perform GMM-clustering and all coordinates were pooled in a single set for each brain hemisphere.

Single neuron connectivity to dendritic arbor domains
To define outgoing connections from single fully traced neurons to dendritic arbor domains, we measured the spatial overlap between single neurons and arbor domains. To do so, we obtained all voxels enclosed by each domain 3D α-shape (we tested whether they are inside the surface of the domain using the inashape3d function in the alphashape3d R package version 1.3.1) and saved them as a 3D mask in the CCFv3 space. To convert surface polygon file format .ply files to 3D masks we used binvox version 1.35 (Nooruddin, et al, 2003). We then obtained a 3D volume where each voxel contains an array of indices identifying each 3D a-shape volume visiting such voxel. We obtained a-shapes for each individual fully traced neuron (a=0.4) and saved the enclosed volume as a 3D mask. Finally, we measured the overlap volume between each single neuron mask and the volume containing all 3D arbor domain indices. We saved the overlapping volume between each neuron and each dendritic domain as a connectivity matrix.

Support Vector Machine clustering of morphology and connectivity
To assess the relevance of arbor domain connectivity for defining cell types and subtypes, we used a Support Vector Machine (SVM; hyperoverlap R package version 1.1.1; linear kernel, cost=1000 and stoppage.threshold=0.2) to classify neurons with somas located in MOp, SUB and VPL regions (Brown, et al, 2020;. For each pair of brain regions, we used SVM to cluster the data in two groups. To assess the separation of the neurons in the space defined by the two morphological variables "total length" and "maximum branch order", we measured the pairwise overlap of points from each of the three brain regions. To account for arbor domain connectivity, we obtained a PCA from the connectivity matrix of the analyzed neurons. We performed pairwise SVM classification analogously by adding the first three principal components of the connectivity matrix in the dataset. We plotted these results using the ggplot2 R package (version 3.4.0).

m/c-score metric
The m/c-score can be used to quantify the dissimilarity of morphological (see Supplementary  Table 5 for axonal features and Supplementary Table 6 for dendritic features) and connectivity (arrays of spatial overlap between each single neuron and all dendritic arbor domains) features between two clusters, taking into account both their intra-class similarity and inter-class separation. A higher score indicates the greater difference between two clusters, while a lower score indicates more similarity. The m/c-score is calculated as the following formula: . , where, *+,-./01233 represents inter-class distance between the centers of two clusters, which is calculated using Manhattan distance metric (Han, et al, 2022).
$4&56/786%%(9) represents intra-class distance of cluster , which is defined as the average of the Manhattan distances between each sample and all other samples within the same cluster.
With regard to m-score matrix clustering, we applied hierarchical clustering by clustermap function (method="ward", metric="euclidean") in Seaborn Python package (version 0.11.2). We used umap-learn Python package (version 0.5.1) to implement UMAP decomposition with default parameters and plotted results as scatterplots with Matplotlib (version 3.3.4).

Anatomy-based distance metric
The distance metric follows the Mahalanobis definition (McLachlan, et al, 1999). Let $ = [ $ , $ , $ ] be the position of soma in 3-D space. Due to the computational convenience, the soma location should be mirrored to the ipsilateral hemisphere. For two somas : , ! , anatomybased distance was defined using the following equation: where, 646&=>? represents the covariance of 3D positions of voxels of relevant ipsilateral anatomical region in the 25µm CCFv3 reference space volume.

Distance-weighted connectivity-based clustering
Distance-weighted connectivity-based clustering was used to cluster s-type cells based on both their connectivity feature similarity and physical distance of somas. Two matrices were generated to represent these components: a connectivity similarity matrix (c-similarity matrix denoted by @ ) calculated using cosine similarity, and a distance matrix (d-map denoted by # ) calculated based on the anatomy-based distance between the somas of the cells. Both matrices were linearly normalized to values between 0 and 1. To emphasize spatial adjacency, a distance affinity matrix ( #A ) was constructed using Gaussian kernel; #A = exp(− # • # ). This ensured that larger values in the affinity matrix indicated greater spatial adjacency between cells. Hierarchical clustering was subsequently applied on the matrix resulting from multiplying @ and #A to produce diversity clustering results. The optimal number of clusters was determined by the Calinski-Harabasz (Caliński, et al, 1974) score (metrics.calinski_harabaz_score function from scikit-learn Python package version 0.24.2) automatically.

Correlation between single cell and population morphology, projections, and transcriptomics
We used a transcriptomic dataset of 34,331 neurons in MOs and FRP brain regions, which is collected from a newly released dataset (Yao, et al., 2021). The analysis is performed by SCANPY (a python package, version: 1.9.3). To ensure the data quality, we filtered out 9432 genes that are detected in less than 3 cells, and filtered out 624 cells that expressed over 6,000 genes. We normalized the data (using functions: pp.normalize_total, pp.log1p, pp. regress_out, under default parameters), and reduced its dimension (using tl.pca, and tl.umap, under default parameters) for visualization. We further extracted L5 related cells (9,879 cells) using this genetic modality, following the same procedures.