Positional transcriptomics shed light on site-speci � c pathologies of the aorta


 Pathologies of large vessels, such as atherosclerosis and aneurysms tend to emerge at specific sites. The consistency in their distribution suggests that a combination of unique local stressors, including both physical forces and specific gene expression profiles are confounding factors in disease etiology. Here we used single-cell RNA sequencing to identify signatures in smooth muscle cells with site-restricted predominance to uncover potentially relevant gene products. We showed that a small cohort of transcripts (5.5%) display preferential expression at specific sites of the vascular tree and in accordance with their embryological origin. Importantly, in silico studies revealed that several of these genes mapped to linkage studies for which no specific disease-causing candidates had been previously found. One of these candidates was Mcam/CD146 that mapped to the familial aortic aneurysm 1 (FAA1) locus identified by linkage analysis two decades ago. We showed that Mcam was significantly reduced in the AngII / hypercholesterolemic model of aortic aneurysm and further demonstrated that absence of the gene in mice resulted in larger lesions and accelerated death due to dissection. Our study highlighted site-specific alterations in gene expression profiles of smooth muscle cells that yield important insight in understanding site-specific vascular pathologies.


Introduction
A growing body of evidence supports the notion that embryological origin can impact the gene expression pro le of daughter cells through transcription factor selection and unique epigenetic modi cations. In relation to the vascular tree, detailed cell-lineage tracing experiments uncovered complex and diverse embryological ancestry 1 . In particular, vascular smooth muscle cells (vSMC) that populate the aorta descend from neural crest, somites, second heart eld, and lateral mesoderm 2-4 . Interestingly, vSMC derived from these precursors remain segregated in the adult tissue as revealed by lineage tracing studies 5 . In addition to ancestry, vSMCs like most cells, also possess positional identity 6 that is retained in the adult, as per expression of Hox family members 7,8 . Interestingly changing topographical expression of Hox leads to alterations in gene expression and vascular remodeling 9 .
The link between embryological origin and location is thought to contribute to disease susceptibility particularly in the case of site-speci c pathologies [10][11][12][13] . Furthermore, embryological origin can affect responses to the same stimuli even in adjacent cells. For example, in mouse models of Loeys-Dietz syndrome responses to TGF-b result in either activation or repression of Smad2/3 signaling depending on whether vSMCs are derived from secondary heart eld or neural crest 14 . These ndings underscore the notion that vSMCs derived from distinct progenitors appear to retain memory of their embryological ancestry with impact to disease susceptibility and responses to therapy. Thus, a more granular understanding of the transcriptional pro le of vSMCs that examines location and ancestry is of high relevance. The advent of next-generation sequencing, and more speci cally, single cell RNA-sequencing (scRNA-seq) offers the unprecedented opportunity to clarify coordinated transcriptional diversity of vascular beds, vascular regions, vessel types and their relationship to unique physiological needs and/or emergence of pathology.
Amongst the pathologies that affect the aorta, aortic aneurysms show remarkable site-predilection, each type presenting with distinct etiology and history [15][16][17] . Of note, the prevalence of abdominal aortic aneurysms (AAA) is higher than that of thoracic aneurysms (TAA) 18 although there is an overlap in the risk-factors that predispose to both types of aneurysms 19 . This implies that intrinsic features associated with the topographical location of the vessel combined with risk factors modulate cellular response to vascular injury.
Here we use scRNA-seq to pro le the transcriptome of adult vSMCs mice in three distinct locations of the arterial tree: carotids, aortic arch (ascending-thoracic) and descending (thoraco-abdominal) aorta. Using this approach, we identi ed a small, but important fraction of transcripts that displayed regionpreferential patterns of expression. These regionally expressed genes de ned site-speci c signatures of adult vSMC subpopulations, in relation to their anatomic location. Within these small signatures, we found both known and novel disease-associated genes, linking regional skewed expression to susceptibility of disease emergence. Furthermore, delving into the network of mRNA signatures present in vSMCs during AAA, we identi ed CD146/Mcam, a gene with preferential expression on the ventral aspect of the thoraco-abdominal aorta. Notably, the MCAM gene resides in a locus previously associated with aneurysm susceptibility 20 , but that had not been directly linked to development of the pathology.
Importantly experimental aneurysm induction in the CD146/Mcam null mouse revealed that absence of this gene increases disease susceptibility and lethality. Overall, our data indicates that discrete expression patterns in regions of the aorta could guide identi cation of causative disease variants particularly from the multiple loci identi ed by genome-wide association studies.

Results
Transcriptional characterization of vascular smooth muscle cells To identify regional-speci c transcriptional identities in vSMCs populations, we performed single cell RNA sequencing (scRNA-seq) in 3 distinct sites: carotid arteries, aortic arch (ascending thoracic) and descending (thoraco-abdominal) aorta (Figure 1a and 1b). The rationale for selecting these sites was the combination of embryonic origin, hemodynamics, and disease emergence. The aortic root was excluded from the evaluation. The collected aortic arch region extended until the left subclavian branch, in concordance with the limits of neural crest origin 5 . In total, we sequenced between 3,732 to 4,632 cells  Table 2). Quality assessment of each of the three libraries was performed and those cells with low RNA counts and high levels of mitochondrial RNA were removed from the dataset prior to downstream analysis (Supplementary Figure 1,2).
To identify common and regional signatures, the samples were merged and analyzed using Seurat (Figure 1c). This analysis identi ed 11 distinct cell clusters of varying vessel composition (Figure 1d). To determine the cellular constituency present in the 11 cell clusters, expression of classical cell-type markers was applied identifying six distinct cell populations (Figure 1e, 1f, Supplemental Figure 1, Supplemental Table 3). Vascular smooth muscle cells (vSMCs) represented the predominant cell type in all samples with the carotid library showed the greatest degree of diversity as 32% of the carotididenti ed cells were non-vSMC (Supplemental Table 3).
Populations of vSMCs were computationally extracted from the other cell types for in-depth analysis and  Table 4). Visualization of the data using t-Distributed Stochastic Neighbor Embedding (tSNE) highlighted the global uniqueness of each population of vSMC per anatomical region, as the data segregated into three distinct mostly contiguous but not overlapping groups that matched the referred anatomical location (Figure 2a From this cell population 18,940 transcripts were quanti ed and assessed for site-enriched expression. In this analysis, 1045 transcripts (5.5%) showed unique anatomic location-enriched expression (Figure 2c, Supplementary Excel File 1). Importantly, while vSMCs populations contained these regional-enriched expression patterns, their core identity as per expression of classical vSMC markers-Myh11, Acta2, and Tagln, were unchanged (Supplemental Figure 3). Molecular drivers of the transcript regional speci cation were further evaluated by plotting their relative log fold change against the frequency of transcript-producing cells (Figure 2d and Supplementary Excel File 2). The predominant pattern was de ned by relatively equal contributions of increased transcript production per cell as well as increased number of transcripts producing cells in a speci c region ( Figure  2d). For a smaller portion of transcripts, site-enrichment was driven by increased transcript counts per positive vSMC (fold change). This is exempli ed by carotid enriched Btg2, which shows 1.5 -log fold increase in transcript quantity per cell in carotid vSMCs, relative to aortic arch and descending aorta (Figure 2d and Supplemental Figure 3). Some site-speci c transcripts were characterized by differential increases in the frequency of cells expressing the transcript (cell enrichment). This expression pattern is exempli ed by the HOX genes Hoxb3os (carotid) and Hoxa7 (descending aorta) which show between 1.7 to 2.5-fold increases in transcript expressing cells, respectively (Figure 2d and Supplemental Figure 3).  Table 4

and Supplementary Excel
File 3). Detailed analysis showed that while the vSMC subcluster derived transcriptional signatures were relatively unique, certain clusters showed high overlap with one of the three the anatomic location de ned signatures (Figure 2g). Visualization of the contribution of vSMCs from each of the three anatomic locations to the vSMC subcluster, revealed that subclusters with high gene signature overlap to a given anatomic signature contained substantial cell contributions from that anatomic location (Figure 2h). For example, subcluster 0, predominantly composed of cells isolated from the descending aorta, displayed a gene signature that is 80% identical with the descending aorta anatomic signature (Figure 2g and 2h). In contrast, Subcluster 6, predominantly composed of aortic arch cells (93.5% arch-derived vSMCs) shows minimal gene signature overlap (15%) with the aortic arch anatomic signature.
Expression of the HoxA cluster in combination with hierarchical clustering of the gene signature was utilized to further locate the topological distribution of cells (Figure 2i-k). The data uncovered striking links with developmental origin and topographical location despite maintaining strong core smooth muscle cell identity as highlighted by the high expression of common genes (Supplementary Figure 3).
We next proceeded to evaluate unique transcriptional signatures associated with three speci c anatomic sites: carotids, ascending aorta and descending aorta and explored the possible association of unique signature with emergence of site-speci c vascular pathology.

Unique Carotid vSMCs Signature (caSMCs)
The carotids are a major branch point from the aortic arch responsible for the supply of blood to the brain ( Figure 3a) and their tunica media is populated by vSMCs derived from the neural crest 1 . A total of 632 transcripts (3.3% from total transcripts) constituted the molecular signature of the carotid artery (caSMCs) distinguishing this population from their counterpart vSMCs in the aortic arch and descending aorta ( Figure 2c). caSMCs were the predominant population associated with two subclusters: subcluster 2, (87.81% carotidderived) and subcluster 5 (100% carotid-derived) (Figure 3b,c and Supplementary Table 4). Differential gene expression analysis of these two subclusters identi ed 459 genes (subcluster 2), and 291 genes (subcluster 5) whose expression pattern separated these two clusters from other SMC (Supplemental Excel Gene ontology analysis of the 632 anatomic location signature genes showed a signi cant enrichment for proteins involved with growth factor signaling, vascular development and AP-1 response (Figure 3d, Supplementary Excel File 4). From these pathways candidate gene validation using immunohistochemistry was performed for the genes Klf4, Gucy1a3 and Gelsolin (Gsn).
Klf4, a member of the Kruppel-like factor transcription factor family, was represented in 6 of the 10 top ontological categories (response to growth factor, regulation of phosphate metabolic process, vascular development, AP-1 pathway, cellular response to organic cyclic compound and response to wounding) (Figure 3d and 3e). Immuno uorescence staining of KLF4 showed signi cantly increased expression in all layers of carotid vascular smooth muscle relative to aortic arch and descending aorta (Figure 3f and 3g). Gelsolin (Gsn), a cytoplasmic, actin-regulating protein was identi ed as a signi cantly enriched carotid SMC gene through its membership in two top ontology categories -AP-1 response and response to wounding (Figure 3d and 3h). Immuno uorescence staining of GSN showed signi cantly increased cytosolic expression in the carotid artery with substantial increases in the layer of smooth muscle closest to the vessel lumen when compared to aortic arch and descending aorta (Figure 3i and 3j). Gucy1a3, the major nitric oxide receptor which was recently associated as causative for the carotid-restricted vasculopathy, Moyamoya 21 , was also identi ed as a signi cantly enriched carotid SMC gene (Figure 3k). Immuno uorescence staining of GUCY1A3 showed high expression in carotid vascular smooth muscle with impressive levels in the smooth muscle layer in closest proximity to the endothelium (Figure 3l  Gene ontology analysis of the aortic arch anatomic location signature showed signi cant enrichment for transcripts involved extracellular matrix organization, metabolism and autophagy (Figure 4d, Supplementary Excel File 4). From these identi ed signature genes, validation using immunohistochemistry was performed for the selected candidates: Aggrecan (Acan), extracellular superoxide dismutase (Sod3) and Thrombomodulin (Thbd). Aggrecan, a large proteoglycan, was identi ed through its membership in 3 ontological categories: extracellular matrix organization, collagen metabolic processes and pyruvate metabolic processes (Figure 4d and 4e). Immuno uorescence staining of ACAN displayed signi cantly increased expression in all layers of aortic arch vascular smooth muscle, characterized by intense staining surrounding each vSMC cell in the arch relative to the carotid arteries and descending aorta (Figure 4f and 4g). Sod3, was identi ed as part of the antigen presentation ontology category, (Figure 4d and 4h). Immuno uorescence staining of SOD3 showed signi cantly increased expression of the protein in all layers of aortic arch vascular smooth muscle relative to the carotid arteries and descending aorta (Figure 4i and 4j). Thrombomodulin (Thbd) a critical component the coagulation cascade was identi ed as signi cantly enriched in the aortic arch, through membership in the platelet degranulation ontology category (Figure 4d and 4k). Immuno uorescence staining of THBD showed signi cantly expression in all layers of aortic arch vascular smooth muscle relative to carotid artery and descending aorta (Figure 4l and 4m).

Characterization of Descending Aorta vSMCs (daSMCs)
SMCs of the descending aorta derive from the somites and the lateral mesoderm 1 (Figure 5a). The scRNA-seq data uncovered 187 (1% of total) genes that provided the molecular signature for daSMCs ( Figure  Gene ontology on the descending aorta anatomic location signature identi ed signi cant enrichment for genes involved oxidative phosphorylation, extracellular matrix proteoglycans, and muscle contraction ( Figure 5d, Supplementary Excel File 4). Three genes were selected to validate the signature using immunohistochemistry: Aquaporin 1 (Aqp1), Ccdc3, and Perlecan (Hsgp2). Aqp1, a well characterized water channel, was identi ed in the gene ontology categories: oxidative phosphorylation, muscle contraction, blood vessel development and actin lament-based processes ( Figure 5e). Immuno uorescence localization of AQP1 identi ed the protein preferentially in smooth muscle cells of descending aorta (Figure 5f and 5g). Surprisingly, AQP1 distribution in the endothelium was also siteenriched with limited expression in the descending aorta relative to the aortic arch (Supplemental Figure  4).
Ccdc3, a putative/predicted secretory factor, known for its anti-in ammatory role in the vasculature and effects on lipid accumulation was the top transcriptionally enriched descending aorta gene (Figure 5h).
Immuno uorescent staining for CCDC3 showed predominant expression in the descending aorta location relative to aortic arch and carotid artery (Figure 5i and 5j). Perlecan (Hsgp2) a secreted heparan sulfate proteoglycan was identi ed as descending aorta signature gene candidate through its membership in two ontology categories: ECM proteoglycans and blood vessel development (Figure 5d and 5k). Immuno uorescence identi cation of Perlecan (Hsgp2) revealed preferential expression in vSMC of the descencing aorta (Figure 5l and 5m).
Identi cation of CD146/MCAM as a novel site-restricted aneurysm-associated gene The cohort of genes selectively expressed at speci c vascular locations (5.5% of total) offered the opportunity to inquire about their potential contribution to regional-speci c pathologies. Thus, we performed an in-silico forward genetic screen of archived loci previously associated with mendelian vascular disease through integration of entries from the Human Phenotype Ontology (HPO) database (https://hpo.jax.org/app/) 22 with data from Online Mendelian Inheritance of Man (omim.org) 23 . Following a detailed curation, 16 reported loci representing 6 vascular associated traits were identi ed (Figure 6a, Supplemental Figure 5 and Supplementary Table 5). From each locus, a list of regional candidate genes was obtained using the UCSC table browser and subsequently overlaid with the siteenriched single cell data to identity novel candidate genes for each genomic region. A total of 13 transcripts predominantly expressed in the speci c vascular regions mapped to vascular disease loci (Supplementary Table 5). Given the discovery-driven nature of this approach, we sought to con rm the feasibility of validating putative disease candidate by further validating the regionally restricted expression of a disease-associated, site-enriched candidate, Tfap2b (Figure 6b, Supplementary Table   6) 24 . Immunohistochemistry at low magni cation revealed a clear predominance of TFAP2B in the ascending portion of the arch leading towards the brachiocephalic artery, after which it becomes signi cantly reduced (Supplemental Figure 6). In the lower curvature however, expression persists into the region beyond the remnant of the ductus arteriosus (Supplemental Figure 6). At higher magni cation, substantial increases in both level and frequency of TFAP2B positive nuclei were predominant to the aortic arch when compared to the descending aorta and carotid arteries (Figure 6c, 6d). These nding strongly supported the notion that site-speci c expression might uncover disease associated gene candidates.
Amongst the identi ed candidate transcripts was the daSMC-enriched gene Mcam, which resided in the critical region of FAA1, a locus described for familial aneurysm identi ed two decades ago using linkage analysis (Figure 6e) 20 . Importantly, as reported in the original paper, this locus, unlike others, affect multiple aortic segments with dilation of the thoracic and abdominal aorta. Transcriptionally, Mcam (also known as CD146) was signi cantly enriched in daSMCs relative to other vascular sites (Figure 6f). Within the tunica media, CD146/MCAM protein was found to decorate the extracellular space of individual vSMCs, particularly concentrated in the ventral side of the descending aorta. The aortic arch and carotid artery showed more limited distribution of the protein with higher concentration in the innermost layers of the vessel (Figure 6g and 6h and Supplementary Figure 7). Upon further analysis of CD146/MCAM expression, we also noted uneven distribution across the circumference of the aortic segment.
Descending aortic segments in the ventral aspect displayed substantial increased in CD146/MCAM protein expression across all layers when compared when compared to vSMC layers on the opposite dorsal orientation of the vessel (Figure 6i).
Given that echocardiographic examination of the FAA1 families indicated the involvement of both the thoracic and abdominal aorta in aneurysms, we performed scRNA-seq pro ling of vascular segments from dissecting aortic abdominal aneurysms (AAA) modeled by angiotensin II infusion (Figure 7a Taken together, these ndings support the conclusion that expression of CD146/MCAM in the descending-abdominal aorta is protective against the development of AAA.

Discussion
Heterogeneity in smooth muscle cell populations has been established between distinct vessel types with important physiological outcomes 25 and even within the same vascular wall segment with pathological consequences 26,27 . It is also acknowledged that SMCs are endowed with signi cant transcriptional plasticity particularly when exposed to stressors 28-32 Nonetheless, heterogeneity based on distinct regions of the same vessel has only recently started to be explored 33 . Single cell RNA sequencing has emerged as a powerful research tool that provides multi-layered transcriptomic information at single-cell resolution allowing for the deconvolution of heterogenous tissue samples. Here, we sought to characterize the molecular signatures of adult vSMCs associated with carotid arteries, aortic arch and thoraco-abdominal (descending) aorta using scRNA-seq taking into consideration embryological origin of those sites and positional identity. Using this approach, we pro led 18,940 transcripts across 12,305 cells. While the bulk of the transcriptome was signi cantly similar, 1045 transcripts (5.5%) showed regionalized expression patterns. Most of those transcripts exhibited increases in both expression levels and percentage of cells expressing the given transcript in the regional vSMC population, although a few were skewed in only one of the two patterns. The analysis provided the opportunity to explore whether these site-enriched transcripts bear potential relevance in disease emergence.
A question posed from the onset of this work was the intersection of embryological origin, topological identity and adaptation to physical forces. While distinct in structure, both the carotids and a large portion of the aortic arch share neural crest ancestry, thus the interest in including carotids in the analysis. Despite the common origin, scRNA-seq data, showed clear segregation between vSMCs from the aortic arch and carotids, however, hierarchical clustering demonstrated that carotids were closer to aortic arch than to the descending aorta. Also critical to the interpretation of these data are the unique hemodynamics to each vascular segment which notably impacts transcription. In fact, evaluation of the aortic arch revealed important adaptations to ow dynamics. Speci cally, we identi ed two clearly distinct subclusters associated with the greater and lesser curvatures (subclusters 1 and 6), the later in locations of high turbulent ow. Interestingly, positional information, as per Hox gene expression, was retained and detectable by scRNA-seq. Hoxa genes, in particular, were highly informative to identify the topological distribution of the subclusters and also helped us uncovered three additional subclusters in the transition from the aortic arch (neural crest origin) and descending aorta (mesodermal origin), namely subclusters 3, 4 and 0.
Importantly many of the region-speci c genes identi ed have been previously associated with vascular disease (Supplemental Table 6). For example: Gucy1a3, a transcript enriched in the carotids has been linked to moyamoya 21 ; Prdm6 and Tfap2b both speci c to the arch were associated with patent ductus arteriosus 24,34 , while Mfap5 was associated with Familial aortic aneurysm classi ed as thoracic, but that affected the aortic root 35 . These associations further lend support the central hypothesis being tested that, localized, slightly skewed, transcriptional expression patterns can be informative in exploring disease causation. Along these lines and to further scrutinize the hypothesis, we overlaid the information captured by scRNA-seq to inquire (in silico) whether loci previously associated with vascular disease included any of the site-speci c transcripts identi ed.
The Online Mendelian Inheritance in Man (OMIM) database is a manually curated, daily updated catalog of information for all known mendelian diseases including known causative genes. Despite signi cant advancement in the identi cation of disease-causing mutations using next-generation sequencing, OMIM still contains several hundred mendelian disorders for which the underlying genetic mutation remains unknown 36 . Thus, we had the opportunity to overlap the identi ed transcripts with regional-speci c disease, such as moyamoya, patent ductus arteriosus, and site-speci c aneurysms (Supplementary Table  5). Through this process, we captured 13 genes (Mcam, Atf5, Cyth2, Ppp1r15a, Tbl1x, Bhlhe4, Wwp1, Stk3, Ywhaz, Klf10, Azin1, Pabpc1, and Gem) that overlapped between site speci c enrichment (this work) and previously identi ed loci for localized vascular disease. While exciting, de nitive, causative proof will require rigorous mechanistic experiments. Along these lines, we decided to undertake one such exploration in relation to Mcam and aneurysm.
Rupture of aortic aneurysms is a well-established cause of preventable cardiovascular death in the developed world. However, given the disease's asymptomatic presentation, improved methods for identi cation of individuals at risk is required to decrease disease-associated mortality. Signi cant evidence has strongly suggested a substantial role for genetic susceptibility in the development of both thoracic and abdominal aortic aneurysms (TAA and AAA) 37 . In fact, mutations in several structural genes coding for extracellular matrix and contractile proteins have been identi ed and replicated in animal models 38 . Nonetheless, not all aneurysms showed mutations in the pool of causative genes identi ed so far, thus either additional genes are yet to be identi ed and/or non-genetic causes of the pathology, such as in ammation, might be at fault. In addition, gene-modi ers are also acknowledged to contribute to the pathology and part-take in severity. Our systematic in silico approach showed that Mcam mapped to the FAA1 locus 20 and was speci cally reduced in smooth muscle cells at sites of aneurysms, further supporting a possible contribution of MCAM in the structural integrity of the tunica media.
Mcam (also known as CD146) is a transmembrane glycoprotein expressed by endothelial, pericytes, and smooth muscle cells and that also functions as a co-receptor for PDGFRß 39,40 . During embryonic development, CD146/MCAM is highly expressed in early progenitors of smooth muscle cells when it was shown to regulate the balance between vascular smooth muscle cell proliferation and differentiation 41 .
Expression in adult smooth muscle appears to be restricted to sites subjected to ow-mediated stress such as branches 41 and the ventral aspect of the dorsal aorta (this work). Our ndings are consistent with a role for CD146/MCAM in resisting tissue stress and promoting resilience, also a function in accord to its expression at vascular branches. Importantly, two recent publications have also attributed immunomodulatory functions to CD146/MCAM 42,43 , a role of importance in for prevention of tissue breakdown by in ammatory cells also known to promote or exacerbate aneurysm development. De nitive experimental support for the contribution of CD146/MCAM in the prevention of aneurysms came from the evaluation of disease development in the null mouse, that clearly showed earlier lethality and increased pathology in comparison to control littermates.
Overall, our work adjoints a growing list of scRNA-seq pro ling studies aiming at improving our collective understanding of transcriptional heterogeneities present in the vascular tree. Importantly, we further demonstrated that detailed scRNA-seq data can be leveraged to advance previous GWAS studies that linked loci to vascular disease.

Immunostaining and Confocal Microscopy
Formalin xed, para n embedded specimens from aortic and carotid vessels were sectioned at 5µm and incubated with primary antibodies overnight followed by species -speci c secondary antibodies for 1 hour prior to mounting in Prolong Gold (Thermo Scienti c). Speci c antibody information including antibody concentration, vendor information and RRID are included in Supplementary Table 1. A subset of antibodies (Tfap2b) were ampli ed using tyramide signal ampli cation kits (T20934) (Thermo Scienti c) following manufacturer's recommendations. Samples were evaluated and photographed using a confocal microscope (Nikon 1AR) equipped with 20X and 60X oil objectives. For a subset of candidates (Tfap2b and Mcam), samples were evaluated and photographed using an LSM880 confocal microscope (Carl Zeiss) equipped with Zeiss Plan-Apochromat 20x/0.8 M27 for acquisition. Image quanti cation: Images were rst processed with the Imaris le converter then transferred to the Imaris 9.7 software. Visualization of all the channels was normalized across images from all three anatomic sites. For quanti cation of the relative average intensity of staining for candidate gene. Imaris surfaces were generated using the aSMA+ channel (for cytoplasmic staining) using background subtraction for thresholding. The surfaces were approximately of 10 voxels in size per surface across the entire image for all three regions. This approach allowed us to segerate and restrict our analysis of the channel intensity to only those regions in the image that expressed aSMA+. Subsequently, following surface creation, all of the aSMA+ surfaces in image of the anatomic location were selected and the mean intensity of the site-speci c candidate (Red Channel) per surface was extracted from the statistics table and exported as a new tab delimited le. This process of surface generation, selection and channel intensity statistic export was repeated for all 3 site, per candidate gene investigated if the staining pattern appeared predominantly cytosolic. If the stanning of the candidate gene was nuclear, surfaces were generated using the DAPI channel (nuclear staining -Tfap2b) using background subtraction for thresholding. The surfaces were approximately of 10 voxels in size per surface across the entire image for all three regions. Then manually, only those surfaces from nuclei that resided in the tunica media (nuclei in cells positive for aSMA) were selected for data export and analysis. Once exported out of imaris, the mean channel intensity les for the candidate genes (3 les per candidate gene) were merged into a table (1 table per gene, 11 tables total) and imported into R Studio for statistical testing and data visualization.
Single Cell RNA Sequencing: Sample Preparation for Site Enriched Transcriptomics: 30 minutes following SQ injection of 400 i.u of heparin 3 C57BL/6J male mice were sacri ced and perfused with 10mL of Versene. Following perfusion, the aorta and carotid vessels were harvested from the mouse and further dissected under a microscope to remove the adventitial layer. Subsequently the three vascular fragments: carotid arteries (right and left common not their branches), aortic arch (from the root to the subclavian branch), and the descending aorta (below the subclavian up to the gonadal branches, but excluding all branches) were obtained, minced into small fragments and incubated under agitation into 1mL digestion buffer containing freshly prepared liberase solution (2.5% Liberase TH from frozen stock [Sigma Aldrich], 5 Kunitz Unit/mL DNase1, 1M HEPES IN 1X HBSS). All preparations were done at temperature except for the digestion, which was done at 37 o C for approximately 20min. The suspension was neutralized with DMEM + 10%FBS and ran through a 40uM lter and then centrifuged for 5 min at 10,000 x g. Following centrifugation, the cell pellet was resuspended in in 0.4% BSA-PBS and assessed for viability prior to library preparation using trypan blue staining with the TC20 cell counter (BioRad). Only samples with more than 85% viability were used in the generation of libraries. Cells were partitioned with Gel Beads into emulsion in the Chromium instrument where cell lysis and barcoded reverse transcription of RNA occurred following ampli cation. Single-cell RNAseq libraries were prepared by using the Chromium single cell 3′ library and gel bead kit v3 (10x Genomics).
Single Cell RNA Sequencing: Sample Preparation for AAA mouse model: At sacri ce, three aneurysmal and three control aortas were pooled and digested for 1 h at 37°C in 10 mg/ml Collagenase type II (C6885, Sigma Aldrich) and 1 mg/ml Elastase (LS002292, Worthington Biochemistry) and processed for single-cell RNA-sequencing.

Data Availability and Bioinformatic Analysis for Single Cell RNA Sequencing Data
The scRNA-seq data sets in this article are deposited in the international public repository Gene Expression Omnibus database under accession code GSE156731, as well as Hadi 2018 (AAA) 45 .
Sequencing of all three libraries (positional transcriptomic dataset) was performed on the same run on Illumina HiSeq 4000, and the digital expression matrix was generated by demultiplexing, barcode processing, and gene unique molecular index counting. Further processing of scRNAseq involved quality control, normalization, confounding factor identi cation, dimensionality reduction and cell-gene level analysis. Preprocessing of the raw data was conducted following the Cell Ranger pipeline (10x Genomics). To identify different cell types and nd signature genes for each cell type, the R package Seurat (version 3.2.2) was used to analyze the digital expression matrix. Cells with less than 500 unique molecular identi ers (UMIs) and greater than 10% mitochondrial expression were removed from further analysis. Seurat function NormalizeData was used to normalize the raw counts. Variable genes were identi ed using the FindVariableGenes function, the top 2000 variable genes were selected for further analysis. The Seurat ScaleData function was used to scale and center expression values in the dataset for dimensional reduction. Principal component analysis (PCA), t-distributed stochastic neighbor embedding (t-SNE), and uniform manifold approximation and projection (UMAP) were used to reduce the dimensions of the data, and the rst 2 dimensions were used in plots. A graph-based clustering approach was later used to cluster the cells; then signature genes were found and used to de ne cell types for each cluster. The FindAllMarkers Seurat function was used to nd positive markers of each vSMC population with the Wilcoxon rank-sum test. Pathway enrichment analysis was performed using Metascape. Top 10 categories for each of the clusters investigated as Supplemental Excel File 4. Data visualization (graphs) were generated using ggplot2 package for R. Site Speci c Gene Signature determination: The FindAllMarkers Seurat function was used to nd positive markers of each vSMC population with the Wilcoxon rank-sum test for each anatomic location (3; Supplemental Excel File 1) and subcluster population (6; Supplemental Excel File 3). Following gene list generation, the datasets were ltered for signi cance (padj ≤0.05) and the presence of shared genes between anatomic sites with duplicate genes if identi ed allocated to the anatomic location (1,045) or vSMC subcluster (1,681) with the most signi cant padj.
Subcluster and Anatomic Location Signature overlap: To identify shared signature genes the nal gene lists for both anatomic location (1,045) and vSMC subcluster (1,681) were overlayed individually using the identify duplicate functions in excel. Genes were characterized as one of 3 major categories: unique to anatomic location, unique to subcluster signature or shared between a speci c anatomic location and speci c vSMC subcluster. From this data, a tabular matrix was constructed with all unique permutations of the 3 major categories (31 patterns combinations were found) and the recording number of genes in each category and data visualization was performed using the tidyverse and ggalluvial packages in R.
Gene ontology analysis: Pathway enrichment analysis was performed using Metascape express analysis using default options. Metascape utilizes the hypergeometric distribution in the calculation of signi cance for each gene enrichment category. Scores (log p-value) are re ective of the number of genes in the gene list de ned by the category in question relative to the total number of genes in the category relative to background. From the output, the top 10 summary gene ontology categories (as ranked by pvalue of enrichment) for each of the anatomic locations were used for visualization of gene ontology following curations and removal of duplicate/highly similar terms (Supplemental Excel File 4). From this data, into two tabular matrices were constructed. The rst matrix (the vertex le): contained the fold enrichment for the individual gene ontology categories as well as top gene ontology category membership, fold enrichment for anatomic location and p-value for anatomic location. The second matrix (Edge le) contained all gene ontology memberships for visualized genes. Both matrices were then visualized into a custom ontology network using a combination of ggplot2, igraph and ggraph packages for R.
"In-silico" forward genetic screen: Genomic loci for mendelian vascular anomalies were identi ed through query, integration, and manual curation of data from multiple online biomedical databases as follows.
Given the complexity of mendelian disease nomenclature, vascular diseases/traits were identi ed using the Human Phenotype Ontology (HPO) database (https://hpo.jax.org/app/) using the search term "vascular anomalies" (HP:0002597) which generated 1043 entries. HPO entries were ltered to sub-select for those entries represented in Online Mendelian Inheritance of Man (OMIM) (omim.org). Of the 540 OMIM entries, 193 entries were classi ed as phenotypic series entries, which is used to identify those disorders with known genetic heterogeneity. Given that the fact that the initial HPO query restricted OMIM entries to those phenotypic series entries with known-disease genes associations we reintegrated the missing entries for each disease/trait, representing and 924 additional entries. These 1464 entries were then manually curated, and entries were excluded based on the following criteria: 1) predominance of vascular feature, 2) known disease gene 3) case reports 4) chromosomal abnormalities 5) association or ambiguous mapping to genome. This exclusion left 16 entries representing 16 unique genomic loci representing 6 distinct vascular traits. Using the published genomic markers that de ned the critical region (either STS or SNPs depending on the publication/loci in question), the coordinates of these loci were translated to hg38 using the STS track on the UCSC genome browser. Using the UCSC table browser (https://genome.ucsc.edu/) we queried the UCSC SQL database to identify all genes residing in each of the given loci using the knownGene table based on the most current build of the human genome (hg38). The "name" column data from knownGene table for each query was then used to pull data from the KgxRef table to convert gene identi ers to HUGO approved gene symbols. The gene symbols from the regional loci data were then overlaid with the scRNASeq anatomic origin markers for each of the 3 locations. Genes were identi ed as positive hits if they resided withing the genomic coordinates of the loci in question and de ned vSMCs in the arterial bed relevant to the trait.
Statistics: For image analysis and animal studies, statistical tests were performed using stats and rstatix packages. Gaussian distribution of the data was assessed with Shapiro-Wilk test. Normally distributed data were compared by unpaired t-test (two-group comparison). Data that departed from the Gaussian distribution were analyzed using Kruskal-Wallis test (multiple groups).  Cell type diversity across three vascular beds. a. Graphical description of experimental design. Carotid arteries (green), aortic arch (red) and descending aorta (blue) were dissected from three C57BL6/J male mice, combined, and enzymatically dissociated to obtain single cell suspensions for scRNA-seq using the 10X genomics platform. b. Image of an intact, isolated mouse aorta and carotid vessels following removal of the adventitial and prior to enzymatic digestion. Relative locations prior to dissociation are         Boxplot visualizing the max descending aorta (non-aneurysmal) diameter following 28d of angiotensin II