A high-resolution brain proteome map uncovers the Inter-hemispheric laterality & Inter-regional protein expression changes

The human brain has always been a black box full of mysteries. Here we present one of the most comprehensive proteomics investigation of the brain, focusing on inter-hemispheric differences. An extensive mass spectrometry-based analysis of 19 brain regions from both left and right hemispheres measured more than 3300 proteins and 38700 peptides. This high-resolution data provides a comprehensive coverage of experimentally measured (non-hypothetical) proteins across various regions to characterize inter-hemispheric differences. We also tried to understand the brain proteins in terms of synapse analysis. The study has attempted to investigate the expression of neuroanatomical allied region and brain disorder protein markers in 19 region and sub-region of brain. Furthermore, we have developed the most comprehensive Brain Proteome Database, based on our, and publicly available curated data representing more than 9000 proteins (with isoforms) and around 90000 peptides at www.brainprot.org, which can aid in understanding the human brain’s complexity. specic protein Ankyrin Repeat Domain 63 (ANKRD63) Putamen, Lenticular Nucleus and Caudate Nucleus in ANKRD63 during development 26 . Chimerin (CHN2),


Introduction
The human brain is the primary organ of the central nervous system, re ected in its multitude of neurons and glial cells, each of which are home to hundreds of different subtypes chie y based on their molecular, morphology and connectivity properties 1 . The various synaptic connections between these cell types contribute to the de nition of neuro-anatomical sub-divisions in the adult human brain, which is estimated to be around 900 in number and are grouped under two distinct hemispheres. The average weight and overall size of the left and right hemispheres are very similar in healthy adults, but both differ in their functional and anatomical levels 2 . Regarding brain laterality, the left hemisphere specializes in language processing, while the right hemisphere specializes in attention, visuospatial tasks, and many aspects of emotion. Lateralization is accompanied by structural asymmetries that vary with handedness, gender, age, and a variety of genetic, neuropsychiatric, and neurodegenerative conditions [3][4][5] . The functionality of these hemispheres and different neuroanatomical regions of brain has been revealed in terms of speci c regional transcriptional signatures that are regulated in a spatiotemporal manner across mammalian brains [6][7][8][9][10][11] .
The HBPP under the patronage of the Human Proteome Organization (HUPO) has initiated, established and conducted many proteomics-based studies, which has provided detailed insights of multiple neuroanatomical regions and neuronal compartments (such as neuromelanin granules, mitochondria, nucleus, membranes, axogliasomes, postsynaptic density and myelin) 12,13 . This initiative has created a draft of a human brain proteome database and a sub-cellular reference proteome of a healthy human brain 14 . Proteins functionally regulate most of the biological functions, and the systematic application of neuroproteomics is needed to functionally interpret the information provided by transcriptome-wide approaches 1 . Until now, all proteomic characterizations performed in the human brain have been unilateral, leaving a gap in our knowledge of the functional proteomic lateralization of the brain 15 . This study aims to bridge that gap and extend the initiatives of HBPP (The Human Brain Proteome Project) by performing proteome-based investigations of 19 neuro-anatomical regions from both left and right hemisphere Additionally, the ndings of inter-region-based analysis have been integrated with different publicly available proteomic datasets of different neuroanatomical regions of human brain. We have also tried to check the status of popular marker proteins of Cerebrospinal Fluid (CSF), Meninges, Spinal Cord, Brain cell and different brain disorders in different neuroanatomical regions of brain. Finally, we have developed an online portal with all the proteins and peptides which could be use by neuroscientists and researchers to accelerate the investigation in inter-hemisphere and neuroanatomical brain regions-based studies. In summary, the widespread implantation and establishment of interhemispheric proteomic innovations will surely promote advancing knowledge about the proteomic portrait of the human brain, leading to novel developments in neurologic differential diagnosis and therapies.

Results
Interhemispheric brain region-based proteome pro ling A human brain proteome map was generated by including the proteomic pro le of 19 different neuroanatomical regions of both the left and right hemispheres. The samples were run unfractionated, primarily so that our ndings could be validated using targeted proteomics strategies in the future ( Figure  1). The analysed data of both the left and right hemispheres were later integrated, providing a total of more than 3300 proteins and 38700 peptides with 1% FDR (Supplementary Fig. 1a, Supplementary Data 1). Human Protein Atlas (HPA), Allen Brain Atlas and literature were referred to group the following regions 16,17  Chromosome Map of Interhemispheric brain proteome A total of 3318 proteins were mapped to their corresponding chromosomes (22 Chromosomes, X, Y, Mitochondrial, Unplaced) using the Nextprot repository as a reference, and represented in the form of a circos plot, to provide better visualization of percentage coverage of chromosomes and chromosomal distribution of the data ( Supplementary Fig. 1c, Supplementary Data 2). The brain proteome chromosome map was further compared with the total human proteome chromosome map, using bar plots which showed that both of them follows a similar trend ( Supplementary Fig. 1d).

Comparative Analysis of Neuroanatomical regions of Human Brain
The neuroanatomical region-based proteomics data of this study has been taken forward and a heat map was drawn with the region enriched protein intensities (Figure 2a). Few of the protein in the heatmap found to be equally enriched in more than one region of sub-regions of Human brain. A list of 549 common protein were found to be expressed in duplicates of both hemisphere of all 19 regions and subregions of brain ( Supplementary Fig. 1a, Supplementary Data 1). Neurofascin (NFASC), Glia maturation factor beta (GMFB), Glial brillary acidic protein (GFAP) and Neurochondrin (NCDN) are one of the popular brain proteins in human belongs to the common protein identi ed in this study has been shown in Figure 2b. The proteins in these heatmaps are found to be differentially expressed when left and right hemisphere were analysed. Cerebellar Vermis enriched proteins like Cerebellin 1 precursor (CBLN1), Cerebellin 3 precursor (CBLN3), Purkinje cell protein 2 (PCP2), Neurexin 1 (NRXN1) and Neurexin 3 (NRXN3) which are found to be upregulated in Right hemisphere when compared with left hemisphere data ( Figure 3C). A list of proteins with differential regulation in terms of left and right hemisphere of Dentate Gyrus of Hippocampus found to be mapped with Generation of Neurones (GO:0048699) with an FDR of 1.74e-09 in STRING. Most of the mapped proteins are found to be upregulated in right hemisphere which has been also con rmed using regression analysis ( Figure  and 3C. Furthermore, the total brain proteins were analysed in NetworkAnalyst using KEGG as background database to understand the synapse pathways under Nervous system (hsa09156). Glutamatergic synapse (hsa04724, p-value: 3.25e-6), GABAergic synapse (hsa04727, p-value: 1.05e-6), Cholinergic synapse (hsa04725, p-value: 0.0013), Dopaminergic synapse (hsa04728, p-value: 3.76e-8) and Serotonergic synapse (hsa04726, p-value: 0.0077) are shown in the form of bipartite network with mapped genes ( Figure 4B).

Expression of Neuroanatomical area and Disease markers in Human Brain regions
The neuroanatomical allied regions like spinal cord, CSF, Meninges, Pituitary and brain cells based protein markers were curated from literatures. Two major types of brain-related disorders, brain tumours and neurodegenerative disorders (NDD) were considered for the disease marker analysis. The popular proteins for each of the alarming diseases were chosen with a p-value cut off of 0.05 from Pubpular 20 .
Literature-based search and the DisGeNet 21 search were also used to verify the list. Furthermore, the list of disease-based proteins markers and neuroanatomical allied region markers were taken forward to understand the expression in normal human brain regions shown in Supplementary Fig. 3D (Supplementary Data 4). However, the expression of the large number brain disease based proteome markers in human brain regions are currently ongoing to give a better understanding in future.

Development of Inter-Hemispheric Brain Proteome Map (IBPM) Portal
A web-based portal is designed to provide easy access to inter-hemisphere-based human brain proteomics data and region-speci c proteomic data that is integrated with publicly available datasets. This information is made publicly available at www.brainprot.org, which includes a search tool, comparing the proteins in terms of the left and right hemisphere, while also fetching proteomics information in an interactive and user-friendly environment.

Discussion
An increase in the knowledge of brain disorders requires a deep understanding of the human brain. A large number of disease-related studies depend on control brain samples to compare the basal level protein expression, which is always challenging. Different popular databases and repositories like Allen Brain Atlas, Human Protein Atlas have taken vast initiatives to provide a brain reference map in terms of microarray-based gene expression data and transcriptomics data respectively, but a gap still lies in terms of mass spectrometry based comprehensive proteomics data. The study attempts to ll some of the gaps in our knowledge by providing a neuro-anatomical region-based brain proteome map. Different neuroanatomical regions of the brain are known to be functionally different from each other despite being a part of the same organ. The differential expression of genes and proteins, together with the diverse functionality of different regions of the brain itself, makes it the most complex organ of the body. This study has, for the rst time demonstrated the differential expression of the proteome in 19 different neuro-anatomical regions and also in terms of brain lateralization. A list of 549 common protein can be considered as brain enriched proteins as these proteins were found in all the 19 regions with maximum con dence, 137 proteins found to be matching with elevated in brain Human Protein Atlas. Our investigation has also found proteins which are enriched in one region or a group of anatomically similar regions of human brain. Hippocalcin (HPCA), a neuron-speci c calcium-binding protein found a higher expression in Basal Ganglia and also in Hippocampus. This mRNA differential expression of HPCA has also been found to be overexpressed in Putament and Caudate Nucleus of Basal Ganglia followed by Hippocampus in normal tissues according to GTEx. Protein Phosphatase 1 Regulatory Inhibitor Subunit 1B (PP1R1B) which plays an important role in the stimulation of neurotransmitter has also found to be overexpressed in Basal Ganglia when compared with other regions. PHD Finger Protein 24 (PHF24) and ATPase H+ Transporting V0 Subunit C (ATP6V0C) were found to be overexpressed in Cerebral cortex in this study, which found to be correlate with the Brain Atlas of Human Protein Atlas (HPA). Olfactory marker protein (OMP) plays a crucial role in olfactory signaling and odor discrimination were found to be identi ed only in the Olfactory Bulb in this study. This particular protein has been reported as a key player in accelerating the maturation of olfactory sensory neuron (OSN) 23 . N-Terminal Xaa-Pro-Lys N-Methyltransferase 1 (NTMT1) which was identi ed in brain stem and substantia nigra whereas Serine And Arginine Rich Splicing Factor 11 (SRSF11) found to be identi ed only in brain stem. The expression of these two proteins found to be corelate with the RNA expression data of Brain Atlas of Human Protein Atlas (HPA) though the overexpression of SRSF11 were shown in Cerebellum followed by Midbrain. NUAK family kinase 1 (NUAK1), identi ed in Frontal cortex, Parietal cortex and Occipital cortex of cerebral cortex has been known to play important role in cell adhesion, cell proliferation and tumor progression. The GTEx mRNA differential expression of NUAK1 reported an overexpression in Frontal cortex in normal tissues. A recent study in mouse model reveals that Nuak1 could be consider as a novel therapeutic entry point for tauopathies 24 . A basal ganglia speci c protein Ankyrin Repeat Domain 63 (ANKRD63) found to be identi ed in Putamen, Lenticular Nucleus and Caudate Nucleus in our study. A study reveals that prominent expression of ANKRD63 has been found during mouse brain development 26 . Chimerin 2 (CHN2), a GTPase-activating protein was identi ed only in Cerebellar vermis in this study. The mRNA differential expression data GTEx showed that CHN2 is overexpressed in Cerebellum and Cerebellar Hemisphere. These region-based proteins could be used as a region based anatomical protein marker as most of their overexpression has been supported in this study which was mostly reported in terms of mRNA expression in GTEx and Brain Atlas of HPA (Human Protein Atlas).
This study has also tried to investigate the interhemispheric proteomics expression in 19 different regions of human brain for the very rst time and found several differentially expressed proteins. We deciphered its uniqueness by providing measurement of hemisphere-enriched proteins in terms of Basal Ganglia, Cerebral cortex and Brain stem. We have also found Cerebellar vermis enriched proteins like NRXN1, NRXN3, PCP2, CBLN1 and CBLN3 which has also been shown with high expression in RNA data of Cerebellum in HPA. These proteins play important role in synaptic signal transmission and found to be upregulated in the right hemisphere in our study 26,27,28 . Our proteome level investigation has also tried to understand the biological pathways like adult hippocampal neurogenesis-based proteins. Neurogenesis, a process of generation of new neurones which are important for emotion, depression and cognition 29,30 . Hippocampus especially dentate gyrus (DG) play important role in adult neurogenesis and behavioural discriminations 31, 32 . We have tried to map the dentate gyrus proteome with generation of neurones (GO:0048699), a biological process which results in 18 differentially regulated protein based on left and right hemisphere. 17 proteins which includes Growth Associated Protein-43 (GAP43), Synculin-1 (SYN1), synaptosomal-associated protein 25 (SNAP-25) are found to be upregulated in right hemisphere when compared with left hemisphere. GAP43 plays an important role in neurogenesis whose failure in expression leads into apoptosis of neurones 33 . SYN1 and SNAP25 belongs to hippocampal synaptoproteome which regulates neurotransmission 34 . The overexpression of these dentate gyrus proteins in neurogenesis directs towards the active role of right hemisphere but the result needs validation in large cohort of sample.
A large number of studies that are dependent on human control brain regions can refer to this proteome map to get an overview of region based proteomic expression. Furthermore, this study has also included different repositories and published datasets to make it the most comprehensive compendium of human brain proteins. With the availability of the Human Genome, neuroanatomical region-based transcriptomics data, our proteomic data corroborates the assets and would serve as a reference map for different neurobiologists and brain disorder studies. This study of a neuroanatomical region-based brain lateralization prototyping might help to accelerate the understanding of the complexity of the brain by researchers and clinicians working worldwide. In light of the detection of lateralized protein expression in post-mortem adult brain data, the future study of embryonic, fetal, and developmental stages in combination with single-cell proteomics will allow obtaining a more comprehensive protein atlas of the human brain. The remaining brain areas were xed with 4% buffered formalin for 4 weeks and some other selected regions were embedded in para n and stained with haematoxylin and eosin following the BrainNet Europe guidelines 35 . Protein expression of Tau, β-amyloid, TDP-43, α-synuclein, ubiquitin, and α-β crystalline was analysed by immunohistochemistry using speci c antibodies across different brain regions. The microscopic study demonstrated minor neuronal hypoxic changes in the temporal and hippocampal area. No neuropathological signs of neurodegenerative disorders were found on pathological examination of the subject.
Sample preparation for LFQ analysis Protein extraction was performed as previously described 36 . Brain samples were homogenized in lysis buffer containing 7 M urea, 2 M thiourea and 50 mM DTT supplemented with protease and phosphatase inhibitors. Then, samples were spun down at 100,000×g for one hour at 15 °C. After protein precipitation, protein concentration in the supernatant was measured with the Bradford assay kit (Biorad). Protein enzymatic cleavage was carried out with trypsin (Promega; 1:20, w/w) at 37 °C for 16 h. Peptides were subjected to desalting (C18-sep packs column, Merck Milipore) and vacuum dried. Reconstitution of peptides was undertaken using 0.1% Formic acid, and peptide quanti cation was performed using Scopes method 37 .
Liquid chromatography tandem mass-spectrometry (LC-MS/MS) 1µg of digested and desalted peptides were loaded onto the column at 5µl/min ow rate. Peptides were resolved on an analytical column at a ow rate of 300 nl/min over 180 min gradient in solvent B (80% ACN in 0.1% FA). Mass spectrometric acquisition was performed in the DDA (Data Dependent Acquisition) mode in full scan range of 375-1700 m/z with Orbitrap fusion mass analyser at a mass resolution of 60,000 in Mass Spectrometry Facility at IIT Bombay (MASSFIITB). The mass window was set to be 10 ppm with a dynamic exclusion duration of 40s. All MS/MS spectra were acquired by HCD, i.e. High energy Collision Dissociation method of fragmentation. Each sample was run in duplicates in a randomized manner to prevent a run-to-run bias and remove prevent batch variation.

LFQ data analysis in MaxQuant
The raw datasets were processed with MaxQuant (v1.6.5.0) against UniProt Human Proteome Database (Proteome ID: UP000005640) and searched with the built-in Andromeda Search Engine of MaxQuant 38 . Raw les were processed within Label-Free-Quanti cation (LFQ) parameters setting label-type as "standard" with a multiplicity of 1. The Orbitrap was set to Orbitrap Fusion mode. Trypsin was used for digestion with a maximum of 2 missed cleavages. Carbamidomethylation of Cysteine (+57.021464 Da) was set as the xed modi cation, whereas oxidation of Methionine (+15.994915 Da) was set as the variable modi cation. The False-Discovery-Rate (FDR) was set to 1% for the protein and peptide levels to ensure high reliability of the protein detection/identi cation. The minimum length for a peptide was set to 7AA. Decoy mode was set to "randomize", and the type of identi ed peptides was set to "unique+razor".

Data quality check and analysis
The analysis of 21 samples showed that the left hemisphere of Motor Cortex (MCx) sample were not giving appropriate number of proteins. The sample was re-run for con rmation and MCx sample was excluded from the study. The max quant analysis was performed with 20 samples and the data was taken forward for further analysis.
Statistical and bioinformatics analysis of spectra and proteomics data was performed using Metaboanalyst 39 , Microsoft Excel, Python, and R statistical software. No missing value imputation has been done. The hierarchical clustering was performed using z-score transformed LFQ intensities and was further clustered using Euclidean as a distance in Metaboanalyst 39 and Hierarchical Clustering Explorer 40 (Version 3.5). PANTHER 41 , WebGestalt 42 and SynGO were used to perform some pathway enrichment and synapse biology-based on Gene Ontology (GO). STRING and NetworkAnalyst were used for protein-protein interaction and enrichment analysis.
Regression/Residual Analysis of inter-hemispheric data A simple bivariate linear regression analysis, used to identify signi cantly over or under-expressed proteins associated with a particular sample was performed. In this approach, the protein expression levels were regressed for pairs of individuals. For example, the expression levels for a replicate of AMY were put on the Y-axis, and the expression levels for another non-AMY sample were put on the X-axis. A regression analysis was conducted in MS Excel and standard residual values were calculated for each protein for the regression. This approach was repeated for all replicates in a pair-wise fashion, regressing all of the sample pro les in the region against all sample pro les out of the region. The mean residual value was calculated for a given region. The advantage of this approach is that it combines the power of a volcano plot with the ability to examine replicate-to-replicate variation and region-to-region variation.
For each regression, the distribution of residuals and the correlation coe cients were examined. Raw regression results are presented in the supplementary data. This approach was not only used to look at the region-speci c proteins but also used to examine the Left versus Right expression for each region. In this way, the Left v/s Right differential expression could be examined across all regions.

Data mining and data extraction for comparative analysis
Proteomics study with keywords related to the human brain and different neuro-anatomical regions of the human brain, according to the Allen Brain Atlas Classi cation table were searched in Proteome Exchange and PRIDE repository. All the studies available in the repository up to 10.02.2020 with the following lter criteria -Species: Human (Homo sapiens) and Method: LFQ were downloaded. The manuscript of the papers and sample metadata les were thoroughly studied to understand the experimental design and which raw les belong to the healthy subjects. In most cases, authors were contacted to verify the raw les. Additionally, in-house generated mass spectrometry-based data of few other regions from our previous study were also taken. All these curated raw mass spectrometry les were taken forward for MaxQuant data analysis with Human Proteome Database (Isoform). Further, brain-related databases and repositories were also considered, which included Brain Atlas of Human Protein Atlas (HPA), CSF Proteome Resource (CSF-PR) 43 , and Harmonizome 44 .
Inter-hemispheric brain proteome map (IBPM) portal The IBPM portal under www.brainprot.org is designed on the Django framework with robust features for visualization of data in the study and it will act as an interface for the researchers in the eld of brain proteomics for data integration and region-speci c information. The portal also fetches general information like the chromosome localization, function, and peptide information about the protein from various existing repositories. The portal is designed with scalability at its core; the database in the portal is highly dynamic and can be extended to incorporate public databases.

Data availability
All MS raw les could be accessed from Proteome-Xchange using the identi er PXD019936.