Molecular markers in the CSF proteome differentiate neuroinflammatory diseases CURRENT STATUS:

Background: Multiple sclerosis (MS) is characterized by different degree of inflammatory and neurodegenerative features in the early relapsing vs. progressive subtypes. By using controls with different extent of inflammation vs. neurodegeneration, we examined the CSF proteome to identify molecular markers that differentiate between subtypes of MS. Gene expression of specific proteins were explored in MS brain lesions with diverse pathological background. Methods: (i) First, we compared the proteome by LC-MS/MS in 169 pooled CSF from MS subtypes to inflammatory/degenerative controls: AQP4-IgG-positive and AQP4-IgG-negative neuromyelitis optica spectrum disorder (NMOSD), Alzheimer’s disease (AD), and healthy controls. F-test based feature selection was used to cluster diseases and MS subtypes. (ii) Next, we selected 299 molecules by comprehensive statistics, and quantified them in the individual CSF samples. (iii) We also screened the genes of MS-specific CSF proteins in transcriptomes of 73 MS brain lesions with different pathology. Results: We identified 11 proteins that separated diseases, and 8 proteins that clustered MS subtypes. Secondary progressive (SP)MS had the most unique proteome characterized by upregulation of intrinsic pathway proteins of the coagulation pathway. SPMS also clustered far from NMOSD indicating less inflammatory pathways. Primary progressive (PP)MS was more similar to relapsing-remitting (RR)MS than SPMS. Quantification of 299 proteins in 170 individual CSF samples identified 5 molecules uniquely upregulated in MS subtypes and in AQP4-IgG-positive NMOSD, respectively. Chitinase-3-like protein 1 (CHI3L1) was upregulated in part of PPMS and remission CSF samples, and it was expressed by astrocytes in chronic active lesions. GFAP was upregulated in 70% of AQP4-IgG-positive NMOSD but only in 40% of AQP4-IgG-negative NMOSD. Conclusions: By the combination of untargeted and targeted quantitative analysis, we identified CSF molecular markers of axonal growth inhibition, lipid binding, and protein/lipid transport that between neuroinflammatory and neurodegenerative diseases, and also MS subtypes. The majority of them were expressed in MS brain lesions suggesting their origin from the brain tissue and not from the systemic compartment. Data suggest that the CSF proteome of SPMS is different 26), AD (n=22), NMOSD AQP4-IgG + (n=13), NMOSD AQP4-IgG - (n=5) and healthy (n=27) were used to quantify the 299 proteins in each individual CSF by mass spectrometry labelled with 11TMT plex spiked with SIS, and analysed by ANOVA. Proteins were grouped based on their specific regulation in disease groups, and their gene expression was also examined in different brain WM lesion types of progressive MS by using MS-Atlas.

from PPMS, and astrocyte damage may not be major pathology in part of the AQP4-IgG seronegative NMOSD.

Background
Identification of specific molecular markers that reflect the pathology and disease course of multiple sclerosis (MS) is difficult because of the dynamic and complex molecular pathogeneses. Early in the course, MS is characterized by clinically active and silent phases (relapsing-remitting, RRMS). A secondary progressive phase (SPMS) evolves in a subset of patients, where a combination of neurodegenerative processes, adaptive and innate immune responses contributes to the advancing disability, and limits the efficacy of disease modifying treatments (DMTs) that target mainly systemic adaptive immune responses 1− [4]. One out of eight MS patients are diagnosed with primary progressive (PP)MS characterized by the absence of clinical relapses and gradual worsening from onset. Axonal degeneration, cortical lesions, CNS innate immune responses, inflammatory demyelination and remyelination significantly influence the prognosis and long-term outcome of MS [1,4,5]. Early prediction of mechanisms that culminate in the progressive phase may provide a more individualized treatment approach and postpone the secondary phase [6].
Hypothesis-driven exploratory omics approaches are effective tools for revealing molecular pathways and quantifying differentially expressed molecules to identify multiple markers that may predict disease outcomes. Mass spectrometry is an analytical technique for the characterization of biological samples and is increasingly used in omics studies as both a nontargeted and targeted approach for discovery proteomics and quantification with high throughput abilities. Proteomics of the cerebrospinal fluid (CSF) reflects more specific changes related to CNS damage than serum, and is a powerful tool for elucidating mechanisms by networks, pathways, protein groups and individual proteins that reflect both the similar and the unique molecular events as inflammation, degeneration, reparation or oxidative stress conditions in the MS subgroups [7].
Here, we used a comprehensive two-stage approach, with an untargeted and then a quantitative targeted method to characterize the molecular landscape of the CSF in different phases of MS. We aimed to identify molecules in the CSF with a potential clinical interest in MS subtypes, and to better understand MS pathophysiology at different stages. Disease controls were selected to include conditions with strong inflammatory alterations in the CNS without major degenerative processes but with similarity to MS, i.e. AQP4-IgG + and AQP4-IgG − neuromyelitis optica spectrum disease (NMOSD) [8], and neurodegenerative conditions associated with innate inflammatory responses in the CNS, i.e.
Alzheimer disease (AD) [9]. Based on the different protein abundances in 169 CSF samples, we: (i) clustered the diseases and MS subtypes based on similarities and differences. (ii) selected hundreds of proteins that were quantified in 170 individual CSF of the MS subgroups and controls; and (iii) compared the unique CSF proteins associated with MS stages/subtypes with MS brain lesion signatures [10].
Relapse was verified by neurologists, and samples were taken within maximum a month after the first relapse symptoms. Patients with AQP4-IgG − NMOSD were not treated with immunosuppressive medications, while patients with AQP4-IgG + NMOSD received azathioprine or mycophenolate mofetil. NMOSD was stable in all patients. CSF samples were obtained by lumbar puncture, collected in polypropylene tubes and gently mixed.
The samples were centrifuged at 2000 × g for 10 min at 4 °C to remove cells and other insoluble materials and stored in polypropylene tubes at − 80 °C pending analysis.
The study was conducted in accordance with the approval of the Danish National Ethics Committee (S-20120066), and informed consent was obtained from each participant.

Sample Preparation for Proteomic Discovery
CSF samples of each disease group were pooled into one of three sample pools producing three technical replicates (Fig. 1A). Proteins were ethanol/acetone precipitated, re-dissolved in 2M thiourea, 20 mM dithiothreitol (DTT), and the protein amount was estimated using Qubit Protein Assay (Thermo Fisher Scientific). Following alkylation, proteins were digested with LysC (0.02 AU/mg proteins) for 4 h, and then with trypsin (50:1 ratio) overnight at 37 °C. Peptides were reverse phase (RP) purified using homemade columns of C8/R2 and C18/R3 (Applied BiosystemsTM). Purified peptides were re-dissolved in 0.1% formic acid. The peptide amount in each sample was determined by amino acid composition analysis (AAA). Subsequently, equal amounts of each sample pool were labelled with one of the iTRAQ 8plex reagent labels according to manufacturer protocol. The bulk peptide sample was fractionated using hydrophilic interaction chromatography (HILIC), and each fraction was further separated by reverse phase chromatography prior to identification by mass spectrometry (Q Exactive HF, Thermo Fisher). The three technical replicates of the sample pools were run separately (Additional file 2A).

Data Availability
The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium via the PRIDE [11] partner repository with the dataset identifier PXD017643.

Statistical Analyses for Selection of Proteins
Proteome Discoverer software (further PD software, Thermo Scientific, v1.4) was used to process the raw mass spectrometry (MS) files, identify the proteins and generate quantitative data which was further processed by three parallel approaches.
ANOVA-based (analysis of variance). For each peptide, ANOVA was performed with the lmPerm R package to determine difference between groups. Afterwards, to determine which pairs of groups showed most differences, the Tukey's HSD (honest significant difference) test was performed as posthoc analysis.
Limma-based (linear models). Linear regression and analysis of variance were performed with the limma R package. The ratios of a specific protein between two compared groups were log 2 transformed, normalized to the median, and the 3 replicates merged into one, and proteins were significant according to q-values (FDR < 0.1). The resulting data were visualized in volcano plots and heatmaps.
Complementary analysis of the three replicates. Using the PD software, for each of the three sets the coefficient of variation CV of proteins (any subject group to healthy subjects) within the set as well as the ratio of the mean abundance between the sets were calculated. A protein was selected for further analyses, if the ratio was larger (or smaller) than 1 + 2xCV (or reciprocal). Subsequently, the PD software calculated a "global" ratio for a protein based on data from the three sets compared to healthy samples (and CV within the combined sets). Proteins were finally selected, if the protein expression was larger (or smaller) than 1 + 2xCV (or reciprocal) at least between two different conditions, and was consistently altered in a minimum of two of the three sets.

Linear discriminate analysis (LDA)
To reduce any possible batch effect, the three pools were merged after scaling them individually (per protein). An F-test based feature selection was performed, where only proteins with a FDR < 0.05 (ANOVA) were considered. Next, the set of candidate proteins were pruned for collinearity by iteratively removing the protein with the highest variance inflation factor (VIF), until only proteins with VIF < 10 remained. This resulted in 11 proteins, which were used to conduct a linear discriminant analysis (LDA). Additionally, the test was also performed only on the MS samples resulting in 8 proteins responsible for the subgroup separation according to the LDA.

Pathway Analysis
After the data were normalized to control samples, Ingenuity Pathway Analysis (IPA) was used to identify molecular pathways and perform functional analysis between different disease groups and subgroups.
Sample Preparation for Quantification CSF from each patient was precipitated with ethanol/acetone, dissolved in urea buffer containing DTT, as described in a previous paper with parallel reaction monitoring (PRM) [12].Total protein content was estimated by AAA, and 10 ug of proteins were digested with trypsin. After digestion, Stable Isotope Standards (SIS) mix was added in equal volume to every sample (both previously prepared [12] and additional ones). Peptides in each sample were labelled with one of the TMT 11plex label. A pooled sample was prepared by mixing a small amount from approximately half of all the available samples.
This pooled sample was labelled with TMT 11plex 126 label. Subsequently, this pooled sample was split equally into 17 samples, and mixed with ten other patient samples in a random manner (Additional file 2B). There were 17 TMT sets each containing at least one (if available) sample from every patient group. Samples were randomized so that each set contained a representative of each patient group, and each sample of every patient group was labelled with a different TMT label.

Data Processing and Statistical Analyses of Validated Proteins
The raw data was processed with the ProteomeDiscoverer software (v2.3). The samples used for analysis contained SIS standard added in the same amount to each sample and labelled with TMT along with all the other CSF peptides. Each patient group was set as one of the Categorical factors, and every patient within a patient group was set as a Biological replicate. The Pool sample was set as "Control" and every patient sample was set as "Sample". The scaling parameter was set "On Average Control". In this way, samples were normalized and scaled to the Pool (which is a common/identical sample across the 17 replicates). The software calculated ratios for protein abundances between any patient group and healthy controls based on proteins identified and quantified in corresponding samples from all the 17 Sets. In an alternative approach, the quantitative data from ProteomeDiscoverer were extracted and further processed in Excel (Microsoft). The constant ratio of CSF proteins to SIS were used to calculate normalization factors within each of the 17 TMT sets.
Additionally, this SIS normalization could also be used for correcting the few samples that contained less than 10 ug of proteins and different amount of volume. After normalization, an average ratio for each protein (for every patient group) was calculated based on the ratios to the corresponding protein in the Pooled sample. The significance of the ratios was validated by ANOVA by using PRISM and PolySTest software [13].

Human Brain Lesion Signature
Gene names of the protein of interest were uploaded in msatlas.dk and heatmaps were produced of genes present in the human MS brain [10]. Stars were added when there was a significant difference (FDR < 0.05) between MS lesion type and control white matter from non-neurological disease brain areas.
Immunohistochemistry and RNAscope of Chronic Active Brain Lesion

Results
Global CSF proteome landscape of MS subtypes compared to NMOSD and AD Untargeted analysis of the CSF proteome in MS subgroups and controls Altogether, we detected 878 proteins in the 169 CSF samples. By using F-test based feature selection, conduct a linear discriminant analysis (LDA) including both the disease groups and healthy controls: there was no overlap between the different disease groups, and no influence of the technical batch effect (Fig. 1B). NMOSD (AQP4-IgG − and AQP4-IgG + ) and SPMS were the most distinct groups both from each other and from healthy controls, PPMS, RRMS (relapse, remission) and AD. The presence of genes coding these 11 proteins in the MS brain was examined by using . All were expressed in the MS brain, and 5 of them were significantly differentially expressed in different lesion types (PEBP4, CNTNAP4, NRXN1, CPQ, OLFML3) (Fig. 1C).
F-test based feature selection was also applied to the MS CSF samples separately and resulted in 8 proteins differentiating the MS subtypes (early MS in remission and relapse, SP and PPMS) (Fig. 1D).
The LDA according to these 8 proteins identified also the SPMS subtype as being the most different ( Fig. 1E). Seven of the 8 genes encoding for the proteins were present in the MS brain, and 3 were significantly differentially expressed: GOLM in all the lesion types (active, chronic active, inactive and remyelinating), FRZB in active and chronic active lesions, and SELENBP1 in inactive lesions (Fig. 1F).
Next, we normalized the protein levels to healthy controls, and the diseases were clustered based on the abundance in protein log 2 ratios (except AQP4-IgG − NMOSD due to lack of technical replicates) ( Fig. 2A). AQP4-IgG + NMOSD was the most different from the other diseases, and SPMS the most different from the other MS subtypes ( Fig. 2A). Volcano maps of normalized proteins in different disease groups also indicated that AQP4-IgG + NMOSD and SPMS had the highest amount of altered proteins compared to healthy controls (FDR < 0.1) (Fig. 2B).

Functional analysis of the CSF proteome
Functional classification and molecular pathways of the proteome in the different diseases was generated by Ingenuity Pathway Analysis (IPA) (Fig. 2C, Additional file 3). The most shared pathway was "LXR/RXR Activation" by SPMS, PPMS, MS remission, and AQP4-IgG + NMOSD. "Acute Phase Response Signalling" was shared between SPMS, AQP4-IgG + NMOSD and AD. "Axonal Guidance Signalling" was shared between MS remission, PPMS and AD. PPMS and AD shared "Intrinsic Prothrombin Activation Pathway". AD and AQP4-IgG + NMOSD shared "Complement". SPMS had two unique pathways: "Neuroprotective Role Of THORP1 In AD" and "Coagulation System", while PPMS and remission had one each, "FXR/RXR Activation" and "Clathrin-mediated Endocytosis Signalling", respectively.
While CSF in MS relapse did not share common top pathways, the top pathways were "Hematopoiesis from Pluripotent Stem Cells", "Leucocyte Extravasation Signalling" and "Agrin Interactions at Neuromuscular Junction" (Fig. 2C). The distinct biological functional enrichment of MS relapse was also reflected by the top 5 predefined diseases or functions (Fig. 2D). The top network assigned for all the disease groups were "Metabolic Disease", "Cellular Movement", "Neurological Disease" and "Psychological Disorders", while relapse only shared "Cellular Movement".
Unique CSF proteins in disease subtypes Quantification of disease-specific proteins in the CSF By combining different statistical analyses of the pooled CSF samples (ANOVA, limma, complementary analysis), we selected 299 dysregulated proteins (Fig. 3A, Additional file 4). These were quantified in 170 individual CSF samples by mass spectrometry.

CHI3L1 and metalloproteinase inhibitor 1
The two proteins, chitinase-3-like protein 1 (CHI3L1) and metalloproteinase inhibitor 1 (TIMP1) that were significantly altered in all three tests were not significantly altered in the individual samples by the quantitative proteomics (Fig. 3B). However, some of the PP and remission patients had increased levels of CHI3L1, while some of the AQP4-IgG + NMOSD patients had increased levels of metalloproteinase inhibitor 1 compared to pool (Fig. 3B). By immunohistochemistry (IHC), we also found CHI3L1 increased at the rim of chronic active lesions in progressive MS WM tissue (Fig. 3C). The morphology of cells expressing CHI3L1 in chronic active lesions was consistent with astrocytes. The astrocytic expression was confirmed by combined RNAscope and immunohistochemistry that colocalized CHI3L1 and GFAP at the chronic active rim in close proximity to MHCII expressing cells ( Fig. 3D-F).

Upregulated CSF proteins in the CSF of MS
The trypsin-1 protein was the most significantly upregulated protein in RRMS in remission, PPMS and SPMS subjects compared to both the disease-and healthy controls (Fig. 4A). Apolipoprotein C-I and augurin were also upregulated in these three MS subtypes compared to healthy controls and AD patients (Fig. 4B, 4C). Receptor-type tyrosine-protein phosphatase gamma was also upregulated in these three MS subtypes compared to disease controls (Fig. 4D). Apolipoprotein A-II was significantly upregulated in SPMS compared to relapsing MS, AQP4-IgG + NMOSD and healthy controls (Fig. 4E).
AQP4-IgG + NMOSD-specific CSF proteins GFAP, inter-alpha-trypsin inhibitor heavy chain H1, and H2, serum amyloid P-component, and actin cytoplasmic 1 protein were uniquely upregulated in AQP4-IgG + NMOSD compared to all MS subtypes, AD and healthy controls (Fig. 5). Glial fibrillary acidic protein/GFAP was detected only in less than 50% of the patients with AQP4-IgG − NMOSD similarly to MS and AD.

CSF proteome signatures in MS brain lesion transcriptomes
We compared the CSF proteome signatures to the recently established transcriptome signatures of different MS lesion types (). Two of the MS-specific upregulated proteins were present as transcripts in the human progressive MS brain: apolipoprotein C-I (APOC1) was significantly upregulated in active lesions, and receptor-type tyrosine-protein phosphatase gamma (PTPRG) was significantly upregulated in all WM tissue of progressive MS (NAWM and lesions) (Fig. 6A).
Three of the five altered proteins in AQP4-IgG + NMOSD patients were also detected as transcripts in the MS WM brain tissue: glial fibrillary acidic protein (GFAP) was upregulated in active, inactive and remyelinating lesion types, inter-alpha-trypsin inhibitor heavy chain H2 (ITIH2) was significantly upregulated in all lesion types, while actin cytoplasmic 1 (ATCB) was not differently expressed compared to non-neurological-disease WM brain areas (Fig. 6B).

Discussion
This comprehensive two-stage proteomic study with a high number of human CSF samples (n = 169) from a spectrum of different neurological diseases provided information about the global CSF proteomic landscape in disease subgroups compared to healthy controls.

Mapping the global CSF proteome in a spectrum of neurological disorders
With F-test based feature selection, a combination of 11 proteins could separate the diseases without overlap and technical batch effect. Due to the pruning for collinearity during the F-test, these proteins have very different functions, i.e. axonal growth inhibition (RTN4R), cell-matrix interactions (CRTAC1), lipid binding and inhibition of serine proteases (PEBP4), and hydrolysis of circulating peptides (CPQ).
The combination of 8 additional proteins could also separate the MS subgroups, and 4 were related to intracellular processing and transporting of synthesized proteins and lipids (GOLM1, NUCB1, NPC2, SELENBP1). LDA and differential abundance of proteins indicated that SPMS and AQP4-IgG +/− NMOSD clustered far from each other, and SPMS differed the most from the other MS subtypes (Fig. 1A).
These two disease groups also had the highest number of significantly altered proteins (FDR < 0.1) (Fig. 1B). We examined the presence of the 17 differentiating molecules among transcripts in different brain MS lesions [10], and identified 8 genes that were significantly differentially expressed.
Neurorexin-1 has been related to neurodegeneration in MS [14], and NRXN1 was uniquely significantly upregulated in the chronic active lesion type associated with progressive MS. Olfactomedin-like protein 3 is a known marker of activated ramified microglia, and OLFML3 was also significantly upregulated in chronic active lesions as well as in NAWM. Selenium-binding protein 1 is an astrocytic marker related to metabolic processes [15], and SELENBP1 was uniquely expressed in inactive lesions characterized by astrocytic scar tissue [10]. Secreted frizzled-related protein 3 is involved in axon targeting basement membrane breakdown [16], and the FRZB gene was significantly upregulated in active and chronic active lesion types. Contactin-associated protein-like 4 is involved in the formation and maintenance of myelinated axons [17], and CNTNAP4 was upregulated in the inactive lesion type. This molecular CSF profile and associated brain lesion spectrum highlights the importance of noninflammatory mechanisms in differentiating these diseases.
We also examined pathways that were different among diseases and MS subtypes (Fig. 2). In this regard, relapse was the most distinct disease group with almost nothing in common with the other diseases. It was dominated by unique immune-related pathways, and the top predicted diseases/functions were more related to systemic than CNS-specific events. Although early MS in remission and PPMS resembled each other the most by sharing the top 5 predicted diseases/functions, they differed by specific pathways suggesting "clean-up" events in remission and regulating lipid and glucose metabolism in PPMS (FXR/RXR Activation). The unique SPMS enriched pathway was the "Coagulation" system, while PPMS and AD shared "Intrinsic Prothrombin Activation Pathway". A previous study also found proteins involved in coagulation unique to chronic active lesion samples, suggesting dysregulation of molecules associated with coagulation in chronic active lesions [18]. Another recent study also identified higher levels of CSF proteins related to the coagulation cascade in MS patients with higher cortical lesion load [19].
Unexpectedly, in our study immune related proteins such as cytokines, chemokines, growth factors and adhesion molecules were not frequently detected. This could be because of the constrained dynamic range of mass spectrometers to truly cover the broad spectrum of lower abundance. A recent systematic review revealed 19 inflammatory proteins specifically altered in MS [20]. Not surprisingly, the majority of the upregulated MS proteins (11 of 19) were immunoglobulins. However, most of these proteins also appeared to be highly abundant in the CSF [21,22], and low abundant proteins likely to be involved in the distinct damaging vs. repairing processes of MS remain to be discovered.

Disease-specific molecular markers
Next, 299 proteins were selected and quantified in 170 individual CSF samples (169 of these were also used for the discovery phase). This identified 12 molecules potential interest, including10 molecular markers specific to MS and AQP4-IgG + NMOSD.
Molecular markers significantly altered in all three tests in the discovery analysis but not significant in quantification of individual samples Two proteins (CHI3L1 and TIMP1) were significantly altered in all three statistical tests (ANOVA, limma, complementary analysis) in the pooled discovery CSF proteome, but were not unique to diseases in the individual quantification study. However, a subgroup of MS patients with PP and remission had increased levels of CHI3L1 (Fig. 3B). CHI3L1 (YKL-40) is a promising biomarker of inflammation in progressive MS [23], and was originally discovered in the CSF proteome of patients with CIS converting to RRMS 16 . Immunohistochemistry and RNAscope indicated that the gene encoding CHI3L1 was primary expressed by astrocytes in the rim of chronic active lesions (Fig. 3C-F).
Another recent study also found that CHI3L1 reflects disease progression, and together with the biomarker neurofilament light chain protein, it may help to discriminate MS phenotypes [24]. These data suggest that some of the emerging biomarkers in progressive MS may reflect unique molecular changes in the brain related to specific subtypes of lesions. The high expression of CHI3L1 in the CSF of patients with progressive MS [25] may be related to the increasing number of a specific subtype of chronic active lesions, and we may speculate that its level in the CSF of patients with progressive MS may even reflect the number of this lesion type in the brain. The expression of CHI3L1 by astrocytes has been recently described in neurodegenerative diseases and often appears in clusters of astrocytes [26]. Knock-out animal models indicated a protective role of CHI3L1, as traumatic brain injury and experimental autoimmune encephalomyelitis were more severe in its absence [27,28]. CH13L1 can also influence the migratory capacity of astrocytes and reduces astrogliosis [27,28]. It may therefore dampen the inflammation and limit astrogliosis.
TIMP-1 seemed to be highly expressed in a subset of AQP4-IgG + NMOSD patients (Fig. 3B). TIMP-1 is produced by astrocytes in both homeostasis and early/acute inflammatory events [29]. We have previously found TIMP-1 peak during acute remyelination in the cuprizone model and to be associated with reduced inflammation in the CSF of MS [12]. Induction of TIMP-1 in neurons and astrocytes was also related to early cellular events triggered by seizures and with long-lasting changes in tissue reorganization and/or neuroprotection [30]. Increased TIMP-1 levels in serum has also been proposed as a prognostic biomarker of mortality in brain trauma injury patients [31]. Therefore, increased TIMP-1 and CH13L1 in the CSF may reflect acute and chronic astrocytic responses in subgroups of MS and AQP4-IgG + NMOSD patients.

Molecular markers of MS
Two apolipoproteins were found increased in MS. These are important players in cholesterol homeostasis, and in CNS diseases for neuronal homeostasis and regeneration [32]. Apolipoprotein C-I was significantly upregulated in RRMS in remission, PPMS and SPMS, and its transcript was significantly induced in active MS lesions in SPMS brain (Fig. 6A). Apoprotein A-II was significantly altered in the CSF in SPMS compared to both AQP4-IgG + NMOSD and healthy controls (Fig. 5E).
Increased levels of apoprotein A-II has been associated with fatigue in MS patients [33], and it may reflect later disease mechanisms accumulated with chronic damage. Apolipoproteins have been also linked to the genetic risk of MS: APOE genotype has been associated with disease severity and MR activity [34][35][36].
Trypsin-1, a protease that degrade other proteins, was also significantly upregulated in remission, PP and SPMS compared to the disease-and healthy controls. Since we were not able to detect the gene of this protein (PRSS1) expressed in the MS brain [10], its presence in the CSF may originate from the systemic compartment.
Receptor-type tyrosine-protein phosphatase gamma (PTPRG) levels were increased RRMS in remission, PP and SPMS compared to the disease controls, but only RRMS in remission was significantly upregulated compared to healthy controls (Fig. 5D). Another study also found it increased in the CSF of early MS patients compared to controls [37], suggesting that it may be induced from onset of the disease. We also found it significantly upregulated in progressive MS tissue in both NAWM and all kind of lesions (Fig. 6A).

Molecular markers in AQP4-IgG + NMOSD
We found 5 upregulated unique molecular markers in the CSF of patients with AQP4-IgG + NMOSD.
Increased GFAP reflects astrocyte damage and death in AQP4-IgG + NMOSD [38,39]. It was not increased in AQP4-IgG seronegative NMOSD indicating that at least in a subset of these patients' disease mechanisms do not primarily target astrocytes. Another study also reported higher GFAP levels in AQP4-IgG + patients compared to AQP4-IgG − NMOSD [40]. The unique elevation of serum amyloid P-component in the CSF in AQP4-IgG + NMOSD may be related to damage to the blood-brain barrier [41]. Upregulation of inter-alpha-trypsin inhibitor heavy chain H1 and H2 may represent endogenous neuroprotective immunomodulatory proteins within the CNS [42]. The ITIH2 gene was significantly upregulated in all lesion types in the MS brain (Fig. 6B), suggesting that this molecule can be an indicator of non-specific neurological inflammatory damage and control.

Conclusions
With the combination of untargeted and targeted quantitative proteomic analysis of the CSF, we identified molecular markers that differentiated between neuroinflammatory and neurodegenerative diseases, and also MS subtypes. Our data support a recent observation that the coagulation system is an important pathway in SPMS. We found that the proteome of SPMS was the most different from the other subtypes of MS including PPMS. Genes encoding proteins that clustered disease and MS subtypes could also be detected in MS brain lesions. Among specific CSF proteins in NMOSD, GFAP was present in 70% of AQP4-IgG seropositive NMOSD and significantly upregulated; however, the 40% detection level in AQP4-IgG seronegative NMOSD may suggest that astrocyte damage may not be major pathology in part of these patients. The absence of an independent cohort validation, different ages of patients in disease subgroups, and immunotherapy of NMOSD patients are limitations of the study.

Ethics approval and consent to participate
The study was conducted in accordance with the approval of the Danish National Ethics Committee (S-20120066), and informed consent was obtained from each participant.

Consent for publication
'Not applicable' for that section.

Availability of data and materials
The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium via the PRIDE [11] partner repository with the dataset identifier PXD017643.