Transcriptomic analysis reveals higher number of DEGs in sALS1 than sALS2 fibroblasts and different transcriptional responses to EH301
To characterize gene expression profiles of sALS subgroups we performed 3’RNAseq analysis on fibroblast lines from control, sALS1, and sALS2 subjects (n=6 per group). In sALS1, 281 genes were differentially expressed (DEGs, padj. <0.05) relative to controls (Supplementary Table 1), while only one gene reached statistical significance in the comparison between sALS2 and controls (Fig. 1A, Supplementary Table 2). Interestingly, several genes relevant to neuronal function and development were differentially expressed in sALS1. For example, stathmin 2 (STMN2) was downregulated in sALS1 by ~80%. STMN2 has been linked to TDP-43 dysfunction [37] and a novel STMN2 genetic variant has been associated with ALS risk, onset, and progression [38]. Furthermore, the most upregulated gene in sALS1 was an antisense RNA for kinesin family member 5C (KIF5C-AS1). KIF5C is highly expressed in the brain and enriched in motor neurons [39], where it regulates axonal transport [40], and alterations of KIF5C are associated with intellectual disabilities and cortical development malformations [41, 42]. In addition, YIF1A, which was upregulated in sALS1, interacts with the ALS8 related protein VAPB [43] involved in neuronal ER-Golgi interactions [44]. HIST1H4C, a replication-dependent component of the nucleosome, was among the top downregulated genes in sALS1. Mutations affecting lysine 91 in HIST1H4C have been associated with a syndrome characterized by developmental anomalies and intellectual disabilities, indicating the importance of chromatin organization for the correct development and function of the nervous system [45]. SOX9, a transcription factor that controls several aspects of neurodevelopment [46] and is highly expressed in astrocytes and neural progenitor cells [47], was downregulated in sALS1. SYNE2 (nesprin), involved in organellar subcellular organization [48] and associated with muscular dystrophy [49], was also downregulated in sALS1. The only gene significantly upregulated in sALS2 compared to controls was MRE11, which encodes a double-strand break repair protein implicated in DNA damage response [50].
In addition to examining individual genes expression, we performed pathway analysis by Webgestalt [26] of biological processes and molecular function of sALS1 DEGs, which revealed that upregulated genes are involved in vesicular and protein transport and in extracellular matrix organization (Fig. 1B, Supplementary Table 3). Among the downregulated genes, the most enriched pathways in sALS1 were linked to cell cycle progression and cytoskeletal function (Fig. 1B).
Next, we evaluated the effects of EH301 on fibroblast transcriptomic profiles. Cells were exposed to EH301 (NR 1mM, PT 10µM) for 48 hours prior to RNA extraction. Interestingly, we found that EH301 affected a larger number of genes in sALS1 and sALS2 fibroblasts compared to controls (233 genes in sALS1, 202 genes in sALS2, 77 genes in controls), with little overlap between groups (Fig. 1C, Supplementary Tables 4-6). In control fibroblasts, pathway analysis of DEGs between vehicle and EH301 treated cells indicated that EH301 modified the expression of genes involved in mRNA processing (Fig. 1D, Supplementary Tables 7-9). For example, SYNCRIP, a ribonucleoprotein involved in RNA stabilization and editing, which has been associated with intellectual disabilities [51–53], was downregulated by EH301. CWC22 and CWC27, which cooperate during spliceosome assembly and are linked to developmental defects [54], were also downregulated in control fibroblasts by EH301. In sALS1 fibroblasts, EH301 influenced downstream steps of protein biosynthesis, by modifying the expression of genes involved in ribosome organization, translation initiation, and protein localization (Fig. 1D). Several genes encoding ribosomal proteins, components of the 60S and the 40S subunits, were upregulated. Furthermore, two elements of the eukaryotic initiation factor 3 complex were differentially expressed after treatment. EIF3F, a positive regulator of NOTCH signaling [55], was upregulated. Conversely, EIF3J, involved in the recognition of starting codons [56] and in ribosome recycling [57], was downregulated. SEC11A, which mediates import of nascent protein into the ER [58], was upregulated, while KDELR3, mediating protein trafficking from Golgi to ER and involved in stress response [59], was downregulated by EH301. BCAP31, a chaperone abundant in the ER and involved in transmembrane protein export [60, 61] and in the assembly of mitochondrial Complex I [62], was upregulated. Both KDELR3 and BCAP31 have been associated with pathologies of the nervous system [63, 64]. Of note, treatment with EH301 normalized YIF1A expression in sALS1, while STMN2 and KIF5C-AS1expression remained altered. Surprisingly, no pathway was found to be significantly enriched in sALS2 fibroblasts, even though the expression of 187 genes was altered by EH301 in this group.
In summary, RNAseq in human primary fibroblasts confirmed that, based on the number of DEGs, sALS1 samples are more distinct from controls than sALS2 and that genes involved in neurodevelopment and neuronal function are differentially regulated in sALS1 fibroblasts. Moreover, pathway analysis indicates that EH301 affects sALS1, sALS2, and control fibroblast gene expression differently, mostly affecting mRNA splicing and stability in controls and protein biosynthesis and localization in sALS1, while no specific pathways were identified in EH301 treated sALS2.
The metabolite profiles of sALS and control fibroblasts are modified by EH301
Next, we investigated how treatment with EH301 affects the metabolite profiles of sALS1, sALS2, and control fibroblasts. We performed targeted metabolomics in the same cell lines (n=6 per group) used for transcriptomics, in the same cell culture conditions. Following exclusion of low abundance hits, 166 metabolites were used for analysis. Metabolomics profiles showed that sALS1 had reduced cystathionine and increased betaine compared to controls (Supplementary Tables 10-12), corroborating previously reported differences in the transsulfuration pathway [8]. Cystathionine levels were unchanged in EH301 treated sALS1 but decreased in both control and sALS2 fibroblasts (Fig. 2A). Oxidized glutathione was increased after treatment in controls, but unchanged in sALS1 and sALS2 fibroblasts (Fig. 2A). Together, these results indicate that EH301 modulates the transsulfuration pathway, but does not correct the alterations observed in sALS1. As expected, metabolic pathway analysis showed that EH301 modifies metabolites of the nicotinate and nicotinamide pathway in all groups, increasing availability of NAD precursors and NAD (Fig. 2A, B). EH301 treated sALS2 also showed decreased fumarate and malate compared to vehicle treated cells, suggesting that NAD derived from NR accelerates the TCA cycle (Fig. 2A). D-glyceraldehyde-3-phosphate was significantly increased at baseline in sALS2 compared to controls and was normalized by EH301 (Fig. 2A), pointing to accelerated flux of NAD-dependent reactions in EH301 treated sALS2 cells. Riboflavin, the precursor of flavin mononucleotide (FMN) and flavin adenine dinucleotide (FAD), was decreased at baseline in both sALS1 and sALS2 fibroblasts and returned to control levels after treatment with EH301 (Fig. 2A). The pentose phosphate pathway was also affected by EH3101 in control, sALS1, and sALS2 fibroblasts, with increased concentration of ribose and phosphorylated ribose in treated cells (Fig. 2A, B). Pyrimidine metabolism was modified by EH301 only in sALS1 and sALS2 fibroblasts, while controls were not affected (Fig. 2A, B). Purine metabolism, which was reported to be altered in sALS1 at baseline [8], was affected by EH301 in control and sALS1 cells (Fig. 2A, B). Interestingly, EH301 increased 1-methyladenosine in all groups. 1-methyladenosine is a S-adenosylmethionine (SAM)-dependent modification of RNA regulating mRNA localization, stability, translation, and splicing [65]. 1-Methyladenosine also responds to stress, decreasing upon glucose or amino acid starvation and increasing after heath shock [65]. 1-Methyladenosine can also modify tRNA, regulating its stability and folding [66] and is found in mtDNA-encoded transcripts [67]. Thus, some of EH301 effects on transcription and metabolism could be mediated by regulation of SAM-dependent epigenetics marks on RNA.
The alanine, aspartate, and glutamate metabolic pathways were globally altered by EH301 in sALS1 and sALS2 fibroblasts (Fig. 2A, B). Specifically, glutamate was decreased at baseline in sALS1, but not in sALS2, compared to controls (Fig. 2A, B). This selective glutamate decrease in sALS1 could be due to increased extrusion of glutamate in exchange for cystine by the SLC7A11 transporter, which was shown to be upregulated in sALS1 fibroblasts [8]. EH301 decreased glutamate levels in all groups, a potentially disease-relevant effect, since extracellular glutamate homeostasis is known to be dysregulated in ALS [68].
EH301 strongly protects fibroblasts from cell death induced by thiol group depletion
Although EH301 did not correct the characteristic unbalances of transsulfuration metabolites in sALS1, we investigated if increased NAD availability and modification of amino acids metabolism by EH301 could improve cell viability under metabolic stress induced by methionine and cystine deprivation. To this end, we cultured cells for 72h in methionine and cystine depleted medium, in the presence or absence of EH301 or its individual components (NR or PT). Depletion of methionine and cystine caused similar levels of cell death in all groups (Fig. 2C, D). Interestingly, addition of NR alone was effective in protecting sALS1 fibroblasts from cell death, while the viability of sALS2 and controls was not improved (Fig. 2C, D). Treatment with PT alone was sufficient to prevent cell death in all groups (Fig. 2C, D). The combination of NR and PT (EH301) had comparable effects to PT alone. These results further indicate that the metabolic alterations of sALS1 are different than sALS2 and potentially more responsive to nicotinamide derivatives. They also indicate that PT is the most potent compound in protecting cells from the profound redox stress deriving from thiol group depletion.
Transcriptomic analysis performed on control cells exposed to medium depleted of methionine and cystine showed that treatment with EH301 reduces expression of genes involved in inflammation and apoptosis (Fig. 2E, F, Supplementary Tables 13-14). Of note, the expression of the stress response factor ATF3 was downregulated by EH301, while the levels of ATF5, which promotes expression of chaperones and pro-survival factors [69], was increased (Fig. 2E). Members of the kinesin family and TP53 were among the genes upregulated by EH301 in fibroblasts grown in absence of methionine and cystine. PMRT1 and PRMT2, regulating DNA damage response and other signaling pathways through SAM-dependent arginine methylation, were also increased by EH301. On the other hand, interleukins (CXCL1, CXCL2, CXCL3, CXCL5, CXCL6, CXCL8) were downregulated by the treatment. The transcripts of SOD2 and different metallothionein isoforms were reduced in fibroblasts treated with EH301 compared to vehicle, further indicating that EH301 acts through antioxidant and anti-inflammatory mechanisms, which prevent the need for upregulation of free radical scavengers and stress response genes under thiol-depleted conditions.
Weighted gene co-expression network analysis highlights associations between fibroblast transcriptional profiles and ALS clinical traits, which are altered by EH301 treatment
Weighted gene co-expression network analysis (WGCNA) is a powerful unbiased method for analysis of transcriptome-wide changes due to disease state [30, 70]. WGCNA differs from more traditional differential gene expression analysis methods, because it considers groups of genes with highly similar expression patterns across samples as part of a set of interconnected modules, rather than considering genes as single entities. This type of analysis increases the statistical power available to identify significant associations with phenotypic traits by minimizing noise. It may also provide more comprehensive information on complex biological processes [71].
To apply the WGCNA framework to our fibroblast gene expression dataset, we first constructed a co-expression matrix using normalized expression data for 17,662 genes. For this analysis we included all the vehicle-treated lines except one sALS1 that was identified as an outlier, based on its extreme distance from all other samples in hierarchical clustering (Fig. S1). We applied the same method to construct a matrix using all the EH301-treated samples. The WGCNA framework uses this matrix as input for hierarchical clustering to group highly co-expressed genes into modules. We identified 25 such modules in the vehicle network and 90 in the EH301 network (Fig. 3A, B). We next correlated module gene expression with 6 disease traits (disease status - ALS or control -, disease subgroup - sALS1 or sALS2 -, disease duration, ALSFRS-R, rate of ALSFRS-R decline, and forced vital capacity - FVC%), as well as sex and age. Disease duration, ALSFRS-R, rate of decline, and FVC% are all relevant markers of ALS severity, which were significantly correlated with each other, as expected (Fig. S2). We found no significant correlations between age and any of the disease traits or the first ten principal components, which cumulatively explain over half of the total variance in the dataset, derived from gene expression in either vehicle or EH301 samples, suggesting that age does not significantly contribute to the variance in gene expression in our data. Sex significantly correlated with vehicle PC9 and EH301 PC1 and PC9 but did not correlate with any of the disease traits (Fig. S2). Nevertheless, we opted to include sex and age in our analysis, as they are potential biologically relevant variables in ALS. In the vehicle network, 10 modules (40%) were significantly associated with one or more traits (Fig. 3C), while in the EH301 network, 38 modules (44%) had significant associations with one or more traits (Fig. 3D).
We then performed GO pathway analyses using the GO: Biological Process (GO:BP), GO: Molecular Function (GO:MF), and KEGG databases on the significantly associated modules and found that in the vehicle network 6/10 modules (Table 2) had a significant enrichment for one or more pathways, while in the EH301 network 23/38 modules (Table 3) had a significant enrichment for one or more pathways. The module significantly associated with the largest number of traits in the vehicle network was the Greenyellow, which associated with disease status, disease duration, ALSFRS-R score, and nearly reached significant association (p=0.06) for FVC%. GO analysis showed that the set of genes comprising the Greenyellow module were functionally enriched for genes involved in cell cycle, chromatin modifications, and DNA damage repair (Fig. 4A). The Turquoise module significantly associated with FVC%, and neared significance for association with disease metabotype, ALSFRS-R score, and rate of decline. Genes belonging to the Turquoise module were functionally enriched for pathways related to DNA damage repair, autophagy and protein catabolism, cell cycle, innate immunity, and mitochondrial function (Fig. 4B). Interestingly, the Turquoise module was also significantly enriched for genes annotated by the KEGG database as important for ALS pathogenesis (KEGG hsa05014). Finally, the Salmon module, which significantly associated with disease metabotype, contained genes belonging to pathways related to autophagy and protein catabolism (Fig. 4C). Clustering of module eigengene expression revealed that the Salmon, Turquoise, and Greenyellow modules had highly dissimilar expression patterns from each other (Fig. S3). Furthermore, when comparing the GO terms enriched in these three modules, we found minimal overlap in terms enriched in the Greenyellow and Salmon modules (Fig. 4D), suggesting that the genes comprising them have mostly distinct functions. Although the Turquoise module has GO terms showing an over 50% overlap with those found in the Salmon and Greenyellow modules, it also has over 1000 unique GO terms, indicating that it contains genes that have functional annotations not represented in either of the other two modules. Thus, based on their eigengene expression and GO enrichment, the three modules identified in the vehicle network are non-redundant.
Table 2
Vehicle network modules with a significant trait association and GO annotation
ALS vs CTL
|
sALS1 vs sALS2
|
Disease Duration
|
ALSFRS-R
|
Rate of Decline
|
FVC%
|
Age
|
Sex
|
Greenyellow
|
Salmon
|
Greenyellow
|
Greenyellow
|
|
Turquoise
|
Brown
|
|
|
|
|
|
|
|
Darkred
|
|
|
|
|
|
|
|
Orange
|
|
Table 3
EH301 network modules with a significant trait association and GO annotation
ALS vs CTL
|
sALS1 vs sALS2
|
Disease Duration
|
ALSFRS-R
|
Rate of Decline
|
FVC%
|
Age
|
Sex
|
Brown
|
White
|
Lightpink2
|
Darkviolet
|
Darkviolet
|
Red
|
Tan3
|
Salmon1
|
|
Indianred1
|
Skyblue
|
Red
|
Red
|
Salmon1
|
Lavenderblush3
|
Navajowhite2
|
|
Brown
|
|
Sienna4
|
Magenta4
|
Slateblue
|
Darkred
|
Slateblue
|
|
Firebrick
|
|
Salmon1
|
Mediumpurple1
|
|
|
Magenta4
|
|
|
|
Slateblue
|
Lavenderblush3
|
|
|
Mediumpurple1
|
|
|
|
Brown
|
Lightskyblue4
|
|
|
Darkorange
|
|
|
|
Magenta4
|
|
|
|
Mediumorchid4
|
|
|
|
|
|
|
|
Magenta3
|
In the EH301 network, four modules were associated with at least three traits. The Brown module was associated with disease status, disease metabotype, and ALSFRS-R and included genes enriched for cell cycle, chromatin modifications, DNA damage repair, nucleic acid metabolism, and transcriptional activity GO terms (Fig. 4E). The Red module associated with ALSFRS-R, rate of decline, and FVC% and included genes enriched for chemotaxis, antigen processing, and immunity (Fig. 4F). The Slateblue module was associated with ALSFRS-R, FVC%, and sex and included genes enriched for RNA stem-loop and scaffold protein binding (not shown). Lastly, the Magenta4 module was associated with ALSFRS-R, rate of decline, and sex and included genes enriched for Fanconi anemia pathway, a pathway activated by DNA damage (not shown). Clustering of module eigengene expression revealed that Brown, Slateblue, and Magenta4 cluster together, while Red does not (Fig. S3). There was minimal overlap in GO enrichment terms among the four modules in the EH301 network that associate with traits (Fig. 4G). This indicates that, while the gene expression signatures of three of the four trait-associated modules in the EH301 network are similar, all four modules are functionally distinct.
Comparing the vehicle and EH301 networks revealed a striking difference, as the EH301 network included nearly four times the number of modules observed in vehicle treated cells. However, the intramodular connectivity was comparable between the two networks (Table 4), indicating that modules are clustered with a similar robustness in both networks. On the other hand, the total average connectivity and extramodular connectivity were significantly higher in the EH301 network (Table S4). This indicates that individual modules are more highly connected to each other in the EH301 than in the vehicle network. Modules from the two networks were compared based on their components using Fisher’s exact test, and modules were paired if p < 0.05. This analysis revealed that 22/25 vehicle modules have a corresponding module in the EH301 network (Fig. S4). Of these 22 pairs, four significantly associated with only one common trait, while the Greenyellow and Brown pair associated with two traits common to both (disease status and ALSFRS-R). When the GO terms from all modules that significantly associated with a trait in both networks were compared, most terms associated with disease status in the Greenyellow vehicle module were also found in the Brown EH301 module (Fig. 5A), and were related to cell cycle and DNA replication (Fig. 4). This suggests that the genes that correlate with disease status likely share similar functions, regardless of treatment with EH301, but the latter introduces new associations with genes annotated with different functions, including nucleic acid metabolism and transcriptional regulation (Fig. 5A and Fig. 4E). However, for other traits including disease metabotype (Fig. 5B), disease duration (Fig. 5C), and FVC% (Fig. 5E), the overlap between GO terms from the vehicle and EH301 networks was small or absent. GO terms differing between sALS1 and sALS2 in vehicle conditions were related to autophagy and protein catabolism (Fig. 4C), while GO terms after EH301 treatment were related to glycolysis, extracellular matrix organization, cell cycle, and transcription (Supplementary Table 15). Vehicle modules associated with disease duration were enriched for terms related to cell cycle and DNA damage repair, while EH301 modules were enriched for terms related to cell adhesion, membrane polarization, and cation homeostasis (Fig. 4A and Supplementary Table 15). The Turquoise vehicle module associated with FVC% was enriched for terms related to mitochondrial function, innate immunity, cell cycle, and autophagy (Fig. 4B), while the EH301 modules associated with FVC% were enriched for terms including MHC complex assembly and antigen presentation (Fig. 4F) and cell cycle (Fig. 4E, and Supplementary Table 15). This suggests that EH301 modifies gene sets associated with disease traits, and that these genes regulate more diverse functions than those from the vehicle network. Similar to disease status (sALS or control), there was a substantial overlap in GO terms between the Greenyellow vehicle module and Brown EH301 module associated with ALSFRS-R (Fig. 5D), but there were also several modules from the EH301 network with no GO overlap with vehicle modules. These were enriched for terms including sugar alcohol metabolism, RNA and protein binding, MHC complex assembly, and cell adhesion and motility (Supplementary Table 15). Overall, comparison of the two networks revealed conservation of most of the vehicle modules after EH301 treatment, including the functionally similar Greenyellow/Brown pair that associated with disease status and ALSFRS-R. However, a large disparity was found in the functional annotation of the other modules associated with clinical traits in the two networks, indicating that EH301 modulates the expression of gene sets that are significantly correlated with various clinical traits.
Table 4
Vehicle and EH301 network structure comparison parameters
Parameter
|
Vehicle Network Average
|
EH301 Network Average
|
P-value
|
Total Connectivity
|
184.45
|
243.75
|
2.04 e-118
|
Intramodular Connectivity
|
140.0
|
144.0
|
0.18
|
Extramodular connectivity
|
44.0
|
100.0
|
0.00
|
To investigate the transcriptional regulation underlying the effects of disease and EH301 treatment on gene expression, we performed a transcription factor (TF) binding site enrichment analysis on all genes in modules significantly associated with disease traits in both networks. The TF binding sites most enriched in genes associated with disease in both vehicle and EH301 modules were those of the E2F and the Sp families of TFs (Supplementary Table 16). Importantly, enrichment of binding sites of several TFs was found only in disease-associated modules of the EH301 network, indicating that these transcriptional effects are EH301-specific (Supplementary Table 16). These TFs have also been associated with ALS in patients or model systems and include AP-2, FOXO1A [72], SREBP1 [73], MTF-1 [74, 75], and RARB [76].
Lastly, we aimed to identify “hub” genes from important modules, or genes that drive the expression profile of each module, while also correlating significantly with disease traits. Hub genes may be useful as biomarkers for potential classification of patients based on their clinical characteristics and disease severity. For each gene we calculated a significance score, denoting how strongly that gene associates with a trait, and a module membership score, denoting how closely that gene’s expression pattern matches the average module eigengene expression, or how strongly that gene “belongs” to that module [30]. We identified the most inter-connected genes in each module using connection strengths calculated from the topological overlap matrix and visualized with VisANT [77]. For each of the relevant modules in the vehicle network (Fig. S5) and the EH301 network (Fig. S6), we selected the 50 most significant genes for each associated trait, as well as the top 50 most strongly connected genes within the associated module. To illustrate the potential application of this analysis to discover disease biomarkers, we chose five genes from the Greenyellow vehicle module that were significantly associated with ALSFRS-R. To this end, genes were ranked in order of significance for correlation with ALSFRS-R and chosen if they 1) were part of the network hub identified by VisANT and 2) were identified as differentially expressed in the sALS1 vs CTL comparison in vehicle samples. Expression patterns of these five genes (DIRAS3, GTSE1, RRM2, CDCA5, and HJURP) showed clear differences between control and sALS1 samples, and to a lesser extent between control and sALS2 (Fig. 6A). We next examined if the expression of these genes could be used to group samples based on their ALSFRS-R score. Dendograms were constructed from hierarchical clustering first of ALSFRS-R score and then of average expression of the five genes, for vehicle-treated samples (Fig. 6B, left) and EH301-treated samples (Fig. 6B, right). In both vehicle and EH301 treated fibroblasts, clustering based on ALSFRS-R significantly matched the clustering based on gene expression, indicating that expression of these five selected genes can be used to group samples based on their ALSFRS-R scores.
WGCNA of transcriptomic data from ALS iMNs supports and extends fibroblast results
To extend the WGCNA analysis to a cell type affected by the disease, we utilized transcriptomic and clinical data from 124 (99 ALS and 25 control) iMN lines obtained by the Answer ALS project (https://www.answerals.org/). We constructed a new network (Fig. 7A) using expression data from the 22,653 genes that passed quality control filters and identified 38 modules. We next calculated associations between modules and six traits (disease status, baseline ALSFRS-R, most recent ALSFRS-R, ALSFRS-R progression slope, age, and sex) and found 23 modules (60%) with a significant association with one or more traits (Fig. 7B). GO analysis revealed that 20/23 modules had significant enrichment for one or more GO:MF, GO:BP, and/or KEGG pathways (Table 5). Of these, Blue, Magenta, and Tan associated with both disease status and one or more measures of ALSFRS-R. Genes in both the Blue and Tan modules were functionally enriched for terms associated with nucleotide metabolism and several types of protein modifications, and Tan module genes were also enriched for terms involved in the mitochondrial electron transport chain (Fig. 7C, D). Genes in the Magenta module were enriched for pathways related to signal transduction through G-protein coupled receptors (Fig. S7).
Table 5
iMN network modules with a significant trait association and GO annotation
ALS
|
ALSFRS-R Baseline
|
ALSFRS-R Latest
|
ALSFRS-R Progression
|
Age
|
Sex
|
Blue
|
Cyan
|
Magenta
|
Blue
|
Honeydew1
|
Bisque4
|
Magenta
|
|
Tan
|
Magenta
|
Darkgrey
|
|
Tan
|
|
Lightgreen
|
Tan
|
Bisque4
|
|
|
|
Maroon
|
Yellowgreen
|
|
|
|
|
Yellowgreen
|
Lightpink4
|
|
|
|
|
Cyan
|
Coral1
|
|
|
|
|
Lightpink4
|
Darkred
|
|
|
|
|
Darkgreen
|
Sienna3
|
|
|
|
|
Orangered4
|
Black
|
|
|
|
|
Lightyellow
|
Purple
|
|
|
|
|
|
Turquoise
|
|
|
Next, we evaluated which modules in the iMN network contained genes from the Greenyellow module in the vehicle-treated fibroblast network. We found that the majority of the Greenyellow genes were found in the Purple and Turquoise modules in the iMN network, both of which significantly associate with ALSFRS-R progression slope. Accordingly, GO analysis revealed that both Purple and Turquoise genes were enriched for pathways related to cell cycle and development, similar to the pathways identified in the fibroblast Greenyellow module (Fig. 7F-G, Supplementary Table 15). We then evaluated whether the genes associated with traits in the vehicle fibroblast network had similar functional GO annotations to the genes associated with the same traits in the iMN network. There was little overlap in GO enrichment between modules associated with disease status in the fibroblast and iMN networks (Fig. 7H). However, for all traits related to ALSFRS-R, there was a large overlap in the GO terms identified in each network, with approximately 2/3 of the GO terms associated with ALSFRS-R in the fibroblast network also associating with one or more of the ALSFRS-R measures in the iMN network (Fig. 7I). Therefore, genes related to ALSFRS-R, a measure of disease severity, share similar functions, including cell cycle and nucleic acid metabolism, in fibroblasts and iMNs.
Finally, to provide initial proof of concept of the potential predictive value of the five genes related to ALSFRS-R progression in fibroblasts (DIRAS3, GTSE1, RRM2, CDCA5, and HJURP), we used multinomial logistic regression to classify ALS iMN samples based on the expression of these genes. We used a ten-fold cross validation approach, in which we randomly divided the 62 ALS samples with available ALSFRS-R progression slopes into ten sets. In each iteration, one set was used as the test data and the model was trained on the remaining 9 sets. We arbitrarily defined cases as “fast progressors” if their ALSFRS-R progression slope values were one standard deviation or more below the mean. Using this method, we obtained an average accuracy of 78.9% (95% CI 66.0% - 91.6%), precision of 80.1% (95% CI 67.6% - 92.6%), false positive rate of 1.4% (95% CI -1.4% - 4.2%), and false negative rate of 17.0% (95% CI 6.3% - 28.0%), indicating good specificity and fair sensitivity. We then computed receiver operating characteristic (ROC) curves for six of the ten iterations (four iterations were unusable due to the absence of any fast progressors in the test set) and obtained an average area under the curve (AUC) value of 0.767 (95% CI 0.651 - 0.883) (Fig. 7J). This approach represents an example of how genes associated with ALSFRS-R in fibroblasts could be utilized to discriminate patients with fast disease progression relative to all other ALS cases in disease-relevant iMNs.