Heat Shock Protein Grp78/BiP/HspA5 Binds Directly to TDP-43 and Mitigates Toxicity Associated with Neurodegenerative Disease Pathology

Amyotrophic lateral sclerosis (ALS) is a fatal neurodegenerative disease with no cure or effective treatment in which TAR DNA Binding Protein of 43 kDa (TDP-43) abnormally accumulates into misfolded protein aggregates in affected neurons. It is widely accepted that protein misfolding and aggregation promote proteotoxic stress. The molecular chaperones are the body’s primary line of defense against proteotoxic stress and there has been long-standing interest in understanding the relationship between chaperones and aggregated protein in ALS. Of particular interest are the heat shock protein of 70 kDa (Hsp70) family of chaperones; however, defining which of the 13 human Hsp70 isoforms is critical for ALS, has presented many challenges. To gain insight into the specific Hsp70 that modulates TDP-43, we investigated the relationship between TDP-43 and the Hsp70s using proximity-dependent biotin identification (BioID) and discovered several Hsp70 isoforms associated with TDP-43 in the nucleus, raising the possibility of an interaction with native TDP-43. We further found that HspA5 bound specifically to the RNA-binding domain of TDP-43 using recombinantly expressed proteins. HspA5 is increased in prefrontal cortex neurons of ALS patients. Finally, overexpression of HspA5 in Drosophila rescued TDP-43-induced toxicity, suggesting that upregulation of HspA5 may have a compensatory role in ALS pathobiology. direct to the target cells. lines verified for fusion-protein expression and proper localization using IF and WB. The stable cell lines were maintained in 5.0% CO 2 at 37°C in DMEM/F12 1:1 (HyClone, supplemented with 10% fetal bovine serum (FBS). All cells were tested monthly for mycoplasma contamination. 0.1% FA). The mass spectrometer was operated in positive data-dependent acquisition mode. MS1 spectra were measured in the Orbitrap in a mass-to-charge ( m/z ) of 350 – 1700 with a resolution of 70,000 at m/z 400. Automatic gain control target was set to 1 x 10 6 with a maximum injection time of 100 ms. Up to 12 MS2 spectra per duty cycle were triggered, fragmented by HCD, and acquired with a resolution of 17,500 and an AGC target of 5 x 10 4 , an isolation window of 1.6 m/z and a normalized collision energy of 25. The dynamic exclusion was set to 20 seconds with a 10 ppm mass tolerance around the precursor.


INTRODUCTION
Proteostasis is the proper equilibrium between the biogenesis, folding, trafficking and degradation of proteins within the cellular milieu 1 . Any interference in proteostasis leads to accumulation of misfolded proteins, a central pathological hallmark of several neurodegenerative diseases including Alzheimer's disease and amyotrophic lateral sclerosis (ALS) 2,3 . In over 95% of ALS patients, TAR DNA-binding protein of 43 kDa (TDP-43) is mislocalized from the nucleus to the cytoplasm where it is found misfolded and aggregated in affected neurons and glia 1,2 . TDP-43 pathology has been observed across several neurodegenerative disorders including frontotemporal degeneration (FTD), Alzheimer's disease, and limbic-predominant age-related TDP-43 encephalopathy (LATE) [4][5][6][7] . Although the causative factors that lead to TDP-43 aggregation are still not fully understood, studies implicate proteostasis mechanisms such as impaired autophagy and the ubiquitin proteasome system (UPS) 8,9 as well as compromised endolysosomal function [10][11][12] . TDP-43, a DNA/RNA-binding protein, consists of a folded N-terminal domain (NTD) linked by a flexible loop to two tandem RNA recognition motifs (RRMs) -RRM1 and RRM2 -and a predominantly unfolded C-terminal prion-like domain that harbors the majority of diseaseassociated mutations in ALS 13 . TDP-43 functions primarily in RNA metabolism including splicing, translation and the cytoplasmic stress granule response 14 . Thus, in ALS, TDP-43 aggregation leads to a loss of function effect on TDP-43-controlled pathways as well as a dysregulation of proteostasis 15,16 . Central to proteostasis are the chaperones; a large family of proteins that typically bind to exposed hydrophobic sequences to assist in protein misfolding, degradation and the clearance of aggregated protein 17,18 . One major chaperone subfamily are the evolutionary conserved Hsp70s, which consist of 13 gene products (HspA1A, HspA1B, HspA1L, HspA2, HspA5, HspA6, HspA7, HspA8, HspA9, Hsp12A, Hsp12B, Hsp13 and Hsp14)) 19,20 . The canonical Hsp proteins share high sequence identity and have diverse cellular localizations and functions 19 . All canonical Hsp70 proteins have an N-terminal nucleotide binding domain (NBD) and a C-terminal substrate-binding domain (SBD) that allosterically communicate in an ATP-dependent manner to recognize and bind client proteins 21 .
Typically, high levels of Hsp70 can be produced by cells in response to hyperthermia, oxidative stress, changes in pH, chemical disruption of proteostasis 22 and expression of disordered proteins [23][24][25] .
Intriguingly, in motor neurons, the primary cells affected in ALS, there appears to be an incomplete Hsp stress response, as inferred from the lack of Hsp70 upregulation in response to several stress paradigms 26,27 .
Moreover, overexpression of chaperones, including Hsp70s, reduced TDP-43 aggregate formation 28 and injection of recombinant human Hsp70 was effective in improving motor defects as well as increasing lifespan of a superoxide dismutase type 1 (SOD1) mouse model of ALS 29 . Collectively, these findings may partially explain why strategies to boost Hsp70 have been touted as neuroprotective in neurodegenerative diseases, particularly ALS. In support of this, Arimoclomol, a co-inducer of heat shock protein expression, has been under investigation in a clinical trial for ALS patients but recently failed in phase II/III (Clinicaltrials.gov identifier NCT03491462). Arimoclomol is known to prolong heat shock factor 1 (HSF1) binding to the heat shock element (HSE) localized in the promoter of inducible Hsp70 isoforms, and induced the expression of a certain subset of heat shock proteins in neuronal cell lines 30 . This might indicate that only a precise Hsp70 isoform subset is able to mitigate ALS toxicity.
It is still unclear how and which Hsp70 isoforms regulate TDP-43. Previous studies demonstrate that at least three Hsp70 isoforms immunoprecipitate with TDP-43: HspA1A, HspA5 and HspA8 31 . It was later hypothesized that Hsp70s could be constitutively bound to TDP-43. Upon a heat shock event, Hsp70 could be released from its interaction with TDP-43, as misfolded proteins accumulate, which could thereby promote the formation of TDP-43 aggregates 32 . More recently, it was shown that in cells, several Hsp70 isoforms accumulate within mutated TDP-43 phase separated anisosomes (an anisotropic intranuclear liquid spherical shell) 33 . To date, potential direct binding between the Hsp70 isoforms and TDP-43 has not been investigated. Here, we interrogated the association of TDP-43 with Hsp70 isoform using BioID, a technique that leverages the activity of a promiscuous biotin ligase to biotinylate proteins based on proximity 34 . We found that HspA5 and HspA8 were enriched in the nuclear, but not cytoplasmic, fraction of TDP-43. We further tested direct binding of TDP-43 with the Hsp70 isoforms HspA1A, HspA5 and HspA8 and found that the TDP-43 RRM domains selectively bind HspA5. We showed an upregulation of HspA5 in neurons of the prefrontal cortex of ALS patients compared to healthy controls. Finally, we discovered that upregulation of the HspA5 homologue (or Hsc70.3 in Drosophila melanogaster) protects against TDP-43-induced degeneration in Drosophila while the ATP binding-deficient mutant Hsc70.3 K97S variant 35 , had no effect. Our data underscore Hsp70 isoform preference by TDP-43 and thus position induction of HspA5 binding to TDP-43 as a novel therapeutic strategy for mitigating TDP-43 toxicity.

BioID identifies Hsp70 networks binding to TDP-43 in the nucleus.
To characterize nuclear versus cytoplasmic localization as well as possible Hsp70 isoform specificity of TDP-43, we performed proximity-dependent biotin labeling (BioID) of TDP-43 in the nucleus or the cytoplasm. BioID2 was fused to the N-terminal domain of TDP-43, and either a 3x tandem nuclear localization signal (3xNLS) or a nuclear export signal (NES) was added to localize TDP-43 to the nucleus or cytoplasm, respectively. BioID2-3xNLS-TDP43, BioID2-NES-TDP43 or the BioID2 control were stably expressed in human neuroblastoma SH-SY5Y cells and their localization was verified using immunofluorescence ( Fig. 1A). It is worth noting that while BioID2-NES-TDP43 mostly localized to the cytoplasm, some marginal nuclear localization was observed and is likely due to the intrinsic NLS of TDP-43. Cells expressing each TDP43 variant or BioID2-only as a control were lysed for BioID pulldown in triplicate, and affinity capture of biotinylated proteins was confirmed via western blot (Fig. S1).
Biotinylated proteins identified via mass spectrometry (MS) were ranked by label-free quantification (LFQ) intensity, enrichment compared to BioID2-only control, and the number of replicates (N) of each protein was identified. Following a criterion of 3-fold enrichment over control and N ≥2 threshold, 144 nuclear and 28 cytoplasmic interaction candidates for TDP-43 were identified (Fig. 1B, C, Table S1). "Highest confidence associations" were proteins found only in the BioID2-3xNLS-TDP43 or BioID2-NES-TDP43 samples, and not at all in the control BioID samples, and ranked by LFQ intensity (Fig. 1B). "Good confidence associations" were proteins enriched at least 3-fold over control, ranked by experimental: control intensity ratio (Fig. 1C).
Surprisingly, HspA5 and HspA8 were found as highest confidence and good confidence associations respectively in the nuclear TDP-43 sample (BioID2-3xNLS-TDP-43) (Fig. 1B, C). No Hsp70 isoform was identified in the cytoplasmic TDP-43 sample (BioID2-NES-TDP43), hinting toward an absence of such an interaction with TDP-43 in the cytoplasm without stress (Table S1). HspA8 is well described for its implication in nuclear import of client proteins as it shuttles between the cytoplasm and nucleus 36 . Although HspA5 is mostly known for its ER localization, several studies have shown the presence of HspA5 in the nucleus 33,37,38 , including in SH-SY5Y cells 39 . Thus, our data suggest that in the SH-SY5Y cells and in the absence of stress, HspA5 and HspA8 selectively associate with nuclear but not cytoplasmic, TDP-43.  We thus first predicted where Hsp70 could bind to TDP-43 using LIMBO, a position specific algorithm for identifying Hsp70 binding sites in proteins 40 S2A). While predicted binding sites were noted in the NTD and RRM domains, the algorithm did not predict any Hsp70 binding sites in the prion-like domain (Fig S2B, C). Thus, our computational predictions suggest that the HSP70 does not bind to the unstructured C-terminal domain but to the N-terminal domain and to the RRMs.
The SBD of these Hsp70 isoforms is approximately 200 amino acids long and is composed of a two layered twisted β-sheet and a C-terminal α-helical subdomain. The SBD and its binding to the client peptide is allosterically modulated by the ATP binding site. However, binding of ATP to the TDP-43 RRM domains has also been shown to enhance the stability of TDP-43 41 , thus we reasoned that this may inhibit Hsp70 isoform binding, and we opted for an Hsp70 construct that lacked the N-terminal nucleotide binding site but retained the ability to recognize client peptides.
All three Hsp70 isoforms bound TDP-431-102 with a similar affinity calculated to be in the high nanomolar to low micromolar range ( Fig. 2A, B). There was a small but significant difference in the binding affinity between the between binding of HspA1A and HspA5 to TDP-431-102 (p = 0.0242, Fig. 2B

HspA5 binds TDP-43 RRM2 at the interface with RNA
Spurred by the selective binding of the RRM region of TDP-43 (TDP-43109-260) to HspA5 we set out to experimentally map, in greater resolution, potential HspA5 binding sites within TDP-43. To do this we synthesized a peptide-binding array of 15-mer peptides with an overlap of 5 amino acids that spanned the RRM region of TDP-43. The peptide binding array was incubated with HspA5-SBD protein and peptide binding was detected using an antibody directed against HspA5 (Fig. 3A). HspA5 bound to several TDP-43 peptides in RRM1 (noted in red in Fig. 3A) and in RRM2 (highest binding peptide shown in orange in Fig. 3A). Some C-terminal TDP-43 peptides also bound to HspA5, but this could be due to the fact that these C-terminal peptides (e.g., peptide 70) have several glutamine (Q) and asparagine (N) amino acids, typical of prion-like domains. There was good concordance between our computationally predicted sites ( Fig. S2B, C) and peptides in the RRM1 and RRM2 domains of TDP-43 that bound HspA5.
We next mapped these potential HspA5-binding regions on TDP-43102-269 in the context of the 3dimensional and folded structure of TDP-43. We calculated the surface accessibility of the TDP-43 peptides bound by Hsp70 and mapped the peptide sequence on to the known TDP-43 structures of the RRM domains complexed to (UG)6 RNA (PDB code: 4bs2 42 ) (Fig. 3). Notably, all of the TDP-43 peptides bound by HspA5 in the NTD and RRM domains have partial surface accessibility (Fig. 3B). Moreover, they have relatively low dynamics in the NMR structures and include secondary structural elements (helix for the accessible peptide in RRM1, strand for the accessible peptide in RRM2) (Fig. 3C). Given that HspA5 binds TDP-43102-269, these data suggest that (i) HspA5 might recognize only a portion of the peptide, sufficient for initiating a binding, and (ii) there might be structural elements at play in the HspA5/TDP-43 interaction.

Hsp70A5 is increased in the cytoplasm in human ALS tissue
Considering the interaction between TDP-43 and HspA5, as well as the mislocalization of TDP-43 in ALS, we next asked whether the distribution of HspA5 is affected in ALS patients. To do this we immunolabelled paraffin-embedded sections of the layer V of frontal cortex of 2 control patients and 2 sporadic ALS patients for the HspA5 protein. In control patients, HspA5 exhibits a granular, cytoplasmic pattern within neurons (Fig. 4A). Although the general distribution of HspA5 (granular and cytoplasmic) remains unchanged in the neurons of the frontal cortex of the 2 ALS patients, the immunoreactivity of HspA5 was increased in intensity (Fig. 4B). Our data suggest, that in the neurons of the frontal cortex of ALS patients, the HspA5 protein is upregulated in the cytoplasm. Upregulation of the HspA5 Drosophila homolog mitigates TDP-43 disease-associated toxicity.

Discussion
Targeting the molecular chaperone pathway is a potential therapeutic strategy in neurodegenerative disorders such as ALS. Arimoclomol, a compound that increases Hsp70 proteins as well as other Hsp chaperones 50  We have further demonstrated that upregulation of the HspA5 homologue in Drosophila protects against disease-associated toxicity of TDP-43. HSPA5 has also been implicated in regulating the toxicity and aggregation of the ALS-causing protein superoxide dismutase (SOD1). For example, knock-in mice expressing HspA5 that lacks the ER retention signal, KDEL, display age-related motor problems, loss of motoneurons and aggregation of wild-type SOD1 66 . Moreover, the neuronal pathology caused by expression of mutant SOD1 (SOD1-G93A) was exacerbated in mice deficient in the HspA5 co-factor SIL1, while SIL1 overexpression induced significant neuroprotection related to improved ER proteostasis and reduced SOD1 aggregation 67 . It is worth noting that previous work showed that in Drosophila down regulation of tankyrase 1 and tankyrase 2 (Tnks-1/2), which physically interact with TDP-43, reduces TDP- HspA5, among other proteins, being upregulated [72][73][74] . It is thus also possible that HspA5 interacts with and helps to properly fold TDP-43 in the cytosol and mitochondria during ER stress, and thus HSPA5 becomes depleted in the ER leading to accumulation of newly synthesized TDP-43 in the rough ER.
Finally, it is worth noting that Arimoclomol is a co-activator that prolongs the binding of activated HSF1 to heat shock elements in the promoter region of many chaperones including Hsp70 family members.
Notably, HspA5 expression is not under the control of Hsf1 20,75-77 . Moreover, Arimoclomol, was shown to induce the expression of only HspA6 and HspA1A in human SH-SY5Y cells 30 . The failure of Arimoclomol in phase II/III of clinical trial for ALS patients might be partly explained by a lack of specific Hsp70 isoform targeting.
Overall, the observations in this study suggest that upregulation of HspA5 in ALS may have a compensatory role, prolonging the survival of neurons by preventing TDP-43 misfolding and subsequent toxicity. Elucidating the stimuli and the underlying cellular mechanisms that control HspA5 binding to TDP-43 will provide the platform for investigating HspA5 as a potential therapeutic target TDP-43associated disease.

Materials
All reagents were purchased from Sigma (St. Louis, MO, USA) and Fisher Scientific (Hampton, NH) unless otherwise indicated. TDP-43102-269 and TDP-431-102 were obtained as previously described 31,32 . The TDP-43 expression strain was described previously 37,38  were imaged with a Leica Z16 Apo A microscope, DFC420 camera and 2.0x planapochromatic objective.
For paraffin sections, fly heads were fixed, processed and quantified as previously described 78 . 8 µm paraffin sections were cut and mounted onto glass slides. 3 sections per head were imaged at the same anatomical position and the retinal width and vacuolization was quantified using image J software.
Graphpad 6 was used to determine statistical significance.

Drosophila Immunoblotting.
Immunoblotting was performed as previously described 55  Horseradish peroxidase (HRP)-coupled secondary antibodies made up in TBST were goat anti-rabbit-HRP (1 in 5,000; EMD Millipore #AP307P) and goat anti-mouse-HRP (1 in 10,000; abcam, ab6789). All experiments were carried out on three or more biological replicates, blots were quantified with ImageJ 79 and statistical analysis was carried out using Graphpad prism 6 software.

LC-MS/MS analysis.
Prior to LC-MS/MS analysis, dried peptides were reconstituted with 2% ACN, 0.1% FA and concentration was determined using a NanoDrop TM spectrophometer (ThermoFisher). Samples were then analyzed by LC-MS/MS using a Proxeon EASY-nanoLC system (ThermoFisher) coupled to a Q-Exactive Plus mass spectrometer (Thermo Fisher Scientific). Peptides were separated using an analytical C18 Aurora column (75µm x 250 mm, 1.6 µm particles; IonOpticks) at a flow rate of 300 nL/min (60 o C) using a 120-min gradient: 1% to 5% B in 1 min, 6% to 23% B in 72 min, 23% to 34% B in 45 min, and 34% to 48% B in 2 min (A= FA 0.1%; B=80% ACN: 0.1% FA). The mass spectrometer was operated in positive data-dependent acquisition mode. MS1 spectra were measured in the Orbitrap in a mass-to-charge (m/z) of 350 -1700 with a resolution of 70,000 at m/z 400. Automatic gain control target was set to 1 x 10 6 with a maximum injection time of 100 ms. Up to 12 MS2 spectra per duty cycle were triggered, fragmented by HCD, and acquired with a resolution of 17,500 and an AGC target of 5 x 10 4 , an isolation window of 1.6 m/z and a normalized collision energy of 25. The dynamic exclusion was set to 20 seconds with a 10 ppm mass tolerance around the precursor.

MS Data Analysis.
All mass spectra were analyzed with MaxQuant software version 1.6.11.0. MS/MS spectra were searched against the Homo sapiens Uniprot protein sequence database (downloaded in January 2020) and GPM cRAP sequences (commonly known protein contaminants). Precursor mass tolerance was set to 20ppm and 4.5ppm for the first search where initial mass recalibration was completed and for the main search, respectively. Product ions were searched with a mass tolerance 0.5 Da. The maximum precursor ion charge state used for searching was 7. Carbamidomethylation of cysteine was searched as a fixed modification, while oxidation of methionine and acetylation of protein N-terminal were searched as variable modifications. Enzyme was set to trypsin in a specific mode and a maximum of two missed cleavages was allowed for searching. The target-decoy-based false discovery rate (FDR) filter for spectrum and protein identification was set to 1%. Interaction candidates were those proteins enriched at least 3x over control samples (BioID2-only) and identified in at least two of the three experimental triplicate samples (N>2).
Immunohistochemistry. Samples from the frontal cortex and spinal cord of ALS and control patients were obtained from the University of Michigan Brain Bank (Table 1). Consent for autopsy was obtained in accordance with guidelines from the University of Michigan Brain Bank who reviewed and confirmed that protocols met the the criteria for human-subjects research.