Molecular Dynamics simulations of TREM2 variants R47H and R62H show structural alterations at key binding regions leading to possible implications for mechanisms in Alzheimer’s disease.

Background There is strong evidence supporting the association between Alzheimer’s disease (AD) and protein-coding variants, R47H and R62H in triggering receptor expressed on myeloid cells 2 (TREM2). The TREM2 protein is an immune receptor protein found in brain microglia. These two variants are in a similar position in the protein but cause a different functional outcome. A structural alteration caused by the variants could be having a large effect on the protein. A crystallised structure was used to investigate structural differences. Both variants were inserted into the protein structure and these were subjected to 300ns of molecular dynamic simulation (MD) in order to investigate the link between structural change and AD risk. Results Results suggest structural alterations in both variant models of TREM2 which could be causing the reduced functional outcome. A large change was noted in the R47H model in the complementarity-determining region two (CDR2) loop, a proposed binding site for ligands such as APOE, a smaller change was observed in the R62H model in this same loop. The overall structure remained stable, possibly accounting for the reduced, not missing, function of TREM2 in disease.


Abstract Background
There is strong evidence supporting the association between Alzheimer's disease (AD) and protein-coding variants, R47H and R62H in triggering receptor expressed on myeloid cells 2 (TREM2). The TREM2 protein is an immune receptor protein found in brain microglia. These two variants are in a similar position in the protein but cause a different functional outcome. A structural alteration caused by the variants could be having a large effect on the protein. A crystallised structure was used to investigate structural differences. Both variants were inserted into the protein structure and these were subjected to 300ns of molecular dynamic simulation (MD) in order to investigate the link between structural change and AD risk.

Results
Results suggest structural alterations in both variant models of TREM2 which could be causing the reduced functional outcome. A large change was noted in the R47H model in the complementaritydetermining region two (CDR2) loop, a proposed binding site for ligands such as APOE, a smaller change was observed in the R62H model in this same loop. The overall structure remained stable, possibly accounting for the reduced, not missing, function of TREM2 in disease.

Conclusions
These differing levels of structural impact could explain the in vitro observed differences in TREM2-ligand binding when the mutations are present. Further studies to investigate this binding loop could help not only a better understanding of TREM2's role in the onset of dementia but also possibly provide a target for therapeutics.

Background
The world health organisation (WHO) estimate there will be 50 million dementia sufferers worldwide by 2050, with Alzheimer's disease (AD) being the most common form (1). With no cure, or effective treatment options further research into the disease-causing mechanisms are needed. One route is to use the outcomes of genetic studies which are continually adding to the list of con rmed Alzheimer associated genes. Sims et al recently reported the rst genome-wide evidence for the coding variant R62H in triggering receptor expressed on myeloid cells 2 (TREM2) (2,3). This and other studies have implicated the role of the immune system and microglial cells in the development and progression of AD (4,5).
The TREM2 gene includes two genome-wide signi cant coding variants (R62H (Odds ratio = 1.67, P = 1.55×10 −14 (2)) and R47H (Odds ratio = 2.90, P = 2.1×10 −12 (5)) both of which are associated with an increased risk of developing AD (4,(6)(7)(8). Although a number of other variants have been implicated, they are yet to reach the level of genome-wide signi cance. A greater understanding of the of these variants and their impact upon the function of TREM2 and its role in immune pathways can help with understanding the development of neurodegenerative disease. TREM2 is an innate immune receptor protein which is expressed on the surface of dendritic cells, macrophages and microglia and has been shown to play an anti-in ammatory role (9). It contains an extracellular V-type immunoglobulin (Ig) domain, a transmembrane domain which associates with the adaptor protein DAP12 for signalling and a cytoplasmic tail (9,10). Recent studies by Zhao et al have shown wildtype TREM2 to bind directly to Amyloid Beta (Aß) with mutant forms of TREM2 showing a reduced rate of binding (11,12). TREM2 has also been reported to bind to several other ligands including Apolipoprotein E (APOE) and Apolipoprotein J (APOJ) (13)(14)(15). Subtle differences in protein secondary structure of the R47H variant have previously been reported, though how this may cause binding a nity change is not well understood (11).
In order to investigate the structural impact of the R47H and R62H mutations and predict possible loss-offunction we carried out an in-silico study of the binding domain of the protein containing the mutations. Here we describe the results of this study and in particular the similarities and differences between the two models. Results suggest a greater effect on the binding loops by the R47H variant, in keeping with existing literature.

Methods
The immunoglobulin domain for the TREM2 protein has previously been crystallised (10), both mutations were added to the structure using the modify protein function in the Accelrys software, Discovery studio.
The wildtype protein (WT) and the two mutated structures were subjected to over 300ns of molecular dynamics (MD) simulations. MD was carried out using the GROMACS (26) software suit using the Amber03 (27) in built force eld parameters. All structures were placed in a cubic box, solvated using TIP3P water molecules and neutralised using Clions. The particle mesh ewald (PME) method was used to treat long-range electrostatic interactions and a 1.4 nm cut-off was applied to Lennard-Jones interactions. All of the simulations were carried out in the NPT ensemble, with periodic boundary conditions and at a temperature of 310K. There were three-steps to each simulation. 1; Energy minimisation, using the steepest decent method and a tolerance of 1000KJ -1 nm -1 . 2; Warm up stage of 25 000 steps at 0.002ps steps, during this stage atoms were restrained to allow the model to settle. 3; Finally, a MD stage run for a total of 300ns. Root mean square deviation (RMSD) was monitored along with the total energy, pressure and volume of the simulation to check for stability.
Resulting structures were analysed for exibility using the gmx rmsf and hydrogen bonding using gmx hbond (both available within the GROMACS suite) all proteins were visualised for structural differences using VMD. Further to this prediction of the functional effect and stability analysis was carried out using three online servers, HoPE, SITF and I-mutant (16)(17)(18)28). HoPE analyses the impact of a mutation, taking into account structural impact, and contact such as possible hydrogen bonding and ionic interactions. The SIFT software predicts tolerated and deleterious SNPs and identi es any impact of amino acid substitution on protein function and lastly, I-Mutant is a neural-network based prediction of protein stability changes.
Statistical normality in distributions such a rmsd, energy, pressure, volume etc, were tested for using the Anderson-Darling test. All were not normally distributed and so all statistical differences between the wildtype and mutated simulations were calculated using the Mann-Whitney U test.

Results
The TREM2 protein has been partly crystallised from a mammalian cell system (10,11), this structure contains the protein's extracellular ligand binding domain (ECD) which is amino acids 19-134. This structure was used as the basis for molecular dynamic modelling experiments in this study. The TREM2 ECD domain is a V-type Ig domain which contains nine ß-strands and two short α-helices, all of which are characteristic of an Ig protein domain. Both variants, R47H and R62H, can be found on the TREM2 protein surface and the surface of this Ig domain, on one of the ß-strands.

Predicted mutational effects;
The Have your Protein Explained server (HoPE) was run to predict the possible mutational effects of both variants on the protein (16). Results from the server suggest that the wildtype amino acid (arginine) at position 47 forms a hydrogen bond with amino acids at positions 66 (threonine) and 67 (histidine) which would not occur with the histidine variant in this position. These bonds may be important for protein structural integrity. The wildtype residue is conserved at position 47, though a histidine is observed here in some species brought up in the blast search. Residue 62 on the other hand is less well conserved, but histidine is not observed here in any species searched. There is an obvious loss of charge and size with both variant amino acids, shown schematically in gure 1. The SIFT online tool was used to predict the tolerance of the two variants in the protein, this does not predict the effect of binding, or function, but whether the variants will be tolerated in the protein structure. Results from this show R47H would be tolerated with a score of 0.06 and R62H to be tolerated with a score of 0.10, this was based on 13 sequences. A score of <0.05 would be classed as a damaging prediction (17). The I-mutant server results showed a decrease in stability for both the R47H and R62H variants (18).

Molecular Dynamic Simulations;
Structures containing the variants were constructed as detailed in the methods section and along with the original structure, each was run in triplicate, for 300n.s. The stability of the simulations was checked, and volume, pressure and root mean square deviation (RMSD) remained stable throughout.

Local structural changes;
Local structural changes were investigated in the three models. The region surrounding both variants, amino acids 43-65, were viewed using VMD, gure 2 shows the starting structure of the wild type (WT) simulation with CDR loops and variant amino acids highlighted. The R62H variant alters this local structure with a shift in the beta sheet and a large movement of the random coil ( gure 3c). However, the R47H variant does not appear to alter the structure surrounding the amino acid in any way. As well as altering the local structure the exibility of the individual residue, i.e. the amount of movement it has, was also altered for the R62H variant. This exibility was measured using the gromacs rmsd module. Results show the WT and R47H to have an average exibility of 0.23 +/-0.02 and 0.01 respectively at amino acid 62, the R62H variant on the other hand has a reduced exibility of 0.17 +/-0.01 ( gure 3a). There is also, to a lesser extent, a reduction of exibility across neighbouring amino acids which surround the R62H variants. A further change to the local stuructre seen in the MD simulations is a change in positioning of the wildtype to variant amino acid, the wildtype pretuding from the molecule in both cases and the mutated amino acid being visually far more buried within the structure, gure 3 (d-g).
Wider structural changes; Solvent accessible surface area (SASA) for the whole model, and the individual mutated residues were measured. Overall SASA was reduced from 71 to 70, this small change is not signi cant and would not have any effect on the protein function. Amino acid speci c SASA was measured for the WT and mutated proteins, at the 47 and 62 sites. Here a SASA change can be seen at the mutated site in each protein, with a reduction of SASA, gure 3b. This correlates with the observed positional change of the amino acid from one protruding from the structure to a more buried position.
Signi cant structural alteration can be seen in the CDR2 loop, gure 4 shows the R47H variant to cause a loss of beta sheet and a changing of alpha helix position in this binding loop. The effect of the R62H variant is subtler, where there is a change in loop structure to the left and right of the alpha helix. There is a further effect on the exibility of the loop, gure 4, here the variants differ with the R47H becoming more exible and the R62H variant less exible, when compared to the wild type simulation. All three images are taken from the representative structures created from the entire simulations and show the loop in the same position and therefore should all be identical if no structural change was observed.

Discussion
Arginine, which is present in the wildtype protein at both positions, is a long and stretching amino acid with a chain of carbons and nitrogens. Histidine, which is the mutated amino acid in both variants, is a ring strucutre, with less avaliblity for hydrogen bonding. This is further exacherbated by the positioning of the histidine amino acid observed in gure 3. Figure 2 depicts the structure of TREM2 which has been modelled and run through the MD simulations. The complementarity-determining region (CDR) loops, which are suggested to be key for the ligand binding process (19,20), are coloured as follows; red for the CDR1, green for CDR2 and purple for the CDR3 loop. The two mutated sites are shown in dark blue; both are close to the CDR1 and CDR2 loops, with R47H actually being found in the CDR1 loop.
In this study we present structural ndings and hypothesised subsequent functional effects of these from two AD associated genome-wide signi cant mutations in TREM2. The R62H variant is more common in the Caucasian population (OR = 1.67; P = 1.55×10 -14 (2)) whereas the rarer variant, R47H, carries a far greater risk (OR = 2.90; P = 2.1×10 -12 (5)). Both variants are found in exon 2, the region of the protein which is predicted to encode for the ligand-binding domain, and both are missense mutations causing a coding change from the wild type arginine to a histidine.
Previous studies have shown that disruptions to the protein in exon 2 are likely to cause either TREM2 signalling problems or a decrease in protein function. The functional impact of both variants has been presented at an in vitro and in vivo level, but here we present a study which aims to identify the structural cause behind the functional alterations. It has been observed that R47H has a greater functional effect than R62H even though the same amino acid change is observed, and they are in very close proximity to each other, why this difference happens is largely unknown (11). Abduljaleel et al looked into the possible structural changes within TREM2, in 2014, by performing MD simulations on the R47H variant. Their simulations ran for just 10ns and they presented results which suggested a possible alteration to binding loops and overall stability. This short simulation time may not have been long enough to view any large or distal impact and this study builds upon those results and expands their hypotheses (21).
Although tools like SIFT and I-Mutant can suggest an overall tolerance of the variant, but a local structural shift, MD simulations give us a clearer picture of what is changing in the structure. The overall tolerance of the variants is key, the protein is remaining stable, and tolerating the mutation means the mutated protein is likely to perform with reduced function as is predicted in vitro. This is further supported by the simulation results which suggest no change in the global protein exibility or SASA.
The local structure shows a greater change, beginning with the positioning and charge of the amino acid. The wildtypes for both variants are found outstretched from the proteins binding domain, here they could perform key functions in binding and it has been suggested that the positive amino acids such as these play an important role in TREM2 (22). The mutated residues are neutral in charge, provide less opportunity for hydrogen bonding and are buried within the binding domain, this alone causes an impact on TREM2's ability to bind to ligands such as APOE. Further to this very local change, both variants are found in the vicinity of the binding loops of CDR1 and CDR2, R47H lies on CDR1 and R62H between the two loops. These, and other putative AD variants, are found on the surface of the protein where they may affect TREM2's ability to bind and function. Local, as well as global, SASA may be important when considering rates of reactions which require a protein-protein or protein-ligand interaction and so a change in the SASA of either of these two amino acids which could be key in the binding process should be considered a detrimental effect and results showed a reduction in SASA at the mutated residue for both models (23). Sudom et al recently published a paper which showed mutated R47H protein to contain a remodelled helix in the CDR2 loop, though their crystal structure is missing residues 76-81(24). Out results also show an altered helix structure in the CDR2 loop, we also see a loss of the beta sheet structure which is replaced by a random coil. A random coil is far more variable and could explain why they were unable to resolve this region of the protein and the crystal structure is missing these amino acids. The TREM2 domain we are investigating also contains three possible N-glycosylation sites, one of which is at position 79, the alteration in structure here could be affecting the ability of TREM2 to undergo translational modi cation and could explain the altered glycosylation seen in vitro in the R47H mutated form (25).
Park et al recently showed that the R47H mutation in TREM2 resulted in a decreased protein stability, based on our models this may due to the large alteration in the CDR2 loop structure (25). Another study by Atagi et al presented strong evidence for the binding of TREM2 to APOE, and more interestingly a lack of binding when the R47H variant was present (15), this is further supported by Yeh et al who measured a decrease in TREM2's ability to bind CLU/APOJ and APOE when the R47H and R62H variants were present. Their results support our difference in binding loop loss between the two variants as they observed less of a decrease in binding with the R62H mutation (13). This binding loop degradation we observed may be the key to understanding the functional effect these variants are having on the protein.

Conclusion
The evidence shown here correlates with previous studies which indicate a binding change when the R47H variant is present. We present novel ndings which show the R62H mutation to have a structural effect on the same region of the protein albeit to a lesser extent. This provides insight and support to the studies which show less of a decrease in binding ability with the R62H protein compared to the R47H form. Understanding the structural and functional changes which occur in this AD associated protein increases our knowledge of the mechanisms behind the processes which cause AD and as a result provide more novel drug and therapeutic targets. Availability of data and material The datasets used and/or analysed during the current study are available from the corresponding author on reasonable request.

Competing interests
The authors declare that they have no competing interests Funding Part-funded by the European Regional Development Fund through the Welsh Government. The funders have no role in the design, analysis or interpretation of any data in this manuscript.

Authors Contributions
GM designed and carried out the study and wrote the manuscript, RS interpreted the data and edited the manuscript, JW oversaw the study design and edited the manuscript. All authors approved the nal manuscript.  Wild type TREM2 starting structure, the structure is depicted as a cartoon style with secondary structure colouring. CDR loops are coloured as follows; CDR1 = red, CDR2 = green, CDR3 = purple, the position of the two mutated sites are coloured in dark blue and shown in full.