DOI: https://doi.org/10.21203/rs.3.rs-2447725/v1
Glaucoma is a chronic optic neuropathy that leads to irreversible vision loss. Aging and family history are the two most important risk factors of glaucoma. One of the most studied genes involved with the onset of open angle glaucoma is myocilin (MYOC). About 105 germline mutations within MYOC are known to be associated with glaucoma and result in endoplasmic reticulum (ER) stress that leads to trabecular meshwork (TM) cell death and subsequent intraocular pressure (IOP) elevation. However, only about 4% of the population carry those mutations. An analysis of MYOC somatic cancer-associated mutations revealed a notable overlap with pathogenic glaucoma variants. Because TM cells have the potential to accumulate somatic mutations at a rapid rate due to ultraviolet (UV) light exposure, we propose that an accumulation of somatic mutations within MYOC is an important contributor to the onset of glaucoma.
Glaucoma is the leading cause of irreversible vision loss and blindness in the world affecting approximately 68.6 million individuals1, 2. It has been estimated that by the year 2040, 111.8 million people will be affected by this chronic optic neuropathy3, thus emphasizing the importance of studying the underlying mechanisms associated with its pathophysiology.
Aging is one of the major risk factors of glaucoma; everyone ages 60 and older is at high risk of glaucoma1, 2. However, a family history of the disease is also known as a risk factor1, 2. Advances in the field of genetic and genomic research have led to extensive investigations of the genes contributing to glaucoma and resulted in identification of 127 associated genomic regions4–6. The majority of the identified loci are linked with the onset of open angle glaucoma (OAG), whereas eight genes are associated with primary angle closure glaucoma4. The most frequently occurring subtypes of glaucoma are the early-onset juvenile OAG (JOAG) and the late-onset primary OAG (POAG), together accounting for 74% of all reported incidents7. Mutations in the myocilin gene (MYOC) were first identified in families with an autosomal dominant inheritance of JOAG and were linked to the GLC1A locus on chromosome 1q24.3-q25.25, 8–10. Ever since, extensive efforts have been focused on studying myocilin (MYOC), the most associated protein in OAG development11, 12.
MYOC, a trabecular meshwork-inducible glucocorticoid response protein, is expressed in various tissue and cell types including the trabecular meshwork (TM), ciliary body, retina, myocytes, astrocytes, fibroblasts, endothelial cells, and some epithelial cells11, 13, 14. Despite a broad expression profile of MYOC, mutations in the protein are currently known to cause disease only in the eye. Thermodynamically unstable pathologic germline mutations in MYOC lead to its overexpression in TM cells15–18, which initiates a cascade of events that lead to glaucoma. Recent studies have illustrated that increasing the concentration of the transforming growth factor-β2 (TGF-β2) induces MYOC expression in TM cells19, resulting in changes to the extracellular matrix (ECM) structure11. Modulating the ECM of TM cells has been shown to obstruct the aqueous humor outflow and increase the intraocular pressure (IOP), subsequently leading to glaucoma20.
There are 105 known glaucoma-causing MYOC variants (www.myocilin.com), the majority of which localize to the olfactomedin (OLF) domain of MYOC21. Glaucoma-associated OLF variants compromise thermal stability of MYOC22, leading to protein misfolding and a decrease in melting temperature (Tm)23, thus reducing protein folding efficiency22 and promoting the formation of aggregates within the endoplasmic reticulum (ER) of TM cells20. This leads to activation of the unfolded protein response (UPR) pathway and subsequent TM cell death due to elevated ER stress20. TM cells play an essential role in modulating the aqueous humor outflow from the anterior chamber of the eye, whereas death of TM cells has been associated with elevated IOP, retinal ganglion cell death, and irreversible loss of vision through OAG progression24, 25. Inherited germline mutations in MYOC are disease-causing in about 4% of POAG and up to 36% of JOAG cases26, 27, which amount to over three million and up to 29 million affected individuals, respectively. A recent study shows a moderate correlation between the stability of inherited germline MYOC variants and age of glaucoma diagnosis28, which underlines the importance of thermodynamic stability of mutated MYOC in activating the UPR pathway.
Herein we have analyzed germline and somatic MYOC mutations obtained from glaucoma and cancer genomics databases. MYOC genome analysis revealed a remarkable overlap between glaucoma-causing and cancer-associated mutations. Hence, we reason that somatic mutations within MYOC may also contribute to glaucoma because an increase in the frequency of ultraviolet (UV) light exposure to the eye can accelerate accumulation of disease-causing somatic mutations within MYOC of TM cells. This would result in the aging related pathogenic phenotype of glaucoma. As mutated C→T variants account for ≥ 60% of UV-induced mutations29, when combined with the complement G→A variants, MYOC may accumulate a large variety of pathogenic mutations due to UV exposure. Indeed, it has been shown that extended periods of time spent outdoors is associated with a higher risk of developing exfoliation syndrome glaucoma30, underscoring the potential importance of somatic MYOC variants in OAG development. Other studies have ascertained a link between age-related neurodegeneration and an accumulation of somatic mutations in neurons31, 32. Age-associated defects in development and neurogenesis could result from an elevated frequency of somatic mutations due to a higher rate of cell division and an increase in oxidative damage33. Thus, we propose that age-related MYOC somatic mutations are also major contributors to the onset of glaucoma and the mechanisms driving the emergence of glaucoma have both hereditary and crucial environmental components.
The MYOC gene is mapped to chromosome 1q24.3-q25.2 and is comprised of three exons that encode a 58-kD, 504 amino acid polypeptide with a leucine zipper motif within a coiled coil domain at the N-terminus and an OLF domain at the C-terminal fragment34 (Fig. 1a-c). The N-terminal segment includes a signal sequence at amino acid position 1–32, which is cleaved off at the ER11, 35 during protein synthesis. Recent evidence suggests that the homo-oligomerizing coiled coil domains at amino acids 74–102 and 118–180 assemble into a Y-shaped tripartite parallel dimer of dimers structure36. Glaucoma-causing variants within the tetramer modulate the structure without altering stability36. MYOC undergoes proteolytic cleavage between residues R226 and I227, which produces a 35-kD secreted C-terminal OLF domain and a 20-kD N-terminal tetramer fragment with coiled coil domains37. The OLF domain is comprised of amino acids 244–503 and contains two cysteine residues that participate in disulfide bond formation, C245 and C43311, 38.
We obtained 278 known MYOC variants that exhibit Mendelian inheritance from the MYOC allele-specific glaucoma phenotype online database (www.myocilin.com)21; of the 278 MYOC germline mutations, 105 manifest a glaucoma phenotype, while 173 are identified as neutral polymorphisms or mutations with unknown pathogenicity21. Using small cycles, all the mutations are summarized in Fig. 1. The colors of those cycles are defined in the Venn diagram (Fig. 1d). The overlaps between glaucoma-causing and neutral polymorphisms (cyan cycles) are the residue positions where the mutated amino acids are different.
We also found 155 somatic cancer-associated MYOC missense and nonsense mutations acquired from the cancer genomics online database, cBioPortal (www.cbioportal.org)39, 40; the most frequent cancer types associated with these mutations are cutaneous melanoma, glioblastoma multiforme, and uterine endometrioid carcinoma. Notably, 38 glaucoma-causing germline mutations (36%)21 and 38 melanoma somatic variants (83% of total melanoma mutants)39, 40 arise from C→T or G→A nucleotide transitions. There are 11 MYOC somatic mutations that precisely overlap with glaucoma-inducing germline mutations (R91STOP, G246R, L255P, T285M, R296C, A363T, G367R, T377M, D384N, A427T, and R470C). All white circles within the mutation diagrams demonstrate varying amino acid changes at the same position with exact overlaps between either cancer and glaucoma variants or cancer mutations and neutral polymorphisms.
Somatic cancer-associated variants and germline neutral polymorphisms are spread evenly across the entire protein with frequent overlap, glaucoma-causing germline mutations largely localize to the OLF domain of MYOC (Fig. 1a, b). All except one glaucoma and cancer overlapping variants localized to the OLF domain. To investigate the thermodynamic parameters of MYOC and determine the effect of each mutation on protein stability, we utilized the Rosetta energy function and calculated the change in free energy (ΔΔG) of individual cancer and glaucoma mutations41 (online methods) with refinement (beta_nov16_cart), which provided a very efficient sampling of the global minimum41. As expected, stabilizing mutations were predominantly located on the exterior of the MYOC OLF domain. These variants are schematically illustrated as dotted sidechain structures for glaucoma-causing (Fig. 2a) and cancer-associated (Fig. 2b) mutations. Amino acid changes are color-coded based on the Venn diagram additive color model (Fig. 1a). The stable and unstable variants of glaucoma-causing and cancer-linked mutations within the MYOC OLF domain are also shown in Fig. 2c and Fig. 2d, respectively; in the figures, variants with positive and negative ΔΔG values plotted on the bottom and top row, respectively. The ratio of cancer to glaucoma stabilizing variants was significant; 31 cancer-associated and four glaucoma-inducing mutations exhibited negative ΔΔG values. Five of these mutations overlapped with the germline neutral polymorphisms (T293K, T325M, E414K, A447V, and R470H) and one variant (A427T) overlapped with a germline glaucoma-causing mutation observed in older individuals (61 ± 21.1 years of age)21. A427T was previously reported to possess a relatively high Tm (48.3 ± 0.3 °C) compared to other glaucoma-causing mutations23. Additionally, the other three stabilizing glaucoma-inducing mutations, Q297H, E300K, and N450D, also had a late age onset between 60–75 years of age21, 42.
In this study, we present a collection of known somatic and germline MYOC variants and demonstrate that a significant level of overlap exists between the two types of mutations. We propose that an accumulation of pathogenic MYOC somatic mutations within the TM of the eye by means of UV radiation throughout aging can lead to protein aggregation, induced ER stress, TM cell death, and subsequent IOP elevation, which are attributes of the glaucoma phenotype induced by the known germline MYOC glaucoma-causing variants. Particularly, UV-induced somatic C→T or G→A transition mutations within the MYOC coding region that lead to pathogenic MYOC missense or nonsense glaucoma variants are likely to be overexpressed in glaucomatous TM cells. Therefore, age-related glaucoma onset may be influenced by somatic mutation buildup within key genes associated with its pathophysiology.
MYOC mutations can either be detrimental or stabilizing depending on the thermodynamic properties of mutated residues’ biochemical environment. Deleterious mutations, such as MYOC glaucoma-causing variants, reduce the folding efficiency of a protein and can induce protein aggregation, among other harmful responses. Genome editing using clustered regularly interspaced short palindromic repeats (CRISPR)-Cas9 technology has been shown to accomplish the alleviation of the glaucomatous phenotype by knocking down the expression of mutant MYOC, reducing ER stress, and ultimately decreasing IOP43. With the advent of genetic modification, it may soon be feasible to reverse the harmful mutations within the genome and prevent the onset of age-related neurodegenerative diseases. On the other hand, stabilizing mutations can enhance mutated protein’s stability, resulting in a product that can sustain its desired native structure to a higher degree. Indeed, many cancer-associated variants that lead to more stable protein products likely have gain-of-function properties, thereby leading to cancer progression. For example, MYOC mutations that were determined to be stabilizing were found in the prostate neuroendocrine, cervical squamous cell, uterine endometrioid, bladder urothelial, and lung squamous cell carcinomas; colon, endocervical, colorectal, prostate, and lung adenocarcinomas; acral and desmoplastic melanomas, uterine carcinosarcoma, anaplastic astrocytoma, and diffuse glioma39, 40. The most stabilizing E253Q variant is associated with endocervical adenocarcinoma, with the next stabilizing A447V mutation linked to colon adenocarcinoma39, 40. Four of the glaucoma-causing mutations are stabilizing MYOC variants (Fig. 2c), and two of them overlap with cancer somatic mutations (Fig. 2d). Those mutations may have an alternative pathway for glaucoma progression and are likely not involved in the activation of the UPR pathway. Further studies on those stabilizing glaucoma-causing variants may explore the possibility of uncovering a secondary pathological mechanism of the multifaceted glaucoma progression.
In summary, our analysis suggests that the accumulation of pathogenic somatic variants within individual cells may also contribute to other age-related diseases, as observed in the development of cancers44. Additionally, glaucoma-causing MYOC stabilizing mutations may contribute to its pathogenesis via an alternative pathway not involved with ER stress induction. Furthermore, stabilizing MYOC cancer-associated variants could be cancer risk factors for those who carry the mutations.
We obtained 278 known MYOC variants that exhibit Mendelian inheritance from the MYOC allele-specific glaucoma phenotype online database (www.myocilin.com); of the 278 MYOC germline mutations, 105 manifest a glaucoma phenotype, while 173 are identified as neutral polymorphisms or mutations with unknown pathogenicity. We also found 155 somatic cancer-associated MYOC missense and nonsense mutations acquired from the cancer genomics online database, cBioPortal (www.cbioportal.org). All the mutations are listed in file Glaucoma-Cancer Myocilin Mutation Correlation – Colorcoded.xlsx (Supplementary Information). We utilized the mutation diagram generated through the cBioPortal for Cancer Genomics MutationMapper and modified it using the Hypertext Markup Language (HTML). The HTML codes we put together for Fig. 1a-1c are included in the Supplementary Information as: MYOC_Figure 1a.docx, MYOC_Figure 1b.docx, and MYOC_Figure 1c.docx.
We obtained the myocilin (MYOC) structure (PDB entry 4WXQ) and modified it using the PyMOL computer software for molecular visualization (Fig. 2a, b)
The most recent Rosetta score function utilizes highly accurate calculation algorithms for energy approximations to a degree that allows for ΔΔG prediction in kilocalories per mole (kcal/mol). Optimization data for the Rosetta energy function for macromolecular modeling and design illustrate a strong correlation between the Rosetta-predicted values and experimental data for ΔΔG calculation (R = 0.994)45, thus we define values below 0 kcal/mol as having stabilizing thermodynamic properties. Initially, we compared our ΔΔG outputs with the Rosetta literature value for the T193V mutation (–4.95 kcal/mol) in the reverse transcriptase-RNaseH-derived peptide bound to HIV-1 protease (PDB entry 1KJG)45 and found that our calculated ΔΔG (–4.12 kcal/mol) using the updated version of the REF2015 score function (beta_nov16_cart) was slightly closer to the experimentally determined value (–1.11 kcal/mol), thereby signifying an improvement in the score function parameters for predicting free energy changes upon an introduction of a mutation.
Python programming language and the PyRosetta program (www.pyrosetta.org) were used for the Rosetta energy calculations. The program’s input is the 4wxs.clean.pdb file, a coordinate file for the 4WXS PDB structure that was cleaned from non-canonical amino acids prior to running the program.
We refined the PDB using a fast relax protocol and the beta_nov16_cart energy score function, using the following initialization parameters:
init(extra_options = "-beta_nov16_cart -in:file:s 4wxs.clean.pdb -packing:use_input_sc -relax:constrain_relax_to_start_coords -in:ignore_unrecognized_res -relax:coord_constrain_sidechains -relax:ramp_constraints false -relax:cartesian -relax:min_type lbfgs_armijo_nonmonotone")
The PyRosetta commands that we’ve implemented in our calculations matched the Rosetta options in the initialization. From this initialization, we calculated the initial PDB score for the relaxed wild type.
After obtaining an initial PDB score for the relaxed wild-type protein, we utilized the mutate_residue() function within PyRosetta to introduce a mutation of interest. We then applied the fast relax on this mutation. The program produced a PDB file and the PDB score for the relaxed mutation. We compared the difference between the PDB scores for the relaxed mutation and the relaxed wild type, and if we ran multiple tests of the relaxed mutation, we also output the average difference between the PDB scores for the mutation and wild type.
After this refinement, we applied a Cartesian version of Rosetta’s ΔΔG protocol on the output PDB files of the relaxed wild type and relaxed mutations, using the following initialization:
init (extra_options = "-beta_nov16_cart -in:file:s input_file -packing:use_input_sc -relax:constrain_relax_to_start_coords -in:ignore_unrecognized_res -relax:coord_constrain_sidechains -relax:ramp_constraints false -relax:cartesian -relax:min_type lbfgs_armijo_nonmonotone -ddg:mut_file -ddg:iterations 5 -max_cycles 200 -relax:min_type lbfgs_armijo_nonmonotone -fa_max_dis 9.0")
We also ran PyRosetta commands to apply a fast relax on the input PDB files. From this initialization, we calculated the initial PDB score for the relaxed wild type applied with ΔΔG protocol. We ran a similar procedure for all the relaxed mutations and compared the difference between the PDB scores for the relaxed mutation and the relaxed wild type, both applied with ΔΔG protocol. If we ran multiple tests of the relaxed mutation, we also output the average difference between the PDB scores for the mutation and wild type.
After running the program in the same directory, all the output PDB files of the relaxed mutations can be found, and the PDB scores can be found in the output text files.
The python program we used, mutation_scorer.py, is included in the Supplementary Information, named python_codes.docx. The program runs on Python3 and has been tested on MacOS and Linux operating systems.
The updated version of the REF2015 algorithm was used for accurate ΔΔG prediction. We ran a PyRosetta implementation that matched the Rosetta functions used by Alford, et al. (2017)45 using the PDB entry 1KJG. They obtained a ΔΔG of -4.95 kcal/mol for the T193V mutation, whereas our value was − 4.12 kcal/mol, which is slightly closer to the experimentally determined value of -1.11 kcal/mol.
To learn more about the nature of these MYOC mutations, we have utilized Rosetta scoring algorithms to predict the free energy change (ΔΔG) of known MYOC mutations45.
We obtained a high resolution (1.9 \(\AA\)) MYOC OLF domain PDB structure (entry 4WXS) with a E396D variant46 from the protein data bank (www.rcsb.org) and performed ΔΔG calculations for each somatic and germline mutation using the most updated Rosetta scoring function with Cartesian refinement (beta_nov16_cart), which provided a very efficient sampling of the global minimum45. Interestingly, stabilizing mutations were predominantly located on the exterior of the MYOC OLF domain. These variants are schematically illustrated as dotted Van der Waals structures for glaucoma-causing (Fig. 2a) and cancer-associated (Fig. 2b) mutations. Similarly to Fig. 1, the codes that generated Fig. 2c and 2d are included Supplementary Information: MYOC_Figure2c.docx and MYOC_Figure2c.docx.
Amino acid changes are color-coded based on the Venn diagram additive color model (Fig. 1d).
We found that the ratio of cancer to glaucoma stabilizing variants was quite large; 31 cancer-associated and four glaucoma-inducing mutations exhibited negative ΔΔG values. The thermodynamically stabilizing cancer-associated variants within the OLF domain are: G244E, E253Q, T281N, D289E, D289N, T293K, T293M, D294N, F299I, E300Q, M308I, K314N, I317T, T325M, V344L, Y371F, E409K, L413I, E414K, T416I, A427T, A427V, I432V, V439I, A447V, T448N, V449I, R470H, Y473F, F487L, and M494V. Five of these mutations overlapped with the germline neutral polymorphisms (T293K, T325M, E414K, A447V, and R470H) and one variant (A427T) overlapped with a germline glaucoma-causing mutation observed in older individuals (61 ± 21.1 years of age)21. A427T was previously reported to possess a relatively high Tm (48.3 ± 0.3 °C) compared to other glaucoma-causing mutations46. Additionally, the other three stabilizing glaucoma-inducing mutations, Q297H, E300K, and N450D, also had a late age of onset between 60–75 years of age21, 47.
G367R, a germline glaucoma-causing variant with an early age of onset (13–23 years of age) and a high maximum IOP (40 mmHg)48 was observed in tumors of patients with esophageal adenocarcinoma. Interestingly, this variant was not stabilizing according to Rosetta calculations with a ΔΔG value of + 4.23 kcal/mol. Several glaucoma pathogenic mutations manifested similar characteristics (i.e., overlap with somatic variants, early age of onset, and destabilizing properties) including G246R, L255P, T285M, R296C, A363T, P370L, T377M, D384N, and R470C, thus recapitulating the likelihood that these pathogenic mutations accumulate within MYOC of TM cells and largely contribute to the onset of glaucoma.
The E253Q and A447V cancer-associated variants were the most stabilizing of all mutations tested with Rosetta ΔΔG values of − 6.93 and − 6.12 kcal/mol, respectively. Interestingly, the A445V and Y473C neutral polymorphisms were also highly stabilizing according to the Rosetta energy calculations with ΔΔG values of − 4.15 and − 5.67 kcal/mol, respectively. The reported literature Tm value for A445V is 54.2 ± 0.2 °C and for Y473C is 54.2 ± 0.1 °C, which compared to the wild-type MYOC Tm (52.2 °C)46 demonstrates that these neutral variants are thermodynamically stabilizing.
To emphasize where the stabilizing somatic (Fig. 2c) and germline (Fig. 2d) variants are located along the mutation diagram, we slightly raised the mutation points above the rest of the respective variants within the OLF domain (Fig. 2c, d). Multiple thermodynamically stabilizing cancer-associated mutations appeared at amino acid positions D289 and A427 within the OLF domain. The stabilizing R470H cancer-linked and neutral variant, illustrated as a white circle with a thicker lining (Fig. 2c), overlaps with a glaucoma-causing and cancer-inducing R470C mutation, which did not exhibit favorable thermodynamic properties.
To investigate ΔΔG parameters of a somatic and neutral germline variant within the N-terminal region of MYOC, we calculated the energy of the R82H mutation that is associated with pancreatic cancer and is among the most frequently mutated cancer-linked variants. We utilized the recently published full length structure of MYOC7 49 to calculate the free energy of the frequently mutated R82H variant located in the coiled-coil region and found it to be exceptionally stabilizing (–9.69 kcal/mol). This observation is reminiscent to the disorder-to-order transition observed with disease-associated mutations within intrinsically disordered regions of proteins50.
Acknowledgements
We thank Drs. Mark Sheffield, For-Shing Lui, Rajendra Ramsamooj, Floyd Culler, and Dean Joseph Silva from the California Northstate University, College of Medicine, and Dr. Glenn Yiu from the Department of Ophthalmology & Vision Science, University of California, Davis for the insightful discussions and helpful comments on the manuscript. We also thank members of Dr. Jie J. Zheng laboratory for providing support and valuable discussions.
Data Availability
The datasets generated in the study and the computer codes created for the study are all included in the Supplementary Information.
The MYCO mutations are listed in the Supplementary Information file named Glaucoma-Cancer Myocilin Mutation Correlation – Colorcoded.xlsx.
The HTML codes used to generate Figures 1a-c, 2c, and d are included in the Supplementary Information with the files named as MYOC_Figure 1a.docx, MYOC_Figure 1b.docx, MYOC_Figure 1c.docx, MYOC_Figure2c.docx, and MYOC_Figure2d.docx.
The python program mutation_scorer.py used for the energy-based analysis is included in the Supplementary Information with the file named python_codes.docx.