Musculoskeletal symptoms are a common AI treatment-emergent toxicity that often leads to treatment discontinuation4–7. Prior studies, primarily using candidate gene approaches, have reported several genetic variants that affect AIMSS risk14–19. Additionally, a genome-wide association study (GWAS) identified variants in TCL1A that may predict AIMSS20, however, this has not been successfully validated in other cohorts.20, 21. Our primary GWAS identified two new variants (rs79048288 and rs912571) that were associated with AIMSS risk in the ELPh cohort.
In the primary GWAS analysis, the T allele of the intronic rs79048288 variant within CCDC148 was associated with a 4.4-fold higher AIMSS risk and the G allele of the intergenic rs912571 upstream of PPP1R14C was associated with a 3.3-fold lower AIMSS risk. In silico analyses did not reveal any obvious functional impact of the intronic variant within CCDC148 (rs79048288), and little is known about the physiological function of this protein, except that dysregulated expression has been reported in several cancer types38, 39. In silico analyses revealed that the G allele of rs912571 was associated with higher expression of PPP1R14C in sun-exposed skin, indicating that this variant could be functionally consequential. This gene encodes a signal-transducing protein phosphatase, also referred to as KEPI, that is an inhibitor of myosin phosphatase and regulates smooth muscle contraction, providing further suggestive evidence that this variant could be associated with AIMSS40–42.
Two additional variants of interest, rs1324052 and rs74418677, were found to be associated with AIMSS in the letrozole-only analysis. In silico analyses suggest that rs74418677 is a regulatory variant that affects expression of SUPT20H, also referred to as P38IP, a protein that is known to be involved in cell cycle regulation and cellular autophagy43, 44. Intriguingly, a nonsense variant (p.Lys25X) in this gene was identified as the likely causal variant in hereditary rheumatoid arthritis45. We speculate that patients with subclinical arthritis-like conditions are at increased risk of clinically overt musculoskeletal pain when administered letrozole, similar to the identification of hereditary neuropathy genes as predictors of taxane-induced neuropathy46. However, we are not aware of any prior studies that have investigated or reported an association for these or other polymorphisms in these genes (i.e., CCDC148, PPP1R14C, or SUPT20H) with AIMSS or any other AI treatment outcome.
Our attempted replication of variants previously reported to be associated with AIMSS found only two significant associations in our combined cohort, both of which have been previously reported in ELPh. The association between rs9322336 in ESR1 and AIMSS risk in exemestane-treated patients was previously reported by our group in 2013.23 Interestingly, another group recently reported that this variant was also associated with lower AIMSS risk in an independent cohort of 196 patients treated with letrozole or anastrozole.18 While these two studies provide consistent and suggestive evidence of association, the association between rs9322336 and AIMSS risk awaits validation in additional studies. In fact, such a validation was recently attempted in the racially diverse ECOG E1Z11 cohort of anastrozole treated patients using rs2347868, which is modestly correlated with rs9322336 (linkage disequilibrium R2 of 0.31), and no association was detected for this variant or any of the other nine variants tested.47 The other association for RANKL (rs7984870) was previously reported by our group in 201922 and was itself an attempt to replicate a previously reported association by Wang et al48. This variant was not included in the E1Z11 analysis and was not successfully replicated in our recent analysis of an independent cohort of 143 AI-treated women49. Taken together, there is weak evidence that any of these candidate variants in CYP17A114, CYP19A114–16, ESR117, 18, or TCL1A20, 21 are associated with AIMSS risk.
A genetic biomarker of AIMSS could be useful in clinical decision making. Patients carrying a variant that increases risk of AIMSS for all AI may be candidates for enhanced toxicity monitoring50 or evidence-based interventions such as exercise, yoga, duloxetine and acupuncture51–53. The identification of a genetic variant that increases risk for only one or two of the AI’s would be even more clinically useful. These patients could be switched within the AI class, since all third-generation AI’s are similarly effective54 and switching within the drug class can improve treatment tolerability and persistence.4 This study indicates that none of the previously reported genetic biomarkers are sufficiently robust for clinical use. The variants identified in our GWAS, particularly the rs74418677 variant that may increase risk of letrozole-induced musculoskeletal toxicity through expression of SUPT20H, should be prioritized for future validation studies, e.g., within the E1Z11 cohort.47 Convincing evidence of clinical validity is warranted prior to further investigation of the causal mechanism for these candidate variants and is necessary before any efforts to use these variants for clinical decision making.
This study had several strengths, including the use of hypothesis-agnostic genome-wide association approach in a large prospectively accrued cohort of patients with a well-documented clinical outcome. However, there were also some limitations that should be considered. Though the ELPh cohort is similar in size to a prior AIMSS GWAS20, pharmacogenetic sample sizes like these are still orders of magnitude smaller than those used in disease genetics GWAS55–57, which limits power to detect associations with smaller effect sizes or for uncommon variants (i.e., MAF<0.025). Finally, we were not able to attempt validation of an AIMSS polygenic risk score reported by another group due to a lack of information in their publication with which to recapitulate their 70-variant signature.58 While we were not in a position to functionally characterize these variants in preclinical models, whether rs74418677 plays a role in regulating SUPT20H expression should investigated.
In conclusion, we identified several new variants that were associated with AIMSS risk in our cohort of AI-treated patients, including rs912571 (PPP1R14C) and rs74418677 (SUPT20H), that should be prioritized for attempted replication in independent cohorts of AI-treated patients. Successful validation of these associations is necessary prior to prospective studies that use genetic biomarkers to inform clinical decision making to reduce AIMSS and enhance AI treatment persistence to improve clinical outcomes in patients with HR+ breast cancer.
Table 1
Characteristics of breast cancer patients by AI treatment
| Letrozole (n=199) | Exemestane (n=201) |
Age at enrollment (years) | 60 (13.5) | 58 (11) |
Body mass index (kg/m2) | 28.9 (8.9) | 28.8 (7.0) |
Prior taxane chemotherapy treatment | 68 (34%) | 66 (33%) |
Pre-treatment pain score on visual analog scale | 2.0 (3.6) | 2.3 (3.8) |
Discontinuation of AI due to AIMSS | 44 (22%) | 56 (28%) |
Discontinuation of AI due to other reasons | 22 (11%) | 35 (17%) |
Time to discontinuation of AI due to AIMSS (months) | 9.0 (8.3) | 5.9 (6.4) |
AI=aromatase inhibitor; AIMSS=AI-induced musculoskeletal symptoms Data are median (interquartile range) or number (percentage). |
Table 2
Variants associated with time to discontinuation of AI treatment due to AI-induced musculoskeletal symptoms
Treatment group | Variant | Chromosome | Positiona | Nearest gene | Allelesb | EAFc | r2 | HR (95% CI)d | P-value |
Combined | rs79048288 | 2 | 159271033 | CCDC148 | C>T | 0.026 | 0.95 | 4.42 (2.67-7.33) | 7.69x10−9 |
Combined | rs912571 | 6 | 150440290 | Intergenic | C>G | 0.93 | 0.92 | 0.30 (0.20-0.47) | 4.74x10−8 |
Letrozole | rs1324052 | 13 | 37841344 | Intergenic | G>A | 0.091 | 0.91 | 5.91 (3.16-11.06) | 2.80x10−8 |
Letrozole | rs74418677 | 13 | 37846201 | Intergenic | G>C | 0.091 | 0.91 | 5.91 (3.16-11.06) | 2.8x10−8 |
AI=aromatase inhibitor; EAF=effect allele frequency; r2=imputation r-squared; HR=hazard ratio; CI=confidence interval |
aPosition based on genome build 37. |
bEffect allele is second allele. |
cEAF in treatment group. |
dHazard ratio based on Cox proportional hazards model assuming additive genetic effects and adjusted for age (under 55 years), baseline pain score on visual analog scale, prior taxane chemotherapy treatment, and (for combined treatment group) drug (exemestane). |
Table 3
Candidate variants associated with time to discontinuation of AI treatment due to AI-induced musculoskeletal symptoms
Treatment group | Variant | Chromosome | Positiona | Nearest gene | Allelesb | EAFc | r2 | HR (95% CI)d | P-value |
Combined | rs912571 | 6 | 150440290 | ESR1 | C>G | 0.765 | 0.98 | 0.61 (0.44-0.83) | 0.002 |
Combined | rs9322336 | 6 | 152200430 | ESR1 | C>T | 0.774 | 1.00 | 0.64 (0.46-0.88) | 0.007 |
Exemestane | rs9322336 | 6 | 152200430 | ESR1 | C>T | 0.769 | 1.00 | 0.61 (0.39-0.99) | 0.033 |
Exemestane | rs2347868 | 6 | 152251568 | ESR1 | T>C | 0.775 | 0.98 | 0.58 (0.38-0.88) | 0.011 |
Combined | rs7984879 | 13 | 43146482 | RANKL | G>C | 0.447 | 1.00 | 1.42 (1.06-1.89) | 0.018 |
Exemestane | rs7984879 | 13 | 43146482 | RANKL | G>C | 0.460 | 1.00 | 1.61 (1.07-2.42) | 0.022 |
Exemestane | rs2369049 | 14 | 96171851 | TCL1A | A>G | 0.157 | 1.00 | 0.53 (0.29-1.00) | <0.05 |
Exemestane | rs11849538 | 14 | 96175978 | TCL1A | C>G | 0.145 | 1.00 | 0.50 (0.25-0.98) | 0.042 |
AI=aromatase inhibitor; EAF=effect allele frequency; r2=imputation r-squared; HR=hazard ratio; CI=confidence interval aPosition based on genome build 37. bEffect allele is second allele. cEAF in treatment group. dHazard ratio based on Cox proportional hazards model assuming additive genetic effects and adjusted for age (under 55 years), baseline pain score on visual analog scale, prior taxane chemotherapy treatment, and (for combined treatment group) drug (exemestane). |