All-inclusive analysis of AA-specific BC risk alleles suggests race group specific effects
Our overall BC risk assessment model was an all-inclusive analysis, including all breast cancer subtypes and SIR/ancestry groups, where we have expanded the number of BC cases from Eastern and Western African nations, investigating previously published BC risk alleles that have been validated among African American women in the AMBER consortium32 (Fig. 1A, Table 2). Three alleles replicated previous associations of increased overall BC risk in our unadjusted models. These include rs2981578 (FGFR2), rs4849887 (GLI2), and rs3745185 (BABAM1). Interestingly, we found that the T allele of rs2981578 in the FGFR2 gene was associated with increased risk (OR = 1.508, p = 0.008491), which contrasts with previous reports of the C allele as the risk allele. The C allele of rs4849887 in the GLI2 gene was associated with increased risk (OR = 1.654, p = 0.006122), replicating previous findings. We also replicated the protective A allele of rs3745185 in the BABAM1 gene (OR = 0.67, p = 0.008402).
To determine whether these all-inclusive association models may be confounded by race-specific bias in age or allele frequency, we adjusted the risk model to correct for race and age. Interestingly, each unadjusted risk association loses significance in the combined race group model after adjusting for race and age, indicating that the risk alleles may have higher frequency in one of the SIR groups (See Table 1). Specifically, in the case of the risk (C) allele of rs4849887, we find it is 10–15% lower in populations of West African descent (AA = 35%, Ghanaians = 36%), compared to European Americans (44%) and East Africans (44%) in our cohort. Two additional alleles gained significant overall BC risk associations after race and age adjustments in our all-inclusive model, rs2981579 in the FGFR2 gene (OR = 1.899, p = 0.03038) and rs3112572 in the LOC643714 gene (OR = 2.410, p = 0.03055).
Next, we tested whether the associated BC risk of our candidate alleles was different among SIR groups by performing a nested overall BC risk assessment within each of the SIR groups (Table 2 and Supplemental Table 1). We found that rs4849887, which lost significance in overall risk in our race-adjusted model, is associated with higher overall BC risk in AAs and Ghanaians, but only in Ghanaians after adjusting for age (OR = 2.472, p = 0.001032) (Fig. 1B). There were no significant associations found between the previously identified variants and breast cancer risk among SIR EA in both unadjusted and age-adjusted models (Supplemental Table 1).
AA-specific risk variants, associated TNBC-specific within ancestry groups.
The higher rate of TNBC among women of African descent worldwide begs the question of whether there is a shared genetic risk among the African diaspora, and we have previously shown that quantified West African ancestry was strongly associated with TNBC disease31. Using a case-series analysis in our African-enriched cohort, we tested whether previously reported AA-specific risk alleles were associated specifically with TNBC disease risk (Fig. 2A, Table 3 and Supplemental Table 2). Prior to adjusted covariate modeling, five of the nine AA-risk variants showed significant association with TNBC disease risk. Four of these variants were not previously reported as having ER-negative disease specific risk, and four were predicted to have a protective effect; including, rs2981578 in FGFR2 (OR = 0.667, p = 0.0627), rs3745185 in BABAM1 (OR = 0.503, p = 0.009), rs4849887 in GLI2 (OR = 0.414, p = 0.003), and rs2362956 in ANKLE1 (OR = 0.593, p = 0.0149). Only the SNV rs609275 in MYEOV/CCND1(OR = 2.479, p = 5.68E-05) showed higher hazard/risk for TNBC in the unadjusted model. The ANKLE1 variant rs2363956 replicated in the TNBC/ER-negative specific protective effect that was previously reported and was the only variant to retain significance after adjusting for race and age (OR = 0.542, p = 0.014).
Similar to our BC case-control analysis, we used a nested risk analysis within SIR groups to test for SIR-specific risk. For the admixed AA population, we included quantified West African ancestry (WAa) in the adjusted covariate modeling. The rs2363956 variant in the ANKLE1 gene retained a protective effect for TNBC in AAs, even after covariate adjustments, (age and WAa adjusted OR = 0.4204, p = 0.005), indicating this is not a mere artifact of disequilibrium, or biased distribution of the allele in African populations (Fig. 2B and Table 3). Among Ghanaians, the protective effect was observed in unadjusted models, but was lost after age adjustment (unadjusted OR = 0.7904, p = 0.7664; age-adjusted OR = 1.471, p = 0.8163) (Table 3 and Supplemental Table 2).
DARC/ACKR1 alleles in BC and TNBC risk
In addition to the previously implicated AA-risk alleles, we have also included DARC/ACKR1 alleles, including the TNBC risk associated Duffy-null allele31, to investigate whether alternative variants may capture risk due to unique biological contributions of either isoforms or distinct gene regulation (Table 1). Our new analysis found that four DARC SNVs also had significant potential to confer overall BC risk in our all-inclusive analysis models (rs2814778 OR = 1.512, p < 0.001, rs17838198, OR = 4.798, p < 0.001, rs3027016 OR = 4.586, p = 0.005 and rs12075 OR = 2.534, p < 0.001, respectively), however, after adjusting for age and race, this is mostly lost (Table 2). In our SIR nested analysis model, the DARC/ACKR1 variant rs3027013 showed a significant protective effect in EA patients, even after age-adjusted modeling (age-adjusted OR = 0.1314, p = 0.03897) (Fig. 1C and Supplemental Table 1).
For DARC/ACKR1 variant associations in TNBC-specific risk we similarly observed that seven out of eight variants were associated with TNBC disease, in which five of the minor alleles presented a protective effect and two showed increased risk, prior to race/age adjustments (rs6676002, OR = 0.191, p = 0.007; rs3027008, OR = 0.134, p = 0.006; rs17838198, OR = 0.367, p = 0.02; rs3027016, OR = 0.39, p = 0.06; rs12075, OR = 0.38, p = 0.003, rs71782098, OR = 3.403, p = 0.019; and rs2814778, OR = 3.062, p < 0.001) (Table 3). Interestingly, as we previously reported with only AA and EA, the Duffy-Null allele, rs2814778, retained significant TNBC-risk association with the addition of East African and West African samples, even after age and SIR adjustments (OR = 3.814, p = 0.001). The Duffy-Null (rs2814778) TNBC-risk association was also retained in our nested SIR analysis among AA, following both age and quantified West African ancestry adjustment (OR = 3.368, p = 0.007) (Fig. 2C and Table 3). This indicates that the TNBC-specific risk conferred by the Duffy-null allele in the DARC/ACKR1 gene is not an artifact of shared ancestry bias, but rather an ancestry-specific risk allele.
Functional consequences of the TNBC-protective rs2362956 variant in ANKLE1
In our TNBC risk analysis, we found that the minor G allele of the rs2363956 ANKLE1 variant was protective against TNBC disease, which has previously been shown for ER-negative disease among AA32. Given its SIR-specific effect, we investigated the frequency of the allele across global 1000 genomes (1 KG) populations33. Population minor allele frequency (MAF) of the protective G allele is relatively equal among European and African groups (51.5% vs 51.2%, respectively, Table 1). However, among TNBC cases in our ICSBCS cohort, the frequency is much lower in AA patients, compared to EA patients (14% and 43%, respectively). This 40% drop in the minor allele frequency in TNBC cases among AA (Fig. 3B) is what explains the interpreted potentially protective effect of the minor allele, inferring the major allele may somehow drive TNBC frequency higher in AAs.
To date, despite being repeatedly reported as a risk allele in both breast and ovarian cancer32,34,35, no investigation has linked a functional impact of this variant to risk or survival in this population. Given that the variant causes a dramatic amino acid change of leucine to tryptophan (L184W, Fig. 3A), there is a high probability that the protein structure is impacted, and subsequently have altered the function. We conducted a 3D rendering of the variant, comparing the wildtype structure of the protein with leucine at position 184 to the minor allele change to tryptophan, and found a predicted destabilization of the gene product (Fig. 3A).
The allele’s protective effect through destabilization of ANKLE1 structure, together with its significant loss in AAs who suffer from higher rates of TNBC, suggests the major allele ANKLE1 protein could be a genetic driver of TNBC. We hypothesize that wildtype ANKLE1 expression suppresses TNBC progression, which is most frequently found in EA patients when caused by the rs2363956 variant. To further investigate this theory, we determined whether the expression of ANKLE1 had any impact on survival36. We found that survival trends in TCGA breast cancer cases are significantly impacted by ANKLE1 expression, but that the advantage of ANKLE1 expression only benefits EA patients (Fig. 3C-E). Specifically, we found that when comparing high vs low/medium ANKLE1 expression within SIR groups, EA have a significant survival improvement associated with higher expression (p = 0.035), but AA did not (p = 0.83) (Fig. 3C-E). In fact, when only including patients who had high expression of ANKLE1, EA had a longer survival advantage associated with ANKLE1, compared to AA (Fig. 3E, p = 0.052). This suggests that the benefit of ANKLE1, only found in EA, could be due to the 41–53% chance that EA are expressing the polymorphic version of ANKLE1, which harbors the rs2363956 allele.