The demographic and clinical characteristics of 1,340 invasive breast cancer cases and 675 controls selected from the AABCG consortium and the GBHS are presented in Table 1. The mean age at diagnosis of breast cancer cases was 54.8 years old (standard deviation (SD) = 7.7 years), and the mean age of controls was 58.0 years old (SD = 6.9 years). Approximately 14.8% of cases and 1.0% of controls reported a family history of breast cancer. Approximately 38.1% of the cases had hormone receptor negative tumors.
The summary of the identified potential loss-of-fucntion deletions
We identified a total of 80 deletions in the intragenic regions of the 29 established or suspected susceptibility genes (Supplementary Data 4). The median length of these deletions was 405 bp (from 50 bp to 32.8 kb) and the majority of them were low frequency or rare (78.8% with a frequency of deletion carriers < 0.01). Of these deletions, 33 were only presented in the cases (n = 44), but not in controls. In contrast, 16 deletions were only presented in the controls (n = 17), but not in cases. The majority of these 49 deletions were located in the intronic regions (83.7%). Although overall there was no significant case-control difference in the frequency of carriers of the deletions detected in this study, cases were significantly more likely to carry the deletions either in the coding/exonic regions or in the intronic reigions with the evidence of epigenetic signals (P < 0.05; 2.6% in cases and 1.2% in controls). We also observed that four particular deletions showed a higher frequency in cases than in controls (odds ratio (OR) > 1.5 for each deletion; Supplementary Data 4).
Potential loss-of-function deletions only presented in cases
Of the 33 deletions that were only presented in cases (n = 44, 3.3% of 1,340 cases), we found five putative pathogenic deletions in the exonic or coding regions of BRCA1, BRCA2, RAD51C, GEN1, and BRIP1, and 28 in the intronic regions of 18 other genes (Table 2; Supplementary Data 4). Several deletions were found in all participating studies (Supplementary Data 5 and 6). No cases carried more than one deletion. Eleven of these deletion were identified in seven established breast cancer susceptibility genes, including BRCA1, BRCA2, PTEN, CDH1, NF1, STK11 and CHEK2, and the remaining 22 were identified in the 15 putative breast cancer susceptibility genes (Table 2). One of these deletions, located in the PTEN gene, had been reported in our previous study conducted in Asian and European descendants 10; in the present study, the deletion was found in nine cases and no controls (P = 0.03) (Supplementary Data 4).
Three cases carried potential loss-of-function deletions in the BRCA1 or BRCA2 genes, accounting for 0.22% of cases under study (Table 2). Of them, two putative pathogenic deletions are located in the coding region, a 3.34 kb deletion in the BRCA1 gene involving three exons (Fig. 1a) and a 32.8kb deletion in the BRCA2 gene, with a loss of nine exons (Fig. 1c). The third deletion (3.41 kb) is located in the first intron of the BRCA1 gene. This deletion may involve functional elements with the evidence of ChIP-seq enriched peaks (Fig. 1b).
Sixteen cases carried potential loss-of-function deletions in five other established breast cancer susceptibility genes, including PTEN, CHEK2, NF1, CDH1, and STK11, accounting for 1.2% of the total investigated cases (Table 2). Of these deletions, three (~ 0.9 kb, 7.1 kb, and 0.5 kb) were identified in PTEN, one (~ 3.0 kb) in CHEK2, one (~ 9.8 kb) in NF1, two (~ 3.8 kb and 1.1 kb) in CDH1, and one (~ 0.3 kb) in STK11. All of these deletions can lead to the loss of intronic sequences of these breast cancer susceptibility genes. In addition, these deletion regions are likely to involve potential regulatory elements with the evidence of epigenetic signals, such as histone modifications, DNase I hypersensitive sites and ChIP-seq enriched peaks (Supplementary Data 4).
Twenty-five cases carried potential loss-of-function deletions in 15 putative breast cancer susceptibility genes, accounting for 1.9% of the total investigated cases. Of the 22 deletions observed in these genes, three putative pathogenic deletions involve exonic or coding sequences, including one (~ 140 bp) for BRIP1, one (~ 82 bp) for GEN1 and one (~ 4.9 kb) for RAD51C (Table 2). All other 19 deletions can lead to the loss of intronic sequences of these breast cancer susceptibility genes (Table 2). Of them, 11 deletion regions are located in nine genes, including BMPR1A, POLE, AKT1, RAD51D, MSH2, MSH6, XRCC2, FANCM and FANCC, which may involve regulatory elements supported by epigenetic signals (Supplementary Data 4).
Rare deletions with higher frequency in cases than controls
We identified three potential loss-of-function deletions in breast cancer susceptibility genes, with each showing a higher frequency in cases than in controls (Table 2; Supplementary Data 4). The deletion in TP53 (~ 1.6 kb in the coding region) results in a loss of the whole last exon and the 3’ untranslated region (UTR). This deletion was observed in six cases and one control in the present study and has been reported in our previous study 10. The other two deletions were observed in the intronic regions of GEN1 and MSH6 and were observed to have epigenetic signals (Table 2; Supplementary Data 4).
Taken together, the above three potential loss-of-function deletions, together with those only observed in cases, were presented in 4.6% cases (n = 61) and 0.6% of controls (n = 4). Carrying any one of these 36 deletions was associated with an 8.0-fold increased risk of breast cancer (95%CI = 2.94 - 30.41, P = 1.5 × 10-7) (Table 2). We observed significant different frequencies of the deletion carriers among studies (e.g. 3.6% and 1.4% for African Americans and Ghanaians, respectively; P = 0.01; Supplementary Data 6).
A low-frequency deletion associated with breast cancer risk
We identified a low-frequency deletion (~ 72 bp) in the intronic region of NF1 associated with breast cancer risk (OR = 1.93, 95%CI = 1.14 - 3.42, P = 0.01) (Table 3). The deletion region may involve regulatory elements with the evidence of the epigenetic signals, including ChIP-seq enriched peak (Supplementary Data 4). We observed that deletion accounts for 5.3% of cases and 2.8% of controls.