Breast cancer in Western Africa: A pilot molecular analysis of BRCA genes in early- onset breast cancer patients in Burkina Faso

Background: Breast cancer (BC) is the most commonly diagnosed cancer and the second leading cause of cancer-related deaths, after cervical cancer, among women in Africa. Even if the epidemiological data are now aligned with those relating to industrialized countries, the knowledge concerning breast cancer in Africa, in particularly in Western Africa still lack clinical data, medical treatments, and the evaluation of genetic and non-genetic factors implicated in the etiology of the disease. The early onset and the aggressiveness of diagnosed breast cancers in patients of African ancestry strongly suggest that the genetic risk factor may play an important role but up to date very few studies have been done concerning the impact of germ line mutations in breast cancer in Africa, with a negative impact on prevention, awareness and patient management. We have performed by Next Generation sequencing (NGS) the analysis of all coding regions and the exon-intron junctions of BRCA1 and BRCA2 genes, the two most important genes in hereditary breast cancer, in fty-one women from Burkina Faso with early onset of breast cancer and or without a family history. Results: We identied six different pathogenic mutations (3 in BRCA1, 3 in BRCA2), two of which have been found to be recurrent , in 8 unrelated women. In addition, we identied, in 4 other patients, two variants of uncertain clinical signicance (VUS) and two variants never previously described in literature, although one of which is present in the dbSNP database. Conclusions: The present study is the rst in which the entire coding sequence of BRCA genes have been analyzed by Next Generation Sequencing in Burkinabe young women with breast cancer. Our data support the importance of genetic risk factors in the etiology of breast cancer in this population and suggests the necessity to improve the genetic cancer risk assessment. Furthermore, the identication of the most frequent mutations of BRCA1 and BRCA2 in the population of Burkina Faso will allow the development of an inexpensive genetic test for the identication of subjects at high genetic cancer risk, which could be used to design personalized therapeutic protocols. 4/5 (likely pathogenic/pathogenic) and the novel variants detected by NGS were conrmed by bidirectional Sanger sequencing. The sequencing was performed using the BigDye Terminator v3.1 Cycle Sequencing kit (Applied Biosystems; Thermo Fisher Scientic, Inc.) and the ABI 3130xl Automated Sequencer (Applied Biosystems, Foster City, CA, USA). The results were analyzed using Sequencing Analysis 5.2.0 software. Ligation-dependent Probe Amplication, BRCA1: P002, BRCA2: P045) according used ng of from each three samples and tests were performed in duplicate in the same procedures were performed the instructions.. The analysis of fragments was performed on ABI 3130xl sequencer and the data generated were imported and analyzed in Coffalyser.Net Software (v.140721.1958).


Abstract
Background: Breast cancer (BC) is the most commonly diagnosed cancer and the second leading cause of cancer-related deaths, after cervical cancer, among women in Africa. Even if the epidemiological data are now aligned with those relating to industrialized countries, the knowledge concerning breast cancer in Africa, in particularly in Western Africa still lack clinical data, medical treatments, and the evaluation of genetic and non-genetic factors implicated in the etiology of the disease.
The early onset and the aggressiveness of diagnosed breast cancers in patients of African ancestry strongly suggest that the genetic risk factor may play an important role but up to date very few studies have been done concerning the impact of germ line mutations in breast cancer in Africa, with a negative impact on prevention, awareness and patient management.
We have performed by Next Generation sequencing (NGS) the analysis of all coding regions and the exon-intron junctions of BRCA1 and BRCA2 genes, the two most important genes in hereditary breast cancer, in fty-one women from Burkina Faso with early onset of breast cancer and or without a family history.
Results: We identi ed six different pathogenic mutations (3 in BRCA1, 3 in BRCA2), two of which have been found to be recurrent , in 8 unrelated women.
In addition, we identi ed, in 4 other patients, two variants of uncertain clinical signi cance (VUS) and two variants never previously described in literature, although one of which is present in the dbSNP database.
Conclusions: The present study is the rst in which the entire coding sequence of BRCA genes have been analyzed by Next Generation Sequencing in Burkinabe young women with breast cancer.
Our data support the importance of genetic risk factors in the etiology of breast cancer in this population and suggests the necessity to improve the genetic cancer risk assessment. Furthermore, the identi cation of the most frequent mutations of BRCA1 and BRCA2 in the population of Burkina Faso will allow the development of an inexpensive genetic test for the identi cation of subjects at high genetic cancer risk, which could be used to design personalized therapeutic protocols.

Background
The global rate of new cancer cases is increasing worldwide [1]. In 2018, the global cancer observatory estimated that, by 2030, 24 million people worldwide will develop cancer and 13 million people will die annually from cancer, with 75% of these deaths in low-and middle-income countries [2,3]. In these countries, the global trend in the epidemiology of breast cancer merges what observed for high-income countries [4]. Focusing on Africa and speci cally on the area of Western Africa, which comprises Burkina Faso, breast cancer represents the second most frequent diagnosed cancer in women [4,5]. However, the lack of resources for preventive screening and access to quality healthcare lead to signi cant delays in breast cancer detection contributing to a high mortality rate in these countries [6,7]. Moreover, the genetics of breast cancer in African countries is generally uncertain [8] and also if the early age of onset and the aggressiveness suggest that there may be a strong inheritance-familial component in the onset of the disease, very few studies have attempted to ll this lack of acknowledge [8, 9,10,11].
In order to better assess the genetic cancer risk in this population we analyzed by next generation sequencing (NGS) the coding regions and the exon-intron junctions of the two main breast cancer susceptibility genes, [12], BRCA1 [13] and BRCA2 [14], in 51 women from Burkina Faso affected by breast cancer.
To the best of our knowledge, this is the rst work that analyzed, with an NGS approach, all the coding sequence of BRCA1 and BRCA2 genes in Burkinabe women with breast cancer. The identi cation and characterization of the most recurrent mutations in these patients will allow the development of genetic tests based on a population-speci c mutations panel thus decreasing the cost of genetic testing. Therefore there will be the possibility of carrying out preventive screenings in high-risk populations.
Moreover, the early identi cation of women carrying germline pathogenic or likely pathogenic variants in BRCA1 or BRCA2 genes will allow to set up an appropriate diagnostic-therapeutic path to improve the overall survival rate of breast cancer patients.

Results
A total of 51African women from Burkina Faso affected by breast cancer have been selected by genetic counseling at CERBA/LABIOGENE laboratory of University of Ouagadougou (Burkina Faso) to perform a genetic testing for the screening of BRCA1 and BRCA2 genes. The molecular analyses were carried out at the Medical Genetics Laboratory of University of Rome Tor Vergata.
All patients had an age at diagnosis less than 40 years. The mean age of patients at diagnosis was 34.8 ± 4.14. Ninety-four percent (94%) of patients had an invasive ductal type of breast carcinoma, and only 24% (12/51) reported to have family history of breast cancer. Overall, on the total of 51 patients analyzed we found 8 carriers of a pathogenic variant (16%), 2 carriers of a variant of uncertain signi cance (VUS) (4%) and 2 carriers of a new undescribed variant (4%) (Fig. 1). We observed that the median age of patients carrying pathogenic variants was lower (33.25 ± 3.77) compared to the median age of patients carrying only benign variants (35.16 ± 4.16), although this difference was not statistically signi cant (p value = 0.25).
The identi ed pathogenetic variants were in total 6 (4 unique and 2 recurrent in more than one patient). Three variants were in BRCA1 gene and 3 in BRCA2 gene. The BRCA1 mutation carriers had a mean age of (32.4 ± 3.78) years, BRCA2 mutations carriers had a mean age of (34.67 ± 4.04). The mutation prevalence was evaluated in our cohort and resulted to be 15.7% (95% CI: 5.7-25.7%) for the two BRCA genes, in particular 9.8% (95% CI: 1.6-18.0%) for BRCA1 and 5.9% (95% CI: 0.2-11.4%) for BRCA2. The mutations included missense, nonsense, small deletion and intronic variants (Table 1). In particular, in BRCA1 gene we identi ed the presence of 2 recurrent pathogenic variants in 4 unrelated patients. Speci cally, 2 patients were affected by the same mutation in BRCA1 gene, c.4088C > G, (p.Ser1363*). This is a nonsense mutation which causes the substitution at amino acid 1363 from Serin to a stop codon. Patients carrying this mutation had an undifferentiated and ductal type of breast carcinoma at ages of 37 and 34 years, respectively, and both had a family history of breast cancer. Other 2 patients shared the intronic mutation c.4986 + 6T > C ( LRG_292t1:c.4986 + 6T > C) in BRCA1 gene. This variant has a severe impact on splicing because leads to activation of a downstream cryptic splice donor site, which results in an aberrant RNA transcript and a truncated protein. These patients were affected by a breast ductal carcinoma, diagnosed at early ages of 28 and 29 years, respectively. One patient reported to have family history of cancer, while the other patient did not. Moreover we identi ed 2 frameshift mutations in BRCA2 gene, the c.6445_6446del, (p.Ile2149*) and the c.6757_6758del, (p.Leu2253Phefs*7) and 1 frameshift mutation in BRCA1 gene, c.5177_5180del,(p.Arg1726Lysfs*3) .Two of the frameshift variants identi ed were in women with no family history of breast cancer. The only missense mutation identi ed, c.8009C > T, (p. Ser2670Leu), was in BRCA2 gene. The variant was present in a 37-years-old woman with a medullary breast carcinoma and a family history for breast cancer. Interestingly, the pathogenetic variants frequency detected in our cohort is statistically different respect to the frequencies listed in the GnomAD database for the African population (p-value < 0.05), with the two variants: the c.6445_6446del, (p.Ile2149*) and the c.8009C > T, (p.Ser2670Leu), which are not even present in GnomAD database (Table 1). In addition to the pathogenetic alleles we identi ed two missense variants of uncertain signi cance (VUS); one in BRCA1 gene (c.5348T > C, p.Met1783Thr) and one in BRCA2 gene (c.7504C > T, p.Arg2502Cys) with an allelic frequency not statistically different respect to the frequencies listed in the GnomAD database for the African population (p-value > 0.05) ( Table 1). Both these variants are in a functional domain of the BRCA1 and BRCA2 protein, respectively. The variant Met1783Thr is in the BRCT2 domain of BRCA1, while the variant Arg2502Cys is in the BRCA2 helical domain. To predict the potential impact of these variants on the protein we used different tools (Mutation Taster and PolyPhen-22). The in silico analysis predicted a damaging role for the BRCA1 variant (Mutation Taster: disease causing; PolyPhen-2: Probably damaging, with a score of 1.000); moreover the sequence alignment of the BRCA1 protein with its orthologous proteins showed that the wild type residue seemed to be moderately preserved in species, implying a role for this residue in the protein function. The in silico analyses for the BRCA2 variant Arg2502Cys gave a benign computational effect on the protein (Mutation Taster: polymorphism; PolyPhen-2: benign, with a score of 0.022). In this case the sequence alignment of BRCA2 protein with its orthologous proteins showed that the wild type residue is poorly conserved among species implying an irrelevant functional or structural role of this residue on the protein.

Discussion
In Africa and Sub-Saharan Africa (SSA) breast cancer (BC) is the most diagnosed cancer in women with an increasing incidence in the last few years, and a ve-year survival rate lower compared to industrialized countries [16]. Epidemiological studies would be very useful to better identify risk factors and understand the genetic variability among African populations. However, up to date, few data are available, with a signi cant impact on the accuracy of diagnosis and clinical management of patients with African ancestry.
The prevalence of mutations in BRCA genes is not yet well de ned in Western Africa. Data are scarce for most countries and the results of the few studies carried out do not allow to have a real picture of the speci c mutations in each country. The prevalence and spectrum of germline mutations in BRCA1 and BRCA2 genes are certainly better delineated in European and North American populations [17]. For many of these populations recurrent and founder mutations have been identi ed and this has allowed the development of targeted genetic tests which, being less expensive, can be more easily used for population screening [18]. One example is represented by the Ashkenazi Jewish population in which identi ed founder mutations have long been used as the rst genetic screening test for women of Jewish descent [19]. Founder mutations have also been identi ed in different European and Asian populations, while for West Africa only one mutation in BRCA1 gene has been identi ed as a potential founder mutation [20]. These examples suggest that speci c mutation panels can be developed for speci c populations. Therefore it is important for Africa identify and characterize the recurrent mutations [11].
In this study, we determined the prevalence of mutations in BRCA genes in a cohort of young Burkinabe women with breast cancer. Interestingly, the frequency of all identi ed pathogenic variants, some of which are present in more than one patient, was statistically different from that one reported in the GnomAD database for the African population, with two variants that are not even present in the database. This result suggests that the identi ed variants could be considered as population speci c variants and therefore be extremely important for genetic testing strategies. In addition to the pathogenetic variants, we have identi ed 2 variants of uncertain clinical signi cance (VUS) and 2 variants never described in the literature of which only one has been previously reported only in the dbSNP database. The two VUS are in functional domains, and the in silico analysis have predicted a damaging role for the one in BRCA1 gene and a benign computational effect on the protein for the one in BRCA2 gene. Both variants have an allelic frequency not statistically different to the one listed in the GnomAD database for the African population suggesting their unlikely contribution to disease risk on their own. However, the role of these variants should certainly be better investigated using multifactorial models and functional studies [21]. In this regard to overcome di culties in classifying BRCA1 /2 VUS, in 2009 was born the Evidence-Based Network for the Interpretation of Germline Mutant Alleles (ENIGMA) consortium [22]. The International Agency for Research on Cancer Working Group (IARC), in collaboration with ENIGMA, has developed a ve-level multifactorial model to classify VUS identi ed in BRCA genes, based on the segregation of the variant in families, the co-occurrence with previously identi ed pathogenic mutations and tumor histopathology, combined with an analysis of the sequence conservation and properties of mutated residues [23].
In our study, we could not carry out the segregation analysis as we did not have the DNA of others family members nor functional studies, the only analysis that have been performed were those using the in silico tools.
The two novel variants identi ed were in the BRCA1 gene. One is a nonsense variant that, in accordance with ACMG and ENIGMA criteria, we have classi ed as likely-pathogenic, the other is a missense variant that needs further investigations. Finally, 17 patients (those with better quality DNA) tested negative for BRCA1/2 mutations were screened for larger genomic rearrangements (LGRs) in BRCA genes by multiplex ligation-molecular dependent probe ampli cation (MLPA). No genomic rearrangements were identi ed in any of the patients analyzed. This data is in agreement with the results obtained by Zhang j et al [24], showing that the genomic rearrangements do not contribute signi cantly to BRCA-associated risk in the Nigerian population.
Our study has limitations. One is certainly represented by the low number of patients analyzed, another is that the genes screened were only the BRCA genes. Up to date, the majority of African BC genetic studies concern only the molecular analysis of BRCA genes which leads to an inaccurate characterization of BC genetic risk factors. Our next step will be to analyze molecular test negative patients for BRCA1/2 mutations with a larger panel that includes other important breast cancer susceptibility genes in order to have a more complete picture of genetic risk factors in Africa such as in European and American countries [25,26].

Conclusions
Certainly the socio-economic conditions that lead to a weak healthcare system, the lack of health insurance, limited access to drugs and therapies and a lack of genetic tests have a strong impact to the high mortality and incidence rates of breast cancer in the African countries .We believe that our study, although conducted on a limited number of patients, represents an important contribution to add greater knowledge to the genetic risk factors of BC in Western Africa. This will permit to have cancer prevention programs and will allow Burkinabe women at high risk of breast cancer to be included in an appropriate diagnostictherapeutic programs. In fact, it will help reduce the disparities that still exist between being a breast cancer patient in a low-income or a high-income country.

Patients' recruitment
This was a prospective cohort study which took place from August lst, 2015 to February 29th, 2016. It consisted of a genetic analysis of breast cancer con rmed cases by histopathological analyses among women younger than 40 years at the University hospital Center of Yalgado OUEDRAOGO (CHU-YO). An approval was obtained from the Ethics Committee for Health Research of Burkina Faso (N° 2014-8-098). After obtaining written informed consent from each patient, clinical, paraclinical and therapeutic data were collected in General Surgery, Gynecology-Obstetrics, Oncology and Anatomy-Pathology departments of the CHU-YO. Women younger than 40 years with histologically con rmed breast cancer, attending the latter mentioned depatments, who gave their free and written consent to participate, were included in the present study.
The parameters studied for each patient were: Epidemiological and socio-demographic characteristics (age, sex, occupation, weight, height, level of education, origin); clinical data (antecedents, consultation time, reasons for consultation, symptoms and physical signs).
Data collection was possible through interview, physical exanimation, and investigations. For each patient, the tumor was classi ed according to the cTNM (clinical classi cation in Tumor-Nodes-Metastasis) and pTNM (p = pathologic) classi cation of breast cancers (7" edition 2010), and graded according to Scarff-Bloom and Richardson (SBR) grading system. DNA extraction and NGS analysis DNA extraction and NGS analyses were performed at Medical Genetics Laboratory of University of Rome Tor Vergata.
Total DNA was isolated from peripheral blood using the QIAGEN® EZ1 DNA Blood 200 µl kit (Qiagen) with the BioRobot EZ1 Workstation (Qiagen, Valencia, CA, USA). The concentration and quality of DNA was determined using NanoDrop 1000 (Thermo Fisher Scienti c) and the Qubit Fluorometer 2.0 (Thermo Fisher Scienti c).
The NGS analyses were performed using Ion AmpliSeq™ BRCA1 and BRCA2 custom Panel (Thermo Fisher Scienti c, Inc). The panel consists of three primers pools (55 amplicons) targeting the entire coding region and the exon-intron boundaries of genes BRCA1 and BRCA2. A total of 10 ng of DNA for sample was used for library preparation, using the Ion AmpliSeq™ Library kit 2.0 (Ion Torrent; Thermo Fisher Scienti c, Inc.). Each library was barcoded using Ion Xpress™ Barcode Adapters kit (Ion Torrent; Thermo Fisher Scienti c, Inc.). After the ampli cation phase follows the emulsion reaction that creates aqueous droplets that randomly trapped one or more DNA fragments. Libraries were puri ed using Agencourt Ampure XP Beads, quanti ed with the Qubit version 2.0 uorometer (Thermo Fisher Scienti c) using the Qubit dsDNA HS assay kit and diluted approximately 100 pmol/L for PGM while for S5 30pM. Templated Ion Sphere Particles (ISPs) were loaded into an Ion 510 Chip (Thermo Fisher Scienti c) or Ion 316 Chip (Thermo Fisher Scienti c). Sequencing was performed on an Ion S5 Platform using the Ion S5 Sequencing kit (Thermo Fisher Scienti c) and on Ion PGM Platform using the Ion PGM sequencing kit. All protocols were followed as recommended by the manufacturers without modi cation.

Data analysis
Preinstalled plugin in the Torrent Browser generates Binary Alignment/Map (BAM) and variant call format (VCF) les. Raw sequence data were processed using the Torrent Suite™software (Ion Torrent; Thermo Fisher Scienti c, Inc.) to analyze barcode reads, to align reads to the HG19 reference genome (Genome Reference Consortium GRCh37) and to generate run metrics, including chip loading e ciency and total read counts and quality. Coverage analysis and variant calling used Torrent Variant Caller plugin software in the Torrent Server. We analyzed bam les on IGV (Integrative GenomeViewer) [27] to verify the real coverage of genes and the presence of variants, and on Ion Reporter Thermo Fisher Scienti c, Inc.) that allow annotation of single nucleotide variants, insertions, deletions and splice site alterations.
Variants were annotated according to nomenclature used by the Human Variation Society [28]. All the annotations and variants were determined using BRCA1 (NM_007294.3) and BRCA2 (NM_000059.3) as reference transcripts. All candidate variants were required on both sequenced DNA strands with a minimum depth of 50X.
The in silico analysis to predict the potential impact of the variants on the structure and function of the protein was performed using the following tools: PolyPhen2 [32], and Mutation Taster [33].
The evaluation of the novel variants has been based on: the location, type and evolutionary conservation of mutated amino acids, biophysical and biochemical differences between wild-type and mutant amino acid, the in silico analysis of the mutant sequence protein.

Results' validation by Sanger sequencing
All the variants of class 3 (variants uncertain signi cance) class 4/5 (likely pathogenic/pathogenic) and the novel variants detected by NGS were con rmed by bidirectional Sanger sequencing.
The sequencing was performed using the BigDye Terminator v3.1 Cycle Sequencing kit (Applied Biosystems; Thermo Fisher Scienti c, Inc.) and the ABI 3130xl Automated Sequencer (Applied Biosystems, Foster City, CA, USA). The results were analyzed using Sequencing Analysis 5.2.0 software.

Multiplex Ligation -dependent PCR Ampli cation (MLPA)
The presence of large genomic rearrangements (LGRs) in BRCA1 and BRCA2 gene, in addition to the presence of the 1100delC mutation in the CHEK2 gene, was investigated by multiplex ligation-dependent probe ampli cation (MLPA) assay using MLPA commercial kits from MRC Holland (Multiplex Ligationdependent Probe Ampli cation, BRCA1: P002, BRCA2: P045) according to the manufacturer's instructions.We used 100 ng of DNA from each sample, three reference samples and tests were performed in duplicate in the same experiment.The procedures were performed according to the manufacturer's instructions.. The analysis of fragments was performed on ABI 3130xl sequencer and the data generated were imported and analyzed in Coffalyser.Net Software (v.140721.1958).

Statistical Analysis
In this study, the analyses focused only on mutations that are classi ed as pathogenic. We calculated the mutation prevalence and exact 95% con dence interval (CI) using a Binomial distribution. Differences between alleles frequencies of our examined cohort and those listed in the GnomAD database [34] for the African population were evaluated by Fisher's exact test. p-values less than 0.05 were considered statistically signi cant.

Declarations
Ethics approval and consent to participate This work has been approved by the Ethical Committee for Health Research of Burkina Faso N° 2014-8-098. Each subject enrolled in the study signed a written informed consent. The research related to human use has been complied with all the relevant national regulations, institutional policies and in accordance the tenets of the Helsinki Declaration.

Availability of data and materials
All data generated or analysed during this study are included in this published article.

Competing interests
The authors declare that they have no competing interests.

Funding
This research has been nanced by the Pietro Annigoni Association of Italy Tables