Genetic landscape of homologous recombination repair genes in early‐onset/familial prostate cancer patients

Prostate cancer (PrCa) is one of the three most frequent and deadliest cancers worldwide. The discovery of PARP inhibitors for the treatment of tumors with deleterious variants in homologous recombination repair (HRR) genes has placed PrCa on the roadmap of precision medicine. However, the overall contribution of HRR genes to the 10%–20% of carcinomas arising in men with early‐onset/familial PrCa has not been fully clarified. We used targeted next‐generation sequencing (T‐NGS) covering eight HRR genes (ATM, BRCA1, BRCA2, BRIP1, CHEK2, NBN, PALB2, and RAD51C) and an analysis pipeline querying both small and large genomic variations to clarify their global and relative contribution to hereditary PrCa predisposition in a series of 462 early‐onset/familial PrCa cases. Deleterious variants were found in 3.9% of the patients, with CHEK2 and ATM being the most frequently mutated genes (38.9% and 22.2% of the carriers, respectively), followed by PALB2 and NBN (11.1% of the carriers, each), and finally by BRCA2, RAD51C, and BRIP1 (5.6% of the carriers, each). Using the same NGS data, exonic rearrangements were found in two patients, one pathogenic in BRCA2 and one of unknown significance in BRCA1. These results contribute to clarify the genetic heterogeneity that underlies PrCa predisposition in the early‐onset and familial disease, respectively.


| INTRODUCTION
Prostate cancer (PrCa) has long been one of the most frequent and deadliest cancers in men worldwide, with the 2018 estimates pointing to 1.3 million diagnoses and 359 thousand deaths. 1 The combination of efforts from several research groups with the most advanced technologies in genome screening have started to unveil the genetic component underlying predisposition to the 10%-20% of the PrCa cases occurring at early age and/or in families with aggregation of the disease. 2,3Lessons from classical hereditary cancer syndromes, such as Hereditary Breast and Ovarian Cancer (HBOC) and Lynch Syndrome (LS), have been fundamental in the identification of germline determinants of PrCa risk.In fact, apart from HOXB13, with the specific G84E mutation defined as a moderately penetrant PrCa risk variant, 4 the other well-established genes predisposing to PrCa are those associated with HBOC or LS, namely the homologous recombination repair (HRR) genes BRCA1 and BRCA2 [5][6][7] and the mismatch repair (MMR) genes MSH2 and MSH6. 8,9Screening of variants in the breast cancer predisposing genes ATM, CHEK2, and PALB2 in PrCa patient cohorts have established them as additional moderate penetrance PrCa predisposing genes. 10,118][19] Additionally, the identification of carriers of HRR deleterious variants may be useful to delineate therapeutic options, considering the positive response to PARP inhibitors observed in patients with metastatic, castration-resistant prostate carcinomas with compromised HRR. 20,21e evolution of NGS data analysis methods has also empowered the detection of genomic Structural Variations (SVs).3][24] Apart from the CHEK2 large deletion identified in the Polish population 25 and the Portuguese founder variant c.156_157insAlu in BRCA2, 26 the few examples describing SVs associated with PrCa development derive from genome-wide association studies (GWAS), with the coding genes involved being mostly unknown. 27,28former study of our group has revealed that the most frequent BRCA1 and BRCA2 variants found in Portuguese HBOC families could only explain a fraction of the missing PrCa heritability, suggesting that other variants in the BRCA genes and/or other genes may play a bigger role in PrCa predisposition.26 Using Targeted Next Generation Sequencing (T-NGS) in a pilot study including 121 patients with strong criteria for hereditary disease, not only we unveiled new candidate genes associated with PrCa development, but also confirmed the low frequency of deleterious variants in the BRCA genes.29 Likewise, MSH2, one of the first genes associated with PrCa predisposition, was not frequently found mutated in either study.26,29 In this study, our aim was to characterize 462 PrCa patients fulfilling criteria for early-onset and/or familial/hereditary PrCa for a wide range of germline variants, including SNVs/INDELs and SVs.We specifically examined eight HRR-related genes, consisting of five wellestablished PrCa predisposing genes (ATM, BRCA1, BRCA2, CHEK2, and PALB2), as well as three additional HRR genes proposed to be involved in PrCa predisposition (BRIP1, NBN, and RAD51C).Our goal was to clarify the overall and relative contribution of these HRR genes to genetic predisposition for PrCa, and to validate an analysis pipeline of T-NGS data for the molecular diagnosis of hereditary disease.

| Ethics statement
Patient samples were collected under the ethics approval with reference CES 38-010 revised by the IPO Porto Ethics Committee.Informed consent was obtained from all subjects involved in the study.

| Biological samples
A total of 462 patients, previously described, 26 who fulfilled criteria for early-onset and/or familial/hereditary PrCa, were enrolled in this study.
Specifically, these patients included individuals diagnosed with PrCa before the age of 56-early-onset criterion-and/or individuals diagnosed with PrCa at any age, provided that they had at least one relative (up to fourth degree) diagnosed with PrCa and that at least one of the PrCa patients in the family had been diagnosed before the age of 66-family history criterion.In cases where DNA was available from multiple PrCa patients within a family, the youngest individual at the time of diagnosis was considered the index case.Among the 462 patients, 151 patients fulfill the early-onset criterion only, 222 patients fulfill the family history criterion only, and 89 patients fulfill both criteria.Clinicopathological characteristics are summarized in Table S1.
A series of 701 control samples of healthy individuals from the Northern Portuguese population, also previously described, 29 was used (supplementary information).
DNA was previously extracted from peripheral blood leukocytes by standard procedures, 26 and kept at À80 C.

| Variant calling and classification
The NextGENe software (v2.4.2.2; Softgenetics, State College, PA) was used for sequence alignment to the reference genome (GRCh37/ hg19) and variant calling.The performance of the SureSelect custom panel was evaluated assessing per base coverage using a python-based script, considering the full coding and splicing consensus regions.
All regions were covered at least 30x in all samples, validating the robustness of the custom panel design.

Analysis and filtering of single nucleotide variants (SNVs) and
INDELs, as well as the software used in analysis of structural variants (SVs) are summarized in Figure 1, and detailed in the supplementary information.Retained variants were annotated according to ClinVar classification, using ClinVar Miner, 30 and classified according to the guidelines of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology (ACMG-AMP) using InterVar (https://wintervar.wglab.org,version updated in June 13, 2022). 31Variants consistently classified as "benign"/"likely benign" in ClinVar Miner by October 2021 were excluded.

| Validation of next generation sequencing findings
Validation of P/LP SNVs and INDELs was performed by Sanger sequencing (primer sequences in Table S2, if not previously described 26,29 ).For validation of SVs, we used both Multiplex ligation-dependent probe amplification (MLPA; MRC Holland, Amsterdam, the Netherlands) and array-based Comparative Genomic Hybridization (aCGH) with the SNP microarray CytoScan XON (Thermo Fisher Scientific, Inc.; Waltham, MA).
Additional details are described in the supplementary information.

| Transcript analysis
To explore the transcriptional consequences of the SV identified in BRCA1 in patient HPC460, the pattern of BRCA1 transcripts was analyzed in RNA extracted from patient's peripheral blood lymphocytes (PBLs) by qRT-PCR, followed by Sanger sequencing (supplementary information).

| Genotyping in Portuguese control subjects
Recurrent variants were screened in 701 control samples using KASP (Kompetitive Allele Specific PCR) genotyping (LGC, Teddington, UK), if not previously evaluated. 29KASP assay primers (Table S2) were designed using the Primer-BLAST design tool from NCBI and acquired from Metabion (Martinsried, Germany).Data were analyzed in the LightCycler 480 Software 1.5.0.(Roche Diagnostics, Basel, Switzerland).The numbers of carriers and non-carriers were compared between patients and control samples using the Fisher's exact test (two-tailed) in GraphaPad (https://www.graphpad.com/quickcalcs/contingency1/).Differences were considered statistically significant with a p value < 0.05.

| Statistical associations with clinicopathological factors
To explore statistical associations between the germline mutational status for the eight genes and clinicopathological data of the prostate carcinomas, information regarding PSA at diagnosis, Gleason Score, F I G U R E 1 Flow chart of the analysis pipeline used for the detection of small (SNVs/INDELs) and large (CNVs/TLs) genomic variations.and TNM staging was gathered.Additionally, patients were assigned into risk groups, according to the NCCN guidelines (v2.2020), and interrogated for associations with germline mutational status.Earlyonset disease and family history criteria were also queried.The IBM Statistical Package for the Social Sciences (v.25) was used to infer statistical significance.An association was considered statistically significant if reaching a p value < 0.05 by two-tailed Fisher's exact test (GraphPad).

| Frequency of carriers of germline deleterious single nucleotide variants/INDELs in the eight HRR genes
Using the analysis pipeline for SNVs/INDELs, we found 13 carriers of pathogenic/likely pathogenic (P/LP) variants in well-established PrCapredisposing genes: seven in CHEK2, four in ATM, and two in PALB2 (Table 1).There were no carriers of P/LP SNVs/INDEL variants in either BRCA1 or BRCA2.
The ATM variant found in patient HPC177, as well as the family tree, was previously described. 29The additional carriers of deleterious variants in ATM include two frameshift variants, c.8264_8268del and c.9079dup, in patients HPC234 and HPC196, respectively, and a splicing variant The five patients carrying the CHEK2 missense variant c.349A > G and the CHEK2 splicing variant found in patient HPC395, have been previously described. 29,34A second carrier of the same splicing variant (c.593-1G > T) was identified in an early-onset PrCa case (HPC353) without family history of cancer (Figure 2D).
The two nonsense variants in PALB2 were identified in patients with distinct familial cancer histories, with the variant c.1438A > T T A B L E 1 Germline P/LP variants found in the eight HRR genes under study.being identified in an early-onset PrCa patient (HPC36) with two first degree relatives diagnosed with different carcinomas (Figure 2E), and the variant c.2257C > T occurring in a patient (HPC223) with history of PrCa in the brother, lung cancer in both parents and two uncharacterized cancers, in brain and leg, in two paternal uncles (Figure 2F).Segregation of the variant in the affected brother could not be performed.
Among the three HRR genes that have been suggested to predispose to PrCa development, four patients carrying P/LP variants were identified: two in NBN, one in BRIP1, and one in RAD51C, the last having been identified in a former study. 29e BRIP1 frameshift variant c.2947dup was identified in a patient (HPC427) diagnosed with PrCa at the age of 48 years, and   4).

| Frequency of germline structural variants in HRR genes
Using the TL modules of SureCall and DRAGEN, a single output occurring in the BRCA2 gene in sample HPC398 was obtained with SureCall only.The aligned reads revealed the Portuguese founder variant c.156_157insAlu (Figure 5A), previously identified, 26,35 and included in Table 1.The three different CNV software outputted two variants with high quality metrics (Table S3): a duplication of the exon 2 of BRCA1 in patient HPC498 and a duplication of the exon 5 of RAD51C in patient HPC163 (Figure 5B).S4), and, thus, the translation of a wild-type BRCA1 protein may not be affected.Segregation analysis in the affected maternal uncle could not be performed (Figure S1), so the ] not described in gnomAD database.CoCa, colon carcinoma; GaCa, gastric carcinoma; HeCa, hepatic carcinoma; LuCa, lung carcinoma; NHL, non-Hodgkin Lymphoma; PrCa, prostate carcinoma.
of early-onset/familial PrCa cases.Although the BRCA genes are the best well-established risk genes for PrCa development, and the most frequently mutated genes in the germline of both metastatic patients 18 and patients unselected for family history, 37 studies using multigene panel testing to assess the frequencies of deleterious variants in both BRCA genes and in other cancer predisposing HRR genes in PrCa patient cohorts with criteria for hereditary disease are scarce.
Thus, we aimed to clarify the overall and relative frequencies of deleterious germline variants in eight HRR genes in our entire series of 462 early-onset/familial PrCa cases.For this purpose, we used a custom T-NGS panel covering the five well-established PrCa predisposing genes, namely, BRCA1, BRCA2, ATM, CHEK2, and PALB2, and three additional genes suggested as PrCa-predisposing in the literature, namely, BRIP1, NBN, and RAD51C.We complemented the analysis pipeline for small variations (SNVs/INDELs) 29 with four different NGS data analysis software designed to detect large exonic SVs.
Following our analysis pipeline for SNVs/INDELs, we identified deleterious variants in 17 patients, representing $3.7% of the 462 patients of our series of early-onset/familial PrCa.By including CNV analyses, we were able to identify the BRCA2 c.156_157insAlu variant, and, additionally, to identify one carrier of a SV in BRCA1.In fact, the c.156_157insAlu variant is the BRCA2 pathogenic variant most frequently found in Portuguese HBOC families and, so far, its detection was restricted to a variant-specific method, 26 being a challenge to be identified by sequencing approaches.Thus, its identifica- approaches for genetic screening and the knowledge that 3%-15% of the hereditary cancer syndromes arise from a cancer-predisposing SV, 38 the identification of CNVs has been hampered by the lack of robust analysis pipelines that may cover all the spectrum of large genomic variations.Our analysis approach sustains the usefulness of NGS to ascertain the full spectrum of CNVs, overall accounting for 5.6% of all carriers of P/LP variants in our study.This frequency is in accordance with the described prevalence of $7.2% among all pathogenic variants identified in 28 cancer predisposing genes. 39PA is the gold-standard technique for validation of SVs in known cancer predisposing genes, however, considering the limited design of MLPA probes, targeting only known cancer genes and potentially missing SVs involving genomic regions outside the targeted sites, 39 aCGH has increased value as a validation approach.
Additionally, aCGH may be useful to narrow down the genomic breakpoints, which may guide the design of cost-effective PCR-based screening approaches for the identification of carriers of large deletions, which in most populations comprise the most frequent pathogenic SVs. 39Although aCGH has validated a true-positive CNV in BRCA1 in patient HPC498, it did not allow to define the boundaries of the genomic duplication in this case.In the literature, a large deletion encompassing this region is reported in several HBOC families, resulting in a non-functional BRCA1 transcript, 40 13 It is possible that the genetic background of the Portuguese population, characterized by several cancer predisposing variants already described to exhibit a founder effect 34,[44][45][46][47] and by the contribution of the Sephardic Jewish population, estimated to account for 19.8% of the Portuguese ancestry, 48,49 is biasing the overall observed genetic landscape of HRR gene mutations in our early-onset/familial PrCa patients.
We have not found an association between the carrier status for a P/LP variant and disease aggressiveness, even when considering only the carriers of P/LP variants in the well-established PrCa risk genes.However, since our cohort is enriched in patients selected for personal and/or family history of PrCa, and not in patients with aggressive disease or in carriers of P/LP variants in BRCA2, associated with aggressive disease in multiple studies, 14,17,42,50 this study may better represent the landscape of genomic alterations in those PrCa patients/families that may benefit from genetic risk assessment, eventually conditioned by a population-specific effect.P/LP variants in ATM, NBN, and PALB2 have also been associated with aggressive PrCa, 50,51 however the low frequency of carriers in our cohort does not allow to confirm this association.
Our study shows that deleterious variants in HRR genes may explain PrCa development in $3.9% of the PrCa cases arising at earlyage and/or with familial aggregation of the disease.P/LP variants in CHEK2 and ATM are the most frequent (38.9% and 22.2% of the carriers, respectively), followed by PALB2 and NBN (11.1% of the carriers, each), and then by BRCA2, RAD51C, and BRIP1 (5.6% of the carriers each).By allowing full screen of many genes at the same time and to identify both small and large genomic variations, NGS has a major role in the identification of the genetic components underlying cancer development.Still, further studies are needed to clarify the (prostate) cancer risk and the implications in patient management of carrying a deleterious variant in several of these genes.
(c.1236-2A > G) in patient HPC408.Segregation with the disease was observed in the affected brother of patient HPC234, who agreed to participate in the patients' recruitment phase (2014-2015; Figure 2A), now deceased.Although both carriers of the ATM variants c.9079dup and c.1236-2A > G have relatives diagnosed with PrCa (Figure 2B, C), variant segregation with the disease could not be performed.
Family trees of HPC cases carrying P/LP SNVs/INDELs in well-established PrCa predisposing HRR genes.(A)-(C) Patients carrying P/LP variants in ATM.(D) Patient carrying the recurrent splicing variant in CHEK2.(E) and (F) Patients carrying P/LP variants in PALB2.Squares represent the males, circles the females, and diamonds unknown gender.Numbers inside the symbols denote the number of individuals with that gender, if known.Deceased individuals are represented by a diagonal line through a symbol and the affected ones are highlighted by colored symbols.The index case is indicated by an upper left arrow and the cancer type and age at diagnosis are indicated whenever known.Identified carriers of the variant are marker by a plus (+) symbol.

Using
MLPA and aCGH, we validated the duplication in BRCA1 in patient HPC498, encompassing at least 2718 bp of the genomic region containing part of BRCA1 and NBR2 genes (chr17: 43123235-43 125 953) (Figure 5C, D).Transcriptomic analysis in PBLs from patient HPC498 revealed an undescribed aberrant transcript retaining 299 bp of the intron 1-2 of BRCA1 NM_007294.4(r.-19_-20ins-20 + 236_-20 + 534) (Figure 5E, F).Available online tools to F I G U R E 3 Family trees of HPC cases carrying P/LP SNVs/INDELs in HRR genes suggested PrCa predisposing genes.(A) Patient carrying the BRIP1 frameshift variant.(B) and (C) Patients carrying the NBN splicing and frameshift variants, respectively.Squares represent the males, circles the females, and diamonds unknown gender.Numbers inside the symbols denote the number of individuals with that gender, if known.Deceased individuals are represented by a diagonal line through a symbol and the affected ones are highlighted by colored symbols.The index case is indicated by an upper left arrow and the cancer type and age at diagnosis are indicated whenever known.Identified carriers of the variant are marker by a plus (+) symbol.AMI, acute myocardial infarction; BrCa, breast carcinoma; LuCa, lung carcinoma; OvCa, ovary carcinoma; PrCa, prostate carcinoma.predict the translation initiation site (TIS) of the aberrant BRCA1 transcript highlight the probability of new TISs in the 299 bp retained, however, any of the predicted alternative ATGs would lead to the occurrence of a stop codon before the wild-type TIS (Table

F I G U R E 4
Distribution of carriers of P/LP variants by gene considering cases with early-onset or familial PrCa.Blue colors represent wellestablished and yellow colors not well-established PrCa predisposing genes.F I G U R E 5 Exonic rearrangements identified by T-NGS.(A) Sequence alignment obtained with SureCall TL analysis for sample HPC398, showing insertion of the Alu sequence in exon 3 of BRCA2 (c.156_157insAlu), flanked by a short sequence duplication (TSD). 35(B) Dot-plot of the ratio of the readdepth coverage of the test sample (HPC498 in the upper panel and HPC163 in the lower panel) against the median read-depth coverage of a set of 10, randomly selected, samples from the same NGS run.Green and red horizontal lines denote the empirical threshold to assign gains and losses, respectively.(C) and (D) Validation of the BRCA1 duplication in patient HPC498 was obtained by MLPA (C) and aCGH (D).A new transcript retaining 299 bp of the alternative BRCA1 exon 1B was identified in patient HPC498 by One-Step RT-PCR in RNA extracted from PBLs.Agarose gel electrophoresis of the RT-PCR amplicons and Sanger sequencing electropherograms are shown in (E) and (F), respectively.The identified aberrant BRCA1 transcript, matching Ensembl transcript ENST00000357654.9 (BRCA1-203), contains, additionally, three SNVs (black arrowed), two described in dbSNP with MAF <0.01%[c.-20 + 415A > G, rs1420583297 (a); and c.-20 + 346A > G, rs1489102499 (c)], and one [(c.-20+ 349A > G (b) tion by an NGS bioinformatic pipeline not only increases the sensitivity of the screening test, but also strengthens the potential of the T-NGS data.A limitation of the SureCall software is that it is only compatible with Agilent's gene panels design, so other bioinformatic tools must be validated for use with other sequencing strategies.Another possible drawback of the Agilent technology may be the output of false-positive CNVs.In fact, samples from the patients with suspected CNVs (HPC498 and HPC163) were latter submitted to the panel TruSight Cancer (v2, Illumina) and only the BRCA1 duplication was outputted using the CNV tool from NextGENe.Still, in light of the identification of the Alu insertion, Agilent's technology and associated SureCall TL analysis module may compensate the output of false-positive CNVs, which can ultimately be validated by complementary technologies.Despite the broad dissemination of NGS