High RAD51 gene expression is associated with aggressive biology and with poor survival in breast cancer

Although the DNA repair mechanism is important in preventing carcinogenesis, its activation in established cancer cells may support their proliferation and aggravate cancer progression. RAD51 cooperates with BRCA2 and is essential in the homologous recombination of DNA repair. To this end, we hypothesized that RAD51 gene expression is associated with cancer cell proliferation and poor prognosis of breast cancer (BC) patients. A total of 8515 primary BC patients with transcriptome and clinical data from 17 independent cohorts were analyzed. The median value was used to divide each cohort into high and low RAD51 expression groups. High RAD51 expression enriched the DNA repair gene set and was correlated with DNA repair-related genes. Nottingham histological grade, Ki67 expression and cell proliferation-related gene sets (E2F Targets, G2M Checkpoint and Myc Targets) were all significantly associated with the high RAD51 BC group. RAD51 expression was positively correlated with Homologous Recombination Deficiency, as well as both mutational burden and neoantigens that accompanied a higher infiltration of immune cells. Primary BC with lymph node metastases was associated with high expression of RAD51 in two cohorts. There was no strong correlation between RAD51 expression and drug sensitivity in cell lines, and RAD51 expression was lower after the neoadjuvant chemotherapy compared to before the treatment. High RAD51 BC was associated with poor prognosis consistently in three independent cohorts. RAD51 gene expression is associated with aggressive cancer biology, cancer cell proliferation, and poor survival in breast cancer.


Introduction
Homologous recombination repair is an important DNA repair mechanism to address DNA double-strand breaks caused by various external and internal stressors [1]. BRCA1 and BRCA2 genes are identified to be essential for homologous recombination repair [2], as mutations in germline BRCA1 and/or BRCA2 genes induce genomic instability due to homologous recombination deficiency (HRD), leading to an increased risk of breast and/or ovarian carcinogenesis [2]. HRD is not only an important cause of hereditary breast cancer but also contributes to "BRCA-ness", which are the traits of BRCA1 genetic disorder found in some sporadic breast cancers [3][4][5]. HRD is a critical therapeutic target in breast cancer because nearly 70% of the most aggressive triple-negative breast cancer (TNBC) subtype contain characteristic "BRCAness" features [6]. In addition, poly ADP-ribose polymerase (PARP) has also been identified to be essential in DNA repair. PARP inhibitors are used to induce DNA double-strand breaks and destroy cancer cells with HRD. The effectiveness of PARP inhibitors against breast cancer with germline BRCA mutation have been confirmed in multiple clinical trials [7][8][9].
RAD51 is an ATPase that forms helical nucleoprotein filaments on single or double-stranded DNA [10] and plays a critical role in the early stages of DNA double-strand break recognition during homologous recombination repair. BRCA2 activates the homologous recombination cascade in a RAD51-dependent manner, particularly during mitosis [11]. BRCA2 recognizes nuclear filaments in single-stranded DNA loaded with RAD51 during DNA damage and invades the homologous DNA duplex to pair up and initiate homologous recombination repair [12,13]. Although RAD51 expression is tightly regulated in normal cells to avoid aberrant DNA recombination [14], its expression is strongly upregulated in several types of cancer including breast [15][16][17][18]. High levels of RAD51 overactivate homologous recombination, resulting in uncontrolled double-stranded DNA break repairs and cancer cell persistence [19]. Therefore, high expression of RAD51 confers resistance to radiation and drugs whose typical function is to induce double-stranded breaks in cancer cells [20][21][22]. Based on these mechanisms, studies have reported the involvement of RAD51 in cancer resistance to PARP inhibitors [23,24]. Studies have even suggested RAD51 to be a candidate as a biomarker of drug sensitivity and as a therapeutic target to combat drug resistance.
Our group has been pursuing translational research that addresses the clinical relevance of gene expression using in-silico analysis of large patient cohorts of transcriptomes associated with clinical parameters [25][26][27][28][29][30][31][32]. Previously, we have reported that increased expression of the BRCA2 gene is associated with enhanced cancer cell proliferation and immunogenicity in breast cancer [33]. In cancer cells, high expression of BRCA2 correlated with HRD and was also associated with aggressive breast cancer. Noting that RAD51 acts together with BRCA1 and/or BRCA2 as a key player in homologous recombination repair, we hypothesized that RAD51 mRNA expression is associated with increased cancer cell proliferation, and thus with poor prognosis. In addition, we hypothesized that RAD51 might be highly expressed in the treatment non-responder group due to its involvement in drug resistance. To date, studies of RAD51 have been limited to experiments with cell lines, animals, and retrospective studies with small cohorts. In contrast, we analyzed the relationship between RAD51 gene expression and breast cancer using three large primary breast cancer cohorts containing several thousand patients. In addition, we analyzed RAD51 expression by treatment response using multiple neoadjuvant chemotherapy (NAC)-treated breast cancer cohorts to explore its potential as a predictor and biomarker of treatment response in breast cancer.
These cohorts were downloaded from the GEO database via the R package GEOquary as well. For the GSE180962 cohort, only the control group was used in the analysis. The expression of RAD51 was calculated from the mean value of probes assigned to RAD51 from the platform corresponding to each expression data series. Details of the treatment information, the number of patients included in the study, the access number of the platform used for annotation for each cohort are summarized in Supplementary Table 1

BRCA mutation data
Genetic mutation data were available for TCGA and META-BRIC. TCGA had 30 patients with BRCA1 mutations, 29 patients with BRCA2 mutations, and a total of 57 patients with either mutation. METABRIC had 55, 60, and 114 patients, respectively.

Breast cancer cell line RAD51 expression and drug sensitivity data
Breast cancer cell line RNA sequence data and drug susceptibility data were obtained from the Depmap portal, as we reported previously [59,60]. This included 64 breast cancer cell lines, and immunohistochemistry staining data were downloaded as well. Expression 21Q3 Public data was used for RAD51 expression and AUC data from PRISM primary or secondary screening. GDSC1 and GDSC2 data was used to determine drug sensitivity.

Gene set enrichment analysis
Gene Set Enrichment Analysis (GSEA) [61] was performed on the gene expression data by dividing the analysis dataset into two groups based on the median expression of RAD51. This approach examines how strongly pathways defined by particular genes are expressed between two sets. GSEA 4.1.0 (free software from Broad Institute) was used for the analysis and Hallmark was selected as the gene set from the major collection of the Molecular Signatures Database [62]. Following the recommendations of the Broad Institute, FDR q-values below 25% were used as cut-off values for significance, and the Normalized Enrichment Score (NES) was used to assess the strength of the correlation with the gene set.

Immune cell fractionation, HRD, and mutation score analysis
TCGA HRD score, intratumoral heterogeneity score, mutation burden score, and immune activity score were calculated and reported by Thorsson et al. in 2018 [63]. Fractionation of intratumoral immune cells and stromal cells was calculated using the xCell web tool [64], an algorithm for enumerating immune cell subsets from the transcriptome, as previously reported [65][66][67]. xCell estimates immune cell fraction for each cohort by comparing 489 gene signatures corresponding to 64 cell types, including adaptive and innate immune cells, hematopoietic progenitor cells, epithelial cells, and extracellular matrix cells, with the input of bulk gene expression dataset. CYT score was used as a measure of immune activity, as previously reported [67,68].

Statistical analysis
Data downloading, organization, analysis, and visualization were done using R 4.0.1. The following packages were used in this study: Survival 3.

RAD51 gene expression was associated with DNA repair activity in breast cancer
RAD51 is known to play an essential role in the DNA repair mechanism. Therefore, we first investigated whether RAD51 gene expression was associated with the DNA repair pathway and with the expression of its member genes. Comparison of RAD51 expression between normal breast and tumor tissues in the TCGA cohort showed that RAD51 was highly expressed in breast cancer (p < 0.001, Fig. 1a). Intratumoral heterogeneity and Homologues Recombination Deficiency (HRD) scores were positively correlated with RAD51 expression in TCGA (r = 0.32 and 0.53, respectively. Figure 1b). Further, RAD51 high breast cancer significantly enriched the DNA repair gene set consistently in TCGA, METABRIC, and GSE96058 cohorts (All p < 0.001 and FDR < 0.01, Fig. 1c). It was also associated with high expression of DNA repair genes, such as BRCA1, BRCA2, E2F1, E2F4, E2F7, and CDK12, consistently across all three cohorts (TCGA, METABRIC, and GSE96058. All p < 0.05. Figure 1c). Since we were unable to find any literature on the correlation between RAD51 and BRCA1 or BRCA2 expressions in in-vitro or in-vivo studies, we analyzed the relationship in breast cancer cell lines. We found that RAD51 gene expression strongly correlated with BRCA1 or BRCA2 expression in analyzing the cancer cell line encyclopedia (Supplementary Figure S1). To this end, we found that 1 3 RAD51 expression is associated with DNA repair activity in the breast cancer tumor microenvironment (TME).

RAD51 gene expression was strongly associated with cancer cell proliferation
As cancer with HRD is known to be highly malignant, we investigated the relationship between RAD51 expression and cancer cell proliferation. Utilizing the score value provided by Thorsson et al. [63], we found a very strong correlation between RAD51 expression and the Proliferation score in the TCGA cohort (r = 0.879, p < 0.001, Fig. 2a). RAD51 expression strongly correlated with Nottingham histological grades and pathological quantification of cancer cell proliferation consistently across all three cohorts-TCGA, METABRIC, and GSE96058 (all p < 0.001, Fig. 2b). RAD51 expression also correlated with the Tubular score, Nuclear score, and Mitotic score in TCGA (Supplementary Figure S2). In agreement, RAD51 expression was highly correlated with the cell proliferation marker gene, MKI67, across all three cohorts (all r > 0.4, Fig. 2b). Strikingly, all five of the cell proliferation-related gene sets in the Hallmark collection (E2F Targets, G2M Checkpoint, Myc Targets v1 and v2, and Mitotic Spindle) and MTORC1 Signaling were enriched in high RAD51 breast cancer group consistently across all cohorts with a strong significance of FDR < 0.01 (Fig. 2c). These results suggested that high RAD51 breast cancer is associated with high cancer cell proliferation.

RAD51 was associated with a high mutation rate
As RAD51 mainly co-acts with BRCA2 and partly with BRCA1, it was of interest to investigate whether RAD51 expression was associated with overall mutation rates and BRCA gene mutations. Silent or Non-silent mutation rates were significantly increased in the high RAD51 expression breast cancer group in TCGA (both p < 0.001, Fig. 3a). In addition, we compared wild-type group to the mutation bearing group (in BRCA1, BRCA2, or both). RAD51 expression was significantly higher in patients with mutations in BRCA1, BRCA2, or in both across the METABRIC cohort (all p < 0.01). However, this was not validated in TCGA cohort (Fig. 3b). To this end, RAD51 expression correlated with cancer mutation level, but not consistently, with BRCA mutations.

RAD51 high breast cancer was immunogenic and elicited cancer immunity in the tumor microenvironment
We have previously reported that cancers with high mutation rates elicit immunogenicity and cancer immunity. Having identified increased levels of mutations in high RAD51 breast cancers, it was of interest to investigate the association of RAD51 expression with cancer immunity. As expected, single-nucleotide variant (SNV) neoantigens and Indel neoantigens were both significantly higher in breast cancer with high RAD51 expression. Several factors related to cancer immunity (interferon (IFN)-gamma response, tumor infiltrating lymphocytes (TIL) regional fraction, Wound Healing, B-Cell Receptor (BCR) Richness, BCR Shannon, and Fraction altered) were all significantly higher in the high RAD51  < 0.001, Fig. 4a). Further, we investigated the amount of immune cells in TME, and several immune cell types (CD4 naive T-cells, CD4 + memory T-cells, T helper type1 cells, T helper type2 cells, Plasma cells, M1 macrophage, and activated dendritic cells) were significantly infiltrated in the high RAD51 breast cancer group. Cytolytic Activity score (CYT), which reflects overall immune cell killing, was also significantly increased consistently across all the three cohorts (All p < 0.001, Fig. 4b). Thus, we could conclude that high RAD51 expressing breast cancer is highly immunogenic and has activated cancer immunity.

RAD51 gene expression was associated with triple-negative breast cancer and with lymph node metastasis
To further elucidate the characteristics of high RAD51 breast cancer, we analyzed its association with clinicopathological factors. Consistently among the three cohorts, RAD51 was highest expressed in triple-negative breast cancer (TNBC) compared to all other immunohistochemical subtypes of breast cancer (all p < 0.001, Fig. 5). In contrast, the estrogen receptor (ER)-positive/epidermal growth factor receptor 2 (HER2)-negative subtype had the lowest expression of RAD51. RAD51 expression was higher in advanced stages in TCGA, but this was not validated in the METABRIC cohort. RAD51 expression was significantly increased in the primary tumors of patients with more metastatic lymph nodes in both the METABRIC and GSE96058 cohorts (both p < 0.02), which was not validated in TCGA. On the other hand, the primary breast cancer RAD51 expression did not change with the presence of distant metastases. These results suggest that RAD51 is highly expressed in aggressive TNBC and in primary breast cancer with lymph node metastasis.

RAD51 expression is high in tumor that achieved pathological complete response after NAC
Breast cancer containing a BRCA1 and/or BRCA2 mutation with HRD is known to be sensitive to platinum-based cytotoxic chemotherapy and PARP inhibitors. In addition, RAD51 expression was reported to be associated with resistance to PARP inhibitors. To this end, the relationship between RAD51 expression and treatment drug sensitivity was of interest to investigate. We analyzed the sensitivity to cytotoxic chemotherapies and multiple PARP inhibitors in comparison to RAD51 gene expression across breast cancer cell lines from the Depmap portal. In TNBC cell lines, RAD51 expression was positively correlated with sensitivity to docetaxel and epirubicin, but not with cisplatin (both p < 0.05 and r > 0.5, Fig. 6a). However, none of the sensitivity to PARP inhibitors correlated with RAD51 expression (Fig. 6a). On the other hand, RAD51 expression significantly correlated with sensitivity to niraparib in ER-positive/ HER2-negative cell lines (p < 0.05 and r = 0.9, Fig. 6a).
As RAD51 was reported to have a role in drug resistance, it was of interest to investigate its association with pathological complete response (pCR) after neoadjuvant  Figure S2). RAD51 expression between groups that did achieve pCR versus those that did not was investigated by immunohistochemical subtype (Fig. 6c). Although we expected that RAD51 expression to be higher in the residual disease (RD) group and particularly in TNBC, that scenario was only found in a single cohort (GSE20271 p = 0.042, Fig. 6c). The opposite relationship was found in other cohorts (GSE25066 p = 0.001, Fig. 6c), and most of the cohorts did not show any significant difference in RAD51 expression as related to pCR in TNBC. In contrast, RAD51 expression was higher in pCR group in the ER + HER2 − subtype across two cohorts (GSE50948 and GSE20271, both p < 0.05, Fig. 6c). These results suggest that RAD51 expression of a bulk tumor does not predict response to NAC.

RAD51 high breast cancer has worse survival consistently across all three cohorts
Given that breast cancers with high expression of RAD51 are more aggressive, it was of interest to investigate whether these characteristics translated into survival disparities. To this end, we compared the survival between high and low RAD51 expression groups. Surprisingly, overall survival (OS) was significantly worse in the high RAD51 breast cancer group consistently across all three cohorts, and the same was observed in disease-specific survival (DSS) in TCGA and METABRIC. Diseasefree survival (DFS) was only significant in METABRIC alone (Fig. 7). These differences may be because the number of patients and follow-up period are approximately half of that found in the METABRIC compared to TCGA. In short, the expression of RAD51 was associated with a worse prognosis.

Discussion
In this study, we investigated the characteristics of breast cancers with high RAD51 expression through functional analysis of clinical, immunohistochemical, and transcriptomic data using multiple large breast cancer patient cohorts. In line with previous reports, we found that RAD51 was highly expressed in cancer compared to normal tissues, and strongly correlated with HRD and intratumor heterogeneity. We also showed that the DNA repair gene set, as well as multiple genes related to homologous recombination repair, were significantly associated with high RAD51 expression. Further, breast cancers with high RAD51 expression were significantly correlated with histological grade and all five Hallmark cell proliferation-related gene sets, indicating that RAD51 high tumors are highly proliferative. RAD51 was also positively correlated with mutation rates. RAD51 expression in BRCA-mutant tumors was significantly higher than in BRCA-wild-type tumors in METABRIC, but this was not validated in TCGA cohort. This was most likely because of a lack of power due to a low number of mutant cases in TCGA (about half of that of METBARIC). Martin et al. also reported a significant correlation between RAD51 expression and BRCA1 mutation, reporting that tumors with BRCA1 mutation had 2.5-fold higher expression of RAD51 compared with wild type in the gene expression microarray of 117 primary breast tumors [69]. Cancer cell immunogenicity and cancer immune activity were all significantly enhanced in high RAD51 tumors across all three cohorts, and the infiltration of each immune cell was also observed in all cohorts. Primary tumors from patients with lymph node metastases were associated with high expression of RAD51 in both TCGA and METABRIC cohorts. There was no strong correlation between RAD51 expression and treatment drug sensitivity other than Niraparib in the ER-positive/HER2-negative subtype. Contrary to our expectation, RAD51 expression was lower after NAC compared to prior to treatment consistently across three independent cohorts. RAD51 expression was higher in primary tumors that did not achieve pCR after NAC compared to tumors that did achieve pCR in only one among ten independent TNBC NAC cohorts analyzed, whereas this was not validated in any other subtypes in the other cohorts. Finally, overall survival was significantly worse in high RAD51 breast cancer across all three large cohorts. DSS was also worse in TCGA and METABRIC, and DFS was also worse in METABRIC.
We found that RAD51 expression was highly associated with cancer cell proliferation across multiple cohorts, which agrees with Maack et al. who reported that RAD51 was more highly expressed in invasive breast cancer with higher grades [15]. Although multigene assay risk scores were not available in the cohorts examined in this study, our results were consistent with previous studies in that RAD51 expression was proportional to clinical proliferation indices such as Nottingham histological grade across all cohorts and Tubular score, Nuclear score and Mitosis score in TCGA. RAD51 was most highly expressed in TNBC, which is known to be the most aggressive subtype of breast cancer. Although not consistent in all cohorts, our study suggested high RAD51 expression occurred at more advanced stages of breast cancer, such as the presence of multiple lymph node metastases, which is consistent with a previous report that RAD51 protein was associated with cancer progression and metastasis of sporadic breast cancer [70]. High RAD51 breast cancer had a higher mutational burden and increased number of neoantigens, and thus, were more immunogenic. Although there was an increased immune cell infiltration in high RAD51 breast cancer, none of the immune-related gene sets enriched to RAD51 high tumor, suggesting anti-cancer immunity was not truly activated. As a result of its strong reflection of cancer aggressors, RAD51 high expression was RAD51 was highlighted as a potential marker for predicting treatment response of breast cancer. BRCA-deficient ovarian and breast cancers with HRD showed sensitivity to PARP inhibitors and DNA-damaging drugs such as platinum, because these drugs arrest a large number of replication forks and lead to synthetic lethality [71]. Since these processes can be circumvented by RAD51, which plays a central role in the repair and restart of replication forks [72,73], the high expression of RAD51 is thought to lead to resistance to these drugs [74]. RAD51 histological expression as identified by fluorescent immunostaining was found to reflect homologous recombination repair function and was claimed as a predictive marker of pCR after NAC in TNBC [75]. Loss of RAD51 fusion in TNBC correlated with HRD as well as with pCR after platinum-based neoadjuvant chemotherapy [76]. However, RAD51 gene expression in our study showed discrepant results to the previously reported RAD51 assay, which was a functional HRD marker scored by simultaneous expression of both RAD51 and geminin, a cell proliferation marker [75]. Low RAD51 tumors determined by RAD51 assay were most frequently of the TNBC subtype, which was inverse to the relation found by our RAD51 gene expression study. Furthermore, high RAD51 expression was positively correlated with HRD, indicating that there may be a dissociation between these functional HRD markers and the gene expression of RAD51. Comparison of drug sensitivity with RAD51 expression suggested that RAD51 expression may be positively correlated with chemotherapy sensitivity in TNBC cell lines. Interestingly, no resistance to PARP inhibitors was observed. The original RAD51 assay study also showed that RAD51 was barely expressed in the baseline biopsy samples but was upregulated in samples taken immediately after radiation-induced DNA damage [75]. However, RAD51 gene expression was downregulated after NAC in our study comparing pre and post NAC samples. RAD51 was not under-expressed in the group that achieved pCR after NAC, and conversely, was highly expressed in the pCR group in some cohorts. It is unclear whether this difference is due to differences between RAD51 gene expression in the RAD51 assay and in bulk tumors, but the function of RAD51 as a marker of drug sensitivity is questionable.
The limitations of this study are as follow. First, there is a patient selection bias in the large cohort included in this analysis, because the patient information was collected more than 10 years ago. Patients receiving newly authorized treatments, such as PARP inhibitors, are not included. Second, the in-vitro cohort was all small, with fewer than 30 cell lines, so a larger number of studies of PARP inhibitors across cell lines may give different results. In addition, we did not perform in-vivo or in-vitro experiments, so the mechanisms by which RAD51 induces cell proliferation and drug resistance will require more detailed testing. Given our result that RAD51 expression was associated with immunogenicity, it was of interest to investigate its potential as a biomarker for immunotherapy. However, we were unable to pursue this since we did not have access to breast cancer patient cohorts with information on response to immunotherapy. In addition, as all our studies have been conducted in retrospective cohorts, prospective studies will need to be designed to investigate the usefulness of RAD51 as a biomarker.

Conclusion
RAD51 expression is strongly associated with aggressive cancer biology, cancer cell proliferation, and poor survival in breast cancer.