Increased frequency of CHD1 deletions in prostate cancers of African American men is associated with rapid disease progression without inducing homologous recombination deficiency

We analyzed genomic data derived from the prostate cancer of African and European American men in order to identify differences that may contribute to racial disparity of outcome and that could also define novel therapeutic strategies. In addition to analyzing patient derived next generation sequencing data, we performed FISH based confirmatory studies of Chromodomain helicase DNA-binding protein 1 (CHD1) loss on prostate cancer tissue microarrays. We created CRISPR edited, CHD1 deficient prostate cancer cell lines for genomic, drug sensitivity and functional homologous recombination (HR) activity analysis. We found that subclonal deletion of CHD1 is nearly three times as frequent in prostate tumors of African American men than in men of European ancestry and it associates with rapid disease progression. We further showed that CHD1 deletion is not associated with homologous recombination deficiency associated mutational signatures in prostate cancer. In prostate cancer cell line models CHD1 deletion did not induce HR deficiency as detected by RAD51 foci formation assay or mutational signatures, which was consistent with the moderate increase of olaparib sensitivity. CHD1 deficient prostate cancer cells, however, showed higher sensitivity to talazoparib. CHD1 loss may contribute to worse outcome of prostate cancer in African American men. A deeper understanding of the interaction between CHD1 loss and PARP inhibitor sensitivity will be needed to determine the optimal use of targeted agents such as talazoparib in the context of castration resistant prostate cancer.


INTRODUCTION
Despite an improving trend, African American (AA) men with PCa still have a signi cantly worse outcome with a 2.2-fold higher mortality rate compared with men of European ancestry (EA) (1).Recent studies demonstrated that AA men are at higher risk of progression after radical prostatectomy, even in equal access settings and when accounting for socioeconomic status (2,3).While the reasons underlying these disparities are multifactorial, these data strongly argue that germline and/or somatic genetic differences between AA and EA men may in part explain these differences.
Comparative analysis of AA and EA prostate tumors have identi ed several genomic differences.PTEN deletions, ERG rearrangements and consequent ERG over-expression are more frequent in PCas of EA men (4)(5)(6).In contrast, LSAMP and ETV3 deletions, ZFHX3 mutations, MYC and CCND1 ampli cations and KMT2D truncations are more frequent in PCas of AA men (7)(8)(9).ERF, an ETS transcriptional repressor, also showed an increased mutational frequency in AA prostate cancer cases with probable functional consequences such as increased anchorage independent growth (10), and SPINK1 expression is also enriched in African American PCa (11).
Chromodomain helicase DNA-binding protein 1 (CHD1) deletion is frequently present in prostate cancer.Deletions are associated with increased Gleason score and faster biochemical recurrence (12), activation of transcriptional programs that drive prostate tumorigenesis (13) and enzalutamide resistance (14).
Mechanistically, CHD1 loss in uences prostate cancer biology in at least two ways.CHD1, an ATPasedependent chromatin remodeler, contributes to a speci c distribution of androgen receptor (AR) binding in the genome of prostate tissue.When lost, the AR cistrome redistributes to HOXB13 enriched sites and thus alters the transcriptional program of prostate cancer cells (13).CHD1 may also contribute to genome integrity.It is required for the recruitment of CtIP, an exonuclease, to DNA double strand breaks (DSB) to initiate end-resection.Impairing this important step of DSB repair upon CHD1 loss was proposed to lead to homologous recombination de ciency (15,16).The functional impact of CHD1 loss is likely further in uenced by the presence of SPOP mutations, which were reported to be associated with the suppression of DNA repair (17).
CHD1 loss is frequently subclonal (18) (present only in a subset of cells), which makes its detection by next generation sequencing more challenging (19) and it may go undetected depending on the fraction of cells harboring this aberration.Therefore, the true proportion of PCa cases with CHD1 may be underestimated.Thus, we decided to investigate the frequency of CHD1 loss in EA and AA PCa by methods more sensitive to detecting subclonal deletions including evaluations of multiple tumor foci present in each prostatectomy specimen.

RESULTS
Subclonal CHD1 deletion is more frequent in African American prostate cancers and associated with worse clinical outcome.
CHD1 is frequently subclonally deleted in prostate cancer (18).Our initial analysis on the SNP array data from TCGA comparing AA and EA PCa cases suggested that the subclonal loss of CHD1 may be a more frequent event in AA men (Suppl.Figures 1 and 2).To independently validate this observation, we assessed CHD1 copy number by FISH (for probe design see Suppl. Figure 3) in tissue microarrays (TMAs) sampling multiple tissue cores from each tumor focus.Sampling included index tumors and non-index tumors per whole mounted radical prostatectomy sections in a matched cohort of 91 AA and 109 EA patients from the equal-access military healthcare system (Fig. 1a).Key clinico-pathological features including age at the diagnosis, serum PSA levels at diagnosis, pathological T-stages, Gleason sums, Grade groups, margin status, biochemical recurrence (BCR) and metastasis had no signi cant differences between AA and EA cases (Suppl.Table 1a).Consistent with the cohort design and long-term follow up (median: 14.5 years), we observed a 40% biochemical recurrence (BCR) and 16% metastasis rate (20).For each case up to four cancerous foci were analyzed, each sampled by two TMA punch cores on average (for details see methods and Suppl.Table 1a, 1b and 1c).We detected monoallelic CHD1 loss in 27 out of 91 AA cases (29.7%), and 14 out of 109 (11%) EA cases indicating that CHD1 deletion is about three times more frequent in prostate tumors of AA men.Our FISH data showed only 3 (2 AA cases and 1 EA case) cases where all TMA punch cores in a single tumor focus harbored CHD1 deletion in the entire samples areas of a given tumor.(Fig. 1b and see the methods and materials "FISH assay part" for details.)In most cases CHD1 deletion was present in only a subset of tumor glands within a 1 mm TMA punch, which further con rmed the subclonal nature of CHD1 deletion in prostate cancer.As a control, we performed FISH staining for PTEN deletion and immunohistochemistry (IHC) staining for ERG overexpression in a subset of the cohort (42 AA and 59 EA prostate cancer cases) con rming previously described frequency differences between AA and EA PCa (4, 5) (Suppl.Table 1e).There was a frequent exclusivity between CHD1 deletion, PTEN deletion and ERG expression both when individual tumor cores or when all tumor cores from a given patient were considered (Suppl.Figures 4a and 4b).In general, the genomic defects including CHD1 deletion, PTEN deletion and ERG expression were mainly detected in index tumors.
Further analyses revealed a signi cant association between CHD1 deletion and pathologic stages and Gleason sum.Higher frequency of CHD1 deletion was detected in T3-4 pathological stage compared to T2 stage (P = 0.043, Suppl.Table 1d).Prostate cancer cases with higher Gleason sum scores (3 + 4, 4 + 3, 8-10) were seen more frequently in the CHD1 deletion group than in the non-deletion group (P < 0.001).In contrast, lower Gleason sum score (3 + 3) was more often seen in non-deletion cases (P < 0.001, Suppl.

Table 1d
).The CHD1 deletion was more commonly detected in the cases with higher grade group (GG3 and GG4-GG5) (P = 0.024, Suppl.Table 1d).CHD1 deletion was more strongly associated with rapid biochemical recurrence in AA cases (P < 0.0001, Fig. 1c) than in EA cases (P = 0.051, Suppl Fig. 5b).The univariable survival analysis was conducted to determine the association of the clinical features including CHD1 deletion to BCR and metastasis for further multivariable model analysis (Suppl.Figures 5a and c, respectively).The multivariate Cox model analysis showed that CHD1 deletion was an independent predictor of BCR (P = 0.012 and P = 0.032, Suppl.Figure 5b) after adjusting for age at diagnosis, PSA at diagnosis, race, pathological tumor stage, grade group and surgical margins.Moreover, a signi cant correlation between CHD1 deletion and metastasis was also detected in both AA (P = 0.0055, Fig. 1d) and EA (P = 0.023, Suppl.Figure 5d) patients with Kaplan-Meier analysis.Following multivariable adjustment in the Cox proportional hazards model, CHD1 deletion was signi cantly associated with metastasis (P = 0.032 and P = 0.048, Suppl.Figure 5d).Taken together, our data strongly support the association of CHD1 deletions with aggressive prostate cancer and worse clinical outcomes in AA PCa.
Estimating the frequency of subclonal CHD1 loss in next generation sequencing data of AA and EA prostate cancer.
Previous publications characterizing the genome of AA prostate cancer cases (10,21) did not report an increased frequency of CHD1 loss as we observed in the FISH-based analysis presented above.Methods to detect copy number variations from WGS or WES data have at least two major limitations.First, subclonal copy number variations (sCNV) can be missed if they are present in fewer than 30%, of the sampled cells (19).Second, copy number loss can be underestimated with smaller deletions (e.g., < 10 kb).Although various tools are available for inferring sCNVs from WES, WGS or SNP array data, such as TITAN (19), THetA (22), and Sclust (23), they are designed to work on the entire genome, and likely miss small (~ 1-10kb) CNVs during the data segmentation process.To maximize the accuracy of our analysis we performed a gene focused analysis of the copy number loss in CHD1.We considered several factors such as the change in the normalized coverage in the tumors relative to their normal pairs', the cellularity of the tumor genome, and the approximate proportion of tumor cells exhibiting the loss.We also evaluated whether the deletion was heterozygous or homozygous using a statistical method designed for calling subclonal loss of heterozygosity (LOH) events within a con ned genomic region (details are available in the Materials and Methods section, and in the Supplementary Material).
Using this approach in a large cohort (N = 530 cases; 59 AA WES, 18AA WGS, 408 EA WES and 45 EA WGS, for details see supplementary material and Suppl.Figures 6-25), we observed that CHD1 is more frequently deleted in AA tumors (N = 20; 26%) than in EA tumors (N = 73 EA; 16%).Taken together, when next generation sequencing based copy number variations were analyzed with a more sensitive method, on the combined cohorts of whole exomes and whole genomes, CHD1 loss was detected more frequently in AA cases than in EA cases (p = 0.029, Fisher exact test), which is consistent with our observations with FISH method in the TMA cohort.Subclonal CHD1 loss is present in a signi cant subset of prostate cancer cases without SPOP mutations.SPOP mutations and CHD1 deletions often occur together in prostate cancer, with SPOP mutation as an early event and CHD1 loss is a later, subclonal event during tumor progression (18).However, as we pointed out above, subclonal CHD1 loss is often missed by routine next generation sequencing analysis.Therefore, we reanalyzed the next generation sequencing cohorts for SPOP mutations and found that CHD1 loss and SPOP mutations frequently occur independently from each other as well.In the 530 cases analyzed, we identi ed 61 SPOP mutant cases and 95 subclonal CHD1 deletions, but only 42 cases (about 68% of SPOP mutants and 44% of CHD1 deleted cases) had both genomic aberration present.CHD1 loss is not associated with genomic aberration features that are usually observed in HR-de cient cancers.
CHD1 loss was proposed to be associated with reduced HR competence in cell line model systems (15,24).Detecting and quantifying HR de ciency in tumor biopsies is currently best achieved by analyzing next generation sequencing data for speci c HR de ciency associated mutational signatures.Those include: 1) A single nucleotide variation based mutational signature ("COSMIC signatures 3 (25) and SBS3 (26)); 2) a short insertions/deletions based mutational pro le, often dominated by deletions with microhomology, a sign of alternative repair mechanisms joining double-strand breaks in the absence of HR, which is also captured by COSMIC indel signatures ID6 and ID8 (26); 3) large scale rearrangements such as non-clustered tandem duplications in the size range of 1-100kb (mainly associated with BRCA1 loss of function) (27).Some of these signatures can be e ciently induced by the inactivation of BRCA1, BRCA2 or several other key downstream HR genes (Suppl.Figures 26-44) (28) .HR de ciency is also assessed in the clinical setting by a large scale genomic aberration based signature, namely the HRD score (29), which is also approved as companion diagnostic for PARP inhibitor therapy.A composite mutational signature, HRDetect (30), combining several of the mutational features listed above was also evaluated as an alternative method to detect HR de ciency in prostate adenocarcinoma (31).In order to investigate whether an association between CHD1 loss and HR de ciency exists in prostate cancer biopsies, we performed a detailed analysis on the mutational signature pro les of CHD1 de cient prostate cancer.
We analyzed whole exome and whole genome sequencing data of several prostate adenocarcinoma cohorts (For the detailed results see the Supplementary Material) containing samples both from AA (52 WES and 18 WGS cases) and EA (387 WES and 45 WGS) individuals in order to determine whether CHD1 loss is associated with the HRD mutational signatures.
We divided the cohorts into three groups: 1) BRCA2 de cient cases that served as positive controls for HR de ciency, 2) CHD1 deleted cases without mutations in HR genes, and 3) cases without BRCA gene aberration or CHD1 deletion (for details see Supplementary Material).
In the WGS cohorts CHD1 de cient cases showed a limited increase of the HRD score relative to the control cases but signi cantly lower than the BRCA2 de cient cases and none of the CHD1 de cient cases had an HRD score above the threshold currently accepted in the clinic as an indicator of HR de ciency (Fig. 2a).Since CHD1 deletions tend to be subclonal, we investigated whether the low levels of HRD score is due to a "dilution" effect, where the HR pro cient regions without CHD1 deletion reduce the intensity of the HRD score.The HRD score did not show a statistically signi cant correlation with the estimated fraction of the subclonal loss of CHD1 (Fig. 2a, Suppl.Figure 26-27), and even cases where all cells had CHD1 deletion did not have a high enough HRD score indicating HR de ciency.Similarly, the most characteristic HRD associated single nucleotide variation signature (signature 3, SBS3), was signi cantly increased in the BRCA2 de cient cases but only slightly increased in the CHD1 de cient cases (Fig. 2b).
The increase of the relative contribution of short indel signatures ID6 and ID8 to the total number of indels characteristic of loss of function on BRCA2 biallelic mutants was not observed in the CHD1 loss cases (Suppl. .This suggests, that the alternative end-joining repair pathways do not dominate the repair of DSBs in CHD1 deleted tumors. In the WGS cohort we also determined the number of structural variants (SVs) as previously de ned (Suppl.Figure 35) (32).The SV signature associated with HR de ciency (SV3) was not elevated in the CHD1 de cient tumors.Interestingly, an SV signature characterized by an increase in the number of nonclustered 1kb-1Mb deletions (termed RS5 ( 27)) was signi cantly increased both in the BRCA2 mutant and CHD1 de cient cases (Fig. 2c), with the latter showing a less signi cant increase.Notably, this signature also displayed a strong subclonal dilution.This signature was described to be associated with BRCA2 de ciency previously (27,32) but it is also present in tumors without BRCA2 de ciency and the current version of this signature, SV5 (https://cancer.sanger.ac.uk/signatures/sv/sv5/) is not associated with HR de ciency.
Finally, the BRCA2 de cient cases showed high HRDetect scores (Suppl.Figures 36-38).However, since the HRDetect scores arise from a logistic regression, which involves the non-linear transformation of the weighted sum of its attributes, even slightly lower linear sums in the CHD1 loss cases compared to the BRCA2 mutant cases can result in substantially lower HRDetect scores (Suppl.Figure 38).
We have previously processed WES prostate adenocarcinoma data for the various HR de ciency associated mutational signatures (31).When the CHD1 de cient cases were compared to the BRCA1/2 de cient and BRCA1/2 intact cases we obtained results that were consistent with the WGS based results outlined above (Suppl.Figures 39-44).
Deleting CHD1 in prostate cancer cell lines does not induce homologous recombination de ciency as detected by the RAD51 foci formation assay or mutational signatures.
In order to investigate the functional impact of the biallelic loss of CHD1 we created several CRISPR-Cas9 edited clones of the AR-PC-3 and AR + 22Rv1 cell lines (Fig. 4a, Suppl Fig. 47a).RAD51 foci formation was induced by 4Gy irradiation.The CHD1 de cient prostate cancer cell lines did not show reduction of RAD51 foci formation.(Fig. 3a).As controls, non-irradiated cells were used (Suppl Fig. 46) DNA repair pathway aberration induced mutational signatures can also be detected in cell lines by whole genome sequencing (28,33).We grew single cell clones from the PC-3 and 22Rv1 cell lines for 45 generations to accumulate the genomic aberrations induced by CHD1 loss (Suppl.Figure 45).Two of such late passage clones and an early passage clone were subjected to WGS analysis.All the clones retained the BRCA2 wild type background of their parental clone.Furthermore, CHD1 elimination did not induce any of the mutational signatures commonly associated with HR de ciency (Fig. 3b-d).
Taken together, CHD1 loss in prostate cancer cell line model systems did not induce any signs of HR de ciency.CHD1 de cient cell lines show limited sensitivity to PARP inhibitors, with talazoparib more effective in some model systems.CHD1 de cient cancer cells were reported to have moderately increased sensitivity to the PARP inhibitor Olaparib (15), which is consistent with the lack of observed HR de ciency described in the previous section.PARP inhibitors were initially thought to exert their therapeutic activity by inhibiting the enzymatic activity of PARP, but it was later revealed that trapped PARP on DNA may have a more signi cant contribution to cytotoxicity (reviewed in (34)).Therefore, in addition to olaparib, we also determined the e cacy of the strong PARP trapping agent talazoparib in several prostate cancer cell lines in which CHD1 was either knocked out or suppressed.In addition to the PC-3, 22Rv1 and LNCaP cells with CRISPR-Cas9mediated CHD1 deletion we also suppressed CHD1 by shRNA in the C4-2b, Du145 and MDA-PCa-2b prostate cancer cell lines, the last one is one of the few AA derived prostate cancer cell line models.Consistent with previous reports, deleting CHD1 induced a maximum of approximately 5-fold increase in olaparib sensitivity with minimal or no change in some cell lines (Fig. 4 panels c, e, i, k, o, q) (15).The increase in talazoparib sensitivity was similar to that of olaparib for most cell lines with a few notable exceptions.Talazoparib sensitivity increased by about 15-20-fold in the CHD1 de cient PC-3 cells (Fig. 4d), and, notably in the CHD1 de cient AA derived cell line (MDA-PCa-2b), talazoparib sensitivity increased by 4-fold (Fig. 4p), while the increase in olaparib sensitivity was approximately 1.5-fold (Fig. 4o).In summary, in four of the six cell lines (Fig. 4d,j,l,p), CHD1 suppression was associated with a talazoparib sensitivity consistent with therapeutically achievable concentrations (around 10nM or less.)These data suggest that trapped PARP may have a more toxic effect in cells with CHD1 de ciency.
The impact of SPOP mutations on the clonality of CHD1 deletions and HR de ciency associated mutational signatures.
Although less frequent, SPOP mutations and CHD1 deletions may co-exist in a subset of prostate cancer (35) and SPOP mutations have been shown to suppress key HR genes (17).Therefore, we investigated whether the presence of SPOP mutation in a CHD1 de cient prostate cancer is associated with a further increase of HR de ciency associated mutational signatures.We identi ed cases with SPOP mutations or CHD1 deletions only, cases with both SPOP mutations and CHD1 deletions and cases without either of those aberrations (Fig. 5a).Cases with both mutations showed signi cantly higher levels of signature SBS3, RS5 and the total number of large-scale structural rearrangements relative to cases with either mutation alone.It should be noted, however, that the proportion of cells in a given tumor with CHD1 deletions tended to be signi cantly higher in SPOP mutant cases than those with CHD1 deletions without SPOP mutations.Therefore, it is possible that the presence of SPOP will intensify HR de ciency associated mutational signatures by enhancing the proportion of CHD1 de cient cells in a tumor (Fig. 5b).
Finally, we investigated whether adding SPOP mutations to a CHD1 de cient background increases PARP inhibitor sensitivity.We overexpressed the SPOP mutant SPOP F102C in the CHD1 deleted PC3 cells (Suppl.Figure . 48), but we could not detect a further increase either in the olaparib or talazoparib sensitivity (Suppl Fig. 47)

DISCUSSION
The presence of functionally relevant subclonal mutations in various solid tumor types is well documented (36, 37).Deletions present only in a minority of tumor cells are di cult to detect unless more targeted analytical approaches are applied.Here we present one example of such detection bias with signi cant functional relevance.We used a FISH based approach to detect CHD1 deletion in PCa.Consistent with the previously described subclonal nature of CHD1 loss, we found that while this gene is often deleted in prostate cancer, it is rarely deleted in every tumor core or tumor focus.When we took the subclonal nature of CHD1 loss into consideration a signi cant racial disparity emerged, with an approximately 3-fold increase in the frequency of CHD1 deletion in AA PCa patients vs. EA patients.This loss was also signi cantly associated with rapid disease progression to biochemical recurrence and metastasis.Since CHD1 loss is associated with a more malignant phenotype, the signi cantly higher frequency of CHD1 loss in AA PCa may account for the diverging clinical course observed in PCa between men of African and European Ancestry.It is possible that CHD1 loss is in fact more frequent in EA PCa as well but with a lower focal density than in AA cases.This is certainly a limitation of our bioinformatics approach.However, CHD1 single cell-level deletions have not been observed in our high-resolution FISH assay in tumors of EA patients.
Several studies pointed out a potential link between CHD1 loss and homologous recombination de ciency (15,16,24).Interestingly, CHD1 null cells showed only a modest (3-fold) increase in sensitivity to PARP inhibitor or platinum-based therapy (15,16,24).This suggested that CHD1 loss may not lead to a signi cant level of HR de ciency.Our results support this assumption since CHD1 de cient tumors did not display increased levels of the veri ed HR de ciency associated mutational signatures and CHD1 loss in cell lines did not induce HR de ciency as detected by functional assays either.
Consequently, the limited sensitivity of CHD1 de cient cell lines to PARP inhibitors suggests that this treatment may be less effective than in bona de HR de cient, such as inactivated BRCA2 cases.Nevertheless, the facts that talazoparib is effective in some of the CHD1 de cient cell line models and the that CHD1 suppression induces enzalutamide sensitivity ( 14) may explain some of the unexpected results of the TALAPRO-2 study (38).In this trial patients without mutations in the DNA damage pathway (BRCA2 etc.) also bene tted from a combination of talazoparib and enzalutamide.We are hypothesizing that talazoparib, perhaps by eliminating CHD1 de cient cells, may delay the emergence of enzalutamide resistance, which may de ne an effective therapy in a signi cant subset, those with CHD1 de ciency, of AA PCa cases.
CHD1 was also reported to be associated with altered immunogenic phenotype in prostate cancer (39).These results coupled with the demonstrated differences of tumor immunity between EA and AA prostate cancer cases (40) raises the possibility that CHD1 de ciency may provide a AA PCa population sensitive to targeted immunotherapy.
Finally, the somewhat increased genomic instability of CHD1 de cient cases, as re ected by the moderately elevated HRD scores, may also indicate that it is the genomic instability rather than the CHD1 loss that is responsible for the signi cantly worse outcome of CHD1 de cient cases detected in our AA PCa cohort.Separating these two effects will require further studies.

Cohort selection and Tissue Microarray (TMA) generation
The aggregate cohort was composed of 2 independently selected cohort samples from Bio-specimen bank of Center for Prostate Disease Research and the Joint Pathology Center.Wholemount prostates were collected from 1996 to 2008 with minimal follow-up time of 10 years.Self-reported race was validated by genomic ancestry analysis showing an 95% accuracy (41).The rst cohort of 42 AA and 59 EA cases was described before (7,41).Similarly, the second cohort of 50 AA and 50 EA cases was selected based on the tissue availability (> 1.0 cm tumor tissue) and tissue differentiation status (1/3 well differentiated, 1/3 moderately differentiated and 1/3 poorly differentiated).All the selected cases had the signed patient consent forms for tissue research applications.Patients who have donated tissue for this study also contributed to the long-term follow-up data (the mean follow-up time was 14.5 years).Our study was reviewed by the Uniformed Services University's Human Research Protections Program (HRPP) O ce and "determined to be considered research not involving human subjects as de ned by 32 CFR 219.102(e) because the research involves the use of de-identi ed specimens and data not collected speci cally for this study."(Ref #910230).TMA block was assigned as 10 cases each slide and each case with 2 benign tissue cores, 2 Prostatic intraepithelial neoplasia (PIN) cores if available and 4-10 tumor cores covering the index and non-index focal tumors from formalin xed para n embedded (FFPE) wholemount blocks.The description of numbers of patients, tumors and tumor cores of combined cohort was in Supplementary table 1d.All the blocks were sectioned into 8 µM tissue slides for FISH staining.
Fluorescence in situ hybridization (FISH) assay: A gene-speci c FISH probe for CHD1 was generated by selecting a combination of bacterial arti cial chromosome (BAC) clones (Thermo Fisher Scienti c, Waltham, MA) within the region of observed deletions near 5q15-q21.1,resulting in a probe matching ca.430 kbp covering the CHD1 gene as well as some upstream and downstream adjacent genomic sequences including the complete repulsive guidance molecule B (RGMB) gene.Due to the high degree of homology of chromosome 5-speci c alpha satellite centromeric DNA to the centromere repeat sequences on other chromosomes, and the resulting potential for cross-hybridization to other centromere sequences, particularly on human chromosomes 1 and 19, a control probe matching a stable genomic region on the short arm of chromosome 5 -instead of a centromere 5 probe -was used for chromosome 5 counting (supplementary Fig. 1e).The FISH assay of CHD1 was performed on TMA as previously described (7).The green signal was from probe detecting control chromosome 5 short arm and the red signal was from probe detecting CHD1 gene copy.The FISH-stained TMA slides were scanned with Leica Aperio VERSA digital pathology scanner for further evaluation.The criteria for CHD1 deletion was that in over 50% of counted cancer cells (with at least 2 copies of chromosome 5 short arm detected in one tumor cell) more than one copy of CHD1 gene had to be undetected.Examining tumor cores, deletions were called when more than 75% of evaluable tumor cells showed loss of allele.Focal deletions were called when more than 25% of evaluable tumor cells showed loss of allele or when more than 50% evaluable tumor cells in each gland of a cluster of two or three tumor glands showed loss of allele.Benign prostatic glands and stroma served as built-in control.
The sub-clonality of CHD1 deletion was presented with a heatmap showing CHD1 deletion status in all the given tumors sampled from whole-mount sections of each patient.The color designations were denoted as: red color (full deletion) meaning all the tumor cores carrying CHD1 deletion within a given tumor, yellow color (sub-clonal deletion) meaning only partial tumor cores carrying CHD1 deletion within a given tumor and green color (no deletion) meaning no tumor core carry CHD1 deletion (supplementary table 1b).
Statistics Analysis: The correlations of CHD1 deletion and clinic-pathological features, including pathological stages, Gleason score sums, Grade groups, margin status, and therapy status were calculated using an unpaired t-test or chi-square test.Gleason Grade Groups were derived from the Gleason patterns for cohort from Grade group 1 to Grade group 5. Due to the small sample sizes within each Grade group, Grade group 1 through Grade group 3 were categorized as one level as well as Grade group 4 through Grade group 5.A BCR was de ned as either two successive post-RP PSAs of ≥ 0.2 ng/mL or the initiation of salvage therapy after a rising PSA of ≥ 0.1 ng/mL.A metastatic event was de ned by a review of each patient's radiographic scan history with a positive metastatic event de ned as the date of a positive CT scan, bone scan, or MRI in their record.The associations of CHD1 deletion and clinical outcomes with time to event outcomes, including BCR and metastasis, were analyzed by a Kaplan-Meier survival curves and tested using a log-rank test.Multivariable Cox proportional hazards models were used to estimated hazard ratios (HR) and 95% con dence intervals (Cis) to adjust for age at diagnosis, PSA at diagnosis, race, pathological tumor stage, grade group, and surgical margins.We checked the proportional hazards assumption by plotting the log-log survival curves.A P-value < 0.05 was considered statistically signi cant.Analyses were performed in R version 4.0.2.
Immunohistochemistry for ERG: ERG immunohistochemistry was performed as previously described (42).Brie y, four µm TMA sections were dehydrated and blocked in 0.6% hydrogen peroxide in methanol for 20 min.and were processed for antigen retrieval in EDTA (pH 9.0) for 30 min in a microwave followed by 30 min of cooling in EDTA buffer.Sections were then blocked in 1% horse serum for 40 min and were incubated with the ERG-MAb mouse monoclonal antibody developed at CPDR (9FY, Biocare Medical Inc.) at a dilution of 1:1280 for 60 min at room temperature.Sections were incubated with the biotinylated horse anti-mouse antibody at a dilution of 1:200 (Vector Laboratories) for 30 min followed by treatment with the ABC Kit (Vector Laboratories) for 30 min.The color was developed by VIP (Vector Laboratories,) treatment for 5 minutes, and the sections were counter stained by hematoxylin.ERG expression was reported as positive or negative.ERG protein expression was correlated with clinico-pathologic features.

Prostate patients and specimens in the in-silico study cohorts
Evaluation of the self-declared ancestries Since the available ancestry data were based on the self-assessment of the patients, and it was a crucial part of our study to identify the samples accurately, we have interrogated the genotypes of 3000 SNPs that are speci c to one of the greater Caucasian, African and Asian ancestries, in each of the germline samples (43).The data was collected into a single genotype matrix, the rst two principal components of which was used to train a non-naïve Bayes classi er to differentiate between the three ancestries (details are available in the supplementary material, Supp.Figures 5-21).
Identi cation of local subclonal loss of CHD1 in prostate adenocarcinoma: The paired germline and tumor binary alignment (bam) les were analyzed using bedtools genomcecov (v2.28.0) (44), and their mean sequencing depths were determined.The coverage above and within the direct vicinity of CHD1 (chr5:98,853,485 − 98,930,272 in grch38 and chr5:98,190,408 − 98,262,740 in grch37) was collected in 50 bp wide bins into d-dimensional vectors (d_grch37 = 1447, d_grch38 = 1536) using an in-house tool and samtools (v1.6) (45), and were normalized using their corresponding mean sequencing depths.The linear relationship between the paired germline-tumor coverages were determined in the following form: , where is the normalized coverage of the germline sample and is the normalized coverage of its corresponding tumor pair.The intercept ( ) was used to ensure that the data was free of outliers, and the slope ( ) was used as a raw measure of the observable loss in the tumor.Similar slopes were calculated for 14 housekeeping genes in each of the sample-pairs, which were used to assess the signi cance of the loss (Supplementary Material).
The cellularity (c) of the tumors were estimated using sequenza (46) after the rigorous selection of the most reliable cellularity-ploidy pair offered by the tool as alternative solutions.In order to account for the uncertainty of the reported cellularity values, a beta distribution was tted on the grid-approximated marginal posterior densities of c.These were used to simulate random variables to determine the proportion of the approximate loss of CHD1 in the tumors, by the following formula: Here, , where is the standard error of , where and are the tted shape-parameters of the cellularity, and is the cellularity-adjusted slopes of the curve.The approximate level of loss in CHD1 is distributed as 1-(Further details are available in the supplementary materials, Suppl.Figures 23).

Local subclonal LOH-calling
The SNP variant allele frequencies (VAF) in the close vicinity of CHD1 in the tumor were collected with GATK HaplotypeCaller (v4.1.0)(47).The coverage and VAF data were carefully analyzed in order to ensure that we are strictly focusing on regions that have suffered the most serious loss (e.g., if only a part of the gene were lost, the unaffected region was excluded from the analysis).By using the tumor cellularity (c) and the estimated level of loss in the tumor (), we assessed whether a heterozygous or a homozygous subclonal deletion is more likely to result in the observed frequency pattern (A detailed explanation is available in the supplementary notes, Suppl.Figure 25, Suppl.Tables 2-3).
Mutational signatures.Second generation somatic point-mutational signatures were estimated with the deconstructSigs R package (48).The list of considered mutational processes whose signatures' linear combination could lead to the nal mutational catalogs (a.k.a.mutational spectra) were extracted in a dynamic process in which every single signature components were investigated one by one in an iterative manner and only those were kept that have improved the cosine similarity between the reconstructed and original spectra by a considerable margin (> 0.001).
HRD-scores.The calculation of the genomics scar scores (loss-of-heterozygosity: LOH, large-scale transitions: LST and number of telomeric allelic imbalances: ntAI) was performed using the scarHRD R package (49).The allele-speci c segmentation data of the samples were provided by sequenza (46).
Stable CRISPR-Cas9 expressing isogenic PC-3 cell line generation.Full length SpCas9 ORF was introduced in PC-3 cell population by Lentiviral transduction using lentiCas9-Blast (Addgene #52962) construction.After antibiotics (blasticidin) selection, survival populations were single cell cloned, isogenic cell lines were generated and tested for Cas9 activity by cleavage assay.
Gene knock-out induction.CHD1 was targeted in CRISPR-Cas9 expressing PC-3 cell line using guide RNA CHD1_ex2_g1 (gCTGACTGCCTGATTCAGATC), resulted PC-3 CHD1 ko 1, and CHD1 ko 2 homozygous knock out cell lines.The same guide RNA was used to transiently knock out CHD1 gene in the 22Rv1 parental cell line.
Transfection.Cells were transiently transfected by Nucleofector® 4D device (Lonza) by using supplemented, Nucleofector® SF solution and 20 µl Nucleocuvette® strips following the manufacturer's instructions.Following transfection, cells were resuspended in 100 µl culturing media and plated in 1.5 ml pre-warmed culturing media in a 24 well tissue culture plate.Cells were subjected to further assays 72 h post transfection.
Duolink® Proximity Ligation Assay (Sigma) was carried out using antibodies against γH2Ax and RAD51(Cell Signaling) according to the manufacturer's instruction.Signals were detected by uorescent microscopy (Nikon Ti2-e Live Cell Imaging System).Quanti cation of uorescent signals were carried out by using the Fiji-ImageJ software.
Sample preparation for Whole Genome Sequencing (WGS).
DNA was extracted from 22Rv1 and PC-3 CHD1 knock out isogenic cell lines at low passage number of the cells (22Rv1_1, PC-3_1).Following 45 passages, CHD1 knock out isogenic cell line was single cell cloned, and two colonies per cell line (22Rv1_2, 22Rv1_3, PC-3_2, PC-3_3) were propagated for DNA isolation.
DNA was extracted by using QIAamp DNA Mini Kit (QIAGENE).Whole Genome Sequencing of the DNA samples was carried out at Novogene service company.
Exponentially growing PC-3 cell lines WT, CHD1 ko1, CHD1 ko2, and 22Rv1 WT and chd1 ko respectively, were seeded in 96-well plates (1500 PC-3 cells/well, and 3000 22 Rv1 cells/well) and incubated for 36 hrs to allow cell attachment.Identical cell numbers of seeded parallel isogenic lines were veri ed by the Celigo Imaging Cytometer after attachment.C4-2B, MDA-PCa-2b and DU145 cells were transiently transfected with Ctrl siRNA (5'-CGUACGCGGAAUACUUCGAUUUU-3') and CHD1 siRNA (5'-CACAAGAGCUGGAGGUCUAUU-3') using RNAiMAX (Invitrogen, 13778-150) according to the     Samples that simultaneously harbor mutations in SPOP and a loss in CHD1 tend to have higher markers.P-values were estimated using non-parametric Wilcoxon signed-rank tests.(b) Proportion of cells with intact CHD1 in SPOPmutants and samples identi ed with CHD1 loss.While the deletion in CHD1in SPOP mutants is mostly clonal, in samples with wild type SPOPbackground it is mostly subclonal.The colorcode for points in both panels A and B is illustrated in the bottom right corner of the gure.

Figure 4 The
Figure 4