The genomic landscape of canine diffuse large B-cell lymphoma identifies distinct subtypes with clinical and therapeutic implications

Diffuse large B-cell lymphoma (DLBCL) is the most common lymphoid neoplasm in dogs and in humans. It is characterized by a remarkable degree of clinical heterogeneity that is not completely elucidated by molecular data. This poses a major barrier to understanding the disease and its response to therapy, or when treating dogs with DLBCL within clinical trials. We performed an integrated analysis of exome (n = 77) and RNA sequencing (n = 43) data in a cohort of canine DLBCL to define the genetic landscape of this tumor. A wide range of signaling pathways and cellular processes were found in common with human DLBCL, but the frequencies of the most recurrently mutated genes (TRAF3, SETD2, POT1, TP53, MYC, FBXW7, DDX3X and TBL1XR1) differed. We developed a prognostic model integrating exonic variants and clinical and transcriptomic features to predict the outcome in dogs with DLBCL. These results comprehensively define the genetic drivers of canine DLBCL and can be prospectively utilized to identify new therapeutic opportunities. Giannuzzi et al. present an integrated analysis of clinical features and exome and RNA sequencing data in a cohort of dogs with diffuse large B-cell lymphoma to better define the genetic landscape of this tumor and identify multiple mutations associated with the outcome.

L ymphoma in domestic dogs is considered a representative and highly predictive spontaneous model for human disease. In particular, the complex genetics interplay, the intact immune system, the environmental exposures and the increasing incidence in this model represent powerful elements for translational studies 1 . Among the many lymphoma subtypes, canine diffuse large B-cell lymphoma (cDLBCL) is the most common, accounting for approximately 50-60% of hematological malignancies in this species 2 .
Current survival rates for cDLBCL after chemotherapy or chemo-immunotherapy are usually disappointing, and dogs show markedly different clinical courses and treatment responses, demonstrating a heterogeneous clinical behavior and a difficulty in anticipating outcome 3 . Proposed cDLBCL prognostic classification systems are based on bone marrow infiltration, substage, mitotic rate and histologic features (centroblastic and immunoblastic) without consideration of the mechanisms underlying tumorigenesis 4,5 . Transcriptomics have shed some light on the pathogenesis of cDLBCL, revealing similarities with its human counterpart, but also important differences that should be considered in veterinary and comparative clinical trials. Compared with normal B cells, cDLBCL present active NF-κB signaling induced by antigen engagement of the B-cell receptor 6 . Additionally, upregulation of several Toll-like receptors suggests a pathogenesis similar to human activated B-cell-like DLBCL (ABC DLBCL), and the activation of immune-related signatures is correlated with an inferior outcome. Indeed, dogs with a shorter overall survival and tumor-free interval show a higher expression of transcripts coding for proteins involved in JAK/STAT signaling, microenvironment, immune system and p53 pathway 7 .
Recent preliminary studies have started to describe the mutational spectrum of canine lymphomas, providing a comprehensive catalog of somatic mutations in coding regions 8 . One study based on whole exome sequencing (WES) investigated canine B-cell lymphomas obtained from three predisposed breeds (Boxer, Golden Retriever and Cocker Spaniel) and found that both TRAF3 and MAP3K14 were frequently mutated 9 . Notably, FBXW7 mutations occurring in a specific codon that is recurrently mutated in several human cancers were identified 9 . Despite the large number of cases included in that study, tumors were not appropriately classified according to World Health Organization criteria and survival data were not reported, limiting the clinical relevance. Also, even if a recent canine pan-cancer study revealed that mutations are preferentially cancer-dependent rather than breed-dependent, some genes might still be breed-specific, partially masking the heterogeneous genetic landscape of canine lymphoma 8 .
In human medicine, the integration of next-generation sequencing technologies in clinical practice holds great promise for personalized medicine, but correlations between genotype and phenotype are critical for the interpretation of these analyses. Veterinary oncology has only recently been modeling the same approach, but the process has been hampered by several difficulties. Firstly, large multiinstitutional molecular studies comprising datasets of fully characterized canine tumors often suffer from a lack of funding. Secondly, understanding the link between molecular aberrations and prognosis is challenging, because treatment and outcome are strongly influenced by the owner. Thirdly, even if genetic alterations are defined, their functional impact and clinical validation are often unknown, thus preventing the identification of new therapeutic targets and prognostic markers.
To address these issues and to prospectively inform clinical trials of cDLBCL, we performed a comprehensive multiomics profiling of de novo diagnosed cDLBCL with the goal of clarifying the genetic changes within this tumor. Genetic data obtained from WES were correlated with clinicopathological features. Finally, an integrated model comprising mutations, copy number aberrations (CNAs) and transcriptome was designed to predict overall survival and tumor-free interval. Further, TP53 mutations were validated in an independent cohort of cDLBCL.

Landscape of somatic mutations in cDLBCL.
On the basis of WES, the median sequencing depth of targeted regions was 265 (range 140-394) for tumors and 246 (range 110-625) for normal samples, with a mapping rate of 99%. Collectively, the total number of short somatic variants identified across tumors ranged from 93 to 2,899, with an average of 282. Of these variants, 10.3-28.7% were annotated as protein-coding variants with an average of 18.4%, including 4.9% insertions and deletions (indels) and 95.1% single-nucleotide variants (SNVs) (Fig. 2). Among the latter, 68.5% were missense (range 47.3-84.3%) ( Supplementary Fig. 1). By sorting intolerant from tolerant (SIFT), 1,866 missense variants were classified as deleterious and 1,220 as tolerated. The full list of the nucleotide variants is reported in Supplementary Data 3. cDLBCL is characterized by recurrent mutations in specific protein-coding genes. A total of 2,831 protein-coding genes showed a nonsynonymous somatic variant in at least one tumor for a total of 3,769 protein-coding variants, and 2,368 genes were mutated in only one sample. More importantly, eight genes (TRAF3, SETD2, POT1, TP53, MYC, FBXW7, DDX3X and TBL1XR1) were recurrently mutated in at least 15% of the dogs. The top 43 most frequently mutated genes are shown in Fig. 3. Several candidate cancer genes were previously identified as genetic drivers in canine cancers (TRAF3 (refs. 9,10 ), SETD2 (refs. 11,12 ), POT1 (refs. 9,13 ) and TP53 (ref. 8 )), but others have never been reported in dogs before (H3C8, DIAPH2 and EHD3).
In concordance with previous studies 10 , TRAF3 was the most frequently mutated gene in our cohort (53% of the dogs). A total of 62 mutations were identified, and 16 dogs carried multiple aberrations. Of these mutations, 41.9% and 29% were frameshift and nonsense variants, respectively, while 25.8% were missense variants (Supplementary Table 1). Exon 11 (ENSCAFT00000028719.4) was the most affected with a total of 31 mutations (Fig. 4a). The second most frequently mutated gene was SETD2. We identified 29 somatic mutations in 24 dogs (31%). Nonsense variants were the most frequent (37.9%), followed by frameshift and missense variants (31% each) ( Fig. 4b and  We filtered the protein-coding somatic variants for known cancer genes using the Catalogue of Somatic Mutations in Cancer (COSMIC) 14 dataset (v92, 723 genes), and a total of 172 cancer genes with at least one protein-coding variant were retrieved (Supplementary Data 4). By comparing the genetic alterations in cDLBCLs with human DLBCL as retrieved from COSMIC, we found that 1,902 genetic alterations in driver genes were shared, including MYC and TP53 (Supplementary Data 4).
To explore patterns of co-mutation and mutual exclusivity, we examined pairwise overlaps using Fisher's test and mutual exclusion 15 among the top cancer genes. Significant pairwise interactions (false discovery rate (FDR) 0.1) are depicted in Fig. 5. Notably, PLEC mutations exclusively co-occurred with SETD2, while MAP3K14 mutations were mutually exclusive with TRAF3, apart from one dog.
The tumor mutational burden (TMB) ranged from 0.12 to 2.94 (mean 0.32; Supplementary Data 1) and was significantly higher in tumors presenting SETD2 and TP53 mutations compared with wild-type (WT) tumors (P < 0.05), in agreement with what has been reported by others in both canine and human tumors bearing these mutations 8,16,17 . To identify the potential contributions of the mutational processes within tumor exomes, we applied a Bayesian treatment of NMF approach. The analysis revealed that the predominant mutational process in all the tumors was signature 1A, which is the product of cytosine deamination at CpG sites due to ageing (Fig. 6).
To construct a comprehensive view of the common genetic alterations underlying cDLBCLs, we grouped genetic aberrations targeting specific oncogenic signaling pathways, including genes and CNAs occurring in >5% of cases 18 . In addition to activation of the NF-κB signaling pathway (80% of the dogs), we identified chromatin remodeling and histone modifications (73%), and TNF signaling pathways (66%). The cell cycle was also frequently altered (64%), including alterations in genes with known roles in the G1/S checkpoint (amplifications of MYC and mutations of FBXW7) and the G2/M checkpoint (mutations of TP53) (Fig. 7).

CNAs in cDLBCL are associated with clinical outcome.
Regions of somatic CNA were defined using WES segmentation data. The proportion of the tumor genome showing chromosomal aberrations ranged from 0.1% to 19.9% per genome (mean 6.8%), and the median number of CNAs was 35 (range 1-487) (Fig. 8).
To reliably detect recurrent changes in tumors, the Genomic Identification of Significant Targets in Cancer (GISTIC) algorithm was used. A total of 78 significant regions were identified, including 24 gains mainly involving CFA 5, 6, 9 13, 17, 27 and 31; as well as 54 losses mainly involving CFA 1, 6, 17, 26 and 38 (Supplementary Data 5). In line with previous observations, the most extensive and highly recurrent CNAs were retrieved in CFA 13 overlapping the MYC locus (44% of the dogs) and CFA 31 (30% of the dogs), followed by focal gains in CFA 5 and 27. Losses of CFA 14 were identified in 15 dogs (19%). Significant associations with TTP and LSS were detected for 17 and 20 CNAs, respectively (FDR <0.05) (Supplementary Data 6). In particular, the frequently observed gain of the whole CFA 13 was significantly associated with improved TTP in dogs treated with chemo-immunotherapy (P = 0.027) (Supplementary Fig. 2 and Supplementary Data 7).
The potential associations between CNAs and somatic mutations were investigated. We found that focal losses in CFA 8 and large broad gains in CFA 13 were strongly associated with TRAF3, SETD2 and TP53 alterations (Supplementary Figs. 3, 4 and 9). Gains in CFA 31 were associated with TP53 mutations (Fig. 9) and losses in CFA 14 with POT1 mutations ( Supplementary Fig. 5).
Recurrently mutated genes associate with clinicopathological features of cDLBCL. We assessed whether there were associations between the most recurrent mutations and clinical features (Supplementary Data 8). Within the cohort, dogs with MYC and DDX3X mutations had a higher percentage of peripheral blood infiltration (P = 0.03 and P = 0.021, respectively). Also, DDX3X mutations were significantly associated with bone marrow infiltration (P = 0.028) and a higher LDH level (P = 0.036). In line with these results, all dogs with DDX3X mutations had stage V disease (P = 0.002). When all recurrent mutations were considered collectively, no significant associations were observed by multivariate analysis.    Table 2).
Somatic mutations are associated with specific transcriptional signatures in cDLBCL. Using data from dogs with both WES and RNA-seq, we evaluated whether the most frequent mutations (TRAF3, SETD2, POT1, TP53, MYC, FBXW7 and DDX3X) and TMB were associated with specific gene expression signatures (Supplementary Table 3). DDX3X mutations were characterized by high expression of transcripts involved in the translation initiation complex and in human Burkitt lymphoma (Supplementary Data 9). MYC mutations were characterized by signatures associated with tumor microenvironment and apoptosis (Supplementary Data 9).
We have previously shown that cDLBCL can be divided into two main clusters, of which the cluster characterized by high expression of T-cell and macrophage markers shows an inferior outcome 7 . When we applied the same approach here, the poor outcome cluster was associated with high TMB (Wilcoxon rank-sum test, P < 0.05) and low frequency of TRAF3mut cases (13% versus 75%).

Confirmation of the negative prognostic value of mutated TP53
in an independent cohort of cDLBCL. We confirmed TP53 mutations by Sanger sequencing and further examined exons 4-8 in a second group of 56 dogs affected by DLBCL, whose clinicopathological features are reported in Supplementary Data 10. Fifteen dogs harbored mutations in TP53, all of which occurred at different nucleotide positions, except in two dogs, which shared the same mutation. Eleven mutations were novel, while three were already reported in dbSNP albeit with unknown frequency. In three dogs, variants were classified as germline since they were also retrieved in matched normal tissue. Among the somatic mutations, we classified nine missense, one frameshift deletion, one frameshift insertion and one splice-acceptor variant. All missense mutations were predicted as deleterious by SIFT (Supplementary Data 11).  Mutations in TP53 were associated with age and significantly enriched in dogs diagnosed with stage IV disease (P = 0.001). The prognostic relevance of TP53 mutations was confirmed (Supplementary Data 12). Indeed, TP53mut dogs had a significantly shorter TTP (P < 0.0001) and LSS (P < 0.0001) compared with TP53WT dogs (Supplementary Fig. 7a,b).

Integration of omics and clinicopathological features predicts survival in cDLBCL.
We developed a multivariate supervised learning approach for defining the association of survival with clinicopathological variables, genetic features and gene expression data. Among all, the Cox model with elastic net regularization outperformed the random forest models and was considered for further steps ( Supplementary Fig. 8a,b). The most predictive features were identified taking advantage of the Least Absolute Shrinkage and Selection Operator (LASSO) shrinkage and combined to generate survival prediction models for LSS and TTP. cDLBCLs from the first cohort of 77 dogs were divided into two subgroups based on their observed risk (long and short survivors) using the median survival time as threshold: 95 and 177 days for TTP and LSS, respectively. Performance was evaluated in cross-validation to avoid overfitting in training and model selection (Methods). The best-performing scores for LSS (area under the receiver operating characteristic (AUROC) 0.95)) and TTP (AUROC 0.87) were obtained with the following variables: age, bone marrow infiltration (%), treatment (chemotherapy versus chemo-immunotherapy), TP53 genetic status (mut versus WT) and STAP2 and G3BP2 gene expression as logCPM (Supplementary Fig. 8c,d). By excluding STAP2 and G3BP2 expression data, AUROC for LSS and TTP dropped to 0.90 and 0.79, respectively ( Supplementary Fig. 8a,b). Furthermore, when TP53 genetic data were removed from the analysis, AUROC for LSS and TTP dropped further to 0.83 and 0.74, respectively ( Supplementary Fig. 8a,b). The predictive model was validated using the second cohort of 56 dogs. All the clinicopathological features and the TP53 status were included, and the AUROC resulted in 0.83 and 0.82 for LSS and TTP, respectively. This mild drop was probably due to the single treatment effect, reducing the ability of the model to discriminate. Indeed, chemo-immunotherapy was the most predictive clinical feature. To prospectively use these data in clinical practice, we have developed an interactive webtool (http:// compbiomed.hpc4ai.unito.it/canine-dlbcl) to predict survival and tumor relapse using clinicopathological data, transcriptomic and genomic features.

Discussion
The application of genetic and transcriptomic analyses has led to an increased understanding of the biology of human DLBCL, paving  the way to target-specific therapeutic approaches 19 . Here we integrated clinical features, somatic mutations, CNAs and transcriptome in cDLBCL, providing new data on the genetic hallmarks of this tumor type, identifying multiple mutated genes associated with outcome, and thereby providing potential therapeutic targets. TRAF3, mutated in 53% of the dogs, was the most frequently affected gene in our series. The gene product is part of a complex including TRAF3 itself, TRAF2, BIRC2, BIRC3 and MAP3K14. The disruption of this complex leads to the activation of the noncanonical NF-κB pathway. TRAF3 inactivation has been described in both canine and human lymphoid neoplasms 9,10,20-24 , including in 15% of human DLBCL cases in which it contributes to the activation of the NF-κB signaling cascade 25 . Indeed, in our series, dogs with mutated TRAF3 presented an active NF-κB transcriptome program, and other genes encoding members of the TRAF3 complex were also recurrently mutated, especially MAP3K14. The latter codes for a kinase (also known as NIK, NF-κB-inducing kinase), which phosphorylates NFKB2 (p100), causing its proteasomal processing and the formation of p52-containing NF-κB dimers that translocate into the nucleus to transactivate target genes. MAP3K14 mutations were largely mutually exclusive with TRAF3 mutations, and overall, 80% of cDLBCL presented genetic lesions compatible with NF-κB activation, emphasizing the importance of this pathway in cDLBCL pathogenesis. These data provide potential therapeutic targets [26][27][28] , but also highlight lesions associated with resistance to Bruton's tyrosine kinase inhibitors 24,29 , which show antitumor activity in many NF-κB-driven lymphomas, and with implications for the management of dogs with DLBCL and for the use of these animals as models for the human disease.

Fig. 5 | Somatic interactions in cDLBCL.
Mutually exclusive and co-occurring genes among the 43 recurrently mutated genes are shown. Trend toward co-occurrence and exclusivity are represented with blue and red, respectively (*P < 0.01, •P < 0.05; Fisher's exact test).
LAB ANiMAL | VOL 51 | JULy 2022 | 191-202 | www.nature.com/laban Differently from our observations in cDLBCL, SETD2 mutations are present in less than 10% of human DLBCL 34,35 , while more common in T-cell lymphomas [31][32][33] . SETD2 was not the only mutated gene encoding proteins involved in chromatin remodeling and transcription regulation. Over 70% of cDLBCL contained at least one mutated gene of this class, including histone 3 members (H3C8 or H3C12, 10%), KDM6A, SUZ12 (5%), KDM2A, KDM3B and EZH2 (3%) (Supplementary Fig. 9). The mutations in H3C8 and H3C12 occurred in the same hotspot, determining amino acid 27 (or 28 according to COSMIC annotation) conversion from lysine (K) into methionine (M). This mutation has also been observed in human pediatric gliomas, where it defines a specific entity termed diffuse midline glioma, which is an infiltrative midline high-grade glioma with predominantly astrocytic differentiation 36 . The mutation inhibits the activity of the polycomb repressive complex 2 (PRC2), composed of the K27 histone methyltransferase EZH2 (enhancer of zeste homolog 2) and the core accessory proteins EED, SUZ12 and RbAp48 (ref. 37 ) (Supplementary Fig. 9): we also observed recurrent inactivating heterozygous mutations in two of the PRC2 proteins (EZH2, 3%; SUZ12, 5%). As methionine cannot be methylated by EZH2, gliomas bearing the K27M mutation present a global reduction of H3K27me levels, a modification associated with gene silencing, and DNA hypomethylation at many loci 37 . However, perhaps owing to a redistribution of the PRC2 complex, H3K27M mutated gliomas still retain a substantial number of genes with the H3K27me3 mark and are dependent on the remaining PRC2 enzymatic activity [37][38][39] . The removal of di-and trimethyl groups from H3K27 is done by two histone demethylases, one of which, UTX, is encoded by KDM6A. KDM6A was mutated in 5% cDLBCL (Supplementary Fig. 9). UTX forms a complex with H3K4 methyltransferases MLL2 (KMT2D)/ MLL3 (KMT2C) 37 . The inactivation of the two methyltransferases is observed in 20-30% of human DLBCL 40,41 , and it determines a diminished global H3K4 methylation with deregulation of CD40, Toll-like and B-cell receptor signaling pathways 42,43 .
In our series, we did not find any mutation in the acetyltransferases CREBBP and EP300, which activate transcription via acetylation of histone H3 lysine 27 (H3K27Ac) and are recurrently inactivated in human lymphomas 40,44 . Besides impairing PRC2 activity, H3K27M might also affect acetyltransferase activity, leading to aberrant gene expression and enhancer dysfunction 45 , and this could at least partially mimic the effects of CREBBP and EP300 mutations.  AC  AC  AC  CC  CC  CC  CC  GC  GC  GC  GC  TC  TC  TC  TC  AC  AC  AC  AC  CC  CC  CC  CC  GC  GC  GC  GC  TC  TC  TC  TC  AC  AC  AC  AC  CC  CC  CC  CC  GC  GC  GC  GC  TC  TC  TC  TC  AT  AT  AT  AT  CT  CT  CT  CT  GT  GT  GT  GT  TT  TT  TT  TT  AT  AT  AT  AT  CT  CT  CT  CT  GT  GT  GT  GT  TT  TT  TT  TT  AT  AT  AT  AT  CT  CT  CT  CT  GT  GT  GT  GT  TT  TT TT TTT S1 C > A T > C C > G T > A C > T T > G Fig. 6 | Mutational signature analysis in cDLBCL. Mutation signature analysis was performed using nonsynonymous and synonymous substitutions for all 77 tumors. A single mutational signature (S1), corresponding to signature 1A, was identified as predominant. The plot shows the distribution of the six types of substitution (in 96 different trinucleotide contexts) defined by the pyrimidine as inferred from the NMF algorithm. Each subgraph within a signature represents one substitution. The bars within each subgraph include the nucleotides on either side of the mutation location in the reference genome. The error bars represent ±standard error (SE) of the coefficients calculated over the replications of the extraction process. Graphics have been created using the signeR package within R software v3.6.3 (www.r-project.org).
Previously reported in a single case of canine B-cell lymphoma 10 , TBL1XR1 was mutated in 17% of our cDLBCLs. The gene is frequently mutated in human ABC DLBCL, specifically in the genetically defined MCD/cluster 5 subtypes, which are characterized by frequent extranodal localization, immune-escape lesions, and an unfavorable clinical outcome 44,47 . TBL1XR1 mutations were associated with a trend for a poor prognosis also in our series of cDLBCL, but the low frequency of mutations prevented statistical significance being reached. TBL1XR1 is a core component of the SMRT/NCOR1 transcriptional repressor complexes, thus also contributing to transcription regulation ( Supplementary Fig. 9).
Similarly to the NF-κB signaling cascade, the high frequency of alterations in genes involved in chromatin regulation and transcription regulation is of dual relevance to veterinary and comparative medicine. Evidence from preclinical studies indicate the potential use of therapeutic agents including EZH2 inhibitors for KDM6A 48 and H3K27M mutants 39 , demethylating agents for KDM6A mutants and HDAC3 inhibitors for TBL1XR1 mutants 49 . The latter 50 might also show activity in H3K27M mutants, owing to the possibly reduced acetyltransferase activity of CREBBP and EP300 on H3K27M 45 . LSD1 inhibitors 51,52 and especially KDM5 inhibitors 53 could work in KDM6A mutants, which might have concomitant deregulated MLL2/MLL3 activity 37 . However, it is important to keep in mind that, although cDLBCL and its human counterpart appear phenotypically similar, subtle genetic differences exist between the two.
In our series, we identified recurrently mutated genes found in WES studies of other canine cancers, including osteosarcoma, melanoma and lymphoma 12,54 . Compared with the previous two studies in canine lymphoma 9,10 , the frequency of mutations for the top four genes was overall higher in the current work. This difference could be attributable to several reasons. First, we included fully characterized cDLBCL, whereas in previous studies, a general diagnosis of B-cell lymphoma was reported and mostly obtained by fine-needle aspiration. Second, in our series, normal DNA was available for every dog and used as a match for variant calling. Finally, from a technical point of view, a wider whole exome enrichment kit was used here.
The somatic mutations observed in our series of cDLBCL appeared almost exclusively compatible with the spontaneous deamination at CpG sites previously associated with aging, which   is also the most frequent mechanism reported in human DLBCL 44 . We did not observe signatures compatible with the activity of activation-induced cytidine deaminase, the second most common mechanism leading to mutations in the human counterpart 44,55 . Considering the so far lack of described chromosomal translocations involving the immunoglobulin genes in cDLBCL, instead frequent in humans, our data suggest potential differences between the two species in the effects caused by the activation-induced cytidine deaminase-mediated somatic hypermutations in the transformation process.
Recent advances in chemo-immunotherapy have offered new options for the treatment of canine cancers 56,57 . The addition of an autologous vaccination (APAVAC) to CHOP-based chemotherapy has dramatically prolonged both remission time and overall survival in dogs with DLBCL 3 . The APAVAC vaccine consists of heat shock proteins purified from the dog's tumor, and their presentation to and recognition by the dogs' immune system provides protection 58 . While chemotherapy still represents the cornerstone of lymphoma treatment, it is usually not curative for dogs with DLBCL. Immunotherapy may circumvent the immune evasion caused by cancer heterogeneity by immunizing the host against a large repertoire of individual tumor-associated antigens 59 .
However, not all patients benefit from immunotherapy 3 , and predicting a patient's response would reduce financial costs. Here we identified two groups of animals differing in their clinical outcome. Dogs older than 10 years, having bone marrow involvement, and TP53 mutations had a poor outcome and a small benefit from chemo-immunotherapy. Conversely, the absence of such features identified a cDLBCL subset with good prognosis and an important gain in survival if treated with chemo-immunotherapy.
The negative prognostic impact of TP53 mutations has been reported in human DLBCL 44,47,60 , but it is still unclear in dogs. In our series, TP53 mutations were frequently found by WES, and these findings were validated in a second group of dogs, maintaining a similar frequency and prognostic significance. Loss of TP53 causes disruption of checkpoint responses to DNA damage and contributes to genomic and chromosomal instability. Here most of the mutations affected the p53 DNA-binding domain, and we can thus hypothesize an inactivating effect, even if this would require experimental validation. In addition, the co-occurrence of TP53 and FBXW7 mutations resulted in a worse outcome than TP53 mutation alone. Other than TP53, POT1 mutations, previously reported in cDLBCL, were associated with poor outcome in our dogs. POT1 mutations contribute to cancer development in multiple ways, including human chronic lymphocytic leukemia, where the predominant effects of POT1 loss are increased telomere length, with telomerase activity and genomic instability 61 contributing to tumorigenesis.
In conclusion, our results suggest that clinical trials testing new targeted agents in cDLBCL should be evaluated in the context of clinical features and genetic aberrations, including mutations affecting TP53, noncanonical NF-κB pathway and chromatin remodeling.

Online content
Any methods, additional references, Nature Research reporting summaries, source data, extended data, supplementary information, acknowledgements, peer review information; details of author contributions and competing interests; and statements of data and code availability are available at https://doi.org/10.1038/ s41684-022-00998-x. built for LSS and TTP. Performance of the models was evaluated by eightfold cross-validation with two holdout sets, repeated 20 times on random permutations of the samples. One holdout set was needed to select the optimal model by hyperparameter grid search, and the other was used to evaluate the final performance of the optimal model. Performance was evaluated by the AUROC for classifying dogs with survival time lower than the median survival in the dataset. Given the limited number of samples, the much larger number of available features and the observed poor performances of models fitted using too many features, we decided to eschew automatic feature selection methods. Thus, various datasets with different features were tested to find those that were most predictive and combine them in a single predictive model. Categorical clinical features were one-hot-encoded producing 41 numeric and binary features.
We tested a dataset of 77 samples with 2,832 genetic mutations (binary) and clinical features and a dataset of 43 samples with clinical features and the top 100 most significantly differentially expressed genes between tumors and controls. Features whose regression coefficient was consistently different from zero among the cross-validation fittings of the best performing models were selected and combined in the final dataset.
PCR amplification and Sanger sequencing. Sanger sequencing was performed to validate the protein-coding somatic mutations identified by WES in TP53 gene. Primer pairs were specifically designed using Primer3 (https://primer3. ut.ee/) and Primer-BLAST (https://www.ncbi.nlm.nih.gov/tools/primer-blast/) tools (Supplementary Table 4). PCR was performed in a final volume of 20 µl using HotStarTaq DNA Polymerase kit (QIAGEN, Hilden, Germany) and 50 ng of genomic DNA (gDNA) with the following cycling conditions: initial denaturation at 95 °C for 5 min, 33 cycles at 95 °C for 30 s, at 58 °C for 30 s and at 72 °C for 40 s, with final extension at 72 °C for 10 min. PCR products were purified using the ExoSAP-IT PCR Product Cleanup Reagent (Applied Biosystems). Purified products were sequenced in the forward or reverse direction using the BigDye Terminator v1.1 Cycle Sequencing Kit (Applied Biosystems) following manufacturer's instructions and analyzed on a SeqStudio Genetic Analyzer (Applied Biosystems). Sequencing electropherograms were manually inspected using Chromas v2.6.6 software. Mutations were identified by comparing sequences obtained from each tumor to the canFam3 reference genome and were classified as somatic when absent in matched normal tissue.

Statistical analysis.
To explore the associations between clinicopathological variables and recurrently mutated genes (at least 5% of cases) and identify different classes of clinicopathological variables in mutated or WT individuals, we built classification trees using the recursive partitioning algorithm implemented in the 'party' R package. The correlation analysis between TMB and genes mutated in at least 15% of cases was performed by means of Student's t-test. Bonferroni correction was applied for multiple comparison analyses.
Survival analysis was conducted using 'survival' and 'survminer' R packages. The following clinicopathological variables were tested for their influence on TTP and LSS by means of univariate and multivariate Cox proportional hazard model: treatment (chemotherapy versus chemo-immunotherapy), breed (pure versus mixed), sex (female versus male), age (<10 years versus ≥10 years), weight (<10 kg versus ≥10 kg), stage (IV and V), substage (a and b), peripheral blood infiltration (%), bone marrow infiltration (%), presence of bone marrow infiltration (yes versus no), LDH activity (normal versus increased), pretreatment with steroids (yes versus no) and TMB (<0.21 versus >0.28). The influence of the mutational status of the 43 most frequently mutated genes (mut or WT) on both TTP and LSS was also evaluated. Variables with a P value ≤0.200, with the exception of TMB, were included in the multivariate analysis. For categorical variables, Kaplan-Meier (KM) curves were drawn and compared by means of log-rank test.