What factors are associated with the prognosis of primary testicular diffuse large B-cell lymphoma? A study based on the SEER database

Primary testicular diffuse large B-cell lymphoma (PT-DLBCL) is a relatively rare urological tumor with a high degree of malignancy and a poor prognosis. This study aimed to investigate the prognostic risk factors for survival of patients with PT-DLBCL, and then to construct a predictive model and verify its reliability. First, we selected subjects from the SEER database (2000–2018) and analyzed the survival of PT-DLBCL patients by Kaplan–Meier test. Then, we analyzed prognostic factors by Cox regression. Finally, the data from the training cohort were used to construct a prediction model and represented with a nomogram. We evaluated the nomogram using the consistency index (C-index), decision curve analysis (DCA), and the area under the subject operating characteristic curve (ROC). In addition, calibration curves were plotted to assess the agreement between the column plot model and the actual model. We identified five independent risk factors for patient prognosis affecting OS and CSS in patients with PT-DLBCL by univariate and multivariate analysis, including age, transversality, Ann Arbor staging, chemotherapy, and radiotherapy. According to the above factors, we constructed prognostic nomograms, and found that age contributed the most to the survival of patients with PT-DLBCL. The C-indexes for the nomogram of OS and CSS in the training cohort were 0.758 (0.716–0.799) and 0.763 (0.714–0.812), and in the validation cohort were OS and CSS 0.756 (0.697–0.815) and 0.748 (0.679–0.817). We produced the first nomogram of PT-DLBCL, and it can be used to evaluate the CSS and OS of patients to determine the prognosis of patients.


Introduction
Primary testicular lymphoma (PTL) is a rare and aggressive extranodal lymphoma (Berjaoui et al. 2023;Guo et al. 2022;Twa et al. 2021), which is more than 60 years old and has an incidence rate of 0.26 cases per 100,000 people (Sun et al. 2022). It accounts for malignant lymphoma 1-2% of primary testicular tumors and 1-5% of primary testicular tumors (Chen et al. 2020;Wang et al. 2020a, b). Among them, primary testicular diffuse large B-cell lymphoma (PT-DLBCL) accounts for about 80-98% of PTL. The typical clinical manifestation of PT-DLBCL is a solid, painless testicular mass, which is not biased to either side and is inseparable from the affected testis (Shen et al. 2022;Cheah et al. 2014;Horne and Adeniran 2011;Leivonen et al. 2019). The median size of PT-DLBCL tumors is 6 cm, of which 40% of cases are accompanied by hydrocele, and 6-10% of cases have bilateral simultaneous involvement (Cheah et al. 2014;Gundrum et al. 2009). PTL is an aggressive and strong malignant tumor that tends to involve the contralateral testis and central nervous system (CNS) and spread to other nodules outside, such as skin, lungs, kidneys, adrenal glands, gastrointestinal and other soft organs (Mazloom et al. 2010;Kridel et al. 2017).
To date, no predictive model for PT-DLBCL has been developed, so it is not possible to predict survival in patients with PBL. In recent years, with the continuous development of information technology, Nomogram has been widely used as a prognostic tool in oncology and cancer prognosis studies and has been accepted by most scholars (Bianco 2006).

3
The SEER database is a collection of information on the morbidity, treatment, prognosis, and survival of millions of patients with malignancies in selected states and counties in the United States (Clegg et al. 2009;Liao et al. 2020). Therefore, we used the information in the SEER database to screen outpatients with PT-DLBCL relevant to the study, determined the risk factors related to OS and CSS in patients with PT-DLBCL, and established a nomogram.

Data source and extraction
The SEER database of the National Cancer Institute (http:// seer. cancer. gov/ seers tat/) is a free cancer database. The SEER database covers approximately 30% of the total US population and contains clinical information on millions of patients with malignancies in selected US states and counties. This study included data from the SEER database from 2000 to 2018 (Incidence-SEER Research Plus Limited-Field Data, 21 Registries, Nov 2020Sub (2000-2018). We used the SEER*Stat 8.3.9.2 (account ID: 18244-Nov2020) to extract, download, and analyze the data. We included the data based on the site code of the third edition of the International Classification of Tumor Diseases (ICD-O-3). C62.9-Testis, histological code. 9680/3: Diffuse large B-cell lymphoma (DLBCL).
We searched the SEER database for patients diagnosed with PT-DLBCL from 2000 to 2018. The exclusion criteria were as follows. (I) Multiple primary tumors; (II) Incomplete follow-up data; and (III) Clinicopathological information was missing (Ann Arbor Stage, laterality, and Surgical information) The screening program is shown in Fig. 1. The factors identified in our study include the year of diagnosis, age at diagnosis, race, marriage, laterality, Ann Arbor stage, surgery of the primary lesion, radiotherapy, chemotherapy, cause of death, and follow-up information. The outcome of the study included OS and CSS. Among them, we defined OS as the time from diagnosis to death and CSS as the time from diagnosis to death from PT-DLBCL. Finally, we included a total of 355 patients. We used the initial operation of PT-DLBCL as the starting point for followup, and death or follow-up until December 31, 2018, as the endpoint.

Statistical methods
We used X-tile (Camp et al. 2004) software to classify patients into three age groups (< 56 years, 56-74 years, or > 74 years) and to determine cut-off points. (Yale University, New Haven, CT, USA) (Supplement Fig. 1).
First, we divided all the included patients (355) into a training cohort (235) and a validation cohort (120) at a ratio of 2:1 by means of random segmentation. Next, we performed a Chi-square test on the underlying clinical characteristics of the two cohorts, where we based the nomogram on the clinicopathological characteristics of the training cohort. We used Kaplan-Meier curves and Logrank tests for the analysis of patients' OS and CSS. First, we conducted univariate analyses of the included factors. Second, we included factors significantly associated with survival in patients with PT-DLBCL (p < 0.1) in the final multivariate analysis.) The factors were applied to construct the nomogram associated with OS and CSS in 3, 5, 10 and 15 years (p < 0.05). We performed internal and external validation of the nomogram for the training cohort (235). The C-index of 0.5 indicated that the model had no predictive power, and the C-index of 0.5-1.0 indicated a gradual increase in predictive power. Not only have we drawn a receiver-operating characteristic curve (ROC) to assess prognosis and survival outcomes, but we have also established decision curve analysis (DCA) to measure the practical application of nomograms in the clinical setting. We used SPSS statistics 26.0 (SPSS Inc., Chicago, IL, USA), GraphPad Prism 8.0, and R4.2.2 (R Foundation for Statistical Computing, Vienna, Austria. http:// www.Rproje ct. org/) for all statistical analyses. P value < 0.05 is considered meaningful.

Basic characteristics of patients
We included a total of 355 eligible patients with PT-DLBCL from the SEER database (Table 1). We then randomly divided these patients into a training cohort (n = 255) and a validation cohort (n = 120) in a ratio of 2:1. Of these, 23.7% of patients (84) were younger than 56 years, 45.4% of patients (161) were aged 56-74 years, and 31.0% of patients (110) were older than 74 years. The number of married people is 235, accounting for 66.2%. The proportion of whites is relatively high, reaching 85.9%. The proportion of right sided was 165, 46.5%, of left sided was 161, 45.4%, and of bilateral was 29, 8.1%. In the Ann Arbor staging, most patients were diagnosed as stage I (200; 56.3%), 17.2% of patients were diagnosed as stage II, only 5.9% of patients

Recognition of independent prognostic risk factors for OS and CSS in the training group
We performed univariate and multivariate Cox regression analyses of the included factors. Univariate results showed that age, laterality, Ann Arbor stage, surgery, chemotherapy, and radiotherapy could be prognostic risk factors for OS and CSS in patients with PT-DLBCL, while we performed Cox analyses for multivariate variables, which showed that age, laterality, Ann Arbor staging, chemotherapy, and radiotherapy were independent prognostic factors for OS and CSS in patients with PT-DLBCL (Table 2).

Nomogram construction for OS and CSS
We developed prognostic nomograms based on multivariate Cox regression results. Independent prognostic factors for the prognostic nomogram include age, laterality, Ann Arbor stage, chemotherapy, and radiotherapy. We could analyze from the nomogram that the factor that contributes the most to the prognosis of PT-DLBCL patients was age, followed by Ann Arbor stage, Laterality, radiotherapy and chemotherapy also had a certain impact on OS ( Fig. 2A). Our analysis of the nomogram for CSS showed that age was again the factor that contributed most to the prognosis of patients with PT-DLBCL, followed by the Ann Arbor stage. Laterality, radiotherapy and chemotherapy also had a certain degree of influence on CSS (Fig. 2B). The detailed scores of prognostic factors in the OS and CSS nomogram charts are shown in Supplement Table 1.

Nomogram validation/performance of nomograms for OS and CSS
In the training cohort, the C-indexes for OS and CSS were 0.758 [95% confidence interval (CI) 0.716-0.799)] and 0.763 (0.714-0.812), respectively, both of which were greater than the C-index for the univariate age, laterality, Ann Arbor stage, chemotherapy, and radiotherapy. In the validation cohort, the C-indexes for OS and CSS were 0.756 (0.697-0.815) and 0.748 (0.679-0.817), again superior to the age, laterality, Ann Arbor stage, chemotherapy, and radiotherapy univariates. In addition, we calculated the C-indexes for OS and CSS for the total data, which were 0.752 [95% CI 0.719-0.785] for OS and 0.752 (0.710-0.793) for CSS (Supplement Table 2). In our study, we compared the AUC values of OS nomograms, age, laterality, Ann Arbor stage, chemotherapy, and radiotherapy at 3, 5, 10 and 15 years for the training cohort, respectively. The results of the study showed that in the OS of the training cohort, the AUC values for the training cohort  Table 3). We calculated the AUC values of OS and CSS in the validation cohort, at 3, 5, 10 and 15 years. The results showed that the AUC values of the nomogram were greater than the AUC values of age, laterality, Ann Arbor stage, radiotherapy, and chemotherapy (Fig. 3C, Fig. 3D, and Supplement Table 3).
We divided all patients into two groups using the final score and found that patients with high scores had a significantly worse prognosis than the low-scoring group (Fig. 4). We found that whether in the training cohorts or in the validation cohort, the results showed that the predicted and the actual value probability in the calibration curve were consistent and similar, indicating that the nomogram model and the actual model were consistent (Fig. 5). Additionally, we also conducted a DCA curve analysis of the risk factors. The results also confirmed the effectiveness of the nomogram prediction model (Fig. 6).

Discussion
Due to the rarity of this disease, there is a lack of a large number of clinical research case samples (Berjaoui et al. 2023;Pollari et al. 2021), and the SEER database provides a large number of effective clinical data for the study of rare types of tumors. In our study, we extracted a total of 355 PT-DLBCL patients with complete information from the SEER database. First, by univariate and multivariate analysis of our screened PT-DLBCL data, we identified five independent prognostic risk factors associated with OS and CSS. A SEER database-based study of 1169 cases of primary testicular lymphoma showed that age, year of diagnosis, specific NHL subtype, radiotherapy, and Ann Arbor stage were independent prognostic factors for PTL (Xu and Yao 2019). In this study, there were 355 patients with PT-DLBCL, of whom 142 (40%) died of PGC and 30 (14.1%) died of other causes during follow-up. Survival curves showed 5-year OS and CSS rates of 60.8% and 66.7%.
Standard treatment for PTL includes orchiectomy, chemotherapy with rituximab, radiotherapy to the contralateral testis, and CNS prophylaxis (Mazloom et al. 2010;Zelenetz et al. 2019;Caumont et al. 2020). The R-CHOP chemotherapy regimen (rituximab plus cyclophosphamide, doxorubicin, vincristine, and prednisone) is the standard of care for the treatment of aggressive PT-DLBCL. In Vitolo's prospective trial IELSG10, they added rituximab to R-CHOP every 21 days (R-CHOP21) or dose-dense CHOP every 14 days (CHOP14). Meanwhile, they also administered intrathecal Fig. 3 The area under the receiver-operating characteristic (ROC) curve (AUC) was used to evaluate the predictive model capability of the 5-year OS and CSS in the training cohort (A and B) and validation cohort (C and D) methotrexate (IT-MTX) and RT to the contralateral testis to avoid recurrence of the testis. Follow-up results showed 5-year PFS, OS, and cumulative incidence of time to progression of 74%, 85%, and 18%, respectively. They have also demonstrated that the protocol achieves effective systemic control in patients with PTL. Most importantly, this project avoided recurrence in the contralateral testis (Vitolo et al. 2011). In the Viola-Poeschel clinical trial, four cycles of R-CHOP treatment in patients with aggressive PT-DLBCL were found to be no less effective than six cycles of R-CHOP treatment, with a relative reduction in toxic effects (Poeschel et al. 2019). In a recent genetic analysis of PTL by Lakshmi-Nayak et al., frequent 9p24.1/PD-L1/PD-L2 copy number alterations and increased expression of PD-1 ligands in and PTL were found in PTL patients. At 13 + to 17 + months after PD-1 blockade treatment with the drug nivolumab, the patient's disease did not deteriorate (Nayak et al. 2017).
This study constructs a reliable prognostic risk assessment model for PT-DLBCL patients by combining these independent prognostic factors. PT-DLBCL patients can all find an overall score based on the nomogram graph and then assess future survival, and for PT-DLBCL patients with high scores, doctors can make more timely interventions and individualize treatment to improve the PT-DLBCL prognosis of patients. At the same time, we have established the reliability and accuracy of this model through multiple validations, and the model is a strong clinical guide.
Our study still has some limitations. First, as with other predictive models, the SEER database is a retrospective cohort, so there will inevitably be a lack of data, which leads to a reduction in sample size. So, we need to conduct prospective studies to further confirm our results. Second, there are a number of factors that may affect the prognosis of patients, but these specific data are not available in the SEER database such as the PI score, the specific chemotherapy regimen, and the type B symptoms). Third, the data for this study are from the USA. This study would be more meaningful if we could further verify the data from China. We had enrolled 20 patients from our own institution that we had included, but the sample size was too small to verify.
However, the C-index and AUC showed that our nomograms had a significant advantage over age, laterality, Ann Arbor Stage, radiation, and chemotherapy, and calibration proved that the nomogram had advanced accuracy. DCA for 3, 5, 10 and 15 years illuminated that the nomogram had better practicability.

Conclusion
The results of this study showed that age, laterality, Ann Arbor stage, chemotherapy, and radiotherapy were independent prognostic risk factors for patients with PT-DLBCL. We used the SEER database to construct a reliable prognostic nomogram model for the assessment of their survival with strong clinical guidance. For patients with PT-DLBCL with high scores, clinicians can make earlier interventions and implement individualized treatment to improve the prognosis of patients. We also confirmed the reliability and accuracy of this model through various validations.  Acknowledgements Thanks to all those who contributed to this research, The SEER database provides us with a free public resource.
Author contributions Conception and design were done by PS and SY. Analysis and interpretation of data were done By SY. Drafting of the manuscript and statistical analysis was done by SY and BZ. Critical revision of the manuscript for important intellectual content was done by PS and WC. Data availability The data that support the findings of this study are available in the SEER database. These data were derived from the following resources available in the public domain: https:// seer. cancer. gov/ seers tat/.

Conflict of interest
All authors involved in the writing of the article declare that there is no conflict of interest. Ethical approval We selected the data from the SEER database, which is a public database and does not require ethical approval. We can find it from http:// seer. cancer. gov/ seers tat/.