A Risk Model for Detecting Clinically Signicant Prostate Cancer Based on Clinical Parameters and the Prostate Imaging Reporting and Data System using Bi-parametric Magnetic Resonance Imaging in a Japanese Cohort.

Background: To selectively identify men with clinically signicant prostate cancer (sPC) is pivotal issue. To develop a risk model for detecting sPC based on Prostate Imaging Reporting and Data System (PI-RADS) for biparametric magnetic resonance imaging (bpMRI) and clinical parameters in a Japanese cohort is expected benecial. Methods Between January 2011 and December 2016, we retrospectively analyzed clinical parameters and bpMRI ndings from 773 biopsy-naïve patients. A risk model was established using multivariate logistic regression analysis and was presented on a nomogram. Discrimination of the risk model was compared using the area under the receiver operating characteristic curve. Statistical differences between the predictive model and clinical parameters were analyzed using DeLong’s test. sPC was detected in 343 men (44.3%). In the multivariate logistic regression analysis to predict sPC, age (P=0.002), log prostate-specic antigen (P<0.001), prostate volume (P<0.001) and PI-RADS scores (P<0.001) contributed signicantly to the model. The risk model showed a higher area under the curve (0.862), than age (0.646), log prostate-specic antigen (0.652), prostate volume (0.697) and imaging scores (0.822). DeLong test results also showed that the novel risk model performed signicantly better compared with those parameters (P<0.05).


Introduction
Prostate cancer is the most commonly diagnosed cancer in Japan. The incidence of prostate cancer is rapidly increasing, with over 90,000 males newly diagnosed in 2017. Over 12,000 males died of prostate cancer in 2018, representing the 6th -most frequent cause of cancer-related death among men in Japan [1]. Population-based prostate-speci c antigen (PSA) screening tests can increase early detection of prostate cancer and thus lead to declines in prostate cancer related-mortality [2]. However, these tests simultaneously lack speci city, resulting in increased numbers of unnecessary prostate biopsies, which in turn are associated with risks of rectum bleeding and sepsis. The risk of over-treatment leading to adverse impacts on quality of life without improving survival is a concern. Randomized controlled clinical studies that evaluated the e cacy of prostate cancer screening have highlighted the need to reduce overdiagnosis of clinically insigni cant prostate cancer. A new diagnostic pathway is thus needed to selectively identify men with clinically signi cant prostate cancer (sPC), while reducing the number of unnecessary biopsies and over-detection and over-treatment of clinically insigni cant prostate cancer [2,3].
The use of multi-parametric magnetic resonance imaging (mpMRI) of prostate incorporating anatomical and functional imaging (T2-weighted imaging, diffusion-weighted imaging (DWI) and dynamic contrast enhancement (DCE)) has been bene cial to detect sPC. However, mpMRI has been criticized for its widely variable reported diagnostic performance across different institutions. In 2012, Prostate Imaging Reporting and Data System (PI-RADS) was introduced to facilitate standardized interpretation of mpMRI ndings [4]. In PI-RADS, a score for suspecting the presence of sPC was assigned on a 1-to 5-point scale on mpMRI sequence. PI-RADS has shown high diagnostic accuracy for detecting sPC by means of targeted biopsies [5,6].
The use of clinical data with mpMRI ndings has become signi cantly important for urologists to better stratify individuals who may warrant prostate biopsy. Multivariable prediction models are superior to conventional decision-making based solely on PSA testing or digital rectal examination (DRE) in predicting the outcome of prostate biopsies. Previous multivariable prediction models for detecting sPC were structured from clinical parameters including various combinations of age, PSA, prostate volume (PV), DRE ndings and others. MRI ndings were also utilized as a parameter of prediction models, but without a standardized reporting system [7,8]. The usefulness of an individualized risk calculator and a multivariable nomogram including data from mpMRI using PI-RADS score for detecting sPC have been reported [9][10][11]. Furthermore, the use of bi-parametric MRI (bpMRI) of prostate incorporating anatomical and functional imaging (T2-weighted imaging and DWI not containing DCE) has maintained high diagnostic accuracy [12,13]. The predictive model based on bpMRI ndings and clinical parameters for risk assessment and selection of sPC have also recently been reported [14,15].
However, epidemiologically, the characteristics of prostate cancer exhibit regional and ethnic differences [16]. While the risk calculator and nomogram ideally should be structured from the same cohorts with good validation, no reports have described a risk calculator and nomogram using PI-RADS scores combined with other clinical parameters from a Japanese-only cohort [6]. The aim of the present study was to develop the rst risk model and nomogram using PI-RADS score among Japanese males for detecting sPC and reducing over-detection and over-treatment of clinically insigni cant prostate cancer.
In the multivariate logistic regression analysis to predict sPC, age (P = 0.002), logPSA (P < 0.001), PV (P < 0.001) and PI-RADS score (P < 0.001) contributed signi cantly to the model ( Table 2). Multicollinearity was tested between all variables by the individual variance in ation factor and no multicollinearity was found. The nomogram of the risk model and the regression equation are shown in Fig. 1. The novel risk model was internally validated by bootstrapping. Discrimination of the risk model was compared using parameters included in ROC analyses (Fig. 2, Table 3). The risk model reached a higher AUC (0.862), compared with age (0.646), PV (0.697), logPSA (0.652) and PI-RADS score (0.822). DeLong's test results also showed that the novel risk model performed signi cantly better compared with those parameters including PI-RADS score alone (Table 3). Table 4 shows TPR, FPR, PPV and NPV at exemplary probability thresholds of this risk model and best PIRADS score cutoff. At a probability threshold of 10%, the net reduction in biopsies taken based on the risk model was 43.0%, while the rate of missing sPC was 2.3%. Bootstrapped calibration plots of the risk model demonstrated no untoward deviations of predicted risk from observed risk of sPC over the entire range (Fig. 3).
In bootstrapped DCA, the risk model showed a higher net bene t in terms of accurately detecting patients with sPC, compared with PI-RADS score and other parameters alone (Fig. 4). The risk model showed a bene t for sPC threshold probabilities larger than 10%.

Discussion
Because of the high diagnostic accuracy for sPC detection, upfront mpMRI has been recommended as a triage test to indicate the need for biopsy among biopsy-naïve men in whom sPC was suspected due to high PSA [17][18][19]. As a result of the high negative predictive value, men with no suspected evidence of sPC on MRI may defer systematic biopsy [20]. Moreover, to improve predictive values, new multivariate risk prediction tools have recently been constructed using the mpMRI suspicion score [9,10,21].
Recently, performing prostate MRI without DCE, a procedure termed "bi-parametric MRI" (bpMRI) garners bene cial results. The effectiveness of bpMRI for the detecting sPC in biopsy-naïve patients has been reported. And the bpMRI has the advantage that there are no adverse events that have been associated with some gadolinium-based contrast agents, shortened examination time and reduced costs [22]. On the other hand DCE MRI has been reported to improve the sensitivity of MRI for the detecting sPC. But at the same time the predictive models based on bpMRI ndings and clinical parameters for risk assessment and selection of sPC have also recently been reported [14,15,23,24].
In a Japanese cohort, the e cacy of mpMRI and bpMRI for detecting sPC as a triage test was also reported [25][26][27]. However, no multivariate risk prediction models for detecting sPC based on PI-RADS scores of mpMRI or bpMRI as ordinal variables among Japanese populations have been reported previously.
The characteristics of our novel risk model were as follows. First, in all cases, bpMRI were performed on the pre-biopsy setting, because biopsy artifacts could affect bpMRI ndings and this model was constructed to reduce unnecessary biopsy. Second, a variable of DRE used in other nomograms was not included in this study. Because anterior prostate cancer is less commonly palpable, if DRE is used as a variable in the prediction model, the dataset of the model should ideally be divided into two groups according to whether DRE ndings are positive, and each model should be constructed independently [28]. The small size of our dataset could not be divided into groups.
PI-RADS score contributed signi cantly to the model, like other parameters from multivariate logistic regression analysis. Interestingly, the odds ratio of PI-RADS score 2 compared to score1 was 0.292 (P=0.098) and PI-RADS score 3 compared to score 1 was 2.005 (P=0.332) ( Table 2). PI-RADS score 1 and score 2 indicated normal prostate gland and benign prostate disease (in ammatory and/or hyperplasia) respectively. In a proportion of cases with PI-RADS score 2, PSA was elevated because of in ammation and hyperplasia. Therefore, among high-PSA cases, PI-RADS score 1 might carry a higher risk of sPC than PI-RADS score 2 in real clinical practice. Moreover, because of the low number of PI-RADS score 1 (only 11 cases (1.42%)), the odds ratio for PI-RADS score 2 to score 1 might not reach statistical signi cance.
This also explained why lower PV cases tended to carry a higher risk of sPC. This was presumably because multicollinearity among parameters could not be excluded completely even if multivariate analysis was performed.
Low PI-RADS score harbors a 5-10% risk of sPC, allowing biopsy to be potentially avoided [29,30]. ROC analysis revealed this novel model offered a high AUC (c index=0.862) approximately equivalent to previous reports, although this novel model lacked external validation and should not be compared to other risk models constructed from different regional and ethnical cohorts [9]. The risk model enable avoidance of unnecessary biopsies in more patients without increasing the risk of missing a diagnosis of sPC at an arbitrary probability threshold. More speci cally, at probability thresholds of 10% and 20% in this model and with a cut-off PI-RADS score between 2 and 3, the net reductions in biopsies were 43.0%, 57.0% and 57.0% while the rates of missing sPC were 2.3%, 6.4% and 6.4%, respectively. Using DCA, the present study showed that the risk model using PI-RADS scores improved clinical decisions for biopsy of patients with suspected sPC, as compared with clinical parameter models or PI-RADS score alone. The risk model provided bene ts in the decision to biopsy patients for sPC at probability thresholds exceeding 10%. From a practical perspective, at various probability cutoffs, the combined models demonstrated the best performance among all prediction parameters. Although cost-effectiveness remains an issue due to differences in social insurance situations and the high penetration rate of MRI in other countries, a protocol for biopsy indications for MRI in cases with high PSA value should be considered.
The present ndings should be interpreted in the context of some limitations. First, this study represented a retrospective analysis that elevated the risk of selection biases. Second, inter-reader agreement on bpMRI was not evaluated in the present study. Third, low numbers of systemic biopsy cores were collected in our cohort. The number of sPC lesions detected by systemic biopsy was thought to be lower and could have improved model accuracy and internal validation. Last, no external validation was performed. If the excellent results obtained with bpMRI and other clinical parameters from a single institution like this study are not reproduced in other hospitals, the broad use of the novel risk model will lead to patient mismanagement in a substantial proportion of cases.
To the best of our knowledge, this represents the rst report of a risk calculator and nomogram using PI-RADS version 2 score of bpMRI among Japanese males for detecting sPC in pre-biopsy settings. On the other hand, recent risk models have been reported to detect sPC using quantitative mpMRI, which may also help standardize mpMRI and bpMRI interpretation and image recognition using new statistical tools (machine learning, deep learning and neural network analysis) [31,32]. Risk models using genetic elements and molecular markers rather than image variables are also being reported [33]. Lastly, prospective and multi-centric risk models for sPC risk prediction including such new biochemical parameters, nancial aspects and novel MRI fusion biopsy data are expected to be established in the future.

Study population
Between January 2011 and December 2016, a total of 773 biopsy-naïve patients suspected to have localized prostate cancer based on abnormal PSA levels were analyzed retrospectively from a single institution, Toranomon hospital, Tokyo, Japan. Indications for biopsy were high PSA level (≥4.0 ng/ml), abnormal DRE or suspicious lesions for prostate cancer on bpMRI images. Exclusion criteria were previous prostate surgery, previous diagnosis of prostate cancer and administration of 5-alpha-reductase inhibitor or anti-androgen, as agents that affect PSA values. Full data on PI-RADS scores of bpMRI before prostate biopsy, biopsy outcome, PSA, age and PV were available for all patients. Those samples were used for risk model development and internal validation. The study was approved by Toranomon Hospital Ethics Committee (approval no.1573). And all methods were conducted in accordance with the relevant local guidelines and regulations. All the patients provided an informed consent or was informed by this hospital internet web page including an opt-out option approved by Toranomon Hospital Ethics Committee.

Imaging
All bpMRI was performed using a 1.5-or 3.0-T system (Magnetom; Siemens, Erlangen, Germany) using a multichannel body surface coil. The bpMRI protocol included axial, coronal and sagittal turbo spin echo T2-weighted sequences and axial DWI with apparent diffusion coe cient (ADC) calculation (Supplementary Table S1). A 1.5 T system was generally used for the rst bpMRI and a 3.0 T system was used for the second and subsequent bpMRI. All image analyses were performed according to PI-RADS version 2.0 on a scale from 1 to 5, with higher numbers indicating a greater likelihood of sPC [34]. Analyses were performed by or under the supervision of a few expert uroradiologists. Overall, PI-RADS scores for each lesion were determined and entailed assignment of a separate score for each of the T2-weighted and DWI sequences. PV was calculated on T2-weighted imaging, calculated as multiplication of 0.52, length, width and height.

Biopsy protocol
All patients underwent systematic transperineal and transrectal biopsy (mapping 8-14 cores) of the whole gland in the lithotomy position under local anesthesia, carried out by one of several expert urologists [35]. If one or more suspicious lesions of prostate cancer were detected on bpMRI ( suspicious lesions were reported as PI-RADS score ≥3 retrospectively ), transperineal cognitive targeted biopsies were added for each lesion (2-4 cores of each lesion; median, 2 per lesion). Transrectal ultrasound echography (ARIETTA; Hitachi Aloka Medical, Wallingford, CT, USA) was used to guide biopsies without MRI fusion software.

Histopathology
Histopathological analyses from biopsies were performed by or under the supervision of a few expert uropathologists specializing in prostate assessment according to International Society of Urological Pathology standards. For all cores, the length of the cancer in millimeters and both primary and secondary Gleason grade were assigned separately. The study de ned sPC as grade group ≥3 (Gleason score ≥4 + 3) or a maximum cancer core length ≥6 mm in any location [5].

Statistical analysis
Patient demographics, MRI and biopsy results (age, PSA, PV, PI-RADS score 1-5 and presence or absence of sPC) were analyzed descriptively. First, we divided all patients into two groups by pathological outcome: a sPC group and others group. The others group included patients with clinically insigni cant prostate cancer and no cancerous tissue. Clinical parameters were compared between groups using the Wilcoxon test and Pearson test. Consequently, we performed multivariate logistic regression analysis to predict the presence of sPC on biopsy. We calculated odds ratios and used multivariate logistic regression-based coe cients to develop multivariable nomograms that predict the probability of sPC (a nomogram is a graphical calculating device, speci cally the approximate probability of sPC derived by mathematical logistic function in this study). To avoid linearity assumptions, PSA was transformed into the logarithmic PSA.
Discrimination of risk models for sPC with or without MRI scoring was compared using the area under the curve (AUC) of the receiver operating characteristic (ROC) curve. Statistical differences between predictive models were analyzed using DeLong's test.
The extent of over-or underestimation of predicted rate relative to observed rate of sPC was explored graphically using calibration plots, which were internally validated using 1000 bootstrap resamples. The intercept indicates whether predictions are systematically too low or too high, and thus should ideally be zero. The calibration slope re ects the average effects of predictors in the model and is estimated in a logistic regression model with the logit of model predictions as the only predictor. For a perfect model, the slope equals 1 [36].
Last, we assessed the performance of the risk model for its clinical usefulness by using decision curve analysis (DCA) based on 1000-times repeated bootstrapped validation. These analyses estimate a 'net bene t' for prediction models by totaling the bene ts (true-positive biopsies) and subtracting the harms (false-positives biopsies) [37]. The harms are weighted by the relative harm of a missed sPC compared to unnecessary biopsy. The weighted rate is derived from the threshold probability of sPC at which a patient would opt for biopsy. This threshold can vary from patient to patient in clinical settings. The reduction in number of biopsies using different probabilities was further assessed and related to the number and percentage of detecting sPC. The interpretation of the decision curve is that the model with the highest net bene t at a particular threshold probability is the most useful model for risk and bene t. To quantify the potential reduction of unnecessary biopsies and potential over-diagnosis, we calculated true-positive rate (TPR), false-positive rate (FPR), positive predictive value (PPV) and negative predictive value (NPV) at exemplary probability thresholds.
All tests performed were two-sided and a P value of less than 0.05 was considered to indicate statistical signi cant. Statistical analyses were performed using R version 4.0.2 (R Foundation for Statistical Computing, Vienna, Austria). ROC analysis and DCA were performed utilizing the pROC package and rmda package, respectively. Reporting followed the Standards of Reporting of Diagnostic Accuracy (Supplementary Table S2    Title: Net DCA demonstrating the bene t for predicting sPC on biopsy Legend: The turquoise line is the net bene t of providing all patients with biopsy, and the horizontal black line is the net bene t of providing no patients with biopsy.The net bene t provided by each prediction tool is given.

Supplementary Files
This is a list of supplementary les associated with this preprint. Click to download.