To selectively identify men with clinically significant prostate cancer (sPC) is pivotal issue. To develop a risk model for detecting sPC based on Prostate Imaging Reporting and Data System (PI-RADS) for bi-parametric magnetic resonance imaging (bpMRI) and clinical parameters in a Japanese cohort is expected beneficial.
Between January 2011 and December 2016, we retrospectively analyzed clinical parameters and bpMRI findings from 773 biopsy-naïve patients. A risk model was established using multivariate logistic regression analysis and was presented on a nomogram. Discrimination of the risk model was compared using the area under the receiver operating characteristic curve. Statistical differences between the predictive model and clinical parameters were analyzed using DeLong’s test.
sPC was detected in 343 men (44.3%). In the multivariate logistic regression analysis to predict sPC, age (P=0.002), log prostate-specific antigen (P<0.001), prostate volume (P<0.001) and PI-RADS scores (P<0.001) contributed significantly to the model. The risk model showed a higher area under the curve (0.862), than age (0.646), log prostate-specific antigen (0.652), prostate volume (0.697) and imaging scores (0.822). DeLong test results also showed that the novel risk model performed significantly better compared with those parameters (P<0.05).
This novel risk model performed significantly better compared with PI-RADS scores and other parameters alone, and is thus expected to provide benefits in making decisions to biopsy on suspicion of sPC.
Prostate cancer is the most commonly diagnosed cancer in Japan. The incidence of prostate cancer is rapidly increasing, with over 90,000 males newly diagnosed in 2017. Over 12,000 males died of prostate cancer in 2018, representing the 6th -most frequent cause of cancer-related death among men in Japan . Population-based prostate-specific antigen (PSA) screening tests can increase early detection of prostate cancer and thus lead to declines in prostate cancer related-mortality . However, these tests simultaneously lack specificity, resulting in increased numbers of unnecessary prostate biopsies, which in turn are associated with risks of rectum bleeding and sepsis. The risk of over-treatment leading to adverse impacts on quality of life without improving survival is a concern. Randomized controlled clinical studies that evaluated the efficacy of prostate cancer screening have highlighted the need to reduce over-diagnosis of clinically insignificant prostate cancer. A new diagnostic pathway is thus needed to selectively identify men with clinically significant prostate cancer (sPC), while reducing the number of unnecessary biopsies and over-detection and over-treatment of clinically insignificant prostate cancer [2, 3].
The use of multi-parametric magnetic resonance imaging (mpMRI) of prostate incorporating anatomical and functional imaging (T2-weighted imaging, diffusion-weighted imaging (DWI) and dynamic contrast enhancement (DCE)) has been beneficial to detect sPC. However, mpMRI has been criticized for its widely variable reported diagnostic performance across different institutions. In 2012, Prostate Imaging Reporting and Data System (PI-RADS) was introduced to facilitate standardized interpretation of mpMRI findings . In PI-RADS, a score for suspecting the presence of sPC was assigned on a 1- to 5- point scale on mpMRI sequence. PI-RADS has shown high diagnostic accuracy for detecting sPC by means of targeted biopsies [5, 6].
The use of clinical data with mpMRI findings has become significantly important for urologists to better stratify individuals who may warrant prostate biopsy. Multivariable prediction models are superior to conventional decision-making based solely on PSA testing or digital rectal examination (DRE) in predicting the outcome of prostate biopsies. Previous multivariable prediction models for detecting sPC were structured from clinical parameters including various combinations of age, PSA, prostate volume (PV), DRE findings and others. MRI findings were also utilized as a parameter of prediction models, but without a standardized reporting system [7, 8]. The usefulness of an individualized risk calculator and a multivariable nomogram including data from mpMRI using PI-RADS score for detecting sPC have been reported [9–11]. Furthermore, the use of bi-parametric MRI (bpMRI) of prostate incorporating anatomical and functional imaging (T2-weighted imaging and DWI not containing DCE) has maintained high diagnostic accuracy [12, 13]. The predictive model based on bpMRI findings and clinical parameters for risk assessment and selection of sPC have also recently been reported [14, 15].
However, epidemiologically, the characteristics of prostate cancer exhibit regional and ethnic differences . While the risk calculator and nomogram ideally should be structured from the same cohorts with good validation, no reports have described a risk calculator and nomogram using PI-RADS scores combined with other clinical parameters from a Japanese-only cohort . The aim of the present study was to develop the first risk model and nomogram using PI-RADS score among Japanese males for detecting sPC and reducing over-detection and over-treatment of clinically insignificant prostate cancer.
In total, sPC was detected in 343 men (44.3%). The demographics, MRI and biopsy data of both groups are given in Table 1. Men in the sPC group were older (69 vs 65 years, P < 0.001), had higher PSA (9.01 vs 6.72 ng/ml, P < 0.001), lower PV (29.6 vs 39.85 ml, P < 0.001), and a greater number of biopsy cores (range 8–14 cores, P = 0.021). The proportion of borderline and malignant lesions on mpMRI (PI-RADS scores 3, 4 or 5) was significantly higher in the sPC group (93.59 vs 65.46%, P < 0.001).
In the multivariate logistic regression analysis to predict sPC, age (P = 0.002), logPSA (P < 0.001), PV (P < 0.001) and PI-RADS score (P < 0.001) contributed significantly to the model (Table 2). Multicollinearity was tested between all variables by the individual variance inflation factor and no multicollinearity was found. The nomogram of the risk model and the regression equation are shown in Fig. 1.
The novel risk model was internally validated by bootstrapping. Discrimination of the risk model was compared using parameters included in ROC analyses (Fig. 2, Table 3). The risk model reached a higher AUC (0.862), compared with age (0.646), PV (0.697), logPSA (0.652) and PI-RADS score (0.822). DeLong’s test results also showed that the novel risk model performed significantly better compared with those parameters including PI-RADS score alone (Table 3). Table 4 shows TPR, FPR, PPV and NPV at exemplary probability thresholds of this risk model and best PIRADS score cutoff. At a probability threshold of 10%, the net reduction in biopsies taken based on the risk model was 43.0%, while the rate of missing sPC was 2.3%. Bootstrapped calibration plots of the risk model demonstrated no untoward deviations of predicted risk from observed risk of sPC over the entire range (Fig. 3).
In bootstrapped DCA, the risk model showed a higher net benefit in terms of accurately detecting patients with sPC, compared with PI-RADS score and other parameters alone (Fig. 4). The risk model showed a benefit for sPC threshold probabilities larger than 10%.
Because of the high diagnostic accuracy for sPC detection, upfront mpMRI has been recommended as a triage test to indicate the need for biopsy among biopsy-naïve men in whom sPC was suspected due to high PSA [17-19]. As a result of the high negative predictive value, men with no suspected evidence of sPC on MRI may defer systematic biopsy . Moreover, to improve predictive values, new multivariate risk prediction tools have recently been constructed using the mpMRI suspicion score [9,10,21].
Recently, performing prostate MRI without DCE, a procedure termed “bi-parametric MRI” (bpMRI) garners beneficial results. The effectiveness of bpMRI for the detecting sPC in biopsy-naïve patients has been reported. And the bpMRI has the advantage that there are no adverse events that have been associated with some gadolinium-based contrast agents, shortened examination time and reduced costs . On the other hand DCE MRI has been reported to improve the sensitivity of MRI for the detecting sPC. But at the same time the predictive models based on bpMRI findings and clinical parameters for risk assessment and selection of sPC have also recently been reported [14,15,23,24].
In a Japanese cohort, the efficacy of mpMRI and bpMRI for detecting sPC as a triage test was also reported [25-27]. However, no multivariate risk prediction models for detecting sPC based on PI-RADS scores of mpMRI or bpMRI as ordinal variables among Japanese populations have been reported previously.
The characteristics of our novel risk model were as follows. First, in all cases, bpMRI were performed on the pre-biopsy setting, because biopsy artifacts could affect bpMRI findings and this model was constructed to reduce unnecessary biopsy. Second, a variable of DRE used in other nomograms was not included in this study. Because anterior prostate cancer is less commonly palpable, if DRE is used as a variable in the prediction model, the dataset of the model should ideally be divided into two groups according to whether DRE findings are positive, and each model should be constructed independently . The small size of our dataset could not be divided into groups.
PI-RADS score contributed significantly to the model, like other parameters from multivariate logistic regression analysis. Interestingly, the odds ratio of PI-RADS score 2 compared to score1 was 0.292 (P=0.098) and PI-RADS score 3 compared to score 1 was 2.005 (P=0.332) (Table 2). PI-RADS score 1 and score 2 indicated normal prostate gland and benign prostate disease (inflammatory and/or hyperplasia) respectively. In a proportion of cases with PI-RADS score 2, PSA was elevated because of inflammation and hyperplasia. Therefore, among high-PSA cases, PI-RADS score 1 might carry a higher risk of sPC than PI-RADS score 2 in real clinical practice. Moreover, because of the low number of PI-RADS score 1 (only 11 cases (1.42%)), the odds ratio for PI-RADS score 2 to score 1 might not reach statistical significance. This also explained why lower PV cases tended to carry a higher risk of sPC. This was presumably because multicollinearity among parameters could not be excluded completely even if multivariate analysis was performed.
Low PI-RADS score harbors a 5–10% risk of sPC, allowing biopsy to be potentially avoided [29,30]. ROC analysis revealed this novel model offered a high AUC (c index=0.862) approximately equivalent to previous reports, although this novel model lacked external validation and should not be compared to other risk models constructed from different regional and ethnical cohorts . The risk model enable avoidance of unnecessary biopsies in more patients without increasing the risk of missing a diagnosis of sPC at an arbitrary probability threshold. More specifically, at probability thresholds of 10% and 20% in this model and with a cut-off PI-RADS score between 2 and 3, the net reductions in biopsies were 43.0%, 57.0% and 57.0% while the rates of missing sPC were 2.3%, 6.4% and 6.4%, respectively. Using DCA, the present study showed that the risk model using PI-RADS scores improved clinical decisions for biopsy of patients with suspected sPC, as compared with clinical parameter models or PI-RADS score alone. The risk model provided benefits in the decision to biopsy patients for sPC at probability thresholds exceeding 10%. From a practical perspective, at various probability cutoffs, the combined models demonstrated the best performance among all prediction parameters. Although cost-effectiveness remains an issue due to differences in social insurance situations and the high penetration rate of MRI in other countries, a protocol for biopsy indications for MRI in cases with high PSA value should be considered.
The present findings should be interpreted in the context of some limitations. First, this study represented a retrospective analysis that elevated the risk of selection biases. Second, inter-reader agreement on bpMRI was not evaluated in the present study. Third, low numbers of systemic biopsy cores were collected in our cohort. The number of sPC lesions detected by systemic biopsy was thought to be lower and could have improved model accuracy and internal validation. Last, no external validation was performed. If the excellent results obtained with bpMRI and other clinical parameters from a single institution like this study are not reproduced in other hospitals, the broad use of the novel risk model will lead to patient mismanagement in a substantial proportion of cases.
To the best of our knowledge, this represents the first report of a risk calculator and nomogram using PI-RADS version 2 score of bpMRI among Japanese males for detecting sPC in pre-biopsy settings. On the other hand, recent risk models have been reported to detect sPC using quantitative mpMRI, which may also help standardize mpMRI and bpMRI interpretation and image recognition using new statistical tools (machine learning, deep learning and neural network analysis) [31,32]. Risk models using genetic elements and molecular markers rather than image variables are also being reported . Lastly, prospective and multi-centric risk models for sPC risk prediction including such new biochemical parameters, financial aspects and novel MRI fusion biopsy data are expected to be established in the future.
Between January 2011 and December 2016, a total of 773 biopsy-naïve patients suspected to have localized prostate cancer based on abnormal PSA levels were analyzed retrospectively from a single institution, Toranomon hospital, Tokyo, Japan. Indications for biopsy were high PSA level (≥4.0 ng/ml), abnormal DRE or suspicious lesions for prostate cancer on bpMRI images. Exclusion criteria were previous prostate surgery, previous diagnosis of prostate cancer and administration of 5-alpha-reductase inhibitor or anti-androgen, as agents that affect PSA values. Full data on PI-RADS scores of bpMRI before prostate biopsy, biopsy outcome, PSA, age and PV were available for all patients. Those samples were used for risk model development and internal validation. The study was approved by Toranomon Hospital Ethics Committee (approval no.1573). And all methods were conducted in accordance with the relevant local guidelines and regulations. All the patients provided an informed consent or was informed by this hospital internet web page including an opt-out option approved by Toranomon Hospital Ethics Committee.
All bpMRI was performed using a 1.5- or 3.0-T system (Magnetom; Siemens, Erlangen, Germany) using a multichannel body surface coil. The bpMRI protocol included axial, coronal and sagittal turbo spin echo T2-weighted sequences and axial DWI with apparent diffusion coefficient (ADC) calculation (Supplementary Table S1). A 1.5 T system was generally used for the first bpMRI and a 3.0 T system was used for the second and subsequent bpMRI. All image analyses were performed according to PI-RADS version 2.0 on a scale from 1 to 5, with higher numbers indicating a greater likelihood of sPC . Analyses were performed by or under the supervision of a few expert uroradiologists. Overall, PI-RADS scores for each lesion were determined and entailed assignment of a separate score for each of the T2-weighted and DWI sequences. PV was calculated on T2-weighted imaging, calculated as multiplication of 0.52, length, width and height.
All patients underwent systematic transperineal and transrectal biopsy (mapping 8–14 cores) of the whole gland in the lithotomy position under local anesthesia, carried out by one of several expert urologists . If one or more suspicious lesions of prostate cancer were detected on bpMRI ( suspicious lesions were reported as PI-RADS score ≥3 retrospectively ), transperineal cognitive targeted biopsies were added for each lesion (2–4 cores of each lesion; median, 2 per lesion). Transrectal ultrasound echography (ARIETTA; Hitachi Aloka Medical, Wallingford, CT, USA) was used to guide biopsies without MRI fusion software.
Histopathological analyses from biopsies were performed by or under the supervision of a few expert uropathologists specializing in prostate assessment according to International Society of Urological Pathology standards. For all cores, the length of the cancer in millimeters and both primary and secondary Gleason grade were assigned separately. The study defined sPC as grade group ≥3 (Gleason score ≥4 + 3) or a maximum cancer core length ≥6 mm in any location .
Patient demographics, MRI and biopsy results (age, PSA, PV, PI-RADS score 1–5 and presence or absence of sPC) were analyzed descriptively. First, we divided all patients into two groups by pathological outcome: a sPC group and others group. The others group included patients with clinically insignificant prostate cancer and no cancerous tissue. Clinical parameters were compared between groups using the Wilcoxon test and Pearson test. Consequently, we performed multivariate logistic regression analysis to predict the presence of sPC on biopsy. We calculated odds ratios and used multivariate logistic regression-based coefficients to develop multivariable nomograms that predict the probability of sPC (a nomogram is a graphical calculating device, specifically the approximate probability of sPC derived by mathematical logistic function in this study). To avoid linearity assumptions, PSA was transformed into the logarithmic PSA.
Discrimination of risk models for sPC with or without MRI scoring was compared using the area under the curve (AUC) of the receiver operating characteristic (ROC) curve. Statistical differences between predictive models were analyzed using DeLong’s test.
The extent of over- or underestimation of predicted rate relative to observed rate of sPC was explored graphically using calibration plots, which were internally validated using 1000 bootstrap resamples. The intercept indicates whether predictions are systematically too low or too high, and thus should ideally be zero. The calibration slope reflects the average effects of predictors in the model and is estimated in a logistic regression model with the logit of model predictions as the only predictor. For a perfect model, the slope equals 1 .
Last, we assessed the performance of the risk model for its clinical usefulness by using decision curve analysis (DCA) based on 1000-times repeated bootstrapped validation. These analyses estimate a ‘net benefit’ for prediction models by totaling the benefits (true-positive biopsies) and subtracting the harms (false-positives biopsies) . The harms are weighted by the relative harm of a missed sPC compared to unnecessary biopsy. The weighted rate is derived from the threshold probability of sPC at which a patient would opt for biopsy. This threshold can vary from patient to patient in clinical settings. The reduction in number of biopsies using different probabilities was further assessed and related to the number and percentage of detecting sPC. The interpretation of the decision curve is that the model with the highest net benefit at a particular threshold probability is the most useful model for risk and benefit. To quantify the potential reduction of unnecessary biopsies and potential over-diagnosis, we calculated true-positive rate (TPR), false-positive rate (FPR), positive predictive value (PPV) and negative predictive value (NPV) at exemplary probability thresholds.
All tests performed were two-sided and a P value of less than 0.05 was considered to indicate statistical significant. Statistical analyses were performed using R version 4.0.2 (R Foundation for Statistical Computing, Vienna, Austria). ROC analysis and DCA were performed utilizing the pROC package and rmda package, respectively. Reporting followed the Standards of Reporting of Diagnostic Accuracy (Supplementary Table S2, S3).
Due to technical limitations, tables are only available as a download in the Supplemental Files section.