Search strategy and study selection
This study was performed according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines [19]. Pubmed and Embase databases were systematically searched from inception until January 12, 2020 using keywords and related terms of “prostate”, “PSMA-PET”, “SVI” and “EPE” based on the search query as the following: (prostate OR prostatic) AND ("prostate-specific membrane antigen" OR PSMA) AND ("positron emission" OR PET) AND ("extracapsular extension" OR ECE OR "extraprostatic extension" OR EPE OR "seminal vesical invasion" OR SVI OR T3 OR T3a OR T3b OR ((local OR localized OR regional OR locoregional) AND (stage OR staging OR extent* OR invasion))). The reference lists of eligible articles were also scrutinized to further identify relevant articles. No language limitations were applied.
Studies were included based on “Patient, Index test, Comparator, Outcome, and Study design” (PICOS) criteria: (1) “patients” with prostate cancer presenting for primary staging; (2) PSMA-PET as “index test”; (3) radical prostatectomy as the “comparator” or reference standard; (4) SVI or EPE as the “outcome”; and (5) “study design” of clinical trials, prospective or retrospective cohort studies either published as original articles or conference abstracts. Of note, we planned to only meta-analyze studies assessing 68Ga-based radioligands as they are widely used and investigated in the literature.
Studies were excluded if they (1) included small number of patients (<10), (2) were of other publication types (e.g., review articles, letters, or editorials); (3) focused on other topics; (4) did not provide sufficient data to calculate 2x2 contingency tables with regard to sensitivity and specificity; or (5) had overlap in the study population. When overlap was present, we used the study with more comprehensive information required for meta-analysis.
The study selection process was performed by two independent reviewers (S.W. and S.G.) and discussion with a third reviewer (H.A.V.) was performed when there was disagreement.
Data extraction and quality assessment
Relevant study-, clinicopathological-, and PET-related information were extracted and collated in Excel 2016 as follows: (1) study: first author, publication year, institution, period of enrollment, country of origin, study design (prospective vs. retrospective), and endpoint (SVI, EPE, or both); (2) clinicopathological: number of patients, age, serum PSA level, Gleason score, risk classification [20], (3) PET: vendor, type of scanner, ligands, anatomical imaging component (MRI vs. CT), and whether PET was assessed blinded to clinicopathological information or not.
The quality of the studies were assessed using the revised Quality Assessment of Diagnostic Accuracy Studies-2 (QUADAS-2) tool [21]. Data extraction and quality assessment were performed by the same three reviewers above in the same manner.
Data synthesis and analysis
The primary outcome of our study was to assess the diagnostic performance of PSMA-PET for determining SVI and EPE in terms of sensitivity and specificity. The secondary outcome was to evaluate whether there are differences in the performance between PET/MRI and PET/CT.
True positive, false negative, false positive, and true negative values were tabulated using sensitivity and specificity or the corresponding raw data provided from each of the included studies. If multiple diagnostic test accuracy results by multiple readers were given within a study, the average value across all readers was used. Sensitivity and specificity were meta-analytically pooled using hierarchical logistic regression modelling and corresponding hierarchical summary ROC (HSROC) curves were generated with their 95% confidence and prediction regions [22, 23]. Publication bias was evaluated by subjective assessment of the Deeks’ funnel plot and based on the p-value of Deeks’ asymmetry test [24].
Heterogeneity was assessed with several methods. First, heterogeneity was evaluated using the Cochran’s Q-test. Second, Higgins I2 test was used to determine the degree of heterogeneity as follows: inconsistency index (I2) = 0–40%, unimportant; 30–60%, moderate; 50–90%, substantial; and 75–100%, considerable [25]. Third, we tested for the presence of a threshold effect, which means a positive correlation between the sensitivity and false-positive rate. Finally, meta-regression analysis was performed using anatomical imaging component of the PET (MRI vs. CT) as a covariate to ascertain if there were differences in the diagnostic performance between studies using PET/MRI and PET/CT.
The “metandi” and “midas” modules in Stata 10.0 (StataCorp LP, College Station, TX, USA) and “mada” package in R software version 3.6.1 (R Foundation for Statistical Computing, Vienna, Austria) were used for statistical analyses. A two-tailed P < 0.05 was considered statistically significance with the exception of Deeks’ asymmetry test, where <0.1 indicated statistical significance.