Utility of Three-Protein Panels in the Separation of Aggressive Prostate Cancer from Non-Aggressive Tumors

Background Prostate cancer (PCa) is a heterogeneous group of tumors, including non-aggressive (NAG) and aggressive (AG) subtypes, with variable clinical outcomes. We assessed the diagnostic utility of selected protein markers to identify AG tumors. Methods The TMA was constructed, including NAG and AG. 12 protein markers were evaluated using the TMA by IHC stains. The makers were also evaluated for their potential utility as single or panels for distinguishing AG from NAG tumors. Results The higher expressions of four protein markers, including prostate specic membrane antigen (PSMA), phospho-EGFR, androgen receptor (AR), and P16, were identied in AG tumors of Gleason score 4 and 5. In contrast, Galectin-3, DPP4 and MAN1B1 revealed stronger staining patterns in NAG tumors. Sensitivity and specicity of individual marker varied widely. In tow-marker panels, especially in the panel of DPP4 and PSMA, the specicity was 38.46% at 95% sensitivity. To further improve the detection ability, we combined DPP4 and PSMA with either Galectin-3 or phospho-EGFR into three-marker panels. The specicity achieved >46% at 95% sensitivity and AUC was >0.85. Our panels can be used to improve the separation of AG from NAG tumor and to add in the optimization of the treatment strategy for patients.


Introduction
Prostate cancer (PCa) is a heterogeneous group of tumors with variable clinical outcomes [1][2][3] . Most PCa presents as a localized disease with low or no risk for the tumor progression, and patients can be managed by the active surveillance program rather than aggressive/surgical interventions. Only a subset of PCa has an aggressive behavior leading to tumor progression, metastasis and caner-related death 4-7 . Overdiagnosis and overtreatment of indolent PCa are still clinical issues in the management of PCa patients. Several recent studies, including the US Preventive Services Task Force (USPSTF), have shown that the incidence of localized disease continues to decline, whereas, the incidence of advanced-stage disease continues to arise in men over 50-year old [4][5][6][7] . Multiple risk strati cation systems have been developed to separate the high-risk aggressive PCa (AG) from low-risk non-aggressive indolent tumors (NAG), including the combination of both clinical and pathological parameters (such as serum PSA levels, Gleason score, ISUP grade of the tumor, as well as clinical and pathological staging). However, these tools are still insu cient for the prediction of disease progression or separation of AG from NAG 4,6−8 . Therefore, there is an urgent clinical need to identify AG in order to optimize the treatment strategy for patients.
Current genomic studies have identi ed a spectrum of molecular abnormalities associated with PCa [9][10][11][12][13] . These studies demonstrated a distinct molecular abnormalities in subtypes of PCa. In the Cancer Genome Atlas (TCGA) study, the comprehensive analysis of a large cohort revealed that 74% of 333 primary PCas fell into one of seven subtypes de ned by speci c gene fusions (ERG, ETV1/4, FLI1) or mutations (SPOP, FOXA1, IDH1) 9 . Among numerous genomic aberrations, SPOP and FOXA1 mutants had the highest levels of AR-induced transcripts 9 . The study of whole-genomes and tumor methylomes revealed several unique tumor-speci c RNA and methylation patterns in AG tumors 10 . In addition, many other epigenetic alterations have been also identi ed in PCa, involving in EGFR, PI3K, MAPK signaling pathways and DNA repair genes 11,12 , loss of PTEN and alterations in TMPRSS2-ERG fusions [12][13][14] . These studies reveal not only genomic heterogeneity among primary PCa, but also identify potentially actionable therapeutic targets.
Furthermore, the dysregulation of genomics in PCa leads to aberrant transcription and expression of cellular proteins. Based upon multi-omic studies of proteogenomics and comprehensive proteomic analysis of tumor tissue, many potential protein biomarkers and molecular mechanisms associated with AG tumors have also been suggested [15][16][17][18][19][20][21][22][23] 15 . Interestingly, they also found that changes of mRNA abundance could not re ex the protein abundance variability, indicating the importance of proteomic study of tumors. In a comparative study of PCa cell lines with PCa tumor tissues, 12 mutant peptides were identi ed to be differentially expressed in PCa 16 . Our previous proteomic studies identi ed certain protein biomarkers were differentially expressed, including 14 high-expression proteins in AG tumors and 14 high-expression proteins in NAG tumors 17 . Similarly, several other studies were also demonstrated that proteins were differentially expressed in subtypes of PCa 18-23 . In the study of 28 primary PCa with Gleason scores ranged from 6 to 9, authors found an increased expression of pro-NPY which was associated with a poor prognosis 18 . Recently, we identi ed a decreased proteinase activity in AG tumors 19 . All these ndings demonstrate an extensive involvement of intracellular proteins and signaling pathways in PCa. Therefore, further evaluation of the roles of potential protein markers as independent predictors of pathological AG tumor is necessary [20][21][22][23] in order to optimize the clinical management.
In PCa, the Gleason score of a tumor has been considered to be an indicator of the aggressiveness of the disease, and is assigned based upon the morphology of the dominant tumor pattern plus secondary tumor pattern 24,25 . Based upon the Gleason score of tumors, a PCa grading system was further developed by the International Society of Urological Pathology (ISUP) to predict the clinical behavior of the tumor 24 . In ISUP grading system, grade group 1 (Gleason score 6 or less) has a low risk for disease progression, grade group 2 (Gleason score 3+4 = 7) and grade group 3 (Gleason score 4+3 = 7) have intermediate risk for progression, grade group 4 (Gleason score 8) has a high risk for progression, and grade group 5 (Gleason score 9 or 10) has the highest risk for tumor progression 24 . Although both ISUP grade group 2 and 3 are considered to be intermediate risk groups, studies have shown that grade group 2 has a favorable prognosis than that of grade group 3 25,26 . Thus, the current guideline suggests that the active surveillance program should be selected for grade group 2 and surgical interventions should be selected for grade group 3 or higher PCa 27,28 . In patients, a small amount of tumor tissue often obtained for the initial evaluation of tumor Gleason scores. However, the procedure can be di cult and may not represent the true nature of the tumor. For example, the Gleason score, particularly the secondary pattern of an aggressive tumor may be di cult to identify and/or assigned in certain cases due to the minute focus of the representative morphology 24,29 . Studies have shown that patients with biopsy proven grade group 1 and grade group 2 could upgrade to higher grade groups after evaluation of the surgical resected whole tumor tissue 29 . Therefore, an integrated evaluation of the tumor morphology in combination with the expression of protein markers is necessary to improve the identi cation of AG from NAG tumors.
To access the utility of potential protein markers and to understand molecular features of tumor progression in PCa, we evaluated expressional patterns of 12 protein biomarkers identi ed by proteomic and genomic studies. The immunochemistry (IHC) was performed by using PCa tumor microarrays (TMA), including both indolent NAG and AG subtypes and tumor-matched normal/benign adjacent tissue (NAT). The purposes of our study were to evaluate the utility of PCa-associated proteins, to optimize the performance of individual marker by combining them into panels, and to assess the protein biomarker panels, which may have a potential for the development of clinical assay to separate AG from NAG PCa.

Clinical Information
In our cohort, the median age of patients was 61 years, ranging from 40 to 73 years. Based upon the grading criteria of International Society of Urological Pathology (ISUP) and the morphological feature of the dominant nodule, the ISUP grade of our cohort were: 20 cases of Grade 1, 18 cases of Grade 3, 5 cases of Grade 4, and 14 cases of Grade 5. In addition, 7 cases of Grade 2 were found in tertiary nodules. The pathological stages were 30 cases of pT2, 1 case of pT2x, 13 cases of pT3A, 11 cases of pT3B and 2 cases of pT4 ( Table 2, Supplementary Table 1). Taken together, 20 indolent NAGs with the Gleason score of 6, and 37 AGs with the Gleason score≥7 were included in the TMA.
IHC Staining Pattern of Individual Protein in PCa TMA IHC stains of 12 proteins were performed on the TMA. The information of primary antibodies is summarized in Table 1. Among them, antibodies for 7 proteins presented variably staining patterns ( Figure 1A). The majority of protein markers revealed membrane and cytoplasmic staining patterns, except P16, which revealed nuclear staining pattern. The staining patterns of each protein in tumors and NAT were analyzed using a semi-quantitative scoring system.
The IHC staining patterns of antibodies to 7 proteins are summarized in Table 3. The higher expressions of four proteins, including PSMA, phospho-EGFR, androgen receptor (AR), and P16, were identi ed in AG tumors. In contrast, three antibodies, including anti-Galectin-3, anti-DPP4, and anti-MAN1B1, revealed stronger staining patterns in Gleason score 3 tumors, but weak staining patterns in Gleason score 4 and Gleason 5 tumors. IHC score 1, 2 and 3 were used as cut-off score for positive or negative stain in tumors, respectively. Both PSMA and phospho-EGFR had a positive correlation with Gleason scores of the tumor, whereas Galectin-3 and DPP4 had negative correlation with Gleason scores of the tumor ( Figure 1B). The correlations of IHC scores of PSMA, phospho-EGFR, Galectin-3 and DPP4 with Gleason scores of tumors are shown in Figure 2.
We did not detect the expression of total EGFR with antibody D38B1, PD-1 with antibody NAT105, PD-L1 with antibodies 22C3 and SP142, and PTEN with antibody 6H2.1 in PCa.

The Sensitivity and Speci city of Individual Proteins
Based upon the individual staining pattern, the receiver operating characteristic (ROC) analysis was performed. The value of an area under the curve (AUC) of individual marker was compared. The sensitivity and speci city of phospho-EGFR, Galectin-3, DPP4 and PSMA for distinguishing AG from NAG are summarized in Figure 3.
Among individual marker, values of AUC ranged from 0.48 to 0.7, with best performance of DPP4 and PSMA. Expressions of DPP4 and PSMA were signi cantly altered in tumors with the Gleason ≥4 (p < 0.05) and demonstrated a better performance than phospho-EGFR and galectin-3 in the separation of AG tumors from NAG tumors ( Figure 3A). To directly compare the performance of individual marker, we xed the sensitivity at 95% and then compared the speci city. At 95% sensitivity, the speci city was ranged from 0% to 8.79% ( Figure 3B). By examining the individual marker at its best cutoff point on ROC curves (the maximal summed sensitivity and speci city), we found that phospho-EGFR, Galectin-3, DPP4 and PSMA, had speci cities of 49.5%, 89%, 73.6% and 89%; and the corresponding sensitivities were 68.1%, 36.3%, 79.6%, and 68.1%, respectively ( Figure 3B). Among these markers, Galectin-3 and PSMA had the best speci city of 89%; and DPP4 had the best sensitivity of 79.6%. To evaluate statistical stability of the performance of markers for AG tumor detection, we used both label permutation and bootstrap methods ( Figure 3C). Again, both DPP4 and PSMA demonstrated higher stability than that of phospho-EGFR and Galectin-3. Taken together, the reduced expression of DPP4 and elevated expression of PSMA could be used as a signature of aggressiveness of PCa.

Further Construction and Evaluation of Protein Panels in Separation of AG Tumors
Based upon the performance of individual protein marker, we combined individual protein biomarker into two-and three-marker panels, and evaluated their performances in the separation of AG from NAG tumors.
To further improve the speci city of these markers using a 95% sensitivity as cutoff value, we also constructed three-marker panels and evaluated their performance. These panels included combinations of DPP4 plus Galectin-3 plus phospho-EGFR, DPP4 plus Galectin-3 plus PSMA, and DPP4 plus phospho-EGFR plus PSMA ( Figure 5). All AUCs were further improved to > 80% ( Figure 5A and Figure B).
Speci cities and sensitivities of three-marker panels were as follows: 75.8% and 83.2% in the panel of DPP4 plus Galectin-3 plus phospho-EGFR, 83.5% and 76.1% in the panel of DPP4 plus Galectin-3 plus PSMA, 81.3% and 79.6% in the panel of DPP4 plus phospho-EGFR plus PSMA ( Figure 5B). All speci cities of three-marker panels were > 75% (75.8%, 83.5%%, and 81.3%); and sensitivities were > 76% (83.2%, 76.1%, 79.6%). The random models (label permutation analysis) and the real data were well-separated, indicating that the performances of three-marker panels were reliable ( Figure 5C). We observed that threemarker panels had much better performance than that of individual makers as well as two-marker panels, and the speci city at 95% sensitivity was improved, especially in panels composed of both DPP4 and PSMA. Speci cities at 95% sensitivity of these panels reached 48.35% and 46.15%, respectively ( Figure  5B).
Taken together, our data demonstrated that three-marker panel containing DPP4 and PSMA can signi cantly improve the separation of AG from NAG, in the comparison with individual marker or twomarker panels.

Discussion
Prostate cancer (PCa) is the most common cancer and the second leading cause of cancer death in men in the United States; with estimated new cases and cancer-related deaths in 2020 were 192,000 and 33,000, respectively [1][2][3][4] . The majority of PCa presents as a indolent tumor, in which the patient can be observed in an active surveillance program 4,24−28 . Only about 10% of PCa presents as an aggressive disease with high risk of tumor progression 4,26−28 . Therefore, in order to accurately predict the high-risk tumor and to limit overtreatment of indolent tumor, it is crucial to distinguish AG PCa from NAG tumors.
Although both Gleason score and newly developed ISUP grade group system have been used to assess the potential clinical behavior of the tumor, these systems still have certain limitations to guide the therapeutic decision for patients 24 .
To better understand the molecular mechanism and to identify AG tumors, great efforts have been established to pro le the proteogenomic landscape of PCa 9-23 . Our previous large-scale quantitative proteomic studies of PCa tumor tissue identi ed a spectrum of cellular proteins to be up-or downregulated in subtypes of PCa 8, 17,19 . These differentially expressed tissue proteins identi ed by MS approach are particularly interesting as biomarker candidates because of the high likelihood of their detectability in tumor tissue 17,19 . Using an immunochemical approach, we assessed the potential detectability of selected candidate protein markers in tumor tissue, and their utility as individual markers and/or as protein panels in the separation of AG tumors.
In this study of 12 markers, expressional patterns of four proteins, including PSMA, phospho-EGFR, AR, and MAN1B1, demonstrated higher levels in AG tumors in comparison to NAG. In contrast, three protein markers, including Galectin-3, DPP4 and P16, revealed stronger staining patterns in Gleason score 3 tumors, but weak staining patterns in Gleason score 4 and Gleason 5 tumors. The AUCs of phospho-EGFR, Galectin-3, DPP4 and PSMA were 0.48, 0.54, 0.68, and 0.7, respectively. Among the individual marker, PSMA showed the best discrimination power with the speci city of 89%, where DPP4 showed a sensitivity of 79.6%. These two markers demonstrated a better performance than phospho-EGFR and Galectin-3 in AG tumor detection. However, these two markers had limitations when we considered the speci city at 95% sensitivity. Nevertheless, reduced expression of DPP4 and elevated expression of PSMA could be used together as signatures of aggressiveness of PCa.
PSMA is a type II transmembrane glycoprotein, containing 750-amino acid. It has a long C-terminal extracellular domain and a short N-terminal intracellular domain 31 . Its extracellular domain has enzymatic activity functioning as folate hydrolase I or glutamate carboxypeptidase II 32 . In benign prostate tissue, the expression of PSMA is low. This low level of expressional pattern is also identi ed in the kidney, small intestine, and brain tissue. However, the expression of PSMA is signi cantly increased in PCa; and the overexpression of PSMA is also correlated with the disease progression and the tumor metastasis in PCa 33,34 . Similarly, DPP4 is also a type II transmembrane glycoprotein, but it has the serine exopeptidase activity. DPP4 plays a critical role in regulating cellular proliferation and migration 35 . The aberrant oncogenic and tumor suppressor activity of DPP4 have been identi ed in cancers 36,37 . A reduced serum DPP4 level is also found in PCa patients, especially in patients with metastatic disease 38 .
In the study of primary and metastatic PCa, we recently identi ed that the decreased DPP4 expression and activity is associated with PCa aggressiveness 17,19 . The ndings of decreased DPP4 levels in aggressive and metastatic PCa suggest its critical role in AG tumors. Galectin-3 is a member of the lectin superfamily and plays critical roles in regulating cellular signaling pathways and cancer progression 39 . In prostate cancer, its expression elevated in the early stages of tumors, but this expression gradually decreased over disease progression and was completely lost in advanced stage tumors [39][40][41] . In the metastatic PCa, Galectin-3 regulates tumor cells to form aggregates and adhere to the microvascular endothelium 42 . Based upon its expression and biological roles, Galectin-3 has been suggested to be a predictive marker for the biochemical recurrence of PCa 40,41 . EGFR is a transmembrane glycoprotein and activated by the dimerization upon a ligand binding 43 . The phosphorylation of EGFR leads to several downstream intracellular phosphorylations [43][44][45][46] . EGFR activation plays a key role in cell survival, proliferation, migration and differentiation, including in PCa progression 45,46 .
Based upon the performance of individual protein markers and the important biological functions of these protein markers, particularly DPP4 and PSMA, we combined individual biomarker into several panels and evaluated their performances in the separation of AG from NAG tumors. The two-marker panels were constructed by using combinations of PSMA or DPP4 with either Galectin-3 or phospho-EGFR. Higher AUCs were achieved using two-marker panels compared to individual maker, and indicated a signi cant improvement in differentiating AG tumors and NAG tumors. All speci cities of two-marker panels were over 68% (85.7%, 86.8%, 68.1% and 76.9%); and all the sensitivities were > 69% (71.7%, 69%, 87.6% and 85%). Furthermore, the speci city at 95% sensitivity was improved in general by using twomarker panels. Among the combinations, two-marker panel consisted of DPP4 and PSMA demonstrated the highest speci city (38.46%) at 95% sensitivity, indicating the best performance.
To investigate whether a higher discrimination power could be further achieved, three-marker panels were also constructed by using combinations of both DPP4 and PSMA with either Galectin-3 or phospho-EGFR. In three-marker panels, all AUCs were improved to >0.83, indication a further improvement in the separation of AG from NAG. All speci cities of three-marker panels were > 75% (75.8%, 83.5%, and 81.3%); and all sensitivities were > 76% (83.2%, 76.1%, and 79.6%). In three-marker panels containing both DPP4 and PSMA, speci cities at 95% sensitivity reached 46.15% and 48.35%, respectively. These threemarker panels demonstrated the best performance, compared to individual marker and other two-and three-marker panels.
The unique feature of our study is the integrative analysis of protein expressions with a spectrum of tumor Gleason scores, including Gleason score 6 NAG (Gleason score 3+3), Gleason score ≥ 7 AG (Gleason score 3+4, 4+3, 4+4, 4+5, 5+4 and 5+5) and NAT. IHC stains of these protein markers correlated and validated previously studies of differential expression of proteins in subtypes of PCa. This integrative IHC study with the previously de ned aggressive PCa subtypes demonstrated that three-marker panels can be used for the separation of aggressive PCa from indolent tumors. Our three-marker panel can be served as a signature of AG tumors.
Furthermore, our ndings also demonstrated that loss of Galectin-3 expression and DPP4 activity may promote prostate cancer aggressiveness. The consequence of the decrease expression of these two proteins and subsequent increase in bio-active of phospho-EGFR and PSMA promote tumor cell proliferation and disease progression. It also helps us to gain further knowledge into the proteomic heterogeneity of aggressive PCa and to investigate the molecular taxonomy of the tumor for future diagnostic, prognostic, and therapeutic strati cation.
In summary, we assessed the diagnostic value of selected protein markers in the identi cation of AG tumors using TMA and IHC. The higher expressions of four protein markers, including PSMA, phospho-EGFR, AR, and P16, were identi ed in AG tumors of Gleason score 4 and 5. In contrast, three protein markers, including Galectin-3, DPP4 and MAN1B1, revealed stronger staining patterns in NAG tumors of Gleason score 3. The sensitivity and speci city of individual marker for distinguishing AG were variable and relatively low. We constructed two-and three-marker panels. The combination of two tissue markers could provide better separation of AG from NAG tumors, especially the panel composed of DPP4 and PSMA. We observed further improvement when combining DPP4 and PSMA with Galectin-3 (AUC of 0.85) as well as combining DPP4 and PSMA with phospho-EGFR (AUC of 0.86). More importantly, higher speci cities of 46.1% and 48.35% were achieved by using the aforementioned three-marker panels at xed 95% sensitivity. These panels can be used to assess the aggressiveness of PCa and to improve the separation of AG from NAG tumor using tumor tissue. The utility of these panels provides an additional diagnostic tool to address the urgent clinical need and to optimize the treatment strategy for PCa patients.

Case Selection
PCa cases were collected from radical prostatectomy (except one case was collected from transurethral resection of the prostate (TURP)) with informed consents and in a manner to protect patients' identity. A total of 57 cases were included in the study, including Gleason score 3 (i.e. 3+3), 4 (i.e. 3+4, 4+3, or 4+4), or 5 (i.e. 5+4 or 4+5) tumors. Among them, 54 cases were collected between January 2002 and December 2009, and additional 3 cases were collected in 2012. Electronic medical records were reviewed and the clinical and pathological data, including age, TNM T-stage, N-stage, M-stage, were obtained (Supplementary Table 1). The pathological stages of PCa were determined according to the eighth edition of the AJCC guidelines 30 . AG and NAG tumor were de ned using the criteria of the International Society of Urological Pathology (ISUP) 24 .
The study was approval by the Institutional Review Board of Johns Hopkins Medical Institutions. In addition, all methods performed in the study are in accordance with the relevant guidelines and regulations.

Construction of Tissue Microarray
The PCa tissue microarray (TMA) was constructed using above surgical resected tumors (n=57 cases). All tumor tissue blocks were xed in 10% formalin and embedded in para n. In addition to the original pathology reports, the hematoxylin and eosin (H&E) stained tumor sections were re-reviewed by the American Board of Pathology certi ed pathologist (QKL) prior to TMA construction to ensure the representation of tumor area and adjacent tumor-matched benign tissue (NAT). Cores (in diameter of 0.6 mm) were obtained from tumor tissue and/or NAT on each representative para n block and transplanted into the recipient TMA block. Of 57 cases, 215 cores of PCa and 111 cores of NAT were included in the TMA.

Resources of Primary Antibodies
We evaluated 12 proteins in the study. Details of primary antibodies are summarized in Table 1. Distinct membranous, cytoplasmic or nuclear staining was considered in each protein IHC staining. The intensity of IHC staining pattern on each protein was semi-quantitatively by two researchers QKL (the American Board of Pathology certi ed pathologist) and NH, using a 4-tier system as: 0 (0%, no staining), 1 (<10%, weak and focally staining), 2 (10-50%, medium and focally staining), or 3 (>50%, strong and diffusely staining) in tumor cells. All IHC stains were scanned using Concentriq (Proscia Inc, Philadelphia, PA https://proscia.com) and stored as digital les. Depending on the TMA section, some of cores could not be evaluated due to the loss of tissue cores during the process (please see individual staining pattern in the result).

Statistical Analysis
The discriminatory power of each protein marker panel (composed of ≥ 1 candidate marker), using logistic regression, was evaluated via receiver operating characteristic (ROC) curve analysis. To ensure statistical stability of the results, we used bootstrap resampling (n=500) of the data to construct and evaluate the predictive models of protein marker panels. Bootstrap resampling with label permutation was also carried out to generate random models for examining the reliability of the panels. The mean ROC curves were depicted based on bootstrap resampling results and an area under the curve (AUC) was computed for each mean ROC curve. All the analyses were carried out in R (version 3.5). The predictive models were built using caret (version 6.0-85) and ROC curves were generated using pROC (version 1.13).