Collection of samples and patients
After exclusion of samples without adequate clinical information, totally 341 patients were included in the present study from TCGA. The average age was 61.5 years and median follow‐up was 2.68 years. Moreover, 30 of 341 patients had low Gleason scores (≤6) PCa and 140 patients had high Gleason scores (≥8) PCa. All patients were undergone RP and pathologically diagnosed with BRC, with assigned to training set (n=169) and testing set (n=172), respectively.
Prognostic proteins were screened by univariate Cox regression analysis to filtered out proteins correlated with BCR. In terms of the cut‐off standard of P value <0.05 and |log2FC| >2, totally 21 proteins significantly related to BCR were identified finally. As clearly shown in Supplementary Figure S1, KM analyses expounds the accuracy of the 21 selected proteins and discriminative power for further analysis (p<0.05).
Development and validation of proteins signature
LASSO Cox regression analysis was conducted to construct a prognostic model in the training set, which picked out 5 proteins (alpha-Catenin, BRD4, DJ1, SMAD1 and YB1) after initial filtration of univariate Cox regression identified 21 proteins (Figure 1). An equation to calculate the risk score for their BCR risk was derived according to selected levels of five proteins weighted by the regression coefficients, as following: risk score = (-2.771× levels of alpha-Catenin) + (1.577× levels of BRD4) + (-2.239× levels of DJ1) + (2.152× levels of SMAD1) + (2.428× levels of YB1). Among these five prognostic proteins, three (BRD4, SMAD1, YB1) demonstrated positive coefficients, suggesting high expression levels were associated with high-risk BCR. Two (alpha-Catenin and DJ1) in the Cox regression analysis showed negative coefficient, indicating that their high expression levels were related to better BCR.
To investigate the predictive performance of the signature, patients were assigned into the high- and low-risk groups based on median risk score as cut‐off value of each protein. PCa patient cohorts with a risk score of 1.804 or lower were divided into the low‐risk group, otherwise the others belonged to the high‐risk group (Figure 2A). For the BCR, the higher risk score meant worse prognosis. Thereby, the higher levels of proteins with positive weighting coefficient suggested higher risk scores. The distribution of survival status of the PCa patients and the levels of proteins were analyzed. The results demonstrated that compared with low-risk groups, death of patients was significantly more in high-risk groups (Figure 2B). The levels of proteins with positive coefficients were higher in high-risk groups (Figure 2C). We also discovered that patients with high BCR risk inclined to express high-risk proteins, whereas samples with low BCR tended to express protective proteins.
Kaplan-Meier analysis demonstrated patients with higher BCR generally had significantly worse survival than those with lower BCR (p<0.0001) (Figure 3). In the training set, AUCs of the 5‐protein‐based signature at 1‐, 3‐ and 5‐year survival times were 0.691, 0.797, 0.808 and 0.74, 0.739, 0.82 for the test set severally, suggesting that the prognostic signature had a great specificity and sensitivity (Figure 3). The C-index of the signature was 0.679 (95%CI: 0.599 to 0.759) in the training set, 0.704 (95%CI: 0.613 to 0.794) in the test set and 0.693 (95%CI: 0.634 to 0.752) in the entire set.
Throughout the univariate and multivariate Cox proportional hazards regression analyses, the 5-protein predictive signature was confirmed to be independent of with other clinicopathological factors, including age, Gleason grades, T stage, N status, PSA and residual tumors in predicting the BCR-free survival (Table 1). The KM survival analysis also indicated the discriminative capability of the signature in different clinical prognostic features (Supplementary Figure S2). To further evaluate the predictive accuracy between the signature and the other clinicopathological factors, we calculated the AUC of ROC which showed that the 5-protein-based signature had significantly better prognostic performance than any other clinical factors in the training, test and entire sets (Figure 4).
Identification and validation of the nomogram
The nomogram was conducted in the entire set by multivariate Cox regression analysis of 5 proteins with preset clinicopathological covariables, including age, Gleason grades, T stage, N status, PSA and residual tumors. The result demonstrated great prognostic performance in BCR of PCa patients (Figure 5A). Calibration plots confirmed the predictive value of the prognostic nomogram in 3-, 5-and 10-year BCR overall survival (OS) (Figure 5B), indicating the good agreement with the actual outcome. The C-index of the nomogram was 0.777 (95%CI: 0.699 to 0.855) in the training set, 0.771 (95%CI: 0.691 to 0.851) in the test set and 0.764 (95%CI: 0.701 to 0.827) in the entire set. Finally, net benefit curves outlined the nomogram was better than the signature and other clinicopathological factors (Figure 5C).
Validation and functional Characteristics of the 5 proteins
There were positive correlations between 5 proteins and corresponding genes by calculated the Pearson correlation coefficient (Supplementary figure S3). Outcomes of GO enrichment analysis indicated that these protein-related genes are enriched in immune- or cell differentiation-related GO terms (Figure 6A), suggesting that the effect of prognostic proteins on cancer might be related to the tumor microenvironment. Additionally, GSEA in TCGA database was conducted to ascertain the five proteins related biological signaling pathway between high‐ and low‐risk groups (Figure 6B). According to significant protein sets (FDR<0.05 and p<0.05), five pathways were screened: 1) base excision repair, 2) DNA replication, 3) nucleotide excision repair, 4) pyrimidine metabolism, and 5) spliceosome. Finally, Sankey diagram revealed the association with co-expression proteins and the 5-protein signature, which may interact with each other by certain molecular mechanism (Figure 6C).