Survival-related ADCP gene
To investigate the prognostic significance of ADCP genes, a gene signature was established using data from cohort A. The workflow for the construction of gene signature is depicted in Supplementary table1. Based on univariate COX regression, 130 prognostic genes were screened out (p-value < 0.01, Supplementary Data Table S1).
Consensus clustering analysis of ADCP genes
Based on the expression levels of survival-related ADCP genes in the TCGA database, two distinct regulatory patterns were identified using an unsupervised clustering method. A total of 515 cases were classified into ADCP-related cluster 1 and 491 cases were classified into ADCP-related cluster 2 (Fig. 1A). PCA analysis demonstrated that the patients can be classified into two distinct parts, providing further evidence of the existence of two significantly different subtypes (Fig. 1B). To evaluate the survival difference between the two clusters, an analysis was conducted on a diverse range of patients from a variety of cohort A (TCGA) with complete survival information were analyzed. Subsequently, the relationship of these two clusters with various clinical features were analyzed (Figure.1D). The survival advantage of cluster 1 was higher compared to that of cluster 2. In cluster 1, the proportion of age less than 60, M0, N0-1, T1-2, Stagei-ii was higher. These suggest that ADCP genes may affect tumor development through some potential mechanisms. The heat map depicts the transcriptomic characteristics of different expressed ADCP genes in the two ADCP subtypes (Figure.1E).
The immune landscape of ADCP subtypes
To explore the disparity in pathway enrichment analysis between the two clusters of ADCP in cohort A, GSEA was performed, with different immune infiltration patterns identified within the two subtypes. The enrichment histogram showed that cluster 1 significantly enriched hormone metabolism (including early estrogen response, late estrogen response) and inflammatory signal regulation pathways (allograft rejection, inflammatory response, kras signal and il6 jak stat3 signal, tnfa signal conduction through NFKB) (Fig. 2A). At the same time, GSEA confirmed that there were differences in the immune pathways of ADCP clusters. The results showed that DEGs with higher expression levels in cluster 1 were significantly enriched in early estrogen response, allograft rejection pathway, inflammatory response, interferon response, kras signaling pathway and tnfa signal transduction through NFKB (Figure.2B). Given the strong correlation between ADCP subtypes and immune activity, the TME of the two clusters in cohort A was investigated (Figure.2C). Cluster 1 subtype is featured with high infiltration of Natural Kill cells (resting) and Macrophages, while cluster 2 subtype exhibits elevated infiltration levels of B cells, Plasma cells, T cells, Natural Kill cells, Dendritic cells, Mast cells and Neutrophils. Based on TCGA expression profile, the Stromal score, Immune score as well as ESTIMATE score of malignant tumor tissues were calculated through ESTIMATE algorithm. ESTIMATE produces a matrix score that measures the presence of tumor-associated matrices, as well as an immune score that reflects the level of immune cell infiltration. These scores are combined to produce an index called an ' estimated score ' which provides a comprehensively estimation tumor purity. Compared with cluster 1, samples in cluster 2 also showed significantly higher estimated scores (Wilcoxon test, P < 0.05, Fig. 2D). This trend was also observed for matrix scores as well as immune scores (Wilcoxon test, P < 0.05). Additionally, we studied the association between the two subtypes and major histocompatibility complex (MHC) and T cell stimulators. In addition to HLA-C, the expression levels of MHC as well as T cell stimulators exhibited a tendency to be higher in cluster 2 (Fig. 3E-F).
Recognition of key ADCP genes
Using the limma algorithm, a total of 432 degs(Differentially expressed gene) between group1 and group2 are identified under the filtering threshold of FDR q value < 0.01 and the absolute value of logFC > 1 (Fig. 3A, Supplementary Data Table S2). Prognostic models were constructed for 1006 BC patients with OS information in cohort A. LASSO Cox regression analysis was utilized to determine the best prognostic features based on 130 survival-related ADCP genes. After the variables were included in the LASSO Cox regression model with the smallest lambda, the genes of 39 ADCP-related features were chosen to construct the ADCP-related risk scoring model. Three key ADCP genes (DEFB1, SIAH2 and SYT1) by overlapped DEGs from cohort A and the model ADCP genes were identified (Fig. 3B). The relationship of the expression levels with 3 ADCP-related signatures together with OS are also presented in the forest plot (Fig. 3D). The expression of Three key ADCP genes has significant survival significance (Fig. 3E).
Multidimensional analysis of key ADCP genes
We studied the relationship of CNV with immune infiltration in BRCA. Additionally, we explored the association of gene methylation with immune infiltration. CNV and methylation of key ADCP genes were closely associated with the infiltration of key immune cells including T and B cells (Fig. 4A and B). At the same time, based on cohort A, the gene expression levels of 3 key ADCP genes were analyzed. In the data set, only DEFB1 was down-regulated in group1, while SIAH2 and SYT1 were up-regulated in group2 (Wilcoxon test, P > 0.05). Only DEFB1 was up-regulated in the high-risk group, while in the low-risk group, SIAH2 together with SYT1 were down-regulated (Wilcoxon test, P > 0.05). Only DEFB1 was up-regulated in the tumor group; in contrast, SIAH2 and SYT1 were down-regulated in the low-risk group (Wilcoxon test, P > 0.05, Fig. 4C). Among the three key ADCP genes, the frequency of copy number alterations (CNA) of SIAH2 was higher compared to that of the others. Specifically, CNA deletions were predominant among all types (Fig. 4D). Methylation analysis showed that the beta value of SIAH2 was higher in the tumor group than in the normal group, and DEFB1 was the opposite (Fig. 6E). It is well known that gene expression is in a negative correlation with the level of methylation, while the whereas of CNA has a positive effect on gene expression. In summary, CNA and methylation may be important reasons for the up-regulation of SYT1 expression in BC.
ADCP -related signatures for the prognostic prediction of BC
LASSO algorithm was used to establish a risk model. Finally, 39 genes related to prognosis were identified, and used to construct models based on risk scores using the training (n = 1006) and test (n = 1980) data sets of cohorts A and B patients, respectively. Survival analysis showed that a higher risk score of the training and the test sets was related with a lower survival rate (p < 0.0001) (Fig. 5A-B). A time-varying ROC curve was generated for the assessment of the sensitivity of the model. The 3-year, 5-year, and 10-year AUCs were found to be 0.743,0.754, and 0.79 in the training set, respectively (Fig. 5C); in contrast, they were 0.546,0.678, and 0.716 in the test set, respectively (Fig. 5C).
The ADCP group served as an independent prognostic factor in BC
Since ADCP genes are significantly associated with high malignancy and advanced tumors of BC, univariate and multivariate Cox regression analysis were carried out to determine the prognostic significance of ADCP genes for BC patients. ADCP group, age, TNM stage, stage together with risk score were included as covariates. Results showed that ADCP group, age, TNM stage, stage as well as risk score were independent prognostic factors for BC patients (Fig. 6A and 6B). we constructed a nomogram by combining independent prognostic factors, serving as a clinically relevant quantitative method tool for predicting the mortality of individual BC patient (Fig. 6C). Based on the c-exponential curve of different variables over time in the TCGA cohort, nomogram performed best compared to other single factors (Fig. 6D). Add up the scores of each prognostic parameter and assign a total score to each patient. The higher the total score, the worse the prognosis of patients. The modal diagram has similar performance to the ideal model (Fig. 6E).
ADCP groups for the prediction of the chemotherapeutic response
We evaluated the chemotherapy response and drug resistance of patients in the ADCP group. Figure 7 shows the sensitivity of two ADCP subtypes to six anticancer drugs (AZD8055, A.443654, AMG.706, AKT.inhibitor.VIII, ABT.888, ATRA). Results showed that the IC50 level of group2 was higher compared to group1 (Figure.7A-F), and small molecule drugs with therapeutic effects on BC could be found according to the results of drug sensitivity. Three-dimensional structural tomography of AZD8055, A.443654, AMG.706, AKT.inhibitor.VIII, ABT.888 and ATRA was found in PubChem (Fig. 7G).
Gene expression level verification via qRT-PCR and WB
We verified the mRNA and protein levels of DEFB1, SIAH2 and SYT1 in BC cell lines and adjacent cell lines by qRT-PCR and WB. Results of qRT-PCR is shown in Figure.8A. In contrast with the normal control, DEFB1 level in BC cell line MDA-MB-453 was lower, but the SIAH2 and SYT1 levels were higher. The protein expression levels of the three key ADCP-genes based on WB analysis were consistent with the results of qRT-PCR experiments (Figure.8B-C).