Construction and Validation of a Prognostic Model for the Assessment of Postoperative Overall Survival of Patients with Metaplastic Breast Cancer: Based on a Retrospective Large Data Analysis and Chinese Multicenter Study

Purpose: Surgery is an important treatment for patients with metaplastic breast cancer (MBC). This study used prognostic clinicopathological factors to establish a model for predicting overall survival (OS) in patients with MBC. Methods ： Patients in the Surveillance, Epidemiology, and End Results (SEER) database diagnosed with MBC from 2010 – 2015 were selected and randomized into a SEER training cohort and an internal validation cohort. We identified independent prognostic factors after MBC surgery based on multivariate Cox regression analysis to construct nomograms. The discriminative and predictive power of the nomogram was assessed using Harrell's consistency index (C-index) and calibration plots.The decision curve analysis (DCA) was used to evaluate the clinical usefulness of the model. Results: We divided 1044 patients from the SEER database randomly into a training set (n=732) and validation set (n=312) in a 7:3 ratio. Multifactorial analysis showed that age at diagnosis, T stage, N stage, M stage, tumor size, radiotherapy, and chemotherapy were important prognostic factors affecting OS. The C-index of nomogram was higher than the 7th edition of the AJCC TNM grading system in the SEER training set and validation set. The calibration chart showed that the survival rate predicted by the nomogram is close to the actual survival rate. The DCA showed that the nomogram is more clinically useful and applicable.

MBC and can provide a reference for doctors and patients to establish treatment plans.

Background
Metaplastic breast cancer (MBC) is a rare subtype of breast cancer, with unique histopathological and molecular characteristics [1], accounting for less than 1% of all breast malignancies [2].While most human cancers have only one histological component in a primary tumor, MBC is a heterogeneous cancer with diverse shapes and two or more different tissue types [3] , including pure epithelial components (e.g., squamous cells), mesenchymal components (e.g., spindle cells, chondroids, osseous and myoid cells), or a mixture of the two [4] .In 2000, the World Health Organization (WHO) recognized MBC as a unique pathological entity [5].Since then, the incidence of MBC has increased, which might be due to an increase in the number of cases, the improved recognition of the disease by pathologists, or both [6] .
MBC patients are not sensitive to chemotherapy and hormone therapy, and relevant targeted therapy and immunotherapy programs have not yet been developed.Therefore, surgical intervention is still the main treatment [7].MBC is characterized by strong invasiveness, poor prognosis, and is often negative in estrogen receptor (ER), progesterone receptor (PR), and human epidermal growth factor receptor 2 (HER2) tests; its prognosis is worse than that of triple-negative breast cancer, and the survival rate is lower [8,9] .At present, there are no standardized guidelines for MBC, and most clinicians still guide the treatment of MBC according to the scheme for invasive ductal cancer (IDC) [6].However, compared with IDC, MBC has a larger tumor size, lower degree of differentiation, faster growth rate, and less axillary lymph node metastasis [4,10,11].Since they have different clinicopathological features, it may be unreasonable to predict the pathological course and prognosis of MBC from existing studies on patients with IDC [12].Therefore, MBC requires independent treatment and management strategies and its own customized model for personalized evaluation of postoperative survival of patients.The TNM staging system proposed by the American Joint Committee on Cancer (AJCC) is a common tool used by oncologists to predict disease progression and design treatment strategies.However, considering that there are many factors affecting cancer progression, it may be unreasonable to predict the prognosis of MBC based on TNM staging alone [13].In this respect, the nomogram makes up for its shortcomings [14].Due to the low incidence of MBC, most of the studies on MBC come from a single medical institution or a limited cohort of case reports [12,15].Therefore, based on data from the Surveillance, Epidemiology, and End Results (SEER) large-scale database in the United States, and the clinical and pathological data of patients with MBC from six large hospitals in Shandong Province, we explored the indicators that affect the prognosis of patients with MBC and constructed a model to predict survival.The objective is to provide a reference for informing the treatment of patients with MBC.

Patient selection and data processing
Patient data from 2010-2015 were screened from the SEER database (1975-2016 varying) using SEER*Stat version 8.3.8 (https://seer.cancer.gov/).The obtained data were divided at random into a training set and an internal verification set in a 7:3 ratio.Based on clinical experience, existing literature, and the availability of information in the SEER data, the following variables were selected and evaluated: age, race, marital status (married: married/unmarried or domestic partner; single: unmarried; others: divorced/separated/widowed ), grade, tumor size, laterality, T stage, N stage (negative, positive), M stage, subtype, ER status, PR status, HER2 status, type of surgery (lumpectomy, mastectomy), radiation (no/unknown, yes), and chemotherapy (no/unknown, yes).The inclusion criteria for data screening were: (a) women with primary MBC, (b) MBC diagnosis was consistent with the International Classification of Disease for Oncology 3rd edition (coded as 8032/3, 8033/3, 8070/3, 8570/3-8572/3, 8575/3, 8980/3, 8982/3), and (c) primary site: C50.0-C50.6,C50.8, or C50.9.The exclusion criteria were: (a) incomplete clinical pathological data or unknown records, (b) histological diagnosis was not positive, (c) survival time was less than 1 month or unknown, (d) patients with bilateral MBC, and (e) the patient did not undergo surgery at the primary tumor site.
For external verification, a Chinese multicenter validation set based on patients diagnosed between January 2010 and December 2020 from six hospitals (Shandong Cancer Hospital, Weihai Municipal Hospital, Rizhao People's Hospital, Linyi Central Hospital, Weifang Traditional Chinese Medicine Hospital, and the affiliated Hospital of Weifang Medical College) was included in the study.The inclusion/exclusion criteria for the Chinese multicenter data were consistent with that of the SEER dataset.Overall survival (OS) was used as the endpoint of this study and defined as the time from diagnosis to death of any cause or to the last follow-up.The last day of follow-up was December 1, 2020.For the retrospective analysis of the external validation set, we received approval from each institutional review board of the six institutions in Shandong, China.Since this was a retrospective study, patients were not required to sign an informed consent form.All patient data were used anonymously.

Construction of nomogram
In the training set, a multivariate Cox proportional hazard regression model was used to further analyze the prognostic variables identified from univariate analysis that were significantly related to postoperative OS in patients with MBC.The results are reported using hazard ratio (HR) and 95% confidence interval (CI).Based on the results of multivariate analysis, a nomogram was constructed to predict the probability of OS at 1, 3, and 5 years in patients with MBC.

Discrimination and calibration of the nomogram
We used the C-index, receiver operating characteristic curve (ROC), and area under the curve (AUC) to evaluate the distinguishing capabilities of the prediction model.A higher C-index value indicates a better discriminative performance by the nomogram.Calibration was evaluated such that how close the calibration curve was to the standard curve indicated the predictive ability of the model.To test the accuracy and reliability of the nomogram, the SEER internal verification set and Chinese multi-center external data set were used.To evaluate the effectiveness of the training and internal validation sets, the nomogram was compared with the 7th edition AJCC staging model.

Survival risk analysis and clinical usefulness of the nomogram
The patients were divided into high-risk and low-risk groups according to the risk score obtained by the nomogram.The Kaplan-Meier method was used to assess the significance of survival differences between the two risk groups.The difference in survival between a triple negative metaplastic breast cancer (TNMC) group and a non-triple negative metaplastic breast cancer (NTNMC) group was also analyzed.Whether radiotherapy would bring survival benefits to elderly female patients was also studied.The DCA was used to evaluate the clinical usefulness of the nomogram.

Statistical analysis
All statistical analyses were performed using SPSS 25.0 (SPSS Inc., Chicago, IL, USA) and R software (version 4.0.3;http://www.r-project.org/).X-Tiles (version 3.6.1,Yale University, New Haven, CT, USA) was used to determine the best cut-off value for age and tumor size.Categorical variables were analyzed as frequencies and proportions.Cox proportional risk regression analysis was performed using SPSS.The nomogram, ROC curve, and calibration curve were generated using software packages ("foreign," "survival," "rms"); The DCA was performed using the R package ("stdca.R").A P value <0.05 was considered statistically significant.

Patient characteristics
As shown in Figure 1, 1044 patients with postoperative MBC were screened and divided into a training set (n=732) and internal validation set (n=312).As an external validation dataset, 40 eligible patients from a Chinese multicenter study were included.The best cut-off for age was 72 years, and the tumor size was 4.2, 6.7 cm(Fig.2).Table 1 summarizes the demographic and clinicopathological characteristics of the training, internal validation, and complete cohorts.In the training set, the median survival time was 31 months (interquartile range [IQR]: 18-54 months), and the median patient age was 61 years (IQR: 51-72 years).Among the patients, 47.4% were in the T2 stage, 76.9% were negative for lymph node metastasis, and 69.2% were diagnosed with TNMC.Table 1 shows the data distribution for the multiple centers in China.Among the patients, the median survival time was 41.5 months (IQR: 13-56.75month), the median age was 52.5 years (IQR: 46-61 year), 47.5% were at the T2 stage, 85.0% were negative for lymph node metastasis, and 80% had a mastectomy.with OS; while race, laterality, subtype, ER status, PR status, and HER2 status were not.Multivariate analysis identified that age at diagnosis, T stage, N stage, M stage, radiation, chemotherapy, and tumor size were independent predictors of survival.The significant variables obtained from multifactor analysis were used to construct a nomogram.Based on the prognostic factors selected from the training cohort, a nomogram was developed to predict the 1-, 3-, and 5-year survival probability of patients with MBC (Figure 3).

Verification and evaluation of nomogram
A good partition performance by the nomogram was verified.In the training cohort and validation cohort of SEER, the C-index of the nomogram was 0.803,0.769,which was higher than that of the AJCC staging system 0.752,0.717.In addition, the AUCs confirmed the superiority of the predictive model for predicting the 1-year, 3-year, and 5-year survival in the training cohort (nomogram vs. AJCC, 0.860 vs. 0.790; 0.811 vs. 0.777; 0.827 vs. 0.785, respectively) and validation set (nomogram vs. AJCC, 0.721 vs. 0.689; 0.797 vs. 0.750; 0.765 vs. 0.734, respectively) (Figure 4a-f).The discrimination and survival prediction performance of the training set and internal verification set were improved compared with the 7th edition of the AJCC TNM.The Chinese validation set C-index was 0.857 (95% CI, 0.726-0.988).The AUC values for 1 year, 3 years, and 5 years were 0.813, 0.858 and 0.881, respectively (Figure 5a-c).The external verification set also proves that the nomogram has good distinguishing ability and predictive ability.In addition, the calibration curve test showed good agreement between the predicted values of the nomogram and the actual observed results (Figure 6a-c).

Survival risk classification
The Kaplan-Meier curve showed that there were significant differences in OS among the different risk subgroups of the SEER training set (P<0.001, Fig. 7a,), which was confirmed in two validation data sets (SEER validation set, P<0.001 Fig. 7b; multicenter validation set, P=0.0158 Fig. 7c).Among all 1044 patients, the nomogram showed great potential to differentiate between high-and low-risk groups (P<0.001,Fig. 7d).In addition, in the training cohort, a subgroup analysis was performed to determine whether MBC was triple negative or not and the results showed that the differences were not significant (P=0.338,Fig. 8a).Radiation therapy was found to provide a significant survival benefit in older women (P=0.0052,Fig. 8b).

Clinical application of the nomogram
The 1-, 3-, and 5-year DCAs in the nomogram showed greater net benefits than either "full treatment" or "no treatment" and performed better than the AJCC stage model, which demonstrates potential clinical applicability of the nomogram (Figure 9a-c).

Discussion
At present, the most useful treatment for MBC is surgical resection.Due to the low incidence of MBC, it is difficult to collect clinical and pathological data; therefore, we constructed a nomogram for predicting postoperative survival of MBC patients based on data from a large-scale database in the United States, and verified its performance using data from multiple Chinese centers.The AJCC staging system takes into account only tumor size and location, local lymph node invasion, and distant metastases [6], and its ability to predict postoperative survival in MBC patients is poor.The nomogram combines factors including age, tumor size, and treatment information to evaluate the prognosis of patients.As far as we know, this is the first nomogram for predicting the postoperative survival of patients with MBC, and it can be used to provide patients with personalized services.
Compared with IDC, the OS rate of MBC was lower [16] .In this study, 7 factors were identified through univariate and multivariate analyses that were significantly related to the OS of patients with MBC, including age, T stage, N stage, M stage, tumor size, chemotherapy, and radiotherapy.T stage had the greatest impact on the survival of patients with MBC, and most patients had T2 stage disease.Many previous studies have also reported that the T stage of patients with MBC is higher than that of invasive ductal carcinoma.And compared with IDC, MBC is mostly blood metastasis.In addition, MBC is more likely to have lung and central nerve metastasis, while IDC has more bone metastasis [17].There are great differences between MBC and IDC in clinicopathological and biological aspects [6] .The expression of HG and Ki-67 in MBC is higher than that in IDC [4] .Therefore, it might not be appropriate to treat patients with MBC according to a management plan for IDC.This emphasizes the importance of the nomogram for informing personalized services for patients with MBC after surgery.
In this study, the type of surgery had nothing to do with the prognosis of MBC patients, which is consistent with previously reported results [18].Whether it is lumpectomy or mastectomy, postoperative radiotherapy can bring survival benefits to patients with MBC [19].Li et al. showed that even elderly women (≥60 years old) can benefit from radiotherapy [20] .We found that even older female patients (≥73 years old), postoperative radiotherapy can also obtain significant survival benefits (P=0.0052).
However, because these studies had small sample sizes or were retrospective, caution is still needed when treating patients with reference to these studies.The development of standardized radiotherapy guidelines based on prospective studies with sufficient sample size would be of great clinical value.
The differences between molecular typing of subtypes were not found to be statistically significant.
When subtypes were grouped into TNMC and NTNMC groups, there was no significant difference in survival between the two groups (P=0.338).Previous studies have also shown this result [7] .In a retrospective analysis of 51 MBC patients treated at Seoul National University Hospital, Kyu-Hyoung et al. found that triple-negative MBC is a favorable prognostic marker for patients with MBC.Their study also mentioned that TNMC is more prone to distant metastasis than NTNMC; however, after distant metastasis TNMC progresses more slowly, which may have caused the difference between the two subgroups.Since the study had a small sample of individual institutions, the mechanism behind this result needs to be further explored [21] .
Histology of MBC contains 8 types after screening, including, metaplastic carcinoma, carcinosarcoma, squamous cell carcinoma, spindle cell carcinoma, sarcomatoid carcinoma, fibromatosis-like metaplastic carcinoma, low-grade adenosquamous carcinoma, and metaplastic carcinoma with chondroid differentiation or osseous differentiation [22,23].When the data were grouped in a 7:3 ratio in this study, not all subtypes were included in the training set, and they were analyzed based on the whole population cohort.A univariate Cox regression analysis showed that histology was not a risk factor for OS after surgery in patients with MBC (p=0.139).Previous studies also find no significant correlation between histology and prognosis [15,18].Univariate analysis showed that marital status is a risk factor for the prognosis of MBC.Previous studies found that social environment also has an impact on this disease, which emphasizes that we might need to consider to the effect of non-biologic stressors on the disease in the future [7].
This study found that hormone receptor status (ER/PR) is not a risk factor that affects the survival of patients with MBC, which is consistent with the results of previous studies.Hormone therapy might not provide survival benefits to patients with MBC [6,16] .While chemotherapy was found to provide a survival benefit in patients with MBC after surgery [24,25] , it has a low impact.Studies have reported that most MBC patients have resistance to chemotherapy, including cancer cells undergoing the epithelial-to-mesenchymal transition and stem cell-like characteristics [26][27][28][29].Joneja et al. used the first-generation gene sequencing method to compare MBC, triple-negative breast cancer, and HER2-positive and hormone-positive breast cancer, and found that the expression rate of PD-L1 in MBC was higher than that of the other three cancer subtypes (46% vs. 9%, 6%, 6%, respectively; p<0.001) [30].Although the expression mechanism is not yet clear, this provides a theoretical basis for the treatment of MBC using the immune system [31] .Many mutated genes have been detected in MBC, and the most frequency mutated genes, TP53 and PIK3CA, suggest a direction for investigating targeted therapy [31][32][33].
Based on the results of the analysis, we constructed a nomogram to predict the postoperative survival of patients with MBC.The C-index and ROC curve of the nomogram performed better than the 7th edition of the AJCC TNM staging system.The calibration curve shows that the 1-year, 3-year, and 5-year survival rates predicted by the nomogram are similar to the actual survival rates.The nomogram also has good clinical applicability.However, due to the rarity of the disease and the small amount of data in the external validation set, the external data set did not validate that the nomogram performs better than the AJCC stage.Nonetheless, since the 40 cases of MBC from the large hospitals in China are representative of the real-world MBC situation, the nomogram is suggested to be applicable for practical use.
The study has some limitations.First, some patients were excluded due to incomplete information when the data was screened, and some selection bias may exist.Second, some important parameters and specific information related to prognosis, such as the family history of breast cancer, vascular invasion, chemotherapy regimens, and targeted therapies, are missing from the SEER database.Third, this is a retrospective study based on SEER and needs to be validated in a prospective clinical trial.

Figure 1 Flow
Figures

Figure 4 Comparison
Figure 4

Figure 9 Decision
Figure 9

Table 2
summarizes the results of univariate and multivariate Cox regression analyses in the training cohort.Univariate regression analysis showed that age at diagnosis, tumor size, marital status, grade, T stage, N stage, M stage, radiotherapy ,chemotherapy, and type of surgery were significantly correlated