Establish a Prediction Model for the Recurrence Time of Hormone Receptor Positive /human Epidermal Growth Factor Receptor 2-negative Breast Cancer

Objectives This study aimed to develop a model to predict recurrence time of hormone receptor-positive (HR+) /human epidermal growth factor receptor 2 negative (HER2-) breast cancer patients. Methods We included HR+ HER2- metastatic breast cancer (MBC) female patients who arrived at the Department of Breast Oncology in Peking University Cancer Hospital. Data were collected through consulting medical records. Patients were divided into early recurrence group and late recurrence group according to disease-free survival (DFS). Predictors of recurrence time were identied and a nomogram was developed and validated using concordance index (C-index), AUC (area under the curve), and calibration plots. A Median DFS was 50.0 months. 382 patients (59.8%) were presented with early recurrence (DFS ≤ 5 years), and 257 patients (40.2%) were presented with late recurrence (DFS >5 years). A nomogram based on potentially associated clinicopathological factors was developed and validation results showed the nomogram was well-calibrated to predict recurrence time (AUC=0.703, C-index=0.697). late The nomogram can late Prospectively designed studies are needed to further our


Introduction
Breast cancer increased rapidly in China, the paper published in "Chinese journal of cancer research"showed that the age-standardized incidence rate was 36.1/100,000 in 2018. (1) Approximately 30 percent of early breast cancer (BC) would appear recurrence (2, (3) . Clinicopathological characteristics such as hormone receptor (HR) status and number of metastatic lymph nodes, are used to predict recurrence risk as well as recurrence time. Among those characteristics, HR status is widely thought to be able to stratify early recurrence (≤5 years) and late recurrence ( 5 years) (4, (5) . For HR negative BC, recurrence rate reaches the highest level during the rst two years after initial diagnosis, and drops rapidly to a low level afterwards. Most of the recurrence happens during the ve years after initial diagnosis. However, for HR positive (HR+) BC, about 50 percent of recurrence happens after ve-year adjuvant endocrine therapy (6, (7, (8) , and the recurrence risk tends to be stable at a low level after at least twenty years (3, (9) .
Estrogen receptor (ER), progesterone receptor (PR) or both are found in about 70 percent of all BC, and such tumors are considered HR+ BC (10, (11) . Adjuvant therapy of early BC is important in improving prognosis, and adjuvant chemotherapy has been proved to signi cantly reduce recurrence rate of the rst ve years after diagnosis (12, (13) . However, HR+/ human epidermal growth factor receptor 2negative(HER2-) BC patients still suffer from high risk of late recurrence (14, (15) . Previous studies have shown that ve-year endocrine therapy could reduce a third of late recurrence (15-year recurrence) (16) , and extending the time of adjuvant endocrine therapy to 10 years could reduce risk of late recurrence in the second decade after initial diagnosis (17, (18, (19, (20, (21) . Therefore, reliable prediction of recurrence time for HR+/ HER2-BC patients is necessary for appropriate treatment.
In foreign studies, researchers have explored the factors affecting recurrence time in HR+/ HER2-BC patients. For example, Yamashita et al found lager tumor size, more number of positive lymph nodes and higher histological grade were associated with early recurrence. After strati ed by age or hormonal status, besides the factor mentioned before, PR status and Ki-67 expression level were associated with recurrence time as well (18) . Due to heterogeneity, prognosis of BC is in uenced by race, so study results abroad may not apply to Chinese. Therefore, this study is aimed to explore factors predicting recurrence time and construct a predictive model, providing reference for treatment of patients with HR+ /HER2-BC in China.

Study population and design
Patients diagnosed with histologically con rmed metastatic recurrence of HR+/HER2-BC who arrived at the Department of Breast Oncology in Peking University Cancer Hospital between April 2007 and October 2019 were included. Patients who were stage at initial diagnosis or had other malignant tumors were excluded. The primary outcome was disease-free survival (DFS), which was de ned as the time from initial diagnosis to rst recurrence. Patients were strate ed into two groups-early recurrence group and late recurrence group, according to DFS. DFS >5 years was de ned as late recurrence and DFS ≤5 years was de ned as early recurrence. Metastatic sites were identi ed by imaging examinations or rebiopsy. Multiple metastasis was de ned as metastatic lesions involved more than one organ. Through consulting medical records, data about clinicopathological factors, adjuvant therapies and metastatic characteristics were collected. This study was conformed to guidelines of the 1996 Declaration of Helsinki and approved by the Ethics Committee of Peking University Cancer Hospital (No.2017KT40).

IHC evaluation
Expression of ER, PR, Ki-67 was evaluated by immunohistochemisty (IHC). HR was considered positive if there was ≥1% positive nuclear staining for either ER and/or PR. HER2 was considered positive as IHC score of 3+, or 2+ with uorescent in situ hybridization showing gene ampli cation. HER2-positive patients were excluded from this study. According to expression level of ER and PR, HR was strate ed into four groups. Group 1 was de ned as 1+ or 25%. Group 2 was de ned as 2+ or 25%~50%. Group 3 was de ned as 3+ or 50%~75%. Group 4 was de ned as ≥75%. Similarly, Ki-67 was strate ed into four groups according to the expression level as well.

Statistical methods
Statistics analysis was performed using SPSS 23.0 and R 4.0.0 software. DFS was evaluated using the Kaplan-Meier log-rank test. Chi-square test was used to compare clinicopathological characteristics between early and late recurrence group. Multivariate analysis was performed using logistic regression model to identify independent factors in uencing recurrence time. On the basis of potentially associated prognostic factors, a nomogram was constructed. The discrimination of the nomogram was evaluated by receiver operating characteristic (ROC) curve and Concordance-index (C-index). Calibration was performed by bootstrapping with 1000 resamples. P-value <0.05 was considered as statistically signi cant.

Patient characteristics
A total of 639 female patients who presented a metastatic recurrence of BC were included in the study. Most patients (91.1%) were diagnosed after 2000/1/1. A description of patients in the study was given in Table 1. Median age at initial diagnosis of primary tumor was 47 years old with 263 patients (41.2%) aged ≥ 50 years old. A majority of patients (61.0%) were premenopausal at initial diagnosis. Most patients (528/639, 83.0%) received adjuvant therapy after surgery, while (108/639, 17.0%) of them received neoadjuvant therapy before surgery. All patients received surgery and most of the surgery (93.6%) was mastectomy including radical and simple mastectomy. During the follow-up of rst ve-year, number of patients who appearing recurrence growed rapidly. At the fth year of follow-up, 59.8% (382/639) of patients appeared replase. Most (557/639, 87.2%) metastasis took place within ten years of follow-up.

Univariate analysis
Univariate survival analysis for DFS showed the following parameters as signi cant factors ( Figure 3): age at diagnosis, hormonal status at diagnosis, treatment mode, histology of primary tumor, Scarff-Bloom-Richardson (SBR) grade, Ki-67, number of positive lymph nodes. Older age (≥50), postmenopausal, higher SBR grade, higher level of Ki-67, more positive lymph nodes were associated with shorter DFS. In addition, patients receiving neoadjuvant therapy (most of those patients was presented as later stage) relapsed earlier than not receiving therapy before surgery (P=0.003) in this study. Level of hormonal receptor and tumor size did not show signi cant impact on recurrence time. Even if tumor size did not show impact on DFS (P=0.157), patients diagnosed with larger tumor size, tended to relapse earlier. Some variables (neoadjuvant/adjuvant chemo-/radio-therapy) need to be described taking possible confusing factors into account to understand their impact.
Neoadjuvant/adjuvant chemo-/radio-therapy seems to have no impact on recurrence time. Patients treated with chemotherapy or radiotherapy had more positive lymph nodes (P<0.001 for both) than patients not treated with this type of treatment. Metastasis did not appear later in this group of patients even if they received treatment to delay recurrence.
Results comparing early recurrence and late recurrence group were shown in Table 2 and Figure 4. Patients with early recurrence took 59.8% (382/639) of all patients, which was more than patients with late recurrence (257/639). Comparied with late recurrence, there were more patients who were postmenopausal at initial diagnosis (42.6% vs 33.6%), received neoadjuvant therapy (20.6% vs 11.7%), had higher level of Ki-67, had more positive lymph nodes, did not receive radiotherapy (46.6% vs 37.0%) in early recurrent patients. There was no signi cant difference in proportion of patients receiving chemotherapy between two groups (P=0.480), which was probably due to more positive lymph nodes in patients receiving chemotherapy as was mentioned before. Referring to rst metastatic sites (Table 3), bone was most common in both groups. There were more liver metastasis (19.4% vs 11.3%, P=0.006), chest/skin/soft tissue metastasis (28.8% vs 21.4%, P=0.036) in patients with early recurrence than late recurrence, while there were more pleura metastasis (5.0% vs 12.8%, P<0.001) in patients with late recurrence than with early recurrence. There was no signi cant difference in bone metastasis, lymph node metastasis, lung metastasis and brain metastasis between two groups, and so was the number of rst metastatic site.

Multivariate analysis
Referring to results of univariate analysis and clinical experience, logistic regression analysis was performed. Age at initial diagnosis, hormonal status at initial diagnosis, treatment mode, type of surgery, expression level of Ki-67, N stage, neo-/adjuvant chemotherapy, neo-/adjuvant endocrine therapy and adjuvant radiotherapy were included in logistic model, and the results were shown in Table 4. Postmenopausal status, higher expression of Ki-67, more positive lymph nodes was more common in patients with early recurrence than late recurrence (P=0.046,0.003,0.021 respectively). Age of menopause in Chinese female is usually around 50 years old. To clarify whether there was interaction between age and hormonal status, interaction analysis was performed. The result indicated that there was no synergistic effect on recurrence time when combining age and hormonal status (P=0.662).

Construction and validation of the predictive model
Logistic regression-based nomogram was constructed on the basis of potentially associated factors. Only patients with complete data of associated factors could be included to develop the nomogram. Because many patients lacked data on Ki-67 (Table 1), the factors mentioned before except expression level of Ki-67 were adopted to construct Nomogram at rst ( Figure 5). There were 563 patients in the development setting. C-index for Nomogram was 0.637, and AUC was 0.635 (95% CI: 0.589-0.681).
Internal validation was performed using bootstrap resampling method, and calibration curve was depicted in Figure 6. To optimize the model, considering the important role of Ki-67 in recurrence, Nomogram was constructed with expression level of Ki-67 adopted at last (Figure 7). There were 357 patients in the development setting of Nomogram . C-index for Nomogram was 0.697, and AUC was 0.703 (95% CI: 0.643-0.763) ( Figure 8). Calibration curve was depicted in Figure 9. The results in calibration curve showed absolute error and squared error were smaller in Nomogram than that in Nomogram , and the calibration curve in Nomogram showed the prediction (solid line) was more closely approximates the 45-degree line than that in Nomogram . Moreover, the ROC curve showed Nomogram exhibited better discrimination to predict the probability of late recurrence than Nomogram . From the nomogram, probablity of late recurrence of every patients could be easily obtained through calculating the total scores of clinicopathological factors.

Discussion
Late recurrence is not only observed in HR+ BC, but also frequently found in many other solid tumors, such as thyroid, prostate cancer and melanoma. The mechanism of late recurrence is poorly understood yet. Some researchers think this is partly due to change of immune microenvironment in primary tumor and tumor dormancy (22, (23, (24, (25, (26, (27) . Although late recurrence remains a tough problem for HR+ BC patients, more and more evidence shows that extending time of endocrine therapy could reduce the risk of late recurrence (9) . Therefore, identifying patients at high risk of late recurrence and proposing a more appropriate treatment plan for them become a critical task. In this study, we found the features of late recurrence and some possible factors that might in uence the recurrence time in HR+/HER2-patients, and constructed a model to predict the risk of late recurrence.
It is controversial whether hormonal status has impact on recurrence risk and recurrence time. Ditsatham et al found patients in premenopausal status had higher risk of recurrence than those in postmenopausal status (28) . In other studies, there was no difference in DFS between patients with late recurrence in premenopausal and postmenopausal status (29) , and there was no difference in hormonal status between patients with early and late recurrence (30) . In this study, hormonal status was an important factor affecting recurrence time in BC patients both in univariate and multivariate analysis. Patients in premenopausal status appeared relapse later than postmenopausal patients (OR=0.477, P=0.046). This phenomenon is probably due to comorbid condition, tumor burden and immune microenvironment (24) . As postmenopausal patients are older, the possibilities of having serious comorbid conditions are higher than younger premenopausal patients. And there might be a big difference in immune microenvironment between premenopausal and postmenopausal patients as well. As a result, although premenopausal patients has higher level of estrogen, they might appear recurrence later than postmenopausal patients. Thus the time of endocrine therapy for premenopausal patients should be extended.
In previous studies, some researchers found lymph node status could provide predictive information on late recurrence of BC (31, (32, (33) . Risk of late recurrence especially distant recurrence increased when there were positive lymph nodes (34) . Some results revealed risk of late recurrence increased signi cantly when there were more than three positive lymph nodes (29, (35) . In this study, >3 positive lymph nodes was more frequently seen in patients with early recurrence than late recurrence. Meanwhile, even if there were less than 3 positive lymph nodes, quite a number of (67.6%) patients appeared late recurrence. The results of this study testi es the in uence of positive lymph nodes on recurrence time. This indicates that in future clinical work, patients with more than 3 positive lymph nodes should receive adjuvant radiotherapy and extended time of endocrine therapy to delay recurrence. On the other hand, for patients with small number of or without positive lymph nodes, the risk of late recurrence may not be that low, and other factors should be taken into consideration.
In biological studies, Ki-67 is supposed to be an important prognostic factor. Ki-67 is a monoclonal antibody, existing in all cell cycles except G0. It is a marker of cell proliferation, re ecting the aggressiveness of tumor cells. It is still under debated whether Ki-67 is an independent prognostic factor.
In a research on late recurrence of BC, high level of Ki-67 was associated with short DFS in univariate analysis, while only histological grade and lymph node metastasis were independent prognostic factors in multiple analysis (29) . In this study, there was signi cant statistical difference in the level of Ki-67 between early and late recurrence group, and high level of Ki-67 was more frequently seen in patients with early recurrence (P=0.005). After excluding potential in uence of other factors in multivariate analysis, Ki-67 was an independent factor affecting recurrence time of BC.
In uence of treatment mode on recurrence time was rarely discussed in previous studies. Akrami et al found chemotherapy before surgery was not related to recurrence time (41) . In this study, treatment mode, including neoadjuvant therapy and adjuvant therapy, was de ned according to whether patients received treatment before surgery. Patients received neoadjuvant therapy to reduce tumor for surgery, and these patients were usually in poor condition at initial diagnosis. Our data testi ed this condition with the result that neoadjuvant therapy was more frequently seen in patients with early recurrence in univariate analysis (20.6% vs 11.7%, P=0.003). Patients with later disease stage are more likely to undergo neoadjuvant therapy, which is why earlier recurrence occurred more in patients receiving neoadjuvant therapy. This phenomenon re ects the fact that characteristics of primary tumor greatly affect recurrence time, and even if patients receive treatment before surgery, the recurrence could not be delayed.
Akrami et al found adjuvant radiotherapy is a predictive factor of late recurrence (41) . In another study, recurrence time was not associated with adjuvant radiotherapy (30) . In univariate analysis of this study, adjuvant radiotherapy was more frequently seen in late recurrence group than in early recurrence group (63.0% vs 53.4%). However, in multivariate analysis, there was no signi cant difference in radiotherapy between the two groups (P=0.719). Because adjuvant radiotherapy is related to some clinical factors, such as breast-conserving surgery and positive lymph nodes. We further analyzed the relationship between type of surgery, lymph node metastasis and adjuvant radiotherapy. Adjuvant radiotherapy was not associated with type of surgery (P=0.126), but was associated with positive lymph nodes (P<0.001).
Patients treated with adjuvant radiotherapy had more positive lymph nodes. Recurrence was not delayed even if they receive radiotherapy.
It has been demonstrated that breast tumor molecular subtypes are associated with site of metastasis.
Bone is the most common metastatic site in HR+ BC (42, (43, (44, (45) . This phenomenon is probably due to the important role of bone marrow in hematopoiesis. Dormant BC micrometastasis reside in speci c bone marrow niches could transit from bone to circulation, thus causing metastasis and proliferation for tumor cells (46, (47) . In this study, no matter in early recurrence or late recurrence group, bone was the most common rst metastatic site (44.0% and 50.2% respectively). On the other hand, liver and chest/skin/soft tissue metastasis were more frequently seen in early recurrence group (19.4% vs 11.3%, 28.8% vs 21.4%). Pleura metastasis was more frequently seen in late recurrence group (12.8% vs 5.0%, P<0.001). There was no difference in lung metastasis between the two groups (P=0.405). In some other studies, the relationship between recurrence time and metastatic site was a little different. For example, in a Japanese study, lung metastasis was more common in HR+/HER2-patients with late recurrence (48) . In another Japanese study, compared with early recurrence within 10 years, patients with late recurrence had less chest and liver metastasis and more lung and pleura metastasis (32) . Metastatic site is not only associated with primary tumor, but also has relationship with microenvironment and organ characteristics.
According to the results of univariate and multivariate analysis, we constructed a nomogram using R software and optimize it for better prediction value with a higher AUC value of 0.703 and a better calibration curve. Age, hormonal status, positive lymph nodes, Ki-67, treatment mode, type of surgery, radiotherapy, chemotherapy and endocrine therapy were included in this model. Doctors could easily identify patients with higher risk of late recurrence with the help of this model and give them more intensive treatment.
This study has some advantages. First, the sample size was large relatively thus ensuring reliability of the results. Second, this study included patients diagnosed in recent 20 years with a relatively long follow-up time and made a comprehensive review of the medical records allowing to clarify clinicopathological factors in uencing recurrence time. Meanwhile, this study proposed a visual model helping better understand and calculate risk of late recurrence. This study has a few limitations as well. For example, this is a retrospective research with selection bias and incomplete data. Histological grade, an important prognostic factor, was not available in nearly half of the patients. The result on histological grade should be interpreted more carefully. Furthermore, we developed and validated a nomogram using data from the same center. We still need another population from different center to externally validate this nomogram.
Currently, more and more researches are focused on the role of genes and biomarkers in predicting late recurrence risk besides clinicopathological factors (49) , such as Oncotype DX, MammaPrint, BC Index and circulating tumor cell, which has the potential to optimize the candidates for extended endocrine therapy (31, (50, (51, (52, (53, (54, (55, (56, (57, (58) . Combination of clinicopathological factors, genes and biomarkers may add signi cant prognostic information for late recurrence of BC, and provide more useful reference for future work.

Conclusion
Our study showed that HR+/HER2-BC patients with premenopausal status at initial diagnosis, less positive lymph nodes, lower level of Ki-67 were more common in late recurrence. We developed a wellcalibrated nomogram predicting recurrence time of HR+/HER2-BC patients. The nomogram can predict the risk of late recurrence. Prospectively designed studies are needed to further validate our model.

Declarations
Ethics approval and consent to participate This study was conformed to guidelines of the 1996 Declaration of Helsinki and approved by the Ethics Committee of Peking University Cancer Hospital (No.2017KT40).
Consent for publication Not applicable.
Availability of data and materials Data in this article is available upon request.
Competing Interest The authors declare that they have no competing interests.
Funding There were no funding sources for this work.
Authors' contributions Huiping Li and Mengyu Hu contribute to conception and design of the study. Mengyu Hu and Huajie Xing analyzed and interpreted the patient data. Mengyu Hu was a major contributor in writing the manuscript. All authors read and approved the nal manuscript.      Forrest plot in univariate analysis. Premenopausal, adjuvant therapy, less positive lymph nodes, lower level of Ki-67, not receiving radiotherapy were more common in late recurrence group.

Figure 6
Calibration curve of Nomogram . B stands for bootstrap. Internal validation was performed for Nomogram using bootstrap resampling method, and calibration curve was depicted as above. In internal validation, 1000 repetitions of resampling were carried out. Based on the results of internal validation, calibration curve was depicted. The closer the curve is to the diagonal line, the more reliable the nomogram is. Absolute error and squared error were used to describe difference between predictive value and actual value. The smaller the error is, the more reliable the nomogram is.

Figure 9
Calibration curve of Nomogram . B stands for bootstrap. Internal validation was performed for Nomogram using bootstrap resampling method, and calibration curve was depicted as above. In internal validation, 1000 repetitions of resampling were carried out. Based on the results of internal validation, calibration curve was depicted. The closer the curve is to the diagonal line, the more reliable the nomogram is. Absolute error and squared error were used to describe difference between predictive value and actual value. The smaller the error is, the more reliable the nomogram is.