An Extended Cox Prognostic Model of ER/PR+ and HER2− Breast Cancer Patients

Background: The purpose of this study was to explore a new ER/PR+ and HER2− breast cancer prognostic model called the extended Cox prognostic model by us for determining the cut-off values for multiple continuous prognostic factors and their interaction via the new modeling idea and variable selection method. Methods: A total of 335 patients with ER/PR+, HER2- breast cancer were enrolled for the nal analysis. The primary endpoint was breast cancer-specic mortality (BCSM). The prognostic factors (histological grade, histological type, stage, T, N, lymphovascular invasion, P53, Ki67, ER, PR, age) were included in this study. The four continuous valuables (Ki67, ER, PR, age) were partitioned into a series of binary variables which all were tted in the multivariate Cox analysis. A smoothly clipped absolute deviation (SCAD) variable selection method was used. Model performance was expressed in discrimination and calibration. Results: We developed an extended Cox model with a time threshold at 164 weeks (more than 3 years) post-operation. We found that the cut-off values for PR, Ki67 and age were 20%, 60% and 41-55 years respectively. There was interaction between age and PR for the patients with age ≥ 41years and PR ≥ 20% after 164 weeks post-operation. The patients with age ≥ 41years and PR ≥ 20% after 164 weeks post-operation had relatively higher mortality than before 164 weeks postoperatively. Conclusions: Our study would offer the guidance in the prognosis for the patients with ER/PR+ and HER2− breast cancer in China. The new idea would be one of ideas for modeling and determining the cut-off values of prognostic factors in future.


Introduction
Breast cancer is a malignant carcinoma with the highest occurrence among Chinese women. In China, Shanghai with the increase on breast cancer, the studies on the prognostic models of breast cancer is especially becoming to be needed in the past years [1]. Recently the prognostic gene signatures (Oncotype DX, Mammaprint, and so on) are taken more seriously, however the current prognostic gene signatures are not ready to be used in the clinic practice due to a plethora of concerns in the cost and technology, regardless of rst-or second-generation gene signatures [2]. Furthermore, recently the prognostic value of the classic clinicopathologic variables is taken seriously once more [3][4][5]. And some evidence indicates that the clinicopathologic variables models are excellent surrogates for the prognostic gene signatures. Hence the classic clinicopathologic variables models are highly valued in China due to the feasibility in clinic practice. From 2006 to present, the National Comprehensive Cancer Network (NCCN) had classi ed invasive breast cancer into four subtypes [6]. The molecular subtype of breast cancer was a classi cation method similar to intrinsic subtype and was more suitable for current clinical practice in China, also serving as an independent prognostic factor [7,8]. Among the four molecular subtypes of breast cancer, ER and/or PR + and HER2-(estrogen receptor and/or progesterone receptor positive and human epidermal growth factor receptor 2 negative) occurred most commonly and accounted for approximately 60% of breast cancer patients [8,9]. There was more urgent demands and wider impacts to explore the improved prognostic model for patients with ER/PR + and HER2 − breast cancers based upon classic clinicopathologic variables to meet the unique clinic needs in China.
The current classic prognostic algorithms (PREDICT, Adjuvant! Online, and Nottingham Prognostic Index) are far from ideal [10][11][12]. These models were often based upon the data sets from non-Chinese or non-Asian patients. Speci cally, it was assumed that these prognostic factors, including Ki67 (a nuclear marker of cell proliferation), estrogen receptor (ER), progesterone receptor (PR) and Age, were continuous factors, or it was assumed that the cut-off values of prognostic factors, including Ki67, estrogen receptor (ER), progesterone receptor (PR) and Age, were determined merely based on univariate analysis, experience or speculation. In addition, existing models ignored the interaction effect between the prognostic factors. Therefore, the current models showed poor accuracy and were not suitable to the clinic practice in China. It is critical to develop novel improved prognostic algorithm to analyze the clinic data from China. included age at diagnosis, number of lymph nodes sampled and number of positive lymph nodes (categorised as 0, 1 to 3, 4 to 9, and 10 + nodes positive [4]), lymphovascular invasion (categorised as positive or negative), tumour size (categorised as < 21 mm, 21 to 50 mm, 50 + mm [4]), histological grade (categorised as I, II, III [4]), pathological type, protein 53 (p53) status, proliferating cell nuclear antigen Ki67 (Ki67) status, ER status, PR status, HER2 status, information on local therapy (wide local excision, mastectomy, radiotherapy), and type of adjuvant systemic therapy (chemotherapy, endocrine therapy, both). The patients with any one of the following conditions were excluded from the analyses, including mucinous carcinoma, cribriform carcinoma, tubular carcinoma [6], or incomplete information, or who received chemotherapy or radiation before operation, or who did not undergo surgery, or who did not complete local treatment (local excision without radiotherapy), or no axillary lymph node dissection, or who did not complete adjuvant systemic therapy (chemotherapy and endocrine therapy), leaving 568 individuals. Variables for each patient included age, TNM status, T stage, N stage, pathological subtype, histology grade, lymphovascular invasion, p53 status, Ki67 status, ER status, PR status, vital status, and survival time.
1.2 Categorize the patients into ER/ PR + and HER2-subtype of breast cancer All formalin-xed para n-embedded (FFPE) tumor blocks were collected at the time of surgery prior to adjuvant therapy and were stored at room temperature. Approval was obtained from Institutional Review Boards (IRB). Tumor sections of 4 to 10 µm were cut. One section was stained by hematoxylin/eosin (HE) to con rm the presence of invasive carcinoma, and other sections were used for molecular analyses by two independent pathologists. ER, PR and HER2 were assayed by immunohistochemistry IHC and evaluated according to standard criteria [9,13,14].
The criteria for evaluating ER and PR in breast cancer cells by IHC [13]: ER or PR was positive if the cell nuclei showed brown color. In one section, ve high-power regions were selected randomly. The patient was assigned to subtype ER/PR + if the percentage of positive cells was ≥ 1% in these regions,. The criteria for evaluating HER2 by IHC: the patients were categorized into four subtypes: 0, 1+, 2+, or 3+ [9,14]. Subtypes HER2 IHC 0 and 1 + were HER2 negative [HER2 (-)]. Subtype HER2 IHC 2 + was equivocal in HER2 status. Finally subtype HER2 IHC 3 + was HER2 positive. Tumor cells of subtype HER2 IHC 2 + were further analyzed by uorescence in situ hybridization (FISH). According to the average copy number of HER2 protein or the HER2/chromosome 17 centromere (CEP17) ratio, the FISH results were considered positive, equivocal, or negative [9]. Patients with equivocal FISH result were excluded from the study. Among the 568 patients, the HER2 status of 185 individuals was detected as HER2 IHC (++) [9,14].
Tissue microarrays (TMAs) were constructed from the tissue core with 1.5 mm in diameter to detect HER2 status of these 185 patients using FISH methods. Finally, the 335 individuals of 568 patients were determined as ER/PR + and HER2 − type, a study population.

Treatment for the patients.
All patients underwent modi ed radical mastectomy/breast-conserving surgery and adjuvant chemotherapy. They were treated with four to six cycles of CEF (cyclophosphamide, epirubicin, and uorouracil) chemotherapy, or four cycles of CEF followed by four cycles of T (docetaxel) chemotherapy, or four to six cycles of TEC (docetaxel, epirubicin, and cyclophosphamide ) chemotherapy after surgery. If necessary, patients would receive postoperative radiotherapy followed with endocrine therapy, but not trastuzumab treatment.

Follow-up
The primary endpoint of this study was breast cancer-speci c mortality (BCSM), which was determined by following up the survival of patients over certain time period. From the rst day after surgery till the death or the end of the study (

Statistical analysis
The extended Cox prognostic model was developed in all eligible patients as follows: First, to determine the cut-off values for each continuous valuables (Ki67, ER, PR, age), these four continuous valuables were partitioned into a series of binary variables. Second, all variables were tted in the multivariate Cox analysis. A SCAD (smoothly clipped absolute deviation) variable selection method [15] was used to build up a Cox prognostic model to determine the independent variables, the cut-off values and interaction effect between different factors. Finally, during developing the model, we reckoned that the model could be divided into two parts by a certain time point. We built up a new model, named as extended Cox prognostic model, with a time threshold at 164 weeks (more than 3 years) based upon the smallest Akaike information criterion (AIC) value.
We evaluated the predictive accuracy of extended Cox prognostic model based upon the parameters of discrimination and calibration. For the model discrimination, the receiver-operator-characteristic (ROC) curves were plotted for the data at 1-year, 3-year and 5-year post operation [16], respectively. And we also calculated the areas under the receiver-operator-characteristic (ROC) curves (AUC). Model calibration was assessed by a simpli ed goodness-of-t (GOF) method [17]. We compared the number of deaths observed and calculated at 5-year post-operation. We grouped the risk scores into 5 sets and then calculated the GOF statistics for each set. This provides a goodness of t Chi-square test.

Prognostic factors
According to whether the prognostic factor was categorical or continuous,  Secondly, 1730 variables included interaction terms were used in the model to evaluate the interaction effect between prognostic factors. A SCAD variable selection method [15] was also used to develop the Cox proportional hazard model (named as Model 2).
* PR -age (Age 41 >=years, PR >=20%) : a binary variable, an interaction item between age and PR, which is one if age is no less than 41years and PR is no less than 20% and zero otherwise  * As to the age factor, the patients were divided into three groups: youth group (<41 years old), middleaged group (41-55 years old) and elderly group (55 and above years old).
* means interaction item between PR status and age after 164 weeks postoperatively.

Table VI
Coe cients, hazard ratios (95% CI) and P-values from the prognostic model The extended Cox prognostic model determined the cut-off values for multiple continuous prognostic factors and the interaction effect between the factors. Firstly, the cut-off values of the prognostic factors were determined by Cox prognostic model with SCAD variable selection method [15]. Among 1730 predictors, histological grade and N status were considered to be categorical factors. Whereas, the model showed Ki67, PR and age were also categorical factors and the prognostic model automatically determined their reasonable cut-off values. For Ki67 expression, a cut-off value at 60% was selected to distinguish between low-expression (< 60%) and high-expression (≥ 60%). For PR expression, a cut-off value at 20% was selected to distinguish between low-expression (< 20%) and high-expression (≥ 20%).
As to the age factor, we categorized the patients into three groups: the old group with at least 55 years old when received surgery, the young group with less than 41 years old and middle-aged group between 41 to 55 years old. Secondly, for the interaction effect, after 164 weeks (more than 3 years) after surgery, there was interaction between age and PR. The patients with age ≥ 41 years and PR ≥ 20% after 164 weeks (more than 3 years) after surgery had relatively higher mortality than before 164 weeks postoperatively.
In our study, we found that the hazard ratio for BCSM increased 2.78 times with the histological grade increasing one level. If N statuses increased one level, the hazard ratio for BCSM increased 1.64 times.
The patients with high Ki67 expression had the hazard ratio for BCSM 4.21 times higher than the patients with low Ki67 expression. Within 164 weeks post-operation, the hazard ratio for BCSM of the patients with low PR expression was 4.88 times higher than that of the patients with high PR expression. Within 164 weeks (more than 3 years) post-operation, the patients aged < 41 had the highest hazard ratio for BCSM, followed by the patients aged ≥ 55, while the patients aged 41 to 55 showed the lowest hazard ratio for BCSM. After 164 weeks (more than 3 years) post-operation, there was interaction effect between age and PR for the patients with age ≥ 41 years and PR ≥ 20%. The hazard ratio for BCSM of the patients with age ≥ 41 years and PR ≥ 20% elevated after 164 weeks (more than 3 years) post-operation. For the patients with high PR expression, the age was positively correlated with the mortality after 164 weeks (more than 3 years) post-operation. For patients with high PR expression after 164 weeks (more than 3 years) post-operation, the patients aged 41 to 55 had nearly the same hazard ratio for BCSM as those aged < 41, while the hazard ratio for BCSM of patients aged ≥ 55 years was 3.09 times higher than that of patients aged < 41 years.

Model discrimination and calibration
As expected, this extended Cox prognostic model showed good discrimination. Figure   The extended Cox prognostic model was also well calibrated by the goodness-of-t (GOF) test [17]. We grouped the risk scores into 5 groups and then calculated the GOF statistic 0.54, with P-value 0.76 (> 0.05). These indicated that our extended Cox prognostic model t was good.

Discussion
In this study, we explored an extended Cox model for the prognosis of ER/PR + and HER2 − breast cancer, with calculating the cut-off points of prognostic factors and their interaction. The cut-off values of Ki67, PR and age and the interaction between the age and PR status were generated from model calculation.
The model was well calibrated and provided a high degree of discrimination.
Here, we found that the prognosis of patients was associated with histological grade, N, Ki67, PR statuses, and age. And we found that the cut-off values for PR, Ki67 and age were 20%, 60% and 41-55 years respectively. It was important to point out that the cut-off values of prognostic factors, including Ki67, PR and Age, was determined only based on our extended Cox prognostic model (a multivariable analysis), and not based on univariate analysis, experience or speculation. It's different from previous studies [10-12, 19, 20]. The prognosis of patients was associated with histological grade, N, Ki67 and PR statuses, and age, which was consistent with previous reports [18]. Histological grade and N status had a linear effect on the hazard ratio. The cut-off value (20%) for PR was consistent with the St. Gallen consensus of 2013 [21,22], which indicated the high delity of our model. We reckoned that the breast cancer with low PR expression was probably a different intrinsic subtype of breast cancer, Luminal B subtype. Our prognostic model determined 60% as the cut-off value for Ki67 status, which was much higher than those in other studies (e.g., 14% and 20%) [18,[21][22][23][24]. We reckoned that this situation was because of a lack of a standardised procedure for Ki67 assessment and the controversial Ki67 assay interpretation [25], and it was also because of determining the cut-off value for Ki67 by the ROC method, an univariate analysis method in previous study [19,23]. So the prognostic value of Ki67 index in breast cancer was to be further explored. The cut-off value (41 and 55 years) for age was consistent with the age range of perimenopausal and menopause Chinese women though it was different from those of other studies [18,26]. Hence our models showed high consistence to the current available golden standard, such as PR factor. The values are well correlated to the physiological conditions of our patients, such as age factor. And it was the most important that our model generates the cut-off value based upon algorithm calculation without empirical biases or univariate analysis.
In our study, algorithmic analysis by the extended Cox prognostic model showed that there was interaction among age and PR for the patients with age ≥ 41 years and PR ≥ 20% after 164 weeks (more than 3 years) post-operation. The interaction among age and PR made the patients with age ≥ 41 years and PR ≥ 20% relatively higher mortality after 164 weeks (more than 3 years) post-operation. We found that the older the patients with ER/PR+, HER2−, PR ≥ 20% were, the lower survival exceeding 164 weeks post surgery they would have. The existence of the interaction between age and PR status had been previously reported [18]. We reckoned this may be related to multiple factors, including the subtype of breast cancer, the status of menopause and the health of the patients after 164 weeks (more than 3 years) after surgery. The tumor cells in the different subtypes of breast cancer had different growth characteristics under the different sex hormone levels. Luminal A subtype of breast cancer was more likely to recur and metastasize after 164 weeks (more than 3 years) after surgery in menopause woman.
So the drug selection in adjuvant endocrine therapy for the perimenopausal or menopause patients with ER/PR+, HER2−, PR ≥ 20% needed careful consideration. And the endocrine therapy should last long enough. Obviously our study about the interaction was deeper and clearer.
Our studies showed the advantages in the following three aspects. 1. We adopted a new idea to determine the cut-off values only based on our extended Cox prognostic model, a multivariable analysis.
The cut-off values for the multiple prognostic factors including age/PR /Ki67 status were determined, which were statistically signi cant for prognosis. 2. During developing the model, we introduced the potential interactions and reckoned that the model could be divided into two parts by a certain time point.
An extended Cox model with a time threshold at 164 weeks (more than 3 years) post-operation was built up based upon statistical analysis. We found that there was interaction between age and PR after 164 weeks post-operation. The hazard ratio for BCSM of the patients with age ≥ 41 years and PR ≥ 20% elevated after 164 weeks (more than 3 years) post-operation. 3. Our study was derived from Chinese clinical data, which could be most relevant model towards Chinese clinical practice. Due to the gene similarity, the model could also apply to the prognosis of breast cancer in other Asian female population. The cut-off value (41 and 55 years) for age was consistent with the age range of perimenopausal and menopause Chinese women, though it was different from those of other studies.
In conclusion, using the new modeling idea and statistical method, an extended Cox prognostic model for the prognosis of ER/PR + and HER2 − breast cancer was explored with calculating the cut-off points of prognostic factors and their interaction. The new idea and statistical method in our study was different from previous studies, especially the study about the cut-off value of age and its interaction. Moreover, the result of our study would offer the guidance in the prognosis and treatment for the patients with ER/PR + and HER2 − breast cancer in China. The new idea used in our study would be one of ideas for determining the cut-off value of prognostic factors in future. And the prognostic model could be divided into two parts by a certain time point, which would be a new idea of developing a prognostic model.