Between 2001 and 2016, a total of 525 females received surgical treatment in the southern China. Among them, 15 patients without the age information, so excluding this study. There were 51 patients without the information of tumor size, 6 patients without node status, and in the left 453 cases, 75 patients without pathological stage and treatment information and 95 patients were lost, all in all, there were 286 patients included in the southern China cohort. Between 1975 and 2016, a total of 65535 patients were included in SEER database, among them, 26277 patients were lost to follow-up, 35956 patients without the information of lymph node stage, 2678 cases had no tumor size information, 28 patients had no complete information of tumor staging and treatment, so there were about 596 patients were included in the SEER cohort (Fig. 1).
Comparison of clinicopathological features between southern China and SEER cohort
All patients were divided into three subgroups: young adult group (<40), middle aged group (40-70) and aged+ group (>70). Among 510 breast cancer patients in southern China, 101 (19.8%) patients were under 40, what’s more, middle aged patients account for the most, about 396 (77.65%) patients. Between 1975 and 2016, there were 65535 BC patients were included SEER database, the proportion of young patients was slightly less than that in southern China, was 4024 (6.14%), but there was no statistical significance between the two (P=0.923). However, the middle aged patients account for 41256 (62.95%) in SEER cohort (P=0.000). There was a difference between southern China and SEER cohort, the proportion of aged+ group was higher in SEER cohort, which was about 13104 (20.00%), it was significantly higher than 5 patients (0.98%) in southern China, there was statistically significant (P=0.048) (Tab. 1). BC patients in southern China and SEER cohort were compared, the age of newly diagnosed cases in the two groups was 40 to 70 years old. While, there was a different between the two groups: BC patients in the subgroup of 40 to 48 years old (48 years old was the median age of 40-70 years old)were roughly similar to the subgroup of 48 to 70 years old in southern China, there were 194 (38.04%) patients and 202 (39.61%) patients respectively. However, BC patients in SEER cohort were slightly older, it was 9105 (13.89%) patients in the subgroup of 40 to 48 years old, while there were 32151 (49.06%) patients in the subgroup of 48 to 70 years old (Tab. 1).
According to TNM stage, BC patients were divided into Tx, Tis and invasive group (including ≤2cm，＞2≤5cm，>5cm), among them, invasive group account for the most, there were 448 (89.96%) patients in southern China and 1364 (2.99%) patients in SEER cohort, P=0.000. In southern China, the tumor size was 2 to 5 cm accounted for the most, were 323 (64.86%) patients. However, there were too many data were missing, were 64071 (97.8%) cases, Tis subgroup in the two was both 1 case (Tab. 1). There were 257 (50.39%) patients with node metastasis in southern China and 296 (0.45%) cases in SEER cohort. Among the two groups, no lymph node metastasis accounted for the most, were about 206 (40.39%) patients and 1145 (1.75%) patients. While, there were too many data of lymph node status missing, were 29269 (44.7%) patients (Tab. 1). Comparing southern China and SEER database, there was statistical significance in tumor stage (P=0.000). Among them, the proportion of stage 2 was the highest, were 262 (51.37%) cases and 14936 (22.81%) cases respectively. Next, it was stage 3, were 133 (26.08%) cases and 12993 (19.75%) cases. However, there were many missing data about tumor stage in SEER database, were about 29269 (44.7%) cases (Tab. 1). The expression of ER was count in both southern China and SEER cohort (P=0.000). Among them, ER (+) accounted for a higher proportion, were 290 (56.86%) cases and 20233 (65.34%) cases respectively, and ER (-) were 178 (34.9%) cases and 5563 (17.96%) cases respectively (Tab. 1). Similarly, the expression of PR in southern China and SEER cohort was also statistically significant (P=0.000). Among them, PR (+) was higher in both southern China and SEER cohort, were 270 (52.94%) cases and 179194 (55.55%) cases, PR (-) were 182 (35.69%) cases and 8110 (26.2%) cases (Tab. 1). While, the expression of HER2 was different between the two groups (P=0.000). The expression of HER2 (+) accounted for the most in southern China, was 283 (55.49%) cases, but there was only 169 (9.46%) cases in SEER cohort, with the proportion of PR (-) was high, was 1530 (85.62%) cases (Tab. 1). Additionally, the expression of KI-67 was different in the two groups, there were 365 (71.57%) cases of KI-67 (+), among them, it was the most between 14% to 51.7%, was 170 (16.58%) cases. However, there was no data about the expression of KI-67 in SEER cohort (Tab. 1).
Survival and prognosis analysis
Between 2001 and 2016, there were 393 BC patients were followed up in southern China. Meanwhile, between 1975 and 2016, 39258 BC patients with complete follow-up information were analyzed. During the follow-up period, there were 59 patients died, 219 patients were alive and 115 patients relapsed in southern China. While in SEER cohort, there were 3613 patients died and 35645 patients were alive. DFS and OS of all included breast cancer patients were compared: in comparing of DFS, there was no significant different between southern China and SEER cohort (P=0.133), but there was significant different in OS (P=0.000), and the OS in SEER cohort was significantly higher than southern China (Fig. 2A-B). Secondly, due to the data of southern China was only included from 2001 to 2016, so layered statistic was used to count the DFS and OS in both southern China and SEER database from 2001 to 2016. Among them, in the first 70 months of follow-up, DFS in southern China was higher than SEER cohort, and then, DFS in SEER cohort was significantly higher than that in southern China (P=0.035), and OS in this period also has significant statistical different (P=0.000), SEER cohort was significantly higher than that in southern China (Fig. 2C-D). Finally, SEER cohort was analyzed in stages, dividing into 1975 to 2000 and 2001 to 2016 two subgroups, furthermore, DFS and OS of each subgroup were counted respectively. The results showed that, in SEER cohort, the DFS and OS of 2001 to 2016 were significantly higher than 1975 to 2000 (P=0.000) (Fig. 2E-F).
Analyzing and comparing the influence of different clinicopathological features on survival and prognosis of BC patients in southern China. By analyzing the effects of age, tumor size, lymph node status, ER, PR, HER2, KI-67, surgery and radiotherapy on the prognosis of breast cancer, we found that tumor size, positive lymph node status and KI-67 expression affected OS of BC patients in southern China, which had significant statistical different (P=0.018, P=0.000, P=0.034 respectively) (Supplementary Fig. 1). To further analyze and compare the effect of different tumor size on survival of different BC cohorts. There were statistical different of DFS and OS in SEER cohort and southern China when the tumor size (T) >2cm (P=0.01 and P=0.04), however, DFS and OS were no statistical different of the two groups when T≤2cm (P=0.188 and P=0.604) (Fig. 3A-D). Secondly, the effects of different tumor sizes on the survival of BC patients in each cohort were analyzed separately. Among them, tumor size had little effect on DFS in southern China (P=0.487), but for OS, there was significant statistical different, OS in T>2cm group was significantly lower than T≤2cm (P=0.012) (Fig. 3E-F). However, for SEER cohort, DFS and OS of T>2cm group were slightly lower than that of T≤2cm group, but there was no statistical different (P=0.738 and P=0.299) (Fig. 3G-H). Analyze and compare the effect of different node stage on survival of different BC cohorts. Positive-node affected DFS and OS in both southern China and SEER cohort (P=0.000 and P=0.044). Meanwhile, negative-node also affected DFS and OS in the two groups (P=0.000 and P=0.000). OS of SEER cohort with different lymph node status was higher than that of southern China (Fig. 4A-D). Analyzing southern China and SEER cohort separately, DFS and OS of positive-node were lower than negative-node, among them, OS of lymph node status has significant statistical different (P=0.000), but DFS of lymph node status has no statistical different (P=0.448) (Fig4. E-F). But for SEER cohort, DFS and OS of positive-node was slightly higher than negative-node, while there was no statistical different (P=0.226 and P=0.087) (Fig4. G-H). Analyze and compare the effect of expression of KI-67 on survival of southern China. Among the subjects included in this study, DFS and OS of KI-67<14% both higher than ≥14%, there were significant statistical different (P=0.05 and P=0.034) (Fig5. A-B). Multivariate analysis and univariate analysis of southern China and SEER cohort was performed by Cox regression analysis. In univariate analysis of DFS, T>2cm, positive-node, ER (+), PR (+), HER2 (+), surgery and radiation all had no statistical different with the increasing risk of death. Among them, the hazard ratio (HR) of KI-67 high expression group was 1.376, 95%CI: 1.000-1.894, P=0.050. However, In multivariate analysis of DFS, all the clinicopathological features of the included studies were statistically significant (Tab. 2).
Multivariate analysis and univariate analysis of southern China was performed by Cox regression analysis. In the univariate analysis of OS, it was significantly associated with increased risk of death and T>2cm (HR 3.406, 95%CI: 1.232-9.417, P=0.018), positive-node status (HR 0.308, 95%CI: 0.169-0.564, P=0.000) and KI-67 high expression (HR 2.128, 95%CI: 1.057-4.285, P=0.034). In the multivariate analysis of OS, positive-node status (HR 0.226, 95%CI: 0.098-0.519, P=0.000) was significantly associated with increased risk of disease recurrence or death (HR 0.226, 95%CI: 0.098-0.519, P=0.000) (Tab. 3).
Comparison of treatment
It had differences of treatments between southern China and SEER cohort (Tab. 4). There were 389 (97.01%) patients received chemotherapy in southern China, but there were 49751 (75.92%) patients in SEER cohort have not received chemotherapy (P=0.000). The treatment of surgery also had a big different, there were 387 (95.09%) patients performed mastectomy (including simple resection and modified radical operation), but for SEER cohort, there were 53584 (81.76%) patients had not performed mastectomy, only 11951 (18.24%) patients had breast surgery (the specific operation method is not clear). However, it was the same of the two about whether to receive radiotherapy or not, and there was a significant statistical different (P=0.001). Among them, there were 351 (72.82%) patients had not performed radiotherapy in southern China, with 43107 (65.78%) patients in SEER cohort.
Changes inmorbidity with years
To further analyze and compare the age distribution of BC patients in different years in SEER cohort. Among them, except for 90’s, the proportion of BC patients in young adult group (< 40 years old) was gradually increased with time goes by, it were respectively: 70’s: 6%, 80’s: 6.31%, 90’s: 5.76%, 00’s: 7.62%, and the median age was 35 year, 36 year, 36 year and 36 year. In all age groups, the incidence of middle-aged group (40-70 years old) was the highest, followed by 63.89%, 61.74%, 62.67% and 66.42%, and the median age was 57 year, 56 year, 56 year and 56 year. In addition, we found that with time going, the morbidity of aged+ group (>70) was decreasing year by year, respectively was 30.11%, 31.95%, 31.57% and 25.96%, and the median age was orderly 78year, 78 year, 78 year and 77 year (Fig. 6).