Investigating the role of core needle biopsy in evaluating tumor-stroma ratio (TSR) of invasive breast cancer: a retrospective study

Tumor-stroma ratio (TSR) of invasive breast carcinoma has gained attention in recent years due to its prognostic significance. Previous studies showed TSR is a potential biomarker for indicating the tumor response to neoadjuvant chemotherapy. However, it is not clear how well TSR evaluation in biopsy specimens might reflect the TSR in resection specimens. We conducted a study to investigate whether biopsy evaluation of TSR can be an alternative method. We collected cases with invasive breast carcinoma of no special type (IBC-NST) from University of Yamanashi hospital between 2011 and 2017 whose biopsy and resection specimens both had a pathologically diagnosis of IBC-NST (n = 146). We conceptualized a method for evaluating TSR in biopsy specimens within a preliminary cohort (n = 50). Within the studied cohort (n = 96), biopsy-based TSR (b-TSR) and resection-based TSR (r-TSR) were scored by two pathologists. We then evaluated our method’s validity and performance by measuring interobserver variability between the two pathologists, Spearman’s correlation between b-TSR and r-TSR, and the receiver operating characteristics (ROC) analysis for defining stroma-rich and stroma-poor tumors. Intra-class coefficient between the two pathologists was 0.59. The correlation coefficients between b-TSR and r-TSR in the two pathologists were 0.45 and 0.37. The ROC areas under the curve were 0.7 and 0.67. By considering an r-TSR of < 50% as stroma-rich, the sensitivity and specificity of detecting stroma-rich tumors were 64.1% and 66.7%, respectively, when b-TSR was < 40%. Our current b-TSR evaluation method can provide information about r-TSR and facilitate pre-treatment therapy follow-up.


Introduction
Tumor-stroma ratio (TSR) is a prognostic factor of invasive breast carcinoma, especially triple-negative breast carcinoma. Many studies showed that stroma-rich tumors had a worse prognosis compared to stroma-poor tumors [1,2]. This prognostic significance is more obvious in triple-negative breast carcinoma. A previous study even proposed a new tumor-node-metastasis (TNM) classification that integrates the concept of TSR into the TNM categories of breast carcinoma [3]. The suggested classification achieved a better prognostic stratification of breast cancer. Moreover, TSR can be a potential biomarker for evaluating the tumor response to neoadjuvant chemotherapy [4,5]. Therefore, there may be a need for pre-operative TSR evaluation.
Minh-Khang Le and Toru Odate have contributed equally to this work.

3
The prognostic value of TSR in breast carcinoma is wellestablished in the literature and the method of choosing the 10X microscopic field to evaluate has been well described across studies [2,6,7]. In general, microscopic fields with tumor cells present at all quarters of its border (north, east, west, and south quadrants) meet the criteria for TSR evaluation. TSR of breast carcinoma is measured as the percentage of tumor-cell occupied area to stroma-occupied area in the most stroma-abundant regions within the tumor under a 10X objective lens. With this methodology, TSR is mainly evaluated on resected specimens. By contrast, the ability of core needle biopsy (CNB) to reflect the TSR, which is conventionally evaluated by resection specimens, in breast carcinoma has not yet been studied. Moreover, CNB is a crucial procedure for preoperative pathological diagnosis of breast tumors [8]. Therefore, it is necessary to examine how CNB predicts TSR of breast carcinoma.
We aimed to investigate the utility of CNB specimens for predicting TSR of invasive breast carcinoma, no special type (IBC-NST). By examining the correlation between biopsybased TSR (b-TSR) and corresponding resection-based TSR (r-TSR), we illustrated how well CNB specimens can be used to evaluate TSR of breast carcinoma. Next, we assessed the performance of b-TSR to delineate stroma-rich and stromapoor breast tumors. Finally, we analyzed the histopathological pitfalls that can potentially reduce the validity of the b-TSR evaluation.

Case collection
We accessed the pathology archive of the University of Yamanashi Hospital and retrieved tumor cases with the diagnosis of breast carcinoma during the period from 2011 to 2017. Next, we manually selected cases based on the following inclusion criteria: (1) cases with both biopsy and corresponding resection specimens of the same tumor; (2) the pathologic diagnosis from biopsy and resection were both invasive breast carcinoma of no special type; and (3) the patients received no neoadjuvant radiation or chemotherapy. In cases with multiple times of biopsies, we selected the most recent specimen to evaluate b-TSR. If one patient had bilateral tumors, we treated such tumors as two cases. For each case, we selected a representative histopathologic resection slide consisting of invasive components. The Institutional Review Board of the University of Yamanashi approved this study (approval number 2547).
First, all the histologic slides of resected specimens were re-examined by the two pathologists (MKL, TO) to confirm the previous diagnoses. Next, the r-TSR in resection slides was double scored by the same pathologists. The third observer (NO) was consulted if a consensus could not be reached. We then excluded cases that were misdiagnosed and inadequate for r-TSR evaluation. All the corresponding biopsy slides were covered and randomized by another author (MK). The pathologists previously evaluating r-TSR then independently scored the b-TSR of each biopsy slide.

r-TSR evaluation of resection specimens
We evaluated r-TSR on resection slides based on previous studies [6,7]. First, the entire slide was evaluated in the direction toward the most stroma-rich field under low power of our light microscope. Second, the invasive tumoral region with the most abundant stroma was marked, using the 10X objective lens. The invasive tumor cells must be present on all sides (north, east, west, and south) within the selected fields. Finally, TSR (recorded as percentage) was estimated in 10% increments in the selected fields. We categorized tumors with TSR ≥ 50% as stroma-poor and tumors with TSR < 50% as stroma-rich [9].

b-TSR evaluation of biopsy specimens
To adapt to the biopsy context, we conceptualized a method that is relevant to the conventional TSR evaluation in resection specimens (Fig. 1). We established and modified the method by evaluating 50 preliminary cases multiple times. The method includes a few ordered steps to evaluate b-TSR in corresponding biopsy specimens, described as follows: Placement of the evaluation field. To evaluate b-TSR, the visual field was adjusted in such a way that the greatest length of the biopsy specimen was achieved within a 10X power field (Fig. 1A).
Selecting the field of evaluation. First, the most abundant tumoral stroma areas in the specimen were noted. The evaluator then selected fields such that the invasive tumor cells (not the in situ component) were present at two ends of the biopsy (Fig. 1B, east/west). Meanwhile, tumor cells must also be (1) present at both side margins of the specimen (Fig. 1B, Left) and/or (2) distributed throughout the specimen within the field (Fig. 1B, Right). The evaluator may select multiple fields and choose the most stroma-abundant field after b-TSR estimation.
b-TSR estimation. Using a 10X power field, we estimated the b-TSR as the percentage of the specimen area occupied by tumor cells within the evaluation field: b-TSR (%) = (Area occupied by invasive tumor cells)/ (Area of the biopsy specimen) × 100.
Finally, b-TSR was evaluated in 10% increments within the selected field. Fat tissue and blood vessels were considered stromal areas. Ductal carcinoma in situ (DCIS) components were not included in either the numerator or denominator of the equation. We avoided tumor necrosis in the selected field, except for comedo necrosis which is considered a part of DCIS. We used Olympus BX50 microscopy with a 10X objective field diameter of 2.2 mm for all the steps. The CNB needle size was 16G with a 1.2 mm inner diameter and 1.65 mm outer diameter.

Statistical analysis
Intraclass correlation coefficient (ICC) was calculated to evaluate interobserver variability between the two pathologists. ICC values < 0.5, 0.5-0.75, 0.75-0.9, and > 0.9 were considered as poor, moderate, good, and excellent reliability, respectively [10]. The Spearman and Kappa correlation coefficients were used to evaluate the correlation between r-TSR and b-TSR. We performed the receiver operating characteristics (ROC) analysis to evaluate the ability of the b-TSR evaluation method to categorize the tumors into stroma-rich and stroma-poor categories. The area under the curve (AUC) was calculated for the result of each pathologist. The 95% confidence interval (95%CI) of sensitivity and specificity were calculated, using 1000 bootstrap samples. The methodology of the present study was similar to our previous study [11].
The descriptive statistics were mean, standard deviation (for continuous variables), and number of cases (for categorical variables). We performed two-sample t-tests, Wilcoxon's tests, and&nbsp;Chi-squared tests to compare normally distributed variables, non-normally distributed variables, and categorical variables, respectively. All the analyses were performed, using R software version 4.1.1 (the R Foundation, Vienna, Austria). A p-value less than 0.05 was considered statistically significant. Figure 2 summarizes the study design and methodology. Based on the inclusion criteria, we collected 173 tumors. By reviewing histopathological slides of resection specimens, we then excluded 27 tumors that were morphologically unclear and could not be evaluated. Examples of such cases included misdiagnosis, invasive carcinoma with mixed type, and tumor with abundant necrosis with almost no viable cells. A final cohort of 146 tumors remained for r-TSR evaluation. We used 50 of the tumors as the preliminary cohort to modify and optimize the evaluation procedure. Subsequently, the remaining 96 tumors underwent consensus-based r-TSR evaluation and separate b-TSR assessment by the two pathologists. In this cohort, 4 tumors originated from two patients with bilateral tumors.

Evaluation of r-TSR
Two main evaluators simultaneously measured r-TSR in the same fields under a multi-head light microscope. The individual results usually fluctuated by 10% and achieved agreement in all the cases. In cases of nonconsensus (n = 0), the third pathologist was then consulted (not at the same time) with the same evaluating field. Although this experimental design may lead to the bias in field selection, it reduces interobserver variation within each specimen. The r-TSR distribution of 96 cases is shown in Fig. 3A. The average r-TSR in our cohort was 40.8% (± 20.3%).

Evaluation of b-TSR
The two pathologists separately evaluated the b-TSR scores of 96 biopsy specimens, providing each biopsy specimen with two results. Figure 1C (left) illustrates a candidate field that met our criteria of evaluation. There were a total of 5 A Placement of the evaluation field. The slide is adjusted to place the greatest length of the biopsy specimen within the field. B Selecting evaluation fields. Invasive tumor cells should always be present at both specimen ends and either at side margins (left) or throughout the specimen (right). C Real-world representation illustrating the appropriate field (left) and a sample not fitting the criteria and unable to be evaluated. Tumor cells are present in small clusters far away and/or the specimen is fragmented specimens that cannot be fully evaluated (Fig. 1C, right). These included 3 specimens that neither pathologist could evaluate due to the lack of tumor regions that fulfilled our criteria for field selection, and each pathologist could not assess 1 additional case. Therefore, each of the two pathologists successfully addressed 92 cases. The b-TSR distribution from the 2 pathologists' combined results is shown in Fig. 3B. Values of the b-TSR histogram originate from the raw results of each pathologist (n = 184). Interobserver variability, as determined by the ICC was 0.59 (95%CI 0.44 -0.71; p < 0.001), indicating a moderate correlation between the two pathologists.
The ROC analyses of the two pathologists' results had AUCs of 0.7 (95%CI 0.58-0.82) and 0.67 (95%CI 0.55-0.79) (Fig. 4A, B). To find the optimal cut-off point of b-TSR to differentiate stroma-rich and stroma-poor tumors, we calculated Youden Indexes [13] across all the b-TSR values from 0 to 100% in increments of 10% (Table 1). The highest index indicates the optimal threshold. For this analysis, we paired each pathologist's b-TSR result for a tumor with the tumor's r-TSR, yielding 184 pairs of b-TSR/r-TSR values. That is, one r-TSR score would be paired with pathologist 1's b-TSR and pathologist 2's b-TSR. If we selected the cut-off value for determining stroma-rich tumors as b-TSR < 40% (the highest Youden Index), the sensitivity and specificity of Fig. 2 The flow chart showing the design of this study. Overall, there were 96 tumors included in the main analyses. Among them, 91 tumors were both successfully evaluated by both pathologists. These cases were used for analyses of interobserver variation, biopsy-resection correlation, and mismatch analysis. There were 3 cases unable to be evaluated by both pathologists while two cases were assessed by only 1 pathologist. Finally, 184 biopsy-resection pairs (b-TSR and r-TSR of each specimen measured by one pathologist comprise one pair) were used for receiver operating characteristic (ROC) analysis  The corresponding biopsy specimen is likely to be taken from the cellular areas B, which results in higher b-TSR. C and D show the opposite situation in which r-TSR > b-TSR. There is a large sclerotic stroma in the resection specimen that cannot be captured by 10X fields (C, upper right). The adjacent highly cellular areas (C, lower left) are the only regions fulfilling the criteria (C, dashed circle). The biopsy, on the other hand, might be taken from the border between cellular and acellular, resulting in a lower b-TSR D our method in detecting stroma-rich tumors 64.5% (95%CI 54.3%-74.2%) and 66.7% (95%CI 52.1%-80.0%), respectively. The kappa correlation coefficient between b-TSR and r-TSR methods is 0.28 (± 0.07). Table 2 shows the confusion matrix of the biopsy evaluation method with the cut-off b-TSR of < 40%.

Comparison of tumor characteristics between concordant and mismatch cases
To investigate the cause of biopsy-resection TSR mismatch, we compared tumor characteristics of mismatch and concordant cases. Given the subjectivity of the evaluation method and stromal heterogeneity of the tumor, we defined a mismatch case as one in which both pathologists' b-TSR scores differed from the corresponding r-TSR by > 20%. Therefore, we did not include cases with less than two b-TSR scores (n = 5). The total number of cases in this analysis was n = 91. The biopsy core length, tumor size, and the biopsy to resection time interval were reported. Table 3 presents the comparison of the two cohorts. Patients in mismatch cases were significantly younger than those in concordant cases (p = 0.036) while r-TSR in mismatch cases was higher than in concordant cases (p < 0.001). To investigate whether age and r-TSR were related, we performed Spearman's correlation analysis, showing that age was negatively associated with r-TSR [rho = -0 .34; 95%CI (− 0.52)-(− 0.14); p < 0.001]. There were no differences in biopsy to resection time intervals (p = 0.439), single-core biopsy length (p = 0.856), total biopsy length (p = 1), or tumor size (p = 0.856). Other tumor characteristics such as Ki67, ER, PR, and HER2 status also showed no statistical significance. In the entire cohort, we found 5 triple-negative IBC-NST, all of which belonged to the concordant cohort.

Histopathological investigation of mismatch cases
We re-examined mismatch cases to find the causes of biopsy-resection TSR differences. Of the 12 cases, 10 had higher r-TSRs and 2 cases had lower r-TSRs compared to the corresponding b-TSR. In mismatch cases with lower r-TSR than b-TSR, we found significant stromal component heterogeneity in resection specimens, where the high stromal components had small areas. When the resection specimens had only small areas of highly stromal regions, these regions could be chosen by the pathologists and evaluated within a full 10X visual field such that the criterion of having tumor cells along each border quadrant could be achieved (Fig. 5A, dashed circle) while very cellular areas would likely be captured by the biopsy specimen (Fig. 5A, dashed lines and Fig. 5B). In contrast, in the resection slides of mismatch cases with higher r-TSR than b-TSR, we found sclerotic areas that were larger than a single 10X field. Although there could be small clusters of invasive tumor cells at the border between cellular and sclerotic areas they were not situated along each border quadrant of the 10X field. Therefore, that criterion cannot be made for r-TSR evaluation of that area (Fig. 5C). On the other hand, such small clusters could fulfill the criteria of b-TSR evaluation (Fig. 5C, dashed lines and 5D). Although we did not encounter fibrotic foci in our mismatch cases, we believe it should be listed as a concern for biopsy-resection discordance because biopsy specimens cannot capture an entire fibrotic foci (Fig. 6A, B), and, thus, can lead to a 10X field that cannot satisfy our criteria for b-TSR evaluation (Fig. 6B).

Discussion
The stromal heterogeneity of invasive breast carcinoma may stem from the different histopathological patterns and tumoral architectures within different regions of the tumor. This phenomenon, in turn, can be explained by the clonal evolution of the tumor cells, leading to different morphologies and biological behaviors [14,15]. The regionally dynamic relationship between tumor cells and tumor stroma may also reflect the localization of biological interactions between cancer cells and normal cells [16], which further n/a n/a n/a n/a < 90 n/a n/a n/a n/a < 100 n/a n/a n/a n/a  [17]. By examining TSR of colon carcinoma, the authors divided these tumors into "stroma-rich" and "stroma-poor" groups by a cut-off TSR value of 50% and illustrated a profound prognostic significance between the two groups. Since its first appearance, the use of TSR has gained popularity with many subsequent studies illustrating its effect on survival outcomes [18,19]. The application of TSR concept in tumor prognostication has spread to other solid tumors such as ovarian carcinoma, cervical squamous cell carcinoma, hepatocellular carcinoma, non-small cell lung cancer, nasopharyngeal squamous cell carcinoma, pancreatic ductal adenocarcinoma, and invasive bladder cancer [20][21][22]. While most types of cancer show inferior survivorship in stroma-rich tumors, pancreatic ductal adenocarcinoma may express the opposite trend, possibly due to the complicated role of cancer-associated fibroblasts in these tumors [23,24]. Interestingly, some authors believed that incorporating TSR into the TNM classification system can improve our current way of stratifying invasive breast carcinoma, gastric adenocarcinoma, and esophageal  [3,25,26]. Moreover, the roles of TSR can also be observed in evaluating the response of some tumors with neoadjuvant therapy [4,27,28]. Therefore, pre-operative evaluation of TSR from biopsy specimens is important. Given that the procedure of TSR evaluation requires a full 10X objective field, TSR assessment can feasibly be performed in biopsy of mucosal tumors [29,30]. CNB specimen, on the other hand, may not be appropriate for conventional r-TSR evaluation because of its thread-like shape under 10X microscopic fields. This situation prompted us to conceptualize a b-TSR evaluation method and examine the relationship between b-TSR and r-TSR.
In the present study, we illustrated that b-TSR correlated moderately to r-TSR, which can be informative for future TSR studies of breast carcinoma. However, we encountered difficulties in establishing an appropriate method to predict r-TSR from b-TSR preoperatively. The poor performance potentially originated from different causes. First, inadequate sampling by CNB can reduce the tumor area to be evaluated. This can present as a short biopsy core length. Our results showed no differences in single-core and total biopsy length between concordant and mismatch cases. Second, tumor size may cause a substantial mismatch between b-TSR and r-TSR. On the one hand, minimal tumor size may lead to inadequate sampling of tumor tissue or inappropriate sampling of the normal bystander tissue. On the other hand, larger tumor size leads to higher stromal heterogeneity, which is difficult to capture with small biopsy specimens. Our study illustrated no significant difference in tumor size between concordant and mismatch cases. Third, since tumors progress over time, it is possible that the tumor architecture, and thus the TSR change at the time of surgical resection if surgery is delayed. Therefore, the biopsy to resection time interval can lead to biopsy-resection TSR mismatch. We also found no relationship between the biopsy to resection time interval and TSR mismatch. Since the present study is retrospective, we could not strictly control the purpose of previous biopsies and eliminate the associated confounders. Therefore, b-TSR alterations over time by multiple biopsies were not reliable and, therefore, not assessed. Finally, stromal components can be quite different between regions due to their heterogeneous nature. For this reason, it is difficult to adopt the TSR concept in small biopsy specimens with minute needle diameters. When we re-evaluated the histopathologic features of mismatch cases, we found that the origins of the biopsy-resection TSR disagreements were mainly related to the inability of biopsy specimens to capture the whole tumor-stroma landscape. Interestingly, our study showed that b-TSR was slightly lower than r-TSR, which was counter-intuitive given that the larger region for r-TSR evaluation should lead to the higher probability of encountering a low-TSR area. It is possible that this paradox was caused by our methodology of b-TSR evaluation. This phenomenon also explained the higher proportion of tumors with high TSR in the mismatch cases. We also found that younger age was associated with higher TSR, which was consistent with a previous study [31]. This relationship (age and TSR) explains the younger age difference in the b-TSR mismatch cohort in our analysis since these cases had significantly higher r-TSR. In the present study, although all cases of triple-negative IBC-NST belonged to the biopsyresection TSR concordant group, we believe that this was the problem of sampling error and the high concordance in these tumors was not truly significant.
Another interesting histopathological feature that complicates the situation of biopsy-resection TSR mismatch is the presence of fibrotic foci (FF). FF and TSR are somewhat overlapping concepts, but they are not identical [32]. The r-TSR requires 10X field assessment while FF, which presents as a scar-like structure surrounded by invasive ductal carcinomatous cells, is determined by the evaluation of its Fig. 6 An illustration of the fibrotic foci, a potential histopathological pitfall of our method. The biopsy cannot capture the structure of fibrotic foci A while a full 10X objective field can visualize the fibrotic foci with a visual field meeting the criteria of r-TSR evaluation B 1 3 whole architecture without a rigid criterion. Small tumor nests can be sporadically encountered within FFs and may or may not be sufficient to create a 10X field fulfilling the criteria of r-TSR evaluation. Since the FF size can be larger than 5 mm, a core-needle biopsy with a small needle diameter is unlikely to capture the field within FF that can meet our criteria. Specifically, the biopsy needle cannot reach the tumor cells at FF borders toward the side margin directions (Fig. 5), leading to an biopsy field that cannot meet the b-TSR evaluation criteria. The focus of the evaluators, therefore, shifts to other biopsy fields with more tumor cells instead. By contrast, finding an optimal 10X field within FFs for r-TSR evaluation is possible in a large resection sample given its larger evaluation diameter in all directions.
Since TSR is a potential indicator to estimate the response of breast carcinoma to neoadjuvant chemotherapy [4,5,33], it is reasonable to attempt predicting r-TSR pre-operatively. Although we were unable to establish a method to rigorously predict r-TSR from CNB, we found a positive correlation between r-TSR and b-TSR. It would be interesting to examine further whether b-TSR itself can be a potential marker for neoadjuvant therapy [5]. However, our study design was not purposed to collect and control for such information. We believe that b-TSR can still provide useful information to estimate r-TSR pre-operatively. To fully exploit this information, deep learning techniques to objectively calculate b-TSR from biopsy specimens may be a good option to improve the accuracy [34]. An alternative approach is to incorporate patients' clinical features and imaging parameters to render a better pre-operative estimation of r-TSR. Previous studies showed that imaging features can also provide information in predicting r-TSR [35][36][37]. Therefore, a comprehensive investigation of different clinicopathological parameters can yield a good prediction of r-TSR, which is a potential marker for neoadjuvant chemotherapy [4]. For example, a multivariable logistic regression model can be used to construct a clinical predictive table of low and high TSR [38,39], yielding clinical scores while deep learning model can input images from imaging studies and histopathological biopsy virtual slides to provide imaging scores and pathological scores [34]. A final combining meta-learner model can then be trained to integrate these scores to provide the final results. Finally, it is also plausible to pre-operatively assess the overall stromal landscape by b-TSR and tumorinfiltrating lymphocytes (TIL) simultaneously, which can be of therapeutic and prognostic value [11,40].

Conclusion
Our study showed that b-TSR can provide useful information about r-TSR of the corresponding resection specimens. Pre-treatment b-TSR measurement can be a good r-TSR estimator with careful consideration of tumor characteristics such as multifocality, size, and imaging. Finally, a more comprehensive pre-operative analytic model combining clinicopathological features, wherein b-TSR can be a component, is required for a more precise r-TSR estimation.