Women with TNBC exhibit significantly worse 5-year survival rates than those with non-TNBC, regardless of the tumor stage at diagnosis (69). No targeted or endocrine therapy is available for TNBC, and NAC is the cornerstone of treatment. However, only 30–40% of TNBC patients achieve pCR with NAC, and there is a dire need for early identification of the nearly 70% of patients who should be offered alternative regimens to improve treatment outcomes. In this study, we used ML approaches to predict NAC response and stratify patients into NAC responders and non-responders based on H&E-stained WSIs of tissue biopsies. We developed a two-step prediction model: in the first step, the histology class of each H&E image tile was determined using a tile-level classification pipeline; in the second step, the spatial graph-derived features associated with histology class pairs were used to predict patient-level NAC response (pCR vs RD). Our model unveils and leverages novel NAC response predictive features and spatial patterns of TME histology components from WSIs of TNBC tissue biopsies. This study also highlights the role of various TME components in accurately predicting NAC response.
TME components and their interactions can influence NAC response in patients with TNBC (70–72). Traditional methods using human annotations are unable to capture these spatial relationships. In contrast, our approach incorporates the spatial relationships of various TME components to predict NAC response in patients with TNBC. Using a graph structure for spatial TME characterization, we identified eight histology component pairs that accurately predicted NAC response. We expect that an investigation with higher-order combinations (e.g., tertiary and quaternary) can further increase NAC response prediction accuracy. The top three TME features captured the spatial interactions between (1) tumor cells and tTILs, (2) stroma and sTILs, and (3) tTILs and PGCCs. Studies have shown the predictive importance of tumor area, immune activation markers, and TILs in TNBC biopsies (73–76). Our results provide further evidence that the interrelationships between TILs, stroma, adipocytes, and tumor cells can predict NAC response in patients with TNBC. Other recently published studies that have relied on WSI models (77, 78) include one that used a federated learning model to predict NAC response in TNBC, and found hemorrhage, TILs, and necrosis as predictive of pCR and apocrine change, fibrosis, and noncohesive tumor cells being predictive of RD (77). Another study quantified the stromal and tumor features in a WSI-based multi-omic (WSI, clinical, pathological) ML model and found that high collagenous stroma was best associated with lower pCR rates (78). Our study used expert annotations that effectively guided the ML models to identify specific histological patterns in spatial TME contexts. While our supervised ML model identified the common histological component of TILs, it did not rank hemorrhage, necrosis, fibrosis, and apocrine change as important predictors due to the lack of annotated training data.
Our NAC response prediction pipeline provides classification accuracy and attention maps that can be highly useful in clinical practice. Attention maps help pathologists and researchers by identifying tissue regions in a WSI that are highly predictive of NAC response, thereby improving slide review, reducing visual fatigue, and facilitating image data interpretation. Information from attention maps can be combined with other WSI derived data such as, Ki67 and pH3 immunohistochemistry-stained serial tissue sections, to train deep-learning models for enhanced prediction (79–81). Ki67 and pH3 are clinical biomarkers with demonstrated NAC response predictive value in TNBC tumors (82, 83). Furthermore, our predictive model is promising for integrating data from various sources, such as electronic health records, laboratory test results, and demographic information, to provide predictions based on the overall view of the health status of patients.
Limitations of the study include small sample sizes, slide quality issues, and expensive computational processes. Quality checks are necessary to ensure inclusion of adequate samples to develop effective training classifiers. The different slide staining protocols, artifacts, and plating variances from different institutions (e.g., cutting glass slide edges) may have resulted in inconsistencies in slide quality. Thus, although we had a larger number of WSIs to begin with, the final validation cohort was whittled down. Because the sample size was small, there was an imbalance of histology classes presented among different patient slides. More histology classes (e.g., microcalcification, muscle) should be included to improve the training of the tile-level histology classifier in all histology classes. We had two pathologists independently annotate the WSIs, however, more experts can be included in the future to validate the annotations and reduce interobserver variability. Additionally, our pipeline is computationally expensive because multiple processes occur throughout the pipeline such as partitioning gigapixel WSIs, calculating various feature measures for each tile, constructing graphs based on spatial relationships. Computational constraints can stem from institutional high-performance computing (HPC) server data loss, standard maintenance, and outages. Refining the code for faster processing times (parallel processing) based on an advanced computer architecture could help support ML processes and data management. We cannot identify important spatially related histological features using image viewing software alone because the software is not scalable for large datasets. Each digital pathology software is limited in the amount of data processed through its graphical user interfaces before exceeding the computational capabilities.
Future work will include model validation in a larger cohort. Future work will also include the development of prediction models with higher-order feature combinations and graph convolution networks. It is important to develop an efficient pipeline to increase the amount of image data and decrease the computational time. Additionally, combinations of features with the highest predictive value will be used to increase the predictive power of the full feature set. For example, attention map regions can be leveraged to focus on regions of interest, which can be used for more complex analyses, such as imaging mass cytometry, to distinguish between the various TIL subtypes and to further refine NAC response prediction. We also plan to extend our pipeline to incorporate other tissue stains, including immunohistochemistry. A more efficient pipeline can reduce the frequency of false negatives and thus, minimize risk of undertreating patients which can result in early relapse and poor outcomes.