Dataset and partition
In this article, we structured the dataset from Huashan Hospital’s BioBank of Central Nervous System Diseases[27] (Ethics approval: KY2015-256) according to the following criteria:
Inclusion criteria: 1. comprehension of the research process, agreement to use the patient data, and signing of the informed consent form by the patient or his or her guardian; 2. tumour surgery (resection or biopsy) from 2013 to 2021; 3. postsurgical pathological diagnosis of glioma, ependymoma, metastasis or lymphoma.
Exclusion criteria: 1. unavailable slides (missing or broken) 2. Unqualified slide or insufficient tissue; and 3. a controversial diagnosis from an experienced neuropathologist.
In general, 1,038 slides were collected. A total of 770 slides served as the modeling data, while the remaining 268 slides were used for independent testing. Among the modeling data, 142 slides were used to train the feature extractors for cell subtyping and grading. The other 628 slides were used to form the feature aggregator responsible for final label inference. A 3-fold development procedure was applied to ensure the robustness of the procedure on different dataset partitions. Detailed dataset and partition were shown in Fig. 1.
Digitization Of Histopathological Slides
An industrial-scale whole-slide imaging scanner (CytoExplorer ZJ300-CS2, ZHONGJI BIOLOGICAL, Wuhan) was used to digitize the histopathological slides. The scanner is equipped with a high-precision electrical displacement platform for precise positioning of the slides. Image acquisition is performed with a high-resolution camera with a highly adaptable, three-dimensional focusing control technology, which was developed to achieve fast and accurate digital scanning of WSIs. Image stitching and fusion are carried out on the scanned patches from multiple views to form a complete WSI.
Standard Annotation Procedure And Committee
To fit the pMIL framework, we adopted an annotation software based on a box selection function module that can quickly label various components contained in a WSI by directly circling them and outlining key pathological features that support the hierarchical diagnosis.
The uneven understanding of pathological classification and familiarity with annotation software make it challenging to standardize the annotation quality among different annotators. We established a committee to ensure consistent, high-quality annotation. In total, three pathologists from Huashan Hospital formed the standard annotation committee in this study, with one neuropathologist with more than 15 years of experience and two junior neuropathologists. The two junior members read all the WSIs in the training set and perform the annotation procedure independently. If the opinions on one WSI disagreed with each other (less than 50% overlap in the cell subtyping annotation boxes or more than 50% discrepancy in the high-risk feature boxes), the leader was invited to review the disputed WSI and make a final decision.
Pipeline-structured Multiple Instance Learning (Pmil)
The pMIL framework can be summarized into two groups of units: (1) clinical task-driven feature extractors and (2) integration aimed feature aggregators. We further established a cell-subtyping network and a grading network based on these.
Feature extractor
The goal of feature extractors is to implement dimensionality reduction on complex original inputs by embedding high-dimensional inputs into a low-dimensional feature matrix. The tasks or training targets of the feature extractors are designed to identify specific clinical image markers for better interpretation of the decision-making procedure.
Feature aggregator
The purpose of the feature aggregator is to provide a permutation invariant attention-based decision given a bag of vectors through instance pooling and multihead attention.
Cell-subtyping network
The patches to be classified included five cell types, oligodendroglioma (O), astrocytoma (A), ependymoma (EPE), lymphoma (LYM), and metastasis (MET), and two auxiliary types, background (BG) and nontumoural tissue (NT). NT covers normal brain tissues, glioses, bleeding and tissue inflammation. First, a feature extractor reduces the dimensions, disassemble all patches and embeds them into a category feature map. Then, a feature aggregator with 8 ports is used to pool and sample the feature map of the category in question and finally provide a decision that satisfies permutation invariance through the multihead attention mechanism.
Grading network
MVP and NEC are typical imaging biomarkers for grading; the presence of such markers indicates a greater likelihood of malignant behavior. Labeling the location of MVP requires a highly labor-intensive, pixel-level annotation; thus, instead of labeling the specific locations, we cropped the WSI into several patches and identified whether such features were present in each patch, changing the segmentation task into a classification task. The classifier also embeds meaningful features for the patient-level decision process. A feature extractor will first generate a grading feature map. The subsequent feature aggregator pools not only the level feature map, but also the sub-typing feature map obtained in the previous stage, so that the multi-head attention mechanism can "think comprehensively" about the grading tasks.
Detailed contents and algorithms of pMIL framework for this section could be found in Supplementary material 2.
Alternative Displays Of Diagnostic Information
Exclusion of low diagnostic areas and cell-subtyping map generation
In pMIL, we could get a classification mask in the classification stage. The entities we can predict in this stage include the five tumour types (O, A, EPE, LYM, and MET) and the two auxiliary types (NT and BG). Naturally, we can superimpose the above classification mask on the original WSI to gain (1) a low diagnostic value area prompt (BG + NT) and (2) a cell topographic map (O + A + EPE + LYM + MET).
High-risk image marker identification
The processing idea we take provides more auxiliary functions through backtracking the attention mechanism. We further leveraged gradient-based class activation mapping (Grad-CAM) to locate these high-risk markers. Detailed contents are given in Supplementary materials 3.
Molecular Pathology: Integrated Diagnosis
The taxonomy of the WHO CNS5 innovatively improves the weighting of non-histopathological parameters, such as molecular pathology. Therefore, we reserved an input window (Fig. 2a) for users to provide molecular pathological parameters in the future. The system will automatically integrate the input diagnostic information through a built-in decision tree (Fig. 2b). Integrated diagnosis of all slides could be found in Supplementary material 4.
Evaluation Metrics
A confusion matrix is used to describe the performance of the analysis unit. Accuracy, sensitivity and specificity are used to evaluate the resolving power of each category, and the arithmetic means of the accuracy, sensitivity and specificity of each category are used to show the overall performance of the model. The confusion matrix is segmented to analyze each category[28] (Fig. 3c). The metrics are defined as follows:
Accuracy_type =\(\frac{{Nt}_{TP}+{Nt}_{TN}}{{Nt}_{TP}+{Nt}_{FP}+{Nt}_{TN}+{Nt}_{FN}}\)
Sensitivity_type =\(\frac{{Nt}_{TP}}{{Nt}_{TP}+{Nt}_{FN}}\)
Specificity_type =\(\frac{{Nt}_{TN}}{{Nt}_{TN}+{Nt}_{FP}}\)
Accuracy_overall =\(\frac{1}{T}\bullet {\sum }_{t=1}^{T}\frac{{Nt}_{TP}+{Nt}_{TN}}{{Nt}_{TP}+{Nt}_{FP}+{Nt}_{TN}+{Nt}_{FN}}\)
Sensitivity_overall =\(\frac{1}{T}\bullet {\sum }_{t=1}^{T}\frac{{Nt}_{TP}}{{Nt}_{TP}+{Nt}_{FN}}\)
Specificity_overall =\(\frac{1}{T}\bullet {\sum }_{t=1}^{T}\frac{{Nt}_{TN}}{{Nt}_{TN}+{Nt}_{FP}}\)
where \({Nt}_{TP},{Nt}_{FP},{Nt}_{TN},{Nt}_{FN}\)represent the number of true positives, false positives, true negatives, and false negatives of each type, respectively. T represents the total number of types; thus, T = 9 in this study.
Prototype Packaging And Time Period Test
Based on the WSI scanner used in the study, we optimized the using process and built a scanning module. The pMIL model and the auxiliary functions were packaged to form the kernel diagnosis module. The information management module included images storage, diagnostic information integration and display functions. These three components were packaged to form the HAS-Bt. We evaluated the time consuming for the operation of HAS-Bt through the independent testing set. The 268 slides were run on the system. Timer were started when slide was settled on the system. Totally three path points were recorded. A researcher who has not participated in prototype development was responsible for recording.