Weakly supervised detection and classification of basal cell carcinoma using graph-transformers on whole slide images

doi:10.21203/rs.3.rs-2499377/v1

Download PDF

Article

Weakly supervised detection and classification of basal cell carcinoma using graph-transformers on whole slide images

https://doi.org/10.21203/rs.3.rs-2499377/v1

This work is licensed under a CC BY 4.0 License

Journal Publication

published 09 May, 2023

Read the published version in Scientific Reports →

You are reading this latest preprint version

The high incidence rates of basal cell carcinoma (BCC) cause a significant burden at pathology laboratories. The standard diagnostic process is time-consuming and prone to inter-pathologist variability. Despite the application of deep learning approaches in grading of other cancer types, there is limited literature on the application of vision transformers to BCC on whole slide images (WSIs). A total of 1831 WSIs from 479 BCCs, divided into training and validation (1434 WSIs from 369 BCCs) and testing (397 WSIs from 110 BCCs) sets, were weakly annotated into four aggressivity subtypes. We used a combination of a graph neural network and vision transformer to 1) detect the presence of tumor (two classes), 2) classify the tumor into low and high-risk subtypes (three classes), and 3) classify four aggressivity subtypes (five classes). Using an ensemble model comprised of the models from cross-validation, accuracies of 93.5%, 86.4%, and 72% were achieved on two, three, and five class classifications, respectively. These results show high accuracy in both tumor detection and grading of BCCs. The use of automated WSI analysis could increase workflow efficiency and possibly overcome inter-pathologist variability.

Biological sciences/Cancer

Biological sciences/Computational biology and bioinformatics

Health sciences/Diseases

Health sciences/Health care

Health sciences/Medical research

Pathology

basal cell carcinoma

contrastive learning

weakly supervised learning

multiscale feature fusion

Basal cell carcinoma is the most common form of skin cancer in humans. The incidence is as high as the incidence of all other cancers combined¹. Further, the number of BCC cases is increasing globally^2–4. Although metastasis and death are rare, BCCs can cause significant morbidity due to aggressive and destructive local growth⁵.

BCCs are a heterogeneous group of tumors with different growth patterns. Internationally, BCCs are classified into two broad categories based on histopathologic features: low-risk and high-risk subtypes⁶. These categories can be further classified in subclasses. Swedish pathologists, for example, classify BCCs according to the “Sabbatsberg model” which includes three risk categories: a) “low-aggressive” subtypes which are further divided into superficial (type Ib) and nodular (type Ia), and b) “medium-aggressive” (type II) which includes less aggressive infiltrative subtypes that grow in a more well-defined manner and more superficially compared to the high-aggressive tumors and c) “high-aggressive” (type III), more aggressive, infiltrative and morphea form subtypes⁷. The correct assessment of the subtype is crucial for planning the relevant treatment. However, there is a significant inter-pathologist variability when grading tumors⁸ and reporting the subtype^9,10.

Moreover, given the time-consuming process of evaluating histological slides combined with an increasing number of samples delays diagnosis and increases costs¹¹. To reduce diagnosis time and inter-observer variations, deep learning¹² approaches have been actively investigated. Deep learning enables the implementation of computational image analysis in pathology, which provides the potential to increase classification accuracy and reduce interobserver variability^13,14. Interestingly, even unknown morphological features associated with metastatic risk, disease-free survival, and prognosis may be revealed ^15,16.

In early research works computational histology methods required pixel-wise annotations i.e. delineating specific regions on WSI by pathologists¹⁷. Using pixel-wise annotation, however, is time-consuming. Further, such approaches do not generalize to real-world data¹⁸. As an alternative, a weakly supervised learning framework has been a widely adopted method for WSI classification. The common technique within weakly supervised learning is multi-instance learning (MIL)¹⁹. This approach can use WSI-level labels, i.e labels not associated with a specific region, without losing performance²⁰. The technique treats the set of instances (patches of a WSI) as a bag. The mere instance of a positive case patch makes the bag, i.e. WSI, positive, otherwise, it is treated as negative. MIL requires that the WSI are partitioned into a set of patches, often without the need for data curation¹⁸.

The later works have increasingly added a self-supervised contrastive learning paradigm in extracting better feature vectors. In these paradigms pre-trained CNN models are tuned using a contrastive learning framework in a contained manner ²¹. Adding these components into MIL approaches has proven to provide better performance ^22,23. However, the MIL framework fundamentally assumes the patches as independently and identically distributed, neglecting the correlation among the instances^19,24. Neglecting the correlation affects the overall performance of the classification models. Instead, the spatial correlation can be captured using the graph neural networks, which in turn increases model performance^25–27.

Recently, Transformers²⁸ have made a great leap in the AI front by introducing the capability to incorporate context among a sequence of tokens in natural language processing tasks e.g. GPT-3²⁹. Inspired by the success of transformers in natural language processing, Dosovitskiy, A. et al.³⁰ proposed Vision Transformer (ViT), a method for image classification tasks that takes flattened patches of an image as input.This enables capturing the sequence of patches (tokens) and considers the position of images (context) using positional embeddings. Consideration of the positional relationship (contextual information) shows that ViT can perform better than CNN, especially when using features obtained from self-supervised contrastive models. In addition, vision transformers require substantially fewer data and compute resources relative to many CNN-based approaches^30,31. Further, the relative resilience to noise, blur, artifacts, semantic changes, and out-of-distribution samples could contribute to better performance³².

In medical images, transformers have been applied in image classification, segmentation, detection, reconstruction, enhancement, and registration tasks³². Specifically, in histological images, vision transformers have been successfully applied to different histological images related tasks, including in the detection of breast cancer metastases, and in the classification of cancer subtypes of lung, kidney and colorectal cancer^33,34. Given the success of vision transformers in many medical applications and the capability of graph neural networks to capture correlation among patches, we adopt the combination of graph neural networks and Transformers to detect and classify BCCs.

The accuracies of the ensembles comprised of the 5 graph-transformer models on the test set were 93.5%, 86.4%, and 72.0% for the two-class, three-class, and five-class classification tasks, respectively. Moreover, the sensitivity of detecting healthy skin and tumors reached 96% and 91.9%, respectively. The performance of the ensemble models on the test set is summarized in Table 1 and the associated confusion matrices are shown in Fig. 1. Figure 2 shows the average ROC curve of the separate cross-validation models against the test set. Heatmaps were used to visualize the regions of WSI that are highly associated with the label. Figure 3 shows tumor regions of different BCC subtypes that were correctly identified by a Graph-Transformer model.

Table 1

Model performance with 1, 2, and 4 BCC grades on the test set based on an ensemble model comprising of the 5 cross-validation model splits.
Tasks	Sub-class	Accuracy (%)	Sensitivity (%)	Specificity (%)
2 classes (Task 1)	0-Healthy skin	93.5	96.0	91.9
2 classes (Task 1)	1-Tumor	93.5	91.9	96.0
3 classes (Task 2)	0-Healthy skin	86.4	98.0	92.3
	1-Low risk		83.5	91.5
	2-High risk		76.2	96.1
5 classes (Task 3)	0-Healthy skin	72.0	98.0	89.4
	1-Superfial low		83.0	96.8
	2-Nodular low		64.0	93.1
	3-medium aggressive		31.6	94.0
	4-High aggressive		57.8	90.7

In this paper, we used a graph-transformer for the detection and classification of WSIs of extraction with BCC. The developed deep-learning method showed high accuracy in both tumor detection and grading. The use of automated image analysis could increase workflow efficiency. Given the high sensitivity in tumor detection, the model could assist pathologists in identifying the slides containing tumor and indicating the tumor regions on the slides and possibly reduce the time needed for the diagnostic process in daily practice. The use of high-accuracy automated tumor grading could furthermore be timesaving and even possibly reduce inter- and intra-pathologists’ variability.

Our study is among the first to apply two and four grading using deep learning approaches for WSI on BCCs. Our method reached high AUC values of 0.964–0.965, 0.932–0.975 and 0.843–0.976 on two, three (two grades), and five class (4 grades) classifications, respectively. Previously, Campanella et al ¹⁸ used a significantly larger dataset and achieved high accuracy in tumor detection and suggested that up to 75% of the slides could safely be removed from the workload of pathologists. Interestingly, Gao et al³⁵ compared WSIs and smartphone-captured microscopic ocular images of BCCs for tumor detection with high sensitivity and specificity for both approaches. However, no tumor grading was applied in these studies. To the best of our knowledge, there is no open-source dataset on grading of BCC. This makes it difficult to compare the results of this work against a baseline. One advantage of our study is that data is available as an open data set which will enable progress in this area.

In another study regarding BCC detection attention patterns of AI were compared to attention patterns of pathologists and observed that the neural networks distribute their attention over larger tissue areas integrating the connective tissue in its decision-making³⁶. Our study used weakly supervised learning where the labels were annotated on a slide level. This approach instead of focusing on small pixelwise annotated areas, gives the algorithm freedom to evaluate larger areas including the tumoral stroma. Furthermore, slide-wise annotation is significantly less time-consuming than pixel-wise annotations.

A limitation of our study is the somewhat limited size of the dataset. As the number of classes increases the performance reduced significantly. This could be attributed to a reduced number of WSI per class in the training set. For example, it was more difficult for the model to differentiate between the BCC subtype Ia and subtype Ib in 5 class classification tasks but relatively easier to differentiate the low and high aggressive classes in 3 class classification task, Fig. 2. With the availability of more data, the performance would most likely increase.

Even though this work didn’t make systemic inter-observer variability analysis, the two pathologists involved in the annotation of the dataset differed in 6.7% of WSIs. For the study the annotation were corrected with consensus along with a third senior pathologist, which is not the case in real life situations. Using tools, such as the one proposed in this work, would likely reduce the inter-pathologist variability.

Furthermore, many of the WSIs had composite subtypes. Such cases are typical in BCC to have an admix from multiple types i.e. cases with more than one pathologic pattern³⁷. The proportion of mixed histology cases can reach up to 43% of all cases³⁸. Up to 70% of mixed BCC cases can contain one or more aggressive subtypes³⁹. Despite such characteristics of mixed pattern per WSI, our models were able to detect the worst BCC subtype per slide with an accuracy of 86.4 % n the three-class classification, and 72.0 % n the five-class classification tasks, as shown in Table 1.

Further, each slide had pen marks that indicate extraction index (corresponding to extraction id) in which some cases can be as large as the tissue on the WSI. Since the dataset is split based on a patient index, the pen marks in the training set are different from that of test set, and the model is not affected by the similarities of the handwritten characters. The pen marks were not identified as tissue by the tiler and were therefore not included in the training patches. Moreover, the WSIs had different colors and artifacts, slice edges, inconsistencies, scattered small tissues, spots, and holes. Despite these variations among the WSIs, the models treated handwritten characters as background and other variations as noise.

This work, to the best of our knowledge, is the first approach that uses transformers in the grading of BCC on WSI. The results show high accuracy in both tumor detection and grading of BCCs. Successful deployment of such approaches could likely increase the efficiency and robustness of histological diagnostic processes.

Dataset

The dataset was retrospectively collected at Sahlgrenska University Hospital, Gothenburg, Sweden from the time period 2019–2020. The complete dataset contains 1831 labeled WSI from 479 BCCs (1 to 18 glass slides per tumor), Table 2. The slides were scanned unidentified using a scanner NanoZoomer S360 Hamamatsu at 40X magnification. The slide labels were further removed using an open-source package, anonymize-slide⁴⁰.

Table 2: The distribution of BCCs and WSIs into the different classes. BCCs were classified according to the WSI with the worst grade tumor belonging to that BCC

	Training and validation set		Hold-out test set
	BCCs	WSI	BCCs	WSI
No tumor	2	593	2	151
Low aggressive superficial (Ib)	63	176	18	53
Low aggressive nodular (Ia)	115	277	23	50
Medium aggressive (II)	98	213	40	79
High aggressive (III)	91	225	27	64
total	369	1434	110	397

The dimensions of the WSIs ranged from 71424 to 207360 px, with sizes ranging 1.1 GB to 5.3 GB (in total 5.6 TB). Moreover, almost all samples had multiple sectioning levels per glass slide. Before scanning, the glass slides were marked with letter ‘B’ and up to 3 digits indicating which slides represented the same tumor.

The scanned slides were then annotated at WSI-level into 5 classes (no-tumor and 4 grades of BCC tumor), in accordance with the Swedish classification system. When several growth patterns of tumors were detected, the WSIs were classified according to the worst possible subtype. The annotations were performed by two pathologists separately. In the cases where the two main annotators had differing opinions (6.7% of WSIs), a third senior pathologist was brought in, and a final annotation decision was made as a consensus between the three pathologists.

The dataset was set out for use for 3 classification tasks. The first task (T1) was detecting the presence of tumors through binary classification (tumor or no tumor). The second task (T2) was classified into three classes (no tumor, low-risk and high-risk tumor, in line with WHO grading systems). The third task (T3) was classing the dataset into 5 classes (no tumor, and 4 grades of BCC; low aggressive superficial, low aggressive nodular, medium aggressive and highly aggressive, in line with the Swedish classification system). In the two-grade classification tasks, the labels were converted to cases of low aggressive (Ia and Ib) and high aggressive (II and III). Figure 4 shows patches of BCCs and their corresponding classes in the three classification tasks (indicated as T1, T2, and T3).

Feature extraction

An overview of the method is shown in Fig. 5. Given that WSIs were large, conventional machine-learning models could not ingest them directly. Hence the WSIs were first tiled into patches. The WSIs were tiled into 224 by 224 patches at 10X magnification with no overlap using OpenSlide⁴¹. The patches with at least 15% tissue areas were kept while others were discarded. The number of patches ranged from 22- 14710 patches per WSI. In total, 5.2 million patches were generated for the training set. As stated above, there was variability among the WSIs including color differences, artifacts, etc. Despite the differences among patches, no image processing was made before or after tiling.

Once the patches were tiled, features were extracted using a self-supervised learning framework, SimCLR²¹. Using a contrastive learning approach, the data was augmented, and sub-images were then used to generate a generic representation of a dataset. The algorithm then reduced the distance between the same image and increased the distance between different images (negative pairs)²¹. In this step, using Resnet18 as a backbone and all patches as a training set, except the patches from the hold-out test set, a feature vector for each patch was extracted. For training SimCLR, Adam optimizer with weight decay of 10^− 6, and batch size of 512 and 32 epochs were used. The initial learning rate 10^− 4 was scheduled using cosine annealing.

Graph Convolutional Network Construction

The features generated from self-supervised contrastive learning are inputs to graph neural networks. Using contrastive learning feature vectors of each patch were extracted. Since each patch is connected to the nearest neighbor patch by its edges and corners, tiling breaks the correlation among the patches., The correlation among patches can be captured via positional embeddings³⁰. Since histological patches are spatially correlated in a 2D space, the positional embeddings could better be captured via a graph network²⁷.

A patch is connected to a neighboring patch by 4 sides and 4 corners, hence in total 8 edges. A set of 8-node adjacency matrices was used to create a graph representation of a WSI. Then the positional embedding captured via the adjacency matrix is used to construct a graph convolutional network. The feature vectors of the patches became the nodes of the graphs.

Zheng et al.²⁷ showed results of using a fully connected graph, that is, a single tissue per slide. In this work, we show that the same approach works with a disconnected graph representing multiple tissues per WSI. It is worth noting that almost all WSIs in our dataset had multiple tissues per slide, i.e., there were no correlations among the separate tissues due to non-tissue regions. This results in a disconnected graph as shown in Fig. 6.

Vision Transformer

Once the graph convolutional network was built, the network was fed to a ViT. Generally, the transformer applies an attention mechanism that mimics the way humans extract important information from a specific image or text, ignoring the information surrounding the image or text⁴². Self-attention²⁸ introduced a function that uses queries, keys, and values, mapped from the input features. It applied multiple-head self-attention to extract refined features, allowing it to understand the image as a whole rather than just focusing on individual parts. Further, the self-attention heads of transformers include a multilayer perceptron (MLP) block which is used in determining classes. In this work, we used the standard ViT was encoder architecture for classification of BCC subtypes.

Moreover, the computational cost of training ViT can be high depending on the input size. The number of patches can be large depending on the size of images and tissue size relative to the WSI. This resulted in a large number of nodes, which were computationally hard to be applied directly as input to the transformer. To reduce the number of nodes to the extent that the ViT can digest the inputs, a pooling layer was added²⁷

Training the graph-transformer

There were 369 extractions (1434 WSIs) in the combined training and validation set. An additional dataset of 110 extractions (397 WSIs) were scanned separately to comprise a hold-out test set. The test set was handled separately and was held out from both SimCLR and graph-transformer models.

For training and validation, all slides relating to a specific extraction were always placed in the same set to avoid data leakage from similar slides. This necessitated dividing the dataset on the extraction level, resulting in uneven splits for cross-validation. Hence, a 5-fold cross-validation was used for training. The 5 models from the cross-validation folds were combined into one ensemble model to provide final predictions against the test set.

In training the models, the same hyperparameters were used for all the tasks. The models were configured with MLP size of 128, 3 self-attention blocks, and trained with batch size 4, 100 epochs and Adam optimizer’s weight decay 10^− 5, learning rate 10^− 3 with decay at steps 40 and 80 by 10^− 1. The training was performed on 2 GPUs on DGX A100. The training of SimCLR model took around 3 days. The training for graph transformers took around 25 minutes on average to converge. For a given WSI in the test set, from tiling to inference, took around 30 seconds.

Visualization

To visualize and interpret the predicted results, a graph-based class activation mapping²⁷ was used. The method computed the class activation map from the class label to a graph representation of the WSI by utilizing precomputed transformer and graph relevance maps. Using the method, heatmaps were overlayed on regions of the WSI associated with the WSI label.

Acknowledgements

The study was financed by grants from the Swedish state under the agreement between the Swedish government and the county councils, the ALF-agreement (Grant ALFGBG-973455).

Author contributions (names must be given as initials)

Conception and design: FY, JS, KV, JTS, NN Development of methodology: FY, JTS Acquisition of data: JS, KV, NN Annotating the dataset: JS, KV, NN, Analysis and interpretation of data: FY, JTS, JS, KV, NN Writing, review, and revision of the manuscript: FY, JS, KV, JTS, NN, LS, Study supervision: NN, JTS, MK Acquisition of funding: NN, MK.

Data availability statement (mandatory)

The datasets generated and/or analysed during the current study are available in the AIDA Hub repository https://datahub.aida.scilifelab.se/datasets/

Additional Information (including a Competing Interests Statement)

There is no conflict of interest.

Levell, N. J., Igali, L., Wright, K. A. & Greenberg, D. C. Basal cell carcinoma epidemiology in the UK: the elephant in the room. Clin Exp Dermatol 38, 367–369 (2013).
Dika, E. et al. Basal Cell Carcinoma: A Comprehensive Review. Int J Mol Sci 21, 5572 (2020).
Cameron, M. C. et al. Basal cell carcinoma. J Am Acad Dermatol 80, 321–339 (2019).
Wong, C. S. M. Basal cell carcinoma. BMJ 327, 794–798 (2003).
Lo, J. S. et al. Metastatic basal cell carcinoma: Report of twelve cases with a review of the literature. J Am Acad Dermatol 24, 715–719 (1991).
Elder, D. E., Daniela Massi, Richard A. Scolyer & Rein Willemze. WHO Classification of Skin Tumours. 4 ed. (2018).
Jernbeck, J., Glaumann, B. & Glas, J. E. Basal cell carcinoma. Clinical evaluation of the histological grading of aggressive types of cancer]. Lakartidningen 85, 3467–70 (1988).
Jagdeo, J., Weinstock, M. A., Piepkorn, M. & Bingham, S. F. Reliability of the histopathologic diagnosis of keratinocyte carcinomas. J Am Acad Dermatol 57, 279–284 (2007).
Moon, D. J. et al. Variance of Basal Cell Carcinoma Subtype Reporting by Practice Setting. JAMA Dermatol 155, 854 (2019).
Al-Qarqaz, F. et al. Basal Cell Carcinoma Pathology Requests and Reports Are Lacking Important Information. J Skin Cancer 2019, 1–5 (2019).
Migden, M. et al. Burden and treatment patterns of advanced basal cell carcinoma among commercially insured patients in a United States database from 2010 to 2014. J Am Acad Dermatol 77, 55–62.e3 (2017).
LeCun, Y., Bengio, Y., nature, G. H.- & 2015, undefined. Deep learning. nature.com (2015) doi:10.1038/nature14539.
Niazi, M. K. K., Parwani, A. v & Gurcan, M. N. Digital pathology and artificial intelligence. Lancet Oncol 20, e253–e261 (2019).
Komura, D. & Ishikawa, S. Machine learning approaches for pathologic diagnosis. Virchows Archiv 475, 131–138 (2019).
Knuutila, J. S. et al. Identification of metastatic primary cutaneous squamous cell carcinoma utilizing artificial intelligence analysis of whole slide images. Springer (123AD) doi:10.1038/s41598-022-13696-y.
Comes, M. et al. A deep learning model based on whole slide images to predict disease-free survival in cutaneous melanoma patients. nature.com.
Olsen, T. G. et al. Diagnostic Performance of Deep Learning Algorithms Applied to Three Common Diagnoses in Dermatopathology. J Pathol Inform 9, 32 (2018).
Campanella, G. et al. Clinical-grade computational pathology using weakly supervised deep learning on whole slide images. Nat Med 25, 1301–1309 (2019).
Carbonneau, M.-A., Cheplygina, V., Granger, E. & Gagnon, G. Multiple instance learning: A survey of problem characteristics and applications. Pattern Recognit 77, 329–353 (2018).
Ilse, M., Tomczak, J., on, M. W.-I. conference & 2018, undefined. Attention-based deep multiple instance learning. proceedings.mlr.press (2018).
proceedings.mlr.press (2020).
Li, J. et al. A multi-resolution model for histopathology image classification and localization with multiple instance learning. Elsevier (2020).
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR) (2021).
Zhou, Z. H. & Xu, J. M. On the relation between multi-instance learning and semi-supervised learning. ACM International Conference Proceeding Series 227, 1167–1174 (2007).
Tu, M., Huang, J., He, X. & Zhou, B. Multiple instance learning with graph neural networks. (2019).
Adnan, M., Kalra, S., IEEE, H. T.-P. of the & 2020, undefined. Representation learning of histopathology images using graph neural networks. in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops (2020).
Zheng, Y. et al. A graph-transformer for whole slide image classification. arxiv.org (2022).
Vaswani, A. et al. Attention is all you need. Advances in Neural Information Processing Systems 30 (NIPS) (2017).
Brown, T. B. et al. Language models are few-shot learners. Advances in Neural Information Processing Systems 33 (NeurIPS) (2020).
Dosovitskiy, A. et al. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. arxiv.org (2020).
Deininger, L. et al. A comparative study between vision transformers and CNNs in digital pathology. arxiv.org (2022).
Li, J. et al. Transforming medical imaging with Transformers? A comparative review of key properties, current progresses, and future perspectives. arxiv.org (2022).
Shao, Z. et al. Transmil: Transformer based correlated multiple instance learning for whole slide image classification. in proceedings.neurips.cc (2021).
Zeid, M. A.-E., El-Bahnasy, K. & Abo-Youssef, S. E. Multiclass Colorectal Cancer Histology Images Classification Using Vision Transformers. in 2021 Tenth International Conference on Intelligent Computing and Information Systems (ICICIS) 224–230 (IEEE, 2021). doi:10.1109/ICICIS52592.2021.9694125.
Jiang, Y. Q. et al. Recognizing basal cell carcinoma on smartphone-captured digital histopathology images with a deep neural network. British Journal of Dermatology 182, 754–762 (2020).
Kimeswenger, S. et al. Artificial neural networks and pathologists recognize basal cell carcinomas based on different histological patterns. Modern Pathology 34, 895–903 (2021).
Crowson, A. N. Basal cell carcinoma: biology, morphology and clinical implications. Modern Pathology 19, S127–S147 (2006).
Cohen, P., Schulze, K., surgery, B. N.-D. & 2006, undefined. Basal cell carcinoma with mixed histology: a possible pathogenesis for recurrent skin cancer. Wiley Online Library 32, 542–551 (2006).
Kamyab-Hesari, K. et al. Diagnostic accuracy of punch biopsy in subtyping basal cell carcinoma. Wiley Online Library 28, 250–253 (2014).
Gilbert, B. anonymize-slide. https://github.com/bgilbert/anonymize-slide.
Goode, A., Gilbert, B., Harkes, J., Jukic, D. & Satyanarayanan, M. OpenSlide: A vendor-neutral software foundation for digital pathology. J Pathol Inform 4, 27 (2013).
Bahdanau, D., Cho, K. H. & Bengio, Y. Neural machine translation by jointly learning to align and translate. 3rd International Conference on Learning Representations, ICLR 2015 - Conference Track Proceedings (2015).

No competing interests reported.

SupplementarymaterialTableS1.docx
Table S1 shows the average number of WSIs and splits into training, validation, and test set. Table S1. The number of WSI per class in the test set and the different cross-validation folds for training and validation.

Download PDF

Journal Publication

published 09 May, 2023

Read the published version in Scientific Reports →

Editorial decision: Major revision
20 Feb, 2023
Reviews received at journal
06 Feb, 2023
Reviewers agreed at journal
25 Jan, 2023
Reviewers invited by journal
25 Jan, 2023
Editor assigned by journal
25 Jan, 2023
Editor invited by journal
25 Jan, 2023
Submission checks completed at journal
25 Jan, 2023
First submitted to journal
20 Jan, 2023

You are reading this latest preprint version

Weakly supervised detection and classification of basal cell carcinoma using graph-transformers on whole slide images

Status:

Journal Publication

Version 1

Abstract

Figures

Introduction

Results

Discussion

Methods

Dataset

Feature extraction

Graph Convolutional Network Construction

Vision Transformer

Training the graph-transformer

Visualization

Declarations

References

Additional Declarations

Supplementary Files

Status:

Journal Publication

Version 1