Deep Learning Use for Differentiation of Low-grade vs High-Grade Glioma in Intraoperative Squash Smears

Objective Automated diagnosis using Articial Intelligence (AI) techniques would be a useful addition to the intraoperative squash smear diagnosis. A robust diagnostic tool would enhance capabilities in centres where there is limited expertise for the diagnosis of intracranial lesions. The study aims to explore possibilities of deep learning technique-based models to classify squash smear images of glioma into high- and low-grade tumors. Methods 500 Scanned images of squash smear were obtained intraoperatively and dataset was built. Image dataset was then pre-processed and fed into a CNN (Convolutional Neural Network) model for training and validation. The dataset consisted of 10,000 images of high (6000) and low (4000) grade gliomas, divided into three sets of training, validation and testing. CNN model based on deep learning algorithm was built and trained on training dataset to get accuracy of 96.2%. On a testing dataset which contains images previously unseen by trained model, it could achieve accuracies of 91% for diagnosing high grade glioma and 77% for low grade glioma. A positive predictive value of 86.6% and F1-score of 0.887 was achieved. Feature visualization technique was applied at the end to visualize regions of interest. techniques can be used for rapid screening of slides or section of slide to assist in rapid diagnosis.


Introduction
Central Nervous System (CNS) malignancy accounts for nearly 1.3 -2% of all tumors in India and worldwide has an incidence of 5-10 per 1,00,000 persons [1][2][3]. Computer Aided Diagnosis (CAD) has been used previously in many elds related to neurosurgery. In neuroradiology, for analysis of images acquired via various modalities like X-ray, Computed Tomography (CT), Magnetic Resonance Imaging (MRI), etc. have been used for rapid and better diagnosing processes. Arti cial Intelligence has been used in report generation based on CT in Traumatic Brain Injury 4 , in segmentation of morphology of brain tumor and differentiation of high-grade and low-grade brain tumor based on MRI [5]. Many of these approaches were heavily dependent on feature extraction from the images and appropriate image pre-processing, which can be a tedious task and are prone to errors.
Intraoperative squash smear cytological preparation was rst introduced by Eisenhardt and Cushing in the early 1930 and Badt in 1937. [6] Diagnostic accuracies of squash smears range from 76 -96 % in different studies. [6] Typically, this procedure takes around 30 minutes to 2 hours at various centers. At our center the typical turnaround time for diagnosis is in-between 30 to 45 mins. This situation gets added di culties during emergencies and in wee hours, when an expert neuropathologist may not be available.
At some places, the facility of intraoperative diagnostics may not be available at all due to a lack of expertise. Also, there is no readily available tool to con rm quality of tissue obtained like during framed/frameless stereotactic biopsy.
Arti cial Neural Network (ANN) based techniques can be used to address at least some of the issues mentioned earlier. Deep learning, a form of complex ANN, is getting recently very popular in solving many challenges in trend analysis, computer vision analysis, computer-aided diagnosis etc. Also, this method can provide some certainty for neurosurgeons that the tissue sent for nal analysis is indeed a representative tissue of tumor during blind procedures like stereotactic biopsies. It can also facilitate repeatability of samples on the operating table itself. To our best knowledge, the use of CNN on brain squash images has never been tried before.

Methods
The aim of this study was to build a CNN model speci c for the analysis of the squash smear cytology of the brain tumor tissue and to check the validity of such AI Agents in detecting high-grade vs low-grade gliomas.
After preparing the dataset from smear cytology slides, a deep learning CNN Model designed speci cally for handling such medical images was developed. Various steps in the process are described below.

Image Acquisition and Data Collection
Approval from Institutional Ethics Committee was obtained before collecting the data. Intraoperatively prepared squash smears slide of gliomas were acquired for surgeries done for 18 months period from June 2017 to December 2018 of gliomas from Department of Neuropathology. Of 500 slides scanned, an average of 20 representative images were obtained from each of the smear slides. Only non-overlapping uniformly spread, single cell layer parts and well stained areas in the slide were selected for imaging by expert neuropathologist. These images have to be selected this way for training purposes as in neural network training input data has to be labelled correctly. The ground truth was acquired from the nal histopathology report.
These slide images were then converted into digital format by digital microscope (Olympus® Bz53, DP27 Camera). The data collected for the study were images of the squash smear slides whose nal histopathology diagnosis was glioma. Images obtained were then divided into three sets, a training set, validation set and testing dataset ( Figure 1). Images in the training set were only used to train the network and the rest of the images were only used for validation/testing. Of the total 10,000 images collected, 6,000 belonged to High Grade Glioma and 4,000 belonged to Low Grade Glioma. Amongst them, 3,200 were used for training and rest for testing/validation. Only high-and low-grade glioma were taken into consideration for this initial project as on squash smear or frozen section many times even by expert neuropathologist this much granularity can be obtained for gliomas and immuhistochemistry has to be done later during nal histopathology reporting.

Image Preprocessing
The images acquired were of height of 480 pixels and a width of 612 pixels, coded in RGB Standard Code, and coded in 3 channels (612 x 480 x 3). The rst image resolution was reduced to (150 x 150 x 3) for computing. After that image was vectorized into linear arrays on each channel. The arrays are then normalized with Min-Max Scaler Normalization and used as inputs. The input of the images was then processed through the Image Data Generator, to add randomness, noise, rotations, and other parameters for making data more generalized. Figure 2 demonstrates image before pre-processing (a) and after (b) pre-processing.

Building of CNN Model
Convolutional Neural Network (CNN) are standard deep learning methods to work with image data. CNN uses a convolution function mentioned below was employed: This formula is an oversimpli cation of the actual convolution formula 7 , but in scope of this article, we consider it to ful ll its purpose. Here, the inputs are denoted by x, the lter for the k th feature map is denoted by Wk. The f (.)function denotes the Convolutional function and the Yk denotes the output of the function given the input x at the kth position. Convolution itself is a linear operator at its core.
A CNN model was built based on VGG19 7 with added layers at the end which were specially curated to handle these processed images ( Figure 3). Models were built in Python language with Keras library with TensorFlow as backend which all are open source packages. As can be seen in the below mentioned gure, the input is in the shape of (150x150x3) formation. The layers were later built as alternate layers of Convolutional function, as mentioned previously, with Pooling layers which are connected with each other via Dense connecting layers. Once the architecture of the model is built; on the top of the model another set of ANN is built and connected and the nal 5 layers of the network are formed. These layers were speci cally built considering the input and nature of data. Model is nally compiled with "Stochastic Gradient Descent (SDG)" with loss of function as "binary cross-entropy" and metrics of accuracy as "accuracy" and a total of 28,939,329 trainable parameters.
The nal output layer was considered as only binary output and labelled as either '1' for high-grade glioma or '0' for low-grade gliomas. This layer has the activation function of "sigmoid", which gives output probabilities between 0 and 1. This would provide the con dence in the probability of high-grade glioma or low-grade tumors in the CNN network's prediction.

Training of Model
Afterward, the CNN Model was trained on a workstation with 16 GB RAM, the process augmentation was done with NVIDIA® RTX 2060 6 GB Graphic Processor, on Intel® Architecture which took 121.02 minutes to train in a batch of 32 images with 100 epochs. One should note that all of the training is done without any feature extraction and no human intervention whatsoever, also for actual reporting purposes much lower con guration of the computer system should su ce.

Validation and Statistics
The arti cial neural network was trained using these images on the training set and the accuracies and cross-validation matrices were built. This would help in validating the tness of the model for the generalized use and for practice in the real world. Statistical analysis was done by Scikit-Learn Library, which is integrated with Keras library.

Results
A) Results During Training the Dataset: As shown in Figure (4), which is a line chart for Training and Validation Data loss and accuracies, for loss the lesser value is considered better and for accuracies higher is regarded better. It was noted that the model started converging at around 20th epoch and with minimal over tting. Speci c techniques like Learning rate regulation and Dropouts were used for to avoid over tting. Also, multiple models with various hyperparameters were tested before nalizing the best working model. The task of hyperparameter tuning for the purpose was tedious and time consuming. On reaching 100th epoch, the loss was 0.0950 in the training set and 0.1016 for the validation set. Accuracies of 96.2% were reached on the training set and 96.39% on the validation set. The validation data set was used only for internal comparison and not for training the model.

B) Results During Testing phase:
The testing dataset was used for obtaining results on nal data which was not previously exposed to the model. Results were obtained in form of '1' for "high-grade glioma" and '0' for "low-grade glioma" as mentioned before.
Confusion Matrix is a cross table between true label versus predicted label which is 2x2 table as shown in  Table 1. As indicated in the table, the accuracy in prediction for high grade glioma was 91% and for low grade glioma it was 77%. These reports could be generated in fraction of time on each image.

Discussion
Squash smear is prepared from tumor tissue obtained during surgery and sent for analysis to get the idea about the nature and type of the tumor a neurosurgeon is dealing with. The accuracy in various studies comparing squash diagnosis with that of the nal histopathological diagnosis ranges from 83% to 95%.
There are variable accuracies in reporting squash obtained from stereotactic biopsy procedures and direct tumor decompression procedures. [6,[8][9][10] Arti cial Intelligence (AI) can be used to assist in various diagnostic methods. Previous studies which were done with some form of feature selection application indicated that the cervical cancers could be diagnosed with accuracies ranging from 85% to 90% on external testing dataset. 11 Also, in cases of Thyroid Cancer, ne needle aspiration cytology diagnosis could be done with sensitivity of 90.48%. [12] But in many of these studies, the sample size in all these studies very limited, and the generalization would be doubtful.
In neuroradiology, Sasank Chilamkurthy, et al [14] demonstrated the utility of CNN networks to predict different abnormalities in various scenarios from non-contrast CT scans. This study has shown that accuracies up to 92% can be achieved in detecting pathologies like intracranial hematoma or subdural or extradural hematoma. Their network could also predict bony abnormalities like skull fractures (92.0%), midline shift and mass effect (93.0%).
In CNS histology, classi cation of histopathological slides for CNS tumor and segmentation was done by a similar use of CNN. In a paper by Yan Xu, et al [13] they have used 23 Histopathological slides of Glioblastoma Multiforme and 22 images of low-grade glioma and it could achieve accuracies of up to 97.5%. However, in our study, we used 10,000 squash smear images for building and validating the CNN model. A speci c study diagnosing malignant vs non-malignant breast cancer based on Computer-aided histological diagnosis could achieve an accuracy of 88.3%. [14] In the present study, sensitivity for detection of high-grade glioma and low-grade glioma by CNN AI agent was 91% and 77% which were comparable to conventional detection methods by human pathologists (83-95%) 6,11 and 77-80% [15] in various studies.
Feature visualization can help us visualize the inner workings of these generally considered black box models. It gives us an idea about which parts of the image our model is giving importance to, while making the diagnosis. If this network is used to screen the whole slide it can give the interpreter an overview of relevant regions of interest. Screening whole slides with feature visualizing techniques can reduce reporting time where experts are available.
Other similar techniques are likely on rise like diagnosing with Raman spectroscopy which has decades of research behind it [16,17], but as compared to our method it requires specialized equipment for reporting.
All the computer programming work done in this study was by team members only. Promoting learning of computer programming languages in the eld of medicine is a need of the hour in recent future.

Future Possibilities and Limitations:
Future holds many possibilities for intraoperative tissue diagnosis reporting. The same model can be made more accurate and more generalized for the diagnosis of all types of CNS pathologies with larger datasets. Similar models can be useful in the diagnosis of the nal histopathological slides once trained. Also, if implemented with surgical confocal microscopes, it can help delineate the normal from abnormal brain tissue while operating in real time. We will try to achieve the same and build better models in the future and make it available publicly to assist neurosurgeons and neuropathologists.
Although the reporting accuracies for low grade gliomas are comparable to human standards, there is scope of improvement with better upcoming deep learning techniques. In this study we tried to demonstrate the proof-of-concept implementation of diagnostic intraoperative neuropathology.

Conclusion
We have demonstrated that deep learning models can be used for diagnosis of high-and low-grade gliomas in squash smear pathology.
Arti cial Intelligence can be reliably used, if properly standardized images of CNS Tumor squash smear cytology are obtained. In our study we found that a CNN Model can differentiate and diagnose High Grade Gliomas from Low Grade gliomas e ciently with accuracies of 91% and 77% respectively which is comparable to current human diagnostic accuracies. Also, feature visualization tools based on these models can screen large areas on slides for further detailed analysis by an expert neuropathologist. We strongly believe with even larger datasets the generalizability of such models can improve in future.
These results will promote to do future research in intra-operative neuropathology to differentiate other types of clinically relevant tumors like germinoma, lymphomas, types of meningiomas and more.

Declarations
Disclosures: No con ict of interests to disclose.  Tables   Table 1: Table showing

Supplementary Files
This is a list of supplementary les associated with this preprint. Click to download. annotation.docx