A Hybrid High Performance Intelligent Computing Approach of CACNN and RNN for Skin Cancer Image Grading

— Skin cancer is characterized as the uncontrollable growth of skin cells caused by unrepairable DNA damage. Melanoma is the deadliest form of skin cancers caused by melanocyte and early diagnosis supports therapists in curing it. Computational pathology offers a one-of-a-kind ability to spatially dissect certain interfaces on digitized histology images. A hybrid context-aware convolutional neural networks with recurrent neural network (CA-CNN-RNN) based on skin cancer histological images is proposed in this research. The proposed model encodes a histology image's local representation into higher-dimensional features first, then aggregated the feature by consider their spatial arrangement to enable the final predictions. In this research, H&E-stained sectioned images from the Cancer Genome Atlas are used as the dataset for assessment. From 58 images, 37 images were used for training and 21 images are used for testing. The process on histology images of melanoma skin cancer was analyzed and validated with various classifiers such as VGG-19, Inception, ResNet50, and DarkNet-53 using the hybrid CA-CNN-RNN model. The dataset is used to generate the results, which are then analyzed based on criteria such as accuracy, recall, precision, and F-score. The performance analysis shows that the proposed CA-CNN-RNN with different classifiers has performed better and among the classifiers the DarkNet-53 model has the better performance in all the parameters.


I. INTRODUCTION
Melanoma, the harmful type of skin cancers, has risen by around 50 percent in the last decade to over 287,0001 cases. This equates to over 60,0001 melanoma-related deaths per year. The World Health Organization estimates that over a million non-melanoma skin cancer cases are diagnosed every year [1]. Though, most of the countries are not legally reporting the cases of non-melanoma cancer, the real incidence was widely considered to turn into several millions. As a result, skin cancer is the most widespread cancer in the world. Skin cancer is classified into two sorts: nonmelanoma and melanoma [2]. Melanoma is a most lethal type of skin cancers. Basal cell carcinoma and squamous cell carcinoma are the most prominent non-melanoma tumors. Merkel cell cancer is a rare, highly active, and rapidly developing cancer [3].
According to the most recent WHO data, the number of melanoma-related deaths will rise by 20% by 2025, increasing to 74% by 2040. Around 2008 and 2018, the incidence of melanoma skin cancer increased by 44 percent, while deaths increased by 32 percent. Australia, Norway, Denmark, Sweden, and Germany are the five countries with the highest rates of skin cancer [4].
Melanoma develops in pigment cells (melanocytes). Unlike other forms of skin cancers, it can quickly expand (metastasizes) to other glands. Cancer may spread across tissue, the lymph system, or the bloodstream. When spreading through skin, it can only disperse through surrounding areas; but, if it penetrated the lymph systems or blood vessel, it could spread to other body tissues [5]. Melanoma-spread tissue become a tumor proliferation, which was hard to treat. Well, since the malignant development happens on the skin's surfaces, it may be detected by a quick visual examination and fully cured if detected early. A dermatologist's visual examination of the suspicious skin area is the initial step in identifying a malignant lesion. Because several forms of lesions are identical, a proper diagnosis is important [6]. Table I MELANOMA STAGES WITH DESCRIPTION [9] Stages Description Stage 0 Melanocytes that are abnormal can be located in the skin's outer layer (epidermis). These cells have the potential to develop into tumor. This was also known as ''melanoma in situ." Stage 1 Cancer was developed. Consider the size and ulcerations. This level was split into two stages: IA and IB. Stage 1A: The cancer was less than 1 mm diameter and has no ulcerations. The formations of a skin break. Stage 1B: Whether the cancer has ulcerations but was less than 1 mm thickness, or it has a diameter among 1 mm and 2 mm but no ulcerations. Stage 2 Considering the size and ulcerations. This stage was split into three sections: Stages 2A, 2B, and 2C. Stage 2A: Whether the cancer has ulcerations and was more than 1 mm thickness but not more than 2 mm thickness, or it was among 2 mm and 4 mm thickness but has no ulcerations. Stage 2B: Whether the cancer has ulcerations and was more than 2 mm thickness but not more than 4 mm thickness, or the cancer was more than 4 mm thickness but has no ulcerations. Stage 2C: The cancer was thicker than 4 mm and has ulcerations. Stage 3 Consider metastasis. Tumors of any size, without or with ulcerations might exist. It might have expanded to more than one lymph nodes, or tumor cells may be at least 2 cm far from the main cancer region and might be in a lymph vessel among the main tumor and the neighbouring lymph nodes, or there may be smaller tumors in a 2 cm area around the main tumor on/below the skin. Stage 4 Tumor may also expand to other areas of the body, such as the lungs, brain, or livers, and may be far from the main tumor.   [12].
Zhiying X et al. proposed a computer-aided methodology for detecting skin cancer from dermoscopy images. To improve diagnostic accuracy, the input images were first denoised using a median filter. To distinguish the cancerous area from the context, CNN based on the SBO algorithm was used. The segmented images were then used to remove a variety of features. An optimal approach based on the SBO algorithm was used for optimal feature selection and reducing the order of features to refine the classification process. To perform the final classification, the extracted features from the images were categorized using the SBO algorithm and an optimized SVM classifier [13].
Youssef Filali et al. proposed an effective skin-lesions classifications method based on the combination of handcrafted feature (shapes, skeleton, colours, and textures) and feature derived from the very effective deep learning architecture. Then, feature engineering was used to exclude unnecessary features and pick only those that were appropriate. This was attributed, first, to the convergence of handcraft and pre-trained feature, and second, to the utilize of the genetic algorithms for feature selections, which demonstrated the usefulness of those chosen as the better and most important one. For better results, this model could be enhanced by extracting more features from the data [14].
Ammara M and Adel AJ implemented a new method for automated segmentation of histopathological images of skin without the need for user interference. For fine segmentation of cancerous regions, the approach merged the advantages of clustering and evolutionary algorithms such as level set. The improved fuzzy C means clustering with orientations sensitivity aided in the better approximation of the tumor field. Novel sensitivity to orientation Fuzzy C means clustering was utilized to produce the original coarse segmentations and to quantify the level sets evolutions governing parameters. Hence, to complete the segmentation process, a refined quick level set based algorithm was used [15]. Ni Zhang et al. proposed an optimal CNN approach for skin cancer early detection. An improved whale optimization strategy was used to optimize the CNN. The primary goal of CNN training was to improve outcomes for layer parameters, which created a strong relationship between the layers to ensure proper recognition. The optimization algorithm was used to optimize the collection of weights and biases in the network in order to decrease the error between the network output and the desired output [16].
Marwan AA proposed the new prediction technique that classified skin lesion as benign or cancer depended on the regularizer method. Deep CNNs was used to achieve strongly differentiated and theoretically common tasks over the finely grained target classified. As a result, this was the binary classifiers that distinguished among benign and malignant lesion. Furthermore, this CNN model was reviewed on various use cases and yielded positive AUC-ROC data. In this work, this novel regularizer was not appropriate for features selection or features reduction [17]. Muhammad

III. PROPOSED METHOD
Because of the large amount of pixel data found in digital histology images, CNN-RNN can be used to analyze them. In this research a hybrid CA-CNN-RNN model was proposed which focused on images to provide a large context. The proposed model encodes a histology image's local representation into higher-dimensional feature first, therefore aggregated the feature by evaluating its spatial arrangement to enable the final predictions.  Figure 3 shows the processing stages for classifying skin cancer in the proposed model. Initially, at the preprocessing step, dataset samples were preprocessed in order to determine the real features of the images. The dataset was then divided into training and testing sets. The pre-trained CNN architecture combined with an RNN classifier was utilized to classify the test dataset.

a. Hybrid CA-CNN-RNN Model
In this research, a hybrid model combining CNN and RNN architecture is presented to classify skin cancer from histopathological images. The proposed model's first stage encoded the input image Z t into the features-cube E t . Most of the input images were patch-based interpreted by the LR-CNN. The proposed architecture is adaptable enough to allow the use of any image classifier as an LR-CNN, like VGG-19, Inception, ResNet50, and DarkNet-53. This adaptability further allows it to utilize pre-trained weight in the part of the small data set. Furthermore, the LR-CNN can be trained independently before being integrated into the proposed model, allowing it to learn concrete representations and resulting in early convergences of the model's context-aware learnings stage. The output function's spatial dimensions of a patch which differ depending on the dimensions of the input patches and the network architectures for features extraction [18]. Figure 4 represents the detailed architecture of the proposed CA-CNN-RNN hybrid model.
The preceding layer's output was given to the subsequent layers by the operator (→), and preceding layer's output was represented by the operator (•). Since, the proposed model's feedback has the relatively broad spatial dimensions, certain parts of the image may be irrelevant for image labels prediction. Attention block was implemented to reduce the importance of trivial features and vice versa. The weighted feature-cube E' was termed as (Equation 1): where 1×1 and represent the 1x1 convolution layers and its parameter, accordingly. Ls indicates the soft-max layer, and operator ⊗ represents the Hadamard product.
After the LR-CNN was utilized for encoding the essential patchbased representation of image into the features-cube, the primary context block's (CB) goal was to know the spatial contexts among the features cube. The CB discovers the relationship among image patch feature based on their spatial position. Three CB architectures were implemented, each with a different level of sophistication and capacity to collect context data. The first CB is made up of a 3 x 3 convolution layer, which was then accompanied by ReLU activations and batch normalizations. The next CB used a residual blocks architecture of two distinct filters size. In contrast to the prior two contexts blocks, the third context block processed the input features-map in paralleled with various filtered size for capturing contexts from differing receptive parts. The features extracted were given as input to the RNN layer.
Using internal memory, the RNN network can handle long-term dependencies. Nodes between layers in CNN's fully connected networks are connectionless and process only one input, but nodes in RNN are connected from a directed graph and process an input in a specified sequence [18]. The cascaded sequence of three contexts blocks (C(•)) of the similar form was used. The Using the traditional CNN structure, a CNN tries to produce coarse and fine labels. The softmax loss function was employed during the training phase to jointly optimize the coarse and fine label predictions, as described in (1) (Equation 3).
Where 1 {·} is the indicator function. The characters N, C, and F represent the collection of images, coarse categories, and fine categories, respectively. The softmax probabilities of the coarse and fine categories are denoted by and , respectively. The trained network may be utilized to determine coarse and fine labels at the same time during the inference phase.   [2]) fully_connect Dropout (0.8) output fully_connect*weights [3]

+ biases [3] Return output End Function
A recurrent neural network is a feedforward neural network with one or more feedback loops that is designed to process sequential data. Given an input sequence (x1, ..., xt), an RNN creates an output sequence (y1, ..., yt). RNN is utilized anytime an inputoutput relationship is discovered based on time and the ability to deal with long-term dependencies. The modelling sequence method is to use an RNN to feed the input sequence to a fixedsized vector, and then map the vector to a softmax layer. Unfortunately, RNN has a difficulty when the gradient vector increases and decreases exponentially over lengthy sequences. The vanishing gradient and exploding issue make it difficult to learn long-term associations from the RNN architecture's sequences.
However, the Long Short-Term Memory (LSTM) is capable of successfully solving such a long-distance dependencies problem. The primary distinction between LSTM and RNN is because LSTM adds a separate memory cell state to hold long-term states, which it updates or exposes as needed. The LSTM is made up of three gates: an input gate, a forget gate, and an output gate. These gates are used to regulate how much the input should be read (input gate i), if the current cell value should be forgotten (forget gate f), and whether the new cell value should be produced (output gate o).
These gates allow the input signal to prognosticate via the recurrent hidden states without influencing the output; as a result, LSTM can deal with vanishing and exploding gradients and successfully represent long-term temporal dynamics that RNN cannot learn [22].
LSTM neurons are used as recurrent neurons in this research. The gates' definition and the LSTM update at timestep k are as described in the following equations 4 to 9: = ( Where ʘ denotes the product operation, was the sigmoid function ( ) = (1 + − ) −1 , and was the hyperbolic tangent function ( ( ) = − − + − ). Other symbols are , , , which stand for input gate, forget gate, output gate, and input modulation gate. The input vector, hidden state, image visual feature, and memory cell are represented by y, d, u, and s, respectively. When updating the LSTM, the image visual feature v was proposed to be imposed at each timestep. The weights and biases that must be learnt are denoted by Z and w. Because the CA-CNN-RNN classifies skin cancer and also trains and predicts skin cancer images, there is no need to construct separate networks for different levels of classification. As a result, the hybrid CA-CNN-RNN is robust and can classify any image category.

IV. EXPERIMENT ANALYSIS
The proposed model's results are experimented using the MATLAB/Simulink tool. Based on histology images of melanoma skin cancer, a context aware CNN-RNN hybrid model for skin cancer grading was proposed in this work. The dataset was used to generate the results, which are then analyzed based on parameters such as accuracy, recall, precision, and F-score. H&E-stained images from the Cancer Genome Atlas were used for the assessment. The melanoma cancer grading was the key focus of this study, and the model can be used to do different cancer grading for any severe disease by using different datasets in the process.

A. Description of Dataset
The Cancer Genome Atlas provided 58 H&E-stained images from formalin-fixed, paraffins-embedded diagnosed block of melanoma cancers. Using Bio-Format, each digitized histology image was scaled to 20x magnifications with the pixel size of 0.504μm. The expert pathologists presented annotation on the slide for four separate areas to fix the ground truths for regional classifications: tumor region, normal stroma, normal epidermis, and lumen/white space. From 58 images, 37 images selected randomly were used for training and 21 images were used for testing [23].

B. Performance Assessment
By using the parameters like accuracy, recall, precision, and Fscore, the results are evaluated based on various networks models such as VGG-19 [24], Inception [25], ResNet50 [26], and DarkNet-53 [27]. The melanoma cancer histology images are split into patch size 1800x1800, and the labels of every patch was predicted utilizing the proposed model with the level of 224x224. The performance analysis of these classifier models for melanoma cancer grading is tabulated in table II (Equations 10 to 13).
= + (11) Whereas the efficiency of every classifier was equal, Inception outperforms them all by having the high mean accuracy. DarkNet-53 model, on the contrary, exhibits constant efficiency towards 3 folds with minimum standard deviation.

V. CONCLUSION
Based on histology images of melanoma skin cancer, a context aware CNN-RNN hybrid model for skin cancer grading was proposed in this research. The proposed model is intended for the classification of larger input images. In this work, H&E histology images from The Cancer Genome Atlas were used as a dataset for assessment. From the dataset of 58 images, 37 images were selected randomly and used for training phase and 21 images were used for testing phase. The process on histology images of melanoma skin cancer was analyzed and validated with various classifiers such as VGG-19, Inception, ResNet50, and DarkNet-53 using the CA-CNN-RNN model. The dataset was used to generate the results, which are then analyzed based on parameters such as accuracy, recall, precision, and F-score. The melanoma cancer grading was the key focus of this work, and the proposed model can be used to do different cancer grading for any severe disease by using different datasets in the process. This proposed model is adaptable, allowing it to use any network architecture based on local representation learning. According to the performance review, the proposed CA-CNN-RNN with different classifiers performed better, and among the classifiers, the DarkNet-53 model performed better in every parameter, as discussed in the results section. In future, the advanced deep learning models can be included with this proposed model for enhanced performance and this proposed model can be used for various cancer grading process using suitable datasets.

ACKNOWLEDGEMENT
We would like to thank the University of Tabuk, Kingdom of Saudi Arabia for providing the support for our research work.