Hybrid high performance intelligent computing approach of CACNN and RNN for skin cancer image grading

Skin cancer is defined as unregulated cell proliferation due to irreversible DNA damage. Melanoma is one kind of skin cancer generated by melanocytes, which could result in serious health issues. Early detection of this skin cancer based on image processing assist therapists in its treatment. Computational pathology has the unique capability of spatially dissecting specific interfaces on digitized histology images. The main objective of this research is to classify skin cancer using a deep learning model. This work proposes a hybrid context-aware convolutional neural network with recurrent neural network (CA-CNN-RNN) based on skin cancer histology images. This model encodes the local representation of the histological images into high-dimensional features and hence aggregates the features by considering their spatial configuration for performing final classification. The dataset for this work is made up of H&E-stained images from the database of The Cancer Genome Atlas. Using the hybrid CA-CNN-RNN model, the experiment on histological image of melanoma disease was performed and validated with several classification models like DarkNet-53, VGG-19, ResNet50, and Inception. The data set is utilized to evaluate the model in order to provide results, which are analyzed using parameters like as accuracy, precision, recall, and F1-score. The experimental analysis reveals that the CA-CNN-RNN performed better with the DarkNet-53 model. The proposed model achieved 97.14 accuracy, 96.49 precision, 98.21 recall, and 96.50 f-score.


Introduction
Melanoma, a harmful type of skin cancer, has risen by around 50% in the last decade to over 2,870,001 cases. This equates to over 600,001 melanoma-related deaths per year. The World Health Organization estimates that over a million non-melanoma skin cancer cases are diagnosed every year (Manu et al. 2020). Though, most of the countries are not legally reporting the cases of non-melanoma cancer, the real incidence was widely considered to turn into several millions. As a result, skin cancer is the most widespread cancer in the world. Skin cancer is classified into two sorts: non-melanoma and melanoma (Khushboo et al. 2019). Melanoma is a most lethal type of skin cancers. Basal cell carcinoma and squamous cell carcinoma are the most prominent non-melanoma tumors. Merkel cell cancer is a rare, highly active, and rapidly developing cancer (Erdem and Mehmet 2018).
According to the most recent WHO data, the number of melanoma-related deaths will rise by 20% by 2025, increasing to 74% by 2040. Around 2008 and 2018, the incidence of melanoma skin cancer increased by 44%, while deaths increased by 32%. Australia, Norway, Denmark, Sweden, and Germany are the five countries with the highest rates of skin cancer (Peizhen et al. 2019). According to GLOBACON 2018 report, the skin cancer in India is less than 1% of all the cancers (https://rb.gy/ gcn89u).
Melanoma develops in pigment cells (melanocytes). Unlike other forms of skin cancers, it can quickly expand (metastasizes) to other glands. Cancer may spread across tissue, the lymph system, or the bloodstream. When spreading through skin, it can only disperse through surrounding areas; but, if it penetrated the lymph systems or blood vessel, it could spread to other body tissues (Titus et al. 2019). Melanoma-spread tissue becomes a tumor proliferation, which was hard to treat. Well, since the malignant development happens on the skin's surfaces, it may be detected by a quick visual examination and fully cured if recognized early. A dermatologist's visual screening of the suspicious skin area is the initial step in identifying a malignant lesion. Because several forms of lesions are identical, a proper diagnosis is important (Kassem et al. 2021) (Table 1).

Types of skin cancer
Basal cell carcinoma (BCC) is a common type skin cancer. It is very common in people with fair skin. BCCs are generally identified as a flesh-colored spherical growth, a pearl-like lump, or a pinkish skin patch. Indoor tanning or sun exposure can lead to BCCs, which can thereafter develop into cancerous growths on the skin. There are a variety of locations where BCCs can be detected, the most prevalent of which are the head and neck as well as the arms and legs. BCC must be diagnosed and treated as soon as possible. There are many ways in which BCC can be disseminated. It has the potential to damage nerves and bones if left unchecked (Fig. 1). Squamous cell carcinoma (SCC) is one of the prevalent types of skin cancer. People with fair skin are more likely to form SCC. This cancer can also occur in people with darker skins. Scaly patches or red hard bumps are common symptoms of SCC. Sores that heal only to return can also occur. If people have sun-exposed areas of body like ears, arms, neck as well as back, are more likely to get SCC. SCC can injure and deform the skin because it grows deep into the dermis. Early recognition and treatments could prevent SCC from increasing deep and trying to spread to various parts of the body. Actinic keratoses are dry, scaly spots, or patches that appear on the skin of some persons (AKs). An AK, which is also caused by excessive sun exposure, is not skin cancer.
Melanoma is frequently referred to as ''the most-deadly skin cancer'' due to its proclivity to spread. One may notice an unusually dark region on the skin which is not like the rest of the skin at first, but this is a sign of melanoma. Early detection and treatment are critical.
Unfortunately, only after the suspicious lesions (or mole) were biopsied or excised can the stage of a melanoma be established. Four specific features are known to decide the stage: tumor diameter, ulceration, and spreads to lymph node or different areas of the body. Staging is critical in designing an adequate recovery plan and assessing the prognosis. Melanoma is classified into five stages, including Stages 0-4 (Andre et al. 2017;Sumaiya et al. 2019). Pathologists use histology slides to examine the microanatomy of cell and tissue under the microscope. Anyway, new advances in digital imaging solution have digital histology slides, allowing pathologists to perform Table 1 Melanoma stages with description (Yan et al. 2017) Stages Description Stage 0 Melanocytes that are abnormal can be located in the skin's outer layer (epidermis). These cells have the potential to develop into tumor. This was also known as ''melanoma in situ.'' Stage 1 Cancer was developed. Consider the size and ulcerations. This level was split into two stages: IA and IB Stage 1A: The cancer was less than 1 mm diameter and has no ulcerations. The formations of a skin break Stage 1B: Whether the cancer has ulcerations but was less than 1 mm thickness, or it has a diameter among 1 mm and 2 mm but no ulcerations Stage 2 Considering the size and ulcerations. This stage was split into three sections: Stages 2A, 2B, and 2C Stage 2A: Whether the cancer has ulcerations and was more than 1 mm thickness but not more than 2 mm thickness, or it was among 2 mm and 4 mm thickness but has no ulcerations Stage 2B: Whether the cancer has ulcerations and was more than 2 mm thickness but not more than 4 mm thickness, or the cancer was more than 4 mm thickness but has no ulcerations Stage 2C: The cancer was thicker than 4 mm and has ulcerations Stage 3 Consider metastasis. Tumors of any size, without or with ulcerations might exist. It might have expanded to more than one lymph nodes, or tumor cells may be at least 2 cm far from the main cancer region and might be in a lymph vessel among the main tumor and the neighboring lymph nodes, or there may be smaller tumors in a 2 cm area around the main tumor on/below the skin

Stage 4
Tumor may also expand to other areas of the body, such as the lungs, brain, or livers, and may be far from the main tumor the same study on a computer screen (Achim et al. 2019;Jwan and Subhi 2021). The CA-CNN-RNN model for classification of skin tumor based on histological image was designed in this work. The proposed approach encodes the image's local representations into high-dimensional feature initially and integrated the features by considering their spatial configurations to perform the final classification. From the Cancer Genome Atlas 58 H&E-stained images from formalin-fixed, paraffins-embedded diagnosed block of melanoma skin cancers is used as dataset.
Hematoxylin and Eosin (H&E) staining is commonly utilized in histopathology studies because it allows the researcher/pathologist to see the tissue in precise detail. This is accomplished by visibly labeling cell features such as the organelles, nucleus, cytoplasm, and extracellular components. Using a context-aware CNN-RNN architecture, this proposed model assesses melanoma skin cancer from histopathological images. The remaining part of this research is presented in following sections as, Sect. 2 discusses the related works, Sect. 3 presents the proposed method, Sect. 4 presents the performance analysis, and Sect. 5 presents the conclusion and future extension of the research.

Related works
To develop cells classification method that relied exclusively on cells nuclei morphology, Konstantinos et al. proposed a hierarchical structure that replicated how pathologists viewed tumor architecture and identified tumor heterogeneity. To present the global context, the Simple Linear Iterative Clustering (SLIC) super pixel algorithm was utilized for segmentation and identify tumor region in low resolutions H&E-stained histology images of melanoma cancers. In 58 whole-tumor images from the melanoma data set, super pixel classifications classified the images into tumor, stroma, epidermis, and lumen/white space yielded 97.70% accuracy in training and 95.70% accuracy in testing (Konstantinos et al. 2018). Because of phenotypic correlations, there were limitations to classification of cell based solely on local contextual factor or nuclear morphology.
Zhiying et al. proposed a computer-aided methodology for detecting skin cancer from dermoscopy images. To improve diagnostic accuracy, the input images were first denoised using a median filter. To distinguish the cancerous area from the context, CNN based on the SBO algorithm was used. The segmented images were then used to remove a variety of features. An optimal approach based on the SBO algorithm was utilized for optimal features selection and reducing the order of feature to refine the classification process. To perform the final classification, the extracted features from the images were categorized using the SBO algorithm and an optimized SVM classifier (Zhiying et al. 2020). The results can be improved by extracting the features from segmented images which could be helpful. Filali et al. proposed an effective skin-lesions classifications method based on the combination of handcrafted feature (shapes, skeleton, colors, and textures) and feature derived from the very effective deep learning architecture. Then, feature engineering was used to exclude unnecessary features and pick only those that were appropriate. This was attributed, first, to the convergence of handcraft and the pretrained feature, and second, to utilize the genetic algorithm for feature selections, which demonstrated the usefulness of those chosen as the better and most important one. For better results, this model could be enhanced by extracting more features from the data (Filali et al. 2020).
Ammara and Al-Jumaily implemented a new method for automated segmentation of histopathological images of skin without the need for user interference. For fine segmentation of cancerous regions, the approach merged the advantages of clustering and evolutionary algorithms such as level set. The improved fuzzy C means clustering with orientations sensitivity aided in the better approximation of the tumor field. Novel sensitivity to orientation Fuzzy C means clustering was utilized to produce the original coarse segmentations and to quantify the level sets evolutions governing parameters. Hence, to complete the segmentation process, a refined quick level set-based algorithm was used (Ammara and Al-Jumaily 2020). The results obtained were from only the segmented images. To improve the results, the feature extraction-based classification can be implemented. Zhang et al. proposed an optimal CNN approach for skin cancer early detection. An improved whale optimization strategy was used to optimize the CNN. The primary goal of CNN training was to improve outcomes for layer parameters, which created a strong relationship between the layers to ensure proper recognition. The optimization algorithm was used to optimize the collection of weights and biases in the networks to decrease the error between the network output and the desired output, which could be more efficient to produce better results (Zhang et al. 2020).
Marwan AA proposed the new prediction technique that classified skin lesion as benign or cancer depended on the regularizer method. Deep CNNs were used to achieve strongly differentiated and theoretically common tasks over the finely grained target classified. As a result, this was the binary classifiers that distinguished among benign and malignant lesion. Furthermore, this CNN model was reviewed on various use cases and yielded positive AUC-ROC data. In this work, this novel regularizer was not appropriate for features selection or features reduction (Marwan 2019). Muhammad et al. presented a new context-aware DNN for tumor grading that could add 64 times more contexts than traditional CNN-based patch classifier. The network was suited well for the cancer grading role that was based on detecting tumor in glandular structure. Two stacked CNNs were used in the proposed contextaware network. The first LR-CNN was used to learn the histological image's local representations. RA-CNN then aggregated the studied local representation based on its spatial pattern. The context-aware model proposed was tested for cancer grading and classifications. For patient survival analyses, this method can conduct downstream analysis at the digital WSI level (Muhammad et al. 2020).
Hosny et al. used AlexNet and transfer learning models to classify skin lesions using a deep neural network. To compare with the state of the art, the proposed approach was trained and evaluated using the public dataset ISIC 2018. This approach has the potential to classify seven different types of lesions (Kassem et al. 2020a). Kassem et al. classified skin lesions using transfer learning and the pretrained models using GoogleNet. The ISIC 2019 dataset was used for evaluating the proposed model's capacity to classify various sorts of skin tumors. To solve the problem of image imbalance between classes, this method's performance improved as the quantity of images in each class decreased (Kassem et al. 2020b). Hosny et al. developed a deep CNN-based classifications technique for investigating and examining skin cancer. An alternative technique to augmentations, such as rotations and translations, was applied to the segmented image to solve the problems of class imbalances and overfitting (Kassem et al. 2020c).
Using ECOC (Error-Correcting Output Codes)-SVM and deep convolutional neural network techniques, a skin cancer classification model was proposed in Dorj et al. (2018). Skin cancer images in RGB format were utilized to evaluate the model. These images were cropped to minimize noise and achieve better results. In order to extract features, a pre-trained and existing AlexNet CNN model was utilized. The skin cancer was classified using an ECOC SVM classifier. The computational time of this model was high because of dermatologists to validate the images one by one.
In Hekler et al. (2019), the potential benefits of integrating human and AI for classifying skin cancer were proposed. The dermoscopic pictures were used for evaluation and classified into five diagnostic categories before being utilized to train a single CNN. The dermatologists and the trained CNN then classified a set of biopsy-verified skin lesions into those five classifications independently. Given the decision's certainty, the two independently determined diagnoses were integrated into a new classifier using a gradient boosting approach. The primary work was to correctly classify the images into five distinct groups, while the secondary work was to correctly classify lesions as malignant or benign. Here the accuracy was degraded due to the poor quality of images.

Proposed method
Because of the large amount of pixel data found in digital histology images, CNN-RNN can be used to analyze them. In this research, a hybrid CA-CNN-RNN model was proposed which focused on images to provide a large context. The proposed method encodes the histopathological image's local representations into high-dimensional feature and therefore combined the features by evaluating its spatial arrangements to perform the final classification.
From the Cancer Genome Atlas 58 H&E-stained image from formalin-fixed, paraffins-embedded diagnosed block of melanoma skin cancers was used as dataset. Using CA-CNN-RNN, this proposed model assesses melanoma skin cancer from histopathological images.  Figure 3 shows the processing stages for classifying skin cancer in the proposed approach. Initially, at the preprocessing step, data set samples were preprocessed in order to determine the real features of the images. The data set was then split into training and testing sets. The pretrained CNN model integrated with an RNN classifier was utilized to classify the test data set.

Hybrid CA-CNN-RNN Model
In this research, a hybrid model combining CNN and RNN architecture is presented to classify skin cancer from histopathological images. The proposed model's first stage encoded the input image Z t into the features-cube E t . Most of the input images were patch-based interpreted by the LR-CNN. The proposed architecture is adaptable enough to allow the use of any image classifier as an LR-CNN, like VGG-19, Inception, ResNet50, and DarkNet-53. This adaptability further allows it to utilize pre-trained weight in the part of the small data set. Furthermore, the LR-CNN can be trained independently before being integrated into the proposed model, allowing it to learn concrete representations and resulting in early convergences of the model's context-aware learnings stage. The output function's spatial dimensions e t ij of a patch z t ij which differ depending on the dimensions of the input patches and the network architectures for features extraction (Muhammad et al. 2020). Figure 4 represents the detailed architecture of the proposed CA-CNN-RNN hybrid model.
This CA-CNN model, combined with RNN, was presented to enhance the feature representations and fundamental context data between features of skin cancer images. Context features extraction, features fusion, and classifications were all part of it. During feature extraction, the contexts-region-of-interest (Context-RoIs) layer was added to the CNN approach, and context features were extracted.
The preceding layer's output was given to the subsequent layers by the operator (!), and preceding layer's output was represented by the operator (•). Since the proposed model's feedback has the relatively broad spatial dimensions, certain parts of the image may be irrelevant for image labels prediction. Attention block was implemented to reduce the importance of trivial features and vice versa. The weighted feature-cube E' was termed as (Eq. 1): where L 1Â1 c and h c represent the 1 9 1 convolution layers and its parameter, accordingly. L s indicates the soft-max layer, and operator represents the Hadamard product.
After the LR-CNN was utilized for encoding the essential patch-based representation of image into the features-cube, the primary context block's (CB) goal was to know the spatial contexts among the features cube. The CB discovers the relationship among image patch feature based on their spatial position. Three CB architectures were implemented, each with a different level of sophistication and capacity to collect context data.
The first CB is made up of a 3 9 3 convolution layer, which was then accompanied by ReLU activations and batch normalizations. The next CB used a residual blocks architecture of two distinct filters size. In contrast to the prior two contexts blocks, the third context block processed the input features-map in paralleled with various filtered size for capturing contexts from differing receptive parts.
The features extracted were given as input to the RNN layer. Using internal memory, the RNN network can handle long-term dependencies. Nodes between layers in CNN's fully connected networks are connection less and execute just one input, but nodes in RNN were connected from the directed graphs and execute an input in the specified sequence (Muhammad et al. 2020). The cascaded sequence of three contexts blocks (C(•)) of the similar form was used. The final prediction V' was computed from the feature of the input image Z as (Eq. 2): Using the traditional CNN structure, a CNN tries to produce coarse and fine labels. The softmax loss function was employed during the training phase to jointly optimize the coarse and fine label predictions, as described in (1) (Eq. 3).
where 1 {Á} was the indicator functions. The characters N, C, and F represent the collection of image, coarse category, and fine category, individually. The softmax probability of the coarse and fine classes are denoted by p j and p k , respectively. The trained network may be utilized for determining fine and coarse labels simultaneously during the inference phase. The following steps represent the algorithm of CA-CNN-RNN model.
RNN is a feed forward neural network model with single or multiple feedback loops that is developed to process data sequentially. Consider the input sequences (x 1 , …, x t ), the RNN creates the output sequences (y 1 , …, y t ). RNN is utilized anytime that an input-outputs relationship was discovered according to the time and the ability to deal with long-time dependencies. The modeling sequential method was to use an RNN for feeding the input sequences to the fixed-sized vectors and thus maps the vector to the softmax layer. In general, RNN has a difficulty when the gradient vector increases and decreases exponentially over lengthy sequences. The vanishing gradient and exploding issue make it difficult to learn long-term associations from the RNN architecture's sequences.
However, the Long Short-Term Memory (LSTM) was capable of successfully solving such a long-distance dependencies problem. The primary distinction between LSTM and RNN was because LSTM adds an individual memory cells state to hold long-term state, which it updates or exposes as needed. The LSTM is made up of three gates: an input, forget, and the output gates. These gates were utilized for regulating how much the inputs must be read (input gate i), if the present cell values should be forgotten (forget gate f), and if the new cell values should be produced (output gate o).
These gates allow input signals to prognosticate via recurrent hidden state without influencing the outputs; as a result, LSTM could deal with vanishing and exploding gradients and successfully represent long-terms temporal dynamic that RNN cannot learn (Guo et al. 2018). LSTM neurons are used as recurrent neuron in this work. The gates' definition and the LSTM updates at timestep k are as described in the following Eqs. 4-9: s k ¼ f k s kÀ1 þ i k g k ð8Þ where denotes the product operation, r was the sigmoid function r x ð Þ ¼ 1 þ e Àx ð Þ À1 , and u was the hyperbolic tangent function u x ð Þ ¼ e x Àe Àx e x þe Àx . Other symbols are i k , f k , o k , g k which stand for input gate, forget gate, output gate, and input modulation gate. The input vectors, hidden states, image visual features, and memory cells are represented by y, d, u, and s, respectively. When updating the LSTM, the image visual feature v was proposed to be imposed at each timestep. The weights and biases that must be learnt are denoted by Z and w. Because the CA-CNN-RNN classifies skin cancer and also trains and predicts skin cancer images, there is no need to construct separate networks for different levels of classification. As a result, the hybrid CA-CNN-RNN is robust and can classify any image category.

Experiment analysis
The proposed model's results are experimented using the MATLAB/SIMULINK tool. Based on histology images of melanoma skin cancer, a context aware CNN-RNN hybrid model for skin cancer grading was proposed in this work. The data set was utilized to evaluate the model to produce the results, which are hence analyzed based on parameters such as accuracy, precision, recall, and F-score. The images from the Cancer Genome Atlas database were utilized for the experiment. The melanoma cancer grading was the key focus of this study, and the model can be used to do different cancer grading for any severe disease by using different datasets in the process.

Description of dataset
The Cancer Genome Atlas provided 58 H&E-stained images from formalin-fixed, paraffins-embedded diagnosed block of melanoma cancers (https://www.openmicroscopy. org/bio-formats/). Using Bio-Format, each digitized histology image was scaled to 20 9 magnifications with pixel size of 0.504 lm. The expert pathologists presented annotation on the slide for four separate areas to fix the ground truths for regional classifications: tumor region, normal stroma, normal epidermis, and lumen/white space. Out of 58 pictures, 37 selected randomly were utilized to train, and 21 images were utilized to test the model.

Performance assessment
By using the parameters like recall, accuracy, F-score, and precision, the results are evaluated based on various networks models like Inception (Szegedy et al. 2016), VGG- 19 (Kamran and Sabbir 2018), DarkNet-53 (Redmon and Farhadi 2018), and ResNet50 (He et al. 2016). The melanoma cancer histology images are split into patch size 1800 9 1800, and the labels of every patch were predicted utilizing the proposed model with the level of 224 9 224. The performance analysis of these classifier models for melanoma cancer grading is tabulated in Table 2 (Eqs. 10-13). Precision Recall whereas the efficiency of every classifier was equal, Inception outperforms them all by having the high mean accuracy. DarkNet-53 model, on the contrary, exhibits constant efficiency toward 3 folds with minimum standard deviation.
Tables 3 and 4 represent the performance evaluation of the proposed model with different transfer learning models. Figure 6 represents the accuracy comparison of training and testing performance based on the classifiers. The accuracy of data from the classifiers represents that the DarkNet-53 model obtained the highest performances in training and testing with 98.39% and 97.14%, and the VGG-19 has the least performance in both cases 95.52% and 93.72%. Figure 7 represents the comparison of precision evaluation of classifiers from the training and testing data. The DarkNet-53 model attained the highest performances in both training and testing evaluation with 97.15% in training and 96.49% in testing. The Inception and ResNet50 obtained similar results in training data with 95.36% and 95.74% and different scenario in testing data.
The comparison of recall performance evaluation on both training and testing data is represented in Fig. 8. The performance of DarkNet-53 model delivered the best performances in both the cases with 98.92% and 98.21%. The Inception model has the second leading performance with 97.58% in training and 95.98% in testing. Figure 9 represents the comparison of F-score performance analysis of training and testing data. In this comparison, the DarkNet-53 model obtained the highest performances 97.85% in training and 96.50% in testing. The least performance was obtained by VGG-19 in training with 94.70% and ResNet50 in testing with 93.08%.
By using the CA-CNN-RNN model, the feature extraction and classification based on histopathology

Conclusion
A hybrid model CA-CNN-RNN for skin cancer grading based on melanoma skin cancer images was proposed in this work. The proposed research was intended for the classification of larger input images. In this work, H&E histology images from The Cancer Genome Atlas database were utilized as a data set for experiment. The performance of classification was performed on transfer learning models like VGG-19, ResNet50, DarkNet-53, and Inception using the CA-CNN-RNN approach. From the data set of 58 images, 37 images were selected randomly and utilized for training phase, and 21 images were utilized for testing phase. The results were assessed based on parameters like accuracy, precision, F-score, and recall. The melanoma cancer grading was the key focus of this work, and the proposed model can be used to do different cancer grading for any severe disease by using different datasets in the process. This proposed model is adaptable, allowing it to use any network architecture based on local representation learning. According to the performance review, the CA-CNN-RNN model with various classifiers performed better, and among the classifiers, the DarkNet-53 model performed effectively in every parameter, as discussed in the results section. The proposed model achieved 97.14 accuracy, 96.49 precision, 98.21 recall, and 96.50 f-score. In future, the advanced deep learning models can be included with this proposed model for enhanced performance, and this proposed model can be used for various cancer grading process using suitable datasets.