Lung cancer is the leading cause of cancer-related deaths worldwide [1]. Early detection and precise diagnosis are crucial for successful treatment. The current diagnosis is made by looking at lung cells in the lab. The cells can be taken from lung secretions such as mucus, fluid removed from the area around the lung (thoracentesis), or from a suspicious area using a needle or surgery (biopsy). However, this method is sometimes invasive method.
On the other hand, image analysis techniques with different imaging modalities presented a comfortable way for cancer detection. Histological imaging of hematoxylin and eosin-stained tissue specimens remains the gold standard for lung cancer diagnosis [2]. However, tumor classification is challenging because of tumor heterogeneity. Also, chest X-rays, chest CT, MRI, and PET are used for lung cancer detection.
A CT scan [3] is more likely to show lung tumors than routine chest x-rays [4]. It can show the size, shape, and position of lung tumors. Low dose computed tomography (LDCT) is used as well in the detection of lung cancer. However, one of its drawbacks is the relatively high false positive rate [5].
Like CT scans, MRI scans show detailed images of soft tissues in the body. But MRI scans use radio waves and strong magnets instead of x-rays. MRI can detect and stage lung cancer, and this method could be an excellent alternative to CT or PET/CT in the investigation of lung malignancies and other diseases [6].
On the other hand, in a PET scan, a slightly radioactive form of sugar is injected into the blood and collects mainly in cancer cells to be traced. In addition, a combination of PET/CT using a special machine that can do both at the same time is used for better accuracy of cancer detection. This lets the doctor compare areas of higher radioactivity on the PET scan with a more detailed picture on the CT scan [7].
In sense of classifier/detector, there are many methods that appeared in the literature for lung cancer detection. These methods are mainly based on machine learning (ML) or deep learning networks (DLN) [8–11]. In [8], ML is used for cancer detection from histopathological images. Seven classifiers: naive Bayes, support vector machines (SVM) with Gaussian kernel, SVM with linear kernel, SVM with the polynomial kernel, bagging for classification trees, random forest utilizing conditional inference trees, and Breiman's random forest. The best accuracy obtained with this method is 85% by utilizing SVM with Gaussian kernel, random forest utilizing conditional inference trees, and Breiman's random forest.
In [9], a cancer diagnostic model based on a deep learning-enabled support vector machine (SVM) is proposed. This method used a convolutional neural network (CNN) with an SVM classifier. The authors reported an accuracy of 94% for pulmonary nodule detection representing early-stage lung cancer.
In [10], deep learning has been proposed as a promising tool to classify malignant nodules. The Lung Cancer Prediction Convolutional Neural Network (LCP-CNN) has been trained to generate a malignancy score for each nodule using CT data. The reported accuracy is 94.5%. Review in [11], provides a comprehensive review of algorithms and techniques relevant to major processes in the pipeline, including feature representation, feature indexing, searching, etc.
From these methods, it can be concluded that ML achieved better accuracy when combined with CNN and a feature extractor this is because ML is feature dependent. However, features extracted from CNN or hand-crafted features can be affected by image rotation zooming, and or scaling. Therefore, the accuracy is still low. Moreover, methods in the literature classify images into cancer or normal image without classifying to the type of lung cancer. Therefore, in this work, we propose a multi-class ML lung cancer detector that can classify the CT images into Adenocarcinoma, Large cell carcinoma, Squamous cell carcinoma, and Normal. The proposed method is based on Wavelet Scattering Transform, which is scale-independent, rotation independent and zoom independent. The proposed method is tested with different classifiers including SVM, Kernel Nearest Neighbor, and Random Forest. Our main contributions to this work are:
-
Unlike the previous work [13], we propose to classify CT lung images into four classes: Adenocarcinoma, Large cell carcinoma, Squamous cell carcinoma, and Normal.
-
Unlike the work done in [5], we used the wavelet scattering transform as a feature domain rather than the time domain to make the system more robust against scaling and shifting.
WST is used as a feature domain to extract features that are insensitive to scaling and shifting rather than using the time domain.