## 3.1. MSNN architecture

A private database containing lung CT scan images is used in this work. The goal is to create a novel, effective convolution neural network architecture that detects lung cancer and outputs a binary data to detect cancerous nodule or noncancerous nodule. Each input CT scan image is a member of a specific class, and a probability score is assigned to each output image.

MSNN has been designed for the diagnosis of lung cancer by using lung CT scan images. To classify the test images, it learns patterns from lung CT scan images. To identify patterns in the images, this network input layer accepts grayscale images of size 512X512. Figure 1 presents MSNN architectural layout which shows that it is made up of five successive blocks. Block1-block4 is made up of four layers namely convolution(conv), Batch Normalization (BN), Rectified Linear Unit (ReLU) and max pooling layer. Wherein, block5 is made up of convolution(conv), Batch Normalization (BN), Rectified Linear Unit (ReLU) and global average pooling layer. Further, table I also displays the MSNN architecture details with total parameters for each layer.

The first layer is convolution layer, which performs convolution operation between the input image(f) and filter size (g) by using Eq. 1:

$$f\left[x\right]*g\left[x\right]=\sum _{k=-\infty }^{\infty }f\left[k\right].g[x-k]$$

1

Where x and k are spatial variables.

In general, a smaller filter size may lead to an overfitting issue, while a bigger filter size may increase the underfitting issue. Therefore, this layer uses 8 filters with a 6x6 ideal filter size.

The next successive layer is BN layer which expedites training speed and lessens network sensitivity. Therefore, performing normalization over a batch(v) of m instances for ‘i’ unit can be done using the following steps.

Firstly, compute batch mean by using Eq. (2):

$${\mu }_{i}=\sum _{r=1}^{m}{v}_{i}^{r}/m$$

2

Where r ranges from 1 to m

Secondly. Compute batch variance by using Eq. (3):

$${\sigma }_{i}^{2}=\sum _{r=1}^{m}{\left({v}_{i}^{r}-{\mu }_{i}\right)}^{2}/m$$

3

Thirdly compute normalized batch instances by using Eq. (4):

$${v}_{n}^{r}={v}_{i}^{r}-{\mu }_{i}/{\sigma }_{i}$$

4

Lastly, scale with learnable parameters by using Eq. (5):

$${a}_{i}^{r}={\gamma }_{i}*{v}_{n}^{r}+{\beta }_{i}$$

5

ReLU layer helps to add nonlinearity to the network by adding a rectifier function which is computing linear operations during convolution. The function works by using Eq. (6):

$$f\left(x\right)=0, x<0$$

$$f\left(x\right)=x,x>0$$

6

Max pooling layers help to decrease the size of the convolved feature map to reduce computational costs.

Global Average Pooling layer is utilized in this work which helps to enhance the model performance by reducing the loss function value.

Each input image is given a probability score using a series of FC (Fully Connected), BN, ReLU, FC, and SM (Soft Max) layers. Where FC layer helps in the classification of images into categories. And SM layer converts the output of the last layer of network into a probability distribution.