Our main goal is to classify traffic signals using a convolutional neural network (CNN). The planning process consists of three steps: identifying and developing a traffic sign image in spatial space and then distributing it. Image quality metrics used in this test include mean error (MSE), peak signal-to-noise ratio (PSNR), structural similarity (SSIM), percent accuracy, and processing time in seconds. To measure image quality, this mouse was tested by Liu et al. (2021). The main task is to find new methods from the methods given by Zaklouta and his friends. To perform image noise removal, image enhancement, and image classification tasks in the region as much as possible. (2021). The criterion for choosing the best method is that it has the lowest mean square error (MSE) value and processing time in seconds, the maximum number of sets of red-to-noise ratio (PSNR) values, and similar image measurement standards. Percentage good (2021). Based on the testing of the image on the training image data, it was decided to use the image denoising machine based on pre-trained neural networks instead of local definition and take this filter type 2 and complete this integration. Non-local average filter mode 2 is repeated eight times in a row and then similar models based on squared error (MSE), peak signal-to-noise ratio (PSNR), and image quality measurement (SSIM) establish the results. The only good values for these parameters for a given image are 0.0000, Infinity, and 1.0000. Zaklouta et al. (2021). Based on the experiments from the images in the data entry test, it was decided to combine the image-denoising machine based on pre-trained neural networks, not on local identification, and take this filter type 2 and do this integration process. Four consecutive non-local samples were resampled using a type 2 filter, and the resulting standard deviation was then used to measure mean square error (MSE), peak signal-to-noise ratio (PSNR), and image quality (SSIM). The only valid values of these parameters for a given image are 0.0000, infinity, and 1.0000 Liu et al. (2021). The process of classifying images contained in test and training image databases involves resizing them to 331 × 331 and sending them to two types of pre-learning deep convolutional neural networks (called NasNetLarge and GoogleNet). The accuracy percentage for both input test and training image databases is 100%, the processing time (in seconds) for the training image database is 2.6606 seconds and for the test image database is 2.5465 seconds Kay et al. (2021). The results for the images in our image library are as follows after image noise removal and spatial enhancement (Figure 12):
The images in our training image database are shown in Figure 13, based on the image representation and development process in the spatial domain of Liu et al. (2021):
Table 5 Comparative Analysis of various image denoising and enhancement in the spatial domain Techniques based on various image quality metrics like MSE, PSNR, and the SSIM for the input traffic sign board images present in the Testing image database
Sr.No.
|
Techniques used
|
Mean Squared Error (MSE)
|
Peak Signal to Noise Ratio (PSNR)
|
Structural Similarity for Measuring the Image Quality (SSIM)
|
1
|
Pretrained Neural Network-based Image Denoising
|
1.25E-5
|
47.18008125
|
0.99859166666667
|
2
|
Non-Local Means Filtering Type 2
|
0.676
|
51.479
|
0.995
|
3
|
Pretrained Neural Network-based Image Denoising + Non-Local Means Filtering Type 2 + Non-Local Means Filtering Type 2 + Non-Local Means Filtering Type 2 + Non-Local Means Filtering Type 2
|
0.0000
|
Infinity
|
1.0000
|
Table 6 Comparative Analysis of various image denoising and enhancement in the spatial domain Techniques based on various image quality metrics like MSE, PSNR, and the SSIM for the input traffic sign board images present in the Training image database
Sr.No.
|
Techniques used
|
Mean Squared Error (MSE)
|
Peak Signal to Noise Ratio (PSNR)
|
Structural Similarity for Measuring the Image Quality (SSIM)
|
1
|
Pretrained Neural Network-based Image Denoising
|
1.3737075332349E-5
|
49.074868537666
|
0.99855731166913
|
2
|
Non-Local Means Filtering Type 2
|
0.080
|
61.175
|
0.999
|
3
|
Pretrained Neural Network-based Image Denoising + Non-Local Means Filtering Type 2 + Non-Local Means Filtering Type 2 + Non-Local Means Filtering Type 2 + Non-Local Means Filtering Type 2 + Non-Local Means Filtering Type 2 + Non-Local Means Filtering Type 2 + Non-Local Means Filtering Type 2 + Non-Local Means Filtering Type 2
|
0.0000
|
Infinity
|
1.0000
|
The instructions in the previous section of this research article show a comparison of the aforementioned Mean Squared Error (MSE), Peak Signal-to-Noise Ratio (PSNR), and Picture Quality Standards. Similarity Assessment (SSIM) – the methods mentioned above and how they compare to a combination of entrance exams and librarian training, Zaklouta et al. (2021).
The instructions below this research paper show the percentage accuracy and processing time (in seconds) of the above methods for entrance exams and image training. Database according to Liu et al. (2021).
Table 7 Comparative Analysis of various image denoising and enhancement in the spatial domain techniques based on various image quality metrics like Accuracy in Percentage and the Processing Time in Seconds for the input traffic sign board images present in the Testing image database
Sr.No.
|
Techniques used
|
Accuracy in Percentage (%)
|
Processing Time in Seconds
|
1
|
Pretrained Neural Network-based Image Denoising
|
97.01%
|
8.4883328333333 seconds
|
2
|
Non-Local Means Filtering Type 2
|
98.79%
|
1.1919867291667 seconds
|
Table 8 Comparative Analysis of various image denoising and enhancement in the spatial domain techniques based on various image quality metrics like Accuracy in Percentage and the Processing Time in Seconds for the input traffic sign board images present in the Training image database
Sr.No.
|
Techniques used
|
Accuracy in Percentage (%)
|
Processing Time in Seconds
|
1
|
Pretrained Neural Network-based Image Denoising
|
82.94%
|
12.232690183161 seconds
|
2
|
Non-Local Means Filtering Type 2
|
99.98%
|
1.6439581447563 seconds
|
After the image signature is entered, the size is first changed to 331 × 331 to represent the image and expand it in space; this is the size of the input image required before. Using NasNetLarge to express deep convolutional neural networks for processing Kay et al. (2021). After processing, the image signatures are entered into NasNetLarge, followed by the deep neural network of GoogleNet Liu et al. (2021). The classification results of the signature images in our test image database are as follows (Figure 14):
Table 9 Comparative Analysis of various image denoising and enhancement in the spatial domain techniques based on various image quality metrics like Accuracy in Percentage and the Processing Time in Seconds for the input traffic sign board images present in the Testing image database
Sr.No.
|
Category of image classification techniques used for various Types of Input image databases
|
Accuracy in Percentage (%)
|
Processing Time in Seconds
|
1
|
Resized Pre-processed images NasNetLarge + NasNetLarge + GoogleNet for input Testing image database
|
100%
|
2.5465 seconds
|
2
|
Resized Pre-processed images NasNetLarge + NasNetLarge + GoogleNet for input Training image database
|
100%
|
2.6606 seconds
|
A detailed description of the various processes present in a traditional convolutional neural network is as follows:
8.1.Image Input Layer:
The 2D input image is entered into the network from the image input layer and the data is normalized Kay et al. (2021).
8.2.convolutional2d Layer:
2D convolutional layers use sliding convolution filters for 2D inputs. The size of the curl layer depends on the input method. For 2D image inputs (four-dimensional profiles, channels, and observations leading to pixels in two spatial dimensions), this layer is skewed in the spatial dimension Zaklouta et al. (2021). For a 2D image array input (five-dimensional data, channels, observations, and time steps corresponding to pixels in both spatial dimensions), this layer is warped in both spatial dimensions. (2021). This set is convoluted in the spatial and physical dimensions of a one-dimensional image sequence input (four-dimensional data taken into pixels, channels, observations, and time steps enter a spatial dimension) Kay et al. (2021).
8.3.Rectified Linear Unit (Relu) Layer:
This function is equal to:
F(x) = {x, x ≤ 0
0, x < 0
8.4.fully Connected Layer:
The fully connected layer combines the input with the weight matrix and then adds the bias vector. Zaklouta et al. (2021).
8.5.softmax Layer:
Softmax layer Using the softmax function for input, Kay et al. (2021).