Taking images and sharing them on sites like Google, Yahoo, Facebook, and WhatsApp has become commonplace in recent years. Finding or searching for a suitable image was difficult [11–13]. When retrieving an image, the Text-Based Image Retrieval (TBIR) approach was commonly used. The image saved in the database in TBIR should [3] be retrievable using the matching term. The issue occurs because each individual communicates differently, which might lead to misunderstanding. Also, there are so many images that labeling each one is difficult.
This shortcoming is overcome by Content-based Image retrieval (CBIR) or Query by Image Content(QBIC) [5–7]. As the name indicates the CBIR analysis is based on the content of the image rather than the text. Traditionally the image is retrieved based on color, texture, and shape which is the open problem [8–10], where the image data is stored and represented as the features. It reduces the human intervention and enhances the retrieval process easier to carry our work in a very large database. This CBIR is largely used in the crime detection, medical diagnosis, military geographical information, etc.,
Both the query image and the image stored in the database must go through feature extraction, which extracts low-level properties like color, shape, and size before comparing their similarity is shown in Fig. 1. If both features are the same, the picture obtained will be the same.
When an image has some noise, it is necessary to choose the regions of interest, thus we employed pre-processing in deep learning to achieve an accurate result. [4]. Deep Convolution Neural Network (CNN), which aids in image analysis, is used to achieve great performance and efficiency. Previously, there were no high-dimensional photographs; now, just a few images are preserved, and they are only accessed on rare occasions. For pre-processing, the MINST dataset is employed. The backpropagation method employed in [14] necessitates extensive training and resources, both of which are difficult to obtain
The technique of generating a compact representation (numerical or alphanumerical) of certain features of digital images to be utilized to deduce information about the image contents is known as feature extraction for content-based image retrieval. Every feature is closely related to the type of data it collects. The selection of one feature over another is determined by image retrieval. Kernel PCA is used if greater resources are required since all data is saved. It is stored in a matrix format, with each data point increasing quadratically. [16] In addition, PCA feature extraction fails to produce good results in high-dimensional data. [15]
Based on feature selection, the extracted feature must be chosen. It is critical to eliminate redundancy, improve accuracy, and minimize computing costs. The selected feature should be tested for similarity measurement, which addresses the feature relevance redundancy of the class conditional redundancy [17], however, all the conditions do not cope with high dimensional image feature selection [18]. If both characteristics have a high degree of similarity, the images are identical or nearly related [19].
So, in this study, we employed Alexnet, which delivers reliable results even in huge databases with high-dimensional datasets. After experimenting with several feature extraction methods, latent feature extraction proved to be the most promising. It takes into account the hidden variable. Even though there are numerous Mutual Information algorithms in use, they have never shown promising results in high-dimensional images. To solve this, we employed MINE (Mutual Information Neural Estimation), which has a high accuracy rate. The Euclidean Distance similarity metric is applied, and the result is accurate.