Content-Based Image Retrieval with Combined Features: Color and Gradient

: Feature extraction is fundamental stage of effective content based image retrieval (CBIR). However, it remains challenging issue to extract low-level features for retrieval systems. This paper puts forward an effective solution proposal for the aforementioned problem. Initially, images and their gradients are clustered with multi-level thresholding. A codebook is generated with threshold values. The size of the codebook generated depends on the number of thresholds. Consequently, every pixel in color image is included in a cluster by means of the codebook. Color reduction is performed by assigning the average values of pixels in the same cluster. A cluster-based one-dimensional histogram (CBH) is created with the numbers of pixels in every cluster represented with a single color. Then the cluster-based feature vectors with histogram are extracted from original image and gradient image. Accordingly, relevant features are combined. The developed feature vector is called as combined feature vector (CFV). The most important advantages of CFV are that it performs an effective color reduction technique and feature presentation by processing texture information with gradient operator. Therefore, the main contribution of the combined feature vector suggested is its high accuracy and stability for image retrieval. The proposed method has been tested with Corel-1K, Corel-5K Corel-10K and GHIM-10K datasets. In addition, performances of different image histogram similarity techniques such as cosine, histogram intersection and Euclidean distance have been verified with the developed algorithm. Experimental results have been analyzed in two categories. Initially, CBIR results produced with combined feature vectors which are generated by Otsu, Kapur and center of gravity of histogram (CGH) procedures have been evaluated. Then, the CBIR strategy based on CGH method has been compared with CBIR systems with local binary pattern (LBP) and gradient-structures histogram (GSH). It was observed that CBIR approach based on CGH technique has signiﬁcantly outperformed.


Introduction
The rapid increase in the number and usage areas of digital images have created large image databases. Retrieving any desired images from the related databases has become a critical issue.
For this reason, image retrieval which is used in many fields such as medicine and health, aeronautics and military has been an important research area [1][2][3]. Text based systems were initially launched for the image retrieval problem. Related systems are based on indexing images with text data. However, manual indexing of all images in image databases requires costly labor [4,5]. In addition, indexing images with texts is a subjective approach. Contentbased image retrieval (CBIR) systems that use low-level features of images such as color, texture, shape have been proposed to overcome the stated shortcomings of text-based retrieval systems [6].
A typical CBIR architecture consists of two basic stages: feature extraction and similarity measurement. The feature extraction process is the most vital stage of the relevant systems.
Many researchers have focused on feature extraction procedures. Because the capability of the extracted features to represent images significantly affects the performance of CBIR systems.
As a result, if the feature vectors attained for images are not defined properly, the achievements of the systems will be low. Feature extraction methods are generally evaluated globally and locally. Global methods work on the whole image and are computational methods with low cost. On the other hand, local approaches are the techniques that consider the local feature of regions in the image. Histogram [7], color moment [8], co-occurrence matrix [9] are generally used global features. LBP [10], GIST [11], SIFT [12] and SURF [13] are the typical used techniques for local features.
The histogram vector is a well-known color-based descriptor and can be used as a global feature. The easy calculation of related vector, its robustness to rotation and scaling made it frequently used for CBIR systems [14,15]. Liu and Yang adopted the edge orientation and the color difference between the two points into histogram [16]. Varish et al. devised a color and texture based retrieval system. In their suggested study, images were converted into HSV color space and used in histogram [17]. In another study, the edge histogram and wavelet transform were applied in the YbCbCr color space. It has been seen that the feature vector obtained by edge histogram and wavelet transformation increases the performance of the developed system [18]. A histogram, which is an effective method to represent images, is expressed in gray-level images with a one-dimensional vector. In color images, there are three different histograms.
Therefore, the combination of histogram information created from each color channel is an issue that needs to be solved. In addition, using the color histogram in image retrieval systems requires processing of a 3-dimensional array. However, since there are 2 24 different colors in RGB color space, the computational cost will increase excessively. It has become imperative to use color reduction methods to overcome the aforementioned problems. Some of the popular color reduction approaches proposed so far are Median-Cut [19], Octree [20], K-means [21], C-means [22]. Although the global histogram is easy to use and frequently used representation tool, it does not calculate the spatial information of pixels. Also, it does not contain information about the texture properties of the image. The related inadequacies make it necessary to use a combined vector. It must include additional information such as texture, shape etc. apart from color features.
Although various methods have been proposed as a features vector, the texture feature is one of the frequently used in image analysis [23][24][25]. Li et al. [26] proposed a medical image retrieval system using texture in their studies. Multilevel indexing has been used in their strategies which increase the performance of retrieval system. In the developed study, the relationship between symmetric pixels through a local window was analyzed and the texture feature was extracted. By using the obtained texture and color features together, the performance of the retrieval system has been improved [27]. It has been shown that performance increases with the use of multiple visual representation in their studies. The effect of similar approaches on CBIR systems has been confirmed in other studies [28][29][30][31]. For example, one of the frequently preferred techniques to extract texture is local binary pattern (LBP). LBP is texture feature methodology that takes into account the relationship between the center pixel and neighboring pixels [32]. Variants of LBP algorithm have been employed by many researchers [33][34][35][36][37][38]. Nevertheless, the most important drawbacks of LBP, which uses local spatial features overlooks macro texture information and is sensitive to noise.
One of the remarkable approaches for CBIR systems in recent years is deep learning based solutions [39,40]. However, these methodologies have higher computational complexity than traditional low-level feature-based approaches. Furthermore, the number of images in database must be enough learning process to be successful.
In this paper, a novel image retrieval system has been developed with the feature vectors obtained from the combination of gradient features and outcomes of cluster-based color histogram (CBH) algorithm. Consequently, the texture and color features have been unified to create feature vector. The multi-level thresholding process was calculated by the center of gravity of the histogram method. Then RGB color space is divided into sub-prisms with the thresholds calculated for each color channel. The averages of all pixels in the same sub-prism are assigned for the same cluster. Thus, the color reduction process was implemented. Also color index based one dimensional histogram was created. The similar process was applied into original image and its gradients. Consequently, the global feature vector has created by combining the one-dimensional histogram information extracted from original image and its gradient.

Image Thresholding and Color Quantization
Image thresholding method are a process that incorporates similar pixels into the same group and provides an effective solution for image clustering. Kapur's entropy and Otsu techniques are the thresholding methods which give successful results for gray level images. Kapur's entropy aims to determine values that maximize the sum of local entropy at each gray level [41]. In the Otsu approach, the variance values that makes a maximum between classes are taken into account [42]. Both optimization methods are based on the image histogram. While the related methods work with a single histogram on gray-level images, they must be extended for three different histograms for color images. The histogram indicates the probability of pixels in the image and probability of i th level pixel is calculated as where are partial entropies of each class and they are defined as where are partial probabilities of clusters. The values of variables, which make Equation 3 maximum are considered to be accurate thresholding for gray scale image clustering.
Objective function of Otsu threshold approach is defined as (5) where are defined as variance between clusters and are calculated as (6) indicates partial probabilities, represents the class averages and gray level average. Related variables are calculated by Equation 7 as follows.
The explained thresholding techniques above are applied in gray-level images. In addition, as the number of thresholds increases, the computational cost becomes high. Demirci and Okur [43] have recently developed an algorithm called as center of gravity of histogram (CGH) that provides multi-level thresholding. In their strategy histogram data and recursive means have been used for threshold values estimation. Initially, the global mean, was calculated.
Subsequently, means determined from previous steps was re-used. Thus, the number threshold was increased with each step. The related algorithm based on the sum of partial probability distributions and partial means are calculated as follows: As could be seen, each stage depends on the values obtained in previous stage. The threshold values to be determined with the proposed approach are presented in Table 1. While using the general average of the image for single threshold, recursive means are taken into account for more threshold. µ0 µ1 t1 t2 t3 µ0 µT µ1 t1 t2 t3 t4 µ00 µ01 µ10 µ11 t1 t2 t3 t4 t5 µ00 µ01 µT µ10 µ11 The above-mentioned thresholding techniques are based on histogram, which is a feature vector that represents images globally. The low calculation cost, robustness to scaling and rotation makes it a useful feature for image retrieval systems. Single channel information is used in gray-level images for image retrieval. On the other hand, since the color image has three channels, three different histograms must be processed or combined for color image retrieval.
The second solution for color image retrieval is to use the color histogram, which has a high calculation cost since the three-dimensional array needs to be processed. In addition, in color images, each color channel histogram consists of 256 levels. Therefore, color reduction methodologies are among the topics that researchers focus on.
Color reduction is a frequently used method for image compression and classification. It is also defined as the expression of the image with less colors. When the color quantization algorithm is applied for the color image, the size of the image does not change, only the numbers of colors in image is reduced. In other words, the pixels in image are clustered as meaningful subsets.
In this study, a novel cluster-based one-dimensional histogram (CBH) which combines three different color channels with thresholding techniques has been proposed. Furthermore, color reduction structure has been devised for color image retrieval.
Firstly, RGB color space is divided into sub-cubes with threshold values determined for each color channel through thresholding techniques. Pixels located in the same sub-cubes are assigned to the same cluster. The average values of pixels in the same cluster have been used for color reduction. The number of sub cubes in RGB space depends on the number of thresholds. When the number of thresholds increases, the number of cubes will increase. Thus, the number of sub-cubes in the RGB is where the number of threshold was chosen as r [44]. Subsequently, the labels of clusters are distributed as . Figure 1 (a) and The rules for partitioning of RGB color space are given in Table 2. Here tr, tg and tb respectively show threshold values calculated from red, green and blue channels while ni represents the number of pixels assigned to the i th class. Since the number of all pixels in an image is equal to the sum of the number of pixels assigned to the classes, and number of pixels in image can be calculated as Accordingly, the probability of any pixel being in i th class could be written as follows, Subsequently, probability distribution function pi demonstrates a cluster-based one-dimensional feature vector representing color images. It has been created by combination of three color channels. The rules in Table 2 refer to a codebook generation process. Color reduction is performed simultaneously by assigning the average values of pixels in each sub-cubes. Figure 2 (a) shows a randomly selected image from the Corel-1K dataset and Figure 2 (b) shows the distribution of its pixels in RGB color space.  Figure 3(c). In addition, consistency was not disturbed in both reduced images. When the color reduction was performed for 1, 2 and 3 thresholds with CGH, PSNR values are given in Table 3. As could be seen in Table 3, the similarity of reduced image has increased with the number of thresholds. Figure 4 shows the traditional histogram of bus image in Figure 2 (a) whereas

Image Retrieval based on Combined Feature Vector
The aim of CBIR system is to retrieve any desired image from databases. Feature databases are created from related image databases and CBIR utilize these feature vectors. So feature extraction is basic step of image retrieval. The histogram is an important feature vector to represent images. The corresponding vector is extracted by processing a single color channel in gray-level images. On the other hand, in color images three color channels must be processed separately. For this reason, computational complexity and storage costs of the process in color images are high. Also, combining three different color histograms is another problem.  Figure 6 indicates the stages of proposed feature vector extracting algorthim. Figure 7 represents the structure of CFV for a single threshold. As seen in Figure 7, the corresponding vector has 16 elements that hold color and texture features together. The CFV vector has 54 and 128 elements for thresholds 2 and 3 respectively. Figure 8   The related similarity method can be defined as (12) and cosine similarity formula is as follows (13)

Experimental Results and Discussion
The proposed algorithm was tested in different databases such as Corel-1K, Corel-5K, Corel- The performances of CBIR system are measured by using precision (P), recall (R) and mean average precision (mAP). The P and R variables are calculated as follows: (14) The mAP value is defined as where Qj is the number of relevant images for query j, X is total number of queries, P(Ii) is precision of i th relevant images. While obtaining the experimental results, each image in database was used as a query and the top 20 images were retrieved. Precision-Recall (P-R) curves were generated. The P-R curves are one of the most frequently used graphical methods to compare algorithms. Therefore, general performances in different databases are presented with the P-R curves. Figure 9 and Figure 10 are P-R curves of experimental results achieved with Corel-1K and Corel-10K datasets, respectively. Figure 9(a) and Figure 10 respectively. On the other hand, in Figure 9(b), the P values at the same R level are 0.  The former results have been obtained with Euclidean similarity method. The proposed algorithm has been tested with cosine similarity metric and histogram intersection technique as well. Table 4 shows the performance of the cosine similarity metric on Corel-1K dataset. The related findings clearly indicate that three approaches produce better results with increasing of number of threshold. On the other hand, the CGH algorithm has retrieved the maximum number of relevant images. The mAP value of the CGH algorithm with r = 3 is 77.8% while they are 74.7% and 73.7% for Otsu and Kapur, respectively.
As the CGH algorithm proposed for CBIR system is more successful that Otsu and Kapur techniques in Corel-1K and Corel-10K dataset, its performance was compared with other image similarity methods. Furthermore, the number of threshold, r was set as 3. Accordingly, the performance of proposed system with different similarity approaches and various databases has been shown in Table 5. It can be seen that the Euclidean similarity metric is more efficient than other methods. As it is known, histogram intersection is an approach based on the minimum value of histogram bins. There are some clusters in CFV with a zero frequencies. Accordingly, intersection metric produces the worst results on databases.  and GHIM-10K have been tested. The precision and Recall parameters of algorithms has been given in Table 6. The drawbacks of LBP are that it produces a long histogram and does not evaluate macro texture. The GSH strategy has a limitation that it needs the balanced color, density and edge components [46]. On the other hand, CFV not only consists of a combination of general color and texture information, but also does not require any parameters. Therefore, the CFV based system is more effective than others. Table 6 Precision and recall parameters: Corel-10K and GHIM-10K

Conclusion
Feature extraction stage is an important issue for an effective CBIR systems. In feature extraction, low-level features of images are converted into vectors. Images are represented by feature vectors. Histogram is a commonly used feature vector in CBIR systems. However, the histogram is a multidimensional matrix for color images. Therefore, computational complexity will be high if an operation is performed on the relevant matrix. In this study, a novel retrieval architecture using cluster-based one-dimensional histogram (CFV) which combines texture and color features has been suggested. Initially gradient operator was applied to color images.
Subsequently, the original and gradient image were clustered using multi-level thresholding techniques. Color reduction is performed by assigning the average values of the related cluster in the reduced image. Cluster-based one-dimensional histograms (CBH) were produced for both original and gradient images. Accordingly, the CFVs were formed by combining them.
Experiments were performed in different databases. It was observed that the CFV with CGH is significantly superior to CFV with Kapur and Otsu techniques. Consequently, a novel content based image retrieval system based on combined feature vector, which contains color and texture information has been developed. Its performance was confirmed by experiments.   Conventional histogram of Bus