Depth grid based local description for 3D point clouds

: With the rapid development and extensive application of next-generation image processing technologies, the manufacturing industry is increasingly adopting intelligent equipment. In order to meet the demands of high precision and high efficiency production, there has been a growing focus on researching 3D point cloud processing methods that go beyond traditional approaches. A fundamental and crucial challenge in the field of point cloud processing is establishing a point-to-point correspondence mapping between two point clouds, which relies on leveraging the local feature description information inherent in the point cloud.This paper thoroughly investigates novel local description methods based on point cloud processing. It addresses the issue of inadequate descriptive capability and robustness found in existing local description methods. Specifically, this study explores the encoding of point information in the neighborhood space and multi-view projection mapping, proposing a local point cloud description method based on depth grids. This method leverages a local reference frame and establishes a depth grid after obtaining the local reference frame through neighborhood projection and distance weighting. The contribution of neighboring points to the depth of the grid is calculated to obtain the eigenvalues. To enhance efficiency, the calculation of eigenvalues incorporates normalization and multi-view projection techniques. The proposed method is compared and evaluated against various local description methods to verify its effectiveness and accuracy.


Introduction
In contemporary society, the advancement of computer science and technology has ushered in a new era of human production.Computer technology, as a driving force behind the progress in various fields, has opened up numerous subject areas that warrant in-depth study.Particularly in the field of nonartificial vision, high-precision visual sensing instruments have emerged as viable alternatives to the human eye, meeting the stringent requirements for achieving certain levels of precision.Moreover, the proliferation of cameras, smartphones, surveillance equipment, and other devices has fueled the remarkable growth of machine vision.Initially reliant on two-dimensional serial images to depict objective reality, machines encountered limitations and uncertainties when applied to practical operations.
Consequently, there has been a growing emphasis on the analysis and utilization of three-dimensional data.Within this realm, the study of point clouds, which capture the spatial information of objects in three-dimensional data, has become a crucial field [1].Point clouds, akin to images, possess the ability to convey real semantic information.Characterized by irregular distribution, varying density, and lacking specific topology, point cloud data represent the most fundamental form of data while retaining resistance to scale distortion and brightness variation.Point clouds and their associated technologies find widespread application in today's social production [2].In manufacturing inspection, the quality of production processes can be assessed by reconstructing models of workpieces.In remote sensing and mapping, point cloud data collected by drones offer significant research potential through big data analysis.For autonomous driving, point cloud technology enables effective adaptation to the complex environmental demands of urban driving.In intelligent healthcare, the simulation of certain physiological features aids doctors in analyzing conditions and obtaining accurate diagnoses and treatment outcomes.
In archaeological restoration, accurate analysis of the level of damage to historical artifacts facilitates the development of appropriate response plans.
The field of point cloud processing encompasses various technical branches, including semantic segmentation, point cloud reconstruction, and target recognition.The successful completion of these vision tasks relies on establishing accurate correspondences between points within the point cloud.
Practical visual inspection often involves complex structures and voluminous targets, making it nearly impossible to achieve desired results through a single camera sensing.Consequently, incomplete data sets are often obtained or a massive amount of data is acquired, necessitating research in multiple aspects of data processing.Firstly, the identification of key points with discriminative abilities and adequately sparse distribution within the point cloud is crucial.Additionally, the geometric representation of local information within the point cloud is essential [3]

Related work
The classical classification of point cloud local description methods can be classified according to whether they are based on a local reference frame or not.In this paper, we review local description algorithms based on this classification.It also focuses on the analysis of methods for encoding point information in the neighborhood space and multi-view projection mapping.

Point Cloud Local Description
Yu Zhong [4]  descriptors reduce the computational efficiency, so Prakhya [7] proposed a binary improvement of SHOT by binary encoding the SHOT descriptors with certain rules, which improves the efficiency but also reduces the accuracy.In addition, SHOT can be improved by introducing texture information.In urban environmental information, Behley [8] et al. derived a spectral histogram method with SHOT-like ideas, which also builds descriptors by cumulative voting for domain-specific performance optimization.Tang, K. [9] In the method without building LRF, it can be described by directly performing spatial cell partitioning and calculating local geometric features.radius-based Surface Descriptor (RSD) proposed by Marton, Z.C. [13] et al. describes the geometric features of points by estimating the radial relationship between points and their neighbors, which is a parsimonious and descriptive method.Covariance based descriptor proposed by Fehr, D. [14] et al. in 2016 has gained wide acceptance because of its low computational and memory requirements.The authors subsequently added additional RGB channels to extend the original covariance description to the color point cloud domain with desired results.Beksi [15] et al. encapsulated the shape features and visual features of the entire point cloud in the covariance descriptor, and also merged the principal curvature and Gaussian curvature into the shape vector, while adding gradients and depths to the visual vector.Then, a point cloud classification framework was constructed for target classification by combining dictionary learning.Zhao, G. [16] et al. worked out the Height Gradient Histogram (HGIH), which extracted utilized height dimension data in point clouds for the first time.This method has good performance but lacks the ability to describe for small targets.

Encoding point information in the neighborhood space
Neighborhoods in point clouds represent regions defined by specific rules centered around key points.Commonly used neighborhood types include columnar, spherical, and k-nearest neighbor neighborhoods.The choice of neighborhood calculation method often depends on the data structure used to store the point cloud in computer memory.While calculating neighborhoods based on the point cloud's storage structure can offer higher efficiency, it may introduce performance issues due to disparities between computer storage and the actual point cloud.On the other hand, geometric shape neighborhoods require traversal-based judgments to determine potential neighborhood points, sacrificing some efficiency for improved stability.
Once the neighborhood points are determined, the next step is to establish encoding relationships between the key points and other points within the neighborhood.A straightforward approach is to utilize direct geometric information, such as normal vectors or curvature.However, this method proves challenging in accommodating the complexities and variations encountered in actual 3D inspection processes.To address this, a more sophisticated approach is necessary to construct neighborhood spatial point information.This involves employing consistent calculation rules across different key point neighborhoods to derive distinct feature information for each key point.
By incorporating a local reference frame, this encoding method based on the neighborhood grants enhanced descriptive capacity and robustness.Additionally, the established local reference frame facilitates the subdivision of the neighborhood into smaller grid intervals.Mapping the obtained feature information onto the grid further enhances the performance of the local descriptor.

Multi-view projection mapping
The unique challenges posed by self-occlusion and other issues in 3D objects have a significant impact on the effectiveness and robustness of local description methods, often making them overly sensitive to changes such as density resolution.It has been observed that recognizing objects from multiple perspectives greatly enhances their recognition capabilities.Therefore, fusing information from multiple viewpoints is a crucial approach for achieving comprehensive 3D object representation.This fusion process relies on the local reference framework employed during object observation.Hence, when establishing local descriptions, projection mapping can be applied to integrate visual information from multiple perspectives, leveraging the benefits of a local reference framework [17].
By analyzing rotational features, a method utilizing multi-perspective projection mapping information can be developed.Two-dimensional projection information is simpler compared to direct three-dimensional information, allowing for further processing through abstraction, normalization, gridding, and other techniques.These processes not only improve the performance of descriptors but also address the issue of data redundancy in multi-perspective information.

Depth grid based local description
Obtaining highly robust and descriptive local features for point cloud data is crucial in point cloud local description.As discussed earlier, point clouds possess unordered, irregular, and topological characteristics, making it challenging to directly utilize their spatial attributes and set information.
Decoding the three-dimensional features and representing them in a meaningful way is necessary.One viable approach is to project the 3D features onto multiple 2D planes, effectively representing the 3D information in a multi-angle plane.In this paper, we propose a point cloud local description method based on multiple 2D local depth grids, focusing on the establishment of local reference frames and feature acquisition.
To address the challenges posed by the data structure of point clouds, we analyze the process of establishing local reference frames and acquiring features.By projecting the 3D features onto multiple 2D planes, we can effectively capture and represent the information from various angles.This approach enables us to extract meaningful local features that exhibit high robustness and descriptive ability.

Local reference frame
The local reference frame refers to a coordinate system that is independent of the world coordinate system and is intrinsic to the local surface information.It offers resistance to rigid body transformations while providing sufficient spatial information.Ideally, the local reference frame should be repeatable, meaning it remains unaffected by positional transformations.In numerous public datasets, local descriptions based on local reference frames consistently demonstrate higher performance compared to those that do not incorporate local reference frames.
The construction of local reference frames entails focusing on two key aspects.Firstly, attention should be placed on ensuring the validity of the local reference frame feature representation.The discriminative ability of the local description relies on the validity of the local reference frame.Secondly, emphasis should be given to the stability of the local reference frame itself.Only when the local reference frame is stable can robust local descriptions be achieved.
Classical methods for constructing local reference frames involve two approaches: point spatial distribution-based methods and covariance analysis-based methods.In accordance with the coordinate system definition, a local reference frame is defined by three axes.One of these axes can be obtained through the outer product of the vectors corresponding to the remaining two axes.
That is, for a key point in a given point , the mathematical expression of the local reference frame constructed at the point is Where () p x and () p y are the x-axis and y-axis of p L .It can be seen that the calculation of the local reference frame is to calculate the x-axis and y-axis separately.In order to ensure the effectiveness of the local reference frame, it is necessary to combine as much feature information as possible.
A proven option is to use the local surface normal vector as the coordinate axis.Therefore, the normal vector can be taken as the x-axis.The normal vector depends on the selection of the local point set, i.e., the neighborhood points.The traditional local point set selection often uses the k-nearest neighbor algorithm, which can be replaced by a spherical nearest neighbor algorithm in order to ensure better robustness to changes in point cloud resolution.This algorithm takes a spherical space of radius r with point p as the center, and all the points in this space except p can be defined as the neighborhood points of p.These spherical neighborhood points then form a local surface 12 { , , } n Q q q q = , and the normal vector () p n corresponding to this local surface can be used as the x-axis of the local reference frame.
Unlike simple normal vector estimation, the normal vectors in real scenarios often have duality, i.e., the estimated normal vector direction may be negative, so further processing is needed to eliminate its duality.This is done by referring to the vectors composed between the key points and the points in the neighborhood to determine the actual direction of the normal vector.The final obtained x-axis are is where n is the number of points in the neighborhood, qi is the vector of key points and neighborhood points.
As shown in Figure 1, the y-axis must be in the plane S that passes through point p and is perpendicular to the x-axis, and the y-axis must also pass through point p.Then, one can take the vector consisting of the point p to the neighboring point q , and find the vector of its projection to the plane S = -( ( )) ( ) To further obtain better performance, each projection vector needs to be weighted before summing over all projection vectors.For the neighborhood points that are closer to the key points, they should have higher weights.Conversely, the neighborhood points that are closer to the key points should have lower weights.So the weights should be related to the distance between the neighboring points and the key points.Two calculation methods are considered here, namely, the straight line distance between two points and the projection distance from the neighboring point to the plane S. Combining the two calculation methods, the weights can be found as In the final merging of the vectors, one can choose a most representative vector or merge the vectors directly by weight, the latter method is chosen here.The y-axis of the local reference frame are obtained as The expression of the entire local reference frame is obtained by calculating the outer product of xaxis and y-axis and finding the z-axis.

Depth grid based local description
When constructing feature descriptions, it is often necessary to make trade-offs in information selection.To enhance the performance and stability of the feature description, it is important to gather as much information as possible.However, including excessive information can lead to reduced efficiency or even decreased performance.Classical methods of feature selection involve choosing direct information, such as point density or the offset angle of the normal vector.By further processing and decoding the direct information, information redundancy can be reduced without sacrificing valuable data.One effective decoding approach is to utilize local depth, which involves projecting the 3D information onto a 2D plane to create a depth map.During the calculation process, the depth map is divided into a grid format according to specified parameters.For each key point, the points within its neighborhood are projected onto the depth map grid, and the contribution of these neighborhood points to the depth map is calculated, resulting in a two-dimensional depth map.By analyzing the depth map, the desired local features can be obtained.
It is important to note that due to occlusion resulting from viewpoint selection in 3D targets, it is necessary to construct not just one, but three depth maps from three different viewpoints.Finally, the feature information from these three depth maps is combined to obtain a comprehensive Depth Grid (DG) feature descriptor.First, it is necessary to set a depth grid parameter v.This parameter indicates the number of grids in a single depth grid map, while the size of the whole depth grid map is determined by the radius r of the spherical neighborhood space.Thus the side length of a single depth grid can be calculated as The coordinates of the neighboring points in the local reference frame are known, and the edge length s is combined to find the grid in which the neighboring points are projected and the depth value after projection .For the depth values, further simplifications are needed.Since there may be multiple neighboring points projected to the same depth grid, there is actually a series of depth values in the depth grid.For the screening of these depth values, the effect of self-masking should be considered to be eliminated, so the one with the smallest depth value should be selected.The depth values obtained after normalized filtering are It is important to note that occlusion occurs in three-dimensional objects due to the selection of viewpoints.Therefore, it is not sufficient to construct only one depth map; instead, three depth maps are constructed from different three-dimensional perspectives.These depth maps capture the object's structure from multiple viewpoints.Subsequently, the feature information from these three depth maps is combined to create the local descriptor based on Improved Neighborhood Projection (INP).The calculation process for this local descriptor is illustrated in Figure 2.

Fig. 2 Calculation Process of INP Local Description
After the calculation process, the final depth grid feature, which is a high-dimensional feature, is obtained.In practical implementation, only three projection depth calculations are performed for each point, and these calculations are relatively straightforward.The depth grid feature relies on the local reference frame, and both the local reference frame and the depth grid are based on eigenproperties within the local neighborhood.Importantly, these eigenproperties remain unchanged under rigid body transformations, ensuring that the depth grid feature is resistant to such transformations.Additionally, the weighting of the y-axis in the local reference frame and the use of a single value for local depth in the depth grid provide independence from sparsity.These characteristics contribute to the noise immunity of the feature description.As a result, the depth grid description exhibits both resistance to rigid body transformations and noise resilience.

Experiments and analysis
In order to verify the effectiveness of the DG feature description proposed in this paper, this section conducts experimental comparative analysis of the local reference frame and feature descriptors with other methods from several perspectives, respectively.The performance of this method and other classical methods are compared for different data sets and different conditions of the same data set.The experimental platform is a computer with Intel Core i7-6700HQ 2.60 GHz CPU and 8.00 GB RAM, Windows 10 64-bit operating system, based on VS2019C++ software platform, completed using PCL 1.12.0 point cloud library.The datasets used for the experiments are the S3CR dataset [18] and the ITODD dataset [19].

Performance Analysis of LRF
Ideally, the local reference frames corresponding to the key points should be the same for point clouds that have undergone rigid body transformation to ensure robustness to point clouds where rigid body transformation has occurred.The error between coordinate systems can be measured by the angular difference between coordinate axes.For example, for the x-axis before and after the transformation, the cosine of the angle between the axes can be taken as cos(x).Ideally, the cos(x) should be 1.The larger the angular error between the axes, the smaller the value.Extending to the other two axes, the evaluation index for quantitative assessment of the local reference frame can be obtained as follows: In order to get a comprehensive result, firstly, the same key points are taken randomly for each point cloud and local reference frames are calculated for these key points.Then MeanCos values are calculated.
It should be noted that the final local features do not need to be computed at this stage since only the performance of the local reference frame is evaluated.In addition, different point cloud local reference frames and final local description building methods require the neighborhood support radius parameter.
In this experiment, the support radius is agreed to be set to 15pr, pr being the point cloud resolution.The experimental results obtained on different data sets using various methods to establish the local reference frame respectively are shown in Table 1.It can be seen from Table 1 that the effect of the CA coordinate system in the ITODD dataset is significantly different from the effect of other coordinate systems.This is because the local reference frame based on CA does not eliminate the duality of the coordinate axes.In contrast, SHOT-LRF solves the problem of duality, so the effect is significantly improved.Meanwhile, in the bevel workpiece data, the effect of PSD is poor, which is because the coordinate system established based on PSD is more  As can be seen in Figure 3, all local reference frames show varying degrees of performance degradation as the noise increases and the downsampling rate decreases.The PSD coordinate system is the most obvious, because this local reference frame is established by using only a single point to determine the axis direction.The other local reference frames, which use all points in the surface neighborhood to determine the axis direction, degrade more slowly, with SHOT-LRF and DG-LRF being more effective because of the introduction of multi-vector weights.In general, the DG-LRF proposed in this chapter has excellent performance, with the best performance in terms of repeatability and robustness convenience, but there is still a significant performance degradation when the downsampling rate is too high.

Performance Analysis of DG local description
In the actual feature matching process between the source and target point clouds, a feature value can be obtained for a key point of the source point cloud.Then the feature values of all points in the target point cloud are traversed and compared with the key point feature values, and those points with close feature values can be considered as pre-matched points.A reasonable feature matching should be disambiguation, so for the source point cloud key points and the target point cloud, the feature values of the closest distance and the next closest distance should be taken, and if the ratio between them is less than a certain threshold value, the two points can be considered as matching.In the experiment, the rigid body transformation is known, and the error between the rigid body transformed point and the matched point is calculated.If the error is less than the threshold value (which can be set to half of the radius of the descriptor support), the match is considered correct this time.If it is greater than the threshold value, a mis-match is considered to have occurred.Two indicators are involved in the experiment, i.e., whether the match is made or whether the match is correct.For whether a match is made, recall can be defined as the number of correct matches total number of corresponding features recall = (9) For a correct match, 1-Precision can be defined as the number of false matches 1 precision total number of matches .− = Combining the two indicators, an RP line graph can be drawn.If no restriction is placed on the feature ratio, the actual number of corresponding points is equal to the number of matched point pairs and is equal to the sum of the number of correct matches and the number of incorrect matches.By changing the feature ratio, multiple sets of data can be obtained and a fold line is drawn.The farthest point of this line from the origin must be on x+y=1.In addition, the AUC (Area Under Curve) index can be used to reflect the comprehensive performance for the RP fold line graph.This is calculated as the sum of the polygon formed by the line and the x-axis, and the area of the rectangle to the right of the line.
For the two data sets S3CR, ITODD, the RP profile can be obtained as shown in Figure 4.As can be seen in Figure 4, the DG descriptor proposed in this chapter works best in the S3CR dataset, followed by the RoPS descriptor.the SHOT descriptor and the FPFH descriptor perform poorly and at low feature ratio thresholds, the FPFH is the only descriptor that has alignment errors.When the feature ratio threshold is maximum, every key point must find a match in the target point cloud.At this time, the number of correct matches for the SHOT description is lower than that for the FPFH description, and the DG description is slightly lower than that for the RoPS description.In fact, in the general matching process, a very high feature proportion threshold is generally not used.In the ITODD dataset, the matching ability of all local descriptions decreases compared to the S3CR dataset due to the presence of multi-target occlusion, and the DG descriptor still performs the best.As can be seen in Figure 5, the relative matching ability of different descriptors under different noise and downsampling environments is approximately the same, both being slightly higher for the DG descriptor than the RoPS descriptor, and higher for the SHOT and FPFH descriptors.Meanwhile, if no restriction is placed on the feature ratio, then the FPFH descriptor performs better than the SHOT descriptor, while the opposite is true if restrictions are placed on the feature ratio.
The AUC metrics of each descriptor are shown in Table 3.As can be seen in Table 3, downsampling has a stronger impact on the performance of the local descriptors than adding noise.This is because compared to noise, downsampling corrupts the spatial structure of the local point cloud more, which leads to the degradation of the descriptor's descriptive ability.Collectively, the AUC value of the DG descriptor proposed in this paper is higher than that of the RoPS descriptor, and even more than that of the FPFH and SHOT descriptors, especially under noisy conditions.This shows that the descriptive ability and stability of DG descriptor are better than other descriptors.

Conclusion
This study focuses on the development of a point cloud local description method and introduces the Depth Grid (DG) descriptor as a solution to the limitations of existing methods.By combining neighborhood point space information and multi-view projection mapping, the proposed method aims to address the issues of insufficient description capability and poor robustness.The DG descriptor, along with its corresponding local reference frame establishment method, is compared with several recent algorithms through experiments, using evaluation metrics such as MeanCos, RP curves, AUC, etc.The experimental results demonstrate that the DG descriptor exhibits superior descriptive ability and robustness on the selected dataset.
It is important to note that the local description based on the depth grid employs a multidimensional perspective projection, resulting in a high-dimensional feature representation.This presents an opportunity for further optimization in terms of computational efficiency.Additionally, the abundance of feature data may contain spatial redundancies and inefficiencies.Therefore, strategies such as binarization can be explored to optimize the dimensionality of the feature description.
. Focusing on local information is imperative due to the challenges associated with processing multiple sets of point clouds or the entire point cloud.Local description aims to digitize the spatial information by representing the feature information of a local point cloud with a sequence of features.As illustrated above, point cloud local feature description and point cloud alignment play a significant role in the field of non-artificial vision and present various technical difficulties and challenges.Assessing the quality of local features requires considerations of their descriptive capability, robustness, and compactness.This paper conducts an in-depth study on point cloud local description to enhance the descriptive power and robustness of point cloud local descriptors.
et al. designed an eigenshape signature based on the eigenvalues of the covariance matrix and the eigenmatrix, and performed directional difference density screening of key points.Radu, B.S. [5] et al. proposed a normal vector-aligned mirror image feature based on the depth map study, which is based on the rotation invariance principle, has invariance in six degrees of freedom, is simple to match, can be set flexibly Descriptor size and has a clear physical meaning.Tombari [6] et al. absorbed the idea of LRF establishment in ISS, divided the feature description into two parts, feature encoding and histogram statistics, and proposed directional histogram features.Firstly, the spherical neighborhood is divided into n regions according to the latitude and longitude, and then the cosine values of the angle between the normal vectors of the domain points in the subregions and the z-axis of the LRF are calculated.The cosine values within a certain range are designated as an interval, and a total of m intervals are obtained.Then the final descriptor obtained has m*n dimensions.Such high-dimensional et al. constructed a unique LRF centered on feature points, and established by voxel partitioning links the Signature of Geometric Centroids (SGC).LRF combined with geometry also works well.The point feature histogram proposed by Rusu, R.B. [10] et al. establishes a local reference frame for each set of point pairs of key and domain points, which can describe the spatial geometric differences between key and domain points, ensuring rotational invariance and robustness.The authors subsequently further improved on the method [11] by simplifying the features and correcting the statistics of k domain point pairs in a way that reduces the computational complexity of PFH and proposes a fast point feature histogram.In addition, Logoglu, K.B. [12] et al. extended the PFH and FPH ideas to the field of colored point clouds and proposed the Colored Histogram of Spatial Concentric Surflet-Pair (CoSPAIR).

Fig. 1
Fig. 1 Illustration of the local reference frame sensitive to the change of point cloud resolution.And DG-LRF outperforms both CA and PSD coordinate systems in each data set, and slightly outperforms SHOT-LRF.Because the actual point cloud is often affected by environmental factors such as detection conditions, the point cloud resolution and noise conditions can vary between different point clouds of the same 3D target.Figure 3 represents the repeatability of different local reference frames under different conditions.Where for since the selection of downsampling is not linear, the representation of taking negative logarithmic values is used.

Fig. 3
Fig. 3 Reproducibility comparison of different coordinate systems under different conditions (a) Gaussian noise (b) Downsampling

Fig. 4
Fig. 4 Feature matching ability of different descriptors under different datasets (a) S3CR dataset (b) ITODD dataset RP curves are obtained by adding different Gaussian noises and performing different degrees of downsampling for the S3CR dataset as shown in Figure5.

CHI
ZHANG received the B.S. degree in information engineering from the Wuhan University of Technology, Wuhan, China, in 2020, where he is currently pursuing the M.S. degree with the School of Information Engineering.His research interests include digital image processing and machine learning.YUYAN SONG received the B.S. degree in information engineering from the Wuhan University of Technology, Wuhan, China, in 2021, where she is about to pursue the M.S. degree with the School of Information Engineering.Her research interests include digital image processing and machine learning.LIWEI DING received the B.S. degree in information engineering from the Wuhan University of Technology, Wuhan, China, in 2022, where he is currently pursuing the M.S. degree with the School of Information Engineering.His research interests include digital image processing and machine learning.YECHEN HUANG received the B.S. degree in information engineering from the Wuhan University of Technology, Wuhan, China, in 2022, where he is currently pursuing the M.S. degree with the School of Information Engineering.His research interests include digital image processing, machine learning and embedded system.

Table 1
MeanCos comparison of different local reference frames on the dataset The FPFH features, on the other hand, perform quite well and are second only to the DG descriptors in the case of high feature proportion thresholds.It can be concluded that different feature descriptors, all have different dominant scenarios.The AUC values of the RP curves are shown in Table2.From the AUC worth comparison, we can see that the DG descriptor proposed in this chapter has the integrated best performance, and the FPFH features perform very differently in datasets with different characteristics.

Table 2
AUC values for different local descriptors Then the ability of the local descriptors to resist Gaussian noise and downsampling is verified.The

Table 3
AUC values for different local descriptors