Fingerprint Presentation Attack Detection using Referential Quality Metrics and Minutiae Count

.


INTRODUCTION
Nowadays biometrics is parlance, as it provides a way for identification and authentication of individuals. Among all the biometrics, fingerprints are the most reliable and useful biometrics system [1]. In the paper, [2] authors commented that fingerprint-based biometric system is less expensive as compared to face and iris-based systems. In the past, many researchers commented on applications of fingerprints and their usability but it laid down in its objective due to presentation attacks [3].
In continuation of resolving security threats in fingerprint, various researches have been carried out for FPAD at both hardware as well as at software level. Hardware-based approaches of FPAD use fingerprint readers along with sensors to analyse living attributes of persons like odour, blood pressure, skin distortion, etc. Two types of fingerprint features are generally used in software-based approaches, first is dynamic features like skin colour change due to skin elastic properties, pressure etc. and second is static features like ridges and valley features, sweat pores, perspiration, etc.
The FPAD are further grouped under open-set and closed-set solution types. Open-set solutions are based on limited information about spoof materials during the training phase and testing is done with a different set of spoof materials [4], [5]. Further, closed-set solutions are based on full information about spoof materials during training and testing is done only with known spoof materials. The major challenges of FPAD in close-set solutions is that of over-fitting and poor generalization. Many of the researchers commented that fingerprints spoof can be fabricated using numerous materials so an open-set generalized solution is a very difficult task and dependent on the fingerprint sensors.
The remaining article is organized as follows; section-2 explains the related research done in the field of FPAD, the significant finding (RQ) of literature and the process of solving the RQs. Section-3 discuss the proposed method based on RQs identified in the literature review. Section-4 discusses the experimental evaluation of the proposed work. Finally, in section-5 we conclude and put forwards lights for future works.

RELATED WORK
The magnitude of the presentation attack is posing security flaws existing in fingerprint recognition systems. So automated spoof detection techniques are developed in past years. The spoof artifacts or spoof attack materials come in large variation because of sensors optical design and materials mechanical properties. Spoof detectors suffer from unseen spoof samples because the machine learning model uses limited spoof samples for training.
The FPAD approaches are mainly based on local features that include Local Binary Pattern (LBP), Local Phase Quantization (LPQ), Binarized Statistical Image Features (BSIF) [6] and global features that are based on deep learning methods [7]. In a recent work, researchers used a middle approach of FPAD using minutiae-centred patches with deep learning [8]. The state of art prescribing above feature-based solution are discussed in the current section. In paper, authors proposed a convolution neural network (CNN) based approach using local patches to identify features around fingerprint minutiae. These local patches are then trained by the Inception v3-CNN model to generate a global spoofed score to discriminate between fake and live fingerprints. The suggested method for spoof detection has reduced average classification error up to 69% under both: unknown and known spoof materials for dataset 'LivDet 2015'. R. F. Nogueira et al. in [9] explained the use of principal component analysis (PCA) and support vector machine (SVM) to identify the features of liveness. They also used the combination of machine learning model on LivDet dataset 2009, 2011 and 2013. Further, they compared the results of different models on the mentioned datasets. The concluding remarks of authors said that ConvNet + PCA + SVM and AUG + ConvNet + PCA + SVM showed best result for LivDet 2013 data set. A. Rattani et al. in [10] adopted a fingerprint spoof detector using W-SVM approach. This scheme has achieved an average true detection rate of up to 70% for LivDet 2011 dataset. In another research, Y. Ding et al. [11], suggested multiple one-class SVM (OC-SVM) classifiers with local textural features using GLCM, LBP, BSIF, BGP, and LPQ. Each OC-SVM uses various set of features and some fake fingerprints for refining the decision boundary. This technique requires lesser spoof samples at the time of training and has stable performance across different fabrication materials. They obtained the Correct Detection Rate (CDR) of 87%, which was better than B-SVM (binary SVM) [12] on LivDet dataset 2011.
In another approach, given by C. Yuan et al. [13] suggested fingerprint liveness detection based on feature extraction technique using CNN with PCA reduction and SVM classifier. This method performed well in liveness detection. They calculated the average classification error (ACE) scores on LivDet 2009, LivDet 2011 and LivDet 2013 datasets. In another study, E. Park et al. [14], proposed a CNN model on fingerprint patches to calculate ACE scores. The advantage of using patches was to increase the dataset size for the training. In their implementation, they used databases of LivDet 2011, 2013 and 2015 and obtained an average classification error (ACE) rate of 1.35%. In [15], R. Gajawada et al. proposed an approach named Universal Material Translator (UMT) for cross material fingerprint spoof detection using style transfer. This technique enhances the generalization performance on novel spoof materials while preserving high performance for known materials. They used local patches instead of the whole image of LivDet 2015. A detailed survey has been performed by E. Marasco et al. in [16] that specifies different methodologies used for liveness detection.
After studying the available literature, the following significant points have been identified. Firstly, in the previous research a bounded approach (shape, size, location, and device-dependent) was used for fingerprint presentation attack detection ((RQ1)). However, this approach suffers from high computation overhead, requires device-dependent algorithm and variety in large numbers in live samples to gives better results. Secondly, researchers used different data sets and algorithms to improve accuracy, but they did not focus much on reducing the error rate that is more important in spoof detection for PA (RQ2).
To examine RQ1, we used non-hand-crafted feature (Image Quality Assessment (IQA)-Referential Quality Metrics (SSIM, MSE and PSNR) and Minutiae Count) to reduce the computation overhead (with limitation to test on multiple sensors). To examine RQ2, we verified the consistency of error rate on different machine learning algorithms and checked the performance with average classification error (ACE) score and accuracy. In the next section, we discuss the proposed method for spoof detection to obtain a good trade-off between accuracy and ACE score.

PROPOSED WORK
This section describes the proposed work based on issues identified in the literature. In the past, many researchers worked on data sets: LivDet 2009LivDet , 2011LivDet , 2013LivDet , 2015LivDet , 2017 and 2019 [23] for FPAD. In this study, we appraised the proposed approach on the LivDet 2015 of CrossMatch dataset. This dataset contains 500 live fingerprint images and 502 spoof fingerprints (152-Body double, 150-Ecoflex, 200-Play Doh) images. Figure-1 shows that identification of 'Live' and 'Spoof' fingerprint images is difficult. Hence, some methodologies are required to detect these spoof fingerprints as they are a threat to security. In this work, we proposed a scheme to detect fingerprint authentication in the aspect of live or spoof for a secure mechanism in today's biometric verification scope. Figure 2 explains the basic architecture of the proposed work. The methodology is mainly divided into three stages Pre-processing phase, image quality assessment phase and minutiae extraction, training phase.

Pre-processing
The captured fingerprint images are of high quality and full of noise during image recording, due to dryness or wetness of the skin. To enhance the captured images Gabor filter is used for filtering using specific wavelength and orientation. The grayscale image is transformed into its threshold image, it is assigned as black otherwise white. The ridges in fingerprints of binary images are then thinned to one-pixel width to ease the task of fingerprints minutia extraction.

Image Quality Assessment
Image quality assessments are made using three types of full-reference image quality matrices namely: Mean Square Error (MSE), Peak Signal to Noise Ratio (PSNR) and Structural Similarity Index Measure (SSIM). MSE and PSNR are error-sensitive measures and the SSIM is a structural similarity measure. These matrices are further explained as follows: In this work, MSE finds an average grey value of image-spoof and real fingerprints. Hence two images were being used in which one image was the enhanced fingerprint image obtained from the first stage and the other was a plain white image as referenced image. Hence, the filtered image is given by equation 1.
Where, A(x, y) is enhanced image, A'(x, y) is referenced image and M, N is resolution.
In PSNR, an error is measured using peak value i.e., calculating the ratio of the maximum power of image and power of noise in an image. The more value of PSNR better the quality of an image. The general form is given by equation 2.
= 20 * log 10 ( The SSIM is a perception-based model that mainly considers the degraded image and hence measures the perceptual difference between two identical images. SSIM utilises luminance, contrast and structure to compare local patterns of pixel intensities. The formula for it is as given in equation 3, where µx = Average of enhanced fingerprint image and µy = Average of the reference image. ( , ) = (2 + 1 )(2 + 2 ) Where, c1 and c2 are included to avoid instability when µ 2 + µ 2 . c1 and c2 are given as: 1 = ( 1 ) 2 where 1 = 0.01; 2 = ( 2 ) 2 ℎ 2 = 0.03; L = dynamic range of the pixel values.

Minutiae points extraction
Before extracting minutiae points, it is important to perform ridge orientation since it is necessary for describing, matching, and detecting minutiae. First-a-fall image gradient is calculated from the original image by convolving it with some filter. In this work, we used a Gaussian filter. The gradient image is calculated for two directions 'x' and 'y'. With the help of these two gradient images, covariance data is calculated. After smoothing the covariance data, a weighted summation of the data is performed. Sine and Cosine functions are applied to the principal direction of gradient image which is further smoothened that helps in calculating the reliability of moment of orientation. Hence, the moment of inertia is calculated around the orientation axis (i.e., minimum inertia) along with an axis perpendicular to orientation (i.e., maximum inertia). Therefore, the reliability factor is calculated as 1-(minimum inertia/maximum inertia). If (minimum inertia/maximum inertia) ratio is near to one, then it will depict little orientation information. At last, the mask is applied to reliability measures to exclude those areas which makes the orientation in the denominator small. Therefore, in this experiment, we used the mask value 0.001. After calculating ridge orientation, minutiae points are extracted [29]. Minutiae points refer to some specific points in the fingerprint and consists of many features such as ridge bifurcation or ridge ending. In this work, all the ridge bifurcation having crossing number three are detected and some of these are weeded and trimmed which are of no use and thus, we get a final set of minutiae points.

Training and Testing of Models
In this phase, we trained the model with three types of training methodologies: Neural Network (NN), SVM (Support Vector Machine) and k-NN (k-Nearest Neighbour) along with the comparison between their results. The input set for them contains 4 features-MSE, PSNR, SSIM and minutiae points.

Neural Network
A neural network is a sequence of algorithms that aims to recognize patterns for a set of data in a way the human brain works.
Hence it refers to a collection of multiple neurons either in artificial or organic nature which widely helps in the classification of data.
In this work, we used in total 1002 images of live and spoof fingerprints out of which 70% were used as training data i.e., 702 samples, 15% as the validation set i.e., 150 samples and remaining 15% as the testing data i.e., 150 samples. Data division for training and testing was random. The number of hidden neurons used in the NN model was 9 and the activation function used was sigmoid which returns a value in the range of 0 to 1. The number of hidden neurons was estimated by training the net with different neurons as shown in Table 1. We have chosen neuron 9 as it attained the best result for both testing and overall accuracy.

SVM
In this work, Support vector machine (SVM) is experimented with all kinds of SVM (linear SVM, Quadratic SVM, Cubic SVM, Fine Gaussian SVM, medium SVM, coarse Gaussian SVM) and got the best outcome for Fine Gaussian SVM. For this, we set cross-validation folds equal to 5. The kernel function chosen for this experiment was Gaussian (evaluated by experimenting performance with other kernel functions too and got best result for Gaussian). The two hyperparameters i.e. kernel scale and box constraint were also used. These two parameters were gauge by experimenting with different combinations of values as shown in table 2 and observed that if box constraint increases then kernel scale had to decrease for getting better results. In addition, box constraint should not be lower as it leads to overfitting problems and increases support vectors. The best accuracy was 88.8% which was received for kernel scale-1 and box constraint-5. The multiclass method used here was one-vs-one as default.

k-NN
K-nearest neighbour (k-NN) is a part of supervised machine learning algorithms widely used for classification problems and wield features similarity to anticipate the new data point values, which means that these data points are assigned values based on how scrupulously points are matched in the training set. In this work, we used Medium k-NN. There are three hyperparameters used: number of neighbours, distance metric and distance weight. These values were set by examining different combinations of them as shown in table 3. It was observed that, if neighbours are very small then it leads to more noise in data. Hence, we need to choose an optimal value for this parameter. Therefore, we did not experiment with several neighbours less than 5. The best result was achieved when the number of neighbours were 5, the distance metrics were Chebyshev and the distance weight was inverse.

EXPERIMENTAL EVALUATION AND DISCUSSION
In section 3, we evaluated and discussed the testing accuracy of all models (NN, SVM and k-NN) from features evaluated from LivDet 2015. This section discussed the benchmarking of the ACE score of FPAD using different machine learning models.

Results analysis using Neural Network
In section 3, table 1 gives the best accuracy of neural network at 9-hidden neurons with 88%. The confusion matrix and receiver operating characteristic (ROC) is shown in figure 4. In the confusion matrix, '0' describes live fingerprints and '1' depicts spoof fingerprints. Out of 40 epochs, the best validation result was obtained at epoch 34 with validation performance 0.24575 and gradient value was 0.022848 at epoch 40. According to the confusion matrix shown in figure 3(a), the false-negative rate (FerrFake) for spoof is small for all 4 matrices i.e., training, validation, testing and when combined together. We can see, for spoof detection algorithms the (FerrFake) should be less than (FerrLive) values to have better authentication. Also, the ROC curve as shown in figure 3(b), the curve is showing false-negative rate remains at '0' when the true positive rate was 0.65.

Results analysis using k-NN
In section 3, table 3 gives the best accuracy for k-NN as 88.6%. The confusion matrix is as shown in figure 4(a). False-negative rate (FNR) is smaller for spoof i.e., 8.8% as compared to live fingerprints, which are 14%. This means there are only 44 spoof images, which were being misclassified as live, and as this factor should be small hence, we worked upon showing a low FNR rate for spoof as compared to the FNR rate for live fingerprints. According to AUC shown in figure 4(b), our classifier model is showing the most optimal result at (0.09, 0.86) i.e., for TPR 0.86, and FPR is 0.09. In this case, false-positive rate is less than the true positive rate. As we move towards right of optimal value, both TPR and FPR will increase which is not good for the model as the value for FPR should be as low as possible and if we move towards the left of optimal value, then both the values decrease. We obtained AUC = 0.95 which is comparably good as the maximum value of an area for the model can be 1.

Results analysis using SVM
In section 3, Table 2 gives the best accuracy of SVM (FINE GAUSSIAN SVM) as 88.8%. The confusion matrix as shown in figure 5(a), out of 500 live images 440 were classified correctly as live and 60 misclassified as fake. Out of 502 spoof fingerprints, 450 were classified correctly as spoof and 52 were misclassified as live. Like k-NN and NN, we can observe that the falsenegative rate for spoof is 10.4% which is less than false-negative rate for live which is 12.0%. According to the AUC parameter shown in figure 5(b), the SVM classifier model is showing the optimal result on (0.10, 0.88), thus TPR is 0.88 and FPR is 0.10. In this case, false-positive rate is less than the true positive rate. As we move towards the right of optimal value both TPR and FPR increase which is not good for a model as the value for FPR should be as low as possible and if we move towards the left of optimal value, both the values decrease. We obtained AUC = 0.95 which is quite good as the maximum value of an area for the model can be 1.

Performance comparison of ACE score with other state of art method
In table 5, the comparison between all three methodologies i.e., k-NN, NN and SVM based on accuracy and ACE score is shown. The results obtained with the SVM is found to be the best in terms of lowest ACE score and highest accuracy. The ACE score is calculated by the formula ACE = (Ferrlive + FerrFake) / 2 where 'FerrFake' is calculated as total count of misclassified spoof fingerprints divided by the total count of spoof fingerprints. 'FerrLive' is given by the total count of misclassified live fingerprints divided by the total count of live fingerprints. Lower the ACE score better the model for the FPAD system.
Type-I and Type-II error needs to be minimized for raising the confidence on FPAD system. Therefore, we compare the performance of our approach in terms of an ACE score with other state of art. We evaluated the proposed approach with three classification methods as described above, which reflect an algorithm's robustness against existing spoof materials, in close-set environments. We observed that the models trained on referential quality metrics and minutiae count of the entire image has achieved a significantly higher reduction in average classification error as compared to the existing methods that use complex feature set as shown in table 6. Table 5. ACE score and Accuracy of all three methods

CONCLUSION AND FUTURE SCOPE
Fingerprint spoof detection is a challenging task as differentiating the live and spoof fingerprints are difficult. In this work, we tried to address issues identified in the literature. The proposed method solved the issues using features of RIQ metrics and minutiae count. These features were evaluated and verified with ANN, k-NN, and SVM machine learning models. We observed that the accuracies and ACE scores obtained by all three methods were approximately similar. SVM provides an accuracy of 88.8% and ACE = 0.111792 while neural network provides 88% accuracy and ACE=0.119780. Similarly, k-NN also showed good accuracy of 88.6% with ACE= 0.113824. Even though the accuracies are in good trade-off with the other state-of-arts, ACE scores are quite low. Hence, we can conclude that, the image quality matrices with minutiae count provide a significant improvement in error rates. This paper uses the whole images for extracting quality metrics, in future we can work on parts or patches of fingerprints. To improve results towards accuracy, other image quality matrices like a full reference or non-reference measures can be utilized. The input image can be divided into R, G, B components and by using combinations of R, G, B images as input images and rest as the reference images. With the above two proposal, the proposed method may further enhance the training process and thus increase the performance as well.

DECLARATIONS
Funding: No funds, grants, or other support was received.

Conflicts of interest/Competing interests:
We declare that none of the authors' has any conflicts of interest involved.

Availability of data and material: Not applicable
Code availability: We will provide the codes if desired by publication after acceptance.