This section summarizes the performed experiments and the results obtained to authenticate the efficacy of the anticipated system i.e., fuzzy methodology-based edge detection followed by SURF attribute/feature abstraction and KNN classification. We next demonstrate the results on the medically altered facial dataset and compare the performance with contemporary methods used for recognition of medically altered faces.
4.1 Medically altered facial dataset
The facial databank for face identification across medical alterations is taken from [11]. The statistical value of the dataset employed is 1800 afore/later surgery samples for 900 humans. It holds 519 pairs of local therapies and 381 pairs of global therapies. The dataset has 74 samples for Otoplasty (ears), 192 for Rhinoplasty (nose), 105 for Blepharoplasty (eyelid), 32 for Dermabrasion (skin texture) and 56 samples for other miscellaneous therapies. Apart from the above-mentioned local therapies the dataset contains samples for global procedures as well i.e., 73 samples for skin peeling and 308 samples for face-uplift therapy. Figure 3 shows some sample images from the plastic surgery facial database: -
4.2 Experimental parameters (Valuation metrics)
Factors employed for the estimation of acknowledgment (recognition) schemes are Expected error rate (EER), Recognition rate (RR), Half total error rate (HTER), Computation time (CT), F-score and Receiver Operating Characteristic (ROC) [7] [9]. Recognition Rate (RR) is the proportional resemblance contest amid pre/post surgically altered samples and is inversely proportional to EER. EER is defined when FRR equals FAR. FAR is also defined as the number of pairs where probe and gallery refer to different subjects. False Rejection Rate (FRR) is the chance of declining a trustworthy person as a deceiver. FAR (False Acceptance Rate) is the prospect of obliging false contest amid pre-post-surgical images.
FAR = Number of false acceptances/Number of false claims
FRR = Number of false rejections/Number of recognition attempts
Squared Euclidean distance metric is used by KNN classifier to calculate the distance amid matched feature points. Lesser distance between matched feature points confirms that before after surgery samples are of the same person and EER is less, while larger distances account for impostor pair’s i.e., former/latter facial surgery illustrations are not of the same person.
Larger the distance between matched feature points more is the EER value which further states that lower the EER higher the recognition rate (RR). HTER clubs FRR and FAR by computing their central (mean) at an edge using training/authentication sets, and then recording the outcomes on the test set. F-score which includes recall and precision is stated as the harmonic mean of the two. Precision is an assessment of eminence whereas recall is a measure of magnitude, and both are inversely proportional to each other. Here RR value is taken as precision while the inverse of RR is taken as recall.
Further computational details for F-score can be referred from Sabharwal et al. [7]. CT (Computation Time) is the over-all dispensation time in seconds including characteristic extraction/contest. We have also plotted a ROC curve which is a recall- precision curve with precision (on x-axis) and recall (on y-axis). It is demarcated as the plot of exact affirmative percentage (i.e., sensitivity) to the untrue affirmative percentage (i.e., specificity).
4.3 Experimental setup
Facial samples pre/post therapy are processed to unit variance and zero mean accompanied by histogram equalization (to enlarge image disparity) and Wiener filtering (to reestablish inaccurate boundaries). Samples are then converted to grayscale and resized to a consistent dimension of 200×200 pixels. 70% of the total facial samples were used for training whereas 15% were employed for testing and 15% were used for validation. LOPO (Leave one person out) design was employed to execute acknowledgement tests. In this technique, widespread sets of illustrations of a single human were used as probationary images and the residual samples were utilized for training.
This assessment also warranted that the samples of one subject were not in the testing/ training sets of an equivalent illustration. For each identification scheme statistical 95% CIs (confidence intervals) for RR, EER, and HTER were assumed. They were estimated empirically by bootstrap sampling the consequences of respective identification scheme. CIs represent the fraction of potential self-reliance range that contains the exact value of the unidentified presentation constraint. The size of the confidence interval depends upon the magnitude, confidence level and the inequality in the illustration [7].
4.3.1 Rhinoplasty (nose) Surgery
Refer to figures 4, 5, and 6.
4.3.2 Otoplasty (ear) Surgery
Table 1
RR/ EER/ HTER examination w.r.t 95% [CI’s] [7] [9] [11] [29] [30] [37] [38]
RECOGNITION TECHNIQUE | RR,% at FAR = 1% | CI-RANGE | EER,% | CI-RANGE | HTER,% | CI-RANGE |
GNN (Singh et al. [2]) | 53.10% | [45.99–59.65] | 12.53% | [10.23–16.43] | 12.44% | [10.04–16.04] |
SURF (Singh et al. [2]) | 49.60% | [42.49–56.15] | 15.00% | [13.00–16.80] | 14.24% | [12.24–16.04] |
Circular-LBP (Singh et al. [2]) | 47.30% | [39.19–52.85] | 16.66% | [14.66–18.46] | 14.06% | [11.46–17.26] |
LFA (Singh et al. [2]) | 37.98% | [35.18–42.28] | 25.06% | [20.46–32.86] | 24.88% | [20.08–32.08] |
FDA/ LDA (Singh et al. [2]) | 32.58% | [26.98–41.18] | 26.42% | [21.82–34.22] | 24.52% | [19.92–32.32] |
PCA (Singh et al. [2]) | 29.70% | [27.40–31.90] | 36.42% | [31.82–40.82] | 32.58% | [26.98–41.18] |
Correlation scheme (Marsico et al. [3]) | 66.86% | [59.93–73.59] | 12.50% | [11.50–13.40] | 10.30% | [7.90–13.90] |
Parts/sparse technique (Aggarwal et al. [4]) | 76.67% | [69.57–83.23] | 7.50% | [6.50–8.40] | 7.12% | [6.12–8.02] |
Granular approach (Bhatt et al. [1]) | 77.50% | [73.75–81.46] | 8.33% | [7.33–9.23] | 7.03% | [5.73–8.63] |
Gabor/LBP + PCA + Euclidean metric (Lakshmiprabha et al. [29]) | 74.40% | [66.57–81.23] | 10.67% | [9.50–10.40] | 9.30% | [8.90–11.90] |
Feature/texture fusion scheme (Ali et al. [6]) | 91.49% | [89.52–93.45] | 5.13% | [4.11–3.63] | 4.62% | [3.72–5.72] |
Evolutionary granular algorithm + SIFT and EUCLBP (Verghis et al. [30]) | 87.30% | [83.75–86.46] | 7.83% | [6.33–8.23] | 7.03% | [6.73–7.63] |
DenseNet with shape, color & space texture classifier (Suri et al. [37]) | 91.75% | [88.52–91.55] | - | - | - | - |
Proposed scheme (Fuzzy + SURF + KNN) | 94.29% | [91.52–95.45] | 3.76% | [2.81–1.33] | 3.57% | [1.67–3.94] |
Table 2
Identification rates w.r.t various cosmetic procedures and recognition techniques [7] [9] [11] [29] [30]
Surgical procedure | PCA | FDA | LFA | CLBP | SURF | GNN | Correlation scheme | Parts/sparse technique | Granular approach | Gabor/LBP + PCA + Euclidean metric | GFRAPS | Feature/texture fusion | Proposed |
Dermabrasion | 22.7 | 24.6 | 25.7 | 41.6 | 41.3 | 43.2 | 60.96 | 57.37 | 58.2 | 57.37 | 84.6 | 72.2 | 95.29 |
Browlift | 31.0 | 33.0 | 39.8 | 48.6 | 50.0 | 56.6 | 60.96 | 70.77 | 71.6 | 70.77 | 83.3 | 85.6 | 86.39 |
Otoplasty | 58.9 | 59.2 | 61.0 | 68.3 | 65.1 | 69.9 | 74.26 | 84.07 | 84.9 | 84.07 | 85.4 | 88.9 | 90.56 |
Blepharoplasty | 30.8 | 36.2 | 40.4 | 51.6 | 52.6 | 60.8 | 65.16 | 74.97 | 75.8 | 74.97 | 72.7 | 87.8 | 87.43 |
Rhinoplasty | 25.6 | 25.3 | 35.6 | 44.3 | 50.2 | 53.7 | 58.06 | 67.87 | 68.7 | 67.87 | 70.8 | 82.7 | 84.95 |
Skin peeling | 27.7 | 32.7 | 40.5 | 53.2 | 50.0 | 53.3 | 57.66 | 67.47 | 68.3 | 67.47 | 87.2 | 82.3 | 88.45 |
Rhytidectomy | 21.1 | 21.2 | 21.8 | 40.4 | 39.0 | 41.5 | 45.86 | 55.67 | 56.5 | 55.67 | 65.0 | 70.5 | 73.93 |
Table 3
F-score examination for different identification techniques [7] [9] [11] [29] [30] [39]
Method | F-score (%) | CT (sec) |
GNN (Singh et al. [2]) | 0.31 | 12.17 |
SURF (Singh et al. [2]) | 0.37 | 9.37 |
CLBP (Singh et al. [2]) | 0.55 | 3.87 |
LFA (Singh et al. [2]) | 0.009 | 2.60 |
LDA/FDA (Singh et al. [2]) | 0.22 | 1.95 |
PCA (Singh et al. [2]) | 0.27 | 3.82 |
Correlation method (Marsico et al. [3]) | 0.85 | 13.77 |
Sparse approach (Aggarwal et al. [4]) | 0.78 | 7.36 |
Granular system (Bhatt et al. [1]) | 0.95 | 13.63 |
Gabor/LBP + PCA + Euclidean metric (Lakshmiprabha et al. [29]) | - | 10.74 |
Geometrical-FRAPS (Said et al. [5]) | 0.25 | 11.85 |
Feature/texture fusion (Ali et al. [6]) | 0.99 | 5.28 |
Evolutionary granular algorithm + SIFT and EUCLBP (Verghis et al. [30]) | - | 4.74 |
Proposed scheme (Fuzzy + SURF + KNN) | 1.00 | 4.59 |
Table 4
Advantages/disadvantages of various edge detection methods [18] [19] [20]
Edge detection techniques | Advantages | Disadvantages |
First order derivatives and gradient operators (Robert, Prewitt and Sobel) | Computation is simple to detect edges/orientations | Imprecise detection sensitivity in case of noise |
Second order derivative’s and Laplacian operator | Locates precise positions of edges and tests wider area round the pixel | Cannot find edge orientation’s |
Zero Crossing (Laplacian second order derivative) | Detects edges/orientations and has static characteristics in all directions | Receptive/sensitive to noise |
Gaussian edge detector (Canny) | Probable for finding error rate and localization. It is symmetric along the edge and decreases the noise by smoothing the image. It gives improved detection in noisy situation | Difficult computation, Time consuming |
Fuzzy logic-based edge detection | Detects very limited wrong edges in contrast to other edge detection schemes, image contour is not distorted thus preserving applicable edges, describes system as a combination of numeric and linguistics, fuzzy algorithms are robust as they are not very sensitive to changing environments or forgotten rules. | fuzzy logic does not fit to every problem |
Table 5
MSE values for various edge detection methods [18] [19] [20]
Type of edge detection | MSE (%) |
Prewitt | 48.13% |
Roberts | 47.93% |
Sobel | 48.63% |
Zero-cross | 46.93% |
LOG | 46.75% |
Canny | 47.61% |
Fuzzy | 42.66% |
3.4 Result Analysis
Figure 4 and Fig. 7 portray various sorts of surgical treatments [8]. Figure 5 and Fig. 8 illustrate the detected edges by fuzzy methodology for innumerable local/global surgical/medical treatments. Figure 6 and Fig. 9 exemplify matched feature vector points after attribute abstraction (via SURF) and classification (via KNN). Table 1. portrays RR, EER and HTER metrics for state-of-the-art recognition schemes w.r.t the proposed scheme. It is clearly understood from Table 1. that evaluation metrics (RR, EER and HTER) are best optimized for the proposed scheme in context to existing recognition schemes. Table. 2 compares RR values for diverse local and global cosmetic procedures in context to existing recognition algorithms. Table. 2 shows that the projected scheme gives highest identification rates in contrast to state-of-the-art recognition methods. Table 2. shows highest RR (95.29%) for Dermabrasion surgery. Out of all the mentioned therapies in Table 2. after Dermabrasion therapy second best results are obtained for Otoplasty surgical treatment. Moderate recognition results were obtained for Blepharoplasty treatment. Least acceptable results were obtained for Rhytidectomy therapy. Figure 10. graphically illustrates RR. Table. 3. highlights other evaluation metrics such as F-score and CT. The metrics mentioned above (RR and EER) are biased metrics i.e., related to each other (dependent metrics) on the other hand F-score and CT are unbiased (independent) evaluation metrics [7]. The anticipated scheme gives an F-score value of one which is considered best till date. Moderate computation time was reported by the anticipated scheme as compared to other recognition schemes. Figure 11. shows a graphical demonstration of F-score and CT for different identification techniques. Figure 12. depicts a ROC curve for precision plotted on axis-y and recall on axis-x. The nearer the curve trails the left and top boundary of the ROC space, the more precise the test. Evaluation metrics are best optimized with minimum values of EER/HTER/CT and higher values for RR and F-score. Running time in image dispensation systems enumerates the expanse of time occupied to run as a function of the range of the input illustration considered. Complexity in time is expressed by means of Big O symbolization where n (is input dimension). Big O symbolization scrutinizes systems in terms of inclusive scalability and competency. Running time can be analyzed w.r.t scheme engaged for acknowledgement, type of features extracted (global/local) and the temperament of surgery undertaken. Singh et al. [2] had described six face detection schemes for recognition after medical alterations. LDA (Linear Discriminant Analysis) being one of them operates by reducing the within class covariance, thus getting the best out of the between class covariance. Its computation encompasses condensed matrices, eigen fragmentation which can be difficult both in terms of time and memory. LDA has O (mnt + t3) complexity, m being the number of samples, n being the number of attributes and t = min (m, n). PCA’s time complexity is defined as O (p2n + p3) where p is the number of attributes and n is the number of data points. Implementation time of PCA varies linearly with the facet of the databank. Demonstrations for LFA are dimensionally low and sparsely disseminated. Here object description is given in terms of statistically derived local attributes. SURF has a hypothetical complexity in time defined as O (mn + k) where k is the number of extrema. Local binary patterns are computationally modest but if the dimension of the attributes surges exponentially with the number of neighbors it tips to an upsurge in computation intricacy. Facial attributes local in nature are likely to mend contest amid before/after therapy samples but have an adverse influence on running time. FDA, LFA, SURF, PCA and CLBP excerpt local attributes while GNN excerpts both traits (local/global). GNN gave utmost acknowledgement speed at an expense of amplified calculation time. Bhatt et al. [1], Marsico et al. [3], Said et al. [5] and Aggarwal et al. [4] presented approaches where face is segmented into several regions of attention to isolate noticeable facial traits. Techniques projected by Bhatt et al. [1], Marsico et al. [3], Said et al. [5] and Aggarwal et al. [4] pulled out local traits while Bhatt et al. [1] anticipated a granular arrangement accompanied by a genetic system which pulled out both the traits (local and global). Marsico et al. [3] estimated acknowledgement after cosmetic surgery by Face Recognition against Occlusions and Expression Variations which divided the face into applicable areas and then coded them using Partitioned Iterated Function System. Face Analysis for Commercial Entities was used to calculate image association/correlation index. Split Face Architecture was utilized to select the feature intellection method. Lakshmiprabha et al. [5] analyzed Gabor features to implement a multimodal biometric system via periocular and face biometrics. An advantage of this method was that neither the training process, nor any contextual data from other datasets was obligatory [46]. RR and EER values obtained were not satisfactory in context to surgically altered faces. Said et al. [5] proposed a system which functions in real time scenario with squat hardware requisites and evaluation time. Local traits result into condensed matrices (maximum elements being non-zero whereas global one’s end into sparse matrices (maximum elements being zero). Matrices sparse in nature encompass enormous zero elements and can protect memory and accelerate the dispensation thus pulling down the deliberation time. Aggarwal et al. [4] proposed region centered sparse illustration. It proceeded by localizing primary facial traits accompanied by initiation of training matrix for each facial part and finally sparse appreciation was consummated for each. Sparse spectral structures (such as Laplacian-eigen-maps and LLE) execute eigen examination for a n × n matrix. This matrix is sparse as it shrinks the time for eigen examination. Eigen examination of a sparse matrix has time O (pn2), here p is the proportion of non-zero components to the over-all components, the memory time being O (pn2) as well [17]. Running time can be protected by scheming a data configuration navigating only non-zero components. Verghis et al. [30] postulated a multi objective evolutionary granular system, operating on several granules extracted from a facial image. Investigational outcomes outperformed state of the art methods, including a commercial system for equalizing medically altered facial samples.
It is a region-based identification scheme which gave satisfactory outcomes with moderate to higher computation complexity levels. Table 4. compares fuzzy logic-based edge detection to different edge detection schemes prevailing in literature. Fuzzy detects extremely limited wrong edges in contrast to other edge detection schemes. The image contour is not distorted in fuzzy edge detection thus preserving applicable edges. Table. 5 gives the MSE % (Mean Square Value) of various edge detection techniques in context to fuzzy edge detection. Out of all edge detection techniques fuzzy edge detection offers more distinct edges and minimum MSE values [18] [19] [20]. Table 4. and Table 5. clearly state the motivation behind choosing fuzzy edge detection as a preprocessing step in the proposed methodology.