1
|
Ulutas et al. [44] 2017
|
• Mirror invariant binary feature extraction technique is proposed to detect frame duplication and mirroring. Binary features are used to determine the similarity between frames. PSNR of the candidate frames is used to eliminate false candidates.
|
• 10 videos
• Duplication
• Mirroring
|
99.98
100
|
99.30
97.34
|
99.35
98.20
|
-
|
• Efficient and has lower computational time.
• Dataset is very small.
• No cross-dataset validation
|
2
|
Kingra Aggarwal et al. [18] 2017
|
• Prediction residual (PR) and optical flow (OF) gradient is used to identify fame-based tampering in videos that are encoded by MPEG-2 and H.264 standards.
|
• DIC Punjab University
Group1:
• Ins/del/dup
Group2:
• Ins/del/dup
• Group3:
• Ins/del/dup
|
-
|
-
|
80/83/75
92/83/88
100/96/100
|
Localization Accuracy: 80% (for all)
|
• Performance decreases when this technique is applied to videos sequence having high illumination.
• Accuracy of localization is Poor
• No evaluation is done on unknown dataset
|
3
|
Ulutas et al. [39] 2018
|
• A novel approach is adopted to detect frame duplication in a video. BoW model was used to generate visual words and build a dictionary from SIFT key-points of frames. Then Hierarchical k-means is adapted to generate a large vocabulary tree for quantization. By using a frequency of visual words, BOW features are created to detect frame duplication.
|
• 31 videos
• Duplication (stationary cam)
• Duplication (moving cam)
|
97.94
98.57
|
97.65
99.13
|
96.73
98.17
|
-
|
• Computationally efficient
• Dataset is small.
• Detect one type of forgery.
|
4
|
Zhao, Wang et al. [45] 2018
|
• Compare the similarity between H-S, and S-V histograms of every frame.
• Used SURF feature extraction along with FLANN matching for confirmation of tampering.
|
• 10 test shots
• Insertion
• Deletion
• Duplication
and Localization
|
98.07
(for all)
|
100
(for all)
|
99.01
(for all)
|
-
|
• Can’t work for videos having scene changing in shot.
• Poor Localization
• Dataset is very small
• No cross-dataset validation
|
5
|
Huang, Zhang et al. [46] 2018
|
• Audio forensics detection: by wavelet packet decomposition
• Perceptual hash: Frame level Features are extracted by it.
• Quaternion DCT feature is used for fine detection.
|
• 115 videos from SULFA and OV
• 124 self-recorded
• Deletion
• Insertion
|
0.9876
(for all)
|
0.9847
(for all)
|
-
|
-
|
• Audio file is required with videos
• Poor localization
• No evaluation is done on unknown dataset
|
6
|
Jia, Xu et al. [14] 2018
|
• Optical flow (OF) sum consistency is used to find suspected tampered points. Fine detection is done by a correlation between frames.
• Determine the location of forgery by the proposed algorithm.
|
• VTL:55
• SULFA:36
• DERF:24 videos
• Duplication
• Computation time:1.623us/pixel
|
0.985
|
0.985
|
-
|
-
|
• Unable to detect tampered videos with largely static scenes.
• Poor localization
• No evaluation is done on unknown dataset
|
7
|
Fadl et al. [47] 2018
|
• Energy difference b/w frames is computed to find anomalous points.
• SNR, Spatial and Temporal energy is used for the detection of forgeries.
|
• 120 videos from SULFA, 28 from and 3 from IVY Lab
• Duplication
• Insertion
• Deletion
|
0.97
0.99
0.97
|
0.99
0.99
0.95
|
-
|
F1:
0.98
0.99
0.96
|
• Can’t detect forgery if frames are deleted from static scenes.
• Localization of forgery has not been made.
• No evaluation is done on unknown dataset
|
|
Bakas et al. [24] 2018
|
• A deep learning based digital forensic technique using 3D-CNN is proposed for the detection of inter-frame video forgery.
• A difference layer in the CNN is introduced which mainly targets extracting the temporal information from the videos and helps in the detection of inter-frame forgery.
|
• 9000 videos taken from UCF101
• Duplication
• Insertion
• Deletion
|
-
|
-
|
97% (average)
|
-
|
• Achieves low accuracy rates under duplication forgery because it can detect the duplication of less than 20 frames in the same single video shot, but if there is a duplication of more than 20 frames then this method cannot detect it.
• No evaluation is done on unknown dataset
|
8
|
Long, Basharat et al. [5] 2019
|
• I3D network: to find candidate duplicate sequences at the course level.
• Siamese network (Resnet152) is used to confirm duplication at the frame level.
• Duplicated frames are distinguished from original frames by an inconsistency detector using I3D.
|
• Media Forensics Challenge dataset (MFC18)
• VIRAT: 12 videos
• IPhone-4 videos: 17
• Frame Duplication and localization
|
-
|
-
|
-
|
AUC:•
• 84.05 (VIRAT)
81.46 (IPhone)
|
• Detect one type of forgery
• Poor localization
• No evaluation is done on unknown dataset
|
9
|
Fadl, Sondos, et al. [11] 2020
|
• Features are extracted by the Temporal Average (TP) of each shot
• Edge change Ratio (ECR): I/P video sequence is divided into small clips according to ECR.
• GLCM: Statistical textural features are extracted for each TP image.
• Lexicographical sorting of feature vectors in Matrix is done. Then similarity b/w adjacent vectors is used to determine duplication.
|
• 51 taken from
SULFA, LASIESTA and IVY Lab
• Duplication
• Duplication with Shuffling
• Execution Time: 0.011s per frame
|
0.99
0.95
|
0.98
0.98
|
-
|
-
|
• Localization of forgery has not been made.
• No evaluation is done on unknown dataset
|
10
|
Kharat et al. [12] 2020
|
• The proposed algorithm is comprised of two steps,
• First, the suspicious frames are identified, from the test video using a motion vector.
• SIFT key points are used as a feature for comparison. Finally, the Random Sample Consensus algorithm is used to locate duplicate frames.
|
• 20 videos from Youtube Movies
• Duplication
|
99.9
|
99.7
|
• 99.8
|
-
|
• A small dataset is used to test the performance of the model.
• No cross-dataset validation has been performed
|
11
|
Fadl et al. [25] 2021
|
• An inter-frame forgeries detection system is proposed using pre-trained 2D-CNN of spatiotemporal information and fusion for deep automatically feature extraction.
• Gaussian RBF multi-class support vector machine (RBF-MSVM) is used for the classification process.
|
• 13135 videos from VIRAT, SULFA, LASIESTA, and IVY LAB.
• Insertion
• Deletion
• Duplication
|
-
|
-
|
99.9
98.7
98.5
|
-
|
• No cross-dataset validation has been performed
|
12
|
Alsakar et al. [22] 2021
|
• A video forgery detection scheme is developed based on representing highly correlated video data with a low computational complexity third-order tensor tube-fiber mode.
• Frame insertion and deletion are detected by using an arbitrary number of core tensors. This tensor data is orthogonally transformed to achieve more data reductions and to provide good features to trace forgery.
|
• 18 videos taken from TRACE library
• Insertion
• Deletion
|
96
92
|
94
90
|
-
|
F1:
95
91
|
• The detection of frame duplication forgery has not been addressed
• No cross-dataset validation has been performed
|
13
|
Panchal et al. [48] 2023
|
• First, the input videos are categorized as static or dynamic through a key frame extraction algorithm.
• In 2nd step, different sets of video quality assessment attributes are chosen for static and dynamic videos using the forward selection method to improve accuracy.
• Finally, multiple linear regression is used to identify outliers among the selected attributes, which determines whether the video is an original, a single tampered, or a multiple tampered frame deletion video
|
• 80 original and 50 tampered videos are taken from SULFA, VTD, and UCF-101 and TDTVD
• Deletion
|
|
|
96.25%
|
|
• The detection of frame duplication and insertion forgeries have not been addressed
• No cross-dataset validation has been performed
|