"Video transformer for deepfake detection with incremental learning" by Sohail Ahmed Khan

Selected Works of Hang Dai

Article

Video transformer for deepfake detection with incremental learning

MM 2021 - Proceedings of the 29th ACM International Conference on Multimedia

Sohail Ahmed Khan, Mohamed Bin Zayed University of Artificial Intelligence
Hang Dai, Mohamed Bin Zayed University of Artificial Intelligence

Link

Document Type

Conference Proceeding

Abstract

Face forgery by deepfake is widely spread over the internet and this raises severe societal concerns. In this paper, we propose a novel video transformer with incremental learning for detecting deepfake videos. To better align the input face images, we use a 3D face reconstruction method to generate UV texture from a single input face image. The aligned face image can also provide pose, eyes blink and mouth movement information that cannot be perceived in the UV texture image, so we use both face images and their UV texture maps to extract the image features. We present an incremental learning strategy to fine-tune the proposed model on a smaller amount of data and achieve better deepfake detection performance. The comprehensive experiments on various public deepfake datasets demonstrate that the proposed video transformer model with incremental learning achieves state-of-the-art performance in the deepfake video detection task with enhanced feature learning from the sequenced data.

DOI

10.1145/3474085.3475332

Publication Date

10-17-2021

Keywords

deepfakes detection,
face forensics,
transformer,
video analysis

Disciplines

Comments

IR Deposit conditions: non-described

Citation Information

S. Khan and H. Dai, "Video transformer for deepfake detection with incremental learning", in Proceedings of the 29th ACM International Conference on Multimedia, New York, 2021, pp. 1821-1828. Available: 10.1145/3474085.3475332.