"SSMTL++: Revisiting Self-Supervised Multi-Task Learning for Video Anomaly Detection" by Antonio Barbalau

Selected Works of Fahad Shahbaz Khan

Article

SSMTL++: Revisiting Self-Supervised Multi-Task Learning for Video Anomaly Detection

arXiv

Antonio Barbalau, University of Bucharest, Romania
Radu Tudor Ionescu, University of Bucharest, Romania & SecurifAI, Romania
Mariana-Iuliana Georgescu, University of Bucharest, Romania & SecurifAI, Romania
Jacob Dueholm, Aalborg University, Denmark & Milestone Systems, Denmark
Bharathkumar Ramachandra, Geopipe Inc, United States
Kamal Nasrollahi, Aalborg University, Denmark & Milestone Systems, Denmark
Fahad Shahbaz Khan, Mohamed bin Zayed University of Artificial Intelligence
Thomas B. Moeslund, Aalborg University, Denmark
Mubarak Shah, University of Central Florida, United States

Link

Document Type

Article

Abstract

A self-supervised multi-task learning (SSMTL) framework for video anomaly detection was recently introduced in literature. Due to its highly accurate results, the method attracted the attention of many researchers. In this work, we revisit the self-supervised multi-task learning framework, proposing several updates to the original method. First, we study various detection methods, e.g. based on detecting high-motion regions using optical flow or background subtraction, since we believe the currently used pre-trained YOLOv3 is suboptimal, e.g. objects in motion or objects from unknown classes are never detected. Second, we modernize the 3D convolutional backbone by introducing multi-head self-attention modules, inspired by the recent success of vision transformers. As such, we alternatively introduce both 2D and 3D convolutional vision transformer (CvT) blocks. Third, in our attempt to further improve the model, we study additional self-supervised learning tasks, such as predicting segmentation maps through knowledge distillation, solving jigsaw puzzles, estimating body pose through knowledge distillation, predicting masked regions (inpainting), and adversarial learning with pseudo-anomalies. We conduct experiments to assess the performance impact of the introduced changes. Upon finding more promising configurations of the framework, dubbed SSMTL++v1 and SSMTL++v2, we extend our preliminary experiments to more data sets, demonstrating that our performance gains are consistent across all data sets. In most cases, our results on Avenue, ShanghaiTech and UBnormal raise the state-of-the-art performance to a new level. Copyright © 2022, The Authors. All rights reserved.

DOI

10.48550/arXiv.2207.08003

Publication Date

7-16-2022

Keywords

Convolution,
Distillation,
Learning systems,
Machine learning

Disciplines

Comments

IR Deposit conditions: non-described

Preprint available on arXiv

Citation Information

A. Barbalau et al, "SSMTL++: Revisiting Self-Supervised Multi-Task Learning for Video Anomaly Detection", 2022, arXiv:2207.08003