Decouple and Resolve: Transformer-Based Models for Online Anomaly Detection From Weakly Labeled Videos

Top Reasons to Join SPS Today!

1. IEEE Signal Processing Magazine
2. Signal Processing Digital Library*
3. Inside Signal Processing Newsletter
4. SPS Resource Center
5. Career advancement & recognition
6. Discounts on conferences and publications
7. Professional networking
8. Communities for students, young professionals, and women
9. Volunteer opportunities
10. Coming soon! PDH/CEU credits
Click here to learn more.

TIFS Volume 18 | 2023

Decouple and Resolve: Transformer-Based Models for Online Anomaly Detection From Weakly Labeled Videos

TIFS Articles

By:

Tianshan Liu; Cong Zhang; Kin-Man Lam; Jun Kong

As one of the vital topics in intelligent surveillance, weakly supervised online video anomaly detection (WS-OVAD) aims to identify the ongoing anomalous events moment-to-moment in streaming videos, trained with only video-level annotations. Previous studies tended to utilize a unified single-stage framework, which struggled to simultaneously address the issues of online constraints and weakly supervised settings. To solve this dilemma, in this paper, we propose a two-stage-based framework, namely “decouple and resolve” (DAR), which consists of two modules, i.e., temporal proposal producer (TPP) and online anomaly localizer (OAL). With the supervision of video-level binary labels, the TPP module targets fully exploiting hierarchical temporal relations among snippets for generating precise snippet-level pseudo-labels. Then, given fine-grained supervisory signals produced by TPP, the Transformer-based OAL module is trained to aggregate both the useful cues retrieved from historical observations and anticipated future semantics, for making predictions at the current time step. Both the TPP and OAL modules are jointly trained to share the beneficial knowledge in a multi-task learning paradigm. Extensive experimental results on three public data sets validate the superior performance of the proposed DAR framework over the competing methods.

Video anomaly detection (VAD), which aims to satisfy the increasing demands of security in our daily life and greatly free human efforts [1], [2], [3], plays a crucial role in intelligent surveillance systems. In particular, the goal of VAD is to identify the frames or snippets that involve anomalous events, e.g., burglary, robbery, explosion, in unconstrained videos. Most of the previous VAD methods were devoted to unsupervised learning paradigms [4], [5], [6], [7], which train a model to memorize normal patterns by solely using normal samples. Then, in the inference phase, the outliers, i.e., the abrupt patterns with high reconstruction or prediction errors, are deemed as anomalies [8]. However, unsupervised VAD methods tend to produce a high false alarm rate for unseen normal events [9], due to lack of knowledge on the anomaly samples. Recently, to improve this limitation, the weakly supervised video anomaly detection (WS-VAD) paradigm [10], [11], [12], [13], [14] has been proposed by introducing video-level binary labels into the training stage. Compared with the unsupervised pipeline, the WS-VAD paradigm yields a more satisfactory trade-off between the detection performance and manual annotation cost.

Read on IEEE Xplore

Tags:

IEEE TIFS Article

SPS Social Media

IEEE SPS Facebook Page https://www.facebook.com/ieeeSPS
IEEE SPS X Page https://x.com/IEEEsps
IEEE SPS Instagram Page https://www.instagram.com/ieeesps/?hl=en
IEEE SPS LinkedIn Page https://www.linkedin.com/company/ieeesps/
IEEE SPS YouTube Channel https://www.youtube.com/ieeeSPS

IEEE SPS Educational Resources

IEEE SPS Resource Center

IEEE SPS YouTube Channel

© Copyright 2025 IEEE - All rights reserved. Use of this website signifies your agreement to the IEEE Terms and Conditions.
A public charity, IEEE is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity.

webinar_general_dsi.jpg

SA-TWG Webinar: Channel Estimation for Beyond Diagonal RIS via Tensor Decomposition

BISP_TC_Webinar.jpg

SPS Webinar: An Anomaly Detection Framework with Compressed Transformer Architecture for Tiny ML

webinar_ASI.jpg

SPS Webinar: Presentation Attack Detection on ID Cards

What is Signal Processing?

Popular Pages

Today's:

All time:

Last viewed:

Decouple and Resolve: Transformer-Based Models for Online Anomaly Detection From Weakly Labeled Videos

Transactions on Information Forensics and Security

Publications & Resources

For Authors

SP-Magazine-Front_Cover-March-2025.jpg

CAI_2027_Call_for_Proposals.png

nominate_2_general.jpg

Top Reasons to Join SPS Today!

Decouple and Resolve: Transformer-Based Models for Online Anomaly Detection From Weakly Labeled Videos

SPS Social Media

IEEE SPS Educational Resources

What is Signal Processing?

Popular Pages

Today's:

All time:

Last viewed:

Decouple and Resolve: Transformer-Based Models for Online Anomaly Detection From Weakly Labeled Videos

Search form

You are here

Transactions on Information Forensics and Security

Publications & Resources

For Authors

Top Reasons to Join SPS Today!

Decouple and Resolve: Transformer-Based Models for Online Anomaly Detection From Weakly Labeled Videos

SPS Social Media

IEEE SPS Educational Resources