On the Relationship Between Universal Adversarial Attacks and Sparse Representations

You are here

Top Reasons to Join SPS Today!

1. IEEE Signal Processing Magazine
2. Signal Processing Digital Library*
3. Inside Signal Processing Newsletter
4. SPS Resource Center
5. Career advancement & recognition
6. Discounts on conferences and publications
7. Professional networking
8. Communities for students, young professionals, and women
9. Volunteer opportunities
10. Coming soon! PDH/CEU credits
Click here to learn more.

On the Relationship Between Universal Adversarial Attacks and Sparse Representations

By: 
Dana Weitzner; Raja Giryes

The prominent success of neural networks, mainly in computer vision tasks, is increasingly shadowed by their sensitivity to small, barely perceivable adversarial perturbations in image input. In this article, we aim at explaining this vulnerability through the framework of sparsity. We show the connection between adversarial attacks and sparse representations, with a focus on explaining the universality and transferability of adversarial examples in neural networks. To this end, we show that sparse coding algorithms, and the neural network-based learned iterative shrinkage thresholding algorithm (LISTA) among them, suffer from this sensitivity, and that common attacks on neural networks can be expressed as attacks on the sparse representation of the input image. The phenomenon that we observe holds true also when the network is agnostic to the sparse representation and dictionary, and thus can provide a possible explanation for the universality and transferability of adversarial attacks.

Introduction

Deep neural networks are increasingly used in many real life applications. Therefore, their astonishing sensitivity to small perturbations raises great concerns. Since first reported [1], numerous works have been dedicated to devising defense strategies [2][3][4], as well as the design of ever more sophisticated attacks [5][6][7]. Despite extensive research, considering a wide range of perspectives, it is still unclear why neural networks are so susceptible to these minute perturbations [7][8][9][10][11][12]. One of the most intriguing properties of adversarial attacks is the existence of image agnostic (universal) and model agnostic (transferable) adversarial perturbations. It is not fully understood why some adversarial examples generated for one model may hinder the performance of another, and how some perturbations cause the miss-classification of entire datasets.

SPS Social Media

IEEE SPS Educational Resources

IEEE SPS Resource Center

IEEE SPS YouTube Channel