Robustness and transferability of universal attacks on compressed models
File(s)2012.06024v1.pdf (2.45 MB)
Working paper
Author(s)
Matachana, Alberto G
Co, Kenneth T
Muñoz-González, Luis
Martinez, David
Lupu, Emil C
Type
Working Paper
Abstract
Neural network compression methods like pruning and quantization are very
effective at efficiently deploying Deep Neural Networks (DNNs) on edge devices.
However, DNNs remain vulnerable to adversarial examples-inconspicuous inputs
that are specifically designed to fool these models. In particular, Universal
Adversarial Perturbations (UAPs), are a powerful class of adversarial attacks
which create adversarial perturbations that can generalize across a large set
of inputs. In this work, we analyze the effect of various compression
techniques to UAP attacks, including different forms of pruning and
quantization. We test the robustness of compressed models to white-box and
transfer attacks, comparing them with their uncompressed counterparts on
CIFAR-10 and SVHN datasets. Our evaluations reveal clear differences between
pruning methods, including Soft Filter and Post-training Pruning. We observe
that UAP transfer attacks between pruned and full models are limited,
suggesting that the systemic vulnerabilities across these models are different.
This finding has practical implications as using different compression
techniques can blunt the effectiveness of black-box transfer attacks. We show
that, in some scenarios, quantization can produce gradient-masking, giving a
false sense of security. Finally, our results suggest that conclusions about
the robustness of compressed models to UAP attacks is application dependent,
observing different phenomena in the two datasets used in our experiments.
effective at efficiently deploying Deep Neural Networks (DNNs) on edge devices.
However, DNNs remain vulnerable to adversarial examples-inconspicuous inputs
that are specifically designed to fool these models. In particular, Universal
Adversarial Perturbations (UAPs), are a powerful class of adversarial attacks
which create adversarial perturbations that can generalize across a large set
of inputs. In this work, we analyze the effect of various compression
techniques to UAP attacks, including different forms of pruning and
quantization. We test the robustness of compressed models to white-box and
transfer attacks, comparing them with their uncompressed counterparts on
CIFAR-10 and SVHN datasets. Our evaluations reveal clear differences between
pruning methods, including Soft Filter and Post-training Pruning. We observe
that UAP transfer attacks between pruned and full models are limited,
suggesting that the systemic vulnerabilities across these models are different.
This finding has practical implications as using different compression
techniques can blunt the effectiveness of black-box transfer attacks. We show
that, in some scenarios, quantization can produce gradient-masking, giving a
false sense of security. Finally, our results suggest that conclusions about
the robustness of compressed models to UAP attacks is application dependent,
observing different phenomena in the two datasets used in our experiments.
Date Issued
2020-12-10
Citation
2020
Publisher
arXiv
Copyright Statement
© 2020 The Author(s). This work is published with CC BY license.
License URL
Sponsor
Commission of the European Communities
Identifier
http://arxiv.org/abs/2012.06024v1
Grant Number
824988
Subjects
cs.LG
cs.LG
cs.AI
cs.CR
Notes
Accepted to AAAI 2021 Workshop: Towards Robust, Secure and Efficient Machine Learning
Publication Status
Published