Jacobian regularization for mitigating universal adversarial perturbations

File Description SizeFormat 
2104.10459v2.pdfFile embargoed until 06 September 2022229.5 kBAdobe PDF    Request a copy
Title: Jacobian regularization for mitigating universal adversarial perturbations
Authors: Co, KT
Rego, DM
Lupu, EC
Item Type: Journal Article
Abstract: Universal Adversarial Perturbations (UAPs) are input perturbations that can fool a neural network on large sets of data. They are a class of attacks that represents a significant threat as they facilitate realistic, practical, and low-cost attacks on neural networks. In this work, we derive upper bounds for the effectiveness of UAPs based on norms of data-dependent Jacobians. We empirically verify that Jacobian regularization greatly increases model robustness to UAPs by up to four times whilst maintaining clean performance. Our theoretical analysis also allows us to formulate a metric for the strength of shared adversarial perturbations between pairs of inputs. We apply this metric to benchmark datasets and show that it is highly correlated with the actual observed robustness. This suggests that realistic and practical universal attacks can be reliably mitigated without sacrificing clean accuracy, which shows promise for the robustness of machine learning systems.
Issue Date: 7-Sep-2021
Date of Acceptance: 15-Jun-2021
DOI: 10.1007/978-3-030-86380-7_17
ISSN: 0302-9743
Publisher: Springer International Publishing
Start Page: 202
End Page: 213
Journal / Book Title: Lecture Notes in Computer Science
Volume: 12894
Copyright Statement: © Springer Nature Switzerland AG 2021. The final publication is available at Springer via
Keywords: cs.LG
Artificial Intelligence & Image Processing
Publication Status: Published
Embargo Date: 2022-09-06
Online Publication Date: 2021-09-07
Appears in Collections:Computing