Saturday, May 11, 2019

Adversarial Examples Are Not Bugs, They Are Features

http://gradientscience.org/adv/

Over the past few years, adversarial examples – or inputs that have been slightly perturbed by an adversary to cause unintended behavior in machine learning systems – have received significant attention in the machine learning community (for more background, read our introduction to adversarial examples here). There has been much work on training models that are not vulnerable to adversarial examples (in previous posts, we discussed methods for training robust models: part 1, part 2, but all this research does not really confront the fundamental question: why do these adversarial examples arise in the first place?
So far, the prevailing view has been that adversarial examples stem from “quirks” of the models that will eventually disappear once we make enough progress towards better training algorithms and larger scale data collection. Common views include adversarial examples being either a consequence of the input space being high-dimensional (e.g. here) or attributed to finite-sample phenomena (e.g. here or here).
Today we will discuss our recent work that provides a new perspective on the reasons for adversarial examples arise. However, before we dive into the details, let us first tell you a short story:

No comments:

Post a Comment