http://gradientscience.org/adv/
Over the past few years, adversarial examples – or inputs that have
been slightly perturbed by an adversary to cause unintended behavior in
machine learning systems – have received significant attention in the
machine learning community (for more background, read our introduction
to adversarial examples here).
There has been much work on training models that are not vulnerable to
adversarial examples (in previous posts, we discussed methods for
training robust models: part 1, part 2, but all this research does not really confront the fundamental question: why do these adversarial examples arise in the first place?
So far, the prevailing view has been that adversarial examples stem
from “quirks” of the models that will eventually disappear once we make
enough progress towards better training algorithms and larger scale data
collection. Common views include adversarial examples being either a
consequence of the input space being high-dimensional (e.g. here) or attributed to finite-sample phenomena (e.g. here or here).
Today we will discuss our recent work
that provides a new perspective on the reasons for adversarial examples
arise. However, before we dive into the details, let us first tell you a
short story:
No comments:
Post a Comment