“Adversarial Examples Are Not Bugs, They Are Features” by Ilyas et al is pretty interesting.
— Chris Olah (@ch402) May 9, 2019
📝Paper: https://t.co/8B8eqoywzl
💻Blog: https://t.co/eJlJ4L8nhA
Some quick notes below.
“Adversarial Examples Are Not Bugs, They Are Features” by Ilyas et al is pretty interesting.
— Chris Olah (@ch402) May 9, 2019
📝Paper: https://t.co/8B8eqoywzl
💻Blog: https://t.co/eJlJ4L8nhA
Some quick notes below.
(2) The claim which will seems to me really remarkable if it holds up is that you can use this process to turn robust models into robust datasets, for which normal training creates robust models. pic.twitter.com/WwC4okBahs
— Chris Olah (@ch402) May 9, 2019
(3) The other interesting result is that you can create a different dataset of adversarial attacks, where you try to predict the attack class.
— Chris Olah (@ch402) May 9, 2019
They find this model - trained on adversarial attacks - generalizes to clean data, which I probably wouldn’t have predicted in advance. pic.twitter.com/ood8JL1dVy