Tweeted By @zacharylipton
In new preprint, @danish037, @Mansi1410, Bhuwan & I show it's easy to train attention models that assign weights to ***impermissible tokens*** even while provably relying on them in prediction. Calls into doubt uses of attention in auditing/FAT*. https://t.co/6KDk7VH1Ze (1/n)
— Zachary Lipton (@zacharylipton) September 18, 2019