Tweeted By @Thom_Wolf
A very nice paper for those interested in Transformers for NLP (and if you are not, you should!) Give insights on why these models improve on high level metrics like BLEU/ppl. I like that they went the extra mile to get a real comparison to the (interesting) results of @ketran! https://t.co/9qoOWArtlD
— Thomas Wolf (@Thom_Wolf) August 29, 2018