Tweeted By @Thom_Wolf
A fascinating article by @lena_voita if you're interested in understanding what makes MLM models like BERT differents from LM models like GPT/GPT-2 (auto-regressive) and MT models.
— Thomas Wolf (@Thom_Wolf) September 16, 2019
And conveyed in such a beautiful blog post, a master-piece of knowledge sharing! https://t.co/OlmIsv2ewc