Tweeted By @Thom_Wolf
A personal❤️ in GPT2 paper: the discussion on language-modeling & multi-tasking. We still see papers being rejected bc LM is not considered "useful"(ex Transfo-XL)😢 If your dataset is diverse enough LM is a proxy for a *huge* multi-task objective over pretty much every NLP task! https://t.co/MeJJVrXdIg
— Thomas Wolf (@Thom_Wolf) February 14, 2019