Tweeted By @Thom_Wolf
I feel like good-old LSTM (or QRNN) are usually better for text classification indeed.
— Thomas Wolf (@Thom_Wolf) January 28, 2020
Note that for those who want to give a try at text classification with pretrained Bert models, you can give a look at the experimental section of this paper https://t.co/w4MWPTB79u