Tweeted By @slashML
BERT's success in some benchmarks tests may be simply due to the exploitation of spurious statistical cues in the dataset. Without them it is no better then random. https://t.co/o3i8FtmC8z
— /MachineLearning (@slashML) July 21, 2019