Tweeted By @Tim_Dettmers
I wrote an in-depth analysis of how GPUs would compare against TPUs for training BERT. I conclude that current GPUs are about 30-50% slower than TPUs for this task https://t.co/BG8mIqQWMj
— Tim Dettmers (@Tim_Dettmers) October 17, 2018