Tweeted By @ak92501
Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity
— AK (@ak92501) January 12, 2021
pdf: https://t.co/0i6fcOuy4X
abs: https://t.co/AUKgennqZy
github: https://t.co/8QD4sJ2ckE pic.twitter.com/iDPDXj4bRR