Tweeted By @fchollet
New code walkthrough on https://t.co/m6mT8Sa9M5: Switch Transformers, an architecture the makes it possible to increase the representational capacity of a Transformer while keeping its computational cost low. Implemented by Khalid Salamahttps://t.co/nkMu0QwPuo
— François Chollet (@fchollet) February 17, 2021