Tweeted By @bneyshabur
You think the RNN era is over? Think again!
— Behnam Neyshabur (@bneyshabur) March 17, 2022
We introduce "Block-Recurrent Transformer", which applies a transformer layer in a recurrent fashion & beats transformer XL on LM tasks.
Paper: https://t.co/j9GOABGCsx
W. DeLesley Hutchins, Imanol Schlag, @Yuhu_ai_ & @ethansdyer
1/ pic.twitter.com/NDn8YyWoOE