Tweeted By @ak92501
PermuteFormer: Efficient Relative Position Encoding for Long Sequences
— AK (@ak92501) September 7, 2021
abs: https://t.co/S0bSxCDoc2
experiments show that PermuteFormer uniformly improves the performance of Performer with almost no computational overhead and outperforms vanilla Transformer on most of the tasks pic.twitter.com/PcmcRPrOtC