Tweeted By @ak92501

on 2022-03-18 (UTC)
research

Memorizing Transformers
abs: https://t.co/T4xmmbcOMI

extension to Transformer architecture, called kNN-augmented attention, which dramatically increases the length of the context that a language model can attend to by using k-nearest-neighbor lookup into a large external memory pic.twitter.com/IKLNTvKols
— AK (@ak92501) March 18, 2022

Tweeted By @ak92501

Tags