Tweeted By @ml_review

on 2020-05-05 (UTC)
research nlp

Synthesizer: Rethinking Self-Attention in Transformer Models
By @ytay017 @dara_bahri @MetzlerDonald

(1) random alignment matrices perform surprisingly well
(2) learning attention weights from (query-key) interactions are not so importanthttps://t.co/pGwh83gilU pic.twitter.com/4LlWGaEAIB
— ML Review (@ml_review) May 5, 2020

Tweeted By @ml_review

Tags