Tweeted By @rsalakhu

on 2019-12-12 (UTC)
research

Mixtape: breaking the softmax bottleneck that limits expressiveness of neural language models.

A network with Mixtape Output Layer is only 35% slower than softmax-based network, while outperforming softmax in perplexity & translation quality #NeurIPS2019 https://t.co/ZxpIqtJomX
— Russ Salakhutdinov (@rsalakhu) December 12, 2019

Tweeted By @rsalakhu

Tags