Tweeted By @ak92501

on 2021-09-20 (UTC)
research nlp

Primer: Searching for Efficient Transformers for Language Modeling
abs: https://t.co/JM9v7pNoSI
github: https://t.co/xhA7uGyC7H
Experiments show Primer’s gains over Transformer increase as compute scale grows and follow a power law with respect to quality at optimal model sizes pic.twitter.com/CXq1yYMfUA
— AK (@ak92501) September 20, 2021

Tweeted By @ak92501

Tags