Ceshine's Data Science Tweet Collection

by kchonyc on 2021-01-27 (UTC).

what a simple yet effective idea! :)

looking at it from the architectural depth perspective (https://t.co/DM4MvzqFQW by zheng et al.,) the depth (# of layers between a particular input at time t' and output at time t) is now (t-t') x L rather than (t-t') + L. https://t.co/GkkuMH7a5u pic.twitter.com/eXWeqoEKxx
— Kyunghyun Cho (@kchonyc) January 27, 2021

research

by ylecun on 2021-01-27 (UTC).

Want Transformers to perform long chains of reasoning and to remember stuff?
Use Feedback Transformers.
Brought to you by a team from FAIR-Paris.@tesatory https://t.co/dpCzt2yiPf
— Yann LeCun (@ylecun) January 27, 2021

research

Tags