Ceshine's Data Science Tweet Collection

by ylecun on 2022-10-10 (UTC).

MaxViT : combines ConvNet modules and 2 types of self attention (local n'y block, and on a subsampled grid).
Since DETR (hi @alcinos26 !), I've become convinced that combining Conv and attention/dynamic routing was the Right Thing. https://t.co/DNOBsqL54Z
— Yann LeCun (@ylecun) October 10, 2022

cv research

by martin_gorner on 2022-10-11 (UTC).

This looks like the Vision Transformers architecture we have been waiting for: MaxViT https://t.co/WbzgJ50PjB
1/ State of the Art accuracy on ImageNet (no pre-training on huge datasets)
2/ Linear complexity wrt. image size (thanks to a clever attention design) pic.twitter.com/5bW0N7n3s5
— Martin Görner (@martin_gorner) October 11, 2022

research cv

Tags