Scalable Neural Video Representations with Learnable Positional Features
— AK (@_akhaliq) October 14, 2022
abs: https://t.co/r8ofrXivQ0
project page: https://t.co/nHTBO27iOG pic.twitter.com/4I7YRnIzGT
Scalable Neural Video Representations with Learnable Positional Features
— AK (@_akhaliq) October 14, 2022
abs: https://t.co/r8ofrXivQ0
project page: https://t.co/nHTBO27iOG pic.twitter.com/4I7YRnIzGT
This looks like the Vision Transformers architecture we have been waiting for: MaxViT https://t.co/WbzgJ50PjB
— Martin Görner (@martin_gorner) October 11, 2022
1/ State of the Art accuracy on ImageNet (no pre-training on huge datasets)
2/ Linear complexity wrt. image size (thanks to a clever attention design) pic.twitter.com/5bW0N7n3s5
MaxViT : combines ConvNet modules and 2 types of self attention (local n'y block, and on a subsampled grid).
— Yann LeCun (@ylecun) October 10, 2022
Since DETR (hi @alcinos26 !), I've become convinced that combining Conv and attention/dynamic routing was the Right Thing. https://t.co/DNOBsqL54Z
Content-Based Search for Deep Generative Models
— AK (@_akhaliq) October 7, 2022
abs: https://t.co/6yAYV5XNqO
project page: https://t.co/fTF1qDsYyh pic.twitter.com/jxEDXIagrJ
XDoc: Unified Pre-training for Cross-Format Document Understanding
— AK (@_akhaliq) October 7, 2022
abs: https://t.co/bHZuRhbzDP pic.twitter.com/CQudsSPM4e
UniCLIP: Unified Framework for Contrastive Language–Image Pre-training
— AK (@_akhaliq) September 28, 2022
abs: https://t.co/7s5k4jYDIL pic.twitter.com/Ky3g54UFhj
fast-stable-diffusion colabs, +25% speed increase + memory efficient
— AK (@_akhaliq) September 26, 2022
github: https://t.co/QO7UFwqXjf
colab: https://t.co/b8JiLOhYmJ pic.twitter.com/1hKZW7B3IL
I'm impressed with this @fastdotai student project that managed to create a dataset and model from scratch to recognise 800 bird species.https://t.co/qOYGlVckNR pic.twitter.com/xpQSH2A6hD
— Jeremy Howard (@jeremyphoward) September 24, 2022
VToonify: Controllable High-Resolution Portrait Video Style Transfer
— AK (@_akhaliq) September 23, 2022
abs: https://t.co/ooiRHTRVf3
project page: https://t.co/juAGeGxoFK
github: https://t.co/JvFUrT3uDZ pic.twitter.com/IQx2b6DXfd
https://t.co/kGd50oUFR1: an open-source toolkit for animal pose tracking. Works with any type/number of animals. Published in Nature Methods. Seems highly useful both for academia and real-world deployment!
— François Chollet (@fchollet) September 21, 2022
Built with Python/TF/Keras. pic.twitter.com/pkgvYWyCUT
Extremely Simple Activation Shaping for Out-of-Distribution Detection
— AK (@_akhaliq) September 21, 2022
abs: https://t.co/LoGObiufa6 pic.twitter.com/uz6MFOnhOG
Text2Light: Zero-Shot Text-Driven HDR Panorama Generation
— AK (@_akhaliq) September 21, 2022
abs: https://t.co/cwTRtf3AJF
project page: https://t.co/mDRWKQzA7r
github: https://t.co/6jJuhy37aT pic.twitter.com/qMhy5f75zE