Genji-python 6B is now on @huggingface Spaces using @Gradio
— AK (@ak92501) August 13, 2021
link: https://t.co/4vQm6oVkdq https://t.co/UG2xH9cCxK pic.twitter.com/pVegXpPLhu
Genji-python 6B is now on @huggingface Spaces using @Gradio
— AK (@ak92501) August 13, 2021
link: https://t.co/4vQm6oVkdq https://t.co/UG2xH9cCxK pic.twitter.com/pVegXpPLhu
Mobile-Former: Bridging MobileNet and Transformer
— AK (@ak92501) August 13, 2021
pdf: https://t.co/Ssr6oFOjy7
abs: https://t.co/lctrhRG2Oq
achieves 77.9% top-1 accuracy at 294M FLOPs, gaining 1.3% over MobileNetV3 but saving 17% of computations pic.twitter.com/ChNT9kJtSy
Billion-Scale Pretraining with Vision Transformers for
— AK (@ak92501) August 13, 2021
Multi-Task Visual Representations
pdf: https://t.co/ZPTagL3LzO
abs: https://t.co/TfhdXimw4s
a scalable approach for pretraining with over a billion images in order to improve a production Unified Visual Embedding model pic.twitter.com/bFmlbpD01e
jurassic-1: technical details and evaluation
— AK (@ak92501) August 12, 2021
pdf: https://t.co/FzG56j1kHw
github: https://t.co/i2RQjyLVU9
Jurassic-1 is a pair of auto-regressive language models recently released by AI21 Labs, consisting of J1-Jumbo, a 178B-parameter model, and J1-Large, a 7B-parameter model pic.twitter.com/MS0DGlypTm
AnyoneNet: Synchronized Speech and Talking Head
— AK (@ak92501) August 11, 2021
Generation for arbitrary person
pdf: https://t.co/pm6IWdWScu
abs: https://t.co/d5O2t0x1zi pic.twitter.com/qSiJEwXm62
Making Transformers Solve Compositional Tasks
— AK (@ak92501) August 11, 2021
paper: https://t.co/1qUhBPlTfa
explore the design space of Transformer models showing that the inductive biases given to the model by several design decisions significantly impact compositional generalization pic.twitter.com/WSMeRNl3SX
Paint Transformer: Feed Forward Neural Painting with Stroke Prediction now on @huggingface Spaces using @Gradio
— AK (@ak92501) August 10, 2021
link: https://t.co/fEL3zjtK2d pic.twitter.com/638qoDIEap
AdaAttN: Revisit Attention Mechanism in Arbitrary Neural Style Transfer
— AK (@ak92501) August 10, 2021
pdf: https://t.co/VaWfcR2kKh
abs: https://t.co/FtEfKs8Ekk
github: https://t.co/Y3IFKiKZok pic.twitter.com/xO9YwDZ3cK
Video Contrastive Learning with Global Context
— AK (@ak92501) August 6, 2021
pdf: https://t.co/0kkXi2hu3X
abs: https://t.co/se2YGoaoo6
github: https://t.co/Rhn4WJjquM pic.twitter.com/HQlA0zw2O2
Token Shift Transformer for Video Classification
— AK (@ak92501) August 6, 2021
pdf: https://t.co/sdbS5P5RpD
abs: https://t.co/w5UpOnjHjl
github: https://t.co/4KQ0rdfCHN pic.twitter.com/A2RA717L84
Vision Transformer with Progressive Sampling
— AK (@ak92501) August 5, 2021
pdf: https://t.co/UW4Q8YmWPi
abs: https://t.co/usaqUHuSkS
When trained from scratch on ImageNet, PS-ViT performs 3.8% higher than the vanilla ViT in terms of top-1 accuracy with about 4× fewer parameters and 10× fewer FLOPs pic.twitter.com/ikxFIUuk9M
S2-MLPv2: Improved Spatial-Shift MLP Architecture for Vision
— AK (@ak92501) August 3, 2021
paper: https://t.co/SXOcXSwt5n
Using 55M parameters, S2-MLPv2-Medium achieves an 83.6% top-1 accuracy on the ImageNet-1K benchmark using 224 × 224 images without self-attention and external training data pic.twitter.com/cqTNTvcsfg