Tag - cv

by ak92501 on 2021-06-24 (UTC).

Vision Permutator: A Permutable MLP-Like Architecture for Visual Recognition
pdf: https://t.co/2Kb6zoFwR1
github: https://t.co/Ahe17BEDwf

achieves 81.5% top-1 accuracy on ImageNet without extra large-scale training data (e.g., ImageNet-22k) using only 25M learnable parameters pic.twitter.com/KO1Kx7OnSi
— AK (@ak92501) June 24, 2021

research cv

by ak92501 on 2021-06-21 (UTC).

How to train your ViT? Data, Augmentation, and Regularization in Vision Transformers
pdf: https://t.co/6isx98PS1M

train ViT models of various sizes on the
ImageNet-21k dataset, either match or outperform their counterparts trained on larger, but not publicly available JFT-300M pic.twitter.com/Zoso6tUPPA
— AK (@ak92501) June 21, 2021

research cv

by DeepMind on 2021-06-18 (UTC).

Skip-Sideways is an efficient approximation of backprop for training video networks. It allows training on potentially infinite video sequences, being independent of the video length: https://t.co/OXG2AgKvlb (1/) pic.twitter.com/JxnJk9FzIN
— DeepMind (@DeepMind) June 18, 2021

research cv

by huggingface on 2021-06-17 (UTC).

🤗Transformers v4.7.0 was just released with 🖼️DETR by @facebookai!

DETR is an Object Detection model that can take models from timm by @wightmanr as a backbone.

Contributed by @NielsRogge, try it out: https://t.co/0AOf3P7QaC

v4.7.0 launches with support for PyTorch v1.9.0! pic.twitter.com/1LAYy4cN2W
— Hugging Face (@huggingface) June 17, 2021

tool cv

by ak92501 on 2021-06-17 (UTC).

Watching Too Much Television is Good: Self-Supervised Audio-Visual Representation Learning from Movies and TV Shows
pdf: https://t.co/hBN7Um0RMm
abs: https://t.co/cRXjYZzUhy pic.twitter.com/t56CsoEYPx
— AK (@ak92501) June 17, 2021

research cv

by dustinvtran on 2021-06-16 (UTC).

This is stellar work in collaboration with folks in Zurich. Do neural networks really make overconfident predictions in general? For new architectures, this is not true. It may not even be true consistently for CNNs! @FrancesAnnHubis @XiaohuaZhai @neilhoulsby @MarioLucic_ https://t.co/zkB1zzmzkp
— Dustin Tran (@dustinvtran) June 16, 2021

research cv

by ak92501 on 2021-06-16 (UTC).

Keep CALM and Improve Visual Feature Attribution
pdf: https://t.co/I90yEPbjse
abs: https://t.co/JKIlsnNnGn
github: https://t.co/MP1F8LXvY0

identifies discriminative attributes for image classifiers more accurately than CAM and other visual attribution baselines pic.twitter.com/50DBMoVYai
— AK (@ak92501) June 16, 2021

research cv w_code

by ak92501 on 2021-06-16 (UTC).

BEIT: BERT Pre-Training of Image Transformers
pdf: https://t.co/WiFZIiErLt
abs: https://t.co/Ld2067ltiV

large-size BEIT obtains 86.3% only using ImageNet-1K, even outperforming ViT-L with supervised
pre-training on ImageNet-22K (85.2%) pic.twitter.com/abMaWZ1aZ8
— AK (@ak92501) June 16, 2021

research cv

by ak92501 on 2021-06-15 (UTC).

Styleformer: Transformer based Generative Adversarial Networks with Style Vector
pdf: https://t.co/jNVLty3unL
abs: https://t.co/SEK0ko63E7
github: https://t.co/hQanKidsZ8

outperforms GAN-based generative models, including StyleGAN2-ADA with fewer parameters on CIFAR-10 pic.twitter.com/bs3JmTJtdz
— AK (@ak92501) June 15, 2021

research cv

by ak92501 on 2021-06-14 (UTC).

MlTr: Multi-label Classification with Transformer
pdf: https://t.co/wqvd89AtJq
abs: https://t.co/H2n64N5OGa pic.twitter.com/yXFM5caLMK
— AK (@ak92501) June 14, 2021

research cv

by ak92501 on 2021-06-14 (UTC).

SimSwap: An Efficient Framework For High Fidelity
Face Swapping
pdf: https://t.co/l2aWTrM1CP
abs: https://t.co/ZSuDnRLUuF
github: https://t.co/deYKr8rhLY pic.twitter.com/cBuaXySkd9
— AK (@ak92501) June 14, 2021

research cv w_code

by facebookai on 2021-06-11 (UTC).

Today, we’re introducing TextStyleBrush, the first self-supervised AI model that replaces text in existing images of both scenes and handwriting — in one shot — using just a single example word: https://t.co/0QfLraAQvV pic.twitter.com/FNDJxNC20S
— Facebook AI (@facebookai) June 11, 2021

research cv

Tag: cv

Tags