ALADIN: All Layer Adaptive Instance Normalization for Fine-grained Style Similarity
— AK (@ak92501) March 18, 2021
pdf: https://t.co/K7homQW2GM
abs: https://t.co/jTp3DS4u7o pic.twitter.com/NUBfHbZZWE
ALADIN: All Layer Adaptive Instance Normalization for Fine-grained Style Similarity
— AK (@ak92501) March 18, 2021
pdf: https://t.co/K7homQW2GM
abs: https://t.co/jTp3DS4u7o pic.twitter.com/NUBfHbZZWE
Is it Enough to Optimize CNN Architectures on ImageNet?
— AK (@ak92501) March 17, 2021
pdf: https://t.co/zC5jToLTto
abs: https://t.co/oIYWstLrIf pic.twitter.com/RrEHLNJEqa
Revisiting ResNets: Improved Training and Scaling Strategies
— AK (@ak92501) March 16, 2021
pdf: https://t.co/Pn5cU2SVkB
abs: https://t.co/icpnuFwmXU pic.twitter.com/bA0E1GWR5z
Facebook AI has built TimeSformer, a new architecture for video understanding. It’s the first based exclusively on the self-attention mechanism used in Transformers. It outperforms the state of the art while being more efficient than 3D ConvNets for video.https://t.co/8mQ2rMgcDo pic.twitter.com/dBpbT3UJRx
— Facebook AI (@facebookai) March 15, 2021
We introduce a new approach for image compression: instead of storing the pixels in an image, we store the weights of an MLP overfitted to the image 🌟 At low bit-rates this can do better than JPEG!https://t.co/ATIyOEiwNX
— Emilien Dupont (@emidup) March 10, 2021
with @adam_golinski @notmilad @yeewhye @ArnaudDoucet1 pic.twitter.com/5sVBc2oST5
Introducing VISSL (https://t.co/iBEpmCi09R) - a library for reproducible, SOTA self-supervised learning for computer vision! Over 10 methods implemented, 60 pre-trained models, 15 benchmarks, and counting. pic.twitter.com/ZZMd8DpHBD
— PyTorch (@PyTorch) March 9, 2021
First Principles of Computer Vision by Shree Nayar.
— Jia-Bin Huang (@jbhuang0604) March 7, 2021
In the era of deep learning everything, understanding the fundamentals is more important than ever!https://t.co/wQvdIXC8TM
Detectron2Go (D2Go) is a new, state-of-the-art extension for Detectron2 that gives developers an end-to-end pipeline for training and deploying object detection models on mobile devices and hardware.https://t.co/SjtH6PWBQq pic.twitter.com/QeZE4rR74w
— Facebook AI (@facebookai) March 4, 2021
SEER: large-scale SSL for vision.
— Yann LeCun (@ylecun) March 4, 2021
- pre-train via SSL on 1 billion randomly selected images using SwAV.
- fine-tune on ImageNet: 84.2% top-1 accuracy.
- ft on 10% of ImageNet: 77.9%
- ft on 1% (13 samples per class): 60.5%
- beats SOTA on other CV taskshttps://t.co/Q8BT5QvKmf
A new blog post I wrote with Ishan Misra.
— Yann LeCun (@ylecun) March 4, 2021
An overview of Self-Supervised Learning.
We look at recent progress in SSL for vision & explain why SSL is more challenging with high-D continuous signals (images, video) than it is for discrete signals (text).https://t.co/DlL885CPpb
WIT: Wikipedia-based Image Text Dataset for Multimodal
— AK (@ak92501) March 3, 2021
Multilingual Machine Learning
pdf: https://t.co/fblyzH2hGe
abs: https://t.co/tVgBdfOnQ5
github: https://t.co/NNkF3oheok pic.twitter.com/nnFUaPJaYU
Our newest @kaggle competition is OCR for chemical compounds. Can you apply ML to translate from an image of the chemical structure to the text string that represents it? 4 million chemical structure images to help solve this problem! https://t.co/YpnGWsxczk
— Ben Hamner (@benhamner) March 2, 2021