It seems that everyone wants to publish many papers but no one wants to read others’ papers…
— Jia-Bin Huang (@jbhuang0604) October 12, 2021
(thoughts after attending a poorly attended poster session)
It seems that everyone wants to publish many papers but no one wants to read others’ papers…
— Jia-Bin Huang (@jbhuang0604) October 12, 2021
(thoughts after attending a poorly attended poster session)
Look at the gender breakdown of who speaks in popular films! pic.twitter.com/Jq5CEerex2
— John B. Holbein (@JohnHolbein1) October 12, 2021
Causal ImageNet: How to discover spurious features in Deep Learning?
— AK (@ak92501) October 12, 2021
abs: https://t.co/NvelZA7Aa4 pic.twitter.com/OhFgarP17L
Yuan 1.0: Large-Scale Pre-trained Language Model in Zero-Shot and Few-Shot Learning
— AK (@ak92501) October 12, 2021
abs: https://t.co/2bT9if0KTH
singleton language model with 245B parameters, sota results on natural language processing tasks. high-quality Chinese corpus with 5TB high quality texts pic.twitter.com/4p6qxxDm0Y
CLIP-Adapter: Better Vision-Language Models with Feature Adapters
— AK (@ak92501) October 12, 2021
abs: https://t.co/keCbjWjil8 pic.twitter.com/Z2uqatQeIH
A Few More Examples May Be Worth Billions of Parameters
— AK (@ak92501) October 12, 2021
abs: https://t.co/UaR7ANxkzq
github: https://t.co/U00DR6CMn5 pic.twitter.com/S30zjBTLm3
Has anyone else noticed that walking whilst learning gives better results? Are there any studies of this phenomenon?https://t.co/oORKggBKXI pic.twitter.com/OkZoeAccdU
— Jeremy Howard (@jeremyphoward) October 11, 2021
ViDT: An Efficient and Effective Fully Transformer-based Object Detector
— AK (@ak92501) October 11, 2021
abs: https://t.co/rOytM75swG
obtains the best AP and latency trade-off among existing fully transformer-based object detectors, and achieves 49.2AP owing to its high scalability for large models pic.twitter.com/CVAzoT3dNh
Token Pooling in Visual Transformers
— AK (@ak92501) October 11, 2021
abs: https://t.co/0Jr3cJqvRe
Applied to DeiT, achieves the same ImageNet top-1 accuracy using 42% fewer computations pic.twitter.com/lusPaB9Bns
Impressive 172 pp. paper from @DeepMind & @GoogleAI: train deep nets on ImageNet with SGD w/o batch norm, and even w/o skip connection if you substitute SGD with a better optimizer s.a. K-FAC or Shampoo. Shocking! And probably very useful for theory. https://t.co/CoPGVN3Csl pic.twitter.com/hmJ3N2ZMl1
— andrea panizza (@unsorsodicorda) October 10, 2021
Nice new paper improving image generation and (generative) unsupervised representation learning https://t.co/OYB21fe3sm uses ViT instead of CNN to improve VQGAN into a new "ViT-VQGAN" image patch tokenizer. Tokens are then fed into a GPT for image generation, or linear probing. pic.twitter.com/cepvrv55Jk
— Andrej Karpathy (@karpathy) October 8, 2021
I am excited to share my latest work: 8-bit optimizers – a replacement for regular optimizers. Faster 🚀, 75% less memory 🪶, same performance📈, no hyperparam tuning needed 🔢. 🧵/n
— Tim Dettmers (@Tim_Dettmers) October 8, 2021
Paper: https://t.co/V5tjOmaWvD
Library: https://t.co/JAvUk9hrmM
Video: https://t.co/TWCNpCtCap pic.twitter.com/qyItEHeB04