Scaled ReLU Matters for Training Vision Transformers
— AK (@ak92501) September 9, 2021
pdf: https://t.co/OcaIFc7Vfl
abs: https://t.co/1k2vrXXsoT pic.twitter.com/glmnhkahWj
Scaled ReLU Matters for Training Vision Transformers
— AK (@ak92501) September 9, 2021
pdf: https://t.co/OcaIFc7Vfl
abs: https://t.co/1k2vrXXsoT pic.twitter.com/glmnhkahWj
We're slowly learning more about Google's not-exactly-public efforts in the huge LM space. The highlight here for me was the subfigure on the right: More evidence that we can see discontinuous, qualitatively-important improvements in behavior as we scale.https://t.co/Q68HaF2Vag pic.twitter.com/yGHmwCLvPt
— Prof. Sam Bowman (@sleepinyourhat) September 7, 2021
PermuteFormer: Efficient Relative Position Encoding for Long Sequences
— AK (@ak92501) September 7, 2021
abs: https://t.co/S0bSxCDoc2
experiments show that PermuteFormer uniformly improves the performance of Performer with almost no computational overhead and outperforms vanilla Transformer on most of the tasks pic.twitter.com/PcmcRPrOtC
Deep Saliency Prior for Reducing Visual Distraction
— AK (@ak92501) September 7, 2021
pdf: https://t.co/ukJEBvPzH1
abs: https://t.co/voqLbUbPZs
project page: https://t.co/h2nvBRGgcD pic.twitter.com/7hNCqA0Ocb
Causal Inference in Natural Language Processing: Estimation, Prediction, Interpretation and Beyond
— AK (@ak92501) September 3, 2021
pdf: https://t.co/Fk81CFNMBa
abs: https://t.co/vyIHZQsPLK pic.twitter.com/DvvUQIzRuQ
Eyes Tell All: Irregular Pupil Shapes Reveal GAN-generated Faces
— AK (@ak92501) September 2, 2021
pdf: https://t.co/lgcfsZO7dp
abs: https://t.co/ZV5TJyVAp4 pic.twitter.com/9BYyrVpxAC
∞-former: Infinite Memory Transformer
— AK (@ak92501) September 2, 2021
pdf: https://t.co/4B4sxwGEM5
abs: https://t.co/sBtFRIv7rc
propose the ∞-former, which extends the vanilla transformer with an unbounded long-term memory pic.twitter.com/0oFWpixF3b
WarpDrive: Extremely Fast End-to-End Deep Multi-Agent Reinforcement Learning on a GPU
— AK (@ak92501) September 1, 2021
pdf: https://t.co/6z9WZacJyk
abs: https://t.co/wUX0MThFGc pic.twitter.com/vR6V68Zdgv
AI is influencing the world and right now most of the actors that have power over AI are in the private sector. This is probably not optimal. Here's some research from @jesswhittles and I on how to change that. https://t.co/vljhYFXyGC
— Jack Clark (@jackclarkSF) August 31, 2021
SummerTime: Text Summarization Toolkit for Non-experts
— AK (@ak92501) August 31, 2021
pdf: https://t.co/RCgAVCFPLx
abs: https://t.co/xuGph3jsdB
github: https://t.co/OYTgNX6u0I pic.twitter.com/txTS9Lg0Cq
Differentiable Prompt Makes Pre-trained Language Models Better Few-shot Learners
— AK (@ak92501) August 31, 2021
pdf: https://t.co/CHAhM6eO0B
abs: https://t.co/Hav0uyFLpH
can convert small language models into better few-shot learners without any prompt engineering pic.twitter.com/qlTIbLvolc
Hire-MLP: Vision MLP via Hierarchical Rearrangement
— AK (@ak92501) August 31, 2021
pdf: https://t.co/4Vjf9gQZuv
abs: https://t.co/9umSzoO621
achieves an 83.4% top-1 accuracy on ImageNet, which surpasses previous Transformer-based and MLP-based models with better trade-off for accuracy and throughput pic.twitter.com/Ftsgaaipva