Tweeted By @_akhaliq
Global Context Vision Transformers
— AK (@_akhaliq) June 22, 2022
abs: https://t.co/d6go0yv7fu
github: https://t.co/rUYFs09ReC
On ImageNet-1K dataset for classification, the base, small and tiny variants of GC ViT with 28M, 51M and 90M parameters achieve 83.2%, 83.9% and 84.4% Top-1 accuracy, respectively pic.twitter.com/XKoJAvUcYm