Tweeted By @ankurhandos
Emerging Properties in Self-Supervised Vision Transformers https://t.co/UbIaCJLi9Q
— Ankur Handa (@ankurhandos) April 30, 2021
object segmentation emerges out of ViT networks trained with self-supervision. This information is directly accessible in the self-attention modules of the last block. pic.twitter.com/KCCpIg87z9