Tweeted By @GoogleAI
Introducing a minimalist and effective approach for vision language model pre-training that learns a single representation from both visual and language inputs and efficiently leverages scaled datasets to achieve state-of-the-art performance. Learn more ↓ https://t.co/U9DY2CZbqR
— Google AI (@GoogleAI) October 15, 2021