Tweeted By @evolvingstuff
VisualBERT: A Simple and Performant Baseline for Vision and Language
— Thomas Lahore (@evolvingstuff) August 12, 2019
"VisualBERT...is even sensitive to syntactic relationships, tracking, for example, associations between verbs and image regions corresponding to their arguments"https://t.co/Wm2RqBqvPI pic.twitter.com/HCfV8QOBtA