Tweeted By @karpathy

on 2021-11-13 (UTC)
research cv

Great paper and thread!
- 😮that super simple MSE loss works vs. BEiT-style dVAE (multi-modal) cross-entropy
- <3 efficiency of asymmetric encoder/decoder
- 👏detailed training recipes
- +1 v curious about dataset size scaling
- bit of lack of commentary on test-time protocol https://t.co/MQFAvrqBvr
— Andrej Karpathy (@karpathy) November 13, 2021

Tweeted By @karpathy

Tags