Tweeted By @rasbt
NLP & vision transformers are very expensive to train, so we now focus more on fine-tuning. But should we select by model size, data size, upstream accuracy? Turns out that (maybe intuitively) upstream accuracy is the best predictor for downstream acc: https://t.co/mR8Y8HNmAe pic.twitter.com/o3JD20b750
— Sebastian Raschka (@rasbt) February 10, 2022