Ceshine's Data Science Tweet Collection

by karpathy on 2021-08-27 (UTC).

Badly tuned LR decay schedules are an excellent way to silently shoot yourself in the foot. Models can often look like they are converging but it's just LR getting too low too fast. FixedLR (+optional warmup) with 1 manual decay of 10X on plateau is a safe strong baseline.
— Andrej Karpathy (@karpathy) August 27, 2021

tip

by egrefen on 2021-08-28 (UTC).

What’s more, if your training is in @PyTorch, you can rather easily add this behaviour with minimal changes to your codebase, using @higherpytorch.https://t.co/U5dFLBXTHZ
— Edward Grefenstette 🇪🇺 (@egrefen) August 28, 2021

research w_code

Tags