Rethinking “Batch” in BatchNorm
— AK (@ak92501) May 18, 2021
pdf: https://t.co/ZfLGqlGxPv
abs: https://t.co/oJArBeNN90 pic.twitter.com/TgqI1HkQv7
Rethinking “Batch” in BatchNorm
— AK (@ak92501) May 18, 2021
pdf: https://t.co/ZfLGqlGxPv
abs: https://t.co/oJArBeNN90 pic.twitter.com/TgqI1HkQv7
This paper gives me anxiety. BatchNorm is the most deviously subtly complex layer in deep learning. Many issues (silently) root cause to it. Yet it is ubiquitous because it works well (it multi-task helps optimization/regularization) and can be fused to affines at inference time. https://t.co/3EC2Abm8Ry
— Andrej Karpathy (@karpathy) May 18, 2021