The quest for optimal normalization in neural nets continues. SwitchNorm: add BatchNorm + InstanceNorm + GroupNorm with a learnable blend at each layer https://t.co/9hOQXnkk8T fun plots; + code https://t.co/34r96BStCS
— Andrej Karpathy (@karpathy) July 7, 2018