Tweeted By @hardmaru
These two figures basically list all of the methods people have tried so far to make stochastic gradient descent work
— hardmaru (@hardmaru) October 10, 2020
“Descending through a Crowded Valley — Benchmarking Deep Learning Optimizers”https://t.co/i05IpY0YcB pic.twitter.com/Re1fMMTabm