Tweeted By @hillbig
They proved that training for multi-layer NN with relu can reach the global optimum in polynomial time if inputs do not degenerate and the network is over-parameterized; solving one of the most important unsolved problems in DNN https://t.co/emWfXo1QO7
— Daisuke Okanohara (@hillbig) November 21, 2018