Ceshine's Data Science Tweet Collection

by hillbig on 2018-07-02 (UTC).

While DropOut has been considered to prevent the co-adaption among hidden neurons to regularize the model, DropOut actually makes gradient flows even when the activation functions are saturated and helps the optimization converges to the flat minima. https://t.co/YEH33wWqRZ
— Daisuke Okanohara (@hillbig) July 2, 2018

research

by dennybritz on 2018-07-02 (UTC).

Dropout has been a standard technique for years, but we’re still finding new ways to interpret it.

I tend to think of new techniques/architectures as an advancement in either regularization or optimization/gradient flow, but sometimes it’s difficult to tell which one it is. https://t.co/74lt0VtExG
— Denny Britz (@dennybritz) July 2, 2018

research

Tags