Ceshine's Data Science Tweet Collection

by dcpage3 on 2019-09-11 (UTC).

Let's get started.

What does the original paper have to say? pic.twitter.com/SUehxk58Jt
— David Page (@dcpage3) September 11, 2019

survey learning

by dcpage3 on 2019-09-11 (UTC).

Recent papers have studied the Hessian of the loss for deep nets experimentally:
(@leventsagun et al) https://t.co/JNJKeqZyvZ, https://t.co/Wbk3sSbIbr
(Papyan) https://t.co/l4QcB85nir.

(@_ghorbani et al) https://t.co/VUxknF5QkM compare what happens with and without BN.
— David Page (@dcpage3) September 11, 2019

survey learning

by dcpage3 on 2019-09-11 (UTC).

So we have given precise experimental meaning to the statement that 'internal covariate shift' limits LRs and that BN works by preventing this...

...matching the intuition of the original paper!
— David Page (@dcpage3) September 11, 2019

learning survey

by dcpage3 on 2019-09-11 (UTC).

More details here:https://t.co/09Li90gCFQ
— David Page (@dcpage3) September 11, 2019

learning

by jeremyphoward on 2019-09-11 (UTC).

This is the best distillation of recent (and old!) research on batchnorm I've seen.

There is so much to learn about training mechanics by studying this thread and the links it contains. https://t.co/a1PeCy7M1s
— Jeremy Howard (@jeremyphoward) September 11, 2019

research learning survey

by ylecun on 2019-09-12 (UTC).

Precisely.
(I've since been told by my random matrix theory colleagues at Courant that the distribution of eigenvalues of a random covariance matrix can be obtained in a much simpler manner than with the replica symmetry breaking calculations used for this paper). https://t.co/zuOlcut3vB
— Yann LeCun (@ylecun) September 12, 2019

learning

Tags