Homepage
Close
Menu

Site Navigation

  • Home
  • Archive(TODO)
    • By Day
    • By Month
  • About(TODO)
  • Stats
Close
by TheGradient on 2020-02-14 (UTC).

1/5 Self-Distillation loop (feeding predictions as new target values & retraining) improves test accuracy. But why? We show it induces a regularization that progressively limits # of basis functions used to represent the solution. https://t.co/570qXFmlGj w/@farajtabar P.Bartlett pic.twitter.com/b79Q6ZSxlS

β€” Hossein Mobahi (@TheGradient) February 14, 2020
researchlearning
by TheGradient on 2020-02-14 (UTC).

2/5 Knowledge distillation by @geoffreyhinton @OriolVinyalsML @JeffDean originally motivated to transfer knowledge from large to smaller networks. Self-distillation is special case with identical architectures; predictions of model are fed back to itself as new target values.

β€” Hossein Mobahi (@TheGradient) February 14, 2020
learning
by ericjang11 on 2020-02-16 (UTC).

self-distillation takes many useful forms in ML research (Q-learning, quantization, teacher-student architectures). Awesome fundamental work! https://t.co/6trYyiANxZ

β€” Eric Jang πŸ‡ΊπŸ‡ΈπŸ‡ΉπŸ‡Ό (@ericjang11) February 16, 2020
researchlearning

Tags

learning tutorial misc nlp rstats gan ethics research dataviz survey python tool security kaggle video thought bayesian humour tensorflow w_code bias dataset pytorch cv tip application javascript forecast swift golang rl jax julia gnn causal surey diffusion
Β© Copyright Philosophy 2018 Site Template by Colorlib