Tweeted By @svlevine
Can we view RL as supervised learning, but where we also "optimize" the data? New blog post by Ben, Aviral, and Abhishek: https://t.co/8wZp0pEiOx
— Sergey Levine (@svlevine) October 13, 2020
The idea: modify (reweight, resample, etc.) the data so that supervised regression onto actions produces better policies. More below: pic.twitter.com/vWBN8CepCo