Tag - rl

by hardmaru on 2021-02-16 (UTC).

Steve's YouTube channel is awesome. I found the control bootcamp series useful a few years ago to review some basic concepts. https://t.co/tuRzCqg0jT https://t.co/rD6pAsP4gS
— hardmaru (@hardmaru) February 16, 2021

rl learning video

by GoogleAI on 2021-02-03 (UTC).

Introducing the World Models Library, an open-source, platform-agnostic suite of tasks and tools for examination of world model design and performance in visual model-based reinforcement learning. Learn more and grab the code at https://t.co/l9jKoBWS2V pic.twitter.com/sT6W4m2FDe
— Google AI (@GoogleAI) February 3, 2021

tool rl

by uclcsml on 2021-01-10 (UTC).

We thank @svlevine for his excellent talk "Data-Driven Reinforcement Learning: Deriving Common Sense from Past Experience" last Friday, now available on our YouTube channel. https://t.co/RfFMxmLivj
— UCL CSML (@uclcsml) January 10, 2021

video rl

by svlevine on 2021-01-10 (UTC).

Offline model-based RL for goal reaching: learn a distance "Q-like" function from offline data, and a video prediction model, then use them to accomplish visually indicated goals.

w/ Stephen Tian et al.https://t.co/pmXL8fGHXv https://t.co/x9XXI7PN06

🧵> pic.twitter.com/G3t23nBWXo
— Sergey Levine (@svlevine) January 10, 2021

rl research

by svlevine on 2020-12-17 (UTC).

Aviral Kumar and I have posted our NeurIPS offline reinforcement learning tutorial on YouTube for your enjoyment :)

Slides, colab exercise, etc.: https://t.co/S639WkAroh
Part 1: https://t.co/OozPaXLVhF
Part 2: https://t.co/MPLhyipS1K
— Sergey Levine (@svlevine) December 17, 2020

rl learning tutorial video

by PyTorch on 2020-12-04 (UTC).

PyGeneses is a Deep Reinforcement Learning framework that attempts to simulate artificial agents in bio-inspired environments. One of the use cases features researching various possible behavior trends and drawing parallels with the real world. https://t.co/sfNC5WJsy8
— PyTorch (@PyTorch) December 4, 2020

pytorch tool rl

by ylecun on 2020-12-04 (UTC).

One Learning to RL them all:
ReBeL (Recursive Belief-based Learning) is a general RL+Search method that works for all two-player zero-sum games, including imperfect-information games (poker, liar's dice,...) and perfect-information games (chess, go....). https://t.co/2sw8Zbe8rg
— Yann LeCun (@ylecun) December 4, 2020

research tool rl

by DynamicWebPaige on 2020-11-06 (UTC).

"Coax is Plug-n-Play reinforcement learning in Python using @OpenAI Gym, JAX, and @DeepMind's Haiku.

For the full documentation, including many examples, go to https://t.co/cVMYPnbp8j."https://t.co/PXdfWLZnWR pic.twitter.com/UZwLeMOGU7
— 👩‍💻 Paige Bailey @ 127.0.0.1 🏠 (@DynamicWebPaige) November 6, 2020

tool rl jax

by svlevine on 2020-10-30 (UTC).

We've been studying why deep RL is so hard, and we think we have another reason: implicit under-parameterization: https://t.co/haeE1YX4Ue

Iteratively training on your own targets is a kind of "self-distillation," and leads to loss of rank ->

w/ Aviral Kumar @agarwl_ @its_dibya pic.twitter.com/h97OiV7d4Y
— Sergey Levine (@svlevine) October 30, 2020

research rl

by svlevine on 2020-10-29 (UTC).

Gamma-models are dynamics models without a fixed time step. Instead, gamma models predict discounted averages of future state visitations, allowing us to train "infinite horizon" models with TD.

w/ @michaeljanner & @IMordatch https://t.co/QYga0n3Vy4 https://t.co/j4FKAQmrWK

->
— Sergey Levine (@svlevine) October 29, 2020

research rl

by svlevine on 2020-10-29 (UTC).

Conservative safety critics use conservative Q-learning (CQL) to learn a safety critic, exploiting the lower bound property of CQL to provide guarantees on safety.

w/ @mangahomanga, Aviral Kumar, @nick_rhinehart, @florian_shkurti, @animesh_garg https://t.co/6BjwHz9Zjx

-> pic.twitter.com/9dYxUjBSEn
— Sergey Levine (@svlevine) October 29, 2020

rl research

by svlevine on 2020-10-20 (UTC).

I recorded an extended version of my offline RL talk, as practice for a live presentation earlier this week: https://t.co/DvnjzagxsN

Covers the following:
AWAC: https://t.co/JYyprRInhR
MOPO: https://t.co/53VtOZKbcx
CQL: https://t.co/TYL7RTNbO7
D4RL: https://t.co/MvBXpSghkM
— Sergey Levine (@svlevine) October 20, 2020

research rl video learning survey

Tag: rl

Tags