Homepage
Close
Menu

Site Navigation

  • Home
  • Archive(TODO)
    • By Day
    • By Month
  • About(TODO)
  • Stats
Close
by ilyasut on 2018-06-25 (UTC).

The deep learning moment of deep RL: https://t.co/25qB39K3HL

— Ilya Sutskever (@ilyasut) June 25, 2018
research
by soumithchintala on 2018-06-25 (UTC).

OpenAI demonstrates remarkable progress in a limited version of 5v5 Dota using two concepts that we didn't think can learn long time-scale strategies: selfplay, LSTM. Carefully designed reward functions are notable -- intermediate, global, team-spirit.https://t.co/GBTw1e7ERR

— Soumith Chintala (@soumithchintala) June 25, 2018
research
by awjuliani on 2018-06-25 (UTC).

Great work by @OpenAI team. More evidence that scaling up simple RL methods (rather than designing complicated algorithms) enables solving increasingly complex problems. https://t.co/1wpazo69hU

— Arthur Juliani (@awjuliani) June 25, 2018
research
by jackclarkSF on 2018-06-25 (UTC).

Some megascale RL results from @OpenAI:
We've scaled existing methods to train AIs with sufficient teamwork skills to solve hard problems within Dota 2
- Scaled-up PPO+LSTM
~120,000 CPUs + 256 GPUs
- Self-play
- Hyperparameter called "Team Spirit" to teach AIs to collaborate https://t.co/lcSGWw0yr5

— Jack Clark (@jackclarkSF) June 25, 2018
research
by hardmaru on 2018-06-25 (UTC).

Amazing what a single-layer 1024-unit LSTM can be trained to do with a bit of engineering! OpenAI Five Model Architecture: pic.twitter.com/mRbD02KpNc

— hardmaru (@hardmaru) June 25, 2018
research
by Smerity on 2018-06-25 (UTC).

I want to wax poetic about the models (LSTM+PPO pushed far beyond what people likely thought possible, mirroring @Smerity et al + @GaborMelis et al in LSTM language modeling), the game (DotA as a complex test-bed), or the stupendous compute (180 years of gaming per day), but ... pic.twitter.com/cgUJ4w39G0

— Smerity (@Smerity) June 25, 2018
research

Tags

learning tutorial misc nlp rstats gan ethics research dataviz survey python tool security kaggle video thought bayesian humour tensorflow w_code bias dataset pytorch cv tip application javascript forecast swift golang rl jax julia gnn causal surey diffusion
© Copyright Philosophy 2018 Site Template by Colorlib