Congrats @jnhwkim and team on winning the VQA challenge at #CVPR2018.
— PyTorch (@PyTorch) June 19, 2018
Read their paper "Bilinear Attention Networks" at https://t.co/sqXNysYxcv
PyTorch based code at: https://t.co/XIOwVHfLtO https://t.co/FGOR3Pe9Rl
Congrats @jnhwkim and team on winning the VQA challenge at #CVPR2018.
— PyTorch (@PyTorch) June 19, 2018
Read their paper "Bilinear Attention Networks" at https://t.co/sqXNysYxcv
PyTorch based code at: https://t.co/XIOwVHfLtO https://t.co/FGOR3Pe9Rl
code and pre-trained models to reproduce the recent paper "Scaling Neural Machine Translation" (https://t.co/mrRDmlwax1) where we train on up to 128 GPUs with half precision floating point operations as well as delayed batching.
— PyTorch (@PyTorch) June 16, 2018
Facebook's fairseq MT engine is really, really fast... Like, 50% faster than @marian_nmt (which is itself way faster than Sockeye/OpenNMT/Tensor2Tensor/xnmt/Nematus/etc) at generating from the same Transformer model https://t.co/tWZuSog5gd
— James Bradbury (@jekbradbury) June 15, 2018
I made a @pytorch implementation of @openai's pretrained transformer with a script to import OpenAI's pre-trained model.
— Thomas Wolf (@Thom_Wolf) June 14, 2018
Link: https://t.co/6zY8NavPA3
Thanks @AlecRad, @karthik_r_n, @TimSalimans, @ilyasut for open-sourcing the code right away!
did terminology move and i didn't notice? do you consider language modeling (training a probabilistic model p(w1 w2 w3 ... wN)) to be unsupervised learning?
— Hal Daumé III (@haldaume3) June 12, 2018
Great work! Language models serving as the basis for transfer learning in task agnostic NLP is really going to feed the next generation of tools and transformations =]
— Smerity (@Smerity) June 11, 2018
This is exactly where we were hoping our ULMFit work would head - really great work from @OpenAI! 😊
— Jeremy Howard (@jeremyphoward) June 11, 2018
If you're doing NLP and haven't tried language model transfer learning yet, then jump in now, because it's a Really Big Deal. https://t.co/0Dj8ChCxvu
What I've been working on for the past year! https://t.co/CAQMYS1rR7
— Alec Radford (@AlecRad) June 11, 2018
Inspired by CoVE, ELMo, and ULMFiT we show that a single transformer language model can be finetuned to a wide variety of NLP tasks and performs very well with little tuning/tweaking.
We show large language models trained on massive text corpora (LM1b, CommonCrawl, Gutenberg) can be used for commonsense reasoning and obtain SOTA on Winograd Schema Challenge. Paper at https://t.co/aRndlByWfj, results reproducible at https://t.co/jFOmUYf03O pic.twitter.com/s3uyrksAQz
— Trieu H. Trinh (@thtrieu_) June 11, 2018
New NLP News - Conversational AI tutorial, RNNs for particle physics, InfoGAN, NLP Coursera, NLP book, killer robots, Code2pix, Google AI principles, relational reasoning https://t.co/5zT3QpjYAT
— Sebastian Ruder (@seb_ruder) June 11, 2018
Some of my thoughts on ai and the future of chat bots:https://t.co/a9Fm1AXFkY
— Richard (@RichardSocher) May 21, 2018
Slides for my keynote this morning on Successes and Frontiers of Deep Learning are now online. Mainly a high-level overview of some important and exciting application areas + some recent work on semi-supervised / transfer learning. https://t.co/uVcLtbHSuT
— Sebastian Ruder (@seb_ruder) May 21, 2018