New model from XLM outperforms BERT on all GLUE tasks, trained on same data. .
β Yann LeCun (@ylecun) June 21, 2019
Get it here: https://t.co/cYYOETEeaj
Tweets from Guillaume & Alex:... https://t.co/2ysUltBH7f
New model from XLM outperforms BERT on all GLUE tasks, trained on same data. .
β Yann LeCun (@ylecun) June 21, 2019
Get it here: https://t.co/cYYOETEeaj
Tweets from Guillaume & Alex:... https://t.co/2ysUltBH7f
I did a thing :) Starting today Marian is serving our largest language pairs and probably most of @mstranslator traffic. Before, it was only being used for two prototype language pairs (en-de and en-zh).
β Marian NMT (@marian_nmt) June 17, 2019
Getting this to work was a fun year, more here:https://t.co/21c6qFUJZN
Paper: https://t.co/V7U4FpYMb9
β Victor Zhong (@hllo_wrld) June 14, 2019
Dockerized code: https://t.co/gWocV5qmPn
10/10#nlproc#ACL2019NLP#DeepLearning
Our #acl2019nlp paper Entailment-driven Extracting and Editing for Conversational Machine Reading (E3) is out! E3 studies conversational machine reading (CMR), a task oriented dialogue problem where reasoning rules aren't fixed but implied by procedural text. Paper/codeπ1/10 pic.twitter.com/v6Ro6wOaik
β Victor Zhong (@hllo_wrld) June 14, 2019
We've spent a few evenings last week building an interactive demo called *Write with Transformer*
β Thomas Wolf (@Thom_Wolf) June 13, 2019
It lets you interact in a very intimate way with GPT-2, call, control, question the model... and I just can't stop playing with it!
You can try it at https://t.co/EZhtCodnoi https://t.co/me75uCeJ9q
And now we have KERMIT https://t.co/hVvQOknF4o to go along with ELMo, BERT, ERNIE, BigBird and Grover.
β Mark Riedl π Mars (Moon) (@mark_riedl) June 12, 2019
New paper out looking into ELMo- and BERT/STILTs-style transfer from a huge range of source tasks! βCan You Tell Me How to Get Past Sesame Street?β https://t.co/tU9dfhyG7Y pic.twitter.com/m8sFdafGoY
β Sam Bowman (@sleepinyourhat) June 11, 2019
In our new paper (my first collaboration at DeepMind, yay!) with Cyprien, @ikekong, & @DaniYogatama, we leverage episodic memory during training (sparse replay) and inference (local adaptation) for continual learning (on QA and classification tasks).https://t.co/M7lgKhVwXZ pic.twitter.com/ZHdl3yAu72
β Sebastian Ruder (@seb_ruder) June 11, 2019
Lots of requests to @seb_ruder & I for full replication details for ULMFiT on IMDb. Here it is! And thanks to @GuggerSylvain it now runs in just 6 hours on a single GPU :) https://t.co/xbl3pIcctD
β Jeremy Howard (@jeremyphoward) June 11, 2019
I've released my web UI for GPT-2-117M that allows you to generate text from the original model, backed by Google Cloud Run for massive scalability at mostly no cost!https://t.co/LO4hHuQo2l pic.twitter.com/HNhg25x8bz
β Max Woolf (@minimaxir) June 10, 2019
Student with access to TPU credits reproduced GPT2-1.5B and plan to release model https://t.co/WJUEJSEKE8
β /MachineLearning (@slashML) June 8, 2019
Very cool visualizations of different word senses being represented in later layers of BERT, by @_coenen, Emily Reif, Ann Yuan and collaborators.https://t.co/RiMjXseXxW pic.twitter.com/o6sKdPprAJ
β Chris Olah (@ch402) June 7, 2019