I can think of hundreds of data augmentation strategies for CV, eg https://t.co/JsGj2fRNMa
β Reza Zadeh (@Reza_Zadeh) May 18, 2020
For NLP tasks, there's only a handful, eg https://t.co/aRQiZrh5N3
NLP data is far more brittle.
I can think of hundreds of data augmentation strategies for CV, eg https://t.co/JsGj2fRNMa
β Reza Zadeh (@Reza_Zadeh) May 18, 2020
For NLP tasks, there's only a handful, eg https://t.co/aRQiZrh5N3
NLP data is far more brittle.
By the way, we just updated π’π¨π₯ πππππ‘ππ§ππ©π π§π¨π§π’π₯πππ π₯ to training a new language model from scratch.
β Julien Chaumond (@julien_c) May 16, 2020
Updated Colab notebook with the new Trainer =)
β‘οΈ https://t.co/nGQxwqwwZu pic.twitter.com/h392FC7FhZ
[3/4] Unsupervised MLM scores from BERT narrow the human gap on the BLiMP minimal pairs set (@a_stadt, @sleepinyourhat), suggesting left-to-right bias in GPT-2 has an outsized effect.
β Julian Salazar (@JulianSlzr) May 15, 2020
(Yes, we ran these experiments when the dataset came out, 1 week before the ACL deadline π) pic.twitter.com/I5ng2NobSd
[1/4] βMasked Language Model Scoringβ is in #acl2020nlp! Score sentences with any BERT variant via mask+predict (works w/ @huggingface). Improves ASR, NMT, acceptability.
β Julian Salazar (@JulianSlzr) May 15, 2020
Paper: https://t.co/j2iJlIJbVQ
Code: https://t.co/zVbCkAhi9P
(w/ @LiangDavis, @toannguyen177, Katrin K.) pic.twitter.com/LhLYdXC3FS
Surviving every AI wave, two kernels have consistently been the beating hearts of Natural Language Processing:
β Thomas Wolf (@Thom_Wolf) May 15, 2020
Datasets and Metrics
Today we release "nlp", a library to easily share & load data/metrics already providing access to 99+ datasets!
Try itπ https://t.co/37pfogRWIZ pic.twitter.com/m70zXs2zXs
Letβs democratize NLP for all languages! πππ
β Hugging Face (@huggingface) May 14, 2020
Today, with v2.9.1, we are releasing 1,008 machine translation models, covering ` of 140 different languages trained by @jorgtiedemann with @marian, ported by @sam_shleifer. Find your language here: https://t.co/9EMtfopij3 [1/4] pic.twitter.com/ACyxZxzNCg
I just rediscovered these posts from @mcgenergy for combining @huggingface and @fastdotai for various NLP tasks. I highly recommend reading this for anyone doing NLP:
β Hamel Husain (@HamelHusain) May 14, 2020
- language modeling: https://t.co/wpWGRHEyUr
- sequence classification: https://t.co/vP21CTqCtg
This Word Does Not Existhttps://t.co/gcUqp2SM0k
β hardmaru (@hardmaru) May 14, 2020
Fun project where @turtlesoupy trained a GPT-2 language model over the Oxford English Dictionary.
Sampling from it, you get realistic sounding words with fake definitions and example usage, e.g. pic.twitter.com/G1u0qjGwat
Does model size matter? π€@jxmorris12 does an excellent job comparing @huggingface's BERT and DistilBERT.#machinelearning #deeplearning #100daysofmlcodehttps://t.co/Y7dHbH7yqk
β Lavanya π¦ (@lavanyaai) May 13, 2020
New textrecipes series post!π
β Emil Hvitfeldt (@Emil_Hvitfeldt) May 12, 2020
This post explores how you can use lexicons in your recipesπ§βπ³
First but not last time blogging about the {textdata} packagehttps://t.co/XN04bceRVg#rstats pic.twitter.com/qOuj0zuaog
Facebook AI released Blender, "largest-ever open-domain chatbot".
β Tomer Ullman (@TomerUllman) May 11, 2020
I tried some things.
====
FACEBOOK AI: "truly intelligent, human-level AI...must effortlessly understand the broader context of the conversation and how specific topics relate to each other."
FACEBOOK BOT: pic.twitter.com/cqhx21JANV
Facebook AI used 1.5B Reddit posts to create a chatbot
β hardmaru (@hardmaru) May 7, 2020
βGood conversation requires a number of skills: providing engaging talking points, listening, displaying knowledge, empathy and personality appropriately, while maintaining a consistent persona.βhttps://t.co/S8JJmSEN1M https://t.co/Rb9TDjqnyF pic.twitter.com/HYHqo33hUq