GPT-3 is currently generating an average of **4.5 billion words per day**: https://t.co/pqwhce8rD3
— Greg Brockman (@gdb) March 25, 2021
GPT-3 is currently generating an average of **4.5 billion words per day**: https://t.co/pqwhce8rD3
— Greg Brockman (@gdb) March 25, 2021
Who needs floats? I-BERT doesn't!
— Hugging Face (@huggingface) March 22, 2021
I-BERT: A quantized Transformer with int-8 *only*
Get the best parameters with Transformers and use in TensorRT for a 4x (!!) speedup!
Contributed by @sehoonkim418, @amir__gholami @ZheweiYao
Try it on the hub: https://t.co/00w4evcRUe pic.twitter.com/eM1LJKgGAX
All NLP Tasks Are Generation Tasks: A General Pretraining Framework
— AK (@ak92501) March 19, 2021
pdf: https://t.co/Zhdx33AUgL
abs: https://t.co/ibJdjVbIsD pic.twitter.com/8OPdrHVCPn
Multi-view subword regularization is simple but yields consistent improvements over pre-trained multilingual models. The best thing: It only needs to be applied during fine-tuning.
— Sebastian Ruder (@seb_ruder) March 16, 2021
Paper: https://t.co/gxTgbzVvWN
Code: https://t.co/FqUyZgEnOQ https://t.co/sTFxot6yan
Few-Shot Text Classification with Triplet Networks, Data Augmentation, and Curriculum Learning
— AK (@ak92501) March 16, 2021
pdf: https://t.co/uyzxL1tu4L
abs: https://t.co/qX90yC7nAH pic.twitter.com/Y15zxoaJo8
🔥Fine-Tuning @facebookai's Wav2Vec2 for Speech Recognition is now possible in Transformers🔥
— Hugging Face (@huggingface) March 12, 2021
Not only for English but for 53 Languages🤯
Check out the tutorials:
👉 Train Wav2Vec2 on TIMIT https://t.co/33Bx8Nj4mN
👉 Train XLSR-Wav2Vec2 on Common Voicehttps://t.co/xOoEQV3Krn pic.twitter.com/rxp2hAbaLS
CUAD: A dataset with over 13,000 annotations for hundreds of legal contracts that have been manually labelled by legal experts, to serve as a benchmark for contract understanding.
— hardmaru (@hardmaru) March 12, 2021
Discussion: https://t.co/qP4dC40Z8l
GitHub: https://t.co/Kj7NZs3qW0
Paper: https://t.co/RTbXxbs06a pic.twitter.com/IZthZPAPs0
A new blog post I wrote with Ishan Misra.
— Yann LeCun (@ylecun) March 4, 2021
An overview of Self-Supervised Learning.
We look at recent progress in SSL for vision & explain why SSL is more challenging with high-D continuous signals (images, video) than it is for discrete signals (text).https://t.co/DlL885CPpb
We keep updating BERTScore, our generation evaluation method, behind the scenes. Been a while so highlights:
— Yoav Artzi (@yoavartzi) March 3, 2021
- Now supports 53 pre-trained models via @huggingface's Transformers
- WMT-16 to-EN correlations here:https://t.co/nlcL2QxtGh --> current best: deberta-xlarge-mnli
WIT: Wikipedia-based Image Text Dataset for Multimodal
— AK (@ak92501) March 3, 2021
Multilingual Machine Learning
pdf: https://t.co/fblyzH2hGe
abs: https://t.co/tVgBdfOnQ5
github: https://t.co/NNkF3oheok pic.twitter.com/nnFUaPJaYU
Nice video by Abhishek about his AutoNLP project! Check it out https://t.co/dzavvR74vl
— Thomas Wolf (@Thom_Wolf) February 28, 2021
What happens when you mix the SHA-RNN with the SRU, similar to the QRNN? 2.5-10x less training time and darn close to SotA results on the enwik8, WikiText-103, and Billion Word language modeling datasets.
— Smerity (@Smerity) February 26, 2021
Impressive work from @taolei15949106 at @asapp!
See https://t.co/aNCqhTLnn6 https://t.co/eD3mWPJnwo