"The Future of Natural Language Processing" https://t.co/PKl4cSSlZ8 from @huggingface, well done quick summary of recent NLP work and lots of good pointers! π
β Andrej Karpathy (@karpathy) April 24, 2020
"The Future of Natural Language Processing" https://t.co/PKl4cSSlZ8 from @huggingface, well done quick summary of recent NLP work and lots of good pointers! π
β Andrej Karpathy (@karpathy) April 24, 2020
Introducing a scalable approach to reducing gender bias in #GoogleTranslate, with applications to new translation pairs, including Finnish, Hungarian, and Persian-to-English. Check it out β https://t.co/xyg2OqEuSP
β Google AI (@GoogleAI) April 22, 2020
Microsoftβs DialoGPT helps build versatile, engaging and natural open-domain conversational agents. Explore the source code and trained model: https://t.co/TQhUh5oYAg
β Microsoft Research (@MSFTResearch) April 22, 2020
Decoding functionality from @HuggingFace: https://t.co/CPuxxLms0R https://t.co/Yks5vwcl26
New python v0.7.0 of π€ tokenizers is out, with:
β Anthony MOI (@moi_anthony) April 17, 2020
- π Reduced memory usage by 70%
- πͺ Rock-solid offsets/alignments, working even with byte-level BPE
- β And so much more!
And soon all of this in π€ transformers too!
pip install tokenizershttps://t.co/wONh4qZSlF pic.twitter.com/Io1kmLE63M
I'm excited to announce XTREME, a new benchmark that covers 9 tasks and 40 typologically diverse languages.
β Sebastian Ruder (@seb_ruder) April 13, 2020
Paper: https://t.co/ZjBIYK6QcX
Blog post: https://t.co/L0SiDRRHMX
Code: https://t.co/QEmw5ZGHoN pic.twitter.com/YVo0T9gT63
Announcing XTREME, a new #NaturalLanguageProcessing benchmark for cross-lingual generalization, which covers 40 typologically diverse languages using nine tasks that collectively require reasoning about different levels of syntax or semantics. Learn more βhttps://t.co/F7pgTQdbuo
β Google AI (@GoogleAI) April 13, 2020
An In-depth Walkthrough on Evolution of Neural Machine Translation: https://t.co/FwVesyZ7Wb
β Denny Britz (@dennybritz) April 13, 2020
This paper looks like a nice overview that summarizes the NMT progress over the past few years. pic.twitter.com/3QmtwzPTNV
This was a really cool integration to work on.
β Julien Chaumond (@julien_c) April 9, 2020
If you have a model you'd like to visualize in ExBERT, let us know and we'll let you know how to add it.
hat/tip @Ben_Hoov @hen_str @sebgehr for the ExBERT repohttps://t.co/k2Gphl6YZp
The second most important factor is regular input dropout: You take the embeddings and dropout elements with probability p. This also has a data augmentation effect very similar to dropping out random pixels for images. What is a good way to think about this? 1/2
β Tim Dettmers (@Tim_Dettmers) April 8, 2020
The most dramatic performance gain comes from discrete embedding dropout: You embed as usual, but now with a probability p you zero the entire word vector. This is akin to masked language modeling but the goal is not to predict the mask β just regular LM with uncertain context.
β Tim Dettmers (@Tim_Dettmers) April 8, 2020
The key insight is the following: In the small dataset regime, it is all about dataset augmentation. The analog in computer vision is that you get much better results, particularly on small datasets, if you do certain dataset augmentations. This also regularizes the model.
β Tim Dettmers (@Tim_Dettmers) April 8, 2020
Another Transformer variant with lower computational complexity, suitable for long-range tasks, is Sparse Sinkhorn Attention (https://t.co/qWp2AJVdkd) by Yi Tay et al.
β hardmaru (@hardmaru) April 8, 2020
A GitHub Colab reimplementation in PyTorch (https://t.co/B5FcGuTZhy) also combined it with ideas from Reformer. https://t.co/WSwZuSRyPb pic.twitter.com/54fJrRbhEA