Tag - nlp

by karpathy on 2020-04-24 (UTC).

"The Future of Natural Language Processing" https://t.co/PKl4cSSlZ8 from @huggingface, well done quick summary of recent NLP work and lots of good pointers! 👏
— Andrej Karpathy (@karpathy) April 24, 2020

learning video nlp survey

by GoogleAI on 2020-04-22 (UTC).

Introducing a scalable approach to reducing gender bias in #GoogleTranslate, with applications to new translation pairs, including Finnish, Hungarian, and Persian-to-English. Check it out ↓ https://t.co/xyg2OqEuSP
— Google AI (@GoogleAI) April 22, 2020

nlp bias learning

by MSFTResearch on 2020-04-22 (UTC).

Microsoft’s DialoGPT helps build versatile, engaging and natural open-domain conversational agents. Explore the source code and trained model: https://t.co/TQhUh5oYAg

Decoding functionality from @HuggingFace: https://t.co/CPuxxLms0R https://t.co/Yks5vwcl26
— Microsoft Research (@MSFTResearch) April 22, 2020

nlp tool

by moi_anthony on 2020-04-17 (UTC).

New python v0.7.0 of 🤗 tokenizers is out, with:

- 🚀 Reduced memory usage by 70%
- 💪 Rock-solid offsets/alignments, working even with byte-level BPE
- ➕ And so much more!
And soon all of this in 🤗 transformers too!

pip install tokenizershttps://t.co/wONh4qZSlF pic.twitter.com/Io1kmLE63M
— Anthony MOI (@moi_anthony) April 17, 2020

nlp tool

by seb_ruder on 2020-04-13 (UTC).

I'm excited to announce XTREME, a new benchmark that covers 9 tasks and 40 typologically diverse languages.

Paper: https://t.co/ZjBIYK6QcX
Blog post: https://t.co/L0SiDRRHMX
Code: https://t.co/QEmw5ZGHoN pic.twitter.com/YVo0T9gT63
— Sebastian Ruder (@seb_ruder) April 13, 2020

dataset nlp

In a group with 1 other tweets.

by GoogleAI on 2020-04-13 (UTC).

Announcing XTREME, a new #NaturalLanguageProcessing benchmark for cross-lingual generalization, which covers 40 typologically diverse languages using nine tasks that collectively require reasoning about different levels of syntax or semantics. Learn more ↓https://t.co/F7pgTQdbuo
— Google AI (@GoogleAI) April 13, 2020

dataset nlp

In a group with 1 other tweets.

by dennybritz on 2020-04-13 (UTC).

An In-depth Walkthrough on Evolution of Neural Machine Translation: https://t.co/FwVesyZ7Wb

This paper looks like a nice overview that summarizes the NMT progress over the past few years. pic.twitter.com/3QmtwzPTNV
— Denny Britz (@dennybritz) April 13, 2020

learning nlp survey

by julien_c on 2020-04-09 (UTC).

This was a really cool integration to work on.

If you have a model you'd like to visualize in ExBERT, let us know and we'll let you know how to add it.

hat/tip @Ben_Hoov @hen_str @sebgehr for the ExBERT repohttps://t.co/k2Gphl6YZp
— Julien Chaumond (@julien_c) April 9, 2020

dataviz nlp tool

In a group with 1 other tweets.

by Tim_Dettmers on 2020-04-08 (UTC).

The second most important factor is regular input dropout: You take the embeddings and dropout elements with probability p. This also has a data augmentation effect very similar to dropping out random pixels for images. What is a good way to think about this? 1/2
— Tim Dettmers (@Tim_Dettmers) April 8, 2020

nlp research tip

In a group with 6 other tweets.

by Tim_Dettmers on 2020-04-08 (UTC).

The most dramatic performance gain comes from discrete embedding dropout: You embed as usual, but now with a probability p you zero the entire word vector. This is akin to masked language modeling but the goal is not to predict the mask — just regular LM with uncertain context.
— Tim Dettmers (@Tim_Dettmers) April 8, 2020

research tip nlp

In a group with 6 other tweets.

by Tim_Dettmers on 2020-04-08 (UTC).

The key insight is the following: In the small dataset regime, it is all about dataset augmentation. The analog in computer vision is that you get much better results, particularly on small datasets, if you do certain dataset augmentations. This also regularizes the model.
— Tim Dettmers (@Tim_Dettmers) April 8, 2020

research nlp

In a group with 6 other tweets.

by hardmaru on 2020-04-08 (UTC).

Another Transformer variant with lower computational complexity, suitable for long-range tasks, is Sparse Sinkhorn Attention (https://t.co/qWp2AJVdkd) by Yi Tay et al.

A GitHub Colab reimplementation in PyTorch (https://t.co/B5FcGuTZhy) also combined it with ideas from Reformer. https://t.co/WSwZuSRyPb pic.twitter.com/54fJrRbhEA
— hardmaru (@hardmaru) April 8, 2020

research nlp w_code pytorch

In a group with 4 other tweets.

Tag: nlp

Tags