Ceshine's Data Science Tweet Collection

by evolvingstuff on 2019-01-10 (UTC).

REALLY cool improvement upon Transformer networks that makes use of recurrence and a relative positional encodings! Up to 1800x faster!

Transformer-XL: Attentive Language Models Beyond a Fixed-Length Contexthttps://t.co/eV4iy1kOPT

TensorFlow & PyTorch: https://t.co/MqZAZKlhEn pic.twitter.com/TjrjOomYnb
— Thomas Lahore (@evolvingstuff) January 10, 2019

nlp w_code research

by hardmaru on 2019-01-11 (UTC).

Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context

Up to 1800x faster than vanilla Transformer during evaluation. New SoTA results on Wikipedia (enwik8, text8, WikiText-103), One Billion Words, and PennTree Bank. 🔥https://t.co/xmpTB43t29
— hardmaru (@hardmaru) January 11, 2019

research nlp

by hardmaru on 2019-01-11 (UTC).

They also released PyTorch and TF implementations with pretrained models: https://t.co/VrE2nR5wBY
— hardmaru (@hardmaru) January 11, 2019

nlp w_code research

by jeremyphoward on 2019-01-11 (UTC).

Yeah I think it says a lot more about the problems of academic review than problems with the paper
— Jeremy Howard (@jeremyphoward) January 11, 2019

nlp research

by Smerity on 2019-01-11 (UTC).

Excited for the Transformer-XL codebase! It also extends my AWD-LSTM `https://t.co/F18vD0XbfY` script to download the One Billion Words + text8 datasets (original grabbed WikiText-2, WikiText-103, enwik8 and PTBC) whilst keeping the most important part ;)https://t.co/aWkksPxmdr pic.twitter.com/6UaO6O2BDh
— Smerity (@Smerity) January 11, 2019

nlp w_code research

by hardmaru on 2019-01-11 (UTC).

Compared to RNNs, the Transformer family of architectures seemingly scale to hundreds of millions of parameters with relative ease. pic.twitter.com/Xo7Ng5b2Gj
— hardmaru (@hardmaru) January 11, 2019

nlp research

by hardmaru on 2019-01-17 (UTC).

Transformer-XL: Combining Transformers and RNNs Into a State-of-the-art Language Model

Blog post by @HorevRani giving an overview of the model and key concepts such as the recurrence mechanism and the relative positional encoding scheme.https://t.co/ORv18GkZBv https://t.co/l1OJKvUNyc
— hardmaru (@hardmaru) January 17, 2019

learning

Tags