Ceshine's Data Science Tweet Collection

by ak92501 on 2021-02-26 (UTC).

Investigating the Limitations of the Transformers with
Simple Arithmetic Tasks
pdf: https://t.co/jGGlcZ3l8V
abs: https://t.co/GY8swbXHBu pic.twitter.com/KLpxmQRq6j
— AK (@ak92501) February 26, 2021

research

by Smerity on 2021-02-26 (UTC).

What happens when you mix the SHA-RNN with the SRU, similar to the QRNN? 2.5-10x less training time and darn close to SotA results on the enwik8, WikiText-103, and Billion Word language modeling datasets.
Impressive work from @taolei15949106 at @asapp!
See https://t.co/aNCqhTLnn6 https://t.co/eD3mWPJnwo
— Smerity (@Smerity) February 26, 2021

research nlp

by katecrawford on 2021-02-25 (UTC).

This is beyond concerning - it's totally inappropriate. It structurally undermines the integrity of research. https://t.co/BYRepnqY5z
— Kate Crawford (@katecrawford) February 25, 2021

ethics misc

by PyTorch on 2021-02-25 (UTC).

FairScale, a PyTorch extension for efficient large scale training, is releasing FullyShardedDataParallel, which shards model params across GPUs (+offload to CPU). Details: https://t.co/xshPfLeXyr. Inspired by DeepSpeed/@MSFTResearch, and made by @myleott @m1nxu @sam_shleifer pic.twitter.com/1ICMsJwtUP
— PyTorch (@PyTorch) February 25, 2021

pytorch tool

by jburnmurdoch on 2021-02-25 (UTC).

NEW: it’s a while since I’ve done a big international Covid thread, but this one feels important.

The first six weeks of 2021 have gone rather well in terms of humanity’s fight against Covid.

As well as the rollout of vaccines, global cases halved(!) between Jan 11 and Feb 18 pic.twitter.com/bnoxNkUZsu
— John Burn-Murdoch (@jburnmurdoch) February 25, 2021

dataviz

In a group with 90 other tweets.

by jekbradbury on 2021-02-25 (UTC).

Something like half the appendix of the DALL-E paper (https://t.co/fIBdsdA3lQ) describes work the authors had to do on GPUs that they wouldn't have had to do on TPUs:
- scaling fp16 mixed precision
- reducing gradient all-reduce comms w/ PowerSGD
- manual optimizer sharding
— James Bradbury (@jekbradbury) February 25, 2021

research

In a group with 7 other tweets.

by hardmaru on 2021-02-25 (UTC).

Hierarchical variational autoencoders are getting more powerful every day. This paper looks at ways to convert a VAE into an image completion generative model. It seems we no longer need GANs or adversarial losses for this level of realism anymore? https://t.co/1pL8QTAKsC https://t.co/wNJPh9CdN4
— hardmaru (@hardmaru) February 25, 2021

research cv

by ak92501 on 2021-02-25 (UTC).

Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions
pdf: https://t.co/zq8nL4Kpab
abs: https://t.co/BuHoKo502s
github: https://t.co/b1SirihVe6 pic.twitter.com/ijv8bVQCbj
— AK (@ak92501) February 25, 2021

research cv

by ak92501 on 2021-02-25 (UTC).

Zero-Shot Text-to-Image Generation
pdf: https://t.co/1jrZu1ibgE
abs: https://t.co/chlumVCi7H pic.twitter.com/DlQZkvDSiZ
— AK (@ak92501) February 25, 2021

research cv

by thomasp85 on 2021-02-24 (UTC).

I've added a. quick intro to ggfx for those curious about how to use it https://t.co/PNegGjQy9I
— Thomas Lin Pedersen (@thomasp85) February 24, 2021

rstats tutorial learning tool

by ak92501 on 2021-02-24 (UTC).

DALL-E code & notebook
github: https://t.co/KW8Rl9lbes pic.twitter.com/Avv9dUqe7K
— AK (@ak92501) February 24, 2021

research w_code cv

In a group with 7 other tweets.

by radekosmulski on 2021-02-24 (UTC).

✅ feeling lost 98% of the time and not knowing if your approach makes sense
✅ identifying and trusting good advice among so much noise
✅ finding the time to study when being a parent, a student, an employee
✅ learning how to pick projects to work on for self-study
— Radek Osmulski (@radekosmulski) February 24, 2021

misc thought

Tags