Homepage
Close
Menu

Site Navigation

  • Home
  • Archive(TODO)
    • By Day
    • By Month
  • About(TODO)
  • Stats
Close
by ak92501 on 2021-02-26 (UTC).

Investigating the Limitations of the Transformers with
Simple Arithmetic Tasks
pdf: https://t.co/jGGlcZ3l8V
abs: https://t.co/GY8swbXHBu pic.twitter.com/KLpxmQRq6j

— AK (@ak92501) February 26, 2021
research
by Smerity on 2021-02-26 (UTC).

What happens when you mix the SHA-RNN with the SRU, similar to the QRNN? 2.5-10x less training time and darn close to SotA results on the enwik8, WikiText-103, and Billion Word language modeling datasets.
Impressive work from @taolei15949106 at @asapp!
See https://t.co/aNCqhTLnn6 https://t.co/eD3mWPJnwo

— Smerity (@Smerity) February 26, 2021
researchnlp
by katecrawford on 2021-02-25 (UTC).

This is beyond concerning - it's totally inappropriate. It structurally undermines the integrity of research. https://t.co/BYRepnqY5z

— Kate Crawford (@katecrawford) February 25, 2021
ethicsmisc
by PyTorch on 2021-02-25 (UTC).

FairScale, a PyTorch extension for efficient large scale training, is releasing FullyShardedDataParallel, which shards model params across GPUs (+offload to CPU). Details: https://t.co/xshPfLeXyr. Inspired by DeepSpeed/@MSFTResearch, and made by @myleott @m1nxu @sam_shleifer pic.twitter.com/1ICMsJwtUP

— PyTorch (@PyTorch) February 25, 2021
pytorchtool
by jburnmurdoch on 2021-02-25 (UTC).

NEW: it’s a while since I’ve done a big international Covid thread, but this one feels important.

The first six weeks of 2021 have gone rather well in terms of humanity’s fight against Covid.

As well as the rollout of vaccines, global cases halved(!) between Jan 11 and Feb 18 pic.twitter.com/bnoxNkUZsu

— John Burn-Murdoch (@jburnmurdoch) February 25, 2021
dataviz
In a group with 90 other tweets.
by jekbradbury on 2021-02-25 (UTC).

Something like half the appendix of the DALL-E paper (https://t.co/fIBdsdA3lQ) describes work the authors had to do on GPUs that they wouldn't have had to do on TPUs:
- scaling fp16 mixed precision
- reducing gradient all-reduce comms w/ PowerSGD
- manual optimizer sharding

— James Bradbury (@jekbradbury) February 25, 2021
research
In a group with 7 other tweets.
by hardmaru on 2021-02-25 (UTC).

Hierarchical variational autoencoders are getting more powerful every day. This paper looks at ways to convert a VAE into an image completion generative model. It seems we no longer need GANs or adversarial losses for this level of realism anymore? https://t.co/1pL8QTAKsC https://t.co/wNJPh9CdN4

— hardmaru (@hardmaru) February 25, 2021
researchcv
by ak92501 on 2021-02-25 (UTC).

Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions
pdf: https://t.co/zq8nL4Kpab
abs: https://t.co/BuHoKo502s
github: https://t.co/b1SirihVe6 pic.twitter.com/ijv8bVQCbj

— AK (@ak92501) February 25, 2021
researchcv
by ak92501 on 2021-02-25 (UTC).

Zero-Shot Text-to-Image Generation
pdf: https://t.co/1jrZu1ibgE
abs: https://t.co/chlumVCi7H pic.twitter.com/DlQZkvDSiZ

— AK (@ak92501) February 25, 2021
researchcv
by thomasp85 on 2021-02-24 (UTC).

I've added a. quick intro to ggfx for those curious about how to use it https://t.co/PNegGjQy9I

— Thomas Lin Pedersen (@thomasp85) February 24, 2021
rstatstutoriallearningtool
by ak92501 on 2021-02-24 (UTC).

DALL-E code & notebook
github: https://t.co/KW8Rl9lbes pic.twitter.com/Avv9dUqe7K

— AK (@ak92501) February 24, 2021
researchw_codecv
In a group with 7 other tweets.
by radekosmulski on 2021-02-24 (UTC).

✅ feeling lost 98% of the time and not knowing if your approach makes sense
✅ identifying and trusting good advice among so much noise
✅ finding the time to study when being a parent, a student, an employee
✅ learning how to pick projects to work on for self-study

— Radek Osmulski (@radekosmulski) February 24, 2021
miscthought
  • Prev
  • 106
  • 107
  • 108
  • 109
  • 110
  • …
  • Next

Tags

learning tutorial misc nlp rstats gan ethics research dataviz survey python tool security kaggle video thought bayesian humour tensorflow w_code bias dataset pytorch cv tip application javascript forecast swift golang rl jax julia gnn causal surey diffusion
© Copyright Philosophy 2018 Site Template by Colorlib