Homepage
Close
Menu

Site Navigation

  • Home
  • Archive(TODO)
    • By Day
    • By Month
  • About(TODO)
  • Stats
Close
by PiotrCzapla on 2019-09-01 (UTC).

I see many people spent hours of compute training Bert to get worse results that they would get on fraction of cost using ULMFIT or even naive bias. Re IMDb ulmfit has 95% accuracy and 55m parameters, less when sentence-piece & qrnnnis used.

— Piotr Czapla (@PiotrCzapla) September 1, 2019
nlpthought
by Smerity on 2019-09-01 (UTC).

I'm incredibly proud that the low compute / low resource AWD-LSTM and QRNN that I helped develop at @SFResearch live on as first class architectures in the @fastdotai community :) https://t.co/nVXplj0L86

— Smerity (@Smerity) September 1, 2019
misc
by Smerity on 2019-09-01 (UTC).

Whilst pretrained weights can be an advantage it also ties you to someone else's whims. Did they train on a dataset that fits your task? Was your task ever intended? Did their setup have idiosyncrasies that might bite you? Will you hit a finetuning progress dead end?

— Smerity (@Smerity) September 1, 2019
misc
by Smerity on 2019-09-01 (UTC).

I don't want to wake up a half decade from now and realize the many fine brains of our field have only been dedicated to finetuning models that are near impossible to reproduce that we've inevitably concentrated the core progress of AI in only a few silos / domains.

— Smerity (@Smerity) September 1, 2019
misc
by Smerity on 2019-09-01 (UTC).

If you enjoy my ranting, you may enjoy my ranted article on this topic.
"Adding ever more engines may help get the plane off the ground...
but that's not the design planes are destined for."https://t.co/aKBLuNpwN0

— Smerity (@Smerity) September 1, 2019
misc

Tags

learning tutorial misc nlp rstats gan ethics research dataviz survey python tool security kaggle video thought bayesian humour tensorflow w_code bias dataset pytorch cv tip application javascript forecast swift golang rl jax julia gnn causal surey diffusion
© Copyright Philosophy 2018 Site Template by Colorlib