Ceshine's Data Science Tweet Collection

by PiotrCzapla on 2019-09-01 (UTC).

I see many people spent hours of compute training Bert to get worse results that they would get on fraction of cost using ULMFIT or even naive bias. Re IMDb ulmfit has 95% accuracy and 55m parameters, less when sentence-piece & qrnnnis used.
— Piotr Czapla (@PiotrCzapla) September 1, 2019

nlp thought

by Smerity on 2019-09-01 (UTC).

I'm incredibly proud that the low compute / low resource AWD-LSTM and QRNN that I helped develop at @SFResearch live on as first class architectures in the @fastdotai community :) https://t.co/nVXplj0L86
— Smerity (@Smerity) September 1, 2019

misc

by Smerity on 2019-09-01 (UTC).

Whilst pretrained weights can be an advantage it also ties you to someone else's whims. Did they train on a dataset that fits your task? Was your task ever intended? Did their setup have idiosyncrasies that might bite you? Will you hit a finetuning progress dead end?
— Smerity (@Smerity) September 1, 2019

misc

by Smerity on 2019-09-01 (UTC).

I don't want to wake up a half decade from now and realize the many fine brains of our field have only been dedicated to finetuning models that are near impossible to reproduce that we've inevitably concentrated the core progress of AI in only a few silos / domains.
— Smerity (@Smerity) September 1, 2019

misc

by Smerity on 2019-09-01 (UTC).

If you enjoy my ranting, you may enjoy my ranted article on this topic.
"Adding ever more engines may help get the plane off the ground...
but that's not the design planes are destined for."https://t.co/aKBLuNpwN0
— Smerity (@Smerity) September 1, 2019

misc

Tags