Ceshine's Data Science Tweet Collection

by Smerity on 2019-02-15 (UTC).

The work that caused the kerfuffle was a large scale language model from @OpenAI. Think of it as a super powered version of the predictive text on your phone that has read more data and can generate fairly coherent text.https://t.co/NFnJe5HFlp
— Smerity (@Smerity) February 15, 2019

nlp thought

by Smerity on 2019-02-15 (UTC).

As everyone has a different point of view, it's just collisions everywhere :S
- When does a model go from "safe" to "dual use"?
- How much of a "dual use" delay do we need to add?
- Should we release to journalists first or researchers?
- How can small labs participate in PR?
— Smerity (@Smerity) February 15, 2019

thought

by Smerity on 2019-02-15 (UTC).

Releasing a "restricted" model in this way has other (intentional or not) consequences - primarily that "AI is scary" is cat nip to reporters. Hence by acting like a good dual use citizen you can accidentally provoke the AI hype beast.https://t.co/B41sqFJ9sh
— Smerity (@Smerity) February 15, 2019

thought

by deliprao on 2019-02-15 (UTC).

What started off as “oh nice! Impressive results” regarding today’s OpenAI announcements has turned into deep cynicism and mistrust because of their way of handling this matter. I think it is a research win and policy failure.
— Delip Rao (@deliprao) February 15, 2019

misc

by GaryMarcus on 2019-02-16 (UTC).

completely agree: remarkable ≠ semantically coherent. system has no idea that a unicorn has one horn, blithely asserts it has four, etc. simultaneously close and not close to Searle’s room, not at all close to genuine understanding. https://t.co/eVpDUsi6B3
— Gary Marcus (@GaryMarcus) February 16, 2019

misc

by AlecRad on 2019-02-17 (UTC).

By the way - I think a valid (if extreme) take on GPT-2 is "lol you need 10,000x the data, 1 billion parameters, and a supercomputer to get current DL models to generalize to Penn Treebank."
— Alec Radford (@AlecRad) February 17, 2019

thought nlp

by jjvincent on 2019-02-17 (UTC).

.@zacharylipton with a lucid writeup of the publication / reception of openAI’s language modelling research last week https://t.co/689hFl8NNW
— James Vincent (@jjvincent) February 17, 2019

nlp

by AlecRad on 2019-02-19 (UTC).

Those samples use a different technique than the ones shown in the blog. The samples you are looking at are temperature=1. We use top_k=40. Unconditional samples with that are here: https://t.co/OxQBnCc6mA
It's also important to note that conditioning on "real" text helps too.
— Alec Radford (@AlecRad) February 19, 2019

nlp

by seb_ruder on 2019-02-19 (UTC).

Dear OpenAI: Please Open Source Your Language Model
Nice piece by @gradientpub's @hughbzhang that makes the case that "withholding the full GPT-2 model is both unnecessary for safety reasons and detrimental to future progress in AI". https://t.co/v0BB3bYdKf
— Sebastian Ruder (@seb_ruder) February 19, 2019

thought

by jeremyphoward on 2019-02-26 (UTC).

This is why systems like GPT-2 will be very effective at influencing discourse in places where close reading is not the norm (e.g. many social media discussions of politically charged topics) https://t.co/wRepSV34mX
— Jeremy Howard (@jeremyphoward) February 26, 2019

misc

by marian_nmt on 2019-02-27 (UTC).

Another comment on the GPT-2 data: the WMT 2019 training data this year for English-German consists of 28GB of English and 58GB(!!!) of German plain text news data with document boundaries. So, similar to @OpenAI Webtext, news-domain but bilingual: https://t.co/EHOD3ZvGL7
— Marian NMT (@marian_nmt) February 27, 2019

dataset nlp

by jeremyphoward on 2019-02-27 (UTC).

We took a quick look at whether you can do something like @OpenAI GPT2 with far less resources. @GuggerSylvain trained a model on a single GPU for 20 hours. Here's the 1st response for the 1st thing we tried.

(More details coming once we've done more research.) pic.twitter.com/VuCW68MtI1
— Jeremy Howard (@jeremyphoward) February 27, 2019

nlp research

Tags