Homepage
Close
Menu

Site Navigation

  • Home
  • Archive(TODO)
    • By Day
    • By Month
  • About(TODO)
  • Stats
Close
by Smerity on 2019-02-15 (UTC).

The work that caused the kerfuffle was a large scale language model from @OpenAI. Think of it as a super powered version of the predictive text on your phone that has read more data and can generate fairly coherent text.https://t.co/NFnJe5HFlp

— Smerity (@Smerity) February 15, 2019
nlpthought
by Smerity on 2019-02-15 (UTC).

As everyone has a different point of view, it's just collisions everywhere :S
- When does a model go from "safe" to "dual use"?
- How much of a "dual use" delay do we need to add?
- Should we release to journalists first or researchers?
- How can small labs participate in PR?

— Smerity (@Smerity) February 15, 2019
thought
by Smerity on 2019-02-15 (UTC).

Releasing a "restricted" model in this way has other (intentional or not) consequences - primarily that "AI is scary" is cat nip to reporters. Hence by acting like a good dual use citizen you can accidentally provoke the AI hype beast.https://t.co/B41sqFJ9sh

— Smerity (@Smerity) February 15, 2019
thought
by deliprao on 2019-02-15 (UTC).

What started off as “oh nice! Impressive results” regarding today’s OpenAI announcements has turned into deep cynicism and mistrust because of their way of handling this matter. I think it is a research win and policy failure.

— Delip Rao (@deliprao) February 15, 2019
misc
by GaryMarcus on 2019-02-16 (UTC).

completely agree: remarkable ≠ semantically coherent. system has no idea that a unicorn has one horn, blithely asserts it has four, etc. simultaneously close and not close to Searle’s room, not at all close to genuine understanding. https://t.co/eVpDUsi6B3

— Gary Marcus (@GaryMarcus) February 16, 2019
misc
by AlecRad on 2019-02-17 (UTC).

By the way - I think a valid (if extreme) take on GPT-2 is "lol you need 10,000x the data, 1 billion parameters, and a supercomputer to get current DL models to generalize to Penn Treebank."

— Alec Radford (@AlecRad) February 17, 2019
thoughtnlp
by jjvincent on 2019-02-17 (UTC).

.@zacharylipton with a lucid writeup of the publication / reception of openAI’s language modelling research last week https://t.co/689hFl8NNW

— James Vincent (@jjvincent) February 17, 2019
nlp
by AlecRad on 2019-02-19 (UTC).

Those samples use a different technique than the ones shown in the blog. The samples you are looking at are temperature=1. We use top_k=40. Unconditional samples with that are here: https://t.co/OxQBnCc6mA
It's also important to note that conditioning on "real" text helps too.

— Alec Radford (@AlecRad) February 19, 2019
nlp
by seb_ruder on 2019-02-19 (UTC).

Dear OpenAI: Please Open Source Your Language Model
Nice piece by @gradientpub's @hughbzhang that makes the case that "withholding the full GPT-2 model is both unnecessary for safety reasons and detrimental to future progress in AI". https://t.co/v0BB3bYdKf

— Sebastian Ruder (@seb_ruder) February 19, 2019
thought
by jeremyphoward on 2019-02-26 (UTC).

This is why systems like GPT-2 will be very effective at influencing discourse in places where close reading is not the norm (e.g. many social media discussions of politically charged topics) https://t.co/wRepSV34mX

— Jeremy Howard (@jeremyphoward) February 26, 2019
misc
by marian_nmt on 2019-02-27 (UTC).

Another comment on the GPT-2 data: the WMT 2019 training data this year for English-German consists of 28GB of English and 58GB(!!!) of German plain text news data with document boundaries. So, similar to @OpenAI Webtext, news-domain but bilingual: https://t.co/EHOD3ZvGL7

— Marian NMT (@marian_nmt) February 27, 2019
datasetnlp
by jeremyphoward on 2019-02-27 (UTC).

We took a quick look at whether you can do something like @OpenAI GPT2 with far less resources. @GuggerSylvain trained a model on a single GPU for 20 hours. Here's the 1st response for the 1st thing we tried.

(More details coming once we've done more research.) pic.twitter.com/VuCW68MtI1

— Jeremy Howard (@jeremyphoward) February 27, 2019
nlpresearch
  • Prev
  • 1
  • 2
  • 3
  • 4
  • Next

Tags

learning tutorial misc nlp rstats gan ethics research dataviz survey python tool security kaggle video thought bayesian humour tensorflow w_code bias dataset pytorch cv tip application javascript forecast swift golang rl jax julia gnn causal surey diffusion
© Copyright Philosophy 2018 Site Template by Colorlib