Homepage
Close
Menu

Site Navigation

  • Home
  • Archive(TODO)
    • By Day
    • By Month
  • About(TODO)
  • Stats
Close
by Thom_Wolf on 2019-05-03 (UTC).

First, there was growing evidence that beam-search is highly sensitive to the length of the output. Best results are obtained when the output length is predicted from the input before decoding (https://t.co/imPuU2t6hp, https://t.co/WkxOsscAe7 at EMNLP 2018) [4/9]

— Thomas Wolf (@Thom_Wolf) May 3, 2019
nlpresearch
by Thom_Wolf on 2019-05-03 (UTC).

Last in this recent trend of work is https://t.co/D3IOZ8CMNQ in which @universeinanegg & co show that the distribution of words in BS/greedy decoded texts is very different from the one in human texts.
Clearly BS/greedy fail to reproduce distributional aspects of human text [7/9] pic.twitter.com/7blkmtLPjB

— Thomas Wolf (@Thom_Wolf) May 3, 2019
researchnlp
by Thom_Wolf on 2019-05-03 (UTC).

Finally, here is a gist showing how to code top-k and nucleus sampling in PyTorch:https://t.co/aDOlWLI3aq
[9/9]

— Thomas Wolf (@Thom_Wolf) May 3, 2019
nlp

Tags

learning tutorial misc nlp rstats gan ethics research dataviz survey python tool security kaggle video thought bayesian humour tensorflow w_code bias dataset pytorch cv tip application javascript forecast swift golang rl jax julia gnn causal surey diffusion
© Copyright Philosophy 2018 Site Template by Colorlib