They trained a model to generate the model parameters of the encoder and decoder LSTMs used for multilingual machine translation! https://t.co/B5kTkZ00HD
— hardmaru (@hardmaru) August 28, 2018
They trained a model to generate the model parameters of the encoder and decoder LSTMs used for multilingual machine translation! https://t.co/B5kTkZ00HD
— hardmaru (@hardmaru) August 28, 2018
Instead of predicting one step at a time, can we train models that predict only future events they are confident about? Predict the predictable moments in an unpredictable system. These are useful for control.
— Sergey Levine (@svlevine) August 28, 2018
Time agnostic prediction (D. Jayaraman et al): https://t.co/W6TlXqCraX pic.twitter.com/oLEsZPTLE4
#EMNLP2018 paper by @eaplatanios on Context Sensitive Parameter Generation for Universal NMT! https://t.co/93mLfECxnC
— Graham Neubig (@gneubig) August 28, 2018
Inspired by @hardmaru's HyperNetworks, we learn to generate NMT parameters for the languages we want to translate. Nice results on multilingual and zero-shot MT! pic.twitter.com/QOYrSwoPMq
A recent study of “Generalisation in humans and deep neural networks” https://t.co/YYcm97w4C9. If we attempt to enlarge our training data distribution to improve generalisation via data augmentation, unsurprisingly, it'll only learn to the extent of the type of augmentation used. pic.twitter.com/3HGm7TeR3p
— hardmaru (@hardmaru) August 28, 2018
A new #deeplearning model for improving abstractive summarization by actually creating novel phrases. Reinforcement learning in #naturallanguageprocessing. Work by amazing intern @iam_wkr and Salesforce Researchers Romain Paulus and @CaimingXiong.https://t.co/gXcp3gyOov pic.twitter.com/UBK8V0VDC6
— Richard (@RichardSocher) August 28, 2018
And still more trouble for deep learning: “human visual system .... more robust [than deep neural nets] to nearly all of [a dozen] tested image manipulations” https://t.co/LxxL5TmRoz
— Gary Marcus (@GaryMarcus) August 28, 2018
So much this. Or maybe you're indirectly tweaking regularization.
— Jeremy Howard (@jeremyphoward) August 28, 2018
This is why researchers that run experiments need to be good practitioners. If you're not training with appropriate LR and regularization, your experiments are probably meaningless. https://t.co/mTJHZOyBgC
If they want to improve the quality of scientific publications, rather than banning p-values or changing the 0.05 threshold, journals should make us show the data. pic.twitter.com/UFz3xrPQZJ
— Rafael Irizarry (@rafalab) August 27, 2018
Driverless cars just got a whole lot harder. This technical paper by @amirrosenfeld raises some profound questions about the robustness of #DeepLearning as a perceptual mechanism. https://t.co/byBdv5Vlra
— Gary Marcus (@GaryMarcus) August 27, 2018
"Style Transfer as Unsupervised Machine Translation," Zhang et al.: https://t.co/aztJfKyBpc
— Miles Brundage (@Miles_Brundage) August 27, 2018
New seq2seq architecture - jointly encodes source and targets into a 2D ConvNet. No enc/dec or explicit attention.
— PyTorch (@PyTorch) August 27, 2018
Outperforming ConvS2S and Transformers on IWSLT'14 de<->en, with 3 to 8 times less parameters
from @melbayad and teamhttps://t.co/8NmiwmnhI2https://t.co/LqUYynj8vB pic.twitter.com/KFdcucErHI
"LIFT: Reinforcement Learning in Computer Systems by Learning From Demonstrations," Schaarschmidt et al.: https://t.co/PuCmcngk2j
— Miles Brundage (@Miles_Brundage) August 27, 2018
RL for data management tasks, using imperfect demos