Interesting -- the larger the model, the less data it needs to reach the same validation loss, this is the opposite of what statistics teaches us https://t.co/88qwts7klV
— Yaroslav Bulatov (@yaroslavvb) August 14, 2019
Interesting -- the larger the model, the less data it needs to reach the same validation loss, this is the opposite of what statistics teaches us https://t.co/88qwts7klV
— Yaroslav Bulatov (@yaroslavvb) August 14, 2019
“Nvidia was able to train BERT-Large using optimized PyTorch software and a DGX-SuperPOD of more than 1,000 GPUs that is able to train BERT in 53 minutes.” – @kharijohnson, @VentureBeat https://t.co/9gT3aZTsBs
— Stanford NLP Group (@stanfordnlp) August 14, 2019
Neural Text Generation with Unlikelihood Training
— Thomas Lahore (@evolvingstuff) August 14, 2019
"We propose a new objective, unlikelihood training, which forces unlikely generations to be assigned lower probability by the model."https://t.co/fhXiNqrDAH pic.twitter.com/EJQ4BPes5Z
a blog post that nicely describes not only whether she's SOTA'd (whatever that means) but also a stream of scientific process behind this work. https://t.co/gYHjJdAkSm
— Kyunghyun Cho (@kchonyc) August 14, 2019
Project Euphonia is a speech-to-text transcription model for those with atypical speech. In a new @interspeech2019 paper, learn how researchers are collaborating with the #ALS community to develop Euphonia for those with ALS or other speech impairments. https://t.co/mSHSINm88a
— Google AI (@GoogleAI) August 13, 2019
This format of setting a paper in dialogue with its critics formally has a rich tradition in journal culture but it's rare in AI/ML. This is an exciting development from @ch402 & the @distillpub team, and I hope they run many more of these:https://t.co/TY6wEy9Jb1
— Zachary Lipton (@zacharylipton) August 12, 2019
New SOTA on NLVR2. Very impressive progress! 👏 Will be interesting to see NLVR2 attention examples. Still a lot of room for human performance. https://t.co/sshDbfzUkj #NLProc https://t.co/ECqoPZnWey pic.twitter.com/ur6kM4Achy
— Yoav Artzi (@yoavartzi) August 12, 2019
VisualBERT: A Simple and Performant Baseline for Vision and Language
— Thomas Lahore (@evolvingstuff) August 12, 2019
"VisualBERT...is even sensitive to syntactic relationships, tracking, for example, associations between verbs and image regions corresponding to their arguments"https://t.co/Wm2RqBqvPI pic.twitter.com/HCfV8QOBtA
Misspelling Oblivious Embeddings (MOE) is a new model for word embeddings that are resilient to misspellings, improving the ability to apply word embeddings to real-world situations, where misspellings are common. https://t.co/gk91vr6LRE #nlp
— Facebook AI (@facebookai) August 9, 2019
Temporal Cycle-Consistency Learning (TCC) is a novel self-supervised method for learning representations that are well-suited for fine-grained temporal labeling of video. Learn how it’s done and download the TCC codebase to try it out for yourself! https://t.co/HhSGSGaTLu
— Google AI (@GoogleAI) August 8, 2019
"ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language Tasks," Lu et al.: https://t.co/2xJgxhFLhC
— Miles Brundage (@Miles_Brundage) August 7, 2019
"Our work represents a shift ... towards treating visual grounding as a pretrainable and transferable capability."
#acl2019nlp paper on "Beyond BLEU: Training NMT with Semantic Similarity" by Wieting et al.: https://t.co/5N9SBiPyDq
— Graham Neubig (@gneubig) August 6, 2019
I like this because it shows 1) a nice use case for semantic similarity, 2) that we can/should optimize seq2seq models for something other than likelihood or BLEU! pic.twitter.com/Fh8WJe5tKH