ADAM's Proof was wrong and then corrected in 2018-- A new video on ADAM & it's proof -- Part I! https://t.co/0VBTyaGe4n
— /MachineLearning (@slashML) January 13, 2020
ADAM's Proof was wrong and then corrected in 2018-- A new video on ADAM & it's proof -- Part I! https://t.co/0VBTyaGe4n
— /MachineLearning (@slashML) January 13, 2020
What an elegant idea:
— Thomas Lahore (@evolvingstuff) January 13, 2020
Choosing the Sample with Lowest Loss makes SGD Robust
"in each step, first choose a set of k samples, then from these choose the one with the smallest current loss, and do an SGD-like update with this chosen sample"https://t.co/mwZjnhJy92
Perhaps the least acknowledged downside of deep neural net #AI models: the carbon footprint
— Eric Topol (@EricTopol) January 12, 2020
But this key preprint is starting to get noticed https://t.co/klTF0KAaIi by @strubell @andrewmccallum https://t.co/pGsBKVCF1B @AndrewLBeam https://t.co/q7msmkQOLN @techreview @_KarenHao pic.twitter.com/j3dcGityEv
On the Relationship between Self-Attention and Convolutional Layers
— hardmaru (@hardmaru) January 11, 2020
This work shows that attention layers can perform convolution and that they often learn to do so in practice. They also prove that a self-attention layer is as expressive as a conv layer.https://t.co/44I1uOd4LF pic.twitter.com/iqioR9eXzU
Very happy to share our latest work accepted at #ICRL2020: we prove that a Self-Attention layer can express any CNN layer. 1/5
— Jean-Baptiste Cordonnier (@jb_cordonnier) January 10, 2020
📄Paper: https://t.co/Cm61A3PWRA
🍿Interactive website : https://t.co/FTpThM3BQc
🖥Code: https://t.co/xSfmFCy0U2
📝Blog: https://t.co/3bp59RfAcj pic.twitter.com/X1rNS1JvPt
What I did over my winter break!
— Jeff Dean (@JeffDean) January 9, 2020
It gives me great pleasure to share this summary of some of our work
in 2019, on behalf of all my colleagues at @GoogleAI & @GoogleHealth.https://t.co/hGoog8G9QD
.@stanfordnlp people’s #ICLR2020 papers #1—@ukhndlwl and colleagues (incl. at @facebookai) show the power of neural nets learning a context similarity function for kNN in LM prediction—almost 3 PPL gain on WikiText-103—maybe most useful for domain transfer https://t.co/on0ntDaqXL pic.twitter.com/5yKRhhjZMr
— Stanford NLP Group (@stanfordnlp) January 6, 2020
Handwritten Optical Character Recognition (OCR): A Comprehensive Systematic Literature Re... https://t.co/QjdnwhbTWY pic.twitter.com/8BBKjKXM67
— arxiv (@arxiv_org) January 5, 2020
Actually, re-reading the Adaptive Sparse Transformers by Gonçalo M. Correia, Vlad Niculae and André F.T. Martins https://t.co/vPq8duBp4k, I found this nice observation of a BPE-merging head that I can't resist sharing with you as well. Isn't that a sweet head?👇
— Thomas Wolf (@Thom_Wolf) January 4, 2020
[3/3] pic.twitter.com/tFMsPIrdgg
Fantastic new paper on calibration of predictions by @BenVanCalster @laure_wynants @MaartenvSmeden @ESteyerberg out of the STRATOS initiative: https://t.co/pNFvTH711Z https://t.co/Yto7IgXKWb
— Frank Harrell (@f2harrell) January 4, 2020
DeepFakes and Beyond: A Survey of Face Manipulation and Fake Detection https://t.co/2oyMcVP4SX
— Delip Rao (@deliprao) January 4, 2020
FaceShifter: Towards High Fidelity And Occlusion Aware Face Swapping https://t.co/Wbcdv0iQJk
— /MachineLearning (@slashML) January 2, 2020