What Language Model Architecture and Pretraining Objective Work Best for Zero-Shot Generalization?
— AK (@ak92501) April 13, 2022
abs: https://t.co/Lk71qAPdzm
github: https://t.co/hIzImwwFoD pic.twitter.com/NSI294Gs7M
What Language Model Architecture and Pretraining Objective Work Best for Zero-Shot Generalization?
— AK (@ak92501) April 13, 2022
abs: https://t.co/Lk71qAPdzm
github: https://t.co/hIzImwwFoD pic.twitter.com/NSI294Gs7M
Chinchilla: A 70 billion parameter language model that outperforms much larger models, including Gopher. By revisiting how to trade-off compute between model & dataset size, users can train a better and smaller model. Read more: https://t.co/RaZGUclBYQ 1/3 pic.twitter.com/TNWI1RLloA
— DeepMind (@DeepMind) April 12, 2022
"Machine Learning State-of-the-Art with Uncertainties" -- great paper by @psteinb_ & @helmholtz_ai
— Sebastian Raschka (@rasbt) April 12, 2022
making a case for confidence intervals in ML benchmarks, or really any ML work. And no, adding CI's (e.g. via normal approx.) doesn't have to be expensive :) https://t.co/pgW6ILD7JW https://t.co/n1LMuCx1RQ
No Token Left Behind: Explainability-Aided Image Classification and Generation
— AK (@ak92501) April 12, 2022
abs: https://t.co/n5Jeu5Q8c7 pic.twitter.com/hLvkQgVFrr
Unleashing Vanilla Vision Transformer with Masked Image Modeling for Object Detection
— AK (@ak92501) April 7, 2022
abs: https://t.co/aL2vCMoyEp
github: https://t.co/xyk5vVRzvU pic.twitter.com/zFqHIngLwu
Temporal Alignment Networks for Long-term Video
— AK (@ak92501) April 7, 2022
abs: https://t.co/8VRuU21Lgg pic.twitter.com/wM72irpZQ5
MixFormer: Mixing Features across Windows and Dimensions
— AK (@ak92501) April 7, 2022
abs: https://t.co/3cLfqEzNfl pic.twitter.com/FUtMJS3p3o
KNN-Diffusion: Image Generation via Large-Scale Retrieval
— AK (@ak92501) April 7, 2022
abs: https://t.co/3E0f0wXBkI pic.twitter.com/78RHYZfpaC
PaLM: Scaling Language Modeling with Pathways
— AK (@ak92501) April 6, 2022
abs: https://t.co/yWvL0NGyjB pic.twitter.com/ACu4cVqAGO
MultiMAE: Multi-modal Multi-task Masked Autoencoders
— AK (@ak92501) April 5, 2022
abs: https://t.co/HrnyoHP9Xz
project page: https://t.co/NRdhfhYPCy pic.twitter.com/BADf1UMd3J
WavFT: Acoustic model finetuning with labelled and unlabelled data
— AK (@ak92501) April 4, 2022
abs: https://t.co/Feck7OBQ9i pic.twitter.com/DyfyXv24AF
XGBoost Is All You Need
— Bojan Tunguz (@tunguz) March 30, 2022
Deep Neural Networks and Tabular Data: A Surveyhttps://t.co/Z2KsHP3fvp pic.twitter.com/uh5NLS1fVP