On the Factory Floor: ML Engineering for Industrial-Scale Ads Recommendation Models
β AK (@_akhaliq) September 13, 2022
abs: https://t.co/pJJaIJnEOK pic.twitter.com/0RS52ny7S2
On the Factory Floor: ML Engineering for Industrial-Scale Ads Recommendation Models
β AK (@_akhaliq) September 13, 2022
abs: https://t.co/pJJaIJnEOK pic.twitter.com/0RS52ny7S2
A Short Chronology Of Deep Learning For Tabular Data:https://t.co/VAXJRBMyzj
β Sebastian Raschka (@rasbt) July 24, 2022
Deep tabular methods are an interesting research direction! So, this morning, I sat down and summarized my thoughts + the recent papers I read.
The Technology Behind BLOOM TrainingπΈ
β Hugging Face (@huggingface) July 14, 2022
Discover how @BigscienceW used @MSFTResearch DeepSpeed + @nvidia Megatron-LM technologies to train the World's Largest Open Multilingual Language Model (BLOOM):https://t.co/8QOxhrIVbs
A Tour of Visualization Techniques for Computer Vision Datasets
β AK (@ak92501) April 20, 2022
abs: https://t.co/N0j9jlMZcC pic.twitter.com/bKzgtpj616
Information on recommender systems is hard to come by
β Radek Osmulski πΊπ¦ (@radekosmulski) April 14, 2022
But did you know that @eugeneyan has put together a list of
β 68 RecSys papers and articles
β 57 papers and articles on Search and Ranking
This is amazing π₯³πΎ thank you so very much for this!!!https://t.co/qdvwZ1zfLc pic.twitter.com/c1Q9Vx3YNL
A Review on Language Models as Knowledge Bases
β AK (@ak92501) April 14, 2022
abs: https://t.co/C70a1YM8AX pic.twitter.com/Ce84fhz5yX
XGBoost Is All You Need
β Bojan Tunguz (@tunguz) March 30, 2022
Deep Neural Networks and Tabular Data: A Surveyhttps://t.co/Z2KsHP3fvp pic.twitter.com/uh5NLS1fVP
Really enjoy working on academic research. However, I recently realized that training models on my GPU workstation != using DL in the real-world. There are lots of cool, compounding techniques to scale-up your PyTorch models. A good & succinct overview: https://t.co/JtfBLvMYXF
β Sebastian Raschka (@rasbt) March 17, 2022
We just assessed the effectiveness of the DDP, Pipe and FSDP distributed training techniques available via PyTorch with different model sizes and network configurations. See the results here. https://t.co/vN191zRclh
β PyTorch (@PyTorch) March 17, 2022
Excellent and unintuitive read on GPUs. The chip doing the compute has tiny amount of memory & is connected to the main memory literally through a straw. Most of the energy goes to data movement too. Many repercussions. E.g. latency better predicted by # activations than # flops https://t.co/67PBOfEcNK
β Andrej Karpathy (@karpathy) March 15, 2022
Great figure illustrating the different types of deep generative models via @lilianweng (https://t.co/4NJZzr9HKF) & nice list of cons. GANs: unstable training, low diversity; VAE: surrogate loss; Flow-based: special architecture for reversible transf (Diffusion: slow to sample) pic.twitter.com/zl4RLIW3EB
β Sebastian Raschka (@rasbt) February 21, 2022
Unsolved Problems in ML Safety
β AK (@ak92501) September 29, 2021
pdf: https://t.co/aZjdACwbJY
abs: https://t.co/Fu2F70Koga pic.twitter.com/OwMjKlTpqv