[Feature Selection Using Null Importances] by Kaggler and VP of Data Science @ @h2oai ogrellier HT @YifanX https://t.co/RMMvsgd6nC
— meg.ehh 🇨🇦 (@MeganRisdal) May 19, 2022
[Feature Selection Using Null Importances] by Kaggler and VP of Data Science @ @h2oai ogrellier HT @YifanX https://t.co/RMMvsgd6nC
— meg.ehh 🇨🇦 (@MeganRisdal) May 19, 2022
Yesterday I asked some Kagglers about their favorite creative ideas with real-world applicability that they found through Competitions ...
— meg.ehh 🇨🇦 (@MeganRisdal) May 19, 2022
They're very cool so I'm sharing them in a thread (add your own!) 👇
Dialog Inpainting: Turning Documents into Dialogs
— AK (@ak92501) May 19, 2022
abs: https://t.co/uVnXYgkKxu
Using inpainted data to pre-train ConvQA retrieval systems, advance sota across three benchmarks (QReCC, OR-QuAC, TREC CAsT) yielding up to 40% relative gains on standard evaluation metrics pic.twitter.com/eONAdbN1fg
Good discussion on this thread: Why do top speech/audio conferences like ICASSP and Interspeech have very high acceptance rates like 46%-48%?
— hardmaru (@hardmaru) May 17, 2022
IMO, low acceptance rates do not imply that the conference is any good. If anything, the opposite might be true:https://t.co/7EDybWXdi1 https://t.co/kebzEuYMz1 pic.twitter.com/AoKlp7Tjff
RTMV: A Ray-Traced Multi-View Synthetic Dataset for Novel View Synthesis
— AK (@ak92501) May 17, 2022
abs: https://t.co/4kdIL9g29r pic.twitter.com/JmXpCqFPXx
The Intel engineers in the PyTorch open-source community have created an new Intel® Extension for PyTorch* which maximizes deep learning inference and training performance on Intel CPUs. Get the extension to make use of these features today:@fanzhao_intel https://t.co/RMtyhRHeDE
— PyTorch (@PyTorch) May 16, 2022
Stumbled upon this neat flowchart for choosing text classification methods. I usually eye-balled it, but using a samples/number ratio cut-off seems reasonable. I.e., with a samples/number < 1500 use a bag-of-words model, with a >= 1500, use a seq. model (https://t.co/gbLJEGBOJA) pic.twitter.com/2BmxTTx9Z4
— Sebastian Raschka (@rasbt) May 16, 2022
My editors just shared with me the feedback from early reviewers and I'm in tears 😭
— Chip Huyen (@chipro) May 16, 2022
With the help of so many people, I worked really hard on this book. I'm grateful that people gave it a chance.
Read the book online: https://t.co/fxph4OYIsf
Pre-order: https://t.co/5RHFYzu7kq pic.twitter.com/vyhGbsih8A
ML bugs are so much trickier than bugs in traditional software because rather than getting an error, you get degraded performance (and it's not obvious a priori what ideal performance is).
— Greg Brockman (@gdb) May 14, 2022
So ML debugging works by continual sanity checking, e.g. comparing to various baselines.
My adamance that “business logic belongs in ETL, not BI” is, fundamentally, the same as “create a metric layer.” And it’s like we’re all figuring out how to do that well as we go along. https://t.co/WUiYFUAwi1
— JD Long (@CMastication) May 14, 2022
Learn how our developer community solves real, everyday machine learning problems with PyTorch. From Advertising & Marketing to Travel and so much in between, get to know PyTorch’s features and capabilities. Read all about PyTorch’s Community Stories: https://t.co/ceOqYIL5fR
— PyTorch (@PyTorch) May 13, 2022
A short thread about DeepMind's recent GATO paper. It trains a basic transformer on an impressive number of datasets pic.twitter.com/ncpP8aLFgs
— Eric Jang (@ericjang11) May 13, 2022