Tag - research

by ak92501 on 2021-11-03 (UTC).

StyleGAN of All Trades: Image Manipulation with Only Pretrained StyleGAN
abs: https://t.co/TGdEjthZlk
github: https://t.co/7q09qdMTdf pic.twitter.com/sGcg5imUAM
— AK (@ak92501) November 3, 2021

research w_code

by ak92501 on 2021-11-03 (UTC).

Can Vision Transformers Perform Convolution?
abs: https://t.co/rsHhON89sV

a single ViT layer with image patches as the input can perform any convolution operation constructively, where the multi-head attention mechanism and the relative positional encoding play essential roles pic.twitter.com/Qw1RqqEfjV
— AK (@ak92501) November 3, 2021

research

by srush_nlp on 2021-10-30 (UTC).

"When Attention Meets Fast Recurrence: Training Language Models with Reduced Compute" by Tao Lei @taolei15949106 - Outstanding Paper at EMNLP https://t.co/7IR25d9Sz2

(Tao's work is always must read. Combines algorithmic cleverness with practical engineering and experiments.)
— Sasha Rush (@srush_nlp) October 30, 2021

research nlp

by OpenAI on 2021-10-29 (UTC).

We've trained a system to answer grade-school math problems with double the accuracy of a fine-tuned GPT-3 model.

Multistep reasoning is difficult for today's language models. We present a new technique to help. https://t.co/JRXUYZOSg7
— OpenAI (@OpenAI) October 29, 2021

research nlp

by m__dehghani on 2021-10-27 (UTC).

With @YiTayML, @anuragarnab, @giffmana, and @ashVaswani, we wrote up a paper on "the efficiency misnomer": https://t.co/yM6XeykB30

TL;DR:
"No single cost indicator is sufficient for making an absolute conclusion when comparing the efficiency of different models". pic.twitter.com/EaZ4nVBWEz
— Mostafa Dehghani (@m__dehghani) October 27, 2021

research

by hardmaru on 2021-10-27 (UTC).

Parameter Prediction for Unseen Deep Architectures

Their graph hypernetwork can predict all 24M parameters of a ResNet-50, achieving 60% CIFAR-10 accuracy, and 50% Top-5 accuracy on ImageNet. A forward pass takes only a fraction of a second, even on a CPU!https://t.co/qRJiDTVRoH
— hardmaru (@hardmaru) October 27, 2021

research

by ak92501 on 2021-10-27 (UTC).

s2s-ft: Fine-Tuning Pretrained Transformer Encoders
for Sequence-to-Sequence Learning
abs: https://t.co/AwURzk5Stg pic.twitter.com/hXAVat6q89
— AK (@ak92501) October 27, 2021

research

by ak92501 on 2021-10-26 (UTC).

Image-Based CLIP-Guided Essence Transfer
abs: https://t.co/wFGwhJxRCZ
github: https://t.co/38xy9RrC9x

new method creates a blending operator that is optimized to be simultaneously additive in both latent spaces pic.twitter.com/WqKURD8ny8
— AK (@ak92501) October 26, 2021

research w_code cv

by ak92501 on 2021-10-25 (UTC).

Self-Supervised Learning by Estimating Twin Class Distributions
abs: https://t.co/LA6IagSCTv
github: https://t.co/QkBgV8FcRU pic.twitter.com/RW5OLHfb3W
— AK (@ak92501) October 25, 2021

research w_code

by ak92501 on 2021-10-25 (UTC).

SOFT: Softmax-free Transformer with Linear Complexity
abs: https://t.co/EralXVH5CZ
github: https://t.co/4miqmwAGcA

introduced a softmax-free self-attention mechanism for linearizing Transformer’s complexity in space and time pic.twitter.com/85Mw5MJOUc
— AK (@ak92501) October 25, 2021

research w_code

by gneubig on 2021-10-21 (UTC).

Here is a one-year perspective on @chrmanning's question (data courtesy of @SemanticScholar). Very interesting result: EMNLP has much fewer little-cited papers, but Findings has more very-highly-cited papers. Findings high-risk, sometimes high reward. 1/2 https://t.co/BnouEtU03e pic.twitter.com/d4Fj9Tv5YB
— Graham Neubig (@gneubig) October 21, 2021

research misc

by ak92501 on 2021-10-20 (UTC).

FlexMatch: Boosting Semi-Supervised Learning with Curriculum Pseudo Labeling
abs: https://t.co/FDZTByplIY

FlexMatch outperforms FixMatch by 14.32% and 24.55% on CIFAR-100 and STL-10 datasets respectively, when there are only 4 labels per class pic.twitter.com/uMYQ171WoL
— AK (@ak92501) October 20, 2021

research cv

Tag: research

Tags