Tag - cv

by ak92501 on 2022-04-19 (UTC).

Animal Kingdom: A Large and Diverse Dataset for Animal Behavior Understanding
abs: https://t.co/D3FpREgzOg pic.twitter.com/B0spxshKDd
— AK (@ak92501) April 19, 2022

dataset cv

by ak92501 on 2022-04-19 (UTC).

An Extendable, Efficient and Effective Transformer-based Object Detector
abs: https://t.co/3D2aSqmSkr
github: https://t.co/tNopT866Jc pic.twitter.com/cZN3Sbooob
— AK (@ak92501) April 19, 2022

research cv w_code

by ak92501 on 2022-04-19 (UTC).

LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking
abs: https://t.co/wuzHfvfDHQ
github: https://t.co/dms3SfhNQo pic.twitter.com/PDd8Xfp1K9
— AK (@ak92501) April 19, 2022

research w_code cv nlp

by ak92501 on 2022-04-15 (UTC).

Masked Siamese Networks for Label-Efficient Learning
abs: https://t.co/dYXpFnTm3Y
github: https://t.co/MHm8z6lBWr

on ImageNet-1K, with only 5,000 annotated images, base MSN model achieves 72.4% top-1 accuracy, and with 1% of ImageNet-1K labels, achieves 75.7% top-1
accuracy pic.twitter.com/wXhSeUtNc5
— AK (@ak92501) April 15, 2022

research w_code cv

by ak92501 on 2022-04-12 (UTC).

No Token Left Behind: Explainability-Aided Image Classification and Generation
abs: https://t.co/n5Jeu5Q8c7 pic.twitter.com/hLvkQgVFrr
— AK (@ak92501) April 12, 2022

research cv

by ak92501 on 2022-04-07 (UTC).

Unleashing Vanilla Vision Transformer with Masked Image Modeling for Object Detection
abs: https://t.co/aL2vCMoyEp
github: https://t.co/xyk5vVRzvU pic.twitter.com/zFqHIngLwu
— AK (@ak92501) April 7, 2022

research cv

by ak92501 on 2022-04-07 (UTC).

Temporal Alignment Networks for Long-term Video
abs: https://t.co/8VRuU21Lgg pic.twitter.com/wM72irpZQ5
— AK (@ak92501) April 7, 2022

research cv

by ak92501 on 2022-04-07 (UTC).

KNN-Diffusion: Image Generation via Large-Scale Retrieval
abs: https://t.co/3E0f0wXBkI pic.twitter.com/78RHYZfpaC
— AK (@ak92501) April 7, 2022

research cv

by ak92501 on 2022-03-30 (UTC).

Fine-tuning Image Transformers using Learnable Memory
abs: https://t.co/EysBcFM7xa

propose augmenting Vision Transformer models with learnable memory tokens. Model adapts to new tasks, using few parameters, while optionally preserving its capabilities on previously learned tasks pic.twitter.com/gJod2r1hSv
— AK (@ak92501) March 30, 2022

research cv

by ak92501 on 2022-03-30 (UTC).

Unified Transformer Tracker for Object Tracking
abs: https://t.co/ujhGLOr4vT pic.twitter.com/Rx3SZQGEvH
— AK (@ak92501) March 30, 2022

research cv

by ak92501 on 2022-03-29 (UTC).

Video Frame Interpolation Transformer
Paper: https://t.co/JtQik8ahVt
Code: https://t.co/4CevEIQYfY pic.twitter.com/fVdL5Wz5QD
— AK (@ak92501) March 29, 2022

research cv w_code

by ak92501 on 2022-03-26 (UTC).

.@Gradio Demo for GroupViT: Semantic Segmentation Emerges from Text Supervision on @huggingface Spaces
demo: https://t.co/otoSP8yeNB
github: https://t.co/VnyWcJQmV4 pic.twitter.com/l4V9TMHisR
— AK (@ak92501) March 26, 2022

w_code learning cv

Tag: cv

Tags