Animal Kingdom: A Large and Diverse Dataset for Animal Behavior Understanding
— AK (@ak92501) April 19, 2022
abs: https://t.co/D3FpREgzOg pic.twitter.com/B0spxshKDd
Animal Kingdom: A Large and Diverse Dataset for Animal Behavior Understanding
— AK (@ak92501) April 19, 2022
abs: https://t.co/D3FpREgzOg pic.twitter.com/B0spxshKDd
An Extendable, Efficient and Effective Transformer-based Object Detector
— AK (@ak92501) April 19, 2022
abs: https://t.co/3D2aSqmSkr
github: https://t.co/tNopT866Jc pic.twitter.com/cZN3Sbooob
LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking
— AK (@ak92501) April 19, 2022
abs: https://t.co/wuzHfvfDHQ
github: https://t.co/dms3SfhNQo pic.twitter.com/PDd8Xfp1K9
Masked Siamese Networks for Label-Efficient Learning
— AK (@ak92501) April 15, 2022
abs: https://t.co/dYXpFnTm3Y
github: https://t.co/MHm8z6lBWr
on ImageNet-1K, with only 5,000 annotated images, base MSN model achieves 72.4% top-1 accuracy, and with 1% of ImageNet-1K labels, achieves 75.7% top-1
accuracy pic.twitter.com/wXhSeUtNc5
No Token Left Behind: Explainability-Aided Image Classification and Generation
— AK (@ak92501) April 12, 2022
abs: https://t.co/n5Jeu5Q8c7 pic.twitter.com/hLvkQgVFrr
Unleashing Vanilla Vision Transformer with Masked Image Modeling for Object Detection
— AK (@ak92501) April 7, 2022
abs: https://t.co/aL2vCMoyEp
github: https://t.co/xyk5vVRzvU pic.twitter.com/zFqHIngLwu
Temporal Alignment Networks for Long-term Video
— AK (@ak92501) April 7, 2022
abs: https://t.co/8VRuU21Lgg pic.twitter.com/wM72irpZQ5
KNN-Diffusion: Image Generation via Large-Scale Retrieval
— AK (@ak92501) April 7, 2022
abs: https://t.co/3E0f0wXBkI pic.twitter.com/78RHYZfpaC
Fine-tuning Image Transformers using Learnable Memory
— AK (@ak92501) March 30, 2022
abs: https://t.co/EysBcFM7xa
propose augmenting Vision Transformer models with learnable memory tokens. Model adapts to new tasks, using few parameters, while optionally preserving its capabilities on previously learned tasks pic.twitter.com/gJod2r1hSv
Unified Transformer Tracker for Object Tracking
— AK (@ak92501) March 30, 2022
abs: https://t.co/ujhGLOr4vT pic.twitter.com/Rx3SZQGEvH
Video Frame Interpolation Transformer
— AK (@ak92501) March 29, 2022
Paper: https://t.co/JtQik8ahVt
Code: https://t.co/4CevEIQYfY pic.twitter.com/fVdL5Wz5QD
.@Gradio Demo for GroupViT: Semantic Segmentation Emerges from Text Supervision on @huggingface Spaces
— AK (@ak92501) March 26, 2022
demo: https://t.co/otoSP8yeNB
github: https://t.co/VnyWcJQmV4 pic.twitter.com/l4V9TMHisR