Open-Set Semi-Supervised Object Detection
— AK (@_akhaliq) August 30, 2022
abs: https://t.co/mW9vWZ6uEX
project page: https://t.co/rplP961v7U pic.twitter.com/g93CV7CW06
Open-Set Semi-Supervised Object Detection
— AK (@_akhaliq) August 30, 2022
abs: https://t.co/mW9vWZ6uEX
project page: https://t.co/rplP961v7U pic.twitter.com/g93CV7CW06
Stable Diffusion Tutorial: GUI, Better Results, Easy Setup, text2image and image2image
— AK (@_akhaliq) August 27, 2022
video: https://t.co/AkBEJfvtrw
github: https://t.co/X5APJg4QAV pic.twitter.com/hGacGDufaY
Masked Vision and Language Modeling for Multi-modal Representation Learning
— AK (@_akhaliq) August 4, 2022
abs: https://t.co/zpOExcblUH pic.twitter.com/siunmulnng
A @Gradio Demo for OCR-free Document Understanding Transformer on @huggingface Spaces
— AK (@_akhaliq) July 28, 2022
demo: https://t.co/o0Gdheg8O4
Get started with Gradio: https://t.co/qh8qpILE1S pic.twitter.com/hN3GWGHnAT
Time and object detectors are running by so fast!
— Sebastian Raschka (@rasbt) July 20, 2022
Feels like I was using YOLOv3 just a few years ago. This month, YOLOv7 was released!
It's 1500% faster than a SWIN Transformer-based R-CNN (at the same accuracy) and 150% faster than YOLOv5!https://t.co/fdHnzNPj2n pic.twitter.com/byeT9ftAbT
Nature scientific report: a deep learning model for analysis of ophthalmic images, deployed to a mobile app to enable field diagnosis. Built with Keras and TensorFlow. https://t.co/T4UCDE1TZ7
— François Chollet (@fchollet) June 30, 2022
Implementation of Parti, Google's pure attention-based text-to-image neural network, in Pytorch https://t.co/fAhIdq5p6v #deeplearning #machinelearning #ml #ai #neuralnetworks #datascience #pytorch
— PyTorch Best Practices (@PyTorchPractice) June 25, 2022
A quick thread on "How DALL-E 2, Imagen and Parti Architectures Differ" with breakdown into comparable modules, annotated with size 🧵#dalle2 #imagen #parti
— Rosanne Liu (@savvyRL) June 25, 2022
* figures taken from corresponding papers with slight modification
* parts used for training only are greyed out pic.twitter.com/9zsIUq3toU
MaskViT: Masked Visual Pre-Training for Video Prediction
— AK (@_akhaliq) June 24, 2022
abs: https://t.co/uhMEB6ashb
project page: https://t.co/yK4d3nx9Xj pic.twitter.com/efj7G3cjVq
Global Context Vision Transformers
— AK (@_akhaliq) June 22, 2022
abs: https://t.co/d6go0yv7fu
github: https://t.co/rUYFs09ReC
On ImageNet-1K dataset for classification, the base, small and tiny variants of GC ViT with 28M, 51M and 90M parameters achieve 83.2%, 83.9% and 84.4% Top-1 accuracy, respectively pic.twitter.com/XKoJAvUcYm
Temporally Consistent Semantic Video Editing
— AK (@_akhaliq) June 22, 2022
abs: https://t.co/sg1dRt2xkw
project page: https://t.co/PyZKnxUQko pic.twitter.com/1Az9nG5ccH
DALL-E for Detection: Language-driven Context Image Synthesis for Object Detection
— AK (@_akhaliq) June 22, 2022
abs: https://t.co/rXx4npbY5G pic.twitter.com/QBHP494eSn