Homepage
Close
Menu

Site Navigation

  • Home
  • Archive(TODO)
    • By Day
    • By Month
  • About(TODO)
  • Stats
Close
by tingchenai on 2021-09-23 (UTC).

Have you wondered why object detection, unlike classification, has so many sophisticated algorithms?

With Pix2Seq (https://t.co/ygsG3aAIbG), we simply cast object detection as a language modeling task conditioned on pixels!

(with @srbhsxn, Lala Li, @fleet_dj, @geoffreyhinton) pic.twitter.com/aTYZ5IvJc9

— Ting Chen (@tingchenai) September 23, 2021
researchcvnlp
by karpathy on 2021-09-24 (UTC).

Amusing! Object detection cast naively into language modeling framework + borrowing many of the tips&tricks.
- random object ordering seems fine ✅
- coords, class labels flattened into a single softmax 😂
- sequence augmentation is the most gnarly part, almost as yucky as nms 😬 https://t.co/FxSz5UbpxY

— Andrej Karpathy (@karpathy) September 24, 2021
researchcvnlp

Tags

learning tutorial misc nlp rstats gan ethics research dataviz survey python tool security kaggle video thought bayesian humour tensorflow w_code bias dataset pytorch cv tip application javascript forecast swift golang rl jax julia gnn causal surey diffusion
© Copyright Philosophy 2018 Site Template by Colorlib