Describe Anything: Detailed Localized Image and Video Captioning Paper • 2504.16072 • Published Apr 22 • 63
QuoTA: Query-oriented Token Assignment via CoT Query Decouple for Long Video Comprehension Paper • 2503.08689 • Published Mar 11 • 4
view article Article Instruction-tuning Stable Diffusion with InstructPix2Pix By sayakpaul • May 23, 2023 • 17