Qinghua Duan's picture

3 134

Qinghua Duan

qhduan

·

qhduan

AI & ML interests

None yet

Recent Activity

liked a Space 3 days ago

ovi054/image-to-prompt

liked a Space 21 days ago

OmniSVG/OmniSVG-3B

reacted to andito's post with 🔥 26 days ago

Many VLMs claim to process hours of video. But can they follow the story?🤔 Today, we introduce TimeScope: The benchmark that separates true temporal understanding from marketing hype. Let's see how much VLMs really understand!⏳ We test three skills that matter for real-world use: 🔎 Localized Retrieval: Find a specific action. 🧩 Information Synthesis: Piece together scattered clues. 🏃 Fine-Grained Perception: Analyze detailed motion (e.g., count how many times a person swings an axe). The results are in, and they're revealing. Only Gemini 2.5 pro handles 1-hour-long videos. Performance drops sharply with duration, proving that long video understanding is still challenging. We've found the breaking points—now the community can start fixing them.📈 Want to learn more? TimeScope is 100% open-source. Benchmark your model and help us build the next generation of video AI. 📖 Blog: https://huggingface.co/blog/timescope-video-lmm-benchmark 👩‍💻 Leaderboard & Demo: https://huggingface.co/spaces/Apollo-LMMs/TimeScope 📊 Dataset: https://huggingface.co/datasets/Apollo-LMMs/TimeScope ⚙️ Eval Code: https://github.com/EvolvingLMMs-Lab/lmms-eval

View all activity

Organizations

spaces 1

No application file

Test Gradio

models 2

qhduan/aquila-7b

Text Generation • Updated Jun 15, 2023 • 12 • 11

qhduan/aquilachat-7b

Text Generation • Updated Jun 15, 2023 • 13 • 17

datasets 0

None public yet