4 6 5

Zhiding Yu

Zhiding

https://research.nvidia.com/person/zhiding-yu

Chrisding

AI & ML interests

None yet

Recent Activity

published a model 4 days ago

nvidia/VideoITG-8B

updated a model 4 days ago

nvidia/VideoITG-8B

published an article 6 days ago

NVIDIA Releases 3 Million Sample Dataset for OCR, Visual Question Answering, and Captioning Tasks

View all activity

Organizations

published a model 4 days ago

nvidia/VideoITG-8B

Image-Text-to-Text • 8B • Updated 4 days ago • 9 • 5

updated a model 4 days ago

nvidia/VideoITG-8B

Image-Text-to-Text • 8B • Updated 4 days ago • 9 • 5

published an article 6 days ago

Article

NVIDIA Releases 3 Million Sample Dataset for OCR, Visual Question Answering, and Captioning Tasks

and 4 others •

6 days ago

• 51

updated a model 8 days ago

nvidia/Eagle2.5-8B

Image-Text-to-Text • 8B • Updated 8 days ago • 12.6k • 24

liked a model about 1 month ago

nvidia/Eagle2.5-8B

Image-Text-to-Text • 8B • Updated 8 days ago • 12.6k • 24

published a model about 1 month ago

nvidia/Eagle2.5-8B

Image-Text-to-Text • 8B • Updated 8 days ago • 12.6k • 24

published an article about 2 months ago

Article

Welcome the NVIDIA Llama Nemotron Nano VLM to Hugging Face Hub

and 11 others •

Jun 27

• 28

updated a collection 2 months ago

Eagle 2

Collection

Eagle 2 is a family of frontier vision-language models with vision-centric design. The model supports 4K HD input, long-context video, and grounding. • 10 items • Updated 2 days ago • 36

upvoted a paper 2 months ago

AV-Reasoner: Improving and Benchmarking Clue-Grounded Audio-Visual Counting for MLLMs

Paper • 2506.05328 • Published Jun 5 • 20

liked a dataset 3 months ago

nvidia/PhysicalAI-Spatial-Intelligence-Warehouse

Updated May 26 • 59 • 11

updated 2 models 4 months ago

nvidia/Eagle2-1B

Image-Text-to-Text • 1B • Updated Apr 27 • 2.08k • 24

nvidia/Eagle2-2B

Image-Text-to-Text • 2B • Updated Apr 27 • 567 • 29

upvoted a paper 4 months ago

Eagle 2.5: Boosting Long-Context Post-Training for Frontier Vision-Language Models

Paper • 2504.15271 • Published Apr 21 • 66

authored a paper 4 months ago

Eagle 2.5: Boosting Long-Context Post-Training for Frontier Vision-Language Models

Paper • 2504.15271 • Published Apr 21 • 66

liked a dataset 4 months ago

nvidia/Llama-Nemotron-Post-Training-Dataset

Viewer • Updated May 8 • 3.91M • 9.29k • 557

liked a model 5 months ago

nvidia/GR00T-N1-2B

Robotics • 2B • Updated Jul 8 • 1.33k • 325

authored a paper 5 months ago

Token-Efficient Long Video Understanding for Multimodal LLMs

Paper • 2503.04130 • Published Mar 6 • 95

upvoted a collection 6 months ago

Deepseek Papers

Collection

Deepseek papers collection • 24 items • Updated 13 days ago • 268

updated 2 models 6 months ago

nvidia/QLIP-L-14-392

0.7B • Updated Feb 10 • 32 • 10

nvidia/QLIP-B-16-256

0.2B • Updated Feb 10 • 85 • 4

Zhiding Yu

AI & ML interests

Recent Activity

Organizations

Zhiding's activity

NVIDIA Releases 3 Million Sample Dataset for OCR, Visual Question Answering, and Captioning Tasks

Welcome the NVIDIA Llama Nemotron Nano VLM to Hugging Face Hub