Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2412.18319

DocLLM: A layout-aware generative language model for multimodal document understanding

Paper • 2401.00908 • Published Dec 31, 2023 • 181
COSMO: COntrastive Streamlined MultimOdal Model with Interleaved Pre-Training

Paper • 2401.00849 • Published Jan 1, 2024 • 16
LLaVA-Plus: Learning to Use Tools for Creating Multimodal Agents

Paper • 2311.05437 • Published Nov 9, 2023 • 48
LLaVA-Interactive: An All-in-One Demo for Image Chat, Segmentation, Generation and Editing

Paper • 2311.00571 • Published Nov 1, 2023 • 41

about 3 hours ago

LinFusion: 1 GPU, 1 Minute, 16K Image

Paper • 2409.02097 • Published Sep 3, 2024 • 33
Phidias: A Generative Model for Creating 3D Content from Text, Image, and 3D Conditions with Reference-Augmented Diffusion

Paper • 2409.11406 • Published Sep 17, 2024 • 26
Diffusion Models Are Real-Time Game Engines

Paper • 2408.14837 • Published Aug 27, 2024 • 123
Segment Anything with Multiple Modalities

Paper • 2408.09085 • Published Aug 17, 2024 • 22

Multimodal Dataset

SPIQA: A Dataset for Multimodal Question Answering on Scientific Papers

Paper • 2407.09413 • Published Jul 12, 2024 • 10
MAVIS: Mathematical Visual Instruction Tuning

Paper • 2407.08739 • Published Jul 11, 2024 • 32
Kvasir-VQA: A Text-Image Pair GI Tract Dataset

Paper • 2409.01437 • Published Sep 2, 2024 • 71
MMEvol: Empowering Multimodal Large Language Models with Evol-Instruct

Paper • 2409.05840 • Published Sep 9, 2024 • 47

Mulberry: Empowering MLLM with o1-like Reasoning and Reflection via Collective Monte Carlo Tree Search

Paper • 2412.18319 • Published 28 days ago • 37
Token-Budget-Aware LLM Reasoning

Paper • 2412.18547 • Published 28 days ago • 45
Efficiently Serving LLM Reasoning Programs with Certaindex

Paper • 2412.20993 • Published 22 days ago • 35
B-STaR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners

Paper • 2412.17256 • Published 29 days ago • 45

2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining

Paper • 2501.00958 • Published 19 days ago • 97
Are Vision-Language Models Truly Understanding Multi-vision Sensor?

Paper • 2412.20750 • Published 22 days ago • 20
Do NOT Think That Much for 2+3=? On the Overthinking of o1-Like LLMs

Paper • 2412.21187 • Published 22 days ago • 36
HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs

Paper • 2412.18925 • Published 27 days ago • 95

interest_need_read

感兴趣热门论文集合

ProcessBench: Identifying Process Errors in Mathematical Reasoning

Paper • 2412.06559 • Published Dec 9, 2024 • 78
Maya: An Instruction Finetuned Multilingual Multimodal Model

Paper • 2412.07112 • Published Dec 10, 2024 • 26
OpenAI o1 System Card

Paper • 2412.16720 • Published about 1 month ago • 31
Diving into Self-Evolving Training for Multimodal Reasoning

Paper • 2412.17451 • Published 29 days ago • 42

rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking

Paper • 2501.04519 • Published 13 days ago • 237
Learning an evolved mixture model for task-free continual learning

Paper • 2207.05080 • Published Jul 11, 2022 • 1
EVOLvE: Evaluating and Optimizing LLMs For Exploration

Paper • 2410.06238 • Published Oct 8, 2024 • 1
Smaller Language Models Are Better Instruction Evolvers

Paper • 2412.11231 • Published Dec 15, 2024 • 27

Reasoning systems

Mulberry: Empowering MLLM with o1-like Reasoning and Reflection via Collective Monte Carlo Tree Search

Paper • 2412.18319 • Published 28 days ago • 37
Think&Cite: Improving Attributed Text Generation with Self-Guided Tree Search and Progress Reward Modeling

Paper • 2412.14860 • Published Dec 19, 2024 • 2
Language Models are Hidden Reasoners: Unlocking Latent Reasoning Capabilities via Self-Rewarding

Paper • 2411.04282 • Published Nov 6, 2024 • 33
Ensembling Large Language Models with Process Reward-Guided Tree Search for Better Complex Reasoning

Paper • 2412.15797 • Published Dec 20, 2024 • 17

Ensembling Large Language Models with Process Reward-Guided Tree Search for Better Complex Reasoning

Paper • 2412.15797 • Published Dec 20, 2024 • 17
Mulberry: Empowering MLLM with o1-like Reasoning and Reflection via Collective Monte Carlo Tree Search

Paper • 2412.18319 • Published 28 days ago • 37

RL Zero: Zero-Shot Language to Behaviors without any Supervision

Paper • 2412.05718 • Published Dec 7, 2024 • 4
Offline Reinforcement Learning for LLM Multi-Step Reasoning

Paper • 2412.16145 • Published Dec 20, 2024 • 38
Ensembling Large Language Models with Process Reward-Guided Tree Search for Better Complex Reasoning

Paper • 2412.15797 • Published Dec 20, 2024 • 17
Mulberry: Empowering MLLM with o1-like Reasoning and Reflection via Collective Monte Carlo Tree Search

Paper • 2412.18319 • Published 28 days ago • 37

Previous
1
2
3
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs