Flow-DPO: Improving LLM Mathematical Reasoning through Online Multi-Agent Learning Paper ā¢ 2410.22304 ā¢ Published Oct 29, 2024 ā¢ 17
MMSearch: Benchmarking the Potential of Large Models as Multi-modal Search Engines Paper ā¢ 2409.12959 ā¢ Published Sep 19, 2024 ā¢ 37
Enhancing Large Vision Language Models with Self-Training on Image Comprehension Paper ā¢ 2405.19716 ā¢ Published May 30, 2024
MIRAI: Evaluating LLM Agents for Event Forecasting Paper ā¢ 2407.01231 ā¢ Published Jul 1, 2024 ā¢ 16
view post Post 1262 Check out our new benchmark paper on LLM agents for global events forecasting! MIRAI: Evaluating LLM Agents for Event Forecasting (2407.01231) š Arxiv: https://arxiv.org/abs/2407.01231š Project page: https://mirai-llm.github.ioš» GitHub Repo: https://github.com/yecchen/MIRAIš Dataset: https://drive.google.com/file/d/1xmSEHZ_wqtBu1AwLpJ8wCDYmT-jRpfrN/view?usp=sharingš Interactive Demo Notebook: https://colab.research.google.com/drive/1QyqT35n6NbtPaNtqQ6A7ILG_GMeRgdnO?usp=sharing ā¤ļø 2 2 + Reply
MuirBench: A Comprehensive Benchmark for Robust Multi-image Understanding Paper ā¢ 2406.09411 ā¢ Published Jun 13, 2024 ā¢ 18
MathVerse: Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems? Paper ā¢ 2403.14624 ā¢ Published Mar 21, 2024 ā¢ 51
Mitigating Object Hallucination in Large Vision-Language Models via Classifier-Free Guidance Paper ā¢ 2402.08680 ā¢ Published Feb 13, 2024 ā¢ 1
SPHINX-X: Scaling Data and Parameters for a Family of Multi-modal Large Language Models Paper ā¢ 2402.05935 ā¢ Published Feb 8, 2024 ā¢ 15
Robust Learning with Progressive Data Expansion Against Spurious Correlation Paper ā¢ 2306.04949 ā¢ Published Jun 8, 2023
Rephrase and Respond: Let Large Language Models Ask Better Questions for Themselves Paper ā¢ 2311.04205 ā¢ Published Nov 7, 2023 ā¢ 5
Towards Understanding Mixture of Experts in Deep Learning Paper ā¢ 2208.02813 ā¢ Published Aug 4, 2022 ā¢ 1
Understanding Transferable Representation Learning and Zero-shot Transfer in CLIP Paper ā¢ 2310.00927 ā¢ Published Oct 2, 2023 ā¢ 1
Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models Paper ā¢ 2401.01335 ā¢ Published Jan 2, 2024 ā¢ 64