Multi-Dimensional Insights: Benchmarking Real-World Personalization in Large Multimodal Models Paper • 2412.12606 • Published 9 days ago • 41
Smaller Language Models Are Better Instruction Evolvers Paper • 2412.11231 • Published 10 days ago • 24
CORAL: Benchmarking Multi-turn Conversational Retrieval-Augmentation Generation Paper • 2410.23090 • Published Oct 30 • 54
Toward General Instruction-Following Alignment for Retrieval-Augmented Generation Paper • 2410.09584 • Published Oct 12 • 47
MSI-Agent: Incorporating Multi-Scale Insight into Embodied Agents for Superior Planning and Decision-Making Paper • 2409.16686 • Published Sep 25 • 9
How Do Your Code LLMs Perform? Empowering Code Instruction Tuning with High-Quality Data Paper • 2409.03810 • Published Sep 5 • 30
DotaMath: Decomposition of Thought with Code Assistance and Self-correction for Mathematical Reasoning Paper • 2407.04078 • Published Jul 4 • 17
We-Math: Does Your Large Multimodal Model Achieve Human-like Mathematical Reasoning? Paper • 2407.01284 • Published Jul 1 • 75
PreAct: Predicting Future in ReAct Enhances Agent's Planning Ability Paper • 2402.11534 • Published Feb 18 • 1
CS-Bench: A Comprehensive Benchmark for Large Language Models towards Computer Science Mastery Paper • 2406.08587 • Published Jun 12 • 15
Understand What LLM Needs: Dual Preference Alignment for Retrieval-Augmented Generation Paper • 2406.18676 • Published Jun 26 • 6
Self-play with Execution Feedback: Improving Instruction-following Capabilities of Large Language Models Paper • 2406.13542 • Published Jun 19 • 16
Semi-Supervised Knowledge-Grounded Pre-training for Task-Oriented Dialog Systems Paper • 2210.08873 • Published Oct 17, 2022 • 1
Scaling Relationship on Learning Mathematical Reasoning with Large Language Models Paper • 2308.01825 • Published Aug 3, 2023 • 21
InstructERC: Reforming Emotion Recognition in Conversation with a Retrieval Multi-task LLMs Framework Paper • 2309.11911 • Published Sep 21, 2023 • 3
Revisit Input Perturbation Problems for LLMs: A Unified Robustness Evaluation Framework for Noisy Slot Filling Task Paper • 2310.06504 • Published Oct 10, 2023 • 1
Query and Response Augmentation Cannot Help Out-of-domain Math Reasoning Generalization Paper • 2310.05506 • Published Oct 9, 2023 • 1