Collections
Discover the best community collections!
Collections including paper arxiv:2402.03620
-
Chain-of-Knowledge: Integrating Knowledge Reasoning into Large Language Models by Learning from Knowledge Graphs
Paper • 2407.00653 • Published • 11 -
Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs
Paper • 2406.18629 • Published • 40 -
Whiteboard-of-Thought: Thinking Step-by-Step Across Modalities
Paper • 2406.14562 • Published • 27 -
Buffer of Thoughts: Thought-Augmented Reasoning with Large Language Models
Paper • 2406.04271 • Published • 27
-
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 142 -
Self-Discover: Large Language Models Self-Compose Reasoning Structures
Paper • 2402.03620 • Published • 109 -
OS-Copilot: Towards Generalist Computer Agents with Self-Improvement
Paper • 2402.07456 • Published • 41 -
Learning From Mistakes Makes LLM Better Reasoner
Paper • 2310.20689 • Published • 28
-
Demonstrate-Search-Predict: Composing retrieval and language models for knowledge-intensive NLP
Paper • 2212.14024 • Published • 3 -
DSPy: Compiling Declarative Language Model Calls into Self-Improving Pipelines
Paper • 2310.03714 • Published • 30 -
DSPy Assertions: Computational Constraints for Self-Refining Language Model Pipelines
Paper • 2312.13382 • Published • 3 -
ReST meets ReAct: Self-Improvement for Multi-Step Reasoning LLM Agent
Paper • 2312.10003 • Published • 35
-
Chain-of-Thought Reasoning Without Prompting
Paper • 2402.10200 • Published • 99 -
Self-Discover: Large Language Models Self-Compose Reasoning Structures
Paper • 2402.03620 • Published • 109 -
Direct Nash Optimization: Teaching Language Models to Self-Improve with General Preferences
Paper • 2404.03715 • Published • 60 -
Do language models plan ahead for future tokens?
Paper • 2404.00859 • Published • 2
-
Communicative Agents for Software Development
Paper • 2307.07924 • Published • 2 -
Self-Refine: Iterative Refinement with Self-Feedback
Paper • 2303.17651 • Published • 2 -
ReST meets ReAct: Self-Improvement for Multi-Step Reasoning LLM Agent
Paper • 2312.10003 • Published • 35 -
ReAct: Synergizing Reasoning and Acting in Language Models
Paper • 2210.03629 • Published • 14
-
PERL: Parameter Efficient Reinforcement Learning from Human Feedback
Paper • 2403.10704 • Published • 56 -
HyperLLaVA: Dynamic Visual and Language Expert Tuning for Multimodal Large Language Models
Paper • 2403.13447 • Published • 17 -
Self-Discover: Large Language Models Self-Compose Reasoning Structures
Paper • 2402.03620 • Published • 109 -
RAFT: Adapting Language Model to Domain Specific RAG
Paper • 2403.10131 • Published • 67