Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2401.07382

Pending Classification

about 2 hours ago

Video Creation by Demonstration

Paper • 2412.09551 • Published 14 days ago • 8
DiffSensei: Bridging Multi-Modal LLMs and Diffusion Models for Customized Manga Generation

Paper • 2412.07589 • Published 16 days ago • 45
Unraveling the Complexity of Memory in RL Agents: an Approach for Classification and Evaluation

Paper • 2412.06531 • Published 17 days ago • 71
APOLLO: SGD-like Memory, AdamW-level Performance

Paper • 2412.05270 • Published 20 days ago • 38

Papers - Reasoning - Critic Pattern

CRITIC: Large Language Models Can Self-Correct with Tool-Interactive Critiquing

Paper • 2305.11738 • Published May 19, 2023 • 8
CriticBench: Benchmarking LLMs for Critique-Correct Reasoning

Paper • 2402.14809 • Published Feb 22 • 3
DRLC: Reinforcement Learning with Dense Rewards from LLM Critic

Paper • 2401.07382 • Published Jan 14 • 2

Papers - Training - Critic Model

CRITIC: Large Language Models Can Self-Correct with Tool-Interactive Critiquing

Paper • 2305.11738 • Published May 19, 2023 • 8
Shepherd: A Critic for Language Model Generation

Paper • 2308.04592 • Published Aug 8, 2023 • 31
CriticBench: Benchmarking LLMs for Critique-Correct Reasoning

Paper • 2402.14809 • Published Feb 22 • 3
DRLC: Reinforcement Learning with Dense Rewards from LLM Critic

Paper • 2401.07382 • Published Jan 14 • 2

Papers - Training - Reward Model

PERL: Parameter Efficient Reinforcement Learning from Human Feedback

Paper • 2403.10704 • Published Mar 15 • 57
WARM: On the Benefits of Weight Averaged Reward Models

Paper • 2401.12187 • Published Jan 22 • 18
RewardBench: Evaluating Reward Models for Language Modeling

Paper • 2403.13787 • Published Mar 20 • 21
DreamReward: Text-to-3D Generation with Human Preference

Paper • 2403.14613 • Published Mar 21 • 35

Papers - Training Research

Measuring the Effects of Data Parallelism on Neural Network Training

Paper • 1811.03600 • Published Nov 8, 2018 • 2
Adafactor: Adaptive Learning Rates with Sublinear Memory Cost

Paper • 1804.04235 • Published Apr 11, 2018 • 2
EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks

Paper • 1905.11946 • Published May 28, 2019 • 3
Yi: Open Foundation Models by 01.AI

Paper • 2403.04652 • Published Mar 7 • 62

Moral Foundations of Large Language Models

Paper • 2310.15337 • Published Oct 23, 2023 • 1
Specific versus General Principles for Constitutional AI

Paper • 2310.13798 • Published Oct 20, 2023 • 2
Contrastive Prefence Learning: Learning from Human Feedback without RL

Paper • 2310.13639 • Published Oct 20, 2023 • 24
RLAIF: Scaling Reinforcement Learning from Human Feedback with AI Feedback

Paper • 2309.00267 • Published Sep 1, 2023 • 47

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs