-
RLHF Workflow: From Reward Modeling to Online RLHF
Paper • 2405.07863 • Published • 67 -
Chameleon: Mixed-Modal Early-Fusion Foundation Models
Paper • 2405.09818 • Published • 126 -
Meteor: Mamba-based Traversal of Rationale for Large Language and Vision Models
Paper • 2405.15574 • Published • 53 -
An Introduction to Vision-Language Modeling
Paper • 2405.17247 • Published • 85
Collections
Discover the best community collections!
Collections including paper arxiv:2408.13933
-
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 143 -
Orion-14B: Open-source Multilingual Large Language Models
Paper • 2401.12246 • Published • 11 -
MambaByte: Token-free Selective State Space Model
Paper • 2401.13660 • Published • 50 -
MM-LLMs: Recent Advances in MultiModal Large Language Models
Paper • 2401.13601 • Published • 44
-
Qwen2-Audio Technical Report
Paper • 2407.10759 • Published • 55 -
Qwen2 Technical Report
Paper • 2407.10671 • Published • 155 -
Gemma 2: Improving Open Language Models at a Practical Size
Paper • 2408.00118 • Published • 73 -
EXAONE 3.0 7.8B Instruction Tuned Language Model
Paper • 2408.03541 • Published • 34
-
How to Train Data-Efficient LLMs
Paper • 2402.09668 • Published • 38 -
SliceGPT: Compress Large Language Models by Deleting Rows and Columns
Paper • 2401.15024 • Published • 68 -
SHERL: Synthesizing High Accuracy and Efficient Memory for Resource-Limited Transfer Learning
Paper • 2407.07523 • Published • 4 -
Spectra: A Comprehensive Study of Ternary, Quantized, and FP16 Language Models
Paper • 2407.12327 • Published • 77
-
Snap Video: Scaled Spatiotemporal Transformers for Text-to-Video Synthesis
Paper • 2402.14797 • Published • 19 -
Subobject-level Image Tokenization
Paper • 2402.14327 • Published • 17 -
MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases
Paper • 2402.14905 • Published • 126 -
GPTVQ: The Blessing of Dimensionality for LLM Quantization
Paper • 2402.15319 • Published • 19
-
QA-LoRA: Quantization-Aware Low-Rank Adaptation of Large Language Models
Paper • 2309.14717 • Published • 44 -
PaLI-3 Vision Language Models: Smaller, Faster, Stronger
Paper • 2310.09199 • Published • 24 -
Can GPT models be Financial Analysts? An Evaluation of ChatGPT and GPT-4 on mock CFA Exams
Paper • 2310.08678 • Published • 12 -
MiniGPT-v2: large language model as a unified interface for vision-language multi-task learning
Paper • 2310.09478 • Published • 19