Edmon02
's Collections
QA-LoRA: Quantization-Aware Low-Rank Adaptation of Large Language Models
Paper
•
2309.14717
•
Published
•
44
PaLI-3 Vision Language Models: Smaller, Faster, Stronger
Paper
•
2310.09199
•
Published
•
25
Can GPT models be Financial Analysts? An Evaluation of ChatGPT and GPT-4
on mock CFA Exams
Paper
•
2310.08678
•
Published
•
12
MiniGPT-v2: large language model as a unified interface for
vision-language multi-task learning
Paper
•
2310.09478
•
Published
•
19
BitNet: Scaling 1-bit Transformers for Large Language Models
Paper
•
2310.11453
•
Published
•
96
JudgeLM: Fine-tuned Large Language Models are Scalable Judges
Paper
•
2310.17631
•
Published
•
33
DeepSpeed Ulysses: System Optimizations for Enabling Training of Extreme
Long Sequence Transformer Models
Paper
•
2309.14509
•
Published
•
17
Skywork: A More Open Bilingual Foundation Model
Paper
•
2310.19341
•
Published
•
5
UFOGen: You Forward Once Large Scale Text-to-Image Generation via
Diffusion GANs
Paper
•
2311.09257
•
Published
•
45
ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world
APIs
Paper
•
2307.16789
•
Published
•
98
SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective
Depth Up-Scaling
Paper
•
2312.15166
•
Published
•
56
MobileQuant: Mobile-friendly Quantization for On-device Language Models
Paper
•
2408.13933
•
Published
•
13
mPLUG-DocOwl2: High-resolution Compressing for OCR-free Multi-page
Document Understanding
Paper
•
2409.03420
•
Published
•
26
Scaling Smart: Accelerating Large Language Model Pre-training with Small
Model Initialization
Paper
•
2409.12903
•
Published
•
21
Training Language Models to Self-Correct via Reinforcement Learning
Paper
•
2409.12917
•
Published
•
135
Language Models Learn to Mislead Humans via RLHF
Paper
•
2409.12822
•
Published
•
9
MathCoder2: Better Math Reasoning from Continued Pretraining on
Model-translated Mathematical Code
Paper
•
2410.08196
•
Published
•
45