Dreamer312 (MC)

upvoted a paper 3 months ago

Scaling Law for Quantization-Aware Training

Paper • 2505.14302 • Published May 20 • 76

upvoted a collection 3 months ago

Llama 4

Meta's new Llama 4 multimodal models, Scout & Maverick. Includes Dynamic GGUFs, 16-bit & Dynamic 4-bit uploads. Run & fine-tune them with Unsloth! • 15 items • Updated 4 days ago • 47

upvoted 2 papers 3 months ago

SEED-GRPO: Semantic Entropy Enhanced GRPO for Uncertainty-Aware Policy Optimization

Paper • 2505.12346 • Published May 18 • 19

Hydra-SGG: Hybrid Relation Assignment for One-stage Scene Graph Generation

Paper • 2409.10262 • Published Sep 16, 2024 • 1

upvoted an article 3 months ago

Article

Mixture of Experts Explained

By

and 5 others •

Dec 11, 2023

• 828

upvoted a collection 3 months ago

Qwen3

Collection

84 items • Updated 12 days ago • 1.1k

upvoted 2 articles 4 months ago

Article

Proximal Policy Optimization (PPO)

By

•

Aug 5, 2022

• 53

Article

Merge Large Language Models with mergekit

By

•

Jan 9, 2024

• 133

upvoted an article 5 months ago

Article

Trace & Evaluate your Agent with Arize Phoenix

By

and 2 others •

Feb 28

• 41

upvoted an article 6 months ago

Article

Mini-R1: Reproduce Deepseek R1 „aha moment“ a RL tutorial

By

•

Jan 31

• 50

upvoted a paper 10 months ago

Groma: Localized Visual Tokenization for Grounding Multimodal Large Language Models

Paper • 2404.13013 • Published Apr 19, 2024 • 32

upvoted an article 11 months ago

Article

A failed experiment: Infini-Attention, and why we should keep trying?

By

and 2 others •

Aug 14, 2024

• 69

upvoted 3 articles 12 months ago

Article

TGI Multi-LoRA: Deploy Once, Serve 30 Models

By

and 2 others •

Jul 18, 2024

• 59

Article

Preference Optimization for Vision Language Models

By

and 3 others •

Jul 10, 2024

• 80

Article

Docmatix - a huge dataset for Document Visual Question Answering

By

and 1 other •

Jul 18, 2024

• 76

upvoted a paper almost 2 years ago

Llama 2: Open Foundation and Fine-Tuned Chat Models

Paper • 2307.09288 • Published Jul 18, 2023 • 244

MC

AI & ML interests

Organizations

Scaling Law for Quantization-Aware Training

Llama 4

SEED-GRPO: Semantic Entropy Enhanced GRPO for Uncertainty-Aware Policy Optimization

Hydra-SGG: Hybrid Relation Assignment for One-stage Scene Graph Generation

Mixture of Experts Explained

Qwen3

Proximal Policy Optimization (PPO)

Merge Large Language Models with mergekit

Trace & Evaluate your Agent with Arize Phoenix

Mini-R1: Reproduce Deepseek R1 „aha moment“ a RL tutorial

Groma: Localized Visual Tokenization for Grounding Multimodal Large Language Models

A failed experiment: Infini-Attention, and why we should keep trying?

TGI Multi-LoRA: Deploy Once, Serve 30 Models

Preference Optimization for Vision Language Models

Docmatix - a huge dataset for Document Visual Question Answering

Llama 2: Open Foundation and Fine-Tuned Chat Models

MC

AI & ML interests

Organizations

Dreamer312's activity

Mixture of Experts Explained

Proximal Policy Optimization (PPO)

Merge Large Language Models with mergekit

Trace & Evaluate your Agent with Arize Phoenix

Mini-R1: Reproduce Deepseek R1 „aha moment“ a RL tutorial

A failed experiment: Infini-Attention, and why we should keep trying?

TGI Multi-LoRA: Deploy Once, Serve 30 Models

Preference Optimization for Vision Language Models

Docmatix - a huge dataset for Document Visual Question Answering