3 11 19

Metal Whale

metalwhale

https://blog.metalwhale.dev/

AI & ML interests

None yet

Recent Activity

liked a model 14 days ago

zai-org/GLM-4.5

upvoted a paper 21 days ago

Group Sequence Policy Optimization

liked a model about 1 month ago

moonshotai/Kimi-K2-Instruct

View all activity

Organizations

None yet

upvoted a paper 21 days ago

Group Sequence Policy Optimization

Paper • 2507.18071 • Published 25 days ago • 289

upvoted a collection 4 months ago

Qwen3

Collection

84 items • Updated 12 days ago • 1.1k

upvoted 2 papers 5 months ago

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published Jan 22 • 416

SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

Paper • 2501.17161 • Published Jan 28 • 123

upvoted an article 6 months ago

Article

Open-source DeepResearch – Freeing our search agents

and 4 others •

Feb 4

• 1.28k

upvoted an article 7 months ago

Article

Open-R1: a fully open reproduction of DeepSeek-R1

and 2 others •

Jan 28

• 877

upvoted a paper 8 months ago

Byte Latent Transformer: Patches Scale Better Than Tokens

Paper • 2412.09871 • Published Dec 13, 2024 • 109

upvoted a collection 9 months ago

Molmo

Collection

Artifacts for open multimodal language models. • 5 items • Updated Apr 30 • 308

upvoted an article 9 months ago

Article

Releasing the largest multilingual open pretraining dataset

and 2 others •

Nov 13, 2024

• 102

upvoted a paper 10 months ago

Differential Transformer

Paper • 2410.05258 • Published Oct 7, 2024 • 180

Metal Whale

AI & ML interests

Recent Activity

Organizations

metalwhale's activity

Open-source DeepResearch – Freeing our search agents

Open-R1: a fully open reproduction of DeepSeek-R1

Releasing the largest multilingual open pretraining dataset