Agustín Piqueres Lajarín

plaguss

AI & ML interests

None yet

Recent Activity

updated a dataset about 11 hours ago
HuggingFaceH4/numina-deepseek-r1-qwen-7b
liked a model about 11 hours ago
deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
published a dataset about 11 hours ago
HuggingFaceH4/numina-deepseek-r1-qwen-7b
View all activity

Articles

Organizations

Hugging Face's profile picture SomosNLP's profile picture Hugging Face H4's profile picture Argilla's profile picture Blog-explorers's profile picture Hugging Face TB Research's profile picture Argilla Explorers's profile picture distilabel-internal-testing's profile picture Data Is Better Together's profile picture LLHF's profile picture SLLHF's profile picture Hugging Quants's profile picture argilla-internal-testing's profile picture Argilla Warehouse's profile picture Hugging Face FineVideo's profile picture smol-explorers's profile picture Hugging Face Science's profile picture Data Is Better Together Contributor's profile picture

plaguss's activity

reacted to lewtun's post with 🚀🔥 about 13 hours ago
view post
Post
896
We are reproducing the full DeepSeek R1 data and training pipeline so everybody can use their recipe. Instead of doing it in secret we can do it together in the open!

🧪 Step 1: replicate the R1-Distill models by distilling a high-quality reasoning corpus from DeepSeek-R1.

🧠 Step 2: replicate the pure RL pipeline that DeepSeek used to create R1-Zero. This will involve curating new, large-scale datasets for math, reasoning, and code.

🔥 Step 3: show we can go from base model -> SFT -> RL via multi-stage training.

Follow along: https://github.com/huggingface/open-r1
updated a dataset about 14 hours ago
published a dataset about 14 hours ago