We are reproducing the full DeepSeek R1 data and training pipeline so everybody can use their recipe. Instead of doing it in secret we can do it together in the open!
๐งช Step 1: replicate the R1-Distill models by distilling a high-quality reasoning corpus from DeepSeek-R1.
๐ง Step 2: replicate the pure RL pipeline that DeepSeek used to create R1-Zero. This will involve curating new, large-scale datasets for math, reasoning, and code.
๐ฅ Step 3: show we can go from base model -> SFT -> RL via multi-stage training.
โจ Launched All-Scenario Reasoning Model (language, visual, and search reasoning capabilities) , with medical expertise as one of its key highlights. https://ying.baichuan-ai.com/chat
โจ Released Baichuan-M1-14B Medical LLM on the hub Available in both Base and Instruct versions, support English & Chinese.
UI-TARS ๐ฅ series of native GUI agent models (2B/7B/72B) released by ByteDance, combining perception, reasoning, grounding, and memory into one system.
What happened yesterday in the Chinese AI community? ๐
T2A-01-HD ๐ https://hailuo.ai/audio MiniMax's Text-to-Audio model, now in Hailuo AI, offers 300+ voices in 17+ languages and instant emotional voice cloning.
Tare ๐ https://www.trae.ai/ A new coding tool by Bytedance for professional developers, supporting English & Chinese with free access to Claude 3.5 and GPT-4 for a limited time.
Kimi K 1.5 ๐ https://github.com/MoonshotAI/Kimi-k1.5 | https://kimi.ai/ An O1-level multi-modal model by MoonShot AI, utilizing reinforcement learning with long and short-chain-of-thought and supporting up to 128k tokens.
And todayโฆ
Hunyuan 3D-2.0 ๐ tencent/Hunyuan3D-2 A SoTA 3D synthesis system for high-res textured assets by Tencent Hunyuan , with open weights and code!
โจ MIT License : enabling distillation for custom models โจ 32B & 70B models match OpenAI o1-mini in multiple capabilities โจ API live now! Access Chain of Thought reasoning with model='deepseek-reasoner'
InternLM3-8B-instruct๐ฅ Trained on just 4T tokens, it outperforms Llama3.1-8B and Qwen2.5-7B in reasoning tasks, at 75% lower cost! internlm/internlm3-67875827c377690c01a9131d
โจ MiniMax-text-01: - 456B with 45.9B activated per token - Combines Lightning Attention, Softmax Attention, and MoE for optimal performance - Training context up to 1M tokens, inference handles 4M tokens
โจ MiniMax-VL-01: - ViT-MLP-LLM framework ( non-transformer๐) - Handles image inputs from 336ร336 to 2016ร2016 - 694M image-caption pairs + 512B tokens processed across 4 stages
MiniCPM-o2.6 ๐ฅ an end-side multimodal LLMs released by OpenBMB from the Chinese community Model: openbmb/MiniCPM-o-2_6 โจ Real-time English/Chinese conversation, emotion control and ASR/STT โจ Real-time video/audio understanding โจ Processes up to 1.8M pixels, leads OCRBench & supports 30+ languages
๐ซ...And we're live!๐ซ Seasonal newsletter from ethicsy folks at Hugging Face, exploring the ethics of "AI Agents" https://huggingface.co/blog/ethics-soc-7 Our analyses found: - There's a spectrum of "agent"-ness - *Safety* is a key issue, leading to many other value-based concerns Read for details & what to do next! With @evijit , @giadap , and @sasha