AI & ML interests

Evaluating open LLMs

Recent Activity

open-llm-leaderboard's activity

lewtunย 
posted an update about 13 hours ago
view post
Post
829
We are reproducing the full DeepSeek R1 data and training pipeline so everybody can use their recipe. Instead of doing it in secret we can do it together in the open!

๐Ÿงช Step 1: replicate the R1-Distill models by distilling a high-quality reasoning corpus from DeepSeek-R1.

๐Ÿง  Step 2: replicate the pure RL pipeline that DeepSeek used to create R1-Zero. This will involve curating new, large-scale datasets for math, reasoning, and code.

๐Ÿ”ฅ Step 3: show we can go from base model -> SFT -> RL via multi-stage training.

Follow along: https://github.com/huggingface/open-r1
AdinaYย 
posted an update 1 day ago
AdinaYย 
posted an update 2 days ago
AdinaYย 
posted an update 3 days ago
AdinaYย 
posted an update 5 days ago
view post
Post
2542
What happened yesterday in the Chinese AI community? ๐Ÿš€

T2A-01-HD ๐Ÿ‘‰ https://hailuo.ai/audio
MiniMax's Text-to-Audio model, now in Hailuo AI, offers 300+ voices in 17+ languages and instant emotional voice cloning.

Tare ๐Ÿ‘‰ https://www.trae.ai/
A new coding tool by Bytedance for professional developers, supporting English & Chinese with free access to Claude 3.5 and GPT-4 for a limited time.

DeepSeek-R1 Series ๐Ÿ‘‰ deepseek-ai/deepseek-r1-678e1e131c0169c0bc89728d
Open-source reasoning models with MIT license by DeepSeek.

Kimi K 1.5 ๐Ÿ‘‰ https://github.com/MoonshotAI/Kimi-k1.5 | https://kimi.ai/
An O1-level multi-modal model by MoonShot AI, utilizing reinforcement learning with long and short-chain-of-thought and supporting up to 128k tokens.

And todayโ€ฆ

Hunyuan 3D-2.0 ๐Ÿ‘‰ tencent/Hunyuan3D-2
A SoTA 3D synthesis system for high-res textured assets by Tencent Hunyuan , with open weights and code!

Stay tuned for more updates ๐Ÿ‘‰ https://huggingface.co/zh-ai-community
AdinaYย 
posted an update 5 days ago
view post
Post
718
Hunyuan 3D 2.0๐Ÿ”ฅ a synthesis system for high-res textured 3D assets released by Tencent Hunyuan

2 key components: Hunyuan3D-DiT (geometry) and Hunyuan3D-Paint (textures) work together, achieving highly realistic 3D results.

Model: tencent/Hunyuan3D-2
Demo coming soon!
AdinaYย 
posted an update 6 days ago
view post
Post
2744
BIG release by DeepSeek AI๐Ÿ”ฅ๐Ÿ”ฅ๐Ÿ”ฅ

DeepSeek-R1 & DeepSeek-R1-Zero: two 660B reasoning models are here, alongside 6 distilled dense models (based on Llama & Qwen) for the community!
https://huggingface.co/deepseek-ai
deepseek-ai/DeepSeek-R1

โœจ MIT License : enabling distillation for custom models
โœจ 32B & 70B models match OpenAI o1-mini in multiple capabilities
โœจ API live now! Access Chain of Thought reasoning with model='deepseek-reasoner'
AdinaYย 
posted an update 9 days ago
AdinaYย 
posted an update 11 days ago
AdinaYย 
posted an update 11 days ago
view post
Post
3074
MiniMax, the company behind Hailuo_AI, has joined the open source community by releasing both models and demos of MiniMax-Text-01 & MiniMax-VL-01๐Ÿ”ฅ
- Model
MiniMaxAI/MiniMax-VL-01
MiniMaxAI/MiniMax-Text-01
- Demo
MiniMaxAI/MiniMax-VL-01
MiniMaxAI/MiniMax-Text-01

โœจ MiniMax-text-01:
- 456B with 45.9B activated per token
- Combines Lightning Attention, Softmax Attention, and MoE for optimal performance
- Training context up to 1M tokens, inference handles 4M tokens

โœจ MiniMax-VL-01:
- ViT-MLP-LLM framework ( non-transformer๐Ÿ‘€)
- Handles image inputs from 336ร—336 to 2016ร—2016
- 694M image-caption pairs + 512B tokens processed across 4 stages
  • 1 reply
ยท
AdinaYย 
posted an update 12 days ago
view post
Post
3164
MiniCPM-o2.6 ๐Ÿ”ฅ an end-side multimodal LLMs released by OpenBMB from the Chinese community
Model: openbmb/MiniCPM-o-2_6
โœจ Real-time English/Chinese conversation, emotion control and ASR/STT
โœจ Real-time video/audio understanding
โœจ Processes up to 1.8M pixels, leads OCRBench & supports 30+ languages
megย 
posted an update 12 days ago
view post
Post
2923
๐Ÿ’ซ...And we're live!๐Ÿ’ซ Seasonal newsletter from ethicsy folks at Hugging Face, exploring the ethics of "AI Agents"
https://huggingface.co/blog/ethics-soc-7
Our analyses found:
- There's a spectrum of "agent"-ness
- *Safety* is a key issue, leading to many other value-based concerns
Read for details & what to do next!
With @evijit , @giadap , and @sasha
AdinaYย 
posted an update 16 days ago
albertvillanovaย 
posted an update 19 days ago