Mishig Davaadorj's picture

Mishig Davaadorj

mishig

·

AI & ML interests

NP-completeness, grammars, universality

Recent Activity

upvoted an article 3 days ago

Train 400x faster Static Embedding Models with Sentence Transformers

upvoted an article 3 days ago

❤️ a love letter to the Open AI inference client

updated a Space 4 days ago

nanotron/ultrascale-playbook

View all activity

Organizations

mishig's activity

upvoted 2 articles 3 days ago

Article

Train 400x faster Static Embedding Models with Sentence Transformers

Jan 15

• 153

Article

❤️ a love letter to the Open AI inference client

By

•

3 days ago

• 8

updated a Space 4 days ago

The Ultra-Scale Playbook

The ultimate guide to training LLM on large GPU Clusters

New activity in nanotron/ultrascale-playbook 4 days ago

Make hash section working

#89 opened 4 days ago by

updated a Space 4 days ago

Visualize Dataset (v2.0+ latest dataset format)

Explore robot datasets by entering a dataset ID

upvoted an article 7 days ago

Article

Remote VAEs for decoding with HF endpoints 🤗

8 days ago

• 30

upvoted a paper 11 days ago

Qwen2.5-VL Technical Report

Paper • 2502.13923 • Published 12 days ago • 153

updated a Space 11 days ago

Inference Playground

Generate responses to chat messages

upvoted a paper 12 days ago

Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention

Paper • 2502.11089 • Published 15 days ago • 138

liked a Space 13 days ago

Kokoro Podcast Generator

AI-generated podcast!

liked a model 17 days ago

zed-industries/zeta

Updated 4 days ago • 1.98k • 212

upvoted a collection 19 days ago

SYNTHETIC-1

A collection of tasks & verifiers for reasoning datasets • 9 items • Updated 11 days ago • 49

upvoted an article 19 days ago

Article

State of open video generation models in Diffusers

Jan 27

• 50

upvoted a paper 24 days ago

DynVFX: Augmenting Real Videos with Dynamic Content

Paper • 2502.03621 • Published 26 days ago • 28

upvoted a collection 25 days ago

Hibiki fr-en

Hibiki is a model for streaming speech translation , which can run on device! See https://github.com/kyutai-labs/hibiki. • 5 items • Updated 25 days ago • 50

upvoted a paper 25 days ago

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

Paper • 2502.02737 • Published 27 days ago • 196

updated a model 28 days ago

simplescaling/s1-32B

Text Generation • Updated 5 days ago • 13.5k • 282

New activity in simplescaling/s1-32B 28 days ago

Update README.md

#1 opened 28 days ago by

upvoted a paper 28 days ago

s1: Simple test-time scaling

Paper • 2501.19393 • Published about 1 month ago • 108

upvoted an article about 1 month ago

Article

Replicating DeepSeek R1 for Information Extraction

By

•

about 1 month ago

• 36