Vaibhav Srivastav's picture

Vaibhav Srivastav PRO

reach-vb

·

https://vaibhavs10.github.io

AI & ML interests

TTS + LM performance prediction

Recent Activity

liked a Space 1 day ago

webml-community/kokoro-webgpu

upvoted a paper 2 days ago

High-Fidelity Simultaneous Speech-To-Speech Translation

updated a model 2 days ago

reach-vb/Llama3.2-1B-whisper-turbo-uvx-14000

View all activity

Organizations

reach-vb's activity

upvoted a paper 2 days ago

High-Fidelity Simultaneous Speech-To-Speech Translation

Paper • 2502.03382 • Published 4 days ago • 8

upvoted a collection 3 days ago

Hibiki fr-en

Hibiki is a model for streaming speech translation , which can run on device! See https://github.com/kyutai-labs/hibiki. • 5 items • Updated 2 days ago • 38

upvoted 2 papers 3 days ago

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

Paper • 2502.02737 • Published 4 days ago • 137

Fully Autonomous AI Agents Should Not be Developed

Paper • 2502.02649 • Published 4 days ago • 19

upvoted 2 articles 4 days ago

Article

π0 and π0-FAST: Vision-Language-Action Models for General Robot Control

5 days ago

• 80

Article

Open-source DeepResearch – Freeing our search agents

5 days ago

• 792

upvoted an article 6 days ago

Article

🚀 Deploying OLMo-7B with Text Generation Inference (TGI) on Hugging Face Spaces

By

•

7 days ago

• 5

upvoted an article 10 days ago

Article

How biased is Whisper ? Evaluating Whisper Models for Robustness to Diverse English Accents

By

•

10 days ago

• 15

upvoted an article 11 days ago

Article

🚀 Build a Qwen 2.5 VL API endpoint with Hugging Face spaces and Docker!

By

•

11 days ago

• 13

upvoted an article 12 days ago

Article

Welcome to Inference Providers on the Hub 🔥

12 days ago

• 290

upvoted a collection 12 days ago

🧠 Reasoning datasets

Datasets with reasoning traces for math and code released by the community • 7 items • Updated 2 days ago • 41

upvoted an article 12 days ago

Article

Open-R1: a fully open reproduction of DeepSeek-R1

12 days ago

• 675

upvoted a collection 13 days ago

Qwen2.5-VL

Vision-language model series based on Qwen2.5 • 3 items • Updated 13 days ago • 330

upvoted a collection 14 days ago

Qwen2.5-1M

The long-context version of Qwen2.5, supporting 1M-token context lengths • 2 items • Updated 14 days ago • 99

upvoted an article 15 days ago

Article

The SOTA Text-to-speech and Zero Shot Voice cloning model that no one knows about...

By

•

19 days ago

• 60

upvoted an article 16 days ago

Article

Failure Modes of OpenAI Operator

By

•

16 days ago

• 3

upvoted an article 17 days ago

Article

SmolVLM Grows Smaller – Introducing the 250M & 500M Models!

17 days ago

• 121

upvoted a collection 17 days ago

SmolVLM 256M & 500M

Collection for models & demos for even smoller SmolVLM release • 12 items • Updated 17 days ago • 67

upvoted an article 17 days ago

Article

Yay! Organizations can now publish blog Articles

By

and 3 others •

19 days ago

• 32

upvoted a collection 18 days ago

Eagle 2

Eagle 2 is a family of frontier vision-language models with vision-centric design. The model supports 4K HD input, long-context video, and grounding. • 9 items • Updated 17 days ago • 31