Let's go! We are releasing SmolVLM, a smol 2B VLM built for on-device inference that outperforms all models at similar GPU RAM usage and tokens throughputs.
- SmolVLM generates tokens 7.5 to 16 times faster than Qwen2-VL! π€― - Other models at this size crash a laptop, but SmolVLM comfortably generates 17 tokens/sec on a macbook! π - SmolVLM can be fine-tuned on a Google collab! Or process millions of documents with a consumer GPU! - SmolVLM even outperforms larger models in video benchmarks, despite not even being trained on videos!
The latest o1 model from OpenAI is still unable to answer 9.11 > 9.9 correctly π€
A possible explanation? Tokenization - and our latest work investigates how it affects a model's ability to do math!
In this blog post, we discuss: π’ The different ways numbers are tokenized in modern LLMs π§ͺ Our detailed approach in comparing these various methods π₯ͺ How we got a free boost in arithmetic performance by adding a few lines of code to the base Llama 3 tokenizer π and a definitive, best tokenization method for math in LLMs!
π PawMatchAI: Making Breed Selection More Intuitive! π Excited to share the latest update to this AI-powered companion for finding your perfect furry friend! The breed recommendation system just got a visual upgrade to help you make better decisions.
β¨ What's New? Enhanced breed recognition accuracy through strategic model improvements: - Upgraded to a fine-tuned ConvNeXt architecture for superior feature extraction - Implemented progressive layer unfreezing during training - Optimized data augmentation pipeline for better generalization - Achieved 8% improvement in breed classification accuracy
π― Key Features: - Smart breed recognition powered by AI - Visual matching scores with intuitive color indicators - Detailed breed comparisons with interactive tooltips - Lifestyle-based recommendations tailored to your needs
π Project Vision Combining my passion for AI and pets, this project represents another step toward my goal of creating meaningful AI applications. Each update aims to make the breed selection process more accessible while improving the underlying technology.
We outperform Llama 70B with Llama 3B on hard math by scaling test-time compute π₯
How? By combining step-wise reward models with tree search algorithms :)
We show that smol models can match or exceed the performance of their much larger siblings when given enough "time to think"
We're open sourcing the full recipe and sharing a detailed blog post.
In our blog post we cover:
π Compute-optimal scaling: How we implemented DeepMind's recipe to boost the mathematical capabilities of open models at test-time.
π Diverse Verifier Tree Search (DVTS): An unpublished extension we developed to the verifier-guided tree search technique. This simple yet effective method improves diversity and delivers better performance, particularly at large test-time compute budgets.
π§ Search and Learn: A lightweight toolkit for implementing search strategies with LLMs and built for speed with vLLM