It's called ๐๐ข๐๐๐๐ง๐จ๐ฌ๐ข๐ฌ and is a lightweight framework that helps you ๐ฑ๐ถ๐ฎ๐ด๐ป๐ผ๐๐ฒ ๐๐ต๐ฒ ๐ฝ๐ฒ๐ฟ๐ณ๐ผ๐ฟ๐บ๐ฎ๐ป๐ฐ๐ฒ ๐ผ๐ณ ๐๐๐ ๐ ๐ฎ๐ป๐ฑ ๐ฟ๐ฒ๐๐ฟ๐ถ๐ฒ๐๐ฎ๐น ๐บ๐ผ๐ฑ๐ฒ๐น๐ ๐ถ๐ป ๐ฅ๐๐ ๐ฎ๐ฝ๐ฝ๐น๐ถ๐ฐ๐ฎ๐๐ถ๐ผ๐ป๐.
You can launch it as an application locally (it's Docker-ready!๐) or, if you want more flexibility, you can integrate it in your code as a python package๐ฆ
The workflow is simple: ๐ง You choose your favorite LLM provider and model (supported, for now, are Mistral AI, Groq, Anthropic, OpenAI and Cohere) ๐ง You pick the embedding models provider and the embedding model you prefer (supported, for now, are Mistral AI, Hugging Face, Cohere and OpenAI) ๐ You prepare and provide your documents โ๏ธ Documents are ingested into a Qdrant vector database and transformed into a synthetic question dataset with the help of LlamaIndex ๐ The LLM is evaluated for the faithfulness and relevancy of its retrieval-augmented answer to the questions ๐ The embedding model is evaluated for hit rate and mean reciprocal ranking (MRR) of the retrieved documents
And the cool thing is that all of this is ๐ถ๐ป๐๐๐ถ๐๐ถ๐๐ฒ ๐ฎ๐ป๐ฑ ๐ฐ๐ผ๐บ๐ฝ๐น๐ฒ๐๐ฒ๐น๐ ๐ฎ๐๐๐ผ๐บ๐ฎ๐๐ฒ๐ฑ: you plug it in, and it works!๐โก
Even cooler? This is all built on top of LlamaIndex and its integrations: no need for tons of dependencies or fancy workarounds๐ฆ And if you're a UI lover, Gradio and FastAPI are there to provide you a seamless backend-to-frontend experience๐ถ๏ธ
โญ๏ธ The AI Energy Score project just launched - this is a game-changer for making informed decisions about AI deployment.
You can now see exactly how much energy your chosen model will consume, with a simple 5-star rating system. Think appliance energy labels, but for AI.
Looking at transcription models on the leaderboard is fascinating: choosing between whisper-tiny or whisper-large-v3 can make a 7x difference. Real-time data on these tradeoffs changes everything.
166 models already evaluated across 10 different tasks, from text generation to image classification. The whole thing is public and you can submit your own models to test.
Why this matters: - Teams can pick efficient models that still get the job done - Developers can optimize for energy use from day one - Organizations can finally predict their AI environmental impact
If you're building with AI at any scale, definitely worth checking out.
Yes, DeepSeek R1's release is impressive. But the real story is what happened in just 7 days after:
- Original release: 8 models, 540K downloads. Just the beginning...
- The community turned those open-weight models into +550 NEW models on Hugging Face. Total downloads? 2.5Mโnearly 5X the originals.
The reason? DeepSeek models are open-weight, letting anyone build on top of them. Interesting to note that the community focused on quantized versions for better efficiency & accessibility. They want models that use less memory, run faster, and are more energy-efficient.
When you empower builders, innovation explodes. For everyone. ๐
The most popular community model? @bartowski's DeepSeek-R1-Distill-Qwen-32B-GGUF version โ 1M downloads alone.