victor (Victor Mustar)

reacted to MrOvkill's post with ❤️ 6 days ago

Post

2258

Hello!

I was just playing around with Python's MIDI library and Colab's code generation, accidentally cooked up a quick n' dirty audio synthesis template.
Have fun!

https://colab.research.google.com/drive/1d-AF6jygCwmoJvAa9nnEMe5ROidnMJNY?usp=sharing

-<3

3 replies

·

reacted to sometimesanotion's post with 👍 6 days ago

Post

4557

I'd like to draw your attention to a Lamarck-based experiment which uses Arcee AI's newly published arcee_fusion merge method for three out of its four merges. Yes, just four. This is a simple one, and its recipe is fully open:

sometimesanotion/Lamarck-14B-v0.7-Fusion

It unifies three branches, all of which feature models which bring Lamarck-14B-v0.7 and Qwenvergence-14B-v12-Prose together. One side features @jpacifico 's jpacifico/Chocolatine-2-14B-Instruct-v2.0.3 and the other features @suayptalha 's suayptalha/Lamarckvergence-14B paired with my models which were their merge ancestors.

A fusion merge - of a fusion merge and a SLERP of a fusion and older merge - should demonstrate the new merge method's behavior in interesting ways, especially in the first 1/4th of the model where the SLERP has less impact.

I welcome you to kick the tires and learn from it. It has prose quality near Qwenvergence v12's - as you'd expect.

Thank you, @mradermacher and @MaziyarPanahi , for the first-day quantizations! Your work helped get me started. https://huggingface.co/models?other=base_model:quantized:sometimesanotion/Lamarck-14B-v0.7-Fusion

5 replies

·

reacted to m-ric's post with 🚀 6 days ago

Post

4294

We now have a Deep Research for academia: SurveyX automatically writes academic surveys nearly indistinguishable from human-written ones 🔥

Researchers from Beijing and Shanghai just published the first application of a deep research system to academia: their algorithm, given a question, can give you a survey of all papers on the subject.

To make a research survey, you generally follow two steps, preparation (collect and organize papers) and writing (outline creation, writing, polishing). Researchers followed the same two steps and automated them.

🎯 For the preparation part, a key part is find all the important references on the given subject.
Researchers first cast a wide net of all relevant papers. But then finding the really important ones is like distilling knowledge from a haystack of information. To solve this challenge, they built an “AttributeTree” object that structures key information from citations. Ablating these AttributeTrees significantly decreased structure and synthesis scores, so they were really useful!

📝 For the writing part, key was to get a synthesis that's both short and true. This is not easy to get with LLMs! So they used methods like LLM-based deduplication to shorten the too verbose listings made by LLMs, and RAG to grab original quotes instead of made-up ones.

As a result, their system outperforms previous approaches by far!

As assessed by LLM-judges, the quality score os SurveyX even approaches this of human experts, with 4.59/5 vs 4.75/5 🏆

I advise you to read the paper, it's a great overview of the kind of assistants that we'll get in the short future! 👉 SurveyX: Academic Survey Automation via Large Language Models (2502.14776)
Their website shows examples of generated surveys 👉 http://www.surveyx.cn/

reacted to stefan-it's post with 👍 6 days ago

Post

5042

She arrived 😍

[Expect more models soon...]

2 replies

·

reacted to openfree's post with 🚀 6 days ago

Post

7523

Datasets Convertor 🚀

openfree/Datasets-Convertor

Welcome to Datasets Convertor, the cutting-edge solution engineered for seamless and efficient data format conversion. Designed with both data professionals and enthusiasts in mind, our tool simplifies the transformation process between CSV, Parquet, and JSONL, XLS file formats, ensuring that your data is always in the right shape for your next analytical or development challenge. 💻✨

Why Choose Datasets Convertor?
In today’s data-driven world, managing and converting large datasets can be a daunting task. Our converter is built on top of robust technologies like Pandas and Gradio, delivering reliable performance with a modern, intuitive interface. Whether you’re a data scientist, analyst, or developer, Datasets Convertor empowers you to effortlessly switch between formats while maintaining data integrity and optimizing storage.

Key Features and Capabilities:
CSV ⇆ Parquet Conversion:
Easily transform your CSV files into the highly efficient Parquet format and vice versa. Parquet’s columnar storage not only reduces file size but also accelerates query performance—a critical advantage for big data analytics. 🔄📂

CSV to JSONL Conversion:
Convert CSV files to JSONL (newline-delimited JSON) to facilitate efficient, line-by-line data processing. This format is particularly useful for streaming data applications, logging systems, and scenarios where incremental data processing is required. Each CSV row is meticulously converted into an individual JSON record, preserving all the metadata and ensuring compatibility with modern data pipelines. 📄➡️📝

Parquet to JSONL Conversion:
For those working with Parquet files, our tool offers a streamlined conversion to JSONL.

Parquet to XLS Conversion.

reacted to Kseniase's post with 👍 6 days ago

Post

9382

8 Free Sources about AI Agents:

Agents seem to be everywhere and this collection is for a deep dive into the theory and practice:

1. "Agents" Google's whitepaper by Julia Wiesinger, Patrick Marlow and Vladimir Vuskovic -> https://www.kaggle.com/whitepaper-agents
Covers agents, their functions, tool use and how they differ from models

2. "Agents in the Long Game of AI. Computational Cognitive Modeling for Trustworthy, Hybrid AI" book by Marjorie McShane, Sergei Nirenburg, and Jesse English -> https://direct.mit.edu/books/oa-monograph/5833/Agents-in-the-Long-Game-of-AIComputational
Explores building AI agents, using Hybrid AI, that combines ML with knowledge-based reasoning

3. "AI Engineer Summit 2025: Agent Engineering" 8-hour video -> https://www.youtube.com/watch?v=D7BzTxVVMuw
Experts' talks that share insights on the freshest Agent Engineering advancements, such as Google Deep Research, scaling tips and more

4. AI Agents Course from Hugging Face -> https://huggingface.co/learn/agents-course/en/unit0/introduction
Agents' theory and practice to learn how to build them using top libraries and tools

5. "Artificial Intelligence: Foundations of Computational Agents", 3rd Edition, book by David L. Poole and Alan K. Mackworth -> https://artint.info/3e/html/ArtInt3e.html
Agents' architectures, how they learn, reason, plan and act with certainty and uncertainty

6. "Intelligent Agents: Theory and Practice" book by Michael Wooldridge -> https://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/ker95/ker95-html.html
A fascinating option to dive into how agents were seen in 1995 and explore their theory, architectures and agent languages

7. The Turing Post articles "AI Agents and Agentic Workflows" on Hugging Face -> https://huggingface.co/Kseniase
We explore agentic workflows in detail and agents' building blocks, such as memory and knowledge

8. Our collection "8 Free Sources to Master Building AI Agents" -> https://www.turingpost.com/p/building-ai-agents-sources

4 replies

·

reacted to freddyaboulton's post with 🚀 6 days ago

Post

2969

Getting WebRTC and Websockets right in python is very tricky. If you've tried to wrap an LLM in a real-time audio layer then you know what I'm talking about.

That's where FastRTC comes in! It makes WebRTC and Websocket streams super easy with minimal code and overhead.

Check out our org: hf.co/fastrtc

reacted to KonradSzafer's post with 👀 6 days ago

Post

1807

I’ve been experimenting with a “Tech Tree” to make ML research more systematic and transparent—turns out it helped me spot hidden interactions between experiments and share progress more easily. I wrote a short blog post with examples and insights! KonradSzafer/tech_tree_blog

reacted to singhsidhukuldeep's post with 👀 6 days ago

Post

1634

I just came across a groundbreaking paper titled "Hypencoder: Hypernetworks for Information Retrieval" by researchers from the University of Massachusetts Amherst that introduces a fundamentally new paradigm for search technology.

Most current retrieval models rely on simple inner product calculations between query and document vectors, which severely limits their expressiveness. The authors prove theoretically that inner product similarity functions fundamentally constrain what types of relevance relationships can be captured.

Hypencoder takes a radically different approach: instead of encoding a query as a vector, it generates a small neural network (called a "q-net") that acts as a learned relevance function. This neural network takes document representations as input and produces relevance scores.

Under the hood, Hypencoder uses:
- Attention-based hypernetwork layers (hyperhead layers) that transform contextualized query embeddings into weights and biases for the q-net
- A document encoder that produces vector representations similar to existing models
- A graph-based greedy search algorithm for efficient retrieval that can search 8.8M documents in under 60ms

The results are impressive - Hypencoder significantly outperforms strong dense retrieval models on standard benchmarks like MS MARCO and TREC Deep Learning Track. The performance gap widens even further on complex retrieval tasks like tip-of-the-tongue queries and instruction-following retrieval.

What makes this approach particularly powerful is that neural networks are universal approximators, allowing Hypencoder to express far more complex relevance relationships than inner product similarity functions. The framework is also flexible enough to replicate any existing neural retrieval method while adding the ability to learn query-dependent weights.

reacted to alvarobartt's post with 🔥 6 days ago

Post

2685

🔥 Agents can do anything! @microsoft Research just announced the release of Magma 8B!

Magma is a new Visual Language Model (VLM) with 8B parameters for multi-modal agents designed to handle complex interactions across virtual and real environments; and it's MIT licensed!

Magma comes with exciting new features such as:
- Introduces the Set-of-Mark and Trace-of-Mark techniques for fine-tuning
- Leverages a large amount of unlabeled video data to learn the spatial-temporal grounding and planning
- A strong generalization and ability to be fine-tuned for other agentic tasks
- SOTA in different multi-modal benchmarks spanning across UI navigation, robotics manipulation, image / video understanding and spatial understanding and reasoning
- Generates goal-driven visual plans and actions for agentic use cases

Model: microsoft/Magma-8B
Technical Report: Magma: A Foundation Model for Multimodal AI Agents (2502.13130)

reacted to prithivMLmods's post with 🔥 6 days ago

Post

5403

Dropping some of the custom fine-tunes based on SigLIP2,
with a single-label classification problem type! 🌀🧤

- AI vs Deepfake vs Real : prithivMLmods/AI-vs-Deepfake-vs-Real-Siglip2
- Deepfake Detect : prithivMLmods/Deepfake-Detect-Siglip2
- Fire Detection : prithivMLmods/Fire-Detection-Siglip2
- Deepfake Quality Assess : prithivMLmods/Deepfake-Quality-Assess-Siglip2
- Guard Against Unsafe Content : prithivMLmods/Guard-Against-Unsafe-Content-Siglip2

🌠Collection : prithivMLmods/siglip2-custom-67bcdb2de8fe96b99fb4e19e

reacted to AdinaY's post with ➕ 6 days ago

Post

2957

Try QwQ-Max-Preview, Qwen's reasoning model here👉 https://chat.qwen.ai
Can't wait for the model weights to drop on the Hugging Face Hub 🔥

4 replies

·

reacted to THUdyh's post with 👀 10 days ago

Post

1970

🔥🔥Introducing Ola! State-of-the-art omni-modal understanding model with advanced progressive modality alignment strategy!
Ola ranks #1 on OpenCompass Leaderboard (<10B)
.
📜Paper: https://arxiv.org/abs/2502.04328
🛠️Code: https://github.com/Ola-Omni/Ola

🛠️We have fully released our video&audio training data, intermediate image&video model at THUdyh/ola-67b8220eb93406ec87aeec37. Try to build your own powerful omni-modal model with our data and models!

reacted to JingzeShi's post with 🚀 10 days ago

Post

2892

🤗Welcome to the Doge Edge Device Small language Model.

SmallDoge/Doge-160M-Instruct

reacted to ychen's post with 👍 10 days ago

Post

2451

Here's some annoying keywords that 4o tends to use when responding to personal experiences with negative sentiments. Will be updated over time.

rough, tough, sound like, sounds like, frustrating, overwhelming

9 replies

·

reacted to lysandre's post with ❤️ 10 days ago

Post

5335

SmolVLM-2 and SigLIP-2 are now part of transformers in dedicated releases!

They're added on top of the v4.49.0 release, and can be installed from the following tags: v4.49.0-SmolVLM-2 and v4.49.0-SigLIP-2.

This marks a new beginning for the release process of transformers. For the past five years, we've been doing monthly releases featuring many models (v4.49.0, the latest release, features 9 new architectures).

Starting with SmolVLM-2 & SigLIP2, we'll now additionally release tags supporting new models on a stable branch. These models are therefore directly available for use by installing from the tag itself. These tags will continue to be updated with fixes applied to these models.

Going forward, continue expecting software releases following semantic versioning: v4.50.0 will have ~10 new architectures compared to v4.49.0, as well as a myriad of new features, improvements and bug fixes. Accompanying these software releases, we'll release tags offering brand new models as fast as possible, to make them accessible to all immediately.

1 reply

·

reacted to burtenshaw's post with 🚀 11 days ago

Post

6937

AGENTS + FINETUNING! This week Hugging Face learn has a whole pathway on finetuning for agentic applications. You can follow these two courses to get knowledge on levelling up your agent game beyond prompts:

1️⃣ New Supervised Fine-tuning unit in the NLP Course https://huggingface.co/learn/nlp-course/en/chapter11/1
2️⃣New Finetuning for agents bonus module in the Agents Course https://huggingface.co/learn/agents-course/bonus-unit1/introduction

Fine-tuning will squeeze everything out of your model for how you’re using it, more than any prompt.

2 replies

·

reacted to fdaudens's post with ❤️ 11 days ago

Post

5778

🎯 Perplexity drops their FIRST open-weight model on Hugging Face: A decensored DeepSeek-R1 with full reasoning capabilities. Tested on 1000+ examples for unbiased responses.

Check it out: perplexity-ai/r1-1776
Blog post: https://perplexity.ai/hub/blog/open-sourcing-r1-1776

1 reply

·

reacted to clem's post with 👍 11 days ago

Post

2724

What are the best organizations to follow on @huggingface ?

On top of my head:
- Deepseek (35,000 followers): https://huggingface.co/deepseek-ai
- Meta Llama (27,000 followers): https://huggingface.co/meta-llama
- Black Forrest Labs (11,000 followers): https://huggingface.co/black-forest-labs
- OpenAI (5,000 followers): https://huggingface.co/openai
- Nvidia (16,000 followers): https://huggingface.co/nvidia
- MIcrosoft (9,000 followers): https://huggingface.co/microsoft
- AllenAI (2,000 followers): https://huggingface.co/allenai
- Mistral (5,000 followers): https://huggingface.co/mistralai
- XAI (600 followers): https://huggingface.co/xai-org
- Stability AI (16,000 followers): https://huggingface.co/stabilityai
- Qwen (16,000 followers): https://huggingface.co/Qwen
- GoogleAI (8,000 followers): https://huggingface.co/google
- Unsloth (3,000 followers): https://huggingface.co/unsloth
- Bria AI (4,000 followers): https://huggingface.co/briaai
- NousResearch (1,300 followers): https://huggingface.co/NousResearch

Bonus, the agent course org with 17,000 followers: https://huggingface.co/agents-course

1 reply

·

replied to AdinaY's post 12 days ago

awesome!

Victor Mustar PRO

AI & ML interests

Recent Activity

Organizations

victor's activity