1 7 55

William Suffill

wsuff

wsuff

AI & ML interests

None yet

Recent Activity

reacted to codelion's post with 👀 6 days ago

Released 17 production-ready adaptive text classifiers that learn from just 100 examples per class and continuously improve without retraining. These models achieve 93% average accuracy across enterprise use cases like email routing, fraud detection, document classification, and support ticket categorization. Built on ModernBERT with prototype memory and elastic weight consolidation. Key benefits: 90% cost reduction vs API solutions, 90-120ms local inference, dynamic class addition, and zero vendor lock-in. All models available under adaptive-classifier organization. Install with pip install adaptive-classifier. Full technical details: https://huggingface.co/blog/codelion/enterprise-ready-classifiers Code: https://github.com/codelion/adaptive-classifier

liked a model 10 days ago

unsloth/gpt-oss-20b-GGUF

reacted to mrfakename's post with 👍 3 months ago

Hi everyone, I just launched TTS Arena V2 - a platform for benchmarking TTS models by blind A/B testing. The goal is to make it easy to compare quality between open-source and commercial models, including conversational ones. What's new in V2: - **Conversational Arena**: Evaluate models like CSM-1B, Dia 1.6B, and PlayDialog in multi-turn settings - **Personal Leaderboard**: Optional login to see which models you tend to prefer - **Multi-speaker TTS**: Random voices per generation to reduce speaker bias - **Performance Upgrade**: Rebuilt from Gradio → Flask. Much faster with fewer failed generations. - **Keyboard Shortcuts**: Vote entirely via keyboard Also added models like MegaTTS 3, Cartesia Sonic, and ElevenLabs' full lineup. I'd love any feedback, feature suggestions, or ideas for models to include. https://huggingface.co/spaces/TTS-AGI/TTS-Arena-V2

View all activity

Organizations

None yet

reacted to codelion's post with 👀 6 days ago

Post

4561

Released 17 production-ready adaptive text classifiers that learn from just 100 examples per class and continuously improve without retraining.

These models achieve 93% average accuracy across enterprise use cases like email routing, fraud detection, document classification, and support ticket categorization. Built on ModernBERT with prototype memory and elastic weight consolidation.

Key benefits: 90% cost reduction vs API solutions, 90-120ms local inference, dynamic class addition, and zero vendor lock-in.

All models available under adaptive-classifier organization. Install with pip install adaptive-classifier.

Full technical details: https://huggingface.co/blog/codelion/enterprise-ready-classifiers
Code: https://github.com/codelion/adaptive-classifier

2 replies

liked a model 10 days ago

unsloth/gpt-oss-20b-GGUF

Text Generation • 21B • Updated 8 days ago • 506k • 312

reacted to mrfakename's post with 👍 3 months ago

Post

6311

Hi everyone,

I just launched TTS Arena V2 - a platform for benchmarking TTS models by blind A/B testing. The goal is to make it easy to compare quality between open-source and commercial models, including conversational ones.

What's new in V2:

- **Conversational Arena**: Evaluate models like CSM-1B, Dia 1.6B, and PlayDialog in multi-turn settings
- **Personal Leaderboard**: Optional login to see which models you tend to prefer
- **Multi-speaker TTS**: Random voices per generation to reduce speaker bias
- **Performance Upgrade**: Rebuilt from Gradio → Flask. Much faster with fewer failed generations.
- **Keyboard Shortcuts**: Vote entirely via keyboard

Also added models like MegaTTS 3, Cartesia Sonic, and ElevenLabs' full lineup.

I'd love any feedback, feature suggestions, or ideas for models to include.

TTS-AGI/TTS-Arena-V2

7 replies

liked a Space 3 months ago

840

TTS Arena V2

🏆

Vote on the latest TTS models!

liked a Space 4 months ago

WebApp1K Models Leaderboard

🥇

View leaderboard of web application models

reacted to Kseniase's post with 👍 4 months ago

Post

7512

11 new types of RAG

RAG is evolving fast, keeping pace with cutting-edge AI trends. Today it becomes more agentic and smarter at navigating complex structures like hypergraphs.

Here are 11 latest RAG types:

1. InstructRAG -> InstructRAG: Leveraging Retrieval-Augmented Generation on Instruction Graphs for LLM-Based Task Planning (2504.13032)
Combines RAG with a multi-agent framework, using a graph-based structure, an RL agent to expand task coverage, and a meta-learning agent for better generalization

2. CoRAG (Collaborative RAG) -> CoRAG: Collaborative Retrieval-Augmented Generation (2504.01883)
A collaborative framework that extends RAG to settings where clients train a shared model using a joint passage store

3. ReaRAG -> ReaRAG: Knowledge-guided Reasoning Enhances Factuality of Large Reasoning Models with Iterative Retrieval Augmented Generation (2503.21729)
It uses a Thought-Action-Observation loop to decide at each step whether to retrieve information or finalize an answer, reducing unnecessary reasoning and errors

4. MCTS-RAG -> MCTS-RAG: Enhancing Retrieval-Augmented Generation with Monte Carlo Tree Search (2503.20757)
Combines RAG with Monte Carlo Tree Search (MCTS) to help small LMs handle complex, knowledge-heavy tasks

5. Typed-RAG - > Typed-RAG: Type-aware Multi-Aspect Decomposition for Non-Factoid Question Answering (2503.15879)
Improves answers on open-ended questions by identifying question types (a debate, personal experience, or comparison) and breaking it down into simpler parts

6. MADAM-RAG -> Retrieval-Augmented Generation with Conflicting Evidence (2504.13079)
A multi-agent system where models debate answers over multiple rounds and an aggregator filters noise and misinformation

7. HM-RAG -> HM-RAG: Hierarchical Multi-Agent Multimodal Retrieval Augmented Generation (2504.12330)
A hierarchical multi-agent RAG framework that uses 3 agents: one to split queries, one to retrieve across multiple data types (text, graphs and web), and one to merge and refine answers

8. CDF-RAG -> CDF-RAG: Causal Dynamic Feedback for Adaptive Retrieval-Augmented Generation (2504.12560)
Works with causal graphs and enables multi-hop causal reasoning, refining queries. It validates responses against causal pathways

To explore what is Causal AI, read our article: https://www.turingpost.com/p/causalai

Subscribe to the Turing Post: https://www.turingpost.com/subscribe

Read further 👇

1 reply

reacted to eugenesiow's post with 👀 4 months ago

Post

1699

GPT-4.1 dropped this week - and it puts OpenAI back in the race for coding & agentic leadership.

⚙️ API only - no ChatGPT toggle for this.
💻 Coding performance is back on par with Claude 3.7 Sonnet & Gemini 2.5 Pro (though Gemini still leads).
💸 Pricing:
• Full: $3.50 / 1M tokens
• Mini: $0.70 / 1M
• Nano: $0.17 / 1M
👉 Gemini 2.5 Pro = best price/perf ($3.44 / 1M)
😵 Claude 3.5 Sonnet = $6 / 1M (!)

🧠 Not a "thinking" model.
📊 Mini shines on general reasoning tasks (e.g. GPQA), but only the full model holds up in SWE-bench-verified (GitHub issue solving).

reacted to m-ric's post with 🚀 4 months ago

Post

2980

New king of open VLMs: InternVL3 takes Qwen 2.5's crown! 👑

InternVL have been a wildly successful series of model : and the latest iteration has just taken back their crown thanks to their superior, natively multimodal vision training pipeline.

➡️ Most of the vision language models (VLMs) these days are built like Frankenstein : take a good text-only Large Language Model (LLM) backbone, stitch a specific vision transformer (ViT) on top of it. Then the training is sequential 🔢 : 1. Freeze the LLM weights while you train the ViT only to work with the LLM part, then 2. Unfreeze all weights to train all weights in order to work together.

💫 The Shanghai Lab decided to challenge this paradigm and chose this approach that they call "native". For each of their model sizes, they still start from a good LLM (mostly Qwen-2.5 series, did I tell you I'm a huge fan of Qwen? ❤️), and stitch the ViT, but they don't freeze anything : they train all weights together with interleaved text and image understanding data in a single pre-training phase 🎨.

They claim it results in more seamless interactions between modalities. And the results prove them right: they took the crown of top VLMs, at nearly all sizes, from their Qwen-2.5 parents. 👑

2 replies

liked a model 4 months ago

agentica-org/DeepCoder-14B-Preview

Text Generation • 15B • Updated May 11 • 59.8k • • 670

reacted to merterbak's post with 👀 4 months ago

Post

4766

Qwen 3 can launch very soon. 👀

https://github.com/ggml-org/llama.cpp/pull/12828

3 replies

reacted to MikeDoes's post with 🔥 5 months ago

Post

2793

🚀 We are quite excited to announce the Ai4Privacy Python library! 🎉

pip install ai4privacy to anonymize short english text with OpenPII Masking 500k labels

📊 Day 5/7 of PII Masking 1M announcements complete! ⏰

reacted to m-ric's post with ❤️ 5 months ago

Post

2435

🚀 DeepSeek R1 moment has come for GUI agents: Rule-based Reinforcement Learning gives better results than SFT with 500x smaller datasets!

Traditionally (by which I mean "in the last few months"), GUI agents have been trained with supervised fine-tuning (SFT). This meant, collecting huge datasets of screen captures from people using computers, and using these to fine-tune your model. 📚

👉 But last week, a new paper introduced UI-R1, applying DeepSeek's R1-style rule-based reinforcement learning (RL) specifically to GUI action prediction tasks.
This is big news: with RL, maybe we could build good agents without the need for huge datasets.

UI-R1 uses a unified reward function that evaluates multiple responses from models, optimizing via policy algorithms like Group Relative Policy Optimization (GRPO).

Specifically, the reward function assesses:
🎯 Action type accuracy: Does the predicted action match the ground truth?
📍 Coordinate accuracy (specifically for clicks): Is the predicted click within the correct bounding box?
📑 Output format: Does the model clearly articulate both its reasoning and final action?

Using just 136 carefully selected mobile tasks—compared to 76,000 tasks for larger models like OS-Atlas—UI-R1 shows significant efficiency and improved performance:
📈 Boosted action prediction accuracy from 76% to 89% on AndroidControl.
🌐 Outperformed larger, SFT-trained models (e.g., OS-Atlas-7B), demonstrating superior results with vastly fewer data points (136 tasks vs. 76K).
🔍 Enhanced adaptability and generalization, excelling even in out-of-domain scenarios.

The paper tests this RL-based method only in low-level GUI tasks. Could it generalize to more complex interactions? 🧐

Read the full paper here 👉 UI-R1: Enhancing Action Prediction of GUI Agents by Reinforcement Learning (2503.21620)

reacted to zamal's post with 👍 5 months ago

Post

2612

DeepGit: Your GitHub Gold Digger! 💰🚀
Hey Hugging Face gang! Meet DeepGit—my open-source sidekick that rips through GitHub to snag repos that fit you. Done with dead-end searches? Me too. Built it with LangGraph and some dope tricks:
Embeddings grab the good stuff (HF magic, baby!)

Re-ranking nails the best picks

Snoops docs, code, and buzz in one slick flow

Drops a clean list of hidden gems 💎

Unearth that sneaky ML lib or Python gem—run python app.py or langgraph dev and boom! Peek it at https://github.com/zamalali/DeepGit. Fork it, tweak it, love it—Docker’s in, HF vibes are strong. Drop a 🌟 or a crazy idea—I’m pumped to jam with you all! 🪂

reacted to giadap's post with 🔥 5 months ago

Post

2367

We've all become experts at clicking "I agree" without a second thought. In my latest blog post, I explore why these traditional consent models are increasingly problematic in the age of generative AI.

I found three fundamental challenges:
- Scope problem: how can you know what you're agreeing to when AI could use your data in different ways?
- Temporality problem: once an AI system learns from your data, good luck trying to make it "unlearn" it.
- Autonomy trap: the data you share today could create systems that pigeonhole you tomorrow.

Individual users shouldn't bear all the responsibility, while big tech holds all the cards. We need better approaches to level the playing field, from collective advocacy and stronger technological safeguards to establishing "data fiduciaries" with a legal duty to protect our digital interests.

Available here: https://huggingface.co/blog/giadap/beyond-consent

reacted to MikeDoes's post with 🚀 5 months ago

Post

2722

🚀 Ai4Privacy Team is excited to unveil PII-Masking-1M, our most significant release yet! 🎉

This publication series 📦 includes datasets 📊, models 🤖, and applications ⚙️ to advance PII masking with AI systems 🛡️

Starting on Monday with daily posts at 7 PM CET ⏰

reacted to chansung's post with ❤️ 5 months ago

Post

2659

Mistral AI Small 3.1 24B is not only commercial free but also the best model in a single GPU deployment.

I packed up all the information you need to know in a single picture. Hope this helps! :)

1 reply

reacted to MikeDoes's post with 🚀 5 months ago

Post

2111

#PII Masking Tech that does not **** around!

We are happy to release the OpenPII English Anonymiser —the most powerful open-source tool for redacting sensitive info from English text.

Fine-tuned Modernbert on 5.7 million+ PII examples, it’s clocking 99%+ accuracy across emails, dates, social numbers, and more!

Why it’s a big deal:
✅ Top-tier precision: 100% for passport numbers, 99.96% for emails*.
✅ Totally free: MIT license for personal or commercial use.
✅ No secrets: Full metrics shared on Hugging Face.

#AI #OpenSource #DataSecurity @huggingface

Day 2 out 7 of PII-Masking-1M Announcements Complete!

*Accuracies reported from the new OpenPII-500k dataset

ai4privacy/llama-ai4privacy-english-anonymiser-openpii

liked a model 5 months ago

ai4privacy/llama-ai4privacy-english-anonymiser-openpii

Token Classification • 0.1B • Updated Jun 5 • 280 • • 15

reacted to aifeifei798's post with 👍 5 months ago

Post

3953

😊 This program is designed to remove emojis from a given text. It uses a regular expression (regex) pattern to match and replace emojis with an empty string, effectively removing them from the text. The pattern includes a range of Unicode characters that correspond to various types of emojis, such as emoticons, symbols, and flags. By using this program, you can clean up text data by removing any emojis that may be present, which can be useful for text processing, analysis, or other applications where emojis are not desired. 💻

import re

def remove_emojis(text):
    # Define a broader emoji pattern
    emoji_pattern = re.compile(
        "["
        u"\U0001F600-\U0001F64F"  # emoticons
        u"\U0001F300-\U0001F5FF"  # symbols & pictographs
        u"\U0001F680-\U0001F6FF"  # transport & map symbols
        u"\U0001F1E0-\U0001F1FF"  # flags (iOS)
        u"\U00002702-\U000027B0"
        u"\U000024C2-\U0001F251"
        u"\U0001F900-\U0001F9FF"  # supplemental symbols and pictographs
        u"\U0001FA00-\U0001FA6F"  # chess symbols and more emojis
        u"\U0001FA70-\U0001FAFF"  # more symbols and pictographs
        u"\U00002600-\U000026FF"  # miscellaneous symbols
        u"\U00002B50-\U00002B59"  # additional symbols
        u"\U0000200D"             # zero width joiner
        u"\U0000200C"             # zero width non-joiner
        u"\U0000FE0F"             # emoji variation selector
        "]+", flags=re.UNICODE
    )
    return emoji_pattern.sub(r'', text)

reacted to julien-c's post with 👍 5 months ago

Post

4094

Important notice 🚨

For Inference Providers who have built support for our Billing API (currently: Fal, Novita, HF-Inference – with more coming soon), we've started enabling Pay as you go (=PAYG)

What this means is that you can use those Inference Providers beyond the free included credits, and they're charged to your HF account.

You can see it on this view: any provider that does not have a "Billing disabled" badge, is PAYG-compatible.

9 replies

William Suffill

AI & ML interests

Recent Activity

Organizations

wsuff's activity

TTS Arena V2

WebApp1K Models Leaderboard