Hugging Face TB Research

Enterprise

community

AI & ML interests

Exploring smol models and high quality web and synthetic datasets, generated by LLMs (TB is for Textbook, as inspired by the "Textbooks are all your need" paper)

Recent Activity

pcuenq new activity about 16 hours ago

HuggingFaceTB/SmolVLM2-500M-Video-Instruct:torch import missed in example

orrzohar new activity 1 day ago

HuggingFaceTB/SmolVLM2-2.2B-Instruct:checkpoint you are trying to load has model type `smolvlm` but Transformers does not recognize this

frimelle authored a paper 2 days ago

Why Should This Article Be Deleted? Transparent Stance Detection in Multilingual Wikipedia Editor Discussions

View all activity

HuggingFaceTB's activity

pcuenq

in HuggingFaceTB/SmolVLM2-500M-Video-Instruct about 16 hours ago

torch import missed in example

#9 opened about 21 hours ago by

amrs-tech

orrzohar

in HuggingFaceTB/SmolVLM2-2.2B-Instruct 1 day ago

checkpoint you are trying to load has model type `smolvlm` but Transformers does not recognize this

#7 opened 1 day ago by

JLouisBiz

fdaudens

posted an update 1 day ago

Post

1450

Trying something new to keep you ahead of the curve: The 5 AI stories of the week - a weekly curation of the most important AI news you need to know. Do you like it?

For more AI stories and deeper analysis, check out my newsletter: https://open.substack.com/pub/fdaudens/p/ai-competition-heats-up-grok-3-iphone

1 reply

mfarre

in HuggingFaceTB/SmolVLM2-2.2B-Instruct 1 day ago

Input Video length constraints

#6 opened 1 day ago by

NikhilJoson

Fix dtype processing in README example

#5 opened 1 day ago by

orrzohar

mfarre

in HuggingFaceTB/SmolVLM2-500M-Video-Instruct 1 day ago

update readme dtype

#8 opened 1 day ago by

orrzohar

mfarre

in HuggingFaceTB/SmolVLM2-256M-Video-Instruct 1 day ago

update readme dtype

#5 opened 1 day ago by

orrzohar

updated a model 1 day ago

HuggingFaceTB/SmolVLM2-500M-Video-Instruct

Video-Text-to-Text • Updated about 16 hours ago • 716 • 18

orrzohar

in HuggingFaceTB/SmolVLM2-500M-Video-Instruct 1 day ago

update readme dtype

#8 opened 1 day ago by

orrzohar

updated a model 1 day ago

HuggingFaceTB/SmolVLM2-256M-Video-Instruct

Video-Text-to-Text • Updated 1 day ago • 786 • 20

orrzohar

in HuggingFaceTB/SmolVLM2-256M-Video-Instruct 1 day ago

update readme dtype

#5 opened 1 day ago by

orrzohar

updated a model 1 day ago

HuggingFaceTB/SmolVLM2-2.2B-Instruct

Video-Text-to-Text • Updated 1 day ago • 1.9k • 49

orrzohar

in HuggingFaceTB/SmolVLM2-2.2B-Instruct 1 day ago

Fix dtype processing in README example

#5 opened 1 day ago by

orrzohar

RuntimeError: Input type (torch.cuda.FloatTensor) and weight type (CUDABFloat16Type) should be the same

#4 opened 1 day ago by

Neiko2002

Using pre-computed embeddings for images/frames and using as input

#2 opened 2 days ago by

maxlun

davanstrien

posted an update 2 days ago

Post

2310

Hacked together a way to log trl GRPO training completions to a 🤗 dataset repo. This allows you to:

- Track rewards from multiple reward functions
- Treat the completion and rewards from training as a "proper" dataset and do EDA
- Share results for open science

The implementation is super hacky, but I'm curious if people would find this useful.

To push completions to the Hub, you just need two extra parameters:

log_completions=True
log_completions_hub_repo='your-username/repo-name'

Example dataset: davanstrien/test-logs
Colab: https://colab.research.google.com/drive/1wzBFPVthRYYTp-mEYlznLg_e_0Za1M3g

frimelle

authored 2 papers 2 days ago

Why Should This Article Be Deleted? Transparent Stance Detection in Multilingual Wikipedia Editor Discussions

Paper • 2310.05779 • Published Oct 9, 2023 • 1

Presumed Cultural Identity: How Names Shape LLM Responses

Paper • 2502.11995 • Published 5 days ago • 9

frimelle

posted an update 3 days ago

Post

2252

What’s in a name? More than you might think, especially for AI.
Whenever I introduce myself, people often start speaking French to me, even though my French is très basic. It turns out that AI systems do something similar:
Large language models infer cultural identity from names, shaping their responses based on presumed backgrounds. But is this helpful personalization or a reinforcement of stereotypes?
In our latest paper, we explored this question by testing DeepSeek, Llama, Aya, Mistral-Nemo, and GPT-4o-mini on how they associate names with cultural identities. We analysed 900 names from 30 cultures and found strong assumptions baked into AI responses: some cultures were overrepresented, while others barely registered.
For example, a name like "Jun" often triggered Japan-related responses, while "Carlos" was linked primarily to Mexico, even though these names exist in multiple countries. Meanwhile, names from places like Ireland led to more generic answers, suggesting weaker associations in the training data.
This has real implications for AI fairness: How should AI systems personalize without stereotyping? Should they adapt at all based on a name?
Work with some of my favourite researchers: @sidicity Arnav Arora and @IAugenstein
Read the full paper here: Presumed Cultural Identity: How Names Shape LLM Responses (2502.11995)

merve

posted an update 3 days ago

Post

4707

Google just released PaliGemma 2 Mix: new versatile instruction vision language models 🔥

> Three new models: 3B, 10B, 28B with res 224, 448 💙
> Can do vision language tasks with open-ended prompts, understand documents, and segment or detect anything 🤯

Read more https://huggingface.co/blog/paligemma2mix
Try the demo google/paligemma2-10b-mix
All models are here google/paligemma-2-mix-67ac6a251aaf3ee73679dcc4

AI & ML interests

Recent Activity

Team members 41

HuggingFaceTB's activity

torch import missed in example

checkpoint you are trying to load has model type `smolvlm` but Transformers does not recognize this

Input Video length constraints

Fix dtype processing in README example

update readme dtype

update readme dtype

update readme dtype

update readme dtype

Fix dtype processing in README example

RuntimeError: Input type (torch.cuda.FloatTensor) and weight type (CUDABFloat16Type) should be the same

Using pre-computed embeddings for images/frames and using as input