Ryan Foster's picture

Ryan Foster

f0ster

·

f0ster

AI & ML interests

📝🖼️🗣️🐙

Recent Activity

upvoted a paper about 22 hours ago

ShowUI: One Vision-Language-Action Model for GUI Visual Agent

upvoted a paper about 23 hours ago

Improving Vision-Language-Action Model with Online Reinforcement Learning

upvoted a paper 3 days ago

JARVIS-VLA: Post-Training Large-Scale Vision Language Models to Play Visual Games with Keyboards and Mouse

View all activity

Organizations

f0ster's activity

upvoted a paper about 22 hours ago

ShowUI: One Vision-Language-Action Model for GUI Visual Agent

Paper • 2411.17465 • Published Nov 26, 2024 • 84

upvoted a paper about 23 hours ago

Improving Vision-Language-Action Model with Online Reinforcement Learning

Paper • 2501.16664 • Published Jan 28 • 1

upvoted a paper 3 days ago

JARVIS-VLA: Post-Training Large-Scale Vision Language Models to Play Visual Games with Keyboards and Mouse

Paper • 2503.16365 • Published 4 days ago • 33

upvoted a paper 5 days ago

Multimodal Chain-of-Thought Reasoning: A Comprehensive Survey

Paper • 2503.12605 • Published 8 days ago • 28

upvoted a paper about 2 months ago

DiffSplat: Repurposing Image Diffusion Models for Scalable Gaussian Splat Generation

Paper • 2501.16764 • Published Jan 28 • 22

upvoted 2 papers 2 months ago

VideoRAG: Retrieval-Augmented Generation over Video Corpus

Paper • 2501.05874 • Published Jan 10 • 69

LlamaV-o1: Rethinking Step-by-step Visual Reasoning in LLMs

Paper • 2501.06186 • Published Jan 10 • 62

commented a paper 2 months ago

SPAR3D: Stable Point-Aware Reconstruction of 3D Objects from Single Images

Paper • 2501.04689 • Published Jan 8 • 17 •

liked a model 2 months ago

microsoft/phi-4

Text Generation • Updated 28 days ago • 624k • • 1.93k

commented a paper 2 months ago

rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking

Paper • 2501.04519 • Published Jan 8 • 265 •

upvoted a paper 2 months ago

rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking

Paper • 2501.04519 • Published Jan 8 • 265

commented a paper 2 months ago

SPAR3D: Stable Point-Aware Reconstruction of 3D Objects from Single Images

Paper • 2501.04689 • Published Jan 8 • 17 •

upvoted 2 papers 2 months ago

SPAR3D: Stable Point-Aware Reconstruction of 3D Objects from Single Images

Paper • 2501.04689 • Published Jan 8 • 17

DPO Kernels: A Semantically-Aware, Kernel-Enhanced, and Divergence-Rich Paradigm for Direct Preference Optimization

Paper • 2501.03271 • Published Jan 5 • 11

upvoted 2 papers 3 months ago

VideoRefer Suite: Advancing Spatial-Temporal Object Understanding with Video LLM

Paper • 2501.00599 • Published Dec 31, 2024 • 44

Molar: Multimodal LLMs with Collaborative Filtering Alignment for Enhanced Sequential Recommendation

Paper • 2412.18176 • Published Dec 24, 2024 • 15

upvoted a collection 3 months ago

Health AI Developer Foundations (HAI-DEF)

Groups models released for use in health AI by Google. Read more about HAI-DEF at https://developers.google.com/health-ai-developer-foundations • 4 items • Updated 7 days ago • 29

upvoted a paper 3 months ago

GenMAC: Compositional Text-to-Video Generation with Multi-Agent Collaboration

Paper • 2412.04440 • Published Dec 5, 2024 • 20

reacted to fdaudens's post with 👍 4 months ago

Post

1339

🚀 Your AI toolkit just got a major upgrade! I updated the Journalists on Hugging Face community's collection with tools for investigative work, content creation, and data analysis.

Sharing these new additions with the links in case it’s helpful:
- @wendys-llc 's excellent 6-part video series on AI for investigative journalism https://www.youtube.com/playlist?list=PLewNEVDy7gq1_GPUaL0OQ31QsiHP5ncAQ
- @jeremycaplan 's curated AI Spaces on HF https://wondertools.substack.com/p/huggingface
- @Xenova 's Whisper Timestamped (with diarization!) for private, on-device transcription Xenova/whisper-speaker-diarization & Xenova/whisper-word-level-timestamps
- Flux models for image gen & LoRAs autotrain-projects/train-flux-lora-ease
- FineGrain's object cutter finegrain/finegrain-object-cutter and object eraser (this one's cool) finegrain/finegrain-object-eraser
- FineVideo: massive open-source annotated dataset + explorer HuggingFaceFV/FineVideo-Explorer
- Qwen2 chat demos, including 2.5 & multimodal versions (crushing it on handwriting recognition) Qwen/Qwen2.5 & Qwen/Qwen2-VL
- GOT-OCR integration stepfun-ai/GOT_official_online_demo
- HTML to Markdown converter maxiw/HTML-to-Markdown
- Text-to-SQL query tool by @davidberenstein1957 for HF datasets davidberenstein1957/text-to-sql-hub-datasets

There's a lot of potential here for journalism and beyond. Give these a try and let me know what you build!

You can also add your favorite ones if you're part of the community!

Check it out: https://huggingface.co/JournalistsonHF

#AIforJournalism #HuggingFace #OpenSourceAI

upvoted a paper 5 months ago

ARCLE: The Abstraction and Reasoning Corpus Learning Environment for Reinforcement Learning

Paper • 2407.20806 • Published Jul 30, 2024 • 1