Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up

All HF Hub posts

clemΒ 
posted an update 1 day ago
view post
Post
2641
Today, we're unveiling two new open-source AI robots! HopeJR for $3,000 & Reachy Mini for $300 πŸ€–πŸ€–πŸ€–

Let's go open-source AI robotics!
Β·
ginipickΒ 
posted an update about 2 hours ago
view post
Post
138
🎨 AI Hairstyle Changer - Transform with 93 Styles! πŸ’‡β€β™€οΈβœ¨

πŸš€ Introduction
Experience 93 different hairstyles and 29 hair colors in real-time with your uploaded photo!
Transform your look instantly with this AI-powered Gradio web app.


✨ Key Features

πŸ“Έ Simple 3 Steps
Upload Photo - Upload a front-facing photo
Select Style - Choose from 93 hairstyles
Pick Color - Click your desired color from 29 color palette options


πŸ’« Diverse Hairstyles (93 types)

🎯 Short Cuts: Pixie Cut, Bob, Lob, Crew Cut, Undercut
🌊 Waves: Soft Waves, Hollywood Waves, Finger Waves
πŸŽ€ Braids: French Braid, Box Braids, Fishtail Braid, Cornrows
πŸ‘‘ Updos: Chignon, Messy Bun, Top Knot, French Twist
🌈 Special Styles: Space Buns, Dreadlocks, Mohawk, Beehive

🎨 Hair Color Palette (29 colors)

🀎 Natural Colors: Black, Browns, Blonde variations
❀️ Red Tones: Red, Auburn, Copper, Burgundy
πŸ’œ Fashion Colors: Blue, Purple, Pink, Green, Rose Gold
βšͺ Cool Tones: Silver, Ash Blonde, Titanium

🌟 Key Advantages

⚑ Fast Processing: Get results in just 10-30 seconds
🎯 High Accuracy: Natural-looking transformations with AI technology
πŸ’Ž Professional Quality: High-resolution output suitable for social media
πŸ”„ Unlimited Trials: Try as many combinations as you want
πŸ“± User-Friendly: Intuitive interface with visual color palette


πŸ’‘ Perfect For

πŸ’ˆ Salon Consultations: Show clients potential new looks before cutting
πŸ›οΈ Personal Styling: Experiment before making a big change
🎭 Entertainment: Fun transformations for social media content
🎬 Creative Projects: Character design and visualization
πŸ‘— Fashion Industry: Match hairstyles with outfits and makeup
πŸ“Έ Photography: Pre-visualization for photoshoots

LINK: ginipick/Change-Hair
MonsterMMORPGΒ 
posted an update 1 day ago
view post
Post
1683
VEO 3 FLOW Full Tutorial - How To Use VEO3 in FLOW Guide : https://youtu.be/AoEmQPU2gtg

Tutorial link : https://youtu.be/AoEmQPU2gtg

VEO 3 AI is rocking generative AI field right now. FLOW is the platform that lets you use VEO 3 with so many cool features. This is an official tutorial and guide made by Google team. I edited it slightly. I hope this be helpful.

FLOW : https://labs.google/flow/about

Veo 3 is Google DeepMind’s most advanced video generation model to date. It allows users to create high-quality, cinematic video clips from simple text prompts, making it one of the most powerful AI tools for video creation. What sets Veo 3 apart is its ability to generate videos with native audio. This means that along with stunning visuals, Veo 3 can produce synchronized dialogue, ambient sounds, and background musicβ€”all from a single prompt. For filmmakers, this is a significant leap forward, as it eliminates the need for separate audio generation or complex syncing processes. Veo 3 also excels in realism, accurately simulating real-world physics and ensuring precise lip-syncing for characters, making the generated content feel remarkably lifelike.

Introducing Flow: AI Filmmaking Made Seamless

While Veo 3 handles the heavy lifting of video and audio generation, Flow is the creative interface that brings it all together. Flow is Google’s new AI filmmaking tool, custom-designed to work with Veo 3, as well as Google’s other advanced models like Gemini (for natural language processing) and Imagen (for text-to-image generation). Flow is built to be intuitive, allowing filmmakers to describe their ideas in everyday language and see them transformed into cinematic scenes. It offers a suite of features that give creators unprecedented control over their projects, from camera movements to scene transitions, all while maintaining consistency across clips.
AtAndDevΒ 
posted an update 2 days ago
view post
Post
2483
deepseek-ai/DeepSeek-R1-0528

This is the end
  • 1 reply
Β·
darkc0deΒ 
posted an update 1 day ago
view post
Post
1614
πŸ€—πŸ‘¨πŸ»β€πŸŽ“
merveΒ 
posted an update 1 day ago
view post
Post
2198
introducing: VLM vibe eval πŸͺ­ visionLMsftw/VLMVibeEval

vision LMs are saturated over benchmarks, so we built vibe eval πŸ’¬

> compare different models with refreshed in-the-wild examples in different categories 🀠
> submit your favorite model for eval
no numbers -- just vibes!
BFFreeΒ 
posted an update about 14 hours ago
view post
Post
256
I am a shy artist. Primarily because I don't get motivation from sharing art publicly. I see so much new art daily online that once I begin thinking about where I fit in the mental fatigue becomes counter productive for me.

Recently I shared an album of hundreds of creations with a friend (and singular art fan) and he asked some questions that I felt were interesting enough to create this post on my process and what it teaches me vs what I am seeking.

Specifically I have learned to take ink drawings and create renderings that reveal my actual intention. My digital art goal is to recreate natural details into characters and landscapes that are imagined and deal with my affection for abstraction, deconstruction and humor.

My drawing goals are to be humorous and crafty about how things can be rendered just slightly incorrect to make the viewer see something familiar and recognizable even when its nonsense.

My process is using hysts/ControlNet-v1-1 with Lineart, 50 steps, 14 guidance scale and I give minimal descriptions that are often plain. Example "Really real old dog, plant, and another old dog, with an alligator turtle, posing for a photography portrait".

In the past few months I started taking the ControlNet render to multimodalart/flux-style-shaping and mashing up styles. Here I used a portrait of a Tortise and a dog laying next to each other on a reflective tile floor.

Last night, I took the Flux output and had it described using WillemVH/Image_To_Text_Description which was very accurate given the image.

I then fed the prompt back into Alpha-VLLM/Lumina-Image-2.0

The last step confirmed why I prefer using sketches to language. One, I am a visual artist therefore I have much better nuance with the drawings than with words. Two, my minds eye looks for the distorted. Three MOR FUN.



openfreeΒ 
posted an update about 23 hours ago
view post
Post
1244
πŸŽ™οΈ Voice Clone AI Podcast Generator: Create Emotionally Rich Podcasts with Your Own Voice!

πŸš€ Project Introduction
Hello! Today we're excited to introduce an AI-powered solo podcast generator that creates high-quality voice cloning with authentic emotional expression.
Transform any PDF document, web URL, or keyword into a professional podcast with just a few clicks! πŸ“šβž‘οΈπŸŽ§

VIDraft/Voice-Clone-Podcast

✨ Key Features
1. 🎯 Multiple Input Methods

URL: Simply paste any blog or article link
PDF: Upload research papers or documents directly
Keyword: Enter a topic and AI searches for the latest information to create content

2. 🎭 Emotionally Expressive Voice Cloning
Powered by Chatterbox TTS:

🎀 Voice Cloning: Learn and replicate your unique voice perfectly
πŸ“’ Natural intonation and emotional expression
🌊 Customizable emotion intensity with Exaggeration control
⚑ Seamless handling of long texts with automatic chunking

3. πŸ€– State-of-the-Art LLM Script Generation

Professional-grade English dialogue using Private-BitSix-Mistral
12 natural conversational exchanges
Real-time web search integration for up-to-date information
Fully editable generated scripts! ✏️

πŸ’‘ Use Cases
πŸ“– Educational Content

Transform complex research papers into easy-to-understand podcasts
Create English learning materials in your own voice

πŸ“° News & Information

Convert international articles into engaging audio content
Produce global trend analysis podcasts

🎨 Creative Content

Tell stories in English with your own voice
Build your global personal brand with custom audio content

πŸ› οΈ Tech Stack
🧠 LLM: Llama CPP + Private-BitSix-Mistral
πŸ—£οΈ TTS: Chatterbox (Voice Cloning & Emotional Expression)
πŸ” Search: Brave Search API
πŸ“„ Document Processing: LangChain + PyPDF
πŸ–₯️ Interface: Gradio
πŸŽ‰ What Makes Us Special

🎀 Voice Cloning: Perfect voice replication from just a short audio sample
😊 Emotion Contro πŸ“ Unlimited Length πŸ”„ Real-time Updates
prithivMLmodsΒ 
posted an update 1 day ago
view post
Post
1535
Just made a demo for Cosmos-Reason1, a physical AI model that understands physical common sense and generates appropriate embodied decisions in natural language through long chain-of-thought reasoning. Also added video understanding support to it. πŸ€—πŸš€

✦ Try the demo here : prithivMLmods/Cosmos-x-DocScope

β€Ή Cosmos-Reason1-7B : nvidia/Cosmos-Reason1-7B
β€Ή docscopeOCR-7B-050425-exp : prithivMLmods/docscopeOCR-7B-050425-exp
β€Ή Captioner-Relaxed : Ertugrul/Qwen2.5-VL-7B-Captioner-Relaxed

β€Ή Multimodal Implementations : prithivMLmods/multimodal-implementations-67c9982ea04b39f0608badb0

β€Ή GitHub : https://github.com/PRITHIVSAKTHIUR/Nvidia-Cosmos-Reason1-Demo

To know more about it, visit the model card of the respective model. !!
dhruv3006Β 
posted an update about 5 hours ago
view post
Post
173
C/ua Cloud Containers : Computer Use Agents in the Cloud

First cloud platform built for Computer-Use Agents. Open-source backbone. Linux/Windows/macOS desktops in your browser. Works with OpenAI, Anthropic, or any LLM. Pay only for compute time.

Our beta users have deployed 1000s of agents over the past month. Available now in 3 tiers: Small (1 vCPU/4GB), Medium (2 vCPU/8GB), Large (8 vCPU/32GB). Windows & macOS coming soon.

Github : https://github.com/trycua/cua ( We are open source !)

Cloud Platform : https://www.trycua.com/blog/introducing-cua-cloud-containers