Join the conversation
Join the community of Machine Learners and AI enthusiasts.
Sign UpAll HF Hub posts
Post
138
π¨ AI Hairstyle Changer - Transform with 93 Styles! πββοΈβ¨
π Introduction
Experience 93 different hairstyles and 29 hair colors in real-time with your uploaded photo!
Transform your look instantly with this AI-powered Gradio web app.
β¨ Key Features
πΈ Simple 3 Steps
Upload Photo - Upload a front-facing photo
Select Style - Choose from 93 hairstyles
Pick Color - Click your desired color from 29 color palette options
π« Diverse Hairstyles (93 types)
π― Short Cuts: Pixie Cut, Bob, Lob, Crew Cut, Undercut
π Waves: Soft Waves, Hollywood Waves, Finger Waves
π Braids: French Braid, Box Braids, Fishtail Braid, Cornrows
π Updos: Chignon, Messy Bun, Top Knot, French Twist
π Special Styles: Space Buns, Dreadlocks, Mohawk, Beehive
π¨ Hair Color Palette (29 colors)
π€ Natural Colors: Black, Browns, Blonde variations
β€οΈ Red Tones: Red, Auburn, Copper, Burgundy
π Fashion Colors: Blue, Purple, Pink, Green, Rose Gold
βͺ Cool Tones: Silver, Ash Blonde, Titanium
π Key Advantages
β‘ Fast Processing: Get results in just 10-30 seconds
π― High Accuracy: Natural-looking transformations with AI technology
π Professional Quality: High-resolution output suitable for social media
π Unlimited Trials: Try as many combinations as you want
π± User-Friendly: Intuitive interface with visual color palette
π‘ Perfect For
π Salon Consultations: Show clients potential new looks before cutting
ποΈ Personal Styling: Experiment before making a big change
π Entertainment: Fun transformations for social media content
π¬ Creative Projects: Character design and visualization
π Fashion Industry: Match hairstyles with outfits and makeup
πΈ Photography: Pre-visualization for photoshoots
LINK: ginipick/Change-Hair
π Introduction
Experience 93 different hairstyles and 29 hair colors in real-time with your uploaded photo!
Transform your look instantly with this AI-powered Gradio web app.
β¨ Key Features
πΈ Simple 3 Steps
Upload Photo - Upload a front-facing photo
Select Style - Choose from 93 hairstyles
Pick Color - Click your desired color from 29 color palette options
π« Diverse Hairstyles (93 types)
π― Short Cuts: Pixie Cut, Bob, Lob, Crew Cut, Undercut
π Waves: Soft Waves, Hollywood Waves, Finger Waves
π Braids: French Braid, Box Braids, Fishtail Braid, Cornrows
π Updos: Chignon, Messy Bun, Top Knot, French Twist
π Special Styles: Space Buns, Dreadlocks, Mohawk, Beehive
π¨ Hair Color Palette (29 colors)
π€ Natural Colors: Black, Browns, Blonde variations
β€οΈ Red Tones: Red, Auburn, Copper, Burgundy
π Fashion Colors: Blue, Purple, Pink, Green, Rose Gold
βͺ Cool Tones: Silver, Ash Blonde, Titanium
π Key Advantages
β‘ Fast Processing: Get results in just 10-30 seconds
π― High Accuracy: Natural-looking transformations with AI technology
π Professional Quality: High-resolution output suitable for social media
π Unlimited Trials: Try as many combinations as you want
π± User-Friendly: Intuitive interface with visual color palette
π‘ Perfect For
π Salon Consultations: Show clients potential new looks before cutting
ποΈ Personal Styling: Experiment before making a big change
π Entertainment: Fun transformations for social media content
π¬ Creative Projects: Character design and visualization
π Fashion Industry: Match hairstyles with outfits and makeup
πΈ Photography: Pre-visualization for photoshoots
LINK: ginipick/Change-Hair

MonsterMMORPGΒ
posted an update
1 day ago
Post
1683
VEO 3 FLOW Full Tutorial - How To Use VEO3 in FLOW Guide : https://youtu.be/AoEmQPU2gtg
Tutorial link : https://youtu.be/AoEmQPU2gtg
VEO 3 AI is rocking generative AI field right now. FLOW is the platform that lets you use VEO 3 with so many cool features. This is an official tutorial and guide made by Google team. I edited it slightly. I hope this be helpful.
FLOW : https://labs.google/flow/about
Veo 3 is Google DeepMindβs most advanced video generation model to date. It allows users to create high-quality, cinematic video clips from simple text prompts, making it one of the most powerful AI tools for video creation. What sets Veo 3 apart is its ability to generate videos with native audio. This means that along with stunning visuals, Veo 3 can produce synchronized dialogue, ambient sounds, and background musicβall from a single prompt. For filmmakers, this is a significant leap forward, as it eliminates the need for separate audio generation or complex syncing processes. Veo 3 also excels in realism, accurately simulating real-world physics and ensuring precise lip-syncing for characters, making the generated content feel remarkably lifelike.
Introducing Flow: AI Filmmaking Made Seamless
While Veo 3 handles the heavy lifting of video and audio generation, Flow is the creative interface that brings it all together. Flow is Googleβs new AI filmmaking tool, custom-designed to work with Veo 3, as well as Googleβs other advanced models like Gemini (for natural language processing) and Imagen (for text-to-image generation). Flow is built to be intuitive, allowing filmmakers to describe their ideas in everyday language and see them transformed into cinematic scenes. It offers a suite of features that give creators unprecedented control over their projects, from camera movements to scene transitions, all while maintaining consistency across clips.
Tutorial link : https://youtu.be/AoEmQPU2gtg
VEO 3 AI is rocking generative AI field right now. FLOW is the platform that lets you use VEO 3 with so many cool features. This is an official tutorial and guide made by Google team. I edited it slightly. I hope this be helpful.
FLOW : https://labs.google/flow/about
Veo 3 is Google DeepMindβs most advanced video generation model to date. It allows users to create high-quality, cinematic video clips from simple text prompts, making it one of the most powerful AI tools for video creation. What sets Veo 3 apart is its ability to generate videos with native audio. This means that along with stunning visuals, Veo 3 can produce synchronized dialogue, ambient sounds, and background musicβall from a single prompt. For filmmakers, this is a significant leap forward, as it eliminates the need for separate audio generation or complex syncing processes. Veo 3 also excels in realism, accurately simulating real-world physics and ensuring precise lip-syncing for characters, making the generated content feel remarkably lifelike.
Introducing Flow: AI Filmmaking Made Seamless
While Veo 3 handles the heavy lifting of video and audio generation, Flow is the creative interface that brings it all together. Flow is Googleβs new AI filmmaking tool, custom-designed to work with Veo 3, as well as Googleβs other advanced models like Gemini (for natural language processing) and Imagen (for text-to-image generation). Flow is built to be intuitive, allowing filmmakers to describe their ideas in everyday language and see them transformed into cinematic scenes. It offers a suite of features that give creators unprecedented control over their projects, from camera movements to scene transitions, all while maintaining consistency across clips.
Post
2198
introducing: VLM vibe eval πͺ
visionLMsftw/VLMVibeEval
vision LMs are saturated over benchmarks, so we built vibe eval π¬
> compare different models with refreshed in-the-wild examples in different categories π€
> submit your favorite model for eval
no numbers -- just vibes!
vision LMs are saturated over benchmarks, so we built vibe eval π¬
> compare different models with refreshed in-the-wild examples in different categories π€
> submit your favorite model for eval
no numbers -- just vibes!
Post
256
I am a shy artist. Primarily because I don't get motivation from sharing art publicly. I see so much new art daily online that once I begin thinking about where I fit in the mental fatigue becomes counter productive for me.
Recently I shared an album of hundreds of creations with a friend (and singular art fan) and he asked some questions that I felt were interesting enough to create this post on my process and what it teaches me vs what I am seeking.
Specifically I have learned to take ink drawings and create renderings that reveal my actual intention. My digital art goal is to recreate natural details into characters and landscapes that are imagined and deal with my affection for abstraction, deconstruction and humor.
My drawing goals are to be humorous and crafty about how things can be rendered just slightly incorrect to make the viewer see something familiar and recognizable even when its nonsense.
My process is using hysts/ControlNet-v1-1 with Lineart, 50 steps, 14 guidance scale and I give minimal descriptions that are often plain. Example "Really real old dog, plant, and another old dog, with an alligator turtle, posing for a photography portrait".
In the past few months I started taking the ControlNet render to multimodalart/flux-style-shaping and mashing up styles. Here I used a portrait of a Tortise and a dog laying next to each other on a reflective tile floor.
Last night, I took the Flux output and had it described using WillemVH/Image_To_Text_Description which was very accurate given the image.
I then fed the prompt back into Alpha-VLLM/Lumina-Image-2.0
The last step confirmed why I prefer using sketches to language. One, I am a visual artist therefore I have much better nuance with the drawings than with words. Two, my minds eye looks for the distorted. Three MOR FUN.
Recently I shared an album of hundreds of creations with a friend (and singular art fan) and he asked some questions that I felt were interesting enough to create this post on my process and what it teaches me vs what I am seeking.
Specifically I have learned to take ink drawings and create renderings that reveal my actual intention. My digital art goal is to recreate natural details into characters and landscapes that are imagined and deal with my affection for abstraction, deconstruction and humor.
My drawing goals are to be humorous and crafty about how things can be rendered just slightly incorrect to make the viewer see something familiar and recognizable even when its nonsense.
My process is using hysts/ControlNet-v1-1 with Lineart, 50 steps, 14 guidance scale and I give minimal descriptions that are often plain. Example "Really real old dog, plant, and another old dog, with an alligator turtle, posing for a photography portrait".
In the past few months I started taking the ControlNet render to multimodalart/flux-style-shaping and mashing up styles. Here I used a portrait of a Tortise and a dog laying next to each other on a reflective tile floor.
Last night, I took the Flux output and had it described using WillemVH/Image_To_Text_Description which was very accurate given the image.
I then fed the prompt back into Alpha-VLLM/Lumina-Image-2.0
The last step confirmed why I prefer using sketches to language. One, I am a visual artist therefore I have much better nuance with the drawings than with words. Two, my minds eye looks for the distorted. Three MOR FUN.
Post
1244
ποΈ Voice Clone AI Podcast Generator: Create Emotionally Rich Podcasts with Your Own Voice!
π Project Introduction
Hello! Today we're excited to introduce an AI-powered solo podcast generator that creates high-quality voice cloning with authentic emotional expression.
Transform any PDF document, web URL, or keyword into a professional podcast with just a few clicks! πβ‘οΈπ§
VIDraft/Voice-Clone-Podcast
β¨ Key Features
1. π― Multiple Input Methods
URL: Simply paste any blog or article link
PDF: Upload research papers or documents directly
Keyword: Enter a topic and AI searches for the latest information to create content
2. π Emotionally Expressive Voice Cloning
Powered by Chatterbox TTS:
π€ Voice Cloning: Learn and replicate your unique voice perfectly
π’ Natural intonation and emotional expression
π Customizable emotion intensity with Exaggeration control
β‘ Seamless handling of long texts with automatic chunking
3. π€ State-of-the-Art LLM Script Generation
Professional-grade English dialogue using Private-BitSix-Mistral
12 natural conversational exchanges
Real-time web search integration for up-to-date information
Fully editable generated scripts! βοΈ
π‘ Use Cases
π Educational Content
Transform complex research papers into easy-to-understand podcasts
Create English learning materials in your own voice
π° News & Information
Convert international articles into engaging audio content
Produce global trend analysis podcasts
π¨ Creative Content
Tell stories in English with your own voice
Build your global personal brand with custom audio content
π οΈ Tech Stack
π§ LLM: Llama CPP + Private-BitSix-Mistral
π£οΈ TTS: Chatterbox (Voice Cloning & Emotional Expression)
π Search: Brave Search API
π Document Processing: LangChain + PyPDF
π₯οΈ Interface: Gradio
π What Makes Us Special
π€ Voice Cloning: Perfect voice replication from just a short audio sample
π Emotion Contro π Unlimited Length π Real-time Updates
π Project Introduction
Hello! Today we're excited to introduce an AI-powered solo podcast generator that creates high-quality voice cloning with authentic emotional expression.
Transform any PDF document, web URL, or keyword into a professional podcast with just a few clicks! πβ‘οΈπ§
VIDraft/Voice-Clone-Podcast
β¨ Key Features
1. π― Multiple Input Methods
URL: Simply paste any blog or article link
PDF: Upload research papers or documents directly
Keyword: Enter a topic and AI searches for the latest information to create content
2. π Emotionally Expressive Voice Cloning
Powered by Chatterbox TTS:
π€ Voice Cloning: Learn and replicate your unique voice perfectly
π’ Natural intonation and emotional expression
π Customizable emotion intensity with Exaggeration control
β‘ Seamless handling of long texts with automatic chunking
3. π€ State-of-the-Art LLM Script Generation
Professional-grade English dialogue using Private-BitSix-Mistral
12 natural conversational exchanges
Real-time web search integration for up-to-date information
Fully editable generated scripts! βοΈ
π‘ Use Cases
π Educational Content
Transform complex research papers into easy-to-understand podcasts
Create English learning materials in your own voice
π° News & Information
Convert international articles into engaging audio content
Produce global trend analysis podcasts
π¨ Creative Content
Tell stories in English with your own voice
Build your global personal brand with custom audio content
π οΈ Tech Stack
π§ LLM: Llama CPP + Private-BitSix-Mistral
π£οΈ TTS: Chatterbox (Voice Cloning & Emotional Expression)
π Search: Brave Search API
π Document Processing: LangChain + PyPDF
π₯οΈ Interface: Gradio
π What Makes Us Special
π€ Voice Cloning: Perfect voice replication from just a short audio sample
π Emotion Contro π Unlimited Length π Real-time Updates

prithivMLmodsΒ
posted an update
1 day ago
Post
1535
Just made a demo for Cosmos-Reason1, a physical AI model that understands physical common sense and generates appropriate embodied decisions in natural language through long chain-of-thought reasoning. Also added video understanding support to it. π€π
β¦ Try the demo here : prithivMLmods/Cosmos-x-DocScope
β€Ή Cosmos-Reason1-7B : nvidia/Cosmos-Reason1-7B
β€Ή docscopeOCR-7B-050425-exp : prithivMLmods/docscopeOCR-7B-050425-exp
β€Ή Captioner-Relaxed : Ertugrul/Qwen2.5-VL-7B-Captioner-Relaxed
β€Ή Multimodal Implementations : prithivMLmods/multimodal-implementations-67c9982ea04b39f0608badb0
β€Ή GitHub : https://github.com/PRITHIVSAKTHIUR/Nvidia-Cosmos-Reason1-Demo
To know more about it, visit the model card of the respective model. !!
β¦ Try the demo here : prithivMLmods/Cosmos-x-DocScope
β€Ή Cosmos-Reason1-7B : nvidia/Cosmos-Reason1-7B
β€Ή docscopeOCR-7B-050425-exp : prithivMLmods/docscopeOCR-7B-050425-exp
β€Ή Captioner-Relaxed : Ertugrul/Qwen2.5-VL-7B-Captioner-Relaxed
β€Ή Multimodal Implementations : prithivMLmods/multimodal-implementations-67c9982ea04b39f0608badb0
β€Ή GitHub : https://github.com/PRITHIVSAKTHIUR/Nvidia-Cosmos-Reason1-Demo
To know more about it, visit the model card of the respective model. !!
Post
173
C/ua Cloud Containers : Computer Use Agents in the Cloud
First cloud platform built for Computer-Use Agents. Open-source backbone. Linux/Windows/macOS desktops in your browser. Works with OpenAI, Anthropic, or any LLM. Pay only for compute time.
Our beta users have deployed 1000s of agents over the past month. Available now in 3 tiers: Small (1 vCPU/4GB), Medium (2 vCPU/8GB), Large (8 vCPU/32GB). Windows & macOS coming soon.
Github : https://github.com/trycua/cua ( We are open source !)
Cloud Platform : https://www.trycua.com/blog/introducing-cua-cloud-containers
First cloud platform built for Computer-Use Agents. Open-source backbone. Linux/Windows/macOS desktops in your browser. Works with OpenAI, Anthropic, or any LLM. Pay only for compute time.
Our beta users have deployed 1000s of agents over the past month. Available now in 3 tiers: Small (1 vCPU/4GB), Medium (2 vCPU/8GB), Large (8 vCPU/32GB). Windows & macOS coming soon.
Github : https://github.com/trycua/cua ( We are open source !)
Cloud Platform : https://www.trycua.com/blog/introducing-cua-cloud-containers