π§ HuggingFace Spaces Configuration Guide
Essential configuration options for your AI Dataset Studio Space
π Required README.md Header
Every HuggingFace Space must have this YAML frontmatter at the very beginning of README.md:
Basic Configuration (Recommended)
---
title: AI Dataset Studio
emoji: π
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: "4.44.0"
app_file: app.py
pinned: false
---
Alternative Configurations
Professional/Business Version
---
title: Enterprise Dataset Studio
emoji: π’
colorFrom: gray
colorTo: blue
sdk: gradio
sdk_version: "4.44.0"
app_file: app.py
pinned: true
license: mit
tags:
- machine-learning
- datasets
- nlp
- data-science
- perplexity-ai
---
Research/Academic Version
---
title: Research Dataset Creator
emoji: π
colorFrom: green
colorTo: blue
sdk: gradio
sdk_version: "4.44.0"
app_file: app.py
pinned: false
license: apache-2.0
tags:
- research
- academic
- datasets
- nlp
- ai
---
Creative/Colorful Version
---
title: AI Dataset Magic β¨
emoji: π¨
colorFrom: pink
colorTo: purple
sdk: gradio
sdk_version: "4.44.0"
app_file: app.py
pinned: false
tags:
- datasets
- creative
- ai-tools
- machine-learning
---
π¨ Configuration Options Explained
Required Fields
Field | Description | Example Values |
---|---|---|
title |
Space name displayed in UI | AI Dataset Studio |
emoji |
Icon shown next to title | π , π€ , π , π― |
colorFrom |
Gradient start color | blue , red , green , purple |
colorTo |
Gradient end color | purple , pink , yellow , blue |
sdk |
Framework used | gradio (for our app) |
sdk_version |
SDK version | "4.44.0" |
app_file |
Main application file | app.py |
Optional Fields
Field | Description | Example Values |
---|---|---|
pinned |
Pin to your profile | true , false |
license |
Software license | mit , apache-2.0 , gpl-3.0 |
tags |
Searchable keywords | machine-learning , nlp , datasets |
models |
Referenced models | facebook/bart-large-cnn |
datasets |
Referenced datasets | imdb , sentiment140 |
π― Popular Color Combinations
Professional Themes
# Corporate Blue
colorFrom: blue
colorTo: indigo
# Business Gray
colorFrom: gray
colorTo: blue
# Tech Green
colorFrom: green
colorTo: teal
Creative Themes
# Sunset
colorFrom: orange
colorTo: red
# Ocean
colorFrom: blue
colorTo: cyan
# Forest
colorFrom: green
colorTo: yellow
# Galaxy
colorFrom: purple
colorTo: pink
AI/Tech Themes
# Matrix
colorFrom: green
colorTo: black
# Cyberpunk
colorFrom: purple
colorTo: blue
# Neural
colorFrom: blue
colorTo: purple
π·οΈ Recommended Tags
For AI Dataset Studio
tags:
- machine-learning
- datasets
- nlp
- data-science
- perplexity-ai
- web-scraping
- sentiment-analysis
- text-classification
- ai-tools
- data-collection
By Use Case
Business/Enterprise
tags:
- business-intelligence
- enterprise
- data-analytics
- market-research
- customer-insights
Research/Academic
tags:
- research
- academic
- scientific
- literature-review
- research-tools
Developer Tools
tags:
- developer-tools
- api
- automation
- productivity
- data-engineering
π Hardware Configuration
The Space configuration also affects hardware selection:
Hardware Options
# In Space settings (not README.md):
# - CPU Basic (free)
# - CPU Upgrade ($0.03/hour)
# - T4 Small ($0.60/hour) β Recommended
# - T4 Medium ($1.20/hour)
# - A10G Small ($1.05/hour)
# - A10G Large ($3.15/hour)
Memory Requirements
# Our application needs:
# - Base app: ~200MB
# - AI models: ~2-4GB
# - Processing: ~1-2GB
# Total: ~4-6GB recommended (T4 Small = 16GB)
π Environment Variables
Set these in Space Settings β Repository secrets:
Required
PERPLEXITY_API_KEY = "your_perplexity_api_key_here"
Optional
# HuggingFace integration
HF_TOKEN = "your_huggingface_token"
# Performance tuning
MAX_SOURCES_PER_SEARCH = "50"
REQUEST_TIMEOUT = "30"
LOG_LEVEL = "INFO"
# Feature flags
ENABLE_DEBUG_MODE = "false"
ENABLE_CACHING = "true"
β Validation Checklist
Before deploying, ensure:
- β YAML frontmatter is at the very beginning of README.md
- β
No spaces before the opening
---
- β Proper YAML syntax (quotes around version numbers)
- β
app_file: app.py
matches your main file name - β SDK version matches your requirements.txt
- β Title and emoji are appropriate for your audience
- β Tags are relevant and searchable
- β PERPLEXITY_API_KEY is set in Space secrets
π¨ Common Configuration Errors
β Missing Frontmatter
# π AI Dataset Studio β ERROR: No YAML header
β Correct Format
---
title: AI Dataset Studio
emoji: π
sdk: gradio
---
# π AI Dataset Studio β Correct: Content after YAML
β Wrong SDK Version Format
sdk_version: 4.44.0 β ERROR: Missing quotes
β Correct Format
sdk_version: "4.44.0" β Correct: Quoted string
β Invalid App File
app_file: main.py β ERROR: File doesn't exist
β Correct Format
app_file: app.py β Correct: Matches actual filename
π Updating Configuration
To change your Space configuration:
Edit README.md
- Update the YAML frontmatter
- Commit changes to git
Space will automatically rebuild
- Changes take effect immediately
- Monitor build logs for errors
Hardware changes
- Go to Space Settings
- Change hardware tier
- Restart Space
π Example Complete README.md Start
Here's how your README.md should begin:
---
title: AI Dataset Studio
emoji: π
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: "4.44.0"
app_file: app.py
pinned: false
license: mit
tags:
- machine-learning
- datasets
- nlp
- perplexity-ai
- data-science
---
# π AI Dataset Studio
**Create high-quality training datasets with AI-powered source discovery**
A comprehensive platform for building ML datasets that combines web scraping, AI processing, and smart source discovery using Perplexity AI...
π‘ Pro Tips
- Choose memorable titles - They appear in search results
- Use relevant emojis - They make your Space stand out
- Pick good color combinations - They create visual appeal
- Add comprehensive tags - They improve discoverability
- Pin important Spaces - They appear prominently on your profile
- Use appropriate licenses - MIT or Apache-2.0 for open source
Your Space configuration is now properly set up for deployment! π